18S rRNA Secondary Structure and Phylogenetic Position of Peloridiidae (Insecta, Hemiptera)

15
18S rRNA Secondary Structure and Phylogenetic Position of Peloridiidae (Insecta, Hemiptera) David Ouvrard,* Bruce C. Campbell,² Thierry Bourgoin,* and Kathleen L. Chan² * Laboratoire d’Entomologie and ESA 8043 du CNRS, Muse ´um National d’Histoire Naturelle, 45 Rue Buffon, F-75005 Paris, France; and ² Western Regional Research Center, USDA-ARS, 800 Buchanan Street, Albany, California 94710-1100 Received September 22, 1999; revised March 10, 2000 A secondary structure model for 18S rRNA of pelo- ridiids, relict insects with a present-day circumantarc- tic distribution, is constructed using comparative se- quence analysis, thermodynamic folding, a consensus method using 18S rRNA models of other taxa, and support of helices based on compensatory substitu- tions. Results show that probable in vivo configuration of 18S rRNA is not predictable using current free- energy models to fold the entire molecule concur- rently. This suggests that refinements in free-energy minimization algorithms are needed. Molecular phy- logenetic datasets were created using 18S rRNA nucle- otide alignments produced by CLUSTAL and rigorous interpretation of homologous position based on cer- tain secondary substructures. Phylogenetic analysis of a hemipteran data matrix of 18S rDNA sequences placed peloridiids sister to Heteroptera. Resolution of affiliations between the three main euhemipteran lin- eages was unresolved. The peloridiid 18S RNA model presented here provides the most accurate template to date for aligning homologous nucleotides of hemipteran taxa. Using folded 18S rRNA to infer ho- mology of character as morpho-molecular structures or nucleotides and scoring particular sites or sub- structures is discussed. © 2000 Academic Press INTRODUCTION The 18S rRNA molecule of the ribosomal small sub- unit is frequently used to infer phylogenetic affiliations of ancient ( 100 Mya) eukaryotic relationships. Its suitability as a phylogenetic tool is two-fold. First, it is a good source of phylogenetic information based on conservation of function, variable mutation rates de- pending on substructure position, and its ubiquity in all taxa. Second, the molecule provides readily obtain- able nucleotide sequences because of high rRNA tran- script copy number in eukaryotes and ease of PCR primer design (Woese, 1987; Kjer, 1995). The 18S rRNA molecule contains certain stable sections having low substitution rates. As such, 18S rRNAs can provide informative characters for assessing affiliations of evo- lutionarily distant taxa. However, 18S rRNAs also pos- sess variable regions having high substitution rates. In some organisms these rapidly evolving expansion re- gions (or helices) provide phylogenetic signals for dis- cerning relationships between evolutionarily closer clades. Genetic changes (characters) used to predict phylo- genetic affiliations are of greatest utility if homology of characters can be inferred as accurately as possible. Determining homology between molecular characters is a significant issue. In some instances, definitions for homology used in morphology (Remane, 1956; Patter- son, 1982) are not exactly applicable to nucleotide data. For example, in comparing rRNA nucleotide sequences between taxa, character to character homology (prima- ry homology) is frequently proposed by aligning respec- tive sequences according to an evaluation of similarity. Individual base positions are aligned to maximize over- all position likeness along the entire sequence. Varia- tions of this approach are found in the computer pro- grams MALIGN (Wheeler and Gladstein, 1994) and POY (Gladstein and Wheeler, 1997). These programs perform iterations of progressively optimized data ma- trices based on gap costs and tree length or character insertion and deletion, respectively. Cladistic analysis of such restructured character matrices may or may not support initially proposed homologies according to congruency of apomorphies (i.e., secondary homology: De Pinna, 1991; Bourgoin, 1996). An altogether differ- ent approach to determining homology of character is secondary reconstruction of rRNA molecules to im- prove recognition of primary homology (De Rijk and De Wachter, 1993; Kjer, 1995), applied in this study. Peloridiids (common name moss bugs) are rare in- sects believed to be relictual members of an ancient preheteropteran lineage of Hemiptera ( sensu lato) (Evans, 1981). The family includes 21 extant species having a classically Gondwanan distribution currently confined to small isolated populations in South Amer- ica, Australia, New Zealand, New Caledonia, and cer- tain remote islands in the southern Pacific (Evans, Molecular Phylogenetics and Evolution Vol. 16, No. 3, September, pp. 403–417, 2000 doi:10.1006/mpev.2000.0797, available online at http://www.idealibrary.com on 1055-7903/00 $35.00 Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 403

Transcript of 18S rRNA Secondary Structure and Phylogenetic Position of Peloridiidae (Insecta, Hemiptera)

18S rRNA Secondary Structure and Phylogenetic Positionof Peloridiidae (Insecta, Hemiptera)

David Ouvrard,* Bruce C. Campbell,† Thierry Bourgoin,* and Kathleen L. Chan†

* Laboratoire d’Entomologie and ESA 8043 du CNRS, Museum National d’Histoire Naturelle, 45 Rue Buffon, F-75005 Paris, France; and†Western Regional Research Center, USDA-ARS, 800 Buchanan Street, Albany, California 94710-1100

Received September 22, 1999; revised March 10, 2000

A secondarystructure model for 18SrRNA of pelo-ridiids, relict insectswith a present-daycircumantarc-tic distribution, is constructed using comparative se-quenceanalysis,thermodynamic folding, a consensusmethod using 18S rRNA models of other taxa, andsupport of helices based on compensatory substitu-tions. Resultsshowthat probable in vivo configurationof 18S rRNA is not predictable using current free-energy models to fold the entire molecule concur-rently. This suggeststhat refinements in free-energyminimization algorithms are needed.Molecular phy-logeneticdatasetswerecreatedusing18SrRNA nucle-otide alignmentsproduced by CLUSTAL and rigorousinterpretation of homologousposition based on cer-tain secondary substructures. Phylogenetic analysisof a hemipteran data matrix of 18S rDNA sequencesplacedpeloridiids sister to Heteroptera. Resolutionofaffiliations betweenthe three main euhemipteran lin-eageswas unresolved.The peloridiid 18SRNA modelpresentedhereprovidesthe mostaccuratetemplate todate for aligning homologous nucleotides ofhemipteran taxa. Using folded 18SrRNA to infer ho-mology of character as morpho-molecular structuresor nucleotides and scoring particular sites or sub-structures is discussed. © 2000Academic Press

INTRODUCTION

The18SrRNA moleculeof the ribosomalsmall sub-unit is frequentlyusedto infer phylogeneticaffiliationsof ancient (.100 Mya) eukaryotic relationships.Itssuitability asa phylogenetictool is two-fold. First, it isa good sourceof phylogeneticinformation basedonconservationof function, variablemutation ratesde-pendingon substructureposition,and its ubiquity inall taxa.Second,the moleculeprovidesreadilyobtain-ablenucleotidesequencesbecauseof high rRNA tran-script copy number in eukaryotesand easeof PCRprimer design (Woese,1987; Kjer, 1995). The 18SrRNA moleculecontainscertainstablesectionshavinglow substitutionrates.Assuch,18SrRNAscanprovide

informativecharactersfor assessingaffiliationsof evo-lutionarily distanttaxa.However,18SrRNAsalsopos-sessvariableregionshavinghighsubstitutionrates.Insomeorganismstheserapidly evolving expansionre-gions(or helices)providephylogeneticsignalsfor dis-cerning relationshipsbetween evolutionarily closerclades.

Geneticchanges(characters)usedto predictphylo-geneticaffiliationsareof greatestutility if homologyofcharacterscan be inferred as accuratelyas possible.Determininghomologybetweenmolecularcharactersis a significantissue.In someinstances,definitionsforhomologyusedin morphology(Remane,1956;Patter-son,1982)arenotexactlyapplicableto nucleotidedata.Forexample,in comparingrRNA nucleotidesequencesbetweentaxa,characterto characterhomology(prima-ry homology)is frequentlyproposedbyaligningrespec-tive sequencesaccordingto anevaluationof similarity.Individualbasepositionsarealignedto maximizeover-all positionlikenessalongthe entiresequence.Varia-tions of this approachare found in the computerpro-gramsMALIGN (Wheelerand Gladstein,1994) andPOY (Gladsteinand Wheeler,1997).Theseprogramsperformiterationsof progressivelyoptimizeddatama-tricesbasedon gapcostsandtree lengthor characterinsertionanddeletion,respectively.Cladisticanalysisof such restructuredcharactermatricesmay or maynot supportinitially proposedhomologiesaccordingtocongruencyof apomorphies(i.e., secondaryhomology:De Pinna,1991;Bourgoin,1996).An altogetherdiffer-ent approachto determininghomologyof characterissecondaryreconstructionof rRNA moleculesto im-proverecognitionof primaryhomology(DeRijk andDeWachter,1993;Kjer, 1995),appliedin this study.

Peloridiids(commonnamemossbugs)are rare in-sectsbelievedto be relictual membersof an ancientpreheteropteranlineage of Hemiptera (sensu lato)(Evans,1981).The family includes21 extantspecieshavinga classicallyGondwanandistributioncurrentlyconfinedto small isolatedpopulationsin SouthAmer-ica, Australia,New Zealand,New Caledonia,andcer-tain remote islands in the southernPacific (Evans,

MolecularPhylogeneticsandEvolutionVol. 16, No. 3, September,pp. 403–417,2000doi:10.1006/mpev.2000.0797,availableonlineat http://www.idealibrary.comon

1055-7903/00$35.00Copyright© 2000by AcademicPressAll rights of reproductionin any form reserved.

403

1981;BurckhardtandAgosti, 1991).Thephylogeneticpositionof thePeloridiidaeamongHemipterahasbeencontinuallydebated.Breddin (1897)first placedPelo-ridiidaein Heteroptera.MyersandChina(1929)trans-ferred them to Homoptera,as membersof a newlycreatedseries,the Coleorhyncha.China (1962) latersuggestedtransferring them to Auchenorrhyncha.Schlee(1969)placedPeloridiidaeand Heteropteraassistergroups(adelphotaxa)in thesuperorderHeterop-teroidea.Cobben(1978) consideredSchlee’ssynapo-morphiesassuperficialandarguedagainstmonophylyof Heteroptera1 Coleorhyncha.Evans(1981)regardedPeloridiidaeasan independentsuborderof Hemipterawith Homopteraand Heteroptera.Finally, basedongenital morphology,Bourgoin(1993)arguedthat Ho-mopteraandAuchenorrhynchawereparaphyleticandPeloridiidae1 Fulgoromorphamight form a cladesis-ter to Heteroptera.However,recentresultssuggestacloserrelationshipof Prosorhynchato Clypeorrhynchathanto Fulgoromorpha(Bourgoinet al., 1999).StudiesusingmoleculardataalsoindicatethatHomopteraandAuchenorrhynchaare paraphyletic(Sorensenet al.,1995; von Dohlen and Moran, 1995; Campbellet al.,1995).

CladeHeteropteroidea(sensu Schlee,1969)wasre-cently supportedwith morphological(seeSorensenetal., 1995)and18SrDNA data,the latter consistingoftwo studiesshowingfour (Wheeleret al., 1993;Schuh

and Slater,1995)and two (Campbellet al., 1995)sy-napomorphicsites. The new subordername Prosor-rhynchawas proposedto replaceHeteropteroidea,inadditionto othernewnamesto conformall hemipteransuborderswith a “-rrhyncha” suffix (Sorensenet al.,1995).However,in thesemolecularstudies,only par-tial 18S rDNA sequenceswere used.Campbellet al.(1995)andBourgoinet al. (1997)foundonly onesyna-pomorphicsite to infer Prosorrhynchaassisterto Ar-chaeorrhyncha(5Fulgoromorpha),forming cladeNeo-hemipterasensu Sorensenet al. (1995)(Fig.1).Todate,morphologicalandanatomicalstudieshavenot agreedon a definitive phylogenetic framework of thesehemipteransubgroups.Becausepeloridiids possessacombinationof ancestraland derived “homopteran”and heteropteranmorphological features, knowingtheir evolutionaryposition in Hemiptera(sensu lato)might helpresolveaffiliationsof themajorhemipteranlineages.In this study,we presentfull 18SrRNA se-quencesof two peloridiids, Hackeriella veitchi, firstdescribedby Hacker in the genus Hemiodoecus in1932, and Hemiowoodwardia wilsoni (Evans,1936).We incorporatethesesequencesinto a phylogeneticdataset to deduce evolutionary affiliations ofhemipteranlineages.We also usethesesequencestoperforma rigorousinferenceof thesecondarystructureof peloridiid18SRNA. Preliminaryexaminationof thisstructureshowsthat it will serveasa reliabletemplate

FIG. 1. Inferredphylogenyof Hemiptera(sensu lato) showingtaxanomicschemebasedon cladification(after Sorensenet al., 1995).

404 OUVRARD ET AL.

to improve understandingof homologyof characters.Thesecharacterscanbe in the form of nucleotidesor“morpho-molecular”substructuresto be incorporatedinto a largerdataset,containingotherhemipteranex-emplars.

MATERIAL AND METHODS

Insects

Peloridiid nymphsand adults were collectedfrommossassociatedwith Nothofagus. Specimensof H. veit-chi were collected in the SpringbrookPlateau,S.E.Queensland,14 March 97, and thoseof H. wilsoni inMelbaGully, Otways,Victoria, 5 November97.Insectswerecollectedfrom mossinto ethyleneglycol usingaBerlesefunnel. Peloridiids were transferredto 95%ethanol,shippedto BCC,andstoredat 280°Cprior toextractionof genomicDNA.

Isolation of Genomic DNA, PCR Amplification,Cloning, and Sequencing 18S rDNA

Insectswereremovedfrom alcoholanddried for 10min using a SpeedVac Concentrator(Savant).TotalgenomicDNA was purified by homogenizingtwo in-sectsin 200 ml buffer [10 mM Tris (pH 8.0), 2.5 mMMgCl2, 50mM KCl], 200ml phenol,and20ml 10%SDS.Liquid phaseswereseparatedby centrifugation.DNAwasprecipitatedfrom thetopphaseusing95%ethanol(3:1). The pellet was dried, washedin 70% ethanol,dried again,and resuspendedin 20 ml TE buffer [10mM Tris (pH 8.0),1 mM EDTA]. Thepolymerasechainreactions(PCR:Saikiet al., 1988)wereperformedwithcomponentsof the Platinum Taq DNA PolymeraseHigh Fidelity kit (Life Technologies,Inc.) in 25-ml re-actions:1 ml DNA template,2.5 ml PCRbuffer, 0.5 mleachdNTP,2 ml eachforwardandreverseprimer,1 mlMgSO4, 0.125ml Taq DNA polymerase,and14.375mlwater. Full 18S rDNA were amplified using the for-wardprimer:59-CTGGTT GAT CCT GCCAGA GT-39

andreverseprimer:59-TCCTTC CGCAGGTTC ACC-

39. The PCR cycling program was 94°C for 1 min,followedby 30 cycles94°Cfor 30 s,55°Cfor 30 s,68°Cfor 1 min, with 68°C for 7 min after last cycle. PCRproductswere clonedusing plasmidsand competentcells suppliedin the TA Cloning System(Invitrogen).Putative clonescontaining18S rDNA were selectedusing suppliesand protocolssuppliedwith the RPMRapidPureMiniprep Kit (BIO 101).Both top andbot-tom strandsof 18SrDNA werefully sequencedby thedideoxynucleotide-terminationmethod(Sangeret al.,1977)on a LI-COR automatedDNA sequencerusingcomponentsandprotocolsfrom SequiThermEXCEL IIDNA SequencingKit-LC (EpicentreTechnologies).

Predicting Secondary Structure

Fora completereviewof techniquesusedfor predict-ing secondarystructuressee Chastain and Tinoco(1991).Descriptionsof seventypesof substructuresofa foldedrRNA moleculeareprovidedfor reference(Ta-ble 1). The threeapproachesusedto estimatesecond-ary structurefor peloridiid 18SrRNA areoutlinedasfollows.

Comparative sequence analysis. The comparativesequencemethodis initially usedto find regionsin 18SrRNA sharedby a largenumberof taxaandto prelim-inarily infer secondarystructure of relatively con-servedportionsof themolecule.Accordingto Chastainand Tinoco (1991), this methodprovidesthe best re-sults when sequenceshave approximately70% basesimilarity. With this method,secondarystructureisinferred basedon the principle that substitutionalchangesnot affecting function may be relatively con-served.Thebasicpostulateof this comparativemethodis that structuralconservationbetweendifferent spe-cies,despitebasesubstitutionsin the sequence,is re-liable evidenceof an importantfunctionalrole for thatstructure.During the first step,rRNA sequencesfromdifferent organismsare alignedaccordingto primarystructure.Thesecondstepis to confirmpairedregionsas structural elementsby positional covariancebe-

TABLE 1

Glossaryof Terms for SecondaryStructures of RNAs

Term Definition

Stem A right-handeddoublehelix composedof a successionof complementaryhydrogen-bondednucleotidesbetweenpairedstrands.

Singlestrand Unpairednucleotidesseparatingstems.Hairpin loop Stemcloseddistally by a loop of unpairednucleotides.Terminalbulge Successionof unpairednucleotidesat the endof a hairpin loop.Lateralbulge Successionof unpairednucleotideson onestrandof a stem.Internalbulge Groupof nucleotidesfrom the two oppositestrandsunableto form canonicalbasepairs.Junctionor multibranched

loopSuccessionof groupsof unpairednucleotidesjoining the last proximalpairing of severalstems.

Compensatorysubstitution

Mutation on onestrandof a stemfollowing initial mutationof a complementarybaseto maintainstructure.

Semicompensatorysubstitution

Mutation of a baseon onestrandnot affectingstructure(A•U or G•C replacedby G•U), exceptintermsof thermodynamics.

40518S rRNA STRUCTUREAND PHYLOGENY OF PELORIDIIDAE

tween availablesequences.For example,A z U basepairsmight replaceG z C basepairs(or vice versa)tomaintainstemsubstructuresof helical regions.Semi-compensatorysubstitutionscan also conservestruc-tural features,but indicate a stem which is not asenergeticallystableas one having full compensatorysubstitutions.As number and diversity of full 18SrDNA sequencesavailable from databasessuch asGenBankhave increased,complementarybasepair-ings in conservedregionshavebecomereasonablyes-tablishedaccordingto positional covariance(Gutell,1993).

Secondarystructuredeterminationof full 18S rR-NAs hasalsobeenproposedby comparingcloselyre-latedsequencesusingtheDCSEprogram(DeRijk andDe Wachter,1993). This programmapsa secondarystructuralpredictionestablishedfor oneorganismontothe sequenceof another organism. It then checkswhetherthis secondsequencecanrealisticallyfold ac-cordingto the initially predictedmodel.Anotherpro-gram, the RNAlign program (Corpet and Michot,1994),usesa similar approachbut for small relativelyconservedregions.Refinementof aligned sequencesbasedon secondarystructurecan also be attemptedmanually following the proceduresoutlined by Kjer(1995).

Publishedmodelsof 18SrRNA secondarystructuresof arthropodsusedas referencesin the comparativesequencemethodin our study includedArachnida—Aphonopelma sp.(Hendrikset al., 1988a);Crustacea—Artemia salina (Nelles et al., 1984); Insecta: Co-leoptera—Tenebrio molitor (Hendriks et al., 1988b);Hemiptera—Acyrthosiphon pisum (Kwon et al., 1991);Diptera—Drosophila melanogaster (Hancock et al.,1988;Gutell, 1994;van dePeeret al., 2000).

For constructingthe modelof peloridiid 18SrRNAusingcomparativesequenceanalysiswealsoexaminedfull sequencesof 18S rRNAs of other hemipterans(Campbellet al., 1995),including Okanagana utahen-sis Davis, 1919 (Cicadidae,U06478);Philaenus spu-marius L., 1758(Cercopidae,U06480);Pealius kellog-gii (Bemis, 1904) (Aleyrodidae, U06479); Lygushesperus Knight, 1917(Miridae,U06476).Thecompar-ativesequenceanalysisis impracticalfor determiningsecondarystructureof strictly conservedsequences(nocompensatorysubstitutions)and highly variable re-gions(difficulty of alignment).

Thermodynamic folding. This second method isbasedon predicting thermodynamicallystablerRNAmoleculesby calculatingfree energiesof all possiblesecondarystructuresand retainingone of lowest en-ergy (i.e., the most stable).In someinstances,struc-tureswith freeenergiesslightly higherthanonewithlowestenergyare alsoof interest.Currently usedal-gorithmsfor thermodynamicfolding are predictedonincompleteexperimentaldata so that the predicatedprobability of certain structures can be irresolute

(ChastainandTinoco,1991).Furthermore,theremaybe stabilizing tertiary interactionsbetweenhelicesinvivo that arenot incorporatedinto currentlyavailablealgorithmsusedto predict structurebasedon lowestfreeenergy(FieldsandGutell, 1996).

The Zuker–Steigleralgorithm (Zuker and Steigler,1981)was usedto infer secondarystructuresof vari-able regions,50 basesin length for which no modelwas published.This algorithm is a subprogramofMacDNASIS (Hitachi Software). For folding longervariableregions,theMUFOLD programv. 3 (Mathewset al., 1999;Zukeret al., 1999)wasusedfrom themfoldserver, http://mfold2.wustl.edu/˜mfold/rna/form1.cgi.In somecases,a consensusstructurewas chosenin-steadof a structurepredicatedon lowestenergy(opti-mal thermodynamicstructure).In suchcases,this con-sensusstructurewas defined by comparingseveralstructuresof low energy(suboptimalstructures)forthe samemolecule(automatedprocedurein KoningsandHogeweg,1989).Retainedconsensusheliceswerefolded for severalinsectsso that the final consensusstructurewasoneinterpretedfrom both phylogeneticcomparisonandthermodynamicfolding methods.

Reliability of this methoddependsgreatly on ther-modynamicparameters.Currently,experimentalther-modynamicdata are missing for somesubstructuressuchasbulges,hairpins,andinternal loops(Table1).Also, therearenoexperimentalthermodynamicvaluesfor junctions or multibranchedloops (Jaegeret al.,1989).Predictionof secondarystructureshaving nu-merousmultibranchedloopsshouldbe acceptedwithcaution. Furthermore,tertiary and quaternaryrela-tions,not inferredin lowestenergymodels,maystabi-lize true helicesin vivo (Noller, 1984).Therearemorethan 20 proteinspresentin the small subunitof theribosomethat could play a role in constrainingstruc-tural conformationswhich are not predicted usingavailablealgorithms.

The thermodynamicfolding techniqueappearstogive bestresults(i.e., agreeusingbroadphylogeneticcomparisons)when pairing basepositions relativelynear to eachother in the primary sequence,particu-larly thosebasesforming a hairpin loop (KoningsandGutell, 1995).It is for this reasonthat an entire 18SrRNA cannotbe inferred from a single folding usingany of the available thermodynamicalgorithms.At-temptsto fold entireprimarysequencesof 18SrRNAsbetween even relatively closely related taxa (e.g.,amonginsectsin the samefamily) renderssecondarystructuralmodelstoodifferentfor reliablecomparison,and even basic postulateson structure cannot betested.

Consensus method. Chastain and Tinoco (1991)suggestedsuccessiveuseof thermodynamicandphylo-genetic comparisontechniques.In our study, struc-turesproposedby otherauthorsfor otherinsectswereusedas templatesto overlaya structurefor the pelo-

406 OUVRARD ET AL.

ridiid sequencesto determinedegreeof congruence.Figure 2 showshow a comparisonbetweendifferenttaxacanbeusedto refinethestructureof anunknownhelix for a newtaxon.If morethanonestructurecouldbe inferred,the interpretationbasedon leastfree-en-ergywasretainedwhencalculationof free-energyval-ueswaspossiblewith MacDNASISv 3.6.Helicalstruc-tures were inferred using thermodynamicsonly inregionswhere rRNA sequenceswere highly variableamongtaxa and no consensusmodel was available.Drawingsof secondarystructureswerecreatedusingsimpledrawingsoftware(Deskdraw,MK Software)orwereimportedfrom MacDNASISfiles.

Steps and decisions followed to infer secondarystructurefor 18SrRNAs in our studywere:

(1) Comparisonbetween sequences.Can the se-lectedsequenceof peloridiidsbefoldedinto structuresproposedin the literature?Yes3 2; No3 3.

(2) Comparisonbetweenstructuresproposedin theliteratureandthermodynamicfolding.Doesthermody-namic folding result in more stable and compatiblehelices, including examination of homologous se-quencesof other insects(e.g.,T. molitor andD. mela-nogaster)? If so:3 Refinement;If not: publishedhelixretained.3 4.

(3) Peloridiid substructuresequencefolded usingZuker–Steigleralgorithm.Then3 4.

(4) In eachcase,supportingevidenceof eachhelix issoughtby searchingfor presenceor absenceof compen-satoryor semicompensatorysubstitutions.

Phylogenetic Analysis

Phylogeneticpositionsof Peloridiidaeandotherma-jor lineageswithin Hemipterawereassessedusing18SrDNA nucleotidesequencesof otherhemipterantaxa,aspresentedin Campbellet al. (1995).Thesequencesof H. veitchi andH. wilsoni andotherinsecttaxawerealignedinitially usingCLUSTAL W (Thompsonet al.,1994). Three data matriceswere analyzed:full (allinformativesites),attenuated(excludingambiguouslyalignedsites),andpolarized(excludingsiteshomopla-siousin outgrouptaxa),as definedin Campbellet al.(1995). Phylogeneticanalyseswere performed withPAUP* (v. 4.0b2a) (Swofford, 1998). Indels weretreatedasmissingdata.

RESULTS AND DISCUSSION

Primary Structure

Full 18SrRNA sequencesof H. veitchi and H. wil-soni, 1909bp each,are depositedin GenBank,underAccessionNos.AF004766andAF131198,respectively.

Secondary Structure Model

Numberingof helicesis basedon van de Peeret al.(2000).Highly variableexpansionregionsof primarysequencesandsecondarystructuresamongeukaryotesare often notedas E6, E9,E10, E23,or E41.The “E”refersto Eukaryota.Suchexpansionregionsareabsentfrom bacterial16Sand16S-likerRNAs.

Helices 1 to 21 (Fig. 3, 18S rRNA model Part I). (1and2) No variationbetweenpeloridiid sequencesandpublished18S rRNA sequencesof other insects.(3)Agreeswith T. molitor model,but doesnot agreewithA. pisum. Theformermodelwaschosenin accordancewith Gutell (1994).(4) Sameas thepredictedstructurefor T. molitor, the strongestthermodynamically.(5)Eachbasein this helix is conservedin morethan95%of eukaryotes(Gutell, 1994). The retainedarrange-mentis stronglysupported.(6) In peloridiidsthis helixdoesnot showgreatvariation in length,unlike othereukaryoticexpansionregions.Despitenot expandingin lengththereis greatvariationbetweentaxain basecompositionof this helix. As a guiding aid, Gutell(1994)showsthat the1stbasepairingof helix 6 shouldbe a Gz C pair. In thepeloridiidmodelthis G is at baseposition 54 and the C is at position 86. If mfold isconstrainedto predictthestructurewith thisG z Cpairat thebaseof this helix, theobtainedstructurefor thepeloridiiddiffersfrom thestructureof helix 6 predictedfor T. molitor (Hendrikset al., 1988b),but our config-uration for this helix shows greaterstability ('3.7kcal/mol;lessfree-energythanin theT. molitor model)becauseof a greaternumberof G z C pairs.(7) A short

FIG. 2. Stepsin folding of a helix combiningtechniquesof ther-modynamicfolding andphylogeneticcomparison.(A) First step,he-lix 48 of Hackeriella veitchi (Hemiptera,Prosorrhyncha,Peloridi-idae) is folded using Zuker–SteiglerMacDNASIS subprogram.(B)Secondstep,two compensatorysubstitutionsarefoundin helices48of Lygus hesperus (Hemiptera,Prosorrhyncha,Heteroptera)andPealius kelloggii (Hemiptera,Sternorrhyncha,Aleyrodoidea).(C)Third step, the four basesin the internal bulge of helix 48 in H.veitchi are forcedto pair andrendera final inferencefor the struc-tural configurationof this helix.

40718S rRNA STRUCTUREAND PHYLOGENY OF PELORIDIIDAE

helix with only 3 basepairings. Thesepairings areconservedthroughouteukaryotes.(8) Thebasepairingdistalto thesecondinternalbulgeof this helix showsa

semicompensatorymutationreplacingU z A in T. moli-tor by U z G in the peloridiids.Thelast basepairing isalsoa semicompensatorymutationreplacingC z G in T.

FIG. 3. Proposedsecondarystructuremodelof 18SrRNA of Hackeriella veitchi, Part I—Helices1 to 21.

408 OUVRARD ET AL.

molitor by U z G in the peloridiids.(9) Only onebasesubstitutionaldifferencebetweenT. molitor and thepeloridiidsoccursin this helix. An A (position147,T.molitor) is replacedby C (144). This changehas noeffecton the structuralconfigurationbecausethis siteis in aninternalbulge.(10)Thishelix comprisesoneofthemorevariableregionsof eukaryotic18SrRNA. Thestructureproposedfor T. molitor by Hendriks et al.(1988b)cannotbe reconstructedusing the peloridiidsequences.Only thethermodynamicapproachis avail-ableto infer secondarystructureuntil therearemoreinsect sequencesto take into consideration.Calcula-tions using the Zuker–Steigleralgorithm yields helix10andtwo subhelicesenumeratedasE101 andE102.Free-energyminimization also predicts similar two-subhelicemodelsin otherhemipteran18SrRNAs, O.utahensis, P. spumarius, andL. hesperus, examinedbyus. A recentlyavailablemodel of this region (van dePeeret al., 2000) showsonly 2 basepair differencescomparedwith the modelfor D. melanogaster (Gutell,1994).However,folding H. veitchi 18SRNA followingthe Gutell model rupturesa numberof pairingsandfails the “comparativesequenceanalysis” test. Thus,weoptedto retainourmorestablyfoldedconfiguration.For comparison,helicesE10 1 and E10 2 in Gutell’smodelareprovided(Fig. 3). (11)Theterminalbulgeofthis hairpinloopis composedof five basesin T. molitorandsix basesin thepeloridiids.Thecharacterstatesofthebasesin thelooparedifferentin thetwo insectsbutthesameoverallconfigurationfor thishelix is thesamein theseinsects.Thisindicatesthatconstraintsarelessimportantin bulgesthanin stems.(12)Thebasalbasepairingsof this structurein peloridiidsdiffer from thehomologoushelix of T. molitor, but, the predictedstructuresin the 12 terminal pairingsof the hairpinloop, the lateral bulge,and the terminal loop are thesamefor bothtaxa.Thesequencesof thesetaxain thisportionof themoleculediffer by only two basechangesin the terminalloopanddonot affectsecondarystruc-ture. (13–15) These helices are conservedregionsamongall eukaryotes(Gutell, 1994) and retain thesameconfigurationin thepeloridiids.(16)Theabsenceof a G at position478in the peloridiids,asopposedtoits presencein T. molitor, transformsan internalbulgeto a lateralbulgein thepeloridiids.(17)Noambiguitiesexist in this helix sinceboth thermodynamicresultsand comparativesequenceanalysispredict the samestructures.(18) A congruentstructureusing the twofolding techniqueswasnot found.The structurewiththe leastenergywas retained.Only the final hairpintetraloopUAAC and 4 pairings in the stemcould beconstructedin common.(19) Theretainedstructureisthesameasthatpredictedfor thebasalportionof helix19 of T. molitor (Hendriks et al., 1988b). (20) Theretentionof this pseudoknothasbeenexperimentallyconfirmed(Gutell et al., 1994).(21) The 4 pairs com-prising this stem are presentedby Hendriks et al.(1988b)in theapicalpartof whatis definedashelix 19.

(Another pseudoknothasbeendescribedas being lo-catedhere,but hasnot beennumbered;Gutell et al.,1994).

Helices 22 to 29 (Fig. 4, 18S rRNA model Part II).(22) This helix consistsof only 3 pairedbasesas pre-dicted by Gutell (1994). The basesproposedto formhelix 20in theT. molitor model(Hendrikset al., 1988b)arenowplacedin helix 26 of thepeloridiid model.(23)Thestructuralconfigurationof this helix wasinferredfrom thermodynamicsalone.No consensussecondarystructurewasavailablefrom knownsequencesof othereukaryotes.(E23) This helix andsubhelicesgenerallyincorporatethe longestexpansionregionof eukaryotic18S rRNA. Our model of the helix was constructedusingmfold. Refinementsof the mfold algorithmhaveimprovedits efficacyin rRNA folding, particularly forsequenceslongerthan50 bases.This particularstruc-ture is composedof eightsubhelicesdesignatedin ourmodelasE231 to E238. (24)Thepredictedsecondarystructuresof this helix for the peloridiidsandT. moli-tor are identical. This helix possessesa number ofcompensatorysubstitutionsthat signify structuralin-tegrity of the model.Thereis onecompensatorysub-stitution at the baseof the largeinternalbulge,threesemicompensatorysubstitutions,and one reinforce-mentof a noncanonicalA z C basepair by a canonicalA z U basepair (Fig. 6). Three substitutionaldiffer-encesoccur in the large internal bulge that do notaffect folding configuration.(25 and 26) Thermody-namic folding of peloridiid sequencesand comparisonwith T. molitor resultsin the samestructuralconfigu-ration for thesetwo helices.Only two substitutionaldifferencesoccurin unpairedbasesof thesehelicesinthesetaxa. (27) This retainedstructureis the moststable,with a total secondarystructureenergyof 216kcal/mol.(28–30)ThesethreeheliceswerefoldedusingMacDNASISandaremorestablethanthoseproposedfor T. molitor byHendrikset al. (1988b).(31)Thishelixwas folded following the instructionsof Leontis andWesthof(1998),who isolateda recursivemotif in rR-NAs using supportive information from chemicalprobes.Accordingto theseauthors,helix 31contains2noncanonicalA z A basepairs.

Helices 32 to 50 (Fig. 5, 18S rRNA model Part III).(32) The configurationof this helix is energeticallyrobust and concurswith secondarystructuremodelsproposedby Gutell et al. (1994) and Hendriks et al.(1988b).(33) In peloridiids this helix is 2 basepairsshorterthantheT. molitor model.Thereducednumberof basepairslowersthedegreeof thermodynamicsup-port of the longerhelix. A shorterhelix is alsoinferredin the Gutell model (1994) for D. melanogaster. (34)This helix conformswith the proposedmodel for T.molitor by Hendrikset al. (1988b).(35)Thenucleotidesin this helix showno variationbetweenthe taxausedin our studyanda comparativesequenceanalysiswasnotpossible.Assuch,themostthermodynamicallysta-

40918S rRNA STRUCTUREAND PHYLOGENY OF PELORIDIIDAE

ble structureis retained.(36) This structuralconfigu-ration is asin Hendrikset al. (1988b),with the excep-tion of the distal basepair in the stem. There is asemicompensatorysubstitution;a G z U (1273z 1566)inT. molitor is replacedby a G z C (1311 z 1563) in thepeloridiids.(37) This helix is constructedaccordingtothe T. molitor model.A compensatorysubstitutioninthis helix, a 4th pair G z C in the main stem of thepeloridiids,is replacedby A z U in T. molitor. Theonesemicompensatorysubstitutionis representedby the10thpairG z U in themainstemreplacedby G z C in T.molitor (Fig.5). (38)Theretainedhelix is similar to thestructureproposedfor T. molitor, but, a C z G (1372 z

1548) is constrainedto pair in our model,making itmorethermodynamicallystablethan the modelfor T.molitor (26.6 kcal/mol insteadof 23.1 kcal/mol, re-spectively).(39–41) Folding of this region using theZuker–Steigleralgorithm createdone hairpin loop.Since this result did not agreewith the three-heliceconfigurationproposedfor Drosophila (Gutell, 1994)

and T. molitor (Hendriks et al., 1988b) models,themore thermodynamically stable configuration pro-posedfor T. molitor wasretained(22.8 kcal/molcom-paredto 20.3 kcal/mol). (42) This shorthelix consistsof only 3 pairedbasesasin Hendrikset al. (1988b).(43)Thishelix is a componentof anexpansionregionandishighly variableamonginsects.As such,we relied onthermodynamicfolding to predict its configuration.(44)Thishelix wasfoldedin concertwith helix 41usingthermodynamicfolding. (45 and 46) There are foursubstitutionaldifferencesdistinguishingthe peloridi-idsandT. molitor in this region,but, thesechangesarefoundin bulgesanddonotalterconfiguration.(47)Ourcalculationsfor this helix (211.5kcal/mol)supportthesameoptimal structureas in Hendrikset al. (1988b).(48) This helix possessesone compensatorysubstitu-tion, a C z G (1708z 1726)in T. molitor replacedby A z

U (1701 z 1719)in the peloridiids.The helix is 4 basepairslongerin the proximalportionof the stemof thepeloridiids than that proposedfor T. molitor by Hen-

FIG. 4. Proposedsecondarystructuremodelof 18SrRNA of Hackeriella veitchi, Part II—Helices22 to 31.

410 OUVRARD ET AL.

drikset al. (1988b).(49)Thermodynamicfolding in thisregioninfersonelonghairpinloopwith 131basepairs,the longesthelix of the entire18SrRNA model.Vari-ation betweenthe sequencesof T. molitor and the

peloridiidsis mostevidentin the distal portionof thishelix. As such,the proposedmodelsdiffer. Theconfig-uration proposedhere is supported,however,by twocompensatoryandsix semicompensatorysubstitutions(Fig. 5). Sevenother substitutionalchangesbetweenthesetaxadonotaffectourproposedstructuresincealloccurin bulgesandterminalloops.If helix 49proposedby Hendrikset al. (1988b)for T. molitor is constrainedto fit themodelbasedon thepeloridiid, thenthenum-ber of compensatoryand semicompensatorysubstitu-tions are the same.Six of thesesubstitutionsdo notalter configuration,but two noncompensatorysubsti-tutions nullify the structurefor T. molitor. (50) Thisextremity of the 18S rRNA geneis conservedamongeukaryotes(Gutell, 1994) and our thermodynamic-basedresultsarethesameasthosepublishedfor theT.molitor model.

Phylogenetic Implications

Procurementof full 18SrRNA sequencesof peloridi-idsprovideda meansto assessphylogeneticaffiliationsof Coleorhynchaandrobustnessof supportfor Neohe-miptera. A previousstudy, using partial 18S rRNAsequences(Campbellet al., 1995),inferred two most-parsimonioustreesplacingProsorrhynchaeithersisterto Archaeorrhyncha(fulgoromorphs), forming cladeNeohemiptera,or sister to Clypeorrhyncha(cicado-morphs).Similar resultswerereachedby Bourgoinetal. (1997)but with cladeNeohemipteraandAuchenor-rhynchaequallysupported,accordingto theoutgroup.Thesestudiesdid not includean18SrDNA sequenceofapeloridiidlongenoughto revealthecharacterstateatthesingularneohemipteransynapomorphicsitefound.In the full peloridiid sequences,baseposition1439isthehomologoussiteof this neohemipteransynapomor-phy.Thecharacterstateof thepeloridiidsat thissiteisanA, asin otherneohemipteraninsects,asopposedtoa G in nonneohemipteraninsects.This synapomorphy

FIG. 5. Proposedsecondarystructuremodel of 18S rRNA ofHackeriella veitchi, Part III—Helices 32 to 50.

FIG. 6. Helix 24 showing positionsof compensatorysubstitu-tions of bases.(A) Hackeriella veitchi. (B) Tenebrio molitor.

41118S rRNA STRUCTUREAND PHYLOGENY OF PELORIDIIDAE

suggests,basedon this limited molecular evidence,that peloridiidsaremembersof Neohemiptera.

However,a cladisticanalysistakinginto accountfull18S rRNA sequencesis necessaryto confirm congru-ency of homologousand independentcharactersforinferring evolutionaryaffiliations amonghemipterantaxa.Useof rRNA secondarystructureenablesassess-ment of such characterindependence.For example,presenceof compensatorymutationsshowsthat char-acterstateof abasemaybedependentonthecharacterstateof a siteelsewherein the sequence.This issueofcharacterindependenceis particularlystriking in helix11. In this helix unpairedbasesin the terminalbulgearequite variablebetweenT. molitor andpeloridiids,whereasthe primary sequencesof thesetaxaarecon-served in its stem. Some authors suggesttreatingbasesdifferently in datasets.Basesin paired(stems)andunpairedsitescouldbedifferentiatedby excludingonetype(WheelerandHoneycutt,1988)or distinguish-ing them in a matrix to weigh one set of characters(Smith, 1989). Intuitively, it could be argued thatpairednucleotidesforming stemscould be consideredas candidatesfor weightingby one-half.Suchnucleo-tidesareinterdependantcharacters,asshownthroughcompensatorymutations,and shouldnot be assessedaspossessingasstronga phylogeneticsignalas inde-pendentcharacters.However,to scoresuchcharacterswith half a weight a priori is not satisfactorybecausethe interdependencedoesnot necessarilyrendersuchpaired nucleotideswith equal weights. Evolutionarystudiesof transitionalstatesin compensatorymutatedsitesmight indicatethat oneof thesesitesshouldbe

given a greaterweight. The first mutationsignalsanimportantphylogeneticevent,but a secondcompensa-tory mutation is necessaryfor maintainingthe func-tional integrity of the ribosome.Hence,weightingthe“responding” compensatorymutation more than thefirst mutationhasjustification.

Using secondarystructureas a guide to initiallyalign homologousregionsof rRNAs betweentaxa,fol-lowed by use of parsimonyand direct optimization(MALIGN and POY), has beensuggestedas one ap-proachto minimize evolutionaryeventsbetweenre-spectivenucleotidesequencesor parts of sequences(Gladsteinand Wheeler,1996). However,employingsuch iterative programsas POY and MALIGN maycontradicthomologybasedonsecondarystructure,thehomologyhypothesesbeingtotally different in the twoapproaches.Homologyproposedby aligningsequencesusing secondarystructurecould be jeopardizedby anewhomologyhypothesisusingthe principleof parsi-monyto explainvariationsbetweenthesesequences.

Cladistic Analysis

Phylogeneticanalysisof the full datasetportraysabasalSternorrhynchasisterto Euhemiptera(sensu Zr-zavy,1992),the latter representedasa quadratomousnode (Fig. 7). WhereascladesArchaeorrhynchaandClypeorrhynchaare monophyletic,Coleorhynchaap-pearsparaphyletic.There is little doubt that uniquemorphologicalfeaturesof peloridiids(Evans,1963)es-tablishestheir monophyly.This anomalousparaphylyof Peloridiidaeusing the full datasetpossibly origi-nated from use of a published partial sequenceof

FIG. 7. Majority-rule consensustree of full datasetbasedon an heuristic analysis(1000 bootstrapreplicates)of informative sites.(Length5 927;CI 5 0.540;HI 5 0.460;RI 5 0.530;RC 5 0.287).Taxaabbreviations:Archaeorrhyncha:Sacu,Siphanta acuta (Flatidae,AccessionNo. U06476);Hsev,Hysteropterum severini (Issidae,U15214);Sfum, Scolops fumida (Dictyopharidae,U15216);Ohes,Oliarushesperinus (Cixiidae, U15215);Pmar, Prokelisia marginata (Delphacidae,U09207);Prosorrhyncha:Arem, Aquarius remigis (Gerridae,U15691);Ofas,Oncopletus fasciatus (Lygaeidae,U15188);Coleorhyncha(boldface,underlined)Hlea,Hemiodoecus leai (Peloridiidae)(partialsequencefrom Wheeleret al., 1993); Hvei, Hackeriella veitchi (Peloridiidae,AF004766);Hwil, Hemiowoodwardia wilsoni (Peloridiidae,AF131198);Clypeorrhyncha:Outa,Okanagana utahensis (Cicadidae,U06478);Ppla,Prosapia plagiata (Cercopidae,U16264);Pspu,Phi-laenus spumarius (Cercopidae,U06480);Sfes,Spissistilus festinus (Membracidae,U06477);Evar, Euscelidius variegatus (Cicadellidae,U15148);Gatr, Graphocephala atropunctata (Cicadellidae,U15213);Sternorrhyncha:Teug, Trioza eugeniae (Psyllidae,U06482);Pkel,Pealius kelloggi (Aleyrodidae,U06479);Aaur, Aonidiella aurantii (Diaspididae,U06475);Apis, Acyrthosiphon pisum (Aphididae,X62623);Outgroup:Tmol, Tenebrio molitor, X07801.

412 OUVRARD ET AL.

Hemiodoecus leai, a peloridiid sequenceconsistingofapprox450bp (Wheeleret al., 1993).Theotherpelori-diid taxa were representedby full sequences.Thisanomalyand the fact that sectionsof the full datamatrix couldonly bealignedambiguously(basedon acriterionof sequencesimilarity) indicatedthatananal-ysis using a datasetof unambiguouslyaligned andconclusivelyhomologoussiteswasimportant.

Next, sitesfrom the datamatrix that wereambigu-ously aligned,becauseof either extensivehomoplasybetweenall taxa (saturation)or presenceof indels,wereexcluded.This treatmentresultedin an attenu-ateddataset.The analysisof this datasetrenderedaconsensustreewith four distinct hemipteranlineages(Fig. 8). Sternorrhynchawas inferred sister to Eu-hemipterathat was representedas a trichotomy ofarchaeorrhynchan,prosorrhynchan,and clypeorrhyn-chanlineages.Thepeloridiidsweregroupedin a clade(Coleorhyncha)sister to Heteroptera,thus formingcladeProsorrhyncha.The trichotomouseuhemipterannodewasstill unresolved.

Toresolvethetrichotomousnode,homoplasioussitesamongoutgrouptaxa(relativeto euhemipterans)wereremoved.This was achievedby first excluding siteshaving different characterstatesin the psyllid (thebasalsternorrhynchanlineage)and T. molitor. Then,sites homoplasiouswithin Sternorrhyncha(the mostbasalhemipteranlineage)wereexcludedfrom thedatamatrix. Thesefiltrations of charactersresultedin re-tentionof only symplesiomorphicsitesof a theoreticalbasallineagerepresentedby thepsyllid asoutgrouptoeuhemipterans.This datasetwastermedthepolarizeddataset.Analysisof this datasetresultedin six most-parsimonioustrees(Fig. 9A). Half of thesetreessup-portedNeohemipteraand half supportedan arrange-ment in which Prosorrhynchaand Clypeorrhynchaweresisters;the consensusbeingan unresolvedphy-logenyfor Euhemiptera(Fig. 9B). To assessnodalde-

cay (Bremer, 1994), trees one and two stepslongerthan the most-parsimonioustreeswere evaluated.Aconsensusof these3528longertreesdid not supportamonophyleticNeohemiptera,but recognizeda Prosor-rhyncha1 Clypeorrhynchaclade(Fig. 9C). Interest-ingly, sucha prosorrhynchan1 clypeorrhynchanar-rangementagreeswith Bourgoinet al. (1999)andwithpaleoentomological interpretations proposed byShcherbakov(1988).

A monophyleticProsorrhynchawassupportedin alltreesand wasrepresentedin the datamatrix by twosynapomorphicsites.Thesesynapomorphiesareactu-ally compensatorysubstitutionsin thestemof helix 37asanA z U (1321z 1358)in Prosorrhynchacomparedtoa Gz Cin otherinsects.WhereasNeohemipterahasonesupportingsynapomorphicsite in a short conservedportionof the18SrRNA molecule(andretainedin thepolarizeddataset),the potentialaffiliation of Prosor-rhyncha1 Clypeorrhynchais supportedby two syna-pomorphicsitesretainedin the full dataset.However,thesesiteswereexcludedfrom the otherdatasetsbe-causetheywerebracketedby sectionsof the sequencethat were ambiguouslyaligned. The two sites arepairedin thestemof helix E231 as an A z U (663z 706)in the thermodynamicallycomputedmodelof thevari-able E23 region. Thesehomologouspositionsare re-placedby a G z C in other insects(exceptin A. pisumwheretheyareanA{C). Theonly synapomorphysup-porting clade Neohemipterais situated in a lateralbulgein helix 43 (site1439,Fig. 5).

At this point, affiliations of the threecladesof Eu-hemipteraare unresolved,despiteincreasingnumberof taxa and incorporating full peloridiid sequences.Constructionof a newmatrix taking into accountsec-ondarystructuresof all Hemipteracould resolvethistrichotomy (T. Bourgoin et al., unpublished).Suchstructurally basedmatricesshould use models con-structedwith recentbiochemicalandmolecularbiolog-

FIG. 8. Majority-rule consensustree of attenuateddatasetbasedon a branch-and-boundanalysis(1000 bootstrapreplicates)ofinformativesites.(Length5 182;CI 5 0.599;HI 5 0.401;RI 5 0.678;RC5 0.406).Numbersat nodesrepresentlevelof bootstrapsupport.Taxaabbreviationsasin Fig. 7.

41318S rRNA STRUCTUREAND PHYLOGENY OF PELORIDIIDAE

ical data and not simply primary sequencesalignedusing rather limited algorithms (De Rijk and DeWachter,1993).Secondarystructuremodelsshouldbebroadlyappliedin a hemipterandatamatrix to avoidarbitrarylossof phylogenetic(nonhomoplasious)infor-mation,aswaspossiblythecasein our effortsto purge

disorderin theattenuatedandpolarizeddatamatrices.The methodemployedby Sorensenet al. (1995) andapplied in Campbell et al. (1995), Bourgoin et al.(1997),and in this studyeliminatedany biasesintro-duced from ambiguouslyaligned regions in the fulldatasetrendering an attenuateddataset.However,

FIG. 9. Treesresulting from branch-and-boundanalysisof the polarizedhemipterandata matrix. (A) Six most-parsimonioustreesshowingequalinferencefor Neohemiptera(Prosorrhyncha1 Archaeorrhyncha)andProsorrhyncha1 Clypeorrhyncha.(Length5 35;CI 5

0.800;HI 5 0.200;RI 5 0.905;RC 5 0.724).(B) Majority-rule consensusof the six trees.(C) Majority-rule consensusof 3528treesoneandtwo stepslongerthanthe most-parsimonioustrees.

414 OUVRARD ET AL.

constructionof the polarizeddatasetfrom the attenu-ateddatasetusingonly symplesiomorphicsitesin theoutgroupled to a “generalizedhypotheticalSternor-rhynchapolarization”(or “sternorrhynchanarchetypepolarization”). Such a polarization cannot define astrictoutgroupcomparisonaccordingto Hennig(1966).

Proposed Model

Useof thermodynamicalgorithmsaloneto infer sec-ondary structurecan result in erroneousinterpreta-tions.However,this methodhasthe advantageof be-ing rapid, with duration of analysislimited only byCPUspeedto run thealgorithms.To achieveimprove-ments,phylogenetic,molecularbiological,andphysicalchemistrydatamustbefurtherassessed.An improve-ment in thesealgorithms is alreadyrecognizableincomparingthe Zuker–Steigleralgorithm (1981) withupgradedversionsincorporatingthelatestrefinementson the mfold server.We found the recentmfold algo-rithms to agreemostfrequentlywith thosefrom com-parative sequenceanalysis.As suggestedby Gutell(1993),a “phylogeneticevents”factor shouldbe incor-poratedinto folding algorithmsso that basepairingcan be manually constrained,or converselyremainunpairedin bulgesor terminal loops,to enablephylo-geneticcomparisons.

However, examination of compensatorysubstitu-tions canfurnish fundamentalinformationon severalhelices. Compensatory substitutions are directedchangesthat reflectconstraintsappliedto the two- orthree-dimensionalconfigurationof the18SrRNA mol-ecule. Phylogenetic studies overlooking secondarystructurepotentially losevaluablephylogeneticinfor-mation inherent to the structural constraintsof themolecule.Comparisonof primary sequencesmight re-veal portionsundergoinga high rateof evolution,butpreservationof a particularhelix in suchregionswouldshow that functional constraintsexist. For example,whenhelix 48is comparedamongfivearthropodstherearefive compensatorymutationsin the 11 pairsform-ing its stem(Fig. 10).Of course,numberof compensa-tory mutationswould increaseif a larger numberofphylogeneticallydistanttaxawereadded,but,compar-ison of morecloselyrelatedtaxa furnishesbetterevi-dencefor confirming helice structurebecauserisk ofcomparingnonhomologouseventsis reduced.

Theevolutionarydynamicsof accumulatingcompen-satorymutationsis believedto first involve a noncom-pensatorymutationthat may becomefixed in a popu-lation throughmoleculardrive (dependingon numberof deleteriouschangesin copiesof the molecule).Thisis followed,at somelater time, by a selectivecompen-satorysubstitutionthat restoresrobustnessof a struc-ture and,therefore,fitnessof descendents.Hancocketal. (1988)considerthat hydrogenbondingbetweenG z

U pairs,while suboptimal,maintainsa structureat anintermediatelevel prior to an eventualcompensatorymutation.Roussetet al. (1991)showthat G z U pairs

are deleteriousin some positions,but also may beadvantageousin other positionswhere they becomefixed and no compensatorymutation is required totransforma G z U to an A z U pairing.If G z U pairingsdid notexistin severaldifferentintermediatelineages,then it is likely that lack of a compensatorymutationwasnot a selectivedisadvantage.At the specieslevelcertainhelicesmaybeconstantin length,number,andpositionof bulgesalonga stemandlengthof terminalloop.

Our analysisof 18SrRNA secondarystructureusingpeloridiid sequencesreflectsthe fact that stabilizingfactorsfor themoleculearenot completelyknown.Themolecularbiologicalenvironmentin theribosomeprob-ably contributesto configurationof ribosomalRNAsinsomeyet to be learnedfashion and there are likelysubtledifferencesof the in vivo structureandonepre-dictedby currentleastfree-energyalgorithms.For ex-ample,intertaxonvariability of helices6, E 10, E 23,and43 suggeststhat thereis no widely sharedstruc-tural conformationof these helices. Closely relatedtaxa might havesequencesin thesehelicesthat areincompatibleto folding in thesameconfiguration.Thisabsenceof overallsimilarity couldindicatepresenceofspecificfeaturesthat are recognitionsites for associ-atedproteinsor RNA–RNA interactions.This wasre-cently demonstratedin helix 23 of prokaryotesforwhich in vivo structuralvariationreflectedtheflexibil-ity of the helix to interactwith ribosomalproteinsS8and S17to maintainfunctional integrity (Clemonsetal., 1999).Thus,suchvariablehelicesmight result instructuralconfigurationsthat arenot necessarilycon-gruentwith phylogeny.Studiesontranslationprocess-ing could identify key zonesmaintainingnormal mo-lecularbiologicalfunctionsof the ribosomethat reflectevolutionary patterns.A recent report showedhowfunctionalityof secondarystructurein variablehelicesof small subunitribosomalRNA in prokaryotesis af-

FIG. 10. Helix 48 showingpositionsof five different compensa-tory mutationsamonghemipteransanda crustacean.

41518S rRNA STRUCTUREAND PHYLOGENY OF PELORIDIIDAE

fectedby tertiary forces(Garrett,1999;Clemonset al.,1999). Such associationsbetweenfunctionality andmolecularforcescouldfurtherimpactourphilosophyofcharactertreatmentin rRNA datasets.

Finally, a potentialfield of interestconcernsuseofsomestructuresto characterizeanddistinguishgroupsof taxa:thesestructuresare“signatures”asdefinedinWinker and Woese(1991).The 18SrRNAs of severalinsectsshow a morpho-molecularfeaturein helix 24which could be interpretatedas a synapomorphyforthe Fulgoromorpha(Archaeorrhyncha).A mispairingafterthethird pair is presentin thefivestudiedspeciesof fulgoresbelongingto five different families, but isreplaced by a conventional base pair in otherHemipterastudied.Such homologousstructuresarenumerousenoughin the long rRNA moleculeto showvariationandmay be usedas independentcharactersin a cladisticanalysis.

CONCLUSION

When inferring evolutionaryaffiliations using mo-lecularphylogenetics,it is importantto recognizepar-ticular structuralcharacteristicsof moleculesif thesemoleculesare associatedwith significant functionalprocesses.Theresultspresentedin this paperempha-size the importanceof taking into accountmolecularchangesthat affect folding of 18SrRNA and possiblyits function. Appreciationof such factors that affectfolding,andformationof varioushelicesandtheir sub-structures,improvesability to infer homologyof char-acter,whetherin a broadsenseasa particularstruc-ture or acutelyas individual nucleotides.In addition,by taking into accountparticularaspectsof secondaryand tertiary structures,one is in a betterposition tovisualizewheredistortedweightingof a phylogeneticsignalmighthaveoccurredwithin adataset.Moreover,onecould usestructuralhomologyof featureswithinsubstructuresof themoleculeascharacters,similar toa morphologicaltreatment,and analyzethis datasetindependentlyor in conjunctionwith anucleotidedata-set.Thenextstepfollowing suchassessmentof second-ary structural elementsfor the current hemipterandatasetis translatingthesesecondarystructuralele-mentsinto charactersfor a cladisticanalysisto providemorerobustnessof phylogeneticsignals(T. Bourgoinetal., unpublished).

ACKNOWLEDGMENTS

We greatly appreciatethe efforts of G. B. and S. R. Monteith,QueenslandMuseum,Brisbane,Queensland,Australiain collectingand providing specimensof the peloridiidsusedin this study.TheauthorsthankC. D’Haesefor providingdocumentationonMALIGNand POY softwareand for useful discussionand K. Kjer and ananonymousreviewerfor their thoroughand helpful reviewsof themanuscript.

REFERENCES

Bourgoin,T. (1993).Femalegenitaliain HemipteraFulgoromorpha,morphological and phylogeneticdata. Ann. Soc. Entomol. Fr.(N.S.) 29: 225–244.

Bourgoin, T. (1996). Phylogenie des hexapodes.La recherchedessynapomorphiesne fait pastoujoursle cladisme!Bull. Soc. Zool.Fr. 121: 5–20.

Bourgoin,T., Steffen-Campbell,J. D., and Campbell,B. C. (1997).Molecularphylogenyof Fulgoromorpha(Insecta,Hemiptera,Ar-chaeorrhyncha).The enigmaticTettigometridae:Evolutionaryaf-filiations andhistoricalbiogeography.Cladistics 13: 207–224.

Bourgoin, T., Chan,K., Ouvrard,D., and Campbell,B. C. (1999).CharacterAnalysis Using RibosomalRNA SecondaryStructureandIts Application to HemipteraandFulgoromorphaPhylogeny.10th InternationalAuchenorrhynchaCongress,Cardiff.

Breddin,G. (1897).Hemipteren.In “ErgebnissederHamburgerMa-galhaensischenSammelreise1982/93. II. Band. Arthropoden”(NaturhistorischenMuseumzu Hamburg,Ed.),pp. 10–13.Fried-erichsen,Hamburg.

Bremer,K. (1994).Branchsupportandtreestability. Cladistics 10:295–304.

Burckhardt,D., andAgosti, D. (1991).New recordsof SouthAmer-icanPeloridiidae(Homoptera:Coleorrhyncha).Rev. Chilena Ento-mol. 19: 71–75.

Campbell,B. C., Steffen-Campbell,J. D., Sorensen,J. T., andGill,R. J. (1995).Paraphylyof Homopteraand Auchenorrhynchain-ferred from 18S rDNA nucleotidesequences.Syst. Entomol. 20:175–194.

Chastain,M., andTinoco,I., Jr. (1991).Structuralelementsin RNA.Prog. Nucleic Acid Res. Mol. Biol. 41: 131–177.

China,W. E. (1962).SouthAmericanPeloridiidae(Hemiptera–Ho-moptera:Coleorrhyncha).Trans. R. Entomol. Soc. Lond. 114:131–161.

Clemons,W. M., Jr., May, J. L. C., Wimberly, B. T., McCutcheon,J. P., Capel,M. S., and Ramakrishnan,V. (1999).Structureof abacterial30Sribosomalsubunitat 5.5 A resolution.Nature 400:833–840.

Cobben,R. H. (1978).Evolutionarytrendsin Heteroptera.Part II.Mouthpart-structuresandfeedingstrategies.Mededelingen Land-bouwhogeschool Wageningen Nederland 78–5, 407pp.

Corpet,F., and Michot, B. (1994).RNAlign program:Alignment ofRNA sequencesusing both primary and secondarystructures.Comput. Appl. Biosci. 10: 389–399.

De Pinna, M. C. (1991). Conceptsand tests of homology in thecladisticparadigme.Cladistics 7: 367–394.

DeRijk, P.,andDeWachter,R. (1993).DCSE,aninteractivetool forsequencealignmentand secondarystructureresearch.Comput.Appl. Biosci. 9: 735–740.

Evans,J. W. (1963).Thephylogenyof Homoptera.Annu. Rev. Ento-mol. 8: 77–94.

Evans,J. W. (1981).A review of presentknowledgeof the familyPeloridiidaeandnew generaandnew speciesfrom New Zealandand New Caledonia(Hemiptera:Insecta).Rec. Austral. Mus. 34:381–406.

Fields, D. S., and Gutell, R. R. (1996).An analysisof large RNAsequencesfoldedby a thermodynamicmethod.Folding Design 1:419–430.

Garrett,R. (1999).Mechanicsof theribosome.Nature 400:811–812.

Gladstein,D. S., and Wheeler,W. C. (1997).“POY: PhylogenyRe-constructionvia Direct Optimization,” anon ftp.amnh.org/pub/wheeler/programs/poy.American Museum of Natural History,New York.

Gutell, R. R.(1993).Comparativestudiesof RNA: Inferring higher-

416 OUVRARD ET AL.

orderstructurefrom patternsof sequencevariation.Curr. Opin.Struct. Biol. 3: 313–322.

Gutell, R. R. (1994).Collectionof small subunit(16S-and16S-like)ribosomalRNA structures.Nucleic Acids Res. 22: 3502–3507.

Gutell, R. R., Larsen,N., andWoese,C. R. (1994).Lessonsfrom anevolving RNA: 16Sand 23SRNA structuresfrom a comparativeperspective.Microbiol. Rev. 58: 10–26.

Hancock,J. M., Tautz,D., andDover,G. A. (1988).Evolutionof thesecondarystructuresand compensatorymutationsof the ribo-somalRNAs of Drosophila melanogaster. Mol. Biol. Evol. 5: 393–414.

Hendriks,L., VanBroeckhoven,C.,Vandenberghe,A., VanDePeer,Y., andDe Wachter,R. (1988a).Primaryandsecondarystructureof the18SribosomalRNA of thebird spiderEurypelma californicaand evolutionary relationshipsamong eukaryotic phyla. Eur.J. Biochem. 177: 15–20.

Hendriks,L., DeBaere,R.,VanBroeckhoven,C.,andDeWachter,R.(1988b).Primary and secondarystructureof the 18S ribosomalRNA of the insectspeciesTenebrio molitor. FEBS Lett. 232:115–120.

Hennig, W. (1966). “PhylogeneticSystematics,”Univ. of IllinoisPress,Urbana.

Jaeger,J. A., Turner,D. H., andZuker,M. (1989).Improvedpredic-tionsof secondarystructuresfor RNA. Proc. Natl. Acad. Sci. USA86: 7706–7710.

Kjer, K. M. (1995).Useof rRNA secondarystructurein phylogeneticstudiesto identify homologouspositions:An exampleof alignmentand data presentationfrom the frogs. Mol. Phylogenet. Evol. 4:314–330.

Konings,D. A. M., andHogeweg,P. (1989).Patternanalysisof RNAsecondarystructure:Similarity andconsensusof minimal-energyfolding. J. Mol. Biol. 207: 597–614.

Konings,D. A. M., and Gutell, R. R. (1995).A comparisonof ther-modynamifoldings with comparativelyderivedstruturesof 16Sand16S-likerRNAs. RNA 1: 559–574.

Kwon, O-Y., Ogino, K., and Ishikawa,H. (1991).The longest18SribosomalRNA ever known: Nucleotidesequenceand presumedsecondarystructureof the 18SRNA of the peaaphid,Acyrthosi-phon pisum. Eur. J. Biochem. 202: 827–833.

Leontis,N. B., andWesthof,E. (1998).A commonmotif organizesthestructureof multi-helix loopsin 16Sand23SribosomalRNAs.J.Mol. Biol. 283: 571–583.

Mathews,D. H., Sabina,J., Zuker, M., and Turner, D. H. (1999).Expandedsequencedependenceof thermodynamicparametersprovidesrobust prediction of RNA secondarystructure.J. Mol.Evol. 288: 911–940.

Myers,J. G.,andChina,W. E. (1929).Thesystematicpositionof thePeloridiidaeaselucidatedby a furtherstudyof theexternalanat-omy of Hemiodoecus leai China.Ann. Mag. Nat. Hist. 105: 282–294.

Nelles, L., Fang, B-L., Volckaert, G., Vandenberghe,A., and DeWachter,R. (1984).Nucleotidesequenceof a crustacean18Sribo-somal RNA gene and secondarystructureof eukaryotic smallsubunitribosomalRNAs.Nucleic Acids Res. 12: 8749–8768.

Noller, H. F. (1984).Structureof the ribosomalRNA. Annu. Rev.Biochem. 53: 119–162.

Patterson,C. (1982). Morphological charactersand homology. In“Problemsof PhylogeneticReconstruction”(K. A. JoyseyandA. F.Friday,Eds.),pp. 21–74.AcademicPress,London.

Remane,A. (1956).“Die GrundlagendesNaturlichen SystemsdesVergleichendenAnatomie und der Phylogenetik,”AkademischeVerlagsgesellschaftGeest& Portig,Leipzig.

Rousset,F., Pelandakis,M., and Solignac,M. (1991).Evolution of

compensatorysubstitutionsthroughG z U intermediatestateinDrosophila RNA. Proc. Natl. Acad. Sci. USA 88: 10032–10036.

Saiki, R. K., Gelfand,D. H., Stoffel, S., Scharf,S. J., Higuchi, R.,Horn,G.T.,Mullis, K. B., andErlich,H. A. (1988).Primer-directedenzymaticamplificationof DNA with a thermostableDNA poly-merase.Science 239: 487–491.

Sanger,F., Nicklen, S.,andCoulson,A. R. (1977).DNA sequencingwith chain-terminatinginhibitors.Proc. Natl. Acad. Sci. USA 74:5436–5467.

Schlee,D. (1969).Morphologieund Symbiose:Ihre BeweiskraftfurdieVerwandtschaftsbeziehungenderColeorrhyncha.Stuttg. Beitr.Naturk. 210: 1–27.

Schuh,R. T., and Slater, J. A. (1995). “True Bugs of the World(Hemiptera:Heteroptera):Classificationand Natural History,”CornellUniv. Press,Ithaca,NY.

Shcherbakov,D. E. (1988).Origin andevolutionof Auchenorrhyncha(Homoptera)basedupon fossil evidence.Proc. 28th Int. Congr.Entomol., p. 8.

Smith,A. B. (1989).RNA sequencedatain phylogeneticreconstruc-tion: Testingthe limits of its resolution.Cladistics 5: 321–344.

Sorensen,J. T., Campbell,B. C., Gill, R. J., and Steffen-Campbell,J. D. (1995).Non-monophylyof Auchenorrhyncha(“Homoptera”),basedupon18SrDNA phylogeny:Eco-evolutionaryand cladisticimplicationswithin pre-HeteropterodeaHemiptera(S. L.) and aproposalfor new monophyleticsuborders.Pan-Pacific Entomol.71: 31–60.

Swofford,D. L. (1998).PAUP*. “PhylogeneticAnalysisUsingParsi-mony (*and Other Methods).Version 4,”. Sinauer,Sunderland,MA.

Thompson,J. D., Higgins,D. G.,andGibson,T. J. (1994).CLUSTALW: Improving the sensitivity of progressivemultiple sequencealignmentthroughsequenceweighting,positionspecificgappen-altiesandweightmatrixchoice.Nucleic Acids Res. 22:4673–4680.

van de Peer,Y., De Rijk, P., Wuyts, J., Winkelmans,T., and DeWachter,R. (2000).TheEuropeanSmallSubunitRibosomalRNAdatabase.Nucleic Acids Res. 28: 175–176.

von Dohlen,C. D., andMoran,N. A. (1995).MolecularphylogenyoftheHomoptera:A paraphyleticAascon.J. Mol. Evol., 41:211–223.

Wheeler,W. C., and Gladstein,D. S. (1994).MALIGN: A multiplesequencealignmentprogram.J. Hered. 85: 417.

Wheeler,W. C.,andHoneycutt,R. L. (1988).Pairedsequencediffer-encein ribosomalRNAs: Evolutionaryand phylogeneticimplica-tions.Mol. Biol. Evol. 5: 90–96.

Wheeler,W. C.,Schuh,R.T., andBang,R. (1993).Cladisticrelation-shipsamonghigher groupsof Heteroptera:Congruencebetweenmorphologicalandmoleculardatasets.Entomol. Scand. 24: 121–137.

Winker, S., and Woese,C. R. (1991).A Definition of the domainsArchaea, Bacteria and Eucarya in terms of small subunit ribo-somalRNA characteristics.Syst. Appl. Microbiol. 14: 305–310.

Woese,C. R.(1987).Bacterialevolution.Microbiol. Rev. 51:221–271.

Zrzavy,J. (1992).Evolutionof antennaeandhistoricalecologyof thehemipteraninsects(Paraneoptera).Acta. Entomol. Bohem. 89:77–86.

Zuker,M., Mathews,D. H., andTurner,D. H. (1999).Algorithmsandthermodynamicsfor RNA secondarystructureprediction:A prac-tical guide.In “RNA BiochemistryandBiotechnology”(J.Barcisze-wski and B. F. C. Clark, Eds.),NATO ASI Series,Kluwer Aca-demic,Dordrecht.

Zuker,M., andSteigler,P. (1981).Optimalcomputerfolding of largeRNA sequencesusingthermodynamicsandauxiliary information.Nucleic Acids Res. 9: 133–148.

41718S rRNA STRUCTUREAND PHYLOGENY OF PELORIDIIDAE