Using Exon and Intron Sequences of the Gene Mp20 to Resolve Basal Relationships in Cicindela...

17
Syst. Biol. 53(4):554–570, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490472940 Using Exon and Intron Sequences of the Gene Mp20 to Resolve Basal Relationships in Cicindela (Coleoptera:Cicindelidae) J OAN PONS , 1, 2 TIMOTHY G. BARRACLOUGH, 2 KOSMAS THEODORIDES , 1, 2 ANABELA CARDOSO, 1, 3 AND ALFRIED P. VOGLER 1, 2 1 Department of Entomology, The Natural History Museum, London SW7 5BD, United Kingdom; E-mail: [email protected] (J.P.) 2 Department of Biological Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY, United Kingdom 3 Faculdade de Ciˆ encias da Universidade de Lisboa, Departamento de Zoologia e Antropologia e Centro de Biologia Ambiental, Rua Ernesto Vasconcelos, 1746-016 Campo Grande, Lisboa, Portugal Abstract.—The genus Cicindela (Coleoptera: Cicindelidae) is a species-rich cosmopolitan group of tiger beetles useful for comparing clade diversification worldwide. Knowledge about relationships of major groups is important for this analysis but basal nodes in Cicindela have been difficult to resolve with standard mtDNA markers. Here we developed the Mp20 gene, a single-copy nuclear marker coding for a muscle-associated protein in insects, for phylogenetic analysis of basal groups of Cicindela. Nearly full-length sequences were obtained for 51 cicindelids, including major taxonomic groups from all continents. Sequences of Mp20 were between 1.2 and 1.7 kb and spanning three introns. Phylogenetic signal of exon and intron sequences was compared with that from four gene regions of mtDNA (COI, COIII, Cytb, 16S rRNA; 2.4 kb total). Because introns differed in length, sequence alignment was conducted using various procedures of phenetic and parsimony-based character coding of indels to assess their phylogenetic information content, but major nodes were recov- ered consistently. Mp20 sequences contributed two thirds of the total support of the combined analysis, with most signal from the introns. We found major clades of Cicindela to be geographically largely coincident with continental regions, con- fined to Australasia, the Holarctic, the Indian subcontinent, Africa, and South and Central America. Clock estimates using various maximum-likelihood (ML) branch length calculations resulted in roughly similar divergence times whether Mp20 exon, introns, or mtDNA were used, and they were not greatly affected by different procedures for coding and optimiz- ing indel characters. Based on existing clock calibrations in Cicindela, basal splits of continental lineages occurred in the mid-Miocene, placing the radiation of basal groups of Cicindela to a period when their open-vegetation habitats expanded globally. [Congruence; DNA sequence alignment; ESTs; miocene; molecular clock; single copy genes.] Several recent studies have demonstrated the power of single-copy nuclear markers in molecular systematics, but for many groups the choice of such markers remains limited. In the Coleoptera (beetles), only a handful of nuclear markers have been used, including elongation factor-1alpha (Cognato and Vogler, 2001; Jordal, 2002), wingless (Ober, 2002), enolase (Farrell et al., 2001), and phosphoenolpyruvate carboxykinase (Sota and Vogler, 2003). As new gene sequences become available for poly- merase chain reaction (PCR) primer design, additional markers can be tested for amplification and phylogenetic information content. Most nuclear markers contain in- tron sequences with a typically higher rate of sequence evolution and greater length variation, and these might provide different kinds of character variation and extend the range of hierarchical levels where a marker can po- tentially be useful (e.g., Hillis and Dixon, 1991). Here we employed the Mp20 locus, encoding a muscle associated protein of about 20 kD (Ayme-Southgate et al., 1989). The Mp20 gene to date has only been described in D. melanogaster where it is located in a single posi- tion at polytene region 49F 9–13 (Ayme-Southgate et al., 1989). A close match of the Mp20 coding sequence was obtained repeatedly in cDNA libraries from taxonomi- cally divergent species of Coleoptera (Theodorides et al., 2002) and hence sequence information was available for primer design. We used these primers to investigate re- lationships in tiger beetles (Cicindelidae, Coleoptera) of the genus Cicindela. This genus represents a spectacular worldwide radiation of nearly 1000 species. The beetles are very effective predators of small arthropods and have diversified in open habitats such as grasslands, salt flats, sand dunes, and river edges (Pearson and Vogler, 2001). We are specifically interested in questions about the fac- tors promoting species diversification, in comparisons of subclades that are geographically confined to different continents (Barraclough and Vogler, 2002). The genus Cicindela s.l. (subtribe Cicindelina of some authors) has been subdivided in some 55 subgenera by Rivalier (1950–1963) mainly based on male genitalic char- acters. However, Rivalier’s work did not primarily at- tempt to establish relationships between these groups, and because he treated the major continental regions sep- arately, the basal relationships in Cicindela were not ad- dressed in much detail. Similarly, conventional mtDNA markers failed to resolve deeper relationships between subgenera, although they have been successful in resolv- ing relationships within some of the North American species groups (Vogler and Kelley, 1998; Vogler et al., 2004). The poor understanding of basal relationships in Cicindela leaves open many questions about the early ra- diation and interchange between major biogeographic regions, and prohibits age estimations of deep branches based on molecular clocks. The goal of this paper is to test the phylogenetic infor- mation content of exon and intron sequences of Mp20, in comparison to the better established mtDNA markers, for resolving relationships among the subgenera of Ci- cindela s.l. and estimating an evolutionary time frame for their diversification. The Mp20 gene, combined with four partial mtDNA genes provided a well-supported tree of basal lineages of Cicindela. Intron sequences in Mp20 554 by guest on December 9, 2014 http://sysbio.oxfordjournals.org/ Downloaded from

Transcript of Using Exon and Intron Sequences of the Gene Mp20 to Resolve Basal Relationships in Cicindela...

Syst. Biol. 53(4):554–570, 2004Copyright c© Society of Systematic BiologistsISSN: 1063-5157 print / 1076-836X onlineDOI: 10.1080/10635150490472940

Using Exon and Intron Sequences of the Gene Mp20 to Resolve Basal Relationshipsin Cicindela (Coleoptera:Cicindelidae)

JOAN PONS,1,2 TIMOTHY G. BARRACLOUGH,2 KOSMAS THEODORIDES,1,2 ANABELA CARDOSO,1,3

AND ALFRIED P. VOGLER1,2

1Department of Entomology, The Natural History Museum, London SW7 5BD, United Kingdom;E-mail: [email protected] (J.P.)

2Department of Biological Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY, United Kingdom3Faculdade de Ciencias da Universidade de Lisboa, Departamento de Zoologia e Antropologia e Centro de Biologia Ambiental,

Rua Ernesto Vasconcelos, 1746-016 Campo Grande, Lisboa, Portugal

Abstract.—The genus Cicindela (Coleoptera: Cicindelidae) is a species-rich cosmopolitan group of tiger beetles useful forcomparing clade diversification worldwide. Knowledge about relationships of major groups is important for this analysisbut basal nodes in Cicindela have been difficult to resolve with standard mtDNA markers. Here we developed the Mp20gene, a single-copy nuclear marker coding for a muscle-associated protein in insects, for phylogenetic analysis of basalgroups of Cicindela. Nearly full-length sequences were obtained for 51 cicindelids, including major taxonomic groups fromall continents. Sequences of Mp20 were between 1.2 and 1.7 kb and spanning three introns. Phylogenetic signal of exonand intron sequences was compared with that from four gene regions of mtDNA (COI, COIII, Cytb, 16S rRNA; 2.4 kbtotal). Because introns differed in length, sequence alignment was conducted using various procedures of phenetic andparsimony-based character coding of indels to assess their phylogenetic information content, but major nodes were recov-ered consistently. Mp20 sequences contributed two thirds of the total support of the combined analysis, with most signalfrom the introns. We found major clades of Cicindela to be geographically largely coincident with continental regions, con-fined to Australasia, the Holarctic, the Indian subcontinent, Africa, and South and Central America. Clock estimates usingvarious maximum-likelihood (ML) branch length calculations resulted in roughly similar divergence times whether Mp20exon, introns, or mtDNA were used, and they were not greatly affected by different procedures for coding and optimiz-ing indel characters. Based on existing clock calibrations in Cicindela, basal splits of continental lineages occurred in themid-Miocene, placing the radiation of basal groups of Cicindela to a period when their open-vegetation habitats expandedglobally. [Congruence; DNA sequence alignment; ESTs; miocene; molecular clock; single copy genes.]

Several recent studies have demonstrated the powerof single-copy nuclear markers in molecular systematics,but for many groups the choice of such markers remainslimited. In the Coleoptera (beetles), only a handful ofnuclear markers have been used, including elongationfactor-1alpha (Cognato and Vogler, 2001; Jordal, 2002),wingless (Ober, 2002), enolase (Farrell et al., 2001), andphosphoenolpyruvate carboxykinase (Sota and Vogler,2003). As new gene sequences become available for poly-merase chain reaction (PCR) primer design, additionalmarkers can be tested for amplification and phylogeneticinformation content. Most nuclear markers contain in-tron sequences with a typically higher rate of sequenceevolution and greater length variation, and these mightprovide different kinds of character variation and extendthe range of hierarchical levels where a marker can po-tentially be useful (e.g., Hillis and Dixon, 1991).

Here we employed the Mp20 locus, encoding a muscleassociated protein of about 20 kD (Ayme-Southgate et al.,1989). The Mp20 gene to date has only been describedin D. melanogaster where it is located in a single posi-tion at polytene region 49F 9–13 (Ayme-Southgate et al.,1989). A close match of the Mp20 coding sequence wasobtained repeatedly in cDNA libraries from taxonomi-cally divergent species of Coleoptera (Theodorides et al.,2002) and hence sequence information was available forprimer design. We used these primers to investigate re-lationships in tiger beetles (Cicindelidae, Coleoptera) ofthe genus Cicindela. This genus represents a spectacularworldwide radiation of nearly 1000 species. The beetlesare very effective predators of small arthropods and have

diversified in open habitats such as grasslands, salt flats,sand dunes, and river edges (Pearson and Vogler, 2001).We are specifically interested in questions about the fac-tors promoting species diversification, in comparisons ofsubclades that are geographically confined to differentcontinents (Barraclough and Vogler, 2002).

The genus Cicindela s.l. (subtribe Cicindelina of someauthors) has been subdivided in some 55 subgenera byRivalier (1950–1963) mainly based on male genitalic char-acters. However, Rivalier’s work did not primarily at-tempt to establish relationships between these groups,and because he treated the major continental regions sep-arately, the basal relationships in Cicindela were not ad-dressed in much detail. Similarly, conventional mtDNAmarkers failed to resolve deeper relationships betweensubgenera, although they have been successful in resolv-ing relationships within some of the North Americanspecies groups (Vogler and Kelley, 1998; Vogler et al.,2004). The poor understanding of basal relationships inCicindela leaves open many questions about the early ra-diation and interchange between major biogeographicregions, and prohibits age estimations of deep branchesbased on molecular clocks.

The goal of this paper is to test the phylogenetic infor-mation content of exon and intron sequences of Mp20,in comparison to the better established mtDNA markers,for resolving relationships among the subgenera of Ci-cindela s.l. and estimating an evolutionary time frame fortheir diversification. The Mp20 gene, combined with fourpartial mtDNA genes provided a well-supported treeof basal lineages of Cicindela. Intron sequences in Mp20

554

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

2004 PONS ET AL.—BASAL RELATIONSHIPS IN CICINDELA 555

FIGURE 1. Schematic representation of the Mp20 gene. Exons are represented by boxes, and introns by grey double headed arrows. Thepositions of the primers are indicated with dark arrows. The length of the regions amplified is indicated in base pairs (bp), with a minimum andmaximum number for the taxa sequenced in this study.

played an important role in establishing tree topologyand clock estimations.

MATERIAL AND METHODS

Taxon Sampling and DNA Procedures

This study included 47 species from 24 subgeneraof Cicindela s.l. plus four outgroups (Appendix). Weselected representatives of major continental groups,including the four independent North American radi-ations (Cicindela s. str., Cicindelidia, Ellipsoptera, Habrosce-limorpha) (Barraclough and Vogler, 2002) plus severalgroups from South America (Brasiella, Cylindera), the In-dian subcontinent (Calochroa, Jansenia, Eugrapha), Africa(Lophyra, Lophyridia), Madagascar (Hipparidium, Chaeto-taxis), Southeast Asia (Monelica, Abroscelis, Callytron),Australia (Rivacindela), and New Zealand (Neocicindela).In some cases we included several closely related speciesfrom a subgenus to test the utility of Mp20 at shal-low nodes. The outgroups included three genera (Odon-tocheila, Prothyma, Peridexia) from another subtribe (Pro-thymina) within the Cicindelini, and one genus from thetribe Megacephalini (Pseudoxycheila).

DNA extractions and procedures for amplification ofthree mitochondrial fragments (cytochrome b [cytb], cy-tochrome oxidase III [COIII], and 16S rRNA) were per-formed as described previously (Vogler and Welsh, 1997).A fragment of 771 bp of cytochrome oxidase I (COI) wasamplified as described by Ribera et al. (2001). Amplifica-tion was not successful in a few taxa for part of the dataset, as follows: C. pimeriana (16S rRNA), C. guerrensis, M.captotriola (COI), P. chaudori, R. salicursoria, Macfarlandia(COIII), L. chloris (Cytb), N. parryi (16S rRNA and COIII),and N. ginevi (COI and COIII).

Primers for polymerase chain reaction (PCR) ampli-fication of Mp20 were based on expressed sequencestags (ESTs) from four species of taxonomically di-vergent beetles, including the adephagan Carabusgranulatus (Carabidae), and the polyphagan Agri-otes lineatus (Elateridae), Mycetophagus quadripus-tulatus (Mycetophagidae), and Curculio glandium(Curculionidae), in combination with the D. melanogastergenomic sequence. The following oligonucleotide pair

was designed for amplification of Mp20: Mp20-5′ (5′-ATG TCT CTK GAA CGT CAA GTC C-3′) and Mp20-3′(5′-TGN CCG GCY TGK GTG GCR CCC TTG-3′). Theprimer binding sites correspond to sequences at the 1stand 4th exons to amplify almost the complete codingsequence except for 19 amino acids at the 3′ end (Fig. 1).PCR followed a touchdown protocol: after an initialdenaturation step of 94◦C for 3.5 min, 8 cycles wereperformed at 94◦C for 30 s, 61◦C for 35 s (decreasing by0.5◦C every cycle), and 72◦C for 2 min and 30 s. This wasfollowed by 32 cycles at 94◦C for 30 s, 57◦C for 35 s, and72◦C for 2.5 min and a final extension step at 72◦C for10 min.

PCR amplified fragments were cloned blunt ended inpMOSBlue using a cloning kit (Amersham PharmaciaBiotech). Inserts were detected with PCR using T7 andU-19 primers and sequenced from both strands on anABI3700 DNA Analyzer (Applied Biosystems). Specificcicindelid internal Mp20 primers in the 2nd exon of Mp20gene were designed to obtain complete sequences ofboth strands: MP20-F2 (5′-GTT CTC TGC CAG GTT ATGAA-3′) and MP20-R1 (5′-ACC AGG AGG GAA CTT YTTGC-3′). The extent of intron sequences in Mp20 was pre-dicted by comparing cicindelid sequences with the ESTfrom Carabus granulatus. Genbank accession numbers ofmitochondrial and Mp20 sequences for each species aregiven in Appendix.

Phylogenetic Analysis

Tree alignments were conducted using POY vers.3.0 (Gladstein and Wheeler, 2002). For all searches re-ported, we present the best (lowest cost) trees from100 random addition replicates and collecting no morethan 3 shortest trees in each replicate (commands:−random 100 −maxtrees 3). Aligned sequence matricescan be produced from the POY tree alignment with the–impliedalignment command. This alignment (“impliedalignment”) is reconstructed from the list of synapomor-phies at internal nodes, which is based on the initialcladogram (Wheeler, 2003). The aligned matrix obtainedfrom this procedure is a representation of the homologiesestablished in the direct optimization, and in contrast to

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

556 SYSTEMATIC BIOLOGY VOL. 53

most standard alignment procedures does not representhomoplastic insertions in the same column. The resultingmatrix of aligned sequences is a synapomorphy schemethat approximates the character transformation in thedynamic homology search of POY (Wheeler, 2003). Theimplied alignment does not represent an alignment in thetraditional sense, and it should not be used as a matrix offixed character correspondences for searches of shortertrees (although parsimony searches on the implied align-ment would frequently find shorter trees than reportedby POY [unpublished observation] and these could beused as a starting tree for another round of searches inPOY [Wheeler, 2003]). However, the implied alignmentis a sufficiently close representation of the character op-timization in the initial cladogram search that it can beused as input for subsequent analysis of character varia-tion, and we use it here for estimates of molecular rates.

Support of trees were based on an approximatemethod to establish Bremer Support implemented inPOY, using tree-bisection-reconnection (TBR) swappingon a constraint file obtained with the program Jack2Henavailable with the POY software.

Tree searches were also conducted on fixed matricesfrom standard multiple alignment procedures per-formed using the advanced ClustalW form (Higginset al., 1996) available online at the Institute Pasteur(http://bioweb.pasteur.fr/seqanal/interfaces/clustalw.html). Parsimony searches on these matrices were car-ried out in PAUP version 4.0b10 (Swofford, 2002), withgaps coded as 5th character state. In addition, gaps werecoded according to Simmons and Ochoterena (2000)using the GapCoder software (Young and Healy, 2003),whereby each indel of a particular length and positionin the aligned matrix is coded as a separate characterand the presence/absence of these indels representsbinary character states in the parsimony analysis.

Bayesian phylogenetic analyses were performed us-ing MrBayes 3.0 software (Huelsenbeck and Ronquist,2001). Searches were performed based on 2 million gen-erations with four Markov chain Monte Carlo (MCMC)chains starting from random trees which were sam-pled every 100th generations. The log-likelihood scoresof sample points were plotted against generation time,and stationarity of Markov chains was assumed whenthe log-likelihood values reached a stable equilibrium(Huelsenbeck and Ronquist, 2001). The stationarity ofthe chains was confirmed by plotting the remaining log-likelihood values (sump command). All sample pointsprior to stationarity were discarded as burn-in values,and remaining points were used to generate a strict con-sensus tree, with each clade posterior probability valuerepresented by the proportion of nodes recovered in thesample of trees. Each search was conducted three timesindependently, starting from random trees, to explore thetree space which retrieved identical topologies and verysimilar parameters and credibility values in all analysesreported here.

Bayesian analysis was implemented in several ways.First, a given ClustalW alignment was subjected toanalysis using MODELTEST (Posada and Crandall,

1998), which performs a hierarchical test of likelihoodfits under 56 different models of character variation.A GTR+�+I model was selected in this analysis, andbayesian searches were performed under this preferredmodel. Further, bayesian searches were performedunder a GTR model calculating site specific rates foreight partitions showing greatly different dynamicsof character change in parsimony optimization: mito-chondrial structural RNA, protein 1st, 2nd, and 3rdpositions, Mp20 exon 1st, 2nd, and 3rd positions, andintrons. These default Mr Bayes analyses treat indelcharacters as missing, but information from indels canbe incorporated in the search as a separate set of binarycharacters (analogous to the treatment of morphologicalcharacters). To incorporate phylogenetic informationfrom indels, we used the binary recoded charactermatrix from the Simmons and Ochoterena (2000) gapcoding procedure in the bayesian analysis, using aGTR+�+I model for the DNA data, and for the binarycharacters using a model estimating the among site ratevariation according to a gamma distribution.

Estimating the Relative Ages of Nodes from Sequence Data

Branch lengths were calculated based on the tree andmodel parameters calculated by bayesian inference un-der the three treatments describe above (ClustalW singlebase coding, binary gap coding, and POY implied align-ment). In all cases, comparisons of models constrainedfor a molecular clock were significantly worse when as-sessed with the likelihood ratio test (P < 0.001, datanot shown). Therefore branch lengths were fitted to amolecular clock using Sanderson’s (1997) nonparamet-ric rate smoothing (NPRS) algorithm. This method doesnot assume a strict molecular clock but that neighboringbranches on the tree tend to have similar rates. NPRSwas applied as implemented in TreeEdit v. 1.0a9 (Ram-baut and Charleston, 2002). We also estimated branchlengths from the implied aligment and tree obtainedwith POY, estimating maximum-likelihood (ML) branchlengths in PAUP based on a GTR+�+I model selected byMODELTEST, and correcting these branch lengths bythe NPRS method. As the number of characters is fi-nite, stochastic variation may affect branch length esti-mates. To take this type of error into account we applieda resampling scheme by generating 1000 bootstrap repli-cates of the data and calculating branch length on each ofthese new data sets given the original tree topology andGTR+�+I model, fitted to a clock using NPRS (Baldwinand Sanderson, 1998). Absolute node ages were calcu-lated with reference to a mtDNA calibration from NorthAmerican taxa based on the Pleistocene biogeographicdivergence along the Florida Peninsula and the closureof the isthmus of Panama (Barraclough and Vogler, 2002).

RESULTS

Mp20 Sequences and Gene Organization

PCR amplification of Mp20 was successful for taxafrom a wide taxonomic range of cicindelids, but in mostcases sequencing was only possible after cloning of PCR

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

2004 PONS ET AL.—BASAL RELATIONSHIPS IN CICINDELA 557

products. On average, five clones were sequenced perspecies, but for nine species only a single clone was ob-tained (Appendix). Multiple clones usually differed inlength, probably representing allelic variation within in-dividuals responsible for the problems with direct se-quencing. Intraindividual length variation was entirelyconfined to the introns, and inferred indels were veryshort (1 to 3 bp), except for a single clone of L. catenaexhibiting an insertion of 83 bp. Mp20 sequences dif-fered between clones obtained from a single specimenby between 0% and 2%. Sequences obtained from a sin-gle individual were monophyletic and their divergenceswere lower than sequence divergence between species,generally by a factor of 10 or more. Intraindividual varia-tion in some species such as C. mathani, E. marginata, andN. ginevi is likely due to the presence of two recogniz-able different alleles. Within these major types, minor nu-cleotide discrepancies in singleton clones were encoun-tered which we attributed to PCR errors (calculated toabout 1.7 discrepancies per 1000 nucleotides). We cannotexclude that the variation of clones represents more thantwo alleles in a few species such as L. chloris, P. fulvia,and B. hemichrysea, perhaps indicating the presence ofmultiple copies of the Mp20 gene (paralogs), but this isunlikely. Only a single copy of Mp20 is present in the fullgenome sequence of D. melanogaster, and no evidence forparalogs was obtained in the cDNA libraries from whichthese sequences were obtained originally. Further, themonophyly of gene copies obtained from each individ-ual demonstrates that if gene duplication had affectedthe locus, it would be recent relative to the divergenceof the species under investigation here. To simplify thephylogenetic analysis, species were represented by a 50%majority rule consensus sequence of the more frequentlyisolated allele (i.e., removing the putative PCR errors).Where the same number of clones was isolated for eachallele we randomly picked one of them.

The length of Mp20 PCR products was in the rangeof 1.2 to 1.7 kb. Sequences from all species exhibitedthree introns (Fig. 1), recognizable by the presence ofthe canonical splicing motif ‘GT . . . AG’ in all cases. Thethird intron separated a codon, with two nucleotides lo-cated in the second and the remaining nucleotide in thethird exon. Intron length varied substantially among taxa(Fig. 1) whereas all exons were of equal length. The firstintron of Prothyma (798 bp in length) and the second in-tron of Chaetotaxis rugicollis (516 bp) were not includedin the analysis and treated as missing data, because oftheir great divergence from all other species. No lengthvariation was observed in mtDNA, except for single nu-cleotide indels in the 16S gene region of M. arachnoides, C.maroccana, I. labeoaneae, L. chloris, and P. fulvia, five single-nucleotide indels in the P. chaudoiri, and a single-plus atwo-nucleotide indel in O. confusa.

Sequence Alignment and Tree Topology

Searches for tree alignments were initially con-ducted in POY on all intron and exon data and thefour mitochondrial regions combined, and topologies

were assessed for the recovery of certain nodes (Ta-ble 1). Alignment parameters were varied extensivelywith respect to the relative cost of indel versus nu-cleotide change, and the weight of the introns rel-ative to the remaining (exons plus mtDNA) parti-tions. Three main clades were recovered under equalweighting. One group included the subgenera Cicindelas. str., Cicindelidia, Lophyridia, Lophyra, Calochroa, andHipparidium (henceforth clade I). A second group wascentered around the subgenus Cylindera (s.l.), includ-ing Old World and New World groups (Ellipsoptera,Brasiella, Eugrapha, and Ifasina; clade II). The third groupincluded a number of mostly Australian and SoutheastAsian taxa (Rivacindela, Neocicindela, Macfarlandia, andAbroscelis) plus in many cases Hyphaetha (clade III). Inaddition, two smaller clades were frequently recovered,including a group of two species of the Indian endemicradiation of Jansenia (J. chloropleura and J. rostrula) andthe Malagasy Chaetotaxis rugicollis, which frequently alsoincluded Taenidia circumdata (henceforth Jansenia clade);and a group of two species of Myriochile (M. undulata andM. mastersi) and Monelica fastidiosa.

Increasing the gap costs relative to nucleotide changesgreatly affected tree topologies. Gap cost = 3 and espe-cially gap cost = 4 generally resulted in the loss of cladesI, II, and III, the outgroup was polyphyletic, and relation-ships within the three clades were increasingly inconsis-tent with the traditional taxonomy (Table 1). When theintron regions were downweighted relative to the otherpartitions by a factor of between 2 and 4, this partly recon-stituted the recovery of the three main clades at highergap costs. At lower gap costs, downweighting the in-tron regions generally resulted in better recovery of ma-jor taxonomic groups, although their relative positionsvaried greatly, and one outgroup species (Prothyma) fre-quently appeared in a derived position as sister to clade I(Table 1). The tree in Figure 2 (equal gap cost, intronsdownweighted by a factor of three) best illustrates fea-tures commonly encountered in various analyses.

Alternative alignment strategies were conducted witha two-step protocol, clustering sequences with ClustalWfollowed by tree searches on prealigned data matrices.The alignment space was explored with a hierarchicaldesign, initially testing a wide range of gap openingpenalties (from 10 to 1) and extension penalties (0.05and 0.001). We selected those which produced the low-est incongruence between intron and exon partitions (asmeasured by the incongruence length difference (ILD)per character change in the combined data; ILD/changeratio), followed by variation of the parameters in a nar-rower range and detailed analysis of the resulting trees.The set of trees retrieved from these alignments recov-ered clades I, II, and III, but differed mostly withinclade I and with regard to the relationship of clade IIand the Jansenia clade, and the presence of Prothymain the ingroup. The common nodes shared from alltrees based on these alignments were labeled with as-terisks in Figure 3. The alignment based on gap open-ing penalty 6 and gap extension penalty 0.001, in boththe pairwise and multiple alignment menu of ClustalW

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

TAB

LE

1.M

onop

hyle

tic

grou

psre

cove

red

und

erd

iffe

rent

wei

ghti

ngsc

hem

esan

dpa

rtit

ions

ofth

ed

ata.

M=

mon

ophy

leti

c;N

=po

lyph

ylet

ic;P

=pa

raph

ylet

ic;U

=un

reso

lved

inst

rict

cons

ensu

sof

alls

hort

estt

rees

.

Mp2

0+

mtD

NA

Mp2

0+

mtD

NA

Mp2

0+

mtD

NA

Mp2

0+

mtD

NA

Mp2

0ex

onM

tDN

AM

p20

Gap

=1

Mp2

0G

ap=

2M

p20

Gap

=3

Mp2

0G

ap=

4

Wei

ghtn

onin

tron

sN

/a

N/

a1

12

34

11

23

41

12

34

11

23

4

Cla

de

IC

icin

dela

Cic

inde

lidia

Loph

yrid

ia,C

osm

odel

a(L

oph,

Cal

o,H

ipp)

M U U U U

N P M M M

M M5

P M P

M P M4

M M

M M5

M4

M M

M M M M M

M M M M M

M M5

P M P

M P M4

M M

M M5

P M P

M M P M M

M M M M M

P1 N M4

M M

M1

P M4

M M1

M1

P P M M1

M M5

M4

M P

M1

P M4

M M

M1

N P P M

M1

P P P M

M1

P P M M1

M1

P M4

M M1

M1

P N M M1

Cla

de

II(C

ylin

dera

s.l.)

New

Wor

ldO

ldW

orld

U M N

N N M

M M P

M P P

M P P

M M M

M P P

M N P

M M M

M M M

M M M

M P P

M7

M P7

N M N

M M M

M M M

M M M

N M N

N M N

N M N

N M N

N M NC

lad

eI+

clad

eII

UN

NN

NM

NP

PM

1N

NN

NN

M1

NN

NN

NN

Cla

de

III

Riv

a+

Neo

cM U

N MM

2

PM M

M MM M

M MM

2

NM

2

M2

M2

M2

M MM M

N NN N

N NN N

N NN N

N NN N

N NN N

Jans

enia

clad

eM

yrio

chile

clad

eIn

grou

p

U M M

M3

M P

N M M∗

P M M

P M P

M M M

P M M

M6

M M∗

M3

M M

N P M∗

P M M∗

P M M∗

P M M∗

N M N

N P N

P P N

M3

P M∗

M3

M N

M3

M N

N P N

N P N

P P N

Jans

enia

clad

e:J.

rost

rula

,J.c

hlor

ople

ura,

C.r

ugic

ollis

,T.c

ircu

mda

ta;M

yrio

chile

clad

e:M

onel

ica

mas

ters

i,M

yrio

chile

fast

idio

sa,M

.und

ulat

a;R

iva

+N

eoc,

Riv

acin

dela

plus

Neo

cici

ndel

a(L

oph,

Cal

o,H

ipp)

,cla

de

ofLo

phyr

a+

Cal

ochr

oa+

Hip

pari

dium

.1 C

lad

esm

onop

hyle

tic

orpa

raph

ylet

ic,b

utth

eou

tgro

upP

roth

yma

sp.i

nclu

ded

inth

ecl

ade;

2 Cla

de

III

mon

ophy

leti

cbu

tal

soin

clud

esH

.bir

ram

osa;

3 doe

sno

tin

clud

eT.

circ

umda

ta;4 d

oes

not

incl

ude

C.i

oess

a;5 in

clud

esC

.ioe

ssa;

6 doe

sno

tinc

lud

eC

.rug

icol

lis;7 d

oes

noti

nclu

de

E.z

aza.

Ast

eris

ksm

ark

thos

etr

ees

whe

reth

ein

grou

pis

notm

onop

hyle

tic

beca

use

Pro

thym

asp

.is

mov

edto

the

ingr

oup.

558

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

2004 PONS ET AL.—BASAL RELATIONSHIPS IN CICINDELA 559

FIGURE 2. Phylogenetic tree obtained from the combined Mp20 and mtDNA data under gap cost = 1 and weight of non-intron regions = 3.Bremer Support is shown above nodes and was calculated with the same weights applied as in the primary tree search. The three main clades I,II, and III are highlighted.

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

560 SYSTEMATIC BIOLOGY VOL. 53

FIGURE 3. Phylogram representing the single shortest tree obtained from simultaneous analysis of all data, with branch length representingaverage number of parsimony steps. Numbers above branches refer to PBS (mitochondrial/Mp20 exons/Mp20 introns), and numbers belowbranches are bootstrap support values above 50%. The three major clades referred to in the text are indicated. This phylogram is based on theClustalW alignment with gap opening penalty 6 and gap extension penalty 0.001, and minor modifications by eye. Mp20 sequences were alsoaligned in ClustalW under a variety of alignment parameters (gap open 10, 6, 4, 2, and 1 and gap extension 0.05 and 0.001). Asterisks indicatethose nodes present in the strict consensus of trees obtained under all of these alignment parameters.

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

2004 PONS ET AL.—BASAL RELATIONSHIPS IN CICINDELA 561

produced the lowest incongruence (ILD/step = 0.0056).This alignment was the only one that accurately reflectedexon and intron boundaries, and in the tree derived fromit none of the outgroups species were included in theingroup. Minor manual modifications of this alignmentfurther reduced the incongruence (0.0049), and also pro-duced the shortest tree obtained in any of the analyses(5133 steps). This Mp20 alignment was combined withmitochondrial sequences for tree searches and resultedin a single tree of 13440 steps (Fig. 3).

The preferred ClustalW alignment was also used as in-put for bayesian tree searches using a GTR+�+I modelof nucleotide substitution. The resulting tree (Fig. 4) wasvery similar to the preferred parsimony tree but also re-solved the reciprocal monophyly of Cicindela and Cicin-delidia that was otherwise only recovered in a few POYalignments. Tree searches were also performed using asite specific model, and with the GTR+�+I model andgaps coded as binary state (see Material and Methods).These analyses retrieved similar tree topologies and cred-ibility values (Fig. 4), and those nodes not recovered in allthree searches showed low credibility values in all threetrees.

Phylogenetic Information Content and Dynamicsof Sequence Evolution in mtDNA and Mp20

The data were separated according to functional crite-ria to test for the phylogenetic signal of various data par-titions and potential conflict in a parsimony framework.In the ClustalW alignment obtained under the preferredparameters, the Mp20 provided a far smaller number ofpositions than mtDNA (1767 versus 2655; Table 2) and

TABLE 2. Tree statistics from parsimony analysis of mtDNA and Mp20 partitions separately and combined.

Sitesa

Total Const Inf PBSb Tree length PBS/step RIShortest

treesResolved

nodescSharednodesd

Lengthconstrainede

All data 4422 1920 1872 1155 13440 0.086 0.4527 1 49 n/a n/aAll data, gap = ? 4422 2355 1570 n/a 11390 n/a 0.3868 1 49 41 11420 (0.26)All, gap binaryf 5048 2355 1915 n/a 12403 n/a 0.4182 2 48 45 12417 (0.11)mtDNA 2655 1551 896 436 8146 0.053 0.3045 1 49 26 8263 (1.42)mtDNA 3rd pos. 634 22 582 n/a 5934 n/a 0.2985 6 45 20 6040 (1.75)mtDNA RNA 750 501 148 n/a 887 n/a 0.3558 >104 43 12 947 (6.33)Mp20 1767 369 976 719 5133 0.140 0.6635 1 49 37 5177 (0.85)Exons 469 320 102 136 433 0.314 0.5796 867 28 20 454 (4.62)Introns 1298 49 874 583 4675 0.124 0.6747 2 47 32 4723 (1.02)mt + exons 3124 1871 998 n/a 8637 n/a 0.3149 6 36 27 8717 (0.92)mt + exons + 3347 1920 1122 644 9234 0.070 0.3281 2 47 34 9289 (0.59)

intron no gapsIntron 1 744 9 488 454 2688 0.169 0.6225 110 33 23 2719 (1.14)Intron 2 474 24 334 49 1673 0.029 0.7500 8 40 21 1739 (2.67)Intron 3 80 16 52 80 252 0.317 0.7116 >104 21 17 265 (4.90)Intron no gaps 223 49 124 72 554 0.129 0.5969 416 32 22 572 (3.14)Intron gaps only 1075 0 750 511 4093 0.124 0.6877 1 49 31 4151 (1.40)Binary gaps only 626 0 345 n/a 969 n/a 0.7487 85 38 28 997 (2.80)

aNumber of sites in partition, number of constant sites, and number of potentially informative character changes.bTotal PBS of the partition on the tree from simultaneous analysis of total data.cNumber of resolved nodes in strict consensus of shortest trees.dNodes shared with tree based on simultaneous analysis from all data (Fig. 3).eMinimum length of tree for the partition when constrained to topology of tree based on simultaneous analysis of all data according to Swofford (1991), and the

percentage of steps added in parentheses.fGaps coded according to Simmons and Ochoterena (2000).

contributed fewer steps to the total cost of the simulta-neous analysis (5133 versus 8146). However, the phylo-genetic information content of Mp20 was greater due tolower internal consistency (RI = 0.66 versus 0.30) andthe higher proportion of informative sites (55.2% ver-sus 33.7%). Consequently, the Partitioned Bremer Sup-port (PBS) attributed to Mp20 in the simultaneous anal-ysis tree was greater than that of mtDNA (719 versus436). Most of the phylogenetically informative variationin Mp20 was contributed by the introns, with Intron 1providing a total PBS of 454, slightly higher than thatof all mtDNA combined (Table 2). The Mp20 partitionwas in closer agreement with the combined data than themtDNA; the single tree from Mp20 shared more nodes(37 of 49 total) with, and required fewer extra steps (0.85%versus 1.42% for all mtDNA) to fit the simultaneousanalysis tree. Based on PBS, the Mp20 partitions (mainlyexons) provided most of the support for basal relation-ships whereas support from mitochondrial partitions forthese nodes was weak or negative (Fig. 3). Alternativecoding of indels as ‘missing’ characters had only a veryslight impact on the tree topology. Similarly, the gap cod-ing procedure according to Simmons and Ochotorena(2000) also produced trees remarkably similar to thoseobtained with indels coded as 5th character, with 45 of49 nodes shared (Table 2).

Separate analysis of introns retrieved similar topolo-gies as in the combined analysis, resolving deepand shallow nodes. Trees based only on the intronpartitions shared more nodes with the combined anal-ysis (32 nodes) than trees from the exons (20 nodes). Thesignal in introns was derived both from the indel free andindel containing sites, with trees from the latter sharing

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

562 SYSTEMATIC BIOLOGY VOL. 53

FIGURE 4. Relative node ages in Cicindela. Branch length estimates on combined data based on GTR+�+I model and then adjusted fora molecular clock via NPRS. The absolute calibration is based on the node marked by an asterisk, defining E. marginata and E. puritana plusE. sperata, set to 5 Mya according to Barraclough and Vogler (2002). Numbers above the branches refer to credibility values in the bayesiananalysis. Numbers below the nodes are the estimated confidence intervals (standard deviations) from the calibration of 1000 bootstrap replicatesof the data matrix. The biogeographic regions of clade distribution are indicated by vertical bars. Nodes of the tree not supported in alternativebayesian searches under the sites specific model are marked by †, and nodes not supported when gaps are coded as binary characters are markedby #. See Material and Methods and Table 5 for details about these alternative tree searches.

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

2004 PONS ET AL.—BASAL RELATIONSHIPS IN CICINDELA 563

a greater number of nodes with the combined analy-sis tree than the tree from indel free positions only (31versus 22 nodes, Table 2), although the latter (based on223 sites of which 124 were potentially informative) stillrecovered clades I, II, III and most of the relevant sub-clades within these groups. Although intron positionsin total contributed greater signal, the exons partitionsshowed nearly three times higher PBS per site than intronpartitions (Table 2).

Character incongruence as determined by ILD wassignificant for most of the partitions, but generally in-ternal conflict was greater in mtDNA partitions than inMp20. The ILD (corrected for the number of characterchanges) in mtDNA was 0.0390, compared to values of0.0049 or 0.0169 for Mp20 when analyzed as two (exonsversus introns) or four (exons/intron1/intron2/intron3)partitions, respectively. However, the level of incon-gruence was decreased if mtDNA plus Mp20 exons ormtDNA plus Mp20 exons and non-indel positions wereanalyzed in combination (ILD/change 0.0067, P < 0.96;and ILD/change = 0.0109, P < 0.66, respectively). Thisdemonstrated that combining all data in simultaneousanalysis produced a more consistent character distri-bution than any of the partitions separately. This wasconfirmed by the interesting observation that incon-gruence between the four mitochondrial partitions wasreduced when analyzed together with the Mp20 data(ILD/change = 0.0390 versus 0.0120).

Rate Variation Within and Between Data Partitions,and Evidence for Saturation

Mp20 and mtDNA partitions differed in base compo-sition and rates of substitution. The Mp20 introns andmtDNA were A+T rich (average 67.07% and 75.84%, re-spectively), whereas the exons were G+C rich (64.69% onaverage) based on informative characters only (Table 3).These values differed across taxa, when assessed withthe heterogeneity chi square test in PAUP, but only due

TABLE 3. Nucleotide bias and parameters of preferred models of evolution for different data partitions. The table gives the percentage of thefour nucleotides in each partition across all taxa, and the result of a chi-square test of the nucleotide bias across species implemented in PAUP(based on informative characters only). Nucleotide percentages for second positions of Mp20 exons are not included because this partition hastwo informative positions only. Parameters for the preferred model of evolution selected by MODELTEST for different data partitions is alsogiven, including the preferred substitution model, the proportion of invariant sites, the distribution of variable sites, and the shape of the alphaparameter (where applicable). Model parameters were estimated for each of the partitions separately, but they were very similar when estimatedon the bayesian tree of Figure 4.

Partition % A % C % G % T Bias Substitution model Proportion invariant Variable sites Shape alpha

All 33.86 17.92 12.33 35.89 P < 0.001 GTR 0.4121 Gamma 0.6002mtDNA 37.78 16.11 8.05 38.06 P < 0.001 GTR 0.5117 Gamma 0.3962Protein 33.56 18.33 7.17 38.94 P < 0.001 GTR 0.4564 Gamma 0.3098

1st positions 25.31 30.11 12.46 32.12 n.s. GTR 0.6746 Gamma 0.60932nd positions 21.44 32.55 10.97 35.04 n.s. ts/tv 1.92 0.8132 Gamma 0.60223rd positions 38.58 14.96 5.79 40.67 P < 0.001 GTR 0.0142 Gamma 0.4189

Structural 48.83 5.07 12.43 33.67 n.s. GTR 0.5292 Gamma 0.4308Mp20 28.63 20.34 18.04 32.99 n.s. GTR 0.2427 Gamma 2.0809Exons 12.87 39.32 25.37 22.44 n.s. TrN 0.5706 Gamma 1.1007

1st positions 16.34 47.88 21.40 14.38 n.s. F81 0.7412 Gamma 0.71562nd positions — — — — — F81 — Equal —3rd positions 12.46 37.89 26.42 23.23 n.s. ts/tv 3.91 — Gamma 1.2584

Introns 31.89 16.41 16.52 35.18 n.s. GTR 0.0512 Gamma 3.9982

to the apparent bias in 3rd codon positions of the mtDNA(Table 3). This bias was due mostly to a small number ofspecies with relatively low A+T content. These specieswere widely scattered throughout the tree, indicatingthat A+T bias is unlikely to have resulted in artificialgroupings.

MtDNA and Mp20 partitions also differed in their dis-tribution of character changes and site-to-site rate het-erogeneity. For mtDNA, likelihood models to describecharacter variation produced a better fit to the data withincreasing complexity of the models, and for most par-titions a GTR+�+I was found to be better than sim-pler models. However, for the Mp20 exons single pa-rameter or two parameter models produced a fit thatwas equally good than complex models, in particularif the three codon positions were considered separately(Table 3). Whereas the estimated proportion of invariantsites was high for all partitions, except the highly vari-able mtDNA 3rd positions and the Mp20 introns, theshape of the gamma distribution was greatly differentbetween mtDNA and Mp20. The alpha parameter wasmuch greater in Mp20 (Table 3), indicating a more ho-mogeneous among-site rate variation.

We calculated rates of change per branch in variouspartitions for the tree based on GTR+�+I model ofFigue 4. Rates were expressed as the average rate perbranch in a given clade, and hence the values permit adirect comparison of likelihood rates of change betweendifferent partitions. Rates in mtDNA were on averagenearly four times higher than Mp20, but with great vari-ation of rates within each of the major partitions. Forexample, the average rate of change in intron positionswas approximately four times higher than in exon se-quences, and nearly twice as high as in Mp20 3rd codonpositions. In mtDNA, rates of change were very highin 3rd codon positions, whereas rates in 1st codon po-sitions (not shown) and structural RNA regions were atleast four times lower (Table 4). Rates were also calcu-lated separately for clades I, II, and III, the ingroup and

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

564 SYSTEMATIC BIOLOGY VOL. 53

TABLE 4. Rates of nucleotide substitution in different data partitions and subclades. The values represent average rates of change pernucleotide site as estimated based on ML branch length (GTR+�+I model) using the parameters and tree estimated in a bayesian tree search(Fig. 4). Values are the arithmetic mean of the rates obtained for all 2n − 2 branches in a subclade of n taxa (in parentheses). The estimatedparameters were as follows: base frequencies (A = 0.3690, C = 0.1400, G = 0.1132, T = 0.3778), substitution model (AC = 0.8597, AG = 5.1963,AT = 1.0650, CG = 1.2714, CT = 8.8476, GT = 1), and among-site rate variation (� = 0.6002, invariants = 0.4121).

mtDNA Mp20

All data All Protein 3rd RNA All Exons 3rd Introns

All taxa (51) 0.078 0.113 0.122 0.153 0.038 0.028 0.009 0.016 0.036Ingroup (46) 0.063 0.098 0.113 0.148 0.027 0.022 0.008 0.011 0.027Clade I (20) 0.044 0.057 0.076 0.158 0.019 0.015 0.005 0.009 0.019Clade II (10) 0.047 0.056 0.074 0.204 0.012 0.032 0.007 0.015 0.048Clade III (7) 0.047 0.060 0.076 0.307 0.026 0.027 0.010 0.022 0.041Outgroups (5) 0.134 0.121 0.147 0.753 0.083 0.144 0.031 0.097 0.314

the outgroup. In these comparisons rates were found todiffer by similar ratios (Table 4), i.e., the partitions exhib-ited an intrinsic rate that differed in concert throughoutthe tree, even for the unlinked Mp20 and mtDNA genes.

Whereas higher rates in mtDNA were widely foundwithin the ingroup, in the outgroup this ratio was re-versed, as rates were faster in Mp20 than in mtDNA(0.144 versus 0.121, respectively; Table 4). Rate of changein mtDNA in the outgroup was very similar to the in-group (0.121 versus 0.098), but in the outgroup Mp20 ratewas about six times faster than in the ingroup (0.144 ver-sus 0.022). Interestingly, estimated rates in Mp20 basedon parsimony branch length were very similar to thoseestimated by ML, but in mtDNA parsimony rates arelower by a factor of 2 to 4 (not shown), suggesting amuch greater level of saturation in mtDNA. Saturationin mtDNA relative to Mp20 was also evident in plotsof branch length optimized under ML models of differ-ent complexity (Fig. 5). A plot of branch length calcu-lated under the simple F81 model against a GTR+�+Imodel produced a largely linear correlation for Mp20.

TABLE 5. Age estimates of subclades calculated from different partitions based on ultrametric branch lengths. Absolute node ages weredetermined by setting the split of E. marginata from the E. sperata-E. puritana to 5 Mya (Barraclough and Vogler, 2002). The last column shows theages estimated in that study where applicable.

Clustal all dataClustal Clustal POYc Barraclough and

Node GTR+�+Ia SSRa Gap binarya Mp20b mtDNAb GTR+�+I Vogler, 2002

Ingroup 12.7–15.9 16.3 15.5 14.6 16.7 13.5 n/aClade I 8.4–10.6 10.9 10.8 9.6 9.3 10.7 n/aClade II 9.2–11.6 11.2 10.8 11.4 10.7 11.2 n/aClade III 11.0–13.8 13.9 12.5 14.4 12.6 13.1 n/aN. America-S. America

Ellipsoptera-Brasiella 8.1–10.3 9.7 9.4 10.0 9.0 10.5 11.9Australia-New Zealand

Rivacindela-Neocicindela 6.6–8.6 8.5 8.0 7.9 9.2 6.7 n/aIndia-Madagascar

Calochroa-Hipparidium 5.1–6.7 6.7 6.1 5.7 4.9 7.1 n/aE. minuta-E. zaza 7.0–8.8 7.8 8.7 8.1 8.3 8.2 n/aJansenia-Chaetotaxis 11.2–13.8 11.5 12.5 10.8 14.0 12.5 n/a

Barraclough and Vogler, 2002C. debilis-Ellipsoptera 6.5–8.5 7.8 7.6 7.9 7.2 8.2 8.0Cicindela s. str.-Cicindelidia 6.7–8.5 8.6 8.1 8.4 7.0 9.2 8.9

aML branch lengths for all data partitions using the preferred Clustal alignment and estimated in a Bayesian search based on three different models: GTR+�+I,sites specific rate (SSR), and GTR+�+I with gaps recoded as binary characters (see Material and Methods). Branch lengths were fitted by NPRS. The first columnincludes the confidence intervals for the estimated ages based on 1000 bootstrap resamplings of the original data.

bML branch lengths were calculated for Mp20 and mtDNA based on the topology of Figure 4, estimating preferred MODELTEST parameters separately for eachpartition, and fitting the branch lengths by NPRS.

cML branch lengths based on the implied alignment and the topology obtained with POY (gap = 1, intron positions downweighted 1/3), estimating preferredMODELTEST parameters, and fitting the branch lengths by NPRS.

In contrast, for mtDNA this correlation is weak, as theestimates of branch length are much higher under thecomplex model, indicating the difficulty of reconstruct-ing the full extent of character variation without takinginto account the great rate heterogeneity in this data set.This effect was almost entirely explained by the behaviorof 3rd position in this analysis (not shown).

Estimation of Node Ages

Branch length estimates were used for calculation ofnode ages. Only the Mp20 exon data were marginallyconsistent with a molecular clock (0.05 < P > 0.025;likelihood ratio test), whereas all other partitions andthe combined data set under any of the treatmentswere not (P < 0.001, see Table 5). Hence for estima-tion of node ages we used branch lengths based onthe NPRS algorithm, although results were closely cor-related with clock estimates assuming rate constancy(r2 = 0.98). Ages were estimated for major nodes in thetree, with focus on sister lineages confined to different

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

FIG

UR

E5.

Ana

lysi

sof

satu

rati

onof

char

acte

rva

riat

ion.

ML

bran

chle

ngth

sw

ere

calc

ulat

edon

the

sing

lesh

orte

sttr

eeob

tain

edfr

omth

eco

mbi

ned

dat

a(F

ig.4

).T

hepl

ots

show

the

bran

chle

ngth

asop

tim

ized

und

era

F81

mod

elag

ains

taco

mpl

exG

TR

+�+I

mod

el,s

epar

ate

for

mtD

NA

(lef

t)an

dM

p20

(rig

htpa

nel)

.

565

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

566 SYSTEMATIC BIOLOGY VOL. 53

biogeographic regions and including clades I, II andIII. Calibration of absolute dates were based on a es-timate of 5 Mya for the split of Ellipsoptera marginatafrom E. sperata plus E. puritana (Barraclough and Vogler,2002).

Based on these extrapolations, the Mp20 and mtDNAcombined data with the GTR+�+I model indicated anage of the ingroup node (Cicindela s.l.) in the range of12.7 to 15.9 Mya (based on bootstrap replicates), with theestimates for the ages of clades I, II and III just below thisrange (Table 5). Separation of several sister groups fromIndia and Madagascar were estimated to range between5.1 and 6.7 and 11.2 and 13.8 Mya, and the split of theAustralian Rivacindela from Neocicindela of New Zealandin the range of 6.6 to 8.6 Mya. Ages were also estimatedunder different likelihood models and gap coding pro-cedures: a site specific model and a model includinggaps as binary characters. Moreover, the implied align-ment and tree obtained with POY were also used to esti-mate the ages with the model selected by MODELTEST(GTR+�+I). In all cases the values obtained were within,or very slightly outside of, the ranges of bootstrappeddata and analysis under the GTR+�+I model (Table 5).Ages calculated for the Mp20 and mtDNA partitions sep-arately also closely matched the calculations from thecombined analysis (Table 5). Because variation in exonswas clock-like, ages from this partition were also esti-mated without the NPRS correction, and produced datesthat were slightly younger at lower nodes, and slightlyolder at deep nodes (not shown). Our study included14 nodes that had also been estimated in the diversi-fication of the North American tiger beetles based onmtDNA only (Barraclough and Vogler, 2002) and thesedata showed good correlation with the previously esti-mated ages (r2 = 0.85, Table 5).

DISCUSSION

Utility of Mp20 in Molecular Systematics of CicindelaThe Mp20 locus was shown here to produce a phylo-

genetic signal largely congruent with mtDNA. It addedin particular to knowledge about basal relationships inCicindela that had been difficult to resolve with mtDNAalone. The total number of base pairs sequenced for Mp20was only about half that for mtDNA, and Mp20 showeda roughly four times lower rate of nucleotide variation.Yet, in the combined analysis Mp20 contributed a largeramount of phylogenetic signal, as it showed greater over-all character consistency, greater information content perbase pair, and contributed a greater amount of total sup-port. Variation in Mp20 showed several features thatprobably contributed to the greater utility in the phy-logenetic analysis, including lower site-to-site rate het-erogeneity and fewer invariable sites, a lower level ofsaturation of character variation, and a less pronouncedAT bias than was found in mtDNA.

Major elements of the tree were supported by both theMp20 and mtDNA, including the major clades I, II, andIII, and several groups within clade I. However, topologi-cal congruence decreased towards the root, where nodal

support values were generally lower and the mtDNApartitions produced very low or negative PBS values.In contrast, at the tips mtDNA generally was in agree-ment with Mp20, showing a much greater proportion ofnodes recovered by all partitions and higher total PBS.Recent work has questioned that Bremer Support val-ues are comparable, even within a single tree (DeBry2001), and indeed these values should not be consid-ered to be comparable in a statistical sense. However,BS and PBS remain well-established parameters for esti-mating relative degree of corroboration among compet-ing hypotheses (Grant and Kluge, 2003), and the trendsin these nodal support values (rather than any partic-ular value) are useful for establishing weaker parts ofour phylogenetic knowledge. Therefore we accept thatincongruence is greater at deep nodes, presumably dueto the poor performance of mtDNA and the various bi-ases of character variation and nucleotide compositionin this marker. The underlying phylogenetic signal inmtDNA, however, may not be very different from thatin Mp20. This is corroborated by the fact that interactionswith Mp20 in simultaneous analysis reduced the level ofincongruence between the various mtDNA partitions,suggesting that the various mtDNA partitions supporta common signal that was more easily apparent in thecontext of the Mp20 data.

Other causes typically affecting incongruence betweenmitochondrial and nuclear markers, such as differen-tial introgression and incomplete lineage sorting, are un-likely to play a role here. These effects should be strongestbetween closely related taxa, rather than deeper in thetree. In addition, the presence of multiple genomic copiesand their diversification could result in the sequencing ofparalogs, and in this case it would be the Mp20 partitionthat contributes the false signal. However, this again isunlikely, as one of the main criteria for selecting Mp20was that it represented only a small number of candidatesfrom the EST libraries which apparently were lacking anyclose paralogs within the D. melanogaster genome or inthe Coleoptera cDNA libraries (unpublished). Further,as the Mp20 tree is congruent with mtDNA at the tiplevel, the Mp20 sequences can be confidently assumedto be orthologs at least at this hierarchical level, and itis difficult to conceive a scenario where different par-alogs would be picked up by PCR deeper in the treeonly. Hence the phylogenetic history of both markers wasprobably largely the same, and incongruence could be at-tributed mostly to differences in parameters of sequencevariation.

Utility of Intron Sequences and Alignment

Intron sequences add to the technical difficulties ofPCR amplification from nuclear genes, and they also tendto complicate phylogenetic analysis as they generally ex-hibit higher rates of variation and suffer from length vari-ation. Most authors have therefore disregarded intronsequences in phylogenetic analysis, or have amplifiedfrom exons only, by designing primers that avoid intronsor by using cDNA templates. However, in our analysis

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

2004 PONS ET AL.—BASAL RELATIONSHIPS IN CICINDELA 567

the introns made an important contribution and clearlyadded to the strengths of Mp20, as seen in other nu-clear genes in recent studies (e.g., Sanchis et al., 2001;Kawakita et al., 2003). The introns contributed about fivetimes greater total PBS than exons, and their fairly highrate of change, exceeding the rate in exons by threefold,and the lower site-by-site rate heterogeneity, were use-ful to resolve relationships over a range of hierarchicallevels. In addition, the fact that introns and exons differin types and rates of sequence variation may have bene-ficial effects, as interaction of data of different kinds fre-quently leads to improved recovery of phylogenetic sig-nal when analyzed simultaneously (e.g., Olmstead andSweere, 1994). Finally, the length variation in the intronsequences provided a major source of phylogenetic sig-nal, which was largely congruent with other characters,as the indel containing sites produced similar topologiesas the exons and the combined analysis.

However, sequence alignment remains a difficult is-sue when using intron sequences. It is an inevitable stepin such analysis to choose a value for the cost of indelsprior to the tree search, but there are no simple criteriafor this choice. The selection of parameters after empiri-cal exploration of parameter values based on congruenceanalysis (Wheeler, 1995) has been criticized because thedifferential weighting required in these procedures ul-timately reduces (not increases) total character congru-ence, and because of the poorly justified use of only asmall portion of a potentially infinite parameter space(Grant and Kluge, 2003). Here we addressed the prob-lem of sequence alignment in two very different proce-dures. First, the dynamic homology assessment of POYwas used to explore a range of pertinent gap cost pa-rameters. The selection of a tree from this (by no meansexhaustive) set of tree searches, however, was arbitrary inthe current case, and the tree shown in Figure 2 is merelyrepresenting an example of the trees obtained. FollowingGrant and Kluge (2003), the analysis probably cannot betaken further, and as such the various resulting hypothe-ses are presented in tabulated form (Table 1).

We found, however, that low gap costs produced treeswhich are generally more in agreement with existingideas of relationships. If gap costs were set higher, itwas preferable to downweight the intron (alignmentvariable) regions, reducing the overall costs of indels.There was also a tendency for larger gaps to be preferred(lower extension cost in the Clustal alignment). Thesefindings indicate that indels should not be weighted witha high cost as afforded to them by the POY method(Wheeler, 1996; Giribet and Wheeler, 1999), which treatseach single-nucleotide indel separately, and where thepenalty for the introduction of a gap in the alignmentconflates the weight of the indels in the phylogeneticanalysis.

The Clustal analysis was performed with a slightlydifferent perspective. Gap costs here represent a penaltywhen aligning two or more sequences to each other, andthe resulting character matrices are independent of thetree search. The resulting matrices can be compared di-rectly, without the problems from differential weighting

as in the POY analysis. To the degree that the ILD repre-sents the congruence of different partitions (see for po-tential problems Dolphin et al., 2000; Yoder et al., 2001;Darlu and Lecointre, 2002; Dowton and Austin, 2002), wecould then show that alignment conditions could be ob-tained under which the phylogenetic signal in the intronswas very compatible with the other data. This appliesboth to the topologies, which were very similar to thoseobtained from non-intron partitions, and also to the ratesof change throughout the tree which differed in concertamong intron and non-intron partitions. For example,the approximately fourfold higher rate of change in in-trons compared to exons was observed consistently forall subclades tested (Table 4). Because intron and exonsas part of a linkage group are expected to have a similarphylogenetic history, this suggests that the optimal align-ments produced here are not simply a fortuitous resultof phenetic procedures but represent a good reflection ofbase pair homologies.

Phylogenetic and Biogeographic History of Cicindela

Molecular clock estimates in insects have relied al-most exclusively on mtDNA but this marker has beenproblematic, in particular for inferences about deepernodes. The Mp20 data increased the confidence in thesecalculations, by providing a better topology for the deepnodes and also providing data apparently less affected bysaturation. Therefore, with the Mp20 data in hand it waspossible for the first time to calculate ages in the deepernodes in Cicindela, for a test of competing biogeographicscenarios explaining the cosmopolitan distribution of thegroup.

We found that the main groupings in Cicindela s.l. re-covered in this analysis conform to major geographicregions. For example, clade I included the Holarcticsubgenus Cicindela s. str. and its Nearctic sister Cicin-delidia, which combined were sister to a clade of Cos-modela, Lophyridia, and the C. japonica group (Cicindela s.str. subgroup 6 of Rivalier), with distribution mostly inthe eastern Palearctic. Sister to this Holarctic clade werethe Indian/Magadascan/African Lophyra plus Calochroa(India) and Hipparidium (Madagascar). Clade II is agrouping of species considered by Rivalier to be part ofthe cosmopolitan (sub)genus Cylindera. Within this glob-ally distributed group the major subclades were againconfined to particular biogeographic regions. The twoIndian subgenera Eugrapha and Ifasina (the former alsoincluding the Malagasy E. zaza) were monophyleticand sister to a New World clade including the SouthAmerican Brasiella and Hemichrysea, which were sis-ter to the Cylindera debilis group plus Ellipsoptera fromNorth and Central America. Clade III included taxa fromAustralia and New Zealand plus a few subgenera withwide distribution in Southeast Asia. These associationshad not been recognized before and they remain to betested further. The analysis also established the deepseparation of the Australian clade from all others,although several groups (Jansenia, Habroscelimorpha,Taenidia, and Chaetotaxis) remained difficult to place.

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

568 SYSTEMATIC BIOLOGY VOL. 53

These relationships of taxa in major biogeographic re-gions now permit testing alternative scenarios of disper-sal or continental break-up to explain continental dis-tributions of Cicindela s.l. Based on the molecular clockcalibration, the age of Cicindela s. l. was placed between12.7 and 15.9 Mya in the analysis of all data, and in avery similar range in the analyses based on Mp20 andmtDNA alone (Table 5). For nodes of known geologicalages, such as the separation of India and Madagascar,or Australia and New Zealand, the calculated node ageswere generally 10 Mya or less, and hence about an orderof magnitude lower than the time of break-up of majorland masses estimated from geological evidence. Thisindicates that the cosmopolitan distribution of Cicindelainvolved occasional faunal interchange between conti-nents. However, this exchange is comparatively rare, asradiations can be identified which are limited to a par-ticular region.

If the molecular clock calibration is correct, the originof the Cicindela s.l. clade is placed in the mid-Miocene,a period of expanding dry climate and spread of grass-lands and savanna at the expense of forest habitat world-wide (Behrensmeyer, 1992). These changes of temperateand tropical ecosystems may have promoted the radia-tion of Cicindela that have conspicuously diversified inopen habitats, in contrast to the mostly forest dwellingbasal lineages of Cicindelidae. The group therefore mightbe one of several that have undergone diversificationand colonization of new biogeographic areas during theearly and middle Miocene with the changes in habi-tat availability. For instance, hypsodont mammals wereable to expand their habitat and food resources in Eu-rope (Jernvall and Fortelius, 2002), shrubland and fyn-bos flora radiated in South Africa (Goldblatt et al., 2002),and the diversification of hominids in savanna habitat co-incided with this period (Pickford, 2002). These changesalso led to the dispersal of several groups between Africaand Eurasia such as tiger frogs (Kosuch et al., 2001),rodents and lagomorph (Winkler, 2002), and hominids(Chaimanee et al., 2003), and could also have promptedthe dispersal of early Cicindela between continents.

In conclusion, the use of Mp20, a molecular markernot previously used in phylogenetics, greatly advancedour knowledge of basal relationships in Cicindela. Yet, thedeep nodes remain poorly supported, perhaps due to fastradiation during the early evolution of the group thatmay have coincided with a period where their habitatgreatly expanded. Better resolution will likely come fromadditional nuclear gene markers, and it will be interest-ing to establish if intron sequences are more generallyuseful for establishing Miocene relationships in insects.

ACKNOWLEDGMENTS

We thank D. Broszka, D. Pearson, D. Sumlin, and F. Cassola for spec-imens, and A. de Riva for contributing to the development of primersfor Mp20. We are indebted to J. Gomez-Zurita and M. Arnedo for theirhelp with phylogenetic methods and comments on the manuscript. Weare also grateful for extremely constructive criticism during the reviewprocess by K. Kjer, K. Ober, C. Simon, and an anonymous referee thatgreatly improved this manuscript. This study was supported by NERC

grant NER/A/S/2000/00489. Funding for AC was through a stipendof Fundaco para a Cincia e a Tecnologia (PRAXIS XXI/BD/18409/98),and KT was funded through BBSRC grant G15548 (to APV andP. Foster).

REFERENCES

Ayme-Southgate, A., P. Lasko, C. French, and M. L. Pardue. 1989. Char-acterization of the gene for Mp20—a Drosophila muscle protein thatis not found in asynchronous oscillatory flight-muscle. J. Cell Biol.108:521–531.

Baldwin, B. G., and M. J. Sanderson. 1998. Age and rate of diversifica-tion of the Hawaiian silversword alliance (Compositae). Proc. Natl.Acad. Sci. U.S.A. 95:9402–9406.

Barraclough, T. G., and A. P. Vogler. 2002. Recent diversification ratesin North American tiger beetles (genus Cicindela). Mol. Biol. Evol.19:1706–1716.

Behrensmeyer, A. K. 1992. Terrestrial ecosystems through time.Chicago University Press, Chicago, Illinois.

Chaimanee, Y., D. Jolly, M. Benammi, P. Tafforeau, D. Duzer, I. Moussa,and J. J. Jaeger. 2003. A Middle Miocene hominoid from Thailandand orangutan origins. Nature 422:61–65.

Cognato, A. I., and A. P. Vogler. 2001. Exploring data interaction andnucleotide alignment in a multiple gene analysis of Ips (Coleoptera:Scolytinae). Syst. Biol. 50:758–780.

Darlu, P., and G. Lecointre. 2002. When does the incongruence lengthdifference test fail? Mol. Biol. Evol. 19:432–437.

DeBry, R. W. 2001. Improving interpretation of the decay index for DNAsequence data. Syst. Biol. 50:742–752.

Dolphin, K., R. Belshaw, C. D. L. Orme, and D. L. J. Quicke. 2000. Noiseand incongruence: Interpreting results of the incongruence lengthdifference test. Mol. Phylogenet. Evol. 17:401–406.

Dowton, M., and A. D. Austin. 2002. Increased congruence does notnecessarily indicate increased phylogenetic accuracy—the behaviorof the incongruence length difference test in mixed-model analyses.Syst. Biol. 51:19–31.

Farrell, B. D., A. S. Sequeira, B. C. O’Meara, B. B. Normark, J. H.Chung, and B. H. Jordal. 2001. The evolution of agriculture in beetles(Curculionidae : Scolytinae and Platypodinae). Evolution 55:2011–2027.

Giribet, G., and W. C. Wheeler. 1999. On gaps. Mol. Phylogenet. Evol.13:132–143.

Gladstein, D., and W. C. Wheeler. 2002. POY. Program and documenta-tion, version 3.0. American Museum of Natural History, New York.

Goldblatt, P., V. Savolainen, O. Porteous, I. Sostaric, M. Powell, G.Reeves, J. C. Manning, T. G. Barraclough, and M. W. Chase. 2002. Ra-diation in the Cape flora and the phylogeny of peacock irises Moraea(Iridaceae) based on four plastid DNA regions. Mol. Phylogenet.Evol. 25:341–360.

Grant, T., and A. G. Kluge. 2003. Data exploration in phylogenetic in-ference: Scientific, heuristic, or neither. Cladistics 19:379–418.

Higgins, D. G., J. D. Thompson, and T. J. Gibson. 1996. Using CLUSTALfor multiple sequence alignment. Methods Enzymol. 266:383–401.

Hillis, D. M., and M. T. Dixon. 1991. Ribosomal DNA: Molecular evo-lution and phylogenetic inference. Quart. Rev. Biol. 66:411–453.

Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inferenceof phylogenetic trees. Bioinformatics 17:754–755.

Jernvall, J., and M. Fortelius. 2002. Common mammals drive the evo-lutionary increase of hypsodonty in the Neogene. Nature 417:538–540.

Jordal, B. H. 2002. Elongation factor 1 alpha resolves the monophylyof the haplodiploid ambrosia beetles Xyleborini (Coleoptera : Cur-culionidae). Insect Mol. Biol. 11:453–465.

Kawakita, A., T. Sota, J. S. Ascher, M. I. M, H. Tanaka, and M. Kato.2003. Evolution and phylogenetic utility of alignment gaps withinintron sequences of three nuclear genes in bumble bees (Bombus).Mol. Biol. Evol. 20:87–92.

Kosuch, J., M. Vences, A. Dubois, A. Ohler, and W. Bohme. 2001. Outof Asia: Mitochondrial DNA evidence for an Oriental origin of tigerfrogs, genus Hoplobatrachus. Mol. Phylogenet. Evol. 21:398–407.

Ober, K. A. 2002. Phylogenetic relationships of the carabid subfamilyHarpalinae (Coleoptera) based on molecular sequence data. Mol.Phylogenet. Evol. 24:228–248.

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

2004 PONS ET AL.—BASAL RELATIONSHIPS IN CICINDELA 569

Olmstead, R. G., and J. A. Sweere. 1994. Combining data in phyloge-netic systematics: An empirical approach using three molecular datasets in the Solanaceae. Syst. Biol. 43:467–481.

Pearson, D. L., and A. P. Vogler. 2001. Tiger beetles: The ecologyand evolution of the Cicindelidae. Cornell University Press, Ithaca,New York.

Pickford, M. 2002. Palaeoenvironments and hominoid evolution. Z.Morphol Anthropol. 83:337–348.

Posada, D., and K. A. Crandall. 1998. MODELTEST: Testing the modelof DNA substitution. Bioinformatics 14:817–818.

Rambaut, A., and M. Charleston. 2002. TreeEdit. Phylogenetic Tree Ed-itor v1.0 alpha 10. Oxford University.

Ribera, I., T. G. Barraclough, and A. P. Vogler. 2001. The effect of habi-tat type on speciation rates and range movements in aquatic bee-tles: Inferences from species-level phylogenies. Mol. Ecol. 10:737–750.

Rivalier, E. 1950. Demembrement du genre Cicindela Linne. Rev. Franc.Entomol. 17:217–244.

Rivalier, E. 1963. Demembrement du genre Cicindela Linne. V. Fauneaustralienne. Rev. Franc. Entomol. 30:30–48.

Sanchis, A., J. M. Michelena, A. Latorre, D. L. J. Quicke, U.Gardenfors, and R. Belshaw. 2001. The phylogenetic analysis ofvariable-length sequence data: Elongation factor-1alpha intronsin European populations of the parasitoid wasp genus Pauesia(Hymenoptera: Braconidae: Aphidiinae). Mol. Biol. Evol. 18:1117–1131.

Sanderson, M. J. 1997. A nonparametric approach to estimating diver-gence times in the absence of rate constancy. Mol. Biol. Evol. 14:1218–1231.

Simmons, M. P., and H. Ochoterena. 2000. Gaps as characters insequence-based phylogenetic analyses. Syst. Biol. 49:369–381.

Sota, T., and A. P. Vogler. 2003. Reconstructing species phylogeny of thecarabid beetles Ohomopterus using multiple nuclear DNA sequences:Heterogeneous information content and the performance of simul-taneous analyses. Mol. Phylogenet. Evol. 26:139–154.

Swofford, D. L. 1991. When are phylogeny estimates from molecularand morphological data incogruent? Pages 295–333 in Phylogenetic

APPENDIX. Taxa used in this study, their collecting locality, and Genebank accession numbers. Mp20 sequences were obtained from aconsensus sequence from three or more PCR product clones (†), directly from PCR product (∗), or from a single PCR product clone (

√).

Species name Locality Mp20 16S cytB COI COIII

Abroscelis ancholaris Shiga, Ishikawa, Japan AJ514267† AJ515098 AJ515067 AJ514954 AJ515123Brasiella argentata Morona-Santiago, Ecuador AJ514296

√AF439012 AF439142 AJ514982 AF439072

Brasiella hemichrysea AJ514297† AF439013 AF439143 AJ514983 AF439074Callytron malabaricum Kumpta, Karnataka, India AJ514309∗ AJ515119 AJ515090 AJ514994 AJ515140Calochroa fabriciana Jog Falls, Karnataka, India AJ514287

√AJ515108 AJ515076 AJ514973 AJ515130

Chaetotaxis rugicollis Analalava forest, Fauchage,Madagascar

AJ514304∗ AJ515117 AJ515086 AJ514990 AJ514419

Cicindela japana Hakusan, Ishikawa, Japan AJ514276√

AJ515107 AJ515075 AJ514963 AJ515129Cicindela japonica Kyoto University, Japan AJ514272† AJ515103 AJ515071 AJ514959 AJ515125Cicindela maroccana Sines, Portugal AJ514275† AJ515106 AJ515074 AJ514962 AJ515128Cicindela parowana Beowawe, Eureka Co., Nevada, US AJ514278† AF438958 AF439087 AJ514965 AF439026Cicindela pimeriana Willcox, Cochise Co, Arizona, US AJ514279

√N/A AF439086 AJ514966 AF439025

Cicindela theatina AJ514277† AF438836 AF438845 AJ514964 AF438854Cicindela willistoni Quivera, Stafford Co., Kansas, US AJ514280† AF438890 AF438939 AJ514967 AF438915Cicindelidia dysenterica Tlaxcala, Mexico AJ514284† AF438981 AF439112 AJ514970 AF439048Cicindelidia guerrerensis Michoauna, Mexico AJ514281† AF438972 AF439102 N/A AF439040Cicindelidia ioessa Santa Marta, Mexico AJ514282∗ AF438962 AF439091 AJ514968 AF439030Cicindelidia mathani Rio Blanco, Pichincha, Ecuador AJ514286† AF438992 AF439113 AJ514972 AF439058Cicindelidia oaxensis AJ514283∗ AF438974 AF439104 AJ514969 AF439042Cicindelidia politula Guadelupe, Texas, US AJ514285∗ AF438977 AF439008 AJ514971 AF439045Cosmodela duponti Jog Falls, Karnataka, India AJ514273∗ AJ515104 AJ515072 AJ514960 AJ515126Cylindera debilis Croton Springs, Texas, US AJ514295† AF133002 AF133018 AJ514981 AF133034Ellipsoptera marginata AJ514294† AF132989 AF133005 AJ514980 AF133021Ellipsoptera puritana Little Cove, Maryland, US AJ514292† AF132997 AF133013 AJ514978 AF133029Ellipsoptera sperata Red Bluff, Texas, US AJ514293† AF132996 AF133012 AJ514979 AF133028Eugrapha minuta Jog Falls, Karnataka, India AJ514301† AJ515114 AJ515083 AJ514987 AJ515136Eugrapha zaza Tulear Embouchure, Andranolaka,

MadagascarAJ514300† AJ515094 AJ515082 AJ514986 AJ515135

analysis of DNA sequences (M. M. Miyamoto, and J. Cracraft, eds.).Oxford University Press, Oxford.

Swofford, D. L. 2002. PAUP*: Phylogenetic Analysis using Parsimony.Version 4.0b. Sinauer Associates, Sunderland, Massachusetts.

Theodorides, K., A. d. Riva, J. Gomez-Zurita, P. G. Foster, and A. P.Vogler. 2002. Comparison of EST libraries from seven beetle species:Towards a framework for phylogenomics of the Coleoptera. InsectMol. Biol. 11:467–475.

Vogler, A. P., A. Cardoso, and T. G. Barraclough. 2004. Distribution ofhomoplasy and the effect of weighting when the tree is fully sampled:Species level phylogenetics of North American tiger beetles (genusCicindela). Syst. Biol. submitted.

Vogler, A. P., and C. K. Kelley. 1998. Covariation of defensive traitsin Cicindela tiger beetles: A phylogenetic approach using mtDNA.Evolution 52:529–538.

Vogler, A. P., and A. Welsh. 1997. Phylogeny of North American Ci-cindela tiger beetles inferred from multiple mitochondrial DNA se-quences. Mol. Phylogenet. Evol. 8:225–235.

Wheeler, W. C. 1995. Sequence alignment, parameter sensitivity, andthe phylogenetic analysis of molecular data. Syst. Biol. 44:321–331.

Wheeler, W. C. 1996. Optimization alignment: The end of multiple se-quence alignment in phylogenetics? Cladistics 12:1–9.

Wheeler, W. C. 2003. Implied alignment: A synapomorphy-basedmultiple-sequence alignment method and its use in cladogramsearch. Cladistics 19:261–268.

Winkler, A. J. 2002. Neogene paleobiogeography and East African pa-leoenvironments: Contributions from the Tugen Hills rodents andlagomorphs. J. Human Evol. 42:237–256.

Yoder, A. D., J. A. Irwin, and B. A. Payseur. 2001. Failure of the ILDto determine data combinability for slow loris phylogeny. Syst. Biol.50:408–424.

Young, N. D., and J. Healy. 2003. GapCoder automates the use of indelcharacters in phylogenetic analysis. BMC Bioinformatics 4:6.

First submitted 19 May 2003; reviews 19 September 2003;final acceptance 24 February 2004

Associate Editor: Karl Kjer

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from

570 SYSTEMATIC BIOLOGY VOL. 53

APPENDIX. Taxa used in this study, their collecting locality, and Genebank accession numbers. Mp20 sequences were obtained differently:consensus sequence from three or more PCR product clones (†), directly from PCR product (∗), and from a single PCR product clon (

√). (Continued)

Species name Locality Mp20 16S cytB COI COIII

Habroscelimorpha californicabrevihamata

Mazaflan, Sinaloa, Mexico AJ514291∗ AF438995 AF439126 AJ514977 AF439061

Habroscelimorpha schwarzi Panama Viejo, Panama AJ514308∗ AF439000 AF439131 AJ514993 AF439065Hipparidium equestre Antanarivo, Madagascar AJ514288

√AJ515109 AJ515077 AJ514974 AJ515131

Hyphaeta birramosa AJ514268† AJ515099 AJ515068 AJ514955 AJ514415Ifasina belli Jog Falls, Karnataka, India AJ514298† AJ515112 AJ515080 AJ514984 AJ515133Ifasina labeoaneae AJ514299† AJ515113 AJ515081 AJ514985 AJ515134Jansenia chloropleura AJ514303† AJ515116 AJ515085 AJ514989 AJ515138Jansenia rostrula AJ514302† AJ515115 AJ515084 AJ514988 AJ515137Lophyra catena Bangalore, Karnataka, India AJ514289

√AJ515110 AJ515078 AJ514975 AJ514418

Lophyridia cardoni Chengalpattu, Madras, India AJ514271† AJ515102 AJ515070 AJ514958 AJ515124Lophyridia chloris AJ514270† AJ515101 N/A AJ514957 AJ514417Lophyridia lunulata Kairouen, Tunisia AJ514274† AJ515105 AJ515073 AJ514961 AJ515127Macfarlandia arachnoides AJ514266

√AJ515097 AJ515066 AJ514953 N/A

Monelica fastidiosa Jog Falls, Karnataka, India AJ514306† AJ514424 AJ515088 AJ514991 AJ515139Myriochile mastersi catoptriola Mt. Magnet, Western Australia,

AustraliaAJ514305† AJ514423 AJ515087 N/A AJ514420

Myriochile undulata Ambala, Hariyana, India AJ514307√

AJ515118 AJ515089 AJ514992 AJ514421Neocicindela ginevi New Zealnad AJ514264† AJ514422 AJ515064 N/A N/ANeocicindela parryi New Zealand AJ514265† N/A AJ515065 AJ514952 N/AOdontocheila confusa San Jose de Chiquitos, Santa Cruz,

BoliviaAJ514310† AF438903 AJ515091 AJ514995 AF438928

Peridexia fulvia AJ514269† AJ515100 AJ515069 AJ514956 AJ514416Prothyma sp APV-2001 Karnataka, India AJ514312∗ AJ515121 AJ515093 AJ514997 AJ515141Pseudoxycheila chaudoiri Nanegalito-Nanegal, Pichincha,

EcuadorAJ514311∗ AJ515120 AJ515092 AJ514996 N/A

Rivacindela aurifodina Lake Goongarrie S2959′65′′ E121 09′46′′,Australia

AJ514262√

AJ515095 AJ515062 AJ514950 AJ515122

Rivacindela salicursoria Lake Lefroy S3126′30′′ E121 33′57′′,Australia

AJ514263† AJ515096 AJ515063 AJ514951 N/A

Taenidia circumdata AJ514290† AJ515111 AJ515079 AJ514976 AJ515132

by guest on Decem

ber 9, 2014http://sysbio.oxfordjournals.org/

Dow

nloaded from