Investigating the path of plastid genome degradation in an early-transitional clade of heterotrophic...

18
Article Fast Track Investigating the Path of Plastid Genome Degradation in an Early-Transitional Clade of Heterotrophic Orchids, and Implications for Heterotrophic Angiosperms Craig F. Barrett,* ,1 John V. Freudenstein, 2 Jeff Li, 1 Dustin R. Mayfield-Jones, 3 Leticia Perez, 1 J. Chris Pires, 3 and Cristian Santos 1 1 Department of Biological Sciences, California State University, Los Angeles 2 Department of Evolution, Ecology, and Organismal Biology and the Museum of Biological Diversity, Ohio State University 3 Division of Biological Sciences, University of Missouri *Corresponding author: E-mail: [email protected]. Associate editor: Naoki Takebayashi Abstract Parasitic organisms exemplify morphological and genomic reduction. Some heterotrophic, parasitic plants harbor dras- tically reduced and degraded plastid genomes resulting from relaxed selective pressure on photosynthetic function. However, few studies have addressed the initial stages of plastome degradation in groups containing both photosynthetic and nonphotosynthetic species. Corallorhiza is a genus of leafless, heterotrophic orchids that contains both green, photosynthetic species and nongreen, putatively nonphotosynthetic species, and represents an ideal system in which to assess the beginning of the transition to a “minimal plastome.” Complete plastomes were generated for nine taxa of Corallorhiza using Illumina paired-end sequencing of genomic DNA to assess the degree of degradation among taxa, and for comparison with a general model of degradation among angiosperms. Quantification of total chlorophyll suggests that nongreen Corallorhiza still produce chlorophyll, but at 10-fold lower concentrations than green congeners. Complete plastomes and partial nuclear rDNA cistrons yielded a fully resolved tree for Corallorhiza, with at least two independent losses of photosynthesis, evidenced by gene deletions and pseudogenes in Co. striata and nongreen Co. maculata. All Corallorhiza show some evidence of degradation in genes of the NAD(P)H dehydrogenase complex. Among genes with open reading frames, photosynthesis-related genes displayed evidence of neutral evolution in nongreen Corallorhiza, whereas genes of the ATP synthase complex displayed some evidence of positive selection in these same groups, though for reasons unknown. Corallorhiza spans the early stages of a general model of plastome degradation and has added critical insight for understanding the process of plastome evolution in heterotrophic angiosperms. Key words: Orchidaceae, chloroplast, pseudogene, parasite, chloroplast, photosynthesis, chlorophyll. Introduction Parasites often display extreme reduction in morphological and genomic features (e.g., Moran 2002; Lawrence 2005; Morrison et al. 2007). There are many examples of reduced genomes in parasitic animals (e.g., Protasio et al. 2012; Tsai et al. 2013) and fungi (e.g., Katinka et al. 2001; Yoder and Turgeon 2001; Galagan et al. 2003). Heterotrophic plants— sensu lato, those that obtain nutrients either from other plants or from fungi with which they associate (i.e., mycoheterotrophs)—provide unique opportunities to study the types of changes that occur along the transition to par- asitism. Some heterotrophic plants retain the ability to carry out photosynthesis (i.e., hemiparasites and partial mycoheter- otrophs) whereas others have lost this ability, presumably due to a relaxation of selective constraints associated with pho- tosynthetic function (i.e., holoparasites [Kuijt 1969] and holomycotrophs). Parasitism upon other plants likely evolved a minimum of 12 times among eudicot angiosperms (Westwood et al. 2010; McNeal et al. 2013), whereas myco- heterotrophy has evolved at least 40 times, mainly among monocot angioisperms, with at least 30 independent shifts within the family Orchidaceae alone (Leake 1994; Bidartondo 2005; Freudenstein and Barrett 2010; Leake and Cameron 2010; Merckx and Freudenstein 2010). Plastid-encoded genes in heterotrophic plants are ex- pected to display evidence of degradation through mutation with little or no selective consequences due to relaxed puri- fying selection on photosynthesis. The plastid genome has been the target of recent studies in heterotrophic plants, in part brought about by the availability of next-generation se- quencing technologies. Examples of heterotrophic taxa for which plastome evolution has been studied include various Orobanchaceae (e.g., Epifagus, Wolfe et al. 1992; Conopholis, Wimpee et al. 1991; Hyobanche/Harveya, Wolfe and dePamphilis 1998; Randle and Wolfe 2005; family-wide repre- sentative genera, Wicke et al. 2013), Convolvulaceae (Cuscuta, Funk et al. 2007; McNeal et al. 2007; Braukmann et al. 2013), monotropoid Ericaceae (Braukmann and Stefanovic 2012; Broe M and Freudenstein J, personal communication), and monocots including Orchidaceae (Delannoy et al. 2011; Logacheva et al. 2011, 2014; Barrett and Davis 2012). ß The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 31(12):3095–3112 doi:10.1093/molbev/msu252 Advance Access publication August 28, 2014 3095 by guest on July 7, 2016 http://mbe.oxfordjournals.org/ Downloaded from

Transcript of Investigating the path of plastid genome degradation in an early-transitional clade of heterotrophic...

Article

FastT

rackInvestigating the Path of Plastid Genome Degradationin an Early-Transitional Clade of Heterotrophic Orchidsand Implications for Heterotrophic AngiospermsCraig F Barrett1 John V Freudenstein2 Jeff Li1 Dustin R Mayfield-Jones3 Leticia Perez1 J Chris Pires3

and Cristian Santos1

1Department of Biological Sciences California State University Los Angeles2Department of Evolution Ecology and Organismal Biology and the Museum of Biological Diversity Ohio State University3Division of Biological Sciences University of Missouri

Corresponding author E-mail cbarret5calstatelaedu

Associate editor Naoki Takebayashi

Abstract

Parasitic organisms exemplify morphological and genomic reduction Some heterotrophic parasitic plants harbor dras-tically reduced and degraded plastid genomes resulting from relaxed selective pressure on photosynthetic functionHowever few studies have addressed the initial stages of plastome degradation in groups containing both photosyntheticand nonphotosynthetic species Corallorhiza is a genus of leafless heterotrophic orchids that contains both greenphotosynthetic species and nongreen putatively nonphotosynthetic species and represents an ideal system in whichto assess the beginning of the transition to a ldquominimal plastomerdquo Complete plastomes were generated for nine taxa ofCorallorhiza using Illumina paired-end sequencing of genomic DNA to assess the degree of degradation among taxa andfor comparison with a general model of degradation among angiosperms Quantification of total chlorophyll suggeststhat nongreen Corallorhiza still produce chlorophyll but at 10-fold lower concentrations than green congeners Completeplastomes and partial nuclear rDNA cistrons yielded a fully resolved tree for Corallorhiza with at least two independentlosses of photosynthesis evidenced by gene deletions and pseudogenes in Co striata and nongreen Co maculata AllCorallorhiza show some evidence of degradation in genes of the NAD(P)H dehydrogenase complex Among genes withopen reading frames photosynthesis-related genes displayed evidence of neutral evolution in nongreen Corallorhizawhereas genes of the ATP synthase complex displayed some evidence of positive selection in these same groups thoughfor reasons unknown Corallorhiza spans the early stages of a general model of plastome degradation and has addedcritical insight for understanding the process of plastome evolution in heterotrophic angiosperms

Key words Orchidaceae chloroplast pseudogene parasite chloroplast photosynthesis chlorophyll

IntroductionParasites often display extreme reduction in morphologicaland genomic features (eg Moran 2002 Lawrence 2005Morrison et al 2007) There are many examples of reducedgenomes in parasitic animals (eg Protasio et al 2012 Tsaiet al 2013) and fungi (eg Katinka et al 2001 Yoder andTurgeon 2001 Galagan et al 2003) Heterotrophic plantsmdashsensu lato those that obtain nutrients either from otherplants or from fungi with which they associate (iemycoheterotrophs)mdashprovide unique opportunities to studythe types of changes that occur along the transition to par-asitism Some heterotrophic plants retain the ability to carryout photosynthesis (ie hemiparasites and partial mycoheter-otrophs) whereas others have lost this ability presumably dueto a relaxation of selective constraints associated with pho-tosynthetic function (ie holoparasites [Kuijt 1969] andholomycotrophs) Parasitism upon other plants likely evolveda minimum of 12 times among eudicot angiosperms(Westwood et al 2010 McNeal et al 2013) whereas myco-heterotrophy has evolved at least 40 times mainly amongmonocot angioisperms with at least 30 independent shifts

within the family Orchidaceae alone (Leake 1994 Bidartondo2005 Freudenstein and Barrett 2010 Leake and Cameron2010 Merckx and Freudenstein 2010)

Plastid-encoded genes in heterotrophic plants are ex-pected to display evidence of degradation through mutationwith little or no selective consequences due to relaxed puri-fying selection on photosynthesis The plastid genome hasbeen the target of recent studies in heterotrophic plants inpart brought about by the availability of next-generation se-quencing technologies Examples of heterotrophic taxa forwhich plastome evolution has been studied include variousOrobanchaceae (eg Epifagus Wolfe et al 1992 ConopholisWimpee et al 1991 HyobancheHarveya Wolfe anddePamphilis 1998 Randle and Wolfe 2005 family-wide repre-sentative genera Wicke et al 2013) Convolvulaceae (CuscutaFunk et al 2007 McNeal et al 2007 Braukmann et al 2013)monotropoid Ericaceae (Braukmann and Stefanovic 2012Broe M and Freudenstein J personal communication) andmonocots including Orchidaceae (Delannoy et al 2011Logacheva et al 2011 2014 Barrett and Davis 2012)

The Author 2014 Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution All rights reserved For permissions pleasee-mail journalspermissionsoupcom

Mol Biol Evol 31(12)3095ndash3112 doi101093molbevmsu252 Advance Access publication August 28 2014 3095

by guest on July 7 2016httpm

beoxfordjournalsorgD

ownloaded from

Nearly all of those taxa display some degree of reduction inoverall plastome size and gene content particularly with re-duction in photosynthesis-related gene systems with manyldquoextremerdquo examples of genomic reduction (eg Delannoyet al 2011 Wicke et al 2013) In the heterotrophic angio-sperm Rafflesia and free-living nonphotosynthetic algaPolytomella failure to recover plastid genomes suggeststheir complete loss (Molina et al 2014 Smith and Lee2014) Although these examples of extreme reduction ofthe plastid genome are highly interesting the initial stagesin the progression of plastid genome degradation remainpoorly understood In order to address this issue focusmust be oriented toward lower taxa (eg see Braukmannet al 2013) displaying variation in trophic strategy and thusrepresenting a ldquosnapshotrdquo of the transition from autotrophyto strict heterotrophy

Building on previous syntheses of plastid genome evolu-tion in heterotrophic plants (eg see Krause 2008 Wicke et al2011) Barrett and Davis (2012) proposed a model of plastidgenome degradation with five ldquostagesrdquo (1) ndh (2) psapsb pet rbcL ycf3 4 (3) rpo (4) atp (5) rpl rps rrntrn accD clpP infA matK ycf1 2 The rationale behind thismodel is that in many groups the NAD(P)H complex (ie ndhgenes stage 1) is the first plastid-encoded gene system toshow evidence of degradation even in some photosyntheticgroups (see Wicke et al 2011 Barrett and Davis 2012 andreferences therein) Second as a result of the transition to aheterotrophic lifestyle purifying selection is expected to berelaxed for genes involved directly in photosynthesis (stage 2)followed by degradation of the plastid-encoded RNApolymerase complex (ldquoPEPrdquo or rpo-genes stage 3) due tothe role of the latter in transcribing plastid operons encodingmany gene products with photosynthesis-related functionsThe next complex hypothesized to show evidence of degra-dation is the ATP synthase complex (atp-genes stage 4)Although this complex is directly involved in synthesizingATP as part of the photosynthetic process its function mayremain essential in plastids other than chloroplasts (eg am-yloplasts) The last proposed stage is the degradation ofldquohousekeepingrdquo genes that play roles in basic processes ofthe plastid itself (eg on-site protein synthesis RNAprocessing)

An ideal taxon for comparison to the model of Barrett andDavis (2012) is the mycoheterotrophic Corallorhiza (the coral-root orchids) Corallorhiza is a genus of 12 leafless species(Freudenstein 1992 1997 Freudenstein and Senyo 2008)The genus contains some species that are partially heterotro-phic and have presumably retained the capacity for photo-synthesis These taxa all have some degree of visible greentissue (Co trifida Co odontorhiza Co wisteriana Co bulbosaCo macrantha and Co maculata var mexicana see supple-mentary fig S1 Supplementary Material online) Photosyth-esis has been demonstrated in one of these species Co trifidathe ldquogreenestrdquo of the coralroots (Zimmer et al 2008 Cameronet al 2009) Other putatively achlorophyllous coralroot spe-cies completely lack visible green tissue (supplementary figS1 Supplementary Material online) including the Co striatacomplex and some members of the Co maculata complex

(Co mertensiana and Co maculata vars maculata andoccidentalis) Preliminary evidence of plastome degradationin Corallorhiza comes from heterologous probe experiments(Freudenstein and Doyle 1994a) the presence of RuBisCOLarge Subunit (rbcL) pseudogenes in some lineages but notothers (Barrett and Freudenstein 2008) and sequencing ofthe complete plastome of Co striata var vreelandii (Barrettand Davis 2012)

Corallorhiza represents a potentially powerful model cladefor answering questions related to the earliest stages of plas-tome degradation in heterotrophic plants Furthermore littleis known about the plastid genomes of partial heterotrophsmdashdo they also display evidence of ldquotransitionalrdquo plastome deg-radation Are all nongreen species of Corallorhiza trulyachlorophyllous Which gene complexes are the first to ac-cumulate loss-of-function mutations Is there a typical pro-gression of pseudogene formationgene loss within this genusin accordance with the model of Barrett and Davis (2012)and how do the plastomes of Corallorhiza compare withthose of other parasitic angiosperms When comparing plas-tomes of ldquogreenrdquo versus ldquonongreenrdquo species are there differ-ences in the numbers of functional genes overall plastomesize GC content nonsynonymous versus synonymous sub-stitution rates and structural rearrangements In this studypaired-end (PE) Illumina sequencing of genomic DNA wascarried out to generate complete plastomes for nine mem-bers of Corallorhiza representing all species complexes in thegenus in order to address the following principal hypotheses

(I) Nongreen members of Corallorhiza display plastidgenome degradation (pseudogenes deletions rearran-gements) whereas members with some green tissueare more conserved when both are compared with theplastomes of green leafy orchid species

(II) There is a general path of degradation among plastid-encoded gene complexes corresponding to the modelof Barrett and Davis (2012) Corallorhiza occupies therelatively early stages in this model

(III) In addition to pseudogenes and losses some membersof Corallorhiza have experienced relaxed purifying se-lection for loci with intact reading frames especially inlineages that display evidence of degradation in otherregions of the plastome

Results

Chlorophyll Content

All Corallorhiza taxa sampled have detectable levels of chlo-rophyll (fig 1) Of the three green taxa included Co trifida hasthe highest mean total chlorophyll concentration (316 ngmg) whereas Co wisteriana and Co odontorhiza have slightlyless (215 and 179 ngmg respectively) there is relatively widevariation in chlorophyll content among individuals withineach green species All nongreen members of Corallorhizahave lower chlorophyll concentrations compared withgreen Corallorhiza with none exceeding a mean value of2 ngmg (fig 1) Chlorophyll content varies significantlyamong green versus nongreen taxa after taking phylogeny

3096

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

into account by way of Phylogenetic Analysis of Variance(PPhyANOVA = 00146 see fig 1) Moreover these values arestrikingly consistent across the nongreen species for whichchlorophylls were measured with very little variation withineach species

Genomic Data Sets and Plastomes

Characteristics of the Illumina data sets and resulting finishedplastome assemblies are listed in tables 1 and 2 Mean

coverage of plastomes for newly sequenced genomic datasets ranges from 810 (Co wisteriana) to 2713 (Coodontorhiza) The length of the plastome among theCorallorhiza accessions sampled ranges from 151506 bp inCo macrantha to 137505 bp in Co striata vreelandii(table 2) On average plastomes of taxa with at least somevisible green tissue are 14874 kb in length while those fromtaxa lacking visible green tissue average 14498 kb but thedifference in mean total plastome length between thesetwo groupings is nonsignificant after taking relationshipsinto account (PPhyANOVA = 006189)

The length of the inverted repeat (IR) ranges from25775 bp in Co odontorhiza to 27243 bp in Co maculatavar mexicana with no major variation in gene content norevidence of any gene being incorporated into or lost from theIR Corallorhiza striata var vreelandii has by far the shortestlarge single copy (LSC 72152 bp) This is followed by Comaculata var maculata (80401 bp) Co mertensiana(81146 bp) and Co maculata var occidentalis (81363 bp)Co macrantha (84253 bp) and Co maculata var mexicana(84349 bp) have the longest LSC regions Thus the range oflength variation among members of the Co maculata com-plex for the LSC region (3852 bp) is substantial Nongreentaxa have significantly shorter LSC regions after accountingfor phylogenetic relationships (PPhyANOVA = 00148) Uniqueto Co maculata var maculata is an approximately 16-kb in-version in the LSC region with breakpoints occurring nearycf4 and within the petD intron (supplementary fig S2Supplementary Material online)

The length of the small single copy (SSC) also varies sub-stantially ranging from 11743 bp in Co wisteriana to 14427in Co trifida but there is no significant difference in lengthbetween the SSC regions of the plastome for green versusnongreen taxa (PPhyANOVA = 07382) nor is there any correla-tion between SSC length and chlorophyll content(PIC = 04093) GC content is lowest in Co striata vreelandii(3414) and highest in Co trifida (3574) but does notdiffer significantly among green versus nongreen taxa nor

Table 1 Voucher Information GenBank Accession Numbers the Number of PE Illumina Reads (reads) and Mean Coverage of Each Plastome

Species Voucher GenBanka reads -Covb

Corallorhiza bulbosa A Rich amp Galeotti CFB 238c (OS MEXU)d KM390013 10757100 2438

Co macrantha Schltr Salazar 8191c (OS MEXU) KM390017 9222076 1364

Co maculata (Raf) Raf var maculata JVF 2919 (OS) KM390014 11498452 1091

Co maculata var mexicana (Lindl) Freudenstein CFB 232c (OS MEXU) KM390015 7828614 1182

Co maculata var occidentalis (Lindl) Ames JVF 2095 (OS) KM390016 13204554 2251

Co mertensiana Bong JVF 1999 (OS) KM390018 14540670 2120

Co odontorhiza (Willd) Poir JVF 2778 (OS) KM390021 21861728 2713

Co striata L var vreelandiie (Rydb) LO Williams Taylor 341 (UNM) JX087681 3204074 439

Co trifida Chatel JVF 2676 (OS) KM390019 13617730 2369

Co wisteriana Conrad JVF 2462 (OS) KM390020 9992438 810

aGenBank accession numbers will be included contingent upon a favorable reviewbMean depth of coverage of each plastome (-cov) based on remapping the original read pool to the finished plastomecDuplicate vouchers at Universidad Nacional Autonoma de Mexico (UNAM)dHerbarium codes (OS) Ohio State University Herbarium (UNM) University of New Mexico Herbarium (MEXU) Universidad Nacional Autonoma de MexicoeSequenced and assembled in Barrett and Davis (2012)

0

5

10

15

20

25

30

35

40

Tota

l chl

orop

hyll

(ng

mg)

C t

rifid

a (4

)

C w

iste

riana

(4)

C

odon

torh

iza

(4)

C

mer

tens

iana

(4)

C

mac

occ

iden

talis

(9)

C s

triat

a (1

1)

C b

entle

yi (5

)

Non-green taxa

FIG 1 Total chlorophyll concentration (chlorophylls a and b combinedas measured by UV-Vis spectroscopy) averaged across multiple individ-uals in each taxon Numbers adjacent to taxon names are samples sizes( individuals) Error bars represent standard deviations

3097

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

does it correlate with chlorophyll content after account-ing for phylogenetic relationships (PPhyANOVA = 00648 andPIC = 00989 respectively)

Structural Evolution of the Plastome Pseudogenesand Losses

Figure 2 shows the maximum likelihood (ML) tree based onwhole plastomes + rDNA (see below) with gene losses-of-function parsimoniously mapped assuming no reversalsNongreen taxa have significantly fewer putatively functionalloci (as assessed by intact reading frames) than greenCorallorhiza and leafy outgroups (PPhyANOVAlt 00001table 2) In the subset of coralroot taxa from which

chlorophyll data are available chlrorophyll concentration isalso significantly correlated with the number of putativelyfunctional genes (PIC = 0002)

Plastid genes encoding subunits of the NAD(P)Hdehydrogenase complex (ndh genes) have mostly becomepseudogenes in Corallorhiza with some wholly or mostly de-leted Over 50 of the gene encoding Photosystem II (PSII)protein M (psbM) a small gene of 105 bp in length has beendeleted in all members of Corallorhiza except Co striata vree-landii and Co trifida it has been completely deleted in Comaculata vars maculata and occidentalis Among membersof Corallorhiza displaying at least some green tissue few otherpseudogenes or gene losses were detected The cemA locus is

C striataC trifidaC odontorhizaC wisterianaC bulbosaC macranthaC maculata mexicana

C mertensiana

C maculata maculata

C maculata occidentalisNorthernNorth America

Mexico

C maculata complex

psaA ΨpsbA ΨpsbD ΨrpoA

petA ΨpetD ΨpsbA ΨpsbH ΨrpoA rpoC2

16 kb inversion

ΨpsbH ΨpetB ccsA rpoC1

ΨpetA ΨpetGrpoB

ΨpsaA psaBΨpsbB ΨpsbC ΨpsbF ΨccsA ΨcemA ΨrbcLΨrpoB ΨrpoC1 ΨrpoC2

ΨpsaA psaB psaC ΨpsbAΨpsbB psbC psbD ΨpsbE ΨpsbF ΨpsbJ ΨpsbLΨpetA ΨpetB ΨpetD ΨpetG petNΨrpoA rpoB rpoC1 rpoC2ΨrbcL ΨcemA ΨccsA Ψycf3 Ψycf4 trnT (ggu)

ndhA ΨndhB ΨndhCΨndhD ΨndhE ndhFΨndhG ΨndhH ΨndhI

ΨcemA ndhJ

ΨcemAΨpsaI

psbM

ndhE

ndhG

ndhG

ΨndhJ

non-greengreengreengreengreengreengreen

non-green

non-green

non-green

FIG 2 Maximum-likelihood tree based on whole plastomes + partial rDNA cistron with pseudogene ( ) content and deletions mapped parsimo-niously partial gene deletion (4 6 bp) causing frame shift or truncation (lt50 of gene missing) w complete or nearly complete gene deletioncausing frame shift or truncation (4 50 of gene missing) psa photosystem I psb PSII pet cytochrome rpo plastid-encoded polymerase ndhNAD(P)H dehydrogenase ccsA cytochrome C heme-binding protein A cemA chloroplast envelope membrane protein A rbcL RuBisCO large subunittrn transfer RNA ycf3 and ycf4 putative photosystem assembly factors Dark circles on branches indicate putative transitions to strict heterotrophy

Table 2 Green Tissue Status Lengths of Plastomes and Subregions GC Content and Numbers of Putatively Functional Plastid Genes for EachAccession

Species GreenTissuea

Length(bp)

LSC (bp) SSC (bp) IR (bp) GC No FunctionalGenesc

Corallorhiza bulbosa Yes 148643 82853 12368 26711 3519 105

Co macrantha Yes 151031 84253 12554 27112 3521 104

Co maculata var maculata No 146886 80401 12885 26800 3494 89

Co maculata var mexicana Yes 151506 84349 12671 27243 3518 103

Co maculata var occidentalis No 146595 81363 12368 26432 3514 88

Co mertensiana No 147941 81146 13737 26529 3523 90

Co odontorhiza Yes 147317 82259 13508 25775 3562 103

Co striata var vreelandiid No 137505 72152 12387 26483 3414 82

Co trifidab Yes 149384 83085 14427 25936 3574 107

Co wisteriana Yes 146437 82350 11743 26172 3528 103

NOTEmdashGC G+C base percentage of each plastomeaPresence of visible green tissue based on observation of multiple populations of each speciesbPhotosynthesis has only been demonstrated in Co trifidacThe number of putatively functional genes as assessed by intact reading framesdSequenced and assembled in Barrett and Davis (2012)

3098

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

a pseudogene in all but Co bulbosa Co macrantha and Cotrifida whereas psaI has experienced a 4-bp insertion in Comacrantha near the 50-end of the gene causing a readingframe shift (fig 2) Grep searches of the original reads con-firmed this to be the case No pseudogenes or major deletionscausing shifts in reading frame were detected in atp genes inany species of Corallorhiza

As previously discussed in Barrett and Davis (2012) at leastone member of all major photosynthesis-related gene systemsin Co striata vreelandii has experienced pseudogene forma-tion or gene deletion with the exception of genes encodingsubunits of the ATP synthase complex (atp genes) Severalloss-of-function mutations have occurred in the nongreennorthern North American members of the Co maculata com-plex (Co mertensiana Co maculata maculata and Co macu-lata occidentalis) These three taxa share a partial deletion inpsbB as well as numerous pseudogenes in the psa(Photosystem I) psb (PSII) and rpo (plastid-encoded RNApolymerase) complexes evidenced by reading frame shiftsand internal stop codons They also all have pseudogenesfor ccsA (cytochrome C biogenesis protein) and rbcL(RuBisCO large subunit)

Unique mutations to Co mertensiana among the northernNorth American members of the Co maculata complex in-clude deletion of a large portion of psaA and pseudogenes forpsbA psbD and a deletion in rpoC1 Corallorhiza maculatamaculata and Co maculata occidentalis both contain pseu-dogenes for petA (Cytochrome f) petG (cytochrome b6fcomplex subunit V) and rpoB Mutations unique to Comaculata occidentalis include pseudogenes for psbH andpetB with deletions in ccsA and rpoC1 Mutations in Co

maculata maculata include the evolution of pseudogenes inpsbA and deletions in rpoA and rpoC2 Although the petDgene in Co maculata maculata appears to bear an intactreading frame a breakpoint of a 16-kb inversion (fig 2) sep-arates the two exons of this gene which in other taxa areadjacent and thus presumably cis-spliced

Phylogenetic Analyses

Relationships based on ML analyses of various data matriceswere generally consistent (supplementary figs S3 and S4Supplementary Material online) The main differenceamong ML topologies resulting from these data sets is theplacement of Co bulbosa For plastid coding data Co bulbosais sister to Co odontorhiza + Co wisteriana (bootstrap = 97)whereas for nuclear rDNA (partial ETS-18 S-ITS1-58 S-ITS2-26 S) it is placed as sister to Co macrantha Co maculatamexicana (bootstrap = 92) Whole plastomes including in-trons and intergenic spacers place Co bulbosa as sister to therest of the Co maculata complex (bootstrap = 91) NuclearrDNA sequences place Co trifida as sister to the remainingCorallorhiza however the placement of Co striata as sister tothe remaining Corallorhiza (excluding Co trifida) received lowsupport (bootstrap = 69) suggesting little support fromrDNA at the base of Corallorhiza

Analyzing the entire plastome along with nuclear rDNAyields a completely resolved Corallorhiza tree with 100bootstrap support for all nodes (supplementary fig S3Supplementary Material online) Parsimony analysis of thisdata set yielded identical results to ML including identicaljackknife support values (100 for all nodes tree not shownsee supplementary fig S3 Supplementary Material online)

FIG 3 A basic model of plastid genome degradation (Barrett and Davis 2012) using sequenced exemplar plastomes from angiosperm familiescontaining heterotrophs Convolvulaceae (Funk et al 2007 McNeal et al 2007) Orobanchaceae (Wolfe et al 1992 Wicke et al 2013) Orchidaceae(Chang et al 2006 Wu et al 2010 Delannoy et al 2011 Barrett and Davis 2012 Logacheva et al 2011 Yang et al 2013) and Petrosaviaceae (Logachevaet al 2014) Plastomes are ranked by numbers of putatively functional genes (highest to lowest) Black-filled spaces represent putatively functional geneswith open reading frames whereas gray-filled spaces represent pseudogenes and white spaces represent complete or nearly complete gene lossesldquoStatusrdquo trophic strategy of each species AU nonparasitic autotroph HemiP hemiparasite HP holoparasite PM partial mycoheterotroph (analogousto hemiparasite) HM holomycotroph For the purposes of comparison both leafy green orchids (Oncidium Phalaenopsis) are considered nonparasiticautotrophs even though they may still rely partially on their mycorrhizal fungi L (bp) total length of each plastome in base pairs Photosynthesis-related excludes atp genes as these may serve functions outside of photosynthesis

3099

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

This topology was chosen for all downstream tree-based anal-yses From the root of the tree (with Cymbidium as the out-group) Oncidium and Phalaenopsis are successively sister to amonophyletic Corallorhiza with Co striata vreelandii and Cotrifida successively diverging from the ancestor of the remain-ing members of Corallorhiza Nested within this clade (Coodontorhiza Co wisteriana) is sister to the Co maculata com-plex The Mexican Co bulbosa is sister to all other members ofthe Co maculata complex with 100 support whereaswithin the latter the Mexican Co macrantha and Co macu-lata mexicana are sister to one another and this clade iscollectively sister to a northern North American clade of(Co mertensiana (Co maculata maculata Co maculataoccidentalis))

Analyses of Selective Regime

Various branch models were constructed to explicitly test thehypothesis of different -ratios for different branches speci-fically for nongreen Corallorhiza Only loci for which all taxahad intact reading frames were included The most basicmodel (M0) is a ldquoone-ratiordquo model which assumes a single-ratio across all branches (here leafy =cor =mac-ng =strsee table 3 for branch-based abbreviations of-ratios) ModelM1 which allows all of Corallorhiza to have a different-ratiothan the leafy outgroups (leafy cor =mac-ng =str) has asignificantly better fit for atp-genes (table 3) Also for atp-genes a three-ratio model (M2) allowing different -ratios inCorallorhiza and additionally for nongreen Corallorhiza as op-posed to leafy outgroups has a significantly better fit than thenested M1 model Model M4 which is a two-ratiomodel specifying different -ratios for leafy outgroups +green Corallorhiza as opposed to nongreen Corallorhiza(leafy =cor mac-ng =str) is highly significant comparedwith the one-ratio model (M0) after Bonferroni correctionPhotosynthesis-related genes (excluding atp-genes) have asignificantly better fit for models M2 (leafy cor mac-

ng =str) and M4 (leafy =cor mac-ng =str) than M0 butthese were nonsignificant after Bonferroni correctionRegardless the branch-specific estimate (model M4) of innongreen taxa (mac-ng =str = 073720) was more than twicethat of green taxa (leafy =cor = 027361) approaching apattern expected under selective neutrality for photosynthe-sis-related genes (~ 1) ldquoHousekeepingrdquo genes showsignificantly better fit for M1 and M4 but these again werenonsignificant after Bonferroni correction for multiple com-parisons None of the branch models had a significantly betterfit in nested comparisons for ycf1 + ycf2 or for matK

Branch-site models were applied in the atp complex totest for evidence of positive selection First for a model inwhich all Corallorhiza were specified as a foreground cladethe alternative model which allows some sites to be underpositive selection (2 4 1) did not fit the data significantlybetter than the null model which allows no sites to be underpositive selection (1 =2 = 1 2 = 06 df = 1 P = 04386)However the alternative model had a significantly better fitthan the null model when specifying nongreen Corallorhizataxa as foreground branches after Bonferroni correction for

multiple comparisons (2 = 114 df = 1 P = 00007) suggest-ing that some sites in the atp complex are under positiveselection

Discussion

Chlorophyll Content of Corallorhiza

Some Corallorhiza and many heterotrophic plants are oftenreferred to as ldquoachlorophyllousrdquo based on their lack of visiblegreen coloration However all Corallorhiza included in thisstudy contain at least some detectable levels of chlorophyll(fig 1) Montfort and Keurousters (1940) detected chlorophyll andphotosynthetic activity in Co trifida which is not surprisinggiven the predominantly green coloration of the above-ground organs of this species a later study by Cummingsand Welschmeyer (1998) detected some levels of chlorophyllin the nongreen Co maculata as well as in another nongreenstrictly heterotrophic orchid Cephalanthera austiniae Taxawith at least some visible green tissue on average had approx-imately a 10-fold higher chlorophyll concentration than non-green taxa (fig 1) Although leafless Co trifida is the greenestof the coralroots with green pigmentation throughout nearlythe entire above-ground portion of the plant The distributionof green tissues among other partially green coralroot speciesis more variable but the common theme is that at least somegreen tissue is noticeable in or around the ovary

Coupled with the fact that green or partially green coral-roots also show a general lack of degradation in photosyn-thesis-related plastid gene systems (with the exception of thendh complex and a few other ldquominorrdquo examples see below)these observations suggest that the aforementioned ldquogreenrdquocoralroot taxa may indeed all be partially heterotrophic rely-ing to some small degree on photosynthetic carbon Thoughit has been demonstrated that Co trifida is an inefficientphotosynthesizer (ie net photosynthetic carbon assimilationcannot compensate for net respiration) there is likely to be anadaptive reason that photosynthesis persists in this speciesand possibly in others such as Co odontorhiza and Co wis-teriana Ironically Co trifida is the one coralroot species thatmight be expected to depend most heavily on photosynthesis(Zimmer et al 2008 Cameron et al 2009) One possibility isthat these species supplement fungal carbon uptake withphotosynthetic carbon in specific tissues only mainly in theovary inside of which hundreds or even thousands of minuteldquodust seedsrdquo are produced This has been experimentally dem-onstrated in the partially mycoheterotrophic orchidLimodorum abortivum with the additional findings that pho-tosynthesis is highest in ovary tissue and increases underexperimentally induced fungal carbon limitation (Bellinoet al 2014) It is currently unknown if any temporal fungalcarbon limitations occur throughout the short season whenCorallorhiza displays aboveground growth (typically latespring for flowering to summer for seedovary development)Based on current knowledge it is hypothesized that for veg-etatively reduced orchids such as ldquogreenrdquo Corallorhiza andLimodorum compensation of carbon in and around develop-ing ovaries seems to have some adaptive value which in turn

3100

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

would cause purifying selection on the plastid- (and presum-ably nuclear-) encoded components of photosynthesis

The more puzzling finding applies to nongreen taxa non-green coralroots had surprisingly consistent (albeit relativelyminute) chlorophyll concentrations with very little variationwithin or among species (fig 1) despite having been sampledfrom different parts of their respective geographic rangesChlorophyll has been detected in other strictly heterotrophicplants (Cummings and Welschmeyer 1998) including mem-bers of Orobanchaceae and the monotropoid Ericaceae

The question naturally arises Why would putatively nonpho-tosynthetic taxa continue to produce chlorophyll There areseveral hypotheses as to why this might be the case includingphotoprotectionavoidance of light damage to DNA andother cellular components involvement of the chlorophyllbiosynthesis machinery in other biochemical pathways suchas plastid-nuclear signaling or prevention of the buildup ofchlorophyll precursors that may lead to increased oxidativedamage (discussed in Wickett et al 2011 and referencestherein) In nearly all cases studied strictly parasitic species

Table 3 Branch-Models for Coding Regions of the Plastome in Corallorhiza for Which All Accessions Have Intact Reading Frames Based onVarious Configurations of Coding Data Implemented in the CODEML Module of PAML

Model Omega Estimates npa ln Lb c PPcorrd

atp genes

M0 xleafy = xcor = xmac-ng = xstr = 021505 26 980276868

M1 xleafy = 015175 xcor = xmac-ng = xstr = 026949 27 979812794 9281476

M2 xleafy = 015211 xcor = 022238 xmac-ng = xstr = 038239 28 979568827 4879344 mdash

M3 xleafy = 015212 xcor = 022242 xmac-ng = 041999 xstr = 036243 29 979562404 0128448

M4 xleafy = xcor = 018380 xmac-ng = xstr = 038274 27 979732610 10885142

M5 xleafy = xcor = 018382 xmac-ng = 041997 xstr = 036292 28 979726308 0126042

Photosynthesis genes (pet psab rbcL)

M0 xleafy = xcor = xmac-ng = xstr = 033866 26 185462551

M1 xleafy = 029790 xcor = xmac-ng = xstr = 032741 27 185448900 0273012

M2 xleafy = 029856 xcor = 025277 xmac-ng = xstr = 073721 28 185232692 4324170 mdash

M3 xleafy = 029858 xcor = 025278 xmac-ng = 059352 xstr = 088556 29 185222042 0213002

M4 xleafy = xcor = 027361 xmac-ng = xstr = 073720 27 185239825 4454520 mdash

M5 xleafy = xcor = 027362 xmac-ng = 059352 xstr = 088550 28 185229177 0212952

Houskeeping genes (rpsl infA accD clpP etc)

M0 xleafy = xcor = xmac-ng = xstr = 030179 26 255451250

M1 xleafy = 029055 xcor = xmac-ng = xstr = 032343 27 255427794 4691148 mdash

M2 xleafy = 029066 xcor = 031584 xmac-ng = xstr = 034192 28 255426470 0264796

M3 xleafy = 029065 xcor = 031583 xmac-ng = 033612 xstr = 034612 29 255426407 0012646

M4 xleafy = xcor = 030267 xmac-ng = xstr = 034230 27 255428978 4454430 mdash

M5 xleafy = xcor = 030269 xmac-ng = 033612 xstr = 034668 28 255428908 0014052

ycf1 amp 2

M0 xleafy = xcor = xmac-ng = xstr = 066003 26 253848570

M1 xleafy = 082314 xcor = xmac-ng = xstr = 061213 27 253831494 3415136

M2 xleafy = 082370 xcor = 061073 xmac-ng = xstr = 061543 28 253831485 0001854

M3 xleafy = 082338 xcor = 061050 xmac-ng = 059460 xstr = 063131 29 253831215 0053994

M4 xleafy = xcor = 067331 xmac-ng = xstr = 061464 27 253846635 0386918

M5 xleafy = xcor = 067313 xmac-ng = 059296 xstr = 062959 28 253846382 0050592

matK

M0 xleafy = xcor = xmac-ng = xstr = 043481 26 366630553

M1 xleafy = 044943 xcor = xmac-ng = xstr = 042324 27 366627784 0055394

M2 xleafy = 045060 xcor = 038642 xmac-ng = xstr = 053159 28 366593065 0694374

M3 xleafy = 045107 xcor = 038419 xmac-ng = 033356 xstr = 099284 29 366463804 2585212

M4 xleafy = xcor = 041948 xmac-ng = xstr = 053022 27 366608306 0444942

M5 xleafy = xcor = 041857 xmac-ng = 033352 xstr = 098601 28 366480458 2556974

NOTEmdashAbbreviations for -ratios leafy = green leafy outgroups (Cymbidium Oncidium Phalaenopsis) cor = all Corallorhiza mac-ng = nongreen Co maculata (ie Co mertensi-ana Co maculata maculata Co maculata occidentalis) str = Co striata Successively nested models were compared based on log-likelihood ratio tests (M1 vs M0 M2 vs M1M3 vs M2 M4 vs M0 M5 vs M4) all comparisons had 1 degree of freedom

Plt 005 Plt 001 Plt 0001 ldquomdashrdquo = not significant (where indicated) otherwise blank space indicates no significanceaThe number of free parameters for a particular modelbLog-likelihood of the data for a particular modelcThe chi-square distributed log-likelihood ratio test statistic used to evaluate significant differences in model fitdP values for uncorrectedBonferroni corrected 2 tests where Pcorr = branch parameters in the model being tested

3101

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

(including the nongreen coralroots) that still synthesize de-tectable levels of chlorophyll do so at reduced levels The factthat all nongreen coralroots contain relatively low chlorophyllconcentrations means that these pigments are masked andthus the overall coloration of these species tends to be redpurple or brown most likely due to anthocyanins occurringat higher concentrations (Freudenstein and Doyle 1994a per-sonal observation) Occasionally yellow variants occur amongpopulations of the various coralroot species (Freudenstein1997 Barrett CF personal observation) These are presumablyrecurrent anthocyanin-deficient mutants (yellow colorationmay be due to persistence of chlorophylls and other yelloworange carotenoid pigments) though no specific studies havebeen carried out to demonstrate the causes of this phenom-enon in Corallorhiza Regardless reduced chlorophyll concen-trations and the resulting apparent lack of visible greencoloration in Corallorhiza are excellent indicators of relaxedselective constraints on photosynthetic machinery and this islikely to be the case for many other parasite-containing taxa

Characteristics of Corallorhiza Plastomes

With perhaps the exception of Co striata vreelandii plas-tomes among coralroot taxa do not altogether displaymajor size reduction that would be expected for a groupleafless parasites and thus can be said to occupy the earlystages along the path to a highly reduced or ldquominimalrdquo plastidgenome (Barbrook et al 2006 Krause 2008) In a previousstudy it was hypothesized that members of the Co striatacomplex would have the most reduced plastomes in thegenus in terms of overall size (Barrett and Davis 2012) buteven the plastome of Co striata vreelandii (the representativesequenced in that study and included here) only representsan approximately 6 total genome size reduction rel-ative to the leafy Phalaenopsis (Chang et al 2006) In factCo macrantha and Co maculata mexicana (table 2) bothhave larger plastomes than two of the three leafy outgrouporchid taxa Phalaenopsis (148964 bp) and Oncidium(146484 bp) Thus the shift to leaflessness per se inCorallorhiza does not appear to be associated with a reduc-tion in plastome size for the genus overall (at least for partiallygreen taxa) although an examination of plastomes for theclosest relatives of Corallorhiza (eg Oreorchis AplectrumCremastra Govenia Calypso) in tribe Calypsoeae will helpclarify this

A reduction in plastome size due to deletions and muta-tions resulting in pseudogenes is expected to be associatedwith a decrease in GC content as once-functional (and rela-tively GC-rich) genes become riddled with loss-of-functionmutations over time as a result of relaxed photosynthesis-associated selective pressures These coding regions begin toresemble noncoding DNA which typically has a lower GCcontent (eg Bernardi 1989 Glemin et al 2014) This is ap-parently the case for Co striata vreelandii (3402 GC table 2)which is 116 lower than the mean GC content for theremaining coralroot plastomes (3528) Mean GC contentfor the nongreen members of the Co maculata complex is3510 and no different than that found in the green

coralroots This suggests that large-scale differences in plastidGC content may take longer to manifest and thus likely comeabout later in the process of plastid modification relative toother processes such as photosynthesis gene deletions andother initial gene mutations causing loss of function Thislends additional support to the notion that Co striata vree-landii is farther ldquodown the pathrdquo of plastid genome degrada-tion relative to the nongreen members of the Co maculatacomplex

More broadly the leafy orchids Phalaenopsis aphrodite andthe hybrid Oncidium Gower Ramsey have GC contents of356 and 3732 respectively (Chang et al 2006 Wu et al2010) these values are very similar to what is found in Cotrifida and other green coralroots thus green coralroots donot seem to deviate much from other leafy orchids Membersof holo- and hemiparasitic Orobanchaceae (Wicke et al 2013)represent an interesting clade for comparison withCorallorhiza in terms of GC content Orobanchaceae rangefrom 3808 GC in the hemiparasitic Schwalbea to 3109 inthe holoparasitic Phelipanche purpurea Variation in GC con-tent among plastomes of Corallorhiza lies completely withinthe bounds of that observed in Orobanchaceae which is hy-pothesized to be in the relatively more advanced stages ofdegradation

Structural Evolution of the Plastome Pseudogenesand Lossesndh GenesDegradation of the plastid-encoded ndh complex is advancedin Corallorhiza as has been observed in other groups (egChang et al 2006 Funk et al 2007 McNeal et al 2007Braukmann et al 2009 Blazier et al 2011 Braukmann andStefanovic 2012 Braukmann et al 2013 Iles et al 2013 Peredoet al 2013 Wicke et al 2013) including some of the leafyphotosynthetic orchid plastomes included in figure 3Interestingly there is variation among the orchid plastomessequenced to date with respect to the status of ndh genes(Chang et al 2006 Wu et al 2010 Delannoy et al 2011Barrett and Davis 2012 Logacheva et al 2011 Yang et al2013) suggesting multiple independent degradation path-ways in this complex in the orchid family

The ancestor of Corallorhiza likely already experiencedsome degree of degradation in the ndh complex as evidencedby several shared pseudogenes Sequencing of additionalorchid plastomes could reveal an evolutionary transitionfrom a functional plastid-encoded ndh complex in leafy rel-atives (eg Aplectrum Oreorchis Cremastra Govenia) to alargely degraded state as observed in Corallorhiza Howeverthere is even some variation in the degree of degradationamong certain members of ndh within Corallorhiza whichin part reflects the plastid phylogeny (fig 2) This patternincludes a major deletion in ndhE in all taxa but Co striatavreelandii a shared pseudogene for ndhJ for all members ofthe Co maculata complex besides Co bulbosa (which here issister to the remaining members of the Co maculata com-plex fig 2) a loss of the same gene in (Co odontorhiza Cowisteriana) and hypothesized parallel deletions of the ndhG

3102

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Nearly all of those taxa display some degree of reduction inoverall plastome size and gene content particularly with re-duction in photosynthesis-related gene systems with manyldquoextremerdquo examples of genomic reduction (eg Delannoyet al 2011 Wicke et al 2013) In the heterotrophic angio-sperm Rafflesia and free-living nonphotosynthetic algaPolytomella failure to recover plastid genomes suggeststheir complete loss (Molina et al 2014 Smith and Lee2014) Although these examples of extreme reduction ofthe plastid genome are highly interesting the initial stagesin the progression of plastid genome degradation remainpoorly understood In order to address this issue focusmust be oriented toward lower taxa (eg see Braukmannet al 2013) displaying variation in trophic strategy and thusrepresenting a ldquosnapshotrdquo of the transition from autotrophyto strict heterotrophy

Building on previous syntheses of plastid genome evolu-tion in heterotrophic plants (eg see Krause 2008 Wicke et al2011) Barrett and Davis (2012) proposed a model of plastidgenome degradation with five ldquostagesrdquo (1) ndh (2) psapsb pet rbcL ycf3 4 (3) rpo (4) atp (5) rpl rps rrntrn accD clpP infA matK ycf1 2 The rationale behind thismodel is that in many groups the NAD(P)H complex (ie ndhgenes stage 1) is the first plastid-encoded gene system toshow evidence of degradation even in some photosyntheticgroups (see Wicke et al 2011 Barrett and Davis 2012 andreferences therein) Second as a result of the transition to aheterotrophic lifestyle purifying selection is expected to berelaxed for genes involved directly in photosynthesis (stage 2)followed by degradation of the plastid-encoded RNApolymerase complex (ldquoPEPrdquo or rpo-genes stage 3) due tothe role of the latter in transcribing plastid operons encodingmany gene products with photosynthesis-related functionsThe next complex hypothesized to show evidence of degra-dation is the ATP synthase complex (atp-genes stage 4)Although this complex is directly involved in synthesizingATP as part of the photosynthetic process its function mayremain essential in plastids other than chloroplasts (eg am-yloplasts) The last proposed stage is the degradation ofldquohousekeepingrdquo genes that play roles in basic processes ofthe plastid itself (eg on-site protein synthesis RNAprocessing)

An ideal taxon for comparison to the model of Barrett andDavis (2012) is the mycoheterotrophic Corallorhiza (the coral-root orchids) Corallorhiza is a genus of 12 leafless species(Freudenstein 1992 1997 Freudenstein and Senyo 2008)The genus contains some species that are partially heterotro-phic and have presumably retained the capacity for photo-synthesis These taxa all have some degree of visible greentissue (Co trifida Co odontorhiza Co wisteriana Co bulbosaCo macrantha and Co maculata var mexicana see supple-mentary fig S1 Supplementary Material online) Photosyth-esis has been demonstrated in one of these species Co trifidathe ldquogreenestrdquo of the coralroots (Zimmer et al 2008 Cameronet al 2009) Other putatively achlorophyllous coralroot spe-cies completely lack visible green tissue (supplementary figS1 Supplementary Material online) including the Co striatacomplex and some members of the Co maculata complex

(Co mertensiana and Co maculata vars maculata andoccidentalis) Preliminary evidence of plastome degradationin Corallorhiza comes from heterologous probe experiments(Freudenstein and Doyle 1994a) the presence of RuBisCOLarge Subunit (rbcL) pseudogenes in some lineages but notothers (Barrett and Freudenstein 2008) and sequencing ofthe complete plastome of Co striata var vreelandii (Barrettand Davis 2012)

Corallorhiza represents a potentially powerful model cladefor answering questions related to the earliest stages of plas-tome degradation in heterotrophic plants Furthermore littleis known about the plastid genomes of partial heterotrophsmdashdo they also display evidence of ldquotransitionalrdquo plastome deg-radation Are all nongreen species of Corallorhiza trulyachlorophyllous Which gene complexes are the first to ac-cumulate loss-of-function mutations Is there a typical pro-gression of pseudogene formationgene loss within this genusin accordance with the model of Barrett and Davis (2012)and how do the plastomes of Corallorhiza compare withthose of other parasitic angiosperms When comparing plas-tomes of ldquogreenrdquo versus ldquonongreenrdquo species are there differ-ences in the numbers of functional genes overall plastomesize GC content nonsynonymous versus synonymous sub-stitution rates and structural rearrangements In this studypaired-end (PE) Illumina sequencing of genomic DNA wascarried out to generate complete plastomes for nine mem-bers of Corallorhiza representing all species complexes in thegenus in order to address the following principal hypotheses

(I) Nongreen members of Corallorhiza display plastidgenome degradation (pseudogenes deletions rearran-gements) whereas members with some green tissueare more conserved when both are compared with theplastomes of green leafy orchid species

(II) There is a general path of degradation among plastid-encoded gene complexes corresponding to the modelof Barrett and Davis (2012) Corallorhiza occupies therelatively early stages in this model

(III) In addition to pseudogenes and losses some membersof Corallorhiza have experienced relaxed purifying se-lection for loci with intact reading frames especially inlineages that display evidence of degradation in otherregions of the plastome

Results

Chlorophyll Content

All Corallorhiza taxa sampled have detectable levels of chlo-rophyll (fig 1) Of the three green taxa included Co trifida hasthe highest mean total chlorophyll concentration (316 ngmg) whereas Co wisteriana and Co odontorhiza have slightlyless (215 and 179 ngmg respectively) there is relatively widevariation in chlorophyll content among individuals withineach green species All nongreen members of Corallorhizahave lower chlorophyll concentrations compared withgreen Corallorhiza with none exceeding a mean value of2 ngmg (fig 1) Chlorophyll content varies significantlyamong green versus nongreen taxa after taking phylogeny

3096

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

into account by way of Phylogenetic Analysis of Variance(PPhyANOVA = 00146 see fig 1) Moreover these values arestrikingly consistent across the nongreen species for whichchlorophylls were measured with very little variation withineach species

Genomic Data Sets and Plastomes

Characteristics of the Illumina data sets and resulting finishedplastome assemblies are listed in tables 1 and 2 Mean

coverage of plastomes for newly sequenced genomic datasets ranges from 810 (Co wisteriana) to 2713 (Coodontorhiza) The length of the plastome among theCorallorhiza accessions sampled ranges from 151506 bp inCo macrantha to 137505 bp in Co striata vreelandii(table 2) On average plastomes of taxa with at least somevisible green tissue are 14874 kb in length while those fromtaxa lacking visible green tissue average 14498 kb but thedifference in mean total plastome length between thesetwo groupings is nonsignificant after taking relationshipsinto account (PPhyANOVA = 006189)

The length of the inverted repeat (IR) ranges from25775 bp in Co odontorhiza to 27243 bp in Co maculatavar mexicana with no major variation in gene content norevidence of any gene being incorporated into or lost from theIR Corallorhiza striata var vreelandii has by far the shortestlarge single copy (LSC 72152 bp) This is followed by Comaculata var maculata (80401 bp) Co mertensiana(81146 bp) and Co maculata var occidentalis (81363 bp)Co macrantha (84253 bp) and Co maculata var mexicana(84349 bp) have the longest LSC regions Thus the range oflength variation among members of the Co maculata com-plex for the LSC region (3852 bp) is substantial Nongreentaxa have significantly shorter LSC regions after accountingfor phylogenetic relationships (PPhyANOVA = 00148) Uniqueto Co maculata var maculata is an approximately 16-kb in-version in the LSC region with breakpoints occurring nearycf4 and within the petD intron (supplementary fig S2Supplementary Material online)

The length of the small single copy (SSC) also varies sub-stantially ranging from 11743 bp in Co wisteriana to 14427in Co trifida but there is no significant difference in lengthbetween the SSC regions of the plastome for green versusnongreen taxa (PPhyANOVA = 07382) nor is there any correla-tion between SSC length and chlorophyll content(PIC = 04093) GC content is lowest in Co striata vreelandii(3414) and highest in Co trifida (3574) but does notdiffer significantly among green versus nongreen taxa nor

Table 1 Voucher Information GenBank Accession Numbers the Number of PE Illumina Reads (reads) and Mean Coverage of Each Plastome

Species Voucher GenBanka reads -Covb

Corallorhiza bulbosa A Rich amp Galeotti CFB 238c (OS MEXU)d KM390013 10757100 2438

Co macrantha Schltr Salazar 8191c (OS MEXU) KM390017 9222076 1364

Co maculata (Raf) Raf var maculata JVF 2919 (OS) KM390014 11498452 1091

Co maculata var mexicana (Lindl) Freudenstein CFB 232c (OS MEXU) KM390015 7828614 1182

Co maculata var occidentalis (Lindl) Ames JVF 2095 (OS) KM390016 13204554 2251

Co mertensiana Bong JVF 1999 (OS) KM390018 14540670 2120

Co odontorhiza (Willd) Poir JVF 2778 (OS) KM390021 21861728 2713

Co striata L var vreelandiie (Rydb) LO Williams Taylor 341 (UNM) JX087681 3204074 439

Co trifida Chatel JVF 2676 (OS) KM390019 13617730 2369

Co wisteriana Conrad JVF 2462 (OS) KM390020 9992438 810

aGenBank accession numbers will be included contingent upon a favorable reviewbMean depth of coverage of each plastome (-cov) based on remapping the original read pool to the finished plastomecDuplicate vouchers at Universidad Nacional Autonoma de Mexico (UNAM)dHerbarium codes (OS) Ohio State University Herbarium (UNM) University of New Mexico Herbarium (MEXU) Universidad Nacional Autonoma de MexicoeSequenced and assembled in Barrett and Davis (2012)

0

5

10

15

20

25

30

35

40

Tota

l chl

orop

hyll

(ng

mg)

C t

rifid

a (4

)

C w

iste

riana

(4)

C

odon

torh

iza

(4)

C

mer

tens

iana

(4)

C

mac

occ

iden

talis

(9)

C s

triat

a (1

1)

C b

entle

yi (5

)

Non-green taxa

FIG 1 Total chlorophyll concentration (chlorophylls a and b combinedas measured by UV-Vis spectroscopy) averaged across multiple individ-uals in each taxon Numbers adjacent to taxon names are samples sizes( individuals) Error bars represent standard deviations

3097

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

does it correlate with chlorophyll content after account-ing for phylogenetic relationships (PPhyANOVA = 00648 andPIC = 00989 respectively)

Structural Evolution of the Plastome Pseudogenesand Losses

Figure 2 shows the maximum likelihood (ML) tree based onwhole plastomes + rDNA (see below) with gene losses-of-function parsimoniously mapped assuming no reversalsNongreen taxa have significantly fewer putatively functionalloci (as assessed by intact reading frames) than greenCorallorhiza and leafy outgroups (PPhyANOVAlt 00001table 2) In the subset of coralroot taxa from which

chlorophyll data are available chlrorophyll concentration isalso significantly correlated with the number of putativelyfunctional genes (PIC = 0002)

Plastid genes encoding subunits of the NAD(P)Hdehydrogenase complex (ndh genes) have mostly becomepseudogenes in Corallorhiza with some wholly or mostly de-leted Over 50 of the gene encoding Photosystem II (PSII)protein M (psbM) a small gene of 105 bp in length has beendeleted in all members of Corallorhiza except Co striata vree-landii and Co trifida it has been completely deleted in Comaculata vars maculata and occidentalis Among membersof Corallorhiza displaying at least some green tissue few otherpseudogenes or gene losses were detected The cemA locus is

C striataC trifidaC odontorhizaC wisterianaC bulbosaC macranthaC maculata mexicana

C mertensiana

C maculata maculata

C maculata occidentalisNorthernNorth America

Mexico

C maculata complex

psaA ΨpsbA ΨpsbD ΨrpoA

petA ΨpetD ΨpsbA ΨpsbH ΨrpoA rpoC2

16 kb inversion

ΨpsbH ΨpetB ccsA rpoC1

ΨpetA ΨpetGrpoB

ΨpsaA psaBΨpsbB ΨpsbC ΨpsbF ΨccsA ΨcemA ΨrbcLΨrpoB ΨrpoC1 ΨrpoC2

ΨpsaA psaB psaC ΨpsbAΨpsbB psbC psbD ΨpsbE ΨpsbF ΨpsbJ ΨpsbLΨpetA ΨpetB ΨpetD ΨpetG petNΨrpoA rpoB rpoC1 rpoC2ΨrbcL ΨcemA ΨccsA Ψycf3 Ψycf4 trnT (ggu)

ndhA ΨndhB ΨndhCΨndhD ΨndhE ndhFΨndhG ΨndhH ΨndhI

ΨcemA ndhJ

ΨcemAΨpsaI

psbM

ndhE

ndhG

ndhG

ΨndhJ

non-greengreengreengreengreengreengreen

non-green

non-green

non-green

FIG 2 Maximum-likelihood tree based on whole plastomes + partial rDNA cistron with pseudogene ( ) content and deletions mapped parsimo-niously partial gene deletion (4 6 bp) causing frame shift or truncation (lt50 of gene missing) w complete or nearly complete gene deletioncausing frame shift or truncation (4 50 of gene missing) psa photosystem I psb PSII pet cytochrome rpo plastid-encoded polymerase ndhNAD(P)H dehydrogenase ccsA cytochrome C heme-binding protein A cemA chloroplast envelope membrane protein A rbcL RuBisCO large subunittrn transfer RNA ycf3 and ycf4 putative photosystem assembly factors Dark circles on branches indicate putative transitions to strict heterotrophy

Table 2 Green Tissue Status Lengths of Plastomes and Subregions GC Content and Numbers of Putatively Functional Plastid Genes for EachAccession

Species GreenTissuea

Length(bp)

LSC (bp) SSC (bp) IR (bp) GC No FunctionalGenesc

Corallorhiza bulbosa Yes 148643 82853 12368 26711 3519 105

Co macrantha Yes 151031 84253 12554 27112 3521 104

Co maculata var maculata No 146886 80401 12885 26800 3494 89

Co maculata var mexicana Yes 151506 84349 12671 27243 3518 103

Co maculata var occidentalis No 146595 81363 12368 26432 3514 88

Co mertensiana No 147941 81146 13737 26529 3523 90

Co odontorhiza Yes 147317 82259 13508 25775 3562 103

Co striata var vreelandiid No 137505 72152 12387 26483 3414 82

Co trifidab Yes 149384 83085 14427 25936 3574 107

Co wisteriana Yes 146437 82350 11743 26172 3528 103

NOTEmdashGC G+C base percentage of each plastomeaPresence of visible green tissue based on observation of multiple populations of each speciesbPhotosynthesis has only been demonstrated in Co trifidacThe number of putatively functional genes as assessed by intact reading framesdSequenced and assembled in Barrett and Davis (2012)

3098

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

a pseudogene in all but Co bulbosa Co macrantha and Cotrifida whereas psaI has experienced a 4-bp insertion in Comacrantha near the 50-end of the gene causing a readingframe shift (fig 2) Grep searches of the original reads con-firmed this to be the case No pseudogenes or major deletionscausing shifts in reading frame were detected in atp genes inany species of Corallorhiza

As previously discussed in Barrett and Davis (2012) at leastone member of all major photosynthesis-related gene systemsin Co striata vreelandii has experienced pseudogene forma-tion or gene deletion with the exception of genes encodingsubunits of the ATP synthase complex (atp genes) Severalloss-of-function mutations have occurred in the nongreennorthern North American members of the Co maculata com-plex (Co mertensiana Co maculata maculata and Co macu-lata occidentalis) These three taxa share a partial deletion inpsbB as well as numerous pseudogenes in the psa(Photosystem I) psb (PSII) and rpo (plastid-encoded RNApolymerase) complexes evidenced by reading frame shiftsand internal stop codons They also all have pseudogenesfor ccsA (cytochrome C biogenesis protein) and rbcL(RuBisCO large subunit)

Unique mutations to Co mertensiana among the northernNorth American members of the Co maculata complex in-clude deletion of a large portion of psaA and pseudogenes forpsbA psbD and a deletion in rpoC1 Corallorhiza maculatamaculata and Co maculata occidentalis both contain pseu-dogenes for petA (Cytochrome f) petG (cytochrome b6fcomplex subunit V) and rpoB Mutations unique to Comaculata occidentalis include pseudogenes for psbH andpetB with deletions in ccsA and rpoC1 Mutations in Co

maculata maculata include the evolution of pseudogenes inpsbA and deletions in rpoA and rpoC2 Although the petDgene in Co maculata maculata appears to bear an intactreading frame a breakpoint of a 16-kb inversion (fig 2) sep-arates the two exons of this gene which in other taxa areadjacent and thus presumably cis-spliced

Phylogenetic Analyses

Relationships based on ML analyses of various data matriceswere generally consistent (supplementary figs S3 and S4Supplementary Material online) The main differenceamong ML topologies resulting from these data sets is theplacement of Co bulbosa For plastid coding data Co bulbosais sister to Co odontorhiza + Co wisteriana (bootstrap = 97)whereas for nuclear rDNA (partial ETS-18 S-ITS1-58 S-ITS2-26 S) it is placed as sister to Co macrantha Co maculatamexicana (bootstrap = 92) Whole plastomes including in-trons and intergenic spacers place Co bulbosa as sister to therest of the Co maculata complex (bootstrap = 91) NuclearrDNA sequences place Co trifida as sister to the remainingCorallorhiza however the placement of Co striata as sister tothe remaining Corallorhiza (excluding Co trifida) received lowsupport (bootstrap = 69) suggesting little support fromrDNA at the base of Corallorhiza

Analyzing the entire plastome along with nuclear rDNAyields a completely resolved Corallorhiza tree with 100bootstrap support for all nodes (supplementary fig S3Supplementary Material online) Parsimony analysis of thisdata set yielded identical results to ML including identicaljackknife support values (100 for all nodes tree not shownsee supplementary fig S3 Supplementary Material online)

FIG 3 A basic model of plastid genome degradation (Barrett and Davis 2012) using sequenced exemplar plastomes from angiosperm familiescontaining heterotrophs Convolvulaceae (Funk et al 2007 McNeal et al 2007) Orobanchaceae (Wolfe et al 1992 Wicke et al 2013) Orchidaceae(Chang et al 2006 Wu et al 2010 Delannoy et al 2011 Barrett and Davis 2012 Logacheva et al 2011 Yang et al 2013) and Petrosaviaceae (Logachevaet al 2014) Plastomes are ranked by numbers of putatively functional genes (highest to lowest) Black-filled spaces represent putatively functional geneswith open reading frames whereas gray-filled spaces represent pseudogenes and white spaces represent complete or nearly complete gene lossesldquoStatusrdquo trophic strategy of each species AU nonparasitic autotroph HemiP hemiparasite HP holoparasite PM partial mycoheterotroph (analogousto hemiparasite) HM holomycotroph For the purposes of comparison both leafy green orchids (Oncidium Phalaenopsis) are considered nonparasiticautotrophs even though they may still rely partially on their mycorrhizal fungi L (bp) total length of each plastome in base pairs Photosynthesis-related excludes atp genes as these may serve functions outside of photosynthesis

3099

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

This topology was chosen for all downstream tree-based anal-yses From the root of the tree (with Cymbidium as the out-group) Oncidium and Phalaenopsis are successively sister to amonophyletic Corallorhiza with Co striata vreelandii and Cotrifida successively diverging from the ancestor of the remain-ing members of Corallorhiza Nested within this clade (Coodontorhiza Co wisteriana) is sister to the Co maculata com-plex The Mexican Co bulbosa is sister to all other members ofthe Co maculata complex with 100 support whereaswithin the latter the Mexican Co macrantha and Co macu-lata mexicana are sister to one another and this clade iscollectively sister to a northern North American clade of(Co mertensiana (Co maculata maculata Co maculataoccidentalis))

Analyses of Selective Regime

Various branch models were constructed to explicitly test thehypothesis of different -ratios for different branches speci-fically for nongreen Corallorhiza Only loci for which all taxahad intact reading frames were included The most basicmodel (M0) is a ldquoone-ratiordquo model which assumes a single-ratio across all branches (here leafy =cor =mac-ng =strsee table 3 for branch-based abbreviations of-ratios) ModelM1 which allows all of Corallorhiza to have a different-ratiothan the leafy outgroups (leafy cor =mac-ng =str) has asignificantly better fit for atp-genes (table 3) Also for atp-genes a three-ratio model (M2) allowing different -ratios inCorallorhiza and additionally for nongreen Corallorhiza as op-posed to leafy outgroups has a significantly better fit than thenested M1 model Model M4 which is a two-ratiomodel specifying different -ratios for leafy outgroups +green Corallorhiza as opposed to nongreen Corallorhiza(leafy =cor mac-ng =str) is highly significant comparedwith the one-ratio model (M0) after Bonferroni correctionPhotosynthesis-related genes (excluding atp-genes) have asignificantly better fit for models M2 (leafy cor mac-

ng =str) and M4 (leafy =cor mac-ng =str) than M0 butthese were nonsignificant after Bonferroni correctionRegardless the branch-specific estimate (model M4) of innongreen taxa (mac-ng =str = 073720) was more than twicethat of green taxa (leafy =cor = 027361) approaching apattern expected under selective neutrality for photosynthe-sis-related genes (~ 1) ldquoHousekeepingrdquo genes showsignificantly better fit for M1 and M4 but these again werenonsignificant after Bonferroni correction for multiple com-parisons None of the branch models had a significantly betterfit in nested comparisons for ycf1 + ycf2 or for matK

Branch-site models were applied in the atp complex totest for evidence of positive selection First for a model inwhich all Corallorhiza were specified as a foreground cladethe alternative model which allows some sites to be underpositive selection (2 4 1) did not fit the data significantlybetter than the null model which allows no sites to be underpositive selection (1 =2 = 1 2 = 06 df = 1 P = 04386)However the alternative model had a significantly better fitthan the null model when specifying nongreen Corallorhizataxa as foreground branches after Bonferroni correction for

multiple comparisons (2 = 114 df = 1 P = 00007) suggest-ing that some sites in the atp complex are under positiveselection

Discussion

Chlorophyll Content of Corallorhiza

Some Corallorhiza and many heterotrophic plants are oftenreferred to as ldquoachlorophyllousrdquo based on their lack of visiblegreen coloration However all Corallorhiza included in thisstudy contain at least some detectable levels of chlorophyll(fig 1) Montfort and Keurousters (1940) detected chlorophyll andphotosynthetic activity in Co trifida which is not surprisinggiven the predominantly green coloration of the above-ground organs of this species a later study by Cummingsand Welschmeyer (1998) detected some levels of chlorophyllin the nongreen Co maculata as well as in another nongreenstrictly heterotrophic orchid Cephalanthera austiniae Taxawith at least some visible green tissue on average had approx-imately a 10-fold higher chlorophyll concentration than non-green taxa (fig 1) Although leafless Co trifida is the greenestof the coralroots with green pigmentation throughout nearlythe entire above-ground portion of the plant The distributionof green tissues among other partially green coralroot speciesis more variable but the common theme is that at least somegreen tissue is noticeable in or around the ovary

Coupled with the fact that green or partially green coral-roots also show a general lack of degradation in photosyn-thesis-related plastid gene systems (with the exception of thendh complex and a few other ldquominorrdquo examples see below)these observations suggest that the aforementioned ldquogreenrdquocoralroot taxa may indeed all be partially heterotrophic rely-ing to some small degree on photosynthetic carbon Thoughit has been demonstrated that Co trifida is an inefficientphotosynthesizer (ie net photosynthetic carbon assimilationcannot compensate for net respiration) there is likely to be anadaptive reason that photosynthesis persists in this speciesand possibly in others such as Co odontorhiza and Co wis-teriana Ironically Co trifida is the one coralroot species thatmight be expected to depend most heavily on photosynthesis(Zimmer et al 2008 Cameron et al 2009) One possibility isthat these species supplement fungal carbon uptake withphotosynthetic carbon in specific tissues only mainly in theovary inside of which hundreds or even thousands of minuteldquodust seedsrdquo are produced This has been experimentally dem-onstrated in the partially mycoheterotrophic orchidLimodorum abortivum with the additional findings that pho-tosynthesis is highest in ovary tissue and increases underexperimentally induced fungal carbon limitation (Bellinoet al 2014) It is currently unknown if any temporal fungalcarbon limitations occur throughout the short season whenCorallorhiza displays aboveground growth (typically latespring for flowering to summer for seedovary development)Based on current knowledge it is hypothesized that for veg-etatively reduced orchids such as ldquogreenrdquo Corallorhiza andLimodorum compensation of carbon in and around develop-ing ovaries seems to have some adaptive value which in turn

3100

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

would cause purifying selection on the plastid- (and presum-ably nuclear-) encoded components of photosynthesis

The more puzzling finding applies to nongreen taxa non-green coralroots had surprisingly consistent (albeit relativelyminute) chlorophyll concentrations with very little variationwithin or among species (fig 1) despite having been sampledfrom different parts of their respective geographic rangesChlorophyll has been detected in other strictly heterotrophicplants (Cummings and Welschmeyer 1998) including mem-bers of Orobanchaceae and the monotropoid Ericaceae

The question naturally arises Why would putatively nonpho-tosynthetic taxa continue to produce chlorophyll There areseveral hypotheses as to why this might be the case includingphotoprotectionavoidance of light damage to DNA andother cellular components involvement of the chlorophyllbiosynthesis machinery in other biochemical pathways suchas plastid-nuclear signaling or prevention of the buildup ofchlorophyll precursors that may lead to increased oxidativedamage (discussed in Wickett et al 2011 and referencestherein) In nearly all cases studied strictly parasitic species

Table 3 Branch-Models for Coding Regions of the Plastome in Corallorhiza for Which All Accessions Have Intact Reading Frames Based onVarious Configurations of Coding Data Implemented in the CODEML Module of PAML

Model Omega Estimates npa ln Lb c PPcorrd

atp genes

M0 xleafy = xcor = xmac-ng = xstr = 021505 26 980276868

M1 xleafy = 015175 xcor = xmac-ng = xstr = 026949 27 979812794 9281476

M2 xleafy = 015211 xcor = 022238 xmac-ng = xstr = 038239 28 979568827 4879344 mdash

M3 xleafy = 015212 xcor = 022242 xmac-ng = 041999 xstr = 036243 29 979562404 0128448

M4 xleafy = xcor = 018380 xmac-ng = xstr = 038274 27 979732610 10885142

M5 xleafy = xcor = 018382 xmac-ng = 041997 xstr = 036292 28 979726308 0126042

Photosynthesis genes (pet psab rbcL)

M0 xleafy = xcor = xmac-ng = xstr = 033866 26 185462551

M1 xleafy = 029790 xcor = xmac-ng = xstr = 032741 27 185448900 0273012

M2 xleafy = 029856 xcor = 025277 xmac-ng = xstr = 073721 28 185232692 4324170 mdash

M3 xleafy = 029858 xcor = 025278 xmac-ng = 059352 xstr = 088556 29 185222042 0213002

M4 xleafy = xcor = 027361 xmac-ng = xstr = 073720 27 185239825 4454520 mdash

M5 xleafy = xcor = 027362 xmac-ng = 059352 xstr = 088550 28 185229177 0212952

Houskeeping genes (rpsl infA accD clpP etc)

M0 xleafy = xcor = xmac-ng = xstr = 030179 26 255451250

M1 xleafy = 029055 xcor = xmac-ng = xstr = 032343 27 255427794 4691148 mdash

M2 xleafy = 029066 xcor = 031584 xmac-ng = xstr = 034192 28 255426470 0264796

M3 xleafy = 029065 xcor = 031583 xmac-ng = 033612 xstr = 034612 29 255426407 0012646

M4 xleafy = xcor = 030267 xmac-ng = xstr = 034230 27 255428978 4454430 mdash

M5 xleafy = xcor = 030269 xmac-ng = 033612 xstr = 034668 28 255428908 0014052

ycf1 amp 2

M0 xleafy = xcor = xmac-ng = xstr = 066003 26 253848570

M1 xleafy = 082314 xcor = xmac-ng = xstr = 061213 27 253831494 3415136

M2 xleafy = 082370 xcor = 061073 xmac-ng = xstr = 061543 28 253831485 0001854

M3 xleafy = 082338 xcor = 061050 xmac-ng = 059460 xstr = 063131 29 253831215 0053994

M4 xleafy = xcor = 067331 xmac-ng = xstr = 061464 27 253846635 0386918

M5 xleafy = xcor = 067313 xmac-ng = 059296 xstr = 062959 28 253846382 0050592

matK

M0 xleafy = xcor = xmac-ng = xstr = 043481 26 366630553

M1 xleafy = 044943 xcor = xmac-ng = xstr = 042324 27 366627784 0055394

M2 xleafy = 045060 xcor = 038642 xmac-ng = xstr = 053159 28 366593065 0694374

M3 xleafy = 045107 xcor = 038419 xmac-ng = 033356 xstr = 099284 29 366463804 2585212

M4 xleafy = xcor = 041948 xmac-ng = xstr = 053022 27 366608306 0444942

M5 xleafy = xcor = 041857 xmac-ng = 033352 xstr = 098601 28 366480458 2556974

NOTEmdashAbbreviations for -ratios leafy = green leafy outgroups (Cymbidium Oncidium Phalaenopsis) cor = all Corallorhiza mac-ng = nongreen Co maculata (ie Co mertensi-ana Co maculata maculata Co maculata occidentalis) str = Co striata Successively nested models were compared based on log-likelihood ratio tests (M1 vs M0 M2 vs M1M3 vs M2 M4 vs M0 M5 vs M4) all comparisons had 1 degree of freedom

Plt 005 Plt 001 Plt 0001 ldquomdashrdquo = not significant (where indicated) otherwise blank space indicates no significanceaThe number of free parameters for a particular modelbLog-likelihood of the data for a particular modelcThe chi-square distributed log-likelihood ratio test statistic used to evaluate significant differences in model fitdP values for uncorrectedBonferroni corrected 2 tests where Pcorr = branch parameters in the model being tested

3101

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

(including the nongreen coralroots) that still synthesize de-tectable levels of chlorophyll do so at reduced levels The factthat all nongreen coralroots contain relatively low chlorophyllconcentrations means that these pigments are masked andthus the overall coloration of these species tends to be redpurple or brown most likely due to anthocyanins occurringat higher concentrations (Freudenstein and Doyle 1994a per-sonal observation) Occasionally yellow variants occur amongpopulations of the various coralroot species (Freudenstein1997 Barrett CF personal observation) These are presumablyrecurrent anthocyanin-deficient mutants (yellow colorationmay be due to persistence of chlorophylls and other yelloworange carotenoid pigments) though no specific studies havebeen carried out to demonstrate the causes of this phenom-enon in Corallorhiza Regardless reduced chlorophyll concen-trations and the resulting apparent lack of visible greencoloration in Corallorhiza are excellent indicators of relaxedselective constraints on photosynthetic machinery and this islikely to be the case for many other parasite-containing taxa

Characteristics of Corallorhiza Plastomes

With perhaps the exception of Co striata vreelandii plas-tomes among coralroot taxa do not altogether displaymajor size reduction that would be expected for a groupleafless parasites and thus can be said to occupy the earlystages along the path to a highly reduced or ldquominimalrdquo plastidgenome (Barbrook et al 2006 Krause 2008) In a previousstudy it was hypothesized that members of the Co striatacomplex would have the most reduced plastomes in thegenus in terms of overall size (Barrett and Davis 2012) buteven the plastome of Co striata vreelandii (the representativesequenced in that study and included here) only representsan approximately 6 total genome size reduction rel-ative to the leafy Phalaenopsis (Chang et al 2006) In factCo macrantha and Co maculata mexicana (table 2) bothhave larger plastomes than two of the three leafy outgrouporchid taxa Phalaenopsis (148964 bp) and Oncidium(146484 bp) Thus the shift to leaflessness per se inCorallorhiza does not appear to be associated with a reduc-tion in plastome size for the genus overall (at least for partiallygreen taxa) although an examination of plastomes for theclosest relatives of Corallorhiza (eg Oreorchis AplectrumCremastra Govenia Calypso) in tribe Calypsoeae will helpclarify this

A reduction in plastome size due to deletions and muta-tions resulting in pseudogenes is expected to be associatedwith a decrease in GC content as once-functional (and rela-tively GC-rich) genes become riddled with loss-of-functionmutations over time as a result of relaxed photosynthesis-associated selective pressures These coding regions begin toresemble noncoding DNA which typically has a lower GCcontent (eg Bernardi 1989 Glemin et al 2014) This is ap-parently the case for Co striata vreelandii (3402 GC table 2)which is 116 lower than the mean GC content for theremaining coralroot plastomes (3528) Mean GC contentfor the nongreen members of the Co maculata complex is3510 and no different than that found in the green

coralroots This suggests that large-scale differences in plastidGC content may take longer to manifest and thus likely comeabout later in the process of plastid modification relative toother processes such as photosynthesis gene deletions andother initial gene mutations causing loss of function Thislends additional support to the notion that Co striata vree-landii is farther ldquodown the pathrdquo of plastid genome degrada-tion relative to the nongreen members of the Co maculatacomplex

More broadly the leafy orchids Phalaenopsis aphrodite andthe hybrid Oncidium Gower Ramsey have GC contents of356 and 3732 respectively (Chang et al 2006 Wu et al2010) these values are very similar to what is found in Cotrifida and other green coralroots thus green coralroots donot seem to deviate much from other leafy orchids Membersof holo- and hemiparasitic Orobanchaceae (Wicke et al 2013)represent an interesting clade for comparison withCorallorhiza in terms of GC content Orobanchaceae rangefrom 3808 GC in the hemiparasitic Schwalbea to 3109 inthe holoparasitic Phelipanche purpurea Variation in GC con-tent among plastomes of Corallorhiza lies completely withinthe bounds of that observed in Orobanchaceae which is hy-pothesized to be in the relatively more advanced stages ofdegradation

Structural Evolution of the Plastome Pseudogenesand Lossesndh GenesDegradation of the plastid-encoded ndh complex is advancedin Corallorhiza as has been observed in other groups (egChang et al 2006 Funk et al 2007 McNeal et al 2007Braukmann et al 2009 Blazier et al 2011 Braukmann andStefanovic 2012 Braukmann et al 2013 Iles et al 2013 Peredoet al 2013 Wicke et al 2013) including some of the leafyphotosynthetic orchid plastomes included in figure 3Interestingly there is variation among the orchid plastomessequenced to date with respect to the status of ndh genes(Chang et al 2006 Wu et al 2010 Delannoy et al 2011Barrett and Davis 2012 Logacheva et al 2011 Yang et al2013) suggesting multiple independent degradation path-ways in this complex in the orchid family

The ancestor of Corallorhiza likely already experiencedsome degree of degradation in the ndh complex as evidencedby several shared pseudogenes Sequencing of additionalorchid plastomes could reveal an evolutionary transitionfrom a functional plastid-encoded ndh complex in leafy rel-atives (eg Aplectrum Oreorchis Cremastra Govenia) to alargely degraded state as observed in Corallorhiza Howeverthere is even some variation in the degree of degradationamong certain members of ndh within Corallorhiza whichin part reflects the plastid phylogeny (fig 2) This patternincludes a major deletion in ndhE in all taxa but Co striatavreelandii a shared pseudogene for ndhJ for all members ofthe Co maculata complex besides Co bulbosa (which here issister to the remaining members of the Co maculata com-plex fig 2) a loss of the same gene in (Co odontorhiza Cowisteriana) and hypothesized parallel deletions of the ndhG

3102

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

into account by way of Phylogenetic Analysis of Variance(PPhyANOVA = 00146 see fig 1) Moreover these values arestrikingly consistent across the nongreen species for whichchlorophylls were measured with very little variation withineach species

Genomic Data Sets and Plastomes

Characteristics of the Illumina data sets and resulting finishedplastome assemblies are listed in tables 1 and 2 Mean

coverage of plastomes for newly sequenced genomic datasets ranges from 810 (Co wisteriana) to 2713 (Coodontorhiza) The length of the plastome among theCorallorhiza accessions sampled ranges from 151506 bp inCo macrantha to 137505 bp in Co striata vreelandii(table 2) On average plastomes of taxa with at least somevisible green tissue are 14874 kb in length while those fromtaxa lacking visible green tissue average 14498 kb but thedifference in mean total plastome length between thesetwo groupings is nonsignificant after taking relationshipsinto account (PPhyANOVA = 006189)

The length of the inverted repeat (IR) ranges from25775 bp in Co odontorhiza to 27243 bp in Co maculatavar mexicana with no major variation in gene content norevidence of any gene being incorporated into or lost from theIR Corallorhiza striata var vreelandii has by far the shortestlarge single copy (LSC 72152 bp) This is followed by Comaculata var maculata (80401 bp) Co mertensiana(81146 bp) and Co maculata var occidentalis (81363 bp)Co macrantha (84253 bp) and Co maculata var mexicana(84349 bp) have the longest LSC regions Thus the range oflength variation among members of the Co maculata com-plex for the LSC region (3852 bp) is substantial Nongreentaxa have significantly shorter LSC regions after accountingfor phylogenetic relationships (PPhyANOVA = 00148) Uniqueto Co maculata var maculata is an approximately 16-kb in-version in the LSC region with breakpoints occurring nearycf4 and within the petD intron (supplementary fig S2Supplementary Material online)

The length of the small single copy (SSC) also varies sub-stantially ranging from 11743 bp in Co wisteriana to 14427in Co trifida but there is no significant difference in lengthbetween the SSC regions of the plastome for green versusnongreen taxa (PPhyANOVA = 07382) nor is there any correla-tion between SSC length and chlorophyll content(PIC = 04093) GC content is lowest in Co striata vreelandii(3414) and highest in Co trifida (3574) but does notdiffer significantly among green versus nongreen taxa nor

Table 1 Voucher Information GenBank Accession Numbers the Number of PE Illumina Reads (reads) and Mean Coverage of Each Plastome

Species Voucher GenBanka reads -Covb

Corallorhiza bulbosa A Rich amp Galeotti CFB 238c (OS MEXU)d KM390013 10757100 2438

Co macrantha Schltr Salazar 8191c (OS MEXU) KM390017 9222076 1364

Co maculata (Raf) Raf var maculata JVF 2919 (OS) KM390014 11498452 1091

Co maculata var mexicana (Lindl) Freudenstein CFB 232c (OS MEXU) KM390015 7828614 1182

Co maculata var occidentalis (Lindl) Ames JVF 2095 (OS) KM390016 13204554 2251

Co mertensiana Bong JVF 1999 (OS) KM390018 14540670 2120

Co odontorhiza (Willd) Poir JVF 2778 (OS) KM390021 21861728 2713

Co striata L var vreelandiie (Rydb) LO Williams Taylor 341 (UNM) JX087681 3204074 439

Co trifida Chatel JVF 2676 (OS) KM390019 13617730 2369

Co wisteriana Conrad JVF 2462 (OS) KM390020 9992438 810

aGenBank accession numbers will be included contingent upon a favorable reviewbMean depth of coverage of each plastome (-cov) based on remapping the original read pool to the finished plastomecDuplicate vouchers at Universidad Nacional Autonoma de Mexico (UNAM)dHerbarium codes (OS) Ohio State University Herbarium (UNM) University of New Mexico Herbarium (MEXU) Universidad Nacional Autonoma de MexicoeSequenced and assembled in Barrett and Davis (2012)

0

5

10

15

20

25

30

35

40

Tota

l chl

orop

hyll

(ng

mg)

C t

rifid

a (4

)

C w

iste

riana

(4)

C

odon

torh

iza

(4)

C

mer

tens

iana

(4)

C

mac

occ

iden

talis

(9)

C s

triat

a (1

1)

C b

entle

yi (5

)

Non-green taxa

FIG 1 Total chlorophyll concentration (chlorophylls a and b combinedas measured by UV-Vis spectroscopy) averaged across multiple individ-uals in each taxon Numbers adjacent to taxon names are samples sizes( individuals) Error bars represent standard deviations

3097

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

does it correlate with chlorophyll content after account-ing for phylogenetic relationships (PPhyANOVA = 00648 andPIC = 00989 respectively)

Structural Evolution of the Plastome Pseudogenesand Losses

Figure 2 shows the maximum likelihood (ML) tree based onwhole plastomes + rDNA (see below) with gene losses-of-function parsimoniously mapped assuming no reversalsNongreen taxa have significantly fewer putatively functionalloci (as assessed by intact reading frames) than greenCorallorhiza and leafy outgroups (PPhyANOVAlt 00001table 2) In the subset of coralroot taxa from which

chlorophyll data are available chlrorophyll concentration isalso significantly correlated with the number of putativelyfunctional genes (PIC = 0002)

Plastid genes encoding subunits of the NAD(P)Hdehydrogenase complex (ndh genes) have mostly becomepseudogenes in Corallorhiza with some wholly or mostly de-leted Over 50 of the gene encoding Photosystem II (PSII)protein M (psbM) a small gene of 105 bp in length has beendeleted in all members of Corallorhiza except Co striata vree-landii and Co trifida it has been completely deleted in Comaculata vars maculata and occidentalis Among membersof Corallorhiza displaying at least some green tissue few otherpseudogenes or gene losses were detected The cemA locus is

C striataC trifidaC odontorhizaC wisterianaC bulbosaC macranthaC maculata mexicana

C mertensiana

C maculata maculata

C maculata occidentalisNorthernNorth America

Mexico

C maculata complex

psaA ΨpsbA ΨpsbD ΨrpoA

petA ΨpetD ΨpsbA ΨpsbH ΨrpoA rpoC2

16 kb inversion

ΨpsbH ΨpetB ccsA rpoC1

ΨpetA ΨpetGrpoB

ΨpsaA psaBΨpsbB ΨpsbC ΨpsbF ΨccsA ΨcemA ΨrbcLΨrpoB ΨrpoC1 ΨrpoC2

ΨpsaA psaB psaC ΨpsbAΨpsbB psbC psbD ΨpsbE ΨpsbF ΨpsbJ ΨpsbLΨpetA ΨpetB ΨpetD ΨpetG petNΨrpoA rpoB rpoC1 rpoC2ΨrbcL ΨcemA ΨccsA Ψycf3 Ψycf4 trnT (ggu)

ndhA ΨndhB ΨndhCΨndhD ΨndhE ndhFΨndhG ΨndhH ΨndhI

ΨcemA ndhJ

ΨcemAΨpsaI

psbM

ndhE

ndhG

ndhG

ΨndhJ

non-greengreengreengreengreengreengreen

non-green

non-green

non-green

FIG 2 Maximum-likelihood tree based on whole plastomes + partial rDNA cistron with pseudogene ( ) content and deletions mapped parsimo-niously partial gene deletion (4 6 bp) causing frame shift or truncation (lt50 of gene missing) w complete or nearly complete gene deletioncausing frame shift or truncation (4 50 of gene missing) psa photosystem I psb PSII pet cytochrome rpo plastid-encoded polymerase ndhNAD(P)H dehydrogenase ccsA cytochrome C heme-binding protein A cemA chloroplast envelope membrane protein A rbcL RuBisCO large subunittrn transfer RNA ycf3 and ycf4 putative photosystem assembly factors Dark circles on branches indicate putative transitions to strict heterotrophy

Table 2 Green Tissue Status Lengths of Plastomes and Subregions GC Content and Numbers of Putatively Functional Plastid Genes for EachAccession

Species GreenTissuea

Length(bp)

LSC (bp) SSC (bp) IR (bp) GC No FunctionalGenesc

Corallorhiza bulbosa Yes 148643 82853 12368 26711 3519 105

Co macrantha Yes 151031 84253 12554 27112 3521 104

Co maculata var maculata No 146886 80401 12885 26800 3494 89

Co maculata var mexicana Yes 151506 84349 12671 27243 3518 103

Co maculata var occidentalis No 146595 81363 12368 26432 3514 88

Co mertensiana No 147941 81146 13737 26529 3523 90

Co odontorhiza Yes 147317 82259 13508 25775 3562 103

Co striata var vreelandiid No 137505 72152 12387 26483 3414 82

Co trifidab Yes 149384 83085 14427 25936 3574 107

Co wisteriana Yes 146437 82350 11743 26172 3528 103

NOTEmdashGC G+C base percentage of each plastomeaPresence of visible green tissue based on observation of multiple populations of each speciesbPhotosynthesis has only been demonstrated in Co trifidacThe number of putatively functional genes as assessed by intact reading framesdSequenced and assembled in Barrett and Davis (2012)

3098

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

a pseudogene in all but Co bulbosa Co macrantha and Cotrifida whereas psaI has experienced a 4-bp insertion in Comacrantha near the 50-end of the gene causing a readingframe shift (fig 2) Grep searches of the original reads con-firmed this to be the case No pseudogenes or major deletionscausing shifts in reading frame were detected in atp genes inany species of Corallorhiza

As previously discussed in Barrett and Davis (2012) at leastone member of all major photosynthesis-related gene systemsin Co striata vreelandii has experienced pseudogene forma-tion or gene deletion with the exception of genes encodingsubunits of the ATP synthase complex (atp genes) Severalloss-of-function mutations have occurred in the nongreennorthern North American members of the Co maculata com-plex (Co mertensiana Co maculata maculata and Co macu-lata occidentalis) These three taxa share a partial deletion inpsbB as well as numerous pseudogenes in the psa(Photosystem I) psb (PSII) and rpo (plastid-encoded RNApolymerase) complexes evidenced by reading frame shiftsand internal stop codons They also all have pseudogenesfor ccsA (cytochrome C biogenesis protein) and rbcL(RuBisCO large subunit)

Unique mutations to Co mertensiana among the northernNorth American members of the Co maculata complex in-clude deletion of a large portion of psaA and pseudogenes forpsbA psbD and a deletion in rpoC1 Corallorhiza maculatamaculata and Co maculata occidentalis both contain pseu-dogenes for petA (Cytochrome f) petG (cytochrome b6fcomplex subunit V) and rpoB Mutations unique to Comaculata occidentalis include pseudogenes for psbH andpetB with deletions in ccsA and rpoC1 Mutations in Co

maculata maculata include the evolution of pseudogenes inpsbA and deletions in rpoA and rpoC2 Although the petDgene in Co maculata maculata appears to bear an intactreading frame a breakpoint of a 16-kb inversion (fig 2) sep-arates the two exons of this gene which in other taxa areadjacent and thus presumably cis-spliced

Phylogenetic Analyses

Relationships based on ML analyses of various data matriceswere generally consistent (supplementary figs S3 and S4Supplementary Material online) The main differenceamong ML topologies resulting from these data sets is theplacement of Co bulbosa For plastid coding data Co bulbosais sister to Co odontorhiza + Co wisteriana (bootstrap = 97)whereas for nuclear rDNA (partial ETS-18 S-ITS1-58 S-ITS2-26 S) it is placed as sister to Co macrantha Co maculatamexicana (bootstrap = 92) Whole plastomes including in-trons and intergenic spacers place Co bulbosa as sister to therest of the Co maculata complex (bootstrap = 91) NuclearrDNA sequences place Co trifida as sister to the remainingCorallorhiza however the placement of Co striata as sister tothe remaining Corallorhiza (excluding Co trifida) received lowsupport (bootstrap = 69) suggesting little support fromrDNA at the base of Corallorhiza

Analyzing the entire plastome along with nuclear rDNAyields a completely resolved Corallorhiza tree with 100bootstrap support for all nodes (supplementary fig S3Supplementary Material online) Parsimony analysis of thisdata set yielded identical results to ML including identicaljackknife support values (100 for all nodes tree not shownsee supplementary fig S3 Supplementary Material online)

FIG 3 A basic model of plastid genome degradation (Barrett and Davis 2012) using sequenced exemplar plastomes from angiosperm familiescontaining heterotrophs Convolvulaceae (Funk et al 2007 McNeal et al 2007) Orobanchaceae (Wolfe et al 1992 Wicke et al 2013) Orchidaceae(Chang et al 2006 Wu et al 2010 Delannoy et al 2011 Barrett and Davis 2012 Logacheva et al 2011 Yang et al 2013) and Petrosaviaceae (Logachevaet al 2014) Plastomes are ranked by numbers of putatively functional genes (highest to lowest) Black-filled spaces represent putatively functional geneswith open reading frames whereas gray-filled spaces represent pseudogenes and white spaces represent complete or nearly complete gene lossesldquoStatusrdquo trophic strategy of each species AU nonparasitic autotroph HemiP hemiparasite HP holoparasite PM partial mycoheterotroph (analogousto hemiparasite) HM holomycotroph For the purposes of comparison both leafy green orchids (Oncidium Phalaenopsis) are considered nonparasiticautotrophs even though they may still rely partially on their mycorrhizal fungi L (bp) total length of each plastome in base pairs Photosynthesis-related excludes atp genes as these may serve functions outside of photosynthesis

3099

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

This topology was chosen for all downstream tree-based anal-yses From the root of the tree (with Cymbidium as the out-group) Oncidium and Phalaenopsis are successively sister to amonophyletic Corallorhiza with Co striata vreelandii and Cotrifida successively diverging from the ancestor of the remain-ing members of Corallorhiza Nested within this clade (Coodontorhiza Co wisteriana) is sister to the Co maculata com-plex The Mexican Co bulbosa is sister to all other members ofthe Co maculata complex with 100 support whereaswithin the latter the Mexican Co macrantha and Co macu-lata mexicana are sister to one another and this clade iscollectively sister to a northern North American clade of(Co mertensiana (Co maculata maculata Co maculataoccidentalis))

Analyses of Selective Regime

Various branch models were constructed to explicitly test thehypothesis of different -ratios for different branches speci-fically for nongreen Corallorhiza Only loci for which all taxahad intact reading frames were included The most basicmodel (M0) is a ldquoone-ratiordquo model which assumes a single-ratio across all branches (here leafy =cor =mac-ng =strsee table 3 for branch-based abbreviations of-ratios) ModelM1 which allows all of Corallorhiza to have a different-ratiothan the leafy outgroups (leafy cor =mac-ng =str) has asignificantly better fit for atp-genes (table 3) Also for atp-genes a three-ratio model (M2) allowing different -ratios inCorallorhiza and additionally for nongreen Corallorhiza as op-posed to leafy outgroups has a significantly better fit than thenested M1 model Model M4 which is a two-ratiomodel specifying different -ratios for leafy outgroups +green Corallorhiza as opposed to nongreen Corallorhiza(leafy =cor mac-ng =str) is highly significant comparedwith the one-ratio model (M0) after Bonferroni correctionPhotosynthesis-related genes (excluding atp-genes) have asignificantly better fit for models M2 (leafy cor mac-

ng =str) and M4 (leafy =cor mac-ng =str) than M0 butthese were nonsignificant after Bonferroni correctionRegardless the branch-specific estimate (model M4) of innongreen taxa (mac-ng =str = 073720) was more than twicethat of green taxa (leafy =cor = 027361) approaching apattern expected under selective neutrality for photosynthe-sis-related genes (~ 1) ldquoHousekeepingrdquo genes showsignificantly better fit for M1 and M4 but these again werenonsignificant after Bonferroni correction for multiple com-parisons None of the branch models had a significantly betterfit in nested comparisons for ycf1 + ycf2 or for matK

Branch-site models were applied in the atp complex totest for evidence of positive selection First for a model inwhich all Corallorhiza were specified as a foreground cladethe alternative model which allows some sites to be underpositive selection (2 4 1) did not fit the data significantlybetter than the null model which allows no sites to be underpositive selection (1 =2 = 1 2 = 06 df = 1 P = 04386)However the alternative model had a significantly better fitthan the null model when specifying nongreen Corallorhizataxa as foreground branches after Bonferroni correction for

multiple comparisons (2 = 114 df = 1 P = 00007) suggest-ing that some sites in the atp complex are under positiveselection

Discussion

Chlorophyll Content of Corallorhiza

Some Corallorhiza and many heterotrophic plants are oftenreferred to as ldquoachlorophyllousrdquo based on their lack of visiblegreen coloration However all Corallorhiza included in thisstudy contain at least some detectable levels of chlorophyll(fig 1) Montfort and Keurousters (1940) detected chlorophyll andphotosynthetic activity in Co trifida which is not surprisinggiven the predominantly green coloration of the above-ground organs of this species a later study by Cummingsand Welschmeyer (1998) detected some levels of chlorophyllin the nongreen Co maculata as well as in another nongreenstrictly heterotrophic orchid Cephalanthera austiniae Taxawith at least some visible green tissue on average had approx-imately a 10-fold higher chlorophyll concentration than non-green taxa (fig 1) Although leafless Co trifida is the greenestof the coralroots with green pigmentation throughout nearlythe entire above-ground portion of the plant The distributionof green tissues among other partially green coralroot speciesis more variable but the common theme is that at least somegreen tissue is noticeable in or around the ovary

Coupled with the fact that green or partially green coral-roots also show a general lack of degradation in photosyn-thesis-related plastid gene systems (with the exception of thendh complex and a few other ldquominorrdquo examples see below)these observations suggest that the aforementioned ldquogreenrdquocoralroot taxa may indeed all be partially heterotrophic rely-ing to some small degree on photosynthetic carbon Thoughit has been demonstrated that Co trifida is an inefficientphotosynthesizer (ie net photosynthetic carbon assimilationcannot compensate for net respiration) there is likely to be anadaptive reason that photosynthesis persists in this speciesand possibly in others such as Co odontorhiza and Co wis-teriana Ironically Co trifida is the one coralroot species thatmight be expected to depend most heavily on photosynthesis(Zimmer et al 2008 Cameron et al 2009) One possibility isthat these species supplement fungal carbon uptake withphotosynthetic carbon in specific tissues only mainly in theovary inside of which hundreds or even thousands of minuteldquodust seedsrdquo are produced This has been experimentally dem-onstrated in the partially mycoheterotrophic orchidLimodorum abortivum with the additional findings that pho-tosynthesis is highest in ovary tissue and increases underexperimentally induced fungal carbon limitation (Bellinoet al 2014) It is currently unknown if any temporal fungalcarbon limitations occur throughout the short season whenCorallorhiza displays aboveground growth (typically latespring for flowering to summer for seedovary development)Based on current knowledge it is hypothesized that for veg-etatively reduced orchids such as ldquogreenrdquo Corallorhiza andLimodorum compensation of carbon in and around develop-ing ovaries seems to have some adaptive value which in turn

3100

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

would cause purifying selection on the plastid- (and presum-ably nuclear-) encoded components of photosynthesis

The more puzzling finding applies to nongreen taxa non-green coralroots had surprisingly consistent (albeit relativelyminute) chlorophyll concentrations with very little variationwithin or among species (fig 1) despite having been sampledfrom different parts of their respective geographic rangesChlorophyll has been detected in other strictly heterotrophicplants (Cummings and Welschmeyer 1998) including mem-bers of Orobanchaceae and the monotropoid Ericaceae

The question naturally arises Why would putatively nonpho-tosynthetic taxa continue to produce chlorophyll There areseveral hypotheses as to why this might be the case includingphotoprotectionavoidance of light damage to DNA andother cellular components involvement of the chlorophyllbiosynthesis machinery in other biochemical pathways suchas plastid-nuclear signaling or prevention of the buildup ofchlorophyll precursors that may lead to increased oxidativedamage (discussed in Wickett et al 2011 and referencestherein) In nearly all cases studied strictly parasitic species

Table 3 Branch-Models for Coding Regions of the Plastome in Corallorhiza for Which All Accessions Have Intact Reading Frames Based onVarious Configurations of Coding Data Implemented in the CODEML Module of PAML

Model Omega Estimates npa ln Lb c PPcorrd

atp genes

M0 xleafy = xcor = xmac-ng = xstr = 021505 26 980276868

M1 xleafy = 015175 xcor = xmac-ng = xstr = 026949 27 979812794 9281476

M2 xleafy = 015211 xcor = 022238 xmac-ng = xstr = 038239 28 979568827 4879344 mdash

M3 xleafy = 015212 xcor = 022242 xmac-ng = 041999 xstr = 036243 29 979562404 0128448

M4 xleafy = xcor = 018380 xmac-ng = xstr = 038274 27 979732610 10885142

M5 xleafy = xcor = 018382 xmac-ng = 041997 xstr = 036292 28 979726308 0126042

Photosynthesis genes (pet psab rbcL)

M0 xleafy = xcor = xmac-ng = xstr = 033866 26 185462551

M1 xleafy = 029790 xcor = xmac-ng = xstr = 032741 27 185448900 0273012

M2 xleafy = 029856 xcor = 025277 xmac-ng = xstr = 073721 28 185232692 4324170 mdash

M3 xleafy = 029858 xcor = 025278 xmac-ng = 059352 xstr = 088556 29 185222042 0213002

M4 xleafy = xcor = 027361 xmac-ng = xstr = 073720 27 185239825 4454520 mdash

M5 xleafy = xcor = 027362 xmac-ng = 059352 xstr = 088550 28 185229177 0212952

Houskeeping genes (rpsl infA accD clpP etc)

M0 xleafy = xcor = xmac-ng = xstr = 030179 26 255451250

M1 xleafy = 029055 xcor = xmac-ng = xstr = 032343 27 255427794 4691148 mdash

M2 xleafy = 029066 xcor = 031584 xmac-ng = xstr = 034192 28 255426470 0264796

M3 xleafy = 029065 xcor = 031583 xmac-ng = 033612 xstr = 034612 29 255426407 0012646

M4 xleafy = xcor = 030267 xmac-ng = xstr = 034230 27 255428978 4454430 mdash

M5 xleafy = xcor = 030269 xmac-ng = 033612 xstr = 034668 28 255428908 0014052

ycf1 amp 2

M0 xleafy = xcor = xmac-ng = xstr = 066003 26 253848570

M1 xleafy = 082314 xcor = xmac-ng = xstr = 061213 27 253831494 3415136

M2 xleafy = 082370 xcor = 061073 xmac-ng = xstr = 061543 28 253831485 0001854

M3 xleafy = 082338 xcor = 061050 xmac-ng = 059460 xstr = 063131 29 253831215 0053994

M4 xleafy = xcor = 067331 xmac-ng = xstr = 061464 27 253846635 0386918

M5 xleafy = xcor = 067313 xmac-ng = 059296 xstr = 062959 28 253846382 0050592

matK

M0 xleafy = xcor = xmac-ng = xstr = 043481 26 366630553

M1 xleafy = 044943 xcor = xmac-ng = xstr = 042324 27 366627784 0055394

M2 xleafy = 045060 xcor = 038642 xmac-ng = xstr = 053159 28 366593065 0694374

M3 xleafy = 045107 xcor = 038419 xmac-ng = 033356 xstr = 099284 29 366463804 2585212

M4 xleafy = xcor = 041948 xmac-ng = xstr = 053022 27 366608306 0444942

M5 xleafy = xcor = 041857 xmac-ng = 033352 xstr = 098601 28 366480458 2556974

NOTEmdashAbbreviations for -ratios leafy = green leafy outgroups (Cymbidium Oncidium Phalaenopsis) cor = all Corallorhiza mac-ng = nongreen Co maculata (ie Co mertensi-ana Co maculata maculata Co maculata occidentalis) str = Co striata Successively nested models were compared based on log-likelihood ratio tests (M1 vs M0 M2 vs M1M3 vs M2 M4 vs M0 M5 vs M4) all comparisons had 1 degree of freedom

Plt 005 Plt 001 Plt 0001 ldquomdashrdquo = not significant (where indicated) otherwise blank space indicates no significanceaThe number of free parameters for a particular modelbLog-likelihood of the data for a particular modelcThe chi-square distributed log-likelihood ratio test statistic used to evaluate significant differences in model fitdP values for uncorrectedBonferroni corrected 2 tests where Pcorr = branch parameters in the model being tested

3101

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

(including the nongreen coralroots) that still synthesize de-tectable levels of chlorophyll do so at reduced levels The factthat all nongreen coralroots contain relatively low chlorophyllconcentrations means that these pigments are masked andthus the overall coloration of these species tends to be redpurple or brown most likely due to anthocyanins occurringat higher concentrations (Freudenstein and Doyle 1994a per-sonal observation) Occasionally yellow variants occur amongpopulations of the various coralroot species (Freudenstein1997 Barrett CF personal observation) These are presumablyrecurrent anthocyanin-deficient mutants (yellow colorationmay be due to persistence of chlorophylls and other yelloworange carotenoid pigments) though no specific studies havebeen carried out to demonstrate the causes of this phenom-enon in Corallorhiza Regardless reduced chlorophyll concen-trations and the resulting apparent lack of visible greencoloration in Corallorhiza are excellent indicators of relaxedselective constraints on photosynthetic machinery and this islikely to be the case for many other parasite-containing taxa

Characteristics of Corallorhiza Plastomes

With perhaps the exception of Co striata vreelandii plas-tomes among coralroot taxa do not altogether displaymajor size reduction that would be expected for a groupleafless parasites and thus can be said to occupy the earlystages along the path to a highly reduced or ldquominimalrdquo plastidgenome (Barbrook et al 2006 Krause 2008) In a previousstudy it was hypothesized that members of the Co striatacomplex would have the most reduced plastomes in thegenus in terms of overall size (Barrett and Davis 2012) buteven the plastome of Co striata vreelandii (the representativesequenced in that study and included here) only representsan approximately 6 total genome size reduction rel-ative to the leafy Phalaenopsis (Chang et al 2006) In factCo macrantha and Co maculata mexicana (table 2) bothhave larger plastomes than two of the three leafy outgrouporchid taxa Phalaenopsis (148964 bp) and Oncidium(146484 bp) Thus the shift to leaflessness per se inCorallorhiza does not appear to be associated with a reduc-tion in plastome size for the genus overall (at least for partiallygreen taxa) although an examination of plastomes for theclosest relatives of Corallorhiza (eg Oreorchis AplectrumCremastra Govenia Calypso) in tribe Calypsoeae will helpclarify this

A reduction in plastome size due to deletions and muta-tions resulting in pseudogenes is expected to be associatedwith a decrease in GC content as once-functional (and rela-tively GC-rich) genes become riddled with loss-of-functionmutations over time as a result of relaxed photosynthesis-associated selective pressures These coding regions begin toresemble noncoding DNA which typically has a lower GCcontent (eg Bernardi 1989 Glemin et al 2014) This is ap-parently the case for Co striata vreelandii (3402 GC table 2)which is 116 lower than the mean GC content for theremaining coralroot plastomes (3528) Mean GC contentfor the nongreen members of the Co maculata complex is3510 and no different than that found in the green

coralroots This suggests that large-scale differences in plastidGC content may take longer to manifest and thus likely comeabout later in the process of plastid modification relative toother processes such as photosynthesis gene deletions andother initial gene mutations causing loss of function Thislends additional support to the notion that Co striata vree-landii is farther ldquodown the pathrdquo of plastid genome degrada-tion relative to the nongreen members of the Co maculatacomplex

More broadly the leafy orchids Phalaenopsis aphrodite andthe hybrid Oncidium Gower Ramsey have GC contents of356 and 3732 respectively (Chang et al 2006 Wu et al2010) these values are very similar to what is found in Cotrifida and other green coralroots thus green coralroots donot seem to deviate much from other leafy orchids Membersof holo- and hemiparasitic Orobanchaceae (Wicke et al 2013)represent an interesting clade for comparison withCorallorhiza in terms of GC content Orobanchaceae rangefrom 3808 GC in the hemiparasitic Schwalbea to 3109 inthe holoparasitic Phelipanche purpurea Variation in GC con-tent among plastomes of Corallorhiza lies completely withinthe bounds of that observed in Orobanchaceae which is hy-pothesized to be in the relatively more advanced stages ofdegradation

Structural Evolution of the Plastome Pseudogenesand Lossesndh GenesDegradation of the plastid-encoded ndh complex is advancedin Corallorhiza as has been observed in other groups (egChang et al 2006 Funk et al 2007 McNeal et al 2007Braukmann et al 2009 Blazier et al 2011 Braukmann andStefanovic 2012 Braukmann et al 2013 Iles et al 2013 Peredoet al 2013 Wicke et al 2013) including some of the leafyphotosynthetic orchid plastomes included in figure 3Interestingly there is variation among the orchid plastomessequenced to date with respect to the status of ndh genes(Chang et al 2006 Wu et al 2010 Delannoy et al 2011Barrett and Davis 2012 Logacheva et al 2011 Yang et al2013) suggesting multiple independent degradation path-ways in this complex in the orchid family

The ancestor of Corallorhiza likely already experiencedsome degree of degradation in the ndh complex as evidencedby several shared pseudogenes Sequencing of additionalorchid plastomes could reveal an evolutionary transitionfrom a functional plastid-encoded ndh complex in leafy rel-atives (eg Aplectrum Oreorchis Cremastra Govenia) to alargely degraded state as observed in Corallorhiza Howeverthere is even some variation in the degree of degradationamong certain members of ndh within Corallorhiza whichin part reflects the plastid phylogeny (fig 2) This patternincludes a major deletion in ndhE in all taxa but Co striatavreelandii a shared pseudogene for ndhJ for all members ofthe Co maculata complex besides Co bulbosa (which here issister to the remaining members of the Co maculata com-plex fig 2) a loss of the same gene in (Co odontorhiza Cowisteriana) and hypothesized parallel deletions of the ndhG

3102

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

does it correlate with chlorophyll content after account-ing for phylogenetic relationships (PPhyANOVA = 00648 andPIC = 00989 respectively)

Structural Evolution of the Plastome Pseudogenesand Losses

Figure 2 shows the maximum likelihood (ML) tree based onwhole plastomes + rDNA (see below) with gene losses-of-function parsimoniously mapped assuming no reversalsNongreen taxa have significantly fewer putatively functionalloci (as assessed by intact reading frames) than greenCorallorhiza and leafy outgroups (PPhyANOVAlt 00001table 2) In the subset of coralroot taxa from which

chlorophyll data are available chlrorophyll concentration isalso significantly correlated with the number of putativelyfunctional genes (PIC = 0002)

Plastid genes encoding subunits of the NAD(P)Hdehydrogenase complex (ndh genes) have mostly becomepseudogenes in Corallorhiza with some wholly or mostly de-leted Over 50 of the gene encoding Photosystem II (PSII)protein M (psbM) a small gene of 105 bp in length has beendeleted in all members of Corallorhiza except Co striata vree-landii and Co trifida it has been completely deleted in Comaculata vars maculata and occidentalis Among membersof Corallorhiza displaying at least some green tissue few otherpseudogenes or gene losses were detected The cemA locus is

C striataC trifidaC odontorhizaC wisterianaC bulbosaC macranthaC maculata mexicana

C mertensiana

C maculata maculata

C maculata occidentalisNorthernNorth America

Mexico

C maculata complex

psaA ΨpsbA ΨpsbD ΨrpoA

petA ΨpetD ΨpsbA ΨpsbH ΨrpoA rpoC2

16 kb inversion

ΨpsbH ΨpetB ccsA rpoC1

ΨpetA ΨpetGrpoB

ΨpsaA psaBΨpsbB ΨpsbC ΨpsbF ΨccsA ΨcemA ΨrbcLΨrpoB ΨrpoC1 ΨrpoC2

ΨpsaA psaB psaC ΨpsbAΨpsbB psbC psbD ΨpsbE ΨpsbF ΨpsbJ ΨpsbLΨpetA ΨpetB ΨpetD ΨpetG petNΨrpoA rpoB rpoC1 rpoC2ΨrbcL ΨcemA ΨccsA Ψycf3 Ψycf4 trnT (ggu)

ndhA ΨndhB ΨndhCΨndhD ΨndhE ndhFΨndhG ΨndhH ΨndhI

ΨcemA ndhJ

ΨcemAΨpsaI

psbM

ndhE

ndhG

ndhG

ΨndhJ

non-greengreengreengreengreengreengreen

non-green

non-green

non-green

FIG 2 Maximum-likelihood tree based on whole plastomes + partial rDNA cistron with pseudogene ( ) content and deletions mapped parsimo-niously partial gene deletion (4 6 bp) causing frame shift or truncation (lt50 of gene missing) w complete or nearly complete gene deletioncausing frame shift or truncation (4 50 of gene missing) psa photosystem I psb PSII pet cytochrome rpo plastid-encoded polymerase ndhNAD(P)H dehydrogenase ccsA cytochrome C heme-binding protein A cemA chloroplast envelope membrane protein A rbcL RuBisCO large subunittrn transfer RNA ycf3 and ycf4 putative photosystem assembly factors Dark circles on branches indicate putative transitions to strict heterotrophy

Table 2 Green Tissue Status Lengths of Plastomes and Subregions GC Content and Numbers of Putatively Functional Plastid Genes for EachAccession

Species GreenTissuea

Length(bp)

LSC (bp) SSC (bp) IR (bp) GC No FunctionalGenesc

Corallorhiza bulbosa Yes 148643 82853 12368 26711 3519 105

Co macrantha Yes 151031 84253 12554 27112 3521 104

Co maculata var maculata No 146886 80401 12885 26800 3494 89

Co maculata var mexicana Yes 151506 84349 12671 27243 3518 103

Co maculata var occidentalis No 146595 81363 12368 26432 3514 88

Co mertensiana No 147941 81146 13737 26529 3523 90

Co odontorhiza Yes 147317 82259 13508 25775 3562 103

Co striata var vreelandiid No 137505 72152 12387 26483 3414 82

Co trifidab Yes 149384 83085 14427 25936 3574 107

Co wisteriana Yes 146437 82350 11743 26172 3528 103

NOTEmdashGC G+C base percentage of each plastomeaPresence of visible green tissue based on observation of multiple populations of each speciesbPhotosynthesis has only been demonstrated in Co trifidacThe number of putatively functional genes as assessed by intact reading framesdSequenced and assembled in Barrett and Davis (2012)

3098

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

a pseudogene in all but Co bulbosa Co macrantha and Cotrifida whereas psaI has experienced a 4-bp insertion in Comacrantha near the 50-end of the gene causing a readingframe shift (fig 2) Grep searches of the original reads con-firmed this to be the case No pseudogenes or major deletionscausing shifts in reading frame were detected in atp genes inany species of Corallorhiza

As previously discussed in Barrett and Davis (2012) at leastone member of all major photosynthesis-related gene systemsin Co striata vreelandii has experienced pseudogene forma-tion or gene deletion with the exception of genes encodingsubunits of the ATP synthase complex (atp genes) Severalloss-of-function mutations have occurred in the nongreennorthern North American members of the Co maculata com-plex (Co mertensiana Co maculata maculata and Co macu-lata occidentalis) These three taxa share a partial deletion inpsbB as well as numerous pseudogenes in the psa(Photosystem I) psb (PSII) and rpo (plastid-encoded RNApolymerase) complexes evidenced by reading frame shiftsand internal stop codons They also all have pseudogenesfor ccsA (cytochrome C biogenesis protein) and rbcL(RuBisCO large subunit)

Unique mutations to Co mertensiana among the northernNorth American members of the Co maculata complex in-clude deletion of a large portion of psaA and pseudogenes forpsbA psbD and a deletion in rpoC1 Corallorhiza maculatamaculata and Co maculata occidentalis both contain pseu-dogenes for petA (Cytochrome f) petG (cytochrome b6fcomplex subunit V) and rpoB Mutations unique to Comaculata occidentalis include pseudogenes for psbH andpetB with deletions in ccsA and rpoC1 Mutations in Co

maculata maculata include the evolution of pseudogenes inpsbA and deletions in rpoA and rpoC2 Although the petDgene in Co maculata maculata appears to bear an intactreading frame a breakpoint of a 16-kb inversion (fig 2) sep-arates the two exons of this gene which in other taxa areadjacent and thus presumably cis-spliced

Phylogenetic Analyses

Relationships based on ML analyses of various data matriceswere generally consistent (supplementary figs S3 and S4Supplementary Material online) The main differenceamong ML topologies resulting from these data sets is theplacement of Co bulbosa For plastid coding data Co bulbosais sister to Co odontorhiza + Co wisteriana (bootstrap = 97)whereas for nuclear rDNA (partial ETS-18 S-ITS1-58 S-ITS2-26 S) it is placed as sister to Co macrantha Co maculatamexicana (bootstrap = 92) Whole plastomes including in-trons and intergenic spacers place Co bulbosa as sister to therest of the Co maculata complex (bootstrap = 91) NuclearrDNA sequences place Co trifida as sister to the remainingCorallorhiza however the placement of Co striata as sister tothe remaining Corallorhiza (excluding Co trifida) received lowsupport (bootstrap = 69) suggesting little support fromrDNA at the base of Corallorhiza

Analyzing the entire plastome along with nuclear rDNAyields a completely resolved Corallorhiza tree with 100bootstrap support for all nodes (supplementary fig S3Supplementary Material online) Parsimony analysis of thisdata set yielded identical results to ML including identicaljackknife support values (100 for all nodes tree not shownsee supplementary fig S3 Supplementary Material online)

FIG 3 A basic model of plastid genome degradation (Barrett and Davis 2012) using sequenced exemplar plastomes from angiosperm familiescontaining heterotrophs Convolvulaceae (Funk et al 2007 McNeal et al 2007) Orobanchaceae (Wolfe et al 1992 Wicke et al 2013) Orchidaceae(Chang et al 2006 Wu et al 2010 Delannoy et al 2011 Barrett and Davis 2012 Logacheva et al 2011 Yang et al 2013) and Petrosaviaceae (Logachevaet al 2014) Plastomes are ranked by numbers of putatively functional genes (highest to lowest) Black-filled spaces represent putatively functional geneswith open reading frames whereas gray-filled spaces represent pseudogenes and white spaces represent complete or nearly complete gene lossesldquoStatusrdquo trophic strategy of each species AU nonparasitic autotroph HemiP hemiparasite HP holoparasite PM partial mycoheterotroph (analogousto hemiparasite) HM holomycotroph For the purposes of comparison both leafy green orchids (Oncidium Phalaenopsis) are considered nonparasiticautotrophs even though they may still rely partially on their mycorrhizal fungi L (bp) total length of each plastome in base pairs Photosynthesis-related excludes atp genes as these may serve functions outside of photosynthesis

3099

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

This topology was chosen for all downstream tree-based anal-yses From the root of the tree (with Cymbidium as the out-group) Oncidium and Phalaenopsis are successively sister to amonophyletic Corallorhiza with Co striata vreelandii and Cotrifida successively diverging from the ancestor of the remain-ing members of Corallorhiza Nested within this clade (Coodontorhiza Co wisteriana) is sister to the Co maculata com-plex The Mexican Co bulbosa is sister to all other members ofthe Co maculata complex with 100 support whereaswithin the latter the Mexican Co macrantha and Co macu-lata mexicana are sister to one another and this clade iscollectively sister to a northern North American clade of(Co mertensiana (Co maculata maculata Co maculataoccidentalis))

Analyses of Selective Regime

Various branch models were constructed to explicitly test thehypothesis of different -ratios for different branches speci-fically for nongreen Corallorhiza Only loci for which all taxahad intact reading frames were included The most basicmodel (M0) is a ldquoone-ratiordquo model which assumes a single-ratio across all branches (here leafy =cor =mac-ng =strsee table 3 for branch-based abbreviations of-ratios) ModelM1 which allows all of Corallorhiza to have a different-ratiothan the leafy outgroups (leafy cor =mac-ng =str) has asignificantly better fit for atp-genes (table 3) Also for atp-genes a three-ratio model (M2) allowing different -ratios inCorallorhiza and additionally for nongreen Corallorhiza as op-posed to leafy outgroups has a significantly better fit than thenested M1 model Model M4 which is a two-ratiomodel specifying different -ratios for leafy outgroups +green Corallorhiza as opposed to nongreen Corallorhiza(leafy =cor mac-ng =str) is highly significant comparedwith the one-ratio model (M0) after Bonferroni correctionPhotosynthesis-related genes (excluding atp-genes) have asignificantly better fit for models M2 (leafy cor mac-

ng =str) and M4 (leafy =cor mac-ng =str) than M0 butthese were nonsignificant after Bonferroni correctionRegardless the branch-specific estimate (model M4) of innongreen taxa (mac-ng =str = 073720) was more than twicethat of green taxa (leafy =cor = 027361) approaching apattern expected under selective neutrality for photosynthe-sis-related genes (~ 1) ldquoHousekeepingrdquo genes showsignificantly better fit for M1 and M4 but these again werenonsignificant after Bonferroni correction for multiple com-parisons None of the branch models had a significantly betterfit in nested comparisons for ycf1 + ycf2 or for matK

Branch-site models were applied in the atp complex totest for evidence of positive selection First for a model inwhich all Corallorhiza were specified as a foreground cladethe alternative model which allows some sites to be underpositive selection (2 4 1) did not fit the data significantlybetter than the null model which allows no sites to be underpositive selection (1 =2 = 1 2 = 06 df = 1 P = 04386)However the alternative model had a significantly better fitthan the null model when specifying nongreen Corallorhizataxa as foreground branches after Bonferroni correction for

multiple comparisons (2 = 114 df = 1 P = 00007) suggest-ing that some sites in the atp complex are under positiveselection

Discussion

Chlorophyll Content of Corallorhiza

Some Corallorhiza and many heterotrophic plants are oftenreferred to as ldquoachlorophyllousrdquo based on their lack of visiblegreen coloration However all Corallorhiza included in thisstudy contain at least some detectable levels of chlorophyll(fig 1) Montfort and Keurousters (1940) detected chlorophyll andphotosynthetic activity in Co trifida which is not surprisinggiven the predominantly green coloration of the above-ground organs of this species a later study by Cummingsand Welschmeyer (1998) detected some levels of chlorophyllin the nongreen Co maculata as well as in another nongreenstrictly heterotrophic orchid Cephalanthera austiniae Taxawith at least some visible green tissue on average had approx-imately a 10-fold higher chlorophyll concentration than non-green taxa (fig 1) Although leafless Co trifida is the greenestof the coralroots with green pigmentation throughout nearlythe entire above-ground portion of the plant The distributionof green tissues among other partially green coralroot speciesis more variable but the common theme is that at least somegreen tissue is noticeable in or around the ovary

Coupled with the fact that green or partially green coral-roots also show a general lack of degradation in photosyn-thesis-related plastid gene systems (with the exception of thendh complex and a few other ldquominorrdquo examples see below)these observations suggest that the aforementioned ldquogreenrdquocoralroot taxa may indeed all be partially heterotrophic rely-ing to some small degree on photosynthetic carbon Thoughit has been demonstrated that Co trifida is an inefficientphotosynthesizer (ie net photosynthetic carbon assimilationcannot compensate for net respiration) there is likely to be anadaptive reason that photosynthesis persists in this speciesand possibly in others such as Co odontorhiza and Co wis-teriana Ironically Co trifida is the one coralroot species thatmight be expected to depend most heavily on photosynthesis(Zimmer et al 2008 Cameron et al 2009) One possibility isthat these species supplement fungal carbon uptake withphotosynthetic carbon in specific tissues only mainly in theovary inside of which hundreds or even thousands of minuteldquodust seedsrdquo are produced This has been experimentally dem-onstrated in the partially mycoheterotrophic orchidLimodorum abortivum with the additional findings that pho-tosynthesis is highest in ovary tissue and increases underexperimentally induced fungal carbon limitation (Bellinoet al 2014) It is currently unknown if any temporal fungalcarbon limitations occur throughout the short season whenCorallorhiza displays aboveground growth (typically latespring for flowering to summer for seedovary development)Based on current knowledge it is hypothesized that for veg-etatively reduced orchids such as ldquogreenrdquo Corallorhiza andLimodorum compensation of carbon in and around develop-ing ovaries seems to have some adaptive value which in turn

3100

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

would cause purifying selection on the plastid- (and presum-ably nuclear-) encoded components of photosynthesis

The more puzzling finding applies to nongreen taxa non-green coralroots had surprisingly consistent (albeit relativelyminute) chlorophyll concentrations with very little variationwithin or among species (fig 1) despite having been sampledfrom different parts of their respective geographic rangesChlorophyll has been detected in other strictly heterotrophicplants (Cummings and Welschmeyer 1998) including mem-bers of Orobanchaceae and the monotropoid Ericaceae

The question naturally arises Why would putatively nonpho-tosynthetic taxa continue to produce chlorophyll There areseveral hypotheses as to why this might be the case includingphotoprotectionavoidance of light damage to DNA andother cellular components involvement of the chlorophyllbiosynthesis machinery in other biochemical pathways suchas plastid-nuclear signaling or prevention of the buildup ofchlorophyll precursors that may lead to increased oxidativedamage (discussed in Wickett et al 2011 and referencestherein) In nearly all cases studied strictly parasitic species

Table 3 Branch-Models for Coding Regions of the Plastome in Corallorhiza for Which All Accessions Have Intact Reading Frames Based onVarious Configurations of Coding Data Implemented in the CODEML Module of PAML

Model Omega Estimates npa ln Lb c PPcorrd

atp genes

M0 xleafy = xcor = xmac-ng = xstr = 021505 26 980276868

M1 xleafy = 015175 xcor = xmac-ng = xstr = 026949 27 979812794 9281476

M2 xleafy = 015211 xcor = 022238 xmac-ng = xstr = 038239 28 979568827 4879344 mdash

M3 xleafy = 015212 xcor = 022242 xmac-ng = 041999 xstr = 036243 29 979562404 0128448

M4 xleafy = xcor = 018380 xmac-ng = xstr = 038274 27 979732610 10885142

M5 xleafy = xcor = 018382 xmac-ng = 041997 xstr = 036292 28 979726308 0126042

Photosynthesis genes (pet psab rbcL)

M0 xleafy = xcor = xmac-ng = xstr = 033866 26 185462551

M1 xleafy = 029790 xcor = xmac-ng = xstr = 032741 27 185448900 0273012

M2 xleafy = 029856 xcor = 025277 xmac-ng = xstr = 073721 28 185232692 4324170 mdash

M3 xleafy = 029858 xcor = 025278 xmac-ng = 059352 xstr = 088556 29 185222042 0213002

M4 xleafy = xcor = 027361 xmac-ng = xstr = 073720 27 185239825 4454520 mdash

M5 xleafy = xcor = 027362 xmac-ng = 059352 xstr = 088550 28 185229177 0212952

Houskeeping genes (rpsl infA accD clpP etc)

M0 xleafy = xcor = xmac-ng = xstr = 030179 26 255451250

M1 xleafy = 029055 xcor = xmac-ng = xstr = 032343 27 255427794 4691148 mdash

M2 xleafy = 029066 xcor = 031584 xmac-ng = xstr = 034192 28 255426470 0264796

M3 xleafy = 029065 xcor = 031583 xmac-ng = 033612 xstr = 034612 29 255426407 0012646

M4 xleafy = xcor = 030267 xmac-ng = xstr = 034230 27 255428978 4454430 mdash

M5 xleafy = xcor = 030269 xmac-ng = 033612 xstr = 034668 28 255428908 0014052

ycf1 amp 2

M0 xleafy = xcor = xmac-ng = xstr = 066003 26 253848570

M1 xleafy = 082314 xcor = xmac-ng = xstr = 061213 27 253831494 3415136

M2 xleafy = 082370 xcor = 061073 xmac-ng = xstr = 061543 28 253831485 0001854

M3 xleafy = 082338 xcor = 061050 xmac-ng = 059460 xstr = 063131 29 253831215 0053994

M4 xleafy = xcor = 067331 xmac-ng = xstr = 061464 27 253846635 0386918

M5 xleafy = xcor = 067313 xmac-ng = 059296 xstr = 062959 28 253846382 0050592

matK

M0 xleafy = xcor = xmac-ng = xstr = 043481 26 366630553

M1 xleafy = 044943 xcor = xmac-ng = xstr = 042324 27 366627784 0055394

M2 xleafy = 045060 xcor = 038642 xmac-ng = xstr = 053159 28 366593065 0694374

M3 xleafy = 045107 xcor = 038419 xmac-ng = 033356 xstr = 099284 29 366463804 2585212

M4 xleafy = xcor = 041948 xmac-ng = xstr = 053022 27 366608306 0444942

M5 xleafy = xcor = 041857 xmac-ng = 033352 xstr = 098601 28 366480458 2556974

NOTEmdashAbbreviations for -ratios leafy = green leafy outgroups (Cymbidium Oncidium Phalaenopsis) cor = all Corallorhiza mac-ng = nongreen Co maculata (ie Co mertensi-ana Co maculata maculata Co maculata occidentalis) str = Co striata Successively nested models were compared based on log-likelihood ratio tests (M1 vs M0 M2 vs M1M3 vs M2 M4 vs M0 M5 vs M4) all comparisons had 1 degree of freedom

Plt 005 Plt 001 Plt 0001 ldquomdashrdquo = not significant (where indicated) otherwise blank space indicates no significanceaThe number of free parameters for a particular modelbLog-likelihood of the data for a particular modelcThe chi-square distributed log-likelihood ratio test statistic used to evaluate significant differences in model fitdP values for uncorrectedBonferroni corrected 2 tests where Pcorr = branch parameters in the model being tested

3101

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

(including the nongreen coralroots) that still synthesize de-tectable levels of chlorophyll do so at reduced levels The factthat all nongreen coralroots contain relatively low chlorophyllconcentrations means that these pigments are masked andthus the overall coloration of these species tends to be redpurple or brown most likely due to anthocyanins occurringat higher concentrations (Freudenstein and Doyle 1994a per-sonal observation) Occasionally yellow variants occur amongpopulations of the various coralroot species (Freudenstein1997 Barrett CF personal observation) These are presumablyrecurrent anthocyanin-deficient mutants (yellow colorationmay be due to persistence of chlorophylls and other yelloworange carotenoid pigments) though no specific studies havebeen carried out to demonstrate the causes of this phenom-enon in Corallorhiza Regardless reduced chlorophyll concen-trations and the resulting apparent lack of visible greencoloration in Corallorhiza are excellent indicators of relaxedselective constraints on photosynthetic machinery and this islikely to be the case for many other parasite-containing taxa

Characteristics of Corallorhiza Plastomes

With perhaps the exception of Co striata vreelandii plas-tomes among coralroot taxa do not altogether displaymajor size reduction that would be expected for a groupleafless parasites and thus can be said to occupy the earlystages along the path to a highly reduced or ldquominimalrdquo plastidgenome (Barbrook et al 2006 Krause 2008) In a previousstudy it was hypothesized that members of the Co striatacomplex would have the most reduced plastomes in thegenus in terms of overall size (Barrett and Davis 2012) buteven the plastome of Co striata vreelandii (the representativesequenced in that study and included here) only representsan approximately 6 total genome size reduction rel-ative to the leafy Phalaenopsis (Chang et al 2006) In factCo macrantha and Co maculata mexicana (table 2) bothhave larger plastomes than two of the three leafy outgrouporchid taxa Phalaenopsis (148964 bp) and Oncidium(146484 bp) Thus the shift to leaflessness per se inCorallorhiza does not appear to be associated with a reduc-tion in plastome size for the genus overall (at least for partiallygreen taxa) although an examination of plastomes for theclosest relatives of Corallorhiza (eg Oreorchis AplectrumCremastra Govenia Calypso) in tribe Calypsoeae will helpclarify this

A reduction in plastome size due to deletions and muta-tions resulting in pseudogenes is expected to be associatedwith a decrease in GC content as once-functional (and rela-tively GC-rich) genes become riddled with loss-of-functionmutations over time as a result of relaxed photosynthesis-associated selective pressures These coding regions begin toresemble noncoding DNA which typically has a lower GCcontent (eg Bernardi 1989 Glemin et al 2014) This is ap-parently the case for Co striata vreelandii (3402 GC table 2)which is 116 lower than the mean GC content for theremaining coralroot plastomes (3528) Mean GC contentfor the nongreen members of the Co maculata complex is3510 and no different than that found in the green

coralroots This suggests that large-scale differences in plastidGC content may take longer to manifest and thus likely comeabout later in the process of plastid modification relative toother processes such as photosynthesis gene deletions andother initial gene mutations causing loss of function Thislends additional support to the notion that Co striata vree-landii is farther ldquodown the pathrdquo of plastid genome degrada-tion relative to the nongreen members of the Co maculatacomplex

More broadly the leafy orchids Phalaenopsis aphrodite andthe hybrid Oncidium Gower Ramsey have GC contents of356 and 3732 respectively (Chang et al 2006 Wu et al2010) these values are very similar to what is found in Cotrifida and other green coralroots thus green coralroots donot seem to deviate much from other leafy orchids Membersof holo- and hemiparasitic Orobanchaceae (Wicke et al 2013)represent an interesting clade for comparison withCorallorhiza in terms of GC content Orobanchaceae rangefrom 3808 GC in the hemiparasitic Schwalbea to 3109 inthe holoparasitic Phelipanche purpurea Variation in GC con-tent among plastomes of Corallorhiza lies completely withinthe bounds of that observed in Orobanchaceae which is hy-pothesized to be in the relatively more advanced stages ofdegradation

Structural Evolution of the Plastome Pseudogenesand Lossesndh GenesDegradation of the plastid-encoded ndh complex is advancedin Corallorhiza as has been observed in other groups (egChang et al 2006 Funk et al 2007 McNeal et al 2007Braukmann et al 2009 Blazier et al 2011 Braukmann andStefanovic 2012 Braukmann et al 2013 Iles et al 2013 Peredoet al 2013 Wicke et al 2013) including some of the leafyphotosynthetic orchid plastomes included in figure 3Interestingly there is variation among the orchid plastomessequenced to date with respect to the status of ndh genes(Chang et al 2006 Wu et al 2010 Delannoy et al 2011Barrett and Davis 2012 Logacheva et al 2011 Yang et al2013) suggesting multiple independent degradation path-ways in this complex in the orchid family

The ancestor of Corallorhiza likely already experiencedsome degree of degradation in the ndh complex as evidencedby several shared pseudogenes Sequencing of additionalorchid plastomes could reveal an evolutionary transitionfrom a functional plastid-encoded ndh complex in leafy rel-atives (eg Aplectrum Oreorchis Cremastra Govenia) to alargely degraded state as observed in Corallorhiza Howeverthere is even some variation in the degree of degradationamong certain members of ndh within Corallorhiza whichin part reflects the plastid phylogeny (fig 2) This patternincludes a major deletion in ndhE in all taxa but Co striatavreelandii a shared pseudogene for ndhJ for all members ofthe Co maculata complex besides Co bulbosa (which here issister to the remaining members of the Co maculata com-plex fig 2) a loss of the same gene in (Co odontorhiza Cowisteriana) and hypothesized parallel deletions of the ndhG

3102

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

a pseudogene in all but Co bulbosa Co macrantha and Cotrifida whereas psaI has experienced a 4-bp insertion in Comacrantha near the 50-end of the gene causing a readingframe shift (fig 2) Grep searches of the original reads con-firmed this to be the case No pseudogenes or major deletionscausing shifts in reading frame were detected in atp genes inany species of Corallorhiza

As previously discussed in Barrett and Davis (2012) at leastone member of all major photosynthesis-related gene systemsin Co striata vreelandii has experienced pseudogene forma-tion or gene deletion with the exception of genes encodingsubunits of the ATP synthase complex (atp genes) Severalloss-of-function mutations have occurred in the nongreennorthern North American members of the Co maculata com-plex (Co mertensiana Co maculata maculata and Co macu-lata occidentalis) These three taxa share a partial deletion inpsbB as well as numerous pseudogenes in the psa(Photosystem I) psb (PSII) and rpo (plastid-encoded RNApolymerase) complexes evidenced by reading frame shiftsand internal stop codons They also all have pseudogenesfor ccsA (cytochrome C biogenesis protein) and rbcL(RuBisCO large subunit)

Unique mutations to Co mertensiana among the northernNorth American members of the Co maculata complex in-clude deletion of a large portion of psaA and pseudogenes forpsbA psbD and a deletion in rpoC1 Corallorhiza maculatamaculata and Co maculata occidentalis both contain pseu-dogenes for petA (Cytochrome f) petG (cytochrome b6fcomplex subunit V) and rpoB Mutations unique to Comaculata occidentalis include pseudogenes for psbH andpetB with deletions in ccsA and rpoC1 Mutations in Co

maculata maculata include the evolution of pseudogenes inpsbA and deletions in rpoA and rpoC2 Although the petDgene in Co maculata maculata appears to bear an intactreading frame a breakpoint of a 16-kb inversion (fig 2) sep-arates the two exons of this gene which in other taxa areadjacent and thus presumably cis-spliced

Phylogenetic Analyses

Relationships based on ML analyses of various data matriceswere generally consistent (supplementary figs S3 and S4Supplementary Material online) The main differenceamong ML topologies resulting from these data sets is theplacement of Co bulbosa For plastid coding data Co bulbosais sister to Co odontorhiza + Co wisteriana (bootstrap = 97)whereas for nuclear rDNA (partial ETS-18 S-ITS1-58 S-ITS2-26 S) it is placed as sister to Co macrantha Co maculatamexicana (bootstrap = 92) Whole plastomes including in-trons and intergenic spacers place Co bulbosa as sister to therest of the Co maculata complex (bootstrap = 91) NuclearrDNA sequences place Co trifida as sister to the remainingCorallorhiza however the placement of Co striata as sister tothe remaining Corallorhiza (excluding Co trifida) received lowsupport (bootstrap = 69) suggesting little support fromrDNA at the base of Corallorhiza

Analyzing the entire plastome along with nuclear rDNAyields a completely resolved Corallorhiza tree with 100bootstrap support for all nodes (supplementary fig S3Supplementary Material online) Parsimony analysis of thisdata set yielded identical results to ML including identicaljackknife support values (100 for all nodes tree not shownsee supplementary fig S3 Supplementary Material online)

FIG 3 A basic model of plastid genome degradation (Barrett and Davis 2012) using sequenced exemplar plastomes from angiosperm familiescontaining heterotrophs Convolvulaceae (Funk et al 2007 McNeal et al 2007) Orobanchaceae (Wolfe et al 1992 Wicke et al 2013) Orchidaceae(Chang et al 2006 Wu et al 2010 Delannoy et al 2011 Barrett and Davis 2012 Logacheva et al 2011 Yang et al 2013) and Petrosaviaceae (Logachevaet al 2014) Plastomes are ranked by numbers of putatively functional genes (highest to lowest) Black-filled spaces represent putatively functional geneswith open reading frames whereas gray-filled spaces represent pseudogenes and white spaces represent complete or nearly complete gene lossesldquoStatusrdquo trophic strategy of each species AU nonparasitic autotroph HemiP hemiparasite HP holoparasite PM partial mycoheterotroph (analogousto hemiparasite) HM holomycotroph For the purposes of comparison both leafy green orchids (Oncidium Phalaenopsis) are considered nonparasiticautotrophs even though they may still rely partially on their mycorrhizal fungi L (bp) total length of each plastome in base pairs Photosynthesis-related excludes atp genes as these may serve functions outside of photosynthesis

3099

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

This topology was chosen for all downstream tree-based anal-yses From the root of the tree (with Cymbidium as the out-group) Oncidium and Phalaenopsis are successively sister to amonophyletic Corallorhiza with Co striata vreelandii and Cotrifida successively diverging from the ancestor of the remain-ing members of Corallorhiza Nested within this clade (Coodontorhiza Co wisteriana) is sister to the Co maculata com-plex The Mexican Co bulbosa is sister to all other members ofthe Co maculata complex with 100 support whereaswithin the latter the Mexican Co macrantha and Co macu-lata mexicana are sister to one another and this clade iscollectively sister to a northern North American clade of(Co mertensiana (Co maculata maculata Co maculataoccidentalis))

Analyses of Selective Regime

Various branch models were constructed to explicitly test thehypothesis of different -ratios for different branches speci-fically for nongreen Corallorhiza Only loci for which all taxahad intact reading frames were included The most basicmodel (M0) is a ldquoone-ratiordquo model which assumes a single-ratio across all branches (here leafy =cor =mac-ng =strsee table 3 for branch-based abbreviations of-ratios) ModelM1 which allows all of Corallorhiza to have a different-ratiothan the leafy outgroups (leafy cor =mac-ng =str) has asignificantly better fit for atp-genes (table 3) Also for atp-genes a three-ratio model (M2) allowing different -ratios inCorallorhiza and additionally for nongreen Corallorhiza as op-posed to leafy outgroups has a significantly better fit than thenested M1 model Model M4 which is a two-ratiomodel specifying different -ratios for leafy outgroups +green Corallorhiza as opposed to nongreen Corallorhiza(leafy =cor mac-ng =str) is highly significant comparedwith the one-ratio model (M0) after Bonferroni correctionPhotosynthesis-related genes (excluding atp-genes) have asignificantly better fit for models M2 (leafy cor mac-

ng =str) and M4 (leafy =cor mac-ng =str) than M0 butthese were nonsignificant after Bonferroni correctionRegardless the branch-specific estimate (model M4) of innongreen taxa (mac-ng =str = 073720) was more than twicethat of green taxa (leafy =cor = 027361) approaching apattern expected under selective neutrality for photosynthe-sis-related genes (~ 1) ldquoHousekeepingrdquo genes showsignificantly better fit for M1 and M4 but these again werenonsignificant after Bonferroni correction for multiple com-parisons None of the branch models had a significantly betterfit in nested comparisons for ycf1 + ycf2 or for matK

Branch-site models were applied in the atp complex totest for evidence of positive selection First for a model inwhich all Corallorhiza were specified as a foreground cladethe alternative model which allows some sites to be underpositive selection (2 4 1) did not fit the data significantlybetter than the null model which allows no sites to be underpositive selection (1 =2 = 1 2 = 06 df = 1 P = 04386)However the alternative model had a significantly better fitthan the null model when specifying nongreen Corallorhizataxa as foreground branches after Bonferroni correction for

multiple comparisons (2 = 114 df = 1 P = 00007) suggest-ing that some sites in the atp complex are under positiveselection

Discussion

Chlorophyll Content of Corallorhiza

Some Corallorhiza and many heterotrophic plants are oftenreferred to as ldquoachlorophyllousrdquo based on their lack of visiblegreen coloration However all Corallorhiza included in thisstudy contain at least some detectable levels of chlorophyll(fig 1) Montfort and Keurousters (1940) detected chlorophyll andphotosynthetic activity in Co trifida which is not surprisinggiven the predominantly green coloration of the above-ground organs of this species a later study by Cummingsand Welschmeyer (1998) detected some levels of chlorophyllin the nongreen Co maculata as well as in another nongreenstrictly heterotrophic orchid Cephalanthera austiniae Taxawith at least some visible green tissue on average had approx-imately a 10-fold higher chlorophyll concentration than non-green taxa (fig 1) Although leafless Co trifida is the greenestof the coralroots with green pigmentation throughout nearlythe entire above-ground portion of the plant The distributionof green tissues among other partially green coralroot speciesis more variable but the common theme is that at least somegreen tissue is noticeable in or around the ovary

Coupled with the fact that green or partially green coral-roots also show a general lack of degradation in photosyn-thesis-related plastid gene systems (with the exception of thendh complex and a few other ldquominorrdquo examples see below)these observations suggest that the aforementioned ldquogreenrdquocoralroot taxa may indeed all be partially heterotrophic rely-ing to some small degree on photosynthetic carbon Thoughit has been demonstrated that Co trifida is an inefficientphotosynthesizer (ie net photosynthetic carbon assimilationcannot compensate for net respiration) there is likely to be anadaptive reason that photosynthesis persists in this speciesand possibly in others such as Co odontorhiza and Co wis-teriana Ironically Co trifida is the one coralroot species thatmight be expected to depend most heavily on photosynthesis(Zimmer et al 2008 Cameron et al 2009) One possibility isthat these species supplement fungal carbon uptake withphotosynthetic carbon in specific tissues only mainly in theovary inside of which hundreds or even thousands of minuteldquodust seedsrdquo are produced This has been experimentally dem-onstrated in the partially mycoheterotrophic orchidLimodorum abortivum with the additional findings that pho-tosynthesis is highest in ovary tissue and increases underexperimentally induced fungal carbon limitation (Bellinoet al 2014) It is currently unknown if any temporal fungalcarbon limitations occur throughout the short season whenCorallorhiza displays aboveground growth (typically latespring for flowering to summer for seedovary development)Based on current knowledge it is hypothesized that for veg-etatively reduced orchids such as ldquogreenrdquo Corallorhiza andLimodorum compensation of carbon in and around develop-ing ovaries seems to have some adaptive value which in turn

3100

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

would cause purifying selection on the plastid- (and presum-ably nuclear-) encoded components of photosynthesis

The more puzzling finding applies to nongreen taxa non-green coralroots had surprisingly consistent (albeit relativelyminute) chlorophyll concentrations with very little variationwithin or among species (fig 1) despite having been sampledfrom different parts of their respective geographic rangesChlorophyll has been detected in other strictly heterotrophicplants (Cummings and Welschmeyer 1998) including mem-bers of Orobanchaceae and the monotropoid Ericaceae

The question naturally arises Why would putatively nonpho-tosynthetic taxa continue to produce chlorophyll There areseveral hypotheses as to why this might be the case includingphotoprotectionavoidance of light damage to DNA andother cellular components involvement of the chlorophyllbiosynthesis machinery in other biochemical pathways suchas plastid-nuclear signaling or prevention of the buildup ofchlorophyll precursors that may lead to increased oxidativedamage (discussed in Wickett et al 2011 and referencestherein) In nearly all cases studied strictly parasitic species

Table 3 Branch-Models for Coding Regions of the Plastome in Corallorhiza for Which All Accessions Have Intact Reading Frames Based onVarious Configurations of Coding Data Implemented in the CODEML Module of PAML

Model Omega Estimates npa ln Lb c PPcorrd

atp genes

M0 xleafy = xcor = xmac-ng = xstr = 021505 26 980276868

M1 xleafy = 015175 xcor = xmac-ng = xstr = 026949 27 979812794 9281476

M2 xleafy = 015211 xcor = 022238 xmac-ng = xstr = 038239 28 979568827 4879344 mdash

M3 xleafy = 015212 xcor = 022242 xmac-ng = 041999 xstr = 036243 29 979562404 0128448

M4 xleafy = xcor = 018380 xmac-ng = xstr = 038274 27 979732610 10885142

M5 xleafy = xcor = 018382 xmac-ng = 041997 xstr = 036292 28 979726308 0126042

Photosynthesis genes (pet psab rbcL)

M0 xleafy = xcor = xmac-ng = xstr = 033866 26 185462551

M1 xleafy = 029790 xcor = xmac-ng = xstr = 032741 27 185448900 0273012

M2 xleafy = 029856 xcor = 025277 xmac-ng = xstr = 073721 28 185232692 4324170 mdash

M3 xleafy = 029858 xcor = 025278 xmac-ng = 059352 xstr = 088556 29 185222042 0213002

M4 xleafy = xcor = 027361 xmac-ng = xstr = 073720 27 185239825 4454520 mdash

M5 xleafy = xcor = 027362 xmac-ng = 059352 xstr = 088550 28 185229177 0212952

Houskeeping genes (rpsl infA accD clpP etc)

M0 xleafy = xcor = xmac-ng = xstr = 030179 26 255451250

M1 xleafy = 029055 xcor = xmac-ng = xstr = 032343 27 255427794 4691148 mdash

M2 xleafy = 029066 xcor = 031584 xmac-ng = xstr = 034192 28 255426470 0264796

M3 xleafy = 029065 xcor = 031583 xmac-ng = 033612 xstr = 034612 29 255426407 0012646

M4 xleafy = xcor = 030267 xmac-ng = xstr = 034230 27 255428978 4454430 mdash

M5 xleafy = xcor = 030269 xmac-ng = 033612 xstr = 034668 28 255428908 0014052

ycf1 amp 2

M0 xleafy = xcor = xmac-ng = xstr = 066003 26 253848570

M1 xleafy = 082314 xcor = xmac-ng = xstr = 061213 27 253831494 3415136

M2 xleafy = 082370 xcor = 061073 xmac-ng = xstr = 061543 28 253831485 0001854

M3 xleafy = 082338 xcor = 061050 xmac-ng = 059460 xstr = 063131 29 253831215 0053994

M4 xleafy = xcor = 067331 xmac-ng = xstr = 061464 27 253846635 0386918

M5 xleafy = xcor = 067313 xmac-ng = 059296 xstr = 062959 28 253846382 0050592

matK

M0 xleafy = xcor = xmac-ng = xstr = 043481 26 366630553

M1 xleafy = 044943 xcor = xmac-ng = xstr = 042324 27 366627784 0055394

M2 xleafy = 045060 xcor = 038642 xmac-ng = xstr = 053159 28 366593065 0694374

M3 xleafy = 045107 xcor = 038419 xmac-ng = 033356 xstr = 099284 29 366463804 2585212

M4 xleafy = xcor = 041948 xmac-ng = xstr = 053022 27 366608306 0444942

M5 xleafy = xcor = 041857 xmac-ng = 033352 xstr = 098601 28 366480458 2556974

NOTEmdashAbbreviations for -ratios leafy = green leafy outgroups (Cymbidium Oncidium Phalaenopsis) cor = all Corallorhiza mac-ng = nongreen Co maculata (ie Co mertensi-ana Co maculata maculata Co maculata occidentalis) str = Co striata Successively nested models were compared based on log-likelihood ratio tests (M1 vs M0 M2 vs M1M3 vs M2 M4 vs M0 M5 vs M4) all comparisons had 1 degree of freedom

Plt 005 Plt 001 Plt 0001 ldquomdashrdquo = not significant (where indicated) otherwise blank space indicates no significanceaThe number of free parameters for a particular modelbLog-likelihood of the data for a particular modelcThe chi-square distributed log-likelihood ratio test statistic used to evaluate significant differences in model fitdP values for uncorrectedBonferroni corrected 2 tests where Pcorr = branch parameters in the model being tested

3101

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

(including the nongreen coralroots) that still synthesize de-tectable levels of chlorophyll do so at reduced levels The factthat all nongreen coralroots contain relatively low chlorophyllconcentrations means that these pigments are masked andthus the overall coloration of these species tends to be redpurple or brown most likely due to anthocyanins occurringat higher concentrations (Freudenstein and Doyle 1994a per-sonal observation) Occasionally yellow variants occur amongpopulations of the various coralroot species (Freudenstein1997 Barrett CF personal observation) These are presumablyrecurrent anthocyanin-deficient mutants (yellow colorationmay be due to persistence of chlorophylls and other yelloworange carotenoid pigments) though no specific studies havebeen carried out to demonstrate the causes of this phenom-enon in Corallorhiza Regardless reduced chlorophyll concen-trations and the resulting apparent lack of visible greencoloration in Corallorhiza are excellent indicators of relaxedselective constraints on photosynthetic machinery and this islikely to be the case for many other parasite-containing taxa

Characteristics of Corallorhiza Plastomes

With perhaps the exception of Co striata vreelandii plas-tomes among coralroot taxa do not altogether displaymajor size reduction that would be expected for a groupleafless parasites and thus can be said to occupy the earlystages along the path to a highly reduced or ldquominimalrdquo plastidgenome (Barbrook et al 2006 Krause 2008) In a previousstudy it was hypothesized that members of the Co striatacomplex would have the most reduced plastomes in thegenus in terms of overall size (Barrett and Davis 2012) buteven the plastome of Co striata vreelandii (the representativesequenced in that study and included here) only representsan approximately 6 total genome size reduction rel-ative to the leafy Phalaenopsis (Chang et al 2006) In factCo macrantha and Co maculata mexicana (table 2) bothhave larger plastomes than two of the three leafy outgrouporchid taxa Phalaenopsis (148964 bp) and Oncidium(146484 bp) Thus the shift to leaflessness per se inCorallorhiza does not appear to be associated with a reduc-tion in plastome size for the genus overall (at least for partiallygreen taxa) although an examination of plastomes for theclosest relatives of Corallorhiza (eg Oreorchis AplectrumCremastra Govenia Calypso) in tribe Calypsoeae will helpclarify this

A reduction in plastome size due to deletions and muta-tions resulting in pseudogenes is expected to be associatedwith a decrease in GC content as once-functional (and rela-tively GC-rich) genes become riddled with loss-of-functionmutations over time as a result of relaxed photosynthesis-associated selective pressures These coding regions begin toresemble noncoding DNA which typically has a lower GCcontent (eg Bernardi 1989 Glemin et al 2014) This is ap-parently the case for Co striata vreelandii (3402 GC table 2)which is 116 lower than the mean GC content for theremaining coralroot plastomes (3528) Mean GC contentfor the nongreen members of the Co maculata complex is3510 and no different than that found in the green

coralroots This suggests that large-scale differences in plastidGC content may take longer to manifest and thus likely comeabout later in the process of plastid modification relative toother processes such as photosynthesis gene deletions andother initial gene mutations causing loss of function Thislends additional support to the notion that Co striata vree-landii is farther ldquodown the pathrdquo of plastid genome degrada-tion relative to the nongreen members of the Co maculatacomplex

More broadly the leafy orchids Phalaenopsis aphrodite andthe hybrid Oncidium Gower Ramsey have GC contents of356 and 3732 respectively (Chang et al 2006 Wu et al2010) these values are very similar to what is found in Cotrifida and other green coralroots thus green coralroots donot seem to deviate much from other leafy orchids Membersof holo- and hemiparasitic Orobanchaceae (Wicke et al 2013)represent an interesting clade for comparison withCorallorhiza in terms of GC content Orobanchaceae rangefrom 3808 GC in the hemiparasitic Schwalbea to 3109 inthe holoparasitic Phelipanche purpurea Variation in GC con-tent among plastomes of Corallorhiza lies completely withinthe bounds of that observed in Orobanchaceae which is hy-pothesized to be in the relatively more advanced stages ofdegradation

Structural Evolution of the Plastome Pseudogenesand Lossesndh GenesDegradation of the plastid-encoded ndh complex is advancedin Corallorhiza as has been observed in other groups (egChang et al 2006 Funk et al 2007 McNeal et al 2007Braukmann et al 2009 Blazier et al 2011 Braukmann andStefanovic 2012 Braukmann et al 2013 Iles et al 2013 Peredoet al 2013 Wicke et al 2013) including some of the leafyphotosynthetic orchid plastomes included in figure 3Interestingly there is variation among the orchid plastomessequenced to date with respect to the status of ndh genes(Chang et al 2006 Wu et al 2010 Delannoy et al 2011Barrett and Davis 2012 Logacheva et al 2011 Yang et al2013) suggesting multiple independent degradation path-ways in this complex in the orchid family

The ancestor of Corallorhiza likely already experiencedsome degree of degradation in the ndh complex as evidencedby several shared pseudogenes Sequencing of additionalorchid plastomes could reveal an evolutionary transitionfrom a functional plastid-encoded ndh complex in leafy rel-atives (eg Aplectrum Oreorchis Cremastra Govenia) to alargely degraded state as observed in Corallorhiza Howeverthere is even some variation in the degree of degradationamong certain members of ndh within Corallorhiza whichin part reflects the plastid phylogeny (fig 2) This patternincludes a major deletion in ndhE in all taxa but Co striatavreelandii a shared pseudogene for ndhJ for all members ofthe Co maculata complex besides Co bulbosa (which here issister to the remaining members of the Co maculata com-plex fig 2) a loss of the same gene in (Co odontorhiza Cowisteriana) and hypothesized parallel deletions of the ndhG

3102

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

This topology was chosen for all downstream tree-based anal-yses From the root of the tree (with Cymbidium as the out-group) Oncidium and Phalaenopsis are successively sister to amonophyletic Corallorhiza with Co striata vreelandii and Cotrifida successively diverging from the ancestor of the remain-ing members of Corallorhiza Nested within this clade (Coodontorhiza Co wisteriana) is sister to the Co maculata com-plex The Mexican Co bulbosa is sister to all other members ofthe Co maculata complex with 100 support whereaswithin the latter the Mexican Co macrantha and Co macu-lata mexicana are sister to one another and this clade iscollectively sister to a northern North American clade of(Co mertensiana (Co maculata maculata Co maculataoccidentalis))

Analyses of Selective Regime

Various branch models were constructed to explicitly test thehypothesis of different -ratios for different branches speci-fically for nongreen Corallorhiza Only loci for which all taxahad intact reading frames were included The most basicmodel (M0) is a ldquoone-ratiordquo model which assumes a single-ratio across all branches (here leafy =cor =mac-ng =strsee table 3 for branch-based abbreviations of-ratios) ModelM1 which allows all of Corallorhiza to have a different-ratiothan the leafy outgroups (leafy cor =mac-ng =str) has asignificantly better fit for atp-genes (table 3) Also for atp-genes a three-ratio model (M2) allowing different -ratios inCorallorhiza and additionally for nongreen Corallorhiza as op-posed to leafy outgroups has a significantly better fit than thenested M1 model Model M4 which is a two-ratiomodel specifying different -ratios for leafy outgroups +green Corallorhiza as opposed to nongreen Corallorhiza(leafy =cor mac-ng =str) is highly significant comparedwith the one-ratio model (M0) after Bonferroni correctionPhotosynthesis-related genes (excluding atp-genes) have asignificantly better fit for models M2 (leafy cor mac-

ng =str) and M4 (leafy =cor mac-ng =str) than M0 butthese were nonsignificant after Bonferroni correctionRegardless the branch-specific estimate (model M4) of innongreen taxa (mac-ng =str = 073720) was more than twicethat of green taxa (leafy =cor = 027361) approaching apattern expected under selective neutrality for photosynthe-sis-related genes (~ 1) ldquoHousekeepingrdquo genes showsignificantly better fit for M1 and M4 but these again werenonsignificant after Bonferroni correction for multiple com-parisons None of the branch models had a significantly betterfit in nested comparisons for ycf1 + ycf2 or for matK

Branch-site models were applied in the atp complex totest for evidence of positive selection First for a model inwhich all Corallorhiza were specified as a foreground cladethe alternative model which allows some sites to be underpositive selection (2 4 1) did not fit the data significantlybetter than the null model which allows no sites to be underpositive selection (1 =2 = 1 2 = 06 df = 1 P = 04386)However the alternative model had a significantly better fitthan the null model when specifying nongreen Corallorhizataxa as foreground branches after Bonferroni correction for

multiple comparisons (2 = 114 df = 1 P = 00007) suggest-ing that some sites in the atp complex are under positiveselection

Discussion

Chlorophyll Content of Corallorhiza

Some Corallorhiza and many heterotrophic plants are oftenreferred to as ldquoachlorophyllousrdquo based on their lack of visiblegreen coloration However all Corallorhiza included in thisstudy contain at least some detectable levels of chlorophyll(fig 1) Montfort and Keurousters (1940) detected chlorophyll andphotosynthetic activity in Co trifida which is not surprisinggiven the predominantly green coloration of the above-ground organs of this species a later study by Cummingsand Welschmeyer (1998) detected some levels of chlorophyllin the nongreen Co maculata as well as in another nongreenstrictly heterotrophic orchid Cephalanthera austiniae Taxawith at least some visible green tissue on average had approx-imately a 10-fold higher chlorophyll concentration than non-green taxa (fig 1) Although leafless Co trifida is the greenestof the coralroots with green pigmentation throughout nearlythe entire above-ground portion of the plant The distributionof green tissues among other partially green coralroot speciesis more variable but the common theme is that at least somegreen tissue is noticeable in or around the ovary

Coupled with the fact that green or partially green coral-roots also show a general lack of degradation in photosyn-thesis-related plastid gene systems (with the exception of thendh complex and a few other ldquominorrdquo examples see below)these observations suggest that the aforementioned ldquogreenrdquocoralroot taxa may indeed all be partially heterotrophic rely-ing to some small degree on photosynthetic carbon Thoughit has been demonstrated that Co trifida is an inefficientphotosynthesizer (ie net photosynthetic carbon assimilationcannot compensate for net respiration) there is likely to be anadaptive reason that photosynthesis persists in this speciesand possibly in others such as Co odontorhiza and Co wis-teriana Ironically Co trifida is the one coralroot species thatmight be expected to depend most heavily on photosynthesis(Zimmer et al 2008 Cameron et al 2009) One possibility isthat these species supplement fungal carbon uptake withphotosynthetic carbon in specific tissues only mainly in theovary inside of which hundreds or even thousands of minuteldquodust seedsrdquo are produced This has been experimentally dem-onstrated in the partially mycoheterotrophic orchidLimodorum abortivum with the additional findings that pho-tosynthesis is highest in ovary tissue and increases underexperimentally induced fungal carbon limitation (Bellinoet al 2014) It is currently unknown if any temporal fungalcarbon limitations occur throughout the short season whenCorallorhiza displays aboveground growth (typically latespring for flowering to summer for seedovary development)Based on current knowledge it is hypothesized that for veg-etatively reduced orchids such as ldquogreenrdquo Corallorhiza andLimodorum compensation of carbon in and around develop-ing ovaries seems to have some adaptive value which in turn

3100

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

would cause purifying selection on the plastid- (and presum-ably nuclear-) encoded components of photosynthesis

The more puzzling finding applies to nongreen taxa non-green coralroots had surprisingly consistent (albeit relativelyminute) chlorophyll concentrations with very little variationwithin or among species (fig 1) despite having been sampledfrom different parts of their respective geographic rangesChlorophyll has been detected in other strictly heterotrophicplants (Cummings and Welschmeyer 1998) including mem-bers of Orobanchaceae and the monotropoid Ericaceae

The question naturally arises Why would putatively nonpho-tosynthetic taxa continue to produce chlorophyll There areseveral hypotheses as to why this might be the case includingphotoprotectionavoidance of light damage to DNA andother cellular components involvement of the chlorophyllbiosynthesis machinery in other biochemical pathways suchas plastid-nuclear signaling or prevention of the buildup ofchlorophyll precursors that may lead to increased oxidativedamage (discussed in Wickett et al 2011 and referencestherein) In nearly all cases studied strictly parasitic species

Table 3 Branch-Models for Coding Regions of the Plastome in Corallorhiza for Which All Accessions Have Intact Reading Frames Based onVarious Configurations of Coding Data Implemented in the CODEML Module of PAML

Model Omega Estimates npa ln Lb c PPcorrd

atp genes

M0 xleafy = xcor = xmac-ng = xstr = 021505 26 980276868

M1 xleafy = 015175 xcor = xmac-ng = xstr = 026949 27 979812794 9281476

M2 xleafy = 015211 xcor = 022238 xmac-ng = xstr = 038239 28 979568827 4879344 mdash

M3 xleafy = 015212 xcor = 022242 xmac-ng = 041999 xstr = 036243 29 979562404 0128448

M4 xleafy = xcor = 018380 xmac-ng = xstr = 038274 27 979732610 10885142

M5 xleafy = xcor = 018382 xmac-ng = 041997 xstr = 036292 28 979726308 0126042

Photosynthesis genes (pet psab rbcL)

M0 xleafy = xcor = xmac-ng = xstr = 033866 26 185462551

M1 xleafy = 029790 xcor = xmac-ng = xstr = 032741 27 185448900 0273012

M2 xleafy = 029856 xcor = 025277 xmac-ng = xstr = 073721 28 185232692 4324170 mdash

M3 xleafy = 029858 xcor = 025278 xmac-ng = 059352 xstr = 088556 29 185222042 0213002

M4 xleafy = xcor = 027361 xmac-ng = xstr = 073720 27 185239825 4454520 mdash

M5 xleafy = xcor = 027362 xmac-ng = 059352 xstr = 088550 28 185229177 0212952

Houskeeping genes (rpsl infA accD clpP etc)

M0 xleafy = xcor = xmac-ng = xstr = 030179 26 255451250

M1 xleafy = 029055 xcor = xmac-ng = xstr = 032343 27 255427794 4691148 mdash

M2 xleafy = 029066 xcor = 031584 xmac-ng = xstr = 034192 28 255426470 0264796

M3 xleafy = 029065 xcor = 031583 xmac-ng = 033612 xstr = 034612 29 255426407 0012646

M4 xleafy = xcor = 030267 xmac-ng = xstr = 034230 27 255428978 4454430 mdash

M5 xleafy = xcor = 030269 xmac-ng = 033612 xstr = 034668 28 255428908 0014052

ycf1 amp 2

M0 xleafy = xcor = xmac-ng = xstr = 066003 26 253848570

M1 xleafy = 082314 xcor = xmac-ng = xstr = 061213 27 253831494 3415136

M2 xleafy = 082370 xcor = 061073 xmac-ng = xstr = 061543 28 253831485 0001854

M3 xleafy = 082338 xcor = 061050 xmac-ng = 059460 xstr = 063131 29 253831215 0053994

M4 xleafy = xcor = 067331 xmac-ng = xstr = 061464 27 253846635 0386918

M5 xleafy = xcor = 067313 xmac-ng = 059296 xstr = 062959 28 253846382 0050592

matK

M0 xleafy = xcor = xmac-ng = xstr = 043481 26 366630553

M1 xleafy = 044943 xcor = xmac-ng = xstr = 042324 27 366627784 0055394

M2 xleafy = 045060 xcor = 038642 xmac-ng = xstr = 053159 28 366593065 0694374

M3 xleafy = 045107 xcor = 038419 xmac-ng = 033356 xstr = 099284 29 366463804 2585212

M4 xleafy = xcor = 041948 xmac-ng = xstr = 053022 27 366608306 0444942

M5 xleafy = xcor = 041857 xmac-ng = 033352 xstr = 098601 28 366480458 2556974

NOTEmdashAbbreviations for -ratios leafy = green leafy outgroups (Cymbidium Oncidium Phalaenopsis) cor = all Corallorhiza mac-ng = nongreen Co maculata (ie Co mertensi-ana Co maculata maculata Co maculata occidentalis) str = Co striata Successively nested models were compared based on log-likelihood ratio tests (M1 vs M0 M2 vs M1M3 vs M2 M4 vs M0 M5 vs M4) all comparisons had 1 degree of freedom

Plt 005 Plt 001 Plt 0001 ldquomdashrdquo = not significant (where indicated) otherwise blank space indicates no significanceaThe number of free parameters for a particular modelbLog-likelihood of the data for a particular modelcThe chi-square distributed log-likelihood ratio test statistic used to evaluate significant differences in model fitdP values for uncorrectedBonferroni corrected 2 tests where Pcorr = branch parameters in the model being tested

3101

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

(including the nongreen coralroots) that still synthesize de-tectable levels of chlorophyll do so at reduced levels The factthat all nongreen coralroots contain relatively low chlorophyllconcentrations means that these pigments are masked andthus the overall coloration of these species tends to be redpurple or brown most likely due to anthocyanins occurringat higher concentrations (Freudenstein and Doyle 1994a per-sonal observation) Occasionally yellow variants occur amongpopulations of the various coralroot species (Freudenstein1997 Barrett CF personal observation) These are presumablyrecurrent anthocyanin-deficient mutants (yellow colorationmay be due to persistence of chlorophylls and other yelloworange carotenoid pigments) though no specific studies havebeen carried out to demonstrate the causes of this phenom-enon in Corallorhiza Regardless reduced chlorophyll concen-trations and the resulting apparent lack of visible greencoloration in Corallorhiza are excellent indicators of relaxedselective constraints on photosynthetic machinery and this islikely to be the case for many other parasite-containing taxa

Characteristics of Corallorhiza Plastomes

With perhaps the exception of Co striata vreelandii plas-tomes among coralroot taxa do not altogether displaymajor size reduction that would be expected for a groupleafless parasites and thus can be said to occupy the earlystages along the path to a highly reduced or ldquominimalrdquo plastidgenome (Barbrook et al 2006 Krause 2008) In a previousstudy it was hypothesized that members of the Co striatacomplex would have the most reduced plastomes in thegenus in terms of overall size (Barrett and Davis 2012) buteven the plastome of Co striata vreelandii (the representativesequenced in that study and included here) only representsan approximately 6 total genome size reduction rel-ative to the leafy Phalaenopsis (Chang et al 2006) In factCo macrantha and Co maculata mexicana (table 2) bothhave larger plastomes than two of the three leafy outgrouporchid taxa Phalaenopsis (148964 bp) and Oncidium(146484 bp) Thus the shift to leaflessness per se inCorallorhiza does not appear to be associated with a reduc-tion in plastome size for the genus overall (at least for partiallygreen taxa) although an examination of plastomes for theclosest relatives of Corallorhiza (eg Oreorchis AplectrumCremastra Govenia Calypso) in tribe Calypsoeae will helpclarify this

A reduction in plastome size due to deletions and muta-tions resulting in pseudogenes is expected to be associatedwith a decrease in GC content as once-functional (and rela-tively GC-rich) genes become riddled with loss-of-functionmutations over time as a result of relaxed photosynthesis-associated selective pressures These coding regions begin toresemble noncoding DNA which typically has a lower GCcontent (eg Bernardi 1989 Glemin et al 2014) This is ap-parently the case for Co striata vreelandii (3402 GC table 2)which is 116 lower than the mean GC content for theremaining coralroot plastomes (3528) Mean GC contentfor the nongreen members of the Co maculata complex is3510 and no different than that found in the green

coralroots This suggests that large-scale differences in plastidGC content may take longer to manifest and thus likely comeabout later in the process of plastid modification relative toother processes such as photosynthesis gene deletions andother initial gene mutations causing loss of function Thislends additional support to the notion that Co striata vree-landii is farther ldquodown the pathrdquo of plastid genome degrada-tion relative to the nongreen members of the Co maculatacomplex

More broadly the leafy orchids Phalaenopsis aphrodite andthe hybrid Oncidium Gower Ramsey have GC contents of356 and 3732 respectively (Chang et al 2006 Wu et al2010) these values are very similar to what is found in Cotrifida and other green coralroots thus green coralroots donot seem to deviate much from other leafy orchids Membersof holo- and hemiparasitic Orobanchaceae (Wicke et al 2013)represent an interesting clade for comparison withCorallorhiza in terms of GC content Orobanchaceae rangefrom 3808 GC in the hemiparasitic Schwalbea to 3109 inthe holoparasitic Phelipanche purpurea Variation in GC con-tent among plastomes of Corallorhiza lies completely withinthe bounds of that observed in Orobanchaceae which is hy-pothesized to be in the relatively more advanced stages ofdegradation

Structural Evolution of the Plastome Pseudogenesand Lossesndh GenesDegradation of the plastid-encoded ndh complex is advancedin Corallorhiza as has been observed in other groups (egChang et al 2006 Funk et al 2007 McNeal et al 2007Braukmann et al 2009 Blazier et al 2011 Braukmann andStefanovic 2012 Braukmann et al 2013 Iles et al 2013 Peredoet al 2013 Wicke et al 2013) including some of the leafyphotosynthetic orchid plastomes included in figure 3Interestingly there is variation among the orchid plastomessequenced to date with respect to the status of ndh genes(Chang et al 2006 Wu et al 2010 Delannoy et al 2011Barrett and Davis 2012 Logacheva et al 2011 Yang et al2013) suggesting multiple independent degradation path-ways in this complex in the orchid family

The ancestor of Corallorhiza likely already experiencedsome degree of degradation in the ndh complex as evidencedby several shared pseudogenes Sequencing of additionalorchid plastomes could reveal an evolutionary transitionfrom a functional plastid-encoded ndh complex in leafy rel-atives (eg Aplectrum Oreorchis Cremastra Govenia) to alargely degraded state as observed in Corallorhiza Howeverthere is even some variation in the degree of degradationamong certain members of ndh within Corallorhiza whichin part reflects the plastid phylogeny (fig 2) This patternincludes a major deletion in ndhE in all taxa but Co striatavreelandii a shared pseudogene for ndhJ for all members ofthe Co maculata complex besides Co bulbosa (which here issister to the remaining members of the Co maculata com-plex fig 2) a loss of the same gene in (Co odontorhiza Cowisteriana) and hypothesized parallel deletions of the ndhG

3102

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

would cause purifying selection on the plastid- (and presum-ably nuclear-) encoded components of photosynthesis

The more puzzling finding applies to nongreen taxa non-green coralroots had surprisingly consistent (albeit relativelyminute) chlorophyll concentrations with very little variationwithin or among species (fig 1) despite having been sampledfrom different parts of their respective geographic rangesChlorophyll has been detected in other strictly heterotrophicplants (Cummings and Welschmeyer 1998) including mem-bers of Orobanchaceae and the monotropoid Ericaceae

The question naturally arises Why would putatively nonpho-tosynthetic taxa continue to produce chlorophyll There areseveral hypotheses as to why this might be the case includingphotoprotectionavoidance of light damage to DNA andother cellular components involvement of the chlorophyllbiosynthesis machinery in other biochemical pathways suchas plastid-nuclear signaling or prevention of the buildup ofchlorophyll precursors that may lead to increased oxidativedamage (discussed in Wickett et al 2011 and referencestherein) In nearly all cases studied strictly parasitic species

Table 3 Branch-Models for Coding Regions of the Plastome in Corallorhiza for Which All Accessions Have Intact Reading Frames Based onVarious Configurations of Coding Data Implemented in the CODEML Module of PAML

Model Omega Estimates npa ln Lb c PPcorrd

atp genes

M0 xleafy = xcor = xmac-ng = xstr = 021505 26 980276868

M1 xleafy = 015175 xcor = xmac-ng = xstr = 026949 27 979812794 9281476

M2 xleafy = 015211 xcor = 022238 xmac-ng = xstr = 038239 28 979568827 4879344 mdash

M3 xleafy = 015212 xcor = 022242 xmac-ng = 041999 xstr = 036243 29 979562404 0128448

M4 xleafy = xcor = 018380 xmac-ng = xstr = 038274 27 979732610 10885142

M5 xleafy = xcor = 018382 xmac-ng = 041997 xstr = 036292 28 979726308 0126042

Photosynthesis genes (pet psab rbcL)

M0 xleafy = xcor = xmac-ng = xstr = 033866 26 185462551

M1 xleafy = 029790 xcor = xmac-ng = xstr = 032741 27 185448900 0273012

M2 xleafy = 029856 xcor = 025277 xmac-ng = xstr = 073721 28 185232692 4324170 mdash

M3 xleafy = 029858 xcor = 025278 xmac-ng = 059352 xstr = 088556 29 185222042 0213002

M4 xleafy = xcor = 027361 xmac-ng = xstr = 073720 27 185239825 4454520 mdash

M5 xleafy = xcor = 027362 xmac-ng = 059352 xstr = 088550 28 185229177 0212952

Houskeeping genes (rpsl infA accD clpP etc)

M0 xleafy = xcor = xmac-ng = xstr = 030179 26 255451250

M1 xleafy = 029055 xcor = xmac-ng = xstr = 032343 27 255427794 4691148 mdash

M2 xleafy = 029066 xcor = 031584 xmac-ng = xstr = 034192 28 255426470 0264796

M3 xleafy = 029065 xcor = 031583 xmac-ng = 033612 xstr = 034612 29 255426407 0012646

M4 xleafy = xcor = 030267 xmac-ng = xstr = 034230 27 255428978 4454430 mdash

M5 xleafy = xcor = 030269 xmac-ng = 033612 xstr = 034668 28 255428908 0014052

ycf1 amp 2

M0 xleafy = xcor = xmac-ng = xstr = 066003 26 253848570

M1 xleafy = 082314 xcor = xmac-ng = xstr = 061213 27 253831494 3415136

M2 xleafy = 082370 xcor = 061073 xmac-ng = xstr = 061543 28 253831485 0001854

M3 xleafy = 082338 xcor = 061050 xmac-ng = 059460 xstr = 063131 29 253831215 0053994

M4 xleafy = xcor = 067331 xmac-ng = xstr = 061464 27 253846635 0386918

M5 xleafy = xcor = 067313 xmac-ng = 059296 xstr = 062959 28 253846382 0050592

matK

M0 xleafy = xcor = xmac-ng = xstr = 043481 26 366630553

M1 xleafy = 044943 xcor = xmac-ng = xstr = 042324 27 366627784 0055394

M2 xleafy = 045060 xcor = 038642 xmac-ng = xstr = 053159 28 366593065 0694374

M3 xleafy = 045107 xcor = 038419 xmac-ng = 033356 xstr = 099284 29 366463804 2585212

M4 xleafy = xcor = 041948 xmac-ng = xstr = 053022 27 366608306 0444942

M5 xleafy = xcor = 041857 xmac-ng = 033352 xstr = 098601 28 366480458 2556974

NOTEmdashAbbreviations for -ratios leafy = green leafy outgroups (Cymbidium Oncidium Phalaenopsis) cor = all Corallorhiza mac-ng = nongreen Co maculata (ie Co mertensi-ana Co maculata maculata Co maculata occidentalis) str = Co striata Successively nested models were compared based on log-likelihood ratio tests (M1 vs M0 M2 vs M1M3 vs M2 M4 vs M0 M5 vs M4) all comparisons had 1 degree of freedom

Plt 005 Plt 001 Plt 0001 ldquomdashrdquo = not significant (where indicated) otherwise blank space indicates no significanceaThe number of free parameters for a particular modelbLog-likelihood of the data for a particular modelcThe chi-square distributed log-likelihood ratio test statistic used to evaluate significant differences in model fitdP values for uncorrectedBonferroni corrected 2 tests where Pcorr = branch parameters in the model being tested

3101

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

(including the nongreen coralroots) that still synthesize de-tectable levels of chlorophyll do so at reduced levels The factthat all nongreen coralroots contain relatively low chlorophyllconcentrations means that these pigments are masked andthus the overall coloration of these species tends to be redpurple or brown most likely due to anthocyanins occurringat higher concentrations (Freudenstein and Doyle 1994a per-sonal observation) Occasionally yellow variants occur amongpopulations of the various coralroot species (Freudenstein1997 Barrett CF personal observation) These are presumablyrecurrent anthocyanin-deficient mutants (yellow colorationmay be due to persistence of chlorophylls and other yelloworange carotenoid pigments) though no specific studies havebeen carried out to demonstrate the causes of this phenom-enon in Corallorhiza Regardless reduced chlorophyll concen-trations and the resulting apparent lack of visible greencoloration in Corallorhiza are excellent indicators of relaxedselective constraints on photosynthetic machinery and this islikely to be the case for many other parasite-containing taxa

Characteristics of Corallorhiza Plastomes

With perhaps the exception of Co striata vreelandii plas-tomes among coralroot taxa do not altogether displaymajor size reduction that would be expected for a groupleafless parasites and thus can be said to occupy the earlystages along the path to a highly reduced or ldquominimalrdquo plastidgenome (Barbrook et al 2006 Krause 2008) In a previousstudy it was hypothesized that members of the Co striatacomplex would have the most reduced plastomes in thegenus in terms of overall size (Barrett and Davis 2012) buteven the plastome of Co striata vreelandii (the representativesequenced in that study and included here) only representsan approximately 6 total genome size reduction rel-ative to the leafy Phalaenopsis (Chang et al 2006) In factCo macrantha and Co maculata mexicana (table 2) bothhave larger plastomes than two of the three leafy outgrouporchid taxa Phalaenopsis (148964 bp) and Oncidium(146484 bp) Thus the shift to leaflessness per se inCorallorhiza does not appear to be associated with a reduc-tion in plastome size for the genus overall (at least for partiallygreen taxa) although an examination of plastomes for theclosest relatives of Corallorhiza (eg Oreorchis AplectrumCremastra Govenia Calypso) in tribe Calypsoeae will helpclarify this

A reduction in plastome size due to deletions and muta-tions resulting in pseudogenes is expected to be associatedwith a decrease in GC content as once-functional (and rela-tively GC-rich) genes become riddled with loss-of-functionmutations over time as a result of relaxed photosynthesis-associated selective pressures These coding regions begin toresemble noncoding DNA which typically has a lower GCcontent (eg Bernardi 1989 Glemin et al 2014) This is ap-parently the case for Co striata vreelandii (3402 GC table 2)which is 116 lower than the mean GC content for theremaining coralroot plastomes (3528) Mean GC contentfor the nongreen members of the Co maculata complex is3510 and no different than that found in the green

coralroots This suggests that large-scale differences in plastidGC content may take longer to manifest and thus likely comeabout later in the process of plastid modification relative toother processes such as photosynthesis gene deletions andother initial gene mutations causing loss of function Thislends additional support to the notion that Co striata vree-landii is farther ldquodown the pathrdquo of plastid genome degrada-tion relative to the nongreen members of the Co maculatacomplex

More broadly the leafy orchids Phalaenopsis aphrodite andthe hybrid Oncidium Gower Ramsey have GC contents of356 and 3732 respectively (Chang et al 2006 Wu et al2010) these values are very similar to what is found in Cotrifida and other green coralroots thus green coralroots donot seem to deviate much from other leafy orchids Membersof holo- and hemiparasitic Orobanchaceae (Wicke et al 2013)represent an interesting clade for comparison withCorallorhiza in terms of GC content Orobanchaceae rangefrom 3808 GC in the hemiparasitic Schwalbea to 3109 inthe holoparasitic Phelipanche purpurea Variation in GC con-tent among plastomes of Corallorhiza lies completely withinthe bounds of that observed in Orobanchaceae which is hy-pothesized to be in the relatively more advanced stages ofdegradation

Structural Evolution of the Plastome Pseudogenesand Lossesndh GenesDegradation of the plastid-encoded ndh complex is advancedin Corallorhiza as has been observed in other groups (egChang et al 2006 Funk et al 2007 McNeal et al 2007Braukmann et al 2009 Blazier et al 2011 Braukmann andStefanovic 2012 Braukmann et al 2013 Iles et al 2013 Peredoet al 2013 Wicke et al 2013) including some of the leafyphotosynthetic orchid plastomes included in figure 3Interestingly there is variation among the orchid plastomessequenced to date with respect to the status of ndh genes(Chang et al 2006 Wu et al 2010 Delannoy et al 2011Barrett and Davis 2012 Logacheva et al 2011 Yang et al2013) suggesting multiple independent degradation path-ways in this complex in the orchid family

The ancestor of Corallorhiza likely already experiencedsome degree of degradation in the ndh complex as evidencedby several shared pseudogenes Sequencing of additionalorchid plastomes could reveal an evolutionary transitionfrom a functional plastid-encoded ndh complex in leafy rel-atives (eg Aplectrum Oreorchis Cremastra Govenia) to alargely degraded state as observed in Corallorhiza Howeverthere is even some variation in the degree of degradationamong certain members of ndh within Corallorhiza whichin part reflects the plastid phylogeny (fig 2) This patternincludes a major deletion in ndhE in all taxa but Co striatavreelandii a shared pseudogene for ndhJ for all members ofthe Co maculata complex besides Co bulbosa (which here issister to the remaining members of the Co maculata com-plex fig 2) a loss of the same gene in (Co odontorhiza Cowisteriana) and hypothesized parallel deletions of the ndhG

3102

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

(including the nongreen coralroots) that still synthesize de-tectable levels of chlorophyll do so at reduced levels The factthat all nongreen coralroots contain relatively low chlorophyllconcentrations means that these pigments are masked andthus the overall coloration of these species tends to be redpurple or brown most likely due to anthocyanins occurringat higher concentrations (Freudenstein and Doyle 1994a per-sonal observation) Occasionally yellow variants occur amongpopulations of the various coralroot species (Freudenstein1997 Barrett CF personal observation) These are presumablyrecurrent anthocyanin-deficient mutants (yellow colorationmay be due to persistence of chlorophylls and other yelloworange carotenoid pigments) though no specific studies havebeen carried out to demonstrate the causes of this phenom-enon in Corallorhiza Regardless reduced chlorophyll concen-trations and the resulting apparent lack of visible greencoloration in Corallorhiza are excellent indicators of relaxedselective constraints on photosynthetic machinery and this islikely to be the case for many other parasite-containing taxa

Characteristics of Corallorhiza Plastomes

With perhaps the exception of Co striata vreelandii plas-tomes among coralroot taxa do not altogether displaymajor size reduction that would be expected for a groupleafless parasites and thus can be said to occupy the earlystages along the path to a highly reduced or ldquominimalrdquo plastidgenome (Barbrook et al 2006 Krause 2008) In a previousstudy it was hypothesized that members of the Co striatacomplex would have the most reduced plastomes in thegenus in terms of overall size (Barrett and Davis 2012) buteven the plastome of Co striata vreelandii (the representativesequenced in that study and included here) only representsan approximately 6 total genome size reduction rel-ative to the leafy Phalaenopsis (Chang et al 2006) In factCo macrantha and Co maculata mexicana (table 2) bothhave larger plastomes than two of the three leafy outgrouporchid taxa Phalaenopsis (148964 bp) and Oncidium(146484 bp) Thus the shift to leaflessness per se inCorallorhiza does not appear to be associated with a reduc-tion in plastome size for the genus overall (at least for partiallygreen taxa) although an examination of plastomes for theclosest relatives of Corallorhiza (eg Oreorchis AplectrumCremastra Govenia Calypso) in tribe Calypsoeae will helpclarify this

A reduction in plastome size due to deletions and muta-tions resulting in pseudogenes is expected to be associatedwith a decrease in GC content as once-functional (and rela-tively GC-rich) genes become riddled with loss-of-functionmutations over time as a result of relaxed photosynthesis-associated selective pressures These coding regions begin toresemble noncoding DNA which typically has a lower GCcontent (eg Bernardi 1989 Glemin et al 2014) This is ap-parently the case for Co striata vreelandii (3402 GC table 2)which is 116 lower than the mean GC content for theremaining coralroot plastomes (3528) Mean GC contentfor the nongreen members of the Co maculata complex is3510 and no different than that found in the green

coralroots This suggests that large-scale differences in plastidGC content may take longer to manifest and thus likely comeabout later in the process of plastid modification relative toother processes such as photosynthesis gene deletions andother initial gene mutations causing loss of function Thislends additional support to the notion that Co striata vree-landii is farther ldquodown the pathrdquo of plastid genome degrada-tion relative to the nongreen members of the Co maculatacomplex

More broadly the leafy orchids Phalaenopsis aphrodite andthe hybrid Oncidium Gower Ramsey have GC contents of356 and 3732 respectively (Chang et al 2006 Wu et al2010) these values are very similar to what is found in Cotrifida and other green coralroots thus green coralroots donot seem to deviate much from other leafy orchids Membersof holo- and hemiparasitic Orobanchaceae (Wicke et al 2013)represent an interesting clade for comparison withCorallorhiza in terms of GC content Orobanchaceae rangefrom 3808 GC in the hemiparasitic Schwalbea to 3109 inthe holoparasitic Phelipanche purpurea Variation in GC con-tent among plastomes of Corallorhiza lies completely withinthe bounds of that observed in Orobanchaceae which is hy-pothesized to be in the relatively more advanced stages ofdegradation

Structural Evolution of the Plastome Pseudogenesand Lossesndh GenesDegradation of the plastid-encoded ndh complex is advancedin Corallorhiza as has been observed in other groups (egChang et al 2006 Funk et al 2007 McNeal et al 2007Braukmann et al 2009 Blazier et al 2011 Braukmann andStefanovic 2012 Braukmann et al 2013 Iles et al 2013 Peredoet al 2013 Wicke et al 2013) including some of the leafyphotosynthetic orchid plastomes included in figure 3Interestingly there is variation among the orchid plastomessequenced to date with respect to the status of ndh genes(Chang et al 2006 Wu et al 2010 Delannoy et al 2011Barrett and Davis 2012 Logacheva et al 2011 Yang et al2013) suggesting multiple independent degradation path-ways in this complex in the orchid family

The ancestor of Corallorhiza likely already experiencedsome degree of degradation in the ndh complex as evidencedby several shared pseudogenes Sequencing of additionalorchid plastomes could reveal an evolutionary transitionfrom a functional plastid-encoded ndh complex in leafy rel-atives (eg Aplectrum Oreorchis Cremastra Govenia) to alargely degraded state as observed in Corallorhiza Howeverthere is even some variation in the degree of degradationamong certain members of ndh within Corallorhiza whichin part reflects the plastid phylogeny (fig 2) This patternincludes a major deletion in ndhE in all taxa but Co striatavreelandii a shared pseudogene for ndhJ for all members ofthe Co maculata complex besides Co bulbosa (which here issister to the remaining members of the Co maculata com-plex fig 2) a loss of the same gene in (Co odontorhiza Cowisteriana) and hypothesized parallel deletions of the ndhG

3102

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

pseudogene in Co bulbosa and the clade of (Co macranthaCo maculata mexicana)

The plastid-encoded ndh complex is believed to be dis-pensable for a number of reasons (reviewed in Martın andSabater 2010 Wicke et al 2011 2013) which may differ forvarious groups of plants displaying evidence of degradation inthis gene complex Generally the ndh complex is believed tobe essential for photosynthetic function in environmentswith highly variable light intensities by functioning in regula-tion of cyclic electron transport and also in mitigating theeffects of photo-oxidative damage under high light intensities(Martın et al 2009 Martın and Sabater 2010) In the ancestorof Corallorhiza selective constraints on the ndh complex mayhave been lifted by a combination of decreased dependenceupon photosynthetic carbon as a result of a shift to heavydependence upon fungal carbon (a similar situation is alsoobserved in carnivorous plants using animals as a carbonsource in the family Lentibulariaceae Wicke et al 2014)and a tendency to grow in dark shaded mature forestswhere light intensity is always low This may also occur inother orchid lineages (including leafy photosynthetic or-chids) due to high dependence on fungal carbon as seedlingsand also the tendency to live in habitats with relatively low orat least static amounts of light intensity (eg a shaded rain-forest understory or floor)

Photosynthesis-Related Genes (excluding ndh rpo and atp)Nongreen members of Corallorhizamdashthat is those with rela-tively reduced levels of chlrorophyll (fig 1)mdashall display evi-dence of degradation of genes directly involved inphotosynthetic processes (fig 2 eg psab pet rbcL) Ofthese taxa Co striata vreelandi displays the highest numberof degraded genes (table 2 figs 2 and 3) as reported in Barrettand Davis (2012) followed by northern North Americanmembers of the Co maculata complex (Co mertensianaCo maculata maculata Co maculata occidentalis) Both ofthese complexes (Co striata Co maculata) have indepen-dently undergone degradation among genes in the followinggenesgene complexes (omitting ndh see above) pet psabrpo ccsA cemA and rbcL It is highly unlikely that transcriptediting of any kind can compensate for these changes and itcan be assumed that these changes are irreversible The pos-sibility exists however slightly that all deleted genes or pseu-dogenes among the plastomes of these taxa are redundantand that functional copies are encoded in the nucleargenome or that genes have been ldquoregainedrdquo through inter-genomic or horizontal transfer (eg Iorizzo et al 2012 Straubet al 2013 Wicke et al 2013)

Based on the degraded states of photosynthesis-relatedgenes among these plastomes it can be concluded that theaforementioned nongreen taxa are strict parasites of fungiincapable of photosynthesis and thus the termsldquoholomycotrophrdquo or ldquofull mycoheterotrophrdquo are indeed ap-propriate No studies of photosynthetic activity have beencarried out in Corallorhiza aside from those in the ldquogreenestrdquocoralroot Co trifida (Zimmer et al 2008 Cameron et al 2009)It will be beneficial to measure photosynthetic rates (or de-termine the lack thereof) along with plastid-gene transcript

levels among multiple individuals of both green and nongreencoralroot taxa preferably growing in close proximity to oneanother and in similar habitats to definitively characterizephotosynthetic capacity in the genus

Though green members of Corallorhiza do not display sig-nificant degradation of photosynthesis-related genes (psapsb pet rbcL etc) there are a few important exceptionswhich may or may not affect the efficacy of photosynthesisin these taxa First members of Corallorhiza excluding Cotrifida and Co striata have experienced a large deletion inpsbM (PSII protein M) This low molecular weight peptideis believed to play a role in maintaining the stability of PSIIdimers psbM-deficient tobacco mutants displayed impairedstability of PSII but were still able to function (Umate et al2007 Kawakami et al 2011) Photosynthesis has only beenexplicitly measured and demonstrated in Co trifida (Zimmeret al 2008 Cameron et al 2009) which has a putatively func-tional copy of psbM Thus it is possible that either 1) otherldquogreenrdquo coralroots (Co odontorhiza Co wisteriana Co bul-bosa Co macrantha Co maculata mexicana) carry out in-hibited photosynthesis 2) this gene has been transferred tothe nucleus or mitochondrion and its product is importedinto the plastid allowing stabilization of the PSII dimer com-plex and some level of photosynthetic function or 3) a non-functional psbM gene has no effect on photosyntheticfunction

Another partially green coralroot Co macrantha (re-stricted to high elevation forests in southern Mexico) has apseudogenized copy of psaI (Photosystem I Reaction CenterSubunit VIII) due to a 4-bp insertion near the 50-end of thegene relative to other coralroots causing a reading frame shiftThe psaI protein is believed to function in the stabilization of anuclear-encoded subunit of the same complex (psaL) inArabidopsis thaliana and has also been shown to bind tothe Light-harvesting Complex II (Jensen et al 2007) it wassuggested that the latter putative function is redundant InCorallorhiza this might explain why psaI in Co macrantha hasapparently become a pseudogene Thus among the greencoralroots Co macrantha may represent yet another veryearly step in the transition to the loss of photosynthesis orat least a reduction in photosynthetic efficacy but this re-mains a hypothesis to be tested

rpo GenesThere is evidence for rpo pseudogenes in both the nongreenCo striata and Co maculata complexes based on large dele-tions reading frame shifts and internal stop codons Plastidgenes are transcribed by two polymerases in monocots Anuclear-encoded polymerase and a plastid-encoded polymer-ase (Liere et al 2011) The plastid-encoded RNA polymerase isbelieved to preferentially transcribe photosynthesis-relatedgenes whereas the rpoB-C1-C2 operon itself is transcribedby a nuclear-encoded RNA polymerase (Hajdukiewicz et al1997 reviewed in Liere et al 2011 Krause 2012) In addition totranscribing photosynthesis-related genes for the most partPEP also tends to be more active in later leaf developmentalstages (Hajdukiewicz et al 1997 reviewed in Liere et al 2011)

3103

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Based on this latter finding it might be expected that allmembers of Corallorhiza should show some degradation inrpo genes as leaves do not fully develop in any species withinthis genus (ie only the basal leaf sheath persists but theblade or lamina never develops) presumably obviating therole of PEP in transcription of photosynthesis-related genes(eg psa psb) in leaves However green coralroots have intactreading frames for all rpo genes whereas nongreen coralrootsshow varying degrees of degradation in the rpo complex Aswith photosynthesis-related genes in Corallorhiza (figs 2 and3) it does not appear to be the transition to leaflessness per sethat has resulted in degradation of the rpo complex but morelikely the shift from partial heterotrophy (and the associatedneed to carry out photosynthesis whatever small amount) tostrict heterotrophy An interesting parallel occurs in the car-nivorous family Lentibulariaceae in which rpo genes displaysignificant departures from purifying selection (Wicke et al2014) An interesting follow-up would be to determinewhether degradation first begins among photosystem genes(psa psb) diminishing the importance of having a functionalPEP complex or whether the opposite is the case in which adysfunctional (or nonexistent) PEP has contributed to relaxedselective pressures associated with photosynthesis-relatedgenes (ie if psa and psb genes can no longer be transcribedor are done so by an inefficient PEP then is there any benefitto maintaining them) Regardless the degradation of rpogenes in nongreen Corallorhiza closely mirrors that in genesdirectly involved in photosynthesis (psab rbcL etc)

Relationships in Corallorhiza

Complete plastome sequences of Corallorhiza yielded a highlyresolved highly supported tree consistent with previousstudies (Freudenstein and Doyle 1994a 1994b Barrett andFreudenstein 2008 Freudenstein and Senyo 2008) Here forthe first time relationships among the six taxa that comprisethe Co maculata complex (as recognized in Freudenstein1997) are fully resolved with high support for Co bulbosaas sister to the rest of the complex (bootstrap = 91) whereasprevious studies could not resolve relationships in this com-plex (Barrett and Freudenstein 2008 Freudenstein and Senyo2008) Complete plastomes also confirm the sister relation-ship of Co striata to the rest of Corallorhiza with Co trifida(moving away from the base of the tree) then Co odontorhizaand Co wisteriana diverging successively Depending on thedata set analyzed Co bulbosa appears in a few differentplaces most likely due to homoplasy and few variable char-acters available to resolve the position of this species in smal-ler data sets as evidenced by short internal branches(supplementary fig S3 Supplementary Material online)Thus inclusion of complete plastomes (pseudogenes spacersand introns) resolves the otherwise uncertain position of thisspecies Nuclear rDNA on the other hand places Co bulbosaas sister to (Co macrantha Co maculata mexicana) withmoderate support a similar finding to the combined analysisof Barrett and Freudenstein (2008) based on plastid rbcL +nuclear ITS though that study did not include Co macranthaCombining all nuclear and plastid data places Co bulbosa as

sister to the remaining members of the Co maculata com-plex with 100 bootstrap support (supplementary fig S3Supplementary Material online) The complete plastome +rDNA data set fully resolves relationships among the majorspecies complexes of Corallorhiza (excluding members notsampled from some complexes)

In particular relationships within the Co maculata com-plex shed light on a phylogeographically intriguing aspect ofplastome degradation All Mexican members of the Co macu-lata complex have at least some visible green tissue and arepresumed to be partially mycoheterotrophic based on con-servation of plastid-encoded photosynthetic apparatus (forthe most part see fig 2) Those members of the Co maculatacomplex that occur in northern North America have no vis-ible green tissue (two of these species were shown to havehighly reduced chlorophyll levels fig 1) and display evidenceof degradation in these genes The topology of the Co macu-lata complex based on combined plastome + rDNA data(supplementary fig S3 Supplementary Material onlinefig 2) lends support to a previous hypothesis that this com-plex (and possibly the genus Corallorhiza itself) originated insouthern Mexico (Freudenstein and Doyle 1994a 1994bBarrett and Freudenstein 2008 Freudenstein and Senyo2008) and expanded into northern North America withthe ancestor of Co mertensiana Co maculata maculataand Co maculata occidentalis undergoing a shift from partialto strict heterotrophy (and continuing this trend amongextant lineages) and thus presumably losing the ability tocarry out photosynthesis This suggests that deeper samplingamong individuals in members of both the Co maculatacomplex and the Co striata complex (which is here repre-sented only by Co striata vreelandii) will yield a much clearerpicture of plastome degradation in Corallorhiza on a phylo-geographic scale

Selective Regime among Genes with Open ReadingFrames

Model M4 allowing green and nongreen taxa to have differ-ent -ratios (leafy =cor mac-ng =str) fit the data signifi-cantly better than the one-ratio model for both houskeepinggenes and photosynthesis-related genes suggesting differentselective scenarios for nongreen taxa relative to green taxabut these comparisons are nonsignificant when corrected formultiple comparisons (table 3) For housekeeping genes -values for green versus nongreen taxa did not differ by much(leafy =cor = 030267 mac-ng =str = 034230 table 3)However this difference is much more pronounced for pho-tosynthesis-related genes (leafy =cor = 027361 mac-

ng =str = 073720) with the estimate for nongreen taxabeing approximately 26 that of green taxa and approach-ing a value expected under neutral evolution (~ 1) Thusit is evident that even though some photosynthesis genes innongreen taxa still have intact reading frames there may be arelaxation of selective pressure associated with these genes

For atp genes Models M1 and M4 fit the data significantlybetter than does M0 even after correcting for multiple com-parisons suggesting different selective regimes in Corallorhiza

3104

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

versus leafy green outgroups or at least in nongreenCorallorhiza versus green Corallorhiza and leafy outgroupsFor example in model M1 the value for Corallorhiza isnearly twice that for leafy orchids (leafy = 015175cor =mac-ng =str = 026949) and in model M4 nongreenCorallorhiza have values over twice those for greenCorallorhiza and leafy orchids Further investigations usingbranch-site models allowing some sites in the ldquoforegroundrdquobranches to be under positive selection suggest evidence forpositive selection in nongreen Corallorhiza for the atpcomplex

The plastid-encoded atp genes play a critical role in pho-tosynthesis by both generating ATP and translocating pro-tons across the Thylakoid Membrane (Evron et al 2000McCarty et al 2000 Allen et al 2011) However in taxathat display evidence of loss-of-function for photosystemand other photosynthesis-related genes the fact that atpgenes are preserved as intact reading frames suggests thatthey may play additional roles in plastid function (discussedin Wicke et al 2013) This is bolstered by the fact that thiscomplex shows evidence of positive selection in strictly het-erotrophic nongreen taxa but not for Corallorhiza as a wholeThis would support the hypothesis that the atp complex haseither taken on additional as-of-yet unknown roles in holo-mycotrophic plastids or at a minimum has fallen under in-creased selective pressure to adapt to the overall ldquoalternativelifestylerdquo associated with strict fungal heterotrophy innonphotosynthetic plastids (including in amyloplasts chro-moplasts elaioplasts proteinoplasts etc) A future researchobjective is to examine the potential biological roles and mo-lecular evolutionary patterns of the atp complex in nonpho-tosynthetic plants and their green photosynthetic relatives asmembers of this gene complex are often preserved and pu-tatively functional among various heterotrophic angiosperms(see fig 3 and references therein including discussion inWicke et al 2013)

Comparison of Corallorhiza with Other ParasiticAngiosperms Using the Model of Barrett and Davis(2012)

A comparison of the plastomes of several heterotrophic taxathat have been sequenced to date shows that Corallorhiza isnot quite in the ldquoadvancedrdquo stages of plastid genome degra-dation relative to several taxa in Orobanchaceae (eg Wickeet al 2013) or other orchids such as Neottia (Logacheva et al2012) and Rhizanthella (Delannoy et al 2011) (fig 3)Interpreted in the context of the basic model presented inBarrett and Davis (2012) members of Corallorhiza fall intocategories 1 and 3 Taxa with some visible green tissue displaypseudogenesdeletions mainly in the ndh complex (category1 gene-class categories are listed at the top of fig 3) with afew additional changes whereas those with no green tissuehave experienced pseudogenization and deletions in ndhphotosynthesis-related and rpo genes but not in atp genesor other ldquohousekeepingrdquo genes (category 3 excepting the lossof a redundant trnTGGU in Co striata vreelandii Barrett andDavis 2012 figs 2 and 3) Thus nonparasitic autotrophs and

partial heterotrophs roughly fall either into category 0 (nodegradation of plastid genes) or category 1 (degradation ofthe ndh complex only) More specifically all partial hetero-trophs fall into category 1 with evidence of degradation of thendh complex

Strict heterotrophs fall into more ldquoadvancedrdquo categoriesSeveral but not all members of Orobanchaceae and twoorchids (Neottia and Rhizanthella) can be classified in cate-gory 5 in which there is evidence of degradation of atp genesand at least some evidence of degradation in ldquohousekeepingrdquogenes (eg rpl rps trn etc) There are some genes for whichno taxa have experienced pseudogenization or loss includingrpl2 rpl16 rpl20 rps2 rps4 rps8 rps11 ycf1 ycf2 trnD-guctrnE-uuc trnI-cau trnM-cau trnQ-uug trnY-gua rrn16 rrn23rrn45 and rrn5 This suggests that they may be indispensablefor plastid function at least in the angiosperms included inthis comparison and suggests that a ldquominimal setrdquo of genesmight be necessary for functional plastid ldquohousekeepingrdquo andperhaps other functions outside of photosynthesis and ATPproduction

Overall no taxa fell in categories 2 or 4 suggesting thatdegradation of the rpo genes occurs more or less simulta-neously with degradation in photosynthesis-related genesand that degradation of atp genes occurs along with that inldquohousekeepingrdquo genes (also discussed in Wicke et al 2013)Thus the model of Barrett and Davis (2012) could be mod-ified in light of more recently sequenced plastomes includingthose of Corallorhiza having five general categories 1) nodegradation 2) degradation in the ndh complex only 3) deg-radation of the ndh complex and photosynthesis-relatedgenes including rpo genes and 4) degradation in all plastidgene systems including ldquohousekeepingrdquo and atp genes and 5)complete or nearly complete loss of the plastid genome (as ishypothesized to have occurred in some strict heterotrophssuch as Corynaea and Rafflesia [Nickrent et al 1997 Molinaet al 2014])

There is some degree of deviation from the Barrett andDavis (2012) model as might be expected given the potentialfor somewhat idiosynchratic patterns of plastome evolutionin unrelated groups The obvious exception to this modelwould be the sequenced members of the genus Cuscuta inthe family Convolvulaceae One of the best-studied parasiticsystems in terms of plastome evolution Cuscuta containsmembers with ldquoextremerdquo plastome reduction relative to clo-sely related leafy species (approximately 50 total size reduc-tion in Co gronovii and Co obtusiflora relative to the leafyIpomoea) (fig 3 and references therein) Yet both highly re-duced species of Cuscuta retain most if not all of the genes forphotosynthesis but have degraded RNA polymerase com-plexes and also show evidence of pseudogenesdeletions ofsome housekeeping genes A further survey based on probe-hybridization in Cuscuta with greatly increased taxon sam-pling by Braukmann et al (2013) suggests that this genusencapsulates multiple transitions to highly degraded plastidgenomes with members of subgenus Grammica displayingeven more advanced stages of degradation These taxa werenot included in figure 3 as they are based on hybridizationand have not yet been sequenced to generate complete

3105

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

plastomes Recycling of respiratory CO2 has been offered as anexplanation for the retention of putatively functional genes inthe psa psb and pet complexes (McNeal et al 2007) under-scoring the importance of studying the physiology of strictheterotrophs on a taxon-by-taxon basis and also highlightinghow little is known about the basic physiology and functionalgenomics of these plants

ConclusionAll members of Corallorhiza produce detectable levels ofchlorophyll based on the current sample but nongreenstrictly heterotrophic members have on average tenfoldlower concentrations relative to green partially heterotro-phic members Genomic data allowed assembly of com-plete plastomes as well as partial rDNA operons whichwhen combined fully resolve relationships among themajor species complexes in Corallorhiza Based on com-plete plastomes Corallorhiza can be said to span theldquoearlyrdquo stages of plastid genome degradation as a resultof relaxed selective pressures on photosynthetic functionCorallorhiza is useful in providing a model system to un-derstand the earliest stages in the transition to a ldquominimalplastomerdquo as are observed in other heterotrophic plants(eg Orobanchaceae other orchids like Rhizanthella) Therehave been at least two independent transitions to strictheterotrophy from partial heterotrophy in the genus (Comaculata and Co striata complexes) Nongreen coralrootsalso display evidence of relaxed selective pressures on pho-tosynthesis-related genes which have retained open readingframes Members of the atp complex display relaxed puri-fying selection among Corallorhiza as a whole but morespecifically display evidence for positive selection for non-green strictly heterotrophic taxa although for reasons cur-rently unknown

Generally there is a clear path of plastid genome deg-radation in the heterotrophic angiosperm lineages studiedthus far (with some deviations from the overall model egsome Cuscuta) Moreover it can be hypothesized thatthere is a stepwise progression to a ldquominimalrdquo plastomeand Corallorhiza occupies categories 2 (partially heterotro-phic taxa which display degradation of the ndh complexonly and a few other pseudogenes) and 3 (strictly hetero-trophic taxa including degradation of the ndh complexphotosynthesis-related genes and rpo genes [excludingatp genes]) in a modified model of the degree of plastomedegradation among angiosperms Future studies will in-clude 1) deeper sampling of complete plastomes withinspecies complexes of Corallorhiza to gain a phylogeographicperspective on plastome degradation 2) direct assessmentof the relative photosynthetic capabilities of members ofeach coralroot taxon in the field and 3) quantification oftranscript levels for plastid-encoded genes within each spe-cies to gain a clearer understanding of gene expressionand to test the hypothesis of the existence of duplicatefunctional copies of pseudogenized or deleted plastidgenes

Materials and Methods

Plant Material DNA Isolation Library Preparation andPE Sequencing

Plant material was collected and preserved either in silica gelor frozen at 80 C Total genomic DNAs were extractedusing the cetyltrimethylammonium bromide (CTAB)method (Doyle JJ and Doyle JL 1987) from 10 g frozen or02ndash05 g silica-dried material and quantified using aNanoDrop2000 spectrophotometer (Thermo ScientificWaltham MA) Samples with total DNA concentrations4 100 ngl were selected and run on a 15 agarose gelto check for high molecular weight nondegraded DNASamples were shipped on ice to Cold Spring HarborLaboratory (CSHL Woodbury NY) where they were shearedto approximately 450 bp followed by library preparationand barcoding for PE Illumina sequencing The exceptionwas Co odontorhiza for which library preparation andsequencing were completed at the University of MissourindashColumbia using the Illumina TruSeq Library Preparation Kit(Illumina Inc San Diego CA) Co odontorhiza DNAs weresheared to approximately 320 bp All samples were run onIllumina HiSeq 2000 machines to generate 100-bp PE readsIndexed samples were pooled and run with accessions fromother projects with a total of 40 or 20 accessions in a singlelane on the CSHL and University of Missouri machines re-spectively Voucher information and characteristics of each ofthe resulting data sets and completed plastomes are listed intable 1

Quantification of Chlorophyll Content

Approximately 01 g of 80 C frozen tissue was harvestedfrom the outer wall of the developing ovary and weighed forchlorophyll quantification Ovary tissue was chosen 1) be-cause this is the only organ that is visibly green in some taxa(eg Co odontorhiza Co wisteriana) and 2) to be consistentbetween taxa Chlorophylls were extracted in 80 ice-coldacetone stored at 80 C and chlorophylls a and b werequantified on a Cary 4000 UV-VIS spectrophotometer(Agilent Technologies Inc Santa Clara CA Department ofPlant Cellular amp Molecular Biology Ohio State University) andtheir concentrations were calculated using formulas of Porra(2002) (Lima D personal communication) Data from chlo-rophylls a and b were pooled to give total chlorophyllsAlthough individual accessions used to quantify chlorophyllconcentration among species were not the same as thoseused for sequencing multiple individuals from across eachspeciesrsquo geographic range were included to account for vari-ation in chlorophyll content among individuals within spe-cies Corallorhiza bentleyi is included here (an endangeredmember of the Co striata complex) but was not sequencedConversely Mexican-endemic members of the Co maculatacomplex (Co macrantha Co bulbosa Co maculata var mex-icana) were not included in the chlorophyll analysis due torarity and thus lack of freshfrozen material Voucher speci-mens for plants used for chlorophyll analysis are listed insupplementary figure S5 Supplementary Material online

3106

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Data Processing

The resulting paired FASTQ data were processed using acustom UNIX script (ldquoprocess_fastqshrdquo) that calls severalfreely available Perl scripts for the purposes of 1) quality fil-teringtrimming and adaptor removal (ldquoIlluQC_PRLLplrdquoldquoTrimmingReadsplrdquo Patel and Jain 2012) 2) ldquoshufflingrdquo PEfiles into a single file (ldquoShuffleSequences_fastqplrdquo part ofthe VELVET suite of tools [E Cabot University ofWisconsin-Madison Zerbino and Birney 2008]) and 3) con-version to FASTA format (ldquofastq2fastaplrdquo B Knaus 2009[httpbrianknauscomsoftwaresrtoolbox last accessedMay 1 2014] Reads were trimmed from their 30-ends discard-ing bases under a quality threshold of PHRED score of 20(corresponding to an error rate of 1100 or greater) andalso discarding trimmed reads below 61 bp in length

Read Mapping

PE reads were then mapped five times iteratively in Geneiousv6 (created by Biomatters Inc Auckland New Zealandhttpwwwgeneiouscom last accessed December 1 2013)using automatically determined settings to a combined fileof 1) the Phalaenopsis aphrodite plastome (GenBank acces-sion number NC_007499 Chang et al 2006) 2) the Phoenixdactylifera mitochondrial genome (NC_016740 Fang et al2012) and 3) the Asclepias syriaca nuclear ribosomal DNAconsensus operon (JF312046 Straub et al 2011) The purposeof read mapping was to filter or ldquoscreenrdquo reads that matchedthe reference genomes and to get initial crude estimates ofcoverage to be used in downstream de novo assemblies

Plastome Assembly

Mean coverage estimates from Geneious were used as inputfor de novo plastome assembly (ldquo-exp_covrdquo parameter) Denovo assembly was completed using both the VELVET v1210assembler (Zerbino and Birney 2008) and the Geneious v6 denovo assembler VELVET assemblies were generated using avariety of hash lengths (ie ldquoword matchrdquo or ldquokmer lengthrdquo)ranging from 51 to 81 for each complete data set with spe-cified expected coverage estimates taken from Geneiousmapping (specified to be in approximate kmer coveragesee Zerbino and Birney 2008) Coverage cutoff values wereinput as roughly one-half of the specified expected coverageestimates minimum contig length was set to 300 bp andinsert size set to 450 bp (320 bp for Co odontorhiza) andthe resulting contigs were assembled in Sequencher v6(GeneCodes Ann Arbor MI) The resulting ldquodraft de novoassemblyrdquo from VELVET was used as a reference to map readsin Geneious as described above (five iterations automaticallydetermined settings) Read mappings were visually scanned inGeneious to check for areas of low coverage or potentialmisassemblies Read pairs that mapped successfully to thedraft reference were used in a second round of de novo as-sembly this time in Geneious with automatically determinedsettings saving 100 consensus contigs This method allowedconstruction of complete draft plastomes with relative ease

Contigs from both assemblies (Geneious Velvet) weremerged in Sequencher to form complete plastomes

(parameters = ldquodirty datardquo or ldquolarge gaprdquo minimum overlap= 20 bp minimum similarity 70ndash90) Any apparent ldquogapsrdquoamong contigs resulting from one software platform wereusually covered by a contig from the other allowing assemblyof completed plastomes (Large Single CopyndashInverted RepeatBndashSmall Single Copy) In a few cases VELVET scaffolds con-tained strings of Ns between 30 and 100 bp mostly associatedwith homopolymer repeats (4 10 bp) These were resolvedby 1) the Geneious de novo assembly (typically) and 2) UNIXldquogreprdquo searches with flanking sequences of 30ndash50 bp used assearch terms taken from each end of the string of Ns Grepresults were converted to FASTA format using a custom Perlscript to build a ldquobridgerdquo of reads This was done iterativelyuntil the region in question was resolved although most caseswere resolved with a single iteration Grep searches were ap-plied to all cases of ambiguity between contigs from the twode novo assembly methods as a quality control Polymerasechain reaction amplification and Sanger sequencing were notnecessary due to high read coverage (table 1)

IR boundaries for each draft plastome were determined byusing the first and last approximately 20 bp of the draft plas-tome as search terms (reverse complement) in Sequencher tofind matches and marking their match points as featuresThen the first approximately 700 bp and last approximately1ndash2000 bp of the draft plastome were copied reverse com-plemented and manually aligned to the IR using the markedfeatures from above as reference points IR boundaries weredetermined informatically using PE information (ie readpairs that spanned the IR boundaries) Because the orienta-tion of the ends of the LSC and SSC can vary the plastomeswere often in different configurations single copy regionswere reoriented to correspond to the ldquostandardrdquo model ofgene order to simplify downstream comparison Special at-tention was paid to the possibility of significant modificationor even loss of the IR (either partial or complete) and also tothe possibility of other genomic rearrangements or structuralmodifications In other words when building draft plastomesit was not assumed that all accessions displayed the ldquostan-dardrdquo plastome structure found in angiosperms (ie LSC IRbSSC IRa)

Plastome Annotation

Plastid genomes were annotated by first aligning the set ofcoding loci (coding sequences [CDS]) containing introns andall tRNAs from Phalaenopsis to each draft plastome (minusIRa) in Sequencher to aid in intronexon and tRNA boundarydetermination Annotations were conducted in DOGMA(Wyman et al 2004) which uses BLASTX searches against adatabase of annotated plastomes and amino acid translationsand allows manual determination of startstop codonsDOGMA BLASTX parameters were as follows genetic code= plant plastid e value = 5 percent cutoff for protein codinggenes = 55 and percent cutoff for tRNAs = 80 The resultingGenBank feature table was exported from the DOGMA serverand imported along with the finished plastid FASTA file intoSEQUIN (National Center for Biotechnology Information)where the annotation was validated and completed for

3107

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

submission to GenBank A large inversion in Co maculata varmaculata was manually reversed for the purposes of phylo-genetic analyses and other whole-plastome comparisons butthe plastome for that accession was annotated with the in-version as assembled de novo Plastome maps were drawnwith GenomeVx (Conant and Wolfe 2008) and manuallyedited in Adobe Illustrator CS6 (Adobe Inc San Jose CA)

Ribosomal DNA Assembly and Annotation

Nuclear ribosomal DNA reads were filtered by mapping readsfrom Co odontorhiza to a nearly complete rDNA operonreference (partial ETS [External Transcribed Spacer] 18 SrDNA ITS1 [Internal Transcribed Spacer] 58 S rDNA ITS226 S rDNA) from A syriaca described above as well as map-ping to an ITS-spanning sequence from Co odontorhiza fromGenBank (accession number EU391326) Other regions werenot assembled (eg 5 S rDNA) Resulting reads that mappedto each reference were assembled de novo in Geneious andresulting contigs were merged through overlap inSequencher This draft assembly was used for a secondround of read mapping (original Co odontorhiza reads) andde novo assembly in Geneious resulting in a single consensuscontig The consensus contig was based on 50 majority persite bearing in mind that the rDNA operon exists in numer-ous-repeated copies that often differ from one another ap-pearing as polymorphism in read pileups The Co odontorhizarDNA consensus operon was then used as a reference in readmapping for all other Corallorhiza species followed by indi-vidual de novo assemblies as above

Nearly complete ribosomal DNA operons (partial ETS 18 SrDNA ITS1 58 S ITS2 26 S rDNA excluding 5 S rDNA) wereannotated by first aligning finished contigs in Geneious (usingthe MUSCLE v3831 plugin in Geneious Edgar 2004) andtransferring the annotation features from Co odontorhiza tothe aligned Corallorhiza sequences Then the entire align-ment was annotated in SEQUIN for submission toGenBank by adding each feature to Co odontorhiza andsubsequently propagating the features to the remaining se-quences Reads were again mapped in Geneious to each fin-ished annotated rDNA cistron to check for misassembliesGenBank accession numbers for rDNA partial operons areKM390003-KM390012 Available outgroup sequences forCymbidium Oncidium and Phalaenopsis were mined fromGenBank for the purposes of downstream phylogenetic anal-yses (18 S rDNAmdashCy farberi GenBank accession numberJN418934 ITSmdashCy aloifolium JF729014 Phalaenopsis aphro-dite AY391543 O andradeanum FJ565598)

Data Matrix Assembly and Sequence Alignment

Annotated plastomes were imported into Geneious with sub-sequent removal of the second copy of the IR (IRa) andaligned with MAFFT v7157 (Katoh et al 2002) using theldquofast-progressiverdquo method (gap opening penalty = 6 off-set = 01) followed by manual adjustment The alignmentalso included complete plastomes of Cy aloifolium(GenBank accession number NC_021429) Oncidium GowerRamsey (GenBank accession number NC_014056) and

Phalaenopsis aphrodite (GenBank accession numberNC_017049) A short inversion of the petN-psbM region inCymbidium was removed for that accession in downstreamanalyses Annotated CDS were exported from Geneious anddata from each locus were saved as separate FASTA files Eachfile was submitted for codon-based alignment to the MACSEweb server (Ranwez et al 2011 httpmbbuniv-montp2frMBBsubsectionsoftExecphpsoft=macse last accessedMay 30 2014) which allows alignment of both intact openreading frames and incomplete sequences (including pseudo-genes with modified reading frames) Alignments were ad-justed manually when necessary and imported back intoGeneious or SequenceMatrix (Vaidya et al 2011) wherethey were concatenated into larger matrices Protein-codinggenes that contained pseudogenes (shifted reading framesinternal stop codons) in at least one taxon were excludedData matrices are deposited in supplementary fig S6Supplementary Material online

Phylogenetic Analyses

Phylogenetic analyses were conducted using both Parsimonyand ML Parsimony searches were conducted in TNT(Goloboff et al 2003 2008) with 100 random addition startingsequences saving 100 trees per replicate and using tree bi-sectionndashreconnection branch swapping on the pool of savedtrees Support was assessed through 2000 Jackknife pseudor-eplicates with the above search parameters Model fit for MLanalyses was assessed for the complete plastome alignment inMEGA v6 (Tamura et al 2013) which yielded GTR+G as thebest-fit model based on the corrected Akaike InformationCriterion (AICc) metric (Burnham and Anderson 2002) Allphylogenetic analyses were conducted in RAxML v8020(Stamatakis 2006) Three separate searches were initiatedfor each matrix analyzed from different random startingseeds using an unpartitioned GTR+ G model to ensure con-vergence on the same topology A codon-partitioned modelwas not chosen due to the possibility of relaxed functionalconstraints among some coding regions for some taxa whichcould possibly represent model violations Support was as-sessed with 1000 standard bootstrap pseudoreplicatesCymbidium was chosen as the outgroup taxon

Plastid Pseudogenes Genomic Deletions andGenomic Attributes

The presence of pseudogenes and whole generegional dele-tions were initially assessed visually in DOGMA by identifyinginterrupted BLASTX reading frames All coding loci aligned inMACSE (see above) were imported into Geneious and se-quences were translated to assess pseudogene status throughthe presence of frame shifts andor premature stop codonsLoci with interrupted reading frames were considered to bepseudogenes unless there was an alternative start or stopcodon within three codon positions of the other taxa inthe alignment In addition the alignment of whole annotatedplastomes (described above) was used to visualize anylarger genomic deletions spanning CDS boundaries orrearrangements

3108

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hypotheses of differences in plastid genomic attributesamong nongreen versus green taxa and correlations betweenchlorophyll content and genomic attributes were assessedamong taxa in using two phylogeny-based analyticalapproaches First a Phylogenetic Analysis of Variance(Garland et al 1993) was used to test for differences in con-tinuous plastome variables (eg length GC content) amonggreen versus nongreen taxa (ie a discrete grouping variable)taking phylogenetic relationships into account Analyses wererun in the R package ldquogeigerrdquo (Harmon et al 2008) with 10000simulations each using the ldquoaovphylordquo function based on anultrametic tree generated through Nonparametric RateSmoothing (Sanderson 1997) These analyses were run for acomplete data set of 13 taxa (three leafy green outgroup taxaand ten species of Corallorhiza) A second set of analyses wasconducted on a subset of six taxa of Corallorhiza for whichchlorophyll data were available to mitigate problems withmissing chlorophyll data In these analyses continuous geno-mic variables (length number of putatively functional genesGC content) were correlated with chlorophyll content (con-tinuous) using Phylogenetically Independent Contrasts toaccount for phylogenetic relatedness among taxa(Felsenstein 1985) The latter analyses were conductedunder a Brownian motion model with the PDAP package(Midford et al 2005) in Mesquite v275 (Maddison WP andMaddison DR 2011)

Model-Based Analyses of Selective Regime

Changes in selective regime among species of Corallorhiza forcodon-based matrices were assessed with the CODEMLmodule in PAML v48 (Yang 1997 2007) using ldquobranchrdquomodels Only loci for which all taxa had open readingframes were included in the analyses Various matrices arelisted in table 3 The objective was to test hypotheses of dif-ferent ratios for 1) Corallorhiza versus leafy outgroups and2) nongreen Corallorhiza versus green Corallorhiza + leafyoutgroups To accomplish this different branch modelswere fit to the data using the topology resulting from theML analysis of whole plastomes + rDNA Six different branchmodels were tested First a model in which all branchesevolve under one ratio (Model M0) was tested where= dNdS the ratio of nonsynonymous substitutions pernonsynonymous site to synonymous substitutions per syn-onymous site A two-ratio model (M1) was then run in whichall branches of Corallorhiza were allowed to have a differentratio (cor) than the remaining leafy outgroups sequences(leafy) A three-ratio model was then run (M2) allowingseparate values for leafy cor and nongreen Corallorhiza(str =mac-ng Co striata Co mertensiana Co maculata varmaculata and occidentalis respectively) followed by a four-ratio model (M3) with additionally separate ratios for Costriata (str) and the nongreen members of the Co maculatacomplex (mac-ng) The latter three models (M1ndashM3) speci-fically test hypotheses of whether Corallorhiza and particu-larly nongreen Corallorhiza display departures from purifyingselection Two additional models were run to test whetherthe nongreen Corallorhiza differ significantly from green

Corallorhiza and leafy outgroup taxa Specifically Model M4(two ratios) sets leafy =cor and str =mac-ng whereas M5(three ratios) allows str and mac-ng to differ Successivelynested models were compared based on log-likelihood ratiotests (M1 vs M0 M2 vs M1 M3 vs M2 M4 vs M0 M5 vsM4) where the likelihood ratio test statistic () was calcu-lated as two times the difference in log-likelihood betweenthe two models with degrees of freedom calculated as thedifference in the number of free parameters for the modelsand tested against the 2 distribution P values for multiplecomparisons to the same data set were Bonferroni-corrected(Dunn 1959) as suggested by Yang (2007) with the correctedsignificance level corresponding to m where m is thenumber of branches being tested Genes ycf1 and ycf2 wereanalyzed separately because they have extraordinarily highsubstitution rates (eg Barrett et al 2013) as was matKbased on the finding of a pseudogenized copy in a previousstudy (Freudenstein and Senyo 2008)

To further investigate the values among branches for theatp complex two sets of branch-site models were testedagainst the data In the first all Corallorhiza were set as theforeground branch and in the second nongreen Corallorhiza(Co striata vreelandii Co maculata var maculata and occi-dentalis and Co mertensiana) were set as foregroundbranches For both sets of branch models two alternativemodels were tested The ldquonullrdquo model specifies 0lt 1 for aproportion of sites and1 =2 = 1 for the remaining sites (nopositive selection) whereas the alternative model allows2 4 1 (positive selection on some sites) The alternativemodel (some sites under positive selection) was then testedagainst the null model (no sites under positive selection)using a 2 distribution as above with 1 degree of freedom(ie the alternative and null models differ by one parameter)

Comparison with Other Parasitic AngiospermLineages

Plastid gene content (losses pseudogenes putatively func-tional genes) was compared among diverse lineages of angio-sperms containing heterotrophs including newly sequencedCorallorhiza Taxa and genes were sorted by the numbers ofputatively functional genes and gene functional class respec-tively to illustrate the different stages of plastid genome deg-radation Specifically genes were arranged according to themodel of Barrett and Davis (2012) in which degradation ofplastid functional gene classes is hypothesized to have oc-curred roughly in the following order 1) NAD(P)H dehydro-genase genes (ndh) 2) ldquophotosynthesisrdquo-related genesincluding pet psa psb cemA ccsA rbcL ycf3 and ycf4 3) Plastid-encoded RNA Polymerase genes (rpo) 4) ATPsynthase genes (atp) and finally 5) ldquohousekeepingrdquo genes(rpl rps rrn trn matK clpP infA accD ycf1 and ycf2) Thoughnot a formal test in the strict sense the use of this model doesallow an assessment of how advanced the sequenced acces-sions of Corallorhiza are when compared with other hetero-trophic angiosperms it is hypothesized that Corallorhizaoccupies the early stages of this model Technically atpgenes function in the generation of ATP and movement of

3109

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

protons across the Thylakoid Membrane and thus play acrucial role in photosynthesis but this functional class waskept separate from photosynthesis genes due to the possibil-ity that ATP synthase serves roles outside of photosynthesis

Supplementary MaterialSupplementary figures S1ndashS5 are available at MolecularBiology and Evolution online (httpwwwmbeoxfordjour-nalsorg)

Acknowledgments

This work was supported by the National Science Foundation(awards DEB-0830020 to Jerrold I Davis and DBI 1110443 toJCP) the California State University Program for Educationand Research in Biotechnology (CSUPERB) to CFB theCalifornia State University Los Angeles Center for EffectiveTeaching and Learning Faculty-Student Mentorship award toCFB and JL and the California State University Council onOcean Affairs Science and Technology to CS The authorsare grateful to Jerrold Davis Patrick Edger Ryan EllingsonGwynne Lim Jeffrey Morawetz and Chris Randle for feedbackand laboratorybioinformatics support the staff at ColdSpring Harbor Laboratory (Woodbury NY) for sample prep-aration and sequencing Daniel Lima and Richard Sayre forassistance with chlorophyll spectroscopy and data interpre-tation Gerardo Salazar (UNAM) for assistance in collectingMexican specimens Maria Logacheva and three anonymousreviewers for critical insight that improved the paper

ReferencesAllen JF de Paula WBM Puthiyaveetil S Nield J 2011 A structural

phylogenetic map for chloroplast photosynthesis Trends Plant Sci16645ndash655

Barbrook AC Howe CJ Purton S 2006 Why are plastid genomes re-tained in non-photosynthetic organisms Trends Plant Sci 11101ndash108

Barrett CF Davis JI 2012 The plastid genome of the mycoheterotrophicCorallorhiza striata (Orchidaceae) is in the relatively early stages ofdegradation Am J Bot 991513ndash1523

Barrett CF Davis JI Leebens-Mack J Conran JG Stevenson DW 2013Plastid genomes and deep relationships among the commelinidmonocot angiosperms Cladistics 2965ndash87

Barrett CF Freudenstein JV 2008 Molecular evolution of rbcL in themycoheterotrophic coralroot orchids (Corallorhiza GagnebinOrchidaceae) Mol Phylogenet Evol 47665ndash679

Bellino A Alfani A Selosse M-A Guerrieri R Borghetti M Baldantoni D2014 Nutritional regulation in mixotrophic plants new insightsfrom Limodorum abortivum Oecologia 175875ndash885

Bernardi G 1989 The isochore organization of the human genomeAnnu Rev Genet 23637ndash661

Bidartondo MI 2005 The evolutionary ecology of myco-heterotrophyNew Phytol 167335ndash352

Blazier J Guisinger MM Jansen RK 2011 Recent loss of plastid-encodedndh genes within Erodium (Geraniaceae) Plant Mol Biol 76263ndash272

Braukmann T Kuzmina M Stefanovic S 2013 Plastid genome evolutionacross the genus Cuscuta (Convolvulaceae) two clades within sub-genus Grammica exhibit extensive gene loss J Exp Bot 64977ndash989

Braukmann T Stefanovic S 2012 Plastid genome evolution in mycohe-terotrophic Ericaceae Plant Mol Biol 795ndash20

Braukmann TWA Kuzmina M Stefanovic S 2009 Loss of all plastid ndhgenes in Gnetales and conifers extent and evolutionary significancefor the seed plant phylogeny Curr Genet 55323ndash337

Burnham KP Anderson DR 2002 Avoiding pitfalls when using infor-mation-theoretic methods J Wildl Manage 66912ndash918

Cameron DD Preiss K Gebauer G Read DJ 2009 The chlorophyll-con-taining orchid Corallorhiza trifida derives little carbon through pho-tosynthesis New Phytol 183358ndash364

Chang CC Lin HC Lin IP Chow TY Chen HH Chen WH Cheng CH LinCY Liu SM Chang CC et al 2006 The chloroplast genome ofPhalaenopsis aphrodite (Orchidaceae) comparative analysis of evo-lutionary rate with that of grasses and its phylogenetic implicationsMol Biol Evol 23279ndash291

Conant GC Wolfe KH 2008 GenomeVx simple web-based creation ofeditable circular chromosome maps Bioinformatics 24861ndash862

Cummings MP Welschmeyer NA 1998 Pigment composition ofputatively achlorophyllous angiosperms Plant Syst Evol 210105ndash111

Delannoy E Fujii S Colas des Francs-Small C Brundrett M Small I 2011Rampant gene loss in the underground orchid Rhizanthella gardnerihighlights evolutionary constraints on plastid genomes Mol BiolEvol 282077ndash2086

Doyle JJ Doyle JL 1987 A rapid DNA isolation procedure for smallquantities of fresh leaf tissue Phytochem Bull 1911ndash15

Dunn OJ 1959 Confidence intervals for the means of dependent nor-mally distributed variables J Am Stat Assoc 54613ndash621

Edgar RC 2004 MUSCLE multiple sequence alignment with high accu-racy and high throughput Nucleic Acids Res 321792ndash1797

Evron Y Johnson EA McCarty RE 2000 Regulation of proton flow andATP synthesis in chloroplasts J Bioenerg Biomembr 32501ndash506

Fang YJ Wu H Zhang TW Yang M Yin YX Pan LL Yu XG Zhang XWHu SNA Al-Mssallem IS et al 2012 A complete sequence andtranscriptomic analyses of date palm (Phoenix dactylifera L) mito-chondrial genome PLoS One 7e37164

Felsenstein J 1985 Phylogenies and the comparative method Am Nat1251ndash15

Freudenstein JV 1992 Systematics of Corallorhiza and theCorallorhizinae (Orchidaceae) [PhD dissertation] [Ithaca (NY)]Cornell University

Freudenstein JV 1997 A monograph of Corallorhiza (Orchidaceae)Harv Pap Bot 105ndash51

Freudenstein JV Barrett CF 2010 Mycoheterotrophy and diversity inOrchidaceae In Seberg O Petersen G Barfod A Davis JI editorsDiversity phylogeny and evolution in the monocotyledons TheProceedings of the Fourth International Conference on MonocotSystematics Aarhus (Denmark) Aarhus University Press p 25ndash37

Freudenstein JV Doyle JJ 1994a Character transformation and relation-ships in Corallorhiza (Orchidaceae Epidendroideae) 1 Plastid DNAAm J Bot 811449ndash1457

Freudenstein JV Doyle JJ 1994b Plastid DNA morphological variationand the phylogenetic species conceptmdashthe Corallorhiza maculata(Orchidaceae) complex Syst Bot 19273ndash290

Freudenstein JV Senyo DM 2008 Relationships and evolution of matKin a group of leafless orchids (Corallorhiza and CorallorhizinaeOrchidaceae Epidendroideae) Am J Bot 95498ndash505

Funk HT Berg S Krupinska K Maier UG Krause K 2007 Complete DNAsequences of the plastid genomes of two parasitic flowering plantspecies Cuscuta reflexa and Cuscuta gronovii BMC Plant Biol 745

Galagan JE Calvo SE Borkovich KA Selker EU Read ND Jaffe DFitzHugh W Ma LJ Smirnov S Purcell S et al 2003 The genomesequence of the filamentous fungus Neurospora crassa Nature 422859ndash868

Garland T Dickerman AW Janis CM Jones JA 1993 Phylogenetic anal-ysis of covariance by computer simulation Syst Biol 42265ndash292

Glemin S Clement Y David J Ressayre A 2014 GC content evolution incoding regions of angiosperm genomes a unifying hypothesisTrends Genet 30263ndash270

Goloboff PA Farris JS Kallersjo M Oxelman B Ramirez MJ Szumik CA2003 Improvements to resampling measures of group supportCladistics 19324ndash332

Goloboff PA Farris JS Nixon KC 2008 TNT a free program for phylo-genetic analysis Cladistics 24774ndash786

3110

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

Hajdukiewicz PTJ Allison LA Maliga P 1997 The two RNA polymerasesencoded by the nuclear and the plastid compartments transcribedistinct groups of genes in tobacco plastids EMBO J 164041ndash4048

Harmon LJ Weir JT Brock CD Glor RE Challenger W 2008 GEIGERinvestigating evolutionary radiations Bioinformatics 24129ndash131

Iles WJD Smith SY Graham SW 2013 A well-supported phylogeneticframework for the monocot order Alismatales reveals multiplelosses of the plastid NADH dehydrogenase complex and a stronglong-branch effect In Wilkin P Mayo S editors Early events inmonocot evolution Cambridge Cambridge University Pressp 1ndash28

Iorizzo M Grzebelus D Senalik D Szklarczyk M Spooner D Simon P2012 Against the traffic the first evidence for mitochondrial DNAtransfer into the plastid genome Mob Genet Elements 2261

Jensen PE Bassi R Boekema EJ Dekker JP Jansson S Leister D RobinsonC Scheller HV 2007 Structure function and regulation of plantphotosystem I Biochim Biophys Acta 1767335ndash352

Katinka MD Duprat S Cornillot E Metenier G Thomarat F Prensier GBarbe V Peyretaillade E Brottier P Wincker P et al 2001 Genomesequence and gene compaction of the eukaryote parasiteEncephalitozoon cuniculi Nature 414450ndash453

Katoh K Misawa K Kuma Ki Miyata T 2002 MAFFT a novel methodfor rapid multiple sequence alignment based on fast Fourier trans-form Nucleic Acids Res 303059ndash3066

Kawakami K Umena Y Iwai M Kawabata Y Ikeuchi M Kamiya N ShenJ-R 2011 Roles of PsbI and PsbM in photosystem II dimer formationand stability studied by deletion mutagenesis and X-ray crystallog-raphy Biochim Biophys Acta 1807319ndash325

Knaus BJ 2010 Short read toolbox Available from httpbrianknauscom

Krause K 2008 From chloroplasts to ldquocrypticrdquo plastids evolution ofplastid genomes in parasitic plants Curr Genet 54111ndash121

Krause K 2012 Plastid genomes of parasitic plants a trail of reductionsand losses In Bullerwell C editor Organelle genetics Berlin(Germany) Springer p 79ndash103

Kuijt J 1969 The biology of flowering parasitic plants Berkeley (CA)University of California Press

Lawrence JG 2005 Common themes in the genome strategies of path-ogens Curr Opin Genet Dev 15584ndash588

Leake JR 1994 The biology of myco-heterotrophic (Saprophytic) plantsNew Phytol 127171ndash216

Leake JR Cameron DD 2010 Physiological ecology of mycoheterotro-phy New Phytol 185601ndash605

Liere K Weihe A Borner T 2011 The transcription machineries of plantmitochondria and chloroplasts composition function and regula-tion J Plant Physiol 1681345ndash1360

Logacheva MD Schelkunov MI Nuraliev MS Samigullin TH Penin AA2014 The plastid genome of mycoheterotrophic monocotPetrosavia stellaris exhibits both gene losses and multiple rearrange-ments Genome Biol Evol 6238ndash246

Logacheva MD Schelkunov MI Penin AA 2011 Sequencing and analysisof plastid genome in mycoheterotrophic orchid Neottia nidus-avisGenome Biol Evol 31296ndash1303

Maddison WP Maddison DR 2011 Mesquite a modular system forevolutionary analysis version 275 Available from httpmesquite-projectorg

Martın M Funk HT Serrot PH Poltnigg P Sabater B 2009 Functionalcharacterization of the thylakoid ndh complex phosphorylation bysite-directed mutations in the ndhF gene Biochim Biophys Acta1787920ndash928

Martın M Sabater B 2010 Plastid ndh genes in plant evolution PlantPhysiol Biochem 48636ndash645

McCarty RE Evron Y Johnson EA 2000 The chloroplast ATPsynthase a rotary enzyme Annu Rev Plant Physiol Plant Mol Biol5183ndash109

McNeal JR Arumugunathan K Kuehl JV Boore JL Depamphilis CW2007 Systematics and plastid genome evolution of the crypticallyphotosynthetic parasitic plant genus Cuscuta (Convolvulaceae)BMC Biol 555

McNeal JR Bennett JR Wolfe AD Mathews S 2013 Phylogenyand origins of holoparasitism in Orobanchaceae Am J Bot 100971ndash983

Merckx V Freudenstein JV 2010 Evolution of mycoheterotrophy inplants a phylogenetic perspective New Phytol 185605ndash609

Midford PE Garland T Jr Maddison WP 2005 PDAP package ofMesquite version 107 Available from httpmesquiteprojectorg

Molina J Hazzouri KM Nickrent D Geisler M Meyer RS Pentony MMFlowers JM Pelser P Barcelona J Inovejas SA et al 2014 Possible lossof the chloroplast genome in the parasitic flowering plant Rafflesialagascae (Rafflesiaceae) Mol Biol Evol 31793ndash803

Montfort C Keurousters E 1940 Saprophytismus und photosythese IBiochemische physiologische studien an humus-orchideenBotanisces Archiv 40571ndash633

Moran NA 2002 Microbial minimalism genome reduction in bacterialpathogens Cell 108583ndash586

Morrison HG McArthur AG Gillin FD Aley SB Adam RD Olsen GJ BestAA Cande WZ Chen F Cipriano MJ et al 2007 Genomic minimal-ism in the early diverging intestinal parasite Giardia lamblia Science3171921ndash1926

Nickrent DL Yan OY Duff RJ dePamphilis CW 1997 Do nonasteridholoparasitic flowering plants have plastid genomes Plant Mol Biol34717ndash729

Patel RK Jain M 2012 NGS QC Toolkit a toolkit for quality control ofnext generation sequencing data PLoS One 7e30619

Peredo EL King UM Les DH 2013 The plastid genome of Najas flexilisadaptation to submersed environments is accompanied by thecomplete loss of the NDH complex in an aquatic angiospermPLoS One 8e68591

Porra RJ 2002 The chequered history of the development and use ofsimultaneous equations for the accurate determination of chloro-phylls a and b Photosyn Res 73149ndash156

Protasio AV Tsai IJ Babbage A Nichol S Hunt M Aslett MA De Silva NVelarde GS Anderson TJC Clark RC et al 2012 A systemati-cally improved high quality genome and transcriptome of thehuman blood fluke Schistosoma mansoni PLoS Negl Trop Dis 6e1455

Randle CP Wolfe AD 2005 The evolution and expression of rbcLin holoparasitic sister-genera Harveya and Hyobanche(Orobanchaceae) Am J Bot 921575ndash1585

Ranwez V Harispe S Delsuc F Douzery EJP 2011 MACSE MultipleAlignment of Coding SEquences accounting for frameshifts andstop codons PLoS One 6e22594

Sanderson MJ 1997 A nonparametric approach to estimating diver-gence times in the absence of rate constancy Mol Biol Evol 141218ndash1231

Smith DR Lee RW 2014 A plastid without a genome evidence from thenonphotosynthetic green algal genus Polytomella Plant Physiol 1641812ndash1819

Stamatakis A 2006 RAxML-VI-HPC maximum likelihood-based phylo-genetic analyses with thousands of taxa and mixed modelsBioinformatics 222688ndash2690

Straub SCK Cronn RC Edwards C Fishbein M Liston A 2013Horizontal transfer of DNA from the mitochondrial to the plastidgenome and its subsequent evolution in milkweeds (Apocynaceae)Genome Biol Evol 51872ndash1885

Straub SCK Fishbein M Livshultz T Foster Z Parks M Weitemier KCronn RC Liston A 2011 Building a model developing genomicresources for common milkweed (Asclepias syriaca) with low cov-erage genome sequencing BMC Genomics 12211

Tamura K Stecher G Peterson D Filipski A Kumar S 2013 MEGA6Molecular Evolutionary Genetics Analysis version 60 Mol Biol Evol302725ndash2729

Tsai IJ Zarowiecki M Holroyd N Garciarrubio A Sanchez-Flores ABrooks KL Tracey A Bobes RJ Fragoso G Sciutto E et al 2013The genomes of four tapeworm species reveal adaptations to par-asitism Nature 49657ndash63

Umate P Schwenkert S Karbat I Dal Bosco C Mlcochova L Volz S ZerH Herrmann RG Ohad I Meurer J 2007 Deletion of psbM in

3111

Plastid Genome Degradation and Implications doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from

tobacco alters the Q(B) site properties and the electron flow withinphotosystem II J Biol Chem 2829758ndash9767

Vaidya G Lohman DJ Meier R 2011 SequenceMatrix concatenationsoftware for the fast assembly of multi-gene datasets with characterset and codon information Cladistics 27171ndash180

Westwood JH Yoder JI Timko MP dePamphilis CW 2010 The evolu-tion of parasitism in plants Trends Plant Sci 15227ndash235

Wicke S Muller KF de Pamphilis CW Quandt D Wickett NJ Zhang YRenner SS Schneeweiss GM 2013 Mechanisms of functional andphysical genome reduction in photosynthetic and nonphotosyn-thetic parasitic plants of the broomrape family Plant Cell 253711ndash3725

Wicke S Scheuroaferhoff B dePamphilis CW Meurouller KF 2014Disproportional plastome-wide increase of substitution rates and re-laxed purifying selection in genes of carnivorous LentibulariaceaeMol Biol Evol 31529ndash545

Wicke S Schneeweiss GM dePamphilis CW Muller KF Quandt D2011 The evolution of the plastid chromosome in land plantsgene content gene order gene function Plant Mol Biol 76273ndash297

Wickett NJ Honaas LA Wafula EK Das M Huang K Wu BA Landherr LTimko MP Yoder J Westwood JH et al 2011 Transcriptomes of theparasitic plant family Orobanchaceae reveal surprising conservationof chlorophyll synthesis Curr Biol 212098ndash2104

Wimpee CF Wrobel RL Garvin DK 1991 A divergent plastid genome inConopholis americana an achlorophyllous parasitic plant Plant MolBiol 17161ndash166

Wolfe AD dePamphilis CW 1998 The effect of relaxed functional con-straints on the photosynthetic gene rbcL in photosynthetic andnonphotosynthetic parasitic plants Mol Biol Evol 151243ndash1258

Wolfe KH Morden CW Palmer JD 1992 Function and evolution of aminimal plastid genome from a nonphotosynthetic parasitic plantProc Natl Acad Sci U S A 8910648ndash10652

Wu FH Chan MT Liao DC Hsu CT Lee YW Daniell H Duvall MR LinCS 2010 Complete chloroplast genome of Oncidium GowerRamsey and evaluation of molecular markers for identificationand breeding in Oncidiinae BMC Plant Biol 1068

Wyman SK Jansen RK Boore JL 2004 Automatic annotation of orga-nellar genomes with DOGMA Bioinformatics 203252ndash3255

Yang JB Tang M Li HT Zhang ZR Li DZ 2013 Complete chloroplastgenome of the genus Cymbidium lights into the species identifica-tion phylogenetic implications and population genetic analysesBMC Evol Biol 1384

Yang ZH 1997 PAML a program package for phylogenetic analysis bymaximum likelihood Comput Appl Biosci 13555ndash556

Yang ZH 2007 PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591

Yoder OC Turgeon BG 2001 Fungal genomics and pathogenicity CurrOpin Plant Biol 4315ndash321

Zerbino DR Birney E 2008 Velvet algorithms for de novo short readassembly using de Bruijn graphs Genome Res 18821ndash829

Zimmer K Meyer C Gebauer G 2008 The ectomycorrhizal specialistorchid Corallorhiza trifida is a partial myco-heterotroph New Phytol178395ndash400

3112

Barrett et al doi101093molbevmsu252 MBE by guest on July 7 2016

httpmbeoxfordjournalsorg

Dow

nloaded from