Characterization of the complete chloroplast genome of Hevea brasiliensis reveals genome...

9
Characterization of the complete chloroplast genome of Hevea brasiliensis reveals genome rearrangement, RNA editing sites and phylogenetic relationships Sithichoke Tangphatsornruang , Pichahpuk Uthaipaisanwong, Duangjai Sangsrakru, Juntima Chanprasert, Thippawan Yoocha, Nukoon Jomchai, Somvong Tragoonrung National Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani, 12120, Thailand abstract article info Article history: Accepted 5 January 2011 Available online 15 January 2011 Received by Jean-Marc Deragon Keywords: Hevea brasiliensis Chloroplast/plastid genome RNA editing Rubber tree (Hevea brasiliensis) is an economical plant and widely grown for natural rubber production. However, genomic research of rubber tree has lagged behind other species in the Euphorbiaceae family. We report the complete chloroplast genome sequence of rubber tree as being 161,191 bp in length including a pair of inverted repeats of 26,810 bp separated by a small single copy region of 18,362 bp and a large single copy region of 89,209 bp. The chloroplast genome contains 112 unique genes, 16 of which are duplicated in the inverted repeat. Of the 112 unique genes, 78 are predicted protein-coding genes, 4 are ribosomal RNA genes and 30 are tRNA genes. Relative to other plant chloroplast genomes, we observed a unique rearrangement in the rubber tree chloroplast genome: a 30-kb inversion between the trnE(UUC)-trnS(GCU) and the trnT(GGU)-trnR(UCU). A comparison between the rubber tree chloroplast genes and cDNA sequences revealed 51 RNA editing sites in which most (48 sites) were located in 26 protein coding genes and the other 3 sites were in introns. Phylogenetic analysis based on chloroplast genes demonstrated a close relationship between Hevea and Manihot in Euphorbiaceae and provided a strong support for a monophyletic group of the eurosid I. © 2011 Elsevier B.V. All rights reserved. 1. Introduction Chloroplasts are plant organelles with their own genome containing genes coding for transcription, translation machinery and components of the photosynthetic complex. Since the rst complete chloroplast (cp) genome sequence of liverwort (Marchantia polymorpha) reported in 1986 (Ohyama et al., 1986), more than 150 chloroplast genomes have been sequenced and characterized; disclosing an enormous amount of evolutionary and functional information of chloroplasts. Chloroplast genomes are sufciently large and complex to include structural and point mutations that are useful for evolutionary studies from intraspe- cic to interspecic levels (Neale et al., 1988; McCauley, 1992; Graham and Olmstead, 2000; Provan et al., 2001). Structural mutations such as gene duplications of tRNA genes (Hipkins et al., 1995), rpl19, rpl2, rpl23 (Bowman et al., 1988), psbA (Lidholm et al., 1991); losses of ndh genes (Wakasugi et al., 1994), hypothetical chloroplast open reading frame (ycf) genes, infA, and accD (Hiratsuka et al., 1989; Maier et al., 1995; Millen et al., 2001); as well as rearrangements of cp genomes (Palmer et al., 1987; Wolfe et al., 1991; Wojciechowski et al., 2004; Guo et al., 2007; Tangphatsornruang et al., 2010b) have been reported in plants and algae. Therefore, chloroplast genome sequences have been used to study phylogenetic relationships (Provan et al., 2001; Lee et al., 2006; Tangphatsornruang et al., 2010b), test hypotheses of seed dispersal, intraspecic differentiation and interspecic introgression (Petit et al., 2003, 2005). In chloroplasts, transcripts undergo a series of RNA processing steps such as inton splicing, polycistronic cleavage, and RNA editing. RNA editing is a mechanism to change genetic information at the transcript level by nucleotide insertion, deletion or conversion (Bock, 2000; Knoop, 2010). Since the rst report of RNA editing in chloroplast in the maize rpl2 gene (Hoch et al., 1991), several editing sites have been reported in Arabidopsis thaliana (Tillich et al., 2005), Atropa belladonna (Schmitz-Linneweber et al., 2002), Lotus japonicus (Kato et al., 2000), black pine (Wakasugi et al., 1996), cassava (Daniell et al., 2008), pea (Miyamoto et al., 2002), tobacco (Sasaki et al., 2003), maize (Maier et al., 1995; Halter et al., 2004) and rice (Corneille et al., 2000). Comparison of sequences surrounding the editing sites revealed no consensus sequence or secondary structure (Hirose et al., 1999). This raised a question of how RNA editing sites are recognized. Previous studies suggested the involvement of distinct cis-acting elements and trans-acting factors in recognition of an individual editing site (Chaudhuri et al., 1995; Bock et al., 1996; Chaudhuri and Maliga, 1996; Hirose and Sugiura, 2001; Miyamoto et al., Gene 475 (2011) 104112 Abbreviations: bp, base pair; cp, chloroplast; IDP, Isopentenyl diphosphate; MVA, Mevalonate; MEP, 1-Deoxy-D-xylulose 5-phosphate/2-C-methyl-D-erythritol 4-phos- phate; H. brasiliensis, Hevea brasiliensis; PCR, Polymerase chain reaction; RCA, Rolling cycle amplication; ML, Maximum likelihood; MP, Maximum parsimony; TBR, Tree bisection and reconnection; A, Adenosine; C, Cytidine; I, Inosine; U, Uridine; G, Guanosine; LSC, Large single copy; SSC, Small single copy; IR, Inverted repeat; ycf, Hypothetical chloroplast open reading frame; EST, Expressed sequence tag. Corresponding author. Tel.: +66 2 564 6700x3259; fax: +66 2 564 6584. E-mail address: [email protected] (S. Tangphatsornruang). 0378-1119/$ see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2011.01.002 Contents lists available at ScienceDirect Gene journal homepage: www.elsevier.com/locate/gene

Transcript of Characterization of the complete chloroplast genome of Hevea brasiliensis reveals genome...

Gene 475 (2011) 104–112

Contents lists available at ScienceDirect

Gene

j ourna l homepage: www.e lsev ie r.com/ locate /gene

Characterization of the complete chloroplast genome of Hevea brasiliensis revealsgenome rearrangement, RNA editing sites and phylogenetic relationships

Sithichoke Tangphatsornruang ⁎, Pichahpuk Uthaipaisanwong, Duangjai Sangsrakru, Juntima Chanprasert,Thippawan Yoocha, Nukoon Jomchai, Somvong TragoonrungNational Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani, 12120, Thailand

Abbreviations: bp, base pair; cp, chloroplast; IDP, IsMevalonate; MEP, 1-Deoxy-D-xylulose 5-phosphate/2-phate; H. brasiliensis, Hevea brasiliensis; PCR, Polymerascycle amplification; ML, Maximum likelihood; MP, Mabisection and reconnection; A, Adenosine; C, CytidinGuanosine; LSC, Large single copy; SSC, Small singleHypothetical chloroplast open reading frame; EST, Expr⁎ Corresponding author. Tel.: +66 2 564 6700x3259;

E-mail address: [email protected] (S. Tang

0378-1119/$ – see front matter © 2011 Elsevier B.V. Adoi:10.1016/j.gene.2011.01.002

a b s t r a c t

a r t i c l e i n f o

Article history:Accepted 5 January 2011Available online 15 January 2011

Received by Jean-Marc Deragon

Keywords:Hevea brasiliensisChloroplast/plastid genomeRNA editing

Rubber tree (Hevea brasiliensis) is an economical plant and widely grown for natural rubber production.However, genomic research of rubber tree has lagged behind other species in the Euphorbiaceae family. Wereport the complete chloroplast genome sequence of rubber tree as being 161,191 bp in length including apair of inverted repeats of 26,810 bp separated by a small single copy region of 18,362 bp and a large singlecopy region of 89,209 bp. The chloroplast genome contains 112 unique genes, 16 of which are duplicatedin the inverted repeat. Of the 112 unique genes, 78 are predicted protein-coding genes, 4 are ribosomalRNA genes and 30 are tRNA genes. Relative to other plant chloroplast genomes, we observed a uniquerearrangement in the rubber tree chloroplast genome: a 30-kb inversion between the trnE(UUC)-trnS(GCU)and the trnT(GGU)-trnR(UCU). A comparison between the rubber tree chloroplast genes and cDNA sequencesrevealed51RNAediting sites inwhichmost (48 sites)were located in 26protein codinggenes and theother3 siteswere in introns. Phylogenetic analysis based on chloroplast genes demonstrated a close relationship betweenHevea andManihot in Euphorbiaceae and provided a strong support for a monophyletic group of the eurosid I.

opentenyl diphosphate; MVA,C-methyl-D-erythritol 4-phos-e chain reaction; RCA, Rollingximum parsimony; TBR, Treee; I, Inosine; U, Uridine; G,copy; IR, Inverted repeat; ycf,essed sequence tag.fax: +66 2 564 6584.phatsornruang).

ll rights reserved.

© 2011 Elsevier B.V. All rights reserved.

1. Introduction

Chloroplasts are plant organelles with their own genome containinggenes coding for transcription, translation machinery and componentsof the photosynthetic complex. Since the first complete chloroplast (cp)genome sequence of liverwort (Marchantia polymorpha) reported in1986 (Ohyama et al., 1986), more than 150 chloroplast genomes havebeen sequenced and characterized; disclosing an enormous amount ofevolutionary and functional information of chloroplasts. Chloroplastgenomes are sufficiently large and complex to include structural andpoint mutations that are useful for evolutionary studies from intraspe-cific to interspecific levels (Neale et al., 1988; McCauley, 1992; Grahamand Olmstead, 2000; Provan et al., 2001). Structural mutations such asgene duplications of tRNA genes (Hipkins et al., 1995), rpl19, rpl2, rpl23(Bowman et al., 1988), psbA (Lidholm et al., 1991); losses of ndh genes(Wakasugi et al., 1994), hypothetical chloroplast open reading frame(ycf) genes, infA, and accD (Hiratsuka et al., 1989; Maier et al., 1995;

Millen et al., 2001); as well as rearrangements of cp genomes (Palmeret al., 1987; Wolfe et al., 1991; Wojciechowski et al., 2004; Guo et al.,2007; Tangphatsornruang et al., 2010b) have been reported in plantsand algae. Therefore, chloroplast genome sequences have been used tostudy phylogenetic relationships (Provan et al., 2001; Lee et al., 2006;Tangphatsornruang et al., 2010b), test hypotheses of seed dispersal,intraspecific differentiation and interspecific introgression (Petit et al.,2003, 2005).

In chloroplasts, transcripts undergo a series of RNA processingsteps such as inton splicing, polycistronic cleavage, and RNA editing.RNA editing is a mechanism to change genetic information at thetranscript level by nucleotide insertion, deletion or conversion (Bock,2000; Knoop, 2010). Since the first report of RNA editing inchloroplast in the maize rpl2 gene (Hoch et al., 1991), several editingsites have been reported in Arabidopsis thaliana (Tillich et al., 2005),Atropa belladonna (Schmitz-Linneweber et al., 2002), Lotus japonicus(Kato et al., 2000), black pine (Wakasugi et al., 1996), cassava (Daniellet al., 2008), pea (Miyamoto et al., 2002), tobacco (Sasaki et al., 2003),maize (Maier et al., 1995; Halter et al., 2004) and rice (Corneille et al.,2000). Comparison of sequences surrounding the editing sitesrevealed no consensus sequence or secondary structure (Hiroseet al., 1999). This raised a question of how RNA editing sites arerecognized. Previous studies suggested the involvement of distinctcis-acting elements and trans-acting factors in recognition of anindividual editing site (Chaudhuri et al., 1995; Bock et al., 1996;Chaudhuri andMaliga, 1996; Hirose and Sugiura, 2001; Miyamoto et al.,

105S. Tangphatsornruang et al. / Gene 475 (2011) 104–112

2002; Lurin et al., 2004; Kotera et al., 2005; Hayes and Hanson, 2007).The RNA-binding pentatricopeptide repeat (PPR) proteins were identi-fied as trans-acting factor responsible for targeting specific editingevents (Lurin et al., 2004; Kotera et al., 2005; Hammani et al., 2009).

Hevea brasiliensis is a perennial plant in the Euphorbiaceae familyand is the most widely cultivated species for commercial productionof natural rubber. The chemical composition of natural rubber is cis-polyisoprene, a high-molecular weight polymer formed from sequen-tial condensation of isopentenyl diphosphate (IDP) units catalysed bythe action of rubber transferase (Cornish, 2001a). IDP is also animportant intermediate for biosynthesis of essential oils, abscisic acid,cytokinin, phytoalexin, sterols, chlorophyll, carotenoids and gibber-ellins (Chappell, 1995a;McGarvey and Croteau, 1995; Lichtenthaler etal., 1997; Cornish, 2001b). There are two IDP biosynthesis pathways:the mevalonate (MVA) pathway which occurs in cytosol (Chappell,1995b); and the 1-deoxy-D-xylulose 5-phosphate/2-C-methyl-D-erythritol 4-phosphate (MEP) pathway which occurs in plastids(Lichtenthaler, 1999; Ko et al., 2003). One approach to improvingrubber production in H. brasiliensis would be to engineer chloroplastsand modify metabolic flux to produce more biosynthetic intermedi-ates. The availability of the complete chloroplast genome sequenceshould also facilitate the chloroplast transformation technique. Theimproved transformation efficiency and foreign gene expression canbe achieved through utilization of endogenous flanking sequencesand regulatory elements (Birch-Machin et al., 2004; Maliga, 2004;Tangphatsornruang et al., 2010a). Transformation of chloroplastgenome offers a number of advantages over nuclear transformationincluding a high level of transgene expression, polycistronic tran-scription, lack of gene silencing or positional effect and transgenecontainment (Daniell et al., 2002; Maliga, 2002, 2004; Bock, 2007).

We sequenced the chloroplast genome of H. brasiliensis in order togain information for genome annotation, comparative genomic studiesand also to lay the groundwork for chloroplast engineering. Weemployed themassively-parallel pyrosequencing technology developedby 454 Life Sciences Technology (Margulies et al., 2005). This technologyhas been applied to the sequencing of genomes, transcriptome profilingand methylation studies. Previous work demonstrated the success ofhigh throughput sequencing technology in obtaining chloroplastgenome sequences (Cai et al., 2006; Moore et al., 2006; Cronn et al.,2008; Tangphatsornruang et al., 2010b). This overcomes the traditionallabor-intensivemethods involving isolation of chloroplast DNA followedby random shearing and cloning into vectors; or long PCR amplificationby conserved primers (Goremykin et al., 2003, 2004; Dhingra and Folta,2005; Heinze, 2007), or rolling circle amplification (RCA) (Jansen et al.,2005; Bausher et al., 2006). In this study, we determined the completenucleotide sequence of theH. brasiliensis chloroplast genome, annotatedit, compared the structures with other plant species, identified RNAediting sites, and used the rubber tree chloroplast genome to determinephylogenetic relationships among angiosperms.

2. Materials and methods

2.1. DNA sequencing, assembly and annotation

DNAwas isolated from1 g of leaves ofH. brasiliensis, clone RRIM600,using the DNeasy PlantMini Kit (Qiagen). The DNA (10 μg)was shearedby nebulization, subjected to 454 library preparation and shotgunsequencing using the GS FLX Titanium platform (Margulies et al., 2005)at the in-house facility (National Center for Genetic Engineering andBiotechnology, Thailand). The obtained nucleotide sequence readswereassembled using Newbler de novo sequence assembly software(Roche). The chloroplast genome sequence was compared with thereference sequence from the complete chloroplast genome of Manihotesculenta (Daniell et al., 2008) using a Sequencher software (GeneCodesCorporation). Remaining gaps were closed by PCR and Sangersequencing using BigDye Terminator v3.1 Cycle sequencing kit. The 3

primer pairs used for closing the gaps are 1) gap_LSCF: 5′-GGG CTC TAAAAA GAC ATC TCC A-3′, gap_LSCR: 5′-CTT TCT GTC TTT CAC GAT TCCA-3′, 2) gap_SSC1F: 5′-TGTATGACCATCGAGGAACTTG-3′, gap_SSC1R:5′-GTCGGAGTGATGGAAAAGAAAG-3′ and3) gap_SSC2F: 5′-GCTGAATAG ACAAAT CGA TTGAA-3′, gap_SSC2R: 5′-TGA TCC ATT TTC TAG CCCAAG-3′. PCR products were purified by electrophoresis in agarose gelusing Qiaquick Gel Extraction Kit (QIAGEN).

2.2. Genome analysis

The genome was annotated using the program DOGMA (DualOrganellar GenoMe Annotator (Wyman et al., 2004)). The predictedannotations were verified using BLAST similarity search (Altschul et al.,1990). All genes, rRNAs, and tRNAs were identified using the plastid/bacterial genetic code. The chloroplast genome of H. brasiliensis wascompared with chloroplast genomes of Arabidopsis (Sato et al., 1999),Populus, Jatropha and Manihot (Daniell et al., 2008) using a Mauvesoftware (Darling et al., 2004). REPuter (Kurtz and Schleiermacher,1999) was used to identify and locate direct repeat and inverted repeatsequences in the rubber tree chloroplast genome with criteria cutoffn≥30 bp, and a sequence identity ≥90%.

2.3. RNA editing

To reveal RNA editing sites, more than two million cDNAsequences of rubber trees were downloaded from the DDBJ readarchive (ID=DRA000170) and used to align with the protein codinggenes extracted from the rubber tree chloroplast genome using GSReference Mapper version 2.3 (Roche).

Some RNA editing sites (rps2eU134TI, rps14eU149PL, ndhKeU65SL,petBi178, ndhBeC1290YY, ndhBeU467PL, ndhDeU887PL, ndhDeU878SLand ndhDeU599SL) were confirmed by sequencing of cDNA productsby Sanger sequencing. In brief, total RNA was extracted from 0.5 g ofyoung leafusingConcert™Plant RNAReagent (Invitrogen), treatedwithDNA-free DNaseI (Ambion) and converted to a pool of cDNA usingRevertAid Hminus First Strand cDNA synthesis kit (Fermentas). Primersequences were given in Supplementary Fig. 2.

2.4. Phylogenetic analysis

A set of 33 protein-coding genes including atpA, atpB, atpE, atpF,atpH, atpI, ccsA, cemA, matK, petA, petG, petN, psaA, psaB, psaC, psbC,psbD, psbE, psbF, psbI, psbJ, psbK, psbN, psbZ, rbcL, rpl2, rpl20, rpoB,rpoC2, rps4, rps14, rps15 and ycf3 from 39 chloroplast genomesrepresenting all lineages of angiosperms, were analyzed. These 33genes are commonly present in all 39 chloroplast genomes andpublicly available in the GenBank database. Sequences were alignedusing MUSCLE (version 3.6) (Edgar, 2004) and edited manually. Formaximum likelihood (ML) analysis, RAxML version 7.0 (Stamatakis,2006) was used with the GTR+I+G matrix. The local bootstrapprobability of each branch was calculated by 100 replications.Phylogenetic analyses using maximum parsimony (MP) were per-formed using PAUP version 4.0b10 (Swofford, 2002). MP searchesincluded 1000 random addition replicates and a heuristic search usingtree bisection and reconnection (TBR) branch swapping with theMultrees option. Bootstrap analysis was performed with 100replicates with TBR branch swapping. TreeView (Page, 1996) wasused for displaying and printing phylogenetic trees.

3. Results and discussion

3.1. Sequencing and assembly of the H. brasiliensis chloroplast genome

A total of 995,092 quality filtered sequence reads was generatedwith the average read length of 332 bases covering 330 Mb. From theassembly analysis, 3 contigs, assembled from 60,855 reads (5.49%),

106 S. Tangphatsornruang et al. / Gene 475 (2011) 104–112

were shown to be parts of the chloroplast genome by alignmentwith theM. esculenta chloroplast genome. The proportion of sequences from thechloroplast genome in rubber tree (5.49%) is similar to aprevious study inmungbean (5.22%) (Tangphatsornruang et al., 2010b). The gaps betweencontigswere located in the large single copy (LSC; between trnS-GCUandtrnE-UUC) with the size of 84 bp, in the small single copy (SSC; betweenndhF and trnL-UAG)with the sizeof 1074 bpandat the junctionbetweenSSC-IRawith the size of 299 bp. A common characteristic of these gaps isthe presence of multiple copies of high AT repeats as also found byTangphatsornruang et al., 2010b. Closing of the gaps with Sangersequencing resulted in a complete chloroplast genome sequence.

Since 454 sequencing technology has a limitation in reading longhomopolymer regions (Moore et al., 2006; Huse et al., 2007;Tangphatsornruang et al., 2010b), we performed Sanger sequencingof all homopolymers (N7 bp) present in the chloroplast genome(Supplementary Table 1). Throughout the rubber tree chloroplast

Fig. 1. Map of the H. brasiliensis chloroplast genome. The thick lines indicate the extent of thcopy regions. Genes on the outside of the map are transcribed clockwise and those on thpsuedogenes are marked with * and # respectively. Arrows indicate the positions of a 30-k

genome, there are 229 homopolymers (N7 bp); 45 homopolymers arepresent in 18 coding genes and 184 are present in non-coding regions.Among the protein coding sequences, ycf1 contains the highestnumber of homopolymers (21) and followed by ycf2 (4). The longeststretch of homopolymer is 19 bp located in the intergenic regionbetween atpF and atpA. Out of 229 homopolymers, 221 were polyA/Tand only 8 were polyG/C. We observed that the number of correctedhomopolymeric bases from GS FLX Titanium in this study were 258out of 2227 (11.58%) which were 4 times higher than the previousreport on errors in homopolymers by the previous version of the GSFLX platform (Tangphatsornruang et al., 2010b).

The complete chloroplast genome sequence was reported in theNCBI database (HQ285842). The chloroplast genome contains a pair ofidentical inverted repeat regions (IRA and IRB), which are 26,810 bpeach. The inverted repeats are separated by a large single-copy (LSC)region of 89,209 bp and a small single-copy (SSC) region of 18,362 bp.

e inverted repeats (IRa and IRb) which separate the genome into small and large singlee inside of the map are transcribed counter clockwise. Genes containing introns andb unique rearrangement in relative to the cassava chloroplast genome.

Table 2RNA editing events in the rubber tree chloroplast genes.The annotation nomenclature of RNA editing events isaccording to Lenz et al., 2009.

Number RNA editing sites

1 matKeU1168RW2 matKeU634HY3 matKeU149SF4 rps16i4935 rpoBeU551SL6 rpoC1eU41SL7 rpoC2eU3746SL8 rps2eU134TI9 rps2eU248SL10 atpIeU635SL11 psbDeU435II12 rps14eU149PL13 rps14eU80SL14 ndhKeU65SL15 ndhCeU323SL16 psbEeU214PS17 petLeU5PL18 rps18eU221SL19 clpPeU556HY20 psbBeU414II21 petBi178

107S. Tangphatsornruang et al. / Gene 475 (2011) 104–112

3.2. Genome content and organization

The positions of all the genes identified in the H. brasiliensischloroplast genome and functional categorization of these genes arepresented in Fig. 1. The genome contains 112 unique genes including30 tRNA genes, 4 rRNA genes and 78 predicted protein coding genes(Table 1). In addition, there are 16 genes duplicated in the invertedrepeat (IR), making a total of 128 genes present in the rubber treechloroplast genome. Coding regions (90,532 bp; 56.16%) account forover half of the chloroplast genome, with the peptide-coding regionsforming the largest group (78,681 bp; 48.81%) followed by ribosomalRNA genes (9050 bp; 5.61%) and transfer RNA genes (2801 bp; 1.74%).The remaining 43.84% is covered by intergenic regions (29.61%) and atotal of 23 introns (13.27%) present within 22 genes (or 17 uniquegenes). The trnK-UUU gene has the largest intron (2535 bp) in whichthe matK gene is present. There are unique 30 tRNA genes (7 tRNAgenes are duplicated in the IR) which recognize all RNA codons for 20amino acids according to thewobble rubles. Based on the sequences ofprotein-coding genes and tRNA genes within the chloroplast genome,we were able to deduce the frequency of codon usage as summarizedin Supplementary Table 3. We observed that the codon usage wasbiased towards a high representation of A and U at the third codonposition like in all other land plants (Shimada and Sugiura, 1991; Caiet al., 2006; Gao et al., 2009). The rubber tree psbC and rps19 genescontain GUG as a start codon. Sequence alignment between thechloroplast genome and the rubber tree ESTs also confirmed that bothpsbC and rps19 transcripts have GUG as the start codons. Studies ofpsbC and rps19 translation also revealed that GUG codon is theinitiation codon in several plants and algae (Rochaix et al., 1989;Carpenter et al., 1990; Yukawa et al., 2005; Kuroda et al., 2007).

The previously sequenced chloroplast genomes of Malpighiales(Manihot, Jatropha and Populus) and Hevea as reported here werecompared with the Arabidopsis chloroplast genome as the referencesequence (Darling et al., 2004) (Supplementary Fig. 1). We observed aunique genome rearrangement of a 30 kb fragment in the LSCbetween trnS(GCU)-trnE(UUC) and trnR(UCU)-trnT(GGU) in therubber tree chloroplast genome compared with others. Although,we were unable to identify any significant repeats in spaces betweenthe rubber tree trnT(GGU)-trnR(UCU) and trnE(UUC)-trnS(GCU),these regions are biased towards high AT content, 85.54% and84.92%, respectively.

Table 1Genes encoded by the Hevea brasiliensis chloroplast genome.

1. Photosystem I: psaA, psaB, psaC, psaI, psaJ, ycf3a, ycf42. Photosystem II: psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK,psbL, psbM, psbN, psbT, psbZ

3. Cytochrome b6/f: petA, petBb, petDb, petG, petL, petN4. ATP synthase: atpA, atpB, atpE, atpF, atpH, atpI5. Rubisco: rbcL6. NADH oxidoreductase: ndhAb, ndhBb,c, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH,ndhI, ndhJ, ndhK

7. Large subunit ribosomal proteins: rpl2b,c, rpl14, rpl16b, rpl20, rpl22, rpl23c,rpl32, rpl33, rpl36

8. Small subunit ribosomal proteins: rps2, rps3, rps4, rps7c, rps8, rps11,rps12b,c,d, rps14, rps15, rps16b, rps18, rps19

9. RNAP: rpoA, rpoB, rpoC1b, rpoC210. Other proteins: accD, ccsA, cemA, clpPa, matK11. Proteins of unknown function: ycf1, ycf2c

12. Ribosomal RNAs: rrn16c, rrn23c, rrn4.5c, rrn5c

13. Transfer RNAs: A(UGC)b,c, C(GCA), D(GUC), E(UUC), F(GAA), G(GCC)b,G(UCC), H(GUG), I(CAU)c, I(GAU)b,c, K(UUU)b, L(CAA)c, L(UAA)b, L(UAG),fM(CAU), M(CAU), N(GUU)c, P(UGG), Q(UUG), R(ACG)c, R(UCU), S(GCU),S(GGA), S(UGA), T(GGU), T(UGU), V(GAC)c, V(UAC)b, W(CCA), Y(GUA)

a Gene containing two introns.b Gene containing a single intron.c Two gene copies in the IRs.d Gene divided into two independent transcription units.

Analysis of the repeat sequences in the rubber tree chloroplastgenome identified twenty five direct repeats and seventeen invertedrepeats of 30 bp or longer with a sequence identity of 90%(Supplementary Table 4). Eighteen repeats are 30 to 40 bp long,eleven repeats are 41–50 bp long, seven repeats are 51–80 bp long,and six repeats are longer than 80 bp. The longest repeat in rubbertree chloroplast DNA is a 151-bp direct repeat between the trnG-GCCand trnT-GGU. Most of the direct repeats are distributed within theintergenic spacer regions, the intron sequences, and in the tRNA, andycf2 genes.

Two ycf genes (ycf15 and ycf68) are probably not functional in therubber tree chloroplast genome due to the presence of premature stopcodons. In several chloroplast genomes, ycf15 and ycf68 have alsobeen reported as non-functional genes (Sato et al., 1999; Schmitz-Linneweber et al., 2001; Steane, 2005; Raubeson et al., 2007; Daniellet al., 2008). The infA gene is present but probably non-functionalin the rubber tree chloroplast genome due to the presence of a

22 petBeU611SL23 petDeU481SL24 rpoAeU836SL25 rpoAeU200SF26 rpl23eU89SL27 ycf2eU467PL28 ycf2eC1608VV29 ycf2eA1645VI30 ndhBeU1481PL31 ndhBeC1290YY32 ndhBeU1255HY33 ndhBeU59SL34 ndhBeU830SL35 ndhBeU746SF36 ndhBeU611SL37 ndhBeU586HY38 ndhBeU542TM39 ndhBeU467PL40 ndhBeU149SL41 rps12-3endi18642 ndhDeU887PL43 ndhDeU878SL44 ndhDeU674SL45 ndhDeU599SL46 ndhDeU313RW47 ndhEeU233PL48 ndhGeU347PL49 ndhAeU961PS50 ndhAeU566SL51 ndhHeU505HY

108 S. Tangphatsornruang et al. / Gene 475 (2011) 104–112

premature stop codon in both chloroplast DNA and cDNA sequences.The loss of infA from the chloroplast genome has been reported tooccur multiple times during the angiosperm evolution (Millen et al.,2001). The infA gene has been lost from the M. esculenta chloroplastgenome, the closest fully sequenced relative to H. brasiliensis; but it ispresent in Populus, another plant species in the Malpighiales order(Millen et al., 2001; Daniell et al., 2008).

3.3. RNA editing search by comparison between coding sequences andcDNAs

To determine the RNA editing sites, we compared the proteincoding sequences extracted from the rubber tree chloroplast genomewith 2,265,782 rubber tree cDNA sequences downloaded from theDDBJ read archive (ID=DRA000170). The chloroplast gene sequencesmatched 52,971 out of 2.2 million rubber tree ESTs (2.23%). Therewere 6765 EST reads (2,059,201 bp) mapped to chloroplast proteincoding genes which is equivalent to 23× coverage of the chloroplastcoding region. Table 2 presents 51 RNA editing sites identified andnamed according to the proposed universal nomenclature by Lenzet al., 2009 (Lenz et al., 2009). Forty eight were in protein codingregions of 26 protein coding genes, 3 were in introns of rps16, petB andrps12. Out of 48 RNA editings in mRNA, a C-to-U change was the mostcommon (45), followed by a U-to-C change (2) and a G-to-A change(1). In chloroplasts and mitochondria of seed plants, a conversionfrom C to U is the most predominant form (Bock, 2000). The reverseU-to-C editing is rarely observed in seed plants (Gualberto et al., 1990;

At.genomic MIWHVQNENFILDSTRIFMKAFHLLLFDGSFIFPECILIFGLILLLMIDSTSDQKDIPWAt.cDNA MIWHVQNENFILDSTRIFMKAFHLLLFDGSFIFPECILIFGLILLLMIDLTSDQKDIPW

-------------------------------------------------*---------Sl.genomic MIWHVQNENFILDSTRIFMKAFHLLLFDGSLIFPECILIFGLILLLMIDSTSDQKDIPWSl.cDNA MIWHVQNENFILDSTRIFMKAFHLLLFDGSLIFPECILIFGLILLLMIDLTSDQKDIPW

-------------------------------------------------*---------Nt.genomic MIWHVQNENFILDSTRIFMKAFHLLLFDGSLIFPECILIFGLILLLMIDSTSDQKDIPWNt.cDNA MIWHVQNENFILDSTRIFMKAFHLLLFDGSLIFPECILIFGLILLLMIDLTSDQKDIPW

-------------------------------------------------*---------Hb.genomic MIWHVQNENFILDSTRIFMKAFHLLLFDGSFIFPECILIFGLILLLMIDSTSDQKDIPWHb.cDNA MIWHVQNENFILDSTRIFMKAFHLLLFDGSFIFPECILIFGLILLLMIDLTSDQKDIPW

-------------------------------------------------*---------

i GG CG C S CS SG S GAt.genomic FLLFILTATLGGMFLCGANDLITIFVAPECFSLCSYLLSGYTKKDIRSNEATMKYLLMGAt.cDNA FLLFILTATLGGMFLCGANDLITIFVALECFSLCSYLLSGYTKKDIRSNEATMKYLLMG

---------------------------*-------------------------------Sl.genomic FLLFVLTATLGGMFLCGANDLITIFVAPECFSLCSYLLSGYTKKDVRSNEATMKYLLMGSl.cDNA FLLFVLTATLGGMFLCGANDLITIFVALECFSLCSYLLSGYTKKDVRSNEATMKYLLMG

---------------------------*-------------------------------Nt.genomic FLLFVLTATLGGMFLCGANDLITIFVAPECFSLCSYLLSGYTKKDVRSNEATMKYLLMGNt.cDNA FLLFVLTATLGGMFLCGANDLITIFVALECFSLCSYLLSGYTKKDVRSNEATMKYLLMG

*---------------------------*-------------------------------Hb.genomic FLLFVLTATLGGMFLCGANDLITIFVAPECFSLCSYLLSGYTKKDVRSNEATTKYLLMGHb.cDNA FLLFVLTATLGGMFLCGANDLITIFVALECFSLCSYLLSGYTKKDVRSNEATMKYLLMG

---------------------------*------------------------*------

At.genomic YEGSPTPVVAFLSVTSKVAASASATRIFDIPFYFSSNEWHLLLEILAILSMIFGNLIAIAt.cDNA YEGSPTPVVAFLSVTSKVAALALATRIFDIPFYFSSNEWHLLLEILAILSMIFGNLIAI

--------------------*-*------------------------------------Sl i YEGSPTPVVAFLSVTSKVAASASATRIFNIPFYFSSNEWHLLLEILAILSMILGNLIAISl.genomic YEGSPTPVVAFLSVTSKVAASASATRIFNIPFYFSSNEWHLLLEILAILSMILGNLIAISl.cDNA YEGSPTPVVAFLSVTSKVAALALATRIFNIPFYFSSNEWHLLLEILAILSMILGNLIAI

--------------------*-*------------------------------------Nt.genomic YEGSPTPVVAFLSVTSKVAASASATRIFDIPFYFSSNEWHLLLEILAILSMILGNLIAINt.cDNA YEGSPTPVVAFLSVTSKVAALALATRIFDIPFYFSSNEWHLLLEILAILSMILGNLIAI

--------------------*-*------------------------------------Hb.genomic YEGSPTPVVAFLSVTSKVAASASATRIFDIPFYFSSNEWHLLLEILAILSMIVGNLIAIHb.cDNA YEGSPTPVVAFLSVTSKVAALALATRIFDIPFYFSSNEWHLLLEILAILSMIVGNLIAI

* *--------------------*-*------------------------------------

At.genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLHLFWCGWQAGLYFLVSIGLLTSVLSAt.cDNA AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVSIGLLTSVLS

----------------------------------*------------------------Sl.genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSSl.cDNA AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVS

-----------------------------------------------------------Nt genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSNt.genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSNt.cDNA AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVS

-----------------------------------------------------------Hb.genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLHLFWCGWQAGLYFLVLIGLLTSVVSHb.cDNA AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVS

----------------------------------*----------*-------------

Fig. 2. Sequence alignment of ndhB proteins translated from chloroplast genomes before RNA(Sl), Nicotiana tabaccum (Nt) and Hevea brasiliensis (Hb) using CLUSTAL 2.0.12. Stars repre

Schuster et al., 1990); but it is common in hornworts and ferns(Yoshinaga et al., 1996; Steinhauser et al., 1999; Vangerow et al.,1999). The two U-to-C events of RNA editing were found only in ndhBand ycf2 transcripts which are very close to each other (7266 bpapart). It is also possible that this fragment of the chloroplast genomemay be transferred to a mitochondrial genome where extensive RNAediting events occur. Several lines of evidence have suggestedtranslocation of chloroplast DNA fragments tomitochondrial genomesin many plant species (Stern and Lonsdale, 1982; Stern and Palmer,1984; Moon et al., 1987). Further experiments on cDNA sequencing oftranscripts extracted from isolated chloroplasts will be required totest this hypothesis.

An uncommon G-to-A change at the ycf2eA1645VI observed herehas never been reported in chloroplasts of higher land plants before.Although, an A-to-I/G editing has been commonly observed in tRNAsto expand the ability to read additional codons (Pfitzinger et al., 1990;Dao et al., 1994; Agris et al., 2007). Recently, the adenosine deaminasegene acting on tRNAs (ADAT) responsible for the editing of theadenosine at the wobble position of cp-tRNAArg(ACG) has beenidentified in Arabidopsis chloroplasts (Delannoy et al., 2009; Karcherand Bock, 2009). However, it should be noted that the number ofedited mRNA found in the ycf2 transcript (from G to A) was only twocompared with six unedtited ycf2 transcripts, and this G-to-Aconversion may be due to sequencing error which may overestimatethe number of RNA editing events in this study.

There are 45 non-synonymous substitutions which are presentmost frequently in ndhB (11) and followed by ndhD (5). The ndhB

LYFISSTSFVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITELYFISSTSFVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE---------------------------------------------------------------------LYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITELYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE---------------------------------------------------------------------LYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITELYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE---------------------------------------------------------------------LYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITELYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE---------------------------------------------------------------------

G SSS G S GSSGG Q G Q S G S G G S S QGASSSILVHGFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPSHQWTPDVGASSSILVYGFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPFHQWTPDV--------*----------------------------------------------------*-------GASSSILVHGFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSPAPSHQWTPDVGASSSILVYGFSWLYGLSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPFHQWTPDV--------*-------*-----------------------------------------*--*-------GASSSILVHGFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSPAPSHQWTPDVGASSSILVYGFSWLYGLSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPFHQWTPDV

* * * *--------*-------*-----------------------------------------*--*-------GASSSILVHAFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPSHQWTPDVGASSSILVYAFSWLYGLSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPFHQWTPDV--------*-------*--------------------------------------------*-------

TQTSMKRMLAYSSIGQIGYVIIGIIVGDSNGGYASMITYMLFYIAMNLGTFACIILFGLRTGTDNIRDYTQTSMKRMLAYSSIGQIGYVIIGIIVGDSNGGYASMITYMLFYIAMNLGTFACIILFGLRTGTDNIRDY---------------------------------------------------------------------TQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDYTQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDYTQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY---------------------------------------------------------------------TQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDYTQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY---------------------------------------------------------------------TQTSMKRMLAYSSIGQIGYVIIGIIVGDSNGGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDYTQTSMKRMLAYSSIGQIGYVIIGIIVGDSNGGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY---------------------------------------------------------------------

IYYYLKIIKLLMTGRNQEITPHMRNYRISPLRSNNSIELSMIVCVIASTIPGISMNPIIAIAQDTLFSFIYYYLKIIKLLMTGRNQEITPHMRNYRISPLRSNNSIELSMIVCVIASTILGISMNPIIAIAQDTLFSF--------------------------------------------------*------------------IYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTIPGISMNPIIAIAQDSLF--IYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTILGISMNPIIAIAQDSLF----------------------------------------------------*----------------IYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTIPGISMNPIIAIAQDSLFIYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTIPGISMNPIIAIAQDSLF--IYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTILGISMNPIIAIAQDSLF----------------------------------------------------*----------------IYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTIPGISMNPIVEIAQDTLF--IYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTILGISMNPIVEIAQDTLF----------------------------------------------------*----------------

editing and cDNAs after RNA editing of Arabidopsis thaliana (At), Solanum lycopersicumsent RNA editing sites and hyphens represent unedited sites.

109S. Tangphatsornruang et al. / Gene 475 (2011) 104–112

transcripts were also found to be highly edited in other plants such asmaize, sugarcane, rice, barley, tomato, tobacco and Arabidopsis (Freyeret al., 1995; Kahlau et al., 2006; Chateigner-Boutin and Small, 2007).Fig. 2 shows amino acid sequence alignment of the ndhB proteinsfrom Arabidopsis, tobacco, tomato and rubber tree with RNA editingpositions. All RNA editing events in the highly edited ndhB transcriptsmaintained the conserved ndhB amino acid sequences in all 4 plantspecies. We also observed that 40 RNA editing events in rubber treechloroplasts caused amino acid changes for highly hydrophobicresidues (such as L, F, I, M, V and W) with conversions from serineto leucine as the most frequent transitions. The majority of RNAediting in messenger RNAs occurred at the second codon position(36), followed by the first codon position (10) and the third codonposition (2). In RNA editing events at the second codon, there was a

Cucum

Hevea

Manihot

Jatropha

P. alba

P. tr icho

Gossypiu

Citrus

Eucalyptus

Oeno

S. lycop

S. bulboc

Atropa

Nicotiana

Daucus

Panax

Spina

Ranunculu

Typha

Acorus I Acorale

Calycanthu s I Laurale

Nymphaea

Nuphar

Amborella I Amborellales

Ginkgo I Ginkgoales

Pinus I Pinales

Nympha

GYMNOSPERMS

0.1Substitutions/site

100100

100

10100

100

100

100

100100

100

100

100

100

100

100

97

100100

100

100

100

100

100

731010

100

99100

100

100

100100

100

100

100

Fig. 3. The phylogenetic relationships based on 33 protein-coding genes from 39 plant taxavalues. Ordinal and higher level group names are also indicated.

bias toward pyrimidine nucleotide at the 5′ upstream and purinenucleotide at the 3′ downstream. However, it is unclear whether thesebiases are due to evolutionary or mechanism limitation of the editingprocess.

3.4. Phylogenetic analysis

Our phylogenetic data set included 33 protein coding genes for 39plant taxa (Supplementary Table 5), including 37 angiosperms andtwo outgroup gymnosperms (Ginkgo and Pinus). These 33 genes arepresent in the chloroplast genome of each of the 39 species so aproblem with missing data from the sequence alignment wasminimized. The sequence alignment that was used for phylogeneticanalyses comprised 26,585 characters. ML analysis resulted in a single

Medicago

Trifolium

Cicer

Lotus

Phaseolus

Vigna

Glycine

is I Cucurbitales

carpa

m I Malvales

Arabidopsis I Brassicales

I Sapindales

thera

ersicum

astanum

cia I Caryophyllales

s I Ranunculales

Sorghum

Saccharum

Zea

Triticum

Oryza

s

s

EUASTERIDS I

EUASTERIDS II

EUROSIDS II

Fabales

EUROSIDS I

Poales

MONOCOTS

ROSIDS

ASTERIDS

EUDICOTS

BASAL ANGIOSPERMSeales

MAGNOLIIDS

Malpighiales

Myrtales

Solanales

Apiales

0

00

with the ML value of lnL=−230655.55. Numbers above node are bootstrap support

110 S. Tangphatsornruang et al. / Gene 475 (2011) 104–112

tree with ln L=−230655.55 (Fig. 3). ML bootstrap values were alsohigh, with values of ≥95% for 36 of the 37 nodes, and 34 nodes with100% bootstrap support (Fig. 3). MP analysis resulted in a singleresolved tree with a length of 40,109, a consistency index of 0.48 and aretention index of 0.657 (not shown). Bootstrap analyses indicatedthat there were 32 out of 36 nodes with values of 100%.

Both the MP and ML trees had similar topologies with two majorclades, Monocots and Eudicots with Amborella as the earliestdiverging angiosperm lineage. The only incongruence between theMP and ML trees is the position of Calycanthus. In the MP tree,Calycanthus was placed sister to Eudicots; whereas it was positionedclose to bothMonocots and Eudicots in theML tree. This incongruencewas observed in previous phylogenetic studies (Leebens-Mack et al.,2005; Bausher et al., 2006; Jansen et al., 2006; Ruhlman et al., 2006).Some studies supported Monocots as the sister clade to Magnoliids+Eudicots (Nickrent et al., 2002; Zanis et al., 2002). However,phylogenies based on phytochromes (Mathews and Donoghue,1999), 17 cp genes (Graham and Olmstead, 2000), 21 cp genes(Tangphatsornruang et al., 2010b) and 61 cp genes (Cai et al., 2006;Lee et al., 2006; Hansen et al., 2007) supported Magnoliids as sister toMonocot and Eudicot. By sequencing three chloroplast genomes ofMagnoliids, Cai et al., 2006 provided strong support for Monocots andEudicots as sister clades with Magnoliids diverging before theMonocots–Eudicots split.

Our MP and ML trees revealed a monophyly of the Monocots andEudicots where Ranunculales was placed sister to the remainingEudicots. The overall structure of the trees is similar to the previouslyreported trees (Lee et al., 2006; Daniell et al., 2008; Logacheva et al.,2008; Tangphatsornruang et al., 2010b). Addition of the H. brasiliensischloroplast genes placed Hevea sister to Manihot and groupedtogether with Jatropha and Populus in the Malpighiales order andprovided a strong support for a monophyletic group of the eurosid I.The relationships in the Malpighiales order were also supported bythe study based on the atpF gene (Daniell et al., 2008).

4. Conclusion

We performed shotgun genome sequencing of H. brasiliensis usingthe 454 pyrosequencing technology and obtained the completechloroplast genome sequence. The approach has been demonstratedhere as a fast and efficient way for obtaining organellar genomes.Gene content and structural organization of the rubber treechloroplast genome are similar to that of M. esculenta, with anexception of the 30-kb fragment rearrangement in the LSC. Bycomparing the rubber tree chloroplast genes and the cDNA sequences,we determined the distribution and the location of RNA editing sitesin the chloroplast genome. The proposed phylogenetic relationshipsamong angiosperms, based on chloroplast DNA sequences includingthose of the rubber tree chloroplast DNA reported here, provided astrong support for a monophyletic group of the eurosid I anddemonstrated a close relationship between Hevea, Manihot, Jatrophaand Populus in Malpighiales.

Supplementarymaterials related to this article can be found onlineat doi:10.1016/j.gene.2011.01.002.

Acknowledgements

We acknowledge funding support by the National Center forGenetic Engineering and Biotechnology, Thailand.

References

Agris, P.F., Vendeix, F.A., Graham, W.D., 2007. tRNA's wobble decoding of the genome:40 years of modification. J. Mol. Biol. 366, 1–13.

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignmentsearch tool. J. Mol. Biol. 215, 403–410.

Bausher, M.G., Singh, N.D., Lee, S.B., Jansen, R.K., Daniell, H., 2006. The completechloroplast genome sequence of Citrus sinensis (L.) Osbeck var ‘Ridge Pineapple’:organization and phylogenetic relationships to other angiosperms. BMC Plant Biol.6, 21.

Birch-Machin, I., Newell, C.A., Hibberd, J.M., Gray, J.C., 2004. Accumulation of rotavirusVP6 protein in chloroplasts of transplastomic tobacco is limited by protein stability.Plant Biotechnol. J. 2, 261–270.

Bock, R., 2000. Sense from nonsense: how the genetic information of chloroplasts isaltered by RNA editing. Biochimie 82, 549–557.

Bock, R., 2007. Plastid biotechnology: prospects for herbicide and insect resistance,metabolic engineering and molecular farming. Curr. Opin. Biotechnol. 18, 100–106.

Bock, R., Hermann, M., Kossel, H., 1996. In vivo dissection of cis-acting determinants forplastid RNA editing. EMBO 15, 5052–5059.

Bowman, C.M., Barker, R.F., Dyer, T.A., 1988. In wheat ctDNA, segments of ribosomalprotein genes are dispersed repeats, probably conserved by nonreciprocalrecombination. Curr. Genet. 14, 127–136.

Cai, Z., Penaflor, C., Kuehl, J.V., Leebens-Mack, J., Carlson, J.E., dePamphilis, C.W.,Boore, J.L., Jansen, R.K., 2006. Complete plastid genome sequences of Drimys,Liriodendron, and Piper: implications for the phylogenetic relationships ofmagnoliids. BMC Evol. Biol. 6, 77.

Carpenter, S.D., Charite, J., Eggers, B., Vermaas, W.F., 1990. The psbC start codon inSynechocystis sp. PCC 6803. FEBS Lett. 260, 135–137.

Chappell, J., 1995a. The biochemistry and molecular biology of isoprenoid metabolism.Plant Physiol. 107, 1–6.

Chappell, J., 1995b. Biochemistry and molecular biology of the isoprenoid biosyntheticpathway in plants. Annu. Rev. Plant Physiol. Plant Mol. Biol. 46, 521–547.

Chateigner-Boutin, A.L., Small, I., 2007. A rapid high-throughput method for thedetection and quantification of RNA editing based on high-resolution melting ofamplicons. Nucleic Acids Res. 35, e114.

Chaudhuri, S., Maliga, P., 1996. Sequences directing C to U editing of the plastid psbLmRNA are located within a 22 nucleotide segment spanning the editing site. EMBOJ. 15, 5958–5964.

Chaudhuri, S., Carrer, H., Maliga, P., 1995. Site-specific factor involved in the editing ofthe psbL mRNA in tobacco plastids. EMBO J. 14, 2951–2957.

Corneille, S., Lutz, K., Maliga, P., 2000. Conservation of RNA editing between rice andmaizeplastids: are most editing events dispensable? Mol. Gen. Genet. 264, 419–424.

Cornish, K., 2001a. Similarities and differences in rubber biochemistry among plantspecies. Phytochemistry 57, 1123–1134.

Cornish, K., 2001b. Similarities and differences in rubber biochemistry among plantspecies. Phytochemistry 57, 1123–1134.

Cronn, R., Liston, A., Parks, M., Gernandt, D.S., Shen, R., Mockler, T., 2008. Multiplexsequencing of plant chloroplast genomes using Solexa sequencing-by-synthesistechnology. Nucleic Acids Res. 36.

Daniell, H., Khan, M.S., Allison, L., 2002. Milestones in chloroplast genetic engineering:an environmentally friendly era in biotechnology. Trends Plant Sci. 7, 84–91.

Daniell, H., Wurdack, K.J., Kanagaraj, A., Lee, S.B., Saski, C., Jansen, R.K., 2008. Thecomplete nucleotide sequence of the cassava (Manihot esculenta) chloroplastgenome and the evolution of atpF in Malpighiales: RNA editing and multiple lossesof a group II intron. Theor. Appl. Genet. 116, 723–737.

Dao, V., Guenther, R., Malkiewicz, A., Nawrot, B., Sochacka, E., Kraszewski, A.,Jankowska, J., Everett, K., Agris, P.F., 1994. Ribosome binding of DNA analogs oftRNA requires base modifications and supports the “extended anticodon”. Proc.Natl Acad. Sci. USA 91, 2125–2129.

Darling, A.C., Mau, B., Blattner, F.R., Perna, N.T., 2004. Mauve: multiple alignment ofconserved genomic sequence with rearrangements. Genome Res. 14, 1394–1403.

Delannoy, E., Le Ret, M., Faivre-Nitschke, E., Estavillo, G.M., Bergdoll, M., Taylor, N.L.,Pogson, B.J., Small, I., Imbault, P., Gualberto, J.M., 2009. Arabidopsis tRNA adenosinedeaminase arginine edits the wobble nucleotide of chloroplast tRNAArg(ACG) andis essential for efficient chloroplast translation. Plant Cell 21, 2058–2071.

Dhingra, A., Folta, K.M., 2005. ASAP: amplification, sequencing & annotation ofplastomes. BMC Genomics 6.

Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and highthroughput. Nucleic Acids Res. 32, 1792–1797.

Freyer, R., Lopez, C., Maier, R.M., Martin, M., Sabater, B., Kossel, H., 1995. Editing of thechloroplast ndhB encoded transcript shows divergence between closely relatedmembers of the grass family (Poaceae). Plant Mol. Biol. 29, 679–684.

Gao, L., Yi, X., Yang, Y.X., Su, Y.J., Wang, T., 2009. Complete chloroplast genome sequenceof a tree fern Alsophila spinulosa: insights into evolutionary changes in fernchloroplast genomes. BMC Evol. Biol. 9, 130.

Goremykin, V.V., Hirsch-Ernst, K.I., Wolfl, S., Hellwig, F.H., 2003. Analysis of theAmborella trichopoda chloroplast genome sequence suggests that amborella is not abasal angiosperm. Mol. Biol. Evol. 20, 1499–1505.

Goremykin, V.V., Hirsch-Ernst, K.I., Wolfl, S., Hellwig, F.H., 2004. The chloroplastgenome of Nymphaea alba: whole-genome analyses and the problem of identifyingthe most basal angiosperm. Mol. Biol. Evol. 21, 1445–1454.

Graham, S.W., Olmstead, R.G., 2000. Utility of 17 chloroplast genes for inferring thephylogeny of the basal angiosperms. Am. J. Bot. 87, 1712–1730.

Gualberto, J.M., Weil, J.H., Grienenberger, J.M., 1990. Editing of the wheat coxIIItranscript: evidence for twelve C to U and one U to C conversions and for sequencesimilarities around editing sites. Nucleic Acids Res. 18, 3771–3776.

Guo, X., Castillo-Ramirez, S., Gonzalez, V., Bustos, P., Fernandez-Vazquez, J.L.,Santamaria, R.I., Arellano, J., Cevallos, M.A., Davila, G., 2007. Rapid evolutionarychange of common bean (Phaseolus vulgaris L) plastome, and the genomicdiversification of legume chloroplasts. BMC Genomics 8, 228.

Halter, C.P., Peeters, N.M., Hanson, M.R., 2004. RNA editing in ribosome-less plastids ofiojap maize. Curr. Genet. 45, 331–337.

111S. Tangphatsornruang et al. / Gene 475 (2011) 104–112

Hammani, K., Okuda, K., Tanz, S.K., Chateigner-Boutin, A.L., Shikanai, T., Small, I., 2009. Astudy of new Arabidopsis chloroplast RNA editing mutants reveals general featuresof editing factors and their target sites. Plant Cell 21, 3686–3699.

Hansen, D.R., Dastidar, S.G., Cai, Z., Penaflor, C., Kuehl, J.V., Boore, J.L., Jansen, R.K., 2007.Phylogenetic and evolutionary implications of complete chloroplast genomesequences of four early-diverging angiosperms: Buxus (Buxaceae), Chloranthus(Chloranthaceae), Dioscorea (Dioscoreaceae), and Illicium (Schisandraceae). Mol.Phylogenet. Evol. 45, 547–563.

Hayes, M.L., Hanson, M.R., 2007. Identification of a sequence motif critical for editing ofa tobacco chloroplast transcript. RNA 13, 281–288.

Heinze, B., 2007. A database of PCR primers for the chloroplast genomes of higherplants. Plant Meth. 3, 4.

Hipkins, V.D., Marshall, K.A., Neale, D.B., Rottmann,W.H., Strauss, S.H., 1995. Amutationhotspot in the chloroplast genome of a conifer (Douglas-fir: Pseudotsuga) is causedby variability in the number of direct repeats derived from a partially duplicatedtRNA gene. Curr. Genet. 27, 572–579.

Hiratsuka, J., Shimada, H., Whittier, R., Ishibashi, T., Sakamoto, M., Mori, M., Kondo, C.,Honji, Y., Sun, C.R., Meng, B.Y., et al., 1989. The complete sequence of the rice (Oryzasativa) chloroplast genome: intermolecular recombination between distinct tRNAgenes accounts for a major plastid DNA inversion during the evolution of thecereals. Mol. Gen. Genet. 217, 185–194.

Hirose, T., Sugiura, M., 2001. Involvement of a site-specific trans-acting factor and acommon RNA-binding protein in the editing of chloroplast mRNAs: development ofa chloroplast in vitro RNA editing system. EMBO J. 20, 1144–1152.

Hirose, T., Kusumegi, T., Tsudzuki, T., Sugiura, M., 1999. RNA editing sites in tobaccochloroplast transcripts: editing as a possible regulator of chloroplast RNApolymerase activity. Mol. Gen. Genet. 262, 462–467.

Hoch, B., Maier, R.M., Appel, K., Igloi, G.L., Kossel, H., 1991. Editing of a chloroplastmRNA by creation of an initiation codon. Nature 353, 178–180.

Huse, S.M., Huber, J.A., Morrison, H.G., Sogin, M.L., Welch, D.M., 2007. Accuracy andquality of massively parallel DNA pyrosequencing. Genome Biol. 8, R143.

Jansen, R.K., Raubeson, L.A., Boore, J.L., dePamphilis, C.W., Chumley, T.W., Haberle, R.C.,Wyman, S.K., Alverson, A.J., Peery, R., Herman, S.J., Fourcade, H.M., Kuehl, J.V.,McNeal, J.R., Leebens-Mack, J., Cui, L., 2005. Methods for obtaining and analyzingwhole chloroplast genome sequences. Meth. Enzymol. 395, 348–384.

Jansen, R.K., Kaittanis, C., Saski, C., Lee, S.B., Tomkins, J., Alverson, A.J., Daniell, H., 2006.Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genomesequences: effects of taxon sampling and phylogenetic methods on resolvingrelationships among rosids. BMC Evol. Biol. 6, 32.

Kahlau, S., Aspinall, S., Gray, J.C., Bock, R., 2006. Sequence of the tomato chloroplastDNAandevolutionary comparison of solanaceous plastid genomes. J. Mol. Evol. 63, 194–207.

Karcher, D., Bock, R., 2009. Identification of the chloroplast adenosine-to-inosine tRNAediting enzyme. RNA 15, 1251–1257.

Kato, T., Kaneko, T., Sato, S., Nakamura, Y., Tabata, S., 2000. Complete structure of thechloroplast genome of a legume, Lotus japonicus. DNA Res. 7, 323–330.

Knoop, V., 2010. When you can't trust the DNA: RNA editing changes transcriptsequences. Cell. Mol. Life Sci 68, 567–586.

Ko, J.H., Chow, K.S., Han, K.H., 2003. Transcriptome analysis reveals novel features of themolecular events occurring in the laticifers of Hevea brasiliensis (para rubber tree).Plant Mol. Biol. 53, 479–492.

Kotera, E., Tasaka, M., Shikanai, T., 2005. A pentatricopeptide repeat protein is essentialfor RNA editing in chloroplasts. Nature 433, 326–330.

Kuroda, H., Suzuki, H., Kusumegi, T., Hirose, T., Yukawa, Y., Sugiura, M., 2007.Translation of psbC mRNAs starts from the downstream GUG, not the upstreamAUG, and requires the extended Shine–Dalgarno sequence in tobacco chloroplasts.Plant Cell Physiol. 48, 1374–1378.

Kurtz, S., Schleiermacher, C., 1999. REPuter: fast computation of maximal repeats incomplete genomes. Bioinformatics 15, 426–427.

Lee, S.B., Kaittanis, C., Jansen, R.K., Hostetler, J.B., Tallon, L.J., Town, C.D., Daniell, H., 2006.The complete chloroplast genome sequence of Gossypium hirsutum: organizationand phylogenetic relationships to other angiosperms. BMC Genomics 7, 61.

Leebens-Mack, J., Raubeson, L.A., Cui, L., Kuehl, J.V., Fourcade, M.H., Chumley, T.W.,Boore, J.L., Jansen, R.K., depamphilis, C.W., 2005. Identifying the basal angiospermnode in chloroplast genome phylogenies: sampling one's way out of the Felsensteinzone. Mol. Biol. Evol. 22, 1948–1963.

Lenz, H., Rudinger, M., Volkmar, U., Fischer, S., Herres, S., Grewe, F., Knoop, V., 2009.Introducing the plant RNA editing prediction and analysis computer tool PREPACTand an update on RNA editing site nomenclature. Curr. Genet. 56, 189–201.

Lichtenthaler, H., 1999. The 1-deoxy-D-xylulose-5-phosphate pathway of isoprenoidbiosynthesis in plants. Annu. Rev. Plant Physiol. Plant Mol. Biol. 50, 47–65.

Lichtenthaler, H.K., Schwender, J., Disch, A., Rohmer, M., 1997. Biosynthesis ofisoprenoids in higher plant chloroplasts proceeds via a mevalonate-independentpathway. FEBS Lett. 400, 271–274.

Lidholm, J., Szmidt, A., Gustafsson, P., 1991. Duplication of the psbA gene in thechloroplast genome of two Pinus species. Mol. Gen. Genet. 226, 345–352.

Logacheva, M.D., Samigullin, T.H., Dhingra, A., Penin, A.A., 2008. Comparative chloroplastgenomics andphylogenetics of Fagopyrumesculentum ssp. ancestrale—awild ancestorof cultivated buckwheat. BMC Plant Biol. 8, 59.

Lurin, C., Andres, C., Aubourg, S., Bellaoui, M., Bitton, F., Bruyere, C., Caboche, M., Debast,C., Gualberto, J., Hoffmann, B., Lecharny, A., Le Ret, M., Martin-Magniette, M.L.,Mireau, H., Peeters, N., Renou, J.P., Szurek, B., Taconnat, L., Small, I., 2004. Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals theiressential role in organelle biogenesis. Plant Cell 16, 2089–2103.

Maier, R.M., Neckermann, K., Igloi, G.L., Kossel, H., 1995. Complete sequence of themaize chloroplast genome: gene content, hotspots of divergence and fine tuning ofgenetic information by transcript editing. J. Mol. Biol. 251, 614–628.

Maliga, P., 2002. Engineering the plastid genome of higher plants. Curr. Opin. Plant Biol.5, 164–172.

Maliga, P., 2004. Plastid transformation in higher plants. Annu. Rev. Plant Biol. 55,289–313.

Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A., Berka,J., Braverman, M.S., Chen, Y.J., Chen, Z., Dewell, S.B., Du, L., Fierro, J.M., Gomes, X.V.,Godwin, B.C., He,W., Helgesen, S., Ho, C.H., Irzyk, G.P., Jando, S.C., Alenquer,M.L., Jarvie,T.P., Jirage, K.B., Kim, J.B., Knight, J.R., Lanza, J.R., Leamon, J.H., Lefkowitz, S.M., Lei,M., Li,J., Lohman, K.L., Lu, H., Makhijani, V.B., McDade, K.E., McKenna, M.P., Myers, E.W.,Nickerson, E., Nobile, J.R., Plant, R., Puc, B.P., Ronan,M.T., Roth, G.T., Sarkis, G.J., Simons,J.F., Simpson, J.W., Srinivasan, M., Tartaro, K.R., Tomasz, A., Vogt, K.A., Volkmer, G.A.,Wang, S.H., Wang, Y., Weiner, M.P., Yu, P., Begley, R.F., Rothberg, J.M., 2005. Genomesequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380.

Mathews, S., Donoghue, M.J., 1999. The root of angiosperm phylogeny inferred fromduplicate phytochrome genes. Science 286, 947–950.

McCauley, D.E., 1992. The use of chloroplast DNA polymorphism in studies of gene flowin plants. Trends Ecol. Evol. 10, 198–202.

McGarvey, D.J., Croteau, R., 1995. Terpenoid metabolism. Plant Cell 7, 1015–1026.Millen, R.S., Olmstead, R.G., Adams, K.L., Palmer, J.D., Lao, N.T., Heggie, L., Kavanagh, T.A.,

Hibberd, J.M., Gray, J.C., Morden, C.W., Calie, P.J., Jermiin, L.S., Wolfe, K.H., 2001.Many parallel losses of infA from chloroplast DNA during angiosperm evolutionwith multiple independent transfers to the nucleus. Plant Cell 13, 645–658.

Miyamoto, T., Obokata, J., Sugiura, M., 2002. Recognition of RNA editing sites is directedby unique proteins in chloroplasts: biochemical identification of cis-actingelements and trans-acting factors involved in RNA editing in tobacco and peachloroplasts. Mol. Cell. Biol. 22, 6726–6734.

Moon, E., Kao, T.H., Wu, R., 1987. Rice chloroplast DNA molecules are heterogeneous asrevealed by DNA sequences of a cluster of genes. Nucleic Acids Res. 15, 611–630.

Moore, M.J., Dhingra, A., Soltis, P.S., Shaw, R., Farmerie, W.G., Folta, K.M., Soltis, D.E.,2006. Rapid and accurate pyrosequencing of angiosperm plastid genomes. BMCPlant Biol. 6, 17.

Neale, D.B., Saghai-Maroof, M.A., Allard, R.W., Zhang, Q., Jorgensen, R., 1988. ChloroplastDNA diversity in populations of wild and cultivated barley. Genetics 120, 1105–1110.

Nickrent, D.L., Blarer, A., Qiu, Y.-L., Soltis, D.E., Soltis, P.S., Zanis, M., 2002. Molecular dataplace Hydnoraceae with Aristolochiaceae. Am. J. Bot. 89, 1809–1817.

Ohyama, K., Fukuzawa, H., Kohchi, T., Shirai, H., Sano, T., Sano, S., Umesono, K., Shiki, Y.,Takeuchi, M., Chang, Z., Aota, S., Inokuchi, H., Ozeki, H., 1986. Chloroplast geneorganization deduced from complete sequence of liverwortMarchantia polymorphachloroplast DNA. Nature 322, 572–574.

Page, R.D., 1996. TreeView: an application to display phylogenetic trees on personalcomputers. Comput. Appl. Biosci. 12, 357–358.

Palmer, J.D., Osorio, B., Aldrich, J., Thompson, W.F., 1987. Chloroplast DNA evolutionamong legumes: loss of a large inverted repeat occurred prior to other sequencerearrangements. Curr. Genet. 11, 275–286.

Petit, R.J., Aguinagalde, I., de Beaulieu, J.L., Bittkau, C., Brewer, S., Cheddadi, R., Ennos, R.,Fineschi, S., Grivet, D., Lascoux, M., Mohanty, A., Muller-Starck, G.M., Demesure-Musch, B., Palme, A., Martin, J.P., Rendell, S., Vendramin, G.G., 2003. Glacial refugia:hotspots but not melting pots of genetic diversity. Science 300, 1563–1565.

Petit, R.J., Duminil, J., Fineschi, S., Hampe, A., Salvini, D., Vendramin, G.G., 2005.Comparative organization of chloroplast, mitochondrial and nuclear diversity inplant populations. Mol. Ecol. 14, 689–701.

Pfitzinger, H., Weil, J.H., Pillay, D.T., Guillemaut, P., 1990. Codon recognitionmechanisms in plant chloroplasts. Plant Mol. Biol. 14, 805–814.

Provan, J., Powell, W., Hollingsworth, P.M., 2001. Chloroplast microsatellites: new toolsfor studies in plant ecology and evolution. Trends Ecol. Evol. 16, 142–147.

Raubeson, L.A., Peery, R., Chumley, T.W., Dziubek, C., Fourcade,H.M., Boore, J.L., Jansen, R.K.,2007. Comparative chloroplast genomics: analyses including new sequences from theangiosperms Nuphar advena and Ranunculus macranthus. BMC Genomics 8, 174.

Rochaix, J.D., Kuchka, M., Mayfield, S., Schirmer-Rahire, M., Girard-Bascou, J., Bennoun, P.,1989. Nuclear and chloroplast mutations affect the synthesis or stability of thechloroplast psbC gene product in Chlamydomonas reinhardtii. EMBO J. 8, 1013–1021.

Ruhlman, T., Lee, S.B., Jansen, R.K., Hostetler, J.B., Tallon, L.J., Town, C.D., Daniell, H.,2006. Complete plastid genome sequence of Daucus carota: implications forbiotechnology and phylogeny of angiosperms. BMC Genomics 7.

Sasaki, T., Yukawa, Y., Miyamoto, T., Obokata, J., Sugiura, M., 2003. Identification of RNAediting sites in chloroplast transcripts from the maternal and paternal progenitorsof tobacco (Nicotiana tabacum): comparative analysis shows the involvement ofdistinct trans-factors for ndhB editing. Mol. Biol. Evol. 20, 1028–1035.

Sato, S., Nakamura, Y., Kaneko, T., Asamizu, E., Tabata, S., 1999. Complete structure ofthe chloroplast genome of Arabidopsis thaliana. DNA Res. 6, 283–290.

Schmitz-Linneweber, C., Maier, R.M., Alcaraz, J.P., Cottet, A., Herrmann, R.G., Mache, R.,2001. The plastid chromosome of spinach (Spinacia oleracea): complete nucleotidesequence and gene organization. Plant Mol. Biol. 45, 307–315.

Schmitz-Linneweber, C., Regel, R., Du, T.G., Hupfer, H., Herrmann, R.G., Maier, R.M.,2002. The plastid chromosome of Atropa belladonna and its comparison with that ofNicotiana tabacum: the role of RNA editing in generating divergence in the processof plant speciation. Mol. Biol. Evol. 19, 1602–1612.

Schuster, W., Hiesel, R., Wissinger, B., Brennicke, A., 1990. RNA editing in thecytochrome b locus of the higher plant Oenothera berteriana includes a U-to-Ctransition. Mol. Cell. Biol. 10, 2428–2431.

Shimada, H., Sugiura, M., 1991. Fine structural features of the chloroplast genome:comparison of the sequenced chloroplast genomes. Nucleic Acids Res. 19, 983–995.

Stamatakis, A., 2006. RAxML-VI-HPC: maximum likelihood-based phylogeneticanalyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690.

Steane, D.A., 2005. Complete nucleotide sequence of the chloroplast genome from theTasmanian blue gum, Eucalyptus globulus (Myrtaceae). DNA Res. 12, 215–220.

112 S. Tangphatsornruang et al. / Gene 475 (2011) 104–112

Steinhauser, S., Beckert, S., Capesius, I., Malek, O., Knoop, V., 1999. Plant mitochondrialRNA editing. J. Mol. Evol. 48, 303–312.

Stern, D.B., Lonsdale, D.M., 1982. Mitochondrial and chloroplast genomes of maize havea 12-kilobase DNA sequence in common. Nature 299, 698–702.

Stern, D.B., Palmer, J.D., 1984. Extensive and widespread homologies between mitochon-drial DNA and chloroplast DNA in plants. Proc. Natl Acad. Sci. USA 81, 1946–1950.

Swofford, D.L., 2002. PAUP: Phylogenetic Analysis Using Parsimony version 4.0b.Sinauer Associates, Sunderland, Massachusetts.

Tangphatsornruang, S., Birch-Machin, I., Newell, C.A., Gray, J., 2010a. The effect ofdifferent 3′ untranslated regions on the accumulation and stability of transcripts ofa gfp transgene in chloroplasts of transplastomic tobacco. Plant Mol. Biol.doi:10.1007/s11103-010-9689-1 (Epub).

Tangphatsornruang, S., Sangsrakru, D., Chanprasert, J., Uthaipaisanwong, P., Yoocha, T.,Jomchai, N., Tragoonrung, S., 2010b. The chloroplast genome sequence ofmungbean (Vigna radiata) determined by high-throughput pyrosequencing:structural organization and phylogenetic relationships. DNA Res. 17, 11–22.

Tillich,M., Funk,H.T., Schmitz-Linneweber, C., Poltnigg, P., Sabater, B.,Martin,M.,Maier, R.M.,2005. Editing of plastid RNA in Arabidopsis thaliana ecotypes. Plant J. 43, 708–715.

Vangerow, S., Teerkorn, T., Knoop, V., 1999. Phylogenetic information in themitochondrial nad5 gene of pteridophytes: RNA editing and intron sequences.Plant Biol. 1, 235–243.

Wakasugi, T., Tsudzuki, J., Ito, S., Nakashima, K., Tsudzuki, T., Sugiura, M., 1994. Loss ofall ndh genes as determined by sequencing the entire chloroplast genome of theblack pine Pinus thunbergii. Proc. Natl Acad. Sci. USA 91, 9794–9798.

Wakasugi, T., Hirose, T., Horihata, M., Tsudzuki, T., Kossel, H., Sugiura, M., 1996. Creationof a novel protein-coding region at the RNA level in black pine chloroplasts: thepattern of RNA editing in the gymnosperm chloroplast is different from that inangiosperms. Proc. Natl Acad. Sci. USA 93, 8766–8770.

Wojciechowski, M.F., Lavin, M., Sanderson, M.J., 2004. A phylogeny of legume(Leguminosae) based on analysis of the plastid matK gene resolves many well-supported subclades within the family. Am. J. Bot. 91, 1846–1862.

Wolfe, K.H., Morden, C.W., Palmer, J.D., 1991. Ins and outs of plastid genome evolution.Curr. Opin. Genet. Dev. 1, 523–529.

Wyman, S.K., Jansen, R.K., Boore, J.L., 2004. Automatic annotation of organellar genomeswith DOGMA. Bioinformatics 20, 3252–3255.

Yoshinaga, K., Iinuma, H., Masuzawa, T., Uedal, K., 1996. Extensive RNA editing of U to Cin addition to C to U substitution in the rbcL transcripts of hornwort chloroplastsand the origin of RNA editing in green plants. Nucleic Acids Res. 24, 1008–1014.

Yukawa, M., Tsudzuki, T., Sugiura, M., 2005. The 2005 version of the chloroplast DNAsequence from tobacco (Nicotiana tabacum). Plant Mol. Biol. Rep. 23, 1–7.

Zanis, M.J., Soltis, D.E., Soltis, P.S., Mathews, S., Donoghue, M.J., 2002. The root of theangiosperms revisited. Proc. Natl Acad. Sci. USA 99, 6848–6853.