Phylogenetic analysis of the nuclear alcohol dehydrogenase (Adh) gene family in Carex section...
Transcript of Phylogenetic analysis of the nuclear alcohol dehydrogenase (Adh) gene family in Carex section...
MOLECULAR
Molecular Phylogenetics and Evolution 33 (2004) 671–686
PHYLOGENETICSANDEVOLUTION
www.elsevier.com/locate/ympev
Phylogenetic analysis of the nuclear alcohol dehydrogenase(Adh) gene family in Carex section Acrocystis (Cyperaceae)
and combined analyses of Adh and nuclear ribosomalITS and ETS sequences for inferring species relationships
Eric H. Roalsona,*, Elizabeth A. Friarb
a School of Biological Sciences and Center for Integrated Biotechnology, Washington State University, Pullman, WA 99164-4236, USAb Rancho Santa Ana Botanic Garden, 1500 N. College Ave., Claremont, CA 91711, USA
Received 2 February 2004; revised 20 July 2004
Available online 25 September 2004
Abstract
We analyzed sequence variation for the alcohol dehydrogenase (Adh) gene family in Carex section Acrocystis (Cyperaceae) to
reconstruct Adh gene trees for Acrocystis species and to characterize the structure of the Adh gene family in Carex. Two Adh loci
were included with ITS and ETS sequences in a combined Bayesian inference analysis of Carex section Acrocystis to gain a better
understanding of species relationships in the section. In addition, we comment on how the results presented here contribute to our
knowledge of the birth-death process of the Adh gene family in angiosperms. It appears that the structure of the Adh gene family in
Carex is complex with possibly six loci present in the gene family. Additionally, variation among Acrocystis species within loci is
quite low, and there is little phylogenetic resolution in the individual datasets. Bayesian inference analysis of the combined ITS,
ETS, Adh1, and Adh2 datasets resulted in a moderately well-supported phylogenetic hypothesis of relationships in the section which
is discussed in relation to previous hypotheses of relationships.
� 2004 Elsevier Inc. All rights reserved.
Keywords: Acrocystis; Adh; Alcohol dehydrogenase; Carex; Cyperaceae; ETS; Gene family evolution; ITS
1. Introduction
The genus Carex L. (Cyperaceae), with at least 2000
species (Reznicek, 1990), is one of the largest genera in
the world. The genus is relatively poorly known phylo-
genetically, with few studies and a general lack of reso-
lution among phylogenetic studies to date (Roalson etal., 2001; Starr et al., 1999; Starr et al., 2003; Yen and
Olmstead, 2000). Carex taxonomy is complex with four
subgenera and more than 70 sections recognized (Rez-
nicek, 1990). Phylogenetic studies have called into
1055-7903/$ - see front matter � 2004 Elsevier Inc. All rights reserved.
doi:10.1016/j.ympev.2004.08.005
* Corresponding author. Fax: +509 335 3184.
E-mail address: [email protected] (E.H. Roalson).
question the monophyly of many subgenera and sec-
tions (Roalson et al., 2001).
Carex section Acrocystis consists of 45–49 taxa rang-
ing across North America and Eurasia, with one species
in Andean South America (Mackenzie, 1935). Previous
studies have provided evidence that a portion of this sec-
tion forms a clade within Carex and includes all of theNorth American members and a few of the Eurasian
members of Acrocystis, although not all non-North
American species in the section were sampled (Roalson
et al., 2001). Relationships within this clade of core-
Acrocystis have been examined using nrDNA internal
transcribed spacer (ITS) and external transcribed spacer
(ETS) sequences (Roalson and Friar, 2004). The ITS/
ETS sequence topology of relationships in Carex section
672 E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686
Acrocystis supports a grade of Eurasian species leading
to a grade of western North American species, with
two clades of eastern North American species nested
within the western North American grade (Roalson
and Friar, 2004).
In order to explore phylogenetic relationships in thecore-Acrocystis clade in more detail, we present results
on the phylogenetic utility of the nuclear alcohol dehy-
drogenase (Adh) gene family. The Adh gene family re-
cently has been suggested to be useful for
phylogenetic estimation in a number of organisms
(Charlesworth et al., 1998; Gaut et al., 1999; Sang
et al., 1997; Small et al., 1998). Adh, in addition to sev-
eral other low-copy-number nuclear gene families (phy-tochrome B, Mathews et al., 2000; granule-bound
starch synthase, Mason-Gamer et al., 1998; Miller
et al., 1999; pistillata, Bailey and Doyle, 1999), has
been useful for examining divergence among closely re-
lated species where nuclear ribosomal spacers (ITS/
ETS) and chloroplast spacers (e.g., trnL-F) do not pro-
vide sufficient variation for phylogenetic reconstruction
(Sang et al., 1997; Small et al., 1998).Adh genes have been explored in detail in plants with
numerous studies either characterizing the gene family
in a particular species (Gaut and Clegg, 1993; Millar
and Dennis, 1996; Mitchell et al., 1989; Miyashita,
2001; Perry and Furnier, 1996; Small and Wendel,
2000a,b), or utilizing it for phylogenetic inference
(Charlesworth et al., 1998; Gaut et al., 1999; Koch
et al., 2000; Sang et al., 1997; Small et al., 1998). Amajority of flowering plants has been found to have
two or three Adh loci, each composed of 10 exons and
nine introns (Sang et al., 1997; and references therein),
although more recent studies (Small and Wendel,
2000b) have begun to suggest that the gene family struc-
ture of Adh may be more complex than previously
thought. While the process of gene duplication and dele-
tion in the genome is relatively well known (for review,see Li, 1997), the rate of this birth-death process across
organisms has not been studied nearly as much. Differ-
ent gene families in the angiosperms have shown differ-
ent patterns of duplication and loss of gene family
members, from long term persistence of gene family
members in functionally conserved genes (catalase gene
family: Klotz et al., 1997; CONSTANS LIKE gene fam-
ily: Lagercrantz and Axelsson, 2000) to quick turnoverof gene family copies (Pin2 gene family: Barta et al.,
2002) and concerted evolution homogenizing tandem re-
peats, with persistence of alternated copies rare beyond
the species or species complex level (nuclear ribosomal
DNA tandem repeats: Arnheim et al., 1980; Dover,
1982). For Adh, it has been suggested that there is a
‘‘slow flux’’ of gene duplication and loss across angio-
sperms (Clegg et al., 1997).Here, Carex alcohol dehydrogenase gene family
structure is explored and sequences of Adh are used to
investigate relationships in Carex section Acrocystis.
This study has three primary goals: (1) to characterize
the structure of the Adh gene family in Carex; (2) to
reconstruct Adh gene trees for Acrocystis species to ex-
plore the phylogenetic utility of this region of DNA;
and (3) to analyze a combined ITS, ETS, Adh1, andAdh2 dataset to further explore phylogenetic relation-
ships in the Carex section Acrocystis lineage. In addi-
tion, we comment on how the results presented here
contribute to our knowledge of the birth-death process
of gene family evolution in the Adh gene family in
angiosperms.
2. Materials and methods
2.1. Taxon sampling
Total DNAs were isolated from leaf tissue from
either live plants or dried herbarium specimens using a
modification of the CTAB method (Doyle and Doyle,
1987; Roalson and Friar, 2000). Samples of as manytaxa of Carex section Acrocystis as possible were in-
cluded in this study. Forty-five accessions of 24 ‘‘core-
Acrocystis’’ taxa were variously included in the different
individual and combined analyses. Voucher information
and GenBank accessions for all samples are listed in
Table 1.
2.2. Amplification, cloning, and sequencing
Preliminary studies focused on an approximately
2000 bp region of Adh covering exon 2 to exon 9 that
was amplified using primers Adh2f and Adh9r (Fig. 1;
Table 2). The primers were developed using GenBank
Poaceae sequences (X02915, AF050457, X04049,
AF044307, X16296, X12734, X12733, and AF050456).
All PCR amplifications were hot-started, with 1 unitof Taq DNA polymerase being added after 5 min of
denaturing at 96 �C. Thirty-five cycles of amplification
were carried out in a Stratagene Robocycler 96, with 1
min denaturing at 96 �C, 1 min annealing at 56 �C, and2 min extension at 72 �C. A final 10 min extension at
72 �C followed amplification. The products were verified
with 0.8% agarose gel in TBE buffer.
PCR products were cloned using the PCR-ScriptAmp Cloning Kit (Stratagene) and either miniprepped
using the PERFECTprep Plasmid DNA Preparation
Kit (5 Prime-3 Prime, Inc.) followed by direct sequenc-
ing from the plasmid, or PCR amplified directly from
colonies followed by purification and cycle sequencing.
Sequencing was performed using an Applied Biosystems
Model 373A Automated DNA Sequencing System. Di-
rect cycle-sequencing of purified template DNAs fol-lowed manufacturers specifications, using the PRISM
DyeDeoxy Terminator Kit (Perkin–Elmer).
Table 1
Taxa of Carex (abbreviated ‘‘C.’’) section Acrocystis and outgroups sampled in the Adh and combined analyses and their voucher information
Taxon Abbreviation Location and Voucher GenBank Accession
Nos. (Adh)
GenBank Accession
Nos. (ITS)
GenBank Accession
Nos. (ETS)
C. albicans Willd. ex Spreng. var. albicans 8150 West Virginia, Grant Co.; Reznicek 8150 (BRCH) *AY689017 (8150.2)1;
*AY689016 (8150.1)2*AY325479 *AY325454
1200 Missouri, Barry Co.; Morse 1200 AY689015 (1200.1)1 X X
C. albicans Willd. ex Spreng.
var. australis (L.H.Bailey) J. Rettig
1455 Mississippi, Washington Co.; Rettig 1455 (TAES) AY689018 (1455.1)1 X X
C. brainerdii Mackenzie 8588 California, Shasta Co.; Vincent 8588 *AY689014 (8588.3)2 *AY325485 *AY325460
C. brevicaulis Mackenzie ss Oregon, Lincoln Co.; Wilson s.n. (OSC) *AY688968 (ss5)1;
*AY688967 (ss3)2*AY325471 *AY325446
C. communis L.H.Bailey var. communis 1334 Canada, Quebec; Roalson 1334 *AY688988 (1334.1)1;
AY688966 (1334.3)4*AY325469 *AY325444
1344 Michigan, Alcona Co.; Roalson 1344 AY688993 (1344.3)1;
*AY688992 (1344.1)2X X
C. deflexa Hornem. var. boottii L.H.Bailey cl Oregon, Klamath Co.; Kuykendall s.n. (OSC) *AY689020 (cl.3)1;
*AY689013 (cl.2)2*AY686719 *AY686724
C. deflexa Hornem. var. deflexa 9701 Canada, Alberta; Ford 9701 (WIN) *AY689000 (9701.1)1;
*AY689001 (9701.3)2*AY686720 X
C. flacca Schreb. subsp. serrulata
(Biv.) Greuter [og]
Greece; Hartvig & Franzen 8709 X *AF284982 *AY325429
C. floridana Schw. 1288 Texas, Houston Co.; Roalson 1288 *AY688991 (1288.3)1 X X
6261 Texas, Jasper Co.; Jones 6261 (BRCH) X *AY325482 *AY325457
C. geophila Mackenzie 1408 Arizona, Cochise Co.; Roalson 1408 AY688977 (1408.1)1 X X
1409 Arizona, Cochise Co.; Roalson 1409 *AY688999 (1409.3)1;
*AY688998 (1409.1)2*AY325474 *AY325449
C. globosa Boott 1347 California, Monterey Co.; Roalson 1347 AY688984 (1347.2)1;
AY688981 (1347.6)2X X
12316 California, Marin Co.; Zika 12316 (OSC) *AY688979 (12316.12)1;
*AY688980 (12316.13)2*AY325487 *AY325462
C. inops L.H.Bailey subsp. inops lac Oregon, Klamath Co.; Kuykendall s.n. (OSC) *AY688983 (lac.2)1 *AY686721 X
4237 Oregon, Deschutes Co.; Halse 4237 (BRCH) *AY689006 (4237.1)2 X X
C. laxiflora Lam. [og] 1291 Texas, Upshur Co.; Roalson 1291 AY688971 (1291.3)2;
AY688959 (1291.1)6X X
C. lucorum Willd. ex Link var. lucorum 1327 Pennsylvania, Sullivan Co.; Roalson 1327 *AY688969 (1327.1)1;
*AY688970 (1327.2)2X X
1336 Ohio, Lucas Co.; Roalson 1336 X *AY325464 *AY325436
C. mandshurica Meinsh. [og] Korea, Kangwon Province; Tyson 5044 (POM) X *AF285045 *AY325432
C. novae-angliae Schw. 1333 Canada, Quebec; Roalson 1333 *AY689021 (1333.1)1 *AY325475 *AY325450
C. oxyandra Kudo 15896 Japan; Ikeda 15896 (OKAY) *AY689003 (15896.3)1;
AY688960 (15896.2)3;
AY688961 (15896.1)5
*AF285061 *AY325443
C. pauciflora Lightf. [og] s.n. Austria; Polatschek s.n. AY689022 (s.n.1)2 X X
C. peckii Howe 89-178 Minnesota, Clearwater Co.; McNeilus 89-178 (BRCH) *AY689007 (178.1)1 *AY325483 *AY325458
C. pensylvanica Lam. 1330 Canada, Quebec; Roalson 1330 *AY688995 (1330.3)2;
AY688994 (1330.1)2X X
1341 Michigan, Iosco Co.; Roalson 1341 X *AY686722 X
8142 West Virginia, Pendleton Co.; Reznicek 8142 (BRCH) *AY689004(8142.1)1 X X
(continued on next page)
E.H
.Roalso
n,E.A.Fria
r/Molecu
larPhylogenetics
andEvolutio
n33(2004)671–686
673
Table 1 (continued)
Taxon Abbreviation Location and Voucher GenBank Accession
Nos. (Adh)
GenBank Accession
Nos. (ITS)
GenBank Accession
Nos. (ETS)
C. pilulifera L. 805 Sweden, Uppland; Alm 805 *AY689023 (805.4)1 *AF284975 *AY325438
C. rossii Boott 1411 California, San Bernardino Co.; Roalson 1411 AY688973 (1411.1)2 X X
6864 Oregon, Lane Co.; Wilson 6864 (OSC) AY689011 (6864.1)1 X X
8166 California, Siskiyou Co.; Wilson 8166 (OSC) AY688974 (8166.9)2 X X
C. rossii Boott [1] cw Oregon, Benton Co.; Wilson s.n. (OSC) *AY689009 (cw.2)1;
*AY689010 (cw.4)2*AY325463 *AY325435
C. rossii Boott [2] 8101b California, Siskiyou Co.; Wilson 8101b (OSC) AY688976 (8101b.8)2 *AY325473 *AY325448
8107 California, Siskiyou Co.; Wilson 8107 (OSC) *AY688986 (8107.1)1;
*AY688989 (8107.2)2X X
C. serpenticola P.Zika 6803 California, Del Norte Co.; Wilson 6803 (OSC) *AY688978 (6803.15)2 X X
7631 Oregon, Curry Co.; Wilson 7631 (OSC) *AY688982 (7631.6)1 X X
12319 California, Marin Co.; Zika 12319 (OSC) X *AY325476 *AY325451
C. sp. nov. 3011 North Carolina, Buncombe Co.; Rothrock 3011 *AY689002 (3011.3)1;
AY688963 (3011.2)3;
AY688964 (3011.1)5
*AY325467 *AY325441
C. spissa L.H.Bailey [og] California, San Diego Co.; Tilforth & Wisura 2140 X *AF285040 *AY325431
C. tonsa (Fern.) E.P.Bicknell
var. rugosperma (Mackenzie) Crins
1332 Canada, Quebec; Roalson 1332 AY688996 (1332.1)1;
AY688997 (1332.3)2;
AY688962 (1332.2)5
X X
1340 Ohio, Lucas Co.; Roalson 1340 *AY688987 (1340.1)1;
*AY688990 (1340.3)2*AY686723 X
C. tonsa (Fern.) E.P.Bicknell var. tonsa 7047 Michigan, Van Buren Co.; Jones 7047 (BRCH) AY689008 (7047.3)1 X X
C. turbinata Liebm. [1] 1398 Mexico, Chihuahua; Laferriere 1398 (ARIZ) *AY689019 (1398.1)2 *AY325465 *AY325439
C. turbinata Liebm. [2]
(‘‘leucodonta’’-type)
1224 Arizona, Santa Cruz Co.; Roalson 1224 *AY688975 (1224.2)2 *AF284973 *AY325434
1383 Mexico, Chihuahua; Roalson 1383 AY689012 (1383.3)2 X X
C. umbellata Schkuhr ex Willd. [1] 8963 Michigan, Washtenaw Co.; Jones 8963 (BRCH) *AY689005 (8963.2)1 *AY325486 *AY325461
C. umbellata Schkuhr ex Willd. [2]
(‘‘microrhyncha’’-type)
1307 Texas, Burleson Co.; Roalson 1307 X *AY325472 *AY325447
1308 Texas, Brazos Co.; Roalson 1308 *AY688985 (1308.1)1;
AY688965 (1308.6)4X X
C. umbrosa Host subsp. sabynensis
Less. ex Kunth [og]
U.S.S.R. [Russia], Siberia; Murray et al. 344 X *AF285042 *AY325430
C. vulpinoidea Michx. [og] 1294 Texas, Morris Co.; Roalson 1294 AY689024 (1294.1)1;
AY688972 (1294.6)2X X
C. wahuensis C.A.Mey. subsp. robusta
(Fr. & Sav.) T.Koyama [og]
Japan, Shizuoka Pref.; Amano s.n. X *AF285023 *AY325433
Collections from USA unless otherwise noted. Specimens are deposited in RSA unless otherwise noted. The collection named ‘‘sp. nov.’’ is an individual that does not fit well in current species
circumscriptions and may be an undescribed species. Taxa used as outgroups in the various analyses are denoted by ‘‘[og]’’ following the species authority. Asterisks denote those sequences used in
the combined data Bayesian inference analysis. Numbers in parentheses following GenBank accession numbers are clone numbers from the figures and text and superscripts following the clone
numbers refer to Adh locus number as described in the figures and text.
674
E.H
.Roalso
n,E.A.Fria
r/Molecu
larPhylogenetics
andEvolutio
n33(2004)671–686
Table 2
Primers used for amplification and sequencing of Adh
Primer name Sequence
Adh2f 50-CKG CBG TGG CVT GGG AGG CMG GSA
AGC C-30
Adh2-1f 50-GAY GTC TWC TTY TGG GAA GCY A-30
Adh3f 50-CCW CGG ATC TTT GGT CAT GA-30
Adh3if 50-CTA TAC CCT TBG CTC TTC CA-30
Adh3ir 50-TGG AAG AGC VAA GGG TAT AG-30
Adh4xf 50-CAG AGG AGA GCA ACA TGT GTG A-30
Adh4f 50-TCC CGC TTC TCC ATC AAT GGC A-30
Adh4r 50-TGC CAT TGA TGG AGA AGC GGG A-30
Adh5f 50-TGG CAA TTT TTG GTC TGG GAG C-30
Adh5r 50-GCT CCC AGA CCA AAA ATT GCC A-30
Adh7r 50-YTC CGT GCA RCC AAA TTT CT-30
Adh8xr 50-CTC CAT TAG TCA TCT CAG CAA GA-30
Adh9-1r 50-YAC VCC GAC CAA AAC TGC CA-30
Adh9r 50-AAG TTC ATB GGR TGR GTC KTG AA-30
Refer to Fig. 1 for their annealing sites on the Adh gene.
Fig. 1. Diagram of the Adh gene in Carex. Exons and introns are abbreviated ‘‘ex’’ and ‘‘in,’’ respectively. The lines below the gene represent the two
regions sequenced, the longer fragment being approximately 2000 bp and the shorter approximately 1000 bp. Arrows indicate the locations and
directions of the PCR and sequencing primers. Primer sequences are listed in Table 2. Adh2f and Adh9r were used to amplify the 2000 bp fragment
and Adh4xf and Adh8xr were used to amplify the 1000 bp fragment.
E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686 675
Due to difficulties in amplification of the 2000+ bp
fragment and problems placing internal primers withinthe large third intron for complete coverage, new prim-
ers located in exon 4 (Adhx4f) and exon 8 (Adhx8r) were
used for the bulk of the samples in these analyses (Fig. 1;
Table 2). These fragments were treated as described
above for the longer fragments. The large fragments
were sequenced using the plasmid primers T3 and T7
and the internal primers Adh 2-1f, Adh3f, Adh3if, Ad-
h3ir, Adh4f, Adh4r, Adh5f, Adh5r, Adh7r, and Adh9-1r (Table 2). The smaller fragment was sequenced using
the plasmid primers T3 and T7 and the internal primers
Adh5f, Adh5r, and Adh7r. For each individual the en-
tire cloned fragment was sequenced in three to nine
clones. In order to avoid having a large percentage of
missing data, all sequences were truncated to include
only the exon 4 to exon 8 sequence region.
Sequences were edited using Sequencher 3.0 (GeneCodes Corporation, Inc.) and aligned manually. Se-
quences obtained in this study have been assigned Gen-
Bank accession numbers as listed in Table 1.
2.3. Phylogenetic analyses
Coding regions of Adh sequences from throughout
the angiosperms and pines, including a subset of the
Carex Adh sequences, were analyzed using maximum
parsimony (MP), as implemented in PAUP*4.0b10
(Swofford, 2001), to test how the Carex sequences com-
pare to other sequenced angiosperms and infer the root
of the Carex sequences within a broader evolutionary
context. This tree was rooted using mammal and birdAdh sequences and a Lycopersicon esculentum short-
chain dehydrogenase (samples and GenBank accessions
in Table 3). This analysis utilized heuristic searches
(ACCTRAN; gaps treated as missing; starting trees ob-
tained via stepwise addition; addition sequence random
with 100 replicates; TBR branch-swapping; STEEPEST
DECENT). Clade support was estimated using 100 heu-
ristic bootstrap (bs) replicates (10 random addition cy-cles per replicate, 10 trees saved from each addition
cycle, TBR branch swapping, STEEPEST DESCENT;
Felsenstein, 1985; Hillis and Bull, 1993).
Relationships among all sequences of Carex Adh
were also inferred using MP (Swofford, 2001). Rooting
was based on the results of the angiosperm coding se-
quence analysis. This analysis utilized heuristic searches
(ACCTRAN; gaps treated as missing; starting trees ob-tained via stepwise addition; addition sequence random
with 1000 replicates; TBR branch-swapping; steepest de-
cent). Due to the large number of trees of equal length,
the number of trees saved per replicate was set to 500.
This allows a more thorough search of tree space in
the analysis given a limited amount of computer mem-
ory. Clade support was estimated using 100 heuristic
bootstrap replicates (10 random addition cycles per rep-licate, 10 trees saved from each addition cycle, TBR
branch swapping, STEEPEST DESCENT; Felsenstein,
1985; Hillis and Bull, 1993). Sequences considered as
outgroups to the Acrocystis sequences in this analysis
are the accessions of Carex laxiflora, C. pauciflora, and
C. vulpinoidea, all members of other subgenera or sec-
tions of Carex (Roalson et al., 2001).
Bayesian inference analysis of a combined ITS/ETS/Adh1/Adh2 Carex section Acrocystis dataset was per-
formed using MrBayes v.3.0 (Huelsenbeck and Ron-
quist, 2001). The four datasets of 30 taxa in this
analysis include some taxa with partial datasets. Missing
data are: ITS, none; ETS, Carex deflexa var. deflexa, C.
inops subsp. inops, C. pensylvanica, and C. tonsa var.
Table 3
GenBank accessions used in the angiosperm Adh coding sequence analyses
Family Taxa, GenBank Accession Nos.
Arecaceae Calamus usitatus, U58363; Phoenix reclinata A, U58362; Washingtonia robusta A, U65973;
Washingtonia robusta B, U65972.
Asteraceae Helianthus annuus, AX146927.
Brassicaceae Arabidopsis griffithiana, AB015504; Arabidopsis himalaica, AB015503; Arabidopsis korshinskyi, AB015505;
Arabidopsis suecica, AB015507; Arabidopsis thaliana, D63463; Arabidopsis wallichii, AB015506;
Arabis flagellosa, AB015500; Arabis gemmifera, D63459; Arabis glabra, AB015499; Arabis hirsuta, AB015502;
Arabis stelleri, AB015498; Arabis lyrata, AB015501; Brassica oleracea, AB015508; Leavenworthia crassa 1, AF037472;
Leavenworthia crassa 2, AF037510; Leavenworthia crassa 3, AF037559; Leavenworthia stylosa 1, AF037564;
Leavenworthia stylosa 2, AF037558; Leavenworthia stylosa 3, AF037560; Leavenworthia uniflora 1, AF037557;
Leavenworthia uniflora 2, AF037512; Leavenworthia uniflora 3, AF037561.
Fabaceae Glycine max 1, AF079058; Glycine max 2, AF079499; Phaseolus acutifolius 1, Z23171; Pisum sativum, X06281;
Trifolium repens 1, X14826.
Joinvilleaceae Joinvillea ascendens, U91623.
Malvaceae Gossypium barbadense A, AF085821; Gossypium barbadense C (sgA), AF036578;
Gossypium barbadense C (sgD), AF036570; Gossypium darwinii C (sgA), AF036579;
Gossypium darwinii C (sgD), AF036573; Gossypium hirsutum A, AF090164; Gossypium hirsutum C (sgA), AF036575;
Gossypium hirsutum C (sgD), AF036569; Gossypium mustelinum C (sgA), AF036577;
Gossypium mustelinum C (sgD), AF036572; Gossypium raimondii C, AF036568; Gossypium robinsonii C, AF036567;
Gossypium tomentosum C (sgA), AF036576; Gossypium tomentosum C (sgD), AF036571.
Paeoniaceae Paeonia anomala 1a, AF009046; Paeonia anomala 2, AF009064; Paeonia californica 1a, AF009041;
Paeonia californica 2, AF009056; Paeonia lactiflora 1a, AF009049; Paeonia lactiflora 2, AF009068;
Paeonia lutea 1a, AF009042; Paeonia lutea 2, AF009057; Paeonia rockii 1a, AF009045; Paeonia rockii 2, AF009063;
Paeonia suffruticosa subsp. spontanea 1a, AF009043; Paeonia suffruticosa subsp. spontanea 2, AF009060.
Pinaceae Pinus banksiana 1, U48366; Pinus banksiana 2, U48367; Pinus banksiana 3, U48368; Pinus banksiana 4, U48369;
Pinus banksiana 5, U48370; Pinus banksiana 6, U48371; Pinus banksiana 7, U48372.
Poaceae Anomochloa marantoidea 1, U91622; Anomochloa marantoidea 2, U91625; Arundo donax 1, U91619;
Bambusa glaucescens 1, U91626; Eragrostis japonica 1, U91620; Hordeum vulgare subsp. spontaneum 1, AF052664;
Hordeum vulgare 2, X12733; Hordeum vulgare 3, X12734; Lithachne humilis 1, U91624;
Muhlenbergia setarioides 1, U91621; Oryza sativa 1, X16296; Oryza sativa 2, X16297; Pennisetum americanum 1, X16547;
Pennisetum glaucum 1, M59082; Sorghum bicolor 1, AF050456; Tripsacum dactyloides 1, AF045548;
Zea luxurians 1, AF044307; Zea mays 1f, AF050457; Zea mays 1s, X04049; Zea mays 2n, X02915.
Rosaceae Malus domestica, Z48234; Fragaria x ananassa, X15588.
Solanaceae Lycopersicon esculentum 2, M86724; Lycopersicon esculentum 3a, S75487; Nicotiana tabacum, X81853;
Petunia hybrida 1, X54106; Petunia hybrida 2, U25536; Solanum tuberosum 1, M25154; Solanum tuberosum 2, M25153;
Solanum tuberosum 3, M25152.
Vitaceae Vitis vinifera, U36586.
Outgroups
(Mammals and Birds)
Apteryx australis subsp. australis 1, S78778; Homo sapiens 1, X03350; Mus caroli 1, M11307; Papio hamadryas, M25035;
Rattus norvegicus, M15327.
Numbers or letters following species names refer to loci designations from original publication. Comments in parentheses refer to the appropriate
subgenome (sg) in Gossypium.
676 E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686
rugosperma; Adh1, C. brainerdii, C. flacca subsp. serru-
lata, C. mandshurica, C. spissa, C. turbinata [1], C. turbi-
nata [2], C. umbrosa subsp. sabynensis, and C. wahuensis
subsp. robusta; and Adh2, C. flacca subsp. serrulata, C.
floridana, C. mandshurica, C. novae-angliae, C. oxyandra,
C. peckii, C. pilulifera,C. sp. nov., C. spissa, C. umbellata
[1],C. umbellata [2],C. umbrosa subsp. sabynensis, andC.
wahuensis subsp. robusta. Some taxa included in thisanalysis include sequences concatenated from different
accessions of the same species (Table 1). Outgroups for
this analysis are based on results from previous analyses
(Roalson and Friar, 2004; Roalson et al., 2001) and in-
clude Carex flacca subsp. serrulata, C. mandshurica, C.
spissa, C. umbrosa subsp. sabynensis, and C. wahuensis
subsp. robusta. Six partitions were set to correspond with
the ITS, ETS, Adh1 exon, Adh1 intron, Adh2 exon, andAdh2 intron datasets. The parameters for each dataset
were allowed to vary independently (‘‘unlinked’’). Priors
for the six molecular dataset partitions included each
dataset with a separate model with variable substitution
types, rates, and invariant sites chosen based on the re-
sults of analysis using DT_ModSel (Minin et al., 2003;
ITS: TrN + I + G; ETS: TrN + G; Adh1 exons:
K80 + I; Adh1 introns: HKY; Adh2 exons: JC + I + G;
Adh2 introns: HKY + I). The DT_ModSel uses a Bayes-
ian information criterion to select a model using branch-length error as a performance measure in a decision the-
ory framework that also includes a penalty for overfitting
(Minin et al., 2003). One hundred million generations
were run with four chains (Markov Chain Monte Carlo),
and a tree was saved every 100 generations. The trees
from theMrBayes analysis were loaded into PAUP*, dis-
carding the samples from the ‘‘burnin’’ of the chain
(Huelsenbeck and Ronquist, 2001; the first 20,000,000generations or the first 200,001 samples) to only include
sample points after stationarity was clearly reached.
E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686 677
Examination of likelihood plots suggest that stationarity
was reached for all parameters within the designated bur-
nin (data not shown). Additionally, multiple indepen-
dent runs (three) starting from different random trees
were conducted to determine if convergence and mixing
had occurred. A majority rule consensus tree was madefrom one of the analyses, showing nodes with a posterior
probability of 0.50 or more. Majority rule consensus
trees of the trees sampled in Bayesian inference analyses
yielded probabilities that the clades are monophyletic
(Lewis, 2001).
2.4. Molecular evolution
Tests of molecular evolution focused upon anoma-
lous sequences placed in clades outside of the two major
loci or placed basal to the divergence of all of the other
Adh loci. Eight sequences fell into this category (C. lax-
iflora 1291.1, C. oxyandra 15896.1, C. oxyandra 15896.2,
C. tonsa var. rugosperma 1332.2, C. sp. nov. 3011.1, C.
sp. nov. 3011.2, C. communis var. communis 1334.3,
and C. umbellata [2] 1308.6) and the sequences of thesesamples were compared with the entire Carex Adh data-
set in tests of recombination, using Plato 2.02 (Grassly
and Rambaut, 1998). Plato utilizes a sliding window of
varying size to find regions of sequences that deviate
from the null hypothesis of phylogenetic structure pro-
vided by a maximum likelihood model fit to the data
set (results of Modeltest 2.1 [Posada and Crandall,
1998]; data not shown) and phylogenetic tree. MonteCarlo simulations are implemented to test for significant
deviation from the null hypothesis.
The coding regions of the eight anomalous sequences
were also compared with six typical Acrocystis Adh cod-
ing sequences (C. laxiflora 1291.3, C. brevicaulis ss.3, C.
brevicaulis ss.5, C. lucorum var. lucorum 1327.1, C. luco-
rum var. lucorum 1327.2, and C. vulpinoidea 1294.6).
GCUA 1.1 (McInerney, 1998) was employed to test fordifferences of base composition among these sequences.
A resampling test for codon bias was conducted using
techniques described by Morton (1998). While Morton�stest (1998) explored codon adaptation, it was applied
here to measure strict codon bias. This test uses a ran-
dom codon usage distribution with 500 replicates and
compares the gene to this distribution. If the gene is
within two standard deviations from the mean, then biasis considered to be not significantly different from the
cumulative usage.
3. Results
3.1. Angiosperm Adh phylogenetic analysis
The aligned angiosperm Adh coding sequence data
matrix was 1173 bp long with 951 variable sites, of
which 807 were potentially parsimony-informative.
The sequences varied from 351 to 1170 bp in length.
There were 14 gaps ranging from 3 to 18 bp in length.
The length of the unaligned sequences varied in size be-
cause different regions were sequenced in different stud-
ies. Maximum parsimony analysis of the angiospermAdh coding sequence data set resulted in 120 most-par-
simonious trees. Figs. 2 and 3 show the strict consensus
of the 120 most-parsimonious trees.
This analysis provides a hypothesis of relationships of
Adh loci in angiosperms. While several internal nodes
that group major clades together are not well-supported,
many clades do have high bootstrap support (Figs. 2
and 3). For most angiosperm families, Adh loci coalescewithin the family, with the exception of the Arecaceae
and Solanaceae (Figs. 2 and 3). There is clear incongru-
ence between the angiosperm Adh strict consensus tree
and current hypotheses of relationships of angiosperms
(Soltis et al., 2000). Particularly, the dicot samples form
a grade leading to the monocots, the closely related fam-
ilies Fabaceae and Rosaceae are not placed together,
and the single Asteraceae sample (Helianthus annuus)is placed at the base of the angiosperm clade (Figs. 2
and 3). Carex samples were placed by this analysis in
their expected location as sister to the Poaceae + Join-
villeaceae (Chase et al., 2000), and they form a strongly
supported monophyletic group (bs = 91%; Fig. 3). While
there is clear incongruity of the Adh parsimony phylog-
eny with the expected relationships of angiosperm lin-
eages, we expect that this reflects the gene history, andis not a result of analysis-related issues, such as long
branch attraction. Maximum likelihood analyses of the
Adh dataset using a single model for the entire gene
(chosen by DT_ModSel) gives similar results as the par-
simony topology, as it suggests paraphyly of the eudi-
cots to the monocots and the coalescence of most loci
within families (Roalson, unpubl. data).
This preliminary tree suggests there are two mainclades of Carex Adh sequences, with the sequence C.
laxiflora 1291.1 sister to the rest of the Adh sequences.
If the C. laxiflora 1291.1 sequence was a member of
either of these clades, it would be expected to be near
the other outgroup sequences associated with one of
the two clades. As it does not, it was considered as a po-
tential separate locus sister to the rest of the Carex Adh
sequences and used as the outgroup for the more de-tailed Carex Adh sequence analyses. The C. laxiflora
1291.1 sequence was sister to the rest of the Carex
Adh sequences regardless of which other Carex se-
quences were included (data not shown).
3.2. Carex Adh phylogenetic analysis
The five Adh sequencing primers produced overlap-ping fragments that collectively covered the 50 end of
exon 4 to the 50 end of exon 8 along both strands for
Fig. 2. Maximum parsimony strict consensus (part 1) of 120 most-parsimonious trees for the angiosperm Adh coding sequences analysis
(length = 6531 steps, CI = 0.290, RI = 0.749, RC = 0.217). Numbers above or below the branches are bootstrap percentages. Generic epithets are
abbreviated as follows: A. = Arabidopsis; Ap. = Apteryx; Ar. = Arabis; B. = Brassica; G. = Gossypium; Gl. = Glycine; He. = Helianthus, Ho. =Homo;
L. = Leavenworthia; Ly. = Lycopersicon; Mu. = Mus; N. = Nicotiana; P. = Phaseolus; Pe. = Petunia; Pi. = Pisum; Po. = Papio; Pu. = Pinus;
R. = Rattus; So. = Solanum; and T. = Trifolium.
678 E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686
the entire Carex Adh dataset. The aligned data matrix
was 978 bp long with 277 variable sites, of which 121
were potentially parsimony-informative. The length of
the unaligned sequences varied from 822 to 915 bp.
There were 31 gaps ranging from 1 to 93 bp in length.
Maximum parsimony analysis of the entire Carex Adh
dataset resulted in 106,000 most-parsimonious trees.
Fig. 4 is the strict consensus of these trees.
Although there are a large number of most-parsimo-
nious trees, this analysis supports two main clades of
Acrocystis Adh sequences, here referred to as Adh1
and Adh2 (Fig. 4). Each of these clades is defined by
being sister to one or more of the outgroup sequences.
The Adh1 clade is sister to C. vulpinoidea 1294.1 and
the Adh2 clade is sister to C. laxiflora 1291.3 (Fig. 4).
In addition to the two main clades, there are several
Acrocystis sequences placed outside of the outgroups
of these clades. Sister to the Adh1/C. vulpinoidea
1294.1 clade is a clade of two Acrocystis Adh sequences
(C. oxyandra 15896.2 and C. sp. nov. 3011.2). Thesecould potentially represent a third locus of Adh in Car-
ex. Sister to the Adh2/C. laxiflora 1291.3 clade is a clade
of two Acrocystis Adh sequences (C. communis 1334.3
and C. microrhyncha 1308.6). Sister to all of these is a
clade of three more Acrocystis Adh sequences (C. oxyan-
dra 15896.1, C. rugosperma 1332.2, and C. sp. nov.
3011.1). These two clades could potentially represent a
fourth or a fourth and fifth locus. Finally, the C. laxifl-
Fig. 3. Maximum parsimony strict consensus (part 2) of 120 most-parsimonious trees for the angiosperm Adh coding sequences analysis
(length = 6531 steps, CI = 0.290, RI = 0.749, RC = 0.217). Numbers above the branches are bootstrap percentages. Generic epithets are abbreviated
as follows: An. = Anomochloa; Ao. = Arundo; Ba. = Bambusa; C. = Carex; Ca. = Calamus; E. = Eragrostis; F. = Fragaria; H. =Hordeum; J. = Join-
villea; Li. = Lithachne; M. = Muhlenbergia; Ma. = Malus; O. = Oryza; Pa. = Paeonia; Ph. = Phoenix; Pn. = Pennisetum; S. = Sorghum; Tr. = Tripsa-
cum; V. = Vitis; W. = Washingtonia; and Z. = Zea. Specific and subspecific epithets are abbreviated as follows: H. vulgare subsp. spon. = H. vulgare
subsp. spontaneum, and Pa. suff. subsp. spon. = Pa. suffruticosa subsp. spontanea.
E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686 679
ora 1291.1 sequence used as the outgroup for this anal-
ysis appears to predate the divergence of any of the
other duplication events, possibly indicating it is a fifth
or sixth locus of Adh in Carex.
The data presented here suggest that there may be
two or three times as many Adh loci in Carex than in
some other groups studied. Given the fact that most
Carex species are thought to be diploid (Davies, 1956),this presents a complex scenario of gene family evolu-
tion in the group. While there is the possibility that addi-
tional sequences within species could be the result of
PCR error, we do not expect this to be likely as all of
the sequence types were found in multiple cloning
events. Verification of Adh locus number will require
Southern blot analysis and localization of loci on genetic
or physical maps. More detailed studies in a broader
phylogenetic context will be necessary to explore the
evolution of the Adh gene family in the Cyperaceae.
Only the Carex Adh1 and Adh2 clades have enough
samples to be used to infer species relationships, so fur-ther discussions will focus on these putative loci. Reso-
lution within the Adh1 and Adh2 clades is very low,
with few nodes within the clades supported in the strict
consensus tree (Fig. 4). The Adh1 locus does resolve sev-
Fig. 4. Maximum parsimony strict consensus of 106,000 most-parsimonious trees for the Carex Adh sequence analysis (length = 459 steps,
CI = 0.697, RI = 0.928, RC = 0.647). Numbers above the branches are bootstrap percentages. All species in the tree are members of the genus Carex.
Numbers in brackets following Carex turbinata and C. umbellata refer to the two different ‘‘types’’ recognized within these species, as noted in Table
1. Numbers following species names not in brackets are collection abbreviations and clone numbers.
680 E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686
eral of the species from eastern North America forming
a clade (C. albicans var. albicans, C. albicans var. aus-
tralis, C. floridana, C. lucorum var. lucorum, C. pensylva-nica, C. tonsa var. rugosperma, and C. umbellata) as has
been found previously using nrDNA spacer sequences
(ITS + ETS; Roalson and Friar, 2004). The Adh2 locus
also resolves a clade of eastern North American species
and a clade of some of the western North American spe-cies (eastern: C. albicans var. albicans, C. communis var.
communis, C. deflexa var. deflexa, C. lucorum var. luco-
E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686 681
rum, C. pensylvanica, and C. tonsa var. rugosperma; wes-
tern: C. geophila, and C. rossii, and C. turbinata; Fig. 4).
3.3. Molecular evolution
Two primary clades of sequences–putative loci–werefound in Carex, which we refer to as Adh1 and Adh2.
Additionally, several samples were placed outside of
the two putative loci and could represent additional loci.
The introns of the two major loci were alignable, but
had distinctive gaps associated with each locus. Both
loci were not found for all samples, but no distinct pat-
tern of which samples had which loci is evident. More
than half of the species were found with both loci,although sometimes in different samples. This implies
that either there has been random loss of copies of these
loci, or that sampling was not sufficient to find both loci
in all samples. Additionally, eight sequences were found
outside of the two main Adh clades from five Acrocystis
species and one outgroup species. These eight sequences
are further explored in Molecular Evolution Results and
the Discussion.Statistical tests for recombination utilizing Plato 2.11
found no evidence of recombination. This was consis-
tent given different window sizes (5–50 bp) and numbers
of replicates (100–500) for the Monte Carlo simulations.
Estimates of base composition using GCUA 1.1
found no differences in the base composition of the
anomalous and typical sequences in both exon GC con-
tent and intron GC content (Table 4). The sequenceshave approximately 50% GC content in exons and
29% GC content in introns.
The resampling tests applied found all of the gene se-
quences to be statistically homogeneous with respect to
codon usage starting at codon 85 (codons 1–84 were ex-
cluded because they were not represented in all samples).
This suggests that there are no real differences in codon
Table 4
Percent GC content of a subset of Carex Adh sequences including
anomalous sequences (marked with an asterisk)
Sequence Exon GC Intron GC
C. communis var. communis 1334.3* 51.20 30.91
C. sp. nov. 3011.1* 51.60 26.80
C. sp. nov. 3011.2* 52.23 29.34
C. laxiflora 1291.1* 48.20 30.67
C. umbellata 1308.6* 50.62 30.04
C. oxyandra 15896.1* 51.60 27.54
C. oxyandra 15896.2* 51.40 28.92
C. tonsa var. rugosperma 1332.2* 51.40 26.80
C. brevicaulis ss.3 50.62 29.93
C. brevicaulis ss.5 49.38 27.81
C. laxiflora 1291.3 50.62 28.82
C. lucorum var. lucorum 1327.1 49.38 27.59
C. lucorum var. lucorum 1327.2 51.56 29.46
C. vulpinoidea 1294.6 50.93 29.43
usage among the different putative loci and anomalous
sequences and does not support any suggestion of selec-
tion, at least in this segment of the coding region. Addi-
tionally, none of the anomalous sequences was found to
have stop codons, suggesting they may not be
pseudogenes.
3.4. Combined ITS/ETS/Adh1/Adh2 Bayesian analysis
In the Bayesian inference analyses, plots of log-like-
lihood scores and all other parameters reach stationa-
rity prior to generation 20,000,000 in all independent
analyses (plots not shown). The first 200,001 sample
points were thus discarded as burn-in, leaving800,000 samples for construction of a 50% majority
rule consensus tree. Fig. 5 illustrates the 50% majority
rule consensus tree from one of the runs (results from
this run used in all further results and discussions). All
three independent analyses resulted in similar posterior
probability distributions and therefore we expect con-
vergence and mixing is occurring (data not shown).
Fifty-two percent of nodes have a posterior probabilityP95%.
Some taxa included in this analysis are missing por-
tions of the total matrix and some species sequences
are combinations of sequences from different accessions
within the species (see Section 2 and Table 1 for details).
While neither of these conditions are optimal for phy-
logeny reconstruction, we believe they have minimal im-
pact on the phylogenetic analyses presented here. Testanalyses with different combinations of taxa with miss-
ing data excluded did not significantly affect the general
tree topology (data not shown). Combining sequences
from different accessions of the same species is often
done (e.g., Michelangeli et al., 2003; Soltis et al.,
2003), and the most critical issue involved with this prac-
tice is confidence that both accessions truly belong to the
same species. Where sequences from different collectionsof a species were combined, we very carefully verified
species identification, and where species circumscription
is problematic, samples were only combined when their
source populations were similar morphologically and in
close geographic proximity.
While the sampling between this combined ITS/ETS/
Adh1/Adh2 analysis and previously published ITS/ETS
analyses is somewhat different, the trees can be gener-ally compared. The Bayesian inference majority rule
consensus tree is congruent with the maximum parsi-
mony strict consensus phylogeny (Roalson and Friar,
2004). There is some difference in internal node struc-
ture between the previously published maximum likeli-
hood topology and this Bayesian inference phylogeny,
but these differences are associated with very short
branches in the maximum likelihood tree, nodes withlow posterior probability values, or both (Roalson
and Friar, 2004).
Fig. 5. Bayesian inference majority rule tree. Numbers above branches are posterior probability values. Numbers in brackets following Carex
turbinata and C. umbellata accessions refer to the two different ‘‘types’’ recognized within these species, as noted in Table 1 and the text. Numbers in
brackets following Carex rossii samples refer only to two different composite sequences included in this analysis. Boxes to the right of species names
refer to previously defined species complexes or ‘‘orphan’’ species and are abbreviated as follows: D = the deflexa complex; N = the nigromarginata
complex; O = orphan species (see Roalson and Friar, 2004 for discussion); P = the pensylvanica complex; and U = the umbellata complex.
682 E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686
4. Discussion
4.1. Evolution of the Adh gene family
These results suggest that there are two major loci of
the Adh gene family, Adh1 and Adh2, in Carex. This is
consistent with the presence of two or three Adh loci in
most angiosperms. Contrary to other phylogenetic stud-
ies of angiosperm groups (Paeonia, Sang et al., 1997;
Leavenworthia, Charlesworth et al., 1998; Gossypium,Small et al., 1998; Poaceae, Gaut et al., 1999), Carex
has additional putative loci that form clades outside of
the two major loci or are placed basal to the divergence
of all other Adh loci (Figs. 2–4). Recent studies of Adh
gene family structure (Gossypium; Small and Wendel,
2000b) have also found a more complex gene family
structure than previously thought, with as many as se-
ven Adh loci present in some diploid Gossypium species.
E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686 683
The phylogenetic estimations shown in Figs. 2–4 pro-
vide support that the duplication event leading to the
Adh1 and Adh2 copies predated the diversification of
Carex, but did not predate the divergence of Carex from
other graminoid monocots. Until Adh of other Cypera-
ceae genera are sequenced, it will remain unclear if theduplication is Carex-specific or associated with a broad-
er group of Cyperaceae genera. There is no evidence of
within-locus variation within species (data not shown).
Analysis of angiosperm Adh sequences also provides
some insight into the birth–death process of loci in the
Adh gene family. Some have suggested that there is a
small, stable number of loci in the Adh gene family in
which there are constraints on gene copy number anda slow flux of gene duplication and loss leading to a dy-
namic equilibrium of copy number (Clegg et al., 1997).
The angiosperm Adh analysis provides evidence that loss
and gain of loci might be more common than previously
suggested at least in some groups (Figs. 2 and 3). For in-
stance, the Adh loci of Leavenworthia coalesce after the
divergence of Leavenworthia from Arabis, which are clo-
sely related genera (Al-Shehbaz, 1988). While many ofthe loci appear to be of relatively recent origins and coa-
lesce within genera or families, this is not true for all
loci. Some, particularly those in the Arecaceae and Sola-
naceae, appear to coalesce at deep nodes in the tree, sug-
gesting that there has also been persistence of ancestral
Adh loci in some lineages. Additionally, the incongru-
ence of the Adh phylogeny with well-supported relation-
ships in the angiosperms may suggest additional deepcoalescence of some Adh loci, particularly the placement
of the Asteraceae sample at the base of the angiosperms,
the grade of the dicots leading to the monocots, and the
distant relationship of Adh loci from the closely related
families Rosaceae and Fabaceae (Figs. 2 and 3; Soltis
et al., 2000). These results suggest that there may be a
combination of rapid gene family homogenization as
well as long term persistence of gene family members indifferent lineages of the angiosperms. Alternatively,
other molecular evolutionary processes (e.g., signal satu-
ration) could be acting to obscure the ‘‘true’’ phyloge-
netic signal of the Adh gene family. Furthermore,
analysis of the Carex Adh loci (Fig. 4) suggests that there
may be more than the two or three loci typically found in
angiosperms, as has been found in pines (Perry and Fur-
nier, 1996) and Gossypium (Small and Wendel, 2000b).
4.2. Molecular evolution of Carex Adh loci
The individuals that possess anomalous sequences
(Adh loci 3–6) also have sequences that fall within the
Adh1 and Adh2 loci. Carex oxyandra 15896 and C. sp.
nov. 3011 sequences are found in the Adh3 and Adh5
and the Adh2 locus. Carex rugosperma 1332 has se-quences in the Adh5 clade as well as both the Adh1
and Adh2 loci. If these groups of sequences do represent
additional loci, it is likely that either Carex has more
than the expected two or three Adh loci and copies of
each locus have yet to be sequenced, or there is a com-
plicated pattern of gain and loss of loci in this group.
Since there is no evidence that these anomalous se-
quences are recombining with members of the Adh1and Adh2 loci, and the typical Adh copies form mono-
phyletic groups, the usefulness of the Adh1 and Adh2
loci as a phylogenetic marker merits further discussion.
The studies of molecular evolution of the Carex Adh
sequences support the idea that all sequences studied
here are functional and not under radically different
selection regimes (no evidence of recombination, similar
base composition, similar codon usage biases, and nostop codons). This is different from the results found
studying molecular evolution of Adh genes in some
other groups (e.g., Poaceae; Gaut et al., 1999). Some
have suggested that with gene family expansion there
is likely functional diversification, which can often be
seen in the different genetic structure of divergent loci
(Gaut et al., 1999). While there has apparently been
duplication and divergence of copy number in the CarexAdh gene family, we do not see any evidence of changes
in selective constraints on different loci.
4.3. Phylogenetic relationships in Carex section acrocystis
The analyses of Acrocystis Adh sequences do not pro-
vide well-resolved phylogenetic hypotheses of intra-sec-
tional relationships. As with ITS/ETS (Roalson andFriar, 2004), species with multiple samples do not coa-
lesce, with the exception of the C. globosa Adh1 and
Adh2 sequences and the C. communis Adh1 sequences
(Fig. 4). Non-coalescence of Adh sequences within spe-
cies has been found in other angiosperms as well (Small
et al., 1999). In some cases, this lack of coalescence is
due to a lack of sequence variation among samples
(apparently most cases fall into this category), while inothers there may be a persistance of ancestral polymor-
phisms (particularly, C. pensylvanica, C. tonsa var. rugo-
sperma, and C. umbellata; Fig. 4). There is a clade of at
least some eastern North American species supported by
both the Adh1 and Adh2 analyses, as was found with
ITS/ETS.
While the Adh sequences provide less phylogenetic
signal than might be desired, there is sequence variationand some phylogenetic structure. In order to better ad-
dress species relationships within Acrocystis, the two pri-
mary Adh loci, Adh1 and Adh2, were combined with
sequences from the nuclear ribosomal DNA internal
and external transcribed spacer regions (ITS and ETS).
These four datasets were then analyzed using Bayesian
inference analyses in order to apply model-based ap-
proaches and allow each of the datasets or dataset par-titions to be adequately modeled for the analysis. The
results of this analysis continue to support a similar tree
684 E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686
topology as previously supported by ITS and ETS
alone, but with greater resolution and better statistical
support for several nodes (Fig. 5; Roalson and Friar,
2004).
Several species as currently defined appear to be para-
or polyphyletic (Fig. 5), particularly, Carex deflexa, C.rossii, C. turbinata, and C. umbellata. Since most species
are represented by a single exemplar, species boundaries
of all species cannot be addressed here. The Bayesian
analysis presented here suggests that Carex deflexa
may be polyphyletic as currently circumscribed (Crins
and Rettig, 2003). Samples of C. deflexa var. deflexa
and C. deflexa var. boottii are here included in the same
analysis for the first time, and they clearly do not form amonophyletic group (Fig. 5). Additionally, they are sep-
arated by at least one strongly supported node (poster-
ior probability = 99%). At times C. deflexa var. boottii
has been considered a separate species (C. brevipes),
and this might be a more appropriate treatment until
more detailed studies of this group are conducted. Carex
rossii has been previously suggested to possibly be poly-
phyletic (Roalson and Friar, 2004) and those results arenot refuted here with the addition of Adh sequences.
Carex turbinata, as currently circumscribed (Crins
and Rettig, 2003), has often been recognized as two spe-
cies: C. turbinata and C. leucodonta. Accessions repre-
senting both of these types were included here ([1] and
[2]; Table 1) and they do not form a monophyletic group
(Fig. 5). Additionally, they are separated by at least one
moderately well-supported node in this analysis. Simi-larly, the circumscription of Carex umbellata has re-
cently been changed to include C. microrhyncha (Crins
and Rettig, 2003). With the inclusion of C. microrhyncha
in C. umbellata, C. umbellata is made paraphyletic with
regards to C. tonsa var. rugosperma (Fig. 5). As C. tonsa
var. tonsa is not included here, its position cannot be ad-
dressed, but it is clear that species circumscriptions
among these species need to be revisited.Systematic studies of Carex section Acrocystis by sev-
eral authors (Crins and Ball, 1983; Mackenzie, 1913a,;
Rettig, 1990; Rettig and Giannasi, 1990; B. Ford, Uni-
versity of Manitoba, ongoing studies, pers. commun.)
have primarily focused on several species complexes
(the deflexa complex; the nigromarginata complex; the
pensylvanica complex; and the umbellata complex).
Whether these species complexes circumscribe mono-phyletic groups or evolutionary lineages has been re-
cently called into question (Roalson and Friar, 2004),
and the results presented here lend support to the non-
monophyly of many of these informal groupings (Fig.
5). Particularly, the nigromarginata complex appears
to be polyphyletic, with none of the sampled species
from the complex grouping with any other species from
the complex (Fig. 5). Unfortunately, previous systematicstudies of the nigromarginata complex (Rettig, 1990;
Rettig and Giannasi, 1990) have been conducted with-
out sampling any other species in the section, so their re-
sults are difficult to compare to the results presented
here.
Alternatively, many of the taxa considered part of the
deflexa complex form a clade, although to the exclusion
of C. deflexa var. deflexa itself as well as C. geophila
(Fig. 5). Several other species previously referred to as
‘‘orphan species’’ (see Roalson and Friar, 2004) and a
member of the pensylvanica complex (C. inops subsp. in-
ops) are clearly related to these members of the deflexa
complex, and so the circumscription of this informal
group would have to be changed considerably to achieve
monophyly. Carex deflexa var. deflexa is here suggested
to form a clade with C. communis var. communis (or-phan species) and C. peckii (nigromarginata complex;
Fig. 5) – a novel suggestion. While morphologically
these species are quite different (hence their placement
in different informal groupings), they all have similar
geographic distributions (Crins and Rettig, 2003). Carex
deflexa var. deflexa and C. peckii, particularly, have sim-
ilar distributions across the northern latitudes of Can-
ada, the northern midwest US, and northeastern US,and these distributions significantly overlap with C.
communis var. communis in the midwest and northeast-
ern US and eastern Canada (Crins and Rettig, 2003).
An alternative hypothesis to the close relationship of
these three species that should be noted (but cannot be
directly addressed here) is that past or current introgres-
sion of these species could lead to the pattern of generic
similarity seen here. We do not consider this likely,though, as hybridization among these species have never
been documented or suggested (Cayouette and Catling,
1992). The potential relationship of these three species
needs to be further explored.
The pensylvanica complex is here represented by
three of the five taxa currently recognized (Roalson
and Friar, 2004). Our analyses clearly suggest that the
western North American C. inops subsp. inops is not clo-sely related to the eastern North American C. lucorum
var. lucorum and C. pensylvanica (Fig. 5). Previous anal-
yses suggested that C. inops subsp. heliophila was rela-
tively closely related to C. lucorum var. lucorum, and
so the monophyly of C. inopsmay be suspect. This needs
to be further explored with all species from the complex.
Interestingly, C. lucorum var. lucorum and C. pensylva-
nica group closely with C. novae-angliae and C. sp.nov. (posterior probability = 96%; Fig. 5). As a close
relationship among these species has not been previously
suggested, potential morphological characteristics sup-
porting this grouping need to be explored.
Of all of the previously recognized species complexes,
the umbellata complex is the only informal group that
appears to form a monophyletic clade (Fig. 5). As not
all taxa recognized in the complex have been sampled(e.g., C. tonsa var. tonsa), we can not unequivocally pro-
nounce its monophyly. Additionally, as previously dis-
E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686 685
cussed, species boundaries are yet problematic in this
group.
5. Conclusions
Exploration of alcohol dehydrogenase gene family
structure in Carex section Acrocystis suggests a more
complicated locus structure than has been found in sev-
eral other groups (Paeonia, Sang et al., 1997; Brassica-
ceae, Charlesworth et al., 1998; Poaceae, Gaut et al.,
1999), with potentially as many as six Adh loci present
in the genus (Fig. 4). Divergence within loci among taxa
in Carex section Acrocystis is relatively low, athoughphylogenetic signal is present. In combination with
nrDNA ITS and ETS sequences analyzed using Bayes-
ian inference techniques, the Adh datasets continue to
support the non-monophyly of some species and species
complexes suggested previously (Roalson and Friar,
2004). More detailed studies of lineages in the section
need to be further explored with more detailed sampling
and more variable markers to conclusively define lin-eages in this group of sedges.
Acknowledgments
The authors thank J. Travis Columbus, Bruce Ford,
Brandon Gaut, Takuji Hoshino, Stanley Jones, Brian
Morton, J. Mark Porter, Tony Reznicek, Julian Starr,Barb Wilson, and Richard Whitkus for helpful discus-
sions, research advice, plant material, and help with field
work; J. Travis Columbus, J. Mark Porter, and two
anonymous reviewers for comments on previous ver-
sions of the manuscript; and RSA, TAES, and BRCH
for allowing access to herbarium material. Financial
support was provided by National Science Foundation
Dissertation Improvement Grant DEB-9801495, Ran-cho Santa Ana Botanic Garden, the A.W. Mellon Foun-
dation, the Hardman Foundation, and the Washington
State University School of Biological Sciences. Portions
of the paper are partial fulfillment of the requirements
for a Doctor of Philosophy degree for EHR, Claremont
Graduate University.
References
Al-Shehbaz, I.A., 1988. The genera of Arabideae (Cruciferae; Brass-
icaceae) in the southeastern United States. J. Arn. Arbor. 69, 85–
166.
Arnheim, N., Krystal, M., Schmickel, R., Wilson, G., Ryder, O.,
Zimmer, E., 1980. Molecular evidence for genetic exchanges among
ribosomal genes on nonhomologous chromosomes in man and
apes. Proc. Natl. Acad. Sci. USA 77, 7323–7327.
Bailey, C.D., Doyle, J.J., 1999. Potential phylogenetic utility of the
low-copy nuclear gene pistillata in dicotyledonous plants: Com-
parison to nrDNA ITS and trnL intron in Sphaerocardamum and
other Brassicaceae. Mol. Phylogenet. Evol. 13, 20–30.
Barta, E., Pintar, A., Pongor, S., 2002. Repeats with variations:
accelerated evolution of the Pin2 family of proteinase inhibitors.
Trends Genet. 18, 600–603.
Cayouette, J., Catling, P.M., 1992. Hybridization in the genus Carex
with special reference to North America. Bot. Rev. 58, 351–438.
Charlesworth, D., Liu, F.-L., Zhang, L., 1998. The evolution of the
alcohol dehydrogenase gene family by loss of introns in plants of
the genus Leavenworthia (Brassicaceae). Mol. Biol. Evol. 15, 552–
559.
Chase, M.W., Soltis, D.E., Soltis, P.E., et al. (10 co-authors), 2000.
Higher-level systematics of the monocotyledons: An assessment of
current knowledge and a new classification. In: Wilson, K.L.,
Morrison, D.A. (Eds.), Monocots: Systematics and Evolution.
CSIRO Publishing, Victoria, Australia, pp. 3–16.
Clegg, M.T., Cummings, M.P., Durbin, M.L., 1997. The evolution of
plant nuclear genes. Proc. Natl. Acad. Sci. USA 94, 7791–7798.
Crins, W.J., Ball, P.W., 1983. The taxonomy of the Carex pensylvanica
complex (Cyperaceae) in North America. Can. J. Bot. 61, 1692–
1717.
Crins, W.J., Rettig, J.H., 2003. Carex Linnaeus sect. Acrocystis
Dumortier. In: Flora of North America Editorial Committee
(Eds.), Flora of North America, north of Mexico, vol. 23,
Magnoliophyta: Commelinidae (in part): Cyperaceae. Oxford
University Press, New York, pp. 532–545.
Davies, E.W., 1956. Cytology, evolution and origin of the aneuploid
series in the genus Carex. Hereditas 42, 349–365.
Dover, G., 1982. Molecular drive: a cohesive mode of species
evolution. Nature 299, 111–117.
Doyle, J.J., Doyle, J.L., 1987. A rapid DNA isolation procedure for
small quantities of fresh leaf tissue. Phytochem. Bull. 19, 11–15.
Felsenstein, J., 1985. Confidence limits on phylogenies: An approach
using the bootstrap. Evolution 39, 783–791.
Gaut, B.S., Clegg, M.T., 1993. Nucleotide polymorphism in the Adh1
locus of pearl millet (Pennisetum glaucum) (Poaceae). Genetics 135,
1091–1097.
Gaut, B.S., Peek, A.S., Morton, B.R., Clegg, M.T., 1999. Patterns of
genetic diversification within the Adh gene family in the grasses
(Poaceae). Mol. Biol. Evol. 16, 1086–1097.
Grassly, N., Rambaut, A., 1998. PLATO 2.02—Partial Likelihoods
Assessed Through Optimization. Department of Zoology, Univer-
sity of Oxford, South Parks Road, Oxford, OX1 3PS.
Hillis, D.M., Bull, J.J., 1993. An empirical test of bootstrapping as a
method for assessing confidence in phylogenetic analysis. Syst.
Biol. 42, 182–192.
Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: Bayesian inference
of phylogenetic trees. Bioinformatics 17, 754–755.
Klotz, M.G., Klassen, G.R., Loewen, P.C., 1997. Phylogenetic
relationships among prokaryotic and eukaryotic catalases. Mol.
Biol. Evol. 14, 951–958.
Koch, M.A., Haubold, B., Mitchell-Olds, T., 2000. Comparative
evolutionary analysis of chalcone synthase and alcohol dehydro-
genase loci in Arabidopsis, Arabis, and related genera (Brassica-
ceae). Mol. Biol. Evol. 17, 1483–1498.
Lagercrantz, U., Axelsson, T., 2000. Rapid evolution of the family of
CONSTANS LIKE genes in plants. Mol. Biol. Evol. 17, 1499–
1507.
Lewis, P.O., 2001. A likelihood approach to estimating phylogeny
from discrete morphological character data. Syst. Biol. 50, 913–
925.
Li, W.-H., 1997. Molecular evolution. Sinauer Associates Inc.,
Sunderland, MA, USA.
Mackenzie, K.K., 1913a. Western allies of Carex pennsylvanica.
Torreya 13, 14–16.
Mackenzie, K.K., 1913b. Notes on Carex-VII. Carex umbellata and its
allies. Bull. Torrey Bot. Club. 40, 529–554.
686 E.H. Roalson, E.A. Friar / Molecular Phylogenetics and Evolution 33 (2004) 671–686
Mackenzie, K.K., 1935. Cyperaceae—Cariceae. North Amer. Flora 18
(4–7), 169–478.
Mason-Gamer, R.J., Weil, C.F., Kellogg, E.A., 1998. Granule-bound
starch synthase: structure, function, and phylogenetic utility. Mol.
Biol. Evol. 15, 1658–1673.
Mathews, S., Tsai, R.C., Kellogg, E.A., 2000. Phylogenetic structure in
the grass family (Poaceae): Evidence from the nuclear gene
phytochrome B. Amer. J. Bot. 87, 96–107.
McInerney, J., 1998. GCUA: General Codon Usage Analysis. Bioin-
formatics 14, 372–373.
Michelangeli, F.A., Davis, J.I., Stevenson, D.W., 2003. Phylogenetic
relationships among Poaceae and related families as inferred from
morphology, inversions in the plastid genome, and sequence data
from the mitochondrial and plastid genomes. Amer. J. Bot. 90, 93–
106.
Millar, A.A., Dennis, E.S., 1996. The alcohol dehydrogenase genes of
cotton. Plant Mol. Biol. 31, 897–904.
Miller, R.E., Rausher, M.D., Manos, P.S., 1999. Phylogenetic
systematics of Ipomoea (Convolvulaceae) based on ITS and waxy
sequences. Syst. Bot. 24, 209–227.
Minin, V., Abdo, Z., Joyce, P., Sullivan, J., 2003. Performance-based
selection of likelihood models for phylogeny estimation. Syst. Biol.
52, 674–683.
Mitchell, L.E., Dennis, E.S., Peacock, W.J., 1989. Molecular analysis
of an alcohol dehydrogenase (Adh) gene from chromosome 1 of
wheat. Genome 32, 349–358.
Miyashita, N.T., 2001. DNA variation in the 50 upstream region of the
Adh locus of the wild plants Arabidopsis thaliana and Arabis
gemmifera. Mol. Biol. Evol. 18, 164–171.
Morton, B.R., 1998. Selection on the codon bias of chloroplast and
cyanelle genes in different plant and algal lineages. J. Mol. Evol. 46,
449–459.
Perry, D.J., Furnier, G.R., 1996. Pinus banksiana has at least seven
expressed alcohol dehydrogenase genes in two linked groups. Proc.
Natl. Acad. Sci. USA 93, 13020–13023.
Posada, D., Crandall, K.A., 1998. Modeltest: testing the model of
DNA substitution. Bioinformatics 14, 817–818.
Rettig, J.H., 1990. Achene micromorphology of the Carex nigromargi-
nata complex (section Acrocystis, Cyperaceae). Rhodora 92, 70–79.
Rettig, J.H., Giannasi, D.E., 1990. Foliar flavonoids of the Carex
nigromarginata complex (sect. Acrocystis, Cyperaceae). Biochem.
Syst. Ecol. 18, 393–397.
Reznicek, A.A., 1990. Evolution in sedges (Carex, Cyperaceae). Can.
J. Bot. 68, 1409–1432.
Roalson, E.H., Friar, E.A., 2000. Supraspecific classification of
Eleocharis (Cyperaceae) revisited: Evidence from the internal
transcribed spacer (ITS) region of nuclear ribosomal DNA. Syst.
Bot. 25, 323–336.
Roalson, E.H., Friar, E.A., 2004. Phylogenetic relationships and
biogeographic patterns in North American members of Carex
section Acrocystis (Cyperaceae) using nrDNA ITS and ETS
sequence data. Plant Syst. Evol. 243, 175–187.
Roalson, E.H., Columbus, J.T., Friar, E.A., 2001. Phylogenetic
relationships in Cariceae (Cyperaceae) based on ITS (nrDNA)
and trnT-L-F (cpDNA) region sequences: Assessment of subgen-
eric and sectional relationships in Carex with emphasis on section
Acrocystis. Syst. Bot. 26, 318–341.
Sang, T., Donoghue, M.J., Zhang, D., 1997. Evolution of alcohol
dehydrogenase genes in Peonies (Paeonia): Phylogenetic relation-
ships of putative nonhybrid species. Mol. Biol. Evol. 14, 994–
1007.
Small, R.L., Ryburn, J.A., Cronn, R.C., Seelanan, T., Wendel, J.F.,
1998. The tortoise and the hare: Choosing between noncoding
plastome and nuclear Adh sequences for phylogeny reconstruction
in a recently diverged plant group. Amer. J. Bot. 85, 1301–1315.
Small, R.L., Ryburn, J.A., Wendel, J.F., 1999. Low levels of
nucleotide diversity at homeologous Adh loci in allotetraploid
cotton (Gossypium L.). Mol. Biol. Evol. 16, 491–501.
Small, R.L., Wendel, J.F., 2000a. Phylogeny, duplication, and intra-
specific variation of Adh sequences in New World diploid cottons
(Gossypium L., Malvaceae). Mol. Phylogenet. Evol. 16, 73–84.
Small, R.L., Wendel, J.F., 2000b. Copy number lability and evolu-
tionary dynamics of the Adh gene family in diploid and tetraploid
cotton (Gossypium). Genetics 155, 1913–1926.
Soltis, D.E., Soltis, P.S., Chase, M.W., et al. (13 co-authors), 2000.
Angiosperm phylogeny inferred from 18S rDNA, rbcL, and atpB
sequences. Bot. J. Linn. Soc. 133, 381–461.
Soltis, D.E., Senters, A.E., Zanis, M.J., Kim, S., Thompson, J.D.,
Soltis, P.S., Ronse de Craene, L.P., Endress, P.K., Farris, J.S.,
2003. Gunnerales are sister to other core eudicots: implications for
the evolution of pentamery. Amer. J. Bot. 90, 461–470.
Starr, J.R., Bayer, R.J., Ford, B.A., 1999. The phylogenetic position of
Carex section Phyllostachys and its implications for phylogeny and
subgeneric circumscription in Carex (Cyperaceae). Amer. J. Bot.
86, 563–577.
Starr, J.R., Harris, S.A., Simpson, D.A., 2003. Potential of the 50 and30 external transcribed spacers (ETS) of the rDNA in the
Cyperaceae: new sequences for lower-level phylogenies in sedges
with an example from Uncinia Pers. Int. J. Plant Sci. 164, 213–
227.
Swofford, D., 2001. PAUP*: Phylogenetic Analysis using Parsimony,
Version 4.0b10. Laboratory of Molecular Systematics, Smithsonian
Institution, Washington, DC, Sinauer, Sunderland, MA.
Yen, A.C., Olmstead, R.G., 2000. Molecular systematics of Cypera-
ceae tribe Cariceae based on two chloroplast DNA regions: ndhF
and trnL intron-intergenic spacer. Syst. Bot. 25, 479–494.