The age of the grasses and clusters of origins of C4 photosynthesis

15
The age of the grasses and clusters of origins of C 4 photosynthesis ALBERTO VICENTINI * 1 , JANET C. BARBER w , SANDRA S. ALISCIONI z, LILIANA M. GIUSSANI§ and ELIZABETH A. KELLOGG * *Department of Biology, University of Missouri, St Louis, One University Boulevard, St Louis, MO 63121, USA, wDepartment of Biology, Saint Louis University, 3507 Laclede Avenue, St Louis, MO 63103, USA, zCa ´tedra de Bota ´nica Agrı ´cola, Facultad de Agronomı ´a, Universidad de Buenos Aires, Av. San Martin 4453, C1417DSE, Buenos Aires, Argentina, §Instituto de Bota ´nica Darwinion, Labarde ´n 200, Casilla de Correo 22, San Isidro, Buenos Aires 1642, Argentina Abstract At high temperatures and relatively low CO 2 concentrations, plants can most efficiently fix carbon to form carbohydrates through C 4 photosynthesis rather than through the ancestral and more widespread C 3 pathway. Because most C 4 plants are grasses, studies of the origin of C 4 are intimately tied to studies of the origin of the grasses. We present here a phylogeny of the grass family, based on nuclear and chloroplast genes, and calibrated with six fossils. We find that the earliest origins of C 4 likely occurred about 32 million years ago (Ma) in the Oligocene, coinciding with a reduction in global CO 2 levels. After the initial appearance of C 4 species, photosynthetic pathway changed at least 15 more times; we estimate nine total origins of C 4 from C 3 ancestors, at least two changes of C 4 subtype, and five reversals to C 3 . We find a cluster of C 4 to C 3 reversals in the Early Miocene correlating with a drop in global temperatures, and a subsequent cluster of C 4 origins in the Mid-Miocene, correlating with the rise in temperature at the Mid-Miocene climatic optimum. In the process of dating the origins of C 4 , we were also able to provide estimated times for other major events in grass evolution. We find that the common ancestor of the grasses (the crown node) originated in the upper Cretaceous. The common ancestor of maize and rice lived at 52 8 Ma. Keywords: evolution, Miocene climate, photosynthesis, Poaceae Received 23 April 2008 and accepted 27 May 2008 Introduction About a quarter of the world’s terrestrial primary productivity comes from C 4 plants (Lloyd & Farquhar, 1994), most of which, whether computed in terms of number of species or in terms of biomass, are grasses (Sage et al., 1999a, b). Grasses that use the C 4 photosyn- thetic pathway dominate the Great Plains of North America, the grasslands of Africa and Australia, and are components of most grassland and savanna biomes in the warmer parts of the world. In addition, C 4 grasses such as maize, sorghum, sugar cane and the various species of millets provide a significant source of human calories. Finally, species currently used or being studied as sources of biofuels, such as maize, sugar cane, switchgrass and Miscanthus, are C 4 . The C 4 pathway is derived from the simpler and more ancient C 3 carbon fixation mechanism found in the majority of plants, and is presumed to be an adaptation to the low atmospheric CO 2 levels and high O 2 levels that have prevailed since the Oligocene (Sage, 2004), in which the C 3 pathway is inherently inefficient (Ehleringer & Monson, 1993). Rubisco (ribulose 1,5,-bisphosphate car- boxylase oxygenase), the enzyme that fixes CO 2 in plants, binds O 2 , as well as CO 2 , a process known as photo- respiration, which increasingly reduces the efficiency of photosynthesis as temperatures increase (Ehleringer & Monson, 1993; Tipple & Pagani, 2007). In contrast, C 4 plants minimize photorespiration by sequestering Rubisco in cells around the veins, and use a different primary carbon acceptor (phosphoenol pyruvate) in the mesophyll. C 4 photosynthesis is thus more efficient than 1 Present address: Center for Tropical Forest Science, Smithsonian Tropical Research Institute, Apartado Postal 0843-03092, Panama ´, Panama. Correspondence: Elizabeth A. Kellogg, tel. 1 1 314 516 6217, fax 1 1 314 516 6233, e-mail: [email protected] Global Change Biology (2008) 14, 1–15, doi: 10.1111/j.1365-2486.2008.01688.x r 2008 The Authors Journal compilation r 2008 Blackwell Publishing Ltd 1

Transcript of The age of the grasses and clusters of origins of C4 photosynthesis

The age of the grasses and clusters of origins of C4

photosynthesis

A L B E R T O V I C E N T I N I *1, J A N E T C . B A R B E R w , S A N D R A S . A L I S C I O N I z,L I L I A N A M . G I U S S A N I § and E L I Z A B E T H A . K E L L O G G *

*Department of Biology, University of Missouri, St Louis, One University Boulevard, St Louis, MO 63121, USA, wDepartment of

Biology, Saint Louis University, 3507 Laclede Avenue, St Louis, MO 63103, USA, zCatedra de Botanica Agrıcola, Facultad de

Agronomıa, Universidad de Buenos Aires, Av. San Martin 4453, C1417DSE, Buenos Aires, Argentina, §Instituto de Botanica

Darwinion, Labarden 200, Casilla de Correo 22, San Isidro, Buenos Aires 1642, Argentina

Abstract

At high temperatures and relatively low CO2 concentrations, plants can most efficiently

fix carbon to form carbohydrates through C4 photosynthesis rather than through the

ancestral and more widespread C3 pathway. Because most C4 plants are grasses, studies

of the origin of C4 are intimately tied to studies of the origin of the grasses. We present

here a phylogeny of the grass family, based on nuclear and chloroplast genes, and

calibrated with six fossils. We find that the earliest origins of C4 likely occurred about 32

million years ago (Ma) in the Oligocene, coinciding with a reduction in global CO2

levels. After the initial appearance of C4 species, photosynthetic pathway changed at

least 15 more times; we estimate nine total origins of C4 from C3 ancestors, at least two

changes of C4 subtype, and five reversals to C3. We find a cluster of C4 to C3 reversals in

the Early Miocene correlating with a drop in global temperatures, and a subsequent

cluster of C4 origins in the Mid-Miocene, correlating with the rise in temperature at the

Mid-Miocene climatic optimum. In the process of dating the origins of C4, we were also

able to provide estimated times for other major events in grass evolution. We find that

the common ancestor of the grasses (the crown node) originated in the upper Cretaceous.

The common ancestor of maize and rice lived at 52� 8 Ma.

Keywords: evolution, Miocene climate, photosynthesis, Poaceae

Received 23 April 2008 and accepted 27 May 2008

Introduction

About a quarter of the world’s terrestrial primary

productivity comes from C4 plants (Lloyd & Farquhar,

1994), most of which, whether computed in terms of

number of species or in terms of biomass, are grasses

(Sage et al., 1999a, b). Grasses that use the C4 photosyn-

thetic pathway dominate the Great Plains of North

America, the grasslands of Africa and Australia, and

are components of most grassland and savanna biomes

in the warmer parts of the world. In addition, C4 grasses

such as maize, sorghum, sugar cane and the various

species of millets provide a significant source of human

calories. Finally, species currently used or being studied

as sources of biofuels, such as maize, sugar cane,

switchgrass and Miscanthus, are C4.

The C4 pathway is derived from the simpler and more

ancient C3 carbon fixation mechanism found in the

majority of plants, and is presumed to be an adaptation

to the low atmospheric CO2 levels and high O2 levels that

have prevailed since the Oligocene (Sage, 2004), in which

the C3 pathway is inherently inefficient (Ehleringer &

Monson, 1993). Rubisco (ribulose 1,5,-bisphosphate car-

boxylase oxygenase), the enzyme that fixes CO2 in plants,

binds O2, as well as CO2, a process known as photo-

respiration, which increasingly reduces the efficiency of

photosynthesis as temperatures increase (Ehleringer &

Monson, 1993; Tipple & Pagani, 2007). In contrast,

C4 plants minimize photorespiration by sequestering

Rubisco in cells around the veins, and use a different

primary carbon acceptor (phosphoenol pyruvate) in the

mesophyll. C4 photosynthesis is thus more efficient than

1Present address: Center for Tropical Forest Science, Smithsonian

Tropical Research Institute, Apartado Postal 0843-03092, Panama,

Panama.

Correspondence: Elizabeth A. Kellogg, tel. 1 1 314 516 6217,

fax 1 1 314 516 6233, e-mail: [email protected]

Global Change Biology (2008) 14, 1–15, doi: 10.1111/j.1365-2486.2008.01688.x

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd 1

C3, particularly at high temperatures, but it requires

more energy and only becomes advantageous when

atmospheric pCO2 (partial pressure of CO2) is at its

current relatively low level (below � 500 ppm) and

when temperatures are relatively high (above � 25 1C)

(Ehleringer et al., 1997).

The grass family (Poaceae) includes more C4 species

than all other plant families combined (Clayton &

Renvoize, 1986; Sage et al., 1999a). The C4 pathway

has originated independently in four of the 13 grass

subfamilies (Sinha & Kellogg, 1996; Kellogg, 1999;

Giussani et al., 2001; Christin et al., 2008), including

multiple changes in subfamily Panicoideae (Giussani

et al., 2001; Duvall et al., 2003). C4 also appears in

subfamilies Chloridoideae, Aristidoideae, and Micrair-

oideae. The various C4 lineages differ in leaf anatomy

(Prendergast et al., 1987), the tissue-specific expression

of photosynthetic enzymes (Sinha & Kellogg, 1996),

and the primary enzymes used in metabolizing the

four-carbon molecules. The differences support the

hypothesis of independent gains of the pathway.

The great C4 grasslands of today expanded in the Late

Miocene, 6–8 million years ago (Ma) (Wang et al., 1994;

Fox & Koch, 2003), but the earliest fossil grass leaf with

C4 anatomy is dated at 12.5 Ma (Nambudiri et al., 1978;

Jacobs et al., 1999), and silica bodies (phytoliths)

diagnostic of the C4 subfamily Chloridoideae have been

identified at about 19 Ma (Stromberg, 2005). C4 origins

thus precede C4 grassland expansions.

Global CO2 levels in most of the Tertiary were much

higher than at present, well over 1000 ppm and fluctu-

ating to even higher levels during the Eocene (Cerling,

1991; Ekart et al., 1999; Pearson & Palmer, 2000; Royer

et al., 2004). The Oligocene saw a drop in atmospheric

CO2, with modern (i.e. preindustrial) levels reached by

the Early Miocene (Pagani et al., 1999, 2005). Global

temperature also fluctuated during this time, falling

from a high during the Eocene climatic optimum to

recent relatively low global temperatures, with much

variation along the way (Zachos et al., 2001; Lunt et al.,

2007). The connection between global temperature and

other climatic parameters (seasonality, aridity) is not

simple, and remains a subject of active research.

Ehleringer et al. (1991) proposed low atmospheric

CO2 levels to explain broad ecological expansion of C4

grasslands in the Late Miocene, an idea that was later

pursued by Ehleringer et al. (1997) and Cerling et al.

(1997). More recently, it has become clear that the

preindustrial CO2 levels were achieved much earlier,

between ca. 35 and 25 Ma during the Oligocene (Pagani

et al., 2005), prompting (Sage, 2001, 2004) to suggest a

possible correlation between the Oligocene drop in CO2

and the origins (rather than expanded distribution) of

C4 species.

A test of Sage’s hypothesis requires (a) a well-

supported phylogeny of the grasses, including repre-

sentatives of the various C4 lineages; (b) inference of the

ancestral photosynthetic pathway at the nodes (specia-

tion events) in the phylogeny; and (c) assigning dates to

the nodes. Some years ago, we produced a phylogeny of

the subfamily Panicoideae based on the chloroplast

gene ndhF (Giussani et al., 2001) and identified nodes

in the phylogeny where C4 might have originated.

That study formed the basis of recent work by Christin

et al. (2008) and the present paper. Because the

chloroplast is maternally inherited, its history reflects

only one line of evidence for the history of the plants.

Accordingly, we chose to verify the phylogeny of Gius-

sani et al. (2001) by assembling data for a nuclear gene,

phytochrome B (phyB), and present those results here.

In contrast, Christin et al. (2008) continued to use

chloroplast markers, including ndhF, but expanded the

sampling of C4 species to include representatives in

other grass subfamilies.

A phylogeny estimates the numbers of mutations

separating extant species from their ancestors, but con-

verting the number of mutations to an estimate of time

requires a calibration point, generally a fossil. An in-

dividual fossil provides a minimum age for a taxonomic

group. For example, Thomasson (1978) described a

fossil that can be assigned to the modern genus

Dichanthelium; the formation in which the fossil was

found is dated at 8 Ma. Thus, the genus Dichanthelium

must be at least 8 Ma old, although it could clearly be

older. Other reliable fossil dates are provided by a fossil

of Setaria at 7 Ma (Elias, 1942), and the chloridoid

phytoliths at 19 Ma (Stromberg, 2005).

Multiple fossils have been suggested to provide a

date for the earliest grasses, and these need to be

assessed in any effort to date the grass phylogeny.

Unequivocal grass pollen is known from 55 Ma

(Jacobs et al., 1999), and pollen that may be from a grass

or from a close relative is dated to 70 Ma (Linder, 1986;

Herendeen & Crane, 1995). A two-flowered grass

spikelet fossil from the Early Eocene (Crepet &

Feldman, 1991; � 55 Ma) does not match any extant

taxon, but indicates that multiflowered spikelets had

arisen by the Early Tertiary. A set of morphologically

diverse phytoliths from the North American Great

Plains suggests that all the major subfamilies had

diverged by 35 Ma (Stromberg, 2005). However, a recent

study of fossilized dinosaur dung (coprolites) in

India (Prasad et al., 2005) places grass diversification

much earlier, at 65–67 Ma. Because India was not

connected to any landmass at that date, Prasad et al.

(2005) interpreted this to mean that the radiation of the

family occurred even earlier, before the breakup of

Gondwana (�80 Ma).

2 A . V I C E N T I N I et al.

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

Here, we present a calibrated phylogeny of the grasses,

focusing particularly on subfamily Panicoideae, and

incorporating both chloroplast and nuclear markers. We

investigate the effects of various fossil calibrations and

find that (1) all but the coprolite calibration provide

similar estimates for C4 origins; (2) the earliest C4 origins

correlate with the drop in global CO2 levels in the

Oligocene, confirming the results of Christin et al. (2008)

and the earlier hypothesis of Sage (2004); and (3) C4

origins are not correlated in any simple way with global

temperatures (i.e. warm global temperatures may not

lead to C4 origins). Our most striking result, however, is

(4) that changes in photosynthetic pathway are clustered

in time, suggesting that global climatic forces in addition

to temperature may have facilitated transitions between

C3 and C4. Our data also provide estimates for dates of

critical nodes of the grass phylogeny, including the

divergence of maize from rice.

Materials and methods

Taxon sampling

We assembled sequence data for 97 grasses plus three

more distantly related monocots (Muscari in Hycintha-

ceae, Flagellaria in Flagellariaceae, and Joinvillea in

Joinvilleaceae). The software used for dating the tree

also requires a still more distant relative that is pruned

from the tree during the analysis, so we included the

eudicot Arabidopsis. All the subfamilies of grasses are

represented except for the recently emended Micrair-

oideae for which phyB sequences were unavailable. We

focused on subfamily Panicoideae, which comprises

three large clades (Giussani et al., 2001; Aliscioni et al.,

2003): Andropogoneae, Paniceae taxa with chromosome

numbers in multiples of 10 (the x 5 10 Paniceae), and

Paniceae taxa with chromosome numbers in multiples

of 9 (the x 5 9 Paniceae). Multiple representatives of

each clade are included here (Table S1).

DNA sequencing

Sequences of the nuclear gene phyB were generated

to maximize comparability with the available ndhF

(chloroplast) data. Many of the ndhF data in GenBank

(and used by Christin et al., 2008) were produced by

several of us (Giussani et al., 2001; Aliscioni et al., 2003),

so we were able to produce phyB sequences from either

the same DNA used in our previous study, or new DNA

extracted from the silica-dried ndhF voucher material.

In a few cases, we extracted new DNA from living plant

material. Total genomic DNA was extracted with the

cetyl-trimethylammonium bromide method (Doyle &

Doyle, 1987) and purified via cesium chloride gradients

(Sambrook et al., 1989). Approximately 1.2 kb of exon 1

of phyB was amplified following Mathews et al. (2000).

Except in maize (which has two copies Sheehan et al.,

2004), phyB is apparently single copy in grasses

(Mathews & Sharrock, 1996); thus, strong PCR products

were purified and sequenced directly. PCR products

were cloned before sequencing for 36 taxa that

amplified weakly. Fragments were sequenced on both

strands with the external amplification primers and two

internal primers (Mathews et al., 2000). Sequence

divergence among multiple clones was o1% and all

the cloned sequences of a taxon clustered together in

preliminary analyses, so consensus sequences were

generated for each species, coding ambiguities as

polymorphic. All the sequences of ndhF and additional

phyB sequences were downloaded from GenBank.

Genbank accession numbers are listed in Table S1.

Multiple alignments of both ndhF and phyB were

produced in MacClade (Maddison & Maddison, 2000);

alignment was easily done by eye. Sequences were

translated to verify the presence of an open reading

frame. Following the procedures used by the Grass

Phylogeny Working Group (2001), 13 taxa were

represented by an ndhF sequence of one species and a

phyB sequence of another; this assumes that the relevant

taxa are monophyletic, an assumption corroborated by

data in the literature.

Phylogenetic analysis

We first tested whether the new data from the nuclear

gene indicated the same phylogeny as the published

ndhF data. We analyzed the phyB and ndhF datasets

individually with parsimony and Bayesian approaches,

and retrieved trees in which major clades were strongly

supported and identical in composition (deposited in

TreeBase). Nonetheless, there were several points of

conflict between the trees. Conflict was highly localized

and involved relationships of only four regions of

the tree, within, rather than between, major clades.

Accordingly, we combined the two datasets.

For the combined dataset, MrModeltest 2.2 (Nylan-

der, 2004) chose the GTR 1 I 1 G as the best model for

each gene. Two Bayesian analyses were performed with

MrBayes (Huelsenbeck & Ronquist, 2001) running on

the Beowulf cluster at University of Missouri, St Louis,

with the following options: ngen 5 10 000 000, print-

freq 5 1000, samplefreq 5 1000, and burnin 5 1000. A

majority rule consensus tree was obtained with the

18 000 trees saved for two independent runs and the

frequency of each node was recorded as the Bayesian

posterior probability (Fig. S1). The 50% majority rule

consensus tree was fully dichotomous and identical to

the maximum likelihood tree. Because assignment of

C L U S T E R E D O R I G I N S O F C 4 P H O T O S Y N T H E S I S 3

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

dates requires an outgroup that is then pruned from the

analysis, we rooted the trees using two unambiguous

outgroups, Arabidopsis and Muscari. These outgroups

are omitted from all the figures for clarity.

Molecular dating

Dating used a Bayesian relaxed-clock method (Thorne

& Kishino, 2002) as implemented in Multidivtime

(Rutschmann, 2005). Model parameters, and

branch-lengths and their variance–covariance matrices

were estimated separately for the ndhF and phyB

partitions (Estbranches step). Estimation of rates and

chronogram sampling was performed with 10 000

Markov chain samples (numsamps) taken at every 1000

cycles (sampfreq) starting after 1 000 000 initial cycles

(burnin). Priors for parameters rtrate and rtratesd in the

command file multicntrl.dat were selected by taking the

median amount of evolution from tips to the ingroup

root and dividing this figure by the prior age for the

ingroup node (rttm parameter), as suggested in the

Multidivtime manual (Rutschmann, 2005). The rttm

parameter was set to 1.5 [i.e. 150 Ma with a standard

deviation of 0.01 (rttmsd)], and bigtime was set to 10

(i.e. 1000 Ma). Other parameters were the default

values: brownmean 5 1, brownsd 5 1, minab 5 1,

newk 5 0.1, othk 5 0.5, thek 5 0.5. Multiple calibrations

were tested, using fossils, as described in the text

(Table 1).

Ancestral state reconstruction

Species were coded as C3 or C4; among the C4 species,

we distinguished among the NADP malic enzyme,

NAD malic enzyme, and phosphoenolpyruvate

carboxykinase (PCK) subtypes (Table S1). The photo-

synthesis type of Aristida was set to NADP, although

Aristida differs from other NADP plants in having a

double bundle sheath (Prendergast & Hattersley, 1987),

carbon reduction in both sheaths, lack of a suberized

lamella in any bundle sheath cells, and a distinctive

pattern of expression of photosynthetic genes (Sinha &

Kellogg, 1996). Likewise, Sporobolus was coded as NAD,

although different species of Sporobolus exhibit either

the NAD or PCK subtypes. Steinchisma, the only true

C3/C4 intermediate in this sample of taxa, was coded as

NADP. Outside the genus Steinchisma, the only known

C3/C4 intermediate in Poaceae is Neurachne minor (Hat-

tersley & Stone, 1986), which was not available for this

study.

To infer the photosynthetic pathway of ancestral species,

Pagel’s (1994) maximum likelihood method for discrete

characters was used for both the unconstrained ML tree

and the chronograms resulting from the dating analyses. Tab

le1

Fo

ssil

dat

esu

sed

for

the

dif

fere

nt

cali

bra

tio

nsc

hem

es

No

de

Ag

e(M

a)F

oss

ilR

efer

ence

Po

llen

55-7

0P

ol-

len

L70

Sp

ikel

etP

hy

toli

thG

on

dw

ana

Co

pro

lite

Pol

l.S

pike

.P

hyt.

b7

Set

aria

Eli

as(1

942)

L7

L7

L7

L7

L7

L7

W8

Dic

han

thel

ium

Th

om

asso

n(1

978)

L8

L8

L8

L8

L8

L8

G19

Ch

lori

do

idS

tro

mb

erg

(200

5)L

19L

19L

19L

19L

19L

19

E35

NA

ph

yto

lith

sS

tro

mb

erg

(200

5)L

35L

35

A55

–70

Po

llen

Lin

der

(198

6)L

55/

U70

L70

L70

C55

Mu

ltifl

ow

ersp

ikel

etC

rep

et&

Fel

dm

an(1

991)

L55

L55

E80

Co

nti

nen

tal

dri

ftP

rasa

det

al.

(200

5)L

80

E67

Co

pro

lite

sP

rasa

det

al.

(200

5)L

67

Tim

eis

ind

icat

edin

mil

lio

ns

of

yea

rs.

Lo

rU

bef

ore

the

nu

mb

ero

fm

illi

on

so

fy

ears

ind

icat

esw

het

her

the

nu

mb

erse

rved

asa

low

er(L

)o

ru

pp

er(U

)b

ou

nd

ary

for

the

no

de.

No

des

are

ind

icat

edin

Fig

s1

and

S2.

4 A . V I C E N T I N I et al.

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

For each tree, we tested different models of transition rates

(equal, symmetrical and all-different models) and a range

of initial likelihood values (ip 5 0–1) using function ace of

package Ape in R (Paradis et al., 2004). The best combina-

tion of transition rate model and initial likelihood setting

was then obtained given the lowest absolute Akaike

Information Criterion (AIC) score for the tested recon-

structions. In all cases, the best model was the ‘equal rates’

model regardless of the initial ip parameter (the default

0.01 was then used). Thus, these data indicate that gains

and losses of C4 photosynthesis are equally likely.

We also explored the panicoid portion of the tree using

parsimony optimization on the individual gene trees

under ACCTRAN and DELTRAN optimizations. As

expected, the precise number of gains and losses varied

depending on the weighting scheme and also somewhat

depending on the sample of taxa. For example, for the

phyB trees, there were two to five gains and six to nine

losses of C4 under ACCTRAN optimization, whereas

DELTRAN optimization showed three to six gains and

five to seven losses. Similarly, optimization on the ndhF

trees found 2–12 gains and zero to nine losses under

ACCTRAN, and 5–13 gains and zero to six losses under

DELTRAN. Nevertheless, changes (of whatever direction)

were optimized to the same nodes as identified by the

likelihood weighting employed here (see Pollen.Spike.Phyt.

calibration in Fig. S2).

Simulation of transitions of photosynthetic pathways

With LASER (Rabosky, 2006), we created 1000 simulated

datasets, each with as many transitions in photosynthetic

pathway as observed, generated under a pure birth con-

stant rate speciation model (Rabosky, 2006). These simu-

lated datasets were then scaled to vary within the age of

the grasses (from node B in Fig. 1 to the present), and we

defined a set of time windows of different sizes (varying

from 1 to 10 Ma with increments of 0.5). Then, for each

time –window, we tested at different points in time (from

50 Ma to the present with increments of 0.5 Ma) whether

the number of observed transition events was greater than

expected under a constant rate of transitions (number of

transition events in the 1000 simulated datasets). As the

observed date for transitions, we used the mean age of

each event (Fig. 3a–d). We did this separately for C3–C4

and C4–C3 events for each of the four calibration schemes

discussed in the text.

Results

Phylogenies and dating

The phylogenetic tree based on combined DNA

sequences from the chloroplast and the nucleus

(Fig. S1) was well supported and consistent with pre-

vious phylogenetic results (GPWG, 2001; Sanchez-Ken

et al., 2007; Bouchenak-Khelladi et al., 2008; Christin

et al., 2008). We retrieved a large clade including Bam-

busoideae, Ehrhartoideae, and Pooideae (the BEP clade)

sister to the clade including Panicoideae, Arundindoi-

deae, Centothecoideae, Chloridoideae, Micrairoideae,

Aristidoideae, and Danthonioideae (the PACCMAD

clade). All the C4 origins are in the latter clade, as found

in all the previous studies (e.g. GPWG, 2001).

We estimated dates for the internal nodes, and con-

verted the tree to a chronogram (Fig. 1), using multiple

fossil calibrations. We assigned the minimum age for

the stem nodes of the Chloridoideae, Dichanthelium, and

Setaria (nodes G, W, and b) at 19, 8, and 7 Ma, respec-

tively, based on fossil phytoliths and anthecia (Elias,

1942; Thomasson, 1978; Stromberg, 2005) (Table 1). (The

stem node for a group is the earliest date at which a

distinctive feature might have arisen. For example, the

phytoliths for Chloridoideae must have originated by

node H, but could have originated much earlier im-

mediately after the divergence of the chloridoid ances-

tor from its sister group at node G. G is the stem node,

whereas H is the crown node.)

We tested the effects of using different dates to

calibrate deep nodes in the phylogeny (Table 1). The

two grass pollen fossils [one at 55 Ma unequivocally a

grass, the other at 70 Ma possibly a grass (Linder, 1986;

Herendeen & Crane, 1995)], let us date the stem node of

the family (A in Fig. 1) as 55 or 70 Ma (calibrations

PollenL55 and PollenL70). Spikelets arose between nodes

B and C, and multiflowered spikelets between nodes C

and D (GPWG, 2001). Accordingly, we used the date of

the Eocene grass spikelet fossil (Crepet & Feldman,

1991) to constrain the stem node (C) of the multiflow-

ered spikelet clade to 55 Ma (Spikelet calibration). Phy-

tolith data from the North American Great Plains

suggest that the BEP clade and the PACCMAD clade

had diverged by 35 Ma (Stromberg, 2005), providing a

minimum date for node E (Phytolith calibration). The

dinosaur coprolite (Prasad et al., 2005), however, places

the same node much earlier, at 65–67 Ma (Coprolite

calibration), although Prasad et al. (2005) preferred a

Gondwanan age for the node (�80 Ma; Gondwana cali-

bration). All the calibrations thus used four fossils –

chloridoid phytoliths, Dichanthelium, Setaria, and one of

six possible dates for the deep nodes (Table 1).

Dates estimated from the PollenL55, PollenL70, Spike-

let, and Phytolith calibrations were similar for all nodes,

so the latter three were combined with those of the three

more recent fossils to provide an estimate of dates based

on six fossils (Poll.Spike.Phyt calibration; Figs 1 and 2;

Tables 1 and 2). Dates estimated from Poll.Spike.Phyt

were similar to dates estimated by other authors (Bre-

C L U S T E R E D O R I G I N S O F C 4 P H O T O S Y N T H E S I S 5

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

PQ

R

ST

U

V

W

X

Y

Z

a

bc

Bambusoideae

Pooideae

Andropogoneae

Paniceae.X=10

Paniceae.X=9

Photosynthesis type

C4 NAD

C4 NADP

C4 PCK

C3

6 A . V I C E N T I N I et al.

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

mer, 2002; Gaut, 2002; Christin et al., 2008) who used

fewer points of calibration. Values for the Coprolite and

Gondwana calibrations were consistently older than the

others (Table 2; Fig. 2).

Independent gains of C4 occurred in Aristidoideae,

Chloridoideae, Panicoideae, and multiple other

lineages (Fig. 1 and Fig. S2). The earliest possible change

to C4 occurred some time after the divergence of the

stem node of the Aristidoideae (node F), making node F

the maximum age of C4 grasses. Likewise, the dates of

the stem node of the chloridoids (G) and of Danthoniop-

sis (J) are maxima. Other switches between photosyn-

thetic pathways could be assigned to intervals between

dated nodes (Fig. 3).

Clustered changes of photosynthetic pathway andcorrelation with climate

We found that evolutionary changes in photosynthetic

pathway were clustered in geological time. We simu-

lated the pattern of C4 origins and reversals assuming a

constant rate of change throughout grass evolution

(Rabosky, 2006), and tested whether significantly more

changes occurred in any window of time than predicted

for constant rates. Transition times were significantly

clustered regardless of calibration and window size

(Fig. 3). Four to five C3–C4 transitions and two to four

reversals occurred within 10 Ma (the maximum win-

dow-size tested), depending on calibration scheme.

Small window sizes (o5 Ma) resolved two separate

time periods containing significantly more C3–C4 tran-

sitions (42) than random, as well as one period of

clustered reversals. Notably, for all the analyses with

all calibrations, C3–C4 clusters occurred during different

periods of geological time than did the C4–C3 clusters

(Figs 3 and 4).

We correlated the estimated dates of C4 transitions

with published values for global pCO2 (Pagani et al.,

2005). The earliest unequivocal origin of C4 occurred at

node J (second red dot from the left in Fig. 4),

31.7 � 5.9 Ma by the Poll.Spike.Phyt. calibration, or

slightly later with the PollenL55 calibration, suggesting

that C4 originated as global pCO2 fell to the range in

which C4 is energetically favorable (Table 2; Figs 3 and

4). This is consistent with climatic models that indicate

conditions favorable for C4 grasses in the late Oligo-

cene, even at CO2 levels around 800 ppm (Lunt et al.,

2007). The Gondwana and Coprolite calibrations, in con-

trast, place node J at 51.8 and 44.3 Ma, when global

pCO2 was over 1000 ppm (Berner, 2004).

Our data also suggest that C4 could have originated

even earlier, possibly soon after the stem node of

Aristidoideae (node F). The Poll.Spike.Phyt. and Pol-

lenL55 calibrations date node F at 42–44 Ma, but this is

a maximum date, so the calibration does not necessarily

contradict a correlation between pCO2 and C4 origin

(Figs 3a and 4a). The Coprolite and Gondwana calibra-

tions, in contrast, place both node F and J much earlier,

near or at the Eocene climatic optimum (Figs 3 and 4),

indicating no correlation between pCO2 and C4.

Finally, we correlated C4 origins with the global deep-

sea oxygen isotopic (@18O) signal (Zachos et al., 2001), a

proxy for temperature. Poll.Spike.Phyt. dates the first

unequivocal C4 origins (node J) coincident with the

glacial aberration Oi-1 (Zachos et al., 2001) (Figs 3a

and 4a). Clusters of C3–C4 changes occur during the

Mid-Miocene climatic optimum (MMCO), 17–15 Ma,

and also during a much cooler period � 12–11 Ma. A

cluster of reversals to C3 correlated with @18O values

suggesting a cool climate somewhat before the MMCO,

at � 21–18 Ma. The PollenL55 calibration indicates a

second cluster of reversals to C3 at � 13–10 Ma. Thus,

the origin of C4 clades is not a simple response to

elevated global temperature, and likely reflects more

complex environmental changes.

Discussion

Dates of C4 origins

Our best estimate of the earliest unequivocal C4 origin

(after node I and before J) is between 34.3 ( � 6.2) and

31.7 ( � 5.9) Ma, using the six-fossil (Poll.Spike.Phyt.)

calibration. This is comparable with the results of

Christin et al. (2008), who place the oldest C4 between

32.0 ( � 4.4) and 25.0 ( � 4.0) Ma. The standard devia-

tions of their estimate and ours overlap. Our data place

the earliest unambiguous C4 at the common ancestor of

Danthoniopsis and other PACCMAD taxa, whereas

Christin et al. (2008) place the earliest C4 as the ancestor

Fig. 1 Chronogram for the Poll.Spike.Phyt. (six-fossil) calibration (Table 2) with the maximum likelihood reconstruction of photosynth-

esis pathway evolution. Graph of pCO2 is from Pagani et al. (2005) and @18O from Zachos et al. (2001). Nodes labeled A and B refer to stem

and crown nodes of grasses, respectively; other labeled nodes mark changes in the photosynthetic pathway (see also Fig. S2a–d). Except

for Joinvillea and Flagellaria, all taxa are members of the Poaceae (grasses). Grass subfamilies are indicated by vertical bars. Danthoniopsis

and Gynerium are assigned to different subfamilies by different authors, so are left unassigned here. Dashed red line indicates time at

which pCO2 first drops below 500 ppm.

C L U S T E R E D O R I G I N S O F C 4 P H O T O S Y N T H E S I S 7

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

of the core Chloridoideae. The difference reflects differ-

ences in taxonomic sampling in this portion of the tree

(broader in Christin et al., 2008) and in the amount of

DNA sequence per taxon (broader in the present study).

The phylogeny of the PACCMAD clade has always

been problematical (GPWG, 2001), and neither the tree

of Christin et al. (2008) nor the one presented here

provides strong support for relationships among the

major lineages of the clade.

Table 2 shows the dates of all C4 origins estimated by

Christin et al. (2008) and the present study. Comparing

our estimates based on the six-fossil (Poll.Spike.Phyt.),

calibration shows that none of the estimates are signifi-

cantly different, as indicated by broad overlap of con-

fidence intervals. Our Spikelet calibration is the same as

that used by Christin et al. (2008), as is the PollenL55

calibration. Because these two calibrations plus

PollenL70 and Phytolith all gave comparable dates for

C4 origins, it is not surprising that our results agree with

those of Christin et al. In addition, one of our two

datasets (ndhF) was the same as that used by Christin

et al., again explaining the general congruence of our

results.

Table 2 also shows that the mean dates estimated by

Christin et al. (2008) are generally slightly younger than

ours. Christin et al. (2008) constrained node D by

imposing both a lower limit (55 Ma) and an upper limit

(60 Ma) on the date, based on the Spikelet calibration. By

imposing an upper limit, all dates higher in the tree are

forced to be somewhat younger. (We used a similar

constraint in early analyses, but felt that the results were

artifactually precise, so reverted to using only the lower

limits presented here.)

With their broader taxonomic sample, Christin et al.

(2008) were able to estimate dates for additional C4

origins, beyond those presented here. These results

are summarized at the end of Table 2. They place the

origin of C4 in Micrairoideae at ca. 11 Ma, and date the

stem nodes of the C4 species of Neurachne and Alloter-

opsis at 4 and 15 Ma, respectively.

An important difference between the taxonomic

sample of Christin et al. (2008) and that of the present

study is in the Aristidoideae. Because we had only

a single representative of this subfamily, we could

provide only a maximum possible date for the C4 origin

(44.4 � 7.5 Ma). Christin et al. (2008) included all the

three genera of the subfamily, the C4 Aristida, Stipagros-

tis, and the C3 Sartidia. The latter two are native to

southern Africa and are not easy to collect, so their

inclusion is a valuable addition to the literature.

Importantly, Sartidia is strongly supported as sister

to Stipagrostis, which leads to an estimate of two

Nodes of interest

Age

est

imat

es (

Ma)

10

20

30

40

50

60

70

80

90

100

110

120

130

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c

Calibrations

PollenL55

PollenL70

Spikelet

Phytolith

Coprolite

Gondwana

Poll.Spike.Phyt

Fig. 2 Age estimates of seven calibration schemes (Table S2). Nodes listed are indicated in Figs 1 and S2.

8 A . V I C E N T I N I et al.

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

Tab

le2

Dat

eso

fo

rig

ins

of

C4

un

der

dif

fere

nt

cali

bra

tio

nsc

hem

es(t

his

pap

er)

com

par

edw

ith

ori

gin

sh

yp

oth

esiz

edb

yC

hri

stin

etal

.(2

008)

No

de

Lab

elC

lad

eP

oll

enL

55P

oll

enL

70S

pik

elet

Ph

yto

lith

Co

pro

lite

Go

nd

wan

aP

oll.

Spi

ke.

Phy

t.*

Ch

rist

inet

al.

(200

8)

A[2

]G

rass

stem

no

de

–P

oll

enfo

ssil

89.5�

12.7w

90.6�

11.7

93.8�

1190�

12.2

111.

8�

11.3

121.

2�

10.9

93.8�

10.9

B[3

]G

rass

cro

wn

no

de

66.1�

11.3

67�

1170

.9�

9.3

66.7�

1191

.9�

910

4.4�

8.8

70.9�

9.2

C[5

]M

ult

iflo

wer

edsp

ikel

etfo

ssil

61.6�

10.7

62.5�

10.5

66.3�

8.7

62.2�

10.3

86.5�

8.4

98.8�

8.2

66.2�

8.6

D[6

]P

uel

ioid

eae-

RE

ST

52�

9.7

52.6�

9.6

55.7�

8.4

52.6�

9.4

77.1�

6.4

89.8�

655.6�

8.3

E[7

]B

EP

–PA

CC

MA

Dd

iver

gen

ce–

NA

ph

yto

lith

s

and

Co

pro

lite

foss

il

48.4�

9.1

48.9�

951

.8�

849�

8.8

72.7�

5.5

85.2�

551.6�

7.9

F[2

2]z

Ari

stid

a–

PA

CC

MA

Dcr

ow

nn

od

e41

.6�

8.3

42.1�

8.3

44.5�

7.6

42�

862

.4�

6.6

73.1�

6.9

44.4�

7.5

G[2

5]z

Spo

robo

lus

1E

ragr

osti

s–

Ch

lori

do

idfo

ssil

35.3�

7.3

35.6�

7.2

37.7�

6.7

35.6�

752

.6�

6.4

61.6�

6.9

37.6�

6.6

H[9

7]S

poro

bolu

s1

Era

gros

tis

21�

5.8

21.2�

5.9

22.4�

5.8

21.3�

5.7

31.4�

6.8

36.8�

822.5�

5.7

25.0�

4.0

I[2

6]z

Dan

thon

iops

is-Z

eacl

ade

32.2�

6.7

32.5�

6.7

34.4�

6.3

32.5�

6.6

47.9�

6.2

56.1�

6.8

34.3�

6.2

J[2

7]z

Dan

thon

iops

is-Z

eacl

ade

–E

arli

est

un

amb

igu

ou

sC

429

.8�

6.3

30.1�

6.3

31.8�

5.9

30.1�

6.2

44.3�

6.1

51.8�

6.7

31.7�

5.9

K[2

8]G

yner

ium

27.6�

5.9

27.8�

5.9

29.4�

5.5

27.9�

5.8

40.9�

5.7

47.9�

6.4

29.3�

5.5

L[3

0]P

anic

eae.

X5

1022

.9�

523

.1�

5.1

24.4�

4.8

23.1�

533

.9�

5.3

39.7�

5.9

24.3�

4.9

21.9�

3.9

M[4

7]P

anic

eae.

X5

1017

.4�

4.2

17.6�

4.2

18.6�

4.1

17.6�

4.1

25.9�

4.9

30.3�

5.4

18.5�

4.1

O[4

8]Ta

tian

yx-A

ltop

arad

isiu

mcl

ade

15.9�

3.9

16.1�

3.9

17�

3.8

16.1�

3.9

23.6�

4.5

27.7�

5.1

16.9�

3.8

12.3�

3.2

P[4

9]Ta

tian

yx-A

ltop

arad

isiu

mcl

ade

14�

3.6

14.1�

3.6

14.9�

3.5

14.1�

3.5

20.8�

4.2

24.3�

4.8

14.8�

3.5

11.3�

3.0

Q[5

3]H

omol

epis

6.8�

2.3

6.9�

2.4

7.3�

2.4

6.9�

2.3

10�

3.1

11.7�

3.6

7.2�

2.4

R[5

5]P

anic

um

prio

nit

is10

.6�

2.9

10.7�

311

.3�

310

.6�

2.9

15.7�

3.7

18.3�

4.2

11.2�

2.9

10.4�

2.9

S[5

8]z

Pas

palu

m11�

3.1

11.1�

3.1

11.7�

3.2

11.1�

3.1

16.3�

3.9

19�

4.5

11.7�

3.1

14.1�

3.4

N[5

9]z

An

thae

nan

tia§

1A

xon

opu

scl

ade

16.5�

416

.6�

417

.5�

416

.6�

424

.4�

4.7

28.6�

5.3

17.4�

414

.3�

3.5

T[6

1]E

chin

olae

na

13�

3.6

13.1�

3.6

13.8�

3.6

13�

3.5

19.3�

4.4

22.5�

5.1

13.7�

3.5

U[6

6]S

tein

chis

ma

hian

s3.

6�

1.8

3.6�

1.8

3.8�

1.9

3.6�

1.8

5.3�

2.5

6.2�

2.9

3.8�

1.8

V[6

7]C

3-g

rad

ew

ith

inP

anic

eae.

X5

919

.4�

4.5

19.6�

4.6

20.6�

4.4

19.6�

4.5

28.7�

5.1

33.7�

5.8

20.6�

4.5

13.8�

3.5

W[6

8]C

3-g

rad

ew

ith

inth

eP

anic

eae.

X5

9–

Dic

han

thel

ium

foss

il16

.7�

416

.8�

417

.7�

416

.8�

424

.6�

4.7

28.8�

5.4

17.8�

418

.5�

3.7

X[6

9]M

ult

iph

oto

-pat

hw

ays

clad

ein

Pan

icea

e.X

59

16�

3.8

16.1�

3.9

17�

3.8

16.1�

3.8

23.5�

4.5

27.5�

5.2

17�

3.9

16.4�

3.6

Y[7

0]M

ult

iph

oto

-pat

hw

ays

clad

ein

Pan

icea

e.X

59

14.9�

3.6

15�

3.7

15.8�

3.6

15.1�

3.6

21.9�

4.3

25.7�

515.9�

3.7

21.2�

3.9

Z[7

1]P

anic

um

s.s.

(Pan

icu

mC

4N

AD

)12

.3�

3.1

12.5�

3.2

13.1�

3.1

12.5�

3.1

18.1�

3.9

21.2�

4.5

13.1�

3.2

a[7

2]P

anic

um

s.s.

(Pan

icu

mC

4N

AD

)6.

6�

2.1

6.7�

2.1

7�

2.1

6.7�

2.1

9.7�

2.7

11.4�

3.2

7.1�

2.2

b[7

5]U

roch

loa

(PC

Kcl

ade)

–S

etar

iafo

ssil

11.1�

2.9

11.3�

2.9

11.8�

2.9

11.3�

2.9

16.3�

3.6

19.1�

4.2

11.8�

3

c[8

2]U

roch

loa

(PC

Kcl

ade)

7.1�

27.

2�

2.1

7.6�

2.1

7.2�

2.1

10.4�

2.7

12.3�

3.2

7.6�

2.2

Ech

inoc

hloa

2.5�

1.2

2.5�

1.3

2.6�

1.3

2.5�

1.3

3.6�

1.8

4.3�

2.1

2.6�

1.3

4.4�

2.8

Eri

achn

est

em11

.5�

3.6

Eri

achn

ecr

own

6.6�

2.8

Sti

pagr

osti

sst

em15

.1�

4.6

Sti

pagr

osti

scr

ow

n7.

5�

3.1

Cen

trop

odia

stem

22.0�

4.6

Neu

rach

ne

mu

nro

i4.

4�

3.3

All

oter

opsi

sse

mia

lata

15.3�

3.5

* Bo

ldfa

ceco

lum

nin

dic

ates

ou

rb

est

esti

mat

eo

fd

ates

bas

edo

nd

ata

pre

sen

ted

her

e.w N

um

ber

sar

ein

mil

lio

ns

of

yea

rs,

plu

so

rm

inu

sst

and

ard

dev

iati

on

s.z I

nd

icat

esn

od

esin

wh

ich

tree

sin

the

pre

sen

tp

aper

dif

fer

fro

mth

etr

eein

Ch

rist

inet

al.

(200

8).

§A

nth

aen

anti

ala

nat

ais

the

curr

entl

yac

cep

ted

nam

efo

rth

eta

xo

nca

lled

Lep

toco

ryph

ium

lan

atu

mb

yC

hri

stin

etal

.(2

008)

.

C L U S T E R E D O R I G I N S O F C 4 P H O T O S Y N T H E S I S 9

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

(a) PollenL55

(b) Poll.Spike.Phyt

(c) Coprolite

(d) Gondwana

(e) PollenL55

(f) Poll.Spike.Phyt

(g) Coprolite

(h) Gondwana

10 A . V I C E N T I N I et al.

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

independent origins of C4 in the subfamily, at or after

the crown node. Because of the position of Aristidoi-

deae in the PACCMAD clade, this also affects the

estimate of the earliest origin of C4. The only caveat is

that the analytical method (parsimony) that Christin

et al. (2008) used for inferring independent origins can

make the timing of changes appear more precise than is

warranted by the data (Pagel, 1999), and thus it would

be of interest to re-analyze their dataset in a maximum

likelihood framework. Interestingly, the recent phylo-

genetic analysis of Bouchenak-Khelladi et al. (2008)

suggests that Aristidoideae might have originated

somewhat later than estimated by either Christin et al.

(2008) or the present study.

The coprolite fossil has not been considered pre-

viously as a possible calibration for the grass phylogeny,

beyond its initial description in 2005 (Prasad et al.,

2005), and was not included by Christin et al. (2008).

This fossil, which underlies both the Gondwana and the

Coprolite calibrations presented here, places C4 origins

in the Eocene or possibly as early as the Paleocene,

when global CO2 levels were far higher than those in

which C4 is predicted to be favored. These dates contra-

dict the expectation, based on the well-characterized

energetics of the pathway, that C4 is only favored at CO2

levels closer to 500 ppm.

We did not attempt to incorporate the putative bam-

busoid fossil Programinis burmitis, from the Early Cre-

taceous of Burma (Poinar, 2004). The identity of the

fossil is not certain, nor is the date. If the fossil is indeed

a multiflowered grass spikelet, then node C would be

Early Cretaceous, with resulting dates older than those

from the Coprolite calibration. We also tested 82 Ma as

the upper bound for node B, as suggested in Bremer

(2002), but observed little effect.

Clustered C4 origins

The clustering of changes in photosynthetic pathway is

one of the striking results of this study, and points to

periods of earth history in which the selective regime

changed to favor one pathway over the other. The work

of Christin et al. (2008) supports the identification of

clusters, in that the origins of C4 in Eriachne, Neurachne,

and Alloteropsis – not included here because of lack of

phyB sequences – all fall within the time intervals

identified for changes in other lineages.

We found significant clustering with all the

calibrations, indicating that the result was insensitive

to absolute dates. We undertook an additional set of

simulations to test whether the result was sensitive to

our use of the mean estimated date for each change,

rather than a sample within the range. For these

simulations, rather than using the mean estimated date,

we sampled a random date within the range of possible

dates for that transition event (Fig. 3a–d). We then

tested whether we still observed clusters of events. As

we expected, the signal of clustering was not strong,

given the broad range of possible dates sampled, but we

still found an excess of changes in the Mid- to Late-

Miocene in a number of the simulations. However, by

choosing dates from within the confidence intervals for

each node independently, we effectively destroyed the

correlations between dates imposed by the tree. For

example, if we randomly sampled 13.4 Ma for node N

(the mean minus one standard deviation) and 17.2 Ma

for node T (the mean plus one standard deviation), it

would violate the relative positions of those nodes in

the phylogeny, which requires that node N must be

earlier than node T; in other words, the dates cannot

vary independently of each other. We found

it remarkable that any significant clustering appeared

at all, given this extensive disruption of the signal in

the data.

The distinction between clusters of C4–C3 transitions

and those of C3–C4 suggests an effect of cycles of

climate change, because global pCO2 has remained

relatively low since the Early Miocene or earlier (Zachos

et al., 2001). Both regional temperature and seasonal

distribution of precipitation affect the distribution of C4

grasses (Hattersley, 1992; Sage et al., 1999b; Huang et al.,

2001; Osborne, 2008), and regional climates are con-

nected in a complex way to global temperatures, ocean

circulation patterns, and polar glaciation (Pagani et al.,

1999; Vlastelic et al., 2005; Meehl et al., 2007). C4 origins

in times of global cooling, such as that in the Early

Fig. 3 (a–d) Time of change events in photosynthetic pathway for four calibration schemes, Pollen55–70, Pollen.Spike.Phyt., Coprolite, and

Gondwana. The maximum age of autapomorphic events is indicated by an arrow with length equal mean � SD. Non autapomorphic

events are indicated by a box with length equal to the range between maxima and minima age estimates; the line indicating the standard

deviations. (e–h) Results of a simulation testing whether the rate of transitions from C3–C4 and from C4–C3 was significantly higher for

the observed dataset than for a model in which the rate of transition was constant, at different points in geological time (x) and for a range

of time window sizes (y). Significant departures (at Po0.05) from constant rate of transitions are indicated in red for C3–C4 transitions

and in blue for C4–C3 transitions. The color gradient indicates the number of events observed in each time window (light gray area

indicates searched space but in which the null model was not rejected). pCO2 mean levels for the Miocene indicated are from Pagani et al.

(2005), and the @18O data are from Zachos et al. (2001). Note that the axes of the two graphs for each calibration scheme run in opposite

directions.

C L U S T E R E D O R I G I N S O F C 4 P H O T O S Y N T H E S I S 11

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

Oligocene, are consistent with the hypothesis that

glacial conditions correlate with increased seasonality

at low latitudes (Pagani et al., 1999). However, we

also find C4 origins in relatively warm periods such

as the MMCO (17–15 Ma; Zachos et al., 2001). This

observation fits other climatic models (Lunt et al.,

2007) correlating high global temperature with conti-

nental aridity.

The C4 clades originating in the Mid-Miocene are the

Tatianyx-Altoparadisium and the Anthaenantia-Axonopus

clades of the x 5 10 Paniceae, Alloteropsis semialata, and

the large speciose (ca. 900 species) clade of the

x 5 9 Paniceae that includes Digitaria, Panicum sensu

stricto, the clade exhibiting the PCK subtype of C4

photosynthesis and the ‘bristle clade’ that includes

foxtail and pearl millet. The origin and subsequent

diversification of these clades in the Mid-Miocene is

consistent with the recent observation by Osborne

(2008) that current abundance of Paniceae species is

correlated with moist, aseasonal conditions.

The relationship among global temperature, regional

hydrology, and seasonality are poorly understood for

the Oligocene and Early to Middle Miocene. Kurschner

et al. (2008), using stomatal indices of fossil trees,

estimate broad fluctuations of pCO2 levels and global

mean temperature within ranges that could affect the

selective value of different photosynthetic pathways,

although the range of variation they find is more

extreme than in other studies. Additional data on

paleoclimate will undoubtedly shed light on the origin

of C4 clades.

Age of major grass lineages

Estimated dates for all the nodes under the various

calibration schemes are listed in Table S2 and Fig. S3.

For many of the calibration schemes, the standard

deviations overlap, indicating that the various calibra-

tions converge on similar estimates for the age of

particular events.

Our analysis provides the most robust estimate to

date for the origin of the grass family, and for the

common ancestor of the BEP and PACCMAD clades

(Table 2). We estimate that the stem node of the grasses

occurred at 93.8 � 10.9 Ma, somewhat older than the

estimates provided by Gaut (2002) (77 Ma) and by

Bremer (2002) (82 Ma). This most likely reflects the

broader sampling in the present study as well as the

use of more fossil calibration points. We place the

PollenL55

Poll.Spike.Phyt

Coprolite

Gondwana

(a)

(b)

(c)

(d)

Fig. 4 (a–d) Timing of changes in photosynthetic pathway and

variation in levels of pCO2 (Pagani et al., 2005) and @18O (Zachos

et al., 2001). Each photosynthetic change is represented by a dot

at the mean age for the event and at the mean @18O value for the

time period in which the change has occurred (see Fig. S3A–C for

error estimates on dates). Vertical bars indicate time periods

showing significantly higher transition rates, with C3–C4 in red

and C4–C3 in blue; the width of the bar indicates the minimum

time window that includes those events and departs signifi-

cantly from a constant rate of transitions (see also Fig. 3e–h).

Climatic events indicated are from Zachos et al. (2001).

12 A . V I C E N T I N I et al.

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

common ancestor of the BEP and PACCMAD clades at

51.6 � 7.9 Ma, approximately the same as the dates

suggested by Gaut (2002) (50 Ma) and by Bremer

(2002) (55 Ma). This ancestor is also the common

ancestor of maize and rice, so serves to calibrate

divergence of those two genomes.

In contrast to all the other fossil calibrations, the

Gondwana and Coprolite calibrations suggest that the

stem node of the grass family occurred around

111–120 Ma, appreciably earlier than the oldest putative

grass pollen fossil (�70 Ma). These dates are compar-

able with those postulated for the base of the monocots

(Janssen & Bremer, 2004). Thus, if either of these two

calibrations is correct, then most other molecular clock

dates for monocots need to be revised. Alternatively, the

coprolite fossil itself may need to be re-evaluated.

Conclusions

In summary, we have shown that photosynthetic

pathway is labile in evolutionary time. The earliest

origin of C4 photosynthesis likely occurred in the

Oligocene at 31.7 � 5.9 Ma. An additional eight origins

occurred subsequently, with several clustered in the

Mid-Miocene and leading to some of the most speciose

clades of Paniceae. The only recent origin, at 3.8 Ma, is

of a C3/C4 intermediate, indicating that C4 has not

arisen recently despite low pCO2 levels. Reversals to

C3 were clustered at a time distinct from the C4 origins.

The present study was made possible by recent

improvements in methods for dating phylogenetic

trees, and in new data on past climates. Nonetheless,

the error bars on the phylogenetic dates remain large, in

many cases being plus or minus ca. 25%. Correlations

with paleoclimate are tantalizing but imprecise. Here,

we were able to overlay photosynthetic pathway

changes on graphs of pCO2 and @18O through time. It

would be of considerable interest to be able to include

estimates of precipitation and/or seasonality as well if

those paleoclimatic data become available. Finally,

dating of phylogenetic trees is only as good as the

available calibration points, generally fossils. Any

additions to the fossil record will permit greater

precision in estimates of dates.

Acknowledgements

This work was supported by National Science Foundationgrants 9815392 and 0108501 (E. A. K.). We thank editor RowanSage for his helpful comments that have greatly improved themanuscript.

References

Aliscioni SS, Giussani LM, Zuloaga FO, Kellogg EA (2003) A

molecular phylogeny of Panicum (Poaceae: Paniceae): test

of monophyly and phylogenetic placement within the

Panicoideae. American Journal of Botany, 90, 796–821.

Berner RA (2004) Atmospheric carbon dioxide over Phanerozoic

time. In: The Phanerozoic Carbon Cycle: CO2 and O2 (ed. Berner

RA), pp. 72–99. Oxford University Press, Oxford, UK.

Bouchenak-Khelladi Y, Salamin N, Savolainen V, Forest F,

vanderBank M, Chase MW, Hodkinson TR (2008) Large

multi-gene phylogenetic trees of the grasses (Poaceae):

progress towards complete tribal and generic taxon sampling.

Molecular Phylogenetics and Evolution, 47, 488–505.

Bremer K (2002) Gondwanan evolution of the grass alliance of

families (Poales). Evolution, 56, 1374–1387.

Cerling TE (1991) Carbon dioxide in the atmosphere: evidence

from Cenozoic and Mesozoic paleosols. American Journal of

Science, 291, 377–400.

Cerling TE, Harris JM, MacFadden BJ, Leakey MG, Quade J,

Wisenmann V, Ehleringer JR (1997) Global vegetation change

through the Miocene/Pliocene boundary. Nature, 389, 153–158.

Christin P-A, Besnard G, Samaritani E, Duvall MR, Hodkinson

TR, Savolainen V, Salamin N (2008) Oligocene CO2 decline

promoted C4 photosynthesis in grasses. Current Biology, 18,

37–43.

Clayton WD, Renvoize SA (1986) Genera Graminum. Her Ma-

jesty’s Stationery Office, London.

Crepet WL, Feldman GD (1991) The earliest remains of grasses in

the fossil record. American Journal of Botany, 78, 1010–1014.

Doyle JJ, Doyle JL (1987) A rapid DNA isolation procedure for

small quantities of fresh leaf tissue. Phytochemical Bulletin, 19,

11–15.

Duvall MR, Saar DE, Grayburn WS, Holbrook GP (2003) Com-

plex transitions between C3 and C4 photosynthesis during the

evolution of Paniceae: a phylogenetic case study emphasizing

the position of Steinchisma hians (Poaceae), a C3–C4 intermedi-

ate. International Journal of Plant Sciences, 164, 949–958.

Ehleringer JR, Cerling TE, Helliker BR (1997) C4 photosynthesis,

atmospheric CO2, and climate. Oecologia, 112, 285–299.

Ehleringer JR, Monson RK (1993) Evolutionary and ecological

aspects of photosynthetic pathway variation. Annual Review of

Ecology and Systematics, 24, 411–439.

Ehleringer JR, Sage RF, Flanagan LB, Pearcy RW (1991) Climate

change and the evolution of C4 photosynthesis. Trends in

Ecology and Evolution, 6, 95–99.

Ekart DD, Cerling TE, Montanez IP, Tabor NJ (1999) A 400 year

carbon isotope record of pedogenic carbonate: implications for

paleoatmospheric carbon dioxide. American Journal of Science,

299, 805–827.

Elias MK (1942) Tertiary prairie grasses and other herbs from the

High Plains. Geological Society of America Special Paper (Regular

Studies), 41, 1–176.

Fox DL, Koch PL (2003) Tertiary history of C4 biomass in the

Great Plains, USA. Geology, 31, 809–812.

Gaut BS (2002) Evolutionary dynamics of grass genomes. New

Phytologist, 154, 15–28.

Giussani LM, Cota-Sanchez JH, Zuloaga FO, Kellogg EA (2001)

A molecular phylogeny of the grass subfamily Panicoideae

C L U S T E R E D O R I G I N S O F C 4 P H O T O S Y N T H E S I S 13

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

(Poaceae) shows multiple origins of C4 photosynthesis.

American Journal of Botany, 88, 1993–2012.

Grass Phylogeny Working Group (GPWG) (2001) Phylogeny and

subfamlial classification of the Poaceae. Annals of the Missouri

Botanical Garden, 88, 373–457.

Hattersley PW (1992) C4 photosynthetic pathway variation in

grasses (Poaceae): its significance for arid and semi-arid lands.

In: Desertified Grasslands: Their Biology and Management

(ed. Chapman G), pp. 181–212. Academic Press, London.

Hattersley PW, Stone NE (1986) Photosythetic enzyme activities

in the C3–C4 intermediate Neurachne minor S. T. Blake

(Poaceae). Australian Journal of Plant Physiology, 13, 399–408.

Herendeen PS, Crane PR (1995) The fossil history of the

monocotyledons. In: Monocotyledons: Systematics and Evolution

(eds Rudall PJ, Cribb P, Cutler DF, Humphries CJ), pp. 1–21.

Royal Botanic Gardens, Kew, Surrey, UK.

Huang Y, Street-Perrott FA, Metcalfe SE, Brenner M, Moreland

M, Freeman KH (2001) Climate change as the dominant

control on glacial–interglacial variations in C3 and C4 plant

abundance. Science, 293, 1647–1651.

Huelsenbeck JP, Ronquist FR (2001) MrBayes: Bayesian inference

of phylogeny. Bioinformatics, 17, 754–755.

Jacobs BF, Kingston JD, Jacobs LL (1999) The origin of

grass-dominated ecosystems. Annals of the Missouri Botanical

Garden, 86, 590–643.

Janssen T, Bremer K (2004) The age of major monocot groups

inferred from 800 1 rbcL sequences. Botanical Journal of the

Linnean Society, 146, 385–398.

Kellogg EA (1999) Phylogenetic aspects of the evolution of C4

photosynthesis. In: C4 Plant Biology (eds Sage R, Monson R),

pp. 411–444. Academic Press, New York.

Kurschner WM, Kvacek Z, Dilcher DL (2008) The impact of

Miocene atmospheric carbon dioxide fluctuations on climate

and the evolution of terrestrial ecosystems. Proceedings of the

National Academy of Sciences USA, 105, 449–453.

Linder HP (1986) The evolutionary history of the Poales/Resti-

onales – a hypothesis. Kew Bulletin, 42, 297–318.

Lloyd J, Farquhar G (1994) 13C discrimination during CO2

assimilation by the terrestrial biosphere. Oecologia, 99, 201–215.

Lunt DJ, Ross I, Hopley PJ, Valdes PJ (2007) Modeling Late

Oligocene C4 grasses and climate. Palaeogeography, Palaeocli-

matology, Palaeoecology, 251, 239–253.

Maddison DR, Maddison WP (2000) MacClade 4: Analysis of

Phylogeny and Character Evolution. Sinauer Associates, Sunder-

land, Massachusetts, USA.

Mathews S, Sharrock RA (1996) The phytochrome gene family in

grasses (Poaceae): a phylogeny and evidence that grasses have

a subset of the loci found in dicot angiosperms. Molecular

Biology and Evolution, 13, 1141–1150.Mathews S, Tsai RC, Kellogg EA (2000) Phylogenetic structure in

the grass family (Poaceae): evidence from the nuclear gene

phytochrome B. American Journal of Botany, 87, 96–107.Meehl GA, Stocker TF, Collins WD et al. (2007) Global climate

projections. In: Climate Change 2007: The Physical Science Basis

(eds Solomon S, Qin D, Manning M, Chen Z, Marquis M, Avery

KB, Tignor M, Miller HL), pp. 747–845. Cambridge University

Press, Cambridge, United Kingdom and New York, NY, USA.Nambudiri EMV, Tidwell WD, Smith BN, Hebbert NP (1978) A

C4 plant from the Pliocene. Nature, 276, 816–817.

Nylander JAA (2004) MrModeltest v. 2. Program distributed by

the author. Evolutionary Biology Centre, Uppsala University.

Osborne CP (2008) Atmosphere, ecology and evolution: what

drove the Miocene expansion of C4 grasslands? Journal of

Ecology, 96, 35–45.

Pagani M, Freeman KH, Arthur MA (1999) Late Miocene atmo-

spheric CO2 concentrations and the expansion of C4 grasses.

Science, 285, 876–879.

Pagani M, Zachos JC, Freeman KH, Tipple B, Bohaty S (2005)

Marked decline in atmospheric carbon dioxide concentrations

during the Paleocene. Science, 309, 600–603.

Pagel M (1994) Detecting correlated evolution on phylogenies: a

general method for the comparative analysis of discrete char-

acters. Proceedings of the Royal Society of London, Series B.

Biological Sciences, 255, 37–45.

Pagel M (1999) Inferring the historical patterns of biological

evolution. Nature, 401, 877–884.

Paradis E, Claude J, Strimmer K (2004) APE: analyses of phyloge-

netic and evolution in R language. Bioinformatics, 20, 289–290.

Pearson PN, Palmer MR (2000) Atmospheric carbon dioxide

concentrations over the last 60 million years. Nature, 406,

695–699.

Poinar GO Jr (2004) Programinis burmitis gen. et sp. nov., and P.

laminatus sp. nov., Early Cretaceous grass-like monocots in

Burmese amber. Australian Systematic Botany, 17, 497–504.

Prasad V, Stromberg CAE, Alimohammadian H, Sahni A (2005)

Dinosaur coprolites and the early evolution of grasses and

grazers. Science, 310, 1177–1180.

Prendergast HDV, Hattersley PW (1987) Australian C4 grasses

(Poaceae): leaf blade anatomical features in relation to C4 acid

decarboxylation types. Australian Journal of Botany, 35, 355–382.

Prendergast HDV, Hattersley PW, Stone NE (1987) New struc-

tural/biochemical associations in leaf blades of C4 grasses

(Poaceae). Australian Journal of Plant Physiology, 14, 403–420.

Rabosky DL (2006) LASER: a maximum likelihood toolkit for

detecting temporal shifts in diversification rates from

molecular phylogenies. Evolutionary Bioinformatics Online, 2,

257–260.

Royer DL, Berner RA, Montanez IP, Tabor NJ, Beerling DJ (2004)

CO2 as a primary driver of Phanerozoic climate change. GSA

Today, 14, 4–10.

Rutschmann F (2005) Bayesian molecular dating using PAML/multi-

divtime. A step-by-step manual. Version 1.5. University of Zurich,

Switzerland. Available at http://www.plant.ch

Sage RF (2001) Environmental and evolutionary preconditions

for the origin and diversification of the C4 photosynthetic

syndrome. Plant Biology, 3, 202–213.

Sage RF (2004) The evolution of C4 photosynthesis. New Phytol-

ogist, 161, 341–370.Sage RF, Li M, Monson RK (1999a) The taxonomic distribution of

C4 photosynthesis. In: C4 Plant Biology (eds Sage RF, Monson

RK), pp. 551–584. Academic Press, San Diego.Sage RF, Wedlin DA, Li M (1999b) The biogeography of C4

photosynthesis: patterns and controlling factors. In: C4 Plant

Biology (eds Sage RF, Monson RK), pp. 313–373. Academic

Press, San Diego.Sambrook J, Fritsch EF, Maniatis TR (1989) Molecular Cloning: A

Laboratory Manual. Cold Spring Harbor Laboratory Press,

Plainview, New York, USA.

14 A . V I C E N T I N I et al.

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x

Sanchez-Ken JG, Clark LG, Kellogg EA, Kay EE (2007) Reinstate-

ment and emendation of subfamily Micrairoideae. Systematic

Botany, 32, 71–80.

Sheehan MJ, Farmer PR, Brutnell TP (2004) Structure and

expression of maize phytochrome family homeologs. Genetics,

167, 1395–1405.

Sinha NR, Kellogg EA (1996) Parallelism and diversity in multi-

ple origins of C4 photosynthesis in the grass family. American

Journal of Botany, 83, 1458–1470.

Stromberg CAE (2005) Decoupled taxonomic radiation and

ecological expansion of open-habitat grasses in the Cenozoic

of North America. Proceedings of the National Academy of

Sciences USA, 102, 11980–11984.

Thomasson JR (1978) Observations on the characteristics of the

lemma and palea of the late Cenozoic grass Panicum elegans.

American Journal of Botany, 65, 34–39.

Thorne JL, Kishino H (2002) Divergence time estimation and rate

evolution within multilocus data sets. Systematic Biology, 51,

689–702.

Tipple BJ, Pagani M (2007) The early origins of terrestrial C4

photosynthesis. Annual Review of Earth and Planetary Science,

35, 435–461.

Vlastelic I, Carpentier M, Lewin E (2005) Miocene climate change

recorded in the chemical and isotopic (Pb, Nd, Hf) signature of

Southern Ocean sediments. Geochemistry, Geophysics, and Geo-

systems, 6, Q03003, doi: 10.1029/2004GC000819.

Wang Y, Cerling TE, MacFadden BJ (1994) Fossil horses and

carbon isotopes: new evidence for Cenozoic dietary, habitat,

and ecosystem changes in North America. Palaeogeography,

Palaeoclimatology, Palaeoecology, 107, 269–279.

Zachos J, Pagani M, Sloan L, Thomas E, Billups K (2001) Trends,

rhythms, and aberrations in global climate 65 Ma to present.

Science, 292, 686–693.

Supporting Information

Additional Supporting Information may be found in the on-

line version of this article:

Fig. S1: Consensus tree from independent Bayesian analyses

of the combined ndhf and phyB datasets with reconstruction

of the pattern of photosynthesis type evolution. Numbers at

nodes are posterior probabilities. This tree contains 15 more

species than shown in Fig. 1; four are C3 species of the BEP

clade, and the remainder are C4 species in Andropogoneae

or the multi-photosynthetic pathways clade. Their inclu-

sion/exclusion has no effect on optimization of C4 origins

on the tree.

Fig. S2: Chronograms for the four calibration schemes, Pollen,

Spikelet, Phytolith and Coprolite, with the respective re-

construction of photosynthetic pathway evolution. A max-

imum parsimony (MP) reconstruction is shown for the

Pollen chronogram, and the maximum likelihood (ML)

reconstructions are shown for the other three chronograms

( see Fig.1 for the ML reconstruction for Pollen. Spike.

Phyt.). Note that the pattern of photosynthesis type evolu-

tion is basically the same in all schemes and the MP and ML

reconstructions indicate transitions at the same nodes.

Fig. S3: The same tree as Fig. 1, but with nodes numbered to

correspond to estimated dates presented in Table S2.

Table S1: Taxa and GenBank numbers.

Table S2: Dates for all nodes in the grass phylogeny.

Please note: Wiley-Blackwell are not responsible for the con-

tent or functionality of any supporting materials supplied

by the authors. Any queries (other than missing material)

should be directed to the corresponding author for the

article.

C L U S T E R E D O R I G I N S O F C 4 P H O T O S Y N T H E S I S 15

r 2008 The AuthorsJournal compilation r 2008 Blackwell Publishing Ltd, Global Change Biology, doi: 10.1111/j.1365-2486.2008.01688.x