Diversity in coffee assessed with SSR markers: structure of the genus Coffea and perspectives for...
-
Upload
independent -
Category
Documents
-
view
0 -
download
0
Transcript of Diversity in coffee assessed with SSR markers: structure of the genus Coffea and perspectives for...
Diversity in coffee assessed with SSR markers:structure of the genus Coffea and perspectivesfor breeding
Philippe Cubry, Pascal Musoli, Hyacinte Legnate, David Pot, Fabien de Bellis,Valerie Poncet, Francois Anthony, Magali Dufour, and Thierry Leroy
Abstract: The present study shows transferability of microsatellite markers developed in the two cultivated coffee species(Coffea arabica L. and C. canephora Pierre ex Froehn.) to 15 species representing the previously identified main groupsof the genus Coffea. Evaluation of the genetic diversity and available resources within Coffea and development of molecu-lar markers transferable across species are important steps for breeding of the two cultivated species. We worked on 15species with 60 microsatellite markers developed using different strategies (SSR-enriched libraries, BAC libraries, gene se-quences). We focused our analysis on 4 species used for commercial or breeding purposes. Our results establish the hightransferability of microsatellite markers within Coffea. We show the large amount of diversity available within wild spe-cies for breeding applications. Finally we discuss the consequences for future comparative mapping studies and breedingof the two cultivated species.
Key words: SSR markers, microsatellites, Coffea, transferability, cross-amplification, genetic diversity.
Resume : La presente etude montre la transferabilite de marqueurs microsatellites developpes sur les deux especes de ca-feiers cultivees (Coffea arabica L. et C. canephora Pierre ex Froehn.) a 15 especes representant les principaux groupesprecedemment identifies du genre Coffea. L’evaluation de la diversite et des ressources genetiques disponibles au sein dugenre Coffea et le developpement de marqueurs moleculaires transferables d’une espece a l’autre sont des etapes importan-tes pour l’amelioration de ces deux especes. Nous avons travaille sur 15 especes avec 60 marqueurs microsatellites deve-loppes suivant differentes methodologies (banques enrichies en microsatellites, banques BAC, sequences de genes). Nousavons plus particulierement analyse quatre especes d’interet en commerce ou en amelioration. Nos resultats etablissent queles microsatellites sont hautement transferables dans le genre Coffea. Nous mettons en evidence l’important reservoir dediversite pour l’amelioration que constituent les especes sauvages de ce genre. Enfin nous discutons des implications pourde futures etudes de cartographie comparee et l’amelioration des deux especes cultivees.
Mots-cles : marqueurs microsatellites, Coffea, transferabilite, amplification croisee, diversite genetique.
Introduction
The genus Coffea consists of 103 species (Davis and Stof-felen 2006) originated from intertropical regions of Africaand Madagascar. Only two species are cultivated: C. arabicaL., which represents 65% of the world’s coffee production,and C. canephora Pierre ex Froehn. Coffee species are dip-loid (2n = 2x = 22) except for C. arabica, which is tetra-ploid (2n = 4x = 44). Coffea arabica is self-compatible,like two diploid species, C. heterocalyx Stoff. and C. antho-nyi Stoff. & F.Anthony (Davis and Stoffelen 2006). Pre-
vious phylogenetic studies based on other markers such asrDNA (Lashermes et al. 1997) and cpDNA variation (Cros etal. 1998) have shown that the genus Coffea is organizedinto 4 groups with different geographical origins, i.e., Centraland West Africa (WC clade), East Africa (E clade), Cen-tral Africa (C clade), and Madagascar (M clade).
Microsatellite markers present different properties thanthe other markers previously used (such as RFLPs, iso-zymes, and cpDNA) and give a complementary view of thecoffee genus diversity. SSR (simple sequence repeat) ormicrosatellite markers are highly variable and codominant
Received 5 April 2007. Accepted 10 October 2007. Published on the NRC Research Press Web site at genome.nrc.ca on 18 December2007.
Corresponding Editor: F. Belzile.
P. Cubry,1 D. Pot, F. de Bellis, M. Dufour, and T. Leroy. CIRAD, UMR DAP, TA A-96/03, avenue Agropolis, 34398 MontpellierCEDEX 5, France.P. Musoli. Coffee Research Institute, P.O. Box 185, Mukono, Uganda.H. Legnate. CNRA, BP 808, DIVO, Republique de Cote d’Ivoire.V. Poncet. IRD, UMR DIA-PC, 911 avenue Agropolis, BP 64501, 34394 Montpellier CEDEX 5, France.F. Anthony. IRD, UMR RPB, 911 avenue Agropolis, BP 64501, 34394 Montpellier CEDEX 5, France.
1Corresponding author (e-mail: [email protected]).
50
Genome 51: 50–63 (2008) doi:10.1139/G07-096 # 2007 NRC Canada
(Tautz and Renz 1984; Jarne and Lagoda 1996). They havealready been analysed for their transferability within the cof-fee genus for 6 species, C. canephora, C. eugenioidesS.Moore, C. heterocalyx, C. liberica Bull ex Hiern.,C. anthonyi, and C. pseudozanguebariae Bridson (Poncet etal. 2004), and compared with AFLPs (Prakash et al. 2005).SSR markers have also been used to assess genetic diversityin the two main cultivated species, C. arabica and C. cane-phora (Anthony et al. 2002a, 2002b; Moncada andMcCouch 2004; Cubry et al. 2005; Prakash et al. 2005).The present study gives cross-amplification results for a setof microsatellite markers in a larger sample of species andindividuals.
In addition to a large survey of the transferability of themarkers, we performed a detailed analysis of the two culti-vated species (C. arabica and C. canephora) and two relatedspecies used for both quality and productivity improvement(C. liberica and C. congensis). A crisis of low prices hasoccurred during past years, and farmers have to produce abetter quality coffee to maintain their incomes. Identifyingthe amount of genetic diversity available for improvementis especially important for C. arabica, which has been iden-tified as a species with a very narrow genetic base (Anthonyet al. 2002a). Since the genus Coffea diverged recently fromothers (5 to 25 million years ago; Lashermes et al. 1996),most of the species are genetically highly related and a lotof hybridizations are possible (Louarn 1992). Indeed, spon-taneous and viable crosses of C. canephora � C. congensis,C. arabica � C. liberica, and C. arabica � C. canephorahave been described (Cramer 1948; Prakash et al. 2002).These hybrids are widely used in breeding programs for re-sistance to pests and disease or for quality improvement.
In the present paper, we analyse the diversity of 15 Coffeaspecies belonging to the 4 previously identified geneticgroups using 60 microsatellite markers from different ori-gins and covering the whole genome. We also detail the re-lationships among 4 species, 2 cultivated and 2 related wildones. Finally, we discuss the consequences for breeding ofC. arabica and C. canephora.
Materials and methods
Plant materialWe used a total of 42 individuals from 15 Coffea species
in our study (Table 1). Four species of particular interestwere represented by more than 4 individuals to enable com-parison of several diversity variables. These 4 species wereC. canephora, C. arabica, C. congensis, and C. liberica.
For C. arabica, we studied both cultivated and wild ac-cessions, including commercial hybrids between the twomain cultivars, ‘Typica’ and ‘Bourbon’. For C. canephoraand C. liberica, we analysed, respectively, genotypes fromdifferent genetic groups (B, C, SG2, and Guinean) and vari-eties (liberica, dewevrei) chosen to represent the greatest di-versity (Louarn 1992; Anthony 1992; Montagnon 2000;Dussert et al. 2003). Coffea canephora accessions also in-cluded new material from Uganda (Musoli et al. 2006), in-cluding wild material surveyed in Itwara Forest (UW) andthe cultivar ‘Nganda’ (UN). Coffea congensis was repre-sented by accessions from different Central African regions.
Eleven other species from different geographic origins
covering the whole repartition of Coffea were included toprovide an overview of the global diversity, including atleast 2 species of each of the previously described diversityclades (i.e., C, WC, E, and M).
Coffea canephora genotypes were kindly provided by theCNRA (Centre National de Recherche Agronomique) fromfield collection in Divo (Republique de Cote d’Ivoire). WildC. canephora (UW) and ‘Nganda’ (UN) genotypes fromUganda were conveniently provided by the CORI (CoffeeResearch Institute) of Uganda. Coffea arabica, C. congensis,C. liberica, and C. sessiliflora Bridson genotypes came fromfield collections in French Guiana. One individual of each ofthese 4 species was kindly provided by the IRD (Institut deRecherche pour le Developpement) greenhouse collection inMontpellier, France. Material of 9 other species also camefrom the IRD collection: C. anthonyi, previously known asC. ‘sp. Moloundou’, C. bertrandii A.Chev., C. eugenioides,C. humilis A.Chev., C. millotii J.-F.Leroy, C. pseudozangue-bariae, C. racemosa Lour., C. salvatrix Swynn. & Philipson,and C. stenophylla G.Don.
DNA extractionGenomic DNA was extracted from ground leaves follow-
ing an extraction method using a MATAB buffer adaptedfrom Risterucci et al. (2000). A purification of the extractsusing products from the solution-based Wizard1 SV Ge-nomic DNA Purification System (Promega, Madison, Wis-consin, USA, Cat. No. A1125) was then performed.
Microsatellite markersIn this study, we used microsatellite markers obtained
from different origins (Table 2). DLxxx primers were previ-ously published and developed from a C. canephora BAClibrary (Leroy et al. 2005). A second set came from a micro-satellite motif–enriched library of C. canephora clone 126(Dufour et al. 2001) and from an enriched library ofC. arabica ‘Caturra’ (Rovelli et al. 2000). Primers for theenriched C. arabica library came from Poncet et al. (2004)and primers for the enriched C. canephora library were de-signed by Poncet et al. (2007) using Primer3 software(Rozen and Skaletski 2000). SSRxxx primers were designedfrom sequences of sucrose synthase (SuSy) genes (Geromelet al. 2006) using Primer3 (D. Pot, unpublished data). A to-tal of 60 loci were screened in this study and all of them,except SSRxxx loci, have been mapped on an intraspecificC. canephora genetic map (T. Leroy, unpublished data).
PCR and data acquisitionFor each reaction, 2.5 ng of DNA template was mixed
with 5 mL of PCR buffer (10 mmol/L Tris-HCl, 50 mmol/LKCl, 2 mmol/L MgCl2, 0.001% glycerol), 200 mmol/L dNTPs,0.10 mmol/L of reverse primer, 0.08 mmol/L of forwardprimer tailed with M13 sequence, 0.10 mmol/L of fluores-cently labelled M13 primer, and 0.1 U of Taq DNA poly-merase. PCR amplifications were performed in anEppendorf Mastercycler ep 384 (Eppendorf, Westbury,New York, USA). The amplification program consisted ofan initial denaturation cycle of 4 min at 94 8C followedby 9 cycles of ‘‘touch-down’’ PCR consisting of 45 s at94 8C, 1 min at 60 8C to 55 8C, decreasing by 0.5 8Ceach cycle, and 1 min 30 s at 72 8C. The next 26 cycles
Cubry et al. 51
# 2007 NRC Canada
consisted of 94 8C for 45 s, 55 8C for 1 min, and 72 8Cfor 1 min 30 s prior to a final elongation step at 72 8Cfor 5 min.
Fluorescently labelled PCR products were analysed byelectrophoresis on a 6.5% polyacrylamide gel using aLI-COR1 4300 automated sequencer (LI-COR Biosciences,Lincoln, Nebraska, USA). Gel images were retrieved andannotated with the manufacturer’s program SAGAGT. Weassigned allele sizes manually to each individual on thebasis of the automated analyses of SAGAGT. Previouslystudied individuals of C. canephora (Cubry et al. 2005)served as controls. The data matrix was exported as a text
file and formatted in Excel1 software for the different pro-grams used for the analysis.
Data analysisA dissimilarity matrix was computed from the data file
using the software DARwin 5 (Perrier et al. 2003). The dis-similarities were calculated using a simple matching dis-tance index. Since C. arabica exhibited a maximum of 2alleles per locus in our data, we decided to manage geno-types from this species as diploid genotypes. The dissimilar-ity matrix was used to infer a global diversity tree using theweighted neighbor-joining method (Saitou and Nei 1987) as
Table 1. List of plant material and providers.
Coffea species Working name Variety or diversity group Collection
Species of particular interest for commercial or breeding purposesC. arabica Arabica_1 ‘Caturra’ IRD, FranceC. arabica Arabica_2 ‘Red Catuaı 1’ CIRAD, French GuianaC. arabica Arabica_3 ‘Guinee pita 1’ CIRAD, French GuianaC. arabica Arabica_4 ‘Sidamo 1’ CIRAD, French GuianaC. arabica Arabica_5 ‘Mundo Novo’ CIRAD, French GuianaC. arabica Arabica_et1 Wild ethiopian CIRAD, French GuianaC. arabica Arabica_et2 Wild ethiopian CIRAD, French GuianaC. arabica Arabica_et3 Wild ethiopian CIRAD, French GuianaC. canephora Can_b1 Congolese group B CNRA, Republique de Cote d’IvoireC. canephora Can_c1 Congolese group C CNRA, Republique de Cote d’IvoireC. canephora Can_sg2_1 Congolese group SG2 CNRA, Republique de Cote d’IvoireC. canephora Can_g1 Guinean CNRA, Republique de Cote d’IvoireC. canephora Can_g2 Guinean CNRA, Republique de Cote d’IvoireC. canephora Can_u1 Uganda, ‘Nganda’ CORI, UgandaC. canephora Can_u2 Uganda, wild CORI, UgandaC. canephora Can_u3 Uganda, wild CORI, UgandaC. canephora Can_g3 Guinean CNRA, Republique de Cote d’IvoireC. congensis Congensis_1 IRD, FranceC. congensis Congensis_2 CIRAD, French GuianaC. congensis Congensis_3 CIRAD, French GuianaC. congensis Congensis_4 CIRAD, French GuianaC. congensis Congensis_5 CIRAD, French GuianaC. liberica Liberica_1 IRD, FranceC. liberica Liberica_2_l liberica CIRAD, French GuianaC. liberica Liberica_3_l liberica CIRAD, French GuianaC. liberica Liberica_4_l liberica CIRAD, French GuianaC. liberica Liberica_5_d dewevrei CIRAD, French GuianaC. liberica Liberica_6_d dewevrei CIRAD, French GuianaC. liberica Liberica_7_d dewevrei CIRAD, French Guiana
Other species included in this studyC. anthonyi Anthonyi IRD, FranceC. bertrandii Bertrandii IRD, FranceC. brevipes Brevipes IRD, FranceC. eugenioides Eugenioides IRD, FranceC. humilis Humilis IRD, FranceC. milloti Milloti IRD, FranceC. pseudozanguebariae Pseudozanguebariae IRD, FranceC. racemosa Racemosa IRD, FranceC. salvatrix Salvatrix IRD, FranceC. sessiliflora Sessiliflora_1 IRD, FranceC. sessiliflora Sessiliflora_2 CIRAD, French GuianaC. sessiliflora Sessiliflora_3 CIRAD, French GuianaC. stenophylla Stenophylla IRD
52 Genome Vol. 51, 2008
# 2007 NRC Canada
Tab
le2.
Lis
tof
the
60SS
Rm
arke
rsus
edin
the
stud
y.
EM
BL
acc.
No.
Mar
ker
nam
eR
epea
tty
peN
o.of
repe
ats
Prim
erse
quen
ces
(5’?
3’)
Sequ
ence
orig
inPr
imer
orig
inSp
ecie
sof
orig
inA
J250
257
257
CA
9F:
GA
CC
AT
TA
CA
TT
TC
AC
AC
AC
Com
bes
2000
Ponc
et20
04C
offe
aar
abic
a‘C
atur
ra’
R:
GC
AT
TT
TG
TT
GC
AC
AC
TG
TA
AM
2311
8630
5T
G8
F:A
AC
TT
CA
CT
AA
TC
TG
TT
GT
TG
CT
GD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
GC
AC
AT
CT
AT
CC
AT
CT
TT
TG
GA
M23
1546
327
CA
9F:
GG
CT
CA
AA
AT
CA
CC
CT
TT
GT
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:C
TA
GG
AT
CG
TG
GC
AG
AA
GA
AG
AM
2315
4732
9G
T10
F:A
CT
CA
GA
CA
AA
CC
CT
TC
AA
CD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
GA
TG
TT
TT
GC
AT
CT
AT
TT
GG
AM
2315
4833
4A
C8
F:T
AT
GC
CT
CA
GC
AC
CT
AT
CT
AD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
TA
CT
TC
CC
CT
GT
TC
CT
TA
TG
AM
2315
4934
1C
A,
TA
12,
5F:
CA
TT
GG
TG
TC
AA
GG
GT
CA
AG
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:A
AA
GT
AT
CA
GA
AG
GA
AA
AG
TC
TC
GT
AA
AM
2315
5035
0G
T8
F:T
CA
AA
AG
AG
GG
CA
CG
AA
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:A
CG
AC
AA
TA
AC
TT
TG
CA
TG
TC
TA
M23
1551
351
GT
13F:
AA
GG
AT
GG
CA
AG
TG
GA
TT
TC
TD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
GC
AG
CT
CT
TG
AT
TG
TA
GT
TT
CG
TA
M23
1552
355
TG
15F:
CT
AT
GA
TG
TC
TT
CC
AA
CC
TT
CT
AA
CD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
GG
TC
CA
AT
TC
TG
TT
TC
AA
TT
TC
AM
2315
5335
6T
G14
F:T
GA
AG
TC
AA
CC
TG
AA
TA
CC
AG
AD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
AC
GC
AC
GC
AC
GA
AT
GA
M23
1554
358
CA
11F:
CA
TG
CA
CT
AT
TA
TG
TT
TG
TG
TT
TT
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:T
CT
CG
TC
AT
AT
TT
AC
AG
GT
AG
GT
TA
M23
1555
360
CA
10F:
AC
AG
TA
GT
AT
TT
CA
TG
CC
AC
AT
CC
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:A
CA
TT
TG
AT
TG
CC
TC
TT
GA
CC
AM
2315
5636
4A
21F:
AG
AA
GA
AT
GA
AG
AC
GA
AA
CA
CA
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:T
AA
CG
CC
TG
CC
AT
CG
AM
2315
5736
7A
C12
F:T
CA
AT
CC
CT
GT
AT
TC
CT
GT
TT
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:C
TA
GG
CA
CT
TA
AA
AT
CT
CT
AT
AA
CG
AM
2315
5836
8T
G13
F:C
AC
AT
CT
CC
AT
CC
AT
AA
CC
AT
TT
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:T
CC
TA
CC
TA
CT
TG
CC
TG
TG
CT
AM
2315
5937
1C
A9
F:A
GA
CA
CA
CA
AG
GC
AA
TA
AT
CA
AA
CD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
TC
TT
GA
GC
AG
CA
TG
GG
AA
CA
M23
1560
384
AC
10F:
AC
GC
TA
TG
AC
AA
GG
CA
AT
GA
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:T
GC
AG
TA
GT
TT
CA
CC
CT
TT
AT
CC
AM
2315
6138
8C
A9
F:A
TG
AA
AC
GA
GA
AT
CC
AT
AC
CC
TA
CD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
AG
AG
GT
AA
AA
GG
AA
AA
TG
CT
AG
AC
CA
M23
1562
392
TC
16F:
AA
GG
TA
TT
GG
TC
TG
CC
TT
TG
TD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
CT
AA
CC
CT
AA
TC
CC
CA
GC
AA
M23
1563
394
TG
9F:
GC
CG
TC
TC
GT
AT
CC
CT
CA
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:G
AA
GC
CA
GA
AA
GT
CA
GT
CA
CA
TA
GA
M23
1564
395
GT
13F:
CA
TC
AT
TT
TG
TT
GG
CA
AA
GD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
TG
GT
TA
TT
TC
CT
TC
TT
TG
TA
TT
G
Cubry et al. 53
# 2007 NRC Canada
Tab
le2
(con
tinu
ed).
EM
BL
acc.
No.
Mar
ker
nam
eR
epea
tty
peN
o.of
repe
ats
Prim
erse
quen
ces
(5’?
3’)
Sequ
ence
orig
inPr
imer
orig
inSp
ecie
sof
orig
in
AM
2315
6542
9A
13F:
CA
TT
CG
AT
GC
CA
AC
AA
CC
TD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
GG
GT
CA
AC
GC
TT
CT
CC
TG
AM
2315
6644
2C
A19
F:C
GC
AA
AT
CT
GA
GT
AT
CC
CA
AC
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:T
GG
AT
CA
AC
AC
TG
CC
CT
TC
AM
2315
6744
5A
C10
F:C
CA
CA
GC
TT
GA
AT
GA
CC
AG
AD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
AA
TT
GA
CC
AA
GT
AA
TC
AC
CG
AC
TA
M23
1568
456
AC
14F:
TG
GT
TG
TT
TT
CT
TC
CA
TC
AA
TC
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:T
CC
AG
TT
TC
CC
AC
GC
TC
TA
M23
1569
460
CA
11F:
TG
CC
TT
CA
AA
AT
GC
TC
TA
TA
AC
CD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
GC
TG
AT
AT
TC
TT
GG
AT
GG
AG
TT
GA
M23
1570
461
AC
9F:
CG
GC
TG
TG
AC
TG
AT
GT
GD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
AA
TT
GC
TA
AG
GG
TC
GA
GA
AA
M23
1571
463
AC
8F:
CA
TT
CT
TC
CC
AC
GA
TT
CT
AT
CT
CD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
GT
GA
CT
TT
CG
GT
TG
AA
AT
AC
TG
GA
M23
1572
471
CT
12F:
TT
AC
CT
CC
CG
GC
CA
GA
CD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
CA
GG
AG
AC
CA
AG
AC
CT
TA
GC
AA
M23
1573
472
CA
,T
A8,
8F:
AA
TC
AT
GG
GG
AC
AG
GA
CA
AG
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:T
CT
GC
TA
GA
CT
TG
AC
AT
CT
TT
TG
GA
M23
1574
477
AC
16F:
CG
AG
GG
TT
GG
GA
AA
AG
GT
Duf
our
2001
Ponc
et20
07C
offe
aca
neph
ora,
clon
e12
6R
:A
CC
AC
CT
GA
TG
TT
CC
AT
TT
GT
AM
2315
7549
5A
C8
F:C
AT
GG
AT
GG
GA
AG
GC
AG
TD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
CT
TG
GA
AA
AC
TT
GC
TG
AA
TG
TG
AM
2315
7650
1T
G8
F:C
AC
CA
CC
AT
CT
AA
TG
CA
CC
TD
ufou
r20
01Po
ncet
2007
Cof
fea
cane
phor
a,cl
one
126
R:
CT
GC
AC
CA
GC
TA
AT
TC
AA
GC
AJ3
0875
375
3C
A15
F:G
GA
GA
CG
CA
GG
TG
GT
AG
AA
GR
ovel
li20
00Po
ncet
2004
Cof
fea
arab
ica
‘Cat
urra
’R
:T
CG
AG
AA
GT
CT
TG
GG
GT
GT
TA
J308
755
755
CA
20F:
CC
CT
CC
CT
CT
TT
CT
CC
TC
TC
Rov
elli
2000
Ponc
et20
04C
offe
aar
abic
a‘C
atur
ra’
R:
TC
TG
GG
TT
TT
CT
GT
GT
TC
TC
GA
J308
774
774
CT
,C
A5,
7F:
GC
CA
CA
AG
TT
TC
GT
GC
TT
TT
Rov
elli
2000
Ponc
et20
04C
offe
aar
abic
a‘C
atur
ra’
R:
GG
GT
GT
CG
GT
GT
AG
GT
GT
AT
GA
J308
779
779
TG
17F:
TC
CC
CC
AT
CT
TT
TT
CT
TT
CC
Rov
elli
2000
Ponc
et20
04C
offe
aar
abic
a‘C
atur
ra’
R:
GG
GA
GT
GT
TT
TT
GT
GT
TG
CT
TA
J308
782
782
GT
15F:
AA
AG
GA
AA
AT
TG
TT
GG
CT
CT
GA
Rov
elli
2000
Ponc
et20
04C
offe
aar
abic
a‘C
atur
ra’
R:
TC
CA
CA
TA
CA
TT
TC
CC
AG
CA
AJ3
0879
079
0G
T21
F:T
TT
TC
TG
GG
TT
TT
CT
GT
GT
TC
TC
Rov
elli
2000
Ponc
et20
04C
offe
aar
abic
a‘C
atur
ra’
R:
TA
AC
TC
TC
CA
TT
CC
CG
CA
TT
AJ3
0880
980
9T
GA
11F:
AG
CA
AG
TG
GA
GC
AG
AA
GA
AG
Rov
elli
2000
Ponc
et20
04C
offe
aar
abic
a‘C
atur
ra’
R:
CG
GT
GA
AT
AA
GT
CG
CA
GT
CA
J308
837
837
TG
,G
A16
,11
F:C
TC
GC
TT
TC
AC
GC
TC
TC
TC
TR
ovel
li20
00Po
ncet
2004
Cof
fea
arab
ica
‘Cat
urra
’R
:C
GG
TA
TG
TT
CC
TC
GT
TC
CT
CA
J308
838
838
AC
9F:
CC
CG
TT
GC
CA
TC
CT
TA
CT
TA
Rov
elli
2000
Ponc
et20
04C
offe
aar
abic
a‘C
atur
ra’
R:
AT
AC
CC
GA
TA
CA
TT
TG
GA
TA
CT
CG
54 Genome Vol. 51, 2008
# 2007 NRC Canada
Tab
le2
(con
clud
ed).
EM
BL
acc.
No.
Mar
ker
nam
eR
epea
tty
peN
o.of
repe
ats
Prim
erse
quen
ces
(5’?
3’)
Sequ
ence
orig
inPr
imer
orig
inSp
ecie
sof
orig
in
AJ8
7188
2D
L00
3C
AA
T5
F:T
AA
CA
GA
AG
CA
CC
AA
AA
CC
Ler
oy20
05L
eroy
2005
Cof
fea
cane
phor
a,cl
one
126
R:
TC
TA
AA
CC
CA
CC
TC
AC
AA
CA
J871
889
DL
010
A14
F:T
AG
TC
CC
TT
TT
CA
GT
GG
TL
eroy
2005
Ler
oy20
05C
offe
aca
neph
ora,
clon
e12
6R
:T
TT
CT
TT
GT
TA
CG
GA
GT
GA
J871
890
DL
011
GC
T,
CA
T4,
8F:
AT
AC
AT
AA
GC
AA
GC
AC
TG
AL
eroy
2005
Ler
oy20
05C
offe
aca
neph
ora,
clon
e12
6R
:C
AG
AA
CA
AA
TG
AA
AT
GG
AA
J871
892
DL
013
CA
,C
T6,
8F:
AG
AG
GG
AT
GT
CA
GC
AT
AA
Ler
oy20
05L
eroy
2005
Cof
fea
cane
phor
a,cl
one
126
R:
AT
TT
GT
GT
TT
GG
TA
GA
TG
TG
AJ8
7189
9D
L02
0T
23F:
TG
CT
CA
AA
CT
TC
TT
GC
TL
eroy
2005
Ler
oy20
05C
offe
aca
neph
ora,
clon
e12
6R
:C
GC
CA
AC
TC
TA
AT
GT
GT
AJ8
7190
4D
L02
5C
17F:
TT
GT
TG
AG
AG
TG
GA
GG
AL
eroy
2005
Ler
oy20
05C
offe
aca
neph
ora,
clon
e12
6R
:C
CA
AA
GA
CA
GT
GC
AG
TA
AA
J871
905
DL
026
A17
F:C
GA
GA
CG
AG
CA
TA
AG
AA
Ler
oy20
05L
eroy
2005
Cof
fea
cane
phor
a,cl
one
126
R:
GC
TG
GA
AT
GA
AG
AA
TG
TA
GA
J871
911
DL
032
TA
CG
3F:
TG
TT
GG
TG
AA
GA
AA
TC
CL
eroy
2005
Ler
oy20
05C
offe
aca
neph
ora,
clon
e12
6R
:A
TG
GA
GA
CA
GG
AA
AT
AA
AC
AM
2315
77SS
R00
1T
3F:
CA
AT
AC
GG
CA
TG
CA
TT
TG
AC
Ger
omel
2006
Pot
2006
Cof
fea
cane
phor
a,cl
one
126
R:
TG
TT
GA
AC
AC
GC
AA
TT
GA
CC
AM
2315
78SS
R00
3A
6F:
AT
TT
GC
GT
GC
TG
GA
TG
TT
TT
Ger
omel
2006
Pot
2006
Cof
fea
cane
phor
a,cl
one
126
R:
AC
CA
TG
TA
GG
AA
GG
CC
AC
AG
AM
2315
79SS
R00
4T
9F:
CC
AA
CC
CT
AA
GA
TG
AT
TT
TT
GT
Ger
omel
2006
Pot
2006
Cof
fea
cane
phor
a,cl
one
126
R:
AA
CC
CC
TC
TC
AA
AA
CC
CA
GT
AM
2315
82SS
R00
5G
AT
2F:
AT
GT
GG
TG
CT
GA
TG
TG
CA
GT
Ger
omel
2006
Pot
2006
Cof
fea
cane
phor
a,cl
one
126
R:
GT
CA
CG
TG
GG
AT
GA
TG
AG
AA
AM
2315
80SS
R00
9G
AA
AA
5F:
CA
AA
CA
AA
AC
AG
TA
CA
AT
TC
AA
TC
CG
erom
el20
06Po
t20
06C
offe
aca
neph
ora,
clon
e12
6R
:A
TC
CC
TG
CG
AG
AC
CT
GA
CT
AA
M23
1581
SSR
010
AT
T2
F:C
GA
AA
GG
AA
CA
CA
GG
AA
CC
AG
erom
el20
06Po
t20
06C
offe
aca
neph
ora,
clon
e12
6R
:C
AG
TG
GT
GA
AC
TT
AA
TC
GT
CC
AA
M23
1583
SSR
014
T14
F:G
GA
TC
TT
AT
CG
CA
AT
GA
AC
CA
Ger
omel
2006
Pot
2006
Cof
fea
cane
phor
a,cl
one
126
R:
CC
AA
CA
GT
GT
CC
TT
GC
TG
AA
AM
2315
84SS
R01
5T
12F:
TT
CT
TC
AC
AA
GA
AC
CA
AC
CC
TA
AG
erom
el20
06Po
t20
06C
offe
aca
neph
ora,
clon
e12
6R
:A
AC
CC
CT
CT
CA
AA
AC
CC
AA
TA
M23
1585
SSR
016
T13
F:T
GG
TC
AA
TT
TG
AA
GC
GA
CT
GG
erom
el20
06Po
t20
06C
offe
aca
neph
ora,
clon
e12
6R
:C
CT
CC
AT
CC
TT
TC
CC
TT
AC
CA
M23
1586
SSR
017
TA
7F:
TG
TT
CC
TC
TG
GC
TG
TT
GA
TG
Ger
omel
2006
Pot
2006
Cof
fea
cane
phor
a,cl
one
126
R:
CC
GG
TT
GA
AT
GA
GG
GT
AA
AG
.
Cubry et al. 55
# 2007 NRC Canada
implemented in DARwin. Five thousand bootstrap iterationswere calculated to test the robustness of the nodes. Consid-ering that some species were represented by more than oneindividual, we inferred another diversity tree with one ran-domly chosen individual per species. This tree allowed abetter understanding of the genetic relationships betweenspecies without the interference of sampling size per spe-cies. The same inference method used for the global treewas used for this second tree.
Several genetic variables (e.g., number of alleles, gene di-versity, and observed heterozygosity) were calculated usingPowerMarker software (Liu and Muse 2005) for the globalsample and for each of the 4 species of particular interest.We also computed the percentage of polymorphic loci byspecies. Ninety-five percent confidence intervals for eachvariable were estimated by performing 5000 bootstrap itera-tions across loci.
Results
Amplifications across the genusThe availability (percentage of amplification) per marker
ranged from 30.9% to 100% among the 42 analysed geno-types, with a mean of 81.5% calculated from the raw matrixof observations (see Table S12). Even if 3 markers appearedto be specific to the Central Africa clade, good transferabil-ity of microsatellites across Coffea species was observed.
The percentage of amplification per individual rangedfrom 51.7% for one C. liberica genotype (note that themean for all C. liberica species is about 72%) to 98.3% forone C. canephora genotype. Values obtained here are closeto those found by Poncet et al. (2004). For the 4 main spe-cies, amplification ranged from 72% for C. liberica to 89%for C. arabica and 90% for C. canephora. Amplification forC. congensis was intermediate (83%).
Genus diversity analysisFigure 1 presents the neighbor-joining tree for the 42 in-
dividuals of the study based on 60 microsatellite loci. Boot-strap values greater than 40 are shown; this threshold wasarbitrarily chosen for the readability of the figure. Ten diver-sity groups were discriminated by the analysis. The 4 ge-netic groups WC, C, E, and M, previously described byLashermes et al. (1997), are indicated on this figure.
Groups C, E, and M were discriminated by our study,whereas species of the WC clade were classified in 7 differ-ent groups. Coffea arabica and C. congensis constitutedoriginal groups, while C. canephora and C. liberica wereeach represented by two groups. These two groups corre-spond to different geographical origins (Central and WestAfrica), as previously described by Berthaud (1986). ForC. liberica, these two groups appear to be the varieties,C. liberica var. liberica and C. liberica var. dewevrei. ForC. canephora the two groups correspond to the Guinean (G)clade and the Congolese clade, including the B and SG2diversity groups. We observed strong relationships betweenB, SG2, and related Ugandan accessions (UW, UN), as pre-
viously described (Musoli et al. 2006). Coffea brevipes canbe grouped with the Central African (Congolese) clade ofC. canephora, while C. humilis and C. stenophylla appearto be grouped.
Within C. arabica, wild and cultivated materials were dif-ferentiated, as expected from previous studies of a smallnumber of SSR markers (Anthony et al. 2002a, 2002b). Thecultivated varieties represent a narrow genetic base, sincedissimilarity distances between those genotypes are theshortest of the dendrogram.
The second tree, considering only one individual per spe-cies, allows us to describe 5 different groups for oursampled species. Groups M, C, and E are still discriminated,while species from West Africa (WC clade) are separatedinto two groups: C. arabica, C. canephora, and the relatedspecies C. congensis and C. brevipes form one group, whileC. liberica, C. humilis, and C. stenophylla form anothergroup. Bootstrap values supporting these groups are quitehigh for microsatellite markers.
The global diversity is high, with a mean gene diversityof 0.72 ± 0.03 and a mean allele number of 10.8 (see Table 3for details). The number of alleles varies from 1 to 22according to the locus considered. Of the total number ofalleles (648), 304 (47%) are specific to one species. A com-plete table of private alleles is given as supplementary mate-rial (Table S22). The percentage of the total number ofprivate alleles for each species ranges from 0% for C. antho-nyi to 31.25% for C. canephora, with a mean of 6.45% (seeTable S32). These results show the great amount of interspe-cific diversity within the genus, even if some species arerepresented by only one individual.
Considering the global sample, 59 markers are polymor-phic. Only one, SSR016, which derived from a genic se-quence, exhibited no polymorphism. At the intraspecificlevel, 91.7%, 75%, 76.7%, and 65% of the markers are poly-morphic in C. canephora, C. congensis, C. liberica, andC. arabica, respectively. For the other species, polymorphisminformation should not be taken into consideration becauseonly one or a small number of individuals are available.
Diversity analysis of several speciesFour species of particular interest because of their eco-
nomic importance or breeding potential were more accu-rately analysed in our study. This subsample of 4 speciescontributed an important part of the global sample diversity,with a mean number of alleles of 8. On the species diversitydiagram (Fig. 2) they appear to be in 2 related clades. Ta-ble 3 presents the results for allele number, gene diversity,and observed heterozygosity for C. arabica, C. canephora,C. congensis, and C. liberica (Table S42 presents values cal-culated for all the species). Coffea arabica shows the lowestdiversity, with a mean number of alleles of 2.10. Moreover,it is the only species that shows gene diversity less than ob-served heterozygosity. The global amount of diversity inthese 4 species is important, with a mean gene diversityhigher than 0.35. Coffea canephora appears to be the most
2 Supplementary data for this article are available on the journal Web site (http://genome.nrc.ca) or may be purchased from the Depositoryof Unpublished Data, Document Delivery, CISTI, National Research Council Canada, Building M-55, 1200 Montreal Road, Ottawa, ONK1A 0R6, Canada. DUD 5250. For more information on obtaining material refer to http://cisti-icist.nrc-cnrc.gc.ca/irm/unpub_e.shtml.
56 Genome Vol. 51, 2008
# 2007 NRC Canada
diverse, with a gene diversity of 0.55 and a mean number ofalleles of 5.00.
Discussion
Coffea diversityThe global amount of diversity within Coffea appears to
be high. Considering the 4 previously described clades, weshow that 3 groups can be confirmed (i.e., groups C, M,and E), while the fourth (WC) appears divided in two(Fig. 2). This division can be imputed to the use of SSRs,which have different properties than the previously usedmarkers, and the high number of markers used in this studycompared with the previous studies. Indeed, the high rate ofmutation for microsatellite markers helps us to better inves-tigate structure within species and species complexes.
Moreover, microsatellites are valuable tools to assess ge-netic structure at the species level, as demonstrated by theglobal diversity diagram (Fig. 1). This figure shows the rela-tionships within 4 species of the WC clade, indicating struc-ture at the intraspecific level for C. liberica, C. canephora,and C. arabica. In contrast, C. congensis appears to be ho-mogeneous, at least for the genotypes studied.
Finally, we validated our sampling strategy, which con-sisted of analysing at least 2 species per previously knowndiversity clade for the whole genus to have an overview ofthe global genus diversity. We sampled more genotypes for4 species particularly well known and of important eco-nomic and breeding interest (Lashermes et al. 1997; An-thony 1992; Poncet et al. 2004).
Our results validate the microsatellite-based approach toquickly study Coffea species by covering the entire genome,while sequence-based studies are generally limited to smallnumbers of genomic regions.
Transferability of microsatellite markersWe have confirmed the transferability of SSR markers
across the genus Coffea for a larger sample of species thanpreviously described. SSRs are useful markers for compara-tive studies across genera (Casasoli 2004). Their transfer-ability over species across a genus has been shown forseveral genera including Lycopersicon (Alvarez et al. 2001),Oryza (Gao et al. 2005), Vigna (Yu et al. 1999), and Coffea(Combes et al. 2000; Poncet et al. 2004). Newly developedmicrosatellites based on C. canephora sequences exhibit thesame properties as those previously developed based on
Fig. 1. Neighbor-joining tree for the 42 individuals analyzed based on the dissimilarity matrix calculated by simple matching. Bootstrapvalues were calculated with 5000 repetitions; only values greater than or equal to 40 are shown.
Cubry et al. 57
# 2007 NRC Canada
Tab
le3.
Sum
mar
yst
atis
tics
calc
ulat
edfo
rth
e60
SSR
mar
kers
for
the
glob
alsa
mpl
e(a
ll15
spec
ies
stud
ied)
,th
e4
spec
ies
focu
sed
on,
and
each
ofth
e4
spec
ies
sepa
rate
ly.
15sp
ecie
sC
.ar
abic
aC
.ca
neph
ora
C.
cong
ensi
sC
.li
beri
ca4
spec
ies
Mar
ker
NG
DH
oN
GD
Ho
NG
DH
oN
GD
Ho
NG
DH
oN
GD
Ho
DL
003
60.
610.
372
0.23
0.29
20.
400.
141
0.00
0.00
30.
370.
503
0.57
0.25
DL
010
120.
770.
362
0.19
0.22
50.
570.
114
0.54
0.80
30.
561.
008
0.72
0.45
DL
011
70.
620.
171
0.00
0.00
40.
610.
223
0.29
0.20
30.
470.
146
0.63
0.16
DL
013
120.
860.
302
0.50
1.00
30.
450.
001
0.00
0.00
30.
510.
007
0.83
0.38
DL
020
130.
860.
434
0.56
0.89
60.
690.
502
0.26
0.00
50.
640.
4311
0.86
0.50
DL
025
70.
780.
343
0.54
1.00
30.
530.
113
0.29
0.20
30.
510.
006
0.74
0.37
DL
026
130.
810.
111
0.00
0.00
50.
680.
004
0.58
0.00
60.
510.
439
0.79
0.10
DL
032
70.
730.
272
0.50
1.00
20.
400.
001
0.00
0.00
40.
500.
405
0.59
0.36
SSR
016
10.
000.
001
0.00
0.00
10.
000.
001
0.00
0.00
10.
000.
001
0.00
0.00
SSR
014
130.
790.
261
0.00
0.00
40.
580.
115
0.51
0.40
60.
640.
439
0.77
0.19
SSR
015
40.
320.
222
0.50
1.00
10.
000.
001
0.00
0.00
10.
000.
002
0.27
0.32
SSR
017
90.
720.
091
0.00
0.00
10.
000.
002
0.16
0.20
60.
680.
147
0.66
0.08
SSR
001
20.
150.
001
0.00
0.00
20.
180.
001
0.00
0.00
0N
AN
A2
0.08
0.00
SSR
003
30.
250.
001
0.00
0.00
20.
410.
002
0.28
0.00
10.
000.
002
0.25
0.00
SSR
004
30.
110.
022
0.10
0.11
10.
000.
001
0.00
0.00
10.
000.
002
0.03
0.03
257
130.
650.
313
0.56
1.00
20.
410.
001
0.00
0.00
30.
420.
258
0.61
0.42
305
70.
570.
402
0.50
1.00
30.
420.
602
0.29
0.40
10.
000.
004
0.34
0.33
327
130.
820.
333
0.55
1.00
60.
680.
294
0.59
0.50
10.
000.
008
0.52
0.14
329
130.
850.
413
0.60
1.00
50.
590.
253
0.52
0.25
40.
480.
296
0.51
0.30
334
40.
580.
102
0.10
0.11
40.
470.
222
0.28
0.00
20.
370.
004
0.59
0.58
341
60.
680.
111
0.00
0.00
30.
470.
002
0.25
0.00
20.
190.
2511
0.82
0.50
350
100.
810.
434
0.69
0.78
30.
540.
295
0.63
0.60
30.
510.
0010
0.83
0.52
351
130.
830.
542
0.50
1.00
60.
590.
715
0.67
0.75
30.
380.
204
0.38
0.10
355
160.
880.
512
0.50
1.00
70.
720.
444
0.56
0.60
50.
660.
716
0.73
0.11
356
140.
810.
594
0.53
0.43
50.
690.
784
0.53
0.67
10.
000.
008
0.78
0.46
358
80.
710.
141
0.00
0.00
40.
590.
221
0.00
0.00
10.
000.
009
0.78
0.73
SSR
009
100.
660.
391
0.00
0.00
40.
620.
563
0.50
0.50
60.
690.
7513
0.87
0.71
SSR
010
40.
300.
291
0.00
0.00
40.
450.
500
NA
NA
0N
AN
A8
0.77
0.62
360
120.
830.
260
NA
NA
60.
690.
222
0.30
0.00
60.
690.
606
0.66
0.09
364
70.
480.
241
0.00
0.00
60.
690.
562
0.38
0.00
30.
480.
4014
0.87
0.33
367
110.
840.
492
0.50
1.00
60.
720.
444
0.56
0.25
40.
560.
337
0.61
0.25
368
210.
880.
291
0.00
0.00
100.
800.
335
0.61
0.25
40.
520.
2510
0.81
0.59
371
110.
780.
542
0.50
1.00
50.
550.
385
0.70
0.80
50.
590.
4313
0.82
0.20
384
90.
830.
212
0.10
0.11
40.
600.
112
0.16
0.20
50.
560.
4310
0.76
0.67
388
180.
870.
313
0.44
0.00
80.
780.
782
0.16
0.20
30.
521.
008
0.82
0.23
392
150.
830.
361
0.00
0.00
50.
620.
134
0.58
0.60
80.
800.
8611
0.82
0.38
394
140.
740.
321
0.00
0.00
50.
430.
444
0.56
0.40
80.
770.
6713
0.80
0.37
395
160.
870.
293
0.21
0.13
90.
780.
572
0.26
0.00
40.
480.
2010
0.64
0.37
429
200.
860.
271
0.00
0.00
80.
790.
563
0.47
0.00
50.
610.
2015
0.87
0.27
442
70.
590.
181
0.00
0.00
60.
690.
292
0.23
0.33
0N
AN
A13
0.81
0.22
445
90.
790.
422
0.50
1.00
30.
490.
333
0.36
0.50
20.
370.
008
0.66
0.17
58 Genome Vol. 51, 2008
# 2007 NRC Canada
Tab
le3
(con
clud
ed).
15sp
ecie
sC
.ar
abic
aC
.ca
neph
ora
C.
cong
ensi
sC
.li
beri
ca4
spec
ies
Mar
ker
NG
DH
oN
GD
Ho
NG
DH
oN
GD
Ho
NG
DH
oN
GD
Ho
456
110.
700.
221
0.00
0.00
110.
810.
440
NA
NA
0N
AN
A5
0.74
0.44
460
220.
880.
542
0.50
1.00
50.
600.
138
0.74
0.80
70.
730.
8011
0.74
0.28
461
130.
860.
414
0.47
0.56
70.
740.
333
0.38
0.20
50.
640.
5718
0.85
0.68
463
70.
730.
592
0.50
1.00
50.
710.
672
0.19
0.25
30.
500.
4014
0.89
0.45
471
110.
810.
261
0.00
0.00
50.
650.
294
0.56
0.25
50.
620.
675
0.65
0.65
472
150.
880.
456
0.69
1.00
60.
720.
254
0.52
0.25
30.
380.
339
0.77
0.32
477
160.
870.
382
0.50
1.00
50.
530.
332
0.26
0.00
30.
490.
0010
0.88
0.54
495
90.
750.
071
0.00
0.00
60.
690.
331
0.00
0.00
10.
000.
0011
0.82
0.43
SSR
005
110.
690.
102
0.10
0.11
30.
500.
003
0.29
0.20
30.
450.
207
0.66
0.10
501
160.
850.
472
0.49
0.89
90.
790.
561
0.00
0.00
50.
510.
5714
0.87
0.58
753
130.
820.
583
0.54
1.00
50.
660.
384
0.65
1.00
40.
530.
678
0.79
0.72
755
150.
870.
593
0.59
1.00
90.
760.
565
0.67
0.75
40.
600.
8013
0.89
0.76
774
80.
580.
122
0.10
0.11
30.
260.
111
0.00
0.00
10.
000.
006
0.53
0.10
779
90.
860.
582
0.50
1.00
70.
740.
384
0.58
0.60
50.
730.
869
0.85
0.71
782
90.
770.
195
0.68
0.80
10.
000.
005
0.64
0.20
40.
540.
206
0.73
0.25
790
160.
880.
563
0.60
1.00
90.
770.
675
0.66
0.60
40.
420.
4314
0.86
0.71
809
80.
710.
532
0.50
1.00
30.
350.
441
0.00
0.00
50.
721.
007
0.67
0.65
837
100.
820.
243
0.29
0.13
60.
700.
433
0.42
0.25
30.
480.
209
0.82
0.25
838
160.
890.
463
0.60
1.00
50.
650.
222
0.28
0.50
30.
450.
2010
0.87
0.50
Mea
n*11
0.72
0.32
20.
300.
495
0.55
0.29
30.
340.
274
0.44
0.34
80.
690.
37M
ean{
110.
720.
322
0.30
0.48
50.
550.
303
0.35
0.27
40.
450.
358
0.69
0.37
SD1
0.03
0.02
00.
030.
060
0.03
0.03
00.
030.
040
0.03
0.04
00.
030.
032.
5%l.b
.10
0.66
0.27
20.
230.
374
0.49
0.24
20.
290.
203
0.39
0.27
70.
630.
3197
.5%
u.b.
120.
760.
362
0.36
0.61
50.
600.
353
0.41
0.34
40.
510.
439
0.74
0.42
Not
e:N
,nu
mbe
rof
alle
les;
GD
,ge
nedi
vers
ity;
Ho,
obse
rved
hete
rozy
gosi
ty;
NA
,no
tav
aila
ble
(mis
sing
data
);SD
,st
anda
rdde
viat
ion;
2.5%
l.b.
and
97.5
%u.
b.,
low
eran
dup
per
boun
dari
esof
the
95%
conf
iden
cein
terv
al.
*M
ean
valu
esba
sed
only
onm
arke
rsw
ithno
mis
sing
data
for
the
cons
ider
edsp
ecie
s.{ M
ean
valu
esca
lcul
ated
over
5000
boot
stra
pite
rati
ons
and
base
don
lyon
mar
kers
with
nom
issi
ngda
tafo
rth
eco
nsid
ered
spec
ies.
Cubry et al. 59
# 2007 NRC Canada
C. arabica sequences, since mean percentages of amplifica-tion are the same. This result will be used for developmentof comparative mapping, utilization of new markers, andknowledge transfer from one species to another.
SSRs described in genes involved in sucrose metabolismappear to have some specific behaviour, since they exhibitvery low diversity (1–4 alleles in the global sample) orintermediate diversity (9–13 alleles). These results willallow us to use these markers to study gene regions impli-cated in sucrose metabolism.
In our work, using new markers, we validate the relationbetween C. anthonyi and C. eugenioides, which was previ-ously described by Lashermes et al. (1997). These two spe-cies show high similarity based on both morphological andmolecular data. However, C. anthonyi originated from Ca-meroon, while C. eugenioides is native to East Africa. Noother coffee species belonging to the same clade (C) hasbeen observed between these two distant geographic areas,and there is no clear explanation for the discontinuous distri-bution of these coffee trees (Anthony 1992).
We can use these two species to improve C. arabica vari-eties, considering their genetic relationships and the originalself-compatible system of C. anthonyi (Anthony et al. 2006).These two species show some of the lowest concentrationsof caffeine (0.6%) of the genus Coffea and exhibit high con-centrations of trigonelline (1.6% for C. anthonyi, 1.3% forC. eugenioides; F. Anthony, personal communication), analkaloid compound. These two characters have always inter-ested breeders in coffee improvement. Meanwhile, since few
genotypes are in collection worldwide, these two specieshave not been agronomically well characterized and experi-ments are necessary to assess potential resistances to bioticand abiotic stresses usable for improvement.
On the other hand, part of the C. arabica genome hasbeen shown to originate from an ancestral species geneti-cally close to C. eugenioides or C. anthonyi (Lashermes etal. 1999). These relationships can be used to better under-stand the elaboration and functioning of the allotetraploidgenome of C. arabica, in particular comportment of homeol-ogous chromosomes during meiosis.
Diversity and genetic properties of cultivated and relatedwild species
The diversity and genetic relationships of C. arabica,C. canephora, and related species are examined in ourwork. Coffea arabica has been treated as a diploid speciesbecause of the presence of only 2 alleles on all the loci.This is not surprising considering the allotetraploid originand amphidiploid nature of C. arabica and its autogamy.Coffea arabica is the only species that exhibited an expectedheterozygosity lower than the observed heterozygosity. Thisresult is consistent with other studies (Lashermes et al.1999; Aggarwal et al. 2007). It could result from the fixedheterozygosity (Lashermes et al. 1999) during the speciationprocess including two different ancestral genomes. Data de-rived from SNP analysis (Pot et al. 2006) confirm this hy-pothesis with the construction of two haplotypes based onsequences. One is close to C. canephora and related species,
Fig. 2. Neighbor-joining tree for 15 individuals (one per species) based on the dissimilarity matrix calculated by simple matching. Bootstrapvalues were calculated with 5000 repetitions.
60 Genome Vol. 51, 2008
# 2007 NRC Canada
while the other exhibits strong relationships with C. euge-nioides. However, heterozygosity within the two ancestralgenomes appears to have been lost, since only one allelefrom each genome remains in C. arabica. This result indi-cates a possible lack of recombination between the ancestralgenomes, while recombination within each genome occursnormally.
We included the two varieties of C. liberica, i.e., C. liber-ica var. liberica and C. liberica var. dewevrei. These twovarieties were genetically well differentiated in previouswork (N’Diaye et al. 2005). In our study, the differentiationbetween these two varieties and their divergence from otherspecies was confirmed.
Coffea congensis, which is considered an ecotype ofC. canephora (Prakash et al. 2005), is differentiated fromC. canephora, but both species are grouped in the samecluster in Fig. 2. Our study also points out the relatednessof C. canephora and C. brevipes. Coffea brevipes originatedfrom Cameroon and Gabon (Chevalier 1947; Anthony 1992;Stoffelen 1998). This species has been described, likeC. congensis (Sybenga 1960; Anthony 1992; Prakash et al.2005), as an ecotype of C. canephora (Chevalier 1947; An-thony 1992; Stoffelen 1998). Our work provides evidence toconfirm the hypothesis that C. brevipes is a dwarf form ofC. canephora, since this species appears to be related to theCentral African genotypes of C. canephora (Fig. 1). Fieldstudies should be performed to validate this point of view.
Coffea canephora is the most diverse species, with 95 pri-vate alleles, i.e., 31.25% of the total number of private al-leles and 14.66% of the total number of alleles. Our results(Fig. 1) confirm the division of this species into at least twogroups, i.e., a Congolese group from Central Africa and aGuinean group from West Africa. In contrast, C. libericaand C. congensis exhibit, respectively, 52 and 27 private al-leles, while C. arabica presents 20 private alleles. Theglobal amount of diversity for C. canephora, C. congensis,and C. liberica is very high compared with that for C. arab-ica, which has the lowest diversity even if wild individualsof this species are more diverse than cultivated ones. Theseresults are in accordance with previous studies (Anthony etal. 2002a; Moncada and McCouch 2004) and corroboratethe very narrow genetic base of C. arabica, suggesting asmall number of founders for this species.
Conclusion and consequences for breedingOur work shows the transferability of SSR markers over
the genus Coffea. We point out the potential usefulness ofrelated wild species in breeding strategies for C. arabicaand C. canephora to provide new variability. These resultsincrease the importance of genus diversity studies. Our re-sults, as well as previous analyses using ITS and RFLPmarkers (Lashermes et al. 1997, 1999), lead us to considerthat a high potentiality for breeding has not yet been ex-ploited using species of these two clades.
We propose working on two axes. First, since C. liberica,C. congensis, and the cultivated species are all grouped inrelated clades, the potentialities of crosses between thesespecies are high and the resulting hybrids would have an im-portant level of fertility (Louarn 1992). Variability observedwithin these species can be used for improvement of bever-age and bean quality, productivity, and resistance to biotic
and abiotic stresses in the cultivated species. Second, breed-ing potentialities with species from other diversity groupsare important to assess, since interesting characters havebeen described. For example, C. racemosa (E clade accord-ing to Cros et al. 1998) has been used for coffee leaf minerresistance (Guerreiro et al. 1999; Mondego et al. 2005) andC. anthonyi (C clade) could be used for self-compatibility.
Breeding C. arabica will have to take into account its al-lopolyploid origin. Considering the low rate of recombina-tion between the two ancestral genomes, the introduction ofrecessive alleles coding for traits of interest will be difficult.
Comparative genetic mapping and association mappingwill be developed for future breeding programs. Relation-ships between C. canephora, C. eugenioides, C. arabica,and related species will be analysed to assess valuable traitsfor both quality and resistance improvement throughout thegenus.
AcknowledgementsTechnical help was provided by the Montpellier Languedoc-
Roussillon Genopole genotyping platform. The authorsthank the NARO-CORI (Uganda), the CNRA (Republiquede Cote d’Ivoire), and the IRD (France) for providing plantmaterial. P. Cubry is supported by a grant of the Frenchministry of research. The authors are grateful to J.L. Noyerfor discussions and advice on an early version of themanuscript. We also thank an anonymous reviewer forcomments and advice on this paper.
ReferencesAggarwal, R.K., Hendre, P.S., Varshney, R.K., Bhat, P.R.,
Krishnakumar, V., and Singh, L. 2007. Identification, character-ization and utilization of EST-derived genic microsatellite mar-kers for genome analyses of coffee and related species. Theor.Appl. Genet. 114: 359–372. PMID:17115127.
Alvarez, A.E., van de Wiel, C.C.M., Smulders, M.J.M., and Vosman,B. 2001. Use of microsatellites to evaluate genetic diversityand species relationships in the genus Lycopersicon. Theor.Appl. Genet. 103: 1283–1292. doi:10.1007/s001220100662.
Anthony, F. 1992. Les ressources genetiques des cafeiers: collecte,gestion d’un conservatoire et evaluation de la diversitegenetique. Collection Travaux et Documents Microfiches n8 81,ORSTOM (now IRD), Paris.
Anthony, F., Combes, C., Astorga, C., Bertrand, B., Graziosi, G.,and Lashermes, P. 2002a. The origin of cultivated Coffea ara-bica L. varieties revealed by AFLP and SSR markers. Theor.Appl. Genet. 104: 894–900. PMID:12582651.
Anthony, F., Quiros, O., Topart, P., Bertrand, B., and Lashermes,P. 2002b. Detection by simple sequence repeat markers of intro-gression from Coffea canephora in Coffea arabica cultivars.Plant Breed. 121: 542–544. doi:10.1046/j.1439-0523.2002.00748.x.
Anthony, F., Noirot, M., Couturon, E., and Stoffelen, P. 2006. Newcoffee (Coffea L.) species from Cameroon bring original charac-ters for breeding [CD-ROM]. In 21st International Conferenceon Coffee Science, Montpellier, 11–15 September 2006. Editedby ASIC. Paris, France.
Berthaud, J. 1986. Les ressources genetiques pour l’ameliorationdes cafeiers africains diploıdes. Doctoral thesis, Universite deParis-Sud, Orsay, France.
Casasoli, M. 2004. Cartographie genetique comparee chez les faga-
Cubry et al. 61
# 2007 NRC Canada
cees. Doctoral thesis, Universite de Bordeaux 1, Bordeaux,France.
Chevalier, A. 1947. Les cafeiers du globe, fascicule III, Systema-tique des cafeiers et faux cafeiers. Paris.
Combes, M.C., Andrzejewski, S., Anthony, F., Bertrand, B., Rovelli,P., Graziosi, G., and Lashermes, P. 2000. Characterization ofmicrosatellite loci in Coffea arabica and related coffee species.Mol. Ecol. 9: 1178–1180. doi:10.1046/j.1365-294x.2000.00954-5.x. PMID:10964241.
Cramer, P.J.S. 1948. Les cafeiers hybrides du groupe Congusta.Bull. Agric. du Congo Belge, 34: 29–48.
Cros, J., Combes, M.C., Trouslot, P., Anthony, F., Hamon, S.,Charrier, A., and Lashermes, P. 1998. Phylogenetic analysis ofchloroplast DNA variation in Coffea L. Mol. Phylogenet. Evol.9: 109–117. doi:10.1006/mpev.1997.0453. PMID:9479700.
Cubry, P., De Bellis, F., Pot, D., Musoli, P., Legnate, H., Leroy, T.,and Dufour, M. 2005. Genetic diversity analyses and linkagedisequilibrium evaluation in some natural and cultivated popula-tions of Coffea canephora. In Proceedings of the 4th PlantGenomics European Meeting, Amsterdam, 20–23 September2005.
Davis, A., and Stoffelen, P. 2006. An annotated taxonomic con-spectus of the genus Coffea (Rubiaceae). Bot. J. Linn. Soc. 152:465–512. doi:10.1111/j.1095-8339.2006.00584.x.
Dufour, M., Hamon, P., Noirot, M., Risterucci, A.M., and Leroy, T.2001. Potential use of SSR markers for Coffea spp. genetic map-ping [CD-ROM]. In 19th International Scientific Colloquium onCoffee, Trieste, 2001. Edited by ASIC. Paris, France.
Dussert, S., Lashermes, P., Anthony, F., Montagnon, C., Trouslot,P., Combes, M.-C., et al. 2003. Coffee (Coffea canephora). InGenetic diversity of cultivated tropical plants. Edited by P. Ha-mon, M. Seguin, X. Perrier, and J.C. Glaszmann. Science Pub-lishers, Inc., Enfield, N.H. pp. 239–258.
Gao, L.Z., Zhang, C.H., and Jia, J.Z. 2005. Cross-species transfer-ability of rice microsatellites in its wild relatives and the poten-tial for conservation genetic studies. Genet. Resour. Crop Evol.52: 931–940. doi:10.1007/s10722-003-6124-3.
Geromel, C., Ferreira, L.P., Cavalari, A.A., Pereira, L.F.P.,Guerreiro, S.M.C., Vieira, L.G.E., et al. 2006. Biochemical andgenomic analysis of sucrose metabolism during coffee (Coffeaarabica) fruit development. J. Exp. Bot. 57: 3243–3258. doi:10.1093/jxb/erl084. PMID:16926239.
Guerreiro, O., Silvarolla, M.B., and Eskes, A.B. 1999. Expressionand mode of inheritance of resistance in coffee to leaf minerPerileucoptera coffeella. Euphytica, 105: 7–15.
The International Plant Names Index. 2007. Available from http://www.ipni.org [accessed 10 December 2007].
Jarne, P., and Lagoda, P.J. 1996. Microsatellites, from molecules topopulations and back. Trends Ecol. Evol. 11: 424–429. doi:10.1016/0169-5347(96)10049-5.
Lashermes, P., Cros, J., Combes, M.C., Trouslot, P., Anthony, F.,Hamon, S., and Charrier, A. 1996. Inheritance and restrictionfragment length polymorphism of chloroplast DNA in the genusCoffea L. Theor. Appl. Genet. 93: 626–632.
Lashermes, P., Combes, M.C., Trouslot, P., and Charrier, A. 1997.Phylogenetic relationships of coffee-tree species (Coffea L.) asinferred from ITS sequences of nuclear ribosomal DNA. Theor.Appl. Genet. 94: 947–953. doi:10.1007/s001220050500.
Lashermes, P., Combes, M.C., Robert, J., Trouslot, P., D’Hont, A.,Anthony, F., and Charrier, A. 1999. Molecular characterizationand origin of the Coffea arabica L. genome. Mol. Gen. Genet.261: 259–266. PMID:10102360.
Leroy, T., Marraccini, P., Dufour, M., Montagnon, C., Lashermes,P., Sabau, X., et al. 2005. Construction and characterization of a
Coffea canephora BAC library to study the organization of su-crose biosynthesis genes. Theor. Appl. Genet. 111: 1032–1041.doi:10.1007/s00122-005-0018-z. PMID:16133319.
Liu, K., and Muse, S.V. 2005. PowerMarker: integrated analysisenvironment for genetic marker data. Bioinformatics, 21: 2128–2129. doi:10.1093/bioinformatics/bti282. PMID:15705655.
Louarn, J. 1992. La fertilite des hybrides interspecifiques et les re-lations genomiques entre cafeiers diploıdes d’origine africaine(genre Coffea L., sous-genre Coffea). Doctoral thesis, Universitede Paris-Sud, Orsay, France.
Moncada, P., and McCouch, S. 2004. Simple sequence repeat di-versity in diploid and tetraploid Coffea species. Genome, 47:501–509. doi:10.1139/g03-129. PMID:15190367.
Mondego, J.M.C., Guerreiro-Filho, O., Bengtson, M.H.,Drummond, R.D., Felix, J.M., Duarte, M.P., et al. 2005. Isolationand characterization of Coffea genes induced during coffeeleaf miner (Leucoptera coffeella) infestation. Plant Sci. 69:351–360.
Montagnon, C. 2000. Optimization des gains genetiques dans leschema de selection recurrente reciproque de Coffea canephoraPierre. Doctoral thesis, Ecole Nationale Superieure Agronomi-que de Montpellier, Montpellier, France.
Musoli, P., Aluka, P., Cubry, P., Dufour, M., De Bellis, F.,Ogwang, J., et al. 2006. Fighting against coffee wilt disease:Uganda wild canephora genetic diversity and usefulness. In 21stInternational Conference on Coffee Science, Montpellier, 11–15 September 2006. Edited by ASIC. Paris, France.
N’Diaye, A., Poncet, V., Louarn, J., Hamon, S., and Noirot, M.2005. Genetic differentiation between Coffea liberica var.liberica and C. liberica var. dewevrei and comparison withC. canephora. Plant Syst. Evol. 253: 95–104. doi:10.1007/s00606-005-0300-1.
Perrier, X., Flori, A., and Bonnot, F. 2003. Data analysis methods.In Genetic diversity of cultivated tropical plants. Science Pub-lishers, Inc., Enfield, N.H. pp. 43–76.
Poncet, V., Hamon, P., Minier, J., Carasco, C., Hamon, S., andNoirot, M. 2004. SSR cross-amplification and variation withincoffee trees (Coffea spp.). Genome, 47: 1071–1081. doi:10.1139/g04-064. PMID:15644965.
Poncet, V., Dufour, M., Hamon, P., Hamon, S., de Kochko, A., andLeroy, T. 2007. Development of genomic microsatellite markersin Coffea canephora and their transferability to other coffee spe-cies. Genome, 50(12):1156–1161. doi:10.1139/G07-073.
Pot, D., Bouchet, S., Cubry, P., Dufour, M., De Bellis, F., Jourdan,I., et al. 2006. Nucleotide diversity of genes involved in sucrosemetabolism. Towards the identification of candidate genes con-troling sucrose variability in Coffea spp. In 21st InternationalConference on Coffee Science, Montpellier, 11–15 September2006. Edited by ASIC. Paris, France.
Prakash, N.S., Combes, M.C., Somanna, N., and Lashermes, P.2002. AFLP analysis of introgression in coffee cultivars (Coffeaarabica L.) derived from a natural interspecific hybrid. Euphy-tica, 124: 265–271. doi:10.1023/A:1015736220358.
Prakash, N.S., Combes, M.C., Dussert, S., Naveen, S., and La-shermes, P. 2005. Analysis of genetic diversity in Indian robustacoffee genepool (Coffea canephora) in comparison with a repre-sentative core collection using SSRs and AFLPs. Genet. Resour.Crop Evol. 52: 333–343. doi:10.1007/s10722-003-2125-5.
Risterucci, A.M., Grivet, L., N’Goran, J.A.K., Pieretti, I., Flament,M.H., and Lanaud, C. 2000. A high-density linkage map ofTheobroma cacao L. Theor. Appl. Genet. 101: 948–955. doi:10.1007/s001220051566.
Rovelli, P., Mettulio, R., Anthony, F., Anzueto, F., and Lashermes,P. 2000. Microsatellites in Coffea arabica L. In Coffee biotech-
62 Genome Vol. 51, 2008
# 2007 NRC Canada
nology and quality. Edited by T. Sera, C.R. Soccol, A. Pandey,and S. Roussos. Kluwer Academic Publishers, the Netherlands.pp. 123–133.
Rozen, S., and Skaletski, H.J. 2000. Primer 3. Version 0.2 [com-puter program]. Available from http://primer3.sourceforge.net/.
Saitou, N., and Nei, M. 1987. The neighbor-joining method: a newmethod for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425. PMID:3447015.
Stoffelen, P. 1998. Coffea and Psilanthus (Rubiaceae) in tropicalAfrica: a systematic and palynological study, including a revi-
sion of the West and Central African species. Doctoral thesis,Katholieke Universiteit Leuven, Leuven, Belgium.
Sybenga, J. 1960. Genetics and cytology of coffee. A literature re-view. Bibliographica Genet. 19: 217–316.
Tautz, D., and Renz, M. 1984. Simple sequences are ubiquitous re-petitive components of eukaryotic genomes. Nucleic Acids Res.12: 4127–4138. doi:10.1093/nar/12.10.4127. PMID:6328411.
Yu, K., Park, S.J., and Poysa, V. 1999. Abundance and variation ofmicrosatellite DNA sequences in beans (Phaseolus and Vigna).Genome, 42: 27–34. doi:10.1139/gen-42-1-27.
Cubry et al. 63
# 2007 NRC Canada