Content uploaded by Valerie Poncet
Author content
All content in this area was uploaded by Valerie Poncet
Content may be subject to copyright.
Diversity in coffee assessed with SSR markers:
structure of the genus Coffea and perspectives
for breeding
Philippe Cubry, Pascal Musoli, Hyacinte Legnate
´
, David Pot, Fabien de Bellis,
Vale
´
rie Poncet, Franc¸ ois Anthony, Magali Dufour, and Thierry Leroy
Abstract: The present study shows transferability of microsatellite markers developed in the two cultivated coffee species
(Coffea arabica L. and C. canephora Pierre ex Froehn.) to 15 species representing the previously identified main groups
of the genus Coffea. Evaluation of the genetic diversity and available resources within Coffea and development of molecu-
lar markers transferable across species are important steps for breeding of the two cultivated species. We worked on 15
species with 60 microsatellite markers developed using different strategies (SSR-enriched libraries, BAC libraries, gene se-
quences). We focused our analysis on 4 species used for commercial or breeding purposes. Our results establish the high
transferability of microsatellite markers within Coffea. We show the large amount of diversity available within wild spe-
cies for breeding applications. Finally we discuss the consequences for future comparative mapping studies and breeding
of the two cultivated species.
Key words: SSR markers, microsatellites, Coffea, transferability, cross-amplification, genetic diversity.
Re
´
sume
´
: La pre
´
sente e
´
tude montre la transfe
´
rabilite
´
de marqueurs microsatellites de
´
veloppe
´
s sur les deux espe
`
ces de ca-
fe
´
iers cultive
´
es (Coffea arabica L. et C. canephora Pierre ex Froehn.) a
`
15 espe
`
ces repre
´
sentant les principaux groupes
pre
´
ce
´
demment identifie
´
s du genre Coffea. L’e
´
valuation de la diversite
´
et des ressources ge
´
ne
´
tiques disponibles au sein du
genre Coffea et le de
´
veloppement de marqueurs mole
´
culaires transfe
´
rables d’une espe
`
ce a
`
l’autre sont des e
´
tapes importan-
tes pour l’ame
´
lioration de ces deux espe
`
ces. Nous avons travaille
´
sur 15 espe
`
ces avec 60 marqueurs microsatellites de
´
ve-
loppe
´
s suivant diffe
´
rentes me
´
thodologies (banques enrichies en microsatellites, banques BAC, se
´
quences de ge
`
nes). Nous
avons plus particulie
`
rement analyse
´
quatre espe
`
ces d’inte
´
re
ˆ
t en commerce ou en ame
´
lioration. Nos re
´
sultats e
´
tablissent que
les microsatellites sont hautement transfe
´
rables dans le genre Coffea. Nous mettons en e
´
vidence l’important re
´
servoir de
diversite
´
pour l’ame
´
lioration que constituent les espe
`
ces sauvages de ce genre. Enfin nous discutons des implications pour
de futures e
´
tudes de cartographie compare
´
e et l’ame
´
lioration des deux espe
`
ces cultive
´
es.
Mots-cle
´
s:marqueurs microsatellites, Coffea, transfe
´
rabilite
´
, amplification croise
´
e, diversite
´
ge
´
ne
´
tique.
Introduction
The genus Coffea consists of 103 species (Davis and Stof-
felen 2006) originated from intertropical regions of Africa
and Madagascar. Only two species are cultivated: C. arabica
L., which represents 65% of the world’s coffee production,
and C. canephora Pierre ex Froehn. Coffee species are dip-
loid (2n =2x = 22) except for C. arabica, which is tetra-
ploid (2n =4x = 44). Coffea arabica is self-compatible,
like two diploid species, C. heterocalyx Stoff. and C. antho-
nyi Stoff. & F.Anthony (Davis and Stoffelen 2006). Pre-
vious phylogenetic studies based on other markers such as
rDNA (Lashermes et al. 1997) and cpDNA variation (Cros et
al. 1998) have shown that the genus Coffea is organized
into 4 groups with different geographical origins, i.e., Central
and West Africa (WC clade), East Africa (E clade), Cen-
tral Africa (C clade), and Madagascar (M clade).
Microsatellite markers present different properties than
the other markers previously used (such as RFLPs, iso-
zymes, and cpDNA) and give a complementary view of the
coffee genus diversity. SSR (simple sequence repeat) or
microsatellite markers are highly variable and codominant
Received 5 April 2007. Accepted 10 October 2007. Published on the NRC Research Press Web site at genome.nrc.ca on 18 December
2007.
Corresponding Editor: F. Belzile.
P. Cubry,
1
D. Pot, F. de Bellis, M. Dufour, and T. Leroy. CIRAD, UMR DAP, TA A-96/03, avenue Agropolis, 34398 Montpellier
CEDEX 5, France.
P. Musoli. Coffee Research Institute, P.O. Box 185, Mukono, Uganda.
H. Legnate
´
. CNRA, BP 808, DIVO, Re
´
publique de Co
ˆ
te d’Ivoire.
V. Poncet. IRD, UMR DIA-PC, 911 avenue Agropolis, BP 64501, 34394 Montpellier CEDEX 5, France.
F. Anthony. IRD, UMR RPB, 911 avenue Agropolis, BP 64501, 34394 Montpellier CEDEX 5, France.
1
Corresponding author (e-mail: philippe.cubry@cirad.fr).
50
Genome 51: 50–63 (2008) doi:10.1139/G07-096
#
2007 NRC Canada
(Tautz and Renz 1984; Jarne and Lagoda 1996). They have
already been analysed for their transferability within the cof-
fee genus for 6 species, C. canephora, C. eugenioides
S.Moore, C. heterocalyx, C. liberica Bull ex Hiern.,
C. anthonyi, and C. pseudozanguebariae Bridson (Poncet et
al. 2004), and compared with AFLPs (Prakash et al. 2005).
SSR markers have also been used to assess genetic diversity
in the two main cultivated species, C. arabica and C. cane-
phora (Anthony et al. 2002a, 2002b; Moncada and
McCouch 2004; Cubry et al. 2005; Prakash et al. 2005).
The present study gives cross-amplification results for a set
of microsatellite markers in a larger sample of species and
individuals.
In addition to a large survey of the transferability of the
markers, we performed a detailed analysis of the two culti-
vated species (C. arabica and C. canephora) and two related
species used for both quality and productivity improvement
(C. liberica and C. congensis). A crisis of low prices has
occurred during past years, and farmers have to produce a
better quality coffee to maintain their incomes. Identifying
the amount of genetic diversity available for improvement
is especially important for C. arabica, which has been iden-
tified as a species with a very narrow genetic base (Anthony
et al. 2002a). Since the genus Coffea diverged recently from
others (5 to 25 million years ago; Lashermes et al. 1996),
most of the species are genetically highly related and a lot
of hybridizations are possible (Louarn 1992). Indeed, spon-
taneous and viable crosses of C. canephora C. congensis,
C. arabica C. liberica, and C. arabica C. canephora
have been described (Cramer 1948; Prakash et al. 2002).
These hybrids are widely used in breeding programs for re-
sistance to pests and disease or for quality improvement.
In the present paper, we analyse the diversity of 15 Coffea
species belonging to the 4 previously identified genetic
groups using 60 microsatellite markers from different ori-
gins and covering the whole genome. We also detail the re-
lationships among 4 species, 2 cultivated and 2 related wild
ones. Finally, we discuss the consequences for breeding of
C. arabica and C. canephora.
Materials and methods
Plant material
We used a total of 42 individuals from 15 Coffea species
in our study (Table 1). Four species of particular interest
were represented by more than 4 individuals to enable com-
parison of several diversity variables. These 4 species were
C. canephora, C. arabica, C. congensis, and C. liberica.
For C. arabica, we studied both cultivated and wild ac-
cessions, including commercial hybrids between the two
main cultivars, ‘Typica’ and ‘Bourbon’. For C. canephora
and C. liberica, we analysed, respectively, genotypes from
different genetic groups (B, C, SG2, and Guinean) and vari-
eties (liberica, dewevrei) chosen to represent the greatest di-
versity (Louarn 1992; Anthony 1992; Montagnon 2000;
Dussert et al. 2003). Coffea canephora accessions also in-
cluded new material from Uganda (Musoli et al. 2006), in-
cluding wild material surveyed in Itwara Forest (UW) and
the cultivar ‘Nganda’ (UN). Coffea congensis was repre-
sented by accessions from different Central African regions.
Eleven other species from different geographic origins
covering the whole repartition of Coffea were included to
provide an overview of the global diversity, including at
least 2 species of each of the previously described diversity
clades (i.e., C, WC, E, and M).
Coffea canephora genotypes were kindly provided by the
CNRA (Centre National de Recherche Agronomique) from
field collection in Divo (Re
´
publique de Co
ˆ
te d’Ivoire). Wild
C. canephora (UW) and ‘Nganda’ (UN) genotypes from
Uganda were conveniently provided by the CORI (Coffee
Research Institute) of Uganda. Coffea arabica, C. congensis,
C. liberica, and C. sessiliflora Bridson genotypes came from
field collections in French Guiana. One individual of each of
these 4 species was kindly provided by the IRD (Institut de
Recherche pour le De
´
veloppement) greenhouse collection in
Montpellier, France. Material of 9 other species also came
from the IRD collection: C. anthonyi, previously known as
C. ‘sp. Moloundou’, C. bertrandii A.Chev., C. eugenioides,
C. humilis A.Chev., C. millotii J.-F.Leroy, C. pseudozangue-
bariae, C. racemosa Lour., C. salvatrix Swynn. & Philipson,
and C. stenophylla G.Don.
DNA extraction
Genomic DNA was extracted from ground leaves follow-
ing an extraction method using a MATAB buffer adapted
from Risterucci et al. (2000). A purification of the extracts
using products from the solution-based Wizard
1
SV Ge-
nomic DNA Purification System (Promega, Madison, Wis-
consin, USA, Cat. No. A1125) was then performed.
Microsatellite markers
In this study, we used microsatellite markers obtained
from different origins (Table 2). DLxxx primers were previ-
ously published and developed from a C. canephora BAC
library (Leroy et al. 2005). A second set came from a micro-
satellite motif–enriched library of C. canephora clone 126
(Dufour et al. 2001) and from an enriched library of
C. arabica ‘Caturra’ (Rovelli et al. 2000). Primers for the
enriched C. arabica library came from Poncet et al. (2004)
and primers for the enriched C. canephora library were de-
signed by Poncet et al. (2007) using Primer3 software
(Rozen and Skaletski 2000). SSRxxx primers were designed
from sequences of sucrose synthase (SuSy) genes (Geromel
et al. 2006) using Primer3 (D. Pot, unpublished data). A to-
tal of 60 loci were screened in this study and all of them,
except SSRxxx loci, have been mapped on an intraspecific
C. canephora genetic map (T. Leroy, unpublished data).
PCR and data acquisition
For each reaction, 2.5 ng of DNA template was mixed
with 5 mL of PCR buffer (10 mmol/L Tris-HCl, 50 mmol/L
KCl, 2 mmol/L MgCl
2
, 0.001% glycerol), 200 mmol/L dNTPs,
0.10 mmol/L of reverse primer, 0.08 mmol/L of forward
primer tailed with M13 sequence, 0.10 mmol/L of fluores-
cently labelled M13 primer, and 0.1 U of Taq DNA poly-
merase. PCR amplifications were performed in an
Eppendorf Mastercycler ep 384 (Eppendorf, Westbury,
New York, USA). The amplification program consisted of
an initial denaturation cycle of 4 min at 94 8C followed
by 9 cycles of ‘‘touch-down’’ PCR consisting of 45 s at
94 8C, 1 min at 60 8Cto558C, decreasing by 0.5 8C
each cycle, and 1 min 30 s at 72 8C. The next 26 cycles
Cubry et al. 51
#
2007 NRC Canada
consisted of 94 8C for 45 s, 55 8C for 1 min, and 72 8C
for 1 min 30 s prior to a final elongation step at 72 8C
for 5 min.
Fluorescently labelled PCR products were analysed by
electrophoresis on a 6.5% polyacrylamide gel using a
LI-COR
1
4300 automated sequencer (LI-COR Biosciences,
Lincoln, Nebraska, USA). Gel images were retrieved and
annotated with the manufacturer’s program SAGA
GT
.We
assigned allele sizes manually to each individual on the
basis of the automated analyses of SAGA
GT
. Previously
studied individuals of C. canephora (Cubry et al. 2005)
served as controls. The data matrix was exported as a text
file and formatted in Excel
1
software for the different pro-
grams used for the analysis.
Data analysis
A dissimilarity matrix was computed from the data file
using the software DARwin 5 (Perrier et al. 2003). The dis-
similarities were calculated using a simple matching dis-
tance index. Since C. arabica exhibited a maximum of 2
alleles per locus in our data, we decided to manage geno-
types from this species as diploid genotypes. The dissimilar-
ity matrix was used to infer a global diversity tree using the
weighted neighbor-joining method (Saitou and Nei 1987) as
Table 1. List of plant material and providers.
Coffea species Working name Variety or diversity group Collection
Species of particular interest for commercial or breeding purposes
C. arabica Arabica_1 ‘Caturra’ IRD, France
C. arabica Arabica_2 ‘Red Catuaı
´
1’ CIRAD, French Guiana
C. arabica Arabica_3 ‘Guinee pita 1’ CIRAD, French Guiana
C. arabica Arabica_4 ‘Sidamo 1’ CIRAD, French Guiana
C. arabica Arabica_5 ‘Mundo Novo’ CIRAD, French Guiana
C. arabica Arabica_et1 Wild ethiopian CIRAD, French Guiana
C. arabica Arabica_et2 Wild ethiopian CIRAD, French Guiana
C. arabica Arabica_et3 Wild ethiopian CIRAD, French Guiana
C. canephora Can_b1 Congolese group B CNRA, Re
´
publique de Co
ˆ
te d’Ivoire
C. canephora Can_c1 Congolese group C CNRA, Re
´
publique de Co
ˆ
te d’Ivoire
C. canephora Can_sg2_1 Congolese group SG2 CNRA, Re
´
publique de Co
ˆ
te d’Ivoire
C. canephora Can_g1 Guinean CNRA, Re
´
publique de Co
ˆ
te d’Ivoire
C. canephora Can_g2 Guinean CNRA, Re
´
publique de Co
ˆ
te d’Ivoire
C. canephora Can_u1 Uganda, ‘Nganda’ CORI, Uganda
C. canephora Can_u2 Uganda, wild CORI, Uganda
C. canephora Can_u3 Uganda, wild CORI, Uganda
C. canephora Can_g3 Guinean CNRA, Re
´
publique de Co
ˆ
te d’Ivoire
C. congensis Congensis_1 IRD, France
C. congensis Congensis_2 CIRAD, French Guiana
C. congensis Congensis_3 CIRAD, French Guiana
C. congensis Congensis_4 CIRAD, French Guiana
C. congensis Congensis_5 CIRAD, French Guiana
C. liberica Liberica_1 IRD, France
C. liberica Liberica_2_l liberica CIRAD, French Guiana
C. liberica Liberica_3_l liberica CIRAD, French Guiana
C. liberica Liberica_4_l liberica CIRAD, French Guiana
C. liberica Liberica_5_d dewevrei CIRAD, French Guiana
C. liberica Liberica_6_d dewevrei CIRAD, French Guiana
C. liberica Liberica_7_d dewevrei CIRAD, French Guiana
Other species included in this study
C. anthonyi Anthonyi IRD, France
C. bertrandii Bertrandii IRD, France
C. brevipes Brevipes IRD, France
C. eugenioides Eugenioides IRD, France
C. humilis Humilis IRD, France
C. milloti Milloti IRD, France
C. pseudozanguebariae Pseudozanguebariae IRD, France
C. racemosa Racemosa IRD, France
C. salvatrix Salvatrix IRD, France
C. sessiliflora Sessiliflora_1 IRD, France
C. sessiliflora Sessiliflora_2 CIRAD, French Guiana
C. sessiliflora Sessiliflora_3 CIRAD, French Guiana
C. stenophylla Stenophylla IRD
52 Genome Vol. 51, 2008
#
2007 NRC Canada
Table 2. List of the 60 SSR markers used in the study.
EMBL acc.
No.
Marker
name Repeat type
No. of
repeats Primer sequences (5’?3’) Sequence origin Primer origin Species of origin
AJ250257 257 CA 9 F: GACCATTACATTTCACACAC Combes 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: GCATTTTGTTGCACACTGTA
AM231186 305 TG 8 F: AACTTCACTAATCTGTTGTTGCTG Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: GCACATCTATCCATCTTTTGG
AM231546 327 CA 9 F: GGCTCAAAATCACCCTTTGT Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: CTAGGATCGTGGCAGAAGAAG
AM231547 329 GT 10 F: ACTCAGACAAACCCTTCAAC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: GATGTTTTGCATCTATTTGG
AM231548 334 AC 8 F: TATGCCTCAGCACCTATCTA Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: TACTTCCCCTGTTCCTTATG
AM231549 341 CA, TA 12, 5 F: CATTGGTGTCAAGGGTCAAG Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: AAAGTATCAGAAGGAAAAGTCTCGTAA
AM231550 350 GT 8 F: TCAAAAGAGGGCACGAA Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: ACGACAATAACTTTGCATGTCT
AM231551 351 GT 13 F: AAGGATGGCAAGTGGATTTCT Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: GCAGCTCTTGATTGTAGTTTCGT
AM231552 355 TG 15 F: CTATGATGTCTTCCAACCTTCTAAC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: GGTCCAATTCTGTTTCAATTTC
AM231553 356 TG 14 F: TGAAGTCAACCTGAATACCAGA Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: ACGCACGCACGAATG
AM231554 358 CA 11 F: CATGCACTATTATGTTTGTGTTTT Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: TCTCGTCATATTTACAGGTAGGTT
AM231555 360 CA 10 F: ACAGTAGTATTTCATGCCACATCC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: ACATTTGATTGCCTCTTGACC
AM231556 364 A 21 F: AGAAGAATGAAGACGAAACACA Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: TAACGCCTGCCATCG
AM231557 367 AC 12 F: TCAATCCCTGTATTCCTGTTT Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: CTAGGCACTTAAAATCTCTATAACG
AM231558 368 TG 13 F: CACATCTCCATCCATAACCATTT Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: TCCTACCTACTTGCCTGTGCT
AM231559 371 CA 9 F: AGACACACAAGGCAATAATCAAAC Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: TCTTGAGCAGCATGGGAAC
AM231560 384 AC 10 F: ACGCTATGACAAGGCAATGA Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: TGCAGTAGTTTCACCCTTTATCC
AM231561 388 CA 9 F: ATGAAACGAGAATCCATACCCTAC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: AGAGGTAAAAGGAAAATGCTAGACC
AM231562 392 TC 16 F: AAGGTATTGGTCTGCCTTTGT Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: CTAACCCTAATCCCCAGCA
AM231563 394 TG 9 F: GCCGTCTCGTATCCCTCA Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: GAAGCCAGAAAGTCAGTCACATAG
AM231564 395 GT 13 F: CATCATTTTGTTGGCAAAG Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: TGGTTATTTCCTTCTTTGTATTG
Cubry et al. 53
#
2007 NRC Canada
Table 2 (continued).
EMBL acc.
No.
Marker
name Repeat type
No. of
repeats Primer sequences (5’?3’) Sequence origin Primer origin Species of origin
AM231565 429 A 13 F: CATTCGATGCCAACAACCT Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: GGGTCAACGCTTCTCCTG
AM231566 442 CA 19 F: CGCAAATCTGAGTATCCCAAC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: TGGATCAACACTGCCCTTC
AM231567 445 AC 10 F: CCACAGCTTGAATGACCAGA Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: AATTGACCAAGTAATCACCGACT
AM231568 456 AC 14 F: TGGTTGTTTTCTTCCATCAATC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: TCCAGTTTCCCACGCTCT
AM231569 460 CA 11 F: TGCCTTCAAAATGCTCTATAACC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: GCTGATATTCTTGGATGGAGTTG
AM231570 461 AC 9 F: CGGCTGTGACTGATGTG Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: AATTGCTAAGGGTCGAGAA
AM231571 463 AC 8 F: CATTCTTCCCACGATTCTATCTC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: GTGACTTTCGGTTGAAATACTGG
AM231572 471 CT 12 F: TTACCTCCCGGCCAGAC Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: CAGGAGACCAAGACCTTAGCA
AM231573 472 CA, TA 8, 8 F: AATCATGGGGACAGGACAAG Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: TCTGCTAGACTTGACATCTTTTGG
AM231574 477 AC 16 F: CGAGGGTTGGGAAAAGGT Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: ACCACCTGATGTTCCATTTGT
AM231575 495 AC 8 F: CATGGATGGGAAGGCAGT Dufour 2001 Poncet 2007 Coffea canephora , clone 126
R: CTTGGAAAACTTGCTGAATGTG
AM231576 501 TG 8 F: CACCACCATCTAATGCACCT Dufour 2001 Poncet 2007 Coffea canephora, clone 126
R: CTGCACCAGCTAATTCAAGC
AJ308753 753 CA 15 F: GGAGACGCAGGTGGTAGAAG Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: TCGAGAAGTCTTGGGGTGTT
AJ308755 755 CA 20 F: CCCTCCCTCTTTCTCCTCTC Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: TCTGGGTTTTCTGTGTTCTCG
AJ308774 774 CT, CA 5, 7 F: GCCACAAGTTTCGTGCTTTT Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: GGGTGTCGGTGTAGGTGTATG
AJ308779 779 TG 17 F: TCCCCCATCTTTTTCTTTCC Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: GGGAGTGTTTTTGTGTTGCTT
AJ308782 782 GT 15 F: AAAGGAAAATTGTTGGCTCTGA Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: TCCACATACATTTCCCAGCA
AJ308790 790 GT 21 F: TTTTCTGGGTTTTCTGTGTTCTC Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: TAACTCTCCATTCCCGCATT
AJ308809 809 TGA 11 F: AGCAAGTGGAGCAGAAGAAG Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: CGGTGAATAAGTCGCAGTC
AJ308837 837 TG, GA 16, 11 F: CTCGCTTTCACGCTCTCTCT Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: CGGTATGTTCCTCGTTCCTC
AJ308838 838 AC 9 F: CCCGTTGCCATCCTTACTTA Rovelli 2000 Poncet 2004 Coffea arabica ‘Caturra’
R: ATACCCGATACATTTGGATACTCG
54 Genome Vol. 51, 2008
#
2007 NRC Canada
Table 2 (concluded).
EMBL acc.
No.
Marker
name Repeat type
No. of
repeats Primer sequences (5’?3’) Sequence origin Primer origin Species of origin
AJ871882 DL003 CAAT 5 F: TAACAGAAGCACCAAAACC Leroy 2005 Leroy 2005 Coffea canephora, clone 126
R: TCTAAACCCACCTCACAAC
AJ871889 DL010 A 14 F: TAGTCCCTTTTCAGTGGT Leroy 2005 Leroy 2005 Coffea canephora, clone 126
R: TTTCTTTGTTACGGAGTG
AJ871890 DL011 GCT, CAT 4, 8 F: ATACATAAGCAAGCACTGA Leroy 2005 Leroy 2005 Coffea canephora, clone 126
R: CAGAACAAATGAAATGGA
AJ871892 DL013 CA, CT 6, 8 F: AGAGGGATGTCAGCATAA Leroy 2005 Leroy 2005 Coffea canephora, clone 126
R: ATTTGTGTTTGGTAGATGTG
AJ871899 DL020 T 23 F: TGCTCAAACTTCTTGCT Leroy 2005 Leroy 2005 Coffea canephora, clone 126
R: CGCCAACTCTAATGTGT
AJ871904 DL025 C 17 F: TTGTTGAGAGTGGAGGA Leroy 2005 Leroy 2005 Coffea canephora, clone 126
R: CCAAAGACAGTGCAGTAA
AJ871905 DL026 A 17 F: CGAGACGAGCATAAGAA Leroy 2005 Leroy 2005 Coffea canephora, clone 126
R: GCTGGAATGAAGAATGTAG
AJ871911 DL032 TACG 3 F: TGTTGGTGAAGAAATCC Leroy 2005 Leroy 2005 Coffea canephora, clone 126
R: ATGGAGACAGGAAATAAAC
AM231577 SSR001 T 3 F: CAATACGGCATGCATTTGAC Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: TGTTGAACACGCAATTGACC
AM231578 SSR003 A 6 F: ATTTGCGTGCTGGATGTTTT Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: ACCATGTAGGAAGGCCACAG
AM231579 SSR004 T 9 F: CCAACCCTAAGATGATTTTTGT Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: AACCCCTCTCAAAACCCAGT
AM231582 SSR005 GAT 2 F: ATGTGGTGCTGATGTGCAGT Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: GTCACGTGGGATGATGAGAA
AM231580 SSR009 GAAAA 5 F: CAAACAAAACAGTACAATTCAATCC Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: ATCCCTGCGAGACCTGACTA
AM231581 SSR010 ATT 2 F: CGAAAGGAACACAGGAACCA Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: CAGTGGTGAACTTAATCGTCCA
AM231583 SSR014 T 14 F: GGATCTTATCGCAATGAACCA Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: CCAACAGTGTCCTTGCTGAA
AM231584 SSR015 T 12 F: TTCTTCACAAGAACCAACCCTAA Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: AACCCCTCTCAAAACCCAAT
AM231585 SSR016 T 13 F: TGGTCAATTTGAAGCGACTG Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: CCTCCATCCTTTCCCTTACC
AM231586 SSR017 TA 7 F: TGTTCCTCTGGCTGTTGATG Geromel 2006 Pot 2006 Coffea canephora, clone 126
R: CCGGTTGAATGAGGGTAAAG .
Cubry et al. 55
#
2007 NRC Canada
implemented in DARwin. Five thousand bootstrap iterations
were calculated to test the robustness of the nodes. Consid-
ering that some species were represented by more than one
individual, we inferred another diversity tree with one ran-
domly chosen individual per species. This tree allowed a
better understanding of the genetic relationships between
species without the interference of sampling size per spe-
cies. The same inference method used for the global tree
was used for this second tree.
Several genetic variables (e.g., number of alleles, gene di-
versity, and observed heterozygosity) were calculated using
PowerMarker software (Liu and Muse 2005) for the global
sample and for each of the 4 species of particular interest.
We also computed the percentage of polymorphic loci by
species. Ninety-five percent confidence intervals for each
variable were estimated by performing 5000 bootstrap itera-
tions across loci.
Results
Amplifications across the genus
The availability (percentage of amplification) per marker
ranged from 30.9% to 100% among the 42 analysed geno-
types, with a mean of 81.5% calculated from the raw matrix
of observations (see Table S1
2
). Even if 3 markers appeared
to be specific to the Central Africa clade, good transferabil-
ity of microsatellites across Coffea species was observed.
The percentage of amplification per individual ranged
from 51.7% for one C. liberica genotype (note that the
mean for all C. liberica species is about 72%) to 98.3% for
one C. canephora genotype. Values obtained here are close
to those found by Poncet et al. (2004). For the 4 main spe-
cies, amplification ranged from 72% for C. liberica to 89%
for C. arabica and 90% for C. canephora. Amplification for
C. congensis was intermediate (83%).
Genus diversity analysis
Figure 1 presents the neighbor-joining tree for the 42 in-
dividuals of the study based on 60 microsatellite loci. Boot-
strap values greater than 40 are shown; this threshold was
arbitrarily chosen for the readability of the figure. Ten diver-
sity groups were discriminated by the analysis. The 4 ge-
netic groups WC, C, E, and M, previously described by
Lashermes et al. (1997), are indicated on this figure.
Groups C, E, and M were discriminated by our study,
whereas species of the WC clade were classified in 7 differ-
ent groups. Coffea arabica and C. congensis constituted
original groups, while C. canephora and C. liberica were
each represented by two groups. These two groups corre-
spond to different geographical origins (Central and West
Africa), as previously described by Berthaud (1986). For
C. liberica, these two groups appear to be the varieties,
C. liberica var. liberica and C. liberica var. dewevrei. For
C. canephora the two groups correspond to the Guinean (G)
clade and the Congolese clade, including the B and SG2
diversity groups. We observed strong relationships between
B, SG2, and related Ugandan accessions (UW, UN), as pre-
viously described (Musoli et al. 2006). Coffea brevipes can
be grouped with the Central African (Congolese) clade of
C. canephora, while C. humilis and C. stenophylla appear
to be grouped.
Within C. arabica , wild and cultivated materials were dif-
ferentiated, as expected from previous studies of a small
number of SSR markers (Anthony et al. 2002a, 2002b). The
cultivated varieties represent a narrow genetic base, since
dissimilarity distances between those genotypes are the
shortest of the dendrogram.
The second tree, considering only one individual per spe-
cies, allows us to describe 5 different groups for our
sampled species. Groups M, C, and E are still discriminated,
while species from West Africa (WC clade) are separated
into two groups: C. arabica, C. canephora, and the related
species C. congensis and C. brevipes form one group, while
C. liberica, C. humilis, and C. stenophylla form another
group. Bootstrap values supporting these groups are quite
high for microsatellite markers.
The global diversity is high, with a mean gene diversity
of 0.72 ± 0.03 and a mean allele number of 10.8 (see Table 3
for details). The number of alleles varies from 1 to 22
according to the locus considered. Of the total number of
alleles (648), 304 (47%) are specific to one species. A com-
plete table of private alleles is given as supplementary mate-
rial (Table S2
2
). The percentage of the total number of
private alleles for each species ranges from 0% for C. antho-
nyi to 31.25% for C. canephora, with a mean of 6.45% (see
Table S3
2
). These results show the great amount of interspe-
cific diversity within the genus, even if some species are
represented by only one individual.
Considering the global sample, 59 markers are polymor-
phic. Only one, SSR016, which derived from a genic se-
quence, exhibited no polymorphism. At the intraspecific
level, 91.7%, 75%, 76.7%, and 65% of the markers are poly-
morphic in C. canephora, C. congensis, C. liberica, and
C. arabica, respectively. For the other species, polymorphism
information should not be taken into consideration because
only one or a small number of individuals are available.
Diversity analysis of several species
Four species of particular interest because of their eco-
nomic importance or breeding potential were more accu-
rately analysed in our study. This subsample of 4 species
contributed an important part of the global sample diversity,
with a mean number of alleles of 8. On the species diversity
diagram (Fig. 2) they appear to be in 2 related clades. Ta-
ble 3 presents the results for allele number, gene diversity,
and observed heterozygosity for C. arabica, C. canephora,
C. congensis, and C. liberica (Table S4
2
presents values cal-
culated for all the species). Coffea arabica shows the lowest
diversity, with a mean number of alleles of 2.10. Moreover,
it is the only species that shows gene diversity less than ob-
served heterozygosity. The global amount of diversity in
these 4 species is important, with a mean gene diversity
higher than 0.35. Coffea canephora appears to be the most
2
Supplementary data for this article are available on the journal Web site (http://genome.nrc.ca) or may be purchased from the Depository
of Unpublished Data, Document Delivery, CISTI, National Research Council Canada, Building M-55, 1200 Montreal Road, Ottawa, ON
K1A 0R6, Canada. DUD 5250. For more information on obtaining material refer to http://cisti-icist.nrc-cnrc.gc.ca/irm/unpub_e.shtml.
56 Genome Vol. 51, 2008
#
2007 NRC Canada
diverse, with a gene diversity of 0.55 and a mean number of
alleles of 5.00.
Discussion
Coffea diversity
The global amount of diversity within Coffea appears to
be high. Considering the 4 previously described clades, we
show that 3 groups can be confirmed (i.e., groups C, M,
and E), while the fourth (WC) appears divided in two
(Fig. 2). This division can be imputed to the use of SSRs,
which have different properties than the previously used
markers, and the high number of markers used in this study
compared with the previous studies. Indeed, the high rate of
mutation for microsatellite markers helps us to better inves-
tigate structure within species and species complexes.
Moreover, microsatellites are valuable tools to assess ge-
netic structure at the species level, as demonstrated by the
global diversity diagram (Fig. 1). This figure shows the rela-
tionships within 4 species of the WC clade, indicating struc-
ture at the intraspecific level for C. liberica, C. canephora,
and C. arabica. In contrast, C. congensis appears to be ho-
mogeneous, at least for the genotypes studied.
Finally, we validated our sampling strategy, which con-
sisted of analysing at least 2 species per previously known
diversity clade for the whole genus to have an overview of
the global genus diversity. We sampled more genotypes for
4 species particularly well known and of important eco-
nomic and breeding interest (Lashermes et al. 1997; An-
thony 1992; Poncet et al. 2004).
Our results validate the microsatellite-based approach to
quickly study Coffea species by covering the entire genome,
while sequence-based studies are generally limited to small
numbers of genomic regions.
Transferability of microsatellite markers
We have confirmed the transferability of SSR markers
across the genus Coffea for a larger sample of species than
previously described. SSRs are useful markers for compara-
tive studies across genera (Casasoli 2004). Their transfer-
ability over species across a genus has been shown for
several genera including Lycopersicon (Alvarez et al. 2001),
Oryza (Gao et al. 2005), Vigna (Yu et al. 1999), and Coffea
(Combes et al. 2000; Poncet et al. 2004). Newly developed
microsatellites based on C. canephora sequences exhibit the
same properties as those previously developed based on
Fig. 1. Neighbor-joining tree for the 42 individuals analyzed based on the dissimilarity matrix calculated by simple matching. Bootstrap
values were calculated with 5000 repetitions; only values greater than or equal to 40 are shown.
Cubry et al. 57
#
2007 NRC Canada
Table 3. Summary statistics calculated for the 60 SSR markers for the global sample (all 15 species studied), the 4 species focused on, and each of the 4 species separately.
15 species C. arabica C. canephora C. congensis C. liberica 4 species
Marker N GD H
o
N GD H
o
N GD H
o
N GD H
o
N GD H
o
N GD H
o
DL003 6 0.61 0.37 2 0.23 0.29 2 0.40 0.14 1 0.00 0.00 3 0.37 0.50 3 0.57 0.25
DL010 12 0.77 0.36 2 0.19 0.22 5 0.57 0.11 4 0.54 0.80 3 0.56 1.00 8 0.72 0.45
DL011 7 0.62 0.17 1 0.00 0.00 4 0.61 0.22 3 0.29 0.20 3 0.47 0.14 6 0.63 0.16
DL013 12 0.86 0.30 2 0.50 1.00 3 0.45 0.00 1 0.00 0.00 3 0.51 0.00 7 0.83 0.38
DL020 13 0.86 0.43 4 0.56 0.89 6 0.69 0.50 2 0.26 0.00 5 0.64 0.43 11 0.86 0.50
DL025 7 0.78 0.34 3 0.54 1.00 3 0.53 0.11 3 0.29 0.20 3 0.51 0.00 6 0.74 0.37
DL026 13 0.81 0.11 1 0.00 0.00 5 0.68 0.00 4 0.58 0.00 6 0.51 0.43 9 0.79 0.10
DL032 7 0.73 0.27 2 0.50 1.00 2 0.40 0.00 1 0.00 0.00 4 0.50 0.40 5 0.59 0.36
SSR016 1 0.00 0.00 1 0.00 0.00 1 0.00 0.00 1 0.00 0.00 1 0.00 0.00 1 0.00 0.00
SSR014 13 0.79 0.26 1 0.00 0.00 4 0.58 0.11 5 0.51 0.40 6 0.64 0.43 9 0.77 0.19
SSR015 4 0.32 0.22 2 0.50 1.00 1 0.00 0.00 1 0.00 0.00 1 0.00 0.00 2 0.27 0.32
SSR017 9 0.72 0.09 1 0.00 0.00 1 0.00 0.00 2 0.16 0.20 6 0.68 0.14 7 0.66 0.08
SSR001 2 0.15 0.00 1 0.00 0.00 2 0.18 0.00 1 0.00 0.00 0 NA NA 2 0.08 0.00
SSR003 3 0.25 0.00 1 0.00 0.00 2 0.41 0.00 2 0.28 0.00 1 0.00 0.00 2 0.25 0.00
SSR004 3 0.11 0.02 2 0.10 0.11 1 0.00 0.00 1 0.00 0.00 1 0.00 0.00 2 0.03 0.03
257 13 0.65 0.31 3 0.56 1.00 2 0.41 0.00 1 0.00 0.00 3 0.42 0.25 8 0.61 0.42
305 7 0.57 0.40 2 0.50 1.00 3 0.42 0.60 2 0.29 0.40 1 0.00 0.00 4 0.34 0.33
327 13 0.82 0.33 3 0.55 1.00 6 0.68 0.29 4 0.59 0.50 1 0.00 0.00 8 0.52 0.14
329 13 0.85 0.41 3 0.60 1.00 5 0.59 0.25 3 0.52 0.25 4 0.48 0.29 6 0.51 0.30
334 4 0.58 0.10 2 0.10 0.11 4 0.47 0.22 2 0.28 0.00 2 0.37 0.00 4 0.59 0.58
341 6 0.68 0.11 1 0.00 0.00 3 0.47 0.00 2 0.25 0.00 2 0.19 0.25 11 0.82 0.50
350 10 0.81 0.43 4 0.69 0.78 3 0.54 0.29 5 0.63 0.60 3 0.51 0.00 10 0.83 0.52
351 13 0.83 0.54 2 0.50 1.00 6 0.59 0.71 5 0.67 0.75 3 0.38 0.20 4 0.38 0.10
355 16 0.88 0.51 2 0.50 1.00 7 0.72 0.44 4 0.56 0.60 5 0.66 0.71 6 0.73 0.11
356 14 0.81 0.59 4 0.53 0.43 5 0.69 0.78 4 0.53 0.67 1 0.00 0.00 8 0.78 0.46
358 8 0.71 0.14 1 0.00 0.00 4 0.59 0.22 1 0.00 0.00 1 0.00 0.00 9 0.78 0.73
SSR009 10 0.66 0.39 1 0.00 0.00 4 0.62 0.56 3 0.50 0.50 6 0.69 0.75 13 0.87 0.71
SSR010 4 0.30 0.29 1 0.00 0.00 4 0.45 0.50 0 NA NA 0 NA NA 8 0.77 0.62
360 12 0.83 0.26 0 NA NA 6 0.69 0.22 2 0.30 0.00 6 0.69 0.60 6 0.66 0.09
364 7 0.48 0.24 1 0.00 0.00 6 0.69 0.56 2 0.38 0.00 3 0.48 0.40 14 0.87 0.33
367 11 0.84 0.49 2 0.50 1.00 6 0.72 0.44 4 0.56 0.25 4 0.56 0.33 7 0.61 0.25
368 21 0.88 0.29 1 0.00 0.00 10 0.80 0.33 5 0.61 0.25 4 0.52 0.25 10 0.81 0.59
371 11 0.78 0.54 2 0.50 1.00 5 0.55 0.38 5 0.70 0.80 5 0.59 0.43 13 0.82 0.20
384 9 0.83 0.21 2 0.10 0.11 4 0.60 0.11 2 0.16 0.20 5 0.56 0.43 10 0.76 0.67
388 18 0.87 0.31 3 0.44 0.00 8 0.78 0.78 2 0.16 0.20 3 0.52 1.00 8 0.82 0.23
392 15 0.83 0.36 1 0.00 0.00 5 0.62 0.13 4 0.58 0.60 8 0.80 0.86 11 0.82 0.38
394 14 0.74 0.32 1 0.00 0.00 5 0.43 0.44 4 0.56 0.40 8 0.77 0.67 13 0.80 0.37
395 16 0.87 0.29 3 0.21 0.13 9 0.78 0.57 2 0.26 0.00 4 0.48 0.20 10 0.64 0.37
429 20 0.86 0.27 1 0.00 0.00 8 0.79 0.56 3 0.47 0.00 5 0.61 0.20 15 0.87 0.27
442 7 0.59 0.18 1 0.00 0.00 6 0.69 0.29 2 0.23 0.33 0 NA NA 13 0.81 0.22
445 9 0.79 0.42 2 0.50 1.00 3 0.49 0.33 3 0.36 0.50 2 0.37 0.00 8 0.66 0.17
58 Genome Vol. 51, 2008
#
2007 NRC Canada
Table 3 (concluded).
15 species C. arabica C. canephora C. congensis C. liberica 4 species
Marker N GD H
o
N GD H
o
N GD H
o
N GD H
o
N GD H
o
N GD H
o
456 11 0.70 0.22 1 0.00 0.00 11 0.81 0.44 0 NA NA 0 NA NA 5 0.74 0.44
460 22 0.88 0.54 2 0.50 1.00 5 0.60 0.13 8 0.74 0.80 7 0.73 0.80 11 0.74 0.28
461 13 0.86 0.41 4 0.47 0.56 7 0.74 0.33 3 0.38 0.20 5 0.64 0.57 18 0.85 0.68
463 7 0.73 0.59 2 0.50 1.00 5 0.71 0.67 2 0.19 0.25 3 0.50 0.40 14 0.89 0.45
471 11 0.81 0.26 1 0.00 0.00 5 0.65 0.29 4 0.56 0.25 5 0.62 0.67 5 0.65 0.65
472 15 0.88 0.45 6 0.69 1.00 6 0.72 0.25 4 0.52 0.25 3 0.38 0.33 9 0.77 0.32
477 16 0.87 0.38 2 0.50 1.00 5 0.53 0.33 2 0.26 0.00 3 0.49 0.00 10 0.88 0.54
495 9 0.75 0.07 1 0.00 0.00 6 0.69 0.33 1 0.00 0.00 1 0.00 0.00 11 0.82 0.43
SSR005 11 0.69 0.10 2 0.10 0.11 3 0.50 0.00 3 0.29 0.20 3 0.45 0.20 7 0.66 0.10
501 16 0.85 0.47 2 0.49 0.89 9 0.79 0.56 1 0.00 0.00 5 0.51 0.57 14 0.87 0.58
753 13 0.82 0.58 3 0.54 1.00 5 0.66 0.38 4 0.65 1.00 4 0.53 0.67 8 0.79 0.72
755 15 0.87 0.59 3 0.59 1.00 9 0.76 0.56 5 0.67 0.75 4 0.60 0.80 13 0.89 0.76
774 8 0.58 0.12 2 0.10 0.11 3 0.26 0.11 1 0.00 0.00 1 0.00 0.00 6 0.53 0.10
779 9 0.86 0.58 2 0.50 1.00 7 0.74 0.38 4 0.58 0.60 5 0.73 0.86 9 0.85 0.71
782 9 0.77 0.19 5 0.68 0.80 1 0.00 0.00 5 0.64 0.20 4 0.54 0.20 6 0.73 0.25
790 16 0.88 0.56 3 0.60 1.00 9 0.77 0.67 5 0.66 0.60 4 0.42 0.43 14 0.86 0.71
809 8 0.71 0.53 2 0.50 1.00 3 0.35 0.44 1 0.00 0.00 5 0.72 1.00 7 0.67 0.65
837 10 0.82 0.24 3 0.29 0.13 6 0.70 0.43 3 0.42 0.25 3 0.48 0.20 9 0.82 0.25
838 16 0.89 0.46 3 0.60 1.00 5 0.65 0.22 2 0.28 0.50 3 0.45 0.20 10 0.87 0.50
Mean* 11 0.72 0.32 2 0.30 0.49 5 0.55 0.29 3 0.34 0.27 4 0.44 0.34 8 0.69 0.37
Mean
{
11 0.72 0.32 2 0.30 0.48 5 0.55 0.30 3 0.35 0.27 4 0.45 0.35 8 0.69 0.37
SD 1 0.03 0.02 0 0.03 0.06 0 0.03 0.03 0 0.03 0.04 0 0.03 0.04 0 0.03 0.03
2.5% l.b. 10 0.66 0.27 2 0.23 0.37 4 0.49 0.24 2 0.29 0.20 3 0.39 0.27 7 0.63 0.31
97.5% u.b. 12 0.76 0.36 2 0.36 0.61 5 0.60 0.35 3 0.41 0.34 4 0.51 0.43 9 0.74 0.42
Note: N, number of alleles; GD, gene diversity; H
o
, observed heterozygosity; NA, not available (missing data); SD, standard deviation; 2.5% l.b. and 97.5% u.b., lower and upper boundaries of the 95%
confidence interval.
*Mean values based only on markers with no missing data for the considered species.
{
Mean values calculated over 5000 bootstrap iterations and based only on markers with no missing data for the considered species.
Cubry et al. 59
#
2007 NRC Canada
C. arabica sequences, since mean percentages of amplifica-
tion are the same. This result will be used for development
of comparative mapping, utilization of new markers, and
knowledge transfer from one species to another.
SSRs described in genes involved in sucrose metabolism
appear to have some specific behaviour, since they exhibit
very low diversity (1–4 alleles in the global sample) or
intermediate diversity (9–13 alleles). These results will
allow us to use these markers to study gene regions impli-
cated in sucrose metabolism.
In our work, using new markers, we validate the relation
between C. anthonyi and C. eugenioides, which was previ-
ously described by Lashermes et al. (1997). These two spe-
cies show high similarity based on both morphological and
molecular data. However, C. anthonyi originated from Ca-
meroon, while C. eugenioides is native to East Africa. No
other coffee species belonging to the same clade (C) has
been observed between these two distant geographic areas,
and there is no clear explanation for the discontinuous distri-
bution of these coffee trees (Anthony 1992).
We can use these two species to improve C. arabica vari-
eties, considering their genetic relationships and the original
self-compatible system of C. anthonyi (Anthony et al. 2006).
These two species show some of the lowest concentrations
of caffeine (0.6%) of the genus Coffea and exhibit high con-
centrations of trigonelline (1.6% for C. anthonyi, 1.3% for
C. eugenioides; F. Anthony, personal communication), an
alkaloid compound. These two characters have always inter-
ested breeders in coffee improvement. Meanwhile, since few
genotypes are in collection worldwide, these two species
have not been agronomically well characterized and experi-
ments are necessary to assess potential resistances to biotic
and abiotic stresses usable for improvement.
On the other hand, part of the C. arabica genome has
been shown to originate from an ancestral species geneti-
cally close to C. eugenioides or C. anthonyi (Lashermes et
al. 1999). These relationships can be used to better under-
stand the elaboration and functioning of the allotetraploid
genome of C. arabica, in particular comportment of homeol-
ogous chromosomes during meiosis.
Diversity and genetic properties of cultivated and related
wild species
The diversity and genetic relationships of C. arabica,
C. canephora, and related species are examined in our
work. Coffea arabica has been treated as a diploid species
because of the presence of only 2 alleles on all the loci.
This is not surprising considering the allotetraploid origin
and amphidiploid nature of C. arabica and its autogamy.
Coffea arabica is the only species that exhibited an expected
heterozygosity lower than the observed heterozygosity. This
result is consistent with other studies (Lashermes et al.
1999; Aggarwal et al. 2007). It could result from the fixed
heterozygosity (Lashermes et al. 1999) during the speciation
process including two different ancestral genomes. Data de-
rived from SNP analysis (Pot et al. 2006) confirm this hy-
pothesis with the construction of two haplotypes based on
sequences. One is close to C. canephora and related species,
Fig. 2. Neighbor-joining tree for 15 individuals (one per species) based on the dissimilarity matrix calculated by simple matching. Bootstrap
values were calculated with 5000 repetitions.
60 Genome Vol. 51, 2008
#
2007 NRC Canada
while the other exhibits strong relationships with C. euge-
nioides. However, heterozygosity within the two ancestral
genomes appears to have been lost, since only one allele
from each genome remains in C. arabica. This result indi-
cates a possible lack of recombination between the ancestral
genomes, while recombination within each genome occurs
normally.
We included the two varieties of C. liberica, i.e., C. liber-
ica var. liberica and C. liberica var. dewevrei. These two
varieties were genetically well differentiated in previous
work (N’Diaye et al. 2005). In our study, the differentiation
between these two varieties and their divergence from other
species was confirmed.
Coffea congensis, which is considered an ecotype of
C. canephora (Prakash et al. 2005), is differentiated from
C. canephora, but both species are grouped in the same
cluster in Fig. 2. Our study also points out the relatedness
of C. canephora and C. brevipes. Coffea brevipes originated
from Cameroon and Gabon (Chevalier 1947; Anthony 1992;
Stoffelen 1998). This species has been described, like
C. congensis (Sybenga 1960; Anthony 1992; Prakash et al.
2005), as an ecotype of C. canephora (Chevalier 1947; An-
thony 1992; Stoffelen 1998). Our work provides evidence to
confirm the hypothesis that C. brevipes is a dwarf form of
C. canephora, since this species appears to be related to the
Central African genotypes of C. canephora (Fig. 1). Field
studies should be performed to validate this point of view.
Coffea canephora is the most diverse species, with 95 pri-
vate alleles, i.e., 31.25% of the total number of private al-
leles and 14.66% of the total number of alleles. Our results
(Fig. 1) confirm the division of this species into at least two
groups, i.e., a Congolese group from Central Africa and a
Guinean group from West Africa. In contrast, C. liberica
and C. congensis exhibit, respectively, 52 and 27 private al-
leles, while C. arabica presents 20 private alleles. The
global amount of diversity for C. canephora, C. congensis,
and C. liberica is very high compared with that for C. arab-
ica, which has the lowest diversity even if wild individuals
of this species are more diverse than cultivated ones. These
results are in accordance with previous studies (Anthony et
al. 2002a; Moncada and McCouch 2004) and corroborate
the very narrow genetic base of C. arabica, suggesting a
small number of founders for this species.
Conclusion and consequences for breeding
Our work shows the transferability of SSR markers over
the genus Coffea. We point out the potential usefulness of
related wild species in breeding strategies for C. arabica
and C. canephora to provide new variability. These results
increase the importance of genus diversity studies. Our re-
sults, as well as previous analyses using ITS and RFLP
markers (Lashermes et al. 1997, 1999), lead us to consider
that a high potentiality for breeding has not yet been ex-
ploited using species of these two clades.
We propose working on two axes. First, since C. liberica,
C. congensis, and the cultivated species are all grouped in
related clades, the potentialities of crosses between these
species are high and the resulting hybrids would have an im-
portant level of fertility (Louarn 1992). Variability observed
within these species can be used for improvement of bever-
age and bean quality, productivity, and resistance to biotic
and abiotic stresses in the cultivated species. Second, breed-
ing potentialities with species from other diversity groups
are important to assess, since interesting characters have
been described. For example, C. racemosa (E clade accord-
ing to Cros et al. 1998) has been used for coffee leaf miner
resistance (Guerreiro et al. 1999; Mondego et al. 2005) and
C. anthonyi (C clade) could be used for self-compatibility.
Breeding C. arabica will have to take into account its al-
lopolyploid origin. Considering the low rate of recombina-
tion between the two ancestral genomes, the introduction of
recessive alleles coding for traits of interest will be difficult.
Comparative genetic mapping and association mapping
will be developed for future breeding programs. Relation-
ships between C. canephora, C. eugenioides, C. arabica,
and related species will be analysed to assess valuable traits
for both quality and resistance improvement throughout the
genus.
Acknowledgements
Technical help was provided by the Montpellier Languedoc-
Roussillon Genopole genotyping platform. The authors
thank the NARO-CORI (Uganda), the CNRA (Re
´
publique
de Co
ˆ
te d’Ivoire), and the IRD (France) for providing plant
material. P. Cubry is supported by a grant of the French
ministry of research. The authors are grateful to J.L. Noyer
for discussions and advice on an early version of the
manuscript. We also thank an anonymous reviewer for
comments and advice on this paper.
References
Aggarwal, R.K., Hendre, P.S., Varshney, R.K., Bhat, P.R.,
Krishnakumar, V., and Singh, L. 2007. Identification, character-
ization and utilization of EST-derived genic microsatellite mar-
kers for genome analyses of coffee and related species. Theor.
Appl. Genet. 114: 359–372. PMID:17115127.
Alvarez, A.E., van de Wiel, C.C.M., Smulders, M.J.M., and Vosman,
B. 2001. Use of microsatellites to evaluate genetic diversity
and species relationships in the genus Lycopersicon. Theor.
Appl. Genet. 103: 1283–1292. doi:10.1007/s001220100662.
Anthony, F. 1992. Les ressources ge
´
ne
´
tiques des cafe
´
iers: collecte,
gestion d’un conservatoire et e
´
valuation de la diversite
´
ge
´
ne
´
tique. Collection Travaux et Documents Microfiche
´
sn8 81,
ORSTOM (now IRD), Paris.
Anthony, F., Combes, C., Astorga, C., Bertrand, B., Graziosi, G.,
and Lashermes, P. 2002a. The origin of cultivated Coffea ara-
bica L. varieties revealed by AFLP and SSR markers. Theor.
Appl. Genet. 104: 894–900. PMID:12582651.
Anthony, F., Quiro
´
s, O., Topart, P., Bertrand, B., and Lashermes,
P. 2002b. Detection by simple sequence repeat markers of intro-
gression from Coffea canephora in Coffea arabica cultivars.
Plant Breed. 121: 542–544. doi:10.1046/j.1439-0523.2002.
00748.x.
Anthony, F., Noirot, M., Couturon, E., and Stoffelen, P. 2006. New
coffee (Coffea L.) species from Cameroon bring original charac-
ters for breeding [CD-ROM]. In 21st International Conference
on Coffee Science, Montpellier, 11–15 September 2006. Edited
by ASIC. Paris, France.
Berthaud, J. 1986. Les ressources ge
´
ne
´
tiques pour l’ame
´
lioration
des cafe
´
iers africains diploı
¨
des. Doctoral thesis, Universite
´
de
Paris-Sud, Orsay, France.
Casasoli, M. 2004. Cartographie ge
´
ne
´
tique compare
´
e chez les faga-
Cubry et al. 61
#
2007 NRC Canada
ce
´
es. Doctoral thesis, Universite
´
de Bordeaux 1, Bordeaux,
France.
Chevalier, A. 1947. Les cafe
´
iers du globe, fascicule III, Syste
´
ma-
tique des cafe
´
iers et faux cafe
´
iers. Paris.
Combes, M.C., Andrzejewski, S., Anthony, F., Bertrand, B., Rovelli,
P., Graziosi, G., and Lashermes, P. 2000. Characterization of
microsatellite loci in Coffea arabica and related coffee species.
Mol. Ecol. 9: 1178–1180. doi:10.1046/j.1365-294x.2000.
00954-5.x. PMID:10964241.
Cramer, P.J.S. 1948. Les cafe
´
iers hybrides du groupe Congusta.
Bull. Agric. du Congo Belge, 34: 29–48.
Cros, J., Combes, M.C., Trouslot, P., Anthony, F., Hamon, S.,
Charrier, A., and Lashermes, P. 1998. Phylogenetic analysis of
chloroplast DNA variation in Coffea L. Mol. Phylogenet. Evol.
9: 109–117. doi:10.1006/mpev.1997.0453. PMID:9479700.
Cubry, P., De Bellis, F., Pot, D., Musoli, P., Legnate
´
, H., Leroy, T.,
and Dufour, M. 2005. Genetic diversity analyses and linkage
disequilibrium evaluation in some natural and cultivated popula-
tions of Coffea canephora. In Proceedings of the 4th Plant
Genomics European Meeting, Amsterdam, 20–23 September
2005.
Davis, A., and Stoffelen, P. 2006. An annotated taxonomic con-
spectus of the genus Coffea (Rubiaceae). Bot. J. Linn. Soc. 152:
465–512. doi:10.1111/j.1095-8339.2006.00584.x.
Dufour, M., Hamon, P., Noirot, M., Risterucci, A.M., and Leroy, T.
2001. Potential use of SSR markers for Coffea spp. genetic map-
ping [CD-ROM]. In 19th International Scientific Colloquium on
Coffee, Trieste, 2001. Edited by ASIC. Paris, France.
Dussert, S., Lashermes, P., Anthony, F., Montagnon, C., Trouslot,
P., Combes, M.-C., et al. 2003. Coffee (Coffea canephora). In
Genetic diversity of cultivated tropical plants. Edited by P. Ha-
mon, M. Seguin, X. Perrier, and J.C. Glaszmann. Science Pub-
lishers, Inc., Enfield, N.H. pp. 239–258.
Gao, L.Z., Zhang, C.H., and Jia, J.Z. 2005. Cross-species transfer-
ability of rice microsatellites in its wild relatives and the poten-
tial for conservation genetic studies. Genet. Resour. Crop Evol.
52: 931–940. doi:10.1007/s10722-003-6124-3.
Geromel, C., Ferreira, L.P., Cavalari, A.A., Pereira, L.F.P.,
Guerreiro, S.M.C., Vieira, L.G.E., et al. 2006. Biochemical and
genomic analysis of sucrose metabolism during coffee (Coffea
arabica) fruit development. J. Exp. Bot. 57: 3243–3258. doi:10.
1093/jxb/erl084. PMID:16926239.
Guerreiro, O., Silvarolla, M.B., and Eskes, A.B. 1999. Expression
and mode of inheritance of resistance in coffee to leaf miner
Perileucoptera coffeella. Euphytica, 105: 7–15.
The International Plant Names Index. 2007. Available from http://
www.ipni.org [accessed 10 December 2007].
Jarne, P., and Lagoda, P.J. 1996. Microsatellites, from molecules to
populations and back. Trends Ecol. Evol. 11: 424–429. doi:10.
1016/0169-5347(96)10049-5.
Lashermes, P., Cros, J., Combes, M.C., Trouslot, P., Anthony, F.,
Hamon, S., and Charrier, A. 1996. Inheritance and restriction
fragment length polymorphism of chloroplast DNA in the genus
Coffea L. Theor. Appl. Genet. 93: 626–632.
Lashermes, P., Combes, M.C., Trouslot, P., and Charrier, A. 1997.
Phylogenetic relationships of coffee-tree species (Coffea L.) as
inferred from ITS sequences of nuclear ribosomal DNA. Theor.
Appl. Genet. 94: 947–953. doi:10.1007/s001220050500.
Lashermes, P., Combes, M.C., Robert, J., Trouslot, P., D’Hont, A.,
Anthony, F., and Charrier, A. 1999. Molecular characterization
and origin of the Coffea arabica L. genome. Mol. Gen. Genet.
261: 259–266. PMID:10102360.
Leroy, T., Marraccini, P., Dufour, M., Montagnon, C., Lashermes,
P., Sabau, X., et al. 2005. Construction and characterization of a
Coffea canephora BAC library to study the organization of su-
crose biosynthesis genes. Theor. Appl. Genet. 111: 1032–1041.
doi:10.1007/s00122-005-0018-z. PMID:16133319.
Liu, K., and Muse, S.V. 2005. PowerMarker: integrated analysis
environment for genetic marker data. Bioinformatics, 21: 2128–
2129. doi:10.1093/bioinformatics/bti282. PMID:15705655.
Louarn, J. 1992. La fertilite
´
des hybrides interspe
´
cifiques et les re-
lations ge
´
nomiques entre cafe
´
iers diploı
¨
des d’origine africaine
(genre Coffea L., sous-genre Coffea). Doctoral thesis, Universite
´
de Paris-Sud, Orsay, France.
Moncada, P., and McCouch, S. 2004. Simple sequence repeat di-
versity in diploid and tetraploid Coffea species. Genome, 47:
501–509. doi:10.1139/g03-129. PMID:15190367.
Mondego, J.M.C., Guerreiro-Filho, O., Bengtson, M.H.,
Drummond, R.D., Felix, J.M., Duarte, M.P., et al. 2005. Isolation
and characterization of Coffea genes induced during coffee
leaf miner (Leucoptera coffeella) infestation. Plant Sci. 69:
351–360.
Montagnon, C. 2000. Optimization des gains ge
´
ne
´
tiques dans le
sche
´
ma de se
´
le
´
ction re
´
currente re
´
ciproque de Coffea canephora
Pierre. Doctoral thesis, Ecole Nationale Supe
´
rieure Agronomi-
que de Montpellier, Montpellier, France.
Musoli, P., Aluka, P., Cubry, P., Dufour, M., De Bellis, F.,
Ogwang, J., et al. 2006. Fighting against coffee wilt disease:
Uganda wild canephora genetic diversity and usefulness. In 21st
International Conference on Coffee Science, Montpellier, 11–
15 September 2006. Edited by ASIC. Paris, France.
N’Diaye, A., Poncet, V., Louarn, J., Hamon, S., and Noirot, M.
2005. Genetic differentiation between Coffea liberica var.
liberica and C. liberica var. dewevrei and comparison with
C. canephora. Plant Syst. Evol. 253: 95–104. doi:10.1007/
s00606-005-0300-1.
Perrier, X., Flori, A., and Bonnot, F. 2003. Data analysis methods.
In Genetic diversity of cultivated tropical plants. Science Pub-
lishers, Inc., Enfield, N.H. pp. 43–76.
Poncet, V., Hamon, P., Minier, J., Carasco, C., Hamon, S., and
Noirot, M. 2004. SSR cross-amplification and variation within
coffee trees (Coffea spp.). Genome, 47: 1071–1081. doi:10.
1139/g04-064. PMID:15644965.
Poncet, V., Dufour, M., Hamon, P., Hamon, S., de Kochko, A., and
Leroy, T. 2007. Development of genomic microsatellite markers
in Coffea canephora and their transferability to other coffee spe-
cies. Genome, 50(12):1156–1161. doi:10.1139/G07-073.
Pot, D., Bouchet, S., Cubry, P., Dufour, M., De Bellis, F., Jourdan,
I., et al. 2006. Nucleotide diversity of genes involved in sucrose
metabolism. Towards the identification of candidate genes con-
troling sucrose variability in Coffea spp. In 21st International
Conference on Coffee Science, Montpellier, 11–15 September
2006. Edited by ASIC. Paris, France.
Prakash, N.S., Combes, M.C., Somanna, N., and Lashermes, P.
2002. AFLP analysis of introgression in coffee cultivars (Coffea
arabica L.) derived from a natural interspecific hybrid. Euphy-
tica, 124: 265–271. doi:10.1023/A:1015736220358.
Prakash, N.S., Combes, M.C., Dussert, S., Naveen, S., and La-
shermes, P. 2005. Analysis of genetic diversity in Indian robusta
coffee genepool (Coffea canephora) in comparison with a repre-
sentative core collection using SSRs and AFLPs. Genet. Resour.
Crop Evol. 52: 333–343. doi:10.1007/s10722-003-2125-5.
Risterucci, A.M., Grivet, L., N’Goran, J.A.K., Pieretti, I., Flament,
M.H., and Lanaud, C. 2000. A high-density linkage map of
Theobroma cacao L. Theor. Appl. Genet. 101: 948–955. doi:10.
1007/s001220051566.
Rovelli, P., Mettulio, R., Anthony, F., Anzueto, F., and Lashermes,
P. 2000. Microsatellites in Coffea arabica L. In Coffee biotech-
62 Genome Vol. 51, 2008
#
2007 NRC Canada
nology and quality. Edited by T. Sera, C.R. Soccol, A. Pandey,
and S. Roussos. Kluwer Academic Publishers, the Netherlands.
pp. 123–133.
Rozen, S., and Skaletski, H.J. 2000. Primer 3. Version 0.2 [com-
puter program]. Available from http://primer3.sourceforge.net/.
Saitou, N., and Nei, M. 1987. The neighbor-joining method: a new
method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:
406–425. PMID:3447015.
Stoffelen, P. 1998. Coffea and Psilanthus (Rubiaceae) in tropical
Africa: a systematic and palynological study, including a revi-
sion of the West and Central African species. Doctoral thesis,
Katholieke Universiteit Leuven, Leuven, Belgium.
Sybenga, J. 1960. Genetics and cytology of coffee. A literature re-
view. Bibliographica Genet. 19: 217–316.
Tautz, D., and Renz, M. 1984. Simple sequences are ubiquitous re-
petitive components of eukaryotic genomes. Nucleic Acids Res.
12: 4127–4138. doi:10.1093/nar/12.10.4127. PMID:6328411.
Yu, K., Park, S.J., and Poysa, V. 1999. Abundance and variation of
microsatellite DNA sequences in beans (Phaseolus and Vigna).
Genome, 42: 27–34. doi:10.1139/gen-42-1-27.
Cubry et al. 63
#
2007 NRC Canada