ArticlePDF Available

Unveiling a unique genetic diversity of cultivated Coffea arabica L. in its main domestication center: Yemen

Authors:
  • Qima Coffee

Abstract and Figures

Whilst it is established that almost all cultivated coffee (Coffea arabica L.) varieties originated in Yemen after some coffee seeds were introduced into Yemen from neighboring Ethiopia, the actual coffee genetic diversity in Yemen and its significance to the coffee world had never been explored. We observed five genetic clusters. The first cluster, which we named the Ethiopian-Only (EO) cluster, was made up exclusively of the Ethiopian accessions. This cluster was clearly separated from the Yemen and cultivated varieties clusters, hence confirming the genetic distance between wild Ethiopian accessions and coffee cultivated varieties around the world. The second cluster, which we named the SL-17 cluster, was a small cluster of cultivated worldwide varieties and included no Yemen samples. Two other clusters were made up of worldwide varieties and Yemen samples. We named these the Yemen Typica-Bourbon cluster and the Yemen SL-34 cluster. Finally, we observed one cluster that was unique to Yemen and was not related to any known cultivated varieties and not even to any known Ethiopian accession: we name this cluster the New-Yemen cluster. We discuss the consequences of these findings and their potential to pave the way for further comprehensive genetic improvement projects for the identification of major resilience/adaptation and cup quality genes that have been shaped through the domestication process of C. arabica.
This content is subject to copyright. Terms and conditions apply.
RESEARCH ARTICLE
Unveiling a unique genetic diversity of cultivated Coffea
arabica L. in its main domestication center: Yemen
C. Montagnon .A. Mahyoub .W. Solano .F. Sheibani
Received: 21 July 2020 / Accepted: 15 January 2021 / Published online: 15 February 2021
ÓThe Author(s) 2021
Abstract Whilst it is established that almost all
cultivated coffee (Coffea arabica L.) varieties origi-
nated in Yemen after some coffee seeds were intro-
duced into Yemen from neighboring Ethiopia, the
actual coffee genetic diversity in Yemen and its
significance to the coffee world had never been
explored. We observed five genetic clusters. The first
cluster, which we named the Ethiopian-Only (EO)
cluster, was made up exclusively of the Ethiopian
accessions. This cluster was clearly separated from the
Yemen and cultivated varieties clusters, hence con-
firming the genetic distance between wild Ethiopian
accessions and coffee cultivated varieties around the
world. The second cluster, which we named the SL-17
cluster, was a small cluster of cultivated worldwide
varieties and included no Yemen samples. Two other
clusters were made up of worldwide varieties and
Yemen samples. We named these the Yemen Typica-
Bourbon cluster and the Yemen SL-34 cluster. Finally,
we observed one cluster that was unique to Yemen and
was not related to any known cultivated varieties and
not even to any known Ethiopian accession: we name
this cluster the New-Yemen cluster. We discuss the
consequences of these findings and their potential to
pave the way for further comprehensive genetic
improvement projects for the identification of major
resilience/adaptation and cup quality genes that have
been shaped through the domestication process of C.
arabica.
Keywords Coffea arabica Genetic diversity
Yemen Domestication
Introduction
12.5 million households around the world receive an
income from coffee growing (Browning 2018). Coffee
is mainly produced by two species:: Coffea arabica L.
producing Arabica coffee and Coffea canephora
Pierre ex A.Froehner producing the coffee known as
Conilon when produced in Brazil, and Robusta
anywhere else in the world. Arabica coffee production
is facing multiple challenges such as climate change
Supplementary Information The online version contains
supplementary material available at https://doi.org/10.1007/
s10722-021-01139-y.
C. Montagnon (&)
RD2 Vision, 60 rue du Carignan, 34270 Valflaunes,
France
e-mail: Christophe.montagnon@rd2vision.com
A. Mahyoub
Qima Coffee, Asir, Sana’a, Yemen
W. Solano
CATIE, Centro Agrono
´mico Tropical de Investigacio
´ny
Ensen
˜anza, Turrialba, Cartago 7170, Costa Rica
F. Sheibani
Qima Coffee, 21 Warren Street, London W1T 5LT, UK
123
Genet Resour Crop Evol (2021) 68:2411–2422
https://doi.org/10.1007/s10722-021-01139-y(0123456789().,-volV)(0123456789().,-volV)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
(Bunn et al. 2015) and serious diseases such as coffee
leaf rust caused by Hemileia vastatrix (Avelino et al.
2015), and Coffee Berry Disease caused by Col-
letotrichum kahawae (Van der Vossen et al. 2015).
Whilst there is no single solution to such complex
challenges, the study of genetics and the breeding of
improved varieties is a critical area of investigation for
potential solutions. The search for genetically superior
quality coffee varieties has become a central issue for
the specialty market (Montagnon et al. 2019). Unfor-
tunately, the genetic diversity of C. arabica is among
the lowest in the cultivated crop, due to a recent single
event of polyploidization (Scalabrin et al. 2020).
Furthermore, the major part of this low genetic
diversity is found mainly in Ethiopia and to a lesser
degree in South Sudan (Chevalier 1929; Harlan 1969;
Sylvain 1958; Thomas 1942).
Whilst Ethiopia and South Sudan are the center of
origin of C. arabica, Yemen has been its key center of
domestication. Both historical records (de la Roque
1716; Ukers 1922; Chevalier 1929; Cramer 1957;
Haarer 1958; Meyer 1965; Koehler 2017) and past
phenotypic (Montagnon and Bouharmont 1996)or
genetic studies (Lashermes et al. 1996; Anthony et al.
2002; Silvestrini et al. 2007; Pruvot-Woehl et al. 2020;
Scalabrin et al. 2020) indicate that all the Arabica
varieties cultivated outside Ethiopia transited through
the Yemen domestication center. Still, while all the
previous genetic studies were using Yemeni coffee
samples to check the Yemeni stop-over between
Ethiopia and the spread of Arabica coffee varieties
to the world, none has revealed the Yemeni genetic
diversity. The few genetic studies focusing on Yemeni
coffees (Al-Murish et al. 2013; Hussein et al. 2017)
indicated the genetic heterogeneity of varieties as
identified by Yemeni farmers. Neither Ethiopian
accessions nor worldwide cultivated varieties were
included in these studies.
Various molecular markers are available to geneti-
cists and breeders for genetic studies and genome
analysis, namely Single Nucleotide Polymorphism
(SNPs) and Single Sequence Repeats (SSRs) (re-
viewed by Adhikari et al. 2017). Genotyping by
Sequencing (GBS) technique which provides thou-
sands of SNPs for a dense coverage of the genome is
no doubt the ideal method to study the relationship
between genotype and phenotype, namely in poly-
ploids (Clevenger et al. 2015). SNP’s are often used in
coffee for genome wide selection in C. canephora
(Alkimim et al. 2020). Scalabrin et al. (2020) demon-
strated the recent unique polyploidization event by
sequencing the C. arabica genome using SNPs from
the two sub-genomes of this tetraploid species.
However, when the objective is fingerprinting or a
genetic diversity study, the high level of polymor-
phism in SSRs renders it a reliable, practical and cost
effective choice (Hodel et al. 2016). In grapes, the first
approach to characterize a Vitis germplasm collec-
tions with ten SSRs proved a high discriminating
capacity for grapevine varieties (Emmanuelli et al.
2013). In the same study, SSRs proved as efficient as
SNPs to establish the genetic diversity of grapevine.
Singh et al. (2013) found that SSR markers were more
efficient than SNP markers when the objective was
strictly the study of genetic diversity. Anthony et al.
(2002) found that 6 SSR markers were efficient to
confirm the origin of cultivated C. arabica varieties,
spreading from Yemen after an early introduction
from Ethiopia. More recently, da Silva et al. (2019)
used 30 markers to efficiently discriminate between C.
arabica varieties and three diploid Coffea species.
Benti et al. (2020) used 14 SSR markers to efficiently
discriminate between 40 cultivated C. arabica vari-
eties in Ethiopia. Finally, Pruvot-Woehl et al. (2020)
demonstrated that a set 8 SSR markers—used in the
present study-used to genotype 2533 samples repre-
senting the largest known genetic diversity of C.
arabica was efficient in discriminating between vari-
eties and could be used for varietal authentication, and
hence for genetic diversity studies.
Over the past half-decade, Qima Coffee (www.
qimacoffee.com)—a coffee company working at the
ground level in Yemen- has developed research
activities in order to better understand the genetic
landscape of Yemeni coffee. A breeding population
made of 45 individuals representing various coffee
morphotypes observed in Yemen has been gathered.
Using this breeding population, together with Ethio-
pian accessions and cultivars as well as a representa-
tion of the cultivated varieties worldwide, we present
here the first global study aiming at describing the C.
arabica coffee genetic diversity in Yemen. The main
questions addressed in the study are as follows: What
is the magnitude and the structure of the genetic
diversity in Yemen? How does the genetic diversity of
Yemen compare with the known genetic diversity of
coffee in Ethiopia and that of the cultivated C. arabica
varieties worldwide? And finally, does Yemen’s
123
2412 Genet Resour Crop Evol (2021) 68:2411–2422
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
genetic diversity offer opportunities for genetic
improvement of C. arabica?
Material and methods
Plant material
137 samples of C. arabica were studied. They belong
to the following three categories (Table 1).
Ethiopian accessions (EA)
72 accessions are representing the Ethiopian acces-
sions collected in 1966 and 1968 by the FAO and
Orstom survey, respectively (FAO 1968; Charrier
1978). Those 72 accessions were selected as they are
part of the core collection defined by World Coffee
Research and the Centro Agrono
´mico Tropical de
Investigacio
´n y Ensen
˜anza (CATIE) in 2014 (Solano,
personal communication).
Worldwide cultivars (WWC)
20 samples represent main cultivated varieties grown
worldwide outside Ethiopia. This includes Bourbon
and Typica, but also East African and Indian varieties
that have been shown to have transited through Yemen
from Ethiopia before being introduced in all present
coffee producing countries (reviewed in Pruvot-
Woehl et al. 2020).
Yemen Qima breeding population (YQ)
45 samples from the Qima breeding population, made
up of 45 trees selected from Yemen germplasm
representing the major coffee growing areas.
DNA extraction and SSR marker analysis
All the operation of DNA extraction and SSR marker
analysis were performed by the ADNiD laboratory of
the Qualtech company in the South of France (http://
www.qualtech-groupe.com/en/).
Genomic DNA was extracted from approximately
20 mg of dried tissue according a homemade protocol
with SDS buffer. DNA was then purified with
magnetic bead (Agencourt AMPure XP, Beckman
Coulter, Brea, California, USA) followed by elution in
Tris Edta (TE) buffer.
The DNA concentration was estimated with a
Enspire spectrofluorimeter (Perkin Elmer) with a
bisbenzimide DNA intercalator (Hoechst 33,258)
and by comparison with known standards of DNA.
Eight SSR primer pairs (Table 2) selected after
Combes et al. (2000) and whose wide discrimination
power was confirmed by Pruvot-Woehl et al. (2020)
have been used.
PCR was performed in a 15 lL final volume
comprising 30 ng genomic DNA and 7.5 lLof
29PCR buffer (Type-it Microsatellite PCR Kit,
Qiagen), 1.0 lM each of forward and reverse primer
(10 lM). Amplifications were carried out in thermal
cycler (Eppendorf) programmed at 94 °C for 5 min
for initial denaturation, followed by 94 °C for 30 s,
annealing temperature depending on the primer used
for 30 s and 72 °C for 1 min for 35 cycles followed by
a final step of extension at 72 °C for 5 min. Final
holding temperature was 4°C.
PCR samples were run on a capillary electrophore-
sis, ABI 3130XL with an internal standard: GeneScan
500 LIZ size standard (Applied Biosystems).
Alleles were scored using GeneMapper v.4.1
software (Applied Biosystems).
Table 1 Repartition of C. arabica samples categories in main genetic clusters
Sample category Genetic cluster (our study)
Ethiopian only SL-17 Yemen Bourbon Typica Yemen SL-34 New-Yemen Total
Ethiopian accessions 68 4 72
WW cultivars 5 4 9 2 20
Yemen Qima breeding populations 13 8 24 45
Total 73 8 22 10 24 137
123
Genet Resour Crop Evol (2021) 68:2411–2422 2413
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Data analysis
The method described by Pruvot-Woehl et al. (2020)
was used. Because C. arabica is tetraploid, the
presence/absence (1/0) was coded for each allele.
Strictly speaking, we are dealing with SRR allelic
phenotype rather than genotype. Indeed, the pheno-
type AB could be either of the following genotypes:
AABB, ABAB, AAAB, ABBB.
DARwin6 software (Perrier and Jacquemoud-Col-
let 2006) was used with single data files. Dissimilarity
matrix was calculated using Dice Index and was the
basis for the construction of the genetic diversity tree
using the weighted Neighbor-Joining method (Saitou
and Nei 1987) and the execution of the Principal
Coordinates Analysis (PCoA). As indicated by Perrier
and Jacquemoun-Collet (2006), the PCoA give an
overall representation of the diversity, while tree
methods tend to represent individual genetic relation-
ships faithfully. Hence, these two different ways of
viewing the data are complementary.
In order to check the robustness of the clusters, a
Discriminant Analysis (DA) (Tomassone et al. 1988)
was run on the coordinates on the five first axis of the
PCoA. The statistical difference between the genetic
clusters was checked with the Wilks Lambda test. The
percentage of good classification was checked through
the cross-validation when each sample is classified
based on the model build on the whole sample but this
sample. The DA and related tests were performed with
the Xlstat software (Addinsoft 2020).
Results
The neighbor-Joining tree based on all the samples
(Fig. 1) shows five well marked different clusters.
Sample categories are not evenly distributed in each
cluster (Table 1). One major cluster was almost
entirely comprised of Ethiopian accessions. We
named it the Ethiopian Only cluster (EO). Five
worldwide cultivated varieties belonged to that clus-
ter: (i) Geisha (CATIE code T.02722) is the famous
Geisha which originally became renowned in Panama
(Pruvot-Woehl et al. 2020), (ii) Java is a variety
originally selected in Cameroon (Bouharmont 1994)
in an Ethiopian population, (iii) Chiroso is a name
given to a variety grown in Colombia and said to have
a superior cup quality (Montagnon Pers. Obs.), (iv)
SL-06 was selected in Kenya in the early twentieth
century, supposedly from a Kent tree (Jones 1956) and
(v) Mibirizi is one of the first variety grown in the
Great Lake region; its origin is unknown (Leplae
1936). The CATIE code of the Mibirizi accession in
this cluster is T.02702.
The second cluster was made of Ethiopian acces-
sions and worldwide cultivated varieties. Cultivated
varieties were varieties selected in East Africa in the
early twentieth century: SL-14, SL-17 and K-7. SL-14
and SL-17 were selected from a ‘‘Drought Resistant’
population in Kenya. The origin of this ‘‘Drought
Resistant’’ population is unknown (Jones 1956). Both
K-7 and K-758 have the same genetic fingerprint and
are supposed to descent from Kent selections (Fernie
1970). Mibirizi with CATIE Code T.03622 was also in
that cluster. No Yemeni samples were found in that
cluster. We named this cluster the ‘‘SL-17’’ cluster.
Table 2 List of the microsatellites used in the study
SSR Marker Primer sequence forward (50–30) Primer sequence reverse (50–30) Size product (bp)
Sat-11 ACCCGAAAGAAAGAACCAA CCACACAACTCTCCTCATTC 143–145
Sat-225 CATGCCATCATCAATTCCAT TTACTGCTCATCATTCCGCA 283–317
Sat-235 TCGTTCTGTCATTAAATCGTCAA GCAAATCATGAAAATAGTTGGTG 245–278
Sat-24 GGCTCGAGATATCTGTTTAG TTTAATGGGCATAGGGTCC 167–181
Sat-254 ATGTTCTTCGCTTCGCTAAC AAGTGTGGGAGTGTCTGCAT 221–237
Sat-29 GACCATTACATTTCACACAC GCATTTTGTTGCACACTGTA 137–154
Sat-32 AACTCTCCATTCCCGCATTC CTGGGTTTTCTGTGTTCTCG 119–125
Sat-47 TGATGGACAGGAGTTGATGG TGCCAATCTACCTACCCCTT 135–169
123
2414 Genet Resour Crop Evol (2021) 68:2411–2422
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
The third and fourth clusters were made only of
worldwide coffee cultivars and Yemeni samples. One
cluster included Yemeni accessions and two cultivated
varieties selected in East Africa in the early twentieth
century: SL-09 and SL-34. SL-09 is of unknown origin
while SL-34 descends from an heterogeneous ‘‘French
Mission’’ population made of seeds brought by French
Fathers of the Holy Ghost directly from Yemen
(Aden). We named this cluster the ‘‘Yemen SL-34’’
cluster. The other cluster included Yemeni accessions
and various worldwide cultivated varieties including
Bourbon, Typica, SL28 (which has the same DNA
fingerprinting as Coorg), Kent, KP-263 and 532,
Bronze 009 and Moka. We named this cluster the
‘Yemen Typica-Bourbon’’ cluster. Typica and Bour-
bon are the only two varieties that spread into the
world since the eighteenth century: Typica through
India and then Asia and Bourbon via the Bourbon
Island, now the La Reunion Island. Kent and Coorg are
Indian selections deriving from Old Chicks, the
population formed by the first introduction of coffee
in India by Baba Budan (Haarer 1958; Kushalappa and
Eskes 1989). In East Africa, the denomination
‘Moka’’ was given to any coffee beans or seeds
proceeding from the port of Mocha in Yemen (Jones
1956). Cramer (1957) and Carvalho et al. (1984)
Ethiopian Only
SL-17
Yemen SL-34
Yemen
Typica Bourbon
New-Yemen
Fig. 1 Neighboring joining
trees based on the
dissimilarity matrix
involving the 137 samples of
the study Black Ethiopian
accessions, Blue Worldwide
cultivars, Red Yemen Qima
Breeding populations.
(Color figure online)
123
Genet Resour Crop Evol (2021) 68:2411–2422 2415
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
described the Moka variety as having small roundish
fruits, originating from a mutation than can occur in
different genetic backgrounds.
Finally, we found one cluster which was made only
of Yemeni samples and no other worldwide cultivars.
We named this the ‘‘New-Yemen’’ cluster.
Figure 2shows the graph based on the first two axis
of the PCoA, which explained 40.2% of the whole
variation. The ‘‘Ethiopian only’’ genetic cluster is on
the right part of the graph while ‘‘SL-17’’, ‘‘Yemen
Bourbon/Typica’’ and ‘‘Yemen SL-34’’ are in the
upper left quarter. Only New-Yemen is on the lower
left part of the graph, thus confirming its genetic
singularity.
The average allele number per marker was 7.4 for
the whole population. However, the ‘‘Ethiopian only’
cluster had 7.0 alleles per marker while all the other
clusters together had only 2.63 alleles per marker.
‘SL-17’’, ‘‘Yemen Bourbon/Typica’’, ‘‘Yemen SL-
34’’ and ‘‘New-Yemen’’ clusters had 3.5, 2.4, 1.8 and
1.9 alleles per marker, respectively.
The DA based on the coordinates on the five first
axis confirmed the statistical difference between the
clusters as the Wilks lambda test was highly signif-
icant (P\0.0001). The overall good classification
through cross-validation of samples to genetic clusters
averaged 91%, and was 100% for the ‘‘New-Yemen’’
cluster.
The detailed list of samples with their sample
category and attributed genetic cluster is provided in
supplemental data (Table S1).
Discussion
To the best of our knowledge, our study is the first ever
to zoom in on the genetic diversity of C. arabica in
Yemen. Unveiling this genetic diversity enables a
better understanding of the genetic diversity of the
coffee varieties cultivated worldwide.
The relevance of the set of SSR markers used in the
present study was confirmed for the genetic diversity
exploration of C. arabica, as Pruvot-Woehl et al.
(2020) established it. This was also in agreement of the
usefulness of SSRs in genetic studies (Hodel et al.
2016). The set of 8 SRR markers is highly polymor-
phic with 7.4 alleles per marker in our study. Pruvot-
Woehl et al. (2020) reported 11.9 alleles per marker
for the same eight markers. However, introgressed
varieties (descending from interspecific crosses with
Fig. 2 Graph on the first two axis of the principal coordinates analysis (PCoA) based on the dissimilarity matrix involving the 137
samples of the study. (Color figure online)
123
2416 Genet Resour Crop Evol (2021) 68:2411–2422
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
C. canephora or C. liberica) were included in the
studied population. Da Silva et al. (2019) also included
introgressed varieties and even other Coffea species in
their study and had 6.9 alleles per marker (30 SSRs),
hence less than in the present study. Benti et al. (2020)
found 7.5 alleles per marker (14 SSRs) for a popula-
tion of 40 Ethiopian Arabica varieties, two of which
were introgressed varieties.
The DA proved the robustness of the genetic
clusters, namely of the ‘‘New-Yemen’’ cluster.
Yemen holds most of the C. arabica genetic
diversity known outside of Ethiopia
Our study confirmed previous knowledge based on
history (de la Roque 1716; Ukers 1922; Chevalier
1929; Cramer 1957; Haarer 1958); Meyer 1965;
Koehler 2017) and past genetic studies (Lashermes
et al. 1996; Anthony et al. 2002; Silvestrini et al. 2007;
Pruvot-Woehl et al. 2020; Scalabrin et al. 2020) that
the vast majority of the coffee varieties cultivated
outside Ethiopia transited through Yemen. However,
the detailed genetic structure of the coffee trees
cultivated in Yemen and the connection of this
detailed genetic diversity to the varieties cultivated
worldwide was not known. We have shown in this
study that there are at least three distinct C. arabica
genetic clusters in Yemen. Those clusters are well
separated, leading to three main hypotheses: (i) Intro-
duction of several populations from different narrow
genetic basis and selection along time of the most
adapted populations, (ii) introduction of one or several
populations with a larger genetic basis and selection in
Yemen within those populations of the different
genetic clusters observed today or (iii) reintroduction
in Yemen of the varieties selected worldwide.
Our data could support either of the first two
hypothesis, or a combination of the two. The third
hypothesis cannot be dismissed but is unlikely because
there is no record in history of such re-introduction of
all major coffee varieties back to Yemen.
The Yemen Typica-Bourbon cluster encompasses
most of the important varieties cultivated worldwide.
Hence, Yemen today is still holding most of the
genetic diversity that it delivered to the world
300 years ago. Moreover, Yemen also hosts a unique
specific genetic diversity. Indeed, no world wide
cultivated varieties in our study belongs to the New-
Yemen cluster, meaning that either it spread out of
Yemen in the eighteenth century but was lost or
counter-selected en route or it simply never left
Yemen.
There was no correlation between the genetic
clusters and the name of coffee cultivars given by
Yemeni coffee farmers. This was in line with the
findings of Hussein et al. (2017). In fact, most of the
given names are related to some obvious visual
characteristics that are not dependent on the precise
genetic background. For instance, taller coffees would
be called Udaini, Jufaini and Jadi while more compact
trees would be called Dawairi and Tufahi (Sheibani,
own observation). The discrepancy between given
names and actual genetic fingerprint has been pre-
cisely shown by Pruvot-Woehl et al. (2020) in various
parts of the world.
Origin and history of the C. arabica coffee
varieties cultivated worldwide
None of the Ethiopian accessions in our study
clustered with the Yemeni accessions. However,
Ethiopian accessions samples are not representative
of all the possible existing genetic diversity in Ethiopia
(FAO 1968; Charrier 1978; Scalabrin et al. 2020).
Furthermore, the only available Ethiopian germ-
plasm—used in this study—comes from two surveys
made some 50 years ago that did not cover the South
Sudan region and the East Ethiopian Hararghe coffee
zones. Scalabrin et al. (2020) suggest that there has
been a general West–East movement of C. arabica
from South West Ethiopia/South Sudan towards the
Ethiopian East part of the Great Rift, then to Hararghe
Ethiopian coffee zones, then to Yemen and then to the
world. Montagnon and Bouharmont (1996) were the
first to highlight a genetic difference between the
Ethiopian accessions West and East of the Great Rift
Valley. Whether the East of the Great Rift was home to
wild C. arabica or whether it was a first place of
domestication with wild C. arabica coming from the
Western forests remains an open question. Further
East, Hararghe zones were planted with coffee trees
coming from the West and/or Eastern parts of the
Great Rift Valley. Hararghe coffee could be related to
coffee planted in Yemen as there were intense trade
relationship between the two regions (Haarer 1958).
Scalabrin et al. (2020) focusing on Ethiopian acces-
sions identified three main genetic groups: the
‘Jimma-Bonga’ and ‘Sheka’ groups were made of
123
Genet Resour Crop Evol (2021) 68:2411–2422 2417
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Ethiopian accessions from the Western part of the
coffee areas in Ethiopia while the third group, called
‘Harar-Yemen’’, was more closely related to the
worldwide cultivars and the Yemeni samples part of
their study. Our study, through focusing on the Yemen
coffee genetic diversity, indicated that Yemen acces-
sions are made of genetically distinct clusters and that
their likely origin in Ethiopia is not a single one. Under
this scenario, studying more Ethiopian samples com-
ing from the Eastern part of the Great Rift Valley and
from Hararghe is a priority to better understand the
origin of the Yemeni coffee genetic landscape.
The worldwide cultivars found in the Ethiopian
Only cluster of our study represent coffee seeds that
did not pass through the Yemen route. Geisha (CATIE
code T.02722) is a referenced famous example. It was
brought out of South Western Ethiopia directly to
Kenya in the early twentieth century (Koehler 2017).
The Java variety is another example. It was selected
from an ‘‘Abyssinian’’, hence Ethiopian, population,
that traveled first to Indonesia and then to Cameroon
where the Java variety was finally selected (Bouhar-
mont 1994). SL-06 was also found in the Ethiopian
Only Cluster, yet was first reported by Jones (1956)to
be a single tree selection from Kent. Clearly clustering
with Ethiopian landraces and genetically distinct from
any Yemeni clusters, it is very unlikely that SL-06 was
part of the seeds that transited to India and were at the
origin of Kent. We cannot rule out potential misla-
beling or mixing anywhere between the Kenyan
research stations and the CATIE germplasm collec-
tion, so this ‘‘SL-06’’ might be unrelated to the original
SL-06 Kenyan Selection. Chiroso is a variety culti-
vated at small scale in Colombia, famous for its cup
quality. We show in our study that Chiroso is part of
those Ethiopian landraces that ‘‘escaped’’ Ethiopia. It
is very likely that the more DNA fingerprints collected
of varieties grown in small scale with exceptional cup
quality (Montagnon et al. 2019; Pruvot-Woehl et al.
2020), the more examples of Ethiopian landraces that
bypassed the Yemen route will be found. Mibirizi
CATIE code T.02702 is part of this ‘‘Ethiopian Only
cluster’’ and is genetically very close to Ethiopian
accession T.04620 of the CATIE collection. Another
Mibirizi accession with CATIE code T.03622 is part
of the SL-17 cluster and is genetically very different
from Mibirizi with CATIE code T.02702. This might
be due to mislabeling for the Mibirizi variety some-
where between East Africa and the CATIE collection.
Given the limited information on Mibirizi (Leplae
1936), it is very unlikely that Mibirizi is an Ethiopian
landrace and if any of the two Mibirizi codes is correct,
it would rather be the T.03622 code. However, the
hypothesis can’t be discarded that several independent
introductions in Central Africa, through trade or
cultural contact, led to two genetic backgrounds of
the material both called Mibirizi.
In our study, the SL-17 cluster includes four
Ethiopian accessions and no Yemeni accessions.
Hence, one hypothesis is that it transited through
Yemen but was not captured in the Yemen Qima
population either because representatives of this
cluster still exist but were not surveyed or because it
has disappeared in Yemen. The other hypothesis is that
it never transited through Yemen and was directly
introduced in East Africa from Ethiopia.
The Yemen SL-34 cluster, unlike the SL-17 cluster,
is clearly represented in Yemen. The two worldwide
cultivated varieties part of this cluster are SL-09 and
SL-34. According to Jones (1956), SL-09 is of
unknown origin while SL-34 if from the ‘‘French
Mission’’ origin, whose geographical origin is either
Central Africa, Bourbon Island or Yemen.
The Yemen Typica-Bourbon represents the main
early routes followed by the majority of C. arabica
varieties cultivated worldwide. As reviewed by Mon-
tagnon et al. (2019) and Pruvot-Woehl et al. (2020),
the two main coffee routes out of Yemen in the early
eighteenth century were the Bourbon Island (today La
Reunion) and India. The seeds that transited through
the Bourbon Island most likely represented a small
fraction of the Yemen Typica-Bourbon cluster as it
only gave rise to the Bourbon variety. The seeds that
transited through India necessarily included a wider
diversity of the Yemen Typica-Bourbon as it gave rise
to the Typica variety through the Indonesian route and
then to the new world. These same population also
gave rise to several varieties from this cluster that were
first cultivated in India and then introduced to East
Africa.
Finally, the New-Yemen cluster either never left
Yemen or was lost en route. Regardless of which
scenario is true, the fact is that it constitutes a unique
genetic cluster not found in the cultivated varieties
worldwide. The origins of this cluster in Ethiopia are
unknown. Its ancestors in Ethiopia may have disap-
peared due to deforestation or genetic extinction.
Similarly, it is also possible that its ancestors may exist
123
2418 Genet Resour Crop Evol (2021) 68:2411–2422
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
in some populations in Ethiopia whose genetic diver-
sity has not yet been studied or published.
Our study allows us to connect the dots and better
understand the movement and spread of C. arabica
around the world. Altogether, until today, three main
routes were considered: (i) the Yemen-India route, (ii)
the Yemen-Bourbon Island route and (iii) what we
refer to as the ‘‘Escapee’’ route of some Ethiopian
accessions that spread to the world without passing
through Yemen. Our results confirm these three routes
with the importance of the Yemen Typica-Bourbon
cluster for the first two routes. However, it also
highlights another previously overlooked route: the
direct Yemen-East African route in the late 19th/early
twentieth century, that the SL-17 and the Yemen SL-
34 cluster might well have followed. The results also
confirm the need for further exploration of the genetic
diversity in the Hararghe region as a priority to
investigate any connections between the Yemen
clusters and their ancestors in Ethiopia.
Hence, for the first time, we present here deeper
knowledge and a fuller picture of the genetic structure
of C. arabica in Yemen, the gateway country between
Ethiopia, the homeland of C. arabica coffee, and the
world of cultivated coffee varieties. The varieties out
of Yemen have been incredibly resilient: Bourbon and
Typica have been cultivated for 300 years. Mundo
Novo, Caturra or Catuai, all selected from the Yemen
Typica-Bourbon varieties, have been successful in
Brazil for more than 50 years now (Guerreiro Filho
et al. 2018). SL-28, also from the Yemen Typica-
Bourbon cluster is appreciated by Kenyan farmers for
almost one century. Only the introgressed varieties
originating from interspecific crosses with Robusta
have partly taken over the original Yemen descending
varieties for their yield and (now often compromised)
resistance to coffee leaf rust (Zambolim 2016; Mon-
tagnon et al. 2019). Most recently, the F1 hybrids
taking advantage of the hybrid vigor have shown a
significant superiority to the traditional Yemen
descending varieties (Georget et al. 2019; Marie
et al. 2020).
Breseghello and Coelho (2013) proposed a com-
prehensive review of events from crop domestication
to modern breeding of crops. Just after domestication,
‘the origin of crop’’, comes the intuitive farmer
selection, which is the ‘‘origin of landraces’’, followed
by pure line selection and mass selection, constituting
the ‘‘origin of cultivars’’. Later on, plant breeding
based on controlled mapping and possibly marker
assisted selection will take place. Our results indicate
that Yemen was at the very least a major origin of
landraces and cultivars for the world, forming a cluster
distinct from Ethiopian accessions. This general
observation was made in former studies (Montagnon
and Bouharmont 1996; Lashermes et al. 1996;
Anthony et 2002; Silvestrini et al. 2007; Pruvot-
Woehl et al. 2020; Scalabrin et al. 2020). However, the
present study goes deeper in the description of the
genetic diversity of C. arabica in Yemen and its
significance for the subsequent selection of cultivars
worldwide. This pattern of genetic distance between
wild and cultivated populations as the result of
domestication and early selection has been observed
in annual (Gepts 2004; Glaszmann et al. 2010) as well
as in perennial crops, namely in tea—Camellia
sinensis (L.) Kuntze (Meegahakumbura et al. 2018),
peach—Prunus persica (L.) Batsch (Agaki et al. 2016)
or grapevine–Vitis vinifera L. (Riaz et al. 2018).
Gepts (2004) recalls that crop domestication is a
long-term selection experiment that has genetic con-
sequences and often decreases the genetic diversity
and the gene expression diversity (Flint-Garcia 2013;
Liu et al. 2019; Turner-Hissong et al. 2020). This in
turn offers a unique opportunity for breeders to
understand, identify and target genes for adaptation
(Ross-Ibarra et al. 2007; Glaszmann et al. 2010),
including for polygenic adaptation as recently shown
in cocoa- Theobroma cacao L. (Ha
¨ma
¨la
¨et al. 2020).
With our results, it is now possible to revisit and
fully explore Yemen genetic diversity. Yemen’s
coffee land has a rough climate: displaying both high
and low temperatures in the extreme range of coffee
growing areas worldwide, together with one of the
lowest global rainfall levels. There is no doubt that this
environment has favored resilient varieties, not only
between the 1400s (coffee first introduced of Yemen)
and 1700s (when today’s main worldwide coffee
varieties were taken out of Yemen), but also during the
last 300 years of coffee cultivation and propagation.
We also unveil the New-Yemen cluster that has not
been observed anywhere else in the world so far. This
newly found genetic cluster represents a huge oppor-
tunity for the sustainability of the global coffee sector.
Indeed, addressing the effects of climate change in
coffee (Bunn et al. 2015; Davis et al. 2019) will partly
rely on new varieties adapted to extreme temperatures.
Yemen can offer the world of coffee several centuries
123
Genet Resour Crop Evol (2021) 68:2411–2422 2419
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
worth of extreme coffee climate selected genes. The
findings could not only provide the global coffee
community with a deeper exploration and understand-
ing of the genetic diversity at the origin of proven
successful varieties but also offer a completely new
genetic reservoir: the New-Yemen cluster.
In addition to the challenge of climate change, our
results offer the specialty coffee market new unex-
plored genetic diversity for cup quality; which can
significantly increase the diversity and sustainability
of the coffee sector (Montagnon et al. 2019).
Last but not least, while Yemen is one of the oldest
coffee growing countries, very little was known about
the country’s coffee genetic landscape. Our results
will be critical in guiding the selection of the best
planting material for the Yemeni coffee growers.
Conclusion
To the best of our knowledge, our study is the first to
unveil and describe the genetic diversity of C. arabica
in Yemen, a key center for the development of the
cultivated varieties around the world. We observed
three genetic clusters in Yemen. Hence, either coffee
was introduced in Yemen in a single event with a
genetic diversity covering the Yemeni three clusters,
or there were several independent introductions of
coffee in Yemen. Furthermore, we showed that the
major part of the genetic diversity of the coffee
cultivars is still present today, 300 years after coffee
was propagated out of Yemen to be cultivated around
the world. However, one genetic cluster, the New-
Yemen cluster, was not known before our study as it
has not been observed elsewhere in the world, either
because it never left Yemen or because it was lost en
route. The New-Yemen cluster is not related to any
population in Ethiopia observed thus far from a
genetic point of view. Hence, either the genetic source
was lost in Ethiopia or it is related to populations that
have not been yet included in genetic diversity
analysis. This work paves the way to new research
opportunities, namely the search for adaptation genes
in the Yemeni coffee gene pool.
Acknowledgements Authors would like to thank Dr Jane
Cheserek from the Kenya Agricultural & Livestock Research
Organization who found and provided us with the original
article of Jones (1956) ‘‘Notes on the varieties of Coffea arabica
in Kenya’’, as well as two anonymous reviewers for their useful
comments. Authors would also like to thank the Ministry of
Agriculture and Irrigation of Yemen for their support of the
work and the coffee farmers of Yemen.
Author contributions CM wrote the paper and did all the
genetic analysis, MA oversaw the breeding populations in
Yemen, WS provided data for Ethiopian accessions and
participated in the interpretation/discussion of data, FS had the
idea of the study, prepared the samples from Yemen and
participated to the discussion and writing of the paper.
Funding Qima Coffee.
Data availability Data is available on demand.
Compliance with ethical standards
Conflict of interest The authors declares that they have no
conflict of interest.
Open Access This article is licensed under a Creative Com-
mons Attribution 4.0 International License, which permits use,
sharing, adaptation, distribution and reproduction in any med-
ium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative
Commons licence, and indicate if changes were made. The
images or other third party material in this article are included in
the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not
included in the article’s Creative Commons licence and your
intended use is not permitted by statutory regulation or exceeds
the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/.
References
Addinsoft (2020). XLSTAT statistical and data analysis solu-
tion. New York. USA. https://www.xlstat.com. Accessed
01 July 2020
Adhikari S, Saha S, Biswas A, Rana TS, Bandyopadhyay TK,
Ghosh P (2017) Application of molecular markers in plant
genome analysis: a review. Nucleus 60:283–297. https://
doi.org/10.1007/s13237-017-0214-7
Akagi T, Hanada T, Yaegaki H, Gradziel TM, Tao R (2016)
Genome-wide view of genetic diversity reveals paths of
selection and cultivar differentiation in peach domestica-
tion. DNA Res 23:271–282. https://doi.org/10.1093/
dnares/dsw014
Alkimim ER, Caixeta ET, Sousa TV, Resende MDV, da Silva
FL, Sakiyama NS, Zambolim L (2020) Selective efficiency
of genome-wide selection in Coffea canephora breeding.
Tree Genet Gen. https://doi.org/10.1016/j.molp.2015.02.
002
Al-Murish TM, Elshafei AA, Al-Doss AA, Barakat MN (2013)
Genetic diversity of coffee (Coffea arabica L.) in Yemen
123
2420 Genet Resour Crop Evol (2021) 68:2411–2422
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
via SRAP, TRAP and SSR markers. J Food Agric Environ
11:411–416
Anthony F, Combes MC, Astorga C, Bertrand B, Graziosi G,
Lashermes P (2002) The origin of cultivated Coffea ara-
bica L. varieties revealed by AFLP and SSR markers.
Theor Appl Genet 104:894–900. https://doi.org/10.1007/
s00122-001-0798-8
Avelino J, Cristancho M, Georgiou S et al (2015) The coffee rust
crises in Colombia and Central America (2008–2013):
impacts, plausible causes and proposed solutions. Food
Secur 7:303–321. https://doi.org/10.1007/s12571-015-
0446-9
Benti T, Gebre E, Tesfaye K, Berecha G, Lashermes P, Kyallo
M, Kouadio Yao N (2020) Genetic diversity among com-
mercial arabica coffee (Coffea arabica L.) varieties in
Ethiopia using simple sequence repeat markers. J Crop
Improv. https://doi.org/10.1080/15427528.2020.1803169
Bouharmont P (1994) La varie
´te
´Java: un cafe
´ier arabica
se
´lectionne
´au Cameroun. Plantations, recherche,
de
´veloppement 1:38–45
Breseghello F, Coelho ASG (2013) Traditional and modern
plant breeding methods with examples in rice (Oryza sativa
L.). J Agric Food Chem 61:8277–8286. https://doi.org/10.
1021/jf305531j
Browning D (2018). How many coffee farms are there in the
world?. ASIC Conference Portland, 2018/09/16–20 https://
www.youtube.com/watch?v=vKaeDkpqPSg, Accessed 01
July 2020
Bunn C, La
¨derach P, Jimenez JGP, Montagnon C, Schilling T
(2015) Multiclass classification of agro-ecological zones
for Arabica coffee: an improved understanding of the
impacts of climate change. PLoS ONE. https://doi.org/10.
1371/journal.pone.0140490
Carvalho A, Fazuoli LC, Medina Filho HP (1984) Effects of
X-radiation on the induction of mutations in Coffea ara-
bica. Bragantia 43:553–567
Charrier A (1978). Etude de la structure et de la variabilite
´
ge
´ne
´tique des cafe
´iers : Re
´sultats des e
´tudes et des
expe
´rimentations re
´alise
´es au Cameroun, en Co
ˆte d’Ivoire
et a
`Madagascar sur l’espe
`ce Coffea arabica L. collecte
´een
Ethiopie par une mission Orstom en 1966. Bulletin IFCC n°
14, Paris, FRA.
Chevalier A (1929) Les cafe
´iers du globe. I. Ge
´ne
´ralite
´s sur les
cafe
´iers, Encyclope
´die biologique, Paul Lechevalier, Paris
Clevenger J, Chavarro C, Pearl SA, Ozias-Akins P, Jackson SA
(2015) Single nucleotide polymorphism identification in
polyploids: a review, example, and recommendations. Mol
plant 8:831–846. https://doi.org/10.1016/j.molp.2015.02.
002
Combes MC, Andrzejewski S, Anthony F, Bertrand B, Rovelli
P, Graziosi G, Lashermes P (2000) Characterization of
microsatellite loci in Coffea arabica and related coffee
species. Mol Ecol 9:1178–1180. https://doi.org/10.1046/j.
1365-294x.2000.00954-5.x
Cramer P J S (1957). A Review of Literature of Coffee Research
in Indonesia (from about 1602 to 1945). IICA.
da Silva BSR, Sant’Ana G C, al, (2019) Population structure and
genetic relationships between Ethiopian and Brazilian
Coffea arabica genotypes revealed by SSR markers.
Genetica 147:205–216. https://doi.org/10.1007/s10709-
019-00064-4
Davis AP, Chadburn H, Moat J, O’Sullivan R, Hargreaves S,
Lughadha EN (2019) High extinction risk for wild coffee
species and implications for coffee sector sustainability.
Sci Adv. https://doi.org/10.1126/sciadv.aav3473Davis
De La Roque J (1716) Voyage de l’Arabie heureuse, par l’Oce
´an
oriental, et le de
´troit de la Mer Rouge : fait par les Franc¸ois
pour la premie
`re fois, dans les anne
´es 1708, 1709 et 1710.
Andre
´Cailleau, Paris. https://play.google.com/books/
reader?id=3fsOAAAAQAAJ&hl=fr&num=10&printsec=
frontcover&pg=GBS.PP7. Accessed 01 July 2020.
Emanuelli F, Lorenzi S, Grzeskowiak L et al. (2013). Genetic
diversity and population structure assessed by SSR and
SNP markers in a large germplasm collection of
grape. BMC plant biology http://www.biomedcentral.co
m/1471–2229/13/39
FAO (1968) FAO coffee mission to Ethiopia: 1964–1965. FAO,
Rome
Fernie LM (1970) The improvement of arabica coffee in East
Africa. Crop Improvement in East Africa. Techn, Comm,
p19
Flint-Garcia SA (2013) Genetics and consequences of crop
domestication. J Agric Food Chem 61:8267–8276. https://
doi.org/10.1021/jf305511d
Georget F, Marie L, Alpizar E et al (2019) Starmaya: The first
arabica F1 coffee hybrid produced using genetic male
sterility. Front Plant Sci 10:1344. https://doi.org/10.3389/
fpls.2019.01344/full
Gepts P (2004) Crop domestication as a long-term selection
experiment. Plant Breed Rev 24:1–44
Glaszmann JC, Kilian B, Upadhyaya HD, Varshney RK (2010)
Accessing genetic diversity for crop improvement. Curr
Opin Plant Biol 13:167–173. https://doi.org/10.1016/j.pbi.
2010.01.004
Guerreiro Filho O, Ramalho MAP, Andrade VT (2018) Alcides
Carvalho and the selection of Catuaı
´cultivar: interpreting
the past and drawing lessons for the future. Crop Breed and
Appl Biotechnol 18:460–466
Haarer AE (1958) Modern coffee production. Ebenezer Baylis
and Son, The Trinity Press, London (UK)
Ha
¨ma
¨la
¨T, Guiltinan MJ, Marden JH, Maximova SN, dePam-
philis CW, Tiffin P (2020) Gene expression modularity
reveals footprints of polygenic adaptation in Theobroma
cacao. Mol Biol Evol 37:110–123. https://doi.org/10.1093/
molbev/msz206
Harlan JR (1969) Ethiopia: a center of diversity. Econ Bot
23:309–314
Hodel RG, Segovia-Salcedo MC, Landi JB et al (2016) The
report of my death was an exaggeration: a review for
researchers using microsatellites in the 21st century. Appl
Plant Sci. https://doi.org/10.3732/apps.1600025
Hussein MAA, Al-Azab AAA, Habib SS, El Sherif FM, El-
Garhy HA (2017) Genetic diversity, structure and DNA
fingerprint for developing molecular IDs of Yemeni coffee
(Coffea Arabica L.) Germplasm assessed by SSR Markers.
Egypt J Plant Breed 203:1–25
Jones P A (1956). Notes on the varieties of Coffea arabica in
Kenya. Coffee Board of Kenya Monthly Bulletin,
November 1956.
Koehler J (2017) Where the wild coffee grows: The untold story
of coffee from the cloud forests of Ethiopia to your cup.
Bloomsbury Publishing, USA
123
Genet Resour Crop Evol (2021) 68:2411–2422 2421
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Kushalappa AC, Eskes AB (1989) Advances in coffee rust
research. Annu Rev Phytopathol 27:503–531. https://doi.
org/10.1146/annurev.py.27.090189.002443
Lashermes P, Trouslot P, Anthony F, Combes M, Charrier A
(1996) Genetic diversity for RAPD markers between cul-
tivated and wild accessions of Coffea arabica. Euphytica
87:59–64
Leplae E (1936) Les plantations de cafe
´au Congo belge, leur
histoire, 1881–1935, leur importance actuelle. Van Cam-
penhout G, Bruxelles
Liu W, Chen L, Zhang S et al (2019) Decrease of gene
expression diversity during domestication of animals and
plants. BMC Evol Biol. https://doi.org/10.1186/s12862-
018-1340-9
Marie L, Abdallah C, Campa C et al (2020) G9E interactions
on yield and quality in Coffea arabica: new F1 hybrids
outperform American cultivars. Euphytica 216:1–17.
https://doi.org/10.1007/s10681-020-02608-8
Meegahakumbura MK, Wambulwa MC, Li MM et al (2018)
Domestication origin and breeding history of the tea plant
(Camellia sinensis) in China and India based on nuclear
microsatellites and cpDNA sequence data. Front Plant Sci.
https://doi.org/10.3389/fpls.2017.02270
Meyer FG (1965) Notes on wild Coffea arabica from South-
western Ethiopia, with some historical considerations.
Econ Bot 19:136–151
Montagnon C, Bouharmont P (1996) Multivariate analysis of
phenotypic diversity of Coffea arabica. Genet Resour Crop
Evol 43:221–227
Montagnon C, Marraccini P, Bertrand B (2019) Breeding for
coffee quality. In: Oberthur et al. (eds) Specialty Coffee-
Managing Quality. Cropster Innsbruck, Austria,
pp 109–143
Perrier X, Jacquemoud-Collet J P (2006). DARwin software.
http://darwin.cirad.fr/darwin
Pruvot-Woehl S, Krishnan S, Solano W, Schilling T, Toniutti L,
Bertrand B, Montagnon C (2020) Authentication of Coffea
arabica varieties through DNA fingerprinting and its sig-
nificance for the coffee sector. J AOAC Int 103:325–334.
https://doi.org/10.1093/jaocint/qsz003
Riaz S, De Lorenzis G, Velasco D et al (2018) Genetic diversity
analysis of cultivated and wild grapevine (Vitis vinifera L.)
accessions around the Mediterranean basin and Central
Asia. BMC Plant Biol. https://doi.org/10.1186/s12870-
018-1351-0
Ross-Ibarra J, Morrell PL, Gaut BS (2007) Plant domestication,
a unique opportunity to identify the genetic basis of
adaptation. Proc Natl Acad Sci 104:8641–8648. https://doi.
org/10.1073/pnas.0700643104
Saitou N, Nei M (1987) The Neighbor-Joining method: a new
method for reconstructing phylogenetic trees. Mol Biol
Evol 4:406–425. https://doi.org/10.1093/oxfordjournals.
molbev.a040454
Scalabrin S, Toniutti L, Di Gaspero G et al. (2020). A single
polyploidization event at the origin of the tetraploid gen-
ome of Coffea arabica is responsible for the extremely low
genetic variation in wild and cultivated germplasm. Sci-
entific Reports.
Silvestrini M, Junqueira MG, Favarin AC, Guerreiro-Filho O,
Maluf MP, Silvarolla MB, Colombo CA (2007) Genetic
diversity and structure of Ethiopian, Yemen and Brazilian
Coffea arabica L. accessions using microsatellites mark-
ers. Genet Resour Crop Evol 54:1367–1379. https://doi.
org/10.1007/s10722-006-9122-4
Singh N, Choudhury DR, Singh AK et al (2013) Comparison of
SSR and SNP markers in estimation of genetic diversity
and population structure of Indian rice varieties. PLoS
ONE. https://doi.org/10.1371/journal.pone.0084136
Sylvain PG (1958) Ethiopian coffee—its significance to world
coffee problems. Econ Bot 12:111–139
Thomas AS (1942) The wild Arabica coffee on the Boma Pla-
teau. Anglo-Egyptian Sudan Emp J Exp Agric 10:207–212
Tomassone R, Danzart M, Daudin J J and Masson J P
(1988). Discrimination et classement. Masson.
Turner-Hissong SD, Mabry ME, Beissinger TM, Ross-Ibarra J,
Pires JC (2020) Evolutionary insights into plant breeding.
Curr Opin Plant Biol 54:93–100. https://doi.org/10.1016/j.
pbi.2020.03.003
Ukers MA (1922) All about coffee. The Tea and Coffee Trade
Journal, New York
Van der Vossen H, Bertrand B, Charrier A (2015) Next gener-
ation variety development for sustainable production of
arabica coffee (Coffea arabica L.): a review. Euphytica
204:243–256. https://doi.org/10.1007/s10681-015-1398-z
Zambolim L (2016) Current status and management of coffee
leaf rust in Brazil. Trop Plant Pathol 41:1–8. https://doi.
org/10.1007/s40858-016-0065-9
Publisher’s Note Springer Nature remains neutral with
regard to jurisdictional claims in published maps and
institutional affiliations.
123
2422 Genet Resour Crop Evol (2021) 68:2411–2422
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”),
for small-scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are
maintained. By accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use
(“Terms”). For these purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or
a personal subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or
a personal subscription (to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the
Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data
internally within ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking,
analysis and reporting. We will not otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of
companies unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that
Users may not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to
circumvent access control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil
liability, or is otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by
Springer Nature in writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer
Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates
revenue, royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain.
Springer Nature journal content cannot be used for inter-library loans and librarians may not upload Springer Nature journal
content on a large scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any
information or content on this website and may remove it or features or functionality at our sole discretion, at any time with or
without notice. Springer Nature may revoke this licence to you at any time and remove access to any copies of the Springer Nature
journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express
or implied with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or
warranties imposed by law, including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be
licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other
manner not expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... Previous research has confirmed substantial genetic variation across Arabica coffee cultivars in Saudi Arabian and Yemeni varietals [11,[39][40][41]. It was documented that most Yemeni coffee varieties originated from ancient "heirloom" cultivars of C. arabica that were initially naturalized centuries ago [41,42]. ...
... Previous research has confirmed substantial genetic variation across Arabica coffee cultivars in Saudi Arabian and Yemeni varietals [11,[39][40][41]. It was documented that most Yemeni coffee varieties originated from ancient "heirloom" cultivars of C. arabica that were initially naturalized centuries ago [41,42]. The findings of the current study and several others addressing genetic diversity in Arabica coffee cultivars in Saudi Arabia and Yemen bolster the proposition that the Arabian Peninsula constitutes the most significant hub of coffee diversity beyond the species' original center in Ethiopia and South Sudan [6,41,42]. ...
... It was documented that most Yemeni coffee varieties originated from ancient "heirloom" cultivars of C. arabica that were initially naturalized centuries ago [41,42]. The findings of the current study and several others addressing genetic diversity in Arabica coffee cultivars in Saudi Arabia and Yemen bolster the proposition that the Arabian Peninsula constitutes the most significant hub of coffee diversity beyond the species' original center in Ethiopia and South Sudan [6,41,42]. ...
Article
Full-text available
The biodiversity of 12 coffee (Coffea arabica L.) cultivars collected from the Al-Baha region in the southwest of Saudi Arabia was evaluated using 25 morphological variations and genetic diversity as demonstrated by molecular polymorphism generated by eight Inter Simple Sequence Repeats (ISSRs) and nine Start Codon Targeted (SCoT) primers. Substantial variations were scored in the morphological traits reflected in the clustering of the examined cultivars in PCA of the coffee cultivars. The examined cultivars were grouped in two groups, one included the cultivars coded Y5, Y6, R113, and Y7 and the other group comprised two clusters; one comprised cultivars coded R8 and R4 and the other comprised cultivars R112, R114, and Y2. In the meantime, the cultivars coded R9 and R111 were differentiated together from other cultivars, while the Y3 cultivar was confirmed by the analysis of ISSR data and SCoT data, which also support the grouping of R9 and R111 cultivars. Principle Component Analysis (PCA) of morphological, ISSR, and SCoT data as a combined set differentiated the examined species into four groups in a scatter plot in agreement with their separation in the cluster trees. The diversity profile among the examined C. arabica cultivars proved that R111 and R4 cultivars are highly diverse, while R8 and Y5 cultivars exhibit low diversity. Alpha diversity indices indicated that R9 and R111 cultivars are the most dominant and stable C. arabica cultivars among the examined samples in the study region.
... Indeed, while the species' native habitat covers Ethiopia and South Sudan in Africa (Meyer 1965;Davis et al. 2012;Koehler 2017;Krishnan et al. 2021), the major center of domestication from the fifteenth century onwards was Yemen, part of the Asian continent, albeit very close to Ethiopia. Yemen designated as primary domestication center was first based on historical data (De la Roque 1716 ;Ukers 1922;Chevalier 1929;Cramer 1957;Haarer 1958;Meyer 1965;Tuchscherer 2001;Koehler 2017) and then confirmed by genetic research (Anthony et al. 2001(Anthony et al. , 2002Silvestrini et al. 2007;Scalabrin et al. 2020;Krishnan et al. 2021;Montagnon et al. 2021Montagnon et al. , 2022b. India was the second country outside Africa to host C. arabica from the end of the seventeenth century. ...
... A set of 10 SRR markers was used (Table 2). SSR markers are suitable for studying the genetic diversity of plants in general (Testolin et al. 2023) and our SSR marker set has proved its discriminatory power for C. arabica in several investigations (Combes et al. 2000;Pruvot-Woehl et al. 2020;Krishnan et al. 2021;Montagnon et al. 2021Montagnon et al. , 2022a as well as C. canephora (Montagnon, unpublished data). PCR was performed in a final volume of 15 µl, comprising 30 ng of genomic DNA and 7.5 µl of 2xPCR buffer (Type-it Microsatellite PCR Kit, Qiagen, Hilden, Germany) and 1 µl each of forward and reverse primers (10 µM). ...
... As in previous research on C. arabica based on SSR markers (Pruvot-woehl et al. 2020;Krishnan et al. 2021;Montagnon et al. 2021Montagnon et al. , 2022a, and in order to take into account the tetraploidy of C. arabica, the phenotype and not the genotype of the alleles is scored by considering these markers as dominant markers. Indeed, an individual showing the AB phenotype (A and B being two alleles) for a given marker could correspond to genotypes AABB, ABAB, AAAB or ABBB. ...
... The Caturra Chiroso variety (small size) is non-derived/unrelated to the old Caturra variety. It means that Caturra Chiroso is not a mutation of Caturra but rather an Ethiopian landrace, an exclusive wild Ethiopian accession non-worldwide cultivated-Ethiopian-Only genetic cluster -, previously determined using microsatellite markers (Montagnon et al. 2021). Montagnon et al. (2021) included Caturra Chiroso in that analysis as another ingroup, without knowing its common name in the Urrao region, naming it as simply "Chiroso". ...
... It means that Caturra Chiroso is not a mutation of Caturra but rather an Ethiopian landrace, an exclusive wild Ethiopian accession non-worldwide cultivated-Ethiopian-Only genetic cluster -, previously determined using microsatellite markers (Montagnon et al. 2021). Montagnon et al. (2021) included Caturra Chiroso in that analysis as another ingroup, without knowing its common name in the Urrao region, naming it as simply "Chiroso". Interestingly, outside Ethiopia, few Ethiopian accessions were known, and none were previously known for their low stature, being Caturra Chiroso and Chiroso the first ones. ...
Article
Full-text available
Varieties represent a defined group with differentiated characteristics derived through natural selection and/or selective breeding from within a species. In the Central-Andean region of Colombia (Urrao) there are three endemic varieties of the species Coffea arabica L ["Caturra Chiroso" (CCH), "Bourbon Chiroso" (BCH), and "Chiroso" (CHCH)], known as "Chiroso" group, globally renowned for their high quality and distinctive cup profile. Despite its significance, there is a lack of reported genomic resources or basic biological information for these. In this study, we conducted the first assembly and characterization of the complete chloroplast (Cp) genomes of these varieties and reconstructed their ancestry relationships. The Cp genomes were 155,188 bp in length (A = 30.93%; C = 19.06%; G = 18.37%; T = 31.64%); containing 131 genes, comprising 86 protein-coding genes, 8 rRNA genes, and 37 tRNA genes. They consisted of four subregions: the large single-copy (LSC) region (85,159 bp; 83 genes), the short single-copy (SSC) region (18,136 bp; 12 genes), and the inverted repeats IRA (25,944 bp; 18 genes) and IRB (25,945 bp; 18 genes). Likewise, among 26 intraspecific varieties analyzed , CCH + BCH formed a unique haplotype, and CHCH + Bourbon + Caturra formed another. CCH and BCH featured an exclusive Cytosine mutation (SNP: C/A), position 47,413 bp (intergenic spacer region trnT(UGU)-trnL(UAA)]. Likewise, a total of 445 short tandem repeats were found in the Cp genomes (dinucleotides: 370; trinucleotides: 71; tetranucleotides: 1). Finally, the three formed a well-supported monophyletic group with conspecific varieties, being more closely related to Eastern Ethiopian-origin varieties [e.g. Berbere region], as well as with traditional ones like Typica, Bourbon, and Caturra. These coffee varieties are a valuable new genetic resource for use as a gene source for genetic improvement, biotechnology, direct exploitation and cultivation worldwide.
... Specialty coffee samples (> 80 points in the Specialty Coffee Association of America (SCAA) scale) that were harvested in 2020 from Yemen (n = 124) were ethically sourced by Qima Coffee from smallholder coffee farmers in five regions of this country (Al Mahwit, Dhamar, Ibb, Sa' dah, and Sana'a) following the guidelines and regulations included in The Coffee Guide by the International Trade Centre and the International Coffee Organization ICC-102-9 Rules on Certificates of Origin. These samples were processed using the Natural post-harvest methods and included genetic varieties from the Typica Bourbon group (SL-28, SL-34, Kent) and the recently described New-Yemen group (Yemenia) 7,45 . In addition, specialty coffee samples from Africa, Asia, Central America, South America, and Oceania (n = 97) processed using Natural, Honey, and Washed post-harvest methods were acquired. ...
... Despite the role of Yemeni coffee in the history of coffee and its unique attributes, quality, and "terroir," there has been very limited scientific work to explore the chemical distinctiveness of Yemeni coffee in comparison to other origins [11][12][13] . Some studies which delved into the genetic makeup of Yemeni coffee plants have unveiled a tapestry of diversity that sets Yemen apart from all other coffee-growing nations 7,8,[14][15][16] . This study, incorporating over 200 samples from major coffee regions worldwide, is a pioneering endeavor to elucidate the discriminative capabilities of NIR spectra of whole green coffee beans from Yemen and other regions, including Africa, Asia, Central America, and South America. ...
Article
Full-text available
Yemeni smallholder coffee farmers face several challenges, including the ongoing civil conflict, limited rainfall levels for irrigation, and a lack of post-harvest processing infrastructure. Decades of political instability have affected the quality, accessibility, and reputation of Yemeni coffee beans. Despite these challenges, Yemeni coffee is highly valued for its unique flavor profile and is considered one of the most valuable coffees in the world. Due to its exclusive nature and perceived value, it is also a prime target for food fraud and adulteration. This is the first study to identify the potential of Near Infrared Spectroscopy and chemometrics—more specifically, the discriminant analysis (PCA-LDA)—as a promising, fast, and cost-effective tool for the traceability of Yemeni coffee and sustainability of the Yemeni coffee sector. The NIR spectral signatures of whole green coffee beans from Yemeni regions (n = 124; Al Mahwit, Dhamar, Ibb, Sa’dah, and Sana’a) and other origins (n = 97) were discriminated with accuracy, sensitivity, and specificity ≥ 98% using PCA-LDA models. These results show that the chemical composition of green coffee and other factors captured on the spectral signatures can influence the discrimination of the geographical origin, a crucial component of coffee valuation in the international markets.
... Even though the genetic variability in Arabica coffee is low compared to many crop wild relatives 18 , Ethiopian coffee has a wider genetic base than coffee grown elsewhere in the world (Box 1) 19,21 . Moreover, the spatial variation in genetic composition across Ethiopian landscapes is complex, reflecting a coffee management history involving processes such as isolated traditional selection, trade, spread of coffee into reforested areas after depopulation due to wars, and current management intensification and spread of modern cultivars 20,22 . ...
... Moreover, the risk of losing genetic diversity of wild coffee through introgression of genes from domesticated populations would be reduced also, as it is unlikely that introgressed genes will be selected for across multiple environmental gradients (see also e.g. ref. 21). As long as such large areas exist, the need to put a ban on the use of improved varieties outside the forests in the same landscape will be of lower priority. ...
Article
Full-text available
The reality for conservation of biodiversity across our planet is that all ecosystems are modified by humans in some way or another. Thus, biodiversity conservation needs to be implemented in multifunctional landscapes. In this paper we use a fascinating coffee-dominated landscape in southwest Ethiopia as our lens to derive general lessons for biodiversity conservation in a post-wild world. Considering a hierarchy of scales from genes to multi-species interactions and social-ecological system contexts, we focus on (i) threats to the genetic diversity of crop wild relatives, (ii) the mechanisms behind trade-offs between biodiversity and agricultural yields, (iii) underexplored species interactions suppressing pest and disease levels, (iv) how the interactions of climate change and land-use change sometimes provide opportunities for restoration, and finally, (v) how to work closely with stakeholders to identify scenarios for sustainable development. The story on how the ecology and evolution of coffee within its indigenous distribution shape biodiversity conservation from genes to social-ecological systems can inspire us to view other landscapes with fresh eyes. The ubiquitous presence of human-nature interactions demands proactive, creative solutions to foster biodiversity conservation not only in remote protected areas but across entire landscapes inhabited by people.
... The Moka phenotype may originate from punctual mutations in Typica, similar to Caturra being a dwarf mutant of "tall" Bourbon (World Coffee Research,2019), also indistinguishable with our markers. The term "Mocha" was historically used for Yemeni coffee, which was described as small-beaned and of superior quality (Haarer, 1923;Ukers, 1922) but is genetically diverse (Montagnon et al., 2021). Guadeloupean Moka is not likely closely related to Yemeni accessions. ...
Article
Full-text available
Societal Impact Statement Despite strong historical declines, Guadeloupe and Haiti's coffee sectors remain important to rural communities' livelihood and resilience. Coffee also holds value as part of the islands' historical legacy and cultural identities. Furthermore, it is often grown in agroforestry systems providing important ecosystem services, which will become more important as these vulnerable islands work to adapt to a changing climate. Current efforts to revitalize coffee farms and target strategically important specialty markets would benefit from understanding existing genetic resources and the historical factors that shaped them. Our study reveals the rich history reflected in current coffee stands on the islands. Summary The West Indies, particularly former French colonies like Haiti and Guadeloupe, were central to the spread of coffee in the Americas. The histories of these Islands are shared until the 19th century, where they diverged significantly. Still, both Islands experienced a strong decline in their coffee sector. Characterizing the genetic and varietal diversity of their coffee resources and understanding historical factors shaping them can help support revitalization efforts. To that end, we performed Kompetitve Allele‐Specific PCR (KASP) genotyping of 80 informative single nucleotide polymorphism (SNP) markers on field samples from across main coffee‐growing region of Guadeloupe, and two historically important ones in Haiti, as well as 146 reference accessions from international collections. We also compared bioclimatic variables from sampled geographic areas and searched for historical determinants of present coffee resources. At least five Coffea arabica varietal groups were found in Haiti, versus two in Guadeloupe, with admixed individuals in both. The traditional Typica variety is still present in both islands, growing across a variety of climatic environments. We also found Coffea canephora on both islands, with multiple likely origins, and identified C. liberica var. liberica in Guadeloupe. These differences are explained by the Islands' respective histories. Overall, Guadeloupe experienced fewer, but older introductions of non‐Typica coffee. By contrast, several recent introductions have taken place in Haiti, driven by local and global factors and reflecting the history of Arabica varietal development and spread. Diversity on these islands is dynamic, and our results reveal opportunities and limits to the future of Guadeloupean and Haitian coffee.
... These persistent paralogs are more apparent in Coffea arabica and Zea mays (Fig. 7). Since these species have undergone recent domestication (85,86), it may be possible that domestication has hindered efficient gene shedding or promoted paralog retention (87). ...
Preprint
Multiple rounds of whole genome duplication (WGD) followed by re-diploidization have occurred throughout the evolutionary history of angiosperms. To understand why these cycles occur, much work has been done to model the genomic consequences and evolutionary significance of WGD. Since the machinations of diploidization are strongly influenced by the mode of speciation (whether a lineage was derived from ancient allo or autopolyploid), methods which can classify ancient whole genome duplication events as allo or auto are of great importance. Here we present a forward-time polyploid genome evolution simulator called SpecKS. Using extensive simulations, we demonstrate that allo and autopolyploid-derived species exhibit differently shaped Ks histograms. We also demonstrate sensitivity of the Ks histogram to the effective population size (Ne) of the ancestral species. Our findings indicate that error in the common method of estimating WGD time from the Ks histogram peak scales with the degree of allopolyploidy, and we present an alternative, accurate estimation method that is independent of the degree of allopolyploidy. Lastly, we use SpecKS results to derive tests that reveal whether a genome is descended from allo or autopolyploidy, and whether the ancestral species had a high or low Ne. We apply this test to transcriptomic data for over 200 species across the plant kingdom, validating the theory that the majority of angiosperm lineages are derived from allopolyploidization events.
Chapter
Full-text available
Coffee, comprising Arabica coffee (Coffea arabica L.) and Robusta or Conillon coffee (Coffea canephora Pierre ex A. Froehner), is a vital global commodity, second only to oil in value. The genetic resources of coffee, particularly C. arabica, are critical for its production, trade, and social impact. This chapter reviews the historical and contemporary understanding of Arabica coffee, including its origin, evolution, genetic diversity, and conservation strategies. Key themes include the importance of wild and cultivated accessions, the role of molecular markers and next-generation technologies in assessing genetic diversity, and the utilization of germplasm in breeding programs. Conservation efforts, both in-situ and ex-situ, are also discussed, highlighting the challenges of maintaining genetic diversity amidst increasing anthropogenic pressures.
Article
Geisha coffee is recognized for its unique aromas and flavors and accordingly, has achieved the highest prices in the specialty coffee markets. We report the development of a chromosome-level, well-annotated, genome assembly of Coffea arabica var. Geisha. Geisha is considered an Ethiopian landrace that represents germplasm from the Ethiopian center of origin of coffee. We used a hybrid de novo assembly approach combining two long-reads single molecule sequencing technologies, Oxford Nanopore and Pacific Biosciences, together with scaffolding with Hi-C libraries. The final assembly is 1.03GB in size with BUSCO assessment of the assembly completeness of 97.7% of single-copy orthologs clusters. RNAseq and IsoSeq data were used as transcriptional experimental evidence for annotation and gene prediction revealing the presence of 47,062 gene loci encompassing 53,273 protein-coding transcripts. Comparison of the assembly to the progenitor subgenomes separated the set of chromosome sequences inherited from C. canephora from those of C. eugenioides. Corresponding orthologs between the two Arabica varieties, Geisha and Red Bourbon, had a 99.67% median identity, higher than what we observe with the progenitor assemblies (median 97.28%). Both Geisha and Red Bourbon contain a recombination event on Chromosome 10 relative to the two progenitors that must have happened before the geographical separation of the two varieties, consistent with a single allopolyploidization event giving rise to C. arabica. Broadening the availability of high-quality genome assemblies of Coffea arabica varieties, paves the way for understanding the evolution and domestication of coffee, as well as the genetic basis and environmental interactions of why a variety like Geisha is capable of producing beans with such exceptional and unique high-quality.
Article
Full-text available
Ethiopia is the center of origin and genetic diversity of arabica coffee. Forty-two commercial arabica coffee varieties were developed by Jimma Agricultural Research Center (JARC) of Ethiopian Institute of Agricultural Research (EIAR) and released for production under diverse agro-ecologies of the country. Information on the level of genetic diversity among these varieties is scarce. Out of the 42 varieties, the genetic diversity of 40 widely cultivated commercial varieties was assessed using 14 simple sequence repeat (SSR) markers. These markers revealed polymorphism among the varieties. High average number of polymorphic alleles (7.5) and polymorphic information content (PIC = 80%) per locus were detected among the varieties. The genetic similarity among varieties using the Jaccard's similarity coefficient ranged from 0.14 to 0.78, with a mean of 0.38. The range of genetic similarity coefficient values in 92% of the possible pair-wise combinations varied from 0.14 to 0.50, indicating the presence of distant genetic relatedness among the varieties. Unweighted pair group method using arithmetic mean (UPGMA) clustering showed six major clusters and three singletons. Coffee varieties, belonging to the same geographic origin, were distributed across clusters. This study represents the first evidence of the presence of a high level of genetic diversity in Ethiopian commercial ara-bica coffee varieties. Divergent varieties with complementing traits could be crossed to develop productive hybrid coffee varieties.
Article
Full-text available
Conventional American cultivars of coffee are no longer adapted to global warming. Finding highly productive and stable cultivars in different environments without neglecting quality characteristics has become a priority for breeders. In this study, new Arabica F1 hybrids clones were compared to conventional American varieties in seven contrasting environments, for yield, rust incidence and volume of the canopy. The quality was assessed through size, weight of 100 beans, biochemical analysis (24 aroma precursors and 31 volatiles compounds) and sensory analysis. Conventional varieties were the least productive, producing 50% less than the best hybrid. The AMMI model analysis pointed out five hybrids as the most stable and productive. Two F1 hybrids clones, H1-Centroamericano and H16-Mundo Maya, were superior to the most planted American cultivar in Latin and Central America showing a high yield performance and stability performance. H1-Centroamerica and Starmaya contain more d-limonene than Caturra, while Starmaya contain more 3-methylbutanoic acid than the control. Those two latter volatiles compounds are linked with good cup quality in previous studies. In terms of sensory analysis, Starmaya and H1-Centroamericano scored better than control.
Article
Full-text available
Crop domestication is a fascinating area of study, as shown by a multitude of recent reviews. Coupled with the increasing availability of genomic and phenomic resources in numerous crop species, insights from evolutionary biology will enable a deeper understanding of the genetic architecture and short-term evolution of complex traits, which can be used to inform selection strategies. Future advances in crop improvement will rely on the integration of population genetics with plant breeding methodology, and the development of community resources to support research in a variety of crop life histories and reproductive strategies. We highlight recent advances related to the role of selective sweeps and demographic history in shaping genetic architecture, how these breakthroughs can inform selection strategies, and the application of precision gene editing to leverage these connections.
Article
Full-text available
Background Locating the optimal varieties for coffee cultivation is increasingly considered a key condition for sustainable production and marketing. Variety performance varies when it comes to susceptibility to coffee leaf rust and other diseases, adaptation to climate change and high cup quality for specialty markets. But because of poor organization and the lack of a professional coffee seed sector, most existing coffee farms (and even seed lots and nurseries) do not know which varieties they are using. DNA fingerprinting of coffee planting material will contribute to professionalize the coffee seed sector. Objective The objective of this paper is i) to check in a large scale the robustness of the existing coffee DNA fingerprinting method based on eight Single Sequence Repeats markers (SRR) and ii) to describe how it can help in moving the needle towards a more professional seed sector. Method 2533 samples representing all possible genetic background of Arabica varieties were DNA fingerprinted with 8 SRR markers. The genetic diversity was analyzed and the genetic conformity to varietal references was assessed. Results The DNA fingerprinting method proved to be robust in authenticating varieties and trace back the history of C. arabica breeding and of the movement of C. arabica varieties. The genetic conformity of two important coffee varieties, Marseillesa and Gesha, proved to be 91% and 39% respectively. Conclusions DNA fingerprinting provides different actors in the coffee sector with a powerful new tool—farmers can verify the identity of their cultivated varieties, coffee roasters can be assured that marketing claims related to varieties are correct, and most of all, those looking to establish the a more professional and reliable coffee seed sector have a reliable new monitoring tool to establish and check genetic purity of seed stock and nursery plants. Highlights While C. arabica is primarily self-pollinating, even fixed line varieties appear to be drifting away from their original genetic reference due to uncontrolled cross pollination. A set of 8 SSR markers applied to the largest possible genetically diverse set of samples prove to discriminate between a wide range of varieties Figures confirm that genetic non conformity of coffee varieties can represent up to 61% of checked samples.
Article
Full-text available
The genome of the allotetraploid species Coffea arabica L. was sequenced to assemble independently the two component subgenomes (putatively deriving from C. canephora and C. eugenioides) and to perform a genome-wide analysis of the genetic diversity in cultivated coffee germplasm and in wild populations growing in the center of origin of the species. We assembled a total length of 1.536 Gbp, 444 Mb and 527 Mb of which were assigned to the canephora and eugenioides subgenomes, respectively, and predicted 46,562 gene models, 21,254 and 22,888 of which were assigned to the canephora and to the eugeniodes subgenome, respectively. Through a genome-wide SNP genotyping of 736 C. arabica accessions, we analyzed the genetic diversity in the species and its relationship with geographic distribution and historical records. We observed a weak population structure due to low-frequency derived alleles and highly negative values of Taijma’s D, suggesting a recent and severe bottleneck, most likely resulting from a single event of polyploidization, not only for the cultivated germplasm but also for the entire species. This conclusion is strongly supported by forward simulations of mutation accumulation. However, PCA revealed a cline of genetic diversity reflecting a west-to-east geographical distribution from the center of origin in East Africa to the Arabian Peninsula. The extremely low levels of variation observed in the species, as a consequence of the polyploidization event, make the exploitation of diversity within the species for breeding purposes less interesting than in most crop species and stress the need for introgression of new variability from the diploid progenitors.
Article
Full-text available
Separating footprints of adaptation from demography is challenging. When selection has acted on a single locus with major effect, this issue can be alleviated through signatures left by selective sweeps. However, as adaptation is often driven by small allele frequency shifts at many loci, studies focusing on single genes are able to identify only a small portion of genomic variants responsible for adaptation. In face of this challenge, we utilize co-expression information to search for signals of polygenetic adaptation in Theobroma cacao, a tropical tree species that is the source of chocolate. Using transcriptomics and a weighted correlation network analysis, we group genes with similar expression patterns into functional modules. We then ask whether modules enriched for specific biological processes exhibit cumulative effects of differential selection in the form of high FST and dXY between populations. Indeed, modules putatively involved in protein modification, flowering, and water transport show signs of polygenic adaptation even though individual genes that are members of those groups do not bear strong signatures of selection. Modelling of demography, background selection, and the effects of genomic features reveal that these patterns are unlikely to arise by chance. We also found that specific modules were enriched for signals of strong or relaxed purifying selection, with one module bearing signs of adaptive differentiation and an excess of deleterious mutations. Our results provide insight into polygenic adaptation, and contribute to understanding of population structure, demographic history, and genome evolution in T. cacao.
Article
Full-text available
Information about population structure and genetic relationships within and among wild and brazilian Coffea arabica L. genotypes is highly relevant to optimize the use of genetic resources for breeding purposes. In this study, we evaluated genetic diversity, clustering analysis based on Jaccard’s coefficient and population structure in 33 genotypes of C. arabica and of three diploid Coffea species (C. canephora, C. eugenioides and C. racemosa) using 30 SSR markers. A total of 206 alleles were identified, with a mean of 6.9 over all loci. The set of SSR markers was able to discriminate all genotypes and revealed that Ethiopian accessions presented higher genetic diversity than commercial varieties. Population structure analysis indicated two genetic groups, one corresponding to Ethiopian accessions and another corresponding predominantly to commercial cultivars. Thirty-four private alleles were detected in the group of accessions collected from West side of Great Rift Valley. We observed a lower average genetic distance of the C. arabica genotypes in relation to C. eugenioides than C. canephora. Interestingly, commercial cultivars were genetically closer to C. eugenioides than C. canephora and C. racemosa. The great allelic richness observed in Ethiopian Arabica coffee, especially in Western group showed that these accessions can be potential source of new alleles to be explored by coffee breeding programs.
Article
Full-text available
This study was carried out to determine the genetic diversity degree and genetic relationships among seventeen genotypes involving 16 commercial cultivars and one accession of Yemeni coffee (Coffea arabica L.) germplasm collected from different Governorates in Yemen, and analysis of the DNA fingerprinting data for creating molecular IDs for conservation and protection of these genotypes. These goals were done using 15 previously described SSR primer pairs, in addition, evaluating of the efficiency and performance of these loci to achieve these objectives. The SSR loci were very highly polymorphic with an average of 100% polymorphism, the scored alleles were high and ranged from 4 to 25 with a mean value of 10.7 per locus. Heterozygosity values per locus and per genotype were low with an average of 0.21., the (PIC) values or gene diversity differed from 0.57 to 0.98. Also, the discriminating power for all loci was high with a mean value of 0.81; the most informative primer pairs was gSSRCa 021 with DP value of 0.94. However, the probability of matching fingerprints was low with an average value of 0.19. In this study, it was impossible to obtain identical DNA fingerprints for any of genotype pairs at all loci, even the genotypes with the same name but were collected from different geographical regions. However, the 15 SSR loci could differentiate between 17 Yemeni coffee genotypes through detecting of 86 clear specific/exclusive alleles. All Cultivars and accessions had unique alleles to that genotype alone with a mean value of 5.06. The cluster analysis grouped the 17 genotypes into three main clusters with a genetic similarity degree ranged from 0.00 to 0.486 with an average of 0.243. The genotypes also were classified according to fruit color and geographical regions. Results of this study confirmed the presence of genetic diversity among Yemeni coffee genotypes ranging between a moderate to high, indicating their importance as source of genetic variability for the purposes of coffee improvement, especially with the suffering of Arabica coffee from the narrowness of its genetic base
Article
Full-text available
Background: The genetic mechanisms underlying the domestication of animals and plants have been of great interest to biologists since Darwin. To date, little is known about the global pattern of gene expression changes during domestication. Results: We generated and collected transcriptome data for seven pairs of domestic animals and plants including dog, silkworm, chicken, rice, cotton, soybean and maize and their wild progenitors and compared the expression profiles between the domestic and wild species. Intriguingly, although the number of expressed genes varied little, the domestic species generally exhibited lower gene expression diversity than did the wild species, and this lower diversity was observed for both domestic plants and different kinds of domestic animals including insect, bird and mammal in the whole-genome gene set (WGGS), candidate selected gene set (CSGS) and non-CSGS, with CSGS exhibiting a higher degree of decreased expression diversity. Moreover, different from previous reports which found 2 to 4% of genes were selected by human, we identified 6892 candidate selected genes accounting for 7.57% of the whole-genome genes in rice and revealed that fewer than 8% of the whole-genome genes had been affected by domestication. Conclusions: Our results showed that domestication affected the pattern of variation in gene expression throughout the genome and generally decreased the expression diversity across species, and this decrease may have been associated with decreased genetic diversity. This pattern might have profound effects on the phenotypic and physiological changes of domestic animals and plants and provide insights into the genetic mechanisms at the transcriptome level other than decreased genetic diversity and increased linkage disequilibrium underpinning artificial selection.
Article
Full-text available
The coffee cultivar Catuaí is among the most successful cultivars in Brazilian agriculture; it has been on the market for more than 40 years. It was obtained by Dr. Alcides Carvalho, a researcher of the Instituto Agronômico de Campinas (IAC), from the cross between ‘Caturra’ and ‘Mundo Novo’ carried out in 1949 for the purpose of joining plant vigor with small plant size. Our aim was to report the activities that culminated in the recommendation of 16 lines of ‘Catuaí’, consisting of eight lines with red fruit and eight with yellow fruit, analyzing the data of several experiments. The decision regarding what to recommend was made in the F1:2 generation, based on two harvests. It became clear that Dr. Alcides should be taken as an example by all breeders, above all in his persistence, scientific rigor, and belief that farmers can be an important ally of breeders.