Access to this full-text is provided by Springer Nature.
Content available from BMC Plant Biology
This content is subject to copyright. Terms and conditions apply.
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
https://doi.org/10.1186/s12870-021-03287-w
RESEARCH
Evidence forextensive hybridisation
andpastintrogression events infeather grasses
using genome-wide SNP genotyping
Evgenii Baiakhmetov1,2*, Daria Ryzhakova2,3, Polina D. Gudkova2,3 and Marcin Nobis1,2*
Abstract
Background: The proper identification of feather grasses in nature is often limited due to phenotypic variability and
high morphological similarity between many species. Among plausible factors influencing this issue are hybridisation
and introgression recently detected in the genus. Nonetheless, to date, only a bounded set of taxa have been investi-
gated using integrative taxonomy combining morphological and molecular data. Here, we report the first large-scale
study on five feather grass species across several hybrid zones in Russia and Central Asia. In total, 302 specimens were
sampled in the field and classified based on the current descriptions of these taxa. They were then genotyped with
high density genome-wide markers and measured based on a set of morphological characters to delimitate species
and assess levels of hybridisation and introgression. Moreover, we tested species for past introgression and estimated
divergence times between them.
Results: Our findings demonstrated that 250 specimens represent five distinct species: S. baicalensis, S. capillata, S.
glareosa, S. grandis and S. krylovii. The remaining 52 individuals provided evidence for extensive hybridisation between
S. capillata and S. baicalensis, S. capillata and S. krylovii, S. baicalensis and S. krylovii, as well as to a lesser extent between
S. grandis and S. krylovii, S. grandis and S. baicalensis. We detected past reticulation events between S. baicalensis, S.
krylovii, S. grandis and inferred that diversification within species S. capillata, S. baicalensis, S. krylovii and S. grandis
started ca. 130–96 kya. In addition, the assessment of genetic population structure revealed signs of contemporary
gene flow between populations across species from the section Leiostipa, despite significant geographical distances
between some of them. Lastly, we concluded that only 5 out of 52 hybrid taxa were properly identified solely based
on morphology.
Conclusions: Our results support the hypothesis that hybridisation is an important mechanism driving evolution
in Stipa. As an outcome, this phenomenon complicates identification of hybrid taxa in the field using morphological
characters alone. Thus, integrative taxonomy seems to be the only reliable way to properly resolve the phylogenetic
issue of Stipa. Moreover, we believe that feather grasses may be a suitable genus to study hybridisation and introgres-
sion events in nature.
Keywords: Feather grasses, Hybridisation, Introgression, Integrative taxonomy, Genome-wide genotyping, DArTseq,
Divergence-time estimation, Population structure
© The Author(s) 2021, corrected publication 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0
International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you
give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes
were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To
view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver
(http:// creat iveco mmons. org/ publi cdoma in/ zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a
credit line to the data.
Background
e proper delimitation of species plays an important
role in taxonomy as well as in studies related to spe-
ciation, biogeography and ecology, leading to effective
conservation and management of biodiversity. In the
Open Access
*Correspondence: evgenii.baiakhmetov@doctoral.uj.edu.pl; m.nobis@uj.edu.
pl
2 Research laboratory ‘Herbarium’, National Research Tomsk State
University, Lenin 36 Ave., 634050 Tomsk, Russia
Full list of author information is available at the end of the article
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 2 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
last two decades, traditional approaches relying mostly
on morphological features have been supplemented by
molecular data that boosted the discovery of new spe-
cies. Although one estimate of the number of plant spe-
cies is around 298,000 [1], recently it has been shown
that the plant kingdom is comprised of at least 374,000
taxa [2]. Nowadays, many systematicists emphasise the
need to apply multidisciplinary data, so-called integra-
tive approaches or integrative taxonomy [3]. For instance,
information from a variety of disciplines, e.g., morphol-
ogy, biochemistry, cytogenetics and ‘omics studies,
increases the reliability and validity in identifying taxa
[4–6].
To date, among the molecular methods, DNA bar-
coding has been a widely utilised tool to identify taxa
at different levels aiming not only to facilitate revision-
ary taxonomy, but also to broaden our understanding
of molecular phylogenetics and population-level varia-
tions [7–9]. Among standard plant markers, chloroplast
regions rbcL and matK and the nuclear internal tran-
scribed spacer (ITS) locus have been proposed for DNA
barcoding of land plants [10, 11]. Additionally, several
non-coding plastid regions have been suggested as sup-
plementary loci where further resolution is required [10].
Stipa is one of the largest genera in the grass subfam-
ily Pooideae (Poaceae), currently comprising nearly 150
cool-season species with C3 photosynthesis common
in Eurasia and North Africa [12, 13]. Based on ITS and
the plastid trnK region, the genus has been proven to be
monophyletic [14, 15]. Nonetheless, the traditional bar-
codes are not able to validate the sectional subdivision in
Stipa proposed, e.g., by Tzvelev [16, 17] or Freitag [18].
Recently, the nuclear intergenic spacer (IGS) region [19]
and marker sets derived from whole chloroplast genomes
of 19 species [20] were proposed for phylogenetic stud-
ies of feather grasses. Although these markers are more
phylogenetically informative in comparison to the previ-
ously used barcodes, they are still unable to discriminate
all taxa, causing unresolved nodes in the reconstructed
trees [19, 20].
One of the plausible explanations for this unresolved
branching in Stipa is that many feather grasses are of
hybrid origin [16, 21–23]. Presently, hybridisation is con-
sidered to be widespread among at least 25% of plant
taxa, mostly the youngest species [24]. is phenomenon
is often accompanied by introgression via repeated back-
crossing to one or both parental species that can lead to
diversification and speciation [25, 26]. In grasses, hybrid
speciation has been explicitly studied at the genome level,
e.g., in Triticum [27] and Brachypodium [28]. Nonethe-
less, previously many hybrids and introgressed individu-
als were characterised exclusively based on morphology
that limited their successful identification in nature [29,
30]. In addition, hybridisation may lead new organisms
not only to intermediate traits of parental species but
also to extreme, or transgressive, phenotypes [31] that
complicate their proper taxonomic treatment. In feather
grasses, the hypothesis of hybrid origin of some species
was initially tested using multivariate morphological
analyses [23, 32] and, more recently, applying molecular
markers among genetically closely related species in the
Stipa heptapotamica complex [33] as well as within two
genetically distant species, S. krylovii and S. bungeana
[34]. Furthermore, due to the usage of integrative taxo-
nomic approaches, it was shown that some Stipa taxa,
previously assigned to S. richteriana and S. grandis,
appeared to be cryptic species [33, 35]. us, taking into
account that ca. 30% of feather grasses could be of hybrid
origin [13] and that cryptic species are present, integra-
tive taxonomy seems to be the only reliable way to prop-
erly resolve the phylogenetic issue of Stipa.
Importantly, the advent of next generation sequencing
technologies, primarily Illumina, and the continuously
reducing sequencing cost have facilitated the implemen-
tation of genomic data in studies of non-model organ-
isms. For instance, high-throughput techniques based
on restriction enzymes, e.g., RADseq [36] and genotyp-
ing-by-sequencing [37], which have been foremost used
in agricultural species [38], are currently widely applied
in phylogenetics and studies related to hybridisation in
many wild plant genera with little or no previous genomic
information [39–41]. Recently, a promising result has
also been demonstrated in Stipa where the usage of the
DArTseq technique resulted in an increased number of
markers that was several 100-fold higher than in the pre-
vious genomic studies [34].
During field studies on steppe communities, we
observed high morphological variability in populations
of genetically related plants. In particular, we noticed that
some specimens of feather grasses representing S. capil-
lata, S. grandis and S. krylovii seemingly share mixed
morphological characteristics between these species,
while taxon S. baicalensis is frequently observed within
populations of the aforesaid taxa and resembles an inter-
mediate phenotype between S. grandis and S. krylovii or
S. capillata and S. krylovii. e above-mentioned species
have wide distribution ranges (Fig.1 and Supplementary
TableS1). Specifically, S. capillata is the most widespread
taxon within the genus, grows on the dry grasslands and
is common in Siberia, Western Asia and is also present in
a limited number of refugia in Europe and North Africa.
Two species, S. baicalensis and S. grandis, share similar
ranges in southwestern Siberia, in the Baikal region, in
the south part of Zabaykalsky Krai, in Mongolia and in
northeastern China; S. baicalensis is also present in the
south part of the Russian Far East, whereas S. grandis
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 3 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
occurs in Central China and Tibet. Finally, S. krylovii
grows in Siberia, Mongolia, China, Northern Nepal,
Southern Tajikistan, Eastern Kazakhstan and Eastern
Kyrgyzstan [13, 42].
We hypothesise that the observed variability in S.
baicalensis, S. capillata, S. grandis and S. krylovii is
due to the presence of interspecific hybrids and that
may lead to species misidentification based on the
current descriptions of these taxa [16, 43–45]. us, in
the current study, we aim to use an integrative taxonomy
approach to (1) delimitate species and test if S. baicalen-
sis is a hybrid between S. grandis × S. krylovii or S. cap-
illata × S. krylovii; (2) assess levels of hybridisation and
introgression (if present) between the examined taxa and
populations at the molecular level; (3) estimate diver-
gence times between the studied taxa; (4) obtain insight
Fig. 1 The general distribution map of (a) S. baicalensis (yellow), S. capillata (red), S. grandis (green), S. krylovii (blue) and sampling locations (b) in
East Kazakhstan and southwestern Siberia (Russia), (c) in southeastern Siberia and (d) in Eastern Kyrgyzstan. The dashed lines indicate hypothetical
borders. The coloured circles depict species found in the numbered locations. The exact coordinates of the locations are presented in the
Supplementary Table S1
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 4 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
into the extent of hybridisation between these species at
the morphological level and (5) assess whether morpho-
logical characters can be used to identify hybrids in the
field.
Results
DNA‑based species delimitation
e DArTseq technique was applied to obtain a total of
8660 SNP markers to infer the genetic structure of 302
Stipa specimens. Firstly, analyses of genetic clustering
with an unweighted pair group method using arithmetic
average (UPGMA) and fastSTRU CTU RE revealed five
major clades corresponding to morphospecies S. glare-
osa, S. capillata, S. grandis, S. krylovii and S. baicalensis
(Fig.2). According to the fastSTRU CTU RE analysis, the
first and the fourth clades consisted exclusively of pure
specimens of S. glareosa and S. krylovii, respectively. e
remaining clades beside pure specimens of S. capillata, S.
grandis and S. baicalensis included hybrid individuals. In
particular, the second clade comprised pure specimens of
S. capillata and the admixed individuals S. capillata × S.
baicalensis and S. capillata × S. krylovii. e third cluster
consisted of pure specimens of S. grandis and hybrids S.
grandis × S. krylovii and S. grandis × S. baicalensis. e
fifth clade included pure specimens of S. baicalensis and
the admixed individuals S. baicalensis × S. krylovii.
In total, fastSTRU CTU RE inferred 52 individuals with
an admixture of two genetic clusters including an excep-
tion of S. capillata × S. baicalensis (0454631) that had a
minor proportion (0.02) of a third cluster representing S.
krylovii. Among pure individuals only one specimen of S.
krylovii (0454646) had an insignificant admixed propor-
tion (0.03) of S. grandis. Noteworthy, the vast majority
of admixed individuals (49 or 94%) had a proportion of
membership in the range from 0.46 to 0.54 indicating F1
hybrids or later generations of hybrids that have no back-
crossing to the parental species. e remaining admixed
samples represented: (1) one individual (0477009) was
formed by a 0.78–0.22 admixture between S. baicalensis
and S. krylovii evidencing a first-generation backcross
(F1× S. baicalensis), (2) one individual (000948) was
shared between S. grandis (0.88) and S. krylovii (0.12)
indicating a second-generation backcross (first-genera-
tion backcross × S. grandis), (3) one individual (000956)
was admixed between S. grandis (0.64) and S. krylovii
(0.36) that may suggest a first-generation backcross (F1×
S. grandis) or a more complex backcross to S. grandis via
different intermediate combinations.
A consistent result was also found with a princi-
pal coordinates analysis (PCoA). e first three axes
explained 29.6, 19.9 and 19.2% of the total genetic diver-
gence within the studied taxa, respectively. According
to the PCoA, pure individuals were grouped into five
Fig. 2 The UPGMA dendrogram (at the top) aligned with the best supported fastSTRU CTU RE model K = 5 (on the bottom). The genetic distance
was calculated using the Jaccard Similarity Coefficient (y-axis, top). Individuals are represented by coloured bars according to the proportion of
membership (y-axis, bottom) of a genotype to the respective cluster
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 5 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
markedly differentiated groups correspondingly to their
taxonomic classifications (Fig.3; an interactive version of
the three-dimensional plot can be accessed via https://
plot. ly/ ~eugen ebaya hmetov/ 40/). e remaining hybrids
F1 had intermediate positions between their parental
species. Two hybrid individuals S. grandis × S. krylovii
(000948 and 000956) were grouped closer to S. grandis
reflecting a higher proportion of membership with the
first taxon established earlier with fastSTRU CTU RE.
Similarly, an admixed individual S. baicalensis × S. kry-
lovii (0477009) with the proportion of 0.78 and 0.22 was
closer to S. baicalensis.
Hybrid generation identication
e NewHybrids analysis revealed a more complex pat-
tern of hybridisation than it was inferred with fastSTRU
CTU RE. Among 16 admixed specimens of S. baicalen-
sis × S. krylovii, previously assigned as F1, six individu-
als with posterior probabilities (PP) in a range of 0.84
and 1.00 were identified being F2 (F1× F1) hybrids sug-
gesting that F1 hybrids are able to reproduce further.
One specimen, S. baicalensis × S. krylovii (0477009),
was proven to be a first-generation backcross (F1× S.
baicalensis) having PP of 0.81. In addition, five mixed
individuals had PP between two categories (F1 and F2
hybrids) in a range of 0.22–0.78 suggesting uncertainty
in the assignment (Fig.4a and Supplementary TableS2).
ese mixed assignment individuals may represent a
more advanced hybrid generation than can be detected
by NewHybrids. Within 14 hybrids of S. capillata × S.
krylovii, previously assigned to F1 hybrids, we detected
six individuals of F1 (PP of 0.87–1.00) and one speci-
men was identified as an F2 hybrid with PP of 0.83. e
remaining seven individuals had mixed assignments
in a range of 0.39–0.73 for the F1 class and 0.27–0.61
for the F2 class, respectively (Fig.4b). e analysis also
demonstrated that 13 out 14 hybrids of S. capillata ×
S. baicalensis were F1 (PP of 0.86–1.00) and one indi-
vidual remained unclassified sharing PP (0.54 and 0.46)
between F1 and F2 categories (Fig.4c). Among S. gran-
dis × S. krylovii only one F1 hybrid was detected (PP
of 0.91), two individuals were assigned to the F2 class
(PP of 0.83 and 1.00) and two specimens were classified
as first generation backcrosses (F1× S. grandis) hav-
ing PP of 0.88 and 0.99. Although the specimen 000948
was inferred to be an F1 backcross, it is more plausible
that it represents rather a second-generation backcross
established by fastSTRU CTU RE due to NewHybrids
being unable to detect more advanced backcrosses than
F1. Additionally, one individual had mixed assignments
between the F1 (PP of 0.27) and F2 (PP of 0.72) classes
(Fig.4d). Finally, the only hybrid detected for S. grandis
Fig. 3 The PCoA plot based on genetic distances between samples. a The plot of the two principal axes. b The plot of the three principal axes. The
pie charts represent the proportions of membership established by fastSTRU CTU RE for the best K = 5
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 6 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
× S. baicalensis appeared to be a first generation hybrid
with PP of 1.00 (Fig.4e).
Testing forintrogression
A total of 6894 SNP markers were used to test for reticu-
lation events between the studied taxa. Due to five dif-
ferent admixed combinations detected by fastSTRU CTU
RE and NewHybrids, we tested all possible four-species
combinations regardless of their phylogenetic positions
(Fig.2). e results of the f4 statistic suggest no gene flow
between S. capillata and the remaining species because
of negligible deviations from the expected 50/50 ratio
of BABA/ABBA patterns and the lowest Z-scores of any
tests (Table1). is finding disagrees with the presence
of contemporary hybrids S. capillata × S. krylovii and S.
capillata × S. baicalensis inferred with fastSTRU CTU
RE and NewHybrids. Nonetheless, it can be explained
by the fact that all identified admixed individuals were
excluded from this analysis. On the other hand, intro-
gression events were suggested between S. grandis and
Fig. 4 The assignment of Stipa taxa into four hybrid classes according to the posterior probabilities (y-axis) inferred in NewHybrids. a S. baicalensis
× S. krylovii, (b) S. capillata × S. krylovii, (c) S. capillata × S. baicalensis, (d) S. grandis × S. krylovii, (e) S. grandis × S. baicalensis. Hybrid classes are
coloured by black (F1 hybrid), grey (F2), cyan (backcross to the first parental species, BC to parent 1) and pink (backcross to the second parental
species, BC to parent 2)
Table 1 Test for introgression between the studied species using 6894 SNPs
Outgroup (A) for all tests was S. glareosa; nBABA, number of BABA patterns; nABBA, number of ABBA patterns. Standard error in all tests was < 0.01. Negative f4 and
Z-score < −3 indicate gene ow between B and C, positive f4 and Z-score > 3 suggest reticulation events between B and D.
No B C D nBABA nABBA f4Z‑score
1S. capillata S. grandis S. krylovii 11 11 −0.000094 −0.187
2S. capillata S. grandis S. baicalensis 15 17 −0.000284 −0.494
3S. capillata S. krylovii S. baicalensis 13 15 −0.000190 −0.293
4S. grandis S. krylovii S. baicalensis 23 64 −0.006000 −4.570
5S. grandis S. capillata S. baicalensis 130 17 0.016400 10.000
6S. grandis S. krylovii S. capillata 11 166 −0.022400 −9.090
7S. krylovii S. baicalensis S. grandis 64 30 0.004870 3.530
8S. krylovii S. capillata S. grandis 166 11 0.022500 8.910
9S. krylovii S. baicalensis S. capillata 15 136 −0.017600 −10.200
10 S. baicalensis S. krylovii S. grandis 23 30 −0.001130 −1.260
11 S. baicalensis S. capillata S. grandis 130 15 0.016700 10.400
12 S. baicalensis S. krylovii S. capillata 13 136 −0.017800 −10.800
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 7 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
S. baicalensis (combinations 5 and 11), S. grandis and S.
krylovii (combinations 4, 6, 7 and 8), S. krylovii and S.
baicalensis (combinations 9 and 12). Additionally, when
S. grandis, S. krylovii and S. baicalensis were analysed
together (combinations 4, 7 and 10) the ratio of BABA/
ABBA patterns were either almost equal (combination
10) or relatively lower (combinations 4 and 7) compared
to the other tests that indicated gene flow among these
species. One potential explanation is that these species
are involved in introgression at the same rate, which the-
oretically cancel out each other.
Population dierentiation
A total of 3483 SNP markers were used to investigate
the genetic differentiation in populations of S. baicalen-
sis, 6288 SNPs in S. capillata, 4635 SNPs in S. grandis
and 6912 SNPs in S. krylovii (Supplementary Fig. S1).
e pairwise Fst values demonstrated strong differen-
tiation among four populations of S. baicalensis, while
the results of STRU CTU RE and PCoA revealed two
and three genetic clusters, respectively (Fig. 5a and
Supplementary Fig.S2), where populations 1 and 4 are
merged (Fst of 0.32, Supplementary TableS3) regard-
less of the fact that the distance between them is more
than 1000 km. Additionally, the second most likely K
according to STRU CTU RE was K = 3 indicating that
this number of clusters is also a likely option. Relatively
strong differentiation was also shown for populations
of S. capillata with an exception of populations 5 and 6
with a moderate Fst value of 0.13, suggesting potential
gene flow. According to the PCoA, almost all individu-
als were grouped together excluding population 3 and a
few specimens of populations 2 and 5. Nonetheless, the
first two axes of PCoA explained only 18% of the total
genetic divergence within the specimens. On the other
hand, STRU CTU RE supported K = 4 as the best fitting
model, while two and nine clusters were also among the
probable options (Fig. 5b and Supplementary Fig. S2).
Within S. grandis, evidence for weak differentiation was
shown for geographically close populations 5 and 6 (Fst
of 0.10) as well as for 7 and 8 (Fst of 0.08), while the first
two axes of the PCoA explained 23.3% of the variation
and revealed the close genetic relationship between geo-
graphically distant populations 1 and 2 (Fst of 0.42). In
Fig. 5 PCoA plots, best supported STRU CTU RE models and localities of the studied populations across four species. a S. baicalensis. b S. capillata. c
S. grandis. d S. krylovii
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 8 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
addition, based on the second axis of the PCoA, popu-
lations 3 and 4 were distant to each other as well as to
the remaining populations (Fig.5c). Further, the STRU
CTU RE analysis suggested that K = 2 was the most prob-
able number of separate clusters within S. grandis (Sup-
plementary Fig.S2), where one pure cluster represented
individuals from populations 7 and 8, while 9 out of 13
specimens of population 3 and 4 constituted the sec-
ond pure group. e rest of S. grandis specimens were
admixed between these pure clusters. Lastly, based on
the PCoA individuals of S. krylovii were clustered into
the two main groups representing population 8 from
Kyrgyzstan and the remaining populations from Russia
(Fig.5d). Although few specimens of population 1 were
genetically more distant to the other individuals, the sec-
ond axis of PCoA explained only 5.4% of the total genetic
divergence. e pairwise Fst values also supported the
division within S. krylovii into two groups indicating
the strong differentiation of population 8 (Fst in a range
of 0.37–0.46) and a moderate or near to moderate dif-
ferentiation among the remaining populations (Fst in a
range of 0.06–0.19). Similarly, the STRU CTU RE analysis
revealed two genetic clusters (Supplementary Fig. S2)
that described the population structure the best: the first
one represented population 8 from Kyrgyzstan, while the
second cluster comprised populations from Russia.
Divergence‑time estimation
e SNAPP phylogeny based on 2717 SNP markers
among pure individuals revealed largely the same topol-
ogy as the UPGMA dendrogram, except for the pair
S. grandis and S. krylovii that were grouped together,
while S. baicalensis was a sister taxon (Fig.6 and Sup-
plementary Fig.S3). e result suggests that the poten-
tial split between S. capillata and three species, namely,
S. grandis, S. krylovii and S. baicalensis, took place
approximately 1.07 Mya with the 95% Highest Posterior
Density interval (HPD) of 1.51–0.71 Mya. e most
recent common ancestor for S. grandis, S. krylovii and
S. baicalensis was inferred to be 0.79 Mya (95% HPD:
1.12–0.53 Mya), whereas the lowest divergence time of
0.73 Mya was registered for S. grandis and S. krylovii
(95% HPD: 1.02–0.48 Mya). e chronogram also indi-
cates that diversification within S. capillata, S. krylovii
Fig. 6 Phylogeny and divergence date estimates inferred by SNAPP. Blue coloured trees represent the most probable topology. Numbers at each
node represent mean ages of divergence time estimates and the 95% HPD intervals (in the brackets). The black rectangles on the nodes indicate
the 95% HPD intervals of the estimated posterior distributions of the divergence times. The red circle indicates the presumed divergence time split
set as a reference. The Bayesian posterior probabilities were 1.00 for the nodes with the shown 95% HPD intervals. The scale shows divergence time
in Mya
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 9 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
and S. baicalensis started at relatively the same time,
ca. 130–114 kya (95% HPD: 181–74 kya), while S. gran-
dis was established to be the youngest species (96 kya,
95% HPD: 137–63 kya). Lastly, although the topologies
within the species had large uncertainty, some nodes
had comparatively high Bayesian posterior probabili-
ties (BPP ≥ 0.80; Supplementary Fig.S3). In particular,
individuals of S. capillata from Kyrgyzstan started to
differentiate 97 kya (95% HPD: 137–58 kya), while the
well-supported split (BPP of 0.92) between specimens
from localities 5 (Russia, the Republic of Khakassia)
and 19 (Russia, the Republic of Buryatia) took place 86
kya (95% HPD: 122–53 kya). e most recent common
ancestor for individuals of S. krylovii from Kyrgyzstan
was inferred to be 62 kya (95% HPD: 94–35 kya),
whereas specimens from localities 22 (Mongolia) and
24 (Russia, Zabaykalsky Krai) had the potential split 68
kya (95% HPD: 99–40 kya). Interestingly, specimens of
S. baicalensis from localities 11 and 13 (both from Rus-
sia, the West shore of Lake Baikal) and individuals from
the remaining localities had nearly the same divergence
times of 97 kya (95% HPD: 140–60 kya) and 93 kya
(95% HPD: 132–58 kya), respectively. Additionally, the
potential split between populations of S. grandis from
the Republic of Khakassia (locality 5) and the Republic
of Buryatia (locality 18) took place 65 kya (95% HPD:
91–39 kya), while the well-supported split (BPP of 0.95)
between specimens from localities 23 and 24 (both
from Russia, Zabaykalsky Krai) took place 58 kya (95%
HPD: 86–34 kya).
Morpho‑molecular analysis
As non-parametric Spearman correlation coefficients did
not demonstrate any strong correlation (> 0.90) between
the measured variables (Supplementary Fig. S4), we
retained all morphological characters (Table3) for a fac-
tor analysis of mixed data (FAMD) and analyses of notch
plots. Subsequently, to investigate if the observed phe-
notypes are congruent with molecular data, we supple-
mented the result of the FAMD analysis with the genetic
clusters inferred by fastSTRU CTU RE for the best K = 5.
As a result, the FAMD revealed five markedly differ-
entiated groups (Fig.7) of operational taxonomic units
(OTUs) in accordance with the detected clusters using
the SNP data (Figs.2, 3 and 4). e first four dimensions
explained 31.0, 18.9, 10.9 and 8.1% of the total variability,
respectively. e first three axes are mainly composed by
the contribution of 11 quantitative and four qualitative
variables (Supplementary TableS4).
Importantly, due to having the genetic clusters assigned
by fastSTRU CTU RE, the two-dimensional plot revealed
the slight overlapping of OTUs belonging to the pure
species S. baicalensis and S. krylovii, whereas OTUs
representing the admixed specimens S. baicalensis × S.
krylovii were present in both clouds of the parental taxa
(Fig.7a). Furthermore, hybrids S. capillata × S. baicalen-
sis were also present in both clouds of the pure species.
On the other hand, all the admixed individuals S. capil-
lata × S. krylovii were grouped together with only one
parental species, S. capillata. Interestingly, hybrids
between S. grandis and S. krylovii were mainly grouped
Fig. 7 The factor analysis of mixed data performed on 17 quantitative and six qualitative characters of the five examined species of Stipa. a Plot of
the two principal axes. b Plot of the three principal axes. The pie charts represent the proportions of membership established by fastSTRU CTU RE for
the best K = 5
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 10 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
together with OTUs of S. baicalensis with an exception
of the first- and the second-generation backcrosses. e
only hybrid detected between S. grandis and S. baicalen-
sis was clustered together with the pure individuals of the
former taxon. A more clear dispersal of the pure species
can be seen in the three-dimensional plot, where differ-
ences between the studied species are explained by the
third principal axis (Fig.7b; an interactive version of the
plot available at https:// plot. ly/ ~eugen ebaya hmetov/ 42/).
Additionally, all combinations of two axes plots for the
four dimensions are present in Supplementary Fig.S5.
e result of FAMD and notch plots of the quantita-
tive variables demonstrated that the studied species can
be differentiated morphologically mainly based on the
length of the lower segment of the awn (Col1L), the dis-
tance from the end of the dorsal line of hairs to the top
of the lemma (DDL), the length of the anthecium (AL),
the length of the middle segment of the awn (Col2L),
the length of the callus (CL), the length of ligules of the
internal vegetative shoots (LigIV), the length of ligules of
the middle cauline leaves (LigC) and the length of hairs
on the top of the lemma (LHTA). In addition, the length
of hairs on the lower segment of the awn (HLCol1),
the length of hairs on the middle segment of the awn
(HLCol2) and the length of the callus base (CBL) can
aid to distinguish S. glareosa from the remaining taxa
(Supplementary Fig.S6). For instance, the notch plot of
Col1L showed significant differences between means and
strong evidence of differing medians within the pure spe-
cies except the pair S. baicalensis and S. capillata, while
the hybrid individuals had mostly intermediate positions
between the parental species except specimens of S. cap-
illata × S. baicalensis that were significantly different
from the samples of the pure taxa. e differences across
all quantitative variables between individuals of the pure
species and the hybrids can be better evaluated in the
interactive box plots presented in Supplementary File1.
Among the qualitative variables, the main contribu-
tion to the axes of the FAMD had the abaxial surface of
vegetative leaves (AbSVL), the type of hairs on the top
of the anthecium (HTTA), the type of the awn genicula-
tion (AG) and the presence of hairs below nodes (PHBN)
(Supplementary TableS4). For instance, vegetative leaves
with prickles were common in S. glareosa (all samples), S.
capillata (61 out of 66 samples), S. capillata × S. krylovii
(12 out of 14 samples), S. capillata × S. baicalensis (10
out of 14 samples), less frequent in S. grandis (2 out of 54
samples), S. krylovii (5 out of 81 samples), S. baicalensis
× S. krylovii (2 out of 14 samples) and totally absent in
S. baicalensis, S. grandis × S. krylovii and S. grandis × S.
baicalensis (Supplementary Fig.S7).
us, based on the results of the molecular analyses
and the FAMD combining both phenotypic and SNP
data, we were able to differentiate the pure species and
the hybrid individuals at the morphological level. We
established that using the traditional identification keys
[16, 43–45] 71 out of 302 specimens had been misiden-
tified, mostly due to their hybrid nature (Supplementary
TableS5). In particular, 47 samples previously identified
as pure species appeared to be hybrids. e remaining
24 specimens earlier were classified either as hybrids (15
samples) or misleadingly assigned to S. baicalensis (9
samples). Interestingly, in the latter case, all individuals
were previously reported from the northeastern part of
Kazakhstan [46].
In general, S. baicalensis was the most problematic spe-
cies for taxonomic identification comprising 54 doubtful
samples. Specifically, the above-mentioned specimens
from Kazakhstan genetically were proven to be pure S.
capillata, 10 specimens appeared to be hybrids S. capil-
lata × S. baicalensis, 8 were S. baicalensis × S. krylovii
and 10 were hybrids between S. capillata and S. kry-
lovii. On the other hand, specimens previously identi-
fied as hybrids S. baicalensis × S. krylovii (3 samples)
and S. grandis × S. krylovii (2 samples) were genetically
assigned to the pure species S. baicalensis. Oppositely,
specimens morphologically identified as S. krylovii (8
samples), S. grandis (1 sample) and S. capillata (1 sam-
ple) appeared to be hybrids S. baicalensis × S. krylovii (6
samples) and S. capillata × S. baicalensis (2 samples), S.
grandis × S. baicalensis and S. capillata × S. baicalensis,
respectively. Additionally, one specimen S. capillata ×
S. grandis was proven to be S. capillata × S. baicalensis.
At last, one individual S. capillata × S. baicalensis was
genetically verified to be the pure species S. capillata.
e remaining doubtful samples morphologically were
either misleadingly assigned as the pure species or as taxa
of hybrid nature. For instance, 4 specimens of S. capil-
lata were hybrids between S. capillata and S. krylovii,
while specimens of S. grandis (2 samples) and S. krylovii
(2 samples) were shown to be genetically admixed as S.
grandis × S. krylovii. In opposite, 9 individuals identified
as S. capillata × S. krylovii appeared to be the pure spe-
cies S. capillata. Interestingly, only 5 out of 52 hybrid taxa
were properly identified based on morphology including
S. baicalensis × S. krylovii (3 samples) and S. grandis × S.
krylovii (2 samples). e only one taxon that had not any
questionable individuals was S. glareosa representing the
section Smirnovia.
Discussion
e current understanding of taxonomy and species
limits in Stipa is still largely based on morphological
characters. Our study highlights the necessity of using
molecular tools to properly identify taxa and detect
processes underlying speciation. is is of particular
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 11 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
relevance in hybrid zones where ongoing hybridisation
and introgression may lead admixed individuals to phe-
notypes similar to one of the parental species, compli-
cating identification of such taxa using morphological
characters alone. Furthermore, integrative studies sug-
gest that apparently intermediate phenotypes between
two species are not necessarily hybrids [47]. On the
other hand, as indicated in the present work, although
some interspecific hybrids have intermediate charac-
ters between parental taxa, their phenotypic traits can
also overlap with non-parental species leading to misi-
dentification. Given the circumstances, here we utilised
an integrative approach combining genome-wide data
and morphology to delimitate species and ascertain the
extent of hybridisation in feather grasses.
e study clearly illustrates that a molecular-based
analysis, e.g., such as fastSTRU CTU RE, combined with a
factor analysis of mixed data, utilising both quantitative
and qualitative variables, can largely resolve the problem
with species identification in the face of ongoing hybrid-
isation and introgression. Primarily, such an approach
allows to visualise and easily trace if the observed phe-
notypes are congruent with molecular data. Besides that,
this approach may aid in selecting a set of traits that can
be useful for species identification in the field. Our find-
ings clearly demonstrate that the studied individuals can
be clustered into five species groups representing S. cap-
illata, S. baicalensis, S. glareosa, S. grandis and S. kry-
lovii. us, here we found no evidence that S. baicalensis
is of hybrid origin from S. grandis × S. krylovii or S.
capillata × S. krylovii but is instead a genetically dis-
tinct species. e general branching of the phylogenetic
trees is in good agreement with the current taxonomic
classification. In particular, a representative of the sec-
tion Smirnovia, S. glareosa, was genetically distant to the
remaining species from the section Leiostipa. Neverthe-
less, our result contradicts a previous research, where S.
capillata, S. krylovii and S. baicalensis represented one
clade and S. grandis was a sister taxon [19], while here,
such a sister taxon was S. capillata. e current result
is likely more accurate due to applying several thou-
sand SNPs across the genome in comparison to only one
nuclear locus in the above-mentioned study. Addition-
ally, we demonstrated that the potential split between S.
capillata and the remaining representatives of Leiostipa
took place approximately 1.07 Mya, which is similar to
our previous estimate of 1.73 Mya based on the nucleolar
organising regions [48]. Here, we also reported for the
first time that diversification within species S. capillata,
S. baicalensis, S. krylovii and S. grandis started ca. 130–
96 kya (95% HPD: 181–63 kya). ese ages may corre-
spond to the potential window of time between the Last
Interglacial period, which began around 130 kya, and the
Last Glacial Period (LGP), which started about 110 kya.
us, the observed pattern is similar to dispersing events
reported for different taxa across the plant kingdom [49,
50] suggesting climatic changes as a feasible factor in the
current distribution of feather grasses. Of note, due to
the divergence times that were inferred in SNAPP, which
uses the multi-species coalescent model ignoring pos-
sible introgression, the confidence may be exaggerated.
On the other hand, although introgression does cause
biased errors in coalescent-based species tree inference
[51–53], it should not affect the estimates in the present
study, since all admixed individuals were excluded from
the analysis.
Importantly, the results of molecular analyses were
congruent and provided evidence for the existence of
hybridisation between pairs S. capillata × S. baicalen-
sis, S. capillata × S. krylovii, S. grandis × S. krylovii, S.
grandis × S. baicalensis and S. baicalensis × S. krylovii.
e presence of F2 or more advanced hybrid genera-
tions suggests that F1 individuals are able to reproduce
further. is observation is in agreement with our previ-
ous findings on hybridisation within the genus [32, 33],
where a direct approach proved that a hybrid taxon S.
heptapotamica produces fertile pollen grains and is capa-
ble of backcrossing to primarily one parental species [33].
Indeed, here we detected backcrosses S. baicalensis × S.
krylovii and S. grandis × S. krylovii to their former paren-
tal species. Moreover, the analysis of BABA/ABBA pat-
terns among species revealed signs of past introgression
between S. baicalensis and S. krylovii, S. baicalensis and
S. grandis, S. krylovii and S. grandis. Taking into account
the diversification times of these species, we can hypoth-
esise that if such a gene flow occurred it was relatively
recent in evolutionary terms and seemingly still present
between S. baicalensis and S. krylovii, as well as in pair S.
grandis and S. krylovii. Nevertheless, we treat our BABA/
ABBA analysis with caution due to such a test originally
being applied in human studies whose genome is avail-
able at the chromosome level and the number of SNPs
was remarkably higher than used here. us, we intend
to reassess the past gene flow within these taxa when a
more continuous genome will be obtained. Additionally,
although here we did not detect any backcrosses between
S. baicalensis and S. grandis, it cannot be excluded that
such a combination exists in nature, especially since both
species are mostly common in Mongolia and China;
however specimens from there were not presented in the
study. Lastly, our study revealed only unidirectional back-
crossing of hybrid taxa either to S. baicalensis or S. gran-
dis, but not to S. krylovii. Similarly, due to the relatively
small sample size and the limited number of localities
used here, we cannot draw any reliable conclusions con-
cerning possible barriers to gene flow to the latter taxon.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 12 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
According to our results, the populations’ expan-
sion seemingly started during the LGP. Nonetheless,
the assessment of genetic structures revealed signs of
contemporary gene flow between populations across
all species, despite significant geographical distances
between some of them. For instance, populations of S.
baicalensis, S. capillata and S. grandis from the eastern
part of Khakassia and the southwestern area of Buryatia
represented either one genetic cluster or were admixed
between two. Among potential explanations are shift-
ing their distributions in response to climate change, or
seasonal migration of wild animals and livestock graz-
ing. While we currently do not possess enough data to
verify the first assumption, seeds of feather grasses usu-
ally are spread naturally by wind or water. On the other
hand, they also can be frequently dispersed by the wool
of mammals including, e.g., sheep, goats and horses [16,
18]. Although this study was not intended to explore
population differences in detail, we believe that our find-
ings regarding gene flow merit further studies in order
to better understand the intraspecific variation and rela-
tionships among populations as well as to discuss poten-
tial consequences of such events.
Our results also illustrated a complex association
between species at the morphological level. ere are
usually few phenotypic characters differentiating hybrid-
ising species and these characters are often function-
ally or developmentally correlated [54]. In the present
case, although the studied species had a set of distinc-
tive characters, the current identification keys do not
provide a solution on distinguishing admixed individu-
als in the section Leiostipa. As a result, only 5 out of 52
hybrid taxa were properly identified based on morphol-
ogy. Furthermore, the results demonstrated that the most
problematic taxon is S. baicalensis, which was frequently
misidentified either with S. capillata or a cross S. capil-
lata × S. krylovii, while a few individuals were mislead-
ingly assigned to S. grandis × S. krylovii. Additionally,
several hybrids between S. baicalensis and S. capil-
lata were determined as pure S. krylovii. erefore, we
believe that the identification keys should be revised in
order to properly delimitate pure species and propose
a taxonomic treatment for the hybrid taxa identified in
the study. Moreover, for a more comprehensive morpho-
logical assessment we suggest using scanning electron
microscopy that can assist in finding unique ultrastruc-
tures among pure and admixed individuals.
Although here we highlight that morphological char-
acters alone cannot be used to properly identify hybrids
and backcrossed individuals in the field, it is a com-
mon issue in plants rather than an exception in feather
grasses. For instance, in a study on three species of wil-
lows it was shown that based on phenotype only 5% of
specimens were classified as introgressed individuals,
which was much less than the 19% detected using SNP
data [55]. Another research on tropical trees demon-
strated that even limited genomic sampling, when com-
bined with morphology and geography, can greatly
improve estimates of species diversity for clades where
hybridisation contributes to taxonomic difficulties [56].
Recently, an investigation on several pine species using
SNP data derived from the DArTseq pipeline revealed
that one species, previously considered of hybrid origin,
is a genetically distinct species and provided insights into
the challenges of solely using morphological traits when
identifying taxa with cryptic hybridisation and variable
morphology [57].
In grasses, hybridisation and introgression phenom-
ena are still mainly studied in crop species, e.g., rice
[58], wheat [59] and sugarcane [60]. To date, we have
detected hybrids not only between genetically closely
related species [33], but also among genetically distant
Stipa taxa [13, 34]. e results present here and our pre-
vious findings helps us to shift toward thinking of the
Stipa phylogeny as reticulate webs rather than a strictly
bifurcating tree. Nonetheless, studying hybridisation in
feather grasses is not only of particular interest to plant
taxonomists. e presence of parental species, multiple
generation hybrids and backcrosses in different propor-
tions in a hybrid zone may indicate renewed sympatry
providing important data for studying species boundaries
and patterns of speciation [61]. From this point of view,
Stipa may be a suitable genus to study these phenomena.
Despite the increasing interest in feather grasses at the
molecular level [19, 20, 48, 62–65], there is still a lack of
substantial knowledge regarding, e.g., chromosome num-
bers of admixed and pure taxa in hybrid zones, fertility
of pollen grains in F1 and later generation hybrids and
backcrosses and genomic information related to specific
loci contributing to reproductive barriers. We believe
that only an integrative approach combining the afore-
said data can properly interpret evolutionary patterns
and processes in feather grasses.
Conclusions
In the current study we revealed a complex taxonomic
issue in feather grasses with variable morphology exhib-
ited due to extensive hybridisation. Based on SNPs
derived from genome-wide genotyping we detected five
genetic groups representing separate morphospecies and
showed that S. baicalensis is a genetically distinct spe-
cies instead of a taxon of hybrid origin as it was previ-
ously hypothesised. We demonstrated the presence of F1
hybrids between S. capillata × S. baicalensis, S. capillata
× S. krylovii, S. grandis × S. krylovii, S. grandis × S. bai-
calensis, S. baicalensis × S. krylovii and F2 individuals in
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 13 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
S. capillata × S. krylovii, S. grandis × S. krylovii, S. bai-
calensis × S. krylovii indicating low levels of reproduc-
tive isolation in these species. We also discovered a few
backcrosses S. baicalensis × S. krylovii and S. grandis
× S. krylovii to their former parental species suggesting
possible introgression among the taxa.
Furthermore, we detected reticulation events between
S. baicalensis and S. krylovii, S. baicalensis and S. grandis,
S. krylovii and S. grandis. On the other hand, we revealed
signs of contemporary gene flow between populations of
the species from the section Leiostipa. Another impor-
tant outcome of the research is divergence date estimates
inferred at the species and population levels. Here we
deduce that diversification within the studied species
started ca. 130–96 kya and hypothesise that climatic
changes during the LGP were a driving force behind the
current distribution of feather grasses. Importantly, here
we also emphasise the usefulness of applying integrative
approaches combining molecular and morphological
data to delimitate species and detect hybridisation and
introgression events in feather grasses. Finally, we con-
clude that Stipa may be a suitable genus to study these
phenomena.
Methods
Plant material
In total, 302 fully developed Stipa samples were used
for molecular and morphological studies. We gathered
individuals representing the section Leiostipa (S. bai-
calensis, S. capillata, S. grandis and S. krylovii) from
localities where these taxa grow together as well as from
areas where they grow separately from each other (Fig.1,
Supplementary TableS1). Additionally, we included two
populations of S. baicalensis previously reported from
the northeastern part of Kazakhstan [46] and 13 speci-
mens of S. glareosa belonging to the section Smirno-
via that were found in localities 10, 12, 13 and 16. All
voucher specimens used in the study are preserved at
TK and KRA. All maps were visualised using ArcGIS Pro
2.7.1 (ESRI, Redlands, USA). e species distribution
ranges were established based on the revision of herbar-
ium specimens preserved at AA, ALTB, B, BM, BRNU,
COLO, E, FR, FRU, G, GAT, GFW, GOET, IFP, K, KAS,
KFTA, KHOR, KRA, KRAM, KUZ, L, LE, LECB, M, MO,
MSB, MW, NY, P, PE, PR, PRC, TAD, TASH, TK, UPS,
W, WA, WU and Z.
DNA extraction, amplication andDArT sequencing
is section was performed according to the previously
reported procedures [34]. In brief, genomic DNA was
isolated from dried leaf tissues using a Genomic Mini AX
Plant Kit (A&A Biotechnology, Poland) and sent to Diver-
sity Arrays Technology Pty Ltd. (Canberra, Australia) for
the following genome complexity reduction using restric-
tion enzymes and high-throughput polymorphism detec-
tion [66].
All DNA samples were processed in digestion/ligation
reactions as described previously [66], but replacing a
single PstI-compatible adaptor with two different adap-
tors corresponding to two different restriction enzyme
overhangs. e PstI-compatible adapter was designed to
include Illumina flowcell attachment sequence, sequenc-
ing primer sequence and “staggered”, varying length bar-
code region, similar to the sequence previously reported
[37]. e reverse adapter contained a flowcell attachment
region and MseI-compatible overhang sequence. Only
“mixed fragments” (PstI-MseI) were effectively ampli-
fied by PCR using an initial denaturation step of 94 °C
for 1 min, followed by 30 cycles with the following tem-
perature profile: denaturation at 94 °C for 20 s, annealing
at 58 °C for 30 s and extension at 72 °C for 45 s, with an
additional final extension at 72 °C for 7 min. After PCR
equimolar amounts of amplification products from each
sample of the 96-well microtiter plate were bulked and
applied to c-Bot (Illumina, USA) bridge PCR followed
by sequencing on Hiseq2500 (Illumina, USA). e single
read sequencing was performed for 77 cycles.
Sequences generated from each lane were processed
using proprietary DArT analytical pipelines. In the pri-
mary pipeline, the fastq files were first processed to
filter away poor quality sequences, applying more strin-
gent selection criteria to the barcode region compared
to the rest of the sequence. In that way the assignments
of the sequences to specific samples carried in the “bar-
code split” step were reliable. Approximately 2.5 mln
sequences per barcode/sample were identified and used
in marker calling.
DArTseq data analysis
For the downstream analyses, we applied co-dominant
single nucleotide polymorphism (SNP) markers pro-
cessed in the R-package dartR v.1.5.5 [67] with the fol-
lowing parameters: (1) a scoring reproducibility of 100%,
(2) SNP loci with read depth < 5 or > 50 were removed, (3)
at least 95% loci called (the respective DNA fragment had
been identified in greater than 95% of all individuals), (4)
monomorphic loci were removed, (5) SNPs that shared
secondaries (had more than one sequence tag repre-
sented in the dataset) were randomly filtered out to keep
only one random sequence tag.
DNA‑based species delimitation
Five approaches were used to analyse the genetic struc-
ture at the species level: (1) Unweighted Pair Group
Method with Arithmetic Mean (UPGMA), (2) fastSTRU
CTU RE analysis, (3) Principal Coordinates Analysis
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 14 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
(PCoA), (4) NewHybrids analysis, (5) calculation of the
f4 statistic. In addition, to assess the genetic differentia-
tion at the population level within S. baicalensis, S. capil-
lata, S. grandis and S. krylovii we performed PCoA and
STRU CTU RE analyses and calculated FST. Furthermore,
we used SNAPP to estimate divergence times within the
studied species and populations.
Firstly, a UPGMA cluster analysis based on Jaccard’s
distance matrix was performed using R-packages dartR
and visualised with stats v.3.6.2 [68]. Next, the genetic
structure was investigated using fastSTRU CTU RE v.1.0,
which implements the Bayesian clustering algorithm
STRU CTU RE, assuming Hardy-Weinberg equilibrium
between alleles, in a fast and resource-efficient man-
ner [69]. A number of clusters (K-values) ranging from 1
to 10 were tested using the default parameters with ten
replicate runs per dataset. e most likely K-value was
estimated with the best choice function implemented
in fastSTRU CTU RE. e output matrices for the best
K-values were reordered and plotted using the R package
pophelperShiny v.2.1.0 [70]. We applied the threshold of
0.10 < q < 0.90 as the most widely utilised measure for the
assessment of hybridisation [71, 72] with q-values > 0.9
being pure species and 0.45 < q < 0.55 being F1 hybrids,
while first- and second-generation backcrosses with
one parent were considered at q-values 0.25 and 0.125,
respectively [73]. en, a PCoA on a Euclidean distance
matrix was performed using the R-package dartR and
visualised with ggplot2 v.3.3.0 [74] to show the first two
components and plotly v.4.9.2 [75] to illustrate the first
three components.
Hybrid generation identication
Next, to assign individuals to a genetic category (pure,
hybrid F1 or F2, backcross hybrid) based on their mul-
tilocus genotypes, we performed analyses using a Bayes-
ian model-based clustering method implemented in
NewHybrids v.1.1 [76] via the R package dartR. e
program utilises Markov chain Monte Carlo (MCMC)
simulations to compute the posterior probability of an
individual belonging to pre-defined categories compar-
ing two parental genotypes at a time. us, we used five
data sets representing parental species and their hybrids
defined by the UPGMA, fastSTRU CTU RE and PCoA
outputs from the previous steps. Specifically, we had the
following datasets: (1) S. baicalensis, S. krylovii and the
admixed individuals between the parental species; (2) S.
baicalensis, S. capillata and the admixed individuals; (3)
S. grandis, S. krylovii and the admixed individuals; (4) S.
capillata, S. krylovii and the admixed individuals; (5) S.
baicalensis, S. grandis and the admixed individuals. Due
to limitations of NewHybrids, only 200 loci were used
per run. In the beginning, the first 200 loci ranked on
information content (option “AvgPIC”) were chosen via
dartR. en, to ensure the convergence of the algorithm,
we used ten different 200-SNP subsets selected in ran-
dom (option “random”). e Jeffreys prior was used for
θ and π and a burn-in of 10,000 MCMC generations fol-
lowed by 10,000 post burn-in sweeps. e resulting pos-
terior probabilities were calculated based on 11 runs and
a probability threshold > 0.8 was set for the assignment
into a genetic category. e calculated posterior prob-
abilities for the assigned categories were visualised using
the R package pophelperShiny.
Testing forintrogression
Further, to calculate the f4 statistic we retained only pure
individuals determined via the UPGMA, fastSTRU CTU
RE and PCoA analyses. All filtering steps were conducted
using the R-package dartR with the above-mentioned
sequence. Next, the processed data was converted into
the EIGENSTRAT format using the R-package dartR.
e subsequent calculation of f4 statistics was performed
in ADMIXTOOLS v. 7.0.1 [77] using the R package
admixr v.0.9.1 [78]. In brief, the f4 statistic [79] is simi-
lar to the D statistic [80, 81] and measures the average
correlation in allele frequency differences between four
populations, e.g., A, B, C and D [82]. If a divergent out-
group is provided as population A, we can test for gene
flow between B and C (if the statistic is negative and
Z-score < − 3) or B and D (if the statistic is positive and
Z-score > 3). Due to such a test originally being applied
to continuous genome-wide data, it is important to men-
tion some limitations of the analysis while working with
non-model organisms. If a draft genome is available,
it is possible to specify positions of SNPs along contigs
or scaffolds by a special parameter “blockname” imple-
mented in ADMIXTOOLS. Nonetheless, the package
currently does not support more than 600 contigs/scaf-
folds [83] decreasing the potential number of used SNPs
for very fragmented genomes. at was the case of Stipa,
where only one draft genome comprising 5931 contigs is
currently available [48]. Additionally, SNPs positions and
genetic distances are used for a block jackknife method
to test for a significant deviation from the null expecta-
tion of the f4 statistic. us, in the absence of a reference
genome or the presence of a very fragmented genome the
f4/D statistics should be treated with caution. Here we
consider signatures of past gene flow events only if BABA
or ABBA patterns are greater than 50. All possible four-
species combinations were tested, while one species, S.
glareosa, was selected as an outgroup (A) for all runs.
Population dierentiation
To perform PCoA and STRU CTU RE analyses and cal-
culate FST at the population level we retained only pure
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 15 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
species and kept populations with more than three indi-
viduals per population. To maintain as many popula-
tions as possible, we merged several individuals growing
relatively close to each other (Table2 and Supplementary
Table S1). All filtering steps were conducted using the
R-package dartR with the above-mentioned sequence.
PCoA analyses were also performed according to the
aforesaid flow. Next, we used STRU CTU RE v.2.3.4 [84]
instead of fastSTRU CTU RE due to the former software
being markedly superior to the latter one under weak
population differentiation [85]. To overcome the rela-
tively slow speed of the analysis we applied parallel com-
puting using StrAuto v.1.0 [86]. Five replicate runs were
performed for each number of clusters (K) from one
to ten with a burn-in of 50,000 iterations followed by
500,000 MCMC iterations. e optimal K value was iden-
tified based on Evanno’s method of ΔK statistics [87] as
implemented in Structure Harvester [88]. e calculated
posterior probabilities for the assigned categories were
visualised using the R package pophelperShiny. en,
the fixation index Fst was calculated using the R-pack-
age dartR with 1000 bootstraps to obtain p-values. is
measure assesses genetic differentiation among popula-
tions, where values of 0.00–0.05 indicate low differentia-
tion, 0.05–0.15 indicate moderate differentiation, while
Fst > 0.15 indicates high levels of differentiation [89].
Divergence‑time estimation
Finally, to estimate divergence times between the studied
taxa, we retained only one pure individual per species
and population using dartR and followed the above-
mentioned sequence of filtering steps with an exception
of the called loci (100% instead of 95%). en, the SNP
data was converted to the Nexus format using dartR.
Subsequently, we created an XML file using the SNAPP-
specific template provided in Stange etal., 2018. SNAPP
v.1.5.1 [90] utilises the multi-species coalescent approach
and is well-suited for analyses of genome-wide data
deducing the species tree and divergence times directly
from SNPs [91]. We applied one time calibration, setting
a log-normal distribution with a mean of 4.39 Mya and
a standard deviation (SD) of 0.18 for the split between
S. glareosa (section Smirnovia) and S. capillata (sec-
tion Leiostipa) as it was inferred in our previous study
[48]. e analysis was performed three times indepen-
dently, 1.25 million MCMC generations for each run
using BEAST2 v.2.6.3 [92]. Tracer v.1.7.1 [93] was used
to visually check the combined log file regarding Effec-
tive Sample Size (ESS) values. As all ESSs exceeded 200,
we combined tree files using LogCombiner v.2.6.3 (a part
of the BEAST package) with the first 10% discarded as
burn-in from each run. e final maximum clade cred-
ibility tree was summarised in TreeAnnotator v.2.6.3 (a
part of the BEAST package). Lately, we visualised a pat-
tern across all of the posterior trees via DensiTree v.2.01
[94], while FigTree v.1.4.4 [95] was used to inspect the
Bayesian posterior probabilities and the 95% credibility
intervals of the final tree.
Morphological analysis
A total of 302 specimens were examined under a light
microscope SMZ800 (Nikon, Japan) across the 17 most
informative quantitative and six qualitative morphologi-
cal characters commonly used in keys and taxonomic
descriptions of Stipa (Table3). Firstly, the Shapiro-Wilk
test was used in the R-package MVN v.5.8 [96] to assess
the normality of the distribution of each characteristic.
Secondly, the non-parametric Spearman’s correlation test
was applied using R-packages stats and Hmisc v.4.3–1
[97] to examine relations between the studied characters.
Table 2 Analysed populations
Species Populations and their localities
according to Fig.1 and
Supplementary TableS1
Number of
individuals per
population
S. baicalensis Population 1 (localities 5, 6 and 7) 5
Population 2 (locality 11) 8
Population 3 (locality 13) 15
Population 4 (locality 16) 8
S. capillata Population 1 (locality 1) 3
Population 2 (locality 3) 7
Population 3 (locality 4) 7
Population 4 (locality 5) 6
Population 5 (locality 8) 19
Population 6 (locality 9) 9
Population 7 (locality 19) 6
Population 8 (locality 21) 5
Population 9 (localities 25, 29 and 30) 4
S. grandis Population 1 (locality 5) 3
Population 2 (locality 11) 7
Population 3 (locality 14) 7
Population 4 (locality 16) 6
Population 5 (localities 15 and 17) 8
Population 6 (locality 18) 8
Population 7 (locality 23) 7
Population 8 (locality 24) 7
S. krylovii Population 1 (locality 11) 15
Population 2 (locality 15) 7
Population 3 (locality 17) 6
Population 4 (locality 18) 8
Population 5 (locality 20) 7
Population 6 (locality 21) 10
Population 7 (locality 24) 6
Population 8 (localities 26, 27 and 28) 20
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 16 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
e combined correlogram with the significance test was
visualised using the R-package corrplot v.0.84 [98].
Next, a Factor Analysis of Mixed Data (FAMD) [99]
was accomplished using the R-package FactoMineR v.2.3
[100] to characterise the variation within and among
groups of taxa without a priori taxonomic classification
and to extract the variables that best identified them. e
number of principal components used in the analysis was
chosen based on Cattell’s scree test [101]. R-packages
factoextra v.1.0.6 [102] and plotly were used to visualise
the first two and the first three principal components,
respectively. Subsequently, the plots were supplemented
with the result of the fastSTRU CTU RE analysis for the
best K = 5.
Additionally, to evaluate distributional relationships
between each response variable and the studied taxa,
notch plots and interactive box plots were created using
R-packages ggplot2 v.3.3.0 and plotly v.4.9.2, respec-
tively. e notched box plots display a confidence inter-
val around the median, which is normally based on the
median ± 1.57 × interquartile range/square root of n.
According to this graphical method for data analysis,
if the notches of the two boxes do not overlap, there is
“strong evidence” (95% confidence) that their medi-
ans differ. In addition, to reveal significant differences
between means of particular characters across all exam-
ined taxa the nonparametric Kruskal-Wallis test followed
by the post-hoc Wilcoxon test with Bonferroni correction
were performed using the R-package stats v.3.6.2.
Supplementary Information
The online version contains supplementary material available at https:// doi.
org/ 10. 1186/ s12870- 021- 03287-w.
Additional le1. Interactive box plots.
Additional le2: Supplementary TableS1. List of samples used in the
study. TableS2. Average posterior probabilities inferred in NewHybrids
for first (F1) and second (F2) generation hybrids and backcrosses (F1xP1
and F1xP2). TableS3. Pairwise Fst values for population differentiation
across the four studied species. Fst > 0.15 indicating high levels of dif-
ferentiation are in bold type. TableS4. Contribution (%) by dimension
of each character (abbreviations according to Table 3) in FAMD. The first
five characters contributing the most are in bold type. Abbreviations of
the qualitative variables and their contributions to the principal axes are
underlined. TableS5. The assigned species names based on morphologi-
cal and molecular data. Mismatches are shown in bold type. Supplemen‑
tary Figure S1 Venn diagram representing polymorphic SNPs among four
pure Stipa species. The admixed individuals and S. glareosa, which did not
show patterns of hybridisation, were omitted in the metric’s calculation.
Table 3 Morphological characters used in the present study
Character Abbreviation
Quantitative characters (mm)
Width of blades of vegetative shoots WVS
Length of ligules of the middle cauline leaves LigC
Length of ligules of the internal vegetative shoots LigIV
Length of the lower glume LG
Length of the anthecium AL
Length of the callus CL
Length of the callus base CBL
Length of hairs on the dorsal line on the lemma LHD
Length of hairs on the ventral line on the lemma LHV
Distance from the end of the dorsal line of hairs to the top of the lemma DDL
Distance from the end of the ventral line of hairs to the top of the lemma DVL
Length of hairs on the top of the lemma LHTA
Length of the lower segment of the awn Col1L
Length of the middle segment of the awn Col2L
Length of the seta SL
Length of hairs on the lower segment of the awn HLCol1
Length of hairs on the middle segment of the awn HLCol2
Qualitative characters
Character of the abaxial surface of vegetative leaves (glabrous, with prickles) AbSVL
Character of the adaxial surface of vegetative leaves (short hairs, long hairs, mixed) AdSVL
Type of the awn geniculation (single, double) AG
Character of nodes (glabrous, with hairs) CN
Type of hairs on the top of the anthecium (glabrous, poor developed, well developed) HTTA
Presence of hairs below nodes (glabrous, with hairs) PHBN
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 17 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
Supplementary Figure S2 Delta K values calculated by Evanno’s method
across four species. (a) S. baicalensis. (b) S. capillata. (c) S. grandis. (d) S.
krylovii. Supplementary Figure S3 Phylogeny (at the top) and divergence
date estimates at the species level (on the bottom) inferred by SNAPP. The
scale shows divergence time in Mya. The red circles indicate nodes with
the Bayesian posterior probabilities (BPP) > 0.8. The lower-case letters refer
to the embedded table containing data regarding the exact estimates of
the divergence times (in kya), BPPs and 95% HPD intervals. Supplemen‑
tary Figure S4 Correlation matrix of the studied morphological characters
(abbreviations according to Table 3). Colour intensity and the size of the
circle are proportional to the correlation coefficients (displayed in the
circle). Positive correlations are blue while negative are red. All p-values of
Pearson correlations were < 0.01. Supplementary Figure S5 Factor analy-
sis of mixed data performed on 17 quantitative and six qualitative char-
acters of the five examined species of Stipa. (a) Plot of the principal axes
one and two. (b) Plot of the principal axes one and three. (c) Plot of the
principal axes one and four. (d) Plot of the principal axes two and three.
(e) Plot of the principal axes two and four. (f) Plot of the principal axes
three and four. The pie charts represent the proportions of membership
established by fastSTRU CTU RE for the best K = 5. Supplementary Figure
S6 Notched boxplot demonstrating the mean (white circle), the median
(dark black line), 95% confidence interval around the median (notch),
inter-quartile ranges (25 to 75%), whiskers (5 and 95%) and minimum and
maximum measurements (crosses) of quantitative characters (a-q) for the
studied species. Statistical significance was tested by Wilcoxon rank-sum
test for post hoc group comparisons with Bonferroni correction, p < 0.001,
p < 0.01, p < 0.05, p < 0.1 and p < 1 noted as ‘***’, ‘**’, ‘*’, ‘.’ and no symbol,
respectively. Due to the small sample size, p-values cannot be properly
estimated for S. grandis × S. krylovii and S. grandis × S. baicalensis. Each
dot represents an observation. Supplementary Figure S7 Bar charts
displaying frequencies of the qualitative characters. (a) AbSVL. (b) AdSVL.
(c) HTTA. (d) AG. (e) CN. (f) PHBN.
Acknowledgements
We would like to express our gratitude to two anonymous reviewers for
providing valuable comments on the manuscript.
Authors’ contributions
E.B., M.N., P.D.G. supervised the study. P.D.G., M.N. planned the field studies, carried
out the sampling, revised herbarium materials and performed the taxonomic iden-
tification. P.D.G., M.N. established the general distribution of the species and D.R.
created the map. D.R., E.B., P.D.G. conducted the morphological measurements. E.B.
performed all the analyses and wrote the manuscript. All authors revised the draft,
provided comments and approved the final version of the manuscript.
Funding
The study was supported by a RSF grant (project no. 19–74-10067), partially
by a DS grant of the Jagiellonian University (446.31150.2.2020) and by the
National Science Centre, Poland (grant no. 2018/29/B/NZ9/00313). The open-
access publication of this article was funded by the BioS Priority Research Area
under the program “Excellence Initiative – Research University” at the Jagiel-
lonian University in Krakow.
Availability of data and materials
The SNP dataset derived from the DArTseq pipeline in the genlight format is
available via Figshare repository, https:// doi. org/ 10. 6084/ m9. figsh are. 14461 802.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Author details
1 Institute of Botany, Faculty of Biology, Jagiellonian University, Gronostajowa
3, 30-387 Kraków, Poland. 2 Research laboratory ‘Herbarium’, National Research
Tomsk State University, Lenin 36 Ave., 634050 Tomsk, Russia. 3 Depar tment
of Biology, Altai State University, Lenin 61 Ave., 656049 Barnaul, Russia.
Received: 23 April 2021 Accepted: 20 October 2021
References
1. Mora C, Tittensor DP, Adl S, Simpson AG, Worm B. How many species
are there on earth and in the ocean? PLoS Biol 2011;9(8):e1001127.
https:// doi. org/ 10. 1371/ journ al. pbio. 10011 27.
2. Christenhusz MJM, Byng JW. The number of known plants species in
the world and its annual increase. Phytotaxa. 2016;261(3):201–217.
https:// doi. org/ 10. 11646/ phyto taxa. 261.3.1
3. Dayrat B. Towards integrative taxonomy. Biol J Linn Soc 2005;83:407–
415. https:// doi. org/ 10. 1111/j. 1095- 8312. 2005. 00503.x.
4. Raupach MJ, Amann R, Wheeler Q, Roos C. The application of "-omics"
technologies for the classification and identification of animals. Org
Divers Evol 2016;16:1–12. https:// doi. org/ 10. 1007/ s13127- 015- 0234-6.
5. Massó S, López-Pujol J, Vilatersana R. Reinterpretation of an endangered
taxon based on integrative taxonomy: the case of Cynara baetica (Com-
positae). PLoS One 2018;13(11):e0207094. https:// doi. org/ 10. 1371/ journ
al. pone. 02070 94.
6. Lovrenčić L, Bonassin L, Boštjančić LL, Podnar M, Jelić M, Klobučar
G, et al. New insights into the genetic diversity of the stone crayfish:
taxonomic and conservation implications. BMC Evol Biol 2020;20:146.
https:// doi. org/ 10. 1186/ s1286 2e- 020- 01709-1.
7. Hajibabaei M, Singer GAC, Hebert PDN, Hickey DA. DNA barcoding:
how it complements taxonomy, molecular phylogenetics and popula-
tion genetics. Trends Genet 2007;23(4):167–172. https:// doi. org/ 10.
1016/j. tig. 2007. 02. 001.
8. Kress WJ. Plant DNA barcodes: applications today and in the future. J
Syst Evol 2017;55(44):291–307. https:// doi. org/ 10. 1111/ jse. 12254.
9. Tizard J, Patel S, Waugh J, Tavares E, Bergmann T, Gill B, et al. DNA bar-
coding a unique avifauna: an important tool for evolution, systematics
and conservation. BMC Evol Biol 2019;19:52. https:// doi. org/ 10. 1186/
s12862- 019- 1346-y.
10. CBOL Plant Working Group. A DNA barcode for land plants. P Natl Acad
Sci USA 2009;106(31):12794–12797. https:// doi. org/ 10. 1073/ pnas. 09058
45106.
11. Li DZ, Gao LM, Li HT, Wang H, Ge XJ, Liu JQ, et al. Comparative analysis
of a large dataset indicates that internal transcribed spacer (ITS) should
be incorporated into the core barcode for seed plants. P Natl Acad Sci
USA. 2011;108(49):19641–19646. https:// doi. org/ 10. 1073/ pnas. 11045
51108.
12. Nobis M. Taxonomic revision of the central Asiatic Stipa tianschanica
complex (Poaceae) with particular reference to the epidermal micro-
morphology of the lemma. Folia Geobot 2014;49:283–308. https:// doi.
org/ 10. 1007/ s12224- 013- 9164-2.
13. Nobis M, Gudkova P, Nowak A, Sawicki J, Nobis A. A revision of the genus
Stipa (Poaceae) in middle Asia, including a key to species identification,
an annotated checklist and phytogeographic analysis. Ann Mo Bot Gard
2020;105:1–63. https:// doi. org/ 10. 3417/ 20193 78.
14. Hamasha HR, von Hagen KB, Röser M. Stipa (Poaceae) and allies in
the Old World: molecular phylogenetics realigns genus circumscrip-
tion and gives evidence on the origin of American and Australian
lineages. Plant Syst Evol 2012;298:351–367. https:// doi. org/ 10. 1007/
s00606- 011- 0549-5.
15. Romaschenko K, Peterson PM, Soreng RJ, Garcia-Jacas N, Futorna O,
Susanna A. Systematics and evolution of the needle grasses (Poaceae:
Pooideae: Stipeae) based on analysis of multiple chloroplast loci, ITS,
and lemma micromorphology. Taxon. 2012;61:18–44. https:// doi. org/
10. 1002/ tax. 611002.
16. Tzvelev NN. Zlaki USSR. Leningrad: Nauka Press; 1976.
17. Tzvelev NN. Notes on the tribe Stipeae Dumort. (Poaceae). Novosti Sist
Vyssh Rast. 2012;43:20–9.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 18 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
18. Freitag H. The genus Stipa (Gramineae) in southwest and South Asia.
Notes Roy Bot Gard Edinburgh. 1985;42:355–489.
19. Krawczyk K, Nobis M, Nowak A, Szczecińska M, Sawicki J. Phylogenetic
implications of nuclear rRNA IGS variation in Stipa L. (Poaceae). Sci Rep.
2017;7:11506. https:// doi. org/ 10. 1038/ s41598- 017- 11804-x.
20. Krawczyk K, Nobis M, Myszczyński K, Klichowska E, Sawicki J. Plastid
superbarcodes as a tool for species discrimination in feather grasses
(Poaceae: Stipa). Sci Rep 2018;8:1924. https:// doi. org/ 10. 1038/
s41598- 018- 20399-w.
21. Smirnov PA. Stiparum Armeniae minus cognitarum descriptiones. Byull
Moskovsk Obshch Isp Prir Otd Biol. 1970;75:113–5.
22. Kotukhov YA. Synopsis of feather grass (Stipa L.) and false needlegrasses
(Ptilagrostis Griseb.) the eastern of Kazakhstan (the Kazakh Altai, Zaisan
valley and Prialtayskie ranges). Bot Issl Sib Kazakhst. 2002;8:3–16.
23. Nobis M. Taxonomic revision of the Stipa lipskyi group (Poaceae: Stipa
section Smirnovia) in the Pamir alai and tian-Shan Mountains. Plant Syst
Evol 2013;299:1307–1354. https:// doi. org/ 10. 1007/ s00606- 013- 0799-5.
24. Mallet J. Hybridization as an invasion of the genome. Trends Ecol Evol
2005;20:229–237. https:// doi. org/ 10. 1016/j. tree. 2005. 02. 010.
25. Hegar ty MJ, Hiscock SJ. Hybrid speciation in plants: new insights from
molecular studies. New Phytol 2005;165:411–423. https:// doi. org/ 10.
1111/j. 1469- 8137. 2004. 01253.x.
26. Abbott R, Albach D, Ansell S, Arntzen JW, Baird SJ, Bierne N, et al.
Hybridization and speciation. J Evol Biol 2013;26:229–246. https:// doi.
org/ 10. 1111/j. 1420- 9101. 2012. 02599.x.
27. Matsuoka Y, Takumi S, Nasuda S. Genetic mechanisms of allopolyploid
speciation through hybrid genome doubling: novel insights from wheat
(Triticum and Aegilops) studies. Int Rev Cel Mol Bio 2014;309:199–258.
https:// doi. org/ 10. 1016/ b978-0- 12- 800255- 1. 00004-1.
28. Gordon SP, Contreras-Moreira B, Levy JJ, Djamei A, Czedik-Eysenberg
A, Tartaglio VS, et al. Gradual polyploid genome evolution revealed
by pan-genomic analysis of Brachypodium hybridum and its diploid
progenitors. Nat Commun 2020;11:3670. https:// doi. org/ 10. 1038/
s41467- 020- 17302-5.
29. Rieseberg LH, Ellstrand NC. What can molecular and morpho-
logical markers tell us about plant hybridization. Crit Rev Plant Sci
1993;12:213–241. https:// doi. org/ 10. 1080/ 07352 68930 97019 02.
30. Hardig TM, Brunsfeld SJ, Fritz RS, Morgan M, Orians CM. Morphological
and molecular evidence for hybridization and introgression in a willow
(Salix) hybrid zone. Mol Ecol 2000;9:9–24. https:// doi. org/ 10. 1046/j.
1365- 294x. 2000. 00757.x.
31. Rieseberg L, Archer M, Wayne R. Transgressive segregation, adaptation
and speciation. Heredity. 1999;83:363–372. https:// doi. org/ 10. 1038/ sj.
hdy. 68861 70.
32. Nobis M, Nowak A, Nobis A, Nowak S, Żabicka J, Żabicki P. Stipa ×fallax
(Poaceae: Pooideae: Stipeae), a new natural hybrid from Tajikistan, and
a new combination in Stipa drobovii. Phytotaxa. 2017;303:141–154.
https:// doi. org/ 10. 11646/ phyto taxa. 303.2.4
33. Nobis M, Gudkova PD, Baiakhmetov E, Żabicka J, Krawczyk K, Sawicki
J. Hybridisation, introgression events and cryptic speciation in Stipa
(Poaceae): a case study of the Stipa heptapotamica hybrid-complex.
Perspect Plant Ecol Evol Syst 2019;39:125457. https:// doi. org/ 10. 1016/j.
ppees. 2019. 05. 001.
34. Baiakhmetov E, Nowak A, Gudkova PD, Nobis M. Morphological and
genome-wide evidence for natural hybridisation within the genus
Stipa (Poaceae). Sci Rep 2020;10:13803. https:// doi. org/ 10. 1038/
s41598- 020- 70582-1.
35. Nie B, Jiao BH, Ren LF, Gudkova PD, Chen WL, Zhang WH. Integrative
taxonomy recognized a new cryptic species within Stipa grandis from
loess plateau of China. J Syst Evol 2020. https:// doi. org/ 10. 1111/ jse.
12714.
36. Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, et al.
Rapid SNP discovery and genetic mapping using sequenced RAD
markers. PLoS One 2008;3:e3376. https://dx.doi.org/10.1371%2Fjournal.
pone.0003376
37. Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, et al.
A robust, simple genotyping-by-sequencing (GBS) approach for high
diversity species. PLoS One 2011;6:e19379. https:// doi. org/ 10. 1371/
journ al. pone. 00193 79.
38. Poland JA, Rife TW. Genotyping-by-sequencing for plant breeding and
genetics. Plant Genome 2012;5:92–102. https:// doi. org/ 10. 3835/ plant
genom e2012. 05. 0005.
39. Schilling MP, Gompert Z, Li FW, Windham MD, Wolf PG. Admixture,
evolution, and variation in reproductive isolation in the Boechera
puberula clade. BMC Evol Biol 2018;18:61. https:// doi. org/ 10. 1186/
s12862- 018- 1173-6.
40. Wagner ND, Gramlich S, Hörandl E. RAD sequencing resolved phyloge-
netic relationships in European shrub willows (Salix L. subg. Chamaetia
and subg. Vetrix) and revealed multiple evolution of dwarf shrubs. Ecol
Evol 2018;8(16):8243–8255. https:// doi. org/ 10. 1002/ ece3. 4360.
41. Hodkinson TR, Perdereau A, Klaas M, Cormican P, Barth S. Genotyping
by sequencing and plastome analysis finds high genetic variability and
geographical structure in Dactylis glomerata L in Northwest Europe
despite lack of ploidy variation Agronomy 2019;9:342. https:// doi. org/
10. 3390/ agron omy90 70342.
42. Gudkova PD, Olonova MV, Feoktisov DS. The comparison of ecologo-
climatic niches of two species feather grass Stipa sareptana a.K. Becker
and S. krylovii Roshev. (Poaceae). Ukr J Ecol 2017;7:263–269. https:// doi.
org/ 10. 15421/ 2017_ 115
43. Tzvelev NN. Gramineae. In: Grubov VI, editor. Plantae Asiae Centralis
(secus materies Instituti Botanici nomine V. L. Komarovii). Leningrad:
Nauka; 1968. p. 1–243.
44. Wu ZL, Phillips SM. Stipa. In: Wu ZY, et al., editors. Flora of China. Beijing:
Science Press; 2006. p. 196–203.
45. Tzvelev NN, Probatova NS. Grasses of Russia. Moscow: KMK Scientific
Press; 2019.
46. Kotukhov YA, Anufrieva OA. Addition to the flora of Kazakhstan. Terra.
2008:48–55.
47. Curtu AL, Gailing O, Finkeldey R. Evidence for hybridization and intro-
gression within a species-rich oak (Quercus spp.) community. BMC Evol
Biol 2007;7:218. https:// doi. org/ 10. 1186/ 1471- 2148-7- 218.
48. Baiakhmetov E, Guyomar C, Shelest E, Nobis M, Gudkova PD. The first
draft genome of feather grasses using SMRT sequencing and its impli-
cations in molecular studies of Stipa. Sci Rep 2021;11:15345. https:// doi.
org/ 10. 1038/ s41598- 021- 94068-w.
49. Dur vasula A, Fulgione A, Gutaker RM, Alacakaptan SI, Flood PJ, Neto
C, et al. (2017). African genomes illuminate the early history and
transition to selfing in Arabidopsis thaliana. P Natl Acad Sci USA.
2017;114(20):5213–5218. https:// doi. org/ 10. 1073/ pnas. 16167 36114.
50. Cornejo-Romero A, Vargas-Mendoza CF, Aguilar-Martínez GF, Medina-
Sánchez J, Rendón-Aguilar B, Valverde PL, et al. Alternative glacial-
interglacial refugia demographic hypotheses tested on Cephalocereus
columna-trajani (Cactaceae) in the intertropical Mexican drylands. PLoS
One 2017;12(4):e0175905. https:// doi. org/ 10. 1371/ journ al. pone. 01759
05.
51. Long C, Kubatko L. The effect of gene flow on coalescent-based
species-tree inference. Syst Biol 2018;67(5):770–785. https:// doi. org/ 10.
1093/ sysbio/ syy020.
52. Jaio X, Flouri T, Rannala B, Yang Z. The impact of cross-species gene flow
on species tree estimation. Syst Biol 2020;69(5):830–847. https:// doi.
org/ 10. 1093/ sysbio/ syaa0 01.
53. Jiao X, Yang Z. Defining species when there is gene flow. Syst Biol
2021;70(1):108–119. https:// doi. org/ 10. 1093/ sysbio/ syaa0 52.
54. Rieseberg L, Wendel J. lntrogression and Its Consequences in Plants. In:
Harrison RG, editor. Hybrid Zones and the Evolutionary Process. Oxford:
Oxford University Press; 1993. p. 70–109.
55. Fogelqvist J, Verkhozina AV, Katyshev AI, Pucholt P, Dixelius C,
Rönnberg-Wästljung AC, et al. Genetic and morphological evidence
for introgression between three species of willows. BMC Evol Biol
2015;15:193. https:// doi. org/ 10. 1186/ s12862- 015- 0461-7.
56. Federman S, Donoghue MJ, Daly DC, Eaton DAR. Reconciling species
diversity in a tropical plant clade (Canarium, Burseraceae). PLoS One
2018;13(6):e0198882. https:// doi. org/ 10. 1371/ journ al. pone. 01988 82.
57. Buck R, Hyasat S, Hossfeld A, Flores-Rentería L. Patterns of hybridization
and cryptic introgression among one- and four-needled pinyon pines.
Ann Bot 2020;126(3):401–411. https:// doi. org/ 10. 1093/ aob/ mcaa0 45.
58. Civán P, Brown TA. Role of genetic introgression during the evolution
of cultivated rice (Oryza sativa L.). BMC Evol Biol. 2018;18:57. https:// doi.
org/ 10. 1186/ s12862- 018- 1180-7.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 19 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
59. Cheng H, Liu J, Wen J, Nie X, Xu L, Chen N, et al. Frequent intra- and
inter-species introgression shapes the landscape of genetic variation
in bread wheat. Genome Biol 2019;20:136. https:// doi. org/ 10. 1186/
s13059- 019- 1744-x.
60. Pachakkil B, Terajima Y, Ohmido N, Ebina M, Irei S, Hayashi H, et al.
Cytogenetic and agronomic characterization of intergeneric hybrids
between Saccharum spp hybrid and Erianthus arundinaceus Sci Rep
2019;9:1748. https:// doi. org/ 10. 1038/ s41598- 018- 38316-6.
61. Harrison RG, Larson EL. Hybridization, introgression, and the nature of
species boundaries, J Hered 2014;105(1):795–809. https:// doi. org/ 10.
1093/ jhered/ esu033.
62. Yang YQ, Li X, Kong X, Ma L, Hu X, Yang Y. Transcriptome analysis reveals
diversified adaptation of Stipa purpurea along a drought gradient on
the Tibetan plateau. Funct Integr Genomic 2015;15:295–307. https://
doi. org/ 10. 1007/ s10142- 014- 0419-7.
63. Wan D, Wan Y, Hou X, Ren W, Ding Y, Sa R. De novo assembly and tran-
scriptomic profiling of the grazing response in Stipa grandis. PLoS One
2015;10:e0122641. https:// doi. org/ 10. 1371/ journ al. pone. 01226 41.
64. Schuber t M, Marcussen T, Meseguer AS, Fjellheim S. The grass subfamily
Pooideae: cretaceous–Palaeocene origin and climate-driven Cenozoic
diversification. Glob Ecol Biogeogr 2019;28:1168–1182. https:// doi. org/
10. 1111/ geb. 12923.
65. Yan D, Ren J, Liu J, Ding Y, Niu J. De novo assembly, annotation, marker
discovery, and genetic diversity of the Stipa breviflora Griseb. (Poaceae)
response to grazing. PLOS One. 2020;15(12): e0244222. https:// doi. org/
10. 1371/ journ al. pone. 02442 22
66. Kilian A, Wenzl P, Huttner E, Carling J, Xia L, Blois H, et al. Diversity
arrays technology: a generic genome profiling technology on open
platforms. Methods Mol Biol 2012;888:67–89. https:// doi. org/ 10. 1007/
978-1- 61779- 870-2_5.
67. Gruber B, Unmack PJ, Berry OF, Georges A dartr: An r package to
facilitate analysis of SNP data generated from reduced representation
genome sequencing Mol Ecol Resour 2018;18:691–699. https:// doi. org/
10. 1111/ 1755- 0998. 12745.
68. R Core Team. R: A language and environment for statistical computing.
R Foundation for Statistical Computing. 2021. https:// www.R- proje ct.
org. Accessed 23 Apr 2021.
69. Raj A, Stephens M, Pritchard JK. fastSTRU C TU RE: variational inference
of population structure in large SNP data sets. Genetics. 2014;197:573–
589. https:// doi. org/ 10. 1534/ genet ics. 114. 164350.
70. Francis RM. POPHELPER: an R package and web app to analyse and
visualize population structure. Mol Ecol Resour 2017;17(1):27–32.
https:// doi. org/ 10. 1111/ 1755- 0998. 12509.
71. Burgarella C, Lorenzo Z, Jabbour-Zahab R, Lumaret R, Guichoux E, Petit
RJ, et al. Detection of hybrids in nature: application to oaks (Quercus
suber and Q. ilex). Heredity. 2009;102:442–452. https:// doi. org/ 10. 1038/
hdy. 2009.8.
72. Winkler KA, Pamminger-Lahnsteiner B, Wanzenböck J, Weiss S. Hybridi-
zation and restricted gene flow between native and introduced stocks
of Alpine whitefish (Coregonus sp.) across multiple environments.
Mol Ecol. 2011;20:456–472. https:// doi. org/ 10. 1111/j. 1365- 294X. 2010.
04961.x.
73. Beugin MP, Gayet T, Pontier D, Devillard S, Jombart T. A fast likelihood solution
to the genetic clustering problem. Methods Ecol Evol 2018;9:1006–1016.
https:// doi. org/ 10. 1111/ 2041- 210X. 12968.
74. Wickham H. ggplot2: elegant graphics for data analysis. New York:
Springer; 2016.
75. Sievert C, Parmer C, Hocking T, Chamberlain S, Ram K, Corvellec
M, et al. plotly: create interactive web graphics via ’plotly.js’. 2021.
https:// rdrr. io/ cran/ plotly. Accessed 23 Apr 2021.
76. Anderson EC, Thompson EA. A model-based method for iden-
tifying species hybrids using multilocus genetic data. Genetics.
2002;160(3):1217–29.
77. Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, et al.
Ancient admixture in human history. Genetics. 2012;192(3):1065–
1093. https:// doi. org/ 10. 1534/ genet ics. 112. 145037.
78. Petr M, Vernot B, Kelso J. Admixr–R package for reproducible analyses
using ADMIXTOOLS. Bioinformatics. 2019;35(17):3194–3195. https://
doi. org/ 10. 1093/ bioin forma tics/ btz030.
79. Reich D, Thangaraj K, Patterson N, Price AL, Singh L. Reconstructing
Indian population history. Nature. 2009;461:489–494. https:// doi. org/
10. 1038/ natur e08365.
80. Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, et al. A
draft sequence of the Neandertal genome. Science. 2010;328:710–
722. https:// doi. org/ 10. 1126/ scien ce. 11880 21.
81. Durand EY, Patterson N, Reich D, Slatkin M. Testing for ancient
admixture between closely related populations. Mol Biol Evol
2011;28:2239–2252. https:// doi. org/ 10. 1093/ molbev/ msr048.
82. Lipson M. Applying f4-statistics and admixture graphs: theory and
examples. Mol Ecol Resour 2020;20(6):1658–1667. https:// doi. org/ 10.
1111/ 1755- 0998. 13230.
83. Taylor RS, Manseau M, Horn RL, Keobouasone S, Golding GB, Wilson
PJ. The role of introgression and ecotypic parallelism in delineating
intraspecific conservation units. Mol Ecol 2020;29(15):2793–2909.
https:// doi. org/ 10. 1111/ mec. 15522.
84. Pritchard JK, Stephens M, Donnelly P. Inference of population struc-
ture using multilocus genotype data. Genetics. 2000;155(2):945–59.
85. Stift M, Kolář F, Meirmans PG. STRU CTU RE is more robust than other
clustering methods in simulated mixed-ploidy populations. Heredity.
2019;123:429–441. https:// doi. org/ 10. 1038/ s41437- 019- 0247-6.
86. Chhatre VE, Emerson KJ. StrAuto: automation and parallelization of
STRU CTU RE analysis. BMC Bioinformatics 2017;18:192. https:// doi.
org/ 10. 1186/ s12859- 017- 1593-0.
87. Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of
individuals using the software STRU CTU RE: a simulation study. Mol
Ecol 2005;14(8):2611–2620. https:// doi. org/ 10. 1111/j. 1365- 294X. 2005.
02553.x.
88. Earl DA, vonHoldt BM. STRU CTU RE HARVESTER: a website and
program for visualizing STRU CTU RE output and implementing the
Evanno method. Conserv Genet Resour 2012;4(2):359–361. https://
doi. org/ 10. 1007/ s12686- 011- 9548-7.
89. Hartl DL, Clark AG. Principles of population genetics. 3rd ed. Sunder-
land: Sinauer Associates; 1997.
90. Bryant D, Bouckaert R, Felsenstein J, Rosenberg NA, RoyChoudhury
A. Inferring species trees directly from Biallelic genetic markers:
bypassing gene trees in a full coalescent analysis. Mol Biol Evol
2012;29(8):1917–1932. https:// doi. org/ 10. 1093/ molbev/ mss086.
91. Stange M, Sánchez-Villagra MR, Salzburger W, Matschiner M. Bayesian
divergence-time estimation with genome-wide single-nucleotide
polymorphism data of sea catfishes (Ariidae) supports Miocene closure
of the Panamanian isthmus. Syst Biol 2018;67(4):681–699. https:// doi.
org/ 10. 1093/ sysbio/ syy006.
92. Bouck aert R, Vaughan TG, Barido-Sottani J, Duchêne S, Fourment M,
Gavryushkina A, et al. BEAST 2.5: An advanced software platform for
Bayesian evolutionary analysis. Plos Comput Biol. 2019;15(4):e1006650.
https:// doi. org/ 10. 1371/ journ al. pcbi. 10066 50.
93. Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. Posterior
summarization in Bayesian Phylogenetics using tracer 1.7. Syst Biol
2018;67(5):901–904. https:// doi. org/ 10. 1093/ sysbio/ syy032.
94. Bouck aert RR. DensiTree: making sense of sets of phylogenetic trees.
Bioinformatics. 2010;26(10):1372–1373. https:// doi. org/ 10. 1093/ bioin
forma tics/ btq110.
95. Rambaut A. Figtree v1.4.4. 2021. http:// tree. bio. ed. ac. uk/ softw are/ figtr
ee. Accessed 23 Apr 2021.
96. Korkmaz S, Goksuluk D, Zararsiz G. Mvn: an r package for assessing
multivariate normality. R J 2014;6:151–162. https:// doi. org/ 10. 32614/
rj- 2014- 031
97. Harrell FEJ. Hmisc: Harrell Miscellaneous. 2021. https:// CRAN.R- proje ct.
org/ packa ge= Hmisc. Accessed 23 Apr 2021.
98. Wei T, Simko V. R package "Corrplot": visualization of a correlation
matrix. 2021. https:// github. com/ taiyun/ corrp lot. Accessed 23 Apr
2021.
99. Pagès J. Analyse Factorielle de Donnees Mixtes. Rev Stat Appl.
2004;4:93–111.
100. Lê S, Josse J, Husson F. FactoMineR: an R package for multivariate analy-
sis. J Stat Softw 2008;25(1):1–18. https:// doi. org/ 10. 18637/ jss. v025. i01
101. Cattell RB. The scree test for the number of factors. Multivariate Behav
Res. 1966;1:245–76.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 20 of 20
Baiakhmetovetal. BMC Plant Biology (2021) 21:505
•
fast, convenient online submission
•
thorough peer review by experienced researchers in your field
•
rapid publication on acceptance
•
support for research data, including large and complex data types
•
gold Open Access which fosters wider collaboration and increased citations
maximum visibility for your research: over 100M website views per year
•
At BMC, research is always in progress.
Learn more biomedcentral.com/submissions
Ready to submit your research
Ready to submit your research
? Choose BMC and benefit from:
? Choose BMC and benefit from:
102. Kassambara A. Factoextra: extract and visualize the results of multivari-
ate data analyses. 2021. https:// rdrr. io/ cran/ facto extra. Accessed 23 Apr
2021.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
Available via license: CC BY 4.0
Content may be subject to copyright.