Article

Factors affecting species delimitations with the GMYC model: Insights from a butterfly survey

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

1. The generalized mixed Yule-coalescent (GMYC) model has become one of the most popular approaches for species delimitation based on single-locus data, and it is widely used in biodiversity assessments and phylogenetic community ecology. We here examine an array of factors affecting GMYC resolution (tree reconstruction method, taxon sampling coverage/taxon richness, and geographic sampling intensity/geographic scale). 2. We test GMYC performance based on empirical data (DNA barcoding of the Romanian butterflies) on a solid taxonomic framework (i.e. all species are thought to be described and can be determined with independent sources of evidence). The dataset is comprehensive (176 species), and intensely and homogeneously sampled (1303 samples representing the main populations of butterflies in this country). Taxonomy was assessed based on morphology, including linear and geometric morphometry when needed. 3. The number of GMYC entities obtained constantly exceeds the total number of morphospecies in the dataset. We show that approximately 80% of the species studied are recognised as entities by GMYC. Interestingly, we show that this percentage is practically the maximum that a single threshold method can provide for this dataset. Thus the ca. 20% of failures are attributable to intrinsic properties of the COI polymorphism: overlap in inter- and intraspecific divergences and non-monophyly of the species likely because of introgression or lack of independent lineage sorting. 4. Our results demonstrate that this method is remarkably stable under a wide array of circumstances, including most phylogenetic reconstruction methods, high singleton presence (up to 95%), taxon richness (above five species), and presence of gaps in intraspecific sampling coverage (removal of intermediate haplotypes). Hence, the method is useful to designate an optimal divergence threshold in an objective manner, and to pinpoint potential cryptic species that are worth being studied in detail. However, the existence of a substantial percentage of species wrongly delimited indicates that GMYC cannot be used as sufficient evidence for evaluating the specific status of particular cases without additional data. 5. Finally, we provide a set of guidelines to maximize efficiency in GMYC analyses and discuss the range of studies that can take advantage of the method.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Supplementary resource (1)

... The method has become very popular in ecology because it does not require prior knowledge of the target study group; which makes it a particularly useful tool for studies involving species for which taxonomic knowledge is limited or non-existent (Talavera et al., 2013). ...
... The performance of the GMYC model has been predominantly tested on simulated data where the effects of factors are controlled, and the model's assumptions are not violated (Papadopoulou et al., 2009;Esselstyn et al., 2012;Fujisawa and Barraclough, 2013;Talavera et al., 2013). Most applications of the GMYC using empirical data will, however, likely violate these assumptions. ...
... Generally, however, a single-threshold approach is recommended as it is less likely to oversplit (Fujisawa and Barraclough, 2013;Talavera et al., 2013;Blair and Bryson Jr, 2017). ...
... However, compared to other approaches, it often produces more OTUs. [73]. Dysmicoccus brevipes, M. hirsutus, and Ps. ...
... annonae (Pacheco da Silva & Kaydan) [38,75]. The bPTP approach was eliminated from our study because it estimated too many singletons, which might have led to an incorrect interpretation of the data [73]. As input, bPTP does not require an ultrametric tree or a sequence-similarity threshold [65]. ...
... However, compared to other approaches, it often produces more OTUs. [73]. The use of genetic/molecular databases has been shown to increase the accuracy of identification [79]. ...
Article
Full-text available
Mealybugs are insects belonging to the family Pseudococcidae. This family includes many plant-pest species with similar morphologies, which may lead to errors in mealybug identification and delimitation. In the present study, we employed molecular-species-delimitation approaches based on distance (ASAP) and coalescence (GMYC and mPTP) methods to identify mealybugs collected from coffee and other plant hosts in the states of Espírito Santo, Bahia, Minas Gerais, and Pernambuco, Brazil. We obtained 171 new COI sequences, and 565 from the BOLD Systems database, representing 26 candidate species of Pseudococcidae. The MOTUs estimated were not congruent across different methods (ASAP-25; GMYC-30; mPTP-22). Misidentifications were revealed in the sequences from the BOLD Systems database involving Phenacoccus solani × Ph. solenopsis, Ph. tucumanus × Ph. baccharidis, and Planacoccus citri × Pl. minor species. Ten mealybug species were collected from coffee plants in Espírito Santo. Due to the incorrect labeling of the species sequences, the COI barcode library of the dataset from the database needs to be carefully analyzed to avoid the misidentification of species. The systematics and taxonomy of mealybugs may be improved by integrative taxonomy which may facilitate the integrated pest management of these pests.
... XML files were made with the BEAUti version 2.6.7 interface with the following settings: GTR+G+I substitution model, empirical base frequencies, 4 gamma categories and all codon positions partitioned with unlinked base frequencies and substitution rates. Since there is no agreement concerning the most appropriate clock and tree priors for reconstructing gene trees for species delimitation (Monaghan et al., 2009;Ratnasingham & Hebert 2013;Talavera et al., 2013;Tang et al., 2014), preliminary analyses to compare the use of two different clock (strict and relaxed lognormal) and two different tree priors (coalescent constant population and Yule) were undertaken (Rodrigues et al., 2020). The results of these exploratory analyses (data not shown) indicated the strict clock and Yule priors as the most suitable for our data set, thus these priors were used for the Bayesian inference analyses. ...
... The sGMYC analysis based on a single gene revealed the presence of 370 MOTUs (likelihood ratio: 600.4823, confidence interval: 349-383, threshold time: -0.01053644). This species-delimitation algorithm relies on the priors and parameters used to construct the ultrametric tree (Ceccarelli et al., 2012), and tends to overestimate species diversity compared to other methods (Paz & Crawford, 2012;Miralles & Vences, 2013;Talavera et al., 2013;Kekkonen & Hebert, 2014). In our study, the sGMYC method seems to be the most accurate since it recovered substantially fewer putative species than the bPTP and sPTP analyses despite its hypothesized oversplitting. ...
... In our study, the sGMYC method seems to be the most accurate since it recovered substantially fewer putative species than the bPTP and sPTP analyses despite its hypothesized oversplitting. Moreover, the sGMYC approach has been suggested to suit datasets with large numbers of singleton taxa (Talavera et al., 2013), which is what we observe forPolypedilum . Based on the aforementioned considerations, we chose the putative species delimited by the sGMYC method as the basis for the biogeographical analyses. ...
Preprint
Full-text available
Aim The Neotropics, particularly South America, holds unparalleled high levels of species richness, when compared to other major biomes. Some neotropical areas are hotspots of a fragmentary known diversity of insects and are under manifest danger of biodiversity loss and climate change. Therefore, prompt estimates methods of its diversity are urgently required to complement slower traditional taxonomic approaches. Despite a variety of algorithms for delimiting species through single-locus DNA barcodes having been developed and applied for rapid estimates of species diversity in a wide array of taxa; however, tree-based and distance-based methods may lead to different group assignments, either overestimating or underestimating the number of putative species. Here, we investigate the performance of different DNA-based species delimitation approaches for a rapid biodiversity estimate of the diversity of Polypedilum (Chironomidae, Diptera) in South America. Location Worldwide Methods We analyze a mtDNA dataset comprising 1,492 specimens from 598 locations worldwide. Molecular operational taxonomic units (MOTUs) ranged from 267 to 520, based on the Barcode Index Number (BIN), Bayesian Poisson tree processes (bPTP), multi-rate Poisson tree processes (mPTP), single-rate Poisson tree processes (sPTP), and generalized mixed Yule coalescent (sGMYC) approaches. Results Our results highlight Polypedilum as a species-rich genus, yet incompletely documented, and found the sGMYC method to be the most adequate to estimate putative species in our dataset. Furthermore, based on these data, we describe the distribution of diversity and some biogeographical patterns of Polypedilum. Main Conclusions Findings imply the genus exhibited high levels of endemism and richness of species in the Neotropics, which confirmed our hypothesis that there are substantial differences in community structure between the Polypedilum fauna in South America and the neighboring regions.
... Model assumptions include that (1) species are monophyletic, (2) there is no intraspecific geographical structuring, and (3) there is no extinction (Fujisawa & Barraclough, 2013). The method has become very popular in ecology because it does not require prior knowledge of the target study group, which makes it a particularly useful tool for studies involving species for which taxonomic knowledge is limited or non-existent (Talavera et al., 2013). ...
... The performance of the GMYC model has been predominantly tested on simulated data where the effects of factors are controlled, and the model's assumptions are not violated (Esselstyn et al., 2012;Fujisawa & Barraclough, 2013;Papadopoulou et al., 2009;Talavera et al., 2013). Most applications of the GMYC using empirical data will, however, likely violate these assumptions. ...
... Applying a multiple threshold method may be useful in large data sets where there is significant variation in intra-and interspecific genetic divergences. Generally, however, a single-threshold approach is recommended as it is less likely to oversplit (Blair and Bryson Jr, 2017;Fujisawa & Barraclough, 2013;Talavera et al., 2013). ...
Article
Full-text available
Species delimitation tools are vital to taxonomy and the discovery of new species. These tools can make use of genetic data to estimate species boundaries, where one of the most widely-used methods is the Generalised Mixed Yule Coalescent (GMYC) model. Despite its popularity, a number of factors are known to influence the performance and resulting inferences of the GMYC. Moreover, the few studies that have assessed model performance to date have been predominantly based on simulated datasets, where model assumptions are not violated. Here, we present a user-friendly R Shiny application, “SPEDE-sampler” (SPEcies DElimitation sampler), that assesses the effect of computational and methodological choices, in combination with sampling effects, on the GMYC model. Output phylogenies are used to test the effect that 1) sample size, 2) BEAST and GMYC parameters (e.g. prior settings, single vs multiple threshold, clock model), and 3) singletons has on GMYC output. Optional predefined grouping information (e.g. morphospecies/ecotypes) can be uploaded in order to compare it to GMYC species and estimate percentage match scores. Additionally, predefined groups that contribute to inflated species richness estimates are identified by SPEDE-sampler, allowing for the further investigation of potential cryptic species or geographic sub-structuring in those groups. Merging by the GMYC is also recorded to identify where traditional taxonomy has overestimated species numbers. Four worked examples are provided to illustrate the functionality of the program’s workflow, and the variation that can arise when applying the GMYC model to empirical datasets. The R Shiny program is available for download on GitHub. Link: https://onlinelibrary.wiley.com/share/author/PV2UXQJRPJXG4CZRQFY2?target=10.1111/1755-0998.13591
... [54] with default parameters in a Linux environment. For each dataset, identical haplotypes were removed using ALTER [55] because the GMYC model can only handle dichotomous branches [12,56]. Subfamily 22 274 160 Stomatellinae Subfamily 6 11 9 Trochinae Subfamily 16 37 23 Turbininae Subfamily 80 375 226 Umboniinae Subfamily 18 32 25 Angariidae Family 2 3 2 Areneidae Family 1 1 1 Calliostomatidae Family 17 64 33 Colloniidae Family 9 22 13 Liotiidae Family 2 4 3 Margaritidae Family 9 19 12 Phasianellidae Family 5 9 9 Skeneidae Family 7 7 7 Solariellidae Family 15 45 39 Tegulidae Family 20 159 116 Trochidae Family 130 624 411 Turbinidae Family 81 377 228 The existence of a "barcode gap" for each dataset was evaluated using the "Spider" package [57] in R version 3.6 [58] by calculating the difference between the minimum interspecific distance and the maximum intraspecific distance. ...
... It is well-known that GMYC is prone to over-splitting species [45,56], especially the multi-threshold version of GMYC [68,69]. The mGMYC method is very sensitive to deep coalescent events since it allows many temporal thresholds for speciation, resulting in the overestimation of the number of species [70]. ...
Article
Full-text available
In the context of diminishing global biodiversity, the validity and practicality of species delimitation methods for the identification of many neglected and undescribed biodiverse species have been paid increasing attention. DNA sequence-based species delimitation methods are mainly classified into two categories, namely, distance-based and tree-based methods, and have been widely adopted in many studies. In the present study, we performed three distance-based (ad hoc threshold, ABGD, and ASAP) and four tree-based (sGMYC, mGMYC, PTP, and mPTP) analyses based on Trochoidea COI data and analyzed the discordance between them. Moreover, we also observed the performance of these methods at different taxonomic ranks (the genus, subfamily, and family ranks). The results suggested that the distance-based approach is generally superior to the tree-based approach, with the ASAP method being the most efficient. In terms of phylogenetic methods, the single threshold version performed better than the multiple threshold version of GMYC, and PTP showed higher efficiency than mPTP in delimiting species. Additionally, GMYC was found to be significantly influenced by taxonomic rank, showing poorer efficiency in datasets at the genus level than at higher levels. Finally, our results highlighted that cryptic diversity within Trochoidea (Mollusca: Vetigastropoda) might be underestimated, which provides quantitative evidence for excavating the cryptic lineages of these species.
... This was necessary because when these ingroup taxa were analysed together using bGMYC, log ratio values (coalescent rate/Yule rate) were below zero. This indicated that the model was not a good fit to the data, likely due to substantial divergences among the main clades, and/or high rate heterogeneity (Talavera et al., 2013). In addition, there is a limit to the number of species in the guide tree for BPP analyses, with >30 species causing issues with the program's memory. ...
... Results for BPP, the validation method employed here, were however unclear and varied widely among different prior settings, possibly due to the lack of dense sampling for the two nuclear markers and lack of additional nuclear data available for analyses. When employing these methods there are limitations associated with the characteristics of the data analysed, such as sampling coverage, number of samples per species, effective population size, or species divergence times (Ahrens et al., 2016;Luo et al., 2018;Talavera et al., 2013;Zhang et al., 2011). Additionally, there are other caveats associated with the methods, such as the inability to distinguish between population and species-level structure, which might lead to inaccurate delimitation, overestimating species numbers (Sukumaran and Knowles, 2017). ...
Article
Full-text available
Uropeltidae is a clade of small fossorial snakes (ca. 65 extant species) endemic to peninsular India and Sri Lanka. Uropeltid taxonomy has been confusing, and the status of some species has not been revised for over a century. Attempts to revise uropeltid systematics and undertake evolutionary studies have been hampered by incompletely sampled and incompletely resolved phylogenies. To address this issue, we take advantage of historical museum collections, including type specimens, and apply genome-wide shotgun (GWS) sequencing, along with recent field sampling (using Sanger sequencing) to establish a near-complete multilocus species-level phylogeny (ca. 87% complete at species level). This results in a phylogeny that supports the monophyly of all genera (if Brachyophidium is considered a junior synonym of Teretrurus), and provides a firm platform for future taxonomic revision. Sri Lankan uropeltids are probably monophyletic, indicating a single colonisation event of this island from Indian ancestors. However, the position of Rhinophis goweri (endemic to Eastern Ghats, southern India) is unclear and warrants further investigation, and evidence that it may nest within the Sri Lankan radiation indicates a possible recolonisation event. DNA sequence data and morphology suggest that currently recognised uropeltid species diversity is substantially underestimated. Our study highlights the benefits of integrating museum collections in molecular genetic analyses and their role in understanding the systematics and evolutionary history of understudied organismal groups.
... In addition to BIN Discordance analysis, we also used other molecular delineation methods to delineate MOTUs. To minimize the risk of oversplitting (Talavera et al. 2013), the dataset was collapsed to retain only unique haplotypes. Four species delimitation approaches were employed: Assemble Species by Automatic Partitioning (ASAP) (Puillandre et al. 2021), jMOTU (Jones et al. 2011), General Mixed Yule Coalescent (GMYC) (Fujisawa and Barraclough 2013), and bPTP (Zhang et al. 2013). ...
... The performance of phylogeny-based methods is sensitive to multiple factors, such as general phylogenetic history, sampling intensity, DNA sequence length, speciation rate, and differences of effective population size among species (Esselstyn et al. 2012). The number of species can be underestimate or overestimate with ancestral polymorphism (Esselstyn et al. 2012), but previous studies showed that the sGMYC performs better than mGMYC (Talavera et al. 2013). Three indicators (Match ratio, C tax , and R tax ) suggest that different species definition methods also diverge in terms of the location of species boundaries (Miralles and Vences 2013). ...
Article
Full-text available
Barcode libraries are generally assembled with two main objectives in mind: specimen identification and species discovery/delimitation. In this study, the standard COI barcode region was sequenced from 681 specimens belonging to katydids (Tettigoniidae), cave crickets (Rhaphidophoridae), and leaf-rolling crickets (Gryllacrididae) from Zhejiang Province, China. Of these, four COI-5P sequences were excluded from subsequent analyses because they were likely NUMTs (nuclear mitochondrial pseudogenes). The final dataset consisted of 677 barcode sequences representing 90 putative species-level taxa. Automated cluster delineation using the Barcode of Life Data System (BOLD) revealed 118 BINs (Barcodes Index Numbers). Among these 90 species-level taxa, 68 corresponded with morphospecies, while the remaining 22 were identified based on reverse taxonomy using BIN assignment. Thirteen of these morphospecies were represented by a single barcode (so-called singletons), and each of 19 morphospecies were split into more than one BIN. The consensus delimitation scheme yielded 55 Molecular Operational Taxonomic Units (MOTUs). Only four morphospecies ( I max > DNN) failed to be recovered as monophyletic clades (i.e., Elimaea terminalis , Phyllomimus klapperichi , Sinochlora szechwanensis and Xizicus howardi ), so it is speculated that these may be species complexes. Therefore, the diversity of katydids, cave crickets, and leaf-rolling crickets in Zhejiang Province is probably slightly higher than what current taxonomy would suggest.
... This model was then modified to include Bayesian support (bPTP) and the potential divergence in intraspecific diversity (mPTP) (Zhang et al. 2013;Kapli et al. 2017). All these coalescent single-gene species delimitation methods have been proved to be effective on simulations and empirical data (Talavera et al. 2013;Tang et al. 2014), being widely used to assess biodiversity (Pons et al. 2006;Machado et al. 2017;Hofmann et al. 2019;Ramirez et al. 2020bRamirez et al. , 2021Cañedo-Apolaya et al. 2021). ...
Chapter
DNA barcoding is a powerful tool to identify species using nucleotide differences found in the cytochrome oxidase I mitochondrial gene between species. Since it was proposed, several projects are promoting the construction of a DNA barcode library for all eukaryotes. DNA barcoding in conjunction with single-gene species delimitation methods have been used to estimate diversity at species level, allowing a rapid and comprehensive assessment of biodiversity. These methods have been developed with the purpose of clustering data sets of orthologous sequences in molecular operational taxonomic unit (MOTU). The MOTUs obtained by these methods could be compared with the taxonomic data to discover hidden diversity. In this chapter we discuss the main concepts and methodologies on DNA barcoding approach and review the use of species delimitation methods in the Neotropics. The huge biodiversity, lack of taxonomic effort and scarcity of funds are still challenging the broad use of this technology in the Neotropics.KeywordsCOIMOTUBarcode of lifeHidden biodiversity
... Input BI ultra-metric tree for GMYC was generated in BEAST 1.10.4. . To avoid potential biases in threshold estimation, the identical COI haplotypes were pruned (see Talavera et al. 2013) using Collapsetypes 4.6 (Chesters 2013). Input BEAST file was created in BEAUTi 1.10.4 , implementing the best model of evolution and the partition scheme specified above, and selecting a relaxed molecular clock (uncorrelated lognormal) model, a coalescent (constant size) prior (see Monaghan et al. 2009) and a UPGMA starting tree. ...
Article
Full-text available
Investigations of material collected partly in 1999 and mainly between 2006 and 2016 in New Guinea, mostly along the high, central mountain chain of the island, further increased our knowledge of the diversity of the genus Labiobaetis Novikova & Kluge on this island. Previously, 37 species were reported from New Guinea. We have identified six new species using a combination of morphology and genetic analysis (COI). They are described and illustrated based on their larvae. Five of the six new species belong to the group petersorum , which is endemic to the island. Additionally, Labiobaetis xeniolus Lugo-Ortiz & McCafferty is also assigned to this group. The morphological characterisation of the group petersorum is enhanced, and a key to all species of this group is provided. Complementary descriptions and remarks to the morphology of known species of the group petersorum are provided. Additionally, a genetic analysis (COI) including most species and several additional Molecular Operational Taxonomic Units (MOTUs) of the group petersorum is discussed. One of the new species belongs to the group vitilis . The morphological characterization of this group is slightly enhanced, and the obtained COI sequence was added to the genetic analysis of the group petersorum . The total number of Labiobaetis species worldwide is augmented to 162.
... superfamily level) (Figure 2b). Similar results have been reported for Coleoptera and Lepidoptera (Ahrens et al., 2016;Talavera et al., 2013). Alternatively, high rates of perfect matches could be explained by the nearly 19-fold higher difference between average nearest-neighbour than intraspecies divergence (Yin et al., 2022). ...
Article
Full-text available
Single‐locus molecular delimitation plays a key role in meeting the need to expedite the exploration and description of the species on our planet. Multiple methods have been developed to aid data interpretation over the past 20 years, but species delimitation remains difficult due to their varying performance. In this study, we examine the accuracy of five widely used delimitation methods (i.e. BIN, ABGD, ASAP, GMYC and mPTP) in analysing 63 empirical data sets that included 1850 mitochondrial COI sequences derived from eriophyoid mites assigned to 456 morphospecies. Our results establish that all five methods resolve approximately 90% of morphospecies. We investigated some factors which might affect the species delimitation results, that is taxonomic rank, number of haplotypes per species, mean number of host plants per species, and geographical distance among sampling sites. We found complex interactions between these factors which affected delimitation effectiveness. An increase in haplotype number negatively affected delimitation accuracy, while increased geographical distance improved delimitation accuracy. BIN was influenced by the number of host plants per species as cryptic speciation linked to host plant usage might be prevalent in eriophyoid mites, while ABGD was not significantly impacted by other factors. Our results highlight multiple factors that affect molecular species delimitation and underline the value of employing multiple analytical approaches to aid species delimitation.
... GMYC has recently become one of the most frequently used delimitation methods, and just 'splits' realization is widely used in frames of the "traditional model" [34,92], because of its "just add water" approach. However, it is known that such delimitation results in excessive "splitting" [95]. Moreover, GMYC has a limitations in its objective representation of data and biological uncertainty of models [96]. ...
Article
Full-text available
To date, a rather large set of both mathematical theories for species delimitation, based on single-locus genetic data, and their implementations as software products, has been accumulated. Comparison of the efficiencies of different delineation methods in the task of accumulating and analyzing data with reference to different taxa in different regions, is vital. The aim of this study was to compare the efficiency of fifteen single-locus species delimitation methods using the example of a fish species found in a single lake in European Russia (Lake Plescheyevo) with reference to other sequences of revealed taxa deposited in international databases. We analyzed 186 original COI sequences belonging to 24 haplotypes, and 101 other sequences previously deposited in GenBank and BOLD. Comparison of all 15 alternative taxonomies demonstrated that all methods adequately separate only the genera, while the number of delimited mOTUs differed from 16 (locMin) to 43 (HwM/CoMa). We can assume that the effectiveness of each method is correlated with the number of matches based on Ctax and MatchRatio criteria. The most comparable results were provided by bGMYC, mPTP, STACEY, KoT and ASAP and the most synchronous results were obtained from bGMYC, mPTP, STACEY and ASAP. We believe that these results are maximally realistic in the number of revealed mOTUs. A high genetic diversity, resulting in the existence of several mOTUs and phylogenetic lineages within many species, demonstrates the usefulness of the "polymorphic species" concept, which does not underestimate species richness and does not prevent the rational use and protection of biodiversity.
... Using a fixed and arbitrary value for the distance thresholds between species for high taxonomic levels (e.g., 2% by Herbert et al. 2003) is considered a naive mistake for DNA barcoding uses (Collins and Cruickshank 2013). Plausible values may be obtained by accumulating data on a target taxon (e.g., Talavera et al. 2013;Gonçalves et al. 2021;Bianchi and Gonçalves 2021a), since the coalescent depths among species are variable in each lineage (Fujita et al. 2012). The 7% genetic distance between G. pulchra and G. anthracina seems acceptable to consider them different species since 6% has been considered an applicable threshold for Theraphosidae (Hamilton et al. 2011). ...
Article
Full-text available
Taxonomic researchers have used multiple sources of evidence to support species hypotheses and delimitations. Grammostola Simon (Mygalomorphae: Theraphosidae) comprises 20 valid species endemic to South America, six occurring in Brazil. The classical morphological approach based mainly on genitalia may be misleading in recognizing species in this genus. Thus, we used morphology, geographical distribution, genetic distance, and phylogeny to support the redescription of Grammostola pulchra from southern Brazil, a species described a century ago. We also diagnosed and illustrated the species. Males have a developed apical keel at the apex of the embolus; for the first time, this type of structure has been reported in a species of Grammostola. The molecular analyses using the partial sequence of Cytochrome c oxidase subunit I showed 7% of genetic distance (p-distance) between G. pulchra and Grammostola anthracina. Distance and tree-based methods (ASAP and bPTP, respectively) assigned G. pulchra as a valid species. The gene-tree under Bayesian and Maximum-Likelihood recovered a similar topology, placing G. pulchra as closely related to Grammostola burzaquensis and G. anthracina. Morphological characters which could be important in the taxonomy of the genus are further discussed. Citation: Pittella RS, Bassa PG, Zefa E, Bianchi FM. 2023. Using the integrative approach to update a gap of one century: redescription and new distribution records of the South American Tarantulas Grammostola pulchra (Araneae: Mygalomorphae: Theraphosidae). Zool Stud 62:05.
... In this study, several commonly used molecular species delimitation methods were applied. Compared with morphological results, different analyses were conducted using two different barcoding fragments potentially over-or underestimating the number of putative species [78], yet here the results estimated by GMYC for the COI tree were congruent with the number of species identified morphologically. One new species was identified and four previously described species (supported by morphological diagnosis) were confirmed based on different species delimitation methods. ...
Article
Full-text available
One new species of the genus Dendronotus (Nudibranchia: Dendronotidae) is described from Norway and Northern Ireland, as well as from the adjacent North Sea, and one new subspecies of Dendronotus arcticus is described from Norway by applying a combination of fine-scale morphological and molecular phylogenetic data. The present case demonstrates multilevel morphological and molecular similarities and differences considering on the one hand a grouping of three similar looking sympatric taxa (D. yrjargul, D. arcticus gartensis n. subsp. and D. keatleyae n. sp.), and on the other hand two different looking apparently allopatric subspecies (D. arcticus arcticus and D. arcticus gartensis n. subsp.). The type species of the genus, D. frondosus, which is the commonest dendronotid in Norway and the United Kingdom, consistently demonstrates substantial molecular and fine-scale morphological differences from D. keatleyae n. sp. The present study, apart from providing purely taxonomic information, also provides new data for a broad discussion of global biodiversity patterns.
... The delimitation category of a given taxon (and thus their regional incidence) may depend on the size of the region where it has been studied. In general, the same species can appear as a single entity if the assessment is made at a narrow geographic scale and as a multiple or lumped entity for wider geographic sampling areas (Talavera et al., 2013). We assessed the delimitation categories for the Samoan butterflies at two different scales: worldwide (W in Table 1) and Australian and Oceanian regions (AO in Table 1) as described by Holt et al. (2013) (Australia, New Zealand, Samoan Is., Polynesia, Papua New Guinea-Melanesia and Micronesia). ...
Article
Full-text available
We investigated the entire butterfly fauna of the Samoan Archipelago (Pacific Ocean) by combining COI barcode sequences for specimens from these islands with those available in repositories at larger biogeographic scale. Haplotype networks and a generalized mixed Yule-coalescent (GMYC) model were applied to identify evolutionary significant units (ESUs). The ESUs from Samoan islands were compared with ESUs of the same or sister taxa regionally and worldwide to explore the level of endemicity and of congruence between established taxonomy and COI barcodes. The level of ESUs endemicity was similar to that shown by species and subspecies. Australia was the most frequent origin for Samoan line-ages, followed by Orient-Asia. When comparing the agreement and mismatch between established taxonomy and ESUs between the Australia-Oceania region and Europe and North America, the COI molecular marker revealed a similar performance in taxonomic identification. Despite this overall convergent pattern, the degree of mtDNA divergence and the analysis of functional traits suggested that the mechanisms producing patterns of genetic differentiation in temperate butterflies over ancient continental lands differ to those occurring across a vast ocean into geologically young islands. Mechanisms on Samoan islands include relatively recent and exceptional oceanic dispersal, possibly followed by repeated extinction events. In the Australia-Oceania region we found a similar fraction of species showing introgression with the maintenance of phenotypic differences as it occurs on the mainland, but the phenomenon was limited to sectors of each species distribution area. Regular gene flow among the Samoan islands seems to prevent allopatric speciation within the archipelago.
... the bPTP and ABGD methods. This result is consistent with previous studies, where researchers discovered more operational taxonomic units using GMYC analysis compared to other species delimitation methods (Esselstyn et al. 2012;Fujisawa & Barraclough 2013;Kekkonen & Hebert 2014;Miralles & Vences 2013;Talavera et al. 2013;Tang et al. 2014;Lorén et al. 2018;Rasouli-Dogaheh et al. 2022). In addition, among the different approaches in this study, the ITS secondary structure was the most sensitive method to separate genetic groups in Cyanobacteria. ...
Article
Full-text available
Information regarding the diversity of Cyanobacteria in many parts of the world is still minimal. One example of a region that has not yet been widely studied is southwestern Asia, including the region of the Persian Gulf. A culture-dependent study of cyanobacterial diversity in a rainwater basin on Kharg Island enabled the isolation of a novel taxon, previously unnamed, from a simple trichal cyanobacterium. Further comparisons showed the existence of closely related strains/taxa from other parts of the world, namely the strains CCALA 945 isolated from South Italy and "Leptolyngbya india" from India. Herein, we have thus described the new genus Khargia and the new species Khargia iranica. Other strains, isolated by other authors, were included in the Khargia genus as additional species: Khargia italica and Khargia indica. The recognition of the new genus was based on morphological evaluations (identification by both light and electron microscopy), the phylogenetic analyses based on the 16S rRNA gene, and species delimitation based on Automatic Barcode Gap Definition (ABGD), Generalized Mixed Yule Coalescent (GMYC), bayesian version of Poisson Tree Processes (bPTP), and the secondary structure of 16S-23S rRNA ITS region of the studied strains.
... In the case of singleton sequences, it is important to know how they manifest in large barcode libraries. There are two possibilities: (1) only a single specimen of a species was sampled or (2) multiple individuals within a species lacking true genetic polymorphisms were sampled (Talavera et al. 2013). This information can be used to assess whether VLFs arise as a result of sequencing error or divergence, since with small sample sizes, actual biological variants (i.e. ...
Article
Full-text available
Here, we introduce VLF , an R package to determine the distribution of very low frequency variants (VLFs) in nucleotide and amino acid sequences for the analysis of errors in DNA sequence records. The package allows users to assess VLFs in aligned and trimmed protein-coding sequences by automatically calculating the frequency of nucleotides or amino acids in each sequence position and outputting those that occur under a user-specified frequency (default of p = 0.001). These results can then be used to explore fundamental population genetic and phylogeographic patterns, mechanisms and processes at the microevolutionary level, such as nucleotide and amino acid sequence conservation. Our package extends earlier work pertaining to an implementation of VLF analysis in Microsoft Excel, which was found to be both computationally slow and error prone. We compare those results to our own herein. Results between the two implementations are found to be highly consistent for a large DNA barcode dataset of bird species. Differences in results are readily explained by both manual human error and inadequate Linnean taxonomy (specifically, species synonymy). Here, VLF is also applied to a subset of avian barcodes to assess the extent of biological artifacts at the species level for Canada goose ( Branta canadensis ), as well as within a large dataset of DNA barcodes for fishes of forensic and regulatory importance. The novelty of VLF and its benefit over the previous implementation include its high level of automation, speed, scalability and ease-of-use, each desirable characteristics which will be extremely valuable as more sequence data are rapidly accumulated in popular reference databases, such as BOLD and GenBank.
... Based on the concatenated dataset analysis, three methods suggested L. wuyishanensis and L. liuyei were one MOTU, whereas GMYC divided L. swinhoei and L. continentalis into two MOTUs. GMYC typically over-splits species, owing to low genetic diversity across lineages and overlap of interspecific and intraspecific divergences, as well as a lack of reciprocal monophyly within sister clades (Talavera et al. 2013;Pentinsaari et al. 2016;Stokkan et al. 2018;Yuan et al. 2021). The genus Lucanus is susceptible to several pressures, such as habitat selection, sexual selection, and food resources, and only occurs in wooded alpine areas above 800 m with more demanding environmental conditions and tiny ecological niches (Switala et al. 2014;Chen et al. 2020). ...
Article
Full-text available
Citation: Zhou LY, Zhan ZH, Zhu XL, Wan X (2022) Multilocus phylogeny and species delimitation suggest synonymies of two Lucanus Scopoli, 1763 (Coleoptera, Lucanidae) species names. ZooKeys 1135: 139-155. https:// Abstract Phylogenetic relationsships of four nominal Lucanus Scopoli, 1763 species, L. swinhoei Parry, 1874, L. continentalis Zilioli, 1998, L. liuyei Huang & Chen, 2010, and L. wuyishanensis Schenk, 1999, are assessed based on mitochondrial (16S rDNA, COI) and nuclear (28S rDNA, Wingless) genes. The genetic distance is 0.0072 between L. swinhoei and L. continentalis, and 0.0094 between L. wuyishanensis and L. liuyei. Three species-delimitation approaches (ABGD, PTP, and GMYC) consistently showed L. swinhoei + L. continentalis and L. wuyishanensis + L. liuyei as two MOTUs. A new synonymy, L. liuyei = L. wuyishanensis, is proposed. Synonymy of L. swinhoei over L. continentalis is confirmed.
... Recently diverged groups are expected to reach reciprocal monophyly faster with mitochondrial genes versus nuclear genes. This phenomenon alone can provide novel insight into mechanisms of reproductive isolation and speciation, particularly when combined with powerful analytical tools for species delimitation (Blair & Bryson, 2017;Fujisawa & Barraclough, 2013;Kapli et al., 2017;Talavera, Dincȃ, & Vila, 2013). Many taxonomic groups also show evidence of mtDNA or cpDNA introgression (Çoraman, Dundarova, Dietz, & Mayer, 2020;Leaché & Cole, 2007;Leaché & McGuire, 2006;Mastrantonio, Porretta, Urbanelli, Crasta, & Nascetti, 2016;Vitelli et al., 2016;Yan et al., 2018) evolutionary patterns that would be invisible with only nuclear DNA (nDNA). ...
Preprint
The genomics revolution continues to change how ecologists and evolutionary biologists study the evolution and maintenance of biodiversity. It is now easier than ever to generate large molecular data sets consisting of hundreds to thousands of independently evolving nuclear loci to estimate a suite of evolutionary and demographic parameters. However, any inferences will be incomplete or inaccurate if incorrect taxonomic identities and perpetuated throughout the analytical pipeline. Due to decades of research and comprehensive online databases, sequencing of mitochondrial DNA (mtDNA), chloroplast DNA (cpDNA) and select nuclear genes can provide researchers with a cost effective and simple means to verify the species identity of samples prior to subsequent phylogeographic and population genomic analysis. The addition of these sequences to genomic studies can also shed light on other important evolutionary questions such as explanations for gene tree-species tree discordance, species limits, sex-biased dispersal patterns, and mtDNA introgression. Although the mtDNA and cpDNA genomes often should not be used exclusively to make historical inferences given their well-known limitations, the addition of these data to modern genomic studies adds little cost and effort while simultaneously providing a wealth of useful data that can have significant implications for both basic and applied research.
... Input BI ultra-metric tree for GMYC was generated in BEAST 1.10.4. . To avoid potential biases in threshold estimation, the outgroups were removed, and identical CO1 haplotypes were pruned (see Talavera et al. 2013) using Collapsetypes 4.6 (Chesters 2013). Input BEAST file was created in BEAUTi , implementing the best model of evolution and the partition scheme specified above, and selecting a relaxed molecular clock (uncorrelated lognormal) model, a coalescent (constant size) prior (see Monaghan et al. 2009) and a UPGMA starting tree. ...
Article
Full-text available
Based on recently collected larvae from Algeria and Morocco, the species delimitation within the genus Centroptilum Eaton, 1869 in that region is validated. Two new species are described and illustrated, one from north-eastern Algeria, and one from North Morocco, using an integrated approach with morphological and molecular evidence. A table summarising the morphological differences between the new species and Centroptilum luteolum (Müller, 1776) from Central Europe is provided. Further, molecular evidence for additional undescribed species of Centroptilum in other regions of the West Palearctic is provided and discussed.
... For the PTP technique to perform properly, the models in each analysis need to be fi tted to data representing at least fi ve separate and previously verifi ed species (Reid & Carstens 2012;Fujisawa & Barraclough 2013;Talavera et al. 2013;Zhang et al. 2013;Leliaert et al. 2014;Dellicour & Flot 2015). Unfortunately, the hypnorum-group has too few a priori recognised species, with only three species in the tree by Cameron et al. (2007). ...
Article
Full-text available
The hypnorum-complex of bumblebees (in the genus Bombus Latreille, 1802) has been interpreted as consisting of a single widespread Old-World species, Bombus hypnorum (Linnaeus, 1758) s. lat., and its closely similar sister species in the New World, B. perplexus Cresson, 1863. We examined barcodes for evidence of species’ gene coalescents within this species complex, using the closely related vagans-group to help calibrate Poisson-tree-process models to a level of branching appropriate for discovering species. The results support seven candidate species within the hypnorum-complex (Bombus taiwanensis Williams, Sung, Lin & Lu, 2022, B. wolongensis Williams, Ren & Xie sp. nov., B. bryorum Richards, 1930, B. hypnorum, B. koropokkrus Sakagami & Ishikawa, 1972, and B. hengduanensis Williams, Ren & Xie sp. nov., plus B. perplexus), which are comparable in status to the currently accepted species of the vagans-group. Morphological corroboration of the coalescent candidate species is subtle but supports the gene coalescents if these candidates are considered near-cryptic species.
... The transition between intra-and interspecific variation is visible as a distinct gap in the distribution, the socalled "barcode gap", and a corresponding distance threshold can then be applied to delimit species (Hebert et al., 2004). Another popular tree-based approach, the General Mixed Yule-coalescent (GMYC) approach (Fujisawa and Barraclough, 2013;Talavera et al., 2013), is a species delimitation method that estimates species boundaries on a time calibrated (ultrametric) phylogeny. The GMYC method finds a time point, a species boundary on the tree, to the left of which the branching process follows the speciation model and to the right of which it follows the population model. ...
Article
Despite that the diversity of chlorophycean microalgae has been studied for a long time, our knowledge of their taxonomic and phylogenetic relationships is still deficient. This study represents one of the first efforts to examine the congruence of the morphological identification of microalgal species from genera Eubrownia, Spongiococcum, and Chlorococcum with DNA-based species delimitation methods. We found unique combinations of morphological, molecular, and ecological features, distinguishing some members of the Moewusinia clade at the genus and species levels. Most genetic groups recovered by GMYC, ABGD, and PTP analyses of 18S rRNA and ITS2 were consistent with their morphological identification. The ITS2 region of genera Eubrownia and Chlorococcum is among the longest within the chlorophytes theretofore studied and has a unique secondary structure with branched helix III and uncommonly long helix IV. Moreover, the molecular signatures were found in the basal stem region of helix I, distinguishing two genera. The CBC analysis is able to effectively discriminate between E. isobilateralis, E. aggregata, and all Chlorococcum species, except for C. infusionum and C. echinozygotum. Thus, we confirmed tentative hypotheses of species, obtained by DNA-based delimitation methods, with morphological data, molecular signatures, ITS2 secondary structure, and CBC criterion.
... F and five specimens of C. annulata. This could be due to either general properties of our dataset, including unbalanced geographical range sampling, skewed species abundances and the availability for analyses of a single specimen (Talavera et al., 2013;Ahrens et al., 2016), or because of paraphyly/polyphyly of some clades (Hendrich et al., 2010;Scicchitano et al., 2018). Interestingly, the intraspecific differentiation observed in C. annulata corresponds to samplings at different geographic areas (Montville and Apple Tree Park -Springbrook). ...
Article
The Phasmida genus Candovia comprises nine traditionally recognized species, all endemic to Australia. In this study, Candovia diversity is explored through molecular species-delimitation analyses using the COIFol gene fragment and phylogenetic inferences leveraging seven additional mitochondrial and nuclear loci. Molecular results were integrated with morphological observations, leading us to confirm the already described species and to the delineation of several new taxa and of the new genus Paracandovia. New Candovia species from various parts of Queensland and New South Wales are described and illustrated (C. alata sp. nov., C. byfieldensis sp. nov., C. dalgleishae sp. nov., C. eungellensis sp. nov., C. karasi sp. nov., C. koensi sp. nov. andC. wollumbinensis sp. nov.). New combinations are proposed and species removed from synonymy with the erection of the new genus Paracandovia (P. cercata stat. rev., comb. nov., P. longipes stat. rev., comb. nov., P. pallida comb. nov., P. peridromes comb. nov., P. tenera stat. rev., comb. nov.). Phylogenetic analyses suggest that the egg capitulum may have independently evolved multiple times throughout the evolutionary history of these insects. Furthermore, two newly described species represent the first taxa with fully developed wings in this previously considered apterous clade.
... One source of support is often from coalescent analyses in fast-evolving genes, which often provide the most direct evidence of evolutionarily independent lineages (Monaghan et al., 2005(Monaghan et al., , 2009Zhang et al., 2013). Methods for recognizing species from gene coalescents necessarily depend on obtaining a representative gene tree for at least five genuine species (Reid & Carstens, 2012;Fujisawa & Barraclough, 2013;Talavera et al., 2013;Zhang et al., 2013;Leliaert et al., 2014;Dellicour & Flot, 2015) and on the correct rooting of the tree (Zhang et al., 2013), and also require accounting for any: (1) 'numts' (nuclear paralogous copies of mitochondrial genes: Magnacca & Brown, 2010;Song et al., 2014); (2) heteroplasmy (Magnacca & Brown, 2010;Francoso et al., 2016;Williams et al., 2019); (3) other forms of incomplete lineage sorting (Maddison & Knowles, 2006); and (4) introgression (Ballard & Whitlock, 2004;Monaghan et al., 2005). A lack of interbreeding now became a necessary, but not sufficient, criterion for recognizing separate species, with species sometimes accepted as showing a possibly temporary lack of interbreeding between parts of their populations when dispersal barriers were introduced (Goulson et al., 2011). ...
Article
Splitting or lumping of species is a concern because of its potential confounding effect on comparisons of biodiversity and on conservation assessments. By comparing global lists of species reported by previous authors to lists of the presently recognized species that were known to those authors, a simple ratio can be used to describe their relative splitting or lumping of species. One group of ‘model’ organisms claimed for the study of what species are and how to recognize them is bumblebees. A comparison of four bumblebee subgenera shows: (1) an early phase (up to and including 1931) showing splitting, in which taxonomy was dominated by a typological concept of invariant species with heavy reliance on colour-pattern characters; (2) a middle phase (1935–98) showing lumping, associated with a shift to a polytypic concept of species emphasizing morphological characters, often justified with an interbreeding concept of species, but only rarely associated directly with process-related characters; and (3) a recent phase (after 2000), using a concept of species as evolutionarily independent lineages, as evidenced by corroboration from integrative assessment, usually including evidence for coalescents of species in fast-evolving genes compared with morphology. Analysis of splitting or lumping should help to improve biodiversity comparisons and conservation.
... For transformation, we used RelTime (Tamura et al., 2012) as implemented in the time tree tool in MEGA X (Kumar et al., 2018). We ran the single-threshold version of GMYC, which has been shown to outperform the multiple-threshold version (Fujisawa & Barraclough, 2013;Talavera et al., 2013;Michonneau, 2015) using the SPLITS (Ezard et al., 2009) package for R (R Core Team, 2020). We ran mPTP using the command line version downloaded from http://github.com/Pas-Kapli/mptp. ...
Article
Oval frogs (Elachistocleis) have a broad geographic distribution covering nearly all of South America and parts of Central America. They also have a large inter- and intraspecific variation of the few morphological characters commonly used as diagnostic traits among species of the genus. Based on molecular data, we provide the most complete phylogeny of Elachistocleis to date, and explore its genetic diversity using distance-based and tree-based methods for putative species delimitation. Our results show that at least two of the most relevant traditional characters used in the taxonomy of this group (belly pattern and dorsal median white line) carry less phylogenetic information than previously thought. Based on our results, we propose some synonymizations and some candidate new species. This study is a first major step in disentangling the current systematics of Elachistocleis. Yet, a comprehensive review of morphological data is needed before any new species descriptions can be properly made.
... Here it can be postulated that these suggested splits are likely artefactual, possibly resulting from the low number of specimens sequenced and potential biases in branch length estimation (due to uneven gene coverage among some specimens), which could have affected the tree-based SD methods. Finally, the results of our SD analyses are also consistent with a general trend reported by other authors (Miralles & Vences 2013;Talavera et al. 2013;Hamilton et al. 2014;Lecocq et al. 2015;Dellicour & Flot 2018;Luo et al. 2018), which is that tree-based approaches (such as GMYC and PTP) generally yield higher numbers of putative species whereas distance-based methods (such as ABGD and ASAP) generate lower numbers of putative species. The latter underlines the importance of using a wide array of methods and settings to critically interpret and reassess the results of SD analyses, while avoiding biases that could result from the reliance on a limited number of approaches. ...
Article
In this study, 31 species of noctuid stemborers belonging to the genus Sesamia Guenée, 1852 (Lepidoptera: Noctuidae: Noctuinae: Apameini: Sesamiina) are reviewed. All these species are assigned to the Sesamia cretica group sensu Tams & Bowden (1953). Based on genitalic characters, several subgroups are hereby defined. Nine species belong to a species complex defined as the Sesamia albivena Hampson, 1902 subgroup; it consists of S. albivena, S. mocoensis Tams & Bowden, 1953, n. stat., S. sudanensis Tams & Bowden, 1953, n. stat. S. taenioleuca (Wallengren, 1863), and of five new species that are described (S. aethiopica n. sp. from Ethiopia, S. kafulo n. sp. from Botswana and Zambia, S. kavirondo n. sp. from Kenya and Uganda, S. maloukou n. sp. from Republic of Congo, and S. soyema n. sp. from Ethiopia. Four species belong to a species complex that defined as the Sesamia cretica subgroup; it encompasses S. cretica, S. rufescens Hampson, 1910, and two new species that are described (S. ihambane n. sp. from Mozambique and Tanzania and S. kikuyuensis n. sp. from Kenya); two new synonyms are introduced for Sesamia cretica: Nonagria uniformis Dudgeon 1905 n. syn. and Sesamia griselda Warren, 1913, n. syn.. Ten species belong to a species complex defined as the Sesamia fuscifrontia Hampson, 1914 subgroup; it includes S. fuscifrontia, S. geyri (Strand, 1915) and eight new species that are described (S. babati n. sp. from Tanzania, S. babessi n. sp. from Cameroon and Zambia, S. mabira n. sp. from Uganda, S. nangaensis n. sp. from Cameroon and Republic of Congo, S. rungwa n. sp. from Tanzania, S. simillima n. sp. from Benin, Cameroon, Kenya and Uganda, S. taveta n. sp. from Kenya and S. ulaukae n. sp. from Ethiopia). One species belongs to a species complex defined as the Sesamia salama n. sp. subgroup; it consists of S. salama n. sp. from Kenya and another undescribed Sesamia species from South Africa. One species belongs to a species complex defined as the Sesamia viettei Rungs, 1954 subgroup. Six species belong to a species complex defined as the Sesamia wiltshirei Rungs, 1963 subgroup; it groups S. wiltshirei and five new species that are described (S. djenoensis n. sp. from Republic of Congo, S. inexpectata n. sp. from South Africa and Zambia, S. lefini n. sp. from Republic of Congo, S. echinochloa n. sp. from Botswana, Kenya, Mozambique, South Africa, Tanzania and Zambia and S. rindini n. sp. from Tanzania). A supplemental description of the previously described species is also provided. Novel host plant records are also provided for 11 species of the S. cretica group. To complement the morphological study, both phylogenetic and molecular species delimitation analyses were carried out on a multimarker (four mitochondrial and two nuclear genes) molecular dataset encompassing 144 specimens representing 35 species (including 25 species from the S. cretica group). Molecular analyses provide a well-supported phylogenetic framework for the species of interest, which are all recovered monophyletic. Molecular species delimitation analyses also support the species status of almost all sampled species. Interestingly, the inferred tree indicates that the S. cretica group and the S. fuscifrontia subgroup are both paraphyletic; this indicates that, while highly informative, the chosen genitalic characters in Sesamia are not all synapomorphies.
... It is an efficient tool to trace young diverging clades (Birch et al., 2017), although it can be influenced by geographical range (Gaytán et al., 2020) and isolation by distance (Mason et al., 2020). Species delimitation tools based on coalescence such as the Generalised Mixed Yule Coalescence Method (GMYC; Fujisawa and Barraclough, 2013) and the Bayesian implementation on Poisson Tree Processes (bPTP; Zhang et al., 2013a) are generally less conservative, with a tendency to predict more OTUs than morphometric analyses (Talavera et al., 2013), outperforming the conventional distance-based species delimitation (Nguyen et al., 2016;Yu et al., 2017). Nevertheless, the lack of consistency in performance and conservatism of these species assignment methods across studies ( Fig. S1-S6) highlights the need for comparisons on the efficacy of different methods in resolving taxonomy of morphologically cryptic species (Table 1). ...
Article
Full-text available
The unregulated wildlife trade increases the risk of global biological invasions and, therefore, accurate taxonomic assignments and origin-tracing methods are critical. First, using a meta-analysis comparing studies from the last 10 years, we quantified the efficiency of popular analytical methods used for species assignments and confirmed the higher sensitivity of coalescent-based methods in isolating cryptic operational taxonomic units (OTUs). Second, we proposed a workflow for species identification of unidentified animals from the trade, in this case cryptic Brown frogs (Rana) imported into the Republic of Korea. This integrated workflow relies on the use of a single-locus 16S rRNA gene barcoding along with morphometry, phylogenetic trait, species delimitation modelling and phylogeography. Out of 171 samples, we identified three erroneously imported non-native species: R. chensinensis, R. amurensis and R. kukunoris. Bayes factor delimitation modelling most supported the presence of 12 OTUs from the trade, highlighting a hidden genetic diversity. Both molecular and morphological analyses converged towards a high phenotypic crypticity and similarity in genetic sequences between Korean R. huanrenensis and Chinese R. chensinensis. The combined model-based OTUs and 16S rRNA gene phylogeny of traded and control specimens (n = 230) recovered the trade pathways, and revealed the widespread and likely wild-harvested origins of traded Rana individuals. Our results also highlight the independent evolution of toe webbings in Rana for the last 12.0 Mya, a potential key trait for species identification of northeastern Asian Rana. With the workflow for large-scale species identification developed herein, we urge the development of trade monitoring and legislation on Rana species in northeast Asia.
... Only PTP and bPTP separated L3-L5 into three partitions, and only GMYC separated L5 from the others. These topological species delimitation methods are often suggested to oversplit species (Talavera et al., 2013;Hamilton et al., 2014;Tang et al., 2014;Luo et al., 2018;Xu et al., 2020;Ekimova et al., 2021). L3 and L4 are derived from the same Korean populations, whereas L5 is from the geographically separated Bohai Sea (Fig. 3). ...
Article
Sipunculans are non-segmented marine worms with an anterior retractable introvert, which are commonly included in Annelida based on molecular phylogenetic and phylogenomic analyses. They generally burrow in the soft sediments or live inside the crevices of hard substrata (e.g. calcareous/coralline rocks). However, members of some sipunculan genera (mainly Phascolion and Aspidosiphon ) are known to have a peculiar habit of dwelling in vacant shells of gastropods or scaphopods. In this study, we investigated the shell utilization and preference pattern of the species of Phascolion and Aspidosiphon in Japan. We collected 302 sipunculans, comprising 273 and 29 individuals in Phascolion and Aspidosiphon , respectively, from 57–800 m depth of three study sites in the Pacific coast of Honshu Island, Japan. The species of Phascolion were found in vacant shells of 38 genera of 27 families of gastropods and six genera of four families of scaphopods, whereas the species of Aspidosiphon were found in 11 genera of 11 families of gastropods and one genus of scaphopod. These results suggest that members of each genus use a wide range of gastropod and scaphopod shells. The body size of the sipunculans was positively correlated with the shell size, suggesting that they change the shells as they grow. Furthermore, we investigated the shell preference of Phascolion species by comparing morphological characteristics of shells occupied and unoccupied by sipunculans. Generalized linear mixed model (GLMM) analyses suggest that the species of Phascolion tend to use long and narrow shells. Such shells likely fit well the elongated trunk of sipunculans.
... The GMYC species delimitation independently recovered FMNH 265806, 255454, 270492 and VNUF R.2014.50 and C. ngati (IBER 4829 and VNUF R.2020.12) as conspecific. However, the likelihood ratio test was insignificant (p=0.10000) which is not surprising given the high percentage of singletons in the data set (Talavera et al. 2013). The bPTP analysis also recovered these specimens as conspecific with a 0.84 posterior probability. ...
Article
Full-text available
Convergent morphological specializations for an arboreal lifestyle in most species of the Cyrtodactylus brevipalmatus group have been a confounding factor for establishing a stable taxonomy among its species. Recent references to C. interdigitalis from throughout Thailand and Laos were made without comparisons to the type material from Tham Yai Nam Nao, Nam Nao National Park, Phetchabun Province, Thailand, but instead, were based on general morphological similarity and distribution. The taxonomy of C. interdigitalis is stabilized here by comparing the paratypes to other specimens from Thailand and Laos and recovering their phylogenetic relationships based on newly acquired genetic data, including those from the type locality. The phylogeny recovered all specimens outside the type locality to be either C. ngati from Vietnam or new species closely related to C. ngati . Cyrtodactylus interdigitalis is shown here to be a range-restricted upland endemic on the Phetchabun massif of northern Thailand. The phylogeny also indicates that C. ngati extends hundreds of kilometers farther south into northern Thailand and central Laos. We hypothesize that the significant morphological divergence in body shape of the types of C. ngati , compared to that of the Lao and Thai populations, may be due to local adaptions for utilizing karst ( C. ngati ) rather than vegetation (Lao and Thai populations). Additionally, phylogenetic and multivariate analyses identified a potentially new species from Phu Hin Rong Kla National Park, Phitsanulok Province, in northern Thailand and another from the Khlong Naka Wildlife Sanctuary, Ranong Province, in southern Thailand. A series of newly examined specimens from Kaeng Krachan National Park, Phetchaburi Province, Thailand represents a possible ~82 km range extension to the southeast of C. rukhadeva . This research continues to underscore the high diversity of range-restricted upland endemics in Thailand and the importance of examining type material (if possible) in the context of a phylogeny so as to construct proper taxonomies that reveal, rather than obscure, diversity.
... A FASTA alignment was used as input file. For the GMYC analysis, non-unique haplotypes were removed from the alignments (Talavera et al. 2013). Ultrametric tree ...
Article
Full-text available
Until recently, three species of the genus Ahnfeltia were recorded from the Russian Pacific coast: A. fastigiata, A. plicata and A. tobuchiensis. A morphological analysis and DNA-barcoding of Ahnfeltia have revealed another species occurring in this area, A. borealis. The endemic taxon of this region, A. tobuchiensis, is conspecific with A. fastigiata and should be reduced to the varietal level, A. fastigiata var. tobuchiensis, as was previously suggested by Maggs et al. The occurrence of A. plicata in the Northwest Pacific requires confirmation by molecular and morphological data.
... r-project.org/projects/splits/). The single-threshold model of bGMYC was preferred since it has been demonstrated to outperform the multiple-threshold model (Fujisawa & Barraclough, 2013;Talavera & al., 2013). ...
Article
Full-text available
The genus Arthrorhaphis is a group of ascomycetes comprising lichenised and non‐lichenised taxa from temperate to arctic‐alpine regions in both hemispheres. Nine species and two infraspecific taxa are currently recognised. Their delimitation, inter‐relationships, and phylogenetic placement remain poorly understood. We have used an integrative taxonomic approach to assess taxon limits, phylogenetic placement of the family, and to test the hypothesis that transition to lichenisation has happened only once. We present a first molecular phylogenetic hypothesis of all but one known Arthrorhaphis species based on Bayesian inference and maximum likelihood analyses of multilocus DNA sequence data. Our results support monophyly of Arthrorhaphis, phylogenetic placement in the Ostropomycetidae, and lichenisation having evolved from lichenicolous ancestors only once. The lichenicolous Arthrorhaphis species are well‐defined both morphologically and genetically. The lichenised A. alpina s.l. and A. citrinella s.l., however, include multiple genetic clades that are partly supported by phenotypic data. We split A. citrinella s.l. into the following five species: (1) A. bullata sp. nov., (2) A. catolechioides comb. & stat. nov., (3) A. citrinella, (4) A. farinosa sp. nov., and (5) A. vulgaris comb. & stat. nov. A sixth phylogenetic clade from the Neotropics remains undescribed herein due to insufficient data. Five circumarctic accessions of A. alpina s.l. form a genetically distinct but morphologically poorly understood clade sister to the alpina‐vacillans clade, which we preliminarily name “A. septentrionalis”. Jointly, our multispecies coalescence analyses, of both single‐locus (bGMYC) and multilocus (bPtP, bP&P) datasets, largely support our proposed species hypotheses in Arthrorhaphis.
... Only PTP and bPTP separated L3-L5 into three partitions, and only GMYC separated L5 from the others. These topological species delimitation methods are often suggested to oversplit species (Talavera et al., 2013;Hamilton et al., 2014;Tang et al., 2014;Luo et al., 2018;Xu et al., 2020;Ekimova et al., 2021). L3 and L4 are derived from the same Korean populations, whereas L5 is from the geographically separated Bohai Sea (Fig. 3). ...
Article
Lingulidae are often considered living fossils, because they have shown little morphological change since the Paleozoic. Limited morphological variation has also made the taxonomic study of living lingulids challenging. We investigated species diversity and phylogenetic relationships of extant lingulids and show that they are substantially more diverse than realized, demonstrating that morphological stasis was commonly accompanied by speciation. Species delimitation based on cytochrome c oxidase subunit I (COI) gene sequences from 194 specimens sampled from East Asia, Australia, Oceania, and the Americas suggested 14–22 species in the lingulids (9–17 species in Lingula and 4–5 species in Glottidia), in contrast to the 11–12 species currently recognized globally in the family. Four-gene phylogenetic analyses supported the sister relationship between Lingula and Glottidia. Within Lingula, L. adamsi, which possesses large, brownish shells, was recovered as sister to all remaining Lingula species, which have more or less greenish shells. Within the greenish Lingula clade, the ‘L. anatina’ complex was sister to the clade that includes the ‘L. reevei’ complex. The ‘L. anatina’ complex was further separated into two major clades with partly separate ranges centered on (i) temperate East Asia, and (ii) the tropical west-central Pacific. Within Glottidia, Pacific species were nested within Atlantic species. Time-calibrated phylogenetic analyses suggested that Lingula likely originated in the early Cretaceous contrary to a previously proposed hypothesis advocating a Cenozoic origin. The separation of Lingula and Glottidia appears to date from the Mesozoic, not from the Carboniferous, contrary to a previous hypothesis. Overall, our results uncovered substantial cryptic diversity in lingulids, which will form the basis for conservation and further taxonomic revision.
... However, the GMYC method is known often to oversplit, especially when compared to the PTP, and this has usually been explained with errors in time calibration of the tree (Pentinsaari et al., 2017). Unfortunately, the reltime method we used for time calibration was not included in the comparative analysis by Talavera et al. (2013), who tested alternative methods for dating ML phylogenies to be used for GMYC. Even when different Table 1) on the first and second principal components (bgPC 1 and bgPC 2) obtained from a between-group principal components analysis of the symmetric component of the shape. ...
Article
The investigation of species boundaries in strictly endogeic animals is challenging because they are prone to fine-scale genetic and phenotypic geographical differentiation owing to low dispersal ability. An integrative approach exploiting different sources of information has seldom been adopted in these animals and even more rarely by treating all data sources equally. We investigated species boundaries in the endogeic centipede Clinopodes carinthiacus across the south-eastern Alps by studying genetic and morphological differentiation in a sample of 66 specimens from 27 sites, complemented by the morphological examination of more than 1100 specimens from other sites. Hypotheses of species delimitation were obtained independently from the molecular sequences of three markers (mitochondrial 16S and COI and nuclear 28S) by means of different species discovery methods (automatic barcode gap discovery, assemble species by automatic partitioning, general mixed Yule coalescent and the Poisson tree process) and from ten morphological characters by means of a model-based cluster analysis and Bayesian model selection. We found strong support for the existence of at least two species: C. carinthiacus s.s. and Clinopodes strasseri, which was formerly described as a subspecies of another species, and later placed in synonymy with C. carinthiacus. The two species coexist in syntopy in at least one site.
... One source of support is often from coalescent analyses in fast-evolving genes, which often provide the most direct evidence of evolutionarily independent lineages (Monaghan et al., 2005(Monaghan et al., , 2009Zhang et al., 2013). Methods for recognizing species from gene coalescents necessarily depend on obtaining a representative gene tree for at least five genuine species (Reid & Carstens, 2012;Fujisawa & Barraclough, 2013;Talavera et al., 2013;Zhang et al., 2013;Leliaert et al., 2014;Dellicour & Flot, 2015) and on the correct rooting of the tree (Zhang et al., 2013), and also require accounting for any: (1) 'numts' (nuclear paralogous copies of mitochondrial genes: Magnacca & Brown, 2010;Song et al., 2014); (2) heteroplasmy (Magnacca & Brown, 2010;Francoso et al., 2016;Williams et al., 2019); (3) other forms of incomplete lineage sorting (Maddison & Knowles, 2006); and (4) introgression (Ballard & Whitlock, 2004;Monaghan et al., 2005). A lack of interbreeding now became a necessary, but not sufficient, criterion for recognizing separate species, with species sometimes accepted as showing a possibly temporary lack of interbreeding between parts of their populations when dispersal barriers were introduced (Goulson et al., 2011). ...
Article
Splitting or lumping of species is a concern because of its potential confounding effect on comparisons of biodiversity and on conservation assessments. By comparing global lists of species reported by previous authors to lists of the presently recognized species that were known to those authors, a simple ratio can be used to describe their relative splitting or lumping of species. One group of 'model' organisms claimed for the study of what species are and how to recognize them is bumblebees. A comparison of four bumblebee subgenera shows: (1) an early phase (up to and including 1931) showing splitting, in which taxonomy was dominated by a typological concept of invariant species with heavy reliance on colour-pattern characters; (2) a middle phase (1935-98) showing lumping, associated with a shift to a polytypic concept of species emphasizing morphological characters, often justified with an interbreeding concept of species, but only rarely associated directly with process-related characters; and (3) a recent phase (after 2000), using a concept of species as evolutionarily independent lineages, as evidenced by corroboration from integrative assessment, usually including evidence for coalescents of species in fast-evolving genes compared with morphology. Analysis of splitting or lumping should help to improve biodiversity comparisons and conservation.
... In the current study, the GMYC approach was more adept at recognizing species than the bPTP and ABGD approaches. This result is consistent with previous findings, which showed that the GMYC method provided more usable taxonomic units than other species delimitation methodologies [91,92,97,99,[101][102][103]. Furthermore, one functional pattern analysis (ITS secondary structure) of the 16S-23S ITS dataset was applied in this study. ...
Article
Full-text available
Simple trichal types constitute a group of cyanobacteria with an abundance of novel, often cryptic taxa. Here, we investigated material collected from wet surface-soil in a saline environment in Petchaburi Province, central Thailand. A morphological comparison of the isolated strain with similar known species, as well as its phylogenetic and species delimitation analyses based on the combined datasets of other related organisms, especially simple trichal cyanobacteria, revealed that the material of this study represented an independent taxon. Using a multifaceted method, we propose that this material represents a new genus, Thainema gen. nov., belonging to the family Leptolyngbyaceae, with the type species Thainema salinarum sp. nov. This novel taxon shares similar ecological habitats with strains previously placed in the same lineage.
... The GMYC method is sometimes considered prone to oversplitting (Lohse 2009, Talavera et al. 2013, but it did not oversplit the datasets in our previous study on the section Nidulantes (Sklen a r et al. 2020), neither the dataset in this study. The method delimiting the highest number of species was bPTP and the most conservative method (resulting in the lowest number of delimited species) was bGMYC. ...
Article
Full-text available
Since the last revision in 2015, the taxonomy of section Flavipedes evolved rapidly along with the availability of new species delimitation techniques. This study aims to re-evaluate the species boundaries of section Flavipedes members using modern delimitation methods applied to an extended set of strains (n = 90) collected from various environments. The analysis used DNA sequences of three housekeeping genes (benA, CaM, RPB2) and consisted of two steps: application of several single-locus (GMYC, bGMYC, PTP, bPTP) and multi-locus (STACEY) species delimitation methods to sort the isolates into putative species, which were subsequently validated using DELINEATE software that was applied for the first time in fungal taxonomy. As a result, four new species are introduced, i.e. A. alboluteus, A. alboviridis, A. inusitatus and A. lanuginosus, and A. capensis is synonymized with A. iizukae. Phenotypic analyses were performed for the new species and their relatives, and the results showed that the growth parameters at different temperatures and colonies characteristics were useful for differentiation of these taxa. The revised section harbors 18 species, most of them are known from soil. However, the most common species from the section are ecologically diverse, occurring in the indoor environment (six species), clinical samples (five species), food and feed (four species), droppings (four species) and other less common substrates/environments. Due to the occurrence of section Flavipedes species in the clinical material/hospital environment, we also evaluated the susceptibility of 67 strains to six antifungals (amphotericin B, itra-conazole, posaconazole, voriconazole, isavuconazole, terbinafine) using the reference EUCAST method. These results showed some potentially clinically relevant differences in susceptibility between species. For example, MICs higher than those observed for A. fumigatus wild-type were found for both triazoles and amphotericin B for A. ardalensis, A. iizukae, and A. spelaeus whereas A. lanuginosus, A. luppiae, A. movilensis, A. neoflavipes, A. olivimuriae and A. suttoniae were comparable to or more susceptible as A. fumigatus. Finally, terbinafine was in vitro active against all species except A. alboviridis.
... The phylogenetic analysis and species delimitation presented here have important limitations, as they assessed only one locus, represented by part of the mitochondrial DNA. Additionally, literature strongly recommends that GMYC results should not be considered alone to guide taxonomic reviews, but should be discussed in the light of other evidences and biological information (Talavera et al., 2013;Tang et al., 2014). The elucidation of the phylogenetic relationships and species delimitation in the M. americana complex depends not only of an advance in molecular analysis, but also an advance in Amazonian populations sampling, which are still underrepresented. ...
Article
Full-text available
The red brocket deer Mazama americana Erxleben, 1777 is considered a polyphyletic complex of cryptic species with wide chromosomal divergence. Evidence indicates that the observed chromosomal divergences result in reproductive isolation. The description of a neotype for M. americana allowed its genetic characterization and represented a comparative basis to resolve the taxonomic uncertainties of the group. Thus, we designated a neotype for the synonym Mazama rufa Illiger, 1815 and tested its recognition as a distinct species from the M. americana complex with the analysis of morphological, cytogenetic and molecular data. We also evaluated its distribution by sampling fecal DNA in the wild. Morphological data from craniometry and body biometry indicated an overlap of quantitative measurements between M. rufa and the entire M. americana complex. The phylogenetic hypothesis obtained through mtDNA confirmed the reciprocal monophyly relationship between M. americana and M. rufa , and both were identified as distinct molecular operational taxonomic units by the General Mixed Yule Coalescent species delimitation analysis. Finally, classic cytogenetic data and fluorescence in situ hybridization with whole chromosome painting probes showed M. rufa with a karyotype of 2n = 52, FN = 56. Comparative analysis indicate that at least fifteen rearrangements separate M. rufa and M. americana ( sensu stricto ) karyotypes, which confirmed their substantial chromosomal divergence. This divergence should represent an important reproductive barrier and allow its characterization as a distinct and valid species. Genetic analysis of fecal samples demonstrated a wide distribution of M. rufa in the South American continent through the Atlantic Forest, Cerrado and south region of Amazon. Thus, we conclude for the revalidation of M. rufa as a distinct species under the concept of biological isolation, with its karyotype as the main diagnostic character. The present work serves as a basis for the taxonomic review of the M. americana complex, which should be mainly based on cytogenetic characterization and directed towards a better sampling of the Amazon region, the evaluation of available names in the species synonymy and a multi-locus phylogenetic analysis.
... Therefore, the number of species depends on the method used, although all of them pointed to undescribed cryptic species in the region. Some studies argue that in some cases GMYC can lead to an overestimation of the number of species (e.g., Talavera et al., 2013;García-Melo et al., 2019). Despite that in our study the GMYC analysis delimited the highest number of species, the results of this method were corroborated by the WP and ABGD result 2. Therefore, we cannot state that the GMYC analysis overestimated the number of delimited species in our study. ...
Article
Full-text available
Recent studies in eastern Amazon coastal drainages and their surroundings have revealed new fish species that sometimes exhibit little morphological differentiation (cryptic species). Thus, we used a DNA-based species delimitation approach to test if populations showing the morphotype and typical character states of the Aphyocharax avary holotype correspond either to A. avary or A. brevicaudatus, two known species from the region, or if they form independent lineages, indicating cryptic speciation. WP and GMYC analyses recovered five lineages (species) in the ingroup, while a bPTP analysis delimited three lineages. ABGD analyses produced two possible results: one corroborating the WP and GMYC methods and another corroborating the bPTP method. All methods indicate undescribed cryptic species in the region and show variation from at least 1 to 4 species in the ingroup, depending on the approach, corroborating previous studies, and revealing this region as a possible hotspot for discovering undescribed fish species.
... This data filtering is necessary due to limitations of the species delimitation algorithms besides saving computational efforts. Moreover, in genealogical inference using BEAST, identical sequences are interpreted as coalescent alleles, generating branch lengths different from zero, probably affecting the GMYC estimates (Reid & Carstens, 2012;Talavera et al., 2013). ...
Article
Full-text available
Hypostomus Lacépède, 1803, is a species-rich and widespread fish genus from the Neotropical region, but its evolutionary history and systematics remain largely unclear. In addition, species from regions with high levels of species richness and endemism such as the Northeastern Mata Atlântica (NMA) and São Francisco (SF) hydrographic freshwater ecoregions are underrepresented in phylogenetic studies so far. In this study, we performed a broad sampling of Hypostomus in NMA and SF to investigate the interspecific boundaries and phylogenetic relationships using a multilocus approach based on one mitochondrial and three nuclear markers. Seven genetic groups were found for both ecoregions, in addition to one lineage exclusive to the SF represented by populations of the species Hypostomus velhochico. Moreover, multilocus analyses validated 16 formally described species and two new lineages for the NMA and SF ecoregions, as well as putative cases of species complexes. In general, lineages with independent evolutionary histories were revealed within the same ecoregion alongside genetically related groups between NMA, SF, and other ecoregions. These data reinforce the intricate scenario of this group of Neotropical fish and are useful to outline further phylogeographic studies in Hypostomus.
... With small datasets, ABGD, PTP and TCS, when employed in an integrative fashion, are more stable approaches in inferring OTUs than GMYC (Hamilton et al., 2014). The GMYC method is susceptible to tree reconstruction method, sampling bias and taxon coverage (Talavera et al., 2013). Regarding the ABGD method, Meier et al. (2008) argued in favour of choosing the smallest interspecific genetic distance for determining the 'barcode gap', but in our samples, this method failed to recognize two species that are well supported morphologically, M. fuscus and M. gallicus. ...
Article
Full-text available
Previously considered as a thelytokous parthenogenetic species, the widespread ant cricket Myrmecophilus acervorum actually turns out to have a mixed reproductive system: our recent surveys in the central part of its distribution area has revealed the presence of both sexes. Detailed morphological and morphometric descriptions of the previously unknown males are here provided. New data on species distribution in south-eastern Europe are presented, including the first records of M. balcanicus in Bulgaria and of M. nonveilleri in Bulgaria and Hungary. Phylogenetic and phylogeographic analyses have revealed several haplotypes of M. acervorum in Europe, with six of them forming a parthenogenetic clade in populations distributed west of the Carpathians. We tested our samples for bacterial infection by Wolbachia and, surprisingly, Wolbachia was identified only in populations with both sexes and no amplification was obtained from parthenogenetic populations. Phylogenetic analyses performed with sequences pertaining to five nominal species related to M. acervorum, yielded topological congruent trees with four well-supported groups: one group with M. acervorum samples, the second group with M. nonveilleri samples, the third group with M. fuscus and M. gallicus samples, and the fourth group with samples of M. balcanicus. We performed species delineation tests on our sequences, which delimited between four to seven putative species.
... Initially, GMYC was proposed by Pons et al. (2006) as a molecular method for species delimitation in a Maximum Likelihood context and based on ultrametric trees. The method was found to be robust in a range of departures from its simplifying assumptions (Fujisawa & Barraclough 2013;Talavera et al. 2013). Herein, this species delimitation method was robust and concordant for the analyzed species of Pseudocellus, recovering the new species based on traditional morphology (Fig. 2). ...
Article
Full-text available
A new species of epigean ricinuleid of the genus Pseudocellus Platnick, 1980 from El Triunfo Biosphere Reserve, Chiapas, Mexico is described. DNA barcoding utilizing mitochondrial cytochrome c oxidase subunit 1 (CO1) and morphology were used for species delimitation. Molecular analyses and species delimitation included four methods: 1) General Mixed Yule Coalescent model (GMYC), 2) Automatic Barcode Gap Discovery (ABGD), 3) Bayesian Poisson Tree Process (bPTP), and 4) Assemble Species by Automatic Partitioning (ASAP). All molecular methods and morphology were consistent in delimiting and recognizing the new species described herein. The average interspecific genetic distance (p-distance) among analyzed species of Pseudocellus was 11.6%. The species is described based on adult males and females: Pseudocellus giribeti sp. nov. This is the seventh species described from Chiapas, which holds the highest number of ricinuleids species for the country. The total number of described species of Pseudocellus from Mexico increases to 21, having the highest species diversity of known ricinuleids worldwide.
... These methods aim to identify putative species by discriminating betweenspecies coalescence from within-species coalescence using information from branching rates. Because it is not uncommon that different species delimitation analyses result in incongruent results (Carstens et al. 2013), it is important that putative species delimited by GMYC and PTP are validated with other lines of evidence (Talavera et al. 2013;Zhang et al. 2013). Therefore, we used a conservative, multifaceted approach to species delimitation in which putative GMYCand PTP-delimited species are only accepted as species if they are also diagnosably morphologically distinct. ...
Article
Tetrastigma loheri (Vitaceae) is a vine species native to Borneo and the Philippines. Because it is a commonly encountered forest species in the Philippines, T. loheri is potentially suitable for studying patterns of genetic diversity and connectivity among fragmented forestecosystems in various parts of this country. However, previous research suggests that T. loheri is part of a species complex in the Philippines (i.e. the T. loheri s. l. complex) that potentially also contains Philippine plants identified as T. diepenhorstii , T. philippinense , T. stenophyllum , and T. trifoliolatum . This uncertainty about its taxonomic delimitation can make it challenging to draw conclusions that are relevant to conservation from genetic studies using this species. Here, we tested the hypothesis that T. loheri s. l. is composed of more than one species in the Philippines.For this, we used generalized mixed Yule coalescent (GMYC) and Poisson tree process (PTP) species delimitation models to identify clades within DNA sequence phylogenies of T. loheri s. l. that might constitute species within this complex. Although these methods identified several putative species, these are statistically poorly supported and subsequent random forest analyses using a geometric morphometric leafshape dataset and several other vegetative characters did not result in the identification of characters that can be used to discriminate these putative species morphologically. Furthermore, the results of principal component and principal coordinates analyses of these data suggest the absence of morphological discontinuities within the species complex. Under a unified species concept that uses phylogenetic and morphological distinction as operational criteria for species recognition, we therefore conclude that the currently available data do not support recognizing multiple species in the T. loheri s. l. complex. This implies that T. loheri is best considered as a single, morphologically variable specieswhen used for studying patterns of genetic diversity and connectivity in the Philippines.
Article
Full-text available
A new species of Alainites is described from northern of Morocco Alainites albai sp. nov. It can be separated from the other west Palearctic species by the gill number, the spination of the distal margin of tergites, the leg setation, and the paraproct shape and spination. This species is widespread in the study area but never abundant. It prefers small to medium streams with slow flow, and does not seem to be very sensitive to pollution and water logging activities.
Article
Full-text available
Endeavours in species discovery, particularly the characterisation of cryptic species, have been greatly aided by the application of DNA molecular sequence data to phylogenetic reconstruction and inference of evolutionary and biogeographic processes. However, the extent of cryptic and undescribed diversity remains unclear in tropical freshwaters, where biodiversity is declining at alarming rates. To investigate how data on previously undiscovered biodiversity impacts inferences of biogeography and diversification dynamics, we generated a densely sampled species-level family tree of Afrotropical Mochokidae catfishes (220 valid species) that was ca. 70% complete. This was achieved through extensive continental sampling specifically targeting the genus Chiloglanis a specialist of the relatively unexplored fast-flowing lotic habitat. Applying multiple species-delimitation methods, we report exceptional levels of species discovery for a vertebrate genus, conservatively delimiting a staggering ca. 50 putative new Chiloglanis species, resulting in a near 80% increase in species richness for the genus. Biogeographic reconstructions of the family identified the Congo Basin as a critical region in the generation of mochokid diversity, and further revealed complex scenarios for the build-up of continental assemblages of the two most species rich mochokid genera, Synodontis and Chiloglanis. While Syndontis showed most divergence events within freshwater ecoregions consistent with largely in situ diversification, Chiloglanis showed much less aggregation of freshwater ecoregions, suggesting dispersal as a key diversification process in this older group. Despite the significant increase in mochokid diversity identified here, diversification rates were best supported by a constant rate model consistent with patterns in many other tropical continental radiations. While our findings highlight fast-flowing lotic freshwaters as potential hotspots for undescribed and cryptic species diversity, a third of all freshwater fishes are currently threatened with extinction, signifying an urgent need to increase exploration of tropical freshwaters to better characterise and conserve its biodiversity.
Article
The genomics revolution continues to change how ecologists and evolutionary biologists study the evolution and maintenance of biodiversity. It is now easier than ever to generate large molecular data sets consisting of hundreds to thousands of independently evolving nuclear loci to estimate a suite of evolutionary and demographic parameters. However, any inferences will be incomplete or inaccurate if incorrect taxonomic identities and perpetuated throughout the analytical pipeline. Due to decades of research and comprehensive online databases, sequencing and analysis of mitochondrial DNA (mtDNA), chloroplast DNA (cpDNA) and select nuclear genes can provide researchers with a cost effective and simple means to verify the species identity of samples prior to subsequent phylogeographic and population genomic analysis. The addition of these sequences to genomic studies can also shed light on other important evolutionary questions such as explanations for gene tree-species tree discordance, species limits, sex-biased dispersal patterns, adaptation, and mtDNA introgression. Although the mtDNA and cpDNA genomes often should not be used exclusively to make historical inferences given their well-known limitations, the addition of these data to modern genomic studies adds little cost and effort while simultaneously providing a wealth of useful data that can have significant implications for both basic and applied research.
Article
Full-text available
The advent of molecular techniques has resulted in the discovery of previously unknown cryptic species in many organisms, including rotifers, allowing researchers to perform detailed tests on reproductive barriers between closely related taxa. Here, we review the available literature about molecular techniques implemented for detecting cryptic diversity in rotifers and delimiting species, and discuss the potential mechanisms of reproductive isolation among rotifer cryptic species. Half of the evaluated studies used quantitative statistical approaches for species delimitation. Species boundaries defined by molecular approaches were also examined by conducting mating experiments. Those mating experiments were used to test various reproductive barriers. Most (75%) studies that identified reproductive isolation mechanisms provided evidence of prezygotic barriers. The available literatures suggest that behavioral reproductive isolation may have a more important role than other prezygotic barriers in facilitating reproductive isolation. Postzygotic barriers such as hybrid unviability or sterility and female mortality also contribute to reproductive isolation among rotifer cryptic species. The prevalence of prezygotic barriers in our assessment may stem from the difficulty of studying postzygotic barriers, which can require long-term maintaining and monitoring of rotifer populations. Sequencing tools, including whole genome sequencing, could be implemented to investigate the molecular basis of reproductive isolation in rotifers.
Article
Full-text available
Molecular phylogenetic studies have shown that the characters of the reduced shell of the false limpets of the genus Siphonaria Sowerby I, 1823 are highly variable and often insufficient for species delimitation. The taxonomy and distribution of Siphonaria in the Indian Ocean are poorly known. We sampled Siphonaria in the Seychelles Bank to check the occurrence of recorded species using DNA sequences and to study the paths through which Siphonaria species have colonised the Seychelles Bank by reconstructing their phylogenetic relationships. Analyses of a dataset comprising 16 S rRNA gene sequences of 33 specimens from the Seychelles Bank and 300 additional Siphonaria sequences from other regions from GenBank with various methods for species delimitation resulted in 19–102 primary species hypotheses. Assemble Species by Automatic Partitioning provided a conservative estimate of the species number (42) in which several indisputable species were lumped. The results of Automatic Barcode Gap Discovery depended strongly on the assumed prior maximum intraspecific divergence, whereas the tree‐based methods Generalised Mixed Yule Coalescent and Poisson Tree Processes resulted in high overestimates. The specimens from the Seychelles Bank represent three clades, belonging to the Siphonaria ‘atra’ group, the Siphonaria ‘normalis’ group and a possibly undescribed species recorded previously only from Hainan. At least two of the three species recorded from the Seychelles Bank came from the east, i.e., from the Coral Triangle in the Indo‐Australian Archipelago, the region with the highest marine biodiversity worldwide. A major transport mechanism across the Indian Ocean was probably the South Equatorial Current.
Article
Full-text available
Cambeva contains species with complex taxonomy or poorly delimitated in terms of morphology and geopraphic distribution. We conducted an extensive review of Cambeva populations from coastal drainages of Southern to Southeastern Brazil to evaluate species geographic limits with an integrative analysis including morphological and molecular data (COI). We test if two single-locus methods, Bayesian Poisson Tree Processes (bPTP) and Generalized Mixed Yule Coalescent (GMYC), are efficient to delimit species boundaries in Cambeva by the comparison with the diagnosable morphological units. Using GMYC, we also evaluated the combination of tree and molecular clock priors to reconstruct the input phylogeny and assessed how well the implemented model fitted our empirical data. Eleven species were identified using a morphological diagnosability criterion: Cambeva balios, C. barbosae, C. botuvera, C. cubataonis, C. davisi, C. guaraquessaba, C. iheringi, C. tupinamba, and C. zonata and two treated as undescribed species. In contrast with previous knowledge, many of them have wider distribution and high intraspecific variation. Species delimitation based on single-locus demonstrated incongruences between the methods and strongly differed from the morphological delimitation. These disagreements and the violation of the GMYC model suggest that a single-locus data is insufficient to delimit Cambeva species and the failure may be attributable to events of mitochondrial introgression and incomplete lineage sorting.
Article
Caucasia is a global biodiversity hotspot, rich in amphibians, including several endemic species. We sequenced samples from Parsley frogs (genus Pelodytes) across their Anatolian range to generate a barcode reference database and to assess patterns of genetic diversity in the species. Different species delimitation methods (ABGD, ASAP, GMYC and PTP) were applied to assess species diversity in the genus Pelodytes based on published and newly obtained mtDNA sequences. A majority of the species delimitation tests (ABGD, GMYC and ASAP) recovered four taxonomic units corresponding to currently accepted taxonomy (P. atlanticus, P. caucasicus, P. ibericus and P. punctatus). PTP, on the other hand, recovered only two taxonomic units, one combining the three Iberian taxa (P. atlanticus, P. ibericus, and P. punctatus), and the other, P. caucasicus. In Anatolia, individuals from Giresun and Trabzon were found to be genetically closer to each other compared to those from Rize and Artvin, based on genetic distances and phylogenetic and haplotype network analyses.
Article
Full-text available
Characterising biodiversity is one of the main challenges for the coming decades. Most diversity has not been morphologically described and barcoding is now complementing morphological-based taxonomy to further develop inventories. Both approaches have been cross-validated at the level of species and OTUs. However, many known species are not listed in reference databases. One path to speed up inventories using barcoding is to directly identify individuals at coarser taxonomic levels. We therefore studied in barcoding of plants whether morphological-based and molecular-based approaches are in agreement at genus, family and order levels. We used Agglomerative Hierarchical Clustering (with Ward, Complete and Single Linkage) and Stochastic Block Models (SBM), with two dissimilarity measures (Smith-Waterman scores, kmers). The agreement between morphological-based and molecular-based classifications ranges in most of the cases from good to very good at taxonomic levels above species, even though it decreases when taxonomic levels increase, or when using the tetramer-based distance. Agreement is correlated with the entropy of morphological-based classification and with the ratio of the mean within- and mean between-groups dissimilarities. The Ward method globally leads to the best agreement whereas Single Linkage can show poor behaviours. SBM provides a useful tool to test whether or not the dissimilarities are structured by the botanical groups. These results suggest that automatic clustering and group identification at taxonomic levels above species are possible in barcoding.
Preprint
Full-text available
The wildlife trade increases the risk of global biological invasions and accurate species assignment is critical. Here, using cross-study comparisons, we quantified the efficiency of species assignment methods and confirmed the higher sensitivity of coalescent-based methods in isolating cryptic operational taxonomic units (OTUs). We then provided a framework incorporating morphometry, phylogenetic trait, species delimitation modelling and phylogeography to improve the accuracy of species identification of brown frogs (Rana) imported into the Republic of Korea. We identified three non-native species: R. chensinensis, R. amurensis and R. kukunoris, and the presence of 12 OTUs from the trade. The combined model-based OTUs and 16S phylogeny on traded and control specimens (n = 230) revealed the widespread and likely wild-harvested origins of traded Rana individuals. Our results also highlight the independent evolution of toe-webbings in Rana for the last 12.0 Mya, a key trait for species identification. With the framework for broad-scale species identification developed herein, we urge the development of trade monitoring and legislation on Rana species in northeast Asia.
Article
Full-text available
Although much biological research depends upon species diagnoses, taxonomic expertise is collapsing. We are convinced that the sole prospect for a sustainable identification capability lies in the construction of systems that employ DNA sequences as taxon 'barcodes'. We establish that the mitochondrial gene cytochrome c oxidase I (COI) can serve as the core of a global bioidentification system for animals. First, we demonstrate that COI profiles, derived from the low-density sampling of higher taxonomic categories, ordinarily assign newly analysed taxa to the appropriate phylum or order. Second, we demonstrate that species-level assignments can be obtained by creating comprehensive COI profiles. A model COI profile, based upon the analysis of a single individual from each of 200 closely allied species of lepidopterans, was 100% successful in correctly identifying subsequent specimens. When fully developed, a COI identification system will provide a reliable, cost-effective and accessible solution to the current problem of species identification. Its assembly will also generate important new insights into the diversification of life and the rules of molecular evolution.
Article
Full-text available
DNA barcoding-type studies assemble single-locus data from large samples of individuals and species, and have provided new kinds of data for evolutionary surveys of diversity. An important goal of many such studies is to delimit evolutionarily significant species units, especially in biodiversity surveys from environmental DNA samples. The Generalized Mixed Yule Coalescent (GMYC) method is a likelihood method for delimiting species by fitting within- and between-species branching models to reconstructed gene trees. Although the method has been widely used, it has not previously been described in detail or evaluated fully against simulations of alternative scenarios of true patterns of population variation and divergence between species. Here, we present important reformulations to the GMYC method as originally specified, and demonstrate its robustness to a range of departures from its simplifying assumptions. The main factor affecting the accuracy of delimitation is the mean population size of species relative to divergence times between them. Other departures from the model assumptions, such as varying population sizes among species, alternative scenarios for speciation and extinction, and population growth or subdivision within species, have relatively smaller effects. Our simulations demonstrate that support measures derived from the likelihood function provide a robust indication of when the model performs well and when it leads to inaccurate delimitations. Finally, the so-called single threshold version of the method outperforms the multiple threshold version of the method on simulated data: we argue that this might represent a fundamental limit due to the nature of evidence used to delimit species in this approach. Together with other studies comparing its performance relative to other methods, our findings support the robustness of GMYC as a tool for delimiting species when only single-locus information is available.
Article
Full-text available
Many cold adapted species occur in both montane settings and in the subarctic. Their disjunct distributions create taxonomic complexity because there is no standardized method to establish whether their allopatric populations represent single or different species. This study employs DNA barcoding to gain new perspectives on the levels and patterns of sequence divergence among populations of 122 arctic-alpine species of Lepidoptera from the Alps, Fennoscandia and North America. It reveals intraspecific variability in the barcode region ranging from 0.00-10.08%. Eleven supposedly different species pairs or groups show close genetic similarity, suggesting possible synonymy in many cases. However, a total of 33 species show evidence of cryptic diversity as evidenced by the presence of lineages with over 2% maximum barcode divergence in Europe, in North America or between the two continents. Our study also reveals cases where taxonomic names have been used inconsistently between regions and exposes misidentifications. Overall, DNA barcodes have great potential to both increase taxonomic resolution and to make decisions concerning the taxonomic status of allopatric populations more objective.
Article
Full-text available
Background Species are considered the fundamental unit in many ecological and evolutionary analyses, yet accurate, complete, accessible taxonomic frameworks with which to identify them are often unavailable to researchers. In such cases DNA sequence-based species delimitation has been proposed as a means of estimating species boundaries for further analysis. Several methods have been proposed to accomplish this. Here we present a Bayesian implementation of an evolutionary model-based method, the general mixed Yule-coalescent model (GMYC). Our implementation integrates over the parameters of the model and uncertainty in phylogenetic relationships using the output of widely available phylogenetic models and Markov-Chain Monte Carlo (MCMC) simulation in order to produce marginal probabilities of species identities. Results We conducted simulations testing the effects of species evolutionary history, levels of intraspecific sampling and number of nucleotides sequenced. We also re-analyze the dataset used to introduce the original GMYC model. We found that the model results are improved with addition of DNA sequence and increased sampling, although these improvements have limits. The most important factor in the success of the model is the underlying phylogenetic history of the species under consideration. Recent and rapid divergences result in higher amounts of uncertainty in the model and eventually cause the model to fail to accurately assess uncertainty in species limits. Conclusion Our results suggest that the GMYC model can be useful under a wide variety of circumstances, particularly in cases where divergences are deeper, or taxon sampling is incomplete, as in many studies of ecological communities, but that, in accordance with expectations from coalescent theory, rapid, recent radiations may yield inaccurate results. Our implementation differs from existing ones in two ways: it allows for the accounting for important sources of uncertainty in the model (phylogenetic and in parameters specific to the model) and in the specification of informative prior distributions that can increase the precision of the model. We have incorporated this model into a user-friendly R package available on the authors’ websites.
Article
Full-text available
Prospects for a comprehensive inventory of global biodiversity would be greatly improved by automating methods of species delimitation. The general mixed Yule-coalescent (GMYC) was recently proposed as a potential means of increasing the rate of biodiversity exploration. We tested this method with simulated data and applied it to a group of poorly known bats (Hipposideros) from the Philippines. We then used echolocation call characteristics to evaluate the plausibility of species boundaries suggested by GMYC. In our simulations, GMYC performed relatively well (errors in estimated species diversity less than 25%) when the product of the haploid effective population size (N(e)) and speciation rate (SR; per lineage per million years) was less than or equal to 10(5), while interspecific variation in N(e) was twofold or less. However, at higher but also biologically relevant values of N(e) × SR and when N(e) varied tenfold among species, performance was very poor. GMYC analyses of mitochondrial DNA sequences from Philippine Hipposideros suggest actual diversity may be approximately twice the current estimate, and available echolocation call data are mostly consistent with GMYC delimitations. In conclusion, we consider the GMYC model useful under some conditions, but additional information on N(e), SR and/or corroboration from independent character data are needed to allow meaningful interpretation of results.
Article
Full-text available
Integrative taxonomy is a recently developed approach that uses multiple lines of evidence such as molecular, morphological, ecological and geographical data to test species limits, and it stands as one of the most promising approaches to species delimitation in taxonomically difficult groups. The Pnigalio soemius complex (Hymenoptera: Eulophidae) represents an interesting taxonomical and ecological study case, as it is characterized by a lack of informative morphological characters, deep mitochondrial divergence, and is susceptible to infection by parthenogenesis-inducing Rickettsia. We tested the effectiveness of an integrative taxonomy approach in delimiting species within the P. soemius complex. We analysed two molecular markers (COI and ITS2) using different methods, performed multivariate analysis on morphometric data and exploited ecological data such as host-plant system associations, geographical separation, and the prevalence, type and effects of endosymbiont infection. The challenge of resolving different levels of resolution in the data was met by setting up a formal procedure of data integration within and between conflicting independent lines of evidence. An iterative corroboration process of multiple sources of data eventually indicated the existence of several cryptic species that can be treated as stable taxonomic hypotheses. Furthermore, the integrative approach confirmed a trend towards host specificity within the presumed polyphagous P. soemius and suggested that Rickettsia could have played a major role in the reproductive isolation and genetic diversification of at least two species.
Article
Full-text available
Eight years after DNA barcoding was formally proposed on a large scale, CO1 sequences are rapidly accumulating from around the world. While studies to date have mostly targeted local or regional species assemblages, the recent launch of the global iBOL project (International Barcode of Life), highlights the need to understand the effects of geographical scale on Barcoding's goals. Sampling has been central in the debate on DNA Barcoding, but the effect of the geographical scale of sampling has not yet been thoroughly and explicitly tested with empirical data. Here, we present a CO1 data set of aquatic predaceous diving beetles of the tribe Agabini, sampled throughout Europe, and use it to investigate how the geographic scale of sampling affects 1) the estimated intraspecific variation of species, 2) the genetic distance to the most closely related heterospecific, 3) the ratio of intraspecific and interspecific variation, 4) the frequency of taxonomically recognized species found to be monophyletic, and 5) query identification performance based on 6 different species assignment methods. Intraspecific variation was significantly correlated with the geographical scale of sampling (R-square = 0.7), and more than half of the species with 10 or more sampled individuals (N = 29) showed higher intraspecific variation than 1% sequence divergence. In contrast, the distance to the closest heterospecific showed a significant decrease with increasing geographical scale of sampling. The average genetic distance dropped from > 7% for samples within 1 km, to < 3.5% for samples up to > 6000 km apart. Over a third of the species were not monophyletic, and the proportion increased through locally, nationally, regionally, and continentally restricted subsets of the data. The success of identifying queries decreased with increasing spatial scale of sampling; liberal methods declined from 100% to around 90%, whereas strict methods dropped to below 50% at continental scales. The proportion of query identifications considered uncertain (more than one species < 1% distance from query) escalated from zero at local, to 50% at continental scale. Finally, by resampling the most widely sampled species we show that even if samples are collected to maximize the geographical coverage, up to 70 individuals are required to sample 95% of intraspecific variation. The results show that the geographical scale of sampling has a critical impact on the global application of DNA barcoding. Scale-effects result from the relative importance of different processes determining the composition of regional species assemblages (dispersal and ecological assembly) and global clades (demography, speciation, and extinction). The incorporation of geographical information, where available, will be required to obtain identification rates at global scales equivalent to those in regional barcoding studies. Our result hence provides an impetus for both smarter barcoding tools and sprouting national barcoding initiatives-smaller geographical scales deliver higher accuracy.
Article
Full-text available
Large-scale sequencing of short mtDNA fragments for biodiversity inventories ('DNA barcoding') indicates that sequence variation in animal mtDNA is highly structured and partitioned into discrete genetic clusters that correspond broadly to species-level entities. Here we explore how the migration rate, an important demographic parameter that is directly related to population isolation, might affect variation in the strength of mtDNA clustering among taxa. Patterns of mtDNA variation were investigated in two groups of beetles that both contain lineages occupying habitats predicted to select for different dispersal abilities: predacious diving beetles (Dytiscidae) in the genus Bidessus from lotic and lentic habitats across Europe and darkling beetles (Tenebrionidae) in the genus Eutagenia from sand and other soil types in the Aegean Islands. The degree of genetic clustering was determined using the recently developed 'mixed Yule coalescent' (MYC) model that detects the transition from between-species to within-population branching patterns. Lineages from presumed stable habitats, and therefore displaying lower dispersal ability and migration rates, showed greater levels of mtDNA clustering and geographical subdivision than their close relatives inhabiting ephemeral habitats. Simulations of expected patterns of mtDNA variation under island models showed that MYC clusters are only detected when the migration rates are much lower than the value of Nm=1 typically used to define the threshold for neutral genetic divergence. Therefore, discrete mtDNA clusters provide strong evidence for independently evolving populations or species, but their formation is suppressed even under very low levels of dispersal.
Article
Full-text available
Uncovering cryptic biodiversity is essential for understanding evolutionary processes and patterns of ecosystem functioning, as well as for nature conservation. As European butterflies are arguably the best-studied group of invertebrates in the world, the discovery of a cryptic species, twenty years ago, within the common wood white Leptidea sinapis was a significant event, and these butterflies have become a model to study speciation. Here we show that the so-called 'sibling' Leptidea actually consist of three species. The new species can be discriminated on the basis of either DNA or karyological data. Such an unexpected discovery challenges our current knowledge on biodiversity, exemplifying how a widespread species can remain unnoticed even within an intensely studied natural model system for speciation.
Article
Full-text available
Singletons—species only known from a single specimen—and uniques—species that have only been collected once—are very common in biodiversity samples. Recent reviews suggest that in tropical arthropod samples, 30% of all species are represented by only one specimen (Bickel 1999; Novotny and Basset 2000; Coddington et al. 2009), with additional sampling helping little with eliminating rarity. Usually, such sampling only converts some of the singleton species to doubletons, with new singleton species being discovered in the process (Scharff et al. 2003; Coddington et al. 2009). Here, we first demonstrate that rare species are similarly common in specimen samples used for taxonomic research before we argue that the phenomenon of rarity has been insufficiently considered by the new quantitative techniques for species delimitation. Addressing this disconnect between theory and reality is pressing given that the last decade has seen a renewed interest in methods for species identification and delimitation (Sites and Marshall 2004; O’Meara 2010). Much of this interest has been fuelled by the availability of DNA sequences (Meier 2008). However, many newly proposed techniques implicitly or explicitly assume that all populations and species can be well sampled. But what is the value of these techniques if many species have only been collected once and/or are only known from one specimen? Here, we argue that all existing techniques need to be modified to accommodate the commonness of rarity and that all future techniques should be explicit about how rare species can be discovered and treated.
Article
Full-text available
DNA barcoding aims to accelerate species identification and discovery, but performance tests have shown marked differences in identification success. As a consequence, there remains a great need for comprehensive studies which objectively test the method in groups with a solid taxonomic framework. This study focuses on the 180 species of butterflies in Romania, accounting for about one third of the European butterfly fauna. This country includes five eco-regions, the highest of any in the European Union, and is a good representative for temperate areas. Morphology and DNA barcodes of more than 1300 specimens were carefully studied and compared. Our results indicate that 90 per cent of the species form barcode clusters allowing their reliable identification. The remaining cases involve nine closely related species pairs, some whose taxonomic status is controversial or that hybridize regularly. Interestingly, DNA barcoding was found to be the most effective identification tool, outperforming external morphology, and being slightly better than male genitalia. Romania is now the first country to have a comprehensive DNA barcode reference database for butterflies. Similar barcoding efforts based on comprehensive sampling of specific geographical regions can act as functional modules that will foster the early application of DNA barcoding while a global system is under development.
Article
Full-text available
High-throughput DNA sequencing has the potential to accelerate species discovery if it is able to recognize evolutionary entities from sequence data that are comparable to species. The general mixed Yule-coalescent (GMYC) model estimates the species boundary from DNA surveys by identifying independently evolving lineages as a transition from coalescent to speciation branching patterns on a phylogenetic tree. Applied here to 12 families from 4 orders of insects in Madagascar, we used the model to delineate 370 putative species from mitochondrial DNA sequence variation among 1614 individuals. These were compared with data from the nuclear genome and morphological identification and found to be highly congruent (98% and 94%). We developed a modified GMYC that allows for a variable transition from coalescent to speciation among lineages. This revised model increased the congruence with morphology (97%), suggesting that a variable threshold better reflects the clustering of sequence data into biological species. Local endemism was pronounced in all 5 insect groups. Most species (60-91%) and haplotypes (88-99%) were found at only 1 of the 5 study sites (40-1000 km apart). This pronounced endemism resulted in a 37% increase in species numbers using diagnostic nucleotides in a population aggregation analysis. Sample sizes between 7 and 10 individuals represented a threshold above which there was minimal increase in genetic diversity, broadly agreeing with coalescent theory and other empirical studies. Our results from > 1.4 Mb of empirical data suggest that the GMYC model captures species boundaries comparable to those from traditional methods without the need for prior hypotheses of population coherence. This provides a method of species discovery and biodiversity assessment using single-locus data from mixed or environmental samples while building a globally available taxonomic database for future identifications.
Article
Full-text available
Sample size has long been one of the basic issues since the start of the DNA barcoding initiative and the global biodiversity investigation. As a contribution to resolving this problem, we propose a simple resampling approach to estimate several key sampling sizes for a DNA barcoding project. We illustrate our approach using both structured populations simulated under coalescent and real species of skipper butterflies. We found that sample sizes widely used in DNA barcoding are insufficient to assess the genetic diversity of a species, population structure impacts the estimation of the sample sizes, and hence will bias the species identification potentially.
Article
Full-text available
Comparative phylogeographical studies in island archipelagos can reveal lineage-specific differential responses to the geological and climatic history. We analysed patterns of genetic diversity in six codistributed lineages of darkling beetles (Tenebrionidae) in the central Aegean archipelago which differ in wing development and habitat preferences. A total of 600 specimens from 30 islands and eight adjacent mainland regions were sequenced for mitochondrial cytochrome oxidase I and nuclear Muscular protein 20. Individual gene genealogies were assessed for the presence of groups that obey an independent coalescent process using a mixed Yule coalescent model. The six focal taxa differed greatly in the number of coalescent groups and depth of lineage subdivision, which was closely mirrored by the degree of geographical structuring. The most severe subdivision at both mitochondrial DNA and nuclear DNA level was found in flightless lineages associated with presumed stable compact-soil habitats (phrygana, maquis), in contrast to sand-obligate lineages inhabiting ephemeral coastal areas that displayed greater homogeneity across the archipelago. A winged lineage, although associated with stable habitats, showed no significant phylogenetic or geographical structuring. Patterns of nucleotide diversity and local genetic differentiation, as measured using PhiST and hierarchical AMOVA, were consistent with high levels of ongoing gene flow in the winged taxon; frequent local extinction and island recolonisation for flightless sand-obligate taxa; and very low gene flow and geographical structure largely defined by the palaeogeographical history of the region in flightless compact-soil taxa. These results show that differences in dispersal rate, mediated by habitat persistence, greatly influence the levels of phylogeographical subdivision in lineages that are otherwise subjected to the same geological events and palaeoclimatic changes.
Article
Full-text available
By far the greatest challenge for diversity studies is to characterize the diversity of prokaryotes, which probably encompasses billions of species, most of which are unculturable. Recent advances in theory and analysis have focused on multi-locus approaches and on combined analysis of molecular and ecological data. However, broad environmental surveys of bacterial diversity still rely on single-locus data, notably 16S ribosomal DNA, and little other detailed information. Evolutionary methods of delimiting species from single-locus data alone need to consider population genetic and macroevolutionary theories for the expected levels of interspecific and intraspecific variation. We discuss the use of a recent evolutionary method, based on the theory of coalescence within independently evolving populations, compared with a traditional approach that uses a fixed threshold divergence to delimit species.
Article
Full-text available
Phylogenies reconstructed from contemporary taxa do not contain information about lineages that have gone extinct. We derive probability models for such phylogenies, allowing real data to be compared with specified null models of evolution, and lineage birth and death rates to be estimated.
Article
Full-text available
Although much biological research depends upon species diagnoses, taxonomic expertise is collapsing. We are convinced that the sole prospect for a sustainable identification capability lies in the construction of systems that employ DNA sequences as taxon 'barcodes'. We establish that the mitochondrial gene cytochrome c oxidase I (COI) can serve as the core of a global bioidentification system for animals. First, we demonstrate that COI profiles, derived from the low-density sampling of higher taxonomic categories, ordinarily assign newly analysed taxa to the appropriate phylum or order. Second, we demonstrate that species-level assignments can be obtained by creating comprehensive COI profiles. A model COI profile, based upon the analysis of a single individual from each of 200 closely allied species of lepidopterans, was 100% successful in correctly identifying subsequent specimens. When fully developed, a COI identification system will provide a reliable, cost-effective and accessible solution to the current problem of species identification. Its assembly will also generate important new insights into the diversification of life and the rules of molecular evolution.
Article
Full-text available
With millions of species and their life-stage transformations, the animal kingdom provides a challenging target for taxonomy. Recent work has suggested that a DNA-based identification system, founded on the mitochondrial gene, cytochrome c oxidase subunit 1 (COI), can aid the resolution of this diversity. While past work has validated the ability of COI sequences to diagnose species in certain taxonomic groups, the present study extends these analyses across the animal kingdom. The results indicate that sequence divergences at COI regularly enable the discrimination of closely allied species in all animal phyla except the Cnidaria. This success in species diagnosis reflects both the high rates of sequence change at COI in most animal groups and constraints on intraspecific mitochondrial DNA divergence arising, at least in part, through selective sweeps mediated via interactions with the nuclear genome.
Article
Full-text available
Analysis of Phylogenetics and Evolution (APE) is a package written in the R language for use in molecular evolution and phylogenetics. APE provides both utility functions for reading and writing data and manipulating phylogenetic trees, as well as several advanced methods for phylogenetic and evolutionary analysis (e.g. comparative and population genetic methods). APE takes advantage of the many R functions for statistics and graphics, and also provides a flexible framework for developing and implementing further statistical methods for the analysis of evolutionary processes. Availability: The program is free and available from the official R package archive at http://cran.r-project.org/src/contrib/PACKAGES.html#ape. APE is licensed under the GNU General Public License.
Article
Full-text available
Short DNA sequences from a standardized region of the genome provide a DNA barcode for identifying species. Compiling a public library of DNA barcodes linked to named specimens could provide a new master key for identifying species, one whose power will rise with increased taxon coverage and with faster, cheaper sequencing. Recent work suggests that sequence diversity in a 648-bp region of the mitochondrial gene, cytochrome c oxidase I (COI), might serve as a DNA barcode for the identification of animal species. This study tested the effectiveness of a COI barcode in discriminating bird species, one of the largest and best-studied vertebrate groups. We determined COI barcodes for 260 species of North American birds and found that distinguishing species was generally straightforward. All species had a different COI barcode(s), and the