Preprint

Discerning structure versus speciation in phylogeographic analysis of Seepage Salamanders (Desmognathus aeneus) using demography, environment, geography, and phenotype

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

Numerous mechanisms drive ecological speciation, including isolation by adaptation, barrier, distance, environment, hierarchy, and resistance. These promote genetic and phenotypic differentiation of local populations, formation of phylogeographic lineages, and ultimately, completed speciation via reinforcement. In contrast, it is possible that similar mechanisms might lead to lineage cohesion through stabilizing rather than diversifying ecomorphological selection and the long-term persistence of population structure within species. Processes that drive the formation and maintenance of geographic genetic diversity while facilitating high rates of migration and limiting phenotypic divergence may thereby result in population structure that is not accompanied by divergence towards reproductive isolation. We suggest that this framework can be applied more broadly to address the classic dilemma of “structure versus speciation” when evaluating phylogeographic diversity, unifying population genetics, species delimitation, and the underlying study of speciation. We demonstrate one such instance in the Seepage Salamander (Desmognathus aeneus) from the southeastern United States. Recent studies estimated up to 6.3% mitochondrial divergence and 4 phylogenomic lineages with broad admixture across geographic hybrid zones, which could potentially represent distinct species. However, while limited dispersal promotes substantial isolation by distance, extreme microhabitat specificity appears to yield stabilizing selection on ecologically mediated phenotypes. As a result, climatic cycles promote recurrent contact between lineages that are not adaptively differentiated and therefore experience repeated bouts of high migration and introgression through time. This leads to a unified, single species with deeply divergent phylogeographic lineages that nonetheless do not appear to represent incipient species.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Background: Inference of complex demographic histories is a source of information about events that happened in the past of studied populations. Existing methods for demographic inference typically require input from the researcher in the form of a parameterized model. With an increased variety of methods and tools, each with its own interface, the model specification becomes tedious and error-prone. Moreover, optimization algorithms used to find model parameters sometimes turn out to be inefficient, for instance, by being not properly tuned or highly dependent on a user-provided initialization. The open-source software GADMA addresses these problems, providing automatic demographic inference. It proposes a common interface for several likelihood engines and provides global parameters optimization based on a genetic algorithm. Results: Here, we introduce the new GADMA2 software and provide a detailed description of the added and expanded features. It has a renovated core code base, new likelihood engines, an updated optimization algorithm, and a flexible setup for automatic model construction. We provide a full overview of GADMA2 enhancements, compare the performance of supported likelihood engines on simulated data, and demonstrate an example of GADMA2 usage on 2 empirical datasets. Conclusions: We demonstrate the better performance of a genetic algorithm in GADMA2 by comparing it to the initial version and other existing optimization approaches. Our experiments on simulated data indicate that GADMA2's likelihood engines are able to provide accurate estimations of demographic parameters even for misspecified models. We improve model parameters for 2 empirical datasets of inbred species.
Article
Full-text available
Contrasting environmental conditions across geographic space might promote divergent selection, making adaptation to local biotic and abiotic conditions necessary for populations to survive. In order to understand how populations adapt to different environmental conditions, studies of local adaptation have been largely used as an interface to address ecological and evolutionary questions. Here, we studied populations of Gymnodactylus amarali (Phyllodactylidae) isolated in rapidly created artificial islands. We combined a genotyping-by-sequence (GBS) survey and redundancy analyses (RDA) to investigate genotype–environment associations (GEA), while DAPC, Fst, and Admixture analyses were used to determine genetic structure. Our hypothesis is that G. amarali populations on the islands are going through a local adaptation process and consequently becoming genetically different from the populations on the mainland. Our results indicate that geographic and environmental differences are related to genetic variation, as we detected the presence of two or three distinct genetic lineages in Serra da Mesa, Minaçu, and Colinas do Sul. Fst analysis shows moderate isolation between Serra da Mesa and Minaçu (0.082). The RDA pointed out potential local genetic signal correlated with temperature and precipitation. We identified 230 candidate loci associated with the environment and at least two locally structured subpopulations (Serra da Mesa and Minaçu) show significant association with environmental variation.
Article
Full-text available
Speciation rates vary substantially across the tree of life. These rates should be linked to the rate at which population structure forms if a continuum between micro and macroevolutionary patterns exists. Previous studies examining the link between speciation rates and the degree of population formation in clades have been shown to be either correlated or uncorrelated depending on the group, but no study has yet examined the relationship between speciation rates and population structure in a young group that is constrained spatially to a single-island system. We examine this correlation in 109 gemsnakes (Pseudoxyrhophiidae) endemic to Madagascar and originating in the early Miocene, which helps control for extinction variation across time and space. We find no relationship between rates of speciation and the formation rates of population structure over space in 33 species of gemsnakes. Rates of speciation show low variation, yet population structure varies widely across species, indicating that speciation rates and population structure are disconnected. We suspect this is largely due to the persistence of some lineages not susceptible to extinction. Importantly, we discuss how delimiting populations versus species may contribute to problems understanding the continuum between shallow and deep evolutionary processes.
Article
Full-text available
Genomic-scale datasets, sophisticated analytical techniques, and conceptual advances have disproportionately failed to resolve species boundaries in some groups relative to others. To understand the processes that underlie taxonomic intractability, we dissect the speciation history of an Australian lizard clade that arguably represents a "worst-case" scenario for species delimitation within vertebrates: the Ctenotus inornatus species group, a clade beset with decoupled genetic and phenotypic breaks, uncertain geographic ranges, and parallelism in purportedly diagnostic morphological characters. We sampled hundreds of localities to generate a genomic perspective on population divergence, structure, and admixture. Our results revealed rampant paraphyly of nominate taxa in the group, with lineages that are either morphologically cryptic or polytypic. Isolation-by-distance patterns reflect spatially continuous differentiation among certain pairs of putative species, yet genetic and geographic distances are decoupled in other pairs. Comparisons of mitochondrial and nuclear gene trees, tests of nuclear introgression, and historical demographic modelling identified gene flow between divergent candidate species. Levels of admixture are decoupled from phylogenetic relatedness; gene flow is often higher between sympatric species than between parapatric populations of the same species. Such idiosyncratic patterns of introgression contribute to species boundaries that are fuzzy while also varying in fuzziness. Our results suggest that "taxonomic disaster zones" like the C. inornatus species group result from spatial variation in the porosity of species boundaries and the resulting patterns of genetic and phenotypic variation. This study raises questions about the origin and persistence of hybridizing species and highlights the unique insights provided by taxa that have long eluded straightforward taxonomic categorization.
Article
Full-text available
The distribution of genetic diversity is often heterogeneous in space, and it usually correlates with environmental transitions or historical processes that affect demography. The coast of Chile encompasses two biogeographic provinces and spans a broad environmental gradient together with oceanographic processes linked to coastal topography that can affect species' genetic diversity. Here, we evaluated the genetic connectivity and historical demography of four Scurria limpets, S. scurra, S. variabilis, S. ceciliana and S. araucana, between ca. 19° S and 53° S in the Chilean coast using genome-wide SNPs markers. Genetic structure varied among species which was evidenced by species-specific breaks together with two shared breaks. One of the shared breaks was located at 22-25° S and was observed in S. araucana and S. variabilis, while the second break around 31-34° S was shared by three Scurria species. Interestingly, the identified genetic breaks are also shared with other low-disperser invertebrates. Demographic histories show bottlenecks in S. scurra and S. araucana populations and recent population expansion in all species. The shared genetic breaks can be linked to oceanographic features acting as soft barriers to dispersal and also to historical climate, evidencing the utility of comparing multiple and sympatric species to understand the influence of a particular seascape on genetic diversity.
Article
Full-text available
Genetic differentiation between and within natural populations is the result of the joint effects of neutral and adaptative processes. In addition, the spatial arrangement of the landscape promotes connectivity or creates barriers to gene flow, directly affecting speciation processes. In this study, we carried out a landscape genomics analysis using NextRAD data from a montane forest specialist bird complex, the Mesoamerican Chestnut-capped/Green-striped Brushfinch of the genus Arremon. Specifically, we examined population genomic structure using different assignment methods and genomic differentiation and diversity, and we tested alternative genetic isolation hypotheses at the individual level (e.g., isolation by barrier, IBB; isolation by environment, IBE; isolation by resistance, IBR). We found well-delimited genomic structuring (K = 5) across Mesoamerican montane forests in the studied group. Individual-level genetic distances among major montane ranges were mainly explained by IBR hypotheses in this sedentary Neotropical taxon. Our results uncover genetic distances/differentiation and patterns of gene flow in allopatric species that support the role of tropical mountains as spatial landscape drivers of biodiversity. IBR clearly supports a pattern of conserved niche-tracking of suitable habitat conditions and topographic complexity throughout glacial-interglacial dynamics.
Article
Full-text available
The germline mutation rate determines the pace of genome evolution and is an evolving parameter itself¹. However, little is known about what determines its evolution, as most studies of mutation rates have focused on single species with different methodologies². Here we quantify germline mutation rates across vertebrates by sequencing and comparing the high-coverage genomes of 151 parent–offspring trios from 68 species of mammals, fishes, birds and reptiles. We show that the per-generation mutation rate varies among species by a factor of 40, with mutation rates being higher for males than for females in mammals and birds, but not in reptiles and fishes. The generation time, age at maturity and species-level fecundity are the key life-history traits affecting this variation among species. Furthermore, species with higher long-term effective population sizes tend to have lower mutation rates per generation, providing support for the drift barrier hypothesis³. The exceptionally high yearly mutation rates of domesticated animals, which have been continually selected on fecundity traits including shorter generation times, further support the importance of generation time in the evolution of mutation rates. Overall, our comparative analysis of pedigree-based mutation rates provides ecological insights on the mutation rate evolution in vertebrates.
Article
Full-text available
Hybrid zones can be studied by modeling clines of trait variation (e.g., morphology, genetics) over a linear transect. Yet, hybrid zones can also be spatially complex, can shift over time, and can even lead to the formation of hybrid lineages with the right combination of dispersal and vicariance. We reassessed Sibley's (1950) gradient between Collared Towhee (Pipilo ocai) and Spotted Towhee (P. maculatus) in Central Mexico to test whether it conformed to a typical tension-zone cline model. By comparing historical and modern data, we found that cline centers for genetic and phenotypic traits have not shifted over the course of 70 years. This equilibrium suggests that secondary contact between these species, which originally diverged over 2 million years ago, likely dates to the Pleistocene. Given the amount of mtDNA divergence, parental ends of the cline have very low autosomal nuclear differentiation (FST = 0.12). Dramatic and coincident cline shifts in mtDNA and throat color suggest the possibility of sexual selection as a factor in differential introgression, while a contrasting cline shift in green back color hints at a role for natural selection. Supporting the idea of a continuum between clinal variation and hybrid lineage formation, the towhee gradient can be analyzed as one population under isolation-by-distance, as a two-population cline, and as three lineages experiencing divergence with gene flow. In the middle of the gradient, a hybrid lineage has become partly isolated, likely due both to forested habitat shrinking and fragmenting as it moved upslope after the last glacial maximum and a stark environmental transition. The towhee system offers a window into the potential outcomes of hybridization across a dynamic landscape including the creation of novel genomic and phenotypic combinations and incipient hybrid lineages.
Article
Full-text available
The Western European house mouse (Mus musculus domesticus) is a widespread human commensal that has recently been introduced to North America. Its introduction to the Americas is thought to have resulted from the transatlantic movements of Europeans that began in the early 16th century. To examine the details of this colonization history, we examine population structure, explore relevant demographic models, and infer the timing of divergence among house mouse populations in the eastern United States using published exome sequences from five North American populations and two European populations. For North American populations of house mice, levels of nucleotide variation were lower, and low-frequency alleles were less common, than for European populations. These patterns provide evidence of a mild bottleneck associated with the movement of house mice into North America. Several analyses revealed that one North American population is genetically admixed, which indicates at least two source populations from Europe were independently introduced to eastern North America. Estimated divergence times between North American and German populations ranged between ∼1,000–7,000 years ago and overlapped with the estimated divergence time between populations from Germany and France. Demographic models comparing different North American populations revealed that these populations diverged from each other mostly within the last 500 years, consistent with the timing of the arrival of Western European settlers to North America. Together, these results support a recent introduction of Western European house mice to eastern North America, highlighting the effects of human migration and colonization on the spread of an invasive human commensal.
Article
Full-text available
Genetic differentiation among local groups of individuals, i.e., genetic β‐diversity, is a key component of population persistence related to connectivity and isolation. However, most genetic investigations of natural populations focus on a single species, overlooking opportunities for multispecies conservation plans to benefit entire communities in an ecosystem. We present an approach to evaluate genetic β‐diversity within and among many species and demonstrate how this riverscape community genomics approach can be applied to identify common drivers of genetic structure. Our study evaluated genetic β‐diversity in 31 co‐distributed native stream fishes sampled from 75 sites across the White River Basin (Ozarks, USA) using SNP genotyping (ddRAD). Despite variance among species in the degree of genetic divergence, general spatial patterns were identified corresponding to river network architecture. Most species (N=24) were partitioned into discrete sub‐populations (K=2–7). We used partial redundancy analysis to compare species‐specific genetic β‐diversity across four models of genetic structure: Isolation by distance (IBD), isolation by barrier (IBB), isolation by stream hierarchy (IBH), and isolation by environment (IBE). A significant proportion of intraspecific genetic variation was explained by IBH (x̄=62%), with the remaining models generally redundant. We found evidence for consistent spatial modularity in that gene flow is higher within rather than between hierarchical units (i.e., catchments, watersheds, basins), supporting the generalization of the Stream Hierarchy Model. We discuss our conclusions regarding conservation and management and identify the 8‐digit Hydrologic Unit (HUC) as the most relevant spatial scale for managing genetic diversity across riverine networks.
Article
Full-text available
The association of molecular variants with phenotypic variation is a main issue in biology, often tackled with genome‐wide association studies (GWAS). GWAS are challenging, with increasing, but still limited, use in evolutionary biology. We used redundancy analysis (RDA) as a complimentary ordination approach to single‐ and multitrait GWAS to explore the molecular basis of pigmentation variation in brown trout (Salmo trutta) belonging to wild populations impacted by hatchery fish. Based on 75,684 single nucleotide polymorphic (SNP) markers, RDA, single‐ and multitrait GWAS allowed the extraction of 337 independent colour patterning loci (CPLs) associated with trout pigmentation traits, such as the number of red and black spots on flanks. Collectively, these CPLs (i) mapped onto 35 out of 40 brown trout linkage groups indicating a polygenic genomic architecture of pigmentation, (ii) were found to be associated with 218 candidate genes, including 197 genes formerly mentioned in the literature associated to skin pigmentation, skin patterning, differentiation or structure notably in a close relative, the rainbow trout (Onchorhynchus mykiss), and (iii) related to functions relevant to pigmentation variation (e.g., calcium‐ and ion‐binding, cell adhesion). Annotated CPLs include genes with well‐known pigmentation effects (e.g., PMEL, SLC45A2, SOX10), but also markers associated with genes formerly found expressed in rainbow or brown trout skins. RDA was also shown to be useful to investigate management issues, especially the dynamics of trout pigmentation submitted to several generations of hatchery introgression.
Article
Full-text available
One of the most stunning patterns of the distribution of life on Earth is the latitudinal biodiversity gradient. In an influential article, Janzen (1967) predicted that tropical mountains are more effective migration barriers than temperate mountains of the same elevation, because annual temperature variation in the tropics is lower. A great deal of research has demonstrated that the mechanism envisioned by Janzen operates at broad latitudinal scales. However, the extent that the mechanism mediates biodiversity generally, and at smaller scales, is far less understood. We investigated whether climate overlap is associated with genetic similarity between populations within temperate regions using lizards in the Sierra Nevada mountain range of California as a study system. By comparing genetic differentiation between high- and low-elevation populations, we found that in addition to the expected strong pattern of isolation by distance, high climate overlap was negatively associated with genetic differentiation, indicating that population pairs that inhabit climatically similar environments are less genetically differentiated. Moreover, while climate overlap between high- and low-elevation sites is predicted to increase from the equator to temperate regions, we find that in adjacent mountain ranges at the same latitude in temperate regions, climate overlap values can vary widely. This study suggests that in addition to the well-studied main effect of latitude on climate overlap and population differentiation, local climate factors within bioclimatic regions can also influence genetic differentiation between populations and do so by the same general mechanism that operates at larger geographic scales.
Article
Full-text available
The efficacy of an allometric growth model to correct for ontogenetic body size variation has been known for decades, yet this method remains relatively obscure and rarely applied. We optimize the implementation of this method through a newly developed and easy-to-use R package GroupStruct and further extend its application from intraspecific to interspecific datasets. Using empirical examples, we show that different size correction methods (i.e., ratios, residuals, and allometry) can result in vastly different conclusions. Our results demonstrate that choosing the appropriate size correction method is crucial as it can have significant impacts on downstream analyses and has the potential to alter biological interpretations.
Article
Full-text available
Ecosystem degradation and biodiversity loss are major global challenges. When reproductive isolation between species is contingent on the interaction of intrinsic lineage traits with features of the environment, environmental change can weaken reproductive isolation and result in extinction through hybridization. By this process called speciation reversal, extinct species can leave traces in genomes of extant species through introgressive hybridization. Using historical and contemporary samples, we sequenced all four species of an Alpine whitefish radiation before and after anthropogenic lake eutrophication and the associated loss of one species through speciation reversal. Despite the extinction of this taxon, substantial fractions of its genome, including regions shaped by positive selection before eutrophication, persist within surviving species as a consequence of introgressive hybridization during eutrophication. Given the prevalence of environmental change, studying speciation reversal and its genomic consequences provides fundamental insights into evolutionary processes and informs biodiversity conservation. Genomic analysis of an Alpine whitefish radiation before and after human-driven lake eutrophication that led to the extinction of one species through hybridization shows that substantial parts of the genome of the extinct species persist within surviving species due to introgressive hybridization.
Article
Full-text available
Abstract Dusky Salamanders (genus Desmognathus) currently comprise only 22 described, extant species. However, recent mitochondrial and nuclear estimates indicate the presence of up to 49 candidate species based on ecogeographic sampling. Previous studies also suggest a complex history of hybridization between these lineages. Studies in other groups suggest that disregarding admixture may affect both phylogenetic inference and clustering‐based species delimitation. With a dataset comprising 233 Anchored Hybrid Enrichment (AHE) loci sequenced for 896 Desmognathus specimens from all 49 candidate species, we test three hypotheses regarding (i) species‐level diversity, (ii) hybridization and admixture, and (iii) misleading phylogenetic inference. Using phylogenetic and population‐clustering analyses considering gene flow, we find support for at least 47 candidate species in the phylogenomic dataset, some of which are newly characterized here while others represent combinations of previously named lineages that are collapsed in the current dataset. Within these, we observe significant phylogeographic structure, with up to 64 total geographic genetic lineages, many of which hybridize either narrowly at contact zones or extensively across ecological gradients. We find strong support for both recent admixture between terminal lineages and ancient hybridization across internal branches. This signal appears to distort concatenated phylogenetic inference, wherein more heavily admixed terminal specimens occupy apparently artifactual early‐diverging topological positions, occasionally to the extent of forming false clades of intermediate hybrids. Additional geographic and genetic sampling and more robust computational approaches will be needed to clarify taxonomy, and to reconstruct a network topology to display evolutionary relationships in a manner that is consistent with their complex history of reticulation.
Article
Full-text available
Significance Speciation rate measures how quickly a species gives rise to new species, and this rate varies up to 50-fold across vertebrate groups. In this study, we explore one hypothesis that explains this variation: Species that form geographically isolated populations more readily should also form new species more readily and thus should have higher speciation rates. This hypothesis links microevolutionary studies of speciation with macroevolutionary studies of biodiversity. We test this hypothesis using a diverse set of lizard and snake species found in the South America savannahs. We find no effect of geographic population isolation on speciation rates. Our results suggest that other stages in the speciation process are more important controls on speciation rate variation.
Article
Full-text available
Many phylogeographic studies on species with large ranges have found genetic-geographic structure associated with changes in habitat and physical barriers preventing or reducing gene flow. These interactions with geographic space, contemporary and historical climate, and biogeographic barriers have complex effects on contemporary population genetic structure and processes of speciation. While allopatric speciation at biogeographic barriers is considered the primary mechanism for generating species, more recently it has been shown that parapatric modes of divergence may be equally or even more common. With genomic data and better modeling capabilities, we can more clearly define causes of speciation in relation to biogeography and migration between lineages, the location of hybrid zones with respect to the ecology of parental lineages, and differential introgression of genes between taxa. Here, we examine the origins of three Nearctic milksnakes (Lampropeltis elapsoides, Lampropeltis triangulum and Lampropeltis gentilis) using genome-scale data to better understand species diversification. Results from artificial neural networks show that a mix of a strong biogeographic barrier, environmental changes, and physical space has affected genetic structure in these taxa. These results underscore conspicuous environmental changes that occur as the sister taxa L. triangulum and L. gentilis diverged near the Great Plains into the forested regions of the Eastern Nearctic. This area has been recognized as a region for turnover for many vertebrate species, but as we show here the contemporary boundary does not isolate these sister species. These two species likely formed in the mid-Pleistocene and have remained partially reproductively isolated over much of this time, showing differential introgression of loci. We also demonstrate that when L. triangulum and L. gentilis are each in contact with the much older L. elapsoides, some limited gene flow has occurred. Given the strong agreement between nuclear and mtDNA genomes, along with estimates of ecological niche, we suggest that all three lineages should continue to be recognized as unique species. Furthermore, this work emphasizes the importance of considering complex modes of divergence and differential allelic introgression over a complex landscape when testing mechanisms of speciation. [Cline; delimitation; Eastern Nearctic; Great Plains; hybrids; introgression; speciation.].
Article
Full-text available
Landscape genomics identifies how spatial and environmental factors structure the amount and distribution of genetic variation among populations. Landscape genomic analyses have been applied across diverse taxonomic groups and ecological settings, and are increasingly used to analyse datasets composed of large numbers of genomic markers and multiple environmental predictors. It is in this context that multivariate methods show their strengths. Redundancy analysis (RDA) is a constrained ordination that, in a landscape genomics framework, models linear relationships among environment predictors and genomic variation, effectively identifying covarying allele frequencies associated with the multivariate environment. RDA can be used at both individual and population levels, can include covariates to account for confounding factors and can be used to directly infer genotype–environment associations on the landscape. The modelling of both multivariate response and explanatory variables allows RDA to accommodate the genomic and environmental complexity found in nature, producing a powerful and efficient tool for landscape genomics. In this review, we outline the diverse uses of RDA in landscape genomics, including variable selection, variance partitioning, genotype–environment associations, and the calculation of adaptive indices and genomic offset. To illustrate these applications, we use a published dataset for lodgepole pine that includes genomic, phenotypic and environmental data. We provide an introduction to the statistical basis of RDA, a tutorial on its use and interpretation in landscape genomics applications, discuss limitations and provide guidelines to avoid misuse. This review and associated tutorial provide a comprehensive resource to the landscape genomics community to improve understanding of RDA as a modelling framework, and encourage the appropriate use of RDA across diverse landscape genomics applications. RDA is truly a Swiss Army Knife for landscape genomics: a multipurpose, adaptable and versatile approach to identifying, evaluating and forecasting relationships between genetic and environmental variation.
Article
Full-text available
Although body size correction and inferential statistics have been used in morphological studies for many decades, their applications are far from being ubiquitous. We performed a meta-analysis to quantify the extent of taxonomic papers that performed body size correction and implemented a statistical hypothesis testing framework during the analysis of morphological data. Our results indicate that in most papers, neither of these analyses were performed but instead, cursory comparisons of descriptive statistics were presented. With the development of numerous freely available and powerful statistical programs such as R, we find it prudent to outline a standardized and statistically defensible framework to enhance the workflow of morphological analyses in taxonomic studies. This 5-step approach can be applied to meristic and mensural data across a wide range of taxonomic groups. We include an easy-to-use companion R script to facilitate the implementation of this workflow. Our proposed framework is not rooted in phylogenetic or evolutionary theory and hence, should not be used in place of explicit species delimitation techniques. Nevertheless, it can be incorporated into a more robust integrative taxonomic framework and is particularly useful for identifying diagnostic characters for species diagnoses.
Article
Full-text available
Reproductive isolation is instrumental to the formation of new species (speciation), but it remains largely enigmatic how many incompatibilities are required to prevent hybridization and where they lie across the genome. By studying patterns of admixture in amphibian hybrid zones, we found that reproductive isolation is initiated by numerous small-effect incompatibilities scattered across the genome rather than concentrated in a few important genes. Unlike mammals and birds, in which Y/W degeneracy is a major cause of hybrid dysfunctions, the undifferentiated sex chromosomes of amphibians do not always host more genetic incompatibilities than other chromosomes. These combined results might explain why amphibian speciation is relatively slow, and its clock-like dynamics offer practical perspectives to categorize evolutionary lineages into species or subspecies.
Article
Full-text available
Evidence is accumulating that gene flow commonly occurs between recently-diverged species, despite the existence of barriers to gene flow in their genomes. However, we still know little about what regions of the genome become barriers to gene flow and how such barriers form. Here we compare genetic differentiation across the genomes of bumblebee species living in sympatry and allopatry to reveal the potential impact of gene flow during species divergence and uncover genetic barrier loci. We first compared the genomes of the alpine bumblebee Bombus sylvicola and a previously unidentified sister species living in sympatry in the Rocky Mountains, revealing prominent islands of elevated genetic divergence in the genome that co-localize with centromeres and regions of low recombination. This same pattern is observed between the genomes of another pair of closely-related species living in allopatry (B. bifarius and B. vancouverensis). Strikingly however, the genomic islands exhibit significantly elevated absolute divergence (dXY) in the sympatric, but not the allopatric, comparison indicating that they contain loci that have acted as barriers to historical gene flow in sympatry. Our results suggest that intrinsic barriers to gene flow between species may often accumulate in regions of low recombination and near centromeres through processes such as genetic hitchhiking, and that divergence in these regions is accentuated in the presence of gene flow.
Article
Full-text available
A primary roadblock to our understanding of speciation is that it usually occurs over a timeframe that is too long to study from start to finish. The idea of a speciation continuum provides something of a solution to this problem; rather than observing the entire process, we can simply reconstruct it from the multitude of speciation events that surround us. But what do we really mean when we talk about the speciation continuum, and can it really help us understand speciation? We explored these questions using a literature review and online survey of speciation researchers. Although most researchers were familiar with the concept and thought it was useful, our survey revealed extensive disagreement about what the speciation continuum actually tells us. This is due partly to the lack of a clear definition. Here, we provide an explicit definition that is compatible with the Biological Species Concept. That is, the speciation continuum is a continuum of reproductive isolation. After outlining the logic of the definition in light of alternatives, we explain why attempts to reconstruct the speciation process from present‐day populations will ultimately fail. We then outline how we think the speciation continuum concept can continue to act as a foundation for understanding the continuum of reproductive isolation that surrounds us. This article is protected by copyright. All rights reserved
Article
Full-text available
Testing among competing demographic models of divergence has become an important component of evolutionary research in model and non-model organisms. However, the effect of unaccounted demographic events on model choice and parameter estimation remains largely unexplored. Using extensive simulations, we demonstrate that under realistic divergence scenarios, failure to account for population size (N e) changes in daughter and ancestral populations leads to strong biases in divergence time estimates as well as model choice. We illustrate these issues reconstructing the recent demographic history of North Sea and Baltic Sea turbots (Schopthalmus maximus) by testing 16 Isolation with Migration (IM) and 16 Secondary Contact (SC) scenarios, modelling changes in N e as well as the effects of linked selection and barrier loci. Failure to account for changes in N e resulted in selecting SC models with long periods of isolation and divergence times preceding the formation of the Baltic Sea. In contrast, models accounting for N e changes suggest recent (<6 kya) divergence with constant gene flow. We further show how interpreting genomic landscapes of differentiation can help discerning among competing models. For example, in the turbots data islands of differentiation show signatures of recent selective sweeps, rather than old divergence resisting secondary introgression. The results have broad implications for the study of population divergence by highlighting the potential effects of unmodeleld changes in N e on demographic inference. Tested models should aim at representing realistic divergence scenarios for the target taxa, and extreme caution should always be exercised when interpreting results of demographic modelling.
Article
Full-text available
Balancing model complexity is a key challenge of modern computational ecology, particularly so since the spread of machine learning algorithms. Species distribution models are often implemented using a wide variety of machine learning algorithms that can be fine‐tuned to achieve the best model prediction while avoiding overfitting. We have released SDMtune, a new R package that aims to facilitate training, tuning, and evaluation of species distribution models in a unified framework. The main innovations of this package are its functions to perform data‐driven variable selection, and a novel genetic algorithm to tune model hyperparameters. Real‐time and interactive charts are displayed during the execution of several functions to help users understand the effect of removing a variable or varying model hyperparameters on model performance. SDMtune supports three different metrics to evaluate model performance: the area under the receiver operating characteristic curve, the true skill statistic, and Akaike's information criterion corrected for small sample sizes. It implements four statistical methods: artificial neural networks, boosted regression trees, maximum entropy modeling, and random forest. Moreover, it includes functions to display the outputs and create a final report. SDMtune therefore represents a new, unified and user‐friendly framework for the still‐growing field of species distribution modeling.
Article
Full-text available
Aim: Effective conservation policies rely on information about population genetic structure and the connectivity of remnants of suitable habitats. The interaction between natural and anthropogenic discontinuities across landscapes can uncover the relative contributions of different barriers to gene flow, with direct consequences for decision-making in conservation. We aimed to quantify the relative roles of land cover and topographic variables on the population genetic differentiation and diversity of a stream-breeding savanna tree frog (Bokermannohyla ibitiguara) across its range. Location: Serra da Canastra mountain range, Cerrado of Minas Gerais State, Brazil. Methods: We collected samples and extracted DNA samples from 12 populations within and outside a strictly protected park, and used 17 microsatellite markers to assess genetic structure, among-population differentiation and within-population diversity measures. We incorporated landscape data derived from digital models and satellite images to create connectivity matrices to correlate with genetic differentiation using Mantel tests. We used generalized linear models and path analyses to assess the roles of each landscape variable in shaping genetic diversity in this species. Results: Populations within and outside the park boundaries belonged to four genetic clusters. Most populations showed evidence of limited gene flow, with significant genetic differentiation, except for those within the park, which also had higher levels of allelic richness and heterozygosity. However, genetic differentiation among populations in this landscape was primarily explained by topographic complexity. Likewise, within-population measures of genetic diversity were best explained by models including elevation and topographic complexity, and not the amount of natural habitat or gallery forests. Main conclusions: Our results underscore that topography may be a strong historical factor shaping genetic structure among amphibian populations. Therefore, effective conservation strategies for endangered amphibians should avoid focusing exclusively on habitat suitability, and incorporate topographic complexity, which seems to be a key factor for the fauna of the extremely threatened Brazilian savanna.
Article
Full-text available
An understanding of genetic structure is essential for answering many questions in population genetics. However, complex population dynamics and scale-dependent processes can make it difficult to detect if there are distinct genetic clusters present in natural populations. Inferring discrete population structure is particularly challenging in the presence of continuous genetic variation such as isolation by distance. Here, we use the plant species Mimulus guttatus as a case study for understanding genetic structure at three spatial scales. We use reduced-representation sequencing and marker-based genotyping to understand dispersal dynamics and to characterise genetic structure. Our results provide insight into the spatial scale of genetic structure in a widespread plant species, and demonstrate how dispersal affects spatial genetic variation at the local, regional, and range-wide scale. At a fine-spatial scale, we show dispersal is rampant with little evidence of spatial genetic structure within populations. At a regional-scale, we show continuous differentiation driven by isolation by distance over hundreds of kilometres, with broad geographic genetic clusters that span major barriers to dispersal. Across Western North America, we observe geographic genetic structure and the genetic signature of multiple postglacial recolonisation events, with historical gene flow linking isolated populations. Our genetic analyses show M. guttatus is highly dispersive and maintains large metapopulations with high intrapopulation variation. This high diversity and dispersal confounds the inference of genetic structure, with multi-level sampling and spatially-explicit analyses required to understand population history.
Article
Full-text available
Whatever one's definition of species, it is generally expected that individuals of the same species should be genetically more similar to each other than they are to individuals of another species. Here we show that in the presence of cross-species gene flow, this expectation may be incorrect. We use the multispecies coalescent model with continuous-time migration or episodic introgression to study the impact of gene flow on genetic differences within and between species and highlight a surprising but plausible scenario in which different population sizes and asymmetrical migration rates cause a genetic sequence to be on average more closely related to a sequence from another species than to a sequence from the same species. Our results highlight the extraordinary impact that even a small amount of gene flow may have on the genetic history of the species. We suggest that contrasting long-term migration rate and short-term hybridization rate, both of which can be estimated using genetic data, may be a powerful approach to detecting the presence of reproductive barriers and to define species boundaries.
Article
Full-text available
Species‐level diversity and the underlying mechanisms that lead to the formation of new species, that is, speciation, have often been confounded with intraspecific diversity and population subdivision. The delineation between intraspecific and interspecific divergence processes has received much less attention than species delimitation. The ramifications of confounding speciation and population subdivision are that the term speciation has been used to describe many different biological divergence processes, rendering the results, or inferences, between studies incomparable. Phylogeographic studies have advanced our understanding of how spatial variation in the pattern of biodiversity can begin, become structured, and persist through time. Studies of species delimitation have further provided statistical and model‐based approaches to determine the phylogeographic entities that merit species status. However, without a proper understanding and delineation between the processes that generate and maintain intraspecific and interspecific diversity in a study system, the delimitation of species may still not be biologically and evolutionarily relevant. I argue that variation in the continuity of the divergence process among biological systems could be a key factor leading to the enduring contention in delineating divergence patterns, or species delimitation, meriting future comparative studies to help us better understand the nature of biological species.
Article
Full-text available
Despite the importance of climate‐adjusted provenancing to mitigate the effects of environmental change, climatic considerations alone are insufficient when restoring highly degraded sites. Here we propose a comprehensive landscape genomic approach to assist the restoration of moderately disturbed and highly degraded sites. To illustrate it we employ genomic datasets comprising thousands of single nucleotide polymorphisms from two plant species suitable for the restoration of iron‐rich Amazonian Savannas. We first use a subset of neutral loci to assess genetic structure and determine the genetic neighborhood size. We then identify genotype‐phenotype‐environment associations, map adaptive genetic variation, and predict adaptive genotypes for restoration sites. Whereas local provenances were found optimal to restore a moderately disturbed site, a mixture of genotypes seemed the most promising strategy to recover a highly degraded mining site. We discuss how our results can help define site‐adjusted provenancing strategies, and argue that our methods can be more broadly applied to assist other restoration initiatives.
Article
Full-text available
Background The demographic history of any population is imprinted in the genomes of the individuals that make up the population. One of the most popular and convenient representations of genetic information is the allele frequency spectrum (AFS), the distribution of allele frequencies in populations. The joint AFS is commonly used to reconstruct the demographic history of multiple populations, and several methods based on diffusion approximation (e.g., ∂a∂i) and ordinary differential equations (e.g., moments) have been developed and applied for demographic inference. These methods provide an opportunity to simulate AFS under a variety of researcher-specified demographic models and to estimate the best model and associated parameters using likelihood-based local optimizations. However, there are no known algorithms to perform global searches of demographic models with a given AFS. Results Here, we introduce a new method that implements a global search using a genetic algorithm for the automatic and unsupervised inference of demographic history from joint AFS data. Our method is implemented in the software GADMA (Genetic Algorithm for Demographic Model Analysis, https://github.com/ctlab/GADMA). Conclusions We demonstrate the performance of GADMA by applying it to sequence data from humans and non-model organisms and show that it is able to automatically infer a demographic model close to or even better than the one that was previously obtained manually. Moreover, GADMA is able to infer multiple demographic models at different local optima close to the global one, providing a larger set of possible scenarios to further explore demographic history.
Article
Numerous mechanisms can drive speciation, including isolation by adaptation, distance, and environment. These forces can promote genetic and phenotypic differentiation of local populations, the formation of phylogeographic lineages, and ultimately, completed speciation. However, conceptually similar mechanisms may also result in stabilizing rather than diversifying selection, leading to lineage integration and the long‐term persistence of population structure within genetically cohesive species. Processes that drive the formation and maintenance of geographic genetic diversity while facilitating high rates of migration and limiting phenotypic differentiation may thereby result in population genetic structure that is not accompanied by reproductive isolation. We suggest that this framework can be applied more broadly to address the classic dilemma of “structure” versus “species” when evaluating phylogeographic diversity, unifying population genetics, species delimitation, and the underlying study of speciation. We demonstrate one such instance in the Seepage Salamander ( Desmognathus aeneus ) from the southeastern United States. Recent studies estimated up to 6.3% mitochondrial divergence and four phylogenomic lineages with broad admixture across geographic hybrid zones, which could potentially represent distinct species supported by our species‐delimitation analyses. However, while limited dispersal promotes substantial isolation by distance, microhabitat specificity appears to yield stabilizing selection on a single, uniform, ecologically mediated phenotype. As a result, climatic cycles promote recurrent contact between lineages and repeated instances of high migration through time. Subsequent hybridization is apparently not counteracted by adaptive differentiation limiting introgression, leaving a single unified species with deeply divergent phylogeographic lineages that nonetheless do not appear to represent incipient species.
Article
Spotted and Northern Dusky Salamanders (Desmognathus conanti and D. fuscus) have a long and complex taxonomic history. At least 10 other currently recognized species in the genus were either described from populations previously considered D. fuscus, described as or later considered subspecies thereof, or later considered synonyms thereof, before ultimately being recognized as distinct. Recent molecular analyses have also revealed extensive cryptic diversity within both species, which are polyphyletic assemblages of 13 distinct mitochondrial lineages with 5.7–10.3% uncorrected ‘p’ distances in the COI barcode locus. Based on phylogenomic data and population-clustering analyses considering admixture between lineages, 11 candidate species were circumscribed by recent authors. Those within D. conanti are also ecomorphologically variable, comprising both large, robust, keel-tailed populations, and small, gracile, round-tailed forms. Evaluating their distinctiveness based on genetic, geographic, and morphological evidence, we conclude that six of the candidates represent new species: Desmognathus anicetus sp. nov., D. bairdi sp. nov., D. campi sp. nov., D. catahoula sp. nov., D. lycos sp. nov., and D. tilleyi sp. nov. Consequently, we recognize eight total species from populations formerly associated with the nominal species D. conanti and D. fuscus, the re-delimited concepts of which also contain additional phylogeographic lineage diversity that may represent further distinct species. In addition to existing mitochondrial and nuclear phylogenetic, network, and clustering results, we present preliminary analyses of linear morphometrics to bolster diagnostic specificity based on phenotypic characteristics. These changes stabilize the previously paraphyletic taxonomy of species-level lineages within Desmognathus, though additional cryptic diversity may exist both within the species considered here, and elsewhere in the genus.
Article
Species delimitation in the genomic era has focused predominantly on the application of multiple analytical methodologies to a single massive parallel sequencing (MPS) data set, rather than leveraging the unique but complementary insights provided by different classes of MPS data. In this study we demonstrate how the use of two independent MPS data sets, a sequence capture data set and a single nucleotide polymorphism (SNP) data set generated via genotyping-by-sequencing, enables the resolution of species in three complexes belonging to the grass genus Ehrharta, whose strong population structure and subtle morphological variation limit the effectiveness of traditional species delimitation approaches. Sequence capture data are used to construct a comprehensive phylogenetic tree of Ehrharta and to resolve population relationships within the focal clades, while SNP data are used to detect patterns of gene pool sharing across populations, using a novel approach that visualizes multiple values of K. Given that the two genomic data sets are independent, the strong congruence in the clusters they resolve provides powerful ratification of species boundaries in all three complexes studied. Our approach is also able to resolve a number of single-population species and a probable hybrid species, both of which would be difficult to detect and characterize using a single MPS data set. Overall, the data reveal the existence of 11 and five species in the E. setacea and E. rehmannii complexes, with the E. ramosa complex requiring further sampling before species limits are finalized. Despite phenotypic differentiation being generally subtle, true crypsis is limited to just a few species pairs and triplets. We conclude that, in the absence of strong morphological differentiation, the use of multiple, independent genomic data sets is necessary in order to provide the cross-data set corroboration that is foundational to an integrative taxonomic approach.
Article
Accelerating climate change and habitat loss make it imperative that plans to conserve biodiversity consider species' ability to adapt to changing environments. However, in biomes where biodiversity is highest, the evolutionary mechanisms responsible for generating adaptative variation and, ultimately, new species are frequently poorly understood. African rainforests represent one such biome, as decadal debates continue concerning the mechanisms generating African rainforest biodiversity. These debates hinge on the relative importance of geographic isolation versus divergent natural selection across environmental gradients. Hindering progress is a lack of robust tests of these competing hypotheses. Because African rainforests are severely at‐risk due to climate change and other anthropogenic activities, addressing this long‐standing debate is critical for making informed conservation decisions. We use demographic inference and allele frequency‐environment relationships to investigate mechanisms of diversification in an African rainforest skink, Trachylepis affinis, a species inhabiting the gradient between rainforest and rainforest‐savanna mosaic (ecotone). We provide compelling evidence of ecotone speciation, in which gene flow has all but ceased between rainforest and ecotone populations, at a level consistent with infrequent hybridization between sister species. Parallel patterns of genomic, morphological, and physiological divergence across this environmental gradient and pronounced allele frequency‐environment correlation indicate speciation is mostly likely driven by ecological divergence, supporting a central role for divergent natural selection. Our results provide strong evidence for the importance of ecological gradients in African rainforest speciation and inform conservation strategies that preserve the processes that produce and maintain biodiversity.
Article
The equal fitness paradigm (EFP) is a life-history model in which the currency of fitness is usable energy rather than individuals, and the principal trade-off is between survival, evaluated as generation time, and productivity, evaluated as growth and reproductive rates. In the current study I examined variation in generation time, age at first reproduction, productivity, and mortality in salamanders of the genus Desmognathus within the framework of the EFP. Desmognathus salamanders are restricted to eastern North America, with a center of distribution in the southern Appalachian Mountains. The data sources of the present report are published studies of life histories and demographics of five species of Desmognathus that include the smallest and largest members of the genus. The analysis showed that Desmognathus salamanders have greater ages at first reproduction, lengthier generation times, lower productivities, and lower mortality rates than are predicted by the scaling functions of the EFP for vertebrates of equivalent sizes. The differences among species in these parameters are correlated with variation in adult body size and the association between body size and habitat utilization in the genus, wherein the largest species are aquatic in mountain streams and the smallest are terrestrial in mesic forests. Streamside species of intermediate size exploit a broader range of habitats and are more widely distributed than the stream- and forest-dwelling forms. It is likely that the streamside mode of life in Desmognathus represents an adaptation promoting dispersal. Adaptive radiation in the genus is expressed in extreme life-history and body-size diversification mediated through variation in age at first reproduction and generation time.
Article
Significant advances have been made in species delimitation and numerous methods can test precisely defined models of speciation, though the synthesis of phylogeography and taxonomy is still sometimes incomplete. Emerging consensus treats distinct genealogical clusters in genome-scale data as strong initial evidence of speciation in most cases; a hypothesis that must therefore be falsified under an explicit evolutionary model. We can now test speciation hypotheses linking trait differentiation to specific mechanisms of divergence with increasingly large datasets. Integrative taxonomy can therefore reflect an understanding of how each axis of variation relates to underlying speciation processes, with nomenclature for distinct evolutionary lineages. We illustrate this approach here with Seal Salamanders (Desmognathus monticola) and introduce a new unsupervised machine-learning approach for species delimitation. Plethodontid salamanders are renowned for their morphological conservatism despite extensive phylogeographic divergence. We discover two geographic genetic clusters, for which demographic and spatial models of ecology and gene flow provide robust support for ecogeographic speciation despite limited phenotypic divergence. These data are integrated under evolutionary mechanisms (e.g., spatially localized gene flow with reduced migration) and reflected in emergent properties expected under models of reinforcement (e.g., ethological isolation and selection against hybrids). Their genetic divergence is prima facie evidence for species-level distinctiveness, supported by speciation models and divergence along axes such as behavior, geography, and climate that suggest an ecological basis with subsequent reinforcement through prezygotic isolation. As datasets grow more comprehensive, species delimitation models can be tested, rejected, or corroborated as explicit speciation hypotheses, providing for reciprocal illumination of evolutionary processes and integrative taxonomies.
Article
Aim Generalized dissimilarity modelling (GDM) is a powerful and unique method for characterizing and predicting beta diversity, the change in biodiversity over space, time and environmental gradients. The number of studies applying GDM is expanding, with increasing recognition of its value in improving our understanding of the drivers of biodiversity patterns and in implementing a wide variety of spatial assessments relevant to biodiversity conservation. However, apart from the original presentation of the GDM technique, there has been little guidance available to users on applying GDM to different situations or on the key modelling decisions required. Innovation We present an accessible working guide to GDM. We describe the context for the development of GDM, present a simple statistical explanation of how model fitting works, and step through key considerations involved in data preparation, model fitting, refinement and assessment. We then describe how several novel spatial biodiversity analyses can be implemented using GDM, with code to support broader implementation. We conclude by providing an overview of the range of GDM‐based analyses that have been undertaken to date and identify priority areas for future research and development. Main conclusions Our vision is that this working guide will facilitate greater and more rigorous use of GDM as a powerful tool for undertaking biodiversity analyses and assessments.
Article
Delimiting species is a challenge, especially in scenarios of diversification with gene flow and when species are now allopatric where reproductive isolation cannot be directly tested. Continental burrowing crayfishes of the genus Parastacus present a disjoint distribution in southern South America. One of the species is P. nicoleti, which lives in underground waters in swampy and wooded areas of southern Chile. A previous assessment based on mitochondrial DNA sequences suggest that the taxon may represent a species complex. Here, using thousands of nuclear genomic single-nucleotide polymorphisms obtained via RADSeq from 81 specimens collected at 27 localities throughout the distributional range of the species, we apply an integrative species delimitation approach to test species boundaries and to investigate some aspects of the speciation process. Our analyses corroborate previous results; a scenario that we favor suggests that the P. nicoleti encompasses seven distinct species. Additionally, demographic analyses show that the distinct species have followed distinct trajectories in size change during the last 17.5 million years and that speciation in this group occurred both in strict isolation as well as in the presence of gene flow.
Article
Conservation benefits from incorporating genomics to explore the impacts of population declines, inbreeding, loss of genetic variation and hybridization. Here we use the near-extinct Mariana Islands reedwarbler radiation to showcase how ancient DNA approaches can allow insights into the population dynamics of extinct species and threatened populations for which historical museum specimens or material with low DNA yield (e.g., scats, feathers) are the only sources for DNA. Despite their having paraphyletic mtDNA, nuclear SNPs support the distinctiveness of critically endangered Acrocephalus hiwae and the other three species in the radiation that went extinct between the 1960s and 1990s. Two extinct species, A. yamashinae and A. luscinius, were deeply divergent from each other and from a third less differentiated lineage containing A. hiwae and extinct A. nijoi. Both mtDNA and SNPs suggest that the two isolated populations of A. hiwae from Saipan and Alamagan Islands are sufficiently distinct to warrant subspecies recognition and separate conservation management. We detected no significant differences in genetic diversity or inbreeding between Saipan and Alamagan, nor strong signatures of geographic structuring within either island. However, the implications of possible signatures of inbreeding in both Saipan and Alamagan, and long-term population declines in A. hiwae that predate modern anthropogenic threats require further study with denser population sampling. Our study highlights the value conservation genomics studies of island radiations have as windows onto the possible future for the world's biota as climate change and habitat destruction increasingly fragments their ranges and contributes to rapid declines in population abundances.
Article
Phylogeographic studies base inferences on large data sets and complex demographic models, but these models are applied in ways that could mislead researchers and compromise their inference. Researchers face three challenges associated with the use of models: (i) ‘model selection’, or the identification of an appropriate model for analysis; (ii) ‘evaluation of analytical results’, or the interpretation of the biological significance of the resulting parameter estimates, delimitations, and topologies; and (iii) ‘model evaluation’, or the use of statistical approaches to assess the fit of the model to the data. The field collectively invests most of its energy in point (ii) without considering the other points; we argue that attention to points (i) and (iii) is essential to phylogeographic inference.
Article
A period of isolation in allopatry typically precedes local adaptation and subsequent divergence among lineages. Alternatively, locally adapted phenotypes may arise and persist in the face of gene flow, resulting in strong correlations between ecologically-relevant phenotypic variation and corresponding environmental gradients. Quantifying genetic, ecological, and phenotypic divergence in such lineages can provide insights into the abiotic and biotic mechanisms that structure populations and drive the accumulation of phenotypic and taxonomic diversity. Low-vagility organisms whose distributions span ephemeral geographic barriers present the ideal evolutionary context within which to address these questions. Here, we combine genetic (mtDNA and genome-wide SNPs) and phenotypic data to investigate the divergence history of caecilians (Amphibia: Gymnophiona) endemic to the oceanic island of São Tomé in the Gulf of Guinea archipelago. Consistent with a previous mtDNA study, we find two phenotypically and genetically distinct lineages that occur along a north-to-south axis with extensive admixture in the centre of the island. Demographic modelling supports divergence in allopatry (~300 kya) followed by secondary contact (~95 kya). Consequently, in contrast to a morphological study that interpreted latitudinal phenotypic variation in these caecilians as a cline within a single widespread species, our analyses suggest a history of allopatric lineage divergence and subsequent hybridization that may have blurred species boundaries. We propose that late Pleistocene volcanic activity favoured allopatric divergence between these lineages with local adaptation to climate maintaining a stable hybrid zone in the centre of São Tomé Island. Our study joins a growing number of systems demonstrating lineage divergence on volcanic islands with stark environmental transitions across small geographic distances.
Article
Discontent about changes in species classifications has grown in recent years. Many of these changes are seen as arbitrary, stemming from unjustified conceptual and methodological grounds, or leading to species that are less distinct than those recognised in the past. We argue that current trends in species classification are the result of a paradigm shift toward which systematics and population genetics have converged and that regards species as the phylogenetic lineages that form the branches of the Tree of Life. Species delimitation now consists of determining which populations belong to which individual phylogenetic lineage. This requires inferences on the process of lineage splitting and divergence, a process to which we have only partial access through incidental evidence and assumptions that are themselves subject to refutation. This approach is not free of problems, as horizontal gene transfer, introgression, hybridisation, incorrect assumptions, sampling and methodological biases can mislead inferences of phylogenetic lineages. Increasing precision is demanded through the identification of both sister relationships and processes blurring or mimicking phylogeny, which has triggered, on the one hand, the development of methods that explicitly address such processes and, on the other hand, an increase in geographical and character data sampling necessary to infer/test such processes. Although our resolving power has increased, our knowledge of sister relationships – what we designate as species resolution – remains poor for many taxa and areas, which biases species limits and perceptions about how divergent species are or ought to be. We attribute to this conceptual shift the demise of trinominal nomenclature we are witnessing with the rise of subspecies to species or their rejection altogether; subspecies are raised to species if they are found to correspond to phylogenetic lineages, while they are rejected as fabricated taxa if they reflect arbitrary partitions of continuous or non‐hereditary variation. Conservation strategies, if based on taxa, should emphasise species and reduce the use of subspecies to avoid preserving arbitrary partitions of continuous variation; local variation is best preserved by focusing on biological processes generating ecosystem resilience and diversity rather than by formally naming diagnosable units of any kind. Since many binomials still designate complexes of species rather than individual species, many species have been discovered but not named, geographical sampling is sparse, gene lineages have been mistaken for species, plenty of species limits remain untested, and many groups and areas lack adequate species resolution, we cannot avoid frequent changes to classifications as we address these problems. Changes will not only affect neglected taxa or areas, but also popular ones and regions where taxonomic research remained dormant for decades and old classifications were taken for granted.
Article
Inferring the history of divergence between species in a framework that permits the presence of gene flow has been crucial for characterizing the “gray zone” of speciation, which is the period of time where lineages have diverged but have not yet achieved strict reproductive isolation. However, estimates of both divergence times and rates of gene flow often ignore spatial information, for example when considering the location and width of hybrid zones with respect to changes in environment between lineages. To connect phylogeographic estimates of lineage structure, migration, historical demography, and timing of divergence with hybrid zone dynamics, we used population genomic data from the North American ratsnake complex (Pantherophis obsoletus). We examined the spatial context of diversification by linking migration and timing of divergence to the location and widths of hybrid zones. Artificial neural network approaches were applied to understand how landscape features and past climate have influenced population genetic structure among these lineages. We found that rates of migration between lineages were associated with the overall width of hybrid zones. Timing of divergence was not related to migration rate or hybrid zone width across species pairs but may be related to the number of alleles weakly introgressing through hybrid zones. This research underscores how incomplete reproductive isolation can be better understood by considering differential allelic introgression and the effects of historical and contemporary landscape features on the formation of lineages as well as overall genomic estimates of migration rates through time. This article is protected by copyright. All rights reserved
Article
The study of recently diverged lineages whose geographical ranges come into contact can provide insight into the early stages of speciation and the potential roles of reproductive isolation in generating and maintaining species. Such insight can also be important for understanding the strategies and challenges for delimiting species within recently diverged species complexes. Here, we use mitochondrial and nuclear genetic data to study population structure, gene flow and demographic history across a geographically widespread rattlesnake clade, the western rattlesnake species complex (Crotalus cerberus, Crotalus viridis, Crotalus oreganus and relatives), which contains multiple lineages with ranges that overlap geographically or contact one another. We find evidence that the evolutionary history of this group does not conform to a bifurcating tree model and that pervasive gene flow has broadly influenced patterns of present-day genetic diversity. Our results suggest that lineage diversity has been shaped largely by drift and divergent selection in isolation, followed by secondary contact, in which reproductive isolating mechanisms appear weak and insufficient to prevent introgression, even between anciently diverged lineages. The complexity of divergence and secondary contact with gene flow among lineages also provides new context for why delimiting species within this complex has been difficult and contentious historically.
Article
Real geography is continuous, but standard models in population genetics are based on discrete, well-mixed populations. As a result many methods of analyzing genetic data assume that samples are a random draw from a well-mixed population, but are applied to clustered samples from populations that are structured clinally over space. Here we use simulations of populations living in continuous geography to study the impacts of dispersal and sampling strategy on population genetic summary statistics, demographic inference, and genome-wide association studies. We find that most common summary statistics have distributions that differ substantially from that seen in well-mixed populations, especially when Wright's neighborhood size is less than 100 and sampling is spatially clustered. Stepping-stone models reproduce some of these effects, but discretizing the landscape introduces artifacts which in some cases are exacerbated at higher resolutions. The combination of low dispersal and clustered sampling causes demographic inference from the site frequency spectrum to infer more turbulent demographic histories, but averaged results across multiple simulations revealed surprisingly little systematic bias. We also show that the combination of spatially autocorrelated environments and limited dispersal causes genome-wide association studies to identify spurious signals of genetic association with purely environmentally determined phenotypes, and that this bias is only partially corrected by regressing out principal components of ancestry. Last, we discuss the relevance of our simulation results for inference from genetic variation in real organisms.
Article
Dusky salamanders (Desmognathus) constitute a large, species-rich group within the family Plethodontidae, and though their systematic relationships have been addressed extensively, most studies have centered on particular species complexes and therefore offer only piecemeal phylogenetic perspective on the genus. Recent work has revealed Desmognathus to be far more clade rich—35 reciprocally monophyletic clades versus 22 recognized species—than previously imagined, results that, in turn, provide impetus for additional survey effort within clades and across geographic areas thus far sparsely sampled. We conceived and implemented a sampling regime combining level IV ecoregions and independent river drainages to yield a geographic grid for comprehensive recovery of all genealogically exclusive clades. We sampled over 550 populations throughout the distribution of Desmognathus in the eastern United States of America and generated mitochondrial DNA sequence data (mtDNA; 1,991 bp) for 536 specimens. A Bayesian phylogenetic reconstruction of the resulting haplotypes revealed forty-five reciprocally monophyletic clades, eleven of which have never been included in a comprehensive phylogenetic reconstruction, and an additional three not represented in any molecular systematic survey. Although general limitations associated with mtDNA data preclude new species delineation, we profile each of the 45 clades and assign names to 10 new clades (following a protocol for previous clade nomenclature). We also redefine several species complexes and erect new informal species complexes. Our dataset, which contains topotypic samples for nearly every currently recognized species and most synonymies, will offer a robust framework for future efforts to delimit species within Desmognathus.
Article
Gene flow between evolutionarily distinct lineages is increasingly recognized as a common occurrence. Such processes distort our ability to diagnose and delimit species, as well as confound attempts to estimate phylogenetic relationships. A conspicuous example is Dusky Salamanders (Desmognathus), a common model-system for ecology, evolution, and behavior. Only 22 species are described; 7 in the last 40 years. However, mitochondrial datasets indicate the presence of up to 45 "candidate species" and multiple paraphyletic taxa presenting a complex history of reticulation. Some authors have even suggested that the search for species boundaries in the group may be in vain. Here, we analyze nuclear and mitochondrial data containing 161 individuals from at least 49 distinct evolutionary lineages that we treat as candidate species. Concatenated and species-tree methods do not estimate fully resolved relationships among these taxa. Comparing topologies and applying methods for estimating phylogenetic networks, we find strong support for numerous instances of hybridization throughout the history of the group. We suggest that these processes may be more common than previously thought across the phylogeography-phylogenetics continuum, and that while the search for species boundaries in Desmognathus may not be in vain, it will be complicated by factors such as crypsis, parallelism, and gene-flow.
Article
Recent analyses of genomic sequence data suggest cross-species gene flow is common in both plants and animals, posing challenges to species tree estimation. We examine the levels of gene flow needed to mislead species tree estimation with three species and either episodic introgressive hybridization or continuous migration between an outgroup and one ingroup species. Several species tree estimation methods are examined, including the majority-vote method based on the most common gene tree topology (with either the true or reconstructed gene trees used), the UPGMA method based on the average sequence distances (or average coalescent times) between species, and the full-likelihood method based on multi-locus sequence data. Our results suggest that the majority-vote method based on gene tree topologies is more robust to gene flow than the UPGMA method based on coalescent times and both are more robust than likelihood assuming a multispecies coalescent (MSC) model with no cross-species gene flow. Comparison of the continuous migration model with the episodic introgression model suggests that a small amount of gene flow per generation can cause drastic changes to the genetic history of the species and mislead species tree methods, especially if the species diverged through radiative speciation events. Estimates of parameters under the MSC with gene flow suggest that African mosquito species in the Anopheles gambia species complex constitute such an example of extreme impact of gene flow on species phylogeny.
Article
ipyrad is a free and open source tool for assembling and analyzing restriction-site associated DNA sequence (RADseq) datasets using de novo and/or reference-based approaches. It is designed to be massively scalable to hundreds of taxa and thousands of samples, and can be efficiently parallelized on high performance computing clusters. It is available as both a command line interface (CLI) and as a Python package with an application programming interface (API), the latter of which can be used interactively to write complex, reproducible scripts, and implement a suite of downstream analysis tools. Availability and implementation: ipyrad is a free and open source program written in Python. Source code is available from the GitHub repository (https://github.com/dereneaton/ipyrad/), and Linux and MacOS installs are distributed through the conda package manager. Supplementary information: Complete documentation, including numerous tutorials, and Jupyter notebooks demonstrating example assemblies and applications of downstream analysis tools are available online: https://ipyrad.readthedocs.io/.