Article

Pseudo-parallel patterns of disjunctions in an Arctic-alpine plant lineage

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The use of ecological niche models (ENMs) for conservation purposes has become an important approach for evaluating the potential impacts of climate change on the geographic ranges of plant species (Gaynor et al., 2018;Randin et al., 2009;Stubbs et al., 2018). Projections of ENMs onto geographic space-species distribution models (SDMs)-can be used to compare estimated current distributions with those forecast under a range of climate change scenarios. ...
... Despite these issues, ENMs and the resulting SDMs built from natural history collections are still of great utility in conservation, providing initial insights into possible future distributions of species and laying the foundation for more focused studies (Ellwood, Soltis, & Klein, 2019;Ferrarini et al., 2016). They have been applied to situations that include assorted species with diverse evolutionary histories, dispersal abilities, and climatic tolerances (Loarie et al., 2008;Randin et al., 2009;Stubbs et al., 2018;Warren et al., 2014). These comparisons of distributional responses to climate change, coupled with our preexisting knowledge of plant dispersal and adaptation, can be applied to conservation strategies for narrowly endemic species. ...
... Buffer size around each point was based on the dispersal potential and accessible area for species in this clade (Barve et al., 2011;Payton, 2012;Romero-Alvarez et al., 2017). Resultant shapefiles for each species were manually edited to be continuous areas for the final shapefile, per Stubbs et al. (2018). Recent work on narrow-range SDMs utilizing simulated and real-world datasets has shown that the number of environmental variables used in model creation is not one of the factors that increases the "minimum required sample size," which in narrowly distributed taxa may be as low as three to five points per species (also referred to as the "absolute minimum sample size"; van Proosdij et al., 2016). ...
Article
Full-text available
Recent studies have revealed that narrow endemics, particularly those native to the North American Coastal Plain, are experiencing range contractions due to human development and anthropogenic-driven climate warming. We model how the projected distributions of a group of scrub-adapted plant species with similar evolutionary histories change in response to warming climates. The Scrub Mint clade (Lamiaceae) (SMC), which comprises 24 species in Dicerandra, Conradina, Stachydeoma, Piloblephis, and Clinopodium, including federally or state-listed threatened and endangered species, occurs in the scrub and sandhill biomes of the North American Coastal Plain. Georeferenced occurrence points were used to develop species distribution models (SDMs) to assess both present and predicted future ranges of all SMC species under future climate change. Future SDMs show that suitable environments for 67% of the SMC species would cover reduced geographical areas than at present. This loss of habitat is most pronounced in species of the Florida peninsula but is also prevalent in species found farther north. We use SDMs to identify the most at-risk species and geographic areas. Narrowly endemic species were more susceptible to habitat loss than those species with wider ranges. Using a large dataset and modeling habitat suitability at this regional scale, we demonstrate that scrub-adapted species are highly vulnerable to habitat reduction as a result of climate change.
... Micranthes tolmiei is relatively well-studied at the population level and displays another important east-west disjunction, between the Rocky Mountains and the mountain ranges west of the Great Basin (Stubbs et al. 2018a). In the western part of its range, it occurs widely and almost exclusively above tree line from the Sierra Nevada of California to southeastern Alaska, but in the Rocky Mountains M. tolmiei is restricted to Idaho and an adjacent portion of western Montana. ...
... Later, Engler and Irmscher (1916) reduced S. ledifolia to a variety of S. tolmiei, likely due to the morphological differences between these two taxa being continuous rather than discrete. Finally, Stubbs et al. (2018a) documented morphological differences between the Rocky Mountain plants compared to the western mountain specimens: the Rocky Mountain plants have shorter inflorescences, fewer flowers, and very reduced cymes or solitary flowers (compared to the corymb-like or raceme-like cyme found in the western plants). Notably, the Rocky Mountain plants have never been given a taxonomic distinction. ...
... Notably, the Rocky Mountain plants have never been given a taxonomic distinction. Based on phylogenomic data, morphology, and geography (Stubbs et al. 2018a), our research suggests three diagnosable clades: one group in the Sierra Nevada (corresponding to S. tolmiei var. ledifolia), another in the Pacific Northwest from Washington to southern Alaska (corresponding to S. tolmiei var. ...
... DNA extraction, probe design, target-capture, and dating analysis followed Stubbs et al. (2018), and are summarized briefly here. We used a target-capture approach for enrichment of pre-selected genomic regions optimized to provide resolution at multiple phylogenetic scales. ...
... We selected taxa to supplement the 27 ingroup accessions from our previous analysis (Stubbs et al., 2018) with the aim of having a substantial representation of every clade of Micranthes based on the most recent treatment of the group (Tkach et al., 2015). Total genomic DNA was extracted from silica-dried and herbarium specimen leaf material, following a modified cetyl trimethylammonium bromide (CTAB) extraction protocol (Doyle and Doyle, 1987). ...
... Methods of alignment and analyses followed Stubbs et al. (2018) and are reviewed here. The 481 single-copy loci and plastid genes were individually aligned with MAFFT. ...
Article
Full-text available
The increased availability of large phylogenomic datasets is often accompanied by difficulties in disentangling and harnessing the data. These difficulties may be enhanced for species resulting from reticulate evolution and/or rapid radiations producing large-scale discordance. As a result, there is a need for methods to investigate discordance, and in turn, use this conflict to inform and aid in downstream analyses. Therefore, we drew upon multiple analytical tools to investigate the evolution of Micranthes (Saxifragaceae), a clade of primarily arctic-alpine herbs impacted by reticulate and rapid radiations. To elucidate the evolution of Micranthes we sought near-complete taxon sampling with multiple accessions per species and assembled extensive nuclear (518 putatively single copy loci) and plastid (95 loci) datasets. In addition to a robust phylogeny for Micranthes, this research shows that genetic discordance presents a valuable opportunity to develop hypotheses about its underlying causes, such as hybridization, polyploidization, and range shifts. Specifically, we present a multi-step approach that incorporates multiple checks points for paralogy, including reciprocally blasting targeted genes against transcriptomes, running paralogy checks during the assembly step, and grouping genes into gene families to look for duplications. We demonstrate that a thorough assessment of discordance can be a source of evidence for evolutionary processes that were not adequately captured by a bifurcating tree model, and helped to clarify processes that have structured the evolution of Micranthes.
... Targeted sequencing of nuclear protein coding genes is a cost-effective method for extracting subgenome scale phylogenetic data (Mandel et al., 2014;Weitemier et al., 2014) to resolve rapid radiations in non-model taxa (Léveillé-Bourret et al., 2018;Villaverde et al., 2018). The advantages of targeted sequencing of full coding sequences include the ability to resolve relationships with hundreds of gene trees to address discordance, and additional sequence information at each locus can be obtained with inclusion of off-target flanking noncoding sequence (e.g., introns, see Stubbs et al., 2018;White et al., 2019). We, therefore, reconstructed the relationships within the Funariaceae (with emphasis on selected taxa with immersed capsules) by targeting 809 putative single copy nuclear genes sampled via enrichment of genomic libraries (see Weitemier et al., 2014). ...
... Although other studies using target enrichment of nuclear genes have included noncoding regions flanking targeted exons in their analysis (e.g., Léveillé-Bourret et al., 2018;Stubbs et al., 2018;White et al., 2019), our study includes a specific comparison of the level of phylogenetic signal obtained with and without flanking sequence ( Fig. 2; Table S6). Adding sites from flanking regions, which significantly increased the total number of characters for each gene tree reconstruction, increased the percentage of genes concordant with the ASTRAL topology for the Physcomitrium-Entosthodon complex for virtually all nodes ( Fig. 2; Table S6). ...
Article
Full-text available
Selection on spore dispersal mechanisms in mosses is thought to shape the transformation of the sporophyte. The majority of extant mosses develop a sporangium that dehisces through the loss of an operculum, and regulates spore release through the movement of articulate teeth, the peristome, lining the capsule mouth. Such complexity was acquired by the Mesozoic Era, but was lost in some groups during subsequent diversification events, challenging the resolution of the affinities for taxa with reduced architectures. The Funariaceae are a cosmopolitan and diverse lineage of mostly annual mosses, and exhibit variable sporophyte complexities, spanning from long, exerted, operculate capsules with two rings of well‐developed teeth, to capsules immersed among maternal leaves, lacking a differentiated line of dehiscence (i.e., inoperculate) and without peristomes. The family underwent a rapid diversification, and the relationships of taxa with reduced sporophytes remain ambiguous. Here we infer the relationships of five taxa with highly reduced sporophytes based on 648 nuclear loci (exons complemented by their flanking regions), based on inferences from concatenated data and concordance analysis of single gene trees. Physcomitrellopsis is resolved as nested within one clade of Entosthodon. Physcomitrella s. l., is resolved as a polyphyletic assemblage and, along with its putative relative Aphanorrhegma, nested within Physcomitrium. We propose a new monophyletic delineation of Physcomitrium, which accommodates species of Physcomitrella and Aphanorrhegma. The monophyly of Physcomitrium s. l. is supported by a small plurality of exons, but a majority of trees inferred from exons and their adjacent non‐coding regions. This article is protected by copyright. All rights reserved.
... If we draw side-by-side dating results obtained by Barres et al. (2013) and Herrando-Moraira et al. (2019b), the Centaureinae clade is the one with more differences but is also one with strong conflicting tree topologies (see figs. 3, 6 in Herrando-Moraira et al., 2019b). Similar results have been observed in Stubbs et al. (2018), where the greatest differences between two dating analyses were in the clades with a high proportion of gene conflicts. Finally, the third cause refers to the use of secondary calibration points and the fact that the nodes obtained from treePL analysis show biased confidence intervals and are overly narrow (Barba-Montoya et al., 2021). ...
Article
Full-text available
Understanding the richness and diversification processes in the Mediterranean basin requires both knowledge of the current environmental complexity and paleogeographic and paleoclimate events and information from studies that introduce the temporal dimension. The Carthamus-Carduncellus complex (Cardueae, Compositae) constitutes a good case study to investigate the biogeographic history of this region because it evolved throughout the basin. We performed molecular dating, ancestral area estimation, and diversification analyses based on previous phylogenetic studies of a nearly complete taxon sampling of the complex. The main aims were to determine the role of tectonic and climatic events in the disjunction of the complex and the expansion route of the two main lineages, Carduncellus s.l. and Carthamus. Our results suggest that the main lineages in the complex originated during the Miocene. Later, all main paleogeographic and paleoclimatic events during the Neogene and Pleistocene in the Mediterranean basin had an important imprint on the evolutionary history of the complex. The Messinian Salinity Crisis facilitated the dispersion of the genus Carduncellus from North Africa to the Iberian Peninsula and the split of the genera Phonus and Femeniasia from the Carduncellus lineage. The onset of the Mediterranean climate in the Pliocene together with some orogenic processes could be the main causes of the diversification of the genus Carduncellus. In contrast, Pleistocene glaciations played a key role in the species diversification of Carthamus. In addition, we emphasize the problems derived from secondary dating and the existing differences between two previous dating analyses of the tribe Cardueae.
... Prior to executing PhyParts, we rooted 238 gene trees and the species topology we used as the mapping tree by employing the outgroup and the pxrr program in phyx (Brown et al., 2017), subsequently eliminating branch lengths from the mapping trees. We followed previous studies (e. g. Stubbs et al., 2018) and ran PhyParts with the -s option configured to 50, ensuring that nodes lacking more than 50 % bootstrap support in the gene trees were excluded from consideration. Results from PhyParts were rendered visually using PhyPartspiecharts.py, which uses pie charts to provide a concise representation of gene tree conflicts within our phylogenetic analyses. ...
... Published phylogenetic analyses of subsection Arachnoideae either did not include all taxa and did not succeed in fully resolving phylogenetic relationships (Ebersbach et al., 2017;Tkach et al., 2015Tkach et al., , 2019, or, when sampling all species (Gerschwitz-Eidt & Kadereit, 2020), did not succeed in resolving phylogenetic relationships and identified supported conflict between nuclear and plastid phylogenetic trees. Against this background, we here aim at reconstructing and dating the phylogeny of subsection Arachnoideae with a Hybseq approach using a bait set of 329 protein-coding nuclear loci designed for phylogenetic reconstructions in Saxifragales Stubbs et al., 2018). This phylogenetic tree will be used to explore the ancestral ecology of the subsection. ...
Article
Full-text available
Saxifraga section Saxifraga subsection Arachnoideae is a lineage of 12 species distributed mainly in the European Alps. It is unusual in terms of ecological diversification by containing both high elevation species from exposed alpine habitats and low elevation species from shady habitats such as overhanging rocks and cave entrances. Our aims are to explore which of these habitat types is ancestral, and to identify the possible drivers of this remarkable ecological diversification. Using a Hybseq DNA‐sequencing approach and a complete species sample we reconstructed and dated the phylogeny of subsection Arachnoideae. Using Landolt indicator values, this phylogenetic tree was used for the reconstruction of the evolution of temperature, light and soil pH requirements in this lineage. Diversification of subsection Arachnoideae started in the late Pliocene and continued through the Pleistocene. Both diversification among and within clades was largely allopatric, and species from shady habitats with low light requirements are distributed in well‐known refugia. We hypothesize that low light requirements evolved when species persisting in cold‐stage refugia were forced into marginal habitats by more competitive warm‐stage vegetation. While we do not claim that such competition resulted in speciation, it very likely resulted in adaptive evolution. Saxifraga sect. Saxifraga subsect. Arachnoideae is unusual in terms of ecological diversification by containing both high elevation species from exposed alpine habitats and low elevation species from shady habitats. Based on phylogenetic relationships, the geographical distribution of species in relation to refugial areas in the Alps and a reconstruction of ecological preferences we hypothesize that low light requirements evolved when species persisting in cold‐stage refugia were forced into marginal habitats by more competitive warm‐stage vegetation.
... Recent studies have used high-throughput sequencing technologies to extensively sample nuclear gene regions to resolve historically recalcitrant nodes and generate more robust phylogenetic results (e.g. Gernandt et al. 2018;Stubbs et al. 2018;Couvreur et al. 2019;White et al. 2019). Specifically, hybridization-based target gene enrichment has developed into a valuable phylogenetic tool to target and sequence variable sites across the genome (e.g. ...
Article
Full-text available
Oenothera sect. Pachylophus has proven to be a valuable system in which to study plant-insect coevolution and the drivers of variation in floral morphology and scent. Current species circumscriptions based on morphological characteristics suggest that the section consists of five species, one of which is subdivided into five subspecies. Previous attempts to understand species (and subspecies) relationships at amolecular level have been largely unsuccessful due to high levels of incomplete lineage sorting and limited phylogenetic signal from slowly evolving gene regions. In the present study, target enrichment was used to sequence 322 conserved protein-coding nuclear genes from 50 individuals spanning the geographic range of Oenothera sect. Pachylophus , with species trees inferred using concatenation and coalescentbasedmethods. Our findings concur with previous research in suggesting that O. psammophila and O. harringtonii are nested within a paraphyletic Oenothera cespitosa . By contrast, our results show clearly that the two annual species ( O. cavernae and O. brandegeei ) did not arise from the O. cespitosa lineage, but rather from a common ancestor of Oenothera sect. Pachylophus . Budding speciation as a result of edaphic specializationappears to best explain the evolution of the narrow endemic species O. harringtonii and O. psammophila . Complete understanding of possible introgression among subspecies of O. cespitosa will require broader sampling across the full geographical and ecological ranges of these taxa.
... Briefly, this is a composite data product summarizing information from phylogenomic data and all species in GenBank. For this Saxifragales tree, we sequenced 627 species covering all major lineages for a panel of 301 proteincoding markers previously described (Stubbs & al., 2018. Then a supermatrix of all available data in GenBank was assembled and extensively curated to remove rogue sequences by building individual gene trees and identifying anomalous placements indicating misidentified or low-quality sequences. ...
Article
The family Saxifragaceae, the current composition of which is one of the great surprises of molecular systematics, has been subject to massive improvements in the knowledge of phylogenetic relationships. Nevertheless, developments from phylogenomic efforts have yet to be mobilized to inform biogeography and taxonomy. Here, we use a recent order‐level phylogeny for Saxifragaceae and related families covering 72% of species with a set of new analyses to assess habitat evolution and biogeography. Our results suggest a North American origin of the family in cold alpine habitats, followed by rapid recent evolution of diverse habitat tolerances. We also combine these recent phylogenomic results and a synthesis of the literature to improve generic limits and tribal classification of Saxifragaceae. We recognize 40 genera in 10 tribes, with 14 new combinations, and elevate one subtribe as well as describing four new taxa at the tribal level. Finally, we synthesize information about biogeography and morphology for the family.
... Importantly, this study includes multiple samples for most of the morphological species in Polemonium, an approach undertaken in only a few studies using data of this type (cf. Folk et al. 2017;Morales-Briones et al. 2018;Stubbs et al. 2018). In studies containing multiple individuals per morphological species, these proposed taxa are not always reciprocally monophyletic in both the nuclear and plastid genomes (Folk et al. 2017;Pham et al. 2017;Lee-Yaw et al. 2019). ...
Article
Phylogenomic data from a rapidly increasing number of studies provide new evidence for resolving relationships in recently radiated clades, but they also pose new challenges for inferring evolutionary histories. Most existing methods for reconstructing phylogenetic hypotheses rely solely on algorithms that only consider incomplete lineage sorting as a cause of intra- or inter-genomic discordance. Here, we utilize a variety of methods, including those to infer phylogenetic networks, to account for both incomplete lineage sorting and introgression as a cause for nuclear and cytoplasmic-nuclear discordance using phylogenomic data from the recently radiated flowering plant genus Polemonium (Polemoniaceae), an ecologically diverse genus in Western North America with known and suspected gene flow between species. We find evidence for widespread discordance among nuclear loci that can be explained by both incomplete lineage sorting and reticulate evolution in the evolutionary history of Polemonium. Furthermore, the histories of organellar genomes show strong discordance with the inferred species tree from the nuclear genome. Discordance between the nuclear and plastid genome is not completely explained by incomplete lineage sorting, and only one case of discordance is explained by detected introgression events. Our results suggest that multiple processes have been involved in the evolutionary history of Polemonium and that the plastid genome does not accurately reflect species relationships. We discuss several potential causes for this cytoplasmic-nuclear discordance, which emerging evidence suggests is more widespread across the Tree of Life than previously thought.
... TreePL estimates evolutionary rates and divergence dates on a tree given a set of age constraints and a smoothing factor determining the amount of among branch rate heterogeneity (Pyron, 2014;Smith and O'Meara, 2012). We chose TreePL rather than BEAST (Drummond and Rambaut, 2007) or MrBayes (Ronquist and Huelsenbeck, 2003) because of it is computationally tractable for the size of our data and it yields similar results to these better-known programs according to recent studies (Klimov et al., 2017;Stubbs et al., 2018). Analyses based on fossil and secondary calibration points generally following the scheme used in Herrando-Moraira et al. (2019) with a few modifications. ...
... Target sequence capture consists of enriching genomic libraries for regions of interest (nuclear or organellar), such as highly conserved regions (e.g., ultra-conserved elements, Faircloth et al., 2012; or anchors, Lemmon et al., 2012), more variable low-copy orthologous loci (e.g., exons plus their flanking non-coding introns, Mandel et al., 2014;Weitemier et al., 2014), or functional genes Moore et al., 2018). In land plants, the Hyb-Seq method has recently become a standard procedure for generating large amounts of sequence data for phylogenomics of non-model organisms (Crowl et al., 2017;Gernandt et al., 2018;Stubbs et al., 2018;Medina et al., 2019). In addition to the resulting wealth of data, other advantages of this approach are the low levels of missing data (minimizing issues with orthology) and its cost-effectiveness (allowing for broad taxon sampling) (McKain et al., 2018;Dodsworth et al., 2019). ...
Article
Full-text available
The reduced cost of high‐throughput sequencing and the development of gene sets with wide phylogenetic applicability has led to the rise of sequence capture methods as a plausible platform for both phylogenomics and population genomics in plants. An important consideration in large targeted sequencing projects is the per‐sample cost, which can be inflated when using off‐the‐shelf kits or reagents not purchased in bulk. Here, we discuss methods to reduce per‐sample costs in high‐throughput targeted sequencing projects. We review the minimal equipment and consumable requirements for targeted sequencing while comparing several alternatives to reduce bulk costs in DNA extraction, library preparation, target enrichment, and sequencing. We consider how each of the workflow alterations may be affected by DNA quality (e.g., fresh vs. herbarium tissue), genome size, and the phylogenetic scale of the project. We provide a cost calculator for researchers considering targeted sequencing to use when designing projects, and identify challenges for future development of low‐cost sequencing in non‐model plant systems.
... TreePL estimates evolutionary rates and divergence dates on a tree given a set of age constraints and a smoothing factor determining the amount of among branch rate heterogeneity (Pyron, 2014;Smith and O'Meara, 2012). We chose TreePL rather than BEAST (Drummond and Rambaut, 2007) or MrBayes (Ronquist and Huelsenbeck, 2003) because of it is computationally tractable for the size of our data and it yields similar results to these better-known programs according to recent studies (Klimov et al., 2017;Stubbs et al., 2018). Analyses based on fossil and secondary calibration points generally following the scheme used in Herrando-Moraira et al. (2019) with a few modifications. ...
... This approach yields similar results than the software BEAST (e.g. Lagomarsino et al., 2016;Stubbs et al., 2018), but it runs faster on larger datasets. The dating procedure was divided into two main stages, which consisted on: (1) selection of the optimal model parameters; and (2) running the analysis with the optimal parameters selected and, additionally, accounting for the uncertainty in calibration points to obtain confidence intervals (95% CI) in the estimated node ages. ...
... Plant Transcriptomes Initiative (OneKP or 1KP) provide a more even phylogenetic distribution) and include sequences for over 830 flowering plant taxa (onekp.com/public_data.html). Transcriptome sequences have been successfully used to develop probe sets for targeting nuclear protein-coding genes in several plant groups(Chamala et al. 2015;Landis et al. 2015Landis et al. , 2017Gardner et al. 2016;Heyduk et al. 2016;Crowl et al. 2017;García et al. 2017;Stubbs et al. 2018;Villaverde et al. 2018). Although intron-exon boundaries Downloaded from https://academic.oup.com/sysbio/advance-article-abstract/doi/10.1093/sysbio/syy086/5237557 by Royal Botanic Gardens Kew user on 02 January 2019 ...
Article
Full-text available
Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants). We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, five to 15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself.
... HybPiper has already been successfully applied to analyse data from captured target loci in plants (e.g. Crowl et al., 2017;Landis et al., 2017;Chau et al., 2018;Gernandt et al., 2018;Kates et al., 2018;Medina et al., 2018;Stubbs et al., 2018;Vatanparast et al., 2018). Other new and promising tools are aTRAM (Allen et al., 2015(Allen et al., , 2017, HybPhyloMarker (Fér and Schmickl, 2018), and SECAPR (Andermann et al., 2018). ...
Article
Target enrichment is a cost-effective sequencing technique that holds promise for elucidating evolutionary relationships in fast-evolving lineages. However, potential biases and impact of bioinformatic sequence treatments in phylogenetic inference have not been thoroughly explored yet. Here, we investigate this issue with an ultimate goal to shed light into a highly diversified group of Compositae (Asteraceae) constituted by four main genera: Arctium, Cousinia, Saussurea, and Jurinea. Specifically, we compared sequence data extraction methods implemented in two easy-to-use workflows, PHYLUCE and HybPiper, and assessed the impact of two filtering practices intended to reduce phylogenetic noise. In addition, we compared two phylogenetic inference methods: 1) the concatenation approach, in which all loci were concatenated in a supermatrix; and 2) the coalescence approach, in which gene trees were produced independently and then used to construct a species tree under coalescence assumptions. Here we confirm the usefulness of the set of 1061 COS targets (a nuclear conserved orthology loci set developed for the Compositae) across a variety of taxonomic levels. Intergeneric relationships were completely resolved: there are two sister groups, Arctium-Cousinia and Saussurea-Jurinea, which are in agreement with a morphological hypothesis. Intrageneric relationships among species of Arctium, Cousinia, and Saussurea are also well defined. Conversely, conflicting species relationships remain for Jurinea. Methodological choices significantly affected phylogenies in terms of topology, branch length, and support. Across all analyses, the phylogeny obtained using HybPiper and the strictest scheme of removing fast-evolving sites was estimated as the optimal. Regarding methodological choices, we conclude that: 1) trees obtained under the coalescence approach are topologically more congruent between them than those inferred using the concatenation approach; 2) refining treatments only improved support values under the concatenation approach; and 3) branch support values are maximized when fast-evolving sites are removed in the concatenation approach, and when a higher number of loci is analyzed in the coalescence approach.
... http://dx.doi.org/10.1101/361618 doi: bioRxiv preprint first posted online Jul. 4, 2018;(Chamala et al. 2015;Landis et al. 2015Landis et al. , 2017Gardner et al. 2016;Heyduk et al. 2016;Crowl et al. 2017;García et al. 2017;Stubbs et al. 2018). Although intron-exon boundaries are not known when probes are designed exclusively from transcriptomes in non-model organisms, this does not prevent efficient sequence recovery (Heyduk et al. 2016). ...
Preprint
Full-text available
Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost associated with developing targeted sequencing approaches is preliminary data needed for identifying orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants). We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm lineage. To maximize the phylogenetic potential of the probes while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, five to 15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order lineages of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order lineages, including all angiosperms.
... habitats, latitudes, and elevations, and are dispersed throughout the Micranthes clade(Stubbs, Folk, Xiang, Soltis, & Cellinese, 2018; Stubbs, R.L. unpublished). ...
Article
Full-text available
Research has shown species undergoing range contractions and/or northward and higher elevational movements as a result of changing climates. Here, we evaluate how the distribution of a group of cold‐adapted plant species with similar evolutionary histories changes in response to warming climates. We selected 29 species of Micranthes (Saxifragaceae) representing the mountain and Arctic biomes of the Northern Hemisphere. For this analysis, 24,755 data points were input into ecological niche models to assess both present fundamental niches and predicted future ranges under climate change scenarios. Comparisons were made across the Northern Hemisphere between all cold‐adapted Micranthes, including Arctic species, montane species, and species defined as narrow endemics. Under future climate change models, 72% of the species would occupy smaller geographical areas than at present. This loss of habitat is most pronounced in Arctic species in general, but is also prevalent in species restricted to higher elevations in mountains. Additionally, narrowly endemic species restricted to high elevations were more susceptible to habitat loss than those species found at lower elevations. Using a large dataset and modeling habitat suitability at a global scale, our results empirically model the threats to cold‐adapted species as a result of warming climates. Although Arctic and alpine biomes share many underlying climate similarities, such as cold and short growing seasons, our results confirm that species in these climates have varied responses to climate change and that key abiotic variables differ between these two habitats.
Article
Full-text available
Angiosperms (flowering plants) are by far the most diverse land plant group with over 300,000 species. The sudden appearance of diverse angiosperms in the fossil record was referred to by Darwin as the “abominable mystery,” hence contributing to the heightened interest in angiosperm evolution. Angiosperms display wide ranges of morphological, physiological, and ecological characters, some of which have probably influenced their species richness. The evolutionary analyses of these characteristics help to address questions of angiosperm diversification and require well resolved phylogeny. Following the great successes of phylogenetic analyses using plastid sequences, dozens to thousands of nuclear genes from next‐generation sequencing have been used in angiosperm phylogenomic analyses, providing well resolved phylogenies and new insights into the evolution of angiosperms. In this review we focus on recent nuclear phylogenomic analyses of large angiosperm clades, orders, families, and subdivisions of some families and provide a summarized Nuclear Phylogenetic Tree of Angiosperm Families. The newly established nuclear phylogenetic relationships are highlighted and compared with previous phylogenetic results. The sequenced genomes of Amborella, Nymphaea, Chloranthus, Ceratophyllum, and species of monocots, Magnoliids, and basal eudicots, have facilitated the phylogenomics of relationships among five major angiosperms clades. All but one of the 64 angiosperm orders were included in nuclear phylogenomics with well resolved relationships except the placements of several orders. Most families have been included with robust and highly supported placements, especially for relationships within several large and important orders and families. Additionally, we examine the divergence time estimation and biogeographic analyses of angiosperm on the basis of the nuclear phylogenomic frameworks and discuss the differences compared with previous analyses. Furthermore, we discuss the implications of nuclear phylogenomic analyses on ancestral reconstruction of morphological, physiological, and ecological characters of angiosperm groups, limitations of current nuclear phylogenomic studies, and the taxa that require future attention.
Article
Full-text available
Biogeographic disjunctions, including intercontinental disjunctions, are frequent across plant lineages and have been of considerable interest to biologists for centuries. Their study has been reinvigorated by molecular dating and associated comparative methods. One of the "classic" disjunction patterns is that between Eastern Asia and North America. It has been speculated that this pattern is the result of vicariance following the sundering of a widespread Acrto-Teritary flora. Subtribe Nepetinae in the mint family (Lamiaceae) is noteworthy because it contains three genera with this disjunction pattern: Agastache, Dracocephalum, and Meehania. These disjunctions are ostensibly the result of three separate events, allowing for concurrent testing of the tempo, origin, and type of each biogeographic event. Using four plastid and four nuclear markers, we estimated divergence times and analyzed the historical biogeography of Nepetinae, including comprehensive sampling of all major clades for the first time. We recover a well-supported and largely congruent phylogeny of Nepetinae between genomic compartments, although several cases of cyto-nuclear discordance are evident. We demonstrate that the three disjunctions are pseudo-congruent, with unidirectional movement from East Asia at slightly staggered times during the late Miocene and early Pliocene. With the possible exception of Meehania, we find that vicariance is likely the underlying driver of these disjunctions. The biogeographic history of Meehania in North America may be best explained by long-distance dispersal, but a more complete picture awaits deeper sampling of the nuclear genome and more advanced biogeographical models.
Article
Full-text available
Premise: Speciation not associated with morphological shifts is challenging to detect unless molecular data are employed. Using Sanger-sequencing approaches, the Lomatium packardiae/L. anomalum subcomplex within the larger Lomatium triternatum complex could not be resolved. Therefore, we attempt to resolve these boundaries here. Methods: The Angiosperms353 probe set was employed to resolve the ambiguity within Lomatium triternatum species complex using 48 accessions assigned to L. packardiae, L. anomalum, or L. triternatum. In addition to exon data, 54 nuclear introns were extracted and were complete for all samples. Three approaches were used to estimate evolutionary relationships and define species boundaries: STACEY, a Bayesian coalescent-based species tree analysis that takes incomplete lineage sorting into account; ASTRAL-III, another coalescent-based species tree analysis; and a concatenated approach using MrBayes. Climatic factors, morphological characters, and soil variables were measured and analyzed to provide additional support for recovered groups. Results: The STACEY analysis recovered three major clades and seven subclades, all of which are geographically structured, and some correspond to previously named taxa. No other analysis had full agreement between recovered clades and other parameters. Climatic niche and leaflet width and length provide some predictive ability for the major clades. Conclusions: The results suggest that these groups are in the process of incipient speciation and incomplete lineage sorting has been a major barrier to resolving boundaries within this lineage previously. These results are hypothesized through sequencing of multiple loci and analyzing data using coalescent-based processes.
Article
Targeted sequence capture is a promising approach for large-scale phylogenomics. However, rapid evolutionary radiations pose significant challenges for phylogenetic inference (e.g. incomplete lineages sorting (ILS), phylogenetic noise), and the ability of targeted nuclear loci to resolve species trees despite such issues remains poorly studied. We test the utility of targeted sequence capture for inferring phylogenetic relationships in rapid, recent angiosperm radiations, focusing on Burmeistera bellflowers (Campanulaceae), which diversified into ~130 species over less than 3 million years. We compared phylogenies estimated from supercontig (exons plus flanking sequences), exon-only, and flanking-only datasets with 506–546 loci (~4.7 million bases) for 46 Burmeistera species/lineages and 10 outgroup taxa. Nuclear loci resolved backbone nodes and many congruent internal relationships with high support in concatenation and coalescent-based species tree analyses, and inferences were largely robust to effects of missing taxa and base composition biases. Nevertheless, species trees were incongruent between datasets, and gene trees exhibited remarkably high levels of conflict (~4–60% congruence, ~40–99% conflict) not simply driven by poor gene tree resolution. Higher gene tree heterogeneity at shorter branches suggests an important role of ILS, as expected for rapid radiations. Phylogenetic informativeness analyses also suggest this incongruence has resulted from low resolving power at short internal branches, consistent with ILS, and homoplasy at deeper nodes, with exons exhibiting much greater risk of incorrect topologies due to homoplasy than other datasets. Our findings suggest that targeted sequence capture is feasible for resolving rapid, recent angiosperm radiations, and that results based on supercontig alignments containing nuclear exons and flanking sequences have higher phylogenetic utility and accuracy than either alone. We use our results to make practical recommendations for future target capture-based studies of Burmeistera and other rapid angiosperm radiations, including that such studies should analyze supercontigs to maximize the phylogenetic information content of loci.
Article
Full-text available
Premise of the study: Unrecognized variation in ploidy level can lead to an underestimation of species richness and a misleading delineation of geographic range. Caltha leptosepala (Ranunculaceae) comprises a complex of hexaploids (6x), rare nonaploids (9x), and dodecaploids (12x), all with unknown distributions. We delineate the geographic distribution and contact zones of the cytotypes, investigate morphologies of cytotypes and subspecies, and discuss the biogeography and evolutionary history of the polyploid complex. Methods: Using cytologically determined specimens as reference, propidium iodide flow cytometry was performed on silica-dried samples and herbarium specimens from across the range of C. leptosepala s.l. Genome size estimates from flow cytometry were used to infer cytotypes. A key morphological character, leaf length-to-width ratio, was measured to evaluate whether these dimensions are informative for taxon and/or cytotype delimitation. Key results: Dodecaploids were more northerly in distribution than hexaploids, and a single midlatitude population in the Northern Rockies yielded nonaploids. Genome size estimates were significantly different between all cytotypes and between hexaploid subspecies. Leaf length-to-width ratios were significantly different between subspecies and some cytotypes. Conclusions: Caltha leptosepala presents clear patterns of cytotype distribution at the large scale. Marked differences in morphology, range, and genome size were detected between the hexaploid subspecies, C. leptosepala subsp. howellii in the Cascade-Sierra axis and C. leptosepala subsp. leptosepala in the Rockies. Sympatry between cytotypes in the Cascades and a parapatric distribution in the Northern Rockies suggest unique origins and separate lineages in the respective contact zones.
Article
Full-text available
The ease with which phylogenomic data can be generated has drastically escalated the computational burden for even routine phylogenetic investigations. To address this, we present phyx : a collection of programs written in C ++ to explore, manipulate, analyze and simulate phylogenetic objects (alignments, trees and MCMC logs). Modelled after Unix/GNU/Linux command line tools, individual programs perform a single task and operate on standard I/O streams that can be piped to quickly and easily form complex analytical pipelines. Because of the stream-centric paradigm, memory requirements are minimized (often only a single tree or sequence in memory at any instance), and hence phyx is capable of efficiently processing very large datasets. Availability and implementation: phyx runs on POSIX-compliant operating systems. Source code, installation instructions, documentation and example files are freely available under the GNU General Public License at https://github.com/FePhyFoFum/phyx. Contact: eebsmith@umich.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Article
Full-text available
The integration of ecological niche modeling into phylogeographic analyses has allowed for the identification and testing of potential refugia under a hypothesis-based framework, where the expected patterns of higher genetic diversity in refugial populations and evidence of range expansion of non refugial populations are corroborated with empirical data. In this study we focus on a montane-restricted cryophilic harvestman, Sclerobunus robustus, distributed throughout the heterogeneous Southern Rocky Mountains and Intermontane Plateau (SRMIP) of southwestern North America. We identified hypothetical refugia using ecological niche models (ENMs) across three time periods, corroborated these refugia with population genetic methods using double-digest RAD-seq data, and conducted population level phylogenetic and divergence dating analyses. ENMs identify two large temporally persistent regions in the mid-latitude highlands. Genetic patterns support these two hypothesized refugia with higher genetic diversity within refugial populations and evidence for range expansion in populations found outside of hypothesized refugia. Phylogenetic analyses identify five to six genetically divergent, geographically cohesive clades of S. robustus. Divergence dating analyses suggest that these separate refugia date to the Pliocene and that divergence between clades predates the late Pleistocene glacial cycles, while diversification within clades was likely driven by these cycles. Population genetic analyses reveal effects of both isolation by distance (IBD) and isolation by environment (IBE), with IBD more important in the continuous mountainous portion of the distribution, while IBE was stronger in the populations inhabiting the isolated sky islands of the south. Using model-based coalescent approaches, we find support for post-divergence migration between clades from separate refugia. This article is protected by copyright. All rights reserved.
Article
Full-text available
Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.
Article
Full-text available
Species tree reconstruction is complicated by effects of incomplete lineage sorting, commonly modeled by the multi-species coalescent model (MSC). While there has been substantial progress in developing methods that estimate a species tree given a collection of gene trees, less attention has been paid to fast and accurate methods of quantifying support. In this article, we propose a fast algorithm to compute quartet-based support for each branch of a given species tree with regard to a given set of gene trees. We then show how the quartet support can be used in the context of the MSC to compute (1) the local posterior probability (PP) that the branch is in the species tree and (2) the length of the branch in coalescent units. We evaluate the precision and recall of the local PP on a wide set of simulated and biological datasets, and show that it has very high precision and improved recall compared with multi-locus bootstrapping. The estimated branch lengths are highly accurate when gene tree estimation error is low, but are underestimated when gene tree estimation error increases. Computation of both the branch length and local PP is implemented as new features in ASTRAL.
Article
Full-text available
Background: Phylogeographic studies of aquatic insects provide valuable insights into mechanisms that shape the genetic structure of communities, yet studies that include broad geographic areas are uncommon for this group. We conducted a broad scale phylogeographic analysis of the least salmonfly Pteronarcella badia (Plecoptera) across western North America. We tested hypotheses related to mode of dispersal and the influence of historic climate oscillations on population genetic structure. In order to generate a larger mitochondrial data set, we used 454 sequencing to reconstruct the complete mitochondrial genome in the early stages of the project. Results: Our analysis revealed high levels of population structure with several deeply divergent clades present across the sample area. Evidence from five mitochondrial genes and one nuclear locus identified a potentially cryptic lineage in the Pacific Northwest. Gene flow estimates and geographic clade distributions suggest that overland flight during the winged adult stage is an important dispersal mechanism for this taxon. We found evidence of multiple glacial refugia across the species distribution and signs of secondary contact within and among major clades. Conclusions: This study provides a basis for future studies of aquatic insect phylogeography at the inter-basin scale in western North America. Our findings add to an understanding of the role of historical climate isolations in shaping assemblages of aquatic insects in this region. We identified several geographic areas that may have historical importance for other aquatic organisms with similar distributions and dispersal strategies as P. badia. This work adds to the ever-growing list of studies that highlight the potential of next-generation DNA sequencing in a phylogenetic context to improve molecular data sets from understudied groups.
Article
Full-text available
Historical biogeography has been characterized by a large diversity of methods and unresolved debates about which processes, such as dispersal or vicariance, are most important for explaining distributions. A new R package, BioGeoBEARS, implements many models in a common likelihood framework, so that standard statistical model selection procedures can be applied to let the data choose the best model. Available models include a likelihood version of DIVA (“DIVALIKE”), LAGRANGE’s DEC model, and BAYAREA, as well as “+J” versions of these models which include founder-event speciation, an important process left out of most inference methods. I use BioGeoBEARS on a large sample of island and non-island clades (including two fossil clades) to show that founder-event speciation is a crucial process in almost every clade, and that most published datasets reject the non-J models currently in widespread use. BioGeoBEARS is open-source and freely available for installation at the Comprehensive R Archive Network at http://CRAN.R-project.org/package=BioGeoBEARS. A step-by-step tutorial is available at http://phylo.wikidot.com/biogeobears.
Article
Full-text available
At the intersection of geological activity, climatic fluctuations, and human pressure, the Mediterranean Basin – a hotspot of biodiversity – provides an ideal setting for studying endemism, evolution, and biogeography. Here, we focus on the Roucela complex (Campanula subgenus Roucela), a group of 13 bellflower species found primarily in the eastern Mediterranean Basin. Plastid and lowcopy nuclear markers were employed to reconstruct evolutionary relationships and estimate divergence times within the Roucela complex using both concatenation and species tree analyses. Niche modeling, ancestral range estimation, and diversification analyses were conducted to provide further insights into patterns of endemism and diversification through time. Diversification of the Roucela clade appears to have been primarily the result of vicariance driven by the breakup of an ancient landmass. We found geologic events such as the formation of the mid-Aegean trench and the Messinian Salinity Crisis to be historically important in the evolutionary history of this group. Contrary to numerous past studies, the onset of the Mediterranean climate has not promoted diversification in the Roucela complex and, in fact, may be negatively affecting these species. This study highlights the diversity and complexity of historical processes driving plant evolution in the Mediterranean Basin.
Article
Full-text available
Phylogenetic inference is moving to large multilocus data sets, yet there remains uncertainty in the choice of marker and sequencing method at low taxonomic levels. To address this gap, we present a method for enriching long loci spanning intron-exon boundaries in the genus Heuchera. Two hundred seventy-eight loci were designed using a splice-site prediction method combining transcriptomic and genomic data. Biotinylated probes were designed for enrichment of these loci. Reference-based assembly was performed using genomic references; additionally, chloroplast and mitochondrial genomes were used as references for off-target reads. The data were aligned and subjected to coalescent and concatenated phylogenetic analyses to demonstrate support for major relationships. Complete or nearly complete (>99%) sequences were assembled from essentially all loci from all taxa. Aligned introns showed a fourfold increase in divergence as opposed to exons. Concatenated analysis gave decisive support to all nodes, and support was also high and relationships mostly similar in the coalescent analysis. Organellar phylogenies were also well-supported and conflicted with the nuclear signal. Our approach shows promise for resolving a recent radiation. Enrichment for introns is highly successful with little or no sequencing dropout at low taxonomic levels despite higher substitution and indel frequencies, and should be exploited in studies of species complexes.
Article
Full-text available
The use of transcriptomic and genomic datasets for phylogenetic reconstruction has become increasingly common as researchers attempt to resolve recalcitrant nodes with increasing amounts of data. The large size and complexity of these datasets introduce significant phylogenetic noise and conflict into subsequent analyses. The sources of conflict may include hybridization, incomplete lineage sorting, or horizontal gene transfer, and may vary across the phylogeny. For phylogenetic analysis, this noise and conflict has been accommodated in one of several ways: by binning gene regions into subsets to isolate consistent phylogenetic signal; by using gene-tree methods for reconstruction, where conflict is presumed to be explained by incomplete lineage sorting (ILS); or through concatenation, where noise is presumed to be the dominant source of conflict. The results provided herein emphasize that analysis of individual homologous gene regions can greatly improve our understanding of the underlying conflict within these datasets. Here we examined two published transcriptomic datasets, the angiosperm group Caryophyllales and the aculeate Hymenoptera, for the presence of conflict, concordance, and gene duplications in individual homologs across the phylogeny. We found significant conflict throughout the phylogeny in both datasets and in particular along the backbone. While some nodes in each phylogeny showed patterns of conflict similar to what might be expected with ILS alone, the backbone nodes also exhibited low levels of phylogenetic signal. In addition, certain nodes, especially in the Caryophyllales, had highly elevated levels of strongly supported conflict that cannot be explained by ILS alone. This study demonstrates that phylogenetic signal is highly variable in phylogenomic data sampled across related species and poses challenges when conducting species tree analyses on large genomic and transcriptomic datasets. Further insight into the conflict and processes underlying these complex datasets is necessary to improve and develop adequate models for sequence analysis and downstream applications. To aid this effort, we developed the open source software phyparts ( https://bitbucket.org/blackrim/phyparts ), which calculates unique, conflicting, and concordant bipartitions, maps gene duplications, and outputs summary statistics such as internode certainy (ICA) scores and node-specific counts of gene duplications.
Article
Full-text available
The estimation of species phylogenies requires multiple loci, since different loci can have different trees due to incomplete lineage sorting, modeled by the multi-species coalescent model. We recently developed a coalescent-based method, ASTRAL, which is statistically consistent under the multi-species coalescent model and which is more accurate than other coalescent-based methods on the datasets we examined. ASTRAL runs in polynomial time, by constraining the search space using a set of allowed 'bipartitions'. Despite the limitation to allowed bipartitions, ASTRAL is statistically consistent. We present a new version of ASTRAL, which we call ASTRAL-II. We show that ASTRAL-II has substantial advantages over ASTRAL: it is faster, can analyze much larger datasets (up to 1000 species and 1000 genes) and has substantially better accuracy under some conditions. ASTRAL's running time is [Formula: see text], and ASTRAL-II's running time is [Formula: see text], where n is the number of species, k is the number of loci and X is the set of allowed bipartitions for the search space. ASTRAL-II is available in open source at https://github.com/smirarab/ASTRAL and datasets used are available at http://www.cs.utexas.edu/~phylo/datasets/astral2/. smirarab@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Article
Full-text available
Premise of the study: Targeted sequencing using next-generation sequencing (NGS) platforms offers enormous potential for plant systematics by enabling economical acquisition of multilocus data sets that can resolve difficult phylogenetic problems. However, because discovery of single-copy nuclear (SCN) loci from NGS data requires both bioinformatics skills and access to high-performance computing resources, the application of NGS data has been limited. Methods and Results: We developed MarkerMiner 1.0, a fully automated, open-access bioinformatic workflow and application for discovery of SCN loci in angiosperms. Our new tool identified as many as 1993 SCN loci from transcriptomic data sampled as part of four independent test cases representing marker development projects at different phylogenetic scales. Conclusions: MarkerMiner is an easy-to-use and effective tool for discovery of putative SCN loci. It can be run locally or via the Web, and its tabular and alignment outputs facilitate efficient downstream assessments of phylogenetic utility, locus selection, intron-exon boundary prediction, and primer or probe development.
Article
Full-text available
Section Micranthes of the genus Saxifraga (Saxifragaceae) comprises 67 species that are distributed throughout the northern hemisphere. A previous phylogenetic analysis indicates that this section is a lineage distinct from the remainder of Saxifraga. Recent taxonomic treatments have divided section Micranthes into four subsections: Cuneifoliatae, Micranthes, Rotundifoliatae, and Stellares. To investigate the phylogenetic relationships among species in section Micranthes and to test the monophyly of each of the four subsections, we sequenced the chloroplast gene matK for 26 species of section Micranthes. The results of our parsimony analyses suggest that subsections Micranthes and Stellares are each monophyletic. The single taxon of subsection Cuneifoliatae (S. calycina) for which material could be obtained appears within a clade representing subsection Rotundifoliatae; hence Cuneifoliatae may not be distinct from Rotundifoliatae. Within section Micranthes there exists a high diversity of ovary positions, ranging from what has been described as fully superior to greater than one-half inferior. Examination of this character in light of our matK strict consensus tree indicates that the major trend in gynoecial evolution in section Micranthes has been from an ancestor with what has been termed a superior ovary towards greater inferiority. However, gynoecial evolution in subsection Micranthes is complex, with several apparent reversals towards greater superiority.
Article
Full-text available
The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets.
Article
Full-text available
• Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. • Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics.
Article
Full-text available
Founder-event speciation, where a rare jump dispersal event founds a new genetically isolated lineage, has long been considered crucial by many historical biogeographers, but its importance is disputed within the vicariance school. Probabilistic modeling of geographic range evolution creates the potential to test different biogeographical models against data using standard statistical model choice procedures, as long as multiple models are available. I re-implement the Dispersal-Extinction-Cladogenesis (DEC) model of LAGRANGE in the R package BioGeoBEARS, and modify it to create a new model, DEC+J, which adds founder-event speciation, the importance of which is governed by a new free parameter, j. The identifiability of DEC and DEC+J is tested on datasets simulated under a wide range of macroevolutionary models where geography evolves jointly with lineage birth/death events. The results confirm that DEC and DEC+J are identifiable even though these models ignore the fact that molecular phylogenies are missing many cladogenesis and extinction events. The simulations also indicate that DEC will have substantially increased errors in ancestral range estimation and parameter inference when the true model includes +J. DEC and DEC+J are compared on 13 empirical datasets drawn from studies of island clades. Likelihood ratio tests indicate that all clades reject DEC, and AICc model weights show large to overwhelming support for DEC+J, for the first time verifying the importance of founder-event speciation in island clades via statistical model choice. Under DEC+J, ancestral nodes are usually estimated to have ranges occupying only one island, rather than the widespread ancestors often favored by DEC. These results indicate that the assumptions of historical biogeography models can have large impacts on inference and require testing and comparison with statistical methods.
Article
Full-text available
Geological and ecological features restrict dispersal and gene flow, leading to isolated populations. Dispersal barriers can be obvious physical structures in the landscape; however microgeographic differences can also lead to genetic isolation. Our study examined dispersal barriers at both macro- and micro-geographical scales in the black-capped chickadee, a resident North American songbird. Although birds have high dispersal potential, evidence suggests dispersal is restricted by barriers. The chickadee's range encompasses a number of physiological features which may impede movement and lead to divergence. Analyses of 913 individuals from 34 sampling sites across the entire range using 11 microsatellite loci revealed as many as 13 genetic clusters. Populations in the east were largely panmictic whereas populations in the western portion of the range showed significant genetic structure, which often coincided with large mountain ranges, such as the Cascade and Rocky Mountains, as well as areas of unsuitable habitat. Unlike populations in the central and southern Rockies, populations on either side of the northern Rockies were not genetically distinct. Furthermore, Northeast Oregon represents a forested island within the Great Basin; genetically isolated from all other populations. Substructuring at the microgeographical scale was also evident within the Fraser Plateau of central British Columbia, and in the southeast Rockies where no obvious physical barriers are present, suggesting additional factors may be impeding dispersal and gene flow. Dispersal barriers are therefore not restricted to large physical structures, although mountain ranges and large water bodies do play a large role in structuring populations in this study.Heredity advance online publication, 30 July 2014; doi:10.1038/hdy.2014.64.
Article
Full-text available
Background Biogeographers seek to understand the influences of global climate shifts and geologic changes to the landscape on the ecology and evolution of organisms. Across both longer and shorter timeframes, the western North American landscape has experienced dynamic transformations related to various geologic processes and climatic oscillations, including events as recently as the Last Glacial Maximum (LGM; ~20 Ka) that have impacted the evolution of the North American biota. Redside shiner is a cyprinid species that is widely distributed throughout western North America. The species’ native range includes several well-documented Pleistocene refugia. Here we use mitochondrial DNA sequence data to assess phylogeography, and to test two biogeographic hypotheses regarding post-glacial colonization by redside shiner: 1) Redside shiner entered the Bonneville Basin at the time of the Bonneville Flood (Late Pleistocene; 14.5 Ka), and 2) redside shiner colonized British Columbia post-glacially from a single refugium in the Upper Columbia River drainage. Results Genetic diversification in redside shiner began in the mid to late Pleistocene, but was not associated with LGM. Different clades of redside shiner were distributed in multiple glacial age refugia, and each clade retains a signature of population expansion, with clades having secondary contact in some areas. Conclusions Divergence times between redside shiner populations in the Bonneville Basin and the Upper Snake/Columbia River drainage precedes the Bonneville Flood, thus it is unlikely that redside shiner invaded the Bonneville Basin during this flooding event. All but one British Columbia population of redside shiner are associated with the Upper Columbia River drainage with the lone exception being a population near the coast, suggesting that the province as a whole was colonized from multiple refugia, but the inland British Columbia redside shiner populations are affiliated with a refugium in the Upper Columbia River drainage.
Article
Full-text available
... Comparative phylogeography of north - ... analyses of multiple animal species has allowed a multi-kingdom approach to European phylogeography (see Chapter ... of plants and animals that exhibit a Cascade/Sierran pattern of genetic differentiation in northwestern North America . ...
Article
Full-text available
• Premise of the study: The Compositae (Asteraceae) are a large and diverse family of plants, and the most comprehensive phylogeny to date is a meta-tree based on 10 chloroplast loci that has several major unresolved nodes. We describe the development of an approach that enables the rapid sequencing of large numbers of orthologous nuclear loci to facilitate efficient phylogenomic analyses. • Methods and Results: We designed a set of sequence capture probes that target conserved orthologous sequences in the Compositae. We also developed a bioinformatic and phylogenetic workflow for processing and analyzing the resulting data. Application of our approach to 15 species from across the Compositae resulted in the production of phylogenetically informative sequence data from 763 loci and the successful reconstruction of known phylogenetic relationships across the family. • Conclusions: These methods should be of great use to members of the broader Compositae community, and the general approach should also be of use to researchers studying other families.
Chapter
Full-text available
Inference of shared history, based on congruence between the topological structure of cladograms for lineages, and the geographic distributions of lineages is becoming increasingly important to ecological and behavioral evolutionary studies. Yet clades which are topologically and geographically congruent may have originated at very different times, a phenomenon known as “pseudocongruence”. In this chapter, we explore the strengths and weaknesses of molecular data in identifying cases of pseudocongruence for the simplest possible case: when sister taxa are found in neighboring areas now separated by a disjunction or barrier of some kind. We focus on studies of littoral marine invertebrates in three well-studied marine model systems: coastal species from the southeastern United States, species pairs divided by the final closure of the Panama Seaway ca. 3 million years ago, and taxa which took part in the trans-Arctic interchange between the Pacific and Atlantic Oceans following the opening of the Bering Strait ca. 3.5 million years ago. For each of these model systems, we present molecular divergence, population genetic, and in some cases, paleontological evidence that sister taxa in neighboring areas likely diverged at different times. We also show that comparison of multiple data sets from the same taxa can reveal cases of rate variation. While comparisons of degree of molecular divergence may be confounded by rate variation, comparisons of phylogeographic structure also have the potential to distinguish between cases of strong geographical subdivision and recent gene flow.
Article
Full-text available
Although many NGS read pre-processing tools already existed, we could not find any tool or combination of tools which met our requirements in terms of flexibility, correct handling of paired-end data, and high performance. We have developed Trimmomatic as a more flexible and efficient pre-processing tool, which could correctly handle paired-end data. The value of NGS read pre-processing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output which is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available from http://www.usadellab.org/cms/index.php?page=trimmomatic CONTACT: usadel@bio1.rwth-aachen.de SUPPLEMENTARY INFORMATION: Manual and source code are available from http://www.usadellab.org/cms/index.php?page=trimmomatic.
Article
Full-text available
Lewisia kelloggii has been understood as a rare plant with a disjunct range in California and Idaho. Examination of herbarium specimens and analysis of isozymes in 6 Idaho and 7 California populations revealed consistent differences between plants of the 2 states. Fixed differences in alleles at 2 loci (AAT2 and PGI1) distinguished Idaho from California plants. Genetic identities based on isozymes between Idaho and California populations averaged 0.58, lower than the average for congeneric plant species. Idaho plants were smaller than most California plants, but California plants were variable. The most consistent morphological difference between Idaho and California specimens was the difference in the number of glands on the margins of bracts and sepals. Idaho plants had 0 (-5) pink glands on each margin of these organs, all on teeth near the tips. In California plants these organs had 12-25 glands on each margin, the distal ones elevated on teeth and the proximal ones sessile. We recognize the Idaho plants as a new species, L. sacajaweana, and retain the name L. kelloggii for the California populations.
Article
Full-text available
Phylogenies are increasingly used in all fields of medical and biological research. Moreover, because of the next generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. RAxML (Randomized Axelerated Maximum Likelihood) is a popular program for phylogenetic analyses of large datasets under maximum likelihood. Since the last RAxML paper in 2006, it has been continuously maintained and extended to accommodate the increasingly growing input datasets and to serve the needs of the user community. I present some of the most notable new features and extensions of RAxML, such as, a substantial extension of substitution models and supported data types, the introduction of SSE3, AVX, and AVX2 vector intrinsics, techniques for reducing the memory requirements of the code and a plethora of operations for conducting post-analyses on sets of trees. In addition, an up-to-date, 50 page user manual covering all new RAxML options is available. The code is available under GNU GPL at https://github.com/stamatak/standard-RAxML. Alexandros.Stamatakis@h-its.org.
Article
Full-text available
Despite the strength of climatic variability at high latitudes and upper elevations, we still do not fully understand how plants in North America that are distributed between Arctic and alpine areas responded to the environmental changes of the Quaternary. To address this question, we set out to resolve the evolutionary history of the King's Crown, Rhodiola integrifolia using multi-locus population genetic and phylogenetic analyses in combination with ecological niche modeling. Our population genetic analyses of multiple anonymous nuclear loci revealed two major clades within R. integrifolia that diverged from each other ~ 700 kya: one occurring in Beringia to the north (including members of subspecies leedyi and part of subspecies integrifolia), and the other restricted to the Southern Rocky Mountain refugium in the south (including individuals of subspecies neomexicana and part of subspecies integrifolia). Ecological niche models corroborate our hypothesized locations of refugial areas inferred from our phylogeographic analyses and revealed some environmental differences between the regions inhabited by its two subclades. Our study underscores the role of geographic isolation in promoting genetic divergence and the evolution of endemic subspecies in R. integrifolia. Furthermore, our phylogenetic analyses of the plastid spacer region trnL-F demonstrate that among the native North American species, R. integrifolia and R. rhodantha are more closely related to one another than either is to R. rosea. An understanding of these historic processes lies at the heart of making informed management decisions regarding this and other Arctic-alpine species of concern in this increasingly threatened biome.
Article
Full-text available
After a transition from sexuality to asexuality, the evolutionary dynamics in apomictic lineages will largely depend on the frequency of recombination. We evaluated the presence and extent of asexuality and recom- bination within populations of the Easter daisy, Townsendia hookeri, from the Yukon Territory, Canada. Amplified fragment-length polymorphism (AFLP) fingerprints were used to genotype 78 individuals from four populations. Multilocus AFLP genotypes from each population were subjected to four tests for deviations from free recombination among loci, and the long-term frequency of sexuality was estimated for each population with a novel procedure. In addition, a sample of individuals was surveyed for genome size using flow cytometry, and pollen was assayed for male fertility. One male-fertile, diploid population showed evidence of rampant recombination. Two male-sterile populations (i.e., with aborted anthers) were tetraploid and asexual. The remaining population was male-sterile and included both triploids and tetraploids. Evidence of both sexuality and asexuality was uncovered in this mixed-ploidy population, at an equilibrium rate of approxi- mately three sexual events every two generations. The presence and extent of sexuality differed with ploidy, while cryptic sex was uncovered within a morphologically asexual population, thus reinforcing the power of genome surveys to assess reproductive dynamics at the limit of a plant's geographical range.
Article
Full-text available
Tellima grandiflora, a herbaceous, diploid (2n = 14) perennial, is distributed from the peninsula and panhandle of Alaska to central California. Restriction site variation of chloroplast DNA was surveyed in 51 populations representing the geographic range of T. grandiflora using 20 endonucleases. Two well-differentiated clades of populations differing by 19 restriction site mutations and several length mutations are geographically structured. A northern group comprises populations from Alaska to central Oregon; populations from central Oregon to San Francisco, California, form a southern group. The southern lineage of the monotypic Tellima appears to have obtained its chloroplast genome via ancient hybridization with a species of Mitella. Although northern and southern lineages have well-differentiated chloroplast geonomes, populations of T. grandiflora show a high degree of genetic similarity of nuclear-encoded allozymes; furthermore, no apparent morphological differences charaterize the lineages. Significantly, several populations of T. grandiflora that possess the typical southern chloroplast genome also occur disjunctly on Prince of Wales Island, Alaska, and the Olympic Peninsula, Washington. Because both areas are proposed glacial refugia, we suggest that past glaciation may have created discontinuities in the geographic distribution of T. grandiflora. Following glaciation, migration of once-isolated populations possessing different chloroplast genomes resulted in the formation of a continuous geographic distribution with a major organellar discontinuity. Additional support for this hypothesis is provided by the presence of well-differentiated northern and southern chloroplast DNA lineages in Tolmiea menziesii, a species having a geographic distribution and life history traits similar to those of Tellima.
Article
Tellima grandiflora, a herbaceous, diploid (2n = 14) perennial, is distributed from the peninsula and panhandle of Alaska to central California. Restriction site variation of chloroplast DNA was surveyed in 51 populations representing the geographic range of T. grandiflora using 20 endonucleases. Two well-differentiated clades of populations differing by 19 restriction site mutations and several length mutations are geographically structured. A northern group comprises populations from Alaska to central Oregon; populations from central Oregon to San Francisco, California, form a southern group. The southern lineage of the monotypic Tellima appears to have obtained its chloroplast genome via ancient hybridization with a species of Mitella. Although northern and southern lineages have well-differentiated chloroplast genomes, populations of T. grandiflora show a high degree of genetic similarity of nuclear-encoded allozymes; furthermore, no apparent morphological differences characterize the lineages. Significantly, several populations of T. grandiflora that possess the typical southern chloroplast genome also occur disjunctly on Prince of Wales Island, Alaska, and the Olympic Peninsula, Washington. Because both areas are proposed glacial refugia, we suggest that past glaciation may have created discontinuities in the geographic distribution of T. grandiflora. Following glaciation, migration of once-isolated populations possessing different chloroplast genomes resulted in the formation of a continuous geographic distribution with a major organellar discontinuity. Additional support for this hypothesis is provided by the presence of well-differentiated northern and southern chloroplast DNA lineages in Tolmiea menziesii, a species having a geographic distribution and life history traits similar to those of Tellima.
Article
Motivation: Advances in sequencing technology continue to deliver increasingly large molecular sequence data sets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the CPU and GPU processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses. Results: We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically employ a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous data set, MCMC integration efficiency improves by over 14-fold. Availability: Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference. Contact:guy.baele@kuleuven.be Supplementary information : Supplementary data are available at Bioinformatics online.
Article
Premise of the study: Estimating phylogenetic relationships in relatively recent evolutionary radiations is challenging, especially if short branches associated with recent divergence result in multiple gene tree histories. We combine anchored enrichment next-generation sequencing with species tree analyses to produce a robust estimate of phylogenetic relationships in the genus Protea (Proteaceae), an iconic radiation in South Africa. Methods: We sampled multiple individuals within 59 out of 112 species of Protea and 6 outgroup species for a total of 163 individuals, and obtained sequences for 498 low-copy, orthologous nuclear loci using anchored phylogenomics. We compare several approaches for building species trees, and explore gene tree-species tree discrepancies to determine whether poor phylogenetic resolution reflects a lack of informative sites, incomplete lineage sorting, or hybridization. Key results: Phylogenetic estimates from species tree approaches are similar to one another and recover previously well-supported clades within Protea, in addition to providing well-supported phylogenetic hypotheses for previously poorly resolved intrageneric relationships. Individual gene trees are markedly different from one another and from species trees. Nonetheless, analyses indicate that differences among gene trees occur primarily concerning clades supported by short branches. Conclusions: Species tree methods using hundreds of nuclear loci provided strong support for many previously unresolved relationships in the radiation of the genus Protea. In cases where support for particular relationships remains low, these appear to arise from few informative sites and lack of information rather than strongly supported disagreement among gene trees.
Article
Aim Geologically dynamic areas often harbour remarkable levels of biodiversity. Among other factors, mountain building is assumed to be a precondition for species radiation, and yet, the potential role of immigration as a source of biodiversity prior to radiation is often neglected. Here, we studied the biogeographical history of the large genus Saxifraga to unravel the role played by the Qinghai‐Tibet Plateau ( QTP ) for the diversification of this genus and to understand factors that have led to the establishment of high biodiversity in and around this region. Location QTP and surrounding mountain ranges and worldwide distribution range of Saxifraga . Methods Using a total of 420 taxa (321 ingroup taxa) comprising more than 60% of extant Saxifraga species, we studied the evolutionary history of Saxifraga by performing phylogenetic analyses (maximum likelihood and Bayesian inference on nuclear ITS and plastid trn L– trn F, mat K sequences), divergence time estimation (using uncorrelated log‐normal clock models and four fossil constraints in beast ) and ancestral range estimation (using BioGeo BEARS ). Results Saxifraga originated in North America around 74 (64–83) Ma, dispersed to South America and northern Asia during its early diversification and colonized Europe and the QTP region by the Late Eocene. The QTP region was colonized several times independently, followed in some lineages by rapid radiations, temporally coinciding with recent uplifts of the Hengduan Mountains at the southeastern fringe of the QTP . Subsequently, several lineages dispersed out of Tibet. Main conclusions Immigration, recent rapid radiation and lineage persistence were all important processes for the establishment of a rich species stock of Saxifraga in the QTP region. Because floristic exchanges between the neighbouring areas and the QTP region were bi‐directional, the spatio‐temporal evolution of Saxifraga contrasts with the ‘out of QTP ’ pattern, which has often been assumed for northern temperate plants.
Article
The floras of mountain ranges, and their similarity, beta diversity and endemism, are indicative of processes of community assembly; they are also the initial conditions for coming disassembly and reassembly in response to climate change. As such, these characteristics can inform thinking on refugia. The published floras or approximations for 42 mountain ranges in the three major mountain systems (Sierra-Cascades, Rocky Mountains and Great Basin ranges) across the western USA and southwestern Canada were analysed. The similarity is higher among the ranges of the Rockies while equally low among the ranges of the Sierra-Cascades and Great Basin. Mantel correlations of similarity with geographic distance are also higher for the Rocky Mountains. Endemism is relatively high, but is highest in the Sierra-Cascades (due to the Sierra Nevada as the single largest range) and lowest in the Great Basin, where assemblages are allochthonous. These differences indicate that the geologic substrates of the Cascade volcanoes, which are much younger than any others, play a role in addition to geographic isolation in community assembly. The pattern of similarity and endemism indicates that the ranges of the Cascades will not function well as stepping stones and the endemic species that they harbor may need more protection than those of the Rocky Mountains. The geometry of the ranges is complemented by geology in setting the stage for similarity and the potential for refugia across the West. Understanding the geographic template as initial conditions for the future can guide the forecast of refugia and related monitoring or protection efforts.
Article
With c. 85 species, the genus Micranthes is among the larger genera of the Saxifragaceae. It is only distantly related to the morphologically similar genus Saxifraga, in which it has frequently been included as Saxifraga section Micranthes. To study the molecular evolution of Micranthes, we analysed nuclear ribosomal (internal transcribed spacer, ITS) and plastid (trnL–trnF) DNA sequences in a comprehensive set of taxa comprising c. 75% of the species. The molecular phylogenetic tree from the combined dataset revealed eight well-supported clades of Micranthes. These clades agree in part with previously acknowledged subsections or series of Saxifraga section Micranthes. As these eight groups can also be delineated morphologically, we suggest that they should be recognized as sections of Micranthes. New relationships were also detected for some species and species groups, e.g. section Davuricae sister to sections Intermediae and Merkianae, and M. micranthidifolia as a member of section Micranthes. Species proposed to be excluded from the genus Micranthes for morphological reasons were resolved in the molecular tree in Saxifraga. Many morphological characters surveyed were homoplasious to varying extents. Micromorphological characters support comparatively well the clades in the phylogenetic tree. An updated nomenclature and a taxonomic conspectus of sections and species of Micranthes are provided. © 2015 The Linnean Society of London, Botanical Journal of the Linnean Society, 2015, ●●, ●●–●●.
Article
For this study data from morphology, anatomy, cytology, ecology, and reproductive biology were used to circumscribe eleven taxa of Saxifraga section Boraphila subsection Integrifoliae and to assess their relationships. Most taxa have strong protandry as a method of ensuring outcrossing, and taxa in the S. integrifolia complex (S. integrifolia, S. apetala, and S. nidifica) have been shown to be self-compatible. Saxifraga californica has been shown to be self-incompatible but not protandrous. In occasional populations of S. integrifolia, gynodioecy occurs as an alternate outcrossing mechanism. Limited hybridization experiments and morphological analysis of putative natural hybrids suggest that at least one species, S. hitchcockiana Elvander, nom. nov., evolved by reticulate evolution. Hybrids have been produced and putative hybrids in nature have been found between taxa of subsection Integrifoliae and subsection Nivali-virginienses. Chromosome numbers have been determined for all but two taxa: S. apetala (n = 38), S. aprica (n = 10), S. integrifolia (n = 19), S. hitchcockiana (n = 38), S. nidifica (n = 10, 19), S. oregana (n = 19, 38), S. rhomboidea (n = 10, 19, 20, 28), and S. tempestiva (n = 5). It is hypothesized that combinations of self-compatibility, hybridization, polyploidy, and aneuploid reduction have been significant influences on speciation in subsection Integrifoliae and probably in other subsections of section Boraphila. The treatment includes a new combination, S. nidifica var. claytoniifolia (Canby ex Small) Elvander. A key to the species and varieties as well as distribution maps are provided. All numbered collections cited are listed in an index.
Article
Aim Complex migration histories with repeated range shifts during the Pleistocene characterize many arctic–alpine plants. Identifying these patterns provides insight into the causes of current distributions and possible responses to climate warming. We investigated patterns of genetic variation in North American species of a widespread Northern Hemisphere plant group to test different hypotheses of origin and refugial persistence. Location North America. Methods We used a phylogeographical approach to investigate the geographical origins of North American Rhodiola , especially the widespread western species R. integrifolia . Populations were sampled over much of the North American range (66 of R. integrifolia , 6 of R. rhodantha and 4 of R. rosea ). We performed maximum likelihood phylogenetic analyses on sequences of the nuclear internal transcribed spacer (ITS) region and the plastid trn H –psb A intergenic spacer, and analysed geographical patterns of haplotype distribution and genetic diversity using plastid restriction‐site and sequence data from all populations. Results Separate lineages of Rhodiola dispersed into North America via the Bering Strait ( R. integrifolia and R. rhodantha ) and amphi‐Atlantic regions ( R. rosea ). Genetic patterns within R. integrifolia indicate southward spread from Beringia, with subsequent persistence in both northern and southern refugia. Rhodiola integrifolia and the regional endemic R. rhodantha show evidence of past hybridization (resulting in chloroplast capture) where their ranges overlap in the southern Rocky Mountains. However, phylogenetic evidence suggests that both species are closely related to Asian taxa and probably migrated independently into North America. Main conclusions The current geographical distributions and genetic structure of North American Rhodiola are the result of a complex history including multiple migrations into North America, persistence in multiple refugia during glaciations, and past hybridization. This complexity of processes within one group underscores the diverse history of arctic–alpine plants and the likelihood of divergent responses to changing environments.
Article
1. Here, I present a new, multifunctional phylogenetics package, phytools, for the R statistical computing environment. 2. The focus of the package is on methods for phylogenetic comparative biology; however, it also includes tools for tree inference, phylogeny input/output, plotting, manipulation and several other tasks. 3. I describe and tabulate the major methods implemented in phytools, and in addition provide some demonstration of its use in the form of two illustrative examples. 4. Finally, I conclude by briefly describing an active web-log that I use to document present and future developments for phytools. I also note other web resources for phylogenetics in the R computational environment.
Article
Aim Coalescent models enable the direct estimation of parameters with clear biological relevance (i.e. divergence time, migration rate and rate of expansion), but they have typically been applied to phylogeographical research without a priori assessment of their fit to the empirical system. Here we explore the extent to which phylogeographical inference can be misled by evaluating the fit of several population genetic models to empirical data collected from the sandbar willow, S alix melanopsis . Location The P acific N orthwest mesic forest of N orth A merica. Methods We collected sequence data from five loci in 145 individuals. We assessed model fit in: (1) models delimiting previously proposed races within S . melanopsis ; (2) historical biogeographical models, each describing the timing and pattern of diversification; and (3) coalescent models that correspond to those implemented in popular software packages such as IM a , lamarc , and Migrate ‐ n . Results We found little evidence for previous hypotheses of cryptic races delimited by habitat type (mesic, lowland or subalpine); rather, our results suggested that these variants originated from the same source population. Historical biogeographical models demonstrate that S . melanopsis has recently expanded from a single refugial population, probably located in the northern Rocky Mountains. An analysis using approximate B ayesian computation indicated that the single population expansion model implemented in lamarc is a better fit to the data than multi‐population models incorporating migration and/or divergence as implemented in Migrate ‐ n and IM a , suggesting that the parameters estimated from the latter are potentially misleading for this system. Main conclusions Our research highlights the importance of assessing model fit in addition to estimating parameters to understand evolutionary processes. Taken together, they allow us to infer the historical demography of S . melanopsis in a manner that is not biased by previous work in the system.
Article
When small RNA is sequenced on current sequencing machines, the resulting reads are usually longer than the RNA and therefore contain parts of the 3' adapter. That adapter must be found and removed error-tolerantly from each read before read mapping. Previous solutions are either hard to use or do not offer required features, in particular support for color space data. As an easy to use alternative, we developed the command-line tool cutadapt, which supports 454, Illumina and SOLiD (color space) data, offers two adapter trimming algorithms, and has other useful features. Cutadapt, including its MIT-licensed source code, is available for download at http://code.google.com/p/cutadapt/