Article

A rapid Bootstrap algorithm for the RAxML Web Servers

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The concatenated alignment was analyzed using PartitionFinder v. 2.1.1 (Lanfear et al. 2012) to determine the best-fitting nucleotide substitution models. For RAxML (Stamatakis et al. 2008) we ran the analysis with a partitioned General Time Reversible model with invariant sites, gamma-distributed rate heterogeneity, and estimated initial base frequencies (Gu et al. 1995). We specified three putative candidate partitions for each exon and one for each intron. ...
... We inferred evolutionary relationships for the concatenated sequence alignment under maximum likelihood (ML) using the RAxML-HPC v.8 workflow on the CIPRES Science Gateway (Stamatakis et al. 2008;Miller et al. 2010) with nodal support inferred from 1,000 bootstrap replicates using both the full data set as well as for nuclear loci only (Supplementary Data SD5). We performed a Bayesian species tree analysis using *BEAST v. 2.6.0, also implemented in CIPRES (Miller et al. 2010;Bouckaert et al. 2014Bouckaert et al. , 2019Ogilvie et al. 2017). ...
Article
Full-text available
The Philippine archipelago hosts an exceptional diversity of murid rodents that have diversified following several independent colonization events. Here, we report the discovery of a new species of rodent from Mt. Kampalili on eastern Mindanao Island. Molecular and craniodental analyses reveal this species as a member of a Philippine “New Endemic” clade consisting of Tarsomys, Limnomys, and Rattus everetti (tribe Rattini). This new species of “shrew-mouse” is easily distinguished from its relatives in both craniodental and external characteristics including a long, narrow snout; small eyes and ears; short, dark, dense fur dorsally and ventrally; stout body with a tapering, visibly haired tail shorter than head and body length; stout forepaws; bulbous and nearly smooth braincase; narrow, tapering rostrum; short incisive foramina; slender mandible; and narrow, slightly opisthodont incisors. This new genus and species of murid rodent illustrates that murids of the tribe Rattini have exhibited greater species and morphological diversification within the Philippines than previously known and provides evidence that Mt. Kampalili represents a previously unrecognized center of mammalian endemism on Mindanao Island that is deserving of conservation action.
... For each marker and the combined analyses (Farris et al., 1996;Nixon and Carpenter, 1996) of all Sanger data only and plastomes only, maximum likelihood tree searches and ML bootstrapping were conducted using RAxML-HPC2 on TG ver. 7.2.8 on CIPRES web server (Stamatakis et al., 2008;Miller et al., 2010), with 1000 rapid bootstrap (BS) analyses followed by a search for the best-scoring tree in a single run (Stamatakis et al., 2008). ...
... For each marker and the combined analyses (Farris et al., 1996;Nixon and Carpenter, 1996) of all Sanger data only and plastomes only, maximum likelihood tree searches and ML bootstrapping were conducted using RAxML-HPC2 on TG ver. 7.2.8 on CIPRES web server (Stamatakis et al., 2008;Miller et al., 2010), with 1000 rapid bootstrap (BS) analyses followed by a search for the best-scoring tree in a single run (Stamatakis et al., 2008). ...
Article
Adder's tongue ferns or Ophioglossaceae are best known among evolutionary biologists and botanists for their highest chromosome count of any known organisms, the presence of sporophores, and simple morphology. Previous studies recovered and strongly supported the monophyly of the family and the two multi-generic subfamilies, Botrychioideae and Ophioglossoideae, but the relationships among these and two other subfamilies (Helminthostachyoideae and Mankyuoideae) are not well resolved preventing us from understanding the character evolution. The monophyly of and the relationships in the species-rich genus, Ophioglossum, have not well been understood. In this study, new phylogenetic trees are reconstructed based on four datasets: Sanger sequences of eight plastid markers of 184 accessions, 22 plastomes (12 are new), 29 morphological characters, and combined Sanger and morphological data. Our major results include: (1) the relationships among the four subfamilies are well resolved and strongly supported in Bayesian and parsimony analyses based on plastomes: Mankyua is sister to the rest, followed by Ophioglossoideae which are sister to Helminthostachys + Botrychioideae; (2) Sanger data, plastomes, and combined Sanger and morphological data recovered and strongly supported the monophyly of Ophioglossum in its current circumscription (sensu lato; s.l.) in Bayesian and/or parsimony analyses; (3) within Ophioglossum s.l., four deeply diverged clades are identified and the relationships among the four clades are well resolved; (4) evolution of 34 morphological characters is analyzed in the context of the new phylogeny, among which shape of rhizomes, germination time of spores, shape of early gametophytes, and a number of other characters are found to contain interesting phylogenetic signal; and (5) based on the new phylogeny and character evolution, we propose a new classification of Ophioglossaceae in which the currently circumscribed Ophioglossum is divided into four genera including three new ones: Goswamia, Haukia, and Whittieria considering their molecular, morphological, ecological, and biogeographical distinctiveness.
... A maximum credible tree was found in TreeAnnotator v.2.2.1 (Rambaut and Drummond, 2015) and edited in FigTree 1.4.3 (Rambaut, 2016). The ML analyses used RAxML v7.2.8 (Stamatakis, 2006;Stamatakis et al., 2008) on the CIPRES Science Gateway v3.1 (htt ps://www.phylo.org, Miller, et al., 2010) and the GTRGAMMA model for each gene, as recommended. ...
... Miller, et al., 2010) and the GTRGAMMA model for each gene, as recommended. The analyses used the rapid bootstrapping algorithm (Stamatakis et al., 2008) with 500 replicates for datasets 1-4. ...
Article
Full-text available
The striped-back shrew group demonstrates remarkable variation in skull and body size, tail length, and brightness of the dorsal stripe; and karyotypic and DNA variation has been reported in recent years. In this study, we investigated the phylogenetic structure of the group, as well as speciation patterns and demographic history in Mountains of Southwestern China and adjacent mountains, including the southern Himalayas, Mts. Bashan, Wushan, and Qinling. We sequenced a total of 462 specimens from 126 localities in the known range of the group, which were sequenced and analyzed based on 6.2 kb of sequence data from two mitochondrial, six nuclear, and two Y chromosome markers. Phylogenetic analyses of the concatenated mtDNA data revealed 14 sympatric and independently evolving lineages within the striped-back shrew group, including Sorex bedfordiae, S. cylindricauda, S. excelsus, S. sinalis and several cryptic species. All concatenated data (ten genes) showed a consistent genetic structure compared to the mtDNA lineages for the group, whereas the nuclear and the Y chromosome data showed a discordant genetic structure compared to the mtDNA lineages for the striped-back shrew group. Species delimitation analyses and deep genetic distance clearly support the species status of the 14 evolving lineages. The divergence time estimation suggested that the striped-back shrew group began to diversify from the middle Pleistocene (2.34 Ma), then flourished at approximately 2.14 Ma, followed by a series of rapid diversifications through the Pleistocene. Our results also revealed multiple mechanisms of speciation in the Mountains of Southwestern China and Adjacent Mountains with complex landscapes and climate. The uplifting of the Qinghai-Tibetan Plateau, Quaternary climate oscillations, riverine barriers, ecological elevation gradients, topographical diversity, and their own low dispersal capacity may have driven the speciation, genetic structure, and phylogeographic patterns of the striped-back shrew group.
... Phylogenetic trees were constructed by the maximum-likelihood (ML) and Bayesian analysis (BI) methods using the entire cp genome. The ML analyses were performed using RAxML-HPC2 on XSEDE (8.2.12) at the CIPRES Science Gateway website (Stamatakis et al. 2008;Miller et al. 2015) (http://www.phylo.org/sub_sections/portal/), as suggested with 1000 bootstrap replicates. ...
Article
Lilium concolor var. pulchellum is a perennial herbaceous plant with high ornamental and edible value; it is a critical breeding parent of Asiatic hybrids. In this study, we reported the complete chloroplast genome of L. concolor var. pulchellum. The total size of the genome is 152,126 bp with a GC content of 37.0%. It has a conserved quadripartite structure comprising 136 genes, including 38 tRNA genes, 8 rRNA genes, 83 protein-coding genes, and 7 pseudogenes. Phylogenetic analysis strongly supported a close relation between L. concolor var. pulchellum and L. callosum. The complete plastome sequence of L. concolor var. pulchellum could provide useful information for phyletic evolution of the genus Lilium.
... Afterward, the evolutionary history was inferred by using the maximum-likelihood (ML) approach in MEGA7.0 (Kumar et al. 2016) in the Tamura-Nei substitution model (Kumar et al. 2018). Bootstrap (BS) value was calculated through 1000 times of repeated analyses (Stamatakis et al. 2008) (Figure 1). As expected, L. buergeri closely grouped with L. bicolor and L. cuneata in genus Lespedeza. ...
Article
Full-text available
The complete chloroplast genome sequence of Lespedeza buergeri is presented in this report. It is 149,065 bp in length and divided into four distinct regions: a small single copy (SSC) region of 18,934 bp, a large single copy (LSC) region of 82,476 bp, and a pair of inverted repeat regions of 23,826 bp. The annotation of the L. buergeri complete chloroplast genome predicted a total of 123 genes (77 protein-coding genes, 38 transfer RNA genes, and 8 ribosomal RNA genes). Phylogenetic analysis with the reported chloroplast genomes revealed that L. buergeri is nested in the genus Lespedeza of Fabaceae family. Furthermore, L. buergeri exhibited a close relationship with Lespedeza bicolor and Lespedeza cuneata. This results in this study might contribute to further investigating the evolutionary relationship of family Fabaceae.
... Maximum likelihood analyses were performed in RAxML-HPC BlackBox v8.2.9 (Stamatakis, 2014), using the GTR CAT substitution model. Bootstrap values were calculated from 1000 replicates (Stamatakis et al., 2008). The resulting tree was visualized as a rooted phylogram using FigTree v1.4.2 (tree.bio.ed.ac.uk) and exported to Inkscape v1.0 BETA (https:// inksc ape. ...
Article
Full-text available
he Caribbean is influenced by Sahara Dust Storms (SDS) every year. SDS can transport a diversity of microorganisms, including potential pathogens of humans, animals, and plants. In fact, SDS have been suggested as a source of Aspergillus sydowii, reported to cause aspergillosis disease in gorgonian sea fans. However, the diversity of fungal spores in SDS remains unknown and there are con- flicting studies as to whether A. sydowii spore are capable of crossing the Atlantic Ocean. In this study, we estimated the fungal diversity of the Saharan dust trapped on air filters during five days of a ship’s tra- jectory in the eastern Atlantic during a dust event. Also, we investigated whether SDS is a potential source of opportunistic fungal pathogens. We isolated 30 morphospecies including the ascomycetes Asper- gillus (33% of identified isolates), Thielavia (18%), Penicillium (12%), Chaetomium strumarium (3%), Periconia (2%), and Cladosporium sphaerosper- mum (1%). Many of these groups include opportun- istic pathogens. Species diversity was similar across days but with significant differences between Days 3 vs 5 and between hazy vs clear days. We report for the first time that Thielavia, Chaetomium strumarium and Periconia are present in SDS and are capable of surviving long-distance transport in SDS. The pres- ence of A. sydowii isolates is consistent with reports of SDS as a source of inoculum for sea fan aspergillo- sis. This could signify that SDS are carriers of viable, potentially pathogenic spores which can be deposited on terrestrial or aquatic substrates.
... and RAxML 7.0.3 (Stamatakis et al. 2008, Ronquist & Huelsenbeck 2003. The BI analysis was run on the MrBayes v3.1.2 ...
Article
We report a new species, Moelleriella puerensis, a fungal pathogen infecting scale insect nymphs, from Puer City, Yunnan Province, Southwestern China. Its diagnostic characteristics include pale yellow and thin pulvinate stromata, ovoid tubercles processes developing on the periphery of the stroma, a fully immersed perithecia, orange ostioles, cylindrical asci, and filiform ascospores disarticulated into short secondary spores. Teleomorph and anamorph typically existed in the same stroma, exclusively anamorphic stromata, with pale yellow conidiomata aggregated in the center of the stroma and several conidiomata per stroma fusing with neighboring conidiomata and existing paraphyses. Morphological observations and phylogenetic analyses of combined nrLSU, rpb1, and tef-1α sequence data confirmed the validity of M. puerensis.
... To infer phylogenetic trees, Bayesian inference (BI) and maximum likelihood (ML) analyses were performed on the CIPRES Science Gateway (Miller et al., 2010) using MrBayes 3.2.7 (Ronquist et al., 2012) and RAxML8.1.12 (Stamatakis et al., 2008), respectively. For BI analysis, the GTR +I+G model was selected. ...
Article
A new bisexual species of Rotylenchus is described and illustrated based on morphological, morphometric and molecular characterizations. Rotylenchus zhongshanensis sp. nov. is characterized by having a conoid lip region complying with the basic pattern for Hoplolaimidae, but with pharyngeal glands slightly overlapping intestine dorsally and cuticle thickened abnormally in female tail terminus. Females have robust stylet (30.1–33.8 μm). The pharyngeal gland has short dorsal (11.2–16.8 μm) overlap on the intestine. The vulva is located at 48.0–56.5% of body length, and phasmids are pore-like, 4–6 annuli posterior to the anus. For males, phasmids are pore-like, 11–17 annuli posterior to cloaca. The spicules are ventrally arcuate (21.0–28.5 μm) with gubernaculum in 5–8 μm length. The rRNA and mitochondrial COI genes were successfully sequenced from the assembled whole-genome sequences of the new species, and were used for reconstructing the phylogenetic relationships of the new species. A new strain of cyto-endosymbiont Cardinium was also discovered from the genome sequences of R. zhongshanensis sp. nov. The 16S rRNA phylogeny analyses revealed that this new bacterial strain is closed to that from cyst and root-lesion nematodes.
... Phylogenetic analyses were carried out using maximum likelihood (ML), maximum parsimony (MP), and bayesian analysis (BI). The ML analysis was performed in the CIPRES Science Gateway platform (Miller et al. 2010) using RAxML-HPC v. 8 on XSEDE (Stamatakis 2006(Stamatakis , 2014. The GTR+GAMMA model of nucleotide evolution was used. ...
Article
During an investigation of fungal saprobes of Yunnan Province, China, a species in Tetraplosphaeriaceae was collected from decaying stems of Saccharum arundinaceum (Poaceae). The taxon has oblong, dark brown to black, four-columned conidia with mostly three long apical appendages. Phylogenetic analyses of combined LSU, SSU, ITS, β-tubulin and tef1-α sequence data revealed that our collection is a member of Tetraploa and has a close affinity to T. aquatica, T. dashaoensis, and T. nagasakiensis. Based on the morphological comparison and multi-locus phylogeny, we introduce T. cylindrica as a new species. The newly discovered taxon is described, illustrated, and compared to related taxa. A taxonomic insight is presented based on our new data regarding the classification problems of asexual and sexual morphs of Tetraploa. In addition, this is the first report of Tetraploa species that occur on Saccharum arundinaceum. The results of the present study will expand our ecological knowledge of Tetraploa.
... ML for the COI data set was performed in MEGA X using the K2p model with 100 bootstraps. The ML tree for the second data set was estimated as implemented in RAxML for XSEDE (Stamatakis 2006, Stamatakis et al. 2008 on the CIPRES Science Gateway web server (Miller et al. 2010). For the ML analysis, we used a mixed partition model determined by PartitionFinder (Lanfear et al. 2012), including the GTR+G substitution model, which is the only model available in RAxML (Stamatakis 2006). ...
Article
Apple snails, family Ampullariidae, are conspicuous inhabitants of water bodies in Peruvian Amazonia, and they are a significant protein resource for mostly the native people. Despite recent efforts to resolve the evolutionary relationships within the genus Pomacea Perry, 1810, its diversity is still undoubtedly undersampled, and the identities of some species are not yet adequately known. DNA barcodes and phylogenetic analyses with COI and 16S rRNA mitochondrial markers have allowed us to discriminate apple snail species sold in open-air markets in the city of Iquitos, Peru, as well as the Peruvian giant species of Pomacea, which probably was referred to in the past as P. maculata Perry, 1810. From open-air markets and rivers surrounding Iquitos we identified P. nobilis (Reeve, 1856) and P. aulanieri (Deville & Huppé, 1850), along with 2 unidentified species of Pomacea, which we designate P. sp. 2 and P. sp. 3. A third unidentified species, P. sp. 1, which has the Spanish name “churo gigante” (giant apple snail), was only found in lagoons of the Huallaga and Napo rivers. Pomacea sp. 1 does not correspond to P. maculata, although it does belong to the P. canaliculata (Lamarck, 1822) clade. Pomacea sp. 2 was the only species with high sequence similarity to sequences deposited in GenBank, which belong to a Pomacea species introduced to Florida, USA.
... The alignment was visually inspected using AliView (Larsson, 2014). The best scoring maximum likelihood tree was estimated using RAxML (Stamatakis, 2006) using the rapid bootstrap analysis algorithm (Stamatakis et al., 2008) with 1000 bootstrap replicates and a general time-reversible (GTR) nucleotide substitution model with a gamma distribution for rate heterogeneity. A GTR model was chosen because it has been found to perform at least as well as other models in phylogenetic reconstruction under a variety of conditions (Arenas, 2015). ...
... The alignment was visually inspected using AliView (Larsson, 2014). The best scoring maximum likelihood tree was estimated using RAxML (Stamatakis, 2006) using the rapid bootstrap analysis algorithm (Stamatakis et al., 2008) with 1000 bootstrap replicates and a general time-reversible (GTR) nucleotide substitution model with a gamma distribution for rate heterogeneity. A GTR model was chosen because it has been found to perform at least as well as other models in phylogenetic reconstruction under a variety of conditions (Arenas, 2015). ...
... Convergence of runs was confirmed when the average standard deviation was <0.02 with effective sample sizes >200. In addition, a maximum likelihood analysis was conducted using RAxML as implemented in GENEIOUS 10.2.6 (Kearse et al., 2012) with 1000 pseudoreplicates of nonparametric bootstrapping (Stamatakis et al., 2008). Trees were viewed in FIGTREE 1.4.3 (Rambaut, 2016) and rooted to Tulasnella eichleriana based on previous studies (Cruz et al., 2014;Linde et al., 2017). ...
Article
While many Australian terrestrial orchids have highly specialized mycorrhizal associations, we tested the hypothesis that the geographically widespread orchid genus Cryptostylis associates with a diversity of fungal species. We investigated the mycorrhizal associations of five Australian Cryptostylis species (27 sites sampled) and included limited sampling from three Asiatic Cryptostylis species (two sites), using fungal isolation and molecular approaches. Like related orchid genera, Tulasnellaceae formed the main fungal associations of the Cryptostylis species we sampled, although some ectomycorrhizal, ericoid and saprotrophic fungi were detected infrequently. Each species of Australian Cryptostylis associated with three to seven Tulasnella Operational Taxonomic Units (OTUs), except for C. hunteriana where only one Tulasnella OTU was detected. In total, eleven Tulasnella OTUs associated with Australian Cryptostylis. The Asiatic Cryptostylis associated with four different Tulasnella OTUs belonging to the same lineage as the Australian species. While five Tulasnella OTUs (T. australiensis, T. prima, T. warcupii, T. densa, and T. punctata) were used by multiple species of Australian Cryptostylis, the most commonly used OTU differed between orchid species. The association with different Tulasnella fungi by Cryptostylis species co-occurring at the same site suggests that in any given environmental condition, Cryptostylis species may intrinsically favour different fungal OTUs.
... = 29), and a second calibration age resulting from population parameter estimation in the coalescent. This divergence age was estimated for lindeniana + linearis clades in SNAPP (Bryant et al., 2012) implemented in Beast 2.6 (Bouckaert et al., 2019) on CIPRES Science Gateway version 3.3 (Stamatakis et al., 2008;Miller et al., 2010). For this age estimation, we included all sampled individuals of A. hirsuta, A. lindeniana, A. linearis, and A. concinna in the SCP. ...
Article
Full-text available
The topographic gradients of the Tropical Andes may have triggered species divergence by different mechanisms. Topography separates species’ geographical ranges and offers climatic heterogeneity, which could potentially foster local adaptation to specific climatic conditions and result in narrowly distributed endemic species. Such a pattern is found in the Andean centered palm genus Aiphanes. To test the extent to which geographic barriers and climatic heterogeneity can explain distribution patterns in Aiphanes, we sampled 34 out of 36 currently recognized species in that genus and sequenced them by Sanger sequencing and/or sequence target capture sequencing. We generated Bayesian, likelihood, and species-tree phylogenies, with which we explored climatic trait evolution from current climatic occupation. We also estimated species distribution models to test the relative roles of geographical and climatic divergence in their evolution. We found that Aiphanes originated in the Miocene in Andean environments and possibly in mid-elevation habitats. Diversification is related to the occupation of the adjacent high and low elevation habitats tracking high annual precipitation and low precipitation seasonality (moist habitats). Different species in different clades repeatedly occupy all the different temperatures offered by the elevation gradient from 0 to 3,000 m in different geographically isolated areas. A pattern of conserved adaptation to moist environments is consistent among the clades. Our results stress the evolutionary roles of niche truncation of wide thermal tolerance by physical range fragmentation, coupled with water-related niche conservatism, to colonize the topographic gradient.
... The samples from the stationary phases of the independent runs were pooled to obtain the final results. Maximum likelihood bootstrapping (1000 replicates) was performed in raxMl v.7.7.1 (Stamatakis et al., 2008) using the same model as in the Bayesian inferences, with the dataset partitioned by codon and support calculated from 1000 bootstrap pseudoreplicates. Uncorrected pairwise sequence divergence was calculated in MEGA7. ...
Article
Full-text available
The eared nightjars (Lyncornis, formerly Eurostopodus) comprise six taxa distributed from southern India and Southeast Asia to Sulawesi. Species limits in this group have not been evaluated since 1940. In this study, we use three datasets (morphology, acoustics and mitochondrial DNA) to assess the taxonomic status of taxa in this genus. Multivariate analyses of vocalizations and phylogenetic analysis of mitochondrial DNA both revealed the presence of four major groups. Morphological analyses also revealed four major groups, but these agreed only in part with those identified by vocalizations and DNA. Lyncornis macrotis cerviniceps from mainland Southeast Asia and the isolated Lyncornis macrotis jacobsoni on Simeulue Island, off northwest Sumatra, differed by six diagnostic plumage characters, but could not be distinguished by their vocalizations or mitochondrial DNA. Conversely, Lyncornis macrotis macrotis from the Philippines and Lyncornis macrotis macropterus from Sulawesi differed diagnosably in song and by 5% sequence divergence but could not be diagnosed by plumage. We adopt an integrative approach and propose to recognize five monotypic species: Lyncornis temminckii, Lyncornis cerviniceps (synonym: Lyncornis bourdilloni), Lyncornis jacobsoni, Lyncornis macrotis and Lyncornis macropterus. Our study illustrates that taxonomic revisions based on single lines of evidence can underestimate diversity and underscores the importance of using multiple datasets in species-level taxonomy.
... Maximum likelihood analysis was done using the online RAxML-HPC (8.2.4) on CIPRES Science Gateway V. 3.3 (Stamatakis 2006, 2014, Stamatakis et al. 2008, Miller et al. 2010) followed by the GTRGAMMA substitution model and 1,000 bootstrap replicates. The final tree was selected amongst suboptimal trees from each run by comparing likelihood scores. ...
Article
Full-text available
Two Pseudoberkleasmium taxa were obtained from decaying leaves and culms of Cocos nucifera and Zea mays in northern Thailand. Pseudoberkleasmium collections were compared with closely related taxa based on morphological characteristics and combined ITS, LSU, SSU, TEF1-α, and RPB2 DNA sequence data. Pseudoberkleasmium chiangraiense sp. nov. and a new host record, P. chiangmaiense are presented with full descriptions, photo plates, and a phylogenetic tree showing the placements of taxa.
... We conducted constraint ML searches in RAxML version 8.1.20 using by-codon partitions and 10 independent iterations (Stamatakis 2006;Stamatakis et al. 2008), and time-calibrated the resulting ML tree in TreePL version 1.0 (Smith and O'Meara 2012). The TreePL analysis used secondary calibrations extracted from the reference backbone tree via "congruification" (Eastman et al. 2013), a function ("congruify") implemented in the R package geiger (Harmon et al. 2008). ...
Article
A growing body of research suggests that genome size in animals can be affected by ecological factors. Half a century ago, Ebeling et al. (1971; EEA71) proposed that genome size increases with depth in some teleost fish groups and discussed a number of biological mechanisms that may explain this pattern (e.g., passive accumulation, adaptive acclimation). Using phylogenetic comparative approaches, we revisit this hypothesis based on genome size and ecological data from up to 708 marine fish species in combination with a set of large‐scale phylogenies, including a newly inferred tree. We also conduct modelling approaches of trait evolution and implement a variety of regression analyses to assess the relationship between genome size and depth. Our reanalysis of the EEA71 dataset shows a weak association between these variables, but the overall pattern in their data is driven by a single clade. While new analyses based on our ‘all‐species’ dataset resulted in positive correlations, providing some evidence that genome size evolves as a function of depth, only one subclade consistently yielded statistically significant correlations. By contrast, negative correlations are rare and non‐significant. All in all, we find modest evidence for an increase in genome size along the depth axis in marine fishes. We discuss some mechanistic explanations for the observed trends. This article is protected by copyright. All rights reserved
... A maximum likelihood phylogeny was inferred for a BIN sequence alignment using RAxML Black box (RAxML, RRID:SCR_006 086) [68] on XCEDE via the CIPRES portal (CIPRES Science Gateway, RRID:SCR_008439) [69]. This system uses a GTRCAT model, which is recommended for larger datasets. ...
Article
Full-text available
Background Traditional biomonitoring approaches have delivered a basic understanding of biodiversity, but they cannot support the large-scale assessments required to manage and protect entire ecosystems. This study used DNA metabarcoding to assess spatial and temporal variation in species richness and diversity in arthropod communities from 52 protected areas spanning 3 Canadian ecoregions. Results This study revealed the presence of 26,263 arthropod species in the 3 ecoregions and indicated that at least another 3,000–5,000 await detection. Results further demonstrate that communities are more similar within than between ecoregions, even after controlling for geographical distance. Overall α-diversity declined from east to west, reflecting a gradient in habitat disturbance. Shifts in species composition were high at every site, with turnover greater than nestedness, suggesting the presence of many transient species. Conclusions Differences in species composition among their arthropod communities confirm that ecoregions are a useful synoptic for biogeographic patterns and for structuring conservation efforts. The present results also demonstrate that metabarcoding enables large-scale monitoring of shifts in species composition, making it possible to move beyond the biomass measurements that have been the key metric used in prior efforts to track change in arthropod communities.
... The trimmed alignment file was converted to Phylip format, and the best-fit amino acid substitution matrix, among-site rate heterogeneity model, and observed amino acid frequency were determined using ProtTest 3 software (Darriba et al., 2011). A maximum-likelihood phylogenetic reference tree was built using RAxML v8 (Stamatakis et al., 2008), and only those sequences that clustered with experimentally verified enzymes were considered putative homologues. These phylogenetic reference trees served as scaffolds to recruit translated environmental metatranscriptomic reads. ...
Article
Full-text available
Glycine betaine (GBT) is a compatible solute in high concentrations in marine microorganisms. As a component of labile organic matter, GBT has complex biochemical potential as a substrate for microbial use that is unconstrained in the environment. Here we determine the uptake kinetics and metabolic fate of GBT in two natural microbial communities in the North Pacific characterized by different nitrate concentrations. Dissolved GBT had maximum uptake rates of 0.36 and 0.56 nM hr‐1 with half‐saturation constants of 79 and 11 nM in the high nitrate and low nitrate stations, respectively. During multiday incubations, most GBT taken into cells was retained as a compatible solute. Stable isotopes derived from the added GBT were also observed in other metabolites, including choline, carnitine, and sarcosine, suggesting that GBT was used for biosynthesis and for catabolism to pyruvate and ammonium. Where nitrate was scarce, GBT was primarily metabolized via demethylation to glycine. Gene transcript data was consistent with SAR11 using GBT as a source of methyl groups to fuel the methionine cycle. Where nitrate concentrations were higher, more GBT was partitioned for lipid biosynthesis by both bacteria and eukaryotic phytoplankton. Our data highlight unexpected metabolic pathways and potential routes of microbial metabolite exchange. This article is protected by copyright. All rights reserved.
... The best model of evolution was determined by MrModeltest v. 2.2 for each gene. Maximum likelihood analyses via RAxML v. 8.2.12 [55] were accomplished using RAxML-HPC2 on XSEDE v. 8.2.8 [55,56] using the CIPRES Science Gateway platform [57]. GTR + I + G evolution model was used with 1000 non-parametric bootstrapping iterations. ...
Article
Full-text available
High temperatures and the seasonality in tropical ecosystems favours plant pathogens, which result in many fungal diseases. Among these, diseases caused by Botryosphaeriaceae species are prominent as dieback, canker and leaf spots. In this research, we isolated one leaf-spot-causing Botryosphaeriaceae species from Ficus altissima leaves, which were collected in Guangzhou, Guang-dong Province, China. Isolation and identification of the pathogen were based on morphological and molecular aspects. Based on multigene phylogenetic analysis of combined internal transcribed spacer (ITS), translation elongation factor 1-α gene (tef1) and beta-tubulin gene (tub2), the fungus associated with leaf spots on F. altissima is described as Lasiodiplodia fici, a novel species. Pathogenic-ity assays were conducted by inoculating the fungus onto detached shoots and plants under controlled environmental conditions. The results revealed that the L. fici isolates can infect the plant tissues under stress conditions by developing disease symptoms on detached shoots within three days. However, when it was inoculated onto the leaves of the host and grown in natural conditions, the progression of the disease was slow. The putative pathogen was re-isolated, and Koch's assumptions were satisfied. This is the first report of Lasiodiplodia species causing disease on Ficus altissima. Results from the present study will provide additional knowledge on fungal pathogens associated with forest and ornamental plant species.
... A maximum likelihood (ML) tree was constructed with RAxMLHPC2 on XSEDE v8.1.12 (Stamatakis et al. 2008;Stamatakis 2014) using a GTR + G model in CIPRES Science Gateway. The topology of the ML tree was visualized by MEGA 5.0 (Tamura et al. 2011). ...
Article
Periphytic ciliates play a vital role in the material cycle and energy flow of microbial food web, however, their taxonomy and biodiversity are inadequately studied given their high species richness. Two new and one little known species, viz. Derouxella lembodes gen. et sp. nov., Cyrtophoron multivacuolatum sp. nov., and Cyrtophoron apsheronica Aliev, 1991, collected from coastal waters of China, were investigated using modern methods. Derouxella gen. nov. can be recognized by having dorsoventrally flattened body, a podite, one fragmented preoral kinety, two parallel circumoral kineties, and somatic kineties progressively shortened from right to left. Morphological classification and phylogenetic analyses based on nuclear small subunit ribosomal RNA (nSSU rRNA) gene and mitochondrial small subunit ribosomal RNA (mtSSU rRNA) gene sequence data inferred that Derouxella gen. nov. occupies an intermediate position between Hartmannulidae and Dysteriidae. Cyrtophoron multivacuolatum sp. nov. is characterized by large body size, the numbers of somatic kineties and nematodesmal rods, and having numerous contractile vacuoles. The genus Cyrtophoron and the poorly known species C. apsheronica were redefined. Even with the addition of newly obtained nSSU rRNA and mtSSU rRNA sequences of Cyrtophoron, the family Chlamydodontidae was still recovered as a monophyletic group, the monophyly of Cyrtophoron was supported too.
... The most recent studies of caniform carnivoran phylogenetics used more than 10 gene sequences with ~7.7-22.0 kb for 16-44 taxa, which would be practically impossible to examine with the traditional ML method (Table 2.1; Eizirik et al., 2010;Yu et al., 2011a;. Therefore, they adopted the recently developed fast-ML search methods implemented in the programs PHYML (Guindon & Gascuel, 2003), GARLI (Zwickl, 2006), and RAxML (Stamatakis et al., 2008). Owing to the development of these fast-ML strategies, pluralistic evaluation of phylogenetic hypotheses by various optimality criteria with different measures of support has been realized with more efficiency than using PAUP (e.g. ...
Chapter
Recent advancements in phylogenetic resolution at higher taxonomic levels within the mammalian order Carnivora have been stimulated by the increasing application of nuclear DNA, which is less homoplastic than mitochondrial DNA, and therefore better suited for studying deep‐level (e.g. among genera or older) relationships. Immense progress in sequencing nuclear and mitochondrial DNAs from carnivoran species has resulted in a wealth of data in publicly available DNA databases, allowing an improved understanding of phylogenetic relationships at every taxonomic level using the ‘total evidence’ supermatrix or supertree method. Here, we review recent molecular systematic studies for one of the most enigmatic species, the red panda, Ailurus fulgens , and show that the use of nuclear DNA, Bayesian and maximum likelihood phylogenetic inference, and the supermatrix approach have improved the resolution of the phylogenetic position of this species. Secondly, we show that such methodological improvements have also clarified the evolution of the family Mustelidae (weasels, martens, otters, badgers, and allies). We demonstrate this in light of phylogeny, chronology, and historical biogeography and provide an up‐to‐date subfamily classification of the Mustelidae. Finally, we discuss the implications of molecular systematics to setting and defining conservation priorities on the basis of the EDGE (Evolutionarily Distinct and Globally Endangered) value, and conclude that the supermatrix‐based priority setting is preferable to the supertree‐based one.
... The ML analysis was performed using RAxML software (Stamatakis et al. 2008). The nonparametric bootstrap analysis with 1,000 replicas was used. ...
Article
A new cyanobacterial species of Aliinostoc, A. vietnamicum sp. nov., is recorded in the tropical forest soil from the Cát Tiên National Park, Vietnam. The analysis is based on morphological characters, 16S rDNA phylogeny, ITS secondary structure, and fatty acid composition analysis. A. vietnamicum differed from the other species of the genus by the size and shape of vegetative cells, size of akinetes and heterocytes, and presence of granular polyphosphate inclusions in vegetative cells. The evolutionary distance matrix based on the 16S rRNA gene shared 96.2–98.2% similarities with other Aliinostoc sequences. The phylogeny inferred by Maximum Likelihood and Bayesian Inference placed A. vietnamicum in the Aliinostoc clade, within the Nostocaceae. For the first time, fatty acids composition analysis was obtained for a member of the genus Aliinostoc with cultivation time experiments. α‐linolenic (27.54–37.75%), palmitic (13.87–22.65%), and stearic (10.08–20.27%) acids were the dominant fatty acids when cultured during the exponential growth phase, as well as during stationary. This is the first finding of a strain with such a high content of stearic acid among cyanobacteria with Nostoc‐like morphology.
... In order to investigate the phylogenetic status of the Chinese bison mitochondrial haplotypes, we performed a maximum-likelihood (ML) phylogenetic analysis with RAxML-HPC v8 [46] on the CIPRES server [47]. Three newly obtained mitochondrial genomes (CADG456, CADG465 and CADG467) were used to reconstruct the phylogenetic tree, which was aligned with 59 steppe bison, ten American bison and four yak sequences downloaded from GenBank and ENA (Table S5) using MAFFT v7.471 [48]. ...
Article
Full-text available
Steppe bison are a typical representative of the Mid-Late Pleistocene steppes of the northern hemisphere. Despite the abundance of fossil remains, many questions related to their genetic diversity, population structure and dispersal route are still elusive. Here, we present both near-complete and partial mitochondrial genomes, as well as a partial nuclear genome from fossil bison samples excavated from Late Pleistocene strata in northeastern China. Maximum-likelihood and Bayesian trees both suggest the bison clade are divided into three maternal haplogroups (A, B and C), and Chinese individuals fall in two of them. Bayesian analysis shows that the split between haplogroup C and the ancestor of haplogroups A and B dates at 326 ky BP (95% HPD: 397-264 ky BP). In addition, our nuclear phylogenomic tree also supports a basal position for the individual carrying haplogroup C. Admixture analyses suggest that CADG467 (haplogroup C) has a similar genetic structure to steppe bison from Siberia (haplogroup B). Our new findings indicate that the genetic diversity of Pleistocene bison was probably even higher than previously thought and that northeastern Chinese populations of several mammalian species, including Pleistocene bison, were genetically distinct.
... For phylogenetic tree construction, the genomes of SFB strains were analyzed through the PAT-RIC Codon Trees pipeline, which utilizes PATRIC Cross-Genus Protein Families (PGfams) to align 1000 single copy protein and nucleotide sequences using MUSCLE and Biopython respectively [43,44]. Support values for the phylogenetic tree were generated from 100 rounds of rapid bootstrapping in RAxML [45]. The P-value Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH ("Springer Nature"). ...
Article
Full-text available
Background Segmented filamentous bacteria (SFB) are intestinal commensal microorganisms that have been demonstrated to induce the innate and adaptive immune responses in mouse and rat hosts. SFB are Gram-positive, spore-forming bacteria that fail to grow optimally under in vitro conditions due to unique metabolic requirements. Recently, SFB have been implicated in improved health and growth outcomes in commercial turkey flocks. To assess the nature and variations in SFB of turkeys and how they may differ from mammalian-associated SFB, the genome of turkey-associated SFB was compared with six representative genomes from murine hosts using an in silico approach. Results The SFB-turkey genome is 1.6 Mb with a G + C content of 26.14% and contains 1,604 coding sequences (CDS). Comparative genome analyses revealed that all the seven SFB strain possesses a common set of metabolic deficiencies and auxotrophies. Specifically, the inability of all the SFB strains to synthesize most of the amino acids, nucleotides and cofactors, emphasizing the importance of metabolite acquisition from the host intestinal environment. Among the seven SFB genomes, the SFB-turkey genome is the largest and contains the highest number of 1,604 predicted CDS. The SFB-turkey genome possesses cellular metabolism genes that are absent in the rodent SFB strains, including catabolic pathways for sucrose, stachyose, raffinose and other complex glycans. Other unique genes associated with SFB-turkey genome is loci for the biosynthesis of biotin, and degradation enzymes to recycle primary bile acids, both of which may play an important role to help turkey associated SFB survive and secure mutualism with its avian host. Conclusions Comparative genomic analysis of seven SFB genomes revealed that each strain have a core set of metabolic capabilities and deficiencies that make these bacteria challenging to culture under ex vivo conditions. When compared to the murine-associated strains, turkey-associated SFB serves as a phylogenetic outgroup and a unique member among all the sequenced strains of SFB. This turkey-associated SFB strain is the first reported non-mammalian SFB genome, and highlights the impact of host specificity and the evolution of metabolic capabilities.
... Based on the substitution GTR þ I þ G model, the maximum-likelihood (ML) phylogenetic tree was produced using RAxML-HPC2 on XSEDE (https://www.phylo.org/) with 1,000 bootstrap replicates (Stamatakis et al. 2008). The topology showed that 11 species of the genus Aristolochia occur in the same clade. ...
Article
Full-text available
Aristolochia hainanensis Merr. 1922, a well-known Chinese medicinal plant, is distributed in Hainan Province and Guangxi Province, China. In the current study, we sequenced the complete chloroplast genome of A. hainanensis. The complete plastome genome was 159,764 bp in length, with a GC content of 38.8%, showing a typical quadripartite organization. The genome contained a large single-copy (LSC) of 89,134 bp, a small single-copy (SSC) of 19,306 bp, and a pair of inverted repeats (IRs) of 25,662 bp. A total of 113 genes were annotated, including 79 protein-coding genes, 30 tRNAs, and four rRNAs. The trnK-UUU gene contained the longest intron (2644 bp). The topology of the maximum-likelihood tree supported a close relationship between A. hainanensis and A. kwangsiensis.
... RAxML was run with the GTRCAT model and with each gene partitioned separately. Bootstraps were obtained with the rapid bootstrapping method (Stamatakis et al. 2008) with 100 iterations. Trees were visualized using a combination of FigTree v1.4.4 and the ggtree R package v3.0.4 (Yu et al. 2017). ...
Article
Nutritional symbioses are integral to the survival and diversity of many insects. The majority of herbivorous insects in the order Hemiptera possess stable, inherited symbionts that produce essential amino acids and vitamins. However, instability has been observed in cicadas, with one bacterial symbiont, Hodgkinia cicadicola, being repeatedly replaced by a new fungal symbiont, Ophiocordyceps. The fungal symbionts are thought to be derived from parasitic Ophiocordyceps species, but little is known about these parasitic ancestors or how the transition from parasite to mutualist occurs. We used a combination of targeted amplified genes and metagenomic sequencing to investigate the evolution of endosymbiotic Ophiocordyceps across 25 species of cicadas in the tribe Cryptotympanini. At least four parallel instances of Ophiocordyceps domestication were found in the studied group, arising from a single monophyletic clade of cicada-parasitic Ophiocordyceps with only one having been known previously. The genome of a symbiotic Ophiocordyceps strain from the cicada Megatibicen auletes has been sequenced and annotated, paving the way for future comparative analyses between symbiotic and parasitic Ophiocordyceps.
... The best model for selected: GTR + I + G model for 12S and 16S gene, HKY + I for Cyt b, and GTR + G for ND4. Each inference was initiated with a random starting tree, and nodal support was assessed with 1000 bootstrap pseudoreplicates (Stamatakis et al., 2008). ...
Article
Full-text available
The newly described species Gloydius huangi Wang, Ren, Dong, Jiang, Siler et Che, 2019 were described based on only three specimens from two sites. We report a new distribution site from Markam County, Tibet Autonomous Region, China with supplementary description on variation of morphology and mitochondrial genetics of the species. The new specimen varies from types of G. huangi on head scalation, coloration patterns, and hemipenis morphology. A distinct genetical distance 1.9 – 2.2% based on Cytb gene fragment exist between the new specimen and types of the species.
... Phylogenetic analyses of maximum likelihood (ML), maximum parsimony (MP), and Bayesian inference (BI) were carried out as detailed in Dissanayake et al. (2020). Maximum likelihood analysis was performed by using RAxML-HPC2 on XSEDE 8.2.12 (Stamatakis et al. 2008) in CIPRES Science Gateway . The MP and BI analyses were performed by using PAUP v. 4.0b (Swofford 2002) and MrBayes v.3.2.7 (Ronquist et al. 2012). ...
Article
Full-text available
During an investigation of ascomycetous fungi on bamboos in Sichuan province, China, a monotypic genus, Pseudokeissleriella, collected from dead culms of bamboos is introduced to accommodate P. bambusicola. Pseudokeissleriella bambusicola is characterized by having subglobose to globose, glabrous ascomata, and hyaline, septate, fusiform ascospores with subobtuse ends and a swollen upper cell, surrounded by a mucilaginous sheath with center depression. The phylogenetic analyses based on multi-gene matrix of SSU, ITS, LSU, tef-1α sequences showed that P. bambusicola presented a distinct lineage sister to Katumotoa and Neoophiosphaerella in Lentitheciaceae. The establishment of new taxa were justified by morphological and phylogenetic evidences. Morpho-phylogenetic differences between Pseudokeissleriella and some related genera Katumotoa, Keissleriella, and Neoophiosphaerella are discussed. Descriptions, illustrations, and notes for the new taxa are provided.
... The samples from the stationary phases of the independent runs were pooled to obtain the final results. Maximum likelihood bootstrapping (1000 replicates) was performed in raxMl v.7.7.1 (Stamatakis et al., 2008) using the same model as in the Bayesian inferences, with the dataset partitioned by codon and support calculated from 1000 bootstrap pseudoreplicates. Uncorrected pairwise sequence divergence was calculated in MEGA7. ...
Article
The eared nightjars (Lyncornis, formerly Eurostopodus) comprise six taxa distributed from southern India and Southeast Asia to Sulawesi. Species limits in this group have not been evaluated since 1940. In this study, we use three datasets (morphology, acoustics and mitochondrial DNA) to assess the taxonomic status of taxa in this genus. Multivariate analyses of vocalizations and phylogenetic analysis of mitochondrial DNA both revealed the presence of four major groups. Morphological analyses also revealed four major groups, but these agreed only in part with those identified by vocalizations and DNA. Lyncornis macrotis cerviniceps from mainland Southeast Asia and the isolated Lyncornis macrotis jacobsoni on Simeulue Island, off north-west Sumatra, differed by six diagnostic plumage characters, but could not be distinguished by their vocalizations or mitochondrial DNA. Conversely, Lyncornis macrotis macrotis from the Philippines and Lyncornis macrotis macropterus from Sulawesi differed diagnosably in song and by 5% sequence divergence but could not be diagnosed by plumage. We adopt an integrative approach and propose to recognize five monotypic species: Lyncornis temminckii, Lyncornis cerviniceps (synonym: Lyncornis bourdilloni), Lyncornis jacobsoni, Lyncornis macrotis and Lyncornis macropterus. Our study illustrates that taxonomic revisions based on single lines of evidence can underestimate diversity and underscores the importance of using multiple datasets in species-level taxonomy.
... Given that previous phylogenetic analyses of the group have failed to find topological differences among standard coding/non-coding and gene-wise partitioning schemes , this analysis was unpartitioned to optimize computational times. Rapid bootstraps (option "-f a"; Stamatakis et al. 2008) were also calculated to assess support. Chloroplast phylogenetics followed the concatenated methods described above. ...
Preprint
Full-text available
Applications of molecular phylogenetic approaches have uncovered evidence of hybridization across numerous clades of life, yet the environmental factors responsible for driving opportunities for hybridization remain obscure. Verbal models implicating geographic range shifts that brought species together during the Pleistocene have often been invoked, but quantitative tests using paleoclimatic data are needed to validate these models. Here, we produce a phylogeny for Heuchereae, a clade of 15 genera and 83 species in Saxifragaceae, with complete sampling of recognized species, using 277 nuclear loci and nearly complete chloroplast genomes. We then employ an improved framework with a coalescent simulation approach to test and ultimately confirm previous hybridization hypotheses and identify one new intergeneric hybridization event. Focusing on the North American distribution of Heuchereae, we introduce and implement a newly developed approach to reconstruct potential past distributions for ancestral lineages across all species in the clade and across a paleoclimatic record extending from the late Pliocene. Time calibration based on both nuclear and chloroplast trees recovers a mid- to late-Pleistocene date for most inferred hybridization events, a timeframe concomitant with repeated geographic range restriction into overlapping refugia. Our results indicate an important role for past episodes of climate change, and the contrasting responses of species with differing ecological strategies, in generating novel patterns of range contact among plant communities and therefore new opportunities for hybridization.
... The phylogenetic tree and posterior probabilities of its branching were obtained on the basis of the remaining trees, having stable estimates of the parameter models of nucleotide substitutions and likelihood. Maximum Likelihood (ML) analysis was performed using the program RAxML (Stamatakis et al. 2008 Table 1. (Figs 1-28) Cells cylindrical, circular, mostly solitary. ...
... The phylogenetic analyses were carried out for maximum likelihood in CIPRES web portal [44] using RAxML 7.4.2 Black Box [45]. ...
Article
Full-text available
During the investigation of xylarialean taxa in China and Thailand, six rosellinia like taxa were collected. Rhizomaticola gen. nov. with type species of Rh. guizhouensis is established based on its morphology and multi-gene molecular data. Rhizomaticola owns no carbonaceous stromata and has black ascospores without a germ slit which are distinguished from those of Rosellinia, Dematophora, Stilbohypoxylon and Xylaria. Five rosellinia like species are introduced based on their morphology, inducing three new species (Dematophora populi, Rosellinia thailandica, Ro. vitis), one new record for China (Ro. cainii) and one known species (D. necatrix). Their descriptions and illustrations are detailed.
... Model-based analyses were performed on two datasets, the molecular only dataset for 51 species and the total evidence dataset for 69 species. Maximum likelihood (ML) analysis was performed using RAxML (Stamatakis, 2006) with rapid bootstrapping (Stamatakis et al., 2008) available via the Cyberinfrastructure for Phylogenetic Research (CIPRES) Science Gateway (www.phylo.org/portal2/) (Miller et al., 2011). ...
Article
The first comprehensive phylogenetic analyses of the plant bug subfamily Deraeocorinae is presented. A total of 86 morphological characters and 3899 base pairs of mitochondrial (16S, COI) and nuclear (18S, 28S) sequences were analysed separately for each partition and combined datasets, using parsimony, maximum likelihood and Bayesian inference. The fossil species Amberderaeous gigophthalmusKim, Taszakowski & Jung, 2020 was analysed in the morphological and combined datasets in order to assess its systematic position. The phylogenetic results revealed that Deraeocorinae as presently constituted is not monophyletic, with the tribe Termatophylini nested separately to the other deraeocorine supraspecific lineages analysed. The remaining Deraeocorinae was found to be monophyletic as well as Clivinemini, Hyaliodini and Saturniomirini. Clivinemini + Saturniomirini is the sister group of the remaining Deraeocorinae. Deraeocorini, which is the most diverse tribe in the subfamily, was found to be non-monophyletic, with Surinamellini nested within it. Based on the results, we proposed that the clade including the questionably placed taxa and a temporarily placed fossil taxon is synonymized from Surinamellini to Deraeocorini, and that Surinamellini is treated as paraphyletic. The nominate genus Deraeocoris s.l. was found to be non-monophyletic. Termatophylini was provisionally retained within Deraeocorinae, pending broader taxon sampling of Deraeocorinae and other mirid subfamilies.
... = 29), and a second calibration age resulting from population parameter estimation in the coalescent. This divergence age was estimated for lindeniana + linearis clades in SNAPP (Bryant et al., 2012) implemented in Beast 2.6 (Bouckaert et al., 2019) on CIPRES Science Gateway version 3.3 (Stamatakis et al., 2008;. For this age estimation, we included all sampled individuals of A. hirsuta, A. lindeniana, A. linearis, and A. concinna in the SCP. ...
Chapter
Full-text available
Pimpinella species are annual, biennial, and perennial semibushy aromatic plants cultivated for folk medicine, pharmaceuticals, food, and spices. The karyology and genome size of 17 populations of 16 different Pimpinella species collected from different locations in Iran were analyzed for inter-specific karyotypic and genome size variations. For karyological studies, root tips were squashed and painted with a DAPI solution (1 mg/ml). For flow cytometric measurements, fresh leaves of the standard reference (Solanum lycopersicum cv. Stupick, 2C DNA = 1.96 pg) and the Pimpinella samples were stained with propidium iodide. We identified two ploidy levels: diploid (2x) and tetraploid (4x), as well as five metaphase chromosomal counts of 18, 20, 22, 24, and 40. 2n = 24 is reported for the first time in the Pimpinella genus, and the presence of a B-chromosome is reported for one species. The nuclear DNA content ranged from 2C = 2.48 to 2C = 5.50 pg, along with a wide range of genome sizes between 1212.72 and 2689.50 Mbp. The average monoploid genome size and the average value of 2C DNA/chromosome were not proportional to ploidy. There were considerable positive correlations between 2C DNA and total chromatin length and total chromosomal volume. The present study results enable us to classify the genus Pimpinella with a high degree of morphological variation in Iran. In addition, cytological studies demonstrate karyotypic differences between P. anthriscoides and other species of Pimpinella, which may be utilized as a novel identification key to affiliate into a distinct, new genus – Pseudopimpinella.
... To further evaluate statistical support for the topology, maximum likelihood (ML) bootstrapping was performed. ML analysis was conducted using RAxML v7.7.1 (Stamatakis et al., 2008). Clade support for the ML analysis was assessed by 1000 bootstrap replicates. ...
Article
The European Robin Erithacus rubecula is currently treated as a single species with eight subspecies. A previous molecular study and new molecular, morphometric and bioacoustic data reported here strongly support the recognition of three species in this complex: E. rubecula (Europe, North Africa and Macaronesia except the central Canary Islands), E. superbus (Tenerife) and a recently described subspecies on Gran Canaria which we raise to species rank as E. marionae. The taxa on Tenerife and Gran Canaria have previously been lumped as a single taxon but differ from each other and from E. rubecula in territorial songs, tic calls, seep calls and wing length. All three species are characterised by moderate to high levels of interspecific mitochondrial DNA sequence divergence (mean 4.2–4.8%). Phylogenetic analysis indicates that E. marionae is sister to E. superbus + E. rubecula. Recognition of Gran Canaria and Tenerife Robins as separate species adds two single-island endemics to the Canary Islands avifauna.
... vital-it. ch) 31 was used to generate an ML tree, with 100 bootstrap replicates performed. ...
Article
Full-text available
Bacteria in the Shigella genus remain a major cause of dysentery in sub-Saharan Africa, and annually cause an estimated 600,000 deaths worldwide. Being spread by contaminated food and water, this study highlights how wild caught food, in the form of freshwater catfish, can act as vectors for Shigella flexneri in Southern Kenya. A metatranscriptomic approach was used to identify the presence of Shigella flexneri in the catfish which had been caught for consumption from the Galana river. The use of nanopore sequencing was shown to be a simple and effective method to highlight the presence of Shigella flexneri and could represent a potential new tool in the detection and prevention of this deadly pathogen. Rather than the presence/absence results of more traditional testing methods, the use of metatranscriptomics highlighted how primarily one SOS response gene was being transcribed, suggesting the bacteria may be dormant in the catfish. Additionally, COI sequencing of the vector catfish revealed they likely represent a cryptic species. Morphological assignment suggested the fish were widehead catfish Clarotes laticeps, which range across Africa, but the COI sequences from the Kenyan fish are distinctly different from C. laticeps sequenced in West Africa.
... An ITS phylogram was constructed using ML analysis performed by RAxML with default parameters ans 1000 bootstrap replications [17,18]. The criterion used to assess BS support percentages (BP) was as follows: low 50-70%, moderate 71-84%, and strong 95-100. ...
... ML analysis of the alignment was performed using RAxML-HPC2 v.8.2.12 on the CIPRES server using the bestsubstitution model as determined by model jumping in the Bayesian analysis, with a gamma-distributed rate variation and proportion of invariable sites. 37 Bootstrap analysis was performed to test the strength of tree topology using RAxML-HPC2 and Consense v.3.697 on the CIPRES server with 1,000 subsets. 9 The phylogenetic tree was visualized using FigTree software (http://tree.bio.ed.ac.uk/software/figtree/). ...
Article
Herpesviruses are found in free-living and captive chelonian populations, often in association with morbidity and mortality. To date, all known chelonian herpesviruses fall within the subfamily Alphaherpesvirinae. We detected a novel herpesvirus in 3 species of chelonians: a captive leopard tortoise (Stigmochelys pardalis) in western TX, USA; a steppe tortoise (Testudo [Agrionemys] horsfieldii) found near Fort Irwin, CA, USA; and 2 free-living, three-toed box turtles (Terrapene mexicana triunguis) found in Forest Park, St. Louis, MO. The leopard tortoise was coinfected with the tortoise intranuclear coccidian and had clinical signs of upper respiratory tract disease. The steppe tortoise had mucopurulent nasal discharge and lethargy. One of the three-toed box turtles had no clinical signs; the other was found dead with signs of trauma after being observed with blepharedema, tympanic membrane swelling, cervical edema, and other clinical signs several weeks prior to death. Generally, the branching order of the turtle herpesviruses mirrors the divergence patterns of their hosts, consistent with codivergence. Based on phylogenetic analysis, this novel herpesvirus clusters with a clade of viruses that infect emydid hosts and is likely of box turtle origin. Therefore, we suggest the name terrapene alphaherpesvirus 3 (TerAHV3) for the novel virus. This virus also has the ability to host-jump to tortoises, and previously documented herpesviral morbidity tends to be more common in aberrant hosts. The relationship between clinical signs and infection with TerAHV3 in these animals is unclear, and further investigation is merited.
Article
Full-text available
Physalacria auricularioides S. De la Peña-Lastra, A. Mateos, M. Saavedra & P. Alvarado, sp. nov. from a dead twig of Castanea sativa
Article
Full-text available
Novel species of fungi described in this study include those from various countries as follows: Australia , Agaricus albofoetidus , Agaricus aureoelephanti and Agaricus parviumbrus on soil, Fusarium ramsdenii from stem cankers of Araucaria cunninghamii , Keissleriella sporoboli from stem of Sporobolus natalensis , Leptosphaerulina queenslandica and Pestalotiopsis chiaroscuro from leaves of Sporobolus natalensis , Serendipita petricolae as endophyte from roots of Eriochilus petricola , Stagonospora tauntonensis from stem of Sporobolus natalensis , Teratosphaeria carnegiei from leaves of Eucalyptus grandis × E. camaldulensis and Wongia ficherai from roots of Eragrostis curvula . Canada , Lulworthia fundyensis from intertidal wood and Newbrunswickomyces abietophilus (incl. Newbrunswickomyces gen. nov.) on buds of Abies balsamea . Czech Republic , Geosmithia funiculosa from a bark beetle gallery on Ulmus minor and Neoherpotrichiella juglandicola (incl. Neoherpotrichiella gen. nov.) from wood of Juglans regia . France , Aspergillus rouenensis and Neoacrodontium gallica (incl. Neoacrodontium gen. nov.) from bore dust of Xestobium rufovillosum feeding on Quercus wood, Endoradiciella communis (incl. Endoradiciella gen. nov.) endophytic in roots of Microthlaspi perfoliatum and Entoloma simulans on soil. India , Amanita konajensis on soil and Keithomyces indicus from soil. Israel , Microascus rothbergiorum from Stylophora pistillata . Italy , Calonarius ligusticus on soil. Netherlands , Appendopyricularia juncicola (incl. Appendopyricularia gen. nov.), Eriospora juncicola and Tetraploa juncicola on dead culms of Juncus effusus , Gonatophragmium physciae on Physcia caesia and Paracosmospora physciae (incl. Paracosmospora gen. nov.) on Physcia tenella , Myrmecridium phragmitigenum on dead culm of Phragmites australis , Neochalara lolae on stems of Pteridium aquilinum , Niesslia nieuwwulvenica on dead culm of undetermined Poaceae , Nothodevriesia narthecii (incl. Nothodevriesia gen. nov.) on dead leaves of Narthecium ossifragum and Parastenospora pini (incl. Parastenospora gen. nov.) on dead twigs of Pinus sylvestris . Norway , Verticillium bjoernoeyanum from sand grains attached to a piece of driftwood on a sandy beach. Portugal , Collybiopsis cimrmanii on the base of living Quercus ilex and amongst dead leaves of Laurus and herbs. South Africa , Paraproliferophorum hyphaenes (incl. Paraproliferophorum gen. nov.) on living leaves of Hyphaene sp. and Saccothecium widdringtoniae on twigs of Widdringtonia wallichii . Spain , Cortinarius dryosalor on soil, Cyphellophora endoradicis endophytic in roots of Microthlaspi perfoliatum , Geoglossum lauri­silvae on soil, Leptographium gemmatum from fluvial sediments, Physalacria auricularioides from a dead twig of Castanea sativa , Terfezia bertae and Tuber davidlopezii in soil. Sweden , Alpova larskersii , Inocybe alpestris and Inocybe boreogodeyi on soil. Thailand , Russula banwatchanensis , Russula purpureoviridis and Russula lilacina on soil. Ukraine , Nectriella adonidis on overwintered stems of Adonis vernalis . USA , Microcyclus jacquiniae from living leaves of Jacquinia keyensis and Penicillium neoherquei from a minute mushroom sporocarp. Morphological and culture characteristics are supported by DNA barcodes.
Preprint
Full-text available
Setaria P. Beauv. is the largest genus of the “bristle clade”, including between 115 and 160 species. Previous molecular phylogenetic studies showed Setaria likely to be para- or polyphyletic, retrieving several clades apparently consistent in all analyses and correlated with the geographic origin of species. In this study, we evaluate the phylogeny of the subtribe Cenchrinae using parsimony, likelihood, and Bayesian inference based on the plastid marker ndh F and increasing the number of sampled species. Our main objective was analyze American taxa with inflorescences of the “Paspalidium type” (i.e., subgenera Paurochaetium and Reverchoniae ) to test whether they, as traditionally circumscribed, form a natural group. Our findings recovered both subgenera as polyphyletic, with their species distributed in different morphologically distinctive clades and not necessarily correlated with the geographic origin. Additionally, we were able to include a second voucher of species that were imprecisely located in previous studies and define their placements in the tree, as well as confirm that Setaria is polyphyletic as currently delineated. A comparison with the results from other studies, comments on Stenotaphrum Trin. and a brief discussion on conflicting placements in the "Cenchrus clade", and of Acritochaete Pilg. are also included here.
Article
Molecular barcoding and morphological characters were used to identify a new saprotrophic species in Pestalotiopsis, which was associated with senescent leaves of Eleutherococcus brachypus (Araliaceae) in Jilin Province, China. The matrix of the internal transcribed spacer (ITS) region, translation elongation factor 1-alpha (tef1-α), and β-tubulin (tub2) were used in the maximum likelihood (ML), maximum parsimony (MP) and Bayesian inference (BI) methods. The new collections formed a distinct clade with Pestalotiopsis lijiangensis. The new species differs from P. lijiangensis by its conidial length/width ratio. Detailed description and micrographs revealed that the species is unique in its olivaceous concolorous median cells and has significantly smaller conidia compared to other related species. The position of the apical appendages of Pestalotiopsis eleutherococci are distinct and are slightly shorter while the basal appendage is slightly longer compared to P. lijiangensis. Therefore, we introduce Pestalotiopsis eleutherococci as a novel species.
Article
The genus Miniopterus is a monophyletic assemblage of many species characterized by remarkably conservative morphology. The number of recognized species has more than doubled over the last two decades, mainly with newly recognized Afrotropical and Malagasy species. A molecular phylogenetic analysis based on cytochrome c oxidase subunit I (COI) revealed a monophyletic clade of Miniopterus from Sri Lanka and southern India that is distinct from the other known taxa of this genus. The mean uncorrected pairwise sequence divergence among the three gene sequences of this new Miniopterus lineage was 0.83% (range 0.4–1.2%) and between this and other sampled taxa was 12.7% (range 8.5–15.9%). This lineage was also distinctive in craniodental morphometrics and hence it is herein described as a new species. The newly described species is easily distinguished by its external and cranial dimensions from its smaller (M. pusillus) and larger (M. magnater) congeners in India and Sri Lanka. It is also somewhat smaller than M. fuliginosus in both external and cranial dimensions. This is the first description of a new Miniopterus species from Asia in six decades and from India and Sri Lanka in eight decades. Our study highlights the importance of using both genetic and morphometric analyses in taxonomic studies on South Asian bats. Key words: cryptic species, Miniopteridae, cytochrome oxidase 1, morphometrics, taxonomy, South Asia, DNA barcode
Article
Full-text available
Background Gnetales have a key phylogenetic position in the evolution of seed plants. Among the Gnetales, there is an extraordinary morphological diversity of seeds, the genus Ephedra, in particular, exhibits fleshy, coriaceous or winged (dry) seeds. Despite this striking diversity, its underlying genetic mechanisms remain poorly understood due to the limited studies in gymnosperms. Expanding the genomic and developmental data from gymnosperms contributes to a better understanding of seed evolution and development. Results We performed transcriptome analyses on different plant tissues of two Ephedra species with different seed morphologies. Anatomical observations in early developing ovules, show that differences in the seed morphologies are established early in their development. The transcriptomic analyses in dry-seeded Ephedra californica and fleshy-seeded Ephedra antisyphilitica, allowed us to identify the major differences between the differentially expressed genes in these species. We detected several genes known to be involved in fruit ripening as upregulated in the fleshy seed of Ephedra antisyphilitica. Conclusions This study allowed us to determine the differentially expressed genes involved in seed development of two Ephedra species. Furthermore, the results of this study of seeds with the enigmatic morphology in Ephedra californica and Ephedra antisyphilitica, allowed us to corroborate the hypothesis which suggest that the extra envelopes covering the seeds of Gnetales are not genetically similar to integument. Our results highlight the importance of carrying out studies on less explored species such as gymnosperms, to gain a better understanding of the evolutionary history of plants.
Article
Access to inorganic phosphate (Pi), a principal intermediate of energy and nucleotide metabolism, profoundly affects cellular activities and plant performance. In most soils, antagonistic Pi-metal interactions restrict Pi bioavailability, which guides local root development to maximize Pi interception. Growing root tips scout the essential but immobile mineral nutrient; however, the mechanisms monitoring external Pi status are unknown. Here, we show that Arabidopsis LOW PHOSPHATE ROOT 1 (LPR1), one key determinant of Fe-dependent Pi sensing in root meristems, encodes a novel ferroxidase of high substrate specificity and affinity (apparent KM ∼ 2 μM Fe²⁺). LPR1 typifies an ancient, Fe-oxidizing multicopper protein family that evolved early upon bacterial land colonization. The ancestor of streptophyte algae and embryophytes (land plants) acquired LPR1-type ferroxidase from soil bacteria via horizontal gene transfer, a hypothesis supported by phylogenomics, homology modeling, and biochemistry. Our molecular and kinetic data on LPR1 regulation indicate that Pi-dependent Fe substrate availability determines LPR1 activity and function. Guided by the metabolic lifestyle of extant sister bacterial genera, we propose that Arabidopsis LPR1 monitors subtle concentration differentials of external Fe availability as a Pi-dependent cue to adjust root meristem maintenance via Fe redox signaling and cell wall modification. We further hypothesize that the acquisition of bacterial LPR1-type ferroxidase by embryophyte progenitors facilitated the evolution of local Pi sensing and acquisition during plant terrestrialization.
Article
The widespread species Parmotrema crinitum (Ach.) M. Choisy and Parmotrema perlatum (Huds.) M. Choisy are mainly distinguished by their reproductive strategies. While P. crinitum propagates by isidia, P. perlatum produces soredia. In this study, we aim to evaluate the phylogenetic relationship between both species and to critically examine their species boundaries. To this purpose, 46 samples belonging to P. crinitum and P. perlatum were used in our analysis, including 22 for which we studied the morphology and chemistry, before extracting their DNA. We used 35 sequences of the internal transcribed spacer region of nuclear ribosomal DNA (ITS) of Parmotrema perlatum from Europe and Africa (20 of which were newly generated), and 11 of Parmotrema crinitum from Europe, North America and North Africa (two newly generated). Additionally, 28 sequences of several species from Parmotrema were included in the ITS dataset. The ITS data matrix was analyzed using different approaches, such as traditional phylogeny (maximum likelihood and Bayesian analyses), genetic distances, automatic barcode gap discovery (ABGD) and the coalescent-based method poisson tree processes (PTP), in order to test congruence among results. Our results indicate that all samples referred to P. crinitum and P. perlatum nested in a well-supported monophyletic clade, but phylogenetic relationships among them remain unresolved. Delimitations inferred from PTP, ABGD and genetic distance analyses were comparable and suggested that P. crinitum and P. perlatum belong to the same lineage. Interestingly, two samples of P. perlatum separate in a different monophyletic clade, which is supported as a different lineage by all the analyses.
Article
Full-text available
Phylogenetic inference is considered to be one of the grand challenges in Bioinformatics due to the immense computational requirements. RAxML is currently among the fastest and most accurate programs for phylogenetic tree inference under the Maximum Likelihood (ML) criterion. First, we introduce new tree search heuristics that accelerate RAxML by a factor of 2.43 while returning equally good trees. The performance of the new search algorithm has been assessed on 18 real-world datasets comprising 148 up to 4,843 DNA sequences. We then present the implementation, optimization, and evaluation of RAxML on the IBM Cell Broadband Engine. We address the problems and provide solutions pertaining to the optimization of floating point code, control flow, communication, and scheduling of multi-level parallelism on the Cell.
Conference Paper
Full-text available
This paper addresses the problem of orchestrating and scheduling parallelism at multiple levels of granularity on heterogeneous mul- ticore processors. We present policies and mechanisms for adaptive exploitation and scheduling of multiple layers of parallelism on the Cell Broadband Engine. Our policies combine event-driven task scheduling with malleable loop-level parallelism, which is exposed from the runtime system whenever task-level parallelism leaves cores idle. We present a runtime system for scheduling applications with layered parallelism on Cell and investigate its potential with RAxML, a computational biology application which infers large phylogenetic trees, using the Maximum Likelihood (ML) method. Our experiments show that the Cell benefits significantly from dy- namic parallelization methods, that selectively exploit the layers of parallelism in the system, in response to workload characteris- tics. Our runtime environment outperforms naive parallelization and scheduling based on MPI and Linux by up to a factor of 2.6. We are able to execute RAxML on one Cell four times faster than on a dual-processor system with Hyperthreaded Xeon processors, and 5-10% faster than on a single-processor system with a dual- core, quad-thread IBM Power5 processor.
Conference Paper
Full-text available
Phylogenetic inference is a grand challenge in Bioinformatics due to immense computational requirements. The increas- ing popularity of multi-gene alignments in biological stud- ies, which typically provide a stable topological signal due to a more favorable ratio of the number of base pairs to the number of sequences, coupled with rapid accumulation of se- quence data in general, poses new challenges for high perfor- mance computing. In this paper, we demonstrate how state- of-the-art Maximum Likelihood (ML) programs can be effi- ciently scaled to the IBM BlueGene/L (BG/L) architecture, by porting RAxML, which is currently among the fastest and most accurate programs for phylogenetic inference under the ML criterion. We simultaneously exploit coarse-grained and fine-grained parallelism that is inherent in every ML-based biological analysis. Performance is assessed using datasets consisting of 212 sequences and 566,470 base pairs, and 2,182 sequences and 51,089 base pairs, respectively. To the best of our knowledge, these are the largest datasets analyzed under ML to date. The capability to analyze such datasets will help to address novel biological questions via phyloge- netic analyses. Our experimental results indicate that the fine-grained parallelization scales well up to 1,024 proces- sors. Moreover, a larger number of processors can be ef- ficiently exploited by a combination of coarse-grained and fine-grained parallelism. Finally, we demonstrate that our parallelization scales equally well on an AMD Opteron clus- ter with a less favorable network latency to processor speed ratio. We recorded super-linear speedups in several cases due to increased cache efficiency.
Article
Full-text available
Motivation: In recent years there has been increased interest in producing large and accurate phylogenetic trees using statistical approaches. However for a large number of taxa, it is not feasible to construct large and accurate trees using only a single processor. A number of specialized parallel programs have been produced in an attempt to address the huge computational requirements of maximum likelihood. We express a number of concerns about the current set of parallel phylogenetic programs which are currently severely limiting the widespread availability and use of parallel computing in maximum likelihood-based phylogenetic analysis. Results: We have identified the suitability of phylogenetic analysis to large-scale heterogeneous distributed computing. We have completed a distributed and fully cross-platform phylogenetic tree building program called distributed phylogeny reconstruction by maximum likelihood. It uses an already proven maximum likelihood-based tree building algorithm and a popular phylogenetic analysis library for all its likelihood calculations. It offers one of the most extensive sets of DNA substitution models currently available. We are the first, to our knowledge, to report the completion of a distributed phylogenetic tree building program that can achieve near-linear speedup while only using the idle clock cycles of machines. For those in an academic or corporate environment with hundreds of idle desktop machines, we have shown how distributed computing can deliver a 'free' ML supercomputer.
Article
Full-text available
A versatile method, quartet puzzling, is introduced to reconstruct the topology (branching pattern) of a phylogenetic tree based on DNA or amino acid sequence data. This method applies maximum-likelihood tree reconstruction to all possible quartets that can be formed from n sequences. The quartet trees serve as starting points to reconstruct a set of optimal n- taxon trees. The majority rule consensus of these trees defines the quartet puzzling tree and shows groupings that are well supported. Computer simulations show that the performance of quartet puzzling to reconstruct the true tree is always equal to or better than that of neighbor joining. For some cases with high transition/transversion bias quartet puzzling outperforms neighbor joining by a factor of 10. The application of quartet puzzling to mitochondrial RNA and tRNA(Val) sequences from amniotes demonstrates the power of the approach. A PHYLIP-compatible ANSI C program, PUZZLE, for analyzing nucleotide or amino acid sequence data is available.
Article
Full-text available
The multi-copy internal transcribed spacer (ITS) region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML) and splits graph analyses to extract phylogenetic information from approximately 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation) instead of the full (partly redundant) original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong's (1994) 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly.
Article
Full-text available
TREE-PUZZLE is a program package for quartet-based maximum-likelihood phylogenetic analysis (formerly PUZZLE, Strimmer and von Haeseler, Mol. Biol. Evol., 13, 964-969, 1996) that provides methods for reconstruction, comparison, and testing of trees and models on DNA as well as protein sequences. To reduce waiting time for larger datasets the tree reconstruction part of the software has been parallelized using message passing that runs on clusters of workstations as well as parallel computers. Availability: http://www.tree-puzzle.de. The program is written in ANSI C. TREE-PUZZLE can be run on UNIX, Windows and Mac systems, including Mac OS X. To run the parallel version of PUZZLE, a Message Passing Interface (MPI) library has to be installed on the system. Free MPI implementations are available on the Web (cf. http://www.lam-mpi.org/mpi/implementations/).
Article
Full-text available
Likelihood-based statistical tests of competing evolutionary hypotheses (tree topologies) have been available for approximately a decade. By far the most commonly used is the Kishino-Hasegawa test. However, the assumptions that have to be made to ensure the validity of the Kishino-Hasegawa test place important restrictions on its applicability. In particular, it is only valid when the topologies being compared are specified a priori. Unfortunately, this means that the Kishino-Hasegawa test may be severely biased in many cases in which it is now commonly used: for example, in any case in which one of the competing topologies has been selected for testing because it is the maximum likelihood topology for the data set at hand. We review the theory of the Kishino-Hasegawa test and contend that for the majority of popular applications this test should not be used. Previously published results from invalid applications of the Kishino-Hasegawa test should be treated extremely cautiously, and future applications should use appropriate alternative tests instead. We review such alternative tests, both nonparametric and parametric, and give two examples which illustrate the importance of our contentions.
Article
Full-text available
MrBayes 3 performs Bayesian phylogenetic analysis combining information from different data partitions or subsets evolving under different stochastic evolutionary models. This allows the user to analyze heterogeneous data sets consisting of different data types—e.g. morphological, nucleotide, and protein—and to explore a wide variety of structured models mixing partition-unique and shared parameters. The program employs MPI to parallelize Metropolis coupling on Macintosh or UNIX clusters. Availability: http://morphbank.ebc.uu.se/mrbayes Contact: fredrik.ronquist@ebc.uu.se * To whom correspondence should be addressed.
Article
Full-text available
Most analysis programs for inferring molecular phylogenies are difficult to use, in particular for researchers with little programming experience. TREEFINDER is an easy-to-use integrative platform-independent analysis environment for molecular phylogenetics. In this paper the main features of TREEFINDER (version of April 2004) are described. TREEFINDER is written in ANSI C and Java and implements powerful statistical approaches for inferring gene tree and related analyzes. In addition, it provides a user-friendly graphical interface and a phylogenetic programming language. TREEFINDER is a versatile framework for analyzing phylogenetic data across different platforms that is suited both for exploratory as well as advanced studies.
Article
Full-text available
The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to the true tree more frequently than less elaborate methods such as parsimony or neighbor joining. Due to the combinatorial and computational complexity the size of trees which can be computed on a Biologist's PC workstation within reasonable time is limited to trees containing approximately 100 taxa. In this paper we present the latest release of our program RAxML-III for rapid maximum likelihood-based inference of large evolutionary trees which allows for computation of 1.000-taxon trees in less than 24 hours on a single PC processor. We compare RAxML-III to the currently fastest implementations for maximum likelihood and bayesian inference: PHYML and MrBayes. Whereas RAxML-III performs worse than PHYML and MrBayes on synthetic data it clearly outperforms both programs on all real data alignments used in terms of speed and final likelihood values. Availability RAxML-III including all alignments and final trees mentioned in this paper is freely available as open source code at http://wwwbode.cs.tum/~stamatak stamatak@cs.tum.edu.
Article
Full-text available
Motivation: Maximum likelihood (ML) is an increasingly popular optimality criterion for selecting evolutionary trees. Yet the computational complexity of ML was open for over 20 years, and only recently resolved by the authors for the Jukes-Cantor model of substitution and its generalizations. It was proved that reconstructing the ML tree is computationally intractable (NP-hard). In this work we explore three directions, which extend that result. Results: (1) We show that ML under the assumption of molecular clock is still computationally intractable (NP-hard). (2) We show that not only is it computationally intractable to find the exact ML tree, even approximating the logarithm of the ML for any multiplicative factor smaller than 1.00175 is computationally intractable. (3) We develop an algorithm for approximating log-likelihood under the condition that the input sequences are sparse. It employs any approximation algorithm for parsimony, and asymptotically achieves the same approximation ratio. We note that ML reconstruction for sparse inputs is still hard under this condition, and furthermore many real datasets satisfy it.
Article
Full-text available
We explored the use of multidimensional scaling (MDS) of tree-to-tree pairwise distances to visualize the relationships among sets of phylogenetic trees. We found the technique to be useful for exploring “tree islands” (sets of topologically related trees among larger sets of near-optimal trees), for comparing sets of trees obtained from bootstrapping and Bayesian sampling, for comparing trees obtained from the analysis of several different genes, and for comparing multiple Bayesian analyses. The technique was also useful as a teaching aid for illustrating the progress of a Bayesian analysis and as an exploratory tool for examining large sets of phylogenetic trees. We also identified some limitations to the method, including distortions of the multidimensional tree space into two dimensions through the MDS technique, and the definition of the MDS-defined space based on a limited sample of trees. Nonetheless, the technique is a useful approach for the analysis of large sets of phylogenetic trees.
Article
Full-text available
IQPNNI is a program to infer maximum-likelihood phylogenetic trees from DNA or protein data with a large number of sequences. We present an improved and MPI-parallel implementation showing very good scaling and speedup behavior. Availability: IQPNNI (http://www.bi.uni-duesseldorf.de/software/iqpnni) is written in C++, executable on UNIX/Linux, Windows and MacOS systems. (Free) MPI libraries can be found at http://www.lam-mpi.org/mpi/implementations/. Contact: haeseler{at}cs.uni-duesseldorf.de
Article
Full-text available
We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihood-ratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the chi(0)2 + chi(1)2 distribution. The new aLRT of interior branch uses this distribution for significance testing, but the test statistic is approximated in a slightly conservative but practical way as 2(l1- l2), i.e., double the difference between the maximum log-likelihood values corresponding to the best tree and the second best topological arrangement around the branch of interest. Such a test is fast because the log-likelihood value l2 is computed by optimizing only over the branch of interest and the four adjacent branches, whereas other parameters are fixed at their optimal values corresponding to the best ML tree. The performance of the new test was studied on simulated 4-, 12-, and 100-taxon data sets with sequences of different lengths. The aLRT is shown to be accurate, powerful, and robust to certain violations of model assumptions. The aLRT is implemented within the algorithm used by the recent fast maximum likelihood tree estimation program PHYML (Guindon and Gascuel, 2003).
Article
Full-text available
The need to depict a phylogeny, or some other kind of abstract tree, is very frequently experienced by researchers from a broad range of biological and computational disciplines. Thousands of papers and talks include phylogeny figures, and often during everyday work, one would like to quickly get a graphical display of, e.g., the phylogenetic relationship between a set of sequences as calculated by an alignment program such as ClustalW or the phylogenetic package Phylip. A wealth of software tools capable of tree drawing exists; most are comprehensive packages that also perform various types of analysis, and hence they are available only for download and installing. Some online tools exist, too. This paper presents an online tool, PHY.FI, which encompasses all the qualities of existing online programs and adds functionality to hopefully eliminate the need for post-processing the phylogeny figure in some other general-purpose graphics program. PHY.FI is versatile, easy-to-use and fast, and supports comprehensive graphical control, several download image formats, and the possibility of dynamically collapsing groups of nodes into named subtrees (e.g. "Primates"). The user can create a color figure from any phylogeny, or other kind of tree, represented in the widely used parenthesized Newick format. PHY.FI is fast and easy to use, yet still offers full color control, tree manipulation, and several image formats. It does not require any downloading and installing, and thus any internet user regardless of computer skills, and computer platform, can benefit from it. PHY.FI is free for all and is available from this web address: http://cgi-www.daimi.au.dk/cgi-chili/phyfi/go.
Article
Full-text available
Unlabelled: RAxML-VI-HPC (randomized axelerated maximum likelihood for high performance computing) is a sequential and parallel program for inference of large phylogenies with maximum likelihood (ML). Low-level technical optimizations, a modification of the search algorithm, and the use of the GTR+CAT approximation as replacement for GTR+Gamma yield a program that is between 2.7 and 52 times faster than the previous version of RAxML. A large-scale performance comparison with GARLI, PHYML, IQPNNI and MrBayes on real data containing 1000 up to 6722 taxa shows that RAxML requires at least 5.6 times less main memory and yields better trees in similar times than the best competing program (GARLI) on datasets up to 2500 taxa. On datasets > or =4000 taxa it also runs 2-3 times faster than GARLI. RAxML has been parallelized with MPI to conduct parallel multiple bootstraps and inferences on distinct starting trees. The program has been used to compute ML trees on two of the largest alignments to date containing 25,057 (1463 bp) and 2182 (51,089 bp) taxa, respectively. Availability: icwww.epfl.ch/~stamatak
Article
Full-text available
Phylemon is an online platform for phylogenetic and evolutionary analyses of molecular sequence data. It has been developed as a web server that integrates a suite of different tools selected among the most popular stand-alone programs in phylogenetic and evolutionary analysis. It has been conceived as a natural response to the increasing demand of data analysis of many experimental scientists wishing to add a molecular evolution and phylogenetics insight into their research. Tools included in Phylemon cover a wide yet selected range of programs: from the most basic for multiple sequence alignment to elaborate statistical methods of phylogenetic reconstruction including methods for evolutionary rates analyses and molecular adaptation. Phylemon has several features that differentiates it from other resources: (i) It offers an integrated environment that enables the direct concatenation of evolutionary analyses, the storage of results and handles required data format conversions, (ii) Once an outfile is produced, Phylemon suggests the next possible analyses, thus guiding the user and facilitating the integration of multi-step analyses, and (iii) users can define and save complete pipelines for specific phylogenetic analysis to be automatically used on many genes in subsequent sessions or multiple genes in a single session (phylogenomics). The Phylemon web server is available at http://phylemon.bioinfo.cipf.es.
Article
Full-text available
With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php.
Article
Full-text available
Even when the maximum likelihood (ML) tree is a better estimate of the true phylogenetic tree than those produced by other methods, the result of a poor ML search may be no better than that of a more thorough search under some faster criterion. The ability to find the globally optimal ML tree is therefore important. Here, I compare a range of heuristic search strategies (and their associated computer programs) in terms of their success at locating the ML tree for 20 empirical data sets with 14 to 158 sequences and 411 to 120,762 aligned nucleotides. Three distinct topics are discussed: the success of the search strategies in relation to certain features of the data, the generation of starting trees for the search, and the exploration of multiple islands of trees. As a starting tree, there was little difference among the neighbor-joining tree based on absolute differences (including the BioNJ tree), the stepwise-addition parsimony tree (with or without nearest-neighbor-interchange (NNI) branch swapping), and the stepwise-addition ML tree. The latter produced the best ML score on average but was orders of magnitude slower than the alternatives. The BioNJ tree was second best on average. As search strategies, star decomposition and quartet puzzling were the slowest and produced the worst ML scores. The DPRml, IQPNNI, MultiPhyl, PhyML, PhyNav, and TreeFinder programs with default options produced qualitatively similar results, each locating a single tree that tended to be in an NNI suboptimum (rather than the global optimum) when the data set had low phylogenetic information. For such data sets, there were multiple tree islands with very similar ML scores. The likelihood surface only became relatively simple for data sets that contained approximately 500 aligned nucleotides for 50 sequences and 3,000 nucleotides for 100 sequences. The RAxML and GARLI programs allowed multiple islands to be explored easily, but both programs also tended to find NNI suboptima. A newly developed version of the likelihood ratchet using PAUP* successfully found the peaks of multiple islands, but its speed needs to be improved.
Article
Full-text available
In order to have confidence in model-based phylogenetic analysis, the model of nucleotide substitution adopted must be selected in a statistically rigorous manner. Several model-selection methods are applicable to maximum likelihood (ML) analysis, including the hierarchical likelihood-ratio test (hLRT), Akaike information criterion (AIC), Bayesian information criterion (BIC), and decision theory (DT), but their performance relative to empirical data has not been investigated thoroughly. In this study, we use 250 phylogenetic data sets obtained from TreeBASE to examine the effects that choice in model selection has on ML estimation of phylogeny, with an emphasis on optimal topology, bootstrap support, and hypothesis testing. We show that the use of different methods leads to the selection of two or more models for approximately 80% of the data sets and that the AIC typically selects more complex models than alternative approaches. Although ML estimation with different best-fit models results in incongruent tree topologies approximately 50% of the time, these differences are primarily attributable to alternative resolutions of poorly supported nodes. Furthermore, topologies and bootstrap values estimated with ML using alternative statistically supported models are more similar to each other than to topologies and bootstrap values estimated with ML under the Kimura two-parameter (K2P) model or maximum parsimony (MP). In addition, Swofford-Olsen-Waddell-Hillis (SOWH) tests indicate that ML trees estimated with alternative best-fit models are usually not significantly different from each other when evaluated with the same model. However, ML trees estimated with statistically supported models are often significantly suboptimal to ML trees made with the K2P model when both are evaluated with K2P, indicating that not all models perform in an equivalent manner. Nevertheless, the use of alternative statistically supported models generally does not affect tests of monophyletic relationships under either the Shimodaira-Hasegawa (S-H) or SOWH methods. Our results suggest that although choice in model selection has a strong impact on optimal tree topology, it rarely affects evolutionary inferences drawn from the data because differences are mainly confined to poorly supported nodes. Moreover, since ML with alternative best-fit models tends to produce more similar estimates of phylogeny than ML under the K2P model or MP, the use of any statistically based model-selection method is vastly preferable to forgoing the model-selection process altogether.
Article
Full-text available
Long-held ideas regarding the evolutionary relationships among animals have recently been upended by sometimes controversial hypotheses based largely on insights from molecular data. These new hypotheses include a clade of moulting animals (Ecdysozoa) and the close relationship of the lophophorates to molluscs and annelids (Lophotrochozoa). Many relationships remain disputed, including those that are required to polarize key features of character evolution, and support for deep nodes is often low. Phylogenomic approaches, which use data from many genes, have shown promise for resolving deep animal relationships, but are hindered by a lack of data from many important groups. Here we report a total of 39.9 Mb of expressed sequence tags from 29 animals belonging to 21 phyla, including 11 phyla previously lacking genomic or expressed-sequence-tag data. Analysed in combination with existing sequences, our data reinforce several previously identified clades that split deeply in the animal tree (including Protostomia, Ecdysozoa and Lophotrochozoa), unambiguously resolve multiple long-standing issues for which there was strong conflicting support in earlier studies with less data (such as velvet worms rather than tardigrades as the sister group of arthropods), and provide molecular support for the monophyly of molluscs, a group long recognized by morphologists. In addition, we find strong support for several new hypotheses. These include a clade that unites annelids (including sipunculans and echiurans) with nemerteans, phoronids and brachiopods, molluscs as sister to that assemblage, and the placement of ctenophores as the earliest diverging extant multicellular animals. A single origin of spiral cleavage (with subsequent losses) is inferred from well-supported nodes. Many relationships between a stable subset of taxa find strong support, and a diminishing number of lineages remain recalcitrant to placement on the tree.
Article
The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved in making estimates from the original data. In the case of phylogenies, it is argued that the proper method of resampling is to keep all of the original species while sampling characters with replacement, under the assumption that the characters have been independently drawn by the systematist and have evolved independently. Majority-rule consensus trees can be used to construct a phylogeny showing all of the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If a group shows up 95% of the time or more, the evidence for it is taken to be statistically significant. Existing computer programs can be used to analyze different bootstrap samples by using weights on the characters, the weight of a character being how many times it was drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show significant evidence for a group if it is defined by three or more characters.
Book
— We studied sequence variation in 16S rDNA in 204 individuals from 37 populations of the land snail Candidula unifasciata (Poiret 1801) across the core species range in France, Switzerland, and Germany. Phylogeographic, nested clade, and coalescence analyses were used to elucidate the species evolutionary history. The study revealed the presence of two major evolutionary lineages that evolved in separate refuges in southeast France as result of previous fragmentation during the Pleistocene. Applying a recent extension of the nested clade analysis (Templeton 2001), we inferred that range expansions along river valleys in independent corridors to the north led eventually to a secondary contact zone of the major clades around the Geneva Basin. There is evidence supporting the idea that the formation of the secondary contact zone and the colonization of Germany might be postglacial events. The phylogeographic history inferred for C. unifasciata differs from general biogeographic patterns of postglacial colonization previously identified for other taxa, and it might represent a common model for species with restricted dispersal.
Article
Evolutionary trees sit at the core of all realistic models describing a set of related sequences, including alignment, homology search, ancestral protein reconstruction and 2D/3D structural change. It is important to assess the stochastic error when estimating a tree, including models using the most realistic likelihood-based optimizations, yet computation times may be many days or weeks. If so, the bootstrap is computationally prohibitive. Here we show that the extremely fast "resampling of estimated log likelihoods" or RELL method behaves well under more general circumstances than previously examined. RELL approximates the bootstrap (BP) proportions of trees better that some bootstrap methods that rely on fast heuristics to search the tree space. The BIC approximation of the Bayesian posterior probability (BPP) of trees is made more accurate by including an additional term related to the determinant of the information matrix (which may also be obtained as a product of gradient or score vectors). Such estimates are shown to be very close to MCMC chain values. Our analysis of mammalian mitochondrial amino acid sequences suggest that when model breakdown occurs, as it typically does for sequences separated by more than a few million years, the BPP values are far too peaked and the real fluctuations in the likelihood of the data are many times larger than expected. Accordingly, several ways to incorporate the bootstrap and other types of direct resampling with MCMC procedures are outlined. Genes evolve by a process which involves some sites following a tree close to, but not identical with, the species tree. It is seen that under such a likelihood model BP (bootstrap proportions) and BPP estimates may still be reasonable estimates of the species tree. Since many of the methods studied are very fast computationally, there is no reason to ignore stochastic error even with the slowest ML or likelihood based methods.
Article
A metric on general phylogenetic trees is presented. This extends the work of most previous authors, who constructed metrics for binary trees. The metric presented in this paper makes possible the comparison of the many nonbinary phylogenetic trees appearing in the literature. This provides an objective procedure for comparing the different methods for constructing phylogenetic trees. The metric is based on elementary operations which transform one tree into another. Various results obtained in applying these operations are given. They enable the distance between any pair of trees to be calculated efficiently. This generalizes previous work by Bourque to the case where interior vertices can be labeled, and labels may contain more than one element or may be empty.
Article
The application of maximum likelihood techniques to the estimation of evolutionary trees from nucleic acid sequence data is discussed. A computationally feasible method for finding such maximum likelihood estimates is developed, and a computer program is available. This method has advantages over the traditional parsimony algorithms, which can give misleading results if rates of evolution differ in different lineages. It also allows the testing of hypotheses about the constancy of evolutionary rates by likelihood ratio tests, and gives rough indication of the error of ;the estimate of the tree.
Article
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/.
Article
Maximum likelihood (ML) methods have become very popular for constructing phylogenetic trees from sequence data. However, despite noticeable recent progress, with large and difficult datasets (e.g. multiple genes with conflicting signals) current ML programs still require huge computing time and can become trapped in bad local optima of the likelihood function. When this occurs, the resulting trees may still show some of the defects (e.g. long branch attraction) of starting trees obtained using fast distance or parsimony programs. Subtree pruning and regrafting (SPR) topological rearrangements are usually sufficient to intensively search the tree space. Here, we propose two new methods to make SPR moves more efficient. The first method uses a fast distance-based approach to detect the least promising candidate SPR moves, which are then simply discarded. The second method locally estimates the change in likelihood for any remaining potential SPRs, as opposed to globally evaluating the entire tree for each possible move. These two methods are implemented in a new algorithm with a sophisticated filtering strategy, which efficiently selects potential SPRs and concentrates most of the likelihood computation on the promising moves. Experiments with real datasets comprising 35-250 taxa show that, while indeed greatly reducing the amount of computation, our approach provides likelihood values at least as good as those of the best-known ML methods so far and is very robust to poor starting trees. Furthermore, combining our new SPR algorithm with local moves such as PHYML's nearest neighbor interchanges, the time needed to find good solutions can sometimes be reduced even more.
Article
A comprehensive phylogeny of papilionoid legumes was inferred from sequences of 2228 taxa in GenBank release 147. A semiautomated analysis pipeline was constructed to download, parse, assemble, align, combine, and build trees from a pool of 11,881 sequences. Initial steps included all-against-all BLAST similarity searches coupled with assembly, using a novel strategy for building length-homogeneous primary sequence clusters. This was followed by a combination of global and local alignment protocols to build larger secondary clusters of locally aligned sequences, thus taking into account the dramatic differences in length of the heterogeneous coding and noncoding sequence data present in GenBank. Next, clusters were checked for the presence of duplicate genes and other potentially misleading sequences and examined for combinability with other clusters on the basis of taxon overlap. Finally, two supermatrices were constructed: a "sparse" matrix based on the primary clusters alone (1794 taxa x 53,977 characters), and a somewhat more "dense" matrix based on the secondary clusters (2228 taxa x 33,168 characters). Both matrices were very sparse, with 95% of their cells containing gaps or question marks. These were subjected to extensive heuristic parsimony analyses using deterministic and stochastic heuristics, including bootstrap analyses. A "reduced consensus" bootstrap analysis was also performed to detect cryptic signal in a subtree of the data set corresponding to a "backbone" phylogeny proposed in previous studies. Overall, the dense supermatrix appeared to provide much more satisfying results, indicated by better resolution of the bootstrap tree, excellent agreement with the backbone papilionoid tree in the reduced bootstrap consensus analysis, few problematic large polytomies in the strict consensus, and less fragmentation of conventionally recognized genera. Nevertheless, at lower taxonomic levels several problems were identified and diagnosed. A large number of methodological issues in supermatrix construction at this scale are discussed, including detection of annotation errors in GenBank sequences; the shortage of effective algorithms and software for local multiple sequence alignment; the difficulty of overcoming effects of fragmentation of data into nearly disjoint blocks in sparse supermatrices; and the lack of informative tools to assess confidence limits in very large trees.
Article
Phylogenetic tree estimation plays a critical role in a wide variety of molecular studies, including molecular systematics, phylogenetics, and comparative genomics. Finding the optimal tree relating a set of sequences using score-based (optimality criterion) methods, such as maximum likelihood and maximum parsimony, may require all possible trees to be considered, which is not feasible even for modest numbers of sequences. In practice, trees are estimated using heuristics that represent a trade-off between topological accuracy and speed. I present a series of novel algorithms suitable for score-based phylogenetic tree reconstruction that demonstrably improve the accuracy of tree estimates while maintaining high computational speeds. The heuristics function by allowing the efficient exploration of large numbers of trees through novel hill-climbing and resampling strategies. These heuristics, and other computational approximations, are implemented for maximum likelihood estimation of trees in the program Leaphy, and its performance is compared to other popular phylogenetic programs. Trees are estimated from 4059 different protein alignments using a selection of phylogenetic programs and the likelihoods of the tree estimates are compared. Trees estimated using Leaphy are found to have equal to or better likelihoods than trees estimated using other phylogenetic programs in 4004 (98.6%) families and provide a unique best tree that no other program found in 1102 (27.1%) families. The improvement is particularly marked for larger families (80 to 100 sequences), where Leaphy finds a unique best tree in 81.7% of families.
Article
If the memory latency remaiy unchanged, the number of cycles of processoriss tis i doubledwib each doubli6 of speed of the processor. A factor of four wir briW us to about 500 clock cycles. 1 Hiding the Memory Gap As the speed of processorsissorsWj the memory gap iWO4zjOW2 On the other hand, shri, age enables L2 caches toiWj4%4% i sij and, to some extent,thi balances out the e#ect of theieW%PHOO memory gap. However, there are reasons to beliP e that non-cacheable problems are iWjHjj4W2 i i ortance. On thesci4 ti si there are 3D siulatizX andsiWjOO appli6PW2Hz-- Agaii many database servers usedi transactiH processiH rarely,i f ever, manage toestabliW a worki4 set that wiW fit enti%z6 wi%z6 a cache. It must be accepted that, however large the cache memoryi made, therewir be plenty of problems that defeatif For such problems the cache actually getsi the way and slows down therunniP of the program. Readerswid a long memorywio recall that the CRAY 1,whi h was desi--zP speci4H6HW for s