Article

A phylogenomic analysis of the Ascomycota

November 2006
Fungal Genetics and Biology 43(10):715-25

November 2006
43(10):715-25

DOI:10.1016/j.fgb.2006.05.001

Source
PubMed

Authors:

Barbara Robbertse

National Library of Medicine

Conrad L Schoch

National Library of Medicine

Joseph W Spatafora

Oregon State University

An automated procedure was developed to extract orthologous sequences from fungal genomes and incorporate them into phylogenomic analyses in a timely and efficient manner. This approach involves parsing an all versus all BLASTP search of 17 proteomes and creating a similarity matrix from e-values, which is then used to cluster proteins into related groups by means of a Markov Clustering algorithm. After performing this analysis at different stringency levels, 854 single copy protein clusters, which were ubiquitously distributed in all 17 proteomes, were identified. These clusters were culled to include only those clusters where all proteins had best hits to and received hits from a protein within the same cluster. The final data set included gapless alignments for 781 clusters of orthologous sequences that were concatenated into one super alignment containing 195,664 amino acid characters. Neighbor-joining distance and maximum likelihood analyses resulted in identical topologies and all except one node received 100% bootstrap support. The node supporting Stagonospora nodorum's position received 83% support or higher; it was also the only taxon differentially resolved in the maximum parsimony analyses. All analyses resolved the two derived subphyla Pezizomycotina and Saccharomycotina, and Schizosaccharomyces pombe as an early diverging lineage of the Ascomycota. Importantly, these analyses resolved the Leotiomycetes as the sister group to the Sordariomycetes, a region of the Ascomycota phylogeny that has remained problematic in molecular phylogenetic studies of more limited character sampling. Additional phylogenetic analyses which included orthologous sequences from an unannotated ascomycotan genome (e.g., Coccidioides immitis) and subsets of orthologs with different characteristics supported this topology. Phylogenetic analyses of the 595 orthologs which included C. immitis resulted in an identical topology to the previous 781 ortholog analysis and correctly placed C. immitis in the Eurotiomycetes. This demonstrated the correct identification of orthologs and the ability to incorporate unannotated genomic data into a common phylogenetic analysis.

The Ascomycota Tree of Life: A phylum-wide phylogeny clarifies the origin and evolution of fundamental reproductive and ecological traits

Article

Full-text available

Mar 2010

We present a 6-gene, 420-species maximum-likelihood phylogeny of Ascomycota, the largest phylum of Fungi. This analysis is the most taxonomically complete to date with species sampled from all 15 currently circumscribed classes. A number of superclass-level nodes that have previously evaded resolution and were unnamed in classifications of the Fungi are resolved for the first time. Based on the 6-gene phylogeny we conducted a phylogenetic informativeness analysis of all 6 genes and a series of ancestral character state reconstructions that focused on morphology of sporocarps, ascus dehiscence, and evolution of nutritional modes and ecologies. A gene-by-gene assessment of phylogenetic informativeness yielded higher levels of informativeness for protein genes (RPB1, RPB2, and TEF1) as compared with the ribosomal genes, which have been the standard bearer in fungal systematics. Our reconstruction of sporocarp characters is consistent with 2 origins for multicellular sexual reproductive structures in Ascomycota, once in the common ancestor of Pezizomycotina and once in the common ancestor of Neolectomycetes. This first report of dual origins of ascomycete sporocarps highlights the complicated nature of assessing homology of morphological traits across Fungi. Furthermore, ancestral reconstruction supports an open sporocarp with an exposed hymenium (apothecium) as the primitive morphology for Pezizomycotina with multiple derivations of the partially (perithecia) or completely enclosed (cleistothecia) sporocarps. Ascus dehiscence is most informative at the class level within Pezizomycotina with most superclass nodes reconstructed equivocally. Character-state reconstructions support a terrestrial, saprobic ecology as ancestral. In contrast to previous studies, these analyses support multiple origins of lichenization events with the loss of lichenization as less frequent and limited to terminal, closely related species.

Fungal phylogenomics.A global analysis of fungal genomes and their evolution

Article

Oct 2010

Marina Marcet-Houben

Phylogenomic analysis uncovers the evolutionary history of nutrition and infection mode in rice blast fungus and other Magnaporthales

Article

Full-text available

Mar 2015

The order Magnaporthales (Ascomycota, Fungi) includes devastating pathogens of cereals, such as the rice blast fungus Pyricularia (Magnaporthe) oryzae, which is a model in host-pathogen interaction studies. Magnaporthales also includes saprotrophic species associated with grass roots and submerged wood. Despite its scientific and economic importance, the phylogenetic position of Magnaporthales within Sordariomycetes and the interrelationships of its constituent taxa, remain controversial. In this study, we generated novel transcriptome data from 21 taxa that represent key Magnaporthales lineages of different infection and nutrition modes and phenotypes. Phylogenomic analysis of >200 conserved genes allowed the reconstruction of a robust Sordariomycetes tree of life that placed the monophyletic group of Magnaporthales sister to Ophiostomatales. Among Magnaporthales, three major clades were recognized: 1) an early diverging clade A comprised of saprotrophs associated with submerged woods; 2) clade B that includes the rice blast fungus and other pathogens that cause blast diseases of monocot plants. These species infect the above-ground tissues of host plants using the penetration structure, appressorium; and 3) clade C comprised primarily of root-associated species that penetrate the root tissue with hyphopodia. The well-supported phylogenies provide a robust framework for elucidating evolution of pathogenesis, nutrition modes, and phenotypic characters in Magnaporthales.

Phylogenetic relationships and diversity of Guignardia spp isolated from different hosts on ITS1-5,8S-ITS2 region

Article

Full-text available

Jun 2009
REV BRAS FRUTIC

Fungi of Guignardia genera are commonly isolated from different plant species and most of the time they are characterized as endophytics. However, some species of this genus, as G. citricarpa and G. psidii are known as causal agents of serious diseases that affect cultures, such as the Citrus Black Spot and the guava fruit rot, respectively. They are also responsible for diseases that cause foliar spots in different fruit species and also in other cultures. This work has the objective of isolate, identify and characterize the genetic diversity present among Guignardia isolates obtained from citrus, mango, guava, eucalyptus, Brazilian grape tree and Surinam cherry by analysis of DNA sequence from cístron ITS1-5,8SITS2. It was verified that the obtained isolates belong to G. mangiferae and G. citricarpa species. Two different Guignardia types, that were found in mango, could not be identified in species, and do not belong to none of the species deposited in GenBank. So, this work found that mango, guava, eucalyptus, Brazilian grape tree and Surinam cherry host only G. mangiferae, whereas citrus hosts G. mangiferae and G. citricarpa species. Mango hosts three different Guignardia, G. mangiferae and two others types that remain without identification of the species level. It was also verified that isolates of Guignardia obtained from guava fruit rot symptoms were identified as G. mangiferae.

Toward genome-enabled mycology

Article

Aug 2013

Genome-enabled mycology is a rapidly expanding field that is characterized by the pervasive use of genome-scale data and associated computational tools in all aspects of fungal biology. Genome-enabled mycology is integrative and often requires teams of researchers with diverse skills in organismal mycology, bioinformatics and molecular biology. This issue of Mycologia presents the first complete fungal genomes in the history of the journal, reflecting the ongoing transformation of mycology into a genome-enabled science. Here, we consider the prospects for genome-enabled mycology and the technical and social challenges that will need to be overcome to grow the database of complete fungal genomes and enable all fungal biologists to make use of the new data.

Relações filogenéticas e diversidade de isolados de Guignardia spp oriundos de diferentes hospedeiros nas regiões ITS1-5,8S-ITS2

Article

Full-text available

Jun 2009
REV BRAS FRUTIC

Fungos do gênero Guignardia são frequentemente isolados em diferentes espécies de plantas, sendo muitas vezes caracterizados como fungos endofíticos. Entretanto, algumas espécies deste fungo, a exemplo de G. citricarpa e G. psidii, são causadores de importantes doenças que afetam culturas agrícolas, como a Mancha-Preta dos Citros (MPC) e a podridão dos frutos de goiabeira, respectivamente. Também são apontados como causadores de manchas foliares em diferentes espécies de frutíferas e também em outras culturas. Este trabalho teve o objetivo de isolar, identificar e caracterizar a diversidade genética existente entre isolados de Guignardia oriundos de citros, mangueira, goiabeira, eucaliptos, jabuticabeira e pitangueira através da análise da sequência de DNA do cístron ITS1-5,8-S-ITS2. Verificou-se que os isolados obtidos pertencem às espécies G. citricarpa e G. mangiferae. Entretanto, dois grupos encontrados em mangueira não puderam ser identificados em nível de espécie com base em sua sequência de DNA em função da baixa similaridade com as sequências de diferentes espécies de Guignardia já depositadas em banco de dados. Desta forma, goiabeira, eucaliptos, jabuticabeira e pitangueira são hospedeiras de G. mangiferae, enquanto os citros hospedam duas formas, G. citricarpa e G. mangiferae. Já a mangueira é hospedeira de G. mangiferae e de dois outros grupos ainda não identificados. Verificou-se ainda que isolados de Guignardia obtidos de sintomas de podridão de fruto de goiabeira foram identificados como G. mangiferae.

Article

Jul 2018
HARMFUL ALGAE

In order to better understand the relationships among current Nostocales cyanobacterial blooms, eight genomes were sequenced from cultured isolates or from environmental metagenomes of recent planktonic Nostocales blooms. Phylogenomic analysis of publicly available sequences placed the new genomes among a group of 15 genomes from four continents in a distinct ADA clade (Anabaena/Dolichospermum/Aphanizomenon) within the Nostocales. This clade contains four species-level groups, two of which include members with both Anabaena-like and Aphanizomenon flos-aquae-like morphology. The genomes contain many repetitive genetic elements and a sizable pangenome, in which ABC-type transporters are highly represented. Alongside common core genes for photosynthesis, the differentiation of N 2-fixing heterocysts, and the uptake and incorporation of the major nutrients P, N and S, we identified several gene pathways in the pangenome that may contribute to niche partitioning. Genes for problematic secondary metabolites-cyanotoxins and taste-and-odor compounds-were sporadically present, as were other polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) gene clusters. By contrast, genes predicted to encode the ribosomally generated bacteriocin peptides were found in all genomes.

Multiple Approaches to Phylogenomic Reconstruction of the Fungal Kingdom

Chapter

Full-text available

Nov 2017
Adv Genet

Fungi are possibly the most diverse eukaryotic kingdom, with over a million member species and an evolutionary history dating back a billion years. Fungi have been at the forefront of eukaryotic genomics, and owing to initiatives like the 1000 Fungal Genomes Project the amount of fungal genomic data has increased considerably over the last 5 years, enabling large-scale comparative genomics of species across the kingdom. In this chapter, we first review fungal evolution and the history of fungal genomics. We then review in detail seven phylogenomic methods and reconstruct the phylogeny of 84 fungal species from 8 phyla using each method. Six methods have seen extensive use in previous fungal studies, while a Bayesian supertree method is novel to fungal phylogenomics. We find that both established and novel phylogenomic methods can accurately reconstruct the fungal kingdom. Finally, we discuss the accuracy and suitability of each phylogenomic method utilized.

ScienceDirect - Infection, Genetics and Evolution : Rapidly evolving genes in pathogens: Methods for detecting positive selection and examples among fungi, bacteria, viruses and protists

Article

Full-text available

Jan 2009

... Rapidly evolving genes in pathogens : Methods for detecting positive selection and examples among fungi ... with hosts evolving to escape pathogen infection and pathogens evolving to escape ... Genes under positive selection in pathogens have mostly been sought among viruses ...

Construction and annotation of large phylogenetic trees

Article

Michael J. Sanderson

Broad availability of molecular sequence data allows construction of phylogenetic trees with 1000s or even 10 000s of taxa. This paper reviews methodological, technological and empirical issues raised in phylogenetic inference at this scale. Numerous algorithmic and computational challenges have been identified surrounding the core problem of reconstructing large trees accurately from sequence data, but many other obstacles, both upstream and downstream of this step, are less well understood. Before phylogenetic analysis, data must be generated de novo or extracted from existing databases, compiled into blocks of homologous data with controlled properties, aligned, examined for the presence of gene duplications or other kinds of complicating factors, and finally, combined with other evidence via supermatrix or supertree approaches. After phylogenetic analysis, confidence assessments are usually reported, along with other kinds of annotations, such as clade names, or annotations requiring additional inference procedures, such as trait evolution or divergence time estimates. Prospects for partial automation of large-tree construction are also discussed, as well as risks associated with 'outsourcing' phylogenetic inference beyond the systematics community.

Phylogenetics and phylogenomics to understand fungal diversity

Preprint

Full-text available

Mar 2023

Taxonomy of Pathogenic Yeasts Candida, Cryptococcus, Malassezia, and Trichosporon: Current Status, Future Perspectives, and Proposal for Transfer of Six Candida Species to the Genus Nakaseomyces</i

Article

Oct 2022

This review describes the changes in yeast species names in the previous decade. Several yeast species have been reclassified to accommodate the “One fungus=One name” (1F=1N) principle of the Code. As the names of medically important yeasts have also been reviewed and revised, details of the genera Candida, Cryptococcus, Malassezia, and Trichosporon are described in Section 3, along with the history of name changes. Since the phylogenetic positions of Candida species in several clades have not been clarified, revision of this species has not been completed. Among the species that remain unrevised despite their importance in the medical field, we propose the transfer of six Candida species to be reclassified in the Nakaseomyces clade, including Nakaseomyces glabratus and Nakaseomyces nivalensis.

Calonectria in the age of genes and genomes: Towards understanding an important but relatively unknown group of pathogens

Article

Full-text available

Mar 2022
MOL PLANT PATHOL

The genus Calonectria includes many aggressive plant pathogens causing diseases on various agricultural crops as well as forestry and ornamental tree species. Some species have been accidentally introduced into new environments via international trade of putatively asymptomatic plant germplasm or contaminated soil, resulting in significant economic losses. This review provides an overview of the taxonomy, population biology, and pathology of Calonectria species, specifically emerging from contemporary studies that have relied on DNA‐based technologies. The growing importance of genomics in future research is highlighted. A life cycle is proposed for Calonectria species, aimed at improving our ability to manage diseases caused by these pathogens. The taxonomy, population biology, pathology, and genomics of Calonectria, an important but relatively unknown group of pathogens of agricultural crops as well as forestry and ornamental trees, are reviewed.

Global Characterization of Fungal Mitogenomes: New Insights on Genomic Diversity and Dynamism of Coding Genes and Accessory Elements

Article

Full-text available

Dec 2021

Fungi comprise a great diversity of species with distinct ecological functions and lifestyles. Similar to other eukaryotes, fungi rely on interactions with prokaryotes and one of the most important symbiotic events was the acquisition of mitochondria. Mitochondria are organelles found in eukaryotic cells whose main function is to generate energy through aerobic respiration. Mitogenomes (mtDNAs) are double- stranded circular or linear DNA from mitochondria that may contain core genes and accessory elements that can be replicated, transcribed, and independently translated from the nuclear genome. Despite their importance, investigative studies on the diversity of fungal mitogenomes are scarce. Herein, we have evaluated 788 curated fungal mitogenomes available at NCBI database to assess discrepancies and similarities among them and to better understand the mechanisms involved in fungal mtDNAs variability. From a total of 12 fungal phyla, four do not have any representative with available mitogenomes, which highlights the underrepresentation of some groups in the current available data. We selected representative and non-redundant mitogenomes based on the threshold of 90% similarity, eliminating 81 mtDNAs. Comparative analyses revealed considerable size variability of mtDNAs with a difference of up to 260 kb in length. Furthermore, variation in mitogenome length and genomic composition are generally related to the number and length of accessory elements (introns, HEGs, and uORFs). We identified an overall average of 8.0 (0–39) introns, 8.0 (0–100) HEGs, and 8.2 (0–102) uORFs per genome, with high variation among phyla. Even though the length of the core protein-coding genes is considerably conserved, approximately 36.3% of the mitogenomes evaluated have at least one of the 14 core coding genes absent. Also, our results revealed that there is not even a single gene shared among all mitogenomes. Other unusual genes in mitogenomes were also detected in many mitogenomes, such as dpo and rpo, and displayed diverse evolutionary histories. Altogether, the results presented in this study suggest that fungal mitogenomes are diverse, contain accessory elements and are absent of a conserved gene that can be used for the taxonomic classification of the Kingdom Fungi.

Using target enrichment sequencing to study the higher-level phylogeny of the largest lichen-forming fungi family: Parmeliaceae (Ascomycota)

Article

Full-text available

Dec 2020

Parmeliaceae is the largest family of lichen-forming fungi with a worldwide distribution. We used a target enrichment data set and a qualitative selection method for 250 out of 350 genes to infer the phylogeny of the major clades in this family including 81 taxa, with both subfamilies and all seven major clades previously recognized in the subfamily Parmelioideae. The reduced genome-scale data set was analyzed using concatenated-based Bayesian inference and two different Maximum Likelihood analyses, and a coalescent-based species tree method. The resulting topology was strongly supported with the majority of nodes being fully supported in all three concatenated-based analyses. The two subfamilies and each of the seven major clades in Parmelioideae were strongly supported as monophyletic. In addition, most backbone relationships in the topology were recovered with high nodal support. The genus Parmotrema was found to be polyphyletic and consequently, it is suggested to accept the genus Crespoa to accommodate the species previously placed in Parmotrema subgen. Crespoa. This study demonstrates the power of reduced genome-scale data sets to resolve phylogenetic relationships with high support. Due to lower costs, target enrichment methods provide a promising avenue for phylogenetic studies including larger taxonomic/specimen sampling than whole genome data would allow.

Genomic clustering within functionally related gene families in Ascomycota fungi

Article

Full-text available

Oct 2020

Multiple mechanisms collaborate for proper regulation of gene expression. One layer of this regulation is through the clustering of functionally related genes at discrete loci throughout the genome. This phenomenon occurs extensively throughout Ascomycota fungi and is an organizing principle for many gene families whose proteins participate in diverse molecular functions throughout the cell. Members of this phylum include organisms that serve as model systems and those of interest medically, pharmaceutically, and for industrial and biotechnological applications. In this review, we discuss the prevalence of functional clustering through a broad range of organisms within the phylum. Position effects on transcription, genomic locations of clusters, transcriptional regulation of clusters, and selective pressures contributing to the formation and maintenance of clusters are addressed, as are common methods to identify and characterize clusters.

Multiple hidden processes complicate phylogenomic inference of deep Basidiomycota relationships

Preprint

Jul 2017

Resolving deep divergences in the fungal tree of life remains a challenging task even for analyses of genome-scale phylogenetic datasets. Relationships between Basidiomycota subphyla, the rusts (Pucciniomycotina), smuts (Ustilaginomycotina) and mushroom forming fungi (Agaricomycotina) represent a particularly challenging situation that posed problems to both traditional multigene and genome-scale phylogenetic studies. Here, we address basal Basidiomycota relationships using three different phylogenomic datasets, concatenated and gene tree-based analyses and examine the contribution of several potential sources of uncertainty, including fast-evolving sites, putative long-branch taxa, model violation and missing data. We inferred conflicting results with different datasets and under different models. Fast-evolving sites and oversimplified models of amino acid substitution favored the grouping of smuts with mushroom-forming fungi, often leading to maximal bootstrap support in both concatenation and Astral analyses. The most conserved datasets grouped rusts with mushroom forming fungi, although this relationship proved labile, sensitive to model choice, different data subsets and missing data. Excluding putative long branch taxa, genes with the highest proportions of missing data and/or genes with strong signal failed to reveal a consistent trend toward one or the other topology, suggesting that additional sources of conflict are at play too. Our analyses suggest that topologies uniting smuts with mushroom forming fungi can arise as a result of inappropriate modeling of amino acid sites that might be prone to systematic bias. While concatenated analyses yielded strong but conflicting support, individual gene trees mostly provided poor support for rusts, smuts and mushroom-forming fungi, suggesting that the true Basidiomycota tree might be in a part of the tree space that is difficult to access using both concatenation and gene tree based approaches. Thus, basal Basidiomycota relationships remain unresolved and might represent a phylogenetic problem that remains contentious even in the genomic era.

Genome-scale phylogeny and contrasting modes of genome evolution in the fungal phylum Ascomycota

Article

Full-text available

Nov 2020

Ascomycota, the largest and most well-studied phylum of fungi, contains three subphyla: Saccharomycotina (budding yeasts), Pezizomycotina (filamentous fungi), and Taphrinomycotina (fission yeasts). Despite its importance, we lack a comprehensive genome-scale phylogeny or understanding of the similarities and differences in the mode of genome evolution within this phylum. By examining 1107 genomes from Saccharomycotina (332), Pezizomycotina (761), and Taphrinomycotina (14) species, we inferred a robust genome-wide phylogeny that resolves several contentious relationships and estimated that the Ascomycota last common ancestor likely originated in the Ediacaran period. Comparisons of genomic properties revealed that Saccharomycotina and Pezizomycotina differ greatly in their genome properties and enabled inference of the direction of evolutionary change. The Saccharomycotina typically have smaller genomes, lower guanine-cytosine contents, lower numbers of genes, and higher rates of molecular sequence evolution compared with Pezizomycotina. These results provide a robust evolutionary framework for understanding the diversity and ecological lifestyles of the largest fungal phylum.

Isolation, Molecular Identification, Phylogenetic Analysis and Biodiversity of Root Symbiotic Fungi (RSF) from Drynaria quercifolia L.

Article

Full-text available

Mar 2019

Jomar Lozano Aban

Fern epiphytes exposed to light-and water-deprived environments are common. Drynaria is an epiphytic fern found in such habitats. One of its unique ecophysiological adaptation is their association with fungi. This research is one of the few studies that explored the phylogenetic relationship, colonization, occurrence and diversity of symbiotic fungi found in D. quercifolia. Genomic DNA of the RSF was extracted, and the ITS (internal transcribed spacer) region of the 18S ribosomal DNA (rDNA) were sequenced. Five isolates were recorded. All the isolates were identified up to the species level using the Basic Local Alignment Search Tool program to their closest type available on NCBI databank. These five isolates are under two genera: Trichoderma and Aspergillus. Their phylogenetic relationship was determined using Molecular Evolutionary Genetics Analysis (MEGA6) and two distinct monophyletic groups were formed: Sordariomycetes and Eurotiomycetes. The computed colonization rate (100%) implies their abundance in the roots of D. quercifolia where species of the genus Trichoderma and Aspergillus were found to occur very frequently. Understanding the diversity of root fungal symbionts and the presence of dominating species are necessary to determine their impact on ecosystem functioning. These factors lead to RSF's potential in organic agriculture and green biotechnology.

The Mitochondrial Genome of the Phytopathogenic Fungus Bipolaris sorokiniana and the Utility of Mitochondrial Genome to Infer Phylogeny of Dothideomycetes

Article

Full-text available

May 2020

A number of species in Bipolaris are important plant pathogens. Due to a limited number of synapomorphic characters, it is difficult to perform species identification and to estimate phylogeny of Bipolaris based solely on morphology. In this study, we sequenced the complete mitochondrial genome of Bipolaris sorokiniana, and presented the detailed annotation of the genome. The B. sorokiniana mitochondrial genome is 137,775 bp long, and contains two ribosomal RNA genes, 12 core protein-coding genes, 38 tRNA genes. In addition, two ribosomal protein genes (rps3 gene and rps5 gene) and the fungal mitochondrial RNase P gene (rnpB) are identified. The large genome size is mostly determined by the presence of numerous intronic and intergenic regions. A total of 28 introns are inserted in eight core protein-coding genes. Together with the published mitochondrial genome sequences, we conducted a preliminary phylogenetic inference of Dothideomycetes under various datasets and substitution models. The monophyly of Capnodiales, Botryosphaeriales and Pleosporales are consistently supported in all analyses. The Venturiaceae forms an independent lineage, with a distant phylogenetic relationship to Pleosporales. At the family level, the Mycosphaerellaceae, Botryosphaeriaceae. Phaeosphaeriaceae, and Pleosporaceae are recognized in the majority of trees.

Evolutionary Histories of Type III Polyketide Synthases in Fungi

Article

Full-text available

Jan 2020

Type III polyketide synthases (PKSs) produce secondary metabolites with diverse biological activities, including antimicrobials. While they have been extensively studied in plants and bacteria, only a handful of type III PKSs from fungi has been characterized in the last 15 years. The exploitation of fungal type III PKSs to produce novel bioactive compounds requires understanding the diversity of these enzymes, as well as of their biosynthetic pathways. Here, phylogenetic and reconciliation analyses of 522 type III PKSs from 1,193 fungal genomes revealed complex evolutionary histories with massive gene duplications and losses, explaining their discontinuous distribution in the fungal tree of life. In addition, horizontal gene transfer events from bacteria to fungi and, to a lower extent, between fungi, could be inferred. Ancestral gene duplication events have resulted in the divergence of eight phylogenetic clades. Especially, two clades show ancestral linkage and functional co-evolution between a type III PKS and a reducing PKS genes. Investigation of the occurrence of protein domains in fungal type III PKS predicted gene clusters highlighted the diversity of biosynthetic pathways, likely reflecting a large chemical landscape. Type III PKS genes are most often located next to genes encoding cytochrome P450s, MFS transporters and transcription factors, defining ancestral core gene clusters. This analysis also allowed predicting gene clusters for the characterized fungal type III PKSs and provides working hypotheses for the elucidation of the full biosynthetic pathways. Altogether, our analyses provide the fundamental knowledge to motivate further characterization and exploitation of fungal type III PKS biosynthetic pathways.

Model Choice, Missing Data, and Taxon Sampling Impact Phylogenomic Inference of Deep Basidiomycota Relationships

Article

Jan 2020
SYST BIOL

Resolving deep divergences in the tree of life is challenging even for analyses of genome-scale phylogenetic data sets. Relationships between Basidiomycota subphyla, the rusts and allies (Pucciniomycotina), smuts and allies (Ustilaginomycotina), and mushroom-forming fungi and allies (Agaricomycotina) were found particularly recalcitrant both to traditional multigene and genome-scale phylogenetics. Here, we address basal Basidiomycota relationships using concatenated and gene tree-based analyses of various phylogenomic data sets to examine the contribution of several potential sources of bias. We evaluate the contribution of biological causes (hard polytomy, incomplete lineage sorting) versus unmodeled evolutionary processes and factors that exacerbate their effects (e.g., fast-evolving sites and long-branch taxa) to inferences of basal Basidiomycota relationships. Bayesian Markov Chain Monte Carlo and likelihood mapping analyses reject the hard polytomy with confidence. In concatenated analyses, fast-evolving sites and oversimplified models of amino acid substitution favored the grouping of smuts with mushroom-forming fungi, often leading to maximal bootstrap support in both concatenation and coalescent analyses. On the contrary, the most conserved data subsets grouped rusts and allies with mushroom-forming fungi, although this relationship proved labile, sensitive to model choice, to different data subsets and to missing data. Excluding putative long-branch taxa, genes with high proportions of missing data and/or with strong signal failed to reveal a consistent trend toward one or the other topology, suggesting that additional sources of conflict are at play. While concatenated analyses yielded strong but conflicting support, individual gene trees mostly provided poor support for any resolution of rusts, smuts, and mushroom-forming fungi, suggesting that the true Basidiomycota tree might be in a part of tree space that is difficult to access using both concatenation and gene tree-based approaches. Inference-based assessments of absolute model fit strongly reject best-fit models for the vast majority of genes, indicating a poor fit of even the most commonly used models. While this is consistent with previous assessments of site-homogenous models of amino acid evolution, this does not appear to be the sole source of confounding signal. Our analyses suggest that topologies uniting smuts with mushroom-forming fungi can arise as a result of inappropriate modeling of amino acid sites that might be prone to systematic bias. We speculate that improved models of sequence evolution could shed more light on basal splits in the Basidiomycota, which, for now, remain unresolved despite the use of whole genome data.

Resolution of deep divergence of club fungi (phylum Basidiomycota)

Article

Full-text available

Dec 2019

A long-standing question about the early evolution of club fungi (phylum Basidiomycota) is the relationship between the three major groups, Pucciniomycotina, Ustilaginomycotina and Agaricomycotina. It is unresolved whether Agaricomycotina are more closely related to Ustilaginomycotina or to Pucciniomycotina. Here we reconstructed the branching order of the three subphyla through two sources of phylogenetic signals, i.e. standard phylogenomic analysis and alignment-free phylogenetic approach. Overall, beyond congruency within the frame of standard phylogenomic analysis, our results consistently and robustly supported the early divergence of Ustilaginomycotina and a closer relationship between Agaricomycotina and Pucciniomycotina. Keywords: Fungi, Basidiomycota, Phylogenetics, Phylogenomics, CVTree

Takashi Nakase's last tweet: What is the current direction of microbial taxonomy research?

Article

Full-text available

Dec 2019
FEMS YEAST RES

During the last few decades, type strains of most yeast species have been barcoded using the D1/D2 domain of their LSU rRNA gene and internal transcribed spacer (ITS) region. Species identification using DNA sequences regarding conspecificity in yeasts has also been studied. Most yeast species can be identified according to the sequence divergence of their ITS region or a combination of the D1/D2 and ITS regions. Studies that have examined intraspecific diversity have used multilocus sequence analyses, whereas the marker regions used in this analysis vary depending upon taxa. D1/D2 domain and ITS region sequences have been used as barcodes to develop primers suitable for the detection of the biological diversity of environmental DNA and the microbiome. Using these barcode sequences, it is possible to identify relative lineages and infer their gene products and function, and how they adapt to their environment. If barcode sequence was not variable enough to identify a described species, one could investigate the other biological traits of these yeasts, considering geological distance, environmental circumstances and isolation of reproduction.

Endophytic Mycobiota of Jingbai Pear Trees in North China

Article

Full-text available

Mar 2019

Endophytic fungi exist in all known plants and play an important role for plant growth and health. As an important forest tree the Jingbai pear (the best quality cultivar of Pyrus ussuriensi Maxim. ex Rupr.) has great ecological as well as economic value in north China. However, the mycobiota of the pear tree is still unknown. In this study, the fungal communities in different organs of the tree and in rhizosphere soils were investigated by Illumina Miseq sequencing of ITS rDNA. For organs, the roots had the highest fungal richness and diversity, while the flowers had the lowest richness and diversity. The results demonstrated that each of the organs investigated harbored a distinctive fungal assemblage. Overall, Ascomycota was the most abundant phyla, followed by Basidiomycota and Zygomycota. Fungal communities from the different soils also differed from each other. The redundancy analysis (RDA) showed that fungal community structure correlated significantly with soil temperature, soil pH, soil nitrogen and soil carbon contents. The results indicate that plant organs, site conditions and soil properties may have important influences on the endophytic fungal community structure associated with Jingbai pear trees.

Population genomic analyses of RAD sequences resolves the phylogenetic relationship of the lichen-forming fungal species Usnea antarctica and Usnea aurantiacoatra

Article

Full-text available

Dec 2018

Neuropogonoid species in the lichen-forming fungal genus Usnea exhibit great morphological variation that can be misleading for delimitation of species. We specifically focused on the species delimitation of two closely-related, predominantly Antarctic species differing in the reproductive mode and representing a so-called species pair: the asexual U.antarctica and the sexual U.aurantiacoatra . Previous studies have revealed contradicting results. While multi-locus studies based on DNA sequence data provided evidence that these two taxa might be conspecific, microsatellite data suggested they represent distinct lineages. By using RADseq, we generated thousands of homologous markers to build a robust phylogeny of the two species. Furthermore, we successfully implemented these data in fine-scale population genomic analyses such as DAPC and fineRADstructure. Both Usnea species are readily delimited in phylogenetic inferences and, therefore, the hypothesis that both species are conspecific was rejected. Population genomic analyses also strongly confirmed separated genomes and, additionally, showed different levels of co-ancestry and substructure within each species. Lower co-ancestry in the asexual U.antarctica than in the sexual U.aurantiacoatra may be derived from a wider distributional range of the former species. Our results demonstrate the utility of this RADseq method in tracing population dynamics of lichens in future analyses.

Phylogenomic analysis of 2556 single-copy protein-coding genes resolves most evolutionary relationships for the major clades in the most diverse group of lichen-forming fungi

Article

Full-text available

Aug 2018

Phylogenomic datasets continue to enhance our understanding of evolutionary relationships in many lineages of organisms. However, genome-scale data have not been widely implemented in reconstructing relationships in lichenized fungi. Here we generate a data set comprised of 2556 single-copy protein-coding genes to reconstruct previously unresolved relationships in the most diverse family of lichen-forming fungi, Parmeliaceae. Our sampling included 51 taxa, mainly from the subfamily Parmelioideae, and represented six of the seven previously identified major clades within the family. Our results provided strong support for the monophyly of each of these major clades and most backbone relationships in the topology were recovered with high nodal support based on concatenated dataset and species tree analyses. The alectorioid clade was strongly supported as sister-group to all remaining clades, which were divided into two major sister-groups. In the first major clade the anzioid and usneoid clades formed a strongly supported sister-group relationship with the cetrarioid + hypogymnioid group. The sister-group relationship of Evernia with the cetrarioid clade was also strongly supported, whereas that between the anzioid and usneoid clades needs further investigation. In the second major clade Oropogon and Platismatia were sister to the parmelioid group, while the position of Omphalora was not fully resolved. This study demonstrates the power of genome-scale data sets to resolve long-standing, ambiguous phylogenetic relationships of lichen-forming fungi. Furthermore, the topology inferred in this study will provide a valuable framework for better understanding diversification in the most diverse lineage of lichen-forming fungi, Parmeliaceae.

Fungal Phylogeny in the Age of Genomics: Insights Into Phylogenetic Inference From Genome-Scale Datasets

Chapter

Full-text available

Jan 2017
Adv Genet

The genomic era has been transformative for many fields, including our understanding of the phylogenetic relationships between organisms. The wide availability of whole-genome sequences practically eliminated data availability as a limiting factor for inferring phylogenetic trees, providing hundreds to thousands of loci for analyses, leading to molecular phylogenetics gradually being replaced by phylogenomics. The new era has also brought new challenges: systematic errors (resulting from, e.g., model violation) can be more pronounced in phylogenomic datasets and can lead to strongly supported incorrect relationships, creating significant incongruence among studies. Here, we review common practices, technical and biological challenges of phylogenomic analyses, with examples illustrated from fungi. We compare major approaches of phylogenetic inference, and illustrate the advantages conferred and challenges presented in phylogenomic case studies across the fungal tree of life, including cases where genome-scale data could conclusively resolve contentious relationships, and others that remain challenging despite the flood of genomic data.

Conserved genomic collinearity as a source of broadly applicable, fast evolving, markers to resolve species complexes: A case study using the lichen-forming genus Peltigera section Polydactylon

Article

Aug 2017

Synteny can be maintained for certain genomic regions across broad phylogenetic groups. In these homologous genomic regions, sites that are under relaxed purifying selection, such as intergenic regions, could be used broadly as markers for population genetic and phylogenetic studies on species complexes. To explore the potential of this approach, we found 125 Collinear Orthologous Regions (COR) ranging from 1 to > 10 kb across nine genomes representing the Lecanoromycetes and Eurotiomycetes (Pezizomycotina, Ascomycota). Twenty-six of these COR were found in all 24 eurotiomycete genomes surveyed for this study. Given the high abundance and availability of fungal genomes we believe this approach could be adopted for other large groups of fungi outside the Pezizomycotina. As a proof of concept, we selected three Collinear Orthologous Regions (COR1b, COR3, and COR16), based on synteny analyses of several genomes representing three classes of Ascomycota: Eurotiomycetes, Lecanoromycetes, and Lichinomycetes. COR16, for example, was found across these three classes of fungi. Here we compare the resolving power of these three new markers with five loci commonly used in phylogenetic studies of fungi, using section Polydactylon of the cyanolichen-forming genus Peltigera (Lecanoromycetes) - a clade with several challenging species complexes. Sequence data were subjected to three species discovery and two validating methods. COR markers substantially increased phylogenetic resolution and confidence, and highly contributed to species delimitation. The level of phylogenetic signal provided by each of the COR markers was higher than the commonly used fungal barcode ITS. High cryptic diversity was revealed by all methods. As redefined here, most species represent lineages that have relatively narrower, and more homogeneous biogeographical ranges than previously understood. The scabrosoid clade consists of ten species, seven of which are new. For the dolichorhizoid clade, twenty-two new species were discovered for a total of twenty-nine species in this clade.

Diversity and Phylogenetic Relationships Among Isolated Root Symbiotic Fungi from Drynaria quercifolia L. in La Union, Philippines

Article

Full-text available

Jun 2017

Drynaria quercifolia is an epiphytic fern often exposed to water-and light-stressed environments. One distinct ecophysiological adaptation of epiphytic ferns is their symbiotic relationship with fungi. This is the first study undertaken to explore the phylogenetic relationship, colonization, occurrence rate, and diversity of RSF found in D. quercifolia. Two hundred seventy-eight RSF isolates were collected from 300 representative root segments. Genomic DNA of the RSF was extracted, and the ITS (internal transcribed spacer) region of the 18S ribosomal DNA (rDNA) was sequenced. Thirteen species were recorded. Eight of the 13 RSF were identified up to the species level using the Basic Local Alignment Search Tool nucleotide search program (BLASTn) to their closest type match available on the databank of NCBI. However, five RSF were undescribed. The phylogenetic relationship of RSF was determined using Molecular Evolutionary Genetics Analysis (MEGA6), and four distinct monophyletic groups were formed: Sordariomycetes, Eurotiomycetes, Saccharomycetes, and Mucoromycotina. The computed colonization rate (92.67%) implies the abundance of RSF in the roots of D. quercifolia where several species of the genus Trichoderma were found to occur very frequently. Sites 2 and 5 possess the highest temperature, the highest light intensity, and the lowest substrate moisture content common in a stressful epiphytic habitat. Despite these conditions, the two sites manifested the highest RSF isolate diversity among the five tree-collection sites. Understanding the diversity and the presence of dominating RSF is necessary to determine their principal impact on ecosystem functioning. These principal factors explain their effects on increased plant productivity, nutrient acquisition, and environmental adaptation.

A class-wide phylogenetic assessment of Dothideomycetes

Article

Full-text available

Jan 2009

Phylogenetic Resolution of Deep Eukaryotic and Fungal Relationships Using Highly Conserved Low-Copy Nuclear Genes

Article

Full-text available

Sep 2016

A comprehensive and reliable eukaryotic tree of life is important for many aspects of biological studies from comparative developmental and physiological analyses to translational medicine and agriculture. Both gene-rich and taxon-rich approaches are effective strategies to improve phylogenetic accuracy and are greatly facilitated by marker genes that are universally distributed, well conserved and orthologous among divergent eukaryotes. Here we report the identification of 943 low-copy eukaryotic genes and we show that many of these genes are promising tools in resolving eukaryotic phylogenies, despite the challenges of determining deep eukaryotic relationships. As a case study, we demonstrate that smaller subsets of ~20 and 52 genes could resolve controversial relationships among widely divergent taxa and provide strong support for deep relationships such as the monophyly and branching order of several eukaryotic supergroups. In addition, the use of these genes resulted in fungal phylogenies that are congruent with previous phylogenomic studies that used much larger datasets, and successfully resolved several difficult relationships (e.g. forming a highly supported clade with Microsporidia, Mitosporidium and Rozella sister to other fungi). We propose that these genes are excellent for both gene-rich and taxon-rich analyses and can be applied at multiple taxonomic levels and facilitate a more complete understanding of the eukaryotic tree of life.

Schizosaccharomyces pombe as a model organism for studies of chromosome segreration

Article

Jun 2015

DOI: 10.15414/afz.2015.18.02.49–42 Received 04. April 2015 ǀ Accepted 11. May 2015 ǀ Available online 29. June 2015 Cell division is one of the key condition of development and reproduction of animals, plants, microorganisms and humans. Therefore, the study of the cell cycle has enormous relevance to the health, well-being, and biology of all living organisms, including growth and development of organisms, deseases such as cancer, to aging. Thus, it is of great importance to study and understand the process of regulation and implementation of the cell cycle on molecular basis. Two types of cell division evolved through evolution, namely mitosis and meiosis. Whereas mitotic events lead to generation of genetically identical cells, the main task of meiosis is to reduce the content of the genetic material by half, and thereby ensuring genetic variability and diversity. We study progress and regulation of chromosome segregation in meiosis using simple model organism Schizosaccharomyces pombe because basic molecular mechanism shares common principles in animals, humans, plants and unicellular organisms. Keywords: Schizosaccharomyces pombe, cell cycle, meiosis

POPULATION GENETICS AND GENOMICS OF COCCIDIOIDES IMMITIS AND COCCIDIOIDES POSADASII

Article

Full-text available

Jan 2009

Bridget Barker

The Ecological Genomics of Fungi

Chapter

Sep 2013

This chapter provides an overview on the diversity of basidiomycetous yeasts with emphasis on the human and animal pathogens. Comparative genomics studies clearly show that these yeast pathogens are well adapted to the human host and are able to circumvent the host defense systems. A discussion is provided on the diversity of mating type systems that regulate the (a)sexual development of basidiomycetes, including the human, animal, and plant pathogens. Two groups of fungi are discussed in detail as examples. The first includes Cryptococcus neoformans, which is causing a significant number of attributable mortalities among people infected with HIV, and its sibling species Cryptococcus gattii that is a primary pathogen causing outbreaks occurring in distinct locales involving a majority of individuals who have no known immunodeficiency. The second example is the adaptation of lipophilic or lipid-dependent Malassezia yeasts to the human and animal skin.

3 Pezizomycotina: Sordariomycetes and Leotiomycetes

Chapter

Full-text available

Apr 2015

The classes Sordariomycetes and Leotiomycetes comprise a large group of nonlichenized ascomycetous fungi, in which over 15,000 species have been described. The close evolutionary relationship of the two classes was recently defined by molecular phylogenetic analyses and subcellular data. Typically, these fungi produce inoperculate, unitunicate asci in perithecial or apothecial ascomata. Sordariomycetes and Leotiomycetes represent a wide range of ecology, including saprobes, plant endophytes, plant pathogens, mycoparasites, and insect and other animal associates. During the past two decades, fungal classification has been considerably advanced but also challenged by rapid developments in molecular phylogeny, genome sequencing, and metagenomics. Here we review history and progress in the phylogenetic classification of these taxa at familial and ordinal levels since the first edition of Mycota VII. Geoglossomycetes and Laboulbeniomycetes are also included. Problems and perspectives associated with studying these fungi in the genomic era are discussed.

The Podosphaera fusca TUB2 gene, a molecular “Swiss Army knife” with multiple applications in powdery mildew research

Article

Full-text available

Jan 2013

The powdery mildew fungus Podosphaera fusca (synonym Podosphaera xanthii) is the main causal agent of cucurbit powdery mildew and one of the most important limiting factors for cucurbit production worldwide. Despite the fungus’ economic importance, very little is known about the physiological and molecular processes involved in P. fusca biology and pathogenesis. In this study, we isolated and characterised the β-tubulin-encoding gene of P. fusca (PfTUB2) to develop molecular tools with different applications in powdery mildew research. PfTUB2 is predicted to encode a protein of 447 amino acid residues. The coding region is interrupted by six introns that occur at approximately the same positions as the introns present in other fungal TUB2-like genes. Once cloned, the PfTUB2 sequence information was used in different applications. Our results showed that the TUB2 gene is a good marker for molecular phylogenetics in powdery mildew fungi but it is unsuitable for the analysis of intraspecific diversity in P. fusca. The expression of PfTUB2 was proven to be stable in different temperature conditions, supporting its use as a reference gene in quantitative gene expression studies. Furthermore, an allele-specific PCR assay for the detection of resistance to methyl-2-benzimidazole carbamate (MBC) fungicides in P. fusca was developed based on the correlation between the single amino acid change E198A in β-tubulin and the MBC resistance phenotype. Lastly, PfTUB2 was used as a target gene in the development of a high-throughput method to quantify fungal growth in plant tissues.

The gut of Guatemalan passalid beetles: A habitat colonized by cellobiose- and xylose-fermenting yeasts

Article

Full-text available

Oct 2013
FUNGAL ECOL

The gut of insects is a productive environment for discovering undescribed species of yeasts, and the gut of wood-feeding insects of several families is especially rich in yeasts that carry out the fermentation of cellobiose and xylose. Passalid beetles (Passalidae, Coleoptera) live in dead wood that they ingest as their primary food source. We report the isolation, molecular identification and physiological characterization of 771 yeast cultures isolated from the gut of 16 species of passalids collected in nine localities in Guatemala. Ascomycete yeasts were present in the gut of every passalid studied, and the xylose-fermenting (X-F) yeasts Scheffersomyces shehatae and Scheffersomyces stipitis were the most abundant taxa isolated. The gut of the beetles also contained undescribed cellobiose-fermenting and X-F species in the Lodderomyces, Scheffersomyces and Spathaspora, and undescribed species in Sugiyamaella clades as well as rare yeast species in the Phaffomyces and Spencermartinsiella clades. Basidiomycete yeasts in the genera Cryptococcus and Trichosporon were also common. The yeast species richness was influenced by the host species and the substrate, and gut-inhabiting yeasts have the ability to survive the differing physiological conditions of several gut compartments.

Hibbett-HigherLevelClassification07

Data

Full-text available

Nov 2013

Myc Res 2007 Higher-level Fungi

Data

Full-text available

Oct 2013

Secondary Metabolism Gene Clusters Exhibit Increasingly Dynamic and Differential Expression during Asexual Growth, Conidiation, and Sexual Development in Neurospora crassa

Article

Full-text available

May 2022

Secondary metabolites (SMs) are low-molecular-weight compounds that often mediate interactions between fungi and their environments. Fungi enriched with SMs are of significant research interest to agriculture and medicine, especially from the aspects of pathogen ecology and environmental epidemiology.

Transcriptional Divergence Underpinning Sexual Development in the Fungal Class Sordariomycetes

Article

Full-text available

May 2022

Gene expression divergence through evolutionary processes is thought to be important for achieving programmed development in multicellular organisms. To test this premise in filamentous fungi, we investigated transcriptional profiles of 3,942 single-copy orthologous genes (SCOGs) in five related sordariomycete species that have morphologically diverged in the formation of their flask-shaped perithecia. We compared expression of the SCOGs to inferred gene expression levels of the most recent common ancestor of the five species, ranking genes from their largest increases to smallest increases in expression during perithecial development in each of the five species. We found that a large proportion of the genes that exhibited evolved increases in gene expression were important for normal perithecial development in Fusarium graminearum. Many of these genes were previously uncharacterized, encoding hypothetical proteins without any known functional protein domains. Interestingly, the developmental stages during which aberrant knockout phenotypes appeared largely coincided with the elevated expression of the deleted genes. In addition, we identified novel genes that affected normal perithecial development in Magnaporthe oryzae and Neurospora crassa, which were functionally and transcriptionally diverged from the orthologous counterparts in F. graminearum. Furthermore, comparative analysis of developmental transcriptomes and phylostratigraphic analysis suggested that genes encoding hypothetical proteins are generally young and transcriptionally divergent between related species. This study provides tangible evidence of shifts in gene expression that led to acquisition of novel function of orthologous genes in each lineage and demonstrates that several genes with hypothetical function are crucial for shaping multicellular fruiting bodies.

Recognition and delineation of yeast genera based on genomic data: Lessons from Trichosporonales

Article

Apr 2019

Delineation and characterization of genera in Trichosporonales (Agaricomycotina, Basidiomycota) was performed using 24 haploid and 3 naturally occurring hybrid genomes, with 3 Tremellales genomes used as outgroups. Orthologous group analysis of those genomes showed presence–absence patterns of orthologs that were consistent with the genus classifications. Many shared unique orthologs were identified in the well-supported lineages (genera Apiotrichum and Trichosporon), supporting the definitions of the genera Apiotrichum and Trichosporon from a genomic perspective. Specifically, we obtained 24 and 285 genus-specific genes from eight Apiotrichum and five Trichosporon species, respectively, and propose that these genus-specific genes can be used for delineation of those genera. On the other hand, the genus Cutaneotrichosporon shared only one genus-specific gene among eight genomes, indicating that this genus definition might require re-examination based on genomic data. In addition, taxonomic revisions are presented in this study, including the proposal of two genera, Pascua and Prillingera. Because genomic data can be systematically obtained and analyzed to compare species from a comprehensive viewpoint, they can be used not only to reconstruct reliable phylogenetic trees, but also to re-examine the definitions of taxonomic classifications. To our knowledge, this is the first report to discuss the ‘natural system’ of genus level classification in fungi based on genomic data.

Advances in Fungal Phylogenomics and Its Impact on Fungal Systematics

Chapter

Full-text available

Jan 2017
Adv Genet

In the past decade, advances in next-generation sequencing technologies and bioinformatic pipelines for phylogenomic analysis have led to remarkable progress in fungal systematics and taxonomy. A number of long-standing questions have been addressed using comparative analysis of genome sequence data, resulting in robust multigene phylogenies. These have added to, and often surpassed traditional morphology or single-gene phylogenetic methods. In this chapter, we provide a brief history of fungal systematics and highlight some examples to demonstrate the impact of phylogenomics on this field. We conclude by discussing some of the challenges and promises in fungal biology posed by the ongoing genomics revolution.

A phylogenetic overview of the Agaricomycotina

Article

Nov 2006

David Hibbett

The Agaricomycotina contains about one-third of the described species of Fungi, including mushrooms, jelly fungi and basidiomycetous yeasts. Recent phylogenetic analyses by P. Matheny and colleagues combining nuclear rRNA genes with the protein-coding genes rpb1, rpb2 and tef1 support the division of Agaricomycotina into Tremellomycetes, Dacrymycetes and Agaricomycetes. There is strong support for the monophyly of the Tremellomycetes, and its position as the sister group of the rest of the Agaricomycotina. Dacrymycetes and Agaricomycetes also are supported strongly, and together they form a clade that is equivalent to the Hymenomycetidae of Swann and Taylor. The deepest nodes in the Agaricomycetes, which are supported only by Bayesian measures of confidence, suggest that the Sebacinales, Cantharellales and Auriculariales are among the most ancient lineages. For the first time, the Polyporales are strongly supported as monophyletic and are placed as the sister group of the Thelephorales. The Agaricales, Boletales and Atheliales are united as the Agaricomycetidae, and the Russulales might be its sister group. There are still some problematical nodes that will require more loci to be resolved. Phylogenomics has promise for reconstructing these difficult backbone nodes, but current genome projects are limited mostly to the Agaricales, Boletales and Polyporales. Genome sequences from other major lineages, especially the early diverging clades, are needed to resolve the most ancient nodes and to assess deep homology in ecological characters in the Agaricomycotina.

Aspergillus Bibliography 2006

Article

Full-text available

Dec 2006

Arthur John Clutterbuck

Fungal evolution and taxonomy

Article

Feb 2010

Meredith Blackwell

Fungi and insects are closely associated in many terrestrial and some aquatic habitats. In addition to the pathogenic associations, many more interactions involve fungal spore dispersal. Recent advances in the study of insect-Associated fungi have come from phylogenic analyses with increased taxon sampling and additional DNA loci. In addition to providing stable phylogenies, some molecular studies have begun to unravel problems of dating of evolutionary events, convergent evolution and host switching. These studies also enlighten our understanding of fungal ecology and the development of organismal interactions. Mycologists continue to rely heavily, however, on identified specimens based on morphology to incorporate more of the estimated 1.5 million species of fungi in phylogenetic studies.

Phylogenetic analysis of entomopathogenic fungi

Article

Jan 2011

Entomopathogenic fungi (EPF) are cosmopolitan insect pathogens that produce several biologically active metabolites and some of them have already been commercially used as Biological Control Agents. The most common molecular techniques that have been developed lately for phylogenetic studies like RAPDs, RFLPs, AFLPs, microsatellite analysis, telomeric fingerprinting, direct sequencing and analysis of particular genomic regions and genes, alone or in combination with other genes (multi-gene approach), have been applied in studies of phylogeny and consequently classification, evolution and taxonomy of EPFs. Each method has advantages and disadvantages that are analysed. The best results for EPF phylogenetic studies have been obtained from the sequences of several genes not only within taxa but also within genera, orders and subphyla where EPF belong. A number of phylogenetic studies based on sequences of the nuclear rRNA gene-complex, housekeeping genes like benA, tef1, rpb1 and rpb2, or mitochondrial genomes, revealed the taxonomic status of many EPFs and helped in the revision of the best known genera like Beauveria, Metarhizium, Paecilomyces and Lecanicillium (former Verticillium lecanii). The generally accepted notion that insect hosts or EPF geographic location are related to certain fungal genotypes has also been tested in a number of phylogenetic studies. However, the results obtained were contradictory since some support a correlation of fungal genetic loci to their insect hosts or origins, while others show complete lack of association. Studies based on gene clusters of different evolutionary origin (nuclear and/or mitochondrial), were the most informative and provided all the necessary data for establishing well-supported and accurate fungal phylogenetic relationships.

11 Phylogenomics Enabling Genome-Based Mycology

Chapter

Jan 2015

Jason E. Stajich

Evolutionary studies of microbes, including fungi, have changed completely with the application of molecular phylogenetics to understand species relationships and define populations. The postmolecular wave of innovation has included the complete sequencing of genomes, which has enabled an even deeper perspective on phylogeny, the history of genes within species, and the impact of ecology and lifestyle on the genomic composition. The standardization of sample preparation, sequencing, and methods of analysis will lower the barrier of entry so that all model and nonmodel systems can benefit from these tools in mycological studies. The application of genomics and transcriptomics to studies of fungi has shaped and will continue to shape our understanding of microfungi and macrofungi and their interactions with hosts and the environment.

L. A. S. JOHNSON REVIEW No. 9. Construction and annotation of large phylogenetic trees

Article

Sep 2007

Michael J. Sanderson

Broad availability of molecular sequence data allows construction of phylogenetic trees with 1000s or even 10 000s of taxa. This paper reviews methodological, technological and empirical issues raised in phylogenetic inference at this scale. Numerous algorithmic and computational challenges have been identifed surrounding the core problem of reconstructing large trees accurately from sequence data, but many other obstacles, both upstream and downstream of this step, are less well understood. Before phylogenetic analysis, data must be generated de novo or extracted from existing databases, compiled into blocks of homologous data with controlled properties, aligned, examined for the presence of gene duplications or other kinds of complicating factors, and finally, combined with other evidence via supermatrix or supertree approaches. After phylogenetic analysis, confidence assessments are usually reported, along with other kinds of annotations, such as clade names, or annotations requiring additional inference procedures, such as trait evolution or divergence time estimates. Prospects for partial automation of large-tree construction are also discussed, as well as risks associated with 'outsourcing' phylogenetic inference beyond the systematics community.

An Empirical Test of Bootstrapping as a Method for Assessing Confidence in Phylogenetic Analysis

Article

Full-text available

Jun 1993

Bootstrapping is a common method for assessing confidence in phylogenetic analyses. Although bootstrapping was first applied in phylogenetics to assess the repeatability of a given result, bootstrap results are commonly interpreted as a measure of the probability that a phylogenetic estimate represents the true phylogeny. Here we use computer simulations and a laboratory-generated phylogeny to test bootstrapping results of parsimony analyses, both as measures of repeatability (i.e., the probability of repeating a result given a new sample of characters) and accuracy (i.e., the probability that a result represents the true phylogeny). Our results indicate that any given bootstrap proportion provides an unbiased but highly imprecise measure of repeatability, unless the actual probability of replicating the relevant result is nearly one. The imprecision of the estimate is great enough to render the estimate virtually useless as a measure of repeatability. Under conditions thought to be typical of most phylogenetic analyses, however, bootstrap proportions in majority-rule consensus trees provide biased but highly conservative estimates of the probability of correctly inferring the corresponding clades. Specifically, under conditions of equal rates of change, symmetric phylogenies, and internodal change of less-than-or-equal-to 20% of the characters, bootstrap proportions of greater-than-or-equal-to 70% usually correspond to a probability of greater-than-or-equal-to 95% that the corresponding clade is real. However, under conditions of very high rates of internodal change (approaching randomization of the characters among taxa) or highly unequal rates of change among taxa, bootstrap proportions >50% are overestimates of accuracy.

MRBAYES: Bayesian inference of phylogenetic trees

Article

Full-text available

Sep 2001

The program MRBAYES performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo. Availability: MRBAYES, including the source code, documentation, sample data files, and an executable, is available at http://brahms.biology.rochester.edu/software.html. Contact: johnh{at}brahms.biology.rochester.edu

A Phylogenomic Study of DNA Repair Genes, Proteins, and Processes

Article

Full-text available

Dec 1999
MUTAT RES-FUND MOL M

The ability to recognize and repair abnormal DNA structures is common to all forms of life. Studies in a variety of species have identified an incredible diversity of DNA repair pathways. Documenting and characterizing the similarities and differences in repair between species has important value for understanding the origin and evolution of repair pathways as well as for improving our understanding of phenotypes affected by repair (e.g., mutation rates, lifespan, tumorigenesis, survival in extreme environments). Unfortunately, while repair processes have been studied in quite a few species, the ecological and evolutionary diversity of such studies has been limited. Complete genome sequences can provide potential sources of new information about repair in different species. In this paper, we present a global comparative analysis of DNA repair proteins and processes based upon the analysis of available complete genome sequences. We use a new form of analysis that combines genome sequence information and phylogenetic studies into a composite analysis we refer to as phylogenomics. We use this phylogenomic analysis to study the evolution of repair proteins and processes and to predict the repair phenotypes of those species for which we now know the complete genome sequence.

Phylogenomics: Improving Functional Predictions for Uncharacterized Genes by Evolutionary Analysis

Article

Full-text available

Mar 1998
GENOME RES

Jonathan A Eisen

The ability to accurately predict gene function based on gene sequence is an important tool in many areas of biological research. Such predictions have become particularly important in the genomics age in which numerous gene sequences are generated with little or no accompanying experimentally determined functional information. Almost all functional prediction methods rely on the identification, characterization, and quantification of sequence similarity between the gene of interest and genes for which functional information is available. Because sequence is the prime determining factor of function, sequence similarity is taken to imply similarity of function. There is no doubt that this assumption is valid in most cases. However, sequence similarity does not ensure identical functions, and it is common for groups of genes that are similar in sequence to have diverse (although usually related) functions. Therefore, the identification of sequence similarity is frequently not enough to assign a predicted function to an uncharacterized gene; one must have a method of choosing among similar genes with different functions. In such cases, most functional prediction methods assign likely functions by quantifying the levels of similarity among genes. I suggest that functional predictions can be greatly improved by focusing on how the genes became similar in sequence (i.e., evolution) rather than on the sequence similarity itself. It is well established that many aspects of comparative biology can benefit from evolutionary studies (Felsenstein 1985), and comparative molecular biology is no exception

Assembling the Fungal Tree of Life: Progress, Classification, and Evolution of Subcellular Traits

Article

Full-text available

Oct 2004

Based on an overview of progress in molecular systematics of the true fungi (Fungi/Eumycota) since 1990, little overlap was found among single-locus data matrices, which explains why no large-scale multilocus phylogenetic analysis had been undertaken to reveal deep relationships among fungi. As part of the project "Assembling the Fungal Tree of Life" (AFTOL), results of four Bayesian analyses are reported with complementary bootstrap assessment of phylogenetic confidence based on (1) a combined two-locus data set (nucSSU and nucLSU rDNA) with 558 species representing all traditionally recognized fungal phyla (Ascomycota, Basidiomycota, Chytridiomycota, Zygomycota) and the Glomeromycota, (2) a combined three-locus data set (nucSSU, nucLSU, and mitSSU rDNA) with 236 species, (3) a combined three-locus data set (nucSSU, nucLSU rDNA, and RPB2) with 157 species, and (4) a combined four-locus data set (nucSSU, nucLSU, mitSSU rDNA, and RPB2) with 103 species. Because of the lack of complementarity among single-locus data sets, the last three analyses included only members of the Ascomycota and Basidiomycota. The four-locus analysis resolved multiple deep relationships within the Ascomycota and Basidiomycota that were not revealed previously or that received only weak support in previous studies. The impact of this newly discovered phylogenetic structure on supraordinal classifications is discussed. Based on these results and reanalysis of subcellular data, current knowledge of the evolution of septal features of fungal hyphae is synthesized, and a preliminary reassessment of ascomal evolution is presented. Based on previously unpublished data and sequences from GenBank, this study provides a phylogenetic synthesis for the Fungi and a framework for future phylogenetic studies on fungi.

Basic Local Aligment Search Tool

Article

Full-text available

Oct 1990

A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.

TREEVIEW: An application to display phylogenetic trees on personal computers

Article

Full-text available

Sep 1996

Roderic Page

No abstract available.

The CLUSTAL_X Windows Interface: Flexible Strategies for Multiple Sequence Alignment Aided by Quality Analysis Tools

Article

Full-text available

Jan 1998

CLUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W. The new system is easy to use, providing an integrated system for performing multiple sequence and profile alignments and analysing the results. CLUSTAL X displays the sequence alignment in a window on the screen. A versatile sequence colouring scheme allows the user to highlight conserved features in the alignment. Pull-down menus provide all the options required for traditional multiple sequence and profile alignment. New features include: the ability to cut-and-paste sequences to change the order of the alignment, selection of a subset of the sequences to be realigned, and selection of a sub-range of the alignment to be realigned and inserted back into the original alignment. Alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted. Quality analysis and realignment of selected residue ranges provide the user with a powerful tool to improve and refine difficult alignments and to trap errors in input sequences. CLUSTAL X has been compiled on SUN Solaris, IRIX5.3 on Silicon Graphics, Digital UNIX on DECstations, Microsoft Windows (32 bit) for PCs, Linux ELF for x86 PCs, and Macintosh PowerMac.

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium

Article

Full-text available

Jun 2000

Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

The MetaCyc database

Article

Full-text available

Feb 2002
NUCLEIC ACIDS RES

MetaCyc is a metabolic-pathway database that describes 445 pathways and 1115 enzymes occurring in 158 organisms. MetaCyc is a review-level database in that a given entry in MetaCyc often integrates information from multiple literature sources. The pathways in MetaCyc were determined experimentally, and are labeled with the species in which they are known to occur based on literature references examined to date. MetaCyc contains extensive commentary and literature citations. Applications of MetaCyc include pathway analysis of genomes, metabolic engineering and biochemistry education. MetaCyc is queried using the Pathway Tools graphical user interface, which provides a wide variety of query operations and visualization tools. MetaCyc is available via the World Wide Web at http://ecocyc.org/ecocyc/metacyc.html, and is available for local installation as a binary program for the PC and the Sun workstation, and as a set of flatfiles. Contact metacyc-info{at}ai.sri.com for information on obtaining a local copy of MetaCyc.

An efficient algorithm for large-scale detection of protein families

Article

Full-text available

May 2002
NUCLEIC ACIDS RES

Detection of protein families in large databases is one of the principal research objectives in structural and functional genomics. Protein family classification can significantly contribute to the delineation of functional diversity of homologous proteins, the prediction of function based on domain architecture or the presence of sequence motifs as well as comparative genomics, providing valuable evolutionary insights. We present a novel approach called TRIBE-MCL for rapid and accurate clustering of protein sequences into families. The method relies on the Markov cluster (MCL) algorithm for the assignment of proteins into families based on precomputed sequence similarity information. This novel approach does not suffer from the problems that normally hinder other protein sequence clustering algorithms, such as the presence of multi-domain proteins, promiscuous domains and fragmented proteins. The method has been rigorously tested and validated on a number of very large databases, including SwissProt, InterPro, SCOP and the draft human genome. Our results indicate that the method is ideally suited to the rapid and accurate detection of protein families on a large scale. The method has been used to detect and categorise protein families within the draft human genome and the resulting families have been used to annotate a large proportion of human proteins.

The COG Database: an Updated Version Includes Eukaryotes

Article

Full-text available

Oct 2003
BMC BIOINFORMATICS

The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or approximately 54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of approximately 20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (approximately 1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes. The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies.

From Gene Trees to Organismal Phylogeny in Prokaryotes:The Case of the γ-Proteobacteria

Article

Full-text available

Nov 2003
PLOS BIOL

The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the gamma-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the gamma-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.

Rokas A, Williams BL, King N, Carroll SB.. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425: 798-804

Article

Full-text available

Nov 2003
NATURE

One of the most pervasive challenges in molecular phylogenetics is the incongruence between phylogenies obtained using different data sets, such as individual genes. To systematically investigate the degree of incongruence, and potential methods for resolving it, we screened the genome sequences of eight yeast species and selected 106 widely distributed orthologous genes for phylogenetic analyses, singly and by concatenation. Our results suggest that data sets consisting of single or a small number of concatenated genes have a significant probability of supporting conflicting topologies. By contrast, analyses of the entire data set of concatenated genes yielded a single, fully resolved species tree with maximum support. Comparable results were obtained with a concatenation of a minimum of 20 genes; substantially more genes than commonly used but a small fraction of any genome. These results have important implications for resolving branches of the tree of life.

Body plan evolution of ascomycetes, as inferred from an RNA polymerase II phylogeny

Article

Full-text available

Apr 2004

The mode of evolution of the biologically diverse forms of ascomycetes is not well understood, largely because the descent relationships remain unresolved. By using sequences of the nuclear gene RPB2, we have inferred with considerable resolution the phylogenetic relationships between major groups within the phylum Ascomycota. These relationships allow us to deduce a historical pattern of body plan evolution. Within Taphrinomycotina, the most basal group, two simple body plans exist: uncovered asci with unicellular growth, or rudimentary ascoma with hyphal growth. Ancestral ascomycetes were filamentous; hyphal growth was lost independently in the yeast forms of Taphrinomycotina and Saccharomycotina. Pezizomycotina, the sister group to Saccharomycotina, retained mycelial growth while elaborating two basic ontogenetic pathways for ascoma formation and centrum development. The RPB2 phylogeny shows with significant statistical support that taxa in Pezizomycotina with ascohymenial ontogeny (ascoma generally forms after nuclear pairing) are ancestral and paraphyletic, whereas ascolocular fungi with fissitunicate asci are a clade derived from them. Ascolocular lichens are polyphyletic, whereas ascohymenial lichens comprise a monophyletic group that includes the Lecanorales. Our data are not consistent with a derived origin of Eurotiomycetes including Aspergillus and Trichophyton from within a lichen-forming ancestral group. For these reasons, the results of this study are considerably at variance with the conclusion that major fungal lineages are derived from lichensymbiotic ancestors. Interpretation of our results in the context of early work suggests that ascoma ontogeny and centrum characters are not in conflict with the molecular data.

PhyloGenie: Automated phylome generation and analysis

Article

Full-text available

Feb 2004
NUCLEIC ACIDS RES

Phylogenetic reconstruction is the method of choice to determine the homologous relationships between sequences. Difficulties in producing high-quality alignments, which are the basis of good trees, and in automating the analysis of trees have unfortunately limited the use of phylogenetic reconstruction methods to individual genes or gene families. Due to the large number of sequences involved, phylogenetic analyses of proteomes preclude manual steps and therefore require a high degree of automation in sequence selection, alignment, phylogenetic inference and analysis of the resulting set of trees. We present a set of programs that automates the steps from seed sequence to phylogeny and a utility to extract all phylogenies that match specific topological constraints from a database of trees. Two example applications that show the type of questions that can be answered by phylome analysis are provided. The generation and analysis of the Thermoplasma acidophilum phylome with regard to lateral gene transfer between Thermoplasmata and Sulfolobus, showed best BLAST hits to be far less reliable indicators of lateral transfer than the corresponding protein phylogenies.The generation and analysis of the Danio rerio phylome provided more than twice as many proteins as described previously, supporting the hypothesis of an additional round of genome duplication in the actinopterygian lineage.

Abascal F, Zardoya R, Posada D.. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21: 2104-2105

Article

Full-text available

Jun 2005

Using an appropriate model of amino acid replacement is very important for the study of protein evolution and phylogenetic inference. We have built a tool for the selection of the best-fit model of evolution, among a set of candidate models, for a given protein sequence alignment. Availability: ProtTest is available under the GNU license from http://darwin.uvigo.es Contact: fabascal{at}uvigo.es

Genome-Scale Gene Function Prediction Using Multiple Sources of High-Throughput Data in Yeast

Article

Full-text available

Feb 2004

Characterizing gene function is one of the major challenging tasks in the post-genomic era. To address this challenge, we have developed GeneFAS (Gene Function Annotation System), a new integrated probabilistic method for cellular function prediction by combining information from protein-protein interactions, protein complexes, microarray gene expression profiles, and annotations of known proteins through an integrative statistical model. Our approach is based on a novel assessment for the relationship between (1) the interaction/correlation of two proteins' high-throughput data and (2) their functional relationship in terms of their Gene Ontology (GO) hierarchy. We have developed a Web server for the predictions. We have applied our method to yeast Saccharomyces cerevisiae and predicted functions for 1548 out of 2472 unannotated proteins.

Multigene Analyses of Bilaterian Animals Corroborate the Monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia

Article

Full-text available

Jun 2005

Almost a decade ago, a new phylogeny of bilaterian animals was inferred from small-subunit ribosomal RNA (rRNA) that claimed the monophyly of two major groups of protostome animals: Ecdysozoa (e.g., arthropods, nematodes, onychophorans, and tardigrades) and Lophotrochozoa (e.g., annelids, molluscs, platyhelminths, brachiopods, and rotifers). However, it received little additional support. In fact, several multigene analyses strongly argued against this new phylogeny. These latter studies were based on a large amount of sequence data and therefore showed an apparently strong statistical support. Yet, they covered only a few taxa (those for which complete genomes were available), making systematic artifacts of tree reconstruction more probable. Here we expand this sparse taxonomic sampling and analyze a large data set (146 genes, 35,371 positions) from a diverse sample of animals (35 species). Our study demonstrates that the incongruences observed between rRNA and multigene analyses were indeed due to long-branch attraction artifacts, illustrating the enormous impact of systematic biases on phylogenomic studies. A refined analysis of our data set excluding the most biased genes provides strong support in favor of the new animal phylogeny and in addition suggests that urochordates are more closely related to vertebrates than are cephalochordates. These findings have important implications for the interpretation of morphological and genomic data.

BIOVERSE: Enhancements to the framework for structural, functional and contextual modeling of proteins and proteomes

Article

Full-text available

Aug 2005
NUCLEIC ACIDS RES

We have made a number of enhancements to the previously described Bioverse web server and computational biology framework (http://bioverse.compbio.washington.edu). In this update, we provide an overview of the new features available that include: (i) expansion of the number of organisms represented in the Bioverse and addition of new data sources and novel prediction techniques not available elsewhere, including network-based annotation; (ii) reengineering the database backend and supporting code resulting in significant speed, search and ease-of use improvements; and (iii) creation of a stateful and dynamic web application frontend to improve interface speed and usability. Integrated Java-based applications also allow dynamic visualization of real and predicted protein interaction networks.

Genome Sequencing in Microfabricated High-Density Picolitre Reactors

Article

Full-text available

Oct 2005
NATURE

The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.

Genome trees and the nature of genome evolution

Article

Full-text available

Feb 2005

Genome trees are a means to capture the overwhelming amount of phylogenetic information that is present in genomes. Different formalisms have been introduced to reconstruct genome trees on the basis of various aspects of the genome. On the basis of these aspects, we separate genome trees into five classes: (a) alignment-free trees based on statistic properties of the genome, (b) gene content trees based on the presence and absence of genes, (c) trees based on chromosomal gene order, (d) trees based on average sequence similarity, and (e) phylogenomics-based genome trees. Despite their recent development, genome tree methods have already had some impact on the phylogenetic classification of bacterial species. However, their main impact so far has been on our understanding of the nature of genome evolution and the role of horizontal gene transfer therein. An ideal genome tree method should be capable of using all gene families, including those containing paralogs, in a phylogenomics framework capitalizing on existing methods in conventional phylogenetic reconstruction. We expect such sophisticated methods to help us resolve the branching order between the main bacterial phyla.

Heterotachy and long-branch attraction in phylogenetics

Article

Full-text available

Feb 2005
BMC EVOL BIOL

Probabilistic methods have progressively supplanted the Maximum Parsimony (MP) method for inferring phylogenetic trees. One of the major reasons for this shift was that MP is much more sensitive to the Long Branch Attraction (LBA) artefact than is Maximum Likelihood (ML). However, recent work by Kolaczkowski and Thornton suggested, on the basis of simulations, that MP is less sensitive than ML to tree reconstruction artefacts generated by heterotachy, a phenomenon that corresponds to shifts in site-specific evolutionary rates over time. These results led these authors to recommend that the results of ML and MP analyses should be both reported and interpreted with the same caution. This specific conclusion revived the debate on the choice of the most accurate phylogenetic method for analysing real data in which various types of heterogeneities occur. However, variation of evolutionary rates across species was not explicitly incorporated in the original study of Kolaczkowski and Thornton, and in most of the subsequent heterotachous simulations published to date, where all terminal branch lengths were kept equal, an assumption that is biologically unrealistic. In this report, we performed more realistic simulations to evaluate the relative performance of MP and ML methods when two kinds of heterogeneities are considered: (i) within-site rate variation (heterotachy), and (ii) rate variation across lineages. Using a similar protocol as Kolaczkowski and Thornton to generate heterotachous datasets, we found that heterotachy, which constitutes a serious violation of existing models, decreases the accuracy of ML whatever the level of rate variation across lineages. In contrast, the accuracy of MP can either increase or decrease when the level of heterotachy increases, depending on the relative branch lengths. This result demonstrates that MP is not insensitive to heterotachy, contrary to the report of Kolaczkowski and Thornton. Finally, in the case of LBA (i.e. when two non-sister lineages evolved faster than the others), ML outperforms MP over a wide range of conditions, except for unrealistic levels of heterotachy. For realistic combinations of both heterotachy and variation of evolutionary rates across lineages, ML is always more accurate than MP. Therefore, ML should be preferred over MP for analysing real data, all the more so since parametric methods also allow one to handle other types of biological heterogeneities much better, such as among sites rate variation. The confounding effects of heterotachy on tree reconstruction methods do exist, but can be eschewed by the development of mixture models in a probabilistic framework, as proposed by Kolaczkowski and Thornton themselves.

Genomics of the fungal kingdom: Insights into eukaryotic biology

Article

Full-text available

Jan 2006
GENOME RES

The last decade has witnessed a revolution in the genomics of the fungal kingdom. Since the sequencing of the first fungus in 1996, the number of available fungal genome sequences has increased by an order of magnitude. Over 40 complete fungal genomes have been publicly released with an equal number currently being sequenced--representing the widest sampling of genomes from any eukaryotic kingdom. Moreover, many of these sequenced species form clusters of related organisms designed to enable comparative studies. These data provide an unparalleled opportunity to study the biology and evolution of this medically, industrially, and environmentally important kingdom. In addition, fungi also serve as model organisms for all eukaryotes. The available fungal genomic resource, coupled with the experimental tractability of the fungi, is accelerating research into the fundamental aspects of eukaryotic biology. We provide here an overview of available fungal genomes and highlight some of the biological insights that have been derived through their analysis. We also discuss insights into the fundamental cellular biology shared between fungi and other eukaryotic organisms.

Animal Evolution and the Molecular Signature of Radiations Compressed in Time

Article

Full-text available

Jan 2006
SCIENCE

The phylogenetic relationships among most metazoan phyla remain uncertain. We obtained large numbers of gene sequences from metazoans, including key understudied taxa. Despite the amount of data and breadth of taxa analyzed, relationships among most metazoan phyla remained unresolved. In contrast, the same genes robustly resolved phylogenetic relationships within a major clade of Fungi of approximately the same age as the Metazoa. The differences in resolution within the two kingdoms suggest that the early history of metazoans was a radiation compressed in time, a finding that is in agreement with paleontological inferences. Furthermore, simulation analyses as well as studies of other radiations in deep time indicate that, given adequate sequence data, the lack of resolution in phylogenetic trees is a signature of closely spaced series of cladogenetic events.

PHYLIP-phylogeny inference package (Version 3.2)

Article

Jan 2002

J. Felsenstein

From gene trees to organismal phylogeny in prokaryotes, a case for gamma-proteobacteria

Article

Jan 2003

PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4.0b10

Book

Jan 2002

David L. Swofford

— We studied sequence variation in 16S rDNA in 204 individuals from 37 populations of the land snail Candidula unifasciata (Poiret 1801) across the core species range in France, Switzerland, and Germany. Phylogeographic, nested clade, and coalescence analyses were used to elucidate the species evolutionary history. The study revealed the presence of two major evolutionary lineages that evolved in separate refuges in southeast France as result of previous fragmentation during the Pleistocene. Applying a recent extension of the nested clade analysis (Templeton 2001), we inferred that range expansions along river valleys in independent corridors to the north led eventually to a secondary contact zone of the major clades around the Geneva Basin. There is evidence supporting the idea that the formation of the secondary contact zone and the colonization of Germany might be postglacial events. The phylogeographic history inferred for C. unifasciata differs from general biogeographic patterns of postglacial colonization previously identified for other taxa, and it might represent a common model for species with restricted dispersal.

Dating the evolutionary radiations of the true fungi

Article

Aug 1993

In this paper we construct a relative time scale for the origin and radiation of major lineages of the true fungi, using the 18S ribosomal RNA gene sequence data of 37 fungal species, and then calibrate the time scale using fossil evidence. Of the sequences, 28 were from the literature or data banks and the remaining 9 are new. To estimate the order of origin of fungal lineages we reconstructed the phylogeny of the fungi using aligned sequence data. To compensate for the differences in nucleotide substitution rates among various fungal lineages, we normalized the pairwise substitution data before estimating the relative timing of fungal divergences. We divided the fungi into nine groups. We then calculated the average percent substitution for each group, and also the average for all the groups, for the time period beginning when the fungi diverged from a common ancestor and ending at the present. We used the ratios of group-specific percent substitutions to the average percent substitution to normalize ou...

Cases in Which Parsimony or Compatibility Methods Will Be Positively Misleading

Chapter

Jan 1978
Syst Zool

JS Felsenstein

PHYLIP – Phylogeny inference package (version 3.2)

Article

Jan 1989

J. Felsenstein

Cases in which Parsimony or Compatibility Methods Will be Positively Misleading

Article

Dec 1978
Syst Zool

Joseph Felsenstein

For some simple three- and four-species cases involving a character with two states, it is determined under what conditions several methods of phylogenetic inference will fail to converge to the true phylogeny as more and more data are accumulated. The methods are the Camin-Sokal parsimony method, the compatibility method, and Farris's unrooted Wagner tree parsimony method. In all cases the conditions for this failure (which is the failure to be statistically consistent) are essentially that parallel changes exceed informative, nonparallel changes. It is possible for these methods to be inconsistent even when change is improbable a priori, provided that evolutionary rates in different lineages are sufficiently unequal. It is by extension of this approach that we may provide a sound methodology for evaluating methods of phylogenetic inference.

Phylogenomics

Article

Dec 2005

The continuous flow of genomic data is creating unprecedented opportunities for the reconstruction of molecular phylogenies. Access to wholegenome data means that phylogenetic analysis can now be performed at different genomic levels, such as primary sequences and gene order, allowing for reciprocal corroboration of the results. We critically review the different kinds of phylogenomic methods currently available, paying particular attention to method reliability. Our emphasis is on methods for the analysis of primary sequences because these are the most advanced. We discuss the important issue of statistical inconsistency and show how failing to fully capture the process of sequence evolution in the underlying models leads to tree reconstruction artifacts. We suggest strategies for detecting and potentially overcoming these problems. These strategies involve the development of better models, the use of an improved taxon sampling and the exclusion of phylogenetically misleading data.

Graph Clustering by Flow Simulation

Article

May 2000

Stijn Marinus van Dongen

Dit proefschrift heeft als onderwerp het clusteren van grafen door middel van simulatie van stroming, een probleem dat in zijn algemeenheid behoort tot het gebied der clustera- nalyse. In deze tak van wetenschap ontwerpt en onderzoekt men methoden die gegeven bepaalde data een onderverdeling in groepen genereren, waarbij het oogmerk is een on- derverdeling in groepen te vinden die natuurlijk is. Dat wil zeggen dat verschillende data-elementen in dezelfde groep idealiter veel op elkaar lijken, en dat data-elementen uit verschillende groepen idealiter veel van elkaar verschillen. Soms ontbreken zulke groepjes helemaal; dan is er weinig patroon te herkennen in de data. Het idee is dat de aanwezigheid van natuurlijke groepjes het mogelijk maakt de data te categoriseren. Een voorbeeld is het clusteren van gegevens (over symptomen of lichaamskarakteristie- ken) van patienten die aan dezelfde ziekte lijden. Als er duidelijke groepjes bestaan in die gegevens, kan dit tot extra inzicht leiden in de ziekte. Clusteranalyse kan al- dus gebruikt worden voor exploratief onderzoek. Verdere voorbeelden komen uit de scheikunde, taxonomie, psychiatrie, archeologie, marktonderzoek en nog vele andere disicplines. Taxonomie, de studie van de classificatie van organismen, heeft een rijke ge- schiedenis beginnend bij Aristoteles en culminerend in de werken van Linnaeus. In feite kan de clusteranalyse gezien worden als het resultaat van een steeds meer systematische en abstracte studie van de diverse methoden ontworpen in verschillende toepassingsge- bieden, waarbij methode zowel wordt gescheiden van data en toepassingsgebied als van berekeningswijze. In de cluster analyse kunnen grofweg twee richtingen onderscheiden worden, naargelang het type data dat geclassificeerd moet worden. De data-elementen in het voorbeeld hier- boven worden beschreven door vectoren (lijstjes van scores of metingen), en het verschil tussen twee elementen wordt bepaald door het verschil van de vectoren. Deze disserta- tie betreft cluster analyse toegepast op data van het type `graaf'. Voorbeelden komen uit de patroonherkenning, het computerondersteund ontwerpen, databases voorzien van hyperlinks en het World Wide Web. In al deze gevallen is er sprake van `punten' die verbonden zijn of niet. Een stelsel van punten samen met hun verbindingen heet een graaf. Een goede clustering van een graaf deelt de punten op in groepjes zodanig dat er weinig verbindingen lopen tussen (punten uit) verschillende groepjes en er veel verbin- dingen zijn in elk groepje afzonderlijk. Het eerste deel van de dissertatie, bestaande uit de hoofdstukken 2 en 3, behandelt de positie van clusteranalyse in het algemeen en de positie van graafclusteren binnen de clusteranalyse in het bijzonder, alsmede de relatie van graafclusteren tot het aanverwante probleem van het partitioneren van grafen. In het cluster probleem zoekt men een `natuurlijke' onderverdeling in groepjes en is het aantal en formaat van de groepjes niet voorgeschreven. In het partitie probleem zijn aantal en afmetingen wel voorgeschreven en zoekt men gegeven deze restricties een toewijzing van de elementen aan de groepjes zodanig dat er een minimale hoeveelheid verbindingen tussen de groepjes is. 163?164 SAMENVATTING De dissertatie beschrijft voorts theorie, implementatie en abstracte toetsing van een krachtig nieuw cluster algoritme voor grafen genaamd Markov Cluster algoritme of MCL algoritme. Het algoritme maakt gebruik van (en is in feite niet meer dan een schil om) een algebraisch proces (genaamd MCL proces) gedefinieerd voor Markov grafen, i.e. gra- fen waarvoor de geassocieerde matrix stochastisch is. In dit proces wordt de aanvangs- graaf successievelijk getransformeerd door alternatie van de twee operatoren expansie en inflatie. Expansie is het nemen van de macht van een matrix volgens het klassieke matrix product. Stochastisch gezien betekent dit het uitrekenen van de overgangskan- sen behorend bij een meerstapsrelatie. Inflatie valt samen met het nemen van de macht van een matrix volgens het elementsgewijze HadamardSchur product, gevolgd door een kolomsgewijze herschaling zodat het uiteindelijke resultaat weer een (kolom) stochas- tische matrix is. Dit is een ongebruikelijke operator in de wereld van de stochastiek; zijn introductie is geheel en al gemotiveerd door de beoogde werking op grafen waar clusterstructuur aanwezig is. Het is namelijk te verwachten dat bij meerstapsrelaties die corresponderen met puntparen liggend binnen een natuurlijke cluster grotere over- gangskansen zullen horen dan bij puntparen waarvan de punten in verschillende clusters liggen. De inflatie operator bevoordeelt meerstapsrelaties met grote bijbehorende kans en benadeelt meerstapsrelaties met kleine bijbehorende kans. De verwachting is dus dat het MCL proces meerstapsrelaties zal creeeren en bestendigen die horen bij relaties liggend in ´e´en cluster, en dat het alle meerstapsrelaties zal decimeren die behoren bij re- laties tussen verschillende clusters. Dit blijkt inderdaad het geval te zijn. Het MCL proces convergeert over het algemeen naar een idempotente matrix die zeer ijl is en bestaat uit meerdere componenten. De componenten worden ge¨interpreteerd als een clustering van de aanvangsgraaf. Doordat de inflatie operator geparametrizeerd is kunnen clusteringen op verschillend niveau van granulariteit ontdekt worden. Het MCL algoritme bestaat ten eerste uit een transformatiestap van een gegeven graaf naar een stochastische aanvangsgraaf, gebruik makend van het standaard concept van een willekeurige wandeling op een graaf. Ten tweede vergt het de specificatie van twee rijen van waarden die de opeenvolgende expansie en inflatie parametrizeringen defini- eeren. Tenslotte berekent het algoritme het bijbehorende proces en interpreteert het de resulterende limiet. Het idee om willekeurige wandelingen te gebruiken om clus- terstructuur te ontdekken is niet nieuw, maar de wijze van uitvoering wel. Het idee wordt als `graafcluster paradigma' ge¨introduceerd in hoofdstuk 5, gevolgd door enige combinatorische voorstellen tot het clusteren van grafen. Getoond wordt dat er een verband is tussen de combinatorische en probabilistische clustermethoden, en dat een belangrijk onderscheid de localisatiestap is die probabilistische methoden over het al- gemeen introduceren. Het hoofdstuk besluit met een voorbeeld van een MCL proces en de formele definitie van zowel proces als algoritme. Notaties en definities zijn dan reeds ge¨introduceerd in hoofdstuk 4. In hoofdstuk 6 wordt de interpretatiefunctie van idempotente matrices naar clusteringen geformaliseerd, worden simpele eigenschappen van de inflatie operator beschreven, en wordt de stabiliteit van MCL limieten en de ge- associeerde clusteringen geanalyseerd. Het fenomeen van overlappende clusters is in principe mogelijk 13 en maakt intrinsiek deel uit van de interpretatiefunctie, maar blijkt 13 De tot nu toe waargenomen overlap van clusters correspondeerde altijd met een graafauto- morfisme dat het overlappende deel van clusters op zichzelf afbeeldde.?SAMENVATTING 165 instabiel te zijn. Hoofdstuk 7 introduceert de klassen van diagonaal symmetrische en diagonaal positief semi-definiete matrices (matrices die diagonaal gelijkvormig zijn met een symmetrische respectievelijk positief semi-definiete matrix). Beide klassen worden in zichzelf overgevoerd door zowel expansie als inflatie 14 . Getoond wordt dat diagonaal positief semi-definiete matrices structuur bevatten die de interpretatiefunctie van idem- potente matrices naar clusteringen generaliseert. Hieruit volgt een preciezere duiding van het inflatoire effect van de inflatieoperator op het spectrum van de argumentma- trix. Ontkoppelingsaspecten van grafen en matrices zijn altijd nauw verbonden met ka- rakteristieken van de geassocieerde spectra. Hoofdstuk 8 beschrijft een aantal bekende resultaten die ten grondslag liggen aan de meest gebruikte technieken ten behoeve van het partitioneren van grafen. De hoofdstukken 4 tot en met 8 vormen het tweede deel van de dissertatie. Het derde deel doet verslag van experimenten met het MCL algoritme. Hoofdstuk 9 is theoretisch van aard en introduceert functies die gebruikt kunnen worden als maat voor de kwaliteit van een graafclustering. Ondermeer wordt een generieke maat afgeleid die uitdrukt hoe goed een karakteristieke vector de massa van een andere (niet nega- tieve) vector representeert. Elements of kolomsgewijze toepassing van de maat geeft een uitdrukking voor de mate waarin een clustering de massa van een gewogen graaf of matrix representeert. Tevens wordt een metriek op de ruimte van clusteringen of par- tities afgeleid, die gebruikt wordt om de continu¨iteitseigenschappen en het onderschei- dend vermogen van het MCL algoritme te toetsen in hoofdstuk 12. Hoofdstuk 10 doet verslag van experimenten op kleine symmetrische grafen met welbepaalde dichtheids- karakteristieken zoals rastervormige grafen. Het MCL algoritme blijkt experimenteel een sterk scheidend vermogen te hebben. Experimenten met buurgrafen 15 wijzen uit dat het algoritme niet geschikt is indien de diameter van de natuurlijke clusters groot is. Dit verschijnsel kan begrepen worden in termen van de (stochastische) stromings- eigenschappen van het algoritme. Hoofdstuk 11 gaat in op de schaalbaarheid van het algoritme. Cruciaal is dat de limiet van het MCL proces over het algemeen zeer ijl is en dat de iteranden van het proces ijl zijn in een gewogen interpretatie van het begrip ijl. Dat wil zeggen, de inflatie operator zorgt ervoor dat de meeste nieuwe niet-nul ele- menten (corresponderend met meerstapsrelaties) zeer klein blijven en uiteindelijk weer verdwijnen. Dit is des te meer waar naarmate de diameter van de natuurlijke clusters klein is, en naarmate de connectiviteit van de totale graaf laag is. Dit suggereert dat tijdens elke expansie stap die ervoor zorgt dat de matrix vol loopt de kolommen van de nieuw berekende matrix uitgedund kunnen worden door simpelweg de k grootste elementen van een nieuw berekende (stochastische) kolom te nemen, en deze elementen te herschalen op 1, waar k afhangt van de aanwezige rekencapaciteit. Omdat het bereke- nen van de k grootste waarden van een vector in principe niet in lineaire tijd kan, blijkt het in praktijk noodzakelijk een verfijnder schema te hanteren waarin de vector eerst uitgedund wordt door middel van drempelwaardes die afhangen van homogeniteitsei- genschappen van de vector. Dit leidt in principe tot een complexiteit in de orde van grootte O Nk 2 , waar N de dimensie van de matrix is. Hoofdstuk 12 doet verslag van 14 Voor diagonaal positief semi-definiete matrices geldt dit voor slechts een deel van de para- metrizeringsruimte van de inflatie operator. 15 Rasterachtige grafen gedefinieerd op punten in de Euclidische ruimte.?166 SAMENVATTING experimenten op testgrafen met tienduizend punten waarvan de verbindingen op zo'n manier (willekeurig) zijn gegenereerd dat een a priori beste clustering bekend is. Deze grafen hebben natuurlijke clusters met kleine diameter maar hebben als geheel hoge tot zeer hoge connectiviteit. Het geschaalde MCL algoritme blijkt zeer goede clusteringen te genereren die dicht bij de a priori bekende clustering liggen. De parameter k kan laag gekozen worden, maar de prestaties van het algoritme nemen sterker af naarmate k lager is en de totale connectiviteit van de input graaf hoger. De appendix A cluster miscellany beginnend op pagina 149 is geschreven voor een algemeen publiek en bevat korte uiteenzettingen over diverse aspecten van clusteranalyse, zoals de geschiedenis van het vakgebied en de rol van de computer.

Phylogenetic Analysis in Molecular Evolutionary Genetics

Article

Feb 1996

Masatoshi Nei

Recent developments of statistical methods in molecular phylogenetics are reviewed. It is shown that the mathematical foundations of these methods are not well established, but computer simulations and empirical data indicate that currently used methods such as neighbor joining, minimum evolution, likelihood, and parsimony methods produce reasonably good phylogenetic trees when a sufficiently large number of nucleotides or amino acids are used. However, when the rate of evolution varies extensively from branch to branch, many methods may fail to recover the true topology. Solid statistical tests for examining the accuracy of trees obtained by neighbor joining, minimum evolution, and least-squares method are available, but the methods for likelihood and parsimony trees are yet to be refined. Parsimony, likelihood, and distance methods can all be used for inferring amino acid sequences of the proteins of ancestral organisms that have become extinct.

Major fungal lineages are derived from lichen symbiotic ancestors

Article

Jun 2001

About one-fifth of all known extant fungal species form obligate symbiotic associations with green algae, cyanobacteria or with both photobionts. These symbioses, known as lichens, are one way for fungi to meet their requirement for carbohydrates. Lichens are widely believed to have arisen independently on several occasions, accounting for the high diversity and mixed occurrence of lichenized and non-lichenized (42 and 58%, respectively) fungal species within the Ascomycota. Depending on the taxonomic classification chosen, 15-18 orders of the Ascomycota include lichen-forming taxa, and 8-11 of these orders (representing about 60% of the Ascomycota species) contain both lichenized and non-lichenized species. Here we report a phylogenetic comparative analysis of the Ascomycota, a phylum that includes greater than 98% of known lichenized fungal species. Using a Bayesian phylogenetic tree sampling methodology combined with a statistical model of trait evolution, we take into account uncertainty about the phylogenetic tree and ancestral state reconstructions. Our results show that lichens evolved earlier than believed, and that gains of lichenization have been infrequent during Ascomycota evolution, but have been followed by multiple independent losses of the lichen symbiosis. As a consequence, major Ascomycota lineages of exclusively non-lichen-forming species are derived from lichen-forming ancestors. These species include taxa with important benefits and detriments to humans, such as Penicillium and Aspergillus.

PAL: an object-oriented programming library for molecular evolution and phylogenetics

Article

Aug 2001

Phylogenetic Analysis Library (PAL) is a collection of Java classes for use in molecular evolution and phylogenetics. PAL provides a modular environment for the rapid construction of both special-purpose and general analysis programs. PAL version 1.1 consists of 145 public classes or interfaces in 13 packages, including classes for models of character evolution, maximum-likelihood estimation, and the coalescent, with a total of more than 27000 lines of code. The PAL project is set up as a collaborative project to facilitate contributions from other researchers. Availability: The program is free and is available at http://www.pal-project.org. It requires Java 1.1 or later. PAL is licensed under the GNU General Public License. Contact: a.drummond@auckland.ac.nz; korbinian.strimmer@zoo.ox.ac.uk Supplementary information: An online description of the Application Programming Interface (API) of all public classes in PAL is available at http://www.pal-project.org/api/. *To whom correspondence should be addressed

A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood

Article

Nov 2003

The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximum- likelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/.

Genome-Scale Phylogeny and the Detection of Systematic Biases

Article

Aug 2004

Phylogenetic inference from sequences can be misled by both sampling (stochastic) error and systematic error (nonhistorical signals where reality differs from our simplified models). A recent study of eight yeast species using 106 concatenated genes from complete genomes showed that even small internal edges of a tree received 100% bootstrap support. This effective negation of stochastic error from large data sets is important, but longer sequences exacerbate the potential for biases (systematic error) to be positively misleading. Indeed, when we analyzed the same data set using minimum evolution optimality criteria, an alternative tree received 100% bootstrap support. We identified a compositional bias as responsible for this inconsistency and showed that it is reduced effectively by coding the nucleotides as purines and pyrimidines (RY-coding), reinforcing the original tree. Thus, a comprehensive exploration of potential systematic biases is still required, even though genome-scale data sets greatly reduce sampling error.

Performance of Four Ribosomal DNA Regions to Infer Higher-Level Phylogenetic Relationships of Inoperculate Euascomycetes (Leotiomyceta)

Article

Apr 2005

The inoperculate euascomycetes are filamentous fungi that form saprobic, parasitic, and symbiotic associations with a wide variety of animals, plants, cyanobacteria, and other fungi. The higher-level relationships of this economically important group have been unsettled for over 100 years. A data set of 55 species was assembled including sequence data from nuclear and mitochondrial small and large subunit rDNAs for each taxon; 83 new sequences were obtained for this study. Parsimony and Bayesian analyses were performed using the four-region data set and all 14 possible subpartitions of the data. The mitochondrial LSU rDNA was used for the first time in a higher-level phylogenetic study of ascomycetes and its use in concatenated analyses is supported. The classes that were recognized in Leotiomyceta (=inoperculate euascomycetes) in a classification by Eriksson and Winka [Myconet 1 (1997) 1] are strongly supported as monophyletic. The following classes formed strongly supported sister-groups: Arthoniomycetes and Dothideomycetes, Chaetothyriomycetes and Eurotiomycetes, and Leotiomycetes and Sordariomycetes. Nevertheless, the backbone of the euascomycete phylogeny remains poorly resolved. Bayesian posterior probabilities were always higher than maximum parsimony bootstrap values, but converged with an increase in gene partitions analyzed in concatenated analyses. Comparison of five recent higher-level phylogenetic studies in ascomycetes demonstrates a high degree of uncertainty in the relationships between classes.

Phylogenomics and the reconstruction of the tree of life

Article

Jun 2005
NAT REV GENET

As more complete genomes are sequenced, phylogenetic analysis is entering a new era - that of phylogenomics. One branch of this expanding field aims to reconstruct the evolutionary history of organisms on the basis of the analysis of their genomes. Recent studies have demonstrated the power of this approach, which has the potential to provide answers to several fundamental evolutionary questions. However, challenges for the future have also been revealed. The very nature of the evolutionary history of organisms and the limitations of current phylogenetic reconstruction methods mean that part of the tree of life might prove difficult, if not impossible, to resolve with confidence.

Phylogenetic analysis using parsimony (* and other methods) Sinauer Associates

Jan 2002

D L Swovord

SwoVord, D.L., 2002. PAUP¤. Phylogenetic analysis using parsimony (* and other methods). Sinauer Associates, Sunderland, Massachusetts, USA.

ProtTest: selection of best-Wt models of protein evolution

Jan 2005
2104-2105

F Abascal
R Zardoya
D Posada

Abascal, F., Zardoya, R., Posada, D., 2005. ProtTest: selection of best-Wt models of protein evolution. Bioinformatics 21, 2104–2105.

Jan 2005
541-562

H Philippe
F Delsuc
H Brinkmann
N Lartillot

Philippe, H., Delsuc, F., Brinkmann, H., Lartillot, N., 2005a. Phylogenomics. Annu. Rev. Ecol. Evol. Syst. 36, 541–562.

ProtTest: selection of best-fit models of protein evolution

Jan 2005
2104

Abascal

Genome-scale approaches to resolving incongruence in molecular phylogenies

Jan 2003
798

Rokas

A phylogenomic analysis of the Ascomycota

Abstract

No full-text available

Recommended publications

The Lignicolous Fungus Coniochaeta pulveracea and Its Interactions with Syntrophic Yeasts from the W...

Molecular phylogenetic analyses of filamentous fungi from deteriorated old Chinese manuscripts in Ce...

Phylogenomic Analysis Resolves the Formerly Intractable Adaptive Diversification of the Endemic Clad...

Molecular Phylogenetic Studies on the Diatrypaceae Based on rDNA-ITS Sequences