Article

Fungal phylogenomics.A global analysis of fungal genomes and their evolution

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Recent molecular studies suggest that Opisthokonta, the eukaryotic supergroup including animals and fungi, should be expanded to include a diverse collection of primitively single-celled eukaryotes previously classified as Protozoa. These taxa include corallochytreans, nucleariids, ministeriids, choanoflagellates, and ichthyosporeans. Assignment of many of these taxa to Opisthokonta remains uncorroborated as it is based solely on small subunit ribosomal RNA trees lacking resolution and significant bootstrap support for critical nodes. Therefore, important details of the phylogenetic relationships of these putative opisthokonts with each other and with animals and fungi remain unclear. We have sequenced elongation factor 1-alpha (EF-1), actin, ß-tubulin, and HSP70, and/or -tubulin from representatives of each of the proposed protistan opisthokont lineages, constituting the first protein-coding gene data for some of them. Our results show that members of all opisthokont protist groups encode a 12-amino acid insertion in EF-1, previously found exclusively in animals and fungi. Phylogenetic analyses of combined multigene data sets including a diverse set of opisthokont and nonopisthokont taxa place all of the proposed opisthokont protists unequivocally in an exclusive clade with animals and fungi. Within this clade, the nucleariid appears as the closest sister taxon to fungi, while the corallochytrean and ichthyosporean form a group which, together with the ministeriid and choanoflagellates, form two to three separate sister lineages to animals. These results further establish Opisthokonta as a bona fide taxonomic group and suggest that any further testing of the legitimacy of this taxon should, at the least, include data from opisthokont protists. Our results also underline the critical position of these "animal-fungal allies" with respect to the origin and early evolution of animals and fungi.
Article
Full-text available
Article
Full-text available
We conservatively estimate that there is a minimum of 712,000 extant fungal species worldwide, but we recognize that the actual species richness is likely much higher. This estimate was calculated from the ratio of fungal species to plant species for various ecologically defined groups of fungi in well-studied regions, along with data on each groups’ level of endemism. These calculations were based on information presented in the detailed treatments of the various fungal groups published in this special issue. Our intention was to establish a lower boundary for the number of fungal species worldwide that can be revised upward as more information becomes available. Establishing a lower boundary for fungal diversity is important as current estimates vary widely, hindering the ability to include fungi in discussions of ecology, biodiversity and conservation. Problems inherent in making these estimates, and the impact that additional data on fungal and plant species diversity will have on these estimates are discussed.
Article
Full-text available
Although fungi are among the most important organisms in the world, only limited and incomplete information is currently available for most species and current estimates of species numbers for fungi differ significantly. This lack of basic information on taxonomic diversity has significant implications for many aspects of evolutionary biology. While the figure of 1.5 million estimated fungal species is commonly used, critics have questioned the validity of this estimate. Data on biogeographic distributions, levels of endemism, and host specificity must be taken into account when developing estimates of global fungal diversity. This paper introduces a set of papers that attempt to develop a rigorous, minimum estimate of global fungal diversity based on a critical assessment of current species lists and informed predictions of missing data and levels of endemism. As such, these papers represent both a meta-analysis of current data and a gap assessment to indicate where future research efforts should be concentrated. KeywordsSpecies lists-Ratio data-Endemism-Host specificity-Diversity estimates
Article
Full-text available
Microbial heavy metal retention was studied using seepage water sampled from a former uranium mining site in Eastern Thuringia, Germany. The seepage water has a low pH and contains high concentrations of metals, including uranium, rare earth elements (REE), and other heavy metals. Microbial influence on sorption and/or active uptake of heavy metals was studied using REE patterns. Incubation of seepage water with the bacterium Escherichia coli caused sorption of heavy metals to biomass. Incubation with the fungus Schizophyllum commune, however, had a much more pronounced effect, including significant fractionation of REE, pointing to the possibility of a specific active uptake mechanism. Extraction factors and fractionation coefficients are given to show the capacity of the presented bioextraction for future applications.
Article
Full-text available
Fungi, members of the kingdoms Chromista, Fungi s.str. and Protozoa studied by mycologists, have received scant consideration in discussions on biodiversity. The number of known species is about 69 000, but that in the world is conservatively estimated at 1·5 million; six-times higher than hitherto suggested. The new world estimate is primarily based on vascular plant: fungus ratios in different regions. It is considered conservative as: (1) it is based on the lower estimates of world vascular plants; (2) no separate provision is made for the vast numbers of insects now suggested to exist; (3) ratios are based on areas still not fully known mycologically; and (4) no allowance is made for higher ratios in tropical and polar regions. Evidence that numerous new species remain to be found is presented. This realization has major implications for systematic manpower, resources, and classification. Fungi have and continue to play a vital role in the evolution of terrestrial life (especially through mutualisms), ecosystem function and the maintenance of biodiversity, human progress, and the operation of Gaia. Conservation in situ and ex situ are complementary, and the significance of culture collections is stressed. International collaboration is required to develop a world inventory, quantify functional roles, and for effective conservation.
Article
Full-text available
Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems.
Article
Full-text available
Lateral genetic transfer can lead to disagreements among phylogenetic trees comprising sequences from the same set of taxa. Where topological discordance is thought to have arisen through genetic transfer events, tree comparisons can be used to identify the lineages that may have shared genetic information. An 'edit path' of one or more transfer events can be represented with a series of subtree prune and regraft (SPR) operations, but finding the optimal such set of operations is NP-hard for comparisons between rooted trees, and may be so for unrooted trees as well. Efficient Evaluation of Edit Paths (EEEP) is a new tree comparison algorithm that uses evolutionarily reasonable constraints to identify and eliminate many unproductive search avenues, reducing the time required to solve many edit path problems. The performance of EEEP compares favourably to that of other algorithms when applied to strictly bifurcating trees with specified numbers of SPR operations. We also used EEEP to recover edit paths from over 19 000 unrooted, incompletely resolved protein trees containing up to 144 taxa as part of a large phylogenomic study. While inferred protein trees were far more similar to a reference supertree than random trees were to each other, the phylogenetic distance spanned by random versus inferred transfer events was similar, suggesting that real transfer events occur most frequently between closely related organisms, but can span large phylogenetic distances as well. While most of the protein trees examined here were very similar to the reference supertree, requiring zero or one edit operations for reconciliation, some trees implied up to 40 transfer events within a single orthologous set of proteins. Since sequence trees typically have no implied root and may contain unresolved or multifurcating nodes, the strategy implemented in EEEP is the most appropriate for phylogenomic analyses. The high degree of consistency among inferred protein trees shows that vertical inheritance is the dominant pattern of evolution, at least for the set of organisms considered here. However, the edit paths inferred using EEEP suggest an important role for genetic transfer in the evolution of microbial genomes as well.
Article
Full-text available
The horizontal transfer of expressed genes from Bacteria into Ciliates which live in close contact with each other in the rumen (the foregut of ruminants) was studied using ciliate Expressed Sequence Tags (ESTs). More than 4000 ESTs were sequenced from representatives of the two major groups of rumen Cilates: the order Entodiniomorphida (Entodinium simplex, Entodinium caudatum, Eudiplodinium maggii, Metadinium medium, Diploplastron affine, Polyplastron multivesiculatum and Epidinium ecaudatum) and the order Vestibuliferida, previously called Holotricha (Isotricha prostoma, Isotricha intestinalis and Dasytricha ruminantium). A comparison of the sequences with the completely sequenced genomes of Eukaryotes and Prokaryotes, followed by large-scale construction and analysis of phylogenies, identified 148 ciliate genes that specifically cluster with genes from the Bacteria and Archaea. The phylogenetic clustering with bacterial genes, coupled with the absence of close relatives of these genes in the Ciliate Tetrahymena thermophila, indicates that they have been acquired via Horizontal Gene Transfer (HGT) after the colonization of the gut by the rumen Ciliates. Among the HGT candidates, we found an over-representation (>75%) of genes involved in metabolism, specifically in the catabolism of complex carbohydrates, a rich food source in the rumen. We propose that the acquisition of these genes has greatly facilitated the Ciliates' colonization of the rumen providing evidence for the role of HGT in the adaptation to new niches.
Article
Full-text available
The ability to accurately predict gene function based on gene sequence is an important tool in many areas of biological research. Such predictions have become particularly important in the genomics age in which numerous gene sequences are generated with little or no accompanying experimentally determined functional information. Almost all functional prediction methods rely on the identification, characterization, and quantification of sequence similarity between the gene of interest and genes for which functional information is available. Because sequence is the prime determining factor of function, sequence similarity is taken to imply similarity of function. There is no doubt that this assumption is valid in most cases. However, sequence similarity does not ensure identical functions, and it is common for groups of genes that are similar in sequence to have diverse (although usually related) functions. Therefore, the identification of sequence similarity is frequently not enough to assign a predicted function to an uncharacterized gene; one must have a method of choosing among similar genes with different functions. In such cases, most functional prediction methods assign likely functions by quantifying the levels of similarity among genes. I suggest that functional predictions can be greatly improved by focusing on how the genes became similar in sequence (i.e., evolution) rather than on the sequence similarity itself. It is well established that many aspects of comparative biology can benefit from evolutionary studies (Felsenstein 1985), and comparative molecular biology is no exception
Article
Full-text available
Using information from several metabolic databases, we have built our own metabolic database containing 434 pathways and 1157 different enzymes. We have used this information to construct a dendrogram that demonstrates the metabolic similarities between 282 species. The resulting species distribution and the clusters defined in the tree show a certain taxonomic congruence, especially in recent relationships between species. This dendrogram is another representation of the tree of life, based on metabolism that may complement the trees constructed by other methods. For example, the metabolic dissimilarity we demonstrate between Symbiobacterium thermophilum (previously defined as Actinobacteria) and the other Actinobacteria species, and the metabolic similarity between S. thermophilum and Clostridia, combined with other evidence, suggest that S. thermophilum may be re-classified as Firmicutes, Clostridia.
Article
Full-text available
Based on an overview of progress in molecular systematics of the true fungi (Fungi/Eumycota) since 1990, little overlap was found among single-locus data matrices, which explains why no large-scale multilocus phylogenetic analysis had been undertaken to reveal deep relationships among fungi. As part of the project "Assembling the Fungal Tree of Life" (AFTOL), results of four Bayesian analyses are reported with complementary bootstrap assessment of phylogenetic confidence based on (1) a combined two-locus data set (nucSSU and nucLSU rDNA) with 558 species representing all traditionally recognized fungal phyla (Ascomycota, Basidiomycota, Chytridiomycota, Zygomycota) and the Glomeromycota, (2) a combined three-locus data set (nucSSU, nucLSU, and mitSSU rDNA) with 236 species, (3) a combined three-locus data set (nucSSU, nucLSU rDNA, and RPB2) with 157 species, and (4) a combined four-locus data set (nucSSU, nucLSU, mitSSU rDNA, and RPB2) with 103 species. Because of the lack of complementarity among single-locus data sets, the last three analyses included only members of the Ascomycota and Basidiomycota. The four-locus analysis resolved multiple deep relationships within the Ascomycota and Basidiomycota that were not revealed previously or that received only weak support in previous studies. The impact of this newly discovered phylogenetic structure on supraordinal classifications is discussed. Based on these results and reanalysis of subcellular data, current knowledge of the evolution of septal features of fungal hyphae is synthesized, and a preliminary reassessment of ascomal evolution is presented. Based on previously unpublished data and sequences from GenBank, this study provides a phylogenetic synthesis for the Fungi and a framework for future phylogenetic studies on fungi.
Article
Full-text available
Comparisons of tree topologies provide relevant information in evolutionary studies. Most existing methods share the drawback of requiring a complete and exact mapping of terminal nodes between the compared trees. This severely limits the scope of genome-wide analyses, since trees containing duplications are pruned arbitrarily or discarded. To overcome this, we have developed treeKO, an algorithm that enables the comparison of tree topologies, even in the presence of duplication and loss events. To do so treeKO recursively splits gene trees into pruned trees containing only orthologs to subsequently compute a distance based on the combined analyses of all pruned tree comparisons. In addition treeKO, implements the possibility of computing phylome support values, and reconciliation-based measures such as the number of inferred duplication and loss events.
Article
Full-text available
Carotenoids are colored compounds produced by plants, fungi, and microorganisms and are required in the diet of most animals for oxidation control or light detection. Pea aphids display a red-green color polymorphism, which influences their susceptibility to natural enemies, and the carotenoid torulene occurs only in red individuals. Unexpectedly, we found that the aphid genome itself encodes multiple enzymes for carotenoid biosynthesis. Phylogenetic analyses show that these aphid genes are derived from fungal genes, which have been integrated into the genome and duplicated. Red individuals have a 30-kilobase region, encoding a single carotenoid desaturase that is absent from green individuals. A mutation causing an amino acid replacement in this desaturase results in loss of torulene and of red body color. Thus, aphids are animals that make their own carotenoids.
Article
Full-text available
Fusarium species are among the most important phytopathogenic and toxigenic fungi. To understand the molecular underpinnings of pathogenicity in the genus Fusarium, we compared the genomes of three phenotypically diverse species: Fusarium graminearum, Fusarium verticillioides and Fusarium oxysporum f. sp. lycopersici. Our analysis revealed lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes and account for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity, indicative of horizontal acquisition. Experimentally, we demonstrate the transfer of two LS chromosomes between strains of F. oxysporum, converting a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in F. oxysporum. These findings put the evolution of fungal pathogenicity into a new perspective.
Article
Full-text available
Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations. ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org.
Article
Full-text available
Oxidative phosphorylation is central to the energy metabolism of the cell. Due to adaptation to different life-styles and environments, fungal species have shaped their respiratory pathways in the course of evolution. To identify the main mechanisms behind the evolution of respiratory pathways, we conducted a phylogenomics survey of oxidative phosphorylation components in the genomes of sixty fungal species. Besides clarifying orthology and paralogy relationships among respiratory proteins, our results reveal three parallel losses of the entire complex I, two of which are coupled to duplications in alternative dehydrogenases. Duplications in respiratory proteins have been common, affecting 76% of the protein families surveyed. We detect several instances of paralogs of genes coding for subunits of respiratory complexes that have been recruited to other multi-protein complexes inside and outside the mitochondrion, emphasizing the role of evolutionary tinkering. Processes of gene loss and gene duplication followed by functional divergence have been rampant in the evolution of fungal respiration. Overall, the core proteins of the respiratory pathways are conserved in most lineages, with major changes affecting the lineages of microsporidia, Schizosaccharomyces and Saccharomyces/Kluyveromyces due to adaptation to anaerobic life-styles. We did not observe specific adaptations of the respiratory metabolism common to all pathogenic species.
Article
Full-text available
Candida glabrata has emerged as an important fungal pathogen of humans, causing life-threatening infections in immunocompromised patients. In contrast, mice do not develop disease upon systemic challenge, even with high infection doses. In this study we show that leukopenia, but not treatment with corticosteroids, leads to fungal burdens that are transiently increased over those in immunocompetent mice. However, even immunocompetent mice were not capable of clearing infections within 4 weeks. Tissue damage and immune responses to microabscesses were mild as monitored by clinical parameters, including blood enzyme levels, histology, myeloperoxidase, and cytokine levels. Furthermore, we investigated the suitability of amino acid auxotrophic C. glabrata strains for in vitro and in vivo studies of fitness and/or virulence. Histidine, leucine, or tryptophan auxotrophy, as well as a combination of these auxotrophies, did not influence in vitro growth in rich medium. The survival of all auxotrophic strains in immunocompetent mice was similar to that of the parental wild-type strain during the first week of infection and was only mildly reduced 4 weeks after infection, suggesting that C. glabrata is capable of utilizing a broad range of host-derived nutrients during infection. These data suggest that C. glabrata histidine, leucine, or tryptophan auxotrophic strains are suitable for the generation of knockout mutants for in vivo studies. Notably, our work indicates that C. glabrata has successfully developed immune evasion strategies enabling it to survive, disseminate, and persist within mammalian hosts.
Article
Full-text available
The ancestors of fungi are believed to be simple aquatic forms with flagellated spores, similar to members of the extant phylum Chytridiomycota (chytrids). Current classifications assume that chytrids form an early-diverging clade within the kingdom Fungi and imply a single loss of the spore flagellum, leading to the diversification of terrestrial fungi. Here we develop phylogenetic hypotheses for Fungi using data from six gene regions and nearly 200 species. Our results indicate that there may have been at least four independent losses of the flagellum in the kingdom Fungi. These losses of swimming spores coincided with the evolution of new mechanisms of spore dispersal, such as aerial dispersal in mycelial groups and polar tube eversion in the microsporidia (unicellular forms that lack mitochondria). The enigmatic microsporidia seem to be derived from an endoparasitic chytrid ancestor similar to Rozella allomycis, on the earliest diverging branch of the fungal phylogenetic tree
Article
Full-text available
Microsporidia are obligate intracellular pathogens mainly infecting both vertebrate and invertebrate hosts. The group comprises approximately 150 genera with 1,200 species. Due to sequence divergence phylogenic reconstructions that are solely based on DNA sequence have been unprecise for these pathogens. Our previous study identified that three microsporidian genomes contained a putative sex-related locus similar to that of zygomycetes. In a comparison of genome architecture of the microsporidia to other fungi, Rhizopus oryzae, a zygomycete fungus, shared more common gene clusters with Encephalitozoon cuniculi, a microsporidian. This provides evidence supporting the hypothesis that microsporidia and zygomycete fungi may share a more recent common ancestor than other fungal lineages. Genetic recombination is an important outcome of sexual development. We describe genetic markers which will enable tests of whether sex occurs within E. cuniculi populations by analyzing tandem repeat DNA regions in three different isolates. Taken together, the phylogenetic relationship of microsporidia to fungi and the presence of a sex-related locus in their genomes suggest the microsporidia may have an extant sexual cycle. In addition, we describe recently reported evidence of horizontal gene transfer from Chlamydia to the E. cuniculi genome and show that these two obligate intracellular pathogens can infect the same host cells.
Article
Full-text available
The InParanoid project gathers proteomes of completely sequenced eukaryotic species plus Escherichia coli and calculates pairwise ortholog relationships among them. The new release 7.0 of the database has grown by an order of magnitude over the previous version and now includes 100 species and their collective 1.3 million proteins organized into 42.7 million pairwise ortholog groups. The InParanoid algorithm itself has been revised and is now both more specific and sensitive. Based on results from our recent benchmarking of low-complexity filters in homology assignment, a two-pass BLAST approach was developed that makes use of high-precision compositional score matrix adjustment, but avoids the alignment truncation that sometimes follows. We have also updated the InParanoid web site (http://InParanoid.sbc.su.se). Several features have been added, the response times have been improved and the site now sports a new, clearer look. As the number of ortholog databases has grown, it has become difficult to compare among these resources due to a lack of standardized source data and incompatible representations of ortholog relationships. To facilitate data exchange and comparisons among ortholog databases, we have developed and are making available two XML schemas: SeqXML for the input sequences and OrthoXML for the output ortholog clusters.
Article
Full-text available
The highly compacted 2.9-Mb genome of Encephalitozoon cuniculi placed the microsporidia in the spotlight, encoding a mere 2,000 proteins and a highly reduced suite of biochemical pathways. This extreme level of reduction is not universal across the microsporidia, with genomes known to vary up to sixfold in size, suggesting that some genomes may harbor a gene content that is not as reduced as that of Enc. cuniculi. In this study, we present an in-depth survey of the large genome of Octosporea bayeri, a pathogen of Daphnia magna, with an estimated genome size of 24 Mb, in order to shed light on the organization and content of a large microsporidian genome. Using Illumina sequencing, 898 Mb of O. bayeri genome sequence was generated, resulting in 13.3 Mb of unique sequence. We annotated a total of 2,174 genes, of which 893 encodes proteins with assigned function. The gene density of the O. bayeri genome is very low on average, but also highly uneven, so gene-dense regions also occur. The data presented here suggest that the O. bayeri proteome is well represented in this analysis and is more complex that that of Enc. cuniculi. Functional annotation of O. bayeri proteins suggests that this species might be less biochemically dependent on its host for its metabolism than its more reduced relatives. The combination of the data presented here, together with the imminent annotated genome of Daphnia magna, will provide a wealth of genetic and genomic tools to study host-parasite interactions in an interesting model for pathogenesis.
Article
Full-text available
Saccharomyces cerevisiae has been used for millennia in winemaking, but little is known about the selective forces acting on the wine yeast genome. We sequenced the complete genome of the diploid commercial wine yeast EC1118, resulting in an assembly of 31 scaffolds covering 97% of the S288c reference genome. The wine yeast differed strikingly from the other S. cerevisiae isolates in possessing 3 unique large regions, 2 of which were subtelomeric, the other being inserted within an EC1118 chromosome. These regions encompass 34 genes involved in key wine fermentation functions. Phylogeny and synteny analyses showed that 1 of these regions originated from a species closely related to the Saccharomyces genus, whereas the 2 other regions were of non-Saccharomyces origin. We identified Zygosaccharomyces bailii, a major contaminant of wine fermentations, as the donor species for 1 of these 2 regions. Although natural hybridization between Saccharomyces strains has been described, this report provides evidence that gene transfer may occur between Saccharomyces and non-Saccharomyces species. We show that the regions identified are frequent and differentially distributed among S. cerevisiae clades, being found almost exclusively in wine strains, suggesting acquisition through recent transfer events. Overall, these data show that the wine yeast genome is subject to constant remodeling through the contribution of exogenous genes. Our results suggest that these processes are favored by ecologic proximity and are involved in the molecular adaptation of wine yeasts to conditions of high sugar, low nitrogen, and high ethanol concentrations.
Article
Full-text available
To effectively apply evolutionary concepts in genome-scale studies, large numbers of phylogenetic trees have to be automatically analysed, at a level approaching human expertise. Complex architectures must be recognized within the trees, so that associated information can be extracted. Here, we present a new software library, PhyloPattern, for automating tree manipulations and analysis. PhyloPattern includes three main modules, which address essential tasks in high-throughput phylogenetic tree analysis: node annotation, pattern matching, and tree comparison. PhyloPattern thus allows the programmer to focus on: i) the use of predefined or user defined annotation functions to perform immediate or deferred evaluation of node properties, ii) the search for user-defined patterns in large phylogenetic trees, iii) the pairwise comparison of trees by dynamically generating patterns from one tree and applying them to the other. PhyloPattern greatly simplifies and accelerates the work of the computer scientist in the evolutionary biology field. The library has been used to automatically identify phylogenetic evidence for domain shuffling or gene loss events in the evolutionary histories of protein sequences. However any workflow that relies on phylogenetic tree analysis, could be automated with PhyloPattern.
Article
Full-text available
While eukaryotes primarily evolve by duplication-divergence expansion (and reduction) of their own gene repertoire with only rare horizontal gene transfers, prokaryotes appear to evolve under both gene duplications and widespread horizontal gene transfers over long evolutionary time scales. But, the evolutionary origin of this striking difference in the importance of horizontal gene transfers remains by and large a mystery. We propose that the abundance of horizontal gene transfers in free-living prokaryotes is a simple but necessary consequence of two opposite effects: i) their apparent genome size constraint compared to typical eukaryote genomes and ii) their underlying genome expansion dynamics through gene duplication-divergence evolution, as demonstrated by the presence of many tandem and block repeated genes. In principle, this combination of genome size constraint and underlying duplication expansion should lead to a coalescent-like process with extensive turnover of functional genes. This would, however, imply the unlikely, systematic reinvention of functions from discarded genes within independent phylogenetic lineages. Instead, we propose that the long-term evolutionary adaptation of free-living prokaryotes must have resulted in the emergence of efficient non-phylogenetic pathways to circumvent gene loss. This need for widespread horizontal gene transfers due to genome size constraint implies, in particular, that prokaryotes must remain under strong selection pressure in order to maintain the long-term evolutionary adaptation of their "mutualized" gene pool, beyond the inevitable turnover of individual prokaryote species. By contrast, the absence of genome size constraint for typical eukaryotes has presumably relaxed their need for widespread horizontal gene transfers and strong selection pressure. Yet, the resulting loss of genetic functions, due to weak selection pressure and inefficient gene recovery mechanisms, must have ultimately favored the emergence of more complex life styles and ecological integration of many eukaryotes. This article was reviewed by Pierre Pontarotti, Eugene V Koonin and Sergei Maslov.
Article
Full-text available
Molecular phylogenetics and phylogenomics have greatly revised and enriched the fungal systematics in the last two decades. Most of the analyses have been performed by comparing single or multiple orthologous gene regions. Sequence alignment has always been an essential element in tree construction. These alignment-based methods (to be called the standard methods hereafter) need independent verification in order to put the fungal Tree of Life (TOL) on a secure footing. The ever-increasing number of sequenced fungal genomes and the recent success of our newly proposed alignment-free composition vector tree (CVTree, see Methods) approach have made the verification feasible. In all, 82 fungal genomes covering 5 phyla were obtained from the relevant genome sequencing centers. An unscaled phylogenetic tree with 3 outgroup species was constructed by using the CVTree method. Overall, the resultant phylogeny infers all major groups in accordance with standard methods. Furthermore, the CVTree provides information on the placement of several currently unsettled groups. Within the sub-phylum Pezizomycotina, our phylogeny places the Dothideomycetes and Eurotiomycetes as sister taxa. Within the Sordariomycetes, it infers that Magnaporthe grisea and the Plectosphaerellaceae are closely related to the Sordariales and Hypocreales, respectively. Within the Eurotiales, it supports that Aspergillus nidulans is the early-branching species among the 8 aspergilli. Within the Onygenales, it groups Histoplasma and Paracoccidioides together, supporting that the Ajellomycetaceae is a distinct clade from Onygenaceae. Within the sub-phylum Saccharomycotina, the CVTree clearly resolves two clades: (1) species that translate CTG as serine instead of leucine (the CTG clade) and (2) species that have undergone whole-genome duplication (the WGD clade). It places Candida glabrata at the base of the WGD clade. Using different input data and methodology, the CVTree approach is a good complement to the standard methods. The remarkable consistency between them has brought about more confidence to the current understanding of the fungal branch of TOL.
Article
Full-text available
Rhizopus oryzae is the primary cause of mucormycosis, an emerging, life-threatening infection characterized by rapid angioinvasive growth with an overall mortality rate that exceeds 50%. As a representative of the paraphyletic basal group of the fungal kingdom called "zygomycetes," R. oryzae is also used as a model to study fungal evolution. Here we report the genome sequence of R. oryzae strain 99-880, isolated from a fatal case of mucormycosis. The highly repetitive 45.3 Mb genome assembly contains abundant transposable elements (TEs), comprising approximately 20% of the genome. We predicted 13,895 protein-coding genes not overlapping TEs, many of which are paralogous gene pairs. The order and genomic arrangement of the duplicated gene pairs and their common phylogenetic origin provide evidence for an ancestral whole-genome duplication (WGD) event. The WGD resulted in the duplication of nearly all subunits of the protein complexes associated with respiratory electron transport chains, the V-ATPase, and the ubiquitin-proteasome systems. The WGD, together with recent gene duplications, resulted in the expansion of multiple gene families related to cell growth and signal transduction, as well as secreted aspartic protease and subtilase protein families, which are known fungal virulence factors. The duplication of the ergosterol biosynthetic pathway, especially the major azole target, lanosterol 14alpha-demethylase (ERG11), could contribute to the variable responses of R. oryzae to different azole drugs, including voriconazole and posaconazole. Expanded families of cell-wall synthesis enzymes, essential for fungal cell integrity but absent in mammalian hosts, reveal potential targets for novel and R. oryzae-specific diagnostic and therapeutic treatments.
Article
Full-text available
The first two steps of aflatoxin biosynthesis are catalyzed by the HexA/B and by the Pks protein. The phylogenetic analysis clearly distinguished fungal HexA/B from FAS subunits and from other homologous proteins. The phylogenetic trees of the HexA and HexB set of proteins share the same clustering. Proteins involved in the synthesis of fatty acids or in the aflatoxin or sterigmatocystin biosynthesis cluster separately. The Pks phylogenetic tree also differentiates the aflatoxin-related polypeptide sequences from those of other kinds of secondary metabolism. The function of some of the A. flavus Pks homologues may be deduced from the phylogenetic analysis. The conserved sequence motifs of protein domains shared by HexA/B and Pks - namely, β-polyketide synthase (KS), acetyl transferase (AT) and acyl carrier protein (ACP) - have been identified, and the HexA/B and Pks involved in aflatoxin biosynthesis have been distinguished from those involved in primary metabolism or other kinds of secondary metabolism.
Article
Full-text available
Multiple sequence alignments are central to many areas of bioinformatics. It has been shown that the removal of poorly aligned regions from an alignment increases the quality of subsequent analyses. Such an alignment trimming phase is complicated in large-scale phylogenetic analyses that deal with thousands of alignments. Here, we present trimAl, a tool for automated alignment trimming, which is especially suited for large-scale phylogenetic analyses. trimAl can consider several parameters, alone or in multiple combinations, for selecting the most reliable positions in the alignment. These include the proportion of sequences with a gap, the level of amino acid similarity and, if several alignments for the same set of sequences are provided, the level of consistency across different alignments. Moreover, trimAl can automatically select the parameters to be used in each specific alignment so that the signal-to-noise ratio is optimized. Availability: trimAl has been written in C++, it is portable to all platforms. trimAl is freely available for download (http://trimal.cgenomics.org) and can be used online through the Phylemon web server (http://phylemon2.bioinfo.cipf.es/). Supplementary Material is available at http://trimal.cgenomics.org/publications. Contact: tgabaldon@crg.es
Article
Full-text available
Recent steep declines in honey bee health have severely impacted the beekeeping industry, presenting new risks for agricultural commodities that depend on insect pollination. Honey bee declines could reflect increased pressures from parasites and pathogens. The incidence of the microsporidian pathogen Nosema ceranae has increased significantly in the past decade. Here we present a draft assembly (7.86 MB) of the N. ceranae genome derived from pyrosequence data, including initial gene models and genomic comparisons with other members of this highly derived fungal lineage. N. ceranae has a strongly AT-biased genome (74% A+T) and a diversity of repetitive elements, complicating the assembly. Of 2,614 predicted protein-coding sequences, we conservatively estimate that 1,366 have homologs in the microsporidian Encephalitozoon cuniculi, the most closely related published genome sequence. We identify genes conserved among microsporidia that lack clear homology outside this group, which are of special interest as potential virulence factors in this group of obligate parasites. A substantial fraction of the diminutive N. ceranae proteome consists of novel and transposable-element proteins. For a majority of well-supported gene models, a conserved sense-strand motif can be found within 15 bases upstream of the start codon; a previously uncharacterized version of this motif is also present in E. cuniculi. These comparisons provide insight into the architecture, regulation, and evolution of microsporidian genomes, and will drive investigations into honey bee-Nosema interactions.
Article
The recently-developed statistical method known as the "bootstrap" can be used to place confidence intervals on phylogenies. It involves resampling points from one's own data, with replacement, to create a series of bootstrap samples of the same size as the original data. Each of these is analyzed, and the variation among the resulting estimates taken to indicate the size of the error involved in making estimates from the original data. In the case of phylogenies, it is argued that the proper method of resampling is to keep all of the original species while sampling characters with replacement, under the assumption that the characters have been independently drawn by the systematist and have evolved independently. Majority-rule consensus trees can be used to construct a phylogeny showing all of the inferred monophyletic groups that occurred in a majority of the bootstrap samples. If a group shows up 95% of the time or more, the evidence for it is taken to be statistically significant. Existing computer programs can be used to analyze different bootstrap samples by using weights on the characters, the weight of a character being how many times it was drawn in bootstrap sampling. When all characters are perfectly compatible, as envisioned by Hennig, bootstrap sampling becomes unnecessary; the bootstrap method would show significant evidence for a group if it is defined by three or more characters.
Article
Identifying the mechanisms of eukaryotic genome evolution by comparative genomics is often complicated by the multiplicity of events that have taken place throughout the history of individual lineages, leaving only distorted and superimposed traces in the genome of each living organism. The hemiascomycete yeasts, with their compact genomes, similar lifestyle and distinct sexual and physiological properties, provide a unique opportunity to explore such mechanisms. We present here the complete, assembled genome sequences of four yeast species, selected to represent a broad evolutionary range within a single eukaryotic phylum, that after analysis proved to be molecularly as diverse as the entire phylum of chordates. A total of approximately 24,200 novel genes were identified, the translation products of which were classified together with Saccharomyces cerevisiae proteins into about 4,700 families, forming the basis for interspecific comparisons. Analysis of chromosome maps and genome redundancies reveal that the different yeast lineages have evolved through a marked interplay between several distinct molecular mechanisms, including tandem gene repeat formation, segmental duplication, a massive genome duplication and extensive gene loss.
Article
A method to measure how similar are two phylogenetic trees for the same collection of evolutionary units (EUs) is described. The measure does not change with changes in the direction of evolution in either of the two trees being compared, and thus it depends only on the convexity and proximity of the groups of EUs within the collection under study. It can also be used with only partially resolved phylogenetic trees. The measure is based on the idea that a group of four EUs is the smallest group for which there is more than one possible distinct undirected phylogenetic tree. For a given phylogenetic tree, the undirected subtree inherited by a group of four EUs is the tree that results when all the branches containing only EUs not in this group of four are removed. Every group of four EUs either inherits one of three distinct types of undirected phylogenetic trees, or is unresolved. Two phylogenetic trees can be compared on the basis of which groups of four EUs inherit the same type of undirected phylogenetic tree. The ideas can be extended to comparisons of three or more trees.
Article
A metric on general phylogenetic trees is presented. This extends the work of most previous authors, who constructed metrics for binary trees. The metric presented in this paper makes possible the comparison of the many nonbinary phylogenetic trees appearing in the literature. This provides an objective procedure for comparing the different methods for constructing phylogenetic trees. The metric is based on elementary operations which transform one tree into another. Various results obtained in applying these operations are given. They enable the distance between any pair of trees to be calculated efficiently. This generalizes previous work by Bourque to the case where interior vertices can be labeled, and labels may contain more than one element or may be empty.
Article
We survey the problem of comparing labeled trees based on simple local operations of deleting, inserting, and relabeling nodes. These operations lead to the tree edit distance, alignment distance, and inclusion problem. For each problem we review the results available and present, in detail, one or more of the central algorithms for solving the problem.
Article
The definition of similarity measures for phylogenetic trees has been motivated by the computation of consensus trees, the search by similarity in databases, and the assessment of phylogenetic reconstruction methods. The transposition distance for fully resolved trees is a recent addition to the extensive collection of available metrics for comparing phylogenetic trees. In this work, we generalize the transposition metric from fully resolved to arbitrary phylogenetic trees, through a construction that involves an embedding of the set of phylogenetic trees (up to isomorphisms) with a fixed number of labeled leaves into a symmetric group. We also show that this transposition distance can be computed in linear time and we establish some of its basic properties.
Article
Our understanding of the tree of life (TOL) is still fragmentary. Until recently, molecular phylogeneticists have built trees based on ribosomal RNA sequences and selected protein sequences, which, however, usually suffered from lack of support for the deeper branches and inconsistencies probably due to limited subsampling of the entire genome. Now, phylogenetic hypotheses can be based on the analysis of full genomes. We used available complete genome data as well as the eukaryote orthologous group (KOG) proteins to reconstruct with confidence basal branches of the fungal TOL. Phylogenetic analysis of a core of 531 KOGs shared among 21 fungal genomes, three animal genomes and one plant genome showed a single tree with high support resulting from four different methods of phylogenetic reconstruction. The single tree that we inferred from our dataset showed excellent nodal support for each branch, suggesting that it reflects the true phylogenetic relationships of the species involved.
Conference Paper
The comparison of rooted phylogenetic trees is essential to querying phylogenetic databases such as TreeBASE. Current comparison methods are based on either tree edit distances or common subtrees. However, a limitation of such methods is their inherent complexity. In this paper, a new distance over fully resolved phylogenetic trees, the transposition distance, is described which is based on a well-known bijection between perfect matchings and phylogenetic trees, and simple linear-time algorithms are presented for computing the new distance.
Article
In this study, we have carried out an in silico analysis of the available mitochondrial and nuclear genomes of fungi in order to identify the oxidative phosphorylation (OXPHOS) proteome, the complete set of proteins that perform the OXPHOS in mitochondria. The presence of OXPHOS proteins has been investigated in 27 nuclear and 52 mitochondrial genomes of fungi. Comparative genomics reveals a high conservation of the OXPHOS system within each fungal phyla, and notable differences between the OXPHOS proteomes of the fungal phyla. The most striking differences concerned Complexes I and V. The absence of Complex I has been previously described in various species of Ascomycota and Microsporidia, and the NDUFB4 and NURM accessory subunits of Complex I appear to be specific of fungi belonging to the subphylum Pezizomycotina. In addition, the Complex V essential subunit ATP14 appears to be specific of two subphyla of Ascomycota: the Saccharomycotina and Pezizomycotina.
Article
Aspergillus fumigatus possesses a branched mitochondrial electron transport chain, with both cyanide-sensitive and -insensitive oxygen-consumption activities. Mitochondrial reactive oxygen species mediate signaling for alternative oxidase (AOX) expression. A 1173 bp-long Afaox gene encoding a 40 kDa protein has been cloned and identified. Recombinant constructs containing the Afaox ORF were transformed into Escherichia coli and Saccharomyces cerevisiae for heterologous expression. In A. fumigatus, AOX activity and mRNA expression were both induced with menadione or paraquat, suggesting an important role of AOX under oxidative stress. Therefore, positive transformants showed a cyanide-resistant and salicylhydroxamic acid-sensitive respiration, whereas in control cells the oxygen uptake was completely inhibited after KCN addition.
Article
D-Amino acid oxidase (DAAO) is a FAD-containing flavoenzyme that catalyzes the oxidative deamination of D-isomers of neutral and polar amino acids. This enzymatic activity has been identified in most eukaryotic organisms, the only exception being plants. In the various organisms in which it does occur, DAAO fulfills distinct physiological functions: from a catabolic role in yeast cells, which allows them to grow on D-amino acids as carbon and energy sources, to a regulatory role in the human brain, where it controls the levels of the neuromodulator D-serine. Since 1935, DAAO has been the object of an astonishing number of investigations and has become a model for the dehydrogenase-oxidase class of flavoproteins. Structural and functional studies have suggested that specific physiological functions are implemented through the use of different structural elements that control access to the active site and substrate/product exchange. Current research is attempting to delineate the regulation of DAAO functions in the contest of complex biochemical and physiological networks.
Article
• In considering the Origin of Species, it is quite conceivable that a naturalist, reflecting on the mutual affinities of organic beings, on their embryological relations, their geographical distribution, geological succession, and other such facts, might come to the conclusion that each species had not been independently created, but had descended, like varieties, from other species. Nevertheless, such a conclusion, even if well founded, would be unsatisfactory, until it could be shown how the innumerable species inhabiting this world have been modified, so as to acquire that perfection of structure and coadaptation which most justly excites our admiration. Naturalists continually refer to external conditions, such as climate, food, &c, as the only possible cause of variation. In one very limited sense, as we shall hereafter see, this may be true; but it is preposterous to attribute to mere external conditions, the structure, for instance, of the woodpecker, with its feet, tail, beak, and tongue, so admirably adapted to catch insects under the bark of trees. In the case of the misseltoe, which draws its nourishment from certain trees, which has seeds that must be transported by certain birds, and which has flowers with separate sexes absolutely requiring the agency of certain insects to bring pollen from one flower to the other, it is equally preposterous to account for the structure of this parasite, with its relations to several distinct organic beings, by the effects of external conditions, or of habit, or of the volition of the plant itself. (PsycINFO Database Record (c) 2012 APA, all rights reserved) • In considering the Origin of Species, it is quite conceivable that a naturalist, reflecting on the mutual affinities of organic beings, on their embryological relations, their geographical distribution, geological succession, and other such facts, might come to the conclusion that each species had not been independently created, but had descended, like varieties, from other species. Nevertheless, such a conclusion, even if well founded, would be unsatisfactory, until it could be shown how the innumerable species inhabiting this world have been modified, so as to acquire that perfection of structure and coadaptation which most justly excites our admiration. Naturalists continually refer to external conditions, such as climate, food, &c, as the only possible cause of variation. In one very limited sense, as we shall hereafter see, this may be true; but it is preposterous to attribute to mere external conditions, the structure, for instance, of the woodpecker, with its feet, tail, beak, and tongue, so admirably adapted to catch insects under the bark of trees. In the case of the misseltoe, which draws its nourishment from certain trees, which has seeds that must be transported by certain birds, and which has flowers with separate sexes absolutely requiring the agency of certain insects to bring pollen from one flower to the other, it is equally preposterous to account for the structure of this parasite, with its relations to several distinct organic beings, by the effects of external conditions, or of habit, or of the volition of the plant itself. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Phylogenetic analyses serve many purposes, including the establishment of orthology relationships, the prediction of protein function and the detection of important evolutionary events. Within the context of the sequencing of the genome of the pea aphid, Acyrthosiphon pisum, we undertook a phylogenetic analysis for every protein of this species. The resulting phylome includes the evolutionary relationships of all predicted aphid proteins and their homologues among 13 other fully-sequenced arthropods and three out-group species. Subsequent analyses have revealed multiple gene expansions that are specific to aphids and have served to transfer functional annotations to 4058 pea aphid genes that display one-to-one orthology relationships with Drosophila melanogaster annotated genes. All phylogenies and alignments are accessible through the PhylomeDB database. Here we provide a description of this dataset and provide some examples on how can it be exploited.
Article
The relevance of horizontal gene transfer (HGT) in eukaryotes is a matter of debate. Recent analyses have shown clear examples in some species such as Candida parapsilosis, but broader surveys are lacking. To assess the impact of HGT in the fungal kingdom, we searched for prokaryotic-derived HGTs in 60 fully sequenced genomes. Using strict phylogenomic criteria, we detected 713 transferred genes. HGT affected most fungal clades, with particularly high rates in Pezizomycotina. Transferred genes included bacterial arsenite reductase, catalase, different racemases and peptidoglycan metabolism enzymes. Our results suggest an important role for HGT in fungal evolution.
Article
Candidemia remains a major cause of morbidity and mortality in the health care setting, and the epidemiology of Candida infection is changing. Clinical data from patients with candidemia were extracted from the Prospective Antifungal Therapy (PATH) Alliance database, a comprehensive registry that collects information regarding invasive fungal infections. A total of 2019 patients, enrolled from 1 July 2004 through 5 March 2008, were identified. Data regarding the candidemia episode were analyzed, including the specific fungal species and patient survival at 12 weeks after diagnosis. The incidence of candidemia caused by non-Candida albicans Candida species (54.4%) was higher than the incidence of candidemia caused by C. albicans (45.6%). The overall, crude 12-week mortality rate was 35.2%. Patients with Candida parapsilosis candidemia had the lowest mortality rate (23.7%; P<.001) and were less likely to be neutropenic (5.1%; P<.001) and to receive corticosteroids (33.5%; P<.001) or other immunosuppressive drugs (7.9%; P=.002), compared with patients infected with other Candida species. Candida krusei candidemia was most commonly associated with prior use of antifungal agents (70.6%; P<.001), hematologic malignancy (52.9%; P<.001) or stem cell transplantation (17.7%; P<.001), neutropenia (45.1%; P<.001), and corticosteroid treatment (60.8%; P<.001). Patients with C. krusei candidemia had the highest crude 12-week mortality in this series (52.9%; P<.001). Fluconazole was the most commonly administered antimicrobial, followed by the echinocandins, and amphotericin B products were infrequently administered. The epidemiology and choice of therapy for candidemia are rapidly changing. Additional study is warranted to differentiate host factors and differences in virulence among Candida species and to determine the best therapeutic regimen.