Bioprospecting metagenomes: glycosyl hydrolases for converting biomass

Biology Department, Brookhaven National Laboratory, Upton, New York 11973, USA.
Biotechnology for Biofuels (Impact Factor: 6.22). 06/2009; 2:10. DOI: 10.1186/1754-6834-2-10
Source: PubMed

ABSTRACT Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes) for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies.


Available from: Luen Luen Li, Jun 03, 2015
  • [Show abstract] [Hide abstract]
    ABSTRACT: There are more than 1000 microbial species living in the complex human intestine. The gut microbial community plays an important role in protecting the host against pathogenic microbes, modulating immunity, regulating metabolic processes, and is even regarded as an endocrine organ. However, traditional culture methods are very limited for identifying microbes. With the application of molecular biologic technology in the field of the intestinal microbiome, especially metagenomic sequencing of the next-generation sequencing technology, progress has been made in the study of the human intestinal microbiome. Metagenomics can be used to study intestinal microbiome diversity and dysbiosis, as well as its relationship to health and disease. Moreover, functional metagenomics can identify novel functional genes, microbial pathways, antibiotic resistance genes, functional dysbiosis of the intestinal microbiome, and determine interactions and co-evolution between microbiota and host, though there are still some limitations. Metatranscriptomics, metaproteomics and metabolomics represent enormous complements to the understanding of the human gut microbiome. This review aims to demonstrate that metagenomics can be a powerful tool in studying the human gut microbiome with encouraging prospects. The limitations of metagenomics to be overcome are also discussed. Metatranscriptomics, metaproteomics and metabolomics in relation to the study of the human gut microbiome are also briefly discussed.
    World Journal of Gastroenterology 01/2015; 21(3):803-814. DOI:10.3748/wjg.v21.i3.803 · 2.43 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: As one of the most abundant agricultural wastes, sugarcane bagasse is largely under-exploited, but it possesses a great potential for the biofuel, fermentation, and cellulosic biorefinery industries. It also provides a unique ecological niche, as the microbes in this lignocellulose-rich environment thrive in relatively high temperatures (50°C) with varying microenvironments of aerobic surface to anoxic interior. The microbial community in bagasse thus presents a good resource for the discovery and characterization of new biomass-degrading enzymes; however, it remains largely unexplored. We have constructed a fosmid library of sugarcane bagasse and obtained the largest bagasse metagenome to date. A taxonomic classification of the bagasse metagenome reviews the predominance of Proteobacteria, which are also found in high abundance in other aerobic environments. Based on the functional characterization of biomass-degrading enzymes, we have demonstrated that the bagasse microbial community benefits from a large repertoire of lignocellulolytic enzymes, which allows them to digest different components of lignocelluoses into single molecule sugars. Comparative genomic analyses with other lignocellulolytic and non-lignocellulolytic metagenomes show that microbial communities are taxonomically separable by their aerobic "open" or anoxic "closed" environments. Importantly, a functional analysis of lignocellulose-active genes (based on the CAZy classifications) reveals core enzymes highly conserved within the lignocellulolytic group, regardless of their taxonomic compositions. Cellulases, in particular, are markedly more pronounced compared to the non-lignocellulolytic group. In addition to the core enzymes, the bagasse fosmid library also contains some uniquely enriched glycoside hydrolases, as well as a large repertoire of the newly defined auxiliary activity proteins. Our study demonstrates a conservation and diversification of carbohydrate-active genes among diverse microbial species in different biomass-degrading niches, and signifies the importance of taking a global approach to functionally investigate a microbial community as a whole, as compared to focusing on individual organisms.
    Biotechnology for Biofuels 12/2015; 8(1):16. DOI:10.1186/s13068-015-0200-8 · 6.22 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein sequences predicted from metagenomic datasets are annotated by identifying their homologs via sequence comparisons with reference or curated proteins. However, a majority of metagenomic protein sequences are partial-length, arising as a result of identifying genes on sequencing reads or on assembled nucleotide contigs, which themselves are often very fragmented. The fragmented nature of metagenomic protein predictions adversely impacts homology detection and, therefore, the quality of the overall annotation of the dataset. Here we present a novel algorithm called GRASP that accurately identifies the homologs of a given reference protein sequence from a database consisting of partial-length metagenomic proteins. Our homology detection strategy is guided by the reference sequence, and involves the simultaneous search and assembly of overlapping database sequences. GRASP was compared to three commonly used protein sequence search programs (BLASTP, PSI-BLAST and FASTM). Our evaluations using several simulated and real datasets show that GRASP has a significantly higher sensitivity than these programs while maintaining a very high specificity. GRASP can be a very useful program for detecting and quantifying taxonomic and protein family abundances in metagenomic datasets. GRASP is implemented in GNU C++, and is freely available at © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
    Nucleic Acids Research 11/2014; 43(3). DOI:10.1093/nar/gku1210 · 8.81 Impact Factor