The convergence of carbohydrate active gene repertoires in human gut microbes

Department of Chemistry and Biochemistry, University of Colorado, Boulder, CO 80309, USA.
Proceedings of the National Academy of Sciences (Impact Factor: 9.67). 10/2008; 105(39):15076-81. DOI: 10.1073/pnas.0807339105
Source: PubMed


The extreme variation in gene content among phylogenetically related microorganisms suggests that gene acquisition, expansion, and loss are important evolutionary forces for adaptation to new environments. Accordingly, phylogenetically disparate organisms that share a habitat may converge in gene content as they adapt to confront shared challenges. This response should be especially pronounced for functional genes that are important for survival in a particular habitat. We illustrate this principle by showing that the repertoires of two different types of carbohydrate-active enzymes, glycoside hydrolases and glycosyltransferases, have converged in bacteria and archaea that live in the human gut and that this convergence is largely due to horizontal gene transfer rather than gene family expansion. We also identify gut microbes that may have more similar dietary niches in the human gut than would be expected based on phylogeny. The techniques used to obtain these results should be broadly applicable to understanding the functional genes and evolutionary processes important for adaptation in many environments and useful for interpreting the large number of reference microbial genome sequences being generated for the International Human Microbiome Project.

Download full-text


Available from: Pedro M Coutinho, Sep 29, 2015
26 Reads
  • Source
    • "On the other hand, human diet contains large quantities of glycans in cereals, fruits and vegetables, most of which reach the distal gut (colon) intact and are harvested by the gut microbiota to generate metabolic fuel. This is in accord with the important role that glycan metabolism plays in modulating the composition of the gut microbiota, and the inherent impact of this on health (Lozupone et al. 2008; Muegge et al. 2011; Scott et al. 2013). Starch, which is an exclusively α-glucan polymer organized in supramolecular insoluble semicrystalline granules (Tester et al. 2004), is the most abundant glycan in human diet. "
    [Show abstract] [Hide abstract]
    ABSTRACT: α-Glucans from bacterial exo-polysaccharides or diet, e.g., resistant starch, legumes and honey are abundant in the human gut and fermentation of resistant fractions of these α-glucans by probiotic lactobacilli and bifidobacteria impacts human health positively. The ability to degrade polymeric α-glucans is confined to few strains encoding extracellular amylolytic activities of glycoside hydrolase (GH) family 13. Debranching pullulanases of the subfamily GH13_14 are the most common extracellular GH13 enzymes in lactobacilli, whereas corresponding enzymes are mainly α-amylases and amylopullulanases in bifidobacteria. Extracellular GH13 enzymes from both genera are frequently modular and possess starch binding domains, which are important for efficient catalysis and possibly to mediate attachment of cells to starch granules. α-1,6-Linked glucans, e.g., isomalto-oligosaccharides are potential prebiotics. The enzymes targeting these glucans are the most abundant intracellular GHs in bifidobacteria and lactobacilli. A phosphoenolpyruvate-dependent phosphotransferase system and a GH4 phospho-α-glucosidase are likely involved in metabolism of isomaltose and isomaltulose in probiotic lactobacilli based on transcriptional analysis. This specificity within GH4 is unique for lactobacilli, whereas canonical GH13 31 α-1,6-glucosidases active on longer α-1,6-gluco-oligosaccharides are ubiquitous in bifidobacteria and lactobacilli. Malto-oligosaccharide utilization operons encode more complex, diverse, and less biochemically understood activities in bifidobacteria compared to lactobacilli, where important members have been recently described at the molecular level. This review presents some aspects of α-glucan metabolism in probiotic bacteria and highlights vague issues that merit experimental effort, especially oligosaccharide uptake and the functionally unassigned enzymes, featuring in this important facet of glycan turnover by members of the gut microbiota.
    Biologia 06/2014; 69(6). DOI:10.2478/s11756-014-0367-7 · 0.83 Impact Factor
  • Source
    • "Of them, 14 were enriched in T2D patients, whereas the remaining 8 were enriched in healthy individuals (Table 1). Further literature mining showed that many of the T2D-enriched microbial strains/species were previously identified as potential opportunistic pathogens, such as Bacteroides caccae ATCC 43185 (35), Clostridium bolteae ATCC BAA-613 (36), Escherichia coli DEC6E or not yet well-characterized microbial strains that are distinct from currently recognized strains such as those named Alistipes sp., Bacteroides sp., Parabacteroides sp. and Subdoligranulum sp. In addition, the mucin-degrading strain Akkermansia muciniphila ATCC BAA-835 was also found to be significantly enriched in T2D patients, which was also observed in the previous study (8). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Shotgun metagenome sequencing has become a fast, cheap and high-throughput technology for characterizing microbial communities in complex environments and human body sites. However, accurate identification of microorganisms at the strain/species level remains extremely challenging. We present a novel k-mer-based approach, termed GSMer, that identifies genome-specific markers (GSMs) from currently sequenced microbial genomes, which were then used for strain/species-level identification in metagenomes. Using 5390 sequenced microbial genomes, 8 770 321 50-mer strain-specific and 11 736 360 species-specific GSMs were identified for 4088 strains and 2005 species (4933 strains), respectively. The GSMs were first evaluated against mock community metagenomes, recently sequenced genomes and real metagenomes from different body sites, suggesting that the identified GSMs were specific to their targeting genomes. Sensitivity evaluation against synthetic metagenomes with different coverage suggested that 50 GSMs per strain were sufficient to identify most microbial strains with ≥0.25× coverage, and 10% of selected GSMs in a database should be detected for confident positive callings. Application of GSMs identified 45 and 74 microbial strains/species significantly associated with type 2 diabetes patients and obese/lean individuals from corresponding gastrointestinal tract metagenomes, respectively. Our result agreed with previous studies but provided strain-level information. The approach can be directly applied to identify microbial strains/species from raw metagenomes, without the effort of complex data pre-processing.
    Nucleic Acids Research 02/2014; 42(8). DOI:10.1093/nar/gku138 · 9.11 Impact Factor
  • Source
    • "In order to annotate and interpret these discoveries, it is important to place them in the context of current knowledge using phylogenetic techniques. For example, UniFrac analysis of gene-family phylogenies identified convergent evolution of carbohydrate-active enzymes in human gut communities [1]. Another study showed that blooms of closely related species cause the phylogenetic diversity of marine microbes to be much lower in surface waters compared to below the photic zone at the HOT/ALOHA study site, despite the fact that communities at different depths have similar numbers of operational taxonomic units (OTUs) [2]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Sequence-based phylogenetic trees are a well-established tool for characterizing diversity of both macroorganisms and microorganisms. Phylogenetic methods have recently been applied to shotgun metagenomic data from microbial communities, particularly with the aim of classifying reads. But the accuracy of gene-family phylogenies that characterize evolutionary relationships among short, non-overlapping sequencing reads has not been thoroughly evaluated. To quantify errors in metagenomic read trees, we developed MetaPASSAGE, a software pipeline to generate in silico bacterial communities, simulate a sample of shotgun reads from a gene family represented in the community, orient or translate reads, and produce a profile-based alignment of the reads from which a gene-family phylogenetic tree can be built. We applied MetaPASSAGE to a variety of RNA and protein-coding gene families, built trees using a range of different phylogenetic methods, and compared the resulting trees using topological and branch-length error metrics. We identified read length as one of the major sources of error. Because phylogenetic methods use a reference database of full-length sequences from the gene family to guide construction of alignments and trees, we found that error can also be substantially reduced through increasing the size and diversity of the reference database. Finally, UniFrac analysis, which compares metagenomic samples based on a summary statistic computed over all branches in a read tree, is very robust to the level of error we observe. Bacterial community diversity can be quantified using phylogenetic approaches applied to shotgun metagenomic data. As sequencing reads get longer and more genomes across the bacterial tree of life are sequenced, the accuracy of this approach will continue to improve, opening the door to more applications.
    BMC Genomics 06/2013; 14(1):419. DOI:10.1186/1471-2164-14-419 · 3.99 Impact Factor
Show more