[Show abstract][Hide abstract]ABSTRACT: Stratified sulfurous lakes are appropriate environments for studying the links between composition and functionality in microbial communities and are potentially modern analogs of anoxic conditions prevailing in the ancient ocean. We explored these aspects in the Lake Banyoles karstic area (NE Spain) through metagenomics and in silico reconstruction of carbon, nitrogen and sulfur metabolic pathways that were tightly coupled through a few bacterial groups. The potential for nitrogen fixation and denitrification was detected in both autotrophs and heterotrophs, with a major role for nitrogen and carbon fixations in Chlorobiaceae. Campylobacterales accounted for a large percentage of denitrification genes, while Gallionellales were putatively involved in denitrification, iron oxidation and carbon fixation and may have a major role in the biogeochemistry of the iron cycle. Bacteroidales were also abundant and showed potential for dissimilatory nitrate reduction to ammonium. The very low abundance of genes for nitrification, the minor presence of anammox genes, the high potential for nitrogen fixation and mineralization and the potential for chemotrophic CO2 fixation and CO oxidation all provide potential clues on the anoxic zones functioning. We observed higher gene abundance of ammonia-oxidizing bacteria than ammonia-oxidizing archaea that may have a geochemical and evolutionary link related to the dominance of Fe in these environments. Overall, these results offer a more detailed perspective on the microbial ecology of anoxic environments and may help to develop new geochemical proxies to infer biology and chemistry interactions in ancient ecosystems.
[Show abstract][Hide abstract]ABSTRACT: Bacterial community composition and functional potential change subtly across gradients in the surface ocean. In contrast, while there are significant phylogenetic divergences between communities from freshwater and marine habitats, the underlying mechanisms to this phylogenetic structuring yet remain unknown. We hypothesized that the functional potential of natural bacterial communities is linked to this striking divide between microbiomes. To test this hypothesis, metagenomic sequencing of microbial communities along a 1,800 km transect in the Baltic Sea area, encompassing a continuous natural salinity gradient from limnic to fully marine conditions, was explored. Multivariate statistical analyses showed that salinity is the main determinant of dramatic changes in microbial community composition, but also of large scale changes in core metabolic functions of bacteria. Strikingly, genetically and metabolically different pathways for key metabolic processes, such as respiration, biosynthesis of quinones and isoprenoids, glycolysis and osmolyte transport, were differentially abundant at high and low salinities. These shifts in functional capacities were observed at multiple taxonomic levels and within dominant bacterial phyla, while bacteria, such as SAR11, were able to adapt to the entire salinity gradient. We propose that the large differences in central metabolism required at high and low salinities dictate the striking divide between freshwater and marine microbiomes, and that the ability to inhabit different salinity regimes evolved early during bacterial phylogenetic differentiation. These findings significantly advance our understanding of microbial distributions and stress the need to incorporate salinity in future climate change models that predict increased levels of precipitation and a reduction in salinity.
[Show abstract][Hide abstract]ABSTRACT: Understanding the microbial content of the air has important scientific, health, and economic implications. While studies have primarily characterized the taxonomic content of air samples by sequencing the 16S or 18S ribosomal RNA gene, direct analysis of the genomic content of airborne microorganisms has not been possible due to the extremely low density of biological material in airborne environments. We developed sampling and amplification methods to enable adequate DNA recovery to allow metagenomic profiling of air samples collected from indoor and outdoor environments. Air samples were collected from a large urban building, a medical center, a house, and a pier. Analyses of metagenomic data generated from these samples reveal airborne communities with a high degree of diversity and different genera abundance profiles. The identities of many of the taxonomic groups and protein families also allows for the identification of the likely sources of the sampled airborne bacteria.
[Show abstract][Hide abstract]ABSTRACT: The Proteomics Standard Initiative Common QUery InterfaCe (PSICQUIC) specification was created by the Human Proteome Organization Proteomics Standards Initiative (HUPO-PSI) to enable computational access to molecular-interaction data resources by means of a standard Web Service and query language. Currently providing >150 million binary interaction evidences from 28 servers globally, the PSICQUIC interface allows the concurrent search of multiple molecular-interaction information resources using a single query. Here, we present an extension of the PSICQUIC specification (version 1.3), which has been released to be compliant with the enhanced standards in molecular interactions. The new release also includes a new reference implementation of the PSICQUIC server available to the data providers. It offers augmented web service capabilities and improves the user experience. PSICQUIC has been running for almost 5 years, with a user base growing from only 4 data providers to 28 (April 2013) allowing access to 151 310 109 binary interactions. The power of this web service is shown in PSICQUIC View web application, an example of how to simultaneously query, browse and download results from the different PSICQUIC servers. This application is free and open to all users with no login requirement (http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml).
Full-text · Article · May 2013 · Nucleic Acids Research
[Show abstract][Hide abstract]ABSTRACT: A variety of microbial communities and their genes (the microbiome) exist throughout the human body, with fundamental roles in human health and disease. The National Institutes of Health (NIH)-funded Human Microbiome Project Consortium has established a population-scale framework to develop metagenomic protocols, resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 or 18 body sites up to three times, which have generated 5,177 microbial taxonomic profiles from 16S ribosomal RNA genes and over 3.5 terabases of metagenomic sequence so far. In parallel, approximately 800 reference strains isolated from the human body have been sequenced. Collectively, these data represent the largest resource describing the abundance and variety of the human microbiome, while providing a framework for current and future studies.
[Show abstract][Hide abstract]ABSTRACT: Gold standard abundances of KEGG gene families, pathways, and modules for four synthetic metagenomes. A ground truth of genes, small functional modules, and large pathways in each of four synthetic metagenomes was calculated from the organismal compositions in Supplemental Table S2 as follows. The abundances of gene families were computed by multiplying the frequency of each KEGG Orthology gene family in organisms' reference genomes by the organism's relative abundance (even or staggered) in each community. Module presence/absence and abundance was determined based on conjunctive normal form satisfaction of constituent gene presence/absence and relative copy number, respectively. Pathway presence/absence was used as defined by KEGG, with abundance as multiplied by each organism's relative abundance within the communities.
[Show abstract][Hide abstract]ABSTRACT: HUMAnN relative abundance estimates for metabolic pathways in the human microbiome. Inferred pathway abundance values for each of 297 KEGG pathways in 649 samples spanning seven body sites across the human microbiome.
[Show abstract][Hide abstract]ABSTRACT: Evaluation of parameter settings and processing modules in the HUMAnN pipeline. Four synthetic metagenomes (high and low complexity, equally and lognormally distributed organismal abundances) were used as a gold standard to evaluate variants of the HUMAnN pipeline for both metabolic modules (Mod.) and full KEGG pathways (Path). The accuracies of relative abundance estimates (left) were evaluated using Pearson correlation with the gold standards (Supplemental Table S3), whereas the coverage (right) was evaluated using partial AUC at 10% false positives in order to specifically prevent erroneously high-confidence false positives. The choices evaluated here include whether to assign genes to modules/pathways using MinPath (MP) or naively, inclusion of taxonomic limitation (Tax) without or with gene copy number correction (TaxC), add-one (Sm) or Witten-Bell (SmWB) abundance smoothing, and biological gap filling using pathway medians (GF) or averages (GFAve). The final HUMAnN pipeline that achieved the best overall performance is highlighted in bold and consists in the sequential execution of MinPath, taxonomic limitation with copy number correction, and median-based gap filling.
[Show abstract][Hide abstract]ABSTRACT: HUMAnN coverage estimates for functional modules in the human microbiome. Inferred module presence/absence confidence values for each of 227 KEGG modules in 649 samples spanning seven body sites across the human microbiome.
[Show abstract][Hide abstract]ABSTRACT: As metagenomic studies continue to increase in their number, sequence volume and complexity, the scalability of biological analysis frameworks has become a rate-limiting factor to meaningful data interpretation. To address this issue, we have developed JCVI Metagenomics Reports (METAREP) as an open source tool to query, browse, and compare extremely large volumes of metagenomic annotations. Here we present improvements to this software including the implementation of a dynamic weighting of taxonomic and functional annotation, support for distributed searches, advanced clustering routines, and integration of additional annotation input formats. The utility of these improvements to data interpretation are demonstrated through the application of multiple comparative analysis strategies to shotgun metagenomic data produced by the National Institutes of Health Roadmap for Biomedical Research Human Microbiome Project (HMP) (http://nihroadmap.nih.gov). Specifically, the scalability of the dynamic weighting feature is evaluated and established by its application to the analysis of over 400 million weighted gene annotations derived from 14 billion short reads as predicted by the HMP Unified Metabolic Analysis Network (HUMAnN) pipeline. Further, the capacity of METAREP to facilitate the identification and simultaneous comparison of taxonomic and functional annotations including biological pathway and individual enzyme abundances from hundreds of community samples is demonstrated by providing scenarios that describe how these data can be mined to answer biological questions related to the human microbiome. These strategies provide users with a reference of how to conduct similar large-scale metagenomic analyses using METAREP with their own sequence data, while in this study they reveal insights into the nature and extent of variation in taxonomic and functional profiles across body habitats and individuals. Over one thousand HMP WGS datasets and the latest open source code are available at http://www.jcvi.org/hmp-metarep.
[Show abstract][Hide abstract]ABSTRACT: HUMAnN relative abundance estimates for functional modules in the human microbiome. Inferred module abundance values for each of 250 KEGG modules in 649 samples spanning seven body sites across the human microbiome.
[Show abstract][Hide abstract]ABSTRACT: Performance of HUMAnN abundance and coverage inference as compared to a best-BLAST-hit (BBH) approach. The HUMAnN pipeline for KEGG metabolic modules and pathways was compared to a best-BLAST-hit approach to module reconstruction. Evaluations of abundance (left columns, by Pearson correlation) and coverage (right columns, by partial AUC at 10% false positives) are summarized in the first row. Subsequent rows show full scatterplots with regressions (left, abundances) and ROC curves (right, coverages) for each individual method over the entire gold synthetic community gold standards. These include A) high-complexity (100 organisms) staggered abundances B) low-complexity (20 organisms) staggered, C) high-complexity even abundances, and D) low-complexity even.
[Show abstract][Hide abstract]ABSTRACT: Hierarchical cluster plot of 84 first and second visit sample pairs clustered by NCBI taxonomy. Hierarchical clustering analysis of human microbiome samples with first and second visits (n = 168) taken from 15 human body habitats clustered by NCBI taxonomy at the Family level. Clusters were generated by the average linkage clustering method using the Morisita-Horn index to generate a distance matrix (shown on the x-axis). Dataset labels encode the following information [donor ID]-[habitat]-[gender]-[time point]-[sample ID]-[annotation-type].
[Show abstract][Hide abstract]ABSTRACT: Studies of the human microbiome have revealed that even healthy individuals differ remarkably in the microbes that occupy habitats such as the gut, skin and vagina. Much of this diversity remains unexplained, although diet, environment, host genetics and early microbial exposure have all been implicated. Accordingly, to characterize the ecology of human-associated microbial communities, the Human Microbiome Project has analysed the largest cohort and set of distinct, clinically relevant body habitats so far. We found the diversity and abundance of each habitat's signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals. The project encountered an estimated 81-99% of the genera, enzyme families and community configurations occupied by the healthy Western microbiome. Metagenomic carriage of metabolic pathways was stable among individuals despite variation in community structure, and ethnic/racial background proved to be one of the strongest associations of both pathways and microbes with clinical metadata. These results thus delineate the range of structural and functional configurations normal in the microbial communities of a healthy population, enabling future characterization of the epidemiology, ecology and translational applications of the human microbiome.
[Show abstract][Hide abstract]ABSTRACT: Evaluation of individual HUMAnN processing modules on 10 synthetic metagenomes. An additional 10 synthetic metagenomes were generated with high-complexity (100 organisms) and random lognormally distributed abundances. These were searched against KEGG protein sequences using USEARCH, allowing multiple hits with maximum e-value 1. All combinations of select HUMAnN modules were then assessed, including best-BLAST-hit versus multiple hits weighted by p-value and the presence or absence of taxonomic limitation with or without copy number normalization. HUMAnN default settings are highlighted in gray. Processing steps recapitulated their behavior as observed in Supplemental Figure S1.