Supervised classifcation of human microbiota

Department of Computer Science, University of Colorado, Boulder, CO, USA.
FEMS microbiology reviews (Impact Factor: 13.24). 09/2010; 35(2):343-59. DOI: 10.1111/j.1574-6976.2010.00251.x
Source: PubMed

ABSTRACT Recent advances in DNA sequencing technology have allowed the collection of high-dimensional data from human-associated microbial communities on an unprecedented scale. A major goal of these studies is the identification of important groups of microorganisms that vary according to physiological or disease states in the host, but the incidence of rare taxa and the large numbers of taxa observed make that goal difficult to obtain using traditional approaches. Fortunately, similar problems have been addressed by the machine learning community in other fields of study such as microarray analysis and text classification. In this review, we demonstrate that several existing supervised classifiers can be applied effectively to microbiota classification, both for selecting subsets of taxa that are highly discriminative of the type of community, and for building models that can accurately classify unlabeled data. To encourage the development of new approaches to supervised classification of microbiota, we discuss several structures inherent in microbial community data that may be available for exploitation in novel approaches, and we include as supplemental information several benchmark classification tasks for use by the community.

1 Follower
133 Reads
  • Source
    • "mu - nities . Note that these counts of shared microbial OTUs are sensitive to sampling effort ; more exten - sive sampling of water and invertebrate microbiota would presumably reveal additional microbes and might therefore increase these estimates of overlap with the fish gut microbiota . We then used Bayesian community - level source tracking ( Knights et al . , 2011 ) to estimate how much of the stickleback gut microbiota is from water , or invertebrate prey sources ( after filtering OTUs found in fewer than 1% of all samples ) : on average 12 . 6% of the fish gut microbiota was from water sources , 73 . 3% from prey sources and 14 . 1% unknown ( o0 . 05% of all samples came from presumed human gut"
    [Show abstract] [Hide abstract]
    ABSTRACT: To explain differences in gut microbial communities we must determine how processes regulating microbial community assembly (colonization, persistence) differ among hosts and affect microbiota composition. We surveyed the gut microbiota of threespine stickleback (Gasterosteus aculeatus) from 10 geographically clustered populations and sequenced environmental samples to track potential colonizing microbes and quantify the effects of host environment and genotype. Gut microbiota composition and diversity varied among populations. These among-population differences were associated with multiple covarying ecological variables: habitat type (lake, stream, estuary), lake geomorphology and food- (but not water-) associated microbiota. Fish genotype also covaried with gut microbiota composition; more genetically divergent populations exhibited more divergent gut microbiota. Our results suggest that population level differences in stickleback gut microbiota may depend more on internal sorting processes (host genotype) than on colonization processes (transient environmental effects).
    The ISME Journal 04/2015; DOI:10.1038/ismej.2015.64 · 9.30 Impact Factor
  • Source
    • "Comparisons of alpha diversities were based on averages of 1000 rarefactions . Random forests supervised learning classification as implemented in QIIME (Knights et al. 2011) as well as ANOSIM and PERMANOVA tests of Bray-Curtis, weighted UniFrac, and unweighted UniFrac beta diversity metrics were used to compare community level differences between treatments. Individual OTUs that appeared in at least 25% of samples were examined for relative abundance differences between treatments using ANOVA with Bonferroni correction. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The recent development of methods applying next-generation sequencing to microbial community characterization has led to the proliferation of these studies in a wide variety of sample types. Yet, variation in the physical properties of environmental samples demands that optimal DNA extraction techniques be explored for each new environment. The microbiota associated with many species of insects offer an extraction challenge as they are frequently surrounded by an armored exoskeleton, inhibiting disruption of the tissues within. In this study, we examine the efficacy of several commonly used protocols for extracting bacterial DNA from ants. While bacterial community composition recovered using Illumina 16S rRNA amplicon sequencing was not detectably biased by any method, the quantity of bacterial DNA varied drastically, reducing the number of samples that could be amplified and sequenced. These results indicate that the concentration necessary for dependable sequencing is around 10,000 copies of target DNA per microliter. Exoskeletal pulverization and tissue digestion increased the reliability of extractions, suggesting that these steps should be included in any study of insect-associated microorganisms that relies on obtaining microbial DNA from intact body segments. Although laboratory and analysis techniques should be standardized across diverse sample types as much as possible, minimal modifications such as these will increase the number of environments in which bacterial communities can be successfully studied.
    MicrobiologyOpen 09/2014; 3(6). DOI:10.1002/mbo3.216 · 2.21 Impact Factor
  • Source
    • "Supervised classification methods can be used to determine which taxa differ between predefined groups of samples (e.g., diseased versus healthy) and to build models that use these discriminatory taxa to predict the classification of a new sample. Examples of commonly employed supervised classification methods are described in Knights et al. (2011). Unsupervised classification (clustering), on the other hand, does not make use of any prior knowledge about the samples. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Human microbiome research is an actively developing area of inquiry, with ramifications for our lifestyles, our interactions with microbes, and how we treat disease. Advances depend on carefully executed, controlled, and reproducible studies. Here, we provide a Primer for researchers from diverse disciplines interested in conducting microbiome research. We discuss factors to be considered in the design, execution, and data analysis of microbiome studies. These recommendations should help researchers to enter and contribute to this rapidly developing field.
    Cell 07/2014; 158(2):250-262. DOI:10.1016/j.cell.2014.06.037 · 32.24 Impact Factor
Show more