Extrapolation of urn models via poissonization: accurate measurements of the microbial unknown.

Department of Applied Mathematics, University of Colorado, Boulder, Colorado, United States of America.
PLoS ONE (Impact Factor: 3.53). 06/2011; 6(6):e21105. DOI: 10.1371/journal.pone.0021105
Source: PubMed

ABSTRACT The availability of high-throughput parallel methods for sequencing microbial communities is increasing our knowledge of the microbial world at an unprecedented rate. Though most attention has focused on determining lower-bounds on the α-diversity i.e. the total number of different species present in the environment, tight bounds on this quantity may be highly uncertain because a small fraction of the environment could be composed of a vast number of different species. To better assess what remains unknown, we propose instead to predict the fraction of the environment that belongs to unsampled classes. Modeling samples as draws with replacement of colored balls from an urn with an unknown composition, and under the sole assumption that there are still undiscovered species, we show that conditionally unbiased predictors and exact prediction intervals (of constant length in logarithmic scale) are possible for the fraction of the environment that belongs to unsampled classes. Our predictions are based on a poissonization argument, which we have implemented in what we call the Embedding algorithm. In fixed i.e. non-randomized sample sizes, the algorithm leads to very accurate predictions on a sub-sample of the original sample. We quantify the effect of fixed sample sizes on our prediction intervals and test our methods and others found in the literature against simulated environments, which we devise taking into account datasets from a human-gut and -hand microbiota. Our methodology applies to any dataset that can be conceptualized as a sample with replacement from an urn. In particular, it could be applied, for example, to quantify the proportion of all the unseen solutions to a binding site problem in a random RNA pool, or to reassess the surveillance of a certain terrorist group, predicting the conditional probability that it deploys a new tactic in a next attack.


Available from: Manuel E. Lladser, Apr 16, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Limited studies have examined the intestinal microbiota composition in relation to changes in disease course of IBD over time. We aimed to study prospectively the fecal microbiota in IBD patients developing an exacerbation during follow-up. Fecal samples from 10 Crohn's disease (CD) and 9 ulcerative colitis (UC) patients during remission and subsequent exacerbation were included. Active disease was determined by colonoscopy and/or fecal calprotectine levels. Exclusion criteria were pregnancy, antibiotic use, enema use and/or medication changes between consecutive samples. The microbial composition was assessed by 16S rDNA pyrosequencing. After quality control, 6,194-11,030 sequences per sample were available for analysis. Patient-specific shifts in bacterial composition and diversity were observed during exacerbation compared to remission, but overarching shifts within UC or CD were not observed. Changes in the bacterial community composition between remission and exacerbation as assessed by Bray-Curtis dissimilarity, were significantly larger in CD versus UC patients (0.59 vs. 0.42, respectively; p = 0.025). Thiopurine use was found to be a significant cause of clustering as shown by Principal Coordinate Analysis and was associated with decreases in bacterial richness (Choa1 501.2 vs. 847.6 in non-users; p<0.001) and diversity (Shannon index: 5.13 vs. 6.78, respectively; p<0.01). Shifts in microbial composition in IBD patients with changing disease activity over time seem to be patient-specific, and are more pronounced in CD than in UC patients. Furthermore, thiopurine use was found to be associated with the microbial composition and diversity, and should be considered when studying the intestinal microbiota in relation to disease course.
    PLoS ONE 03/2014; 9(3):e90981. DOI:10.1371/journal.pone.0090981 · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We wish to estimate the total number of classes in a population. The classical approach assumes that each class independently contributes a Poisson number of representatives to the sample according to its sampling intensity; these intensities follow a stochastic abundance distribution. In this paper we present what we believe to be the first parametric departure from the mixed Poisson framework. We draw on probability theory that characterizes distributions on the integers by the ratios of their consecutive probabilities. Based on these distributions we construct a nonlinear regression model for the ratios of consecutive frequency counts; this allows us to predict the unobserved count and hence to estimate the total diversity. We find that this approach results in realistic estimates with good fits to data and reasonable standard errors, and it is geometrically intuitive. The method is especially well-suited to the high diversity setting typical of modern microbial datasets derived from next-generation sequencing. We demonstrate its performance in low, medium and high diversity contexts, and via simulation. Finally, we present a dataset for which our method outperforms all competitors.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Vaginally administered antiviral agents may reduce the risk of HIV and HSV acquisition. Delivery of these drugs using intravaginal rings (IVRs) holds the potential benefits of improving adherence and decreasing systemic exposure, while maintaining steady-state drug levels in the vaginal tract. Elucidating how IVRs interact with the vaginal microbiome constitutes a critical step in evaluating the safety of these devices, as shifts the vaginal microbiome have been linked with several disease states. To date, clinical IVR trials have relied on culture-dependent methods that omit the high diversity of unculturable microbial population. Longitudinal, culture-independent characterization of the microbiota in vaginal samples from 6 women with recurrent genital HSV who used an acyclovir IVR was carried out and compared to the communities developing in biofilms on the IVR surface. The analysis utilized Illumina MiSeq sequence datasets generated from bar-coded amplicons of 16S rRNA gene fragments. Specific taxa in the vaginal communities of the study participants were found to be associated with the duration of recurrent genital HSV status and the number of HSV outbreaks. Taxonomic comparison of the vaginal and IVR biofilm communities did not reveal any significant differences, suggesting that the IVRs were not systematically enriched with members of the vaginal microbiome. Device usage did not alter the participants' vaginal microbial communities, within the confines of the current study design. Rigorous, molecular analysis of the effects of intravaginal devices on the corresponding microbial communities shows promise for integration with traditional approaches in the clinical evaluation of candidate products.
    Antiviral research 12/2013; DOI:10.1016/j.antiviral.2013.12.004 · 3.43 Impact Factor