Article

Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies.

Department of Microbiology & Immunology, University of Michigan, Ann Arbor, Michigan, United States of America.
PLoS ONE (Impact Factor: 3.53). 12/2011; 6(12):e27310. DOI: 10.1371/journal.pone.0027310
Source: PubMed

ABSTRACT The advent of next generation sequencing has coincided with a growth in interest in using these approaches to better understand the role of the structure and function of the microbial communities in human, animal, and environmental health. Yet, use of next generation sequencing to perform 16S rRNA gene sequence surveys has resulted in considerable controversy surrounding the effects of sequencing errors on downstream analyses. We analyzed 2.7×10(6) reads distributed among 90 identical mock community samples, which were collections of genomic DNA from 21 different species with known 16S rRNA gene sequences; we observed an average error rate of 0.0060. To improve this error rate, we evaluated numerous methods of identifying bad sequence reads, identifying regions within reads of poor quality, and correcting base calls and were able to reduce the overall error rate to 0.0002. Implementation of the PyroNoise algorithm provided the best combination of error rate, sequence length, and number of sequences. Perhaps more problematic than sequencing errors was the presence of chimeras generated during PCR. Because we knew the true sequences within the mock community and the chimeras they could form, we identified 8% of the raw sequence reads as chimeric. After quality filtering the raw sequences and using the Uchime chimera detection program, the overall chimera rate decreased to 1%. The chimeras that could not be detected were largely responsible for the identification of spurious operational taxonomic units (OTUs) and genus-level phylotypes. The number of spurious OTUs and phylotypes increased with sequencing effort indicating that comparison of communities should be made using an equal number of sequences. Finally, we applied our improved quality-filtering pipeline to several benchmarking studies and observed that even with our stringent data curation pipeline, biases in the data generation pipeline and batch effects were observed that could potentially confound the interpretation of microbial community data.

0 Bookmarks
 · 
111 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: High current densities in microbial electrolysis cells (MECs) result from the predominance of various Geobacter species on the anode, but it is not known if archaeal communities similarly converge to one specific genus. MECs were examined here on the basis of maximum methane production and current density relative to the inoculum community structure. We used anaerobic digester (AD) sludge dominated by acetoclastic Methanosaeta, and an anaerobic bog sediment where hydrogenotrophic methanogens were detected. Inoculation using solids to medium ratio of 25% (w/v) resulted in the highest methane production rates (0.27 mL mL(-1) cm(-2), gas volume normalized by liquid volume and cathode projected area) and highest peak current densities (0.5 mA cm(-2)) for the bog sample. Methane production was independent of solid to medium ratio when AD sludge was used as the inoculum. 16S rRNA gene community analysis using pyrosequencing and quantitative PCR confirmed the convergence of Archaea to Methanobacterium and Methanobrevibacter, and of Bacteria to Geobacter, despite their absence in AD sludge. Combined with other studies, these findings suggest that Archaea of the hydrogenotrophic genera Methanobacterium and Methanobrevibacter are the most important microorganisms for methane production in MECs and that their presence in the inoculum improves the performance.
    Frontiers in Microbiology 01/2014; 5:778. · 3.94 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: New mutations leading to structural variation (SV) in genomes-in the form of mobile element insertions, large deletions, gene duplications, and other chromosomal rearrangements-can play a key role in microbial evolution. Yet, SV is considerably more difficult to predict from short-read genome resequencing data than single-nucleotide substitutions and indels (SN), so it is not yet routinely identified in studies that profile population-level genetic diversity over time in evolution experiments. We implemented an algorithm for detecting polymorphic SV as part of the breseq computational pipeline. This procedure examines split-read alignments, in which the two ends of a single sequencing read match disjoint locations in the reference genome, in order to detect structural variants and estimate their frequencies within a sample. We tested our algorithm using simulated Escherichia coli data and then applied it to 500- and 1000-generation population samples from the Lenski E. coli long-term evolution experiment (LTEE). Knowledge of genes that are targets of selection in the LTEE and mutations present in previously analyzed clonal isolates allowed us to evaluate the accuracy of our procedure. Overall, SV accounted for ~25% of the genetic diversity found in these samples. By profiling rare SV, we were able to identify many cases where alternative mutations in key genes transiently competed within a single population. We also found, unexpectedly, that mutations in two genes that rose to prominence at these early time points always went extinct in the long term. Because it is not limited by the base-calling error rate of the sequencing technology, our approach for identifying rare SV in whole-population samples may have a lower detection limit than similar predictions of SNs in these data sets. We anticipate that this functionality of breseq will be useful for providing a more complete picture of genome dynamics during evolution experiments with haploid microorganisms.
    Frontiers in Genetics 01/2014; 5:468.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Receipt of broad-spectrum antibiotics enhances Candida albicans colonization of the GI tract, a risk factor for haematogenously-disseminated candidiasis. To understand how antibiotics influence C. albicans colonization, we treated mice orally with vancomycin or a combination of penicillin, streptomycin, and gentamicin (PSG) and then inoculated them with C. albicans by gavage. Only PSG treatment resulted in sustained, high-level GI colonization with C. albicans. Furthermore, PSG reduced bacterial diversity in the colon much more than vancomycin. Both antibiotic regimens significantly reduced IL-17A, IL-21, IL-22 and IFN-γ mRNA levels in the terminal ileum but had limited effect on the GI fungal microbiome. Through a series of models that employed Bayesian model averaging, we investigated the associations between antibiotic treatment, GI microbiota, and host immune response and their collective impact on C. albicans colonization. Our analysis revealed that bacterial genera were typically associated with either C. albicans colonization or altered cytokine expression but not with both. The only exception was Veillonella, which was associated with both increased C. albicans colonization and reduced IL-21 expression. Overall, antibiotic-induced changes in the bacterial microbiome were much more consistent determinants of C. albicans colonization than either the GI fungal microbiota or the GI immune response.
    Scientific Reports 02/2015; 5:8131. · 5.08 Impact Factor

Full-text (2 Sources)

Download
79 Downloads
Available from
May 30, 2014