Identification and Quantification of Abundant Species from Pyrosequences of 16S rRNA by Consensus Alignment

School of Informatics and Computing, Bloomington, IN 47408, U.S.A.
Proceedings. IEEE International Conference on Bioinformatics and Biomedicine 02/2011; 2010:153-157. DOI: 10.1109/BIBM.2010.5706555
Source: PubMed


16S rRNA gene profiling has recently been boosted by the development of pyrosequencing methods. A common analysis is to group pyrosequences into Operational Taxonomic Units (OTUs), such that reads in an OTU are likely sampled from the same species. However, species diversity estimated from error-prone 16S rRNA pyrosequences may be inflated because the reads sampled from the same 16S rRNA gene may appear different, and current OTU inference approaches typically involve time-consuming pairwise/multiple distance calculation and clustering. I propose a novel approach AbundantOTU based on a Consensus Alignment (CA) algorithm, which infers consensus sequences, each representing an OTU, taking advantage of the sequence redundancy for abundant species. Pyrosequencing reads can then be recruited to the consensus sequences to give quantitative information for the corresponding species. As tested on 16S rRNA pyrosequence datasets from mock communities with known species, AbundantOTU rapidly reported identified sequences of the source 16S rRNAs and the abundances of the corresponding species. AbundantOTU was also applied to 16S rRNA pyrosequence datasets derived from real microbial communities and the results are in general agreement with previous studies.

Download full-text


Available from: Yuzhen Ye
  • Source
    • "Sequence reads were aligned with our own custom multiple alignment tool known as the Illinois-Mayo Taxon Operations for RNA Dataset Organization (IM-TORNADO) that merges paired end reads into a single multiple alignment and obtains taxa calls [19]. IM-TORNADO then clusters sequences into operational taxonomic units (OTUs) using AbundantOTU+ [20]. Further processing for visualization was performed using QIIME [21]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Objective To assess the vaginal microbiome throughout full-term uncomplicated pregnancy. Methods Vaginal swabs were obtained from twelve pregnant women at 8-week intervals throughout their uncomplicated pregnancies. Patients with symptoms of vaginal infection or with recent antibiotic use were excluded. Swabs were obtained from the posterior fornix and cervix at 8–12, 17–21, 27–31, and 36–38 weeks of gestation. The microbial community was profiled using hypervariable tag sequencing of the V3–V5 region of the 16S rRNA gene, producing approximately 8 million reads on the Illumina MiSeq. Results Samples were dominated by a single genus, Lactobacillus, and exhibited low species diversity. For a majority of the patients (n = 8), the vaginal microbiome was dominated by Lactobacillus crispatus throughout pregnancy. Two patients showed Lactobacillus iners dominance during the course of pregnancy, and two showed a shift between the first and second trimester from L. crispatus to L. iners dominance. In all of the samples only these two species were identified, and were found at an abundance of higher than 1% in this study. Comparative analyses also showed that the vaginal microbiome during pregnancy is characterized by a marked dominance of Lactobacillus species in both Caucasian and African-American subjects. In addition, our Caucasian subject population clustered by trimester and progressed towards a common attractor while African-American women clustered by subject instead and did not progress towards a common attractor. Conclusion Our analyses indicate normal pregnancy is characterized by a microbiome that has low diversity and high stability. While Lactobacillus species strongly dominate the vaginal environment during pregnancy across the two studied ethnicities, observed differences between the longitudinal dynamics of the analyzed populations may contribute to divergent risk for pregnancy complications. This helps establish a baseline for investigating the role of the microbiome in complications of pregnancy such as preterm labor and preterm delivery.
    Full-text · Article · Jun 2014 · PLoS ONE
  • Source
    • "Reads with at least 400 nucleotides (nt) were trimmed and checked for chimerism (Edgar et al., 2011). We obtained consensus OTU clusters and representative sequences using abundant OTU (Ye, 2010). Representative sequences and the OTU table were used for further analysis with the QIIME pipeline as detailed above (Caporaso et al., 2010). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Glycoside hydrolases (GHs), the enzymes that breakdown complex carbohydrates, are a highly diversified class of key enzymes associated with the gut microbiota and its metabolic functions. To learn more about the diversity of GHs and their potential role in a variety of gut microbiomes, we used a combination of 16S, metagenomic and targeted amplicon sequencing data to study one of these enzyme families in detail. Specifically, we employed a functional gene-targeted metagenomic approach to the 1-4-α-glucan-branching enzyme (gBE) gene in the gut microbiomes of four host species (human, chicken, cow and pig). The characteristics of operational taxonomic units (OTUs) and operational glucan-branching units (OGBUs) were distinctive in each of hosts. Human and pig were most similar in OTUs profiles while maintaining distinct OGBU profiles. Interestingly, the phylogenetic profiles identified from 16S and gBE gene sequences differed, suggesting the presence of different gBE genes in the same OTU across different vertebrate hosts. Our data suggest that gene-targeted metagenomic analysis is useful for an in-depth understanding of the diversity of a particular gene of interest. Specific carbohydrate metabolic genes appear to be carried by distinct OTUs in different individual hosts and among different vertebrate species' microbiomes, the characteristics of which differ according to host genetic background and/or diet.The ISME Journal advance online publication, 10 October 2013; doi:10.1038/ismej.2013.167.
    Full-text · Article · Oct 2013 · The ISME Journal
    • "Chimeras were detected and removed using the UCHIME algorithm (Edgar et al. 2011) using both reference-based (using the SILVA SSU database;Quast et al. 2013) and de novo chimera detection. Remaining sequences were clustered to 97% similarity using abundant OTU to construct operational taxonomic units (OTUs) (Yuzhen 2010). Student's t-tests were conducted to remove OTUs that were probably matches for contaminant bacteria. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background/Question/Methods Ticks are the most common disease vector in North America. Ticks take blood meals from one or more hosts throughout their life cycle, and conversely hosts can be bitten by many ticks over their lifetime. This provides ticks with many opportunities to become infected with bacteria from infected hosts, and for hosts to become infected with bacteria from ticks. We conducted a field survey and next-generation pyroseqencing of ticks and rodent hosts from two sites during peak tick season in 2011 to compare bacterial community composition between ticks and their hosts. Most previous research has focused on single bacterial taxa, such as Borrelia burgdorferi, but because microbial interactions can influence infection and transmission, investigating the whole bacterial community may provide additional insights. We used next-generation pyroseqencing to analyze the bacterial communities in ticks taken off rodents and in rodent blood samples. A total of 134 tick samples and 95 blood samples were sequenced and OTU clusters were created at 97% identity and assigned to taxonomic groups using a BLAST comparison to the SILVA SSU non-redundant database. Results/Conclusions Peromyscus leucopus was the most common rodent host and two tick species were found (Dermacentor variabilis and Ixodes scapularis). The two most abundant OTUs from tick samples most closely corresponded to the Francisella endosymbiont of Dermacentor ticks (56% of tick sequences) and Rickettsia massiliae (23%). Many of these Rickettsia sequences are likely from the Rickettsia endosymbiont of Ixodes ticks. However, there were some co-occurrences of Rickettsia and Francisella, suggesting non-endosymbiont Rickettsia infection in Dermacentor. The most common OTU from rodent blood samples was associated with three Bartonella species (45% of sequences), which have been identified as flea-borne pathogens of mammals. Two taxa were found in both tick and blood samples, Afipia broomeae (0.67% tick sequences, 4.5% blood sequences) and Bartonella (3% of tick sequences), suggesting ticks may also be infected by flea-borne bacteria. Mycoplasma, a directly-transmitted bacterial pathogen, was also found in rodent blood samples. Arsenephonus and Wolbachia sequences were infrequently detected in tick samples. Species in these genera have been found to infect ticks and other arthropods. These results suggest that the dominant bacteria in ticks and host blood samples are different, but that the common flea-borne pathogens, Bartonella and Afipia co-occur in rodent blood and in ticks.
    No preview · Conference Paper · Aug 2013
Show more