Hypervariable loci in the human gut virome

Department of Microbiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA.
Proceedings of the National Academy of Sciences (Impact Factor: 9.67). 03/2012; 109(10):3962-6. DOI: 10.1073/pnas.1119061109
Source: PubMed


Genetic variation is critical in microbial immune evasion and drug resistance, but variation has rarely been studied in complex heterogeneous communities such as the human microbiome. To begin to study natural variation, we analyzed DNA viruses present in the lower gastrointestinal tract of 12 human volunteers by determining 48 billion bases of viral DNA sequence. Viral genomes mostly showed low variation, but 51 loci of ∼100 bp showed extremely high variation, so that up to 96% of the viral genomes encoded unique amino acid sequences. Some hotspots of hypervariation were in genes homologous to the bacteriophage BPP-1 viral tail-fiber gene, which is known to be hypermutagenized by a unique reverse-transcriptase (RT)-based mechanism. Unexpectedly, other hypervariable loci in our data were in previously undescribed gene types, including genes encoding predicted Ig-superfamily proteins. Most of the hypervariable loci were linked to genes encoding RTs of a single clade, which we find is the most abundant clade among gut viruses but only a minor component of bacterial RT populations. Hypervariation was targeted to 5'-AAY-3' asparagine codons, which allows maximal chemical diversification of the encoded amino acids while avoiding formation of stop codons. These findings document widespread targeted hypervariation in the human gut virome, identify previously undescribed types of genes targeted for hypervariation, clarify association with RT gene clades, and motivate studies of hypervariation in the full human microbiome.

Download full-text


Available from: Samuel Minot, Aug 20, 2014
  • Source
    • "human gut viromes , but contained two genes associated with plasmid replication ( Fig . 1 – Figure supplement 2A ) . This could suggest a plasmid origin , but the high and even coverage of these genomes across several CsCl - purified viromes from different studies ( Kim et al . , 2011 ; Minot et al . , 2012 ) suggests that they are derived from encapsidated particles typical of viruses ( Fig . 1 – Figure supplement 2B ) . If confirmed , these sequences would represent the first complete genomes for an entirely new viral order ."
    [Show abstract] [Hide abstract]
    ABSTRACT: The ecological importance of viruses is now widely recognized, yet our limited knowledge of viral sequence space and virus–host interactions precludes accurate prediction of their roles and impacts. In this study, we mined publicly available bacterial and archaeal genomic data sets to identify 12,498 high-confidence viral genomes linked to their microbial hosts. These data augment public data sets 10-fold, provide first viral sequences for 13 new bacterial phyla including ecologically abundant phyla, and help taxonomically identify 7–38% of ‘unknown’ sequence space in viromes. Genome- and network-based classification was largely consistent with accepted viral taxonomy and suggested that (i) 264 new viral genera were identified (doubling known genera) and (ii) cross-taxon genomic recombination is limited. Further analyses provided empirical data on extrachromosomal prophages and coinfection prevalences, as well as evaluation of in silico virus–host linkage predictions. Together these findings illustrate the value of mining viral signal from microbial genomes.
    Full-text · Article · Jul 2015 · eLife Sciences
  • Source
    • "Much of the study of the human microbiome has concentrated on those indigenous bacterial communities inhabiting different body surfaces [1-4], but relatively little effort has been focused on viruses [5-9]. Recent studies have identified communities of viruses inhabiting the human oral cavity [10,11], the respiratory tract [8], skin [12], and the intestinal tract [5,7,13]. While the role of viruses in these communities has yet to be thoroughly examined, a common feature shared among these body surfaces has been that most of the viruses identified have been bacteriophage [5-7,11,14]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Dental plaque is home to a diverse and complex community of bacteria, but has generally been believed to be inhabited by relatively few viruses. We sampled the saliva and dental plaque from 4 healthy human subjects to determine whether plaque was populated by viral communities, and whether there were differences in viral communities specific to subject or sample type. Results We found that the plaque was inhabited by a community of bacteriophage whose membership was mostly subject-specific. There was a significant proportion of viral homologues shared between plaque and salivary viromes within each subject, suggesting that some oral viruses were present in both sites. We also characterized Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in oral streptococci, as their profiles provide clues to the viruses that oral bacteria may be able to counteract. While there were some CRISPR spacers specific to each sample type, many more were shared across sites and were highly subject specific. Many CRISPR spacers matched viruses present in plaque, suggesting that the evolution of CRISPR loci may have been specific to plaque-derived viruses. Conclusions Our findings of subject specificity to both plaque-derived viruses and CRISPR profiles suggest that human viral ecology may be highly personalized.
    Full-text · Article · Jun 2014 · BMC Microbiology
  • Source
    • "Additional contributions of metagenomics to clinical virology include the identification of a bunyavirus in patients with thrombocytopenia and leukopenia syndrome [13]. Characterizations of the human gut virome have revealed the diversity and predominance of bacteriophages in this environment [14], [15]. Recently, NGS was employed to determine the complexity of the human virome in febrile children with acute diarrhea [16], [17]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We conducted an unbiased metagenomics survey using plasma from patients with chronic hepatitis B, chronic hepatitis C, autoimmune hepatitis (AIH), non-alcoholic steatohepatitis (NASH), and patients without liver disease (control). RNA and DNA libraries were sequenced from plasma filtrates enriched in viral particles to catalog virus populations. Hepatitis viruses were readily detected at high coverage in patients with chronic viral hepatitis B and C, but only a limited number of sequences resembling other viruses were found. The exception was a library from a patient diagnosed with hepatitis C virus (HCV) infection that contained multiple sequences matching GB virus C (GBV-C). Abundant GBV-C reads were also found in plasma from patients with AIH, whereas Torque teno virus (TTV) was found at high frequency in samples from patients with AIH and NASH. After taxonomic classification of sequences by BLASTn, a substantial fraction in each library, ranging from 35% to 76%, remained unclassified. These unknown sequences were assembled into scaffolds along with virus, phage and endogenous retrovirus sequences and then analyzed by BLASTx against the non-redundant protein database. Nearly the full genome of a heretofore-unknown circovirus was assembled and many scaffolds that encoded proteins with similarity to plant, insect and mammalian viruses. The presence of this novel circovirus was confirmed by PCR. BLASTx also identified many polypeptides resembling nucleo-cytoplasmic large DNA viruses (NCLDV) proteins. We re-evaluated these alignments with a profile hidden Markov method, HHblits, and observed inconsistencies in the target proteins reported by the different algorithms. This suggests that sequence alignments are insufficient to identify NCLDV proteins, especially when these alignments are only to small portions of the target protein. Nevertheless, we have now established a reliable protocol for the identification of viruses in plasma that can also be adapted to other patient samples such as urine, bile, saliva and other body fluids.
    Full-text · Article · Apr 2013 · PLoS ONE
Show more