Orthologous Gene Clusters and Taxon Signature Genes for Viruses of Prokaryotes

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Journal of bacteriology (Impact Factor: 2.81). 12/2012; 195(5). DOI: 10.1128/JB.01801-12
Source: PubMed


Viruses are the most abundant biological entities on earth and encompass a vast amount of genetic diversity. The recent rapid increase in the number of sequenced viral genomes has created unprecedented opportunities for gaining new insight into the structure and evolution of the virosphere. Here we present an update of the Phage Orthologous Groups (POGs), a collection of 4,542 clusters of orthologous genes from bacteriophages that now also includes viruses infecting archaea and encompasses more than 1,000 distinct virus genomes. Analysis of this expanded dataset shows that the number of POGs keeps growing without saturation and that a substantial majority of the POGs remain specific to viruses, lacking homologues in prokaryotic cells, outside of known proviruses. Thus, the great majority of virus genes apparently remains to be discovered. A complementary observation is that numerous viral genomes remain poorly if at all covered by POGs. The genome coverage by POGs is expected to increase as more genomes are sequenced. Taxon-specific, single-copy signature genes, that are not observed in prokaryotic genomes outside of detected proviruses, were identified for two-thirds of the 57 taxa (those with genomes available from at least 3 distinct viruses, with half of these present in all members of the respective taxon. These signatures can be used to specifically identify the presence and quantify the abundance of viruses from particular taxa in metagenomic samples and thus gain new insights into the ecology and evolution of viruses in relation to their hosts.

Download full-text


Available from: Alison Waller
  • Source
    • "As a case in point, 39 new genera have been recently proposed within the bacteriophage family Siphoviridae (Adriaenssens et al., 2014). Despite the rapid accumulation of bacteriophage sequences, the diversity of phage genes does not show any signs of saturation, suggestive of a vast phage supergenome that so far has been barely tapped into (Kristensen et al., 2013). In the case of eukaryotes, the diversity of retroelements is not captured by the existing classification of viruses, resulting in a severe underestimate of the true impact of this class of genomic parasites. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Viruses and other selfish genetic elements are dominant entities in the biosphere, with respect to both physical abundance and genetic diversity. Various selfish elements parasitize on all cellular life forms. The relative abundances of different classes of viruses are dramatically different between prokaryotes and eukaryotes. In prokaryotes, the great majority of viruses possess double-stranded (ds) DNA genomes, with a substantial minority of single-stranded (ss) DNA viruses and only limited presence of RNA viruses. In contrast, in eukaryotes, RNA viruses account for the majority of the virome diversity although ssDNA and dsDNA viruses are common as well. Phylogenomic analysis yields tangible clues for the origins of major classes of eukaryotic viruses and in particular their likely roots in prokaryotes. Specifically, the ancestral genome of positive-strand RNA viruses of eukaryotes might have been assembled de novo from genes derived from prokaryotic retroelements and bacteria although a primordial origin of this class of viruses cannot be ruled out. Different groups of double-stranded RNA viruses derive either from dsRNA bacteriophages or from positive-strand RNA viruses. The eukaryotic ssDNA viruses apparently evolved via a fusion of genes from prokaryotic rolling circle-replicating plasmids and positive-strand RNA viruses. Different families of eukaryotic dsDNA viruses appear to have originated from specific groups of bacteriophages on at least two independent occasions. Polintons, the largest known eukaryotic transposons, predicted to also form virus particles, most likely, were the evolutionary intermediates between bacterial tectiviruses and several groups of eukaryotic dsDNA viruses including the proposed order "Megavirales" that unites diverse families of large and giant viruses. Strikingly, evolution of all classes of eukaryotic viruses appears to have involved fusion between structural and replicative gene modules derived from different sources along with additional acquisitions of diverse genes. Published by Elsevier Inc.
    Full-text · Article · Mar 2015 · Virology
  • Source
    • "Virome studies allow for the first time to apprehend the overall phage genomic diversity, which is even greater than bacterial genomic diversity (Kristensen et al., 2013). It should be noted however, that phage genomic diversity is lower in fecal samples compared to that observed in open environments, as has already been observed for bacterial diversity (Ley et al., 2006). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Metagenomic approaches applied to viruses have highlighted their prevalence in almost all microbial ecosystems investigated. In all ecosystems, notably those associated with humans or animals, the viral fraction is dominated by bacteriophages. Whether they contribute to dysbiosis, i.e., the departure from microbiota composition in symbiosis at equilibrium and entry into a state favoring human or animal disease is unknown at present. This review summarizes what has been learnt on phages associated with human and animal microbiota, and focuses on examples illustrating the several ways by which phages may contribute to a shift to pathogenesis, either by modifying population equilibrium, by horizontal transfer, or by modulating immunity.
    Full-text · Article · Mar 2014 · Frontiers in Cellular and Infection Microbiology
  • Source
    • "Marker genes for phage taxa Phage Orthologous Groups (POGs) were constructed using the proteins contained in over 1000 phage genomes, including single-strand and double-strand DNA, single-strand and double-strand RNA phages and archaeal viruses (Kristensen et al., 2013). Then, taxon-specific marker genes were identified that are never found in other viral taxa (that is, 100% precision), and not found in non-prophage regions of bacterial chromosomes (that is, viral quotient greater than 85% (see Kristensen et al. (2013) for details)). The presence of a phage in a given sample is determined by the detection of one of these marker genes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Bacteriophages have key roles in microbial communities, to a large extent shaping the taxonomic and functional composition of the microbiome, but data on the connections between phage diversity and the composition of communities are scarce. Using taxon-specific marker genes, we identified and monitored 20 viral taxa in 252 human gut metagenomic samples, mostly at the level of genera. On average, five phage taxa were identified in each sample, with up to three of these being highly abundant. The abundances of most phage taxa vary by up to four orders of magnitude between the samples, and several taxa that are highly abundant in some samples are absent in others. Significant correlations exist between the abundances of some phage taxa and human host metadata: for example, 'Group 936 lactococcal phages' are more prevalent and abundant in Danish samples than in samples from Spain or the United States of America. Quantification of phages that exist as integrated prophages revealed that the abundance profiles of prophages are highly individual-specific and remain unique to an individual over a 1-year time period, and prediction of prophage lysis across the samples identified hundreds of prophages that are apparently active in the gut and vary across the samples, in terms of presence and lytic state. Finally, a prophage-host network of the human gut was established and includes numerous novel host-phage associations.
    Full-text · Article · Mar 2014 · The ISME Journal
Show more