-
[show abstract]
[hide abstract]
ABSTRACT: BACKGROUND: Ribonucleotide reductase (RNR), the enzyme responsible for the formation of deoxyribonucleotides from ribonucleotides, is found in all domains of life and many viral genomes. RNRs are also amongst the most abundant genes identified in environmental metagenomes. This study focused on understanding the distribution, diversity, and evolution of RNRs in phages (viruses that infect bacteria). Hidden Markov Model profiles were used to analyze the proteins encoded by 685 completely sequenced double-stranded DNA phages and 22 environmental viral metagenomes to identify RNR homologs in cultured phages and uncultured viral communities, respectively. RESULTS: RNRs were identified in 128 phage genomes, nearly tripling the number of phages known to encode RNRs. Class I RNR was the most common RNR class observed in phages (70%), followed by class II (29%) and class III (28%). Twenty-eight percent of the phages contained genes belonging to multiple RNR classes. RNR class distribution varied according to phage type, isolation environment, and the host's ability to utilize oxygen. The majority of the phages containing RNRs are Myoviridae (65%), followed by Siphoviridae (30%) and Podoviridae (3%). The phylogeny and genomic organization of phage and host RNRs reveal several distinct evolutionary scenarios involving horizontal gene transfer, co-evolution, and differential selection pressure. Several putative split RNR genes interrupted by self-splicing introns or inteins were identified, providing further evidence for the role of frequent genetic exchange. Finally, viral metagenomic data indicate that RNRs are prevalent and highly dynamic in uncultured viral communities, necessitating future research to determine the environmental conditions under which RNRs provide a selective advantage. CONCLUSIONS: This comprehensive study describes the distribution, diversity, and evolution of RNRs in phage genomes and environmental viral metagenomes. The distinct distributions of specific RNR classes amongst phages, combined with the various evolutionary scenarios predicted from RNR phylogenies suggest multiple inheritance sources and different selective forces for RNRs in phages. This study significantly improves our understanding of phage RNRs, providing insight into the diversity and evolution of this important auxiliary metabolic gene as well as the evolution of phages in response to their bacterial hosts and environments.
BMC Evolutionary Biology 02/2013; 13(1):33. · 3.52 Impact Factor
-
Elizabeth A Dinsdale, Robert A Edwards,
Barbara A Bailey,
Imre Tuba,
Sajia Akhter,
Katelyn McNair,
Robert Schmieder,
Naneh Apkarian,
Michelle Creek,
Eric Guan,
Mayra Hernandez,
Katherine Isaacs,
Chris Peterson,
Todd Regh,
Vadim Ponomarenko
[show abstract]
[hide abstract]
ABSTRACT: Metagenomics is a primary tool for the description of microbial and viral communities. The sheer magnitude of the data generated in each metagenome makes identifying key differences in the function and taxonomy between communities difficult to elucidate. Here we discuss the application of seven different data mining and statistical analyses by comparing and contrasting the metabolic functions of 212 microbial metagenomes within and between 10 environments. Not all approaches are appropriate for all questions, and researchers should decide which approach addresses their questions. This work demonstrated the use of each approach: for example, random forests provided a robust and enlightening description of both the clustering of metagenomes and the metabolic processes that were important in separating microbial communities from different environments. All analyses identified that the presence of phage genes within the microbial community was a predictor of whether the microbial community was host-associated or free-living. Several analyses identified the subtle differences that occur with environments, such as those seen in different regions of the marine environment.
Frontiers in genetics. 01/2013; 4:41.
-
[show abstract]
[hide abstract]
ABSTRACT: All sequence data contain inherent information that can be measured by Shannon's uncertainty theory. Such measurement is valuable in evaluating large data sets, such as metagenomic libraries, to prioritize their analysis and annotation, thus saving computational resources. Here, Shannon's index of complete phage and bacterial genomes was examined. The information content of a genome was found to be highly dependent on the genome length, GC content, and sequence word size. In metagenomic sequences, the amount of information correlated with the number of matches found by comparison to sequence databases. A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database. Measuring uncertainty may be used for rapid screening for sequences with matches in available database, prioritizing computational resources, and indicating which sequences with no known similarities are likely to be important for more detailed analysis.
Scientific Reports 01/2013; 3:1033.
-
[show abstract]
[hide abstract]
ABSTRACT: MOTIVATION: Metagenomes are often characterized by high levels of unknown sequences. Reads derived from known micro-organisms can easily be identified and analyzed using fast homology search algorithms and a suitable reference database, but the unknown sequences are often ignored in further analyses, biasing conclusions. Nevertheless, it is possible to use more data in a comparative metagenomic analysis by creating a cross-assembly of all reads, i.e. a single assembly of reads from different samples. Comparative metagenomics studies the inter-relationships between metagenomes from different samples. Using an assembly algorithm is a fast and intuitive way to link (partially) homologous reads without requiring a database of reference sequences. RESULTS: Here, we introduce crAss, a novel bioinformatic tool that enables fast, simple analysis of cross-assembly files, yielding distances between all metagenomic sample pairs and an insightful image displaying the similarities. Availability and Implementation: crAss is available as a web server at http://edwards.sdsu.edu/crass/ and the Perl source code can be downloaded to run as a stand-alone, command line tool. CONTACT: dutilh@cmbi.ru.nl.
Bioinformatics 10/2012; · 5.47 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Annotation of metagenomes involves comparing the individual sequence reads with a database of known sequences and assigning a unique function to each read. This is a time-consuming task that is computationally intensive (though not computationally complex). Here we present a novel approach to annotate metagenomes using unique k-mer oligopeptides sequences from seven to twelve amino acids long. We demonstrate that k-mer based annotations are faster and approach the sensitivity and precision of blastx based annotations without loosing accuracy. A last-common ancestor approach was also developed to describe the members of the community.Availability and Implementation: This open-source application was implemented in Perl and can be accessed via a user-friendly website at http://edwards.sdsu.edu/rtmghttp://edwards.sdsu.edu/rtmg. In addition, code to access the annotation servers is available for download from http://www.theseed.org/. FIGfams and k-mers are available for download from ftp://ftp.theseed.org/FIGfams/.
Bioinformatics 10/2012; · 5.47 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Prophages are phages in lysogeny that are integrated into, and replicated as part of, the host bacterial genome. These mobile elements can have tremendous impact on their bacterial hosts' genomes and phenotypes, which may lead to strain emergence and diversification, increased virulence or antibiotic resistance. However, finding prophages in microbial genomes remains a problem with no definitive solution. The majority of existing tools rely on detecting genomic regions enriched in protein-coding genes with known phage homologs, which hinders the de novo discovery of phage regions. In this study, a weighted phage detection algorithm, PhiSpy was developed based on seven distinctive characteristics of prophages, i.e. protein length, transcription strand directionality, customized AT and GC skew, the abundance of unique phage words, phage insertion points and the similarity of phage proteins. The first five characteristics are capable of identifying prophages without any sequence similarity with known phage genes. PhiSpy locates prophages by ranking genomic regions enriched in distinctive phage traits, which leads to the successful prediction of 94% of prophages in 50 complete bacterial genomes with a 6% false-negative rate and a 0.66% false-positive rate.
Nucleic Acids Research 05/2012; 40(16):e126. · 8.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Phages (viruses that infect bacteria) have gained significant attention because of their abundance, diversity and important ecological roles. However, the lack of a universal gene shared by all phages presents a challenge for phage identification and characterization, especially in environmental samples where it is difficult to culture phage-host systems. Homologous conserved genes (or "signature genes") present in groups of closely-related phages can be used to explore phage diversity and define evolutionary relationships amongst these phages. Bioinformatic approaches are needed to identify candidate signature genes and design PCR primers to amplify those genes from environmental samples; however, there is currently no existing computational tool that biologists can use for this purpose.
Here we present PhiSiGns, a web-based and standalone application that performs a pairwise comparison of each gene present in user-selected phage genomes, identifies signature genes, generates alignments of these genes, and designs potential PCR primer pairs. PhiSiGns is available at (http://www.phantome.org/phisigns/; http://phisigns.sourceforge.net/) with a link to the source code. Here we describe the specifications of PhiSiGns and demonstrate its application with a case study.
PhiSiGns provides phage biologists with a user-friendly tool to identify signature genes and design PCR primers to amplify related genes from uncultured phages in environmental samples. This bioinformatics tool will facilitate the development of novel signature genes for use as molecular markers in studies of phage diversity, phylogeny, and evolution.
BMC Bioinformatics 03/2012; 13:37. · 2.75 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Phages are a primary driving force behind the evolution of bacterial pathogens by transferring a variety of virulence genes into their hosts. Similar to other bacterial genomes, the Salmonella enterica serovar Enteritidis LK5 genome contains several regions that are homologous to phages. Although genomic analysis demonstrated the presence of prophages, it was unable to confirm which phage elements within the genome were viable. Genetic markers were used to tag one of the prophages in the genome to allow monitoring of phage induction. Commonly used laboratory strains of Salmonella were resistant to phage infection, and therefore a rapid screen was developed to identify susceptible hosts. This approach showed that a genetically tagged prophage, ELPhiS (Enteritidis lysogenic phage S), was capable of infecting Salmonella serovars that are diverse in host range and virulence and has the potential to laterally transfer genes between these serovars via lysogenic conversion. The rapid screen approach is adaptable to any system with a large collection of isolates and may be used to test the viability of prophages found by sequencing the genomes of various bacterial pathogens.
Applied and environmental microbiology 03/2012; 78(6):1785-93. · 3.69 Impact Factor
-
Ramy K Aziz,
Scott Devoid,
Terrence Disz, Robert A Edwards,
Christopher S Henry,
Gary J Olsen,
Robert Olson,
Ross Overbeek,
Bruce Parrello,
Gordon D Pusch,
Rick L Stevens,
Veronika Vonstein,
Fangfang Xia
[show abstract]
[hide abstract]
ABSTRACT: The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers (http://www.theseed.org/servers): four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users.
PLoS ONE 01/2012; 7(10):e48053. · 4.09 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Viruses are the most abundant biological entities on the planet and play an important role in balancing microbes within an ecosystem and facilitating horizontal gene transfer. Although bacteriophages are abundant in rumen environments, little is known about the types of viruses present or their interaction with the rumen microbiome. We undertook random pyrosequencing of virus-enriched metagenomes (viromes) isolated from bovine rumen fluid and analysed the resulting data using comparative metagenomics. A high level of diversity was observed with up to 28,000 different viral genotypes obtained from each environment. The majority (~78%) of sequences did not match any previously described virus. Prophages outnumbered lytic phages approximately 2:1 with the most abundant bacteriophage and prophage types being associated with members of the dominant rumen phyla (Firmicutes and Proteobacteria). Metabolic profiling based on SEED subsystems revealed an enrichment of sequences with putative functional roles in DNA and protein metabolism, but a surprisingly low proportion of sequences assigned to carbohydrate and amino acid metabolism. We expanded our analysis to include previously described metagenomic data and 14 reference genomes. Clustered regularly interspaced short palindromic repeats (CRISPR) were detected in most of the microbial genomes, suggesting previous interactions between viral and microbial communities.
Environmental Microbiology 01/2012; 14(1):207-27. · 5.84 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Genes, like organisms, struggle for existence, and the most successful genes persist and widely disseminate in nature. The unbiased determination of the most successful genes requires access to sequence data from a wide range of phylogenetic taxa and ecosystems, which has finally become achievable thanks to the deluge of genomic and metagenomic sequences. Here, we analyzed 10 million protein-encoding genes and gene tags in sequenced bacterial, archaeal, eukaryotic and viral genomes and metagenomes, and our analysis demonstrates that genes encoding transposases are the most prevalent genes in nature. The finding that these genes, classically considered as selfish genes, outnumber essential or housekeeping genes suggests that they offer selective advantage to the genomes and ecosystems they inhabit, a hypothesis in agreement with an emerging body of literature. Their mobile nature not only promotes dissemination of transposable elements within and between genomes but also leads to mutations and rearrangements that can accelerate biological diversification and--consequently--evolution. By securing their own replication and dissemination, transposases guarantee to thrive so long as nucleic acid-based life forms exist.
Nucleic Acids Research 03/2010; 38(13):4207-17. · 8.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups.
The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java.
We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.
BMC Bioinformatics 01/2010; 11:319. · 2.75 Impact Factor
-
Florent E Angly,
Dana Willner,
Alejandra Prieto-Davó, Robert A Edwards,
Robert Schmieder,
Rebecca Vega-Thurber,
Dionysios A Antonopoulos,
Katie Barott,
Matthew T Cottrell,
Christelle Desnues, [......],
Folker Meyer,
R Michael Miller,
Egbert Mundt,
Robert K Naviaux,
Beltran Rodriguez-Mueller,
Rick Stevens,
Linda Wegley,
Lixin Zhang,
Baoli Zhu,
Forest Rohwer
[show abstract]
[hide abstract]
ABSTRACT: Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions.
PLoS Computational Biology 12/2009; 5(12):e1000593. · 5.22 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The coral holobiont is the community of metazoans, protists and microbes associated with scleractinian corals. Disruptions in these associations have been correlated with coral disease, but little is known about the series of events involved in the shift from mutualism to pathogenesis. To evaluate structural and functional changes in coral microbial communities, Porites compressa was exposed to four stressors: increased temperature, elevated nutrients, dissolved organic carbon loading and reduced pH. Microbial metagenomic samples were collected and pyrosequenced. Functional gene analysis demonstrated that stressors increased the abundance of microbial genes involved in virulence, stress resistance, sulfur and nitrogen metabolism, motility and chemotaxis, fatty acid and lipid utilization, and secondary metabolism. Relative changes in taxonomy also demonstrated that coral-associated microbiota (Archaea, Bacteria, protists) shifted from a healthy-associated coral community (e.g. Cyanobacteria, Proteobacteria and the zooxanthellae Symbiodinium) to a community (e.g. Bacteriodetes, Fusobacteria and Fungi) of microbes often found on diseased corals. Additionally, low-abundance Vibrio spp. were found to significantly alter microbiome metabolism, suggesting that the contribution of a just a few members of a community can profoundly shift the health status of the coral holobiont.
Environmental Microbiology 05/2009; 11(8):2148-63. · 5.84 Impact Factor
-
Rebecca L Vega Thurber,
Katie L Barott,
Dana Hall,
Hong Liu,
Beltran Rodriguez-Mueller,
Christelle Desnues, Robert A Edwards,
Matthew Haynes,
Florent E Angly,
Linda Wegley,
Forest L Rohwer
[show abstract]
[hide abstract]
ABSTRACT: During the last several decades corals have been in decline and at least one-third of all coral species are now threatened with extinction. Coral disease has been a major contributor to this threat, but little is known about the responsible pathogens. To date most research has focused on bacterial and fungal diseases; however, viruses may also be important for coral health. Using a combination of empirical viral metagenomics and real-time PCR, we show that Porites compressa corals contain a suite of eukaryotic viruses, many related to the Herpesviridae. This coral-associated viral consortium was found to shift in response to abiotic stressors. In particular, when exposed to reduced pH, elevated nutrients, and thermal stress, the abundance of herpes-like viral sequences rapidly increased in 2 separate experiments. Herpes-like viral sequences were rarely detected in apparently healthy corals, but were abundant in a majority of stressed samples. In addition, surveys of the Nematostella and Hydra genomic projects demonstrate that even distantly related Cnidarians contain numerous herpes-like viral genes, likely as a result of latent or endogenous viral infection. These data support the hypotheses that corals experience viral infections, which are exacerbated by stress, and that herpes-like viruses are common in Cnidarians.
Proceedings of the National Academy of Sciences 12/2008; 105(47):18413-8. · 9.68 Impact Factor
-
Elizabeth A Dinsdale, Robert A Edwards,
Dana Hall,
Florent Angly,
Mya Breitbart,
Jennifer M Brulc,
Mike Furlan,
Christelle Desnues,
Matthew Haynes,
Linlin Li, [......],
John Paul,
Beltran Rodriguez-Brito,
Yijun Ruan,
Brandon K Swan,
Rick Stevens,
David L Valentine,
Rebecca Vega Thurber,
Linda Wegley,
Bryan A White,
Forest Rohwer
Nature 11/2008; 455(7214):830. · 36.28 Impact Factor
-
Lutz Krause,
Naryttza N Diaz, Robert A Edwards,
Karl-Heinz Gartemann,
Holger Krömeke,
Heiko Neuweger,
Alfred Pühler,
Kai J Runte,
Andreas Schlüter,
Jens Stoye,
Rafael Szczepanowski,
Andreas Tauch,
Alexander Goesmann
[show abstract]
[hide abstract]
ABSTRACT: A total community DNA sample from an agricultural biogas reactor continuously fed with maize silage, green rye, and small proportions of chicken manure has recently been sequenced using massively parallel pyrosequencing. In this study, the sample was computationally characterized without a prior assembly step, providing quantitative insights into the taxonomic composition and gene content of the underlying microbial community. Clostridiales from the phylum Firmicutes is the most prevalent phylogenetic order, Methanomicrobiales are dominant among methanogenic archaea. An analysis of Operational Taxonomic Units (OTUs) revealed that the entire microbial community is only partially covered by the sequenced sample, despite that estimates suggest only a moderate overall diversity of the community. Furthermore, the results strongly indicate that archaea related to the genus Methanoculleus, using CO2 as electron acceptor and H2 as electron donor, are the main producers of methane in the analyzed biogas reactor sample. A phylogenetic analysis of glycosyl hydrolase protein families suggests that Clostridia play an important role in the digestion of polysaccharides and oligosaccharides. Finally, the results unveiled that most of the organisms constituting the sample are still unexplored.
Journal of Biotechnology 07/2008; 136(1-2):91-101. · 3.05 Impact Factor
-
Mya Breitbart,
Matthew Haynes,
Scott Kelley,
Florent Angly, Robert A Edwards,
Ben Felts,
Joseph M Mahaffy,
Jennifer Mueller,
James Nulton,
Steve Rayhawk,
Beltran Rodriguez-Brito,
Peter Salamon,
Forest Rohwer
[show abstract]
[hide abstract]
ABSTRACT: Metagenomic sequencing of DNA viruses from the feces of a healthy week-old infant revealed a viral community with extremely low diversity. The identifiable sequences were dominated by phages, which likely influence the diversity and abundance of co-occurring microbes. The most abundant fecal viral sequences did not originate from breast milk or formula, suggesting a non-dietary initial source of viruses. Certain sequences were stable in the infant's gut over the first 3 months of life, but microarray experiments demonstrated that the overall viral community composition changed dramatically between 1 and 2 weeks of age.
Research in Microbiology 06/2008; 159(5):367-73. · 2.76 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The coral holobiont is the integrated assemblage of the coral animal, its symbiotic algae, protists, fungi and a diverse consortium of Bacteria and Archaea. Corals are a model system for the study of symbiosis, the breakdown of which can result in disease and mortality. Little is known, however, about viruses that infect corals and their symbionts. Here we present metagenomic analyses of the viral communities associated with healthy and partially bleached specimens of the Caribbean reef-building coral Diploria strigosa. Surprisingly, herpes-like sequences accounted for 4-8% of the total sequences in each metagenome; this abundance of herpes-like sequences is unprecedented in other marine viral metagenomes. Viruses similar to those that infect algae and plants were also present in the coral viral assemblage. Among the phage identified, cyanophages were abundant in both healthy and bleaching corals and vibriophages were also present. Therefore, coral-associated viruses could potentially infect all components of the holobiont--coral, algal and microbial. Thus, we expect viruses to figure prominently in the preservation and breakdown of coral health.
Environmental Microbiology 06/2008; 10(9):2277-86. · 5.84 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Metagenomics is providing striking insights into the ecology of microbial communities. The recently developed massively parallel 454 pyrosequencing technique gives the opportunity to rapidly obtain metagenomic sequences at a low cost and without cloning bias. However, the phylogenetic analysis of the short reads produced represents a significant computational challenge. The phylogenetic algorithm CARMA for predicting the source organisms of environmental 454 reads is described. The algorithm searches for conserved Pfam domain and protein families in the unassembled reads of a sample. These gene fragments (environmental gene tags, EGTs), are classified into a higher-order taxonomy based on the reconstruction of a phylogenetic tree of each matching Pfam family. The method exhibits high accuracy for a wide range of taxonomic groups, and EGTs as short as 27 amino acids can be phylogenetically classified up to the rank of genus. The algorithm was applied in a comparative study of three aquatic microbial samples obtained by 454 pyrosequencing. Profound differences in the taxonomic composition of these samples could be clearly revealed.
Nucleic Acids Research 05/2008; 36(7):2230-9. · 8.03 Impact Factor