SEED Servers: High-Performance Access to the SEED Genomes, Annotations, and Metabolic Models

Computation Institute, University of Chicago, Chicago, Illinois, United States of America
PLoS ONE (Impact Factor: 3.23). 10/2012; 7(10):e48053. DOI: 10.1371/journal.pone.0048053
Source: PubMed


The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers ( four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users.

Download full-text


Available from: Robert Olson, Jan 07, 2014
77 Reads
  • Source
    • "an e - value cutoff of 1 × 10 −3 , which is approximately equiva - lent , for our database , to 30% similarity over 100 amino acids and 20% similarity over 500 amino acids . This custom database contains the entire NCBI non - redundant protein database , the PhAnToMe phage protein database ( Aziz et al . , 2012 ) , the NCBI Viral RefSeq Database , and the Aedes aegypti , Anopheles gambiae , and Culex quinquefasciatus protein databases ( Megy et al . , 2012 ) . The rationale for combining databases was to ensure that viral and phage proteins were present , while simultaneously reducing the false positive rate by the inclusion of all possible no"
    [Show abstract] [Hide abstract]
    ABSTRACT: Mosquitoes, most often recognized for the microbial agents of disease they may carry, harbor diverse microbial communities that include viruses, bacteria, and fungi, collectively called the microbiota. The composition of the microbiota can directly and indirectly affect disease transmission through microbial interactions that could be revealed by its characterization in natural populations of mosquitoes. Furthermore, the use of shotgun metagenomic sequencing (SMS) approaches could allow the discovery of unknown members of the microbiota. In this study, we use RNA SMS to characterize the microbiota of seven individual mosquitoes (species include Culex pipiens, Culiseta incidens, and Ochlerotatus sierrensis) collected from a variety of habitats in California, USA. Sequencing was performed on the Illumina HiSeq platform and the resulting sequences were quality-checked and assembled into contigs using the A5 pipeline. Sequences related to single stranded RNA viruses of the Bunyaviridae and Rhabdoviridae were uncovered, along with an unclassified genus of double-stranded RNA viruses. Phylogenetic analysis finds that in all three cases, the closest relatives of the identified viral sequences are other mosquito-associated viruses, suggesting widespread host-group specificity among disparate viral taxa. Interestingly, we identified a Narnavirus of fungi, also reported elsewhere in mosquitoes, that potentially demonstrates a nested host-parasite association between virus, fungi, and mosquito. Sequences related to 8 bacterial families and 13 fungal families were found across the seven samples. Bacillus and Escherichia/Shigella were identified in all samples and Wolbachia was identified in all Cx. pipiens samples, while no single fungal genus was found in more than two samples. This study exemplifies the utility of RNA SMS in the characterization of the natural microbiota of mosquitoes and, in particular, the value of identifying all microbes associated with a specific host.
    Frontiers in Microbiology 03/2015; 6:185. DOI:10.3389/fmicb.2015.00185 · 3.99 Impact Factor
  • Source
    • "FOCUS requires a group of reference genomes to model and identify the organisms present in a metagenome. 2,766 complete genomes were downloaded from the SEED servers (Aziz et al., 2012) on 20 December 2013 (see Table S1). k-mer frequencies (k = 6–8, default: k = 7) were calculated for both strands using Jellyfish 1.1.6 "
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the major goals in metagenomics is to identify the organisms present in a microbial community from unannotated shotgun sequencing reads. Taxonomic profiling has valuable applications in biological and medical research, including disease diagnostics. Most currently available approaches do not scale well with increasing data volumes, which is important because both the number and lengths of the reads provided by sequencing platforms keep increasing. Here we introduce FOCUS, an agile composition based approach using non-negative least squares (NNLS) to report the organisms present in metagenomic samples and profile their abundances. FOCUS was tested with simulated and real metagenomes, and the results show that our approach accurately predicts the organisms present in microbial communities. FOCUS was implemented in Python. The source code and web-sever are freely available at
    PeerJ 06/2014; 2(4):e425. DOI:10.7717/peerj.425 · 2.11 Impact Factor
  • Source
    • "After removing low quality reads with PRINSEQ (Schmieder and Edwards, 2011a) and human sequences using DeconSeq (Schmieder and Edwards, 2011b) with the commands specified in the Supplementary Materials, 13.6 million reads of B140 bp were retained for further analysis (Supplementary Table 7). Genes related to butanedione synthesis from the SEED database (Aziz et al., 2012) and phenazine synthesis genes from the Patric database (Snyder et al., 2007) were used in BLASTn searches for related sequences in metagenomic data from CF patients. Metagenomic sequences were selected if they matched butanedione and phenazine synthesis pathway genes with a minimum length of 40 bp, sequence identity of 40% and a BLAST e-value cutoff of 1 Â 10 À 10 . "
    [Show abstract] [Hide abstract]
    ABSTRACT: The airways of cystic fibrosis (CF) patients are chronically colonized by patient-specific polymicrobial communities. The conditions and nutrients available in CF lungs affect the physiology and composition of the colonizing microbes. Recent work in bioreactors has shown that the fermentation product 2,3-butanediol mediates cross-feeding between some fermenting bacteria and Pseudomonas aeruginosa, and that this mechanism increases bacterial current production. To examine bacterial fermentation in the respiratory tract, breath gas metabolites were measured and several metagenomes were sequenced from CF and non-CF volunteers. 2,3-butanedione was produced in nearly all respiratory tracts. Elevated levels in one patient decreased during antibiotic treatment, and breath concentrations varied between CF patients at the same time point. Some patients had high enough levels of 2,3-butanedione to irreversibly damage lung tissue. Antibiotic therapy likely dictates the activities of 2,3-butanedione-producing microbes, which suggests a need for further study with larger sample size. Sputum microbiomes were dominated by P. aeruginosa, Streptococcus spp. and Rothia mucilaginosa, and revealed the potential for 2,3-butanedione biosynthesis. Genes encoding 2,3-butanedione biosynthesis were disproportionately abundant in Streptococcus spp, whereas genes for consumption of butanedione pathway products were encoded by P. aeruginosa and R. mucilaginosa. We propose a model where low oxygen conditions in CF lung lead to fermentation and a decrease in pH, triggering 2,3-butanedione fermentation to avoid lethal acidification. We hypothesize that this may also increase phenazine production by P. aeruginosa, increasing reactive oxygen species and providing additional electron acceptors to CF microbes.The ISME Journal advance online publication, 9 January 2014; doi:10.1038/ismej.2013.229.
    The ISME Journal 01/2014; 8(6). DOI:10.1038/ismej.2013.229 · 9.30 Impact Factor
Show more