SEED Servers: High-Performance Access to the SEED Genomes, Annotations, and Metabolic Models

Computation Institute, University of Chicago, Chicago, Illinois, United States of America
PLoS ONE (Impact Factor: 3.23). 10/2012; 7(10):e48053. DOI: 10.1371/journal.pone.0048053
Source: PubMed


The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers ( four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users.

Download full-text


Available from: Robert Olson, Jan 07, 2014
80 Reads
  • Source
    • "Whole genome sequencing, assembly, annotation were performed as described previously (Chan et al., 2014). 16S rRNA gene sequence of C. neteri SSMD04 was searched using " Genome Browser " function in RAST (Aziz et al., 2012) after automated annotation. Other 16S rRNA gene sequences of Cedecea. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Cedecea neteri is a very rare human pathogen. We have isolated a strain of C. neteri SSMD04 from pickled mackerel sashimi identified using molecular and phenotypics approaches. Using the biosensor Chromobacterium violaceum CV026, we have demonstrated the presence of short chain N-acyl-homoserine lactone (AHL) type quorum sensing (QS) activity in C. neteri SSMD04. Triple quadrupole LC/MS analysis revealed that C. neteri SSMD04 produced short chain N-butyryl-homoserine lactone (C4-HSL). With the available genome information of C. neteri SSMD04, we went on to analyse and identified a pair of luxI/R homologues in this genome that share the highest similarity with croI/R homologues from Citrobacter rodentium. The AHL synthase, which we named cneI(636 bp), was found in the genome sequences of C. neteri SSMD04. At a distance of 8bp from cneI is a sequence encoding a hypothetical protein, potentially the cognate receptor, a luxR homologue which we named it as cneR. Analysis of this protein amino acid sequence reveals two signature domains, the autoinducer-binding domain and the C-terminal effector which is typical characteristic of luxR. In addition, we found that this genome harboured an orphan luxR that is most closely related to easR in Enterobacter asburiae. To our knowledge, this is the first report on the AHL production activity in C. neteri, and the discovery of its luxI/R homologues, the orphan receptor and its whole genome sequence.
    PeerJ 09/2015; 3(Pt 3):e1216. DOI:10.7717/peerj.1216 · 2.11 Impact Factor
  • Source
    • "an e - value cutoff of 1 × 10 −3 , which is approximately equiva - lent , for our database , to 30% similarity over 100 amino acids and 20% similarity over 500 amino acids . This custom database contains the entire NCBI non - redundant protein database , the PhAnToMe phage protein database ( Aziz et al . , 2012 ) , the NCBI Viral RefSeq Database , and the Aedes aegypti , Anopheles gambiae , and Culex quinquefasciatus protein databases ( Megy et al . , 2012 ) . The rationale for combining databases was to ensure that viral and phage proteins were present , while simultaneously reducing the false positive rate by the inclusion of all possible no"
    [Show abstract] [Hide abstract]
    ABSTRACT: Mosquitoes, most often recognized for the microbial agents of disease they may carry, harbor diverse microbial communities that include viruses, bacteria, and fungi, collectively called the microbiota. The composition of the microbiota can directly and indirectly affect disease transmission through microbial interactions that could be revealed by its characterization in natural populations of mosquitoes. Furthermore, the use of shotgun metagenomic sequencing (SMS) approaches could allow the discovery of unknown members of the microbiota. In this study, we use RNA SMS to characterize the microbiota of seven individual mosquitoes (species include Culex pipiens, Culiseta incidens, and Ochlerotatus sierrensis) collected from a variety of habitats in California, USA. Sequencing was performed on the Illumina HiSeq platform and the resulting sequences were quality-checked and assembled into contigs using the A5 pipeline. Sequences related to single stranded RNA viruses of the Bunyaviridae and Rhabdoviridae were uncovered, along with an unclassified genus of double-stranded RNA viruses. Phylogenetic analysis finds that in all three cases, the closest relatives of the identified viral sequences are other mosquito-associated viruses, suggesting widespread host-group specificity among disparate viral taxa. Interestingly, we identified a Narnavirus of fungi, also reported elsewhere in mosquitoes, that potentially demonstrates a nested host-parasite association between virus, fungi, and mosquito. Sequences related to 8 bacterial families and 13 fungal families were found across the seven samples. Bacillus and Escherichia/Shigella were identified in all samples and Wolbachia was identified in all Cx. pipiens samples, while no single fungal genus was found in more than two samples. This study exemplifies the utility of RNA SMS in the characterization of the natural microbiota of mosquitoes and, in particular, the value of identifying all microbes associated with a specific host.
    Frontiers in Microbiology 03/2015; 6:185. DOI:10.3389/fmicb.2015.00185 · 3.99 Impact Factor
  • Source
    • "FOCUS requires a group of reference genomes to model and identify the organisms present in a metagenome. 2,766 complete genomes were downloaded from the SEED servers (Aziz et al., 2012) on 20 December 2013 (see Table S1). k-mer frequencies (k = 6–8, default: k = 7) were calculated for both strands using Jellyfish 1.1.6 "
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the major goals in metagenomics is to identify the organisms present in a microbial community from unannotated shotgun sequencing reads. Taxonomic profiling has valuable applications in biological and medical research, including disease diagnostics. Most currently available approaches do not scale well with increasing data volumes, which is important because both the number and lengths of the reads provided by sequencing platforms keep increasing. Here we introduce FOCUS, an agile composition based approach using non-negative least squares (NNLS) to report the organisms present in metagenomic samples and profile their abundances. FOCUS was tested with simulated and real metagenomes, and the results show that our approach accurately predicts the organisms present in microbial communities. FOCUS was implemented in Python. The source code and web-sever are freely available at
    PeerJ 06/2014; 2(4):e425. DOI:10.7717/peerj.425 · 2.11 Impact Factor
Show more