[Show abstract][Hide abstract] ABSTRACT: For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this short report, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as well as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.
[Show abstract][Hide abstract] ABSTRACT: The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers (http://www.theseed.org/servers): four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users.
PLoS ONE 01/2012; 7(10):e48053. · 3.53 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Many enzymes and other proteins are difficult subjects for bioinformatic analysis because they exhibit variant catalytic, structural, regulatory, and fusion mode features within a protein family whose sequences are not highly conserved. However, such features reflect dynamic and interesting scenarios of evolutionary importance. The value of experimental data obtained from individual organisms is instantly magnified to the extent that given features of the experimental organism can be projected upon related organisms. But how can one decide how far along the similarity scale it is reasonable to go before such inferences become doubtful? How can a credible picture of evolutionary events be deduced within the vertical trace of inheritance in combination with intervening events of lateral gene transfer (LGT)? We present a comprehensive analysis of a dehydrogenase protein family (TyrA) as a prototype example of how these goals can be accomplished through the use of cohesion group analysis. With this approach, the full collection of homologs is sorted into groups by a method that eliminates bias caused by an uneven representation of sequences from organisms whose phylogenetic spacing is not optimal. Each sufficiently populated cohesion group is phylogenetically coherent and defined by an overall congruence with a distinct section of the 16S rRNA gene tree. Exceptions that occasionally are found implicate a clearly defined LGT scenario whereby the recipient lineage is apparent and the donor lineage of the gene transferred is localized to those organisms that define the cohesion group. Systematic procedures to manage and organize otherwise overwhelming amounts of data are demonstrated.
Microbiology and molecular biology reviews: MMBR 04/2008; 72(1):13-53, table of contents. · 12.59 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them.
We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12-24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service.
By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.
[Show abstract][Hide abstract] ABSTRACT: Despite its being a leading cause of nosocomal and community-acquired infections, surprisingly little is known about Staphylococcus aureus stress responses. In the current study, Affymetrix S. aureus GeneChips were used to define transcriptome changes in response to cold shock, heat shock, stringent, and SOS response-inducing conditions. Additionally, the RNA turnover properties of each response were measured. Each stress response induced distinct biological processes, subsets of virulence factors, and antibiotic determinants. The results were validated by real-time PCR and stress-mediated changes in antimicrobial agent susceptibility. Collectively, many S. aureus stress-responsive functions are conserved across bacteria, whereas others are unique to the organism. Sets of small stable RNA molecules with no open reading frames were also components of each response. Induction of the stringent, cold shock, and heat shock responses dramatically stabilized most mRNA species. Correlations between mRNA turnover properties and transcript titers suggest that S. aureus stress response-dependent alterations in transcript abundances can, in part, be attributed to alterations in RNA stability. This phenomenon was not observed within SOS-responsive cells.
Journal of Bacteriology 11/2006; 188(19):6739-56. · 3.19 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Bacterial pathogens regulate virulence factor expression at both the level of transcription initiation and mRNA processing/turnover. Within Staphylococcus aureus, virulence factor transcript synthesis is regulated by a number of two-component regulatory systems, the DNA binding protein SarA, and the SarA family of homologues. However, little is known about the factors that modulate mRNA stability or influence transcript degradation within the organism. As our entree to characterizing these processes, S. aureus GeneChips were used to simultaneously determine the mRNA half-lives of all transcripts produced during log-phase growth. It was found that the majority of log-phase transcripts (90%) have a short half-life (<5 min), whereas others are more stable, suggesting that cis- and/or trans-acting factors influence S. aureus mRNA stability. In support of this, it was found that two virulence factor transcripts, cna and spa, were stabilized in a sarA-dependent manner. These results were validated by complementation and real-time PCR and suggest that SarA may regulate target gene expression in a previously unrecognized manner by posttranscriptionally modulating mRNA turnover. Additionally, it was found that S. aureus produces a set of stable RNA molecules with no predicted open reading frame. Based on the importance of the S. aureus agr RNA molecule, RNAIII, and small stable RNA molecules within other pathogens, it is possible that these RNA molecules influence biological processes within the organism.
Journal of Bacteriology 04/2006; 188(7):2593-603. · 3.19 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The amount of genomic data available for study is increasing at a rate similar to that of Moore's law. This deluge of data is challenging bioinformaticians to develop newer, faster and better algorithms for analysis and examination of this data. The growing availability of large scale computing grids coupled with high-performance networking is challenging computer scientists to develop better, faster methods of exploiting parallelism in these biological computations and deploying them across computing grids. In this paper, we describe two computations that are required to be run frequently and which require large amounts of computing resource to complete in a reasonable time. The data for these computations are very large and the sequential computational time can exceed thousands of hours. We show the importance and relevance of these computations, the nature of the data and parallelism and we show how we are meeting the challenge of efficiently distributing and managing these computations in the SEED project.
Challenges of Large Applications in Distributed Environments, 2005. CLADE 2005. Proceedings; 08/2005
[Show abstract][Hide abstract] ABSTRACT: The Access Grid creates collaborative scientific workspaces that challenge traditional desktop metaphors by integrating large-scale visualization displays and lab instruments.
IEEE Internet Computing 08/2003; 7(4):51- 58. · 2.04 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: While many issues in the area of virtual reality (VR) researchhave been addressed in recentyears, the constant leaps forward in technology continue to push the field forward. VR research no longer is focused only on computer graphics, but instead has become even more interdisciplinary,combining the fields of networking, distributed computing, and even artificial intelligence. In this article we discuss some of the issues associated with distributed, collaborative virtual reality,aswell as lessons learned during the developmentoftwo distributed virtual reality applications. 1 Introduction The Futures Laboratory at Argonne National Laboratory has been exploring what is needed to support large-scale shared space virtual environments (VE) for wide-area collaborations. Our research has focused on the system architecture, software design, and features needed to implement suchenvironments. In this article we discuss twoprototype systems under development at Argonne. Shared virtual spaces a...
[Show abstract][Hide abstract] ABSTRACT: The Futures Lab was founded within the Mathematics and Computer Science Division of Argonne National Laboratory in the fall of 1994. The goal of the lab is develop new technology and systems to support collaborative science. In order to meet this goal, the lab is organizedaround threeresearch areas: advanced networking, multimedia, and virtual environments. The Argonne Computing and Communications Infrastructure Futures Laboratory (Futures Lab) was created in 1994 to explore, develop, and prototype next-generation computing and communications infrastructure systems. An important goal of the Futures Lab project is to understand how to incorporate advanced display and media server systems into scientific computing environments. The objective is to create new collaborativeenvironmenttechnologies that combine advanced networking, virtual space technology, and high-end virtual environments to enable the construction of virtual teams for scientific research. We believe that digital medi...
[Show abstract][Hide abstract] ABSTRACT: Virtual reality has become an increasingly familiar part of the science of visualization and communication of information. This, combined with the increase in connectivity of remote sites via high-speed networks, allows for the development of a collaborative distributed virtual environment. Such an environment enables the development of supercomputer simulations with virtual reality visualizations that can be displayed at multiple sites, with each site interacting, viewing, and communicating about the results being discovered. The early results of an experimental collaborative virtual reality environment are discussed in this paper. The issues that need to be addressed in the implementation, as well as preliminary results are covered. Also provided are a discussion of plans and a generalized application programmers interface for CAVE to CAVE will be provided. 1 Introduction Sharing a visualization experience among remote virtual environments is a new area of research within the field ...
[Show abstract][Hide abstract] ABSTRACT: The Futures Lab was founded within the Mathematics and Computer Science Division of Argonne National Laboratory in the fall of 1994. The goal of the lab is develop new technology and systems to support collaborative science. In order to meet this goal, the lab is organized around three research areas: advanced networking, multimedia, and virtual environments. The Argonne Computing and Communications Infrastructure Futures Laboratory (Futures Lab) was created in 1994 to explore, develop, and prototype next-generation computing and communications infrastructure systems. An important goal of the Futures Lab project is to understand how to incorporate advanced display and media server systems into scientific computing environments. The objective is to create new collaborative environment technologies that combine advanced networking, virtual space technology, and high-end virtual environments to enable the construction of virtual teams for scientific research. We believe that digital ...
[Show abstract][Hide abstract] ABSTRACT: The Argonne Voyager Multimedia Server is being developed in the Futures Lab of the Mathematics and Computer Science Division at Argonne National Laboratory. As a network based service for recording and playing multimedia streams, it is important that the Voyager system be capable of sustaining certain minimal levels of performance in order for it to be a viable system. In this article, we examine the performance characteristics of the server. As we examine the architecture of the system, we try to determine where bottlenecks lie, show actual vs potential performance, and recommend areas for improvement through custom architectures and system tuning
Application-Specific Systems, Architectures and Processors, 1997. Proceedings., IEEE International Conference on; 08/1997
[Show abstract][Hide abstract] ABSTRACT: UbiWorld is a concept being developed by the Futures Laboratory group at Argonne National Laboratory that ties together the notion of ubiquitous computing (Ubicomp) with that of using virtual reality for rapid prototyping. The goal is to develop an environment where one can explore Ubicomp-type concepts without having to build real Ubicomp hardware. The basic notion is to extend object models in a virtual world by using distributed wide area heterogeneous computing technology to provide complex networking and processing capabilities to virtual reality objects. 1 Introduction In the Futures Laboratory  in the Mathematics and Computer Science (MCS) Division at Argonne National Laboratory (ANL), our research agenda is driven partly by discussions of advanced computing scenarios. We find that by suspending disbelief momentarily and by engaging in serious discussion of such topics as off-planet infrastructure, green nomadic computing, and molecular nanotechnology, we are able to proje...
[Show abstract][Hide abstract] ABSTRACT: The use of virtual reality has been demonstrated to be very useful in the visualization of science and the synthesis of vast amounts of data. Further, advances in high-speed networks makes feasible the coupling of remote supercomputer simulations and virtual environments. The critical performance issue to be addressed with this coupling is the end-to-end lag time (i.e., the delay between a user input and the result of the input). In this paper we quantify the lag times for four different applications. These applications range from coupling virtual environments with supercomputers to coupling multiple virtual environments for collaborative design. The lag times for each applications are given in terms of rendering, network, simulation, tracking, and synchronization times. The results of the survey indicate that for some applications it is feasible, in terms of lag times, to couple supercomputers with virtual environments and to couple multiple virtual environments. For the applications ...
[Show abstract][Hide abstract] ABSTRACT: Virtual reality has become an increasingly familiar part of the science of visualization and communication of information. This, combined with the increase in connectivity of remote sites via high-speed networks, allows for the development of a collaborative distributed virtual environment. Such an environment enables the development of supercomputer simulations with virtual reality visualizations that can be displayed at multiple sites, with each site interacting, viewing, and communicating about the results being discovered. The CAVEComm library is a set of routines designed to generalize the communications between virtual environments and supercomputers.
[Show abstract][Hide abstract] ABSTRACT: When coupling supercomputer simulations to “virtual
reality” for real time interactive visualization, the critical
performance metric is the end to end lag time in system response.
Measuring the simulation, tracking, rendering, network, and
synchronization components of lag time shows the feasibility of coupling
supercomputers with virtual environments for some applications. For
others, simulation time makes interactivity difficult. The article
analyzes the components of lag for four applications that use virtual
environments: Monte-a simple application to calculate π using a
parallel Monte Carlo algorithm; Automotive Disk Brake-uses a parallel
finite element code to allow users to design and analyze an automotive
disk braking system under different conditions; BoilerMaker-lets users
design and analyze the placement of pollution control system injectors
in boilers and incinerators; Calvin (Collaborative Architectural Layout
Via Immersive Navigation)-allows people at different sites to work
collaboratively on the design and viewing of architectural spaces
IEEE Computational Science and Engineering 02/1996;