- [Show abstract] [Hide abstract] ABSTRACT: DNA sequencing plays a more and more important role in various fields of genetics. This includes sequencing of whole genomes, libraries of cDNA clones and probes of metagenome communities. The applied sequencing technologies evolve permanently. With the emergence of ultrafast sequencing technologies, a new era of DNA sequencing has recently started. Concurrently, the needs for adapted bioinformatics tools arise. Since the ability to process current datasets efficiently is essential for modern genetics, a modular bioinformatics platform providing extensive sequence analysis methods, is designated to achieve well the constantly growing requirements. The Sequence Analysis and Management System (SAMS) is a bioinformatics software platform with a database backend designed to support the computational analysis of (1) whole genome shotgun (WGS) bacterial genome sequencing, (2) cDNA sequencing by reading expressed sequence tags (ESTs) as well as (3) sequence data obtained by ultrafast sequencing. It provides extensive bioinformatics analysis of sequenced single reads, sequencing libraries and fragments of arbitrary DNA sequences such as assembled contigs of metagenome reads for instance. The system has been implemented to cope with several thousands of sequences, efficiently processing them and storing the results for further analysis. With the project setup, SAMS automatically recognizes the data type.
- [Show abstract] [Hide abstract] ABSTRACT: Random community genomes (metagenomes) are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers. A high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. Phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. User access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing datasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats. The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis - the availability of high-performance computing for annotating the data. http://metagenomics.nmpdr.org.
- [Show abstract] [Hide abstract] ABSTRACT: The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them. We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12-24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service. By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.
- [Show abstract] [Hide abstract] ABSTRACT: The National Microbial Pathogen Data Resource (NMPDR) (http://www.nmpdr.org) is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of ∼50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development.
- [Show abstract] [Hide abstract] ABSTRACT: A 70mer oligonucleotide microarray was constructed to analyze genome-wide expression profiles of Corynebacterium jeikeium, a skin bacterium that is predominantly present in the human axilla and involved in axillary odor formation. Oligonucleotides representing 100% of the predicted coding regions of the C. jeikeium K411 genome were designed and spotted in quadruplicate onto epoxy-coated glass slides. The quality of the printed microarray was demonstrated by co-hybridization with fluorescently labeled cDNA probes obtained from exponentially growing C. jeikeium cultures. Accordingly, genes detected with different intensities resulting in log(2) transformed ratios greater than 0.8 or smaller than -0.8 can be regarded as differentially expressed with a confidence level greater than 99%. In an application example, we measured global changes of gene expression during growth of C. jeikeium in the presence of different concentrations of the deodorant component 4-hydroxy-3-methoxybenzyl alcohol that is active in preventing body odor formation. Global expression profiling revealed that low concentrations of 4-hydroxy-3-methoxybenzyl alcohol (0.5 and 2.5mg/ml) had almost no detectable effect on the transcriptome of C. jeikeium. A slightly higher concentration of 4-hydroxy-3-methoxybenzyl alcohol (5mg/ml) resulted in differential expression of 95 genes, 86 of which showed an enhanced expression when compared to a control culture. Besides many genes encoding proteins that apparently participate in transcription and translation, the drug resistance determinant cmx and the predicted virulence factors sapA and sapD showed significantly enhanced expression levels. Differential expression of relevant genes was validated by real-time reverse transcription PCR assays.