-
[show abstract]
[hide abstract]
ABSTRACT: Viruses are ubiquitous in the oceans and critical components of marine microbial communities, regulating nutrient transfer to higher trophic levels or to the dissolved organic pool through lysis of host cells. Hydrothermal vent systems are oases of biological activity in the deep oceans, for which knowledge of biodiversity and its impact on global ocean biogeochemical cycling is still in its infancy. In order to gain biological insight into viral communities present in hydrothermal vent systems, we developed a method based on deep-sequencing of pulsed field gel electrophoretic bands representing key viral fractions present in seawater within and surrounding a hydrothermal plume derived from Loki's Castle vent field at the Arctic Mid-Ocean Ridge. The reduction in virus community complexity afforded by this novel approach enabled the near-complete reconstruction of a lambda-like phage genome from the virus fraction of the plume. Phylogenetic examination of distinct gene regions in this lambdoid phage genome unveiled diversity at loci encoding superinfection exclusion- and integrase-like proteins. This suggests the importance of fine-tuning lyosgenic conversion as a viral survival strategy, and provides insights into the nature of host-virus and virus-virus interactions, within hydrothermal plumes. By reducing the complexity of the viral community through targeted sequencing of prominent dsDNA viral fractions, this method has selectively mimicked virus dominance approaching that hitherto achieved only through culturing, thus enabling bioinformatic analysis to locate a lambdoid viral "needle" within the greater viral community "haystack". Such targeted analyses have great potential for accelerating the extraction of biological knowledge from diverse and poorly understood environmental viral communities.
PLoS ONE 01/2012; 7(4):e34238. · 4.09 Impact Factor
-
Virginie Mittard-Runte,
Thomas Bekel,
Jochen Blom, Michael Dondrup,
Kolja Henckel,
Sebastian Jaenicke,
Lutz Krause,
Burkhard Linke,
Heiko Neuweger,
Susanne Schneiker-Bekel,
Alexander Goesmann
[show abstract]
[hide abstract]
ABSTRACT: In recent years, modern high-throughput techniques in genome and post-genome research have made a marked impact on the marine
sciences. Today, massively parallel DNA sequencing and hybridization approaches allow the identification of not only the gene
repertoire but also the gene regulatory networks that function within an organism. The huge amounts of data acquired from
such experiments can only be handled with intensive bioinformatics support that has to provide an adequate infrastructure
for storing and analysing these data. Bioinformatics has to deliver efficient data analysis algorithms, user-friendly tools
and software applications, as well as extensive hardware infrastructure to deal with these genome-scale analyses.
The following chapter briefly introduces not only the most relevant topics of bioinformatics for functional and structural
genomics but also addresses the practical aspects of other steps of a genome project such as sequencing or data management
issues. The chapter will take the reader through the different technical approaches that can be applied in marine genomics
projects.
In the first part, we will mainly focus on data generation, introducing classical genome sequencing approaches such as the
Sanger method and the shotgun technique. Moreover, a short overview of the current status of the next generation of sequencing
techniques will be given. In the second part, we briefly introduce the concept of data management for bioinformatics applications.
In the third part, we describe the basic principles of genome sequence analysis and address topics like EST clustering and
genome assembly, gene prediction, gene function assignment and classification as well as whole genome annotation. In the fourth
part of this chapter, we present an overview of transcriptome data analysis using microarray hybridization technology. After
a brief introduction to microarray technology we describe state-of-the-art methods for image processing, data normalization,
significance testing and cluster analysis.
04/2010: pages 315-378;
-
BMC Systems Biology 09/2009; · 3.15 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Microarray analysis has become a popular and routine method in functional genomics. It is typical for such experiments to involve a small number of replicates, which causes unreliable estimates of the sample variance. Microarrays have fostered the development of new statistical methods to analyze data resulting from experiments with small sample sizes. In this study, we tackle the problem of evaluating the performance of statistical tests for generating ranked gene lists from two-channel direct comparisons. We propose an evaluation method based on a oligonucleotide microarray with a large number of replicate spots yielding a maximum of 400 replicates per gene. We apply Spearman's rank correlation coefficient to ranked gene-lists generated by eight widely used microarray specific test statistics, which are applied to small random samples. We could show that variance stabilizing methods such as Cyber-T, SAM, and LIMMA can be beneficial for very small sample sizes and that SAM and the t-test provide stronger control of the type I error rate than the other methods. Specifically, we report that for four replicates all methods reach a high to very high correlation with our reference standard.
Journal of biotechnology 04/2009; 140(1-2):18-26. · 2.88 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Databases for either sequence, annotation, or microarray experiments data are extremely beneficial to the research community, as they centrally gather information from experiments performed by different scientists. However, data from different sources develop their full capacities only when combined. The idea of a data warehouse directly adresses this problem and solves it by integrating all required data into one single database - hence there are already many data warehouses available to genetics. For the model legume Medicago truncatula, there is currently no such single data warehouse that integrates all freely available gene sequences, the corresponding gene expression data, and annotation information. Thus, we created the data warehouse TRUNCATULIX, an integrative database of Medicago truncatula sequence and expression data.
The TRUNCATULIX data warehouse integrates five public databases for gene sequences, and gene annotations, as well as a database for microarray expression data covering raw data, normalized datasets, and complete expression profiling experiments. It can be accessed via an AJAX-based web interface using a standard web browser. For the first time, users can now quickly search for specific genes and gene expression data in a huge database based on high-quality annotations. The results can be exported as Excel, HTML, or as csv files for further usage.
The integration of sequence, annotation, and gene expression data from several Medicago truncatula databases in TRUNCATULIX provides the legume community with access to data and data mining capability not previously available. TRUNCATULIX is freely available at http://www.cebitec.uni-bielefeld.de/truncatulix/.
BMC Plant Biology 03/2009; 9:19. · 3.45 Impact Factor
-
Michael Dondrup,
Stefan P Albaum,
Thasso Griebel,
Kolja Henckel,
Sebastian Jünemann,
Tim Kahlke,
Christiane K Kleindt,
Helge Küster,
Burkhard Linke,
Dominik Mertens,
Virginie Mittard-Runte,
Heiko Neuweger,
Kai J Runte,
Andreas Tauch,
Felix Tille,
Alfred Pühler,
Alexander Goesmann
[show abstract]
[hide abstract]
ABSTRACT: Understanding transcriptional regulation by genome-wide microarray studies can contribute to unravel complex relationships between genes. Attempts to standardize the annotation of microarray data include the Minimum Information About a Microarray Experiment (MIAME) recommendations, the MAGE-ML format for data interchange, and the use of controlled vocabularies or ontologies. The existing software systems for microarray data analysis implement the mentioned standards only partially and are often hard to use and extend. Integration of genomic annotation data and other sources of external knowledge using open standards is therefore a key requirement for future integrated analysis systems.
The EMMA 2 software has been designed to resolve shortcomings with respect to full MAGE-ML and ontology support and makes use of modern data integration techniques. We present a software system that features comprehensive data analysis functions for spotted arrays, and for the most common synthesized oligo arrays such as Agilent, Affymetrix and NimbleGen. The system is based on the full MAGE object model. Analysis functionality is based on R and Bioconductor packages and can make use of a compute cluster for distributed services.
Our model-driven approach for automatically implementing a full MAGE object model provides high flexibility and compatibility. Data integration via SOAP-based web-services is advantageous in a distributed client-server environment as the collaborative analysis of microarray data is gaining more and more relevance in international research consortia. The adequacy of the EMMA 2 software design and implementation has been proven by its application in many distributed functional genomics projects. Its scalability makes the current architecture suited for extensions towards future transcriptomics methods based on high-throughput sequencing approaches which have much higher computational requirements than microarrays.
BMC Bioinformatics 03/2009; 10:50. · 2.75 Impact Factor
-
Anke Becker,
Melanie J Barnett,
Delphine Capela, Michael Dondrup,
Paul-Bertram Kamp,
Elizaveta Krol,
Burkhard Linke,
Silvia Rüberg,
Kai Runte,
Brenda K Schroeder,
Stefan Weidner,
Svetlana N Yurgel,
Jacques Batut,
Sharon R Long,
Alfred Pühler,
Alexander Goesmann
[show abstract]
[hide abstract]
ABSTRACT: Sinorhizobium meliloti is a symbiotic soil bacterium of the alphaproteobacterial subdivision. Like other rhizobia, S. meliloti induces nitrogen-fixing root nodules on leguminous plants. This is an ecologically and economically important interaction, because plants engaged in symbiosis with rhizobia can grow without exogenous nitrogen fertilizers. The S. meliloti-Medicago truncatula (barrel medic) association is an important symbiosis model. The S. meliloti genome was published in 2001, and the M. truncatula genome currently is being sequenced. Many new resources and data have been made available since the original S. meliloti genome annotation and an update was needed. In June 2008, we submitted our annotation update to the EMBL and NCBI databases. Here we describe this new annotation and a new web-based portal RhizoGATE. About 1000 annotation updates were made; these included assigning functions to 313 putative proteins, assigning EC numbers to 431 proteins, and identifying 86 new putative genes. RhizoGATE incorporates the new annotion with the S. meliloti GenDB project, a platform that allows annotation updates in real time. Locations of transposon insertions, plasmid integrations, and array probe sequences are available in the GenDB project. RhizoGATE employs the EMMA platform for management and analysis of transcriptome data and the IGetDB data warehouse to integrate a variety of heterogeneous external data sources.
Journal of biotechnology 01/2009; 140(1-2):45-50. · 2.88 Impact Factor
-
Michael Dondrup,
Stefan P. Albaum,
Thasso Griebel,
Kolja Henckel,
Sebastian Jünemann,
Tim Kahlke,
Christiane K. Kleindt,
Helge Küster,
Burkhard Linke,
Dominik Mertens,
Virginie Mittard-Runte,
Heiko Neuweger,
Kai J. Runte,
Andreas Tauch,
Felix Tille,
Alfred Pühler,
Alexander Goesmann
BMC Bioinformatics. 01/2009; 10.
-
Michael Dondrup,
Stefan Albaum,
Thasso Griebel,
Kolja Henckel,
Sebastian Jünemann,
Tim Kahlke,
Christiane Kleindt,
Helge Küster,
Burkhard Linke,
Dominik Mertens,
Virginie Mittard-Runte,
Heiko Neuweger,
Kai Runte,
Andreas Tauch,
Felix Tille,
Alfred Pühler,
Alexander Goesmann
[show abstract]
[hide abstract]
ABSTRACT: Abstract
Background
Understanding transcriptional regulation by genome-wide microarray studies can contribute to unravel complex relationships between genes. Attempts to standardize the annotation of microarray data include the Minimum Information About a Microarray Experiment (MIAME) recommendations, the MAGE-ML format for data interchange, and the use of controlled vocabularies or ontologies. The existing software systems for microarray data analysis implement the mentioned standards only partially and are often hard to use and extend. Integration of genomic annotation data and other sources of external knowledge using open standards is therefore a key requirement for future integrated analysis systems.
Results
The EMMA 2 software has been designed to resolve shortcomings with respect to full MAGE-ML and ontology support and makes use of modern data integration techniques. We present a software system that features comprehensive data analysis functions for spotted arrays, and for the most common synthesized oligo arrays such as Agilent, Affymetrix and NimbleGen. The system is based on the full MAGE object model. Analysis functionality is based on R and Bioconductor packages and can make use of a compute cluster for distributed services.
Conclusion
Our model-driven approach for automatically implementing a full MAGE object model provides high flexibility and compatibility. Data integration via SOAP-based web-services is advantageous in a distributed client-server environment as the collaborative analysis of microarray data is gaining more and more relevance in international research consortia. The adequacy of the EMMA 2 software design and implementation has been proven by its application in many distributed functional genomics projects. Its scalability makes the current architecture suited for extensions towards future transcriptomics methods based on high-throughput sequencing approaches which have much higher computational requirements than microarrays.
BMC Bioinformatics. 01/2009;
-
[show abstract]
[hide abstract]
ABSTRACT: The recent advances in metabolomics have created the potential to measure the levels of hundreds of metabolites which are the end products of cellular regulatory processes. The automation of the sample acquisition and subsequent analysis in high-throughput instruments that are capable of measuring metabolites is posing a challenge on the necessary systematic storage and computational processing of the experimental datasets. Whereas a multitude of specialized software systems for individual instruments and preprocessing methods exists, there is clearly a need for a free and platform-independent system that allows the standardized and integrated storage and analysis of data obtained from metabolomics experiments. Currently there exists no such system that on the one hand supports preprocessing of raw datasets but also allows to visualize and integrate the results of higher level statistical analyses within a functional genomics context.
To facilitate the systematic storage, analysis and integration of metabolomics experiments, we have implemented MeltDB, a web-based software platform for the analysis and annotation of datasets from metabolomics experiments. MeltDB supports open file formats (netCDF, mzXML, mzDATA) and facilitates the integration and evaluation of existing preprocessing methods. The system provides researchers with means to consistently describe and store their experimental datasets. Comprehensive analysis and visualization features of metabolomics datasets are offered to the community through a web-based user interface. The system covers the process from raw data to the visualization of results in a knowledge-based background and is integrated into the context of existing software platforms of genomics and transcriptomics at Bielefeld University. We demonstrate the potential of MeltDB by means of a sample experiment where we dissect the influence of three different carbon sources on the gram-negative bacterium Xanthomonas campestris pv. campestris on the level of measured metabolites. Experimental data are stored, analyzed and annotated within MeltDB and accessible via the public MeltDB web server.
The system is publicly available at http://meltdb.cebitec.uni-bielefeld.de.
Bioinformatics 10/2008; 24(23):2726-32. · 5.47 Impact Factor
-
Andreas Schlüter,
Thomas Bekel,
Naryttza N Diaz, Michael Dondrup,
Rudolf Eichenlaub,
Karl-Heinz Gartemann,
Irene Krahn,
Lutz Krause,
Holger Krömeke,
Olaf Kruse,
Jan H Mussgnug,
Heiko Neuweger,
Karsten Niehaus,
Alfred Pühler,
Kai J Runte,
Rafael Szczepanowski,
Andreas Tauch,
Alexandra Tilker,
Prisca Viehöver,
Alexander Goesmann
[show abstract]
[hide abstract]
ABSTRACT: Composition and gene content of a biogas-producing microbial community from a production-scale biogas plant fed with renewable primary products was analysed by means of a metagenomic approach applying the ultrafast 454-pyrosequencing technology. Sequencing of isolated total community DNA on a Genome Sequencer FLX System resulted in 616,072 reads with an average read length of 230 bases accounting for 141,664,289 bases sequence information. Assignment of obtained single reads to COG (Clusters of Orthologous Groups of proteins) categories revealed a genetic profile characteristic for an anaerobic microbial consortium conducting fermentative metabolic pathways. Assembly of single reads resulted in the formation of 8752 contigs larger than 500 bases in size. Contigs longer than 10kb mainly encode house-keeping proteins, e.g. DNA polymerase, recombinase, DNA ligase, sigma factor RpoD and genes involved in sugar and amino acid metabolism. A significant portion of contigs was allocated to the genome sequence of the archaeal methanogen Methanoculleus marisnigri JR1. Mapping of single reads to the M. marisnigri JR1 genome revealed that approximately 64% of the reference genome including methanogenesis gene regions are deeply covered. These results suggest that species related to those of the genus Methanoculleus play a dominant role in methanogenesis in the analysed fermentation sample. Moreover, assignment of numerous contig sequences to clostridial genomes including gene regions for cellulolytic functions indicates that clostridia are important for hydrolysis of cellulosic plant biomass in the biogas fermenter under study. Metagenome sequence data from a biogas-producing microbial community residing in a fermenter of a biogas plant provide the basis for a rational approach to improve the biotechnological process of biogas production.
Journal of Biotechnology 06/2008; 136(1-2):77-90. · 3.05 Impact Factor
-
Helge Küster,
Anke Becker,
Christian Firnhaber,
Natalija Hohnjec,
Katja Manthey,
Andreas M Perlick,
Thomas Bekel, Michael Dondrup,
Kolja Henckel,
Alexander Goesmann,
Folker Meyer,
Daniel Wipf,
Natalia Requena,
Ulrich Hildebrandt,
Rüdiger Hampp,
Uwe Nehls,
Franziska Krajinski,
Philipp Franken,
Alfred Pühler
[show abstract]
[hide abstract]
ABSTRACT: The great majority of terrestrial plants enters a beneficial arbuscular mycorrhiza (AM) or ectomycorrhiza (ECM) symbiosis with soil fungi. In the SPP 1084 "MolMyk: Molecular Basics of Mycorrhizal Symbioses", high-throughput EST-sequencing was performed to obtain snapshots of the plant and fungal transcriptome in mycorrhizal roots and in extraradical hyphae. To focus activities, the interactions between Medicago truncatula and Glomus intraradices as well as Populus tremula and Amanita muscaria were selected as models for AM and ECM symbioses, respectively. Together, almost, 20.000 expressed sequence tags (ESTs) were generated from different random and suppressive subtractive hybridization (SSH) cDNA libraries, providing a comprehensive overview of the mycorrhizal transcriptome. To automatically cluster and annotate EST-sequences, the BioMake and SAMS software tools were developed. In connection with the eNorthern software SteN, plant genes with a predicted mycorrhiza-induced expression were identified. To support experimental transcriptome profiling, macro- and microarray tools have been constructed for the two model mycorrhizae, based either on PCR-amplified cDNAs or 70mer oligonucleotides. These arrays were used to profile the transcriptome of AM and ECM roots under different conditions, and the data obtained were uploaded to the ArrayLIMS and EMMA databases that are designed to store and evaluate expression profiles from DNA arrays. Together, the EST- and transcriptome databases can be mined to identify candidate genes for targeted functional studies.
Phytochemistry 02/2007; 68(1):19-32. · 3.35 Impact Factor
-
Heiko Neuweger,
Jan Baumbach,
Stefan Albaum,
Thomas Bekel, Michael Dondrup,
Andrea T Hüser,
Jörn Kalinowski,
Sebastian Oehm,
Alfred Pühler,
Sven Rahmann,
Jochen Weile,
Alexander Goesmann
[show abstract]
[hide abstract]
ABSTRACT: The introduction of high-throughput genome sequencing and post-genome analysis technologies, e.g. DNA microarray approaches, has created the potential to unravel and scrutinize complex gene-regulatory networks on a large scale. The discovery of transcriptional regulatory interactions has become a major topic in modern functional genomics.
To facilitate the analysis of gene-regulatory networks, we have developed CoryneCenter, a web-based resource for the systematic integration and analysis of genome, transcriptome, and gene regulatory information for prokaryotes, especially corynebacteria. For this purpose, we extended and combined the following systems into a common platform: (1) GenDB, an open source genome annotation system, (2) EMMA, a MAGE compliant application for high-throughput transcriptome data storage and analysis, and (3) CoryneRegNet, an ontology-based data warehouse designed to facilitate the reconstruction and analysis of gene regulatory interactions. We demonstrate the potential of CoryneCenter by means of an application example. Using microarray hybridization data, we compare the gene expression of Corynebacterium glutamicum under acetate and glucose feeding conditions: Known regulatory networks are confirmed, but moreover CoryneCenter points out additional regulatory interactions.
CoryneCenter provides more than the sum of its parts. Its novel analysis and visualization features significantly simplify the process of obtaining new biological insights into complex regulatory systems. Although the platform currently focusses on corynebacteria, the integrated tools are by no means restricted to these species, and the presented approach offers a general strategy for the analysis and verification of gene regulatory networks. CoryneCenter provides freely accessible projects with the underlying genome annotation, gene expression, and gene regulation data. The system is publicly available at http://www.CoryneCenter.de.
BMC Systems Biology 02/2007; 1:55. · 3.15 Impact Factor
-
Helge Küster,
Natalija Hohnjec,
Franziska Krajinski,
Yahyaoui Fikri El,
Katja Manthey,
Jéôme Gouzy, Michael Dondrup,
Folker Meyer,
Jörn Kalinowski,
Laurent Brechenmacher,
Diederik van Tuinen,
Vivienne Gianinazzi-Pearson,
Alfred Pühler,
Pascal Gamas,
Anke Becker
[show abstract]
[hide abstract]
ABSTRACT: To construct macro- and microarray tools suitable for expression profiling in root endosymbioses of the model legume Medicago truncatula, we PCR-amplified a total of 6048 cDNA probes representing genes expressed in uninfected roots, mycorrhizal roots and young root nodules [Nucleic Acids Res. 30 (2002) 5579]. Including additional probes for either tissue-specific or constitutively expressed control genes, 5651 successfully amplified gene-specific probes were used to grid macro- and to spot microarrays designated Mt6k-RIT (M. truncatula 6k root interaction transcriptome). Subsequent to a technical validation of microarray printing, we performed two pilot expression profiling experiments using Cy-labeled targets from Sinorhizobium meliloti-induced root nodules and Glomus intraradices-colonized arbuscular mycorrhizal roots. These targets detected marker genes for nodule and arbuscular mycorrhiza development, amongst them different nodule-specific leghemoglobin and nodulin genes as well as a mycorrhiza-specific phosphate transporter gene. In addition, we identified several dozens of genes that have so far not been reported to be differentially expressed in nodules or arbuscular mycorrhiza thus demonstrating that Mt6k-RIT arrays serve as useful tools for an identification of genes relevant for legume root endosymbioses. A comprehensive profiling of such candidate genes will be very helpful to the development of breeding strategies and for the improvement of cultivation management targeted at increasing legume use in sustainable agricultural systems.
Journal of Biotechnology 04/2004; 108(2):95-113. · 3.05 Impact Factor
-
Andreas Wilke,
Christian Rückert,
Daniela Bartels, Michael Dondrup,
Alexander Goesmann,
Andrea T Hüser,
Sebastian Kespohl,
Burkhard Linke,
Martina Mahne,
Alice McHardy,
Alfred Pühler,
Folker Meyer
[show abstract]
[hide abstract]
ABSTRACT: In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteome data. The rapid advancement of this technique in combination with other methods used in proteomics results in an increasing number of high-throughput projects. This leads to an increasing amount of data that needs to be archived and analyzed. To cope with the need for automated data conversion, storage, and analysis in the field of proteomics, the open source system ProDB was developed. The system handles data conversion from different mass spectrometer software, automates data analysis, and allows the annotation of MS spectra (e.g. assign gene names, store data on protein modifications). The system is based on an extensible relational database to store the mass spectra together with the experimental setup. It also provides a graphical user interface (GUI) for managing the experimental steps which led to the MS data. Furthermore, it allows the integration of genome and proteome data. Data from an ongoing experiment was used to compare manual and automated analysis. First tests showed that the automation resulted in a significant saving of time. Furthermore, the quality and interpretability of the results was improved in all cases.
Journal of Biotechnology 01/2004; 106(2-3):147-56. · 3.05 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The flood of data acquired from the increasing number of publicly available genomes has led to new demands for bioinformatics software. With the growing amount of information resulting from high throughput experiments new questions arise that often focus on the comparison of genes, genomes, and their expression profiles. Inferring new knowledge by combining different kinds of "post-genomics" data obviously necessitates the development of new approaches that allow the integration of variable data sources into a flexible framework. In this paper, we describe our concept for the integration of heterogeneous data into a platform for systems biology. We have implemented a Bioinformatics Resource for the Integration of heterogeneous Data from Genomic Explorations (BRIDGE) and illustrate the usability of our approach as a platform for systems biology for two sample applications.
Journal of Biotechnology 01/2004; 106(2-3):157-67. · 3.05 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: As a high throughput technique, microarray experiments produce large data sets, consisting of measured data, laboratory protocols, and experimental settings. We have implemented the open source platform EMMA to store and analyze these data. The system provides automated pipelines for data processing and has a modular architecture that can be easily extended. EMMA features detailed reports about spots and their corresponding measurements. In addition to routine data analysis algorithms, the system can be integrated with other components that contain additional data sources (e.g. genome annotation systems).
Journal of Biotechnology 01/2004; 106(2-3):135-46. · 3.05 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: A DNA microarray was developed to analyse global gene expression of the amino acid-producing bacterium Corynebacterium glutamicum. PCR products representing 93.4% of the predicted C. glutamicum genes were prepared and spotted in quadruplicate onto 3-aminopropyltrimethoxysilane-coated glass slides. The applicability of the C. glutamicum DNA microarray was demonstrated by co-hybridisation with fluorescently labelled cDNA probes. Analysis of the technical variance revealed that C. glutamicum genes detected with different intensities resulting in ratios greater than 1.52 or smaller than -1.52 can be regarded as differentially expressed with a confidence level of greater than 95%. In a validation example, we measured changes of the mRNA levels during growth of C. glutamicum with acetate and propionate as carbon sources. Acetate-grown C. glutamicum cultures were used as reference. At the 95% confidence interval, 117 genes revealed increased transcript levels in the presence of propionate, while 43 genes showed a decreased expression compared with the acetate-grown culture. Global expression profiling confirmed the induction of the prpD2B2C2 gene cluster already known to be essential for propionate degradation via the 2-methylcitrate cycle. Besides many genes of unknown function, the paralogous prpD1B1C1 gene cluster as well as fasI-B (encoding fatty-acid synthase IB), dtsR1 and dtsR2 (components of acyl-CoA carboxylases), gluABCD (glutamate transport system), putP (proline transport system), and pyc (pyruvate carboxylase) showed significantly increased expression levels. Differential expression of these genes was confirmed by real-time reverse transcription (RT) PCR assays.
Journal of Biotechnology 01/2004; 106(2-3):269-86. · 3.05 Impact Factor
-
-
Michael Dondrup,
Stefan P Albaum,
Thasso Griebel,
Kolja Henckel,
Sebastian Jünemann,
Tim Kahlke,
Christiane Katja Kleindt,
Helge Küster,
Burkhard Linke,
Dominik Mertens,
Virginie Mittard-Runte,
Heiko Neuweger,
Kai J Runte,
Andreas Tauch,
Felix Tille,
Alfred Pühler,
Alexander Goesmann
[show abstract]
[hide abstract]
ABSTRACT: Background: Understanding transcriptional regulation by genome-wide microarray studies can contribute to unravel complex relationships between genes. Attempts to standardize the annotation of microarray data include the Minimum Information About a Microarray Experiment (MIAME) recommendations, the MAGE-ML format for data interchange, and the use of controlled vocabularies or ontologies. The existing software systems for microarray data analysis implement the mentioned standards only partially and are often hard to use and extend. Integration of genomic annotation data and other sources of external knowledge using open standards is therefore a key requirement for future integrated analysis systems. Results: The EMMA 2 software has been designed to resolve shortcomings with respect to full MAGE-ML and ontology support and makes use of modern data integration techniques. We present a software system that features comprehensive data analysis functions for spotted arrays, and for the most common synthesized oligo arrays such as Agilent, Affymetrix and NimbleGen. The system is based on the full MAGE object model. Analysis functionality is based on R and Bioconductor packages and can make use of a compute cluster for distributed services. Conclusion: Our model-driven approach for automatically implementing a full MAGE object model provides high flexibility and compatibility. Data integration via SOAP-based web-services is advantageous in a distributed client-server environment as the collaborative analysis of microarray data is gaining more and more relevance in international research consortia. The adequacy of the EMMA 2 software design and implementation has been proven by its application in many distributed functional genomics projects. Its scalability makes the current architecture suited for extensions towards future transcriptomics methods based on high-throughput sequencing approaches which have much higher computational requirements than microarrays.
BMC Bioinformatics, 10:50.