[Show abstract][Hide abstract] ABSTRACT: The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
Journal of Biotechnology 07/2009; 142(1):38-49. DOI:10.1016/j.jbiotec.2009.02.010 · 2.88 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Sinorhizobium meliloti is a symbiotic soil bacterium of the alphaproteobacterial subdivision. Like other rhizobia, S. meliloti induces nitrogen-fixing root nodules on leguminous plants. This is an ecologically and economically important interaction, because plants engaged in symbiosis with rhizobia can grow without exogenous nitrogen fertilizers. The S. meliloti-Medicago truncatula (barrel medic) association is an important symbiosis model. The S. meliloti genome was published in 2001, and the M. truncatula genome currently is being sequenced. Many new resources and data have been made available since the original S. meliloti genome annotation and an update was needed. In June 2008, we submitted our annotation update to the EMBL and NCBI databases. Here we describe this new annotation and a new web-based portal RhizoGATE. About 1000 annotation updates were made; these included assigning functions to 313 putative proteins, assigning EC numbers to 431 proteins, and identifying 86 new putative genes. RhizoGATE incorporates the new annotion with the S. meliloti GenDB project, a platform that allows annotation updates in real time. Locations of transposon insertions, plasmid integrations, and array probe sequences are available in the GenDB project. RhizoGATE employs the EMMA platform for management and analysis of transcriptome data and the IGetDB data warehouse to integrate a variety of heterogeneous external data sources.
Journal of Biotechnology 03/2009; 140(1-2):45-50. DOI:10.1016/j.jbiotec.2008.11.006 · 2.88 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Understanding transcriptional regulation by genome-wide microarray studies can contribute to unravel complex relationships between genes. Attempts to standardize the annotation of microarray data include the Minimum Information About a Microarray Experiment (MIAME) recommendations, the MAGE-ML format for data interchange, and the use of controlled vocabularies or ontologies. The existing software systems for microarray data analysis implement the mentioned standards only partially and are often hard to use and extend. Integration of genomic annotation data and other sources of external knowledge using open standards is therefore a key requirement for future integrated analysis systems.
The EMMA 2 software has been designed to resolve shortcomings with respect to full MAGE-ML and ontology support and makes use of modern data integration techniques. We present a software system that features comprehensive data analysis functions for spotted arrays, and for the most common synthesized oligo arrays such as Agilent, Affymetrix and NimbleGen. The system is based on the full MAGE object model. Analysis functionality is based on R and Bioconductor packages and can make use of a compute cluster for distributed services.
Our model-driven approach for automatically implementing a full MAGE object model provides high flexibility and compatibility. Data integration via SOAP-based web-services is advantageous in a distributed client-server environment as the collaborative analysis of microarray data is gaining more and more relevance in international research consortia. The adequacy of the EMMA 2 software design and implementation has been proven by its application in many distributed functional genomics projects. Its scalability makes the current architecture suited for extensions towards future transcriptomics methods based on high-throughput sequencing approaches which have much higher computational requirements than microarrays.
[Show abstract][Hide abstract] ABSTRACT: Databases for either sequence, annotation, or microarray experiments data are extremely beneficial to the research community, as they centrally gather information from experiments performed by different scientists. However, data from different sources develop their full capacities only when combined. The idea of a data warehouse directly adresses this problem and solves it by integrating all required data into one single database - hence there are already many data warehouses available to genetics. For the model legume Medicago truncatula, there is currently no such single data warehouse that integrates all freely available gene sequences, the corresponding gene expression data, and annotation information. Thus, we created the data warehouse TRUNCATULIX, an integrative database of Medicago truncatula sequence and expression data.
The TRUNCATULIX data warehouse integrates five public databases for gene sequences, and gene annotations, as well as a database for microarray expression data covering raw data, normalized datasets, and complete expression profiling experiments. It can be accessed via an AJAX-based web interface using a standard web browser. For the first time, users can now quickly search for specific genes and gene expression data in a huge database based on high-quality annotations. The results can be exported as Excel, HTML, or as csv files for further usage.
The integration of sequence, annotation, and gene expression data from several Medicago truncatula databases in TRUNCATULIX provides the legume community with access to data and data mining capability not previously available. TRUNCATULIX is freely available at http://www.cebitec.uni-bielefeld.de/truncatulix/.
[Show abstract][Hide abstract] ABSTRACT: A total community DNA sample from an agricultural biogas reactor continuously fed with maize silage, green rye, and small proportions of chicken manure has recently been sequenced using massively parallel pyrosequencing. In this study, the sample was computationally characterized without a prior assembly step, providing quantitative insights into the taxonomic composition and gene content of the underlying microbial community. Clostridiales from the phylum Firmicutes is the most prevalent phylogenetic order, Methanomicrobiales are dominant among methanogenic archaea. An analysis of Operational Taxonomic Units (OTUs) revealed that the entire microbial community is only partially covered by the sequenced sample, despite that estimates suggest only a moderate overall diversity of the community. Furthermore, the results strongly indicate that archaea related to the genus Methanoculleus, using CO2 as electron acceptor and H2 as electron donor, are the main producers of methane in the analyzed biogas reactor sample. A phylogenetic analysis of glycosyl hydrolase protein families suggests that Clostridia play an important role in the digestion of polysaccharides and oligosaccharides. Finally, the results unveiled that most of the organisms constituting the sample are still unexplored.
Journal of Biotechnology 07/2008; 136(1-2):91-101. DOI:10.1016/j.jbiotec.2008.06.003 · 2.88 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Composition and gene content of a biogas-producing microbial community from a production-scale biogas plant fed with renewable primary products was analysed by means of a metagenomic approach applying the ultrafast 454-pyrosequencing technology. Sequencing of isolated total community DNA on a Genome Sequencer FLX System resulted in 616,072 reads with an average read length of 230 bases accounting for 141,664,289 bases sequence information. Assignment of obtained single reads to COG (Clusters of Orthologous Groups of proteins) categories revealed a genetic profile characteristic for an anaerobic microbial consortium conducting fermentative metabolic pathways. Assembly of single reads resulted in the formation of 8752 contigs larger than 500 bases in size. Contigs longer than 10kb mainly encode house-keeping proteins, e.g. DNA polymerase, recombinase, DNA ligase, sigma factor RpoD and genes involved in sugar and amino acid metabolism. A significant portion of contigs was allocated to the genome sequence of the archaeal methanogen Methanoculleus marisnigri JR1. Mapping of single reads to the M. marisnigri JR1 genome revealed that approximately 64% of the reference genome including methanogenesis gene regions are deeply covered. These results suggest that species related to those of the genus Methanoculleus play a dominant role in methanogenesis in the analysed fermentation sample. Moreover, assignment of numerous contig sequences to clostridial genomes including gene regions for cellulolytic functions indicates that clostridia are important for hydrolysis of cellulosic plant biomass in the biogas fermenter under study. Metagenome sequence data from a biogas-producing microbial community residing in a fermenter of a biogas plant provide the basis for a rational approach to improve the biotechnological process of biogas production.
Journal of Biotechnology 06/2008; 136(1-2):77-90. DOI:10.1016/j.jbiotec.2008.05.008 · 2.88 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Pedro is a Java application that dynamically generates data entry forms for data models expressed in XML Schema, producing XML data files that validate against this schema. The software uses an intuitive tree-based navigation system, can supply context-sensitive help to users and features a sophisticated interface for populating data fields with terms from controlled vocabularies. The software also has the ability to import records from tab delimited text files and features various validation routines. AVAILABILITY: The application, source code, example models from several domains and tutorials can be downloaded from http://pedro.man.ac.uk/.
[Show abstract][Hide abstract] ABSTRACT: The Proteomics Standards Initiative (PSI) aims to define community standards for data representation in proteomics and to facilitate data comparision, exchange and verification. To this end, a Level 1 Molecular Interaction XML data exchange format has been developed which has been accepted for publication and is freely available at the PSI website (http.//psidev.sf.net/). Several major protein interaction databases are already making data available in this format. A draft XML interchange format for mass spectrometry data has been written and is currently undergoing evaluation whilst work is ongoing to develop a proteomics data integration model, MIAPE.