ArticlePDF Available

FIVA: Functional Information Viewer and Analyzer extracting biological knowledge from transcriptome data of prokaryotes

Authors:

Abstract and Figures

FIVA (Function Information Viewer and Analyzer) aids researchers in the prokaryotic community to quickly identify relevant biological processes following transcriptome analysis. Our software assists in functional profiling of large sets of genes and generates a comprehensive overview of affected biological processes. Availability:http://bioinformatics.biol.rug.nl/standalone/fiva/ Contact:o.p.kuipers@rug.nl Supplementary information:http://bioinformatics.biol.rug.nl/standalone/fiva/suppMaterials.php
Content may be subject to copyright.
Vol. 23 no. 9 2007, pages 1161–1163
BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btl658
Gene expression
FIVA: Functional Information Viewer and Analyzer extracting
biological knowledge from transcriptome data of prokaryotes
Evert-Jan Blom
1
, Dinne W. J. Bosman
2
, Sacha A. F. T. van Hijum
1
, Rainer Breitling
3
,
Lars Tijsma
2
, Remko Silvis
1
, Jos B. T. M. Roerdink
2
and Oscar P. Kuipers
1,
1
Molecular Genetics, Groningen Biomolecular Sciences,
2
Institute for Mathematics and Computing Science and
3
Groningen Bioinformatics Centre, University of Groningen, PO Box 800, 9700 AV, Groningen, The Netherlands
Received on October 19, 2006; revised on November 24, 2006; accepted on December 19, 2006
Advance Access publication January 19, 2007
Associate Editor: Dmitrij Frishman
ABSTRACT
Summary: FIVA (Function Information Viewer and Analyzer) aids
researchers in the prokaryotic community to quickly identify relevant
biological processes following transcriptome analysis. Our software
assists in functional profiling of large sets of genes and generates
a comprehensive overview of affected biological processes.
Availability: http://bioinformatics.biol.rug.nl/standalone/fiva/
Contact: o.p.kuipers@rug.nl
Supplementary information: http://bioinformatics.biol.rug.nl/stand
alone/fiva/suppMaterials.php
1 INTRODUCTION
Genome-wide expression profiles describing various cellular
states are obtained by use of DNA microarrays. Following
statistical analysis of the raw gene expression values, data-
driven methods such as unsupervised clustering allow grouping
of genes based on their (temporal) expression patterns. Genes
involved in similar cellular processes are expected to have a
high probability of exhibiting similar expression patterns.
Analysis and interpretation of these clusters is time-consuming
and error-prone. Various applications have been developed to
functionally profile differentially expressed genes from DNA-
microarray experiments.
Several of these, as reviewed by Khatri et al. (2005), overlap
with our application in terms of functionality and data
sources employed. Many of these focus on higher organisms
and therefore lack support for prokaryote gene identifiers.
A number of applications support rarely used (Uniprot,
GI accession) identifiers (Hosack et al., 2003) or only identifiers
for a limited set of organisms (Scheer et al., 2006). Moreover,
with few exceptions, these software products use gene ontology
as their exclusive data source. In addition, the laborious task
of preprocessing the list containing differentially expressed
genes must be performed by a researcher. A stand-alone
application that focuses on prokaryotes is therefore essential
for the fast-growing community of microbiologists making use
of a plethora of (confidential) microbial genome sequences.
We have developed FIVA (Functional Information Viewer
and Analyzer). It uses several sources of biological annotations
to create an extensive functional profile based on gene
expression data. Furthermore, FIVA is capable of processing
groups of genes assembled by other criteria (e.g. functional
grouping of genes which are not available in current annotation
modules). The significance of each biological process is
calculated to distinguish between significant and spurious
occurrences.
2 PROGRAM OVERVIEW
2.1 Input
The input data for FIVA consists of transcriptome data and
genome annotation files (e.g. EMBL or Genbank), supplemen-
ted with annotation information. FIVA supports a broad
variety of prokaryotic gene identifiers from the expression
datasets, including locus tags and standard gene names (further
details available in Supplementary Materials). Each annotation
module uses functional information from one of the following
sources to classify the groups of genes and determine any
significantly over-represented categories. (i) Gene ontology
(ii) Metabolic pathways (iii) COG classes (iv) Regulatory
interactions (v) UniProt keywords (vi) InterPro (vii) User-
defined functional categories.
2.2 Processing
The analysis in FIVA first involves the partitioning of the gene
expression data into up- and down-regulated fractions. Testing
different settings to partition the data is not a trivial task.
FIVA offers the ability to automatically detect the optimal
settings for each individual experiment based on the number of
over-represented functional categories. In addition to this
partitioning method which is based on thresholds applied to
a single experiment, the iGA algorithm (Breitling et al., 2004) is
implemented. This algorithm optimizes the parameters for each
functional category, which greatly improves the sensitivity of
the analysis and increases the number of affected biological
processes that can be reliably detected. Furthermore, the
analysis of gene expression data can also be applied on
user-defined gene lists.
*To whom correspondence should be addressed.
ß2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/
by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
by guest on June 4, 2013http://bioinformatics.oxfordjournals.org/Downloaded from
A Fisher exact test is used to calculate P-values for each
cluster. This P-value describes the probability of observing a
specific enrichment of genes from a functional category in
a cluster by chance. The number of false positives, due to the
large number of statistical tests performed, are controlled
by four multiple testing corrections (Benjamini/HochBerg,
Bonferroni Step-down, Bonferroni and Benjamini Yekutieli).
These are implemented to adjust the raw P-values (see
Supplementary Website).
2.3 Output
For each of the classification modules, a graphical representa-
tion of the over-represented categories is generated. A preview
is created from these results, from which a selection of the
results can be made (Fig. 1). In order to conveniently compare
biological phenomena occurring in different experiments,
multiple experiments can be loaded simultaneously and are
displayed as columns. Clickable links are present for each
category in the individual graphical map, providing detailed
information for each cluster that contains an enrichment of
genes from this category. Furthermore, FIVA uses the KEGG
API (http://www.genome.jp/kegg/soap/) to communicate with
the KEGG database to color pathways based on the gene
distribution in the clusters.
2.4 Implementation and availability
FIVA was programmed as a stand-alone application in Java
using the Eclipse (http://www.eclipse.org/) framework and
runs on all Java-supporting operating systems (Mac OS,
MS Windows, UNIX and Linux). The graphical output can
also be viewed by all web browsers that are able to process
scalable vector graphics (SVG) or, to ensure portability
Fig. 1. Graphical output of a single annotation module. Genes from two DNA-microarray datasets (gluc: growth on glucitol compared to growth on
glucose, man: growth on mannitol compared to glucose) were partitioned into up- and down-regulated clusters. The size of each cluster is displayed
in blue underneath the cluster name. Numbers in each rectangle represent absolute values of occurrences. The significance of occurrences is visualized
in a colour gradient which is displayed at the bottom of the plot. The description of each category is placed at the right. S: annotations that are
significant after multiple testing correction. Multiple testing correction results are visualized using five different symbols to distinguish between the
individual corrections. The number of symbols placed in each rectangle corresponds to the number of multiple testing corrections after which the
annotation is found significant.
E.-J.Blom et al.
1162
by guest on June 4, 2013http://bioinformatics.oxfordjournals.org/Downloaded from
of the results, portable network graphics. More information
on the functionality of FIVA, as well as the results of
several test cases, can be found under the Supplementary
Materials.
3 CONCLUSION
A full information analysis was performed to assess the
overlap between the annotation modules (see Supplementary
Website). The gene ontology module is the most informative
annotation type for our test organism Bacillus subtilis and
covers a large portion of the information present in the other
types. However, the utilization of multiple modules yields
relevant areas which are not shared by any of the other
modules. For our test cases, several relevant categories were
identified by the metabolic pathways modules but were missed
by the GO module (see Supplementary Website for more
information on this analysis). We conclude that combining
multiple annotation sources into one tool is advantageous
compared to using only one or a few sources. The combination
of various complementary annotation sources, together
with the dynamic visualization and elaborate statistical
analysis, allows a richer and more objective exploration
of prokaryote expression data than any other available tool
provides.
ACKNOWLEDGEMENTS
This study was fully supported by a grant from The
Netherlands Organization for Scientific Research and industrial
partners in the NWO-BMI project number 050.50.206 on
Computational Genomics of Prokaryotes and by Center IOP
Genomics. Work performed by SvH was supported by grant
QLK3-CT-2001-01473 under the EU programme ‘Quality of
life and management of living resources: The cell factory’.
We thank J.W.Veening for useful suggestions on experimental
procedures and G. te Meerman for expert advice on the
statistical analysis. Funding to pay the Open Access charges
was provided by the Molecular Genetics department of the
University of Groningen.
Conflict of Interest: none declared.
REFERENCES
Breitling,R. et al. (2004) Iterative group analysis (iGA): a simple tool to enhance
sensitivity and facilitate interpretation of microarray experiments. BMC
Bioinformatics,5, 34.
Hosack,D.A. et al. (2003) Identifying biological themes within lists of genes with
ease. Genome Biol.,4, R70.
Khatri,P. et al. (2005) Ontological analysis of gene expression data: current tools,
limitations, and open problems. Bioinformatics,21, 3587–3595.
Scheer,M. et al. (2006) JProGO: a novel tool for the functional interpretation
of prokaryotic microarray data using Gene Ontology information.
Nucleic Acids Res.,34, W510–515.
FIVA: Functional Information Viewer and Analyzer
1163
by guest on June 4, 2013http://bioinformatics.oxfordjournals.org/Downloaded from
... The gene lists selected with these criteria are presented in S1 and S2 Tables. The software package FIVA (Functional Information Viewer and Analyzer; [33] was used to identify overrepresented functional categories in differentially expressed genes. Sources used by this software include: metabolic pathways from Kyoto Encyclopedia of Genes and Genomes (KEGG; [34]), categories from Gene Ontology (GO; [35]) and Cluster of Orthologous Groups (COG; [36]) and regulons from Database of Transcriptional regulation in Bacillus subtilis ...
... The number of symbols placed in each rectangle corresponds to the number of multiple testing corrections after which the annotation is found significant. This figure legend is cited from [33]. ...
Article
Full-text available
Sigma factor F is the first forespore specific transcription factor in Bacillus subtilis and controls genes required for the early stages of prespore development. The role of sigF is well studied under conditions that induce sporulation. Here, the impact of sigF disruption on the transcriptome of exponentially growing cultures is studied by micro-array analysis. Under these conditions that typically don't induce sporulation, the transcriptome showed minor signs of sporulation initiation. The number of genes differentially expressed and the magnitude of expression were, as expected, quite small in comparison with sporulation conditions. The genes mildly down-regulated were mostly involved in anabolism and the genes mildly up-regulated, in particular fatty acid degradation genes, were mostly involved in catabolism. This is probably related to the arrest at sporulation stage II occurring in the sigF mutant, because continuation of growth from the formed disporic sporangia may require additional energy. The obtained knowledge is relevant for various experiments, such as industrial fermentation, prolonged experimental evolution or zero-growth studies, where sporulation is an undesirable trait that should be avoided, e.g by a sigF mutation.
... Therefore, a number of tools were developed for analysis and data interpretation. FIVA (Functional Information Viewer and Analyzer) provides a platform for processing information contained in clusters of genes having similar gene expression patterns and for constructing functional profiles 16 . PROSECUTOR is another software package that facilitates the identification of putative functions and annotations of genes 17 . ...
... Further analysis on the pool of genes regulated in the microarray can be done using different in-house software packages. They include PePPER 20 , FIVA 16 , DISCLOSE 18 , PROSECUTOR 17 and Genome2D 19 . These Windows-based tools and software packages are user-friendly and provide deep insight into the data during further investigation. ...
Article
Full-text available
Gene expression and its regulation are very important to understand the behavior of cells under different conditions. Various techniques are used nowadays to study gene expression, but most are limited in terms of providing an overall picture of the expression of the whole transcriptome. DNA microarrays offer a fast and economic research technology, which gives a full overview of global gene expression and have a vast number of applications including identification of novel genes and transcription factor binding sites, characterization of transcriptional activity of the cells and also help in analyzing thousands of genes (in a single experiment). In the present study, the conditions for bacterial transcriptome analysis from cell harvest to DNA microarray analysis have been optimized. Taking into account the time, costs and accuracy of the experiments, this technology platform proves to be very useful and universally applicable for studying bacterial transcriptomes. Here, we perform DNA microarray analysis with Streptococcus pneumoniae as a case-study by comparing the transcriptional responses of S. pneumoniae grown in the presence of varying L-serine concentrations in the medium. Total RNA was isolated by using a Macaloid method using an RNA isolation kit and the quality of RNA was checked by using an RNA quality check kit. cDNA was prepared using reverse transcriptase and the cDNA samples were labelled using one of two amine-reactive fluorescent dyes. Homemade DNA microarray slides were used for hybridization of the labelled cDNA samples and microarray data were analyzed by using a cDNA microarray data pre-processing framework (Microprep). Finally, Cyber-T was used to analyze the data generated using Microprep for the identification of statistically significant differentially expressed genes. Furthermore, in-house built software packages (PePPER, FIVA, DISCLOSE, PROSECUTOR, Genome2D) were used to analyze data.
... Genome2D [9] A visualization tool for the rapid analysis of bacterial transcriptome data http://genome2d.molgenrug.nl/ FIVA [10] Functional Information Viewer and Analyzer extracting biological knowledge from transcriptome data of prokaryotes http://bioinformatics.biol.rug.nl/standalone/fiva/ Projector [11] Automatic contig mapping for gap closure purposes http://bamics2.cmbi.ru.nl/websoftware/projector2/projector2_start.php ...
Article
Full-text available
Cobalt (Co2+) is an important transition metal ion that plays a vital role in cellular physiology of bacteria. The role of Co2+ in the regulation of several genes/operons in Streptococcus pneumoniae has recently been reported [1]. The data described in this article relate to the genome-wide transcriptional profiling of Streptococcus pneumoniae D39, either in the presence or absence of 0.5 mM Co2+ in chemically defined medium (CDM) using DNA microarray analysis. Genes belonging to a broad range of cellular processes such as virulence, transport and efflux systems, stress response and surface attachment were differentially expressed in the presence of Co2+. We used transcriptional lacZ assays and electrophoretic mobility shift assays (EMSAs) to confirm our results [1]. The dataset is publicly available at the Gene Expression Omnibus (GEO) repository (http://www.ncbi.nlm.nih.gov/geo/) with accession number GSE57696.
... value. The subsets of significantly affected genes were analysed for overrepresented KEGG metabolic pathways with the web-based tool for functional analysis of genes FIVA [40]. Microarray raw and processed data is deposited in the GEO database (http:// www.ncbi.nlm.gov/geo/) ...
Article
Full-text available
Sigma 54 is a transcriptional regulator predicted to play a role in physical interaction of bacteria with their environment, including virulence and biofilm formation. In order to study the role of Sigma 54 in Bacillus cereus, a comparative transcriptome and phenotypic study was performed using B. cereus ATCC 14579 WT, a markerless rpoN deletion mutant, and its complemented strain. The mutant was impaired in many different cellular functions including low temperature and anaerobic growth, carbohydrate metabolism, sporulation and toxin production. Additionally, the mutant showed lack of motility and biofilm formation at air-liquid interphase, and this correlated with absence of flagella, as flagella staining showed only WT and complemented strain to be highly flagellated. Comparative transcriptome analysis of cells harvested at selected time points during growth in aerated and static conditions in BHI revealed large differences in gene expression associated with loss of phenotypes, including significant down regulation of genes in the mutant encoding enzymes involved in degradation of branched chain amino acids, carbohydrate transport and metabolism, flagella synthesis and virulence factors. Our study provides evidence for a pleiotropic role of Sigma 54 in B. cereus supporting its adaptive response and survival in a range of conditions and environments.
... For differentially expressed genes, p<0.001 and FDR<0.05 were taken as standard. Further computational analysis on the data for the regulatory networks prediction and data mining was done using different software packages [9][10][11][12]. Microarray data have been submitted to GEO under the accession number GSE61649. ...
Article
Full-text available
The transcriptional regulator UlaR belongs to the family of PRD-containing transcriptional regulators, which are mostly involved in the regulation of carbohydrate metabolism. The role of the transcriptional regulator UlaR in Streptococcus pneumoniae has recently been described. Here, we report detailed genome-wide transcriptional profiling of UlaR-regulated genes in S. pneumoniae D39 and its ΔulaR derivative, either in the presence of 10 mM ascorbic acid in M17 medium using microarray analysis. 10 mM concentration of ascorbic acid was supplemented to the M17 medium because our lacZ-fusions studies indicated that UlaR acts as a transcriptional activator of its targets in the presence of ascorbic acid and the expression of the ula operon was maximal at a 10mM ascorbic acid concentration [1]. All transcriptional profiling data of UlaR regulated genes was deposited to Gene Expression Omnibus (GEO) database under accession number GSE61649.
... Comparative analysis of the butanol (BuOH) versus butyrate (BA) stress responses is essential for understanding the general (that is, the common) stress response as well as the specialized, stressor-dependent responses. Using FIVA (Functional Information Viewer and Analyzer) [29], we identified the statistically significant differentially expressed functional categories based on annotated pathways (KEGG database [30]) and Gene Ontology (GO) annotations (from UniProtKB [31]) for the C. acetobutylicum genome. Each gene was assigned into one of four differentially expressed groups (up-regulation, down-regulation, bimodal and non-significant) for each stressor individually or in combination (e.g., BuOH-up/BA-up, BuOH-down/BA-down, BuOH-up/ BA-down etc.). ...
Article
Full-text available
Organisms of the genus Clostridium are Gram-positive endospore formers of great importance to the carbon cycle, human normo- and pathophysiology, but also in biofuel and biorefinery applications. Exposure of Clostridium organisms to chemical and in particular toxic metabolite stress is ubiquitous in both natural (such as in the human microbiome) and engineered environments, engaging both the general stress response as well as specialized programs. Yet, despite its fundamental and applied significance, it remains largely unexplored at the systems level. We generated a total of 96 individual set of microarray data examining the transcriptional changes in C. acetobutylicum, a model Clostridium organism, in response to three levels of chemical stress from the native metabolites, butanol and butyrate. We identified 164 significantly differentially expressed transcription regulators and detailed the cellular programs associated with general and stressor-specific responses, many previously unexplored. Pattern-based, comparative genomic analyses enabled us, for the first time, to construct a detailed picture of the genetic circuitry underlying the stress response. Notably, a list of the regulons and DNA binding motifs of the stress-related transcription factors were identified: two heat-shock response regulators, HrcA and CtsR; the SOS response regulator LexA; the redox sensor Rex; and the peroxide sensor PerR. Moreover, several transcriptional regulators controlling stress-responsive amino acid and purine metabolism and their regulons were also identified, including ArgR (arginine biosynthesis and catabolism regulator), HisR (histidine biosynthesis regulator), CymR (cysteine metabolism repressor) and PurR (purine metabolism repressor). Using an exceptionally large set of temporal transcriptional data and regulon analyses, we successfully built a STRING-based stress response network model integrating important players for the general and specialized metabolite stress response in C. acetobutylicum. Since the majority of the transcription factors and their target genes are highly conserved in other organisms of the Clostridium genus, this network would be largely applicable to other Clostridium organisms. The network informs the molecular basis of Clostridium responses to toxic metabolites in natural ecosystems and the microbiome, and will facilitate the construction of genome-scale models with added regulatory-network dimensions to guide the development of tolerant strains.
... Analysis of time series data was originally performed in a gene-by-gene approach of the most differentially expressed genes (genes whose expression was most notably changed) involving literature search and mining of information available at public repositories such as PubMed (www.pubmed.org). Furthermore, the global effects of the ccpA deletion on the known regulators and metabolic pathways were studied by Lulko and coworkers by using FIVA [5]. This tool presents an overview of the key cellular processes affected, but it does not allow visual or manual identification of groups of genes exhibiting correlated behavior within these processes. ...
Chapter
Full-text available
We present GENeVis, an application to visualize gene expression time series data in a gene regulatory network context. This is a network of regulator proteins that regulate the expression of their respective target genes. The networks are represented as graphs, in which the nodes represent genes, and the edges represent interactions between a gene and its targets. GENeVis adds features that are currently lacking in existing tools, such as mapping of expression value and corresponding p-value (or other statistic) to a single visual attribute, multiple time point visualization, and visual comparison of multiple time series in one view. Various interaction mechanisms, such as panning, zooming, regulator and target highlighting, data selection, and tooltips support data analysis and exploration. Subnetworks can be studied in detail in a separate view that shows the network context, expression data plots, and tables containing the raw expression data. We present a case study, in which gene expression time series data acquired in-house are analyzed by a biological expert using GENeVis. The case study shows that the application fills the gap between present biological interpretation of time series experiments, performed on a gene-by-gene basis, and analysis of global classes of genes whose expression is regulated by regulator proteins.
Article
Aims: This study was conducted to investigate the inactivation kinetics of Bacillus cereus vegetative cells upon exposure to low-temperature nitrogen gas plasma and to reveal the mode of inactivation by transcriptome profiling. Methods and results: Exponentially growing B. cereus cells were filtered and put on agar plates. The plates, carrying the filters with the vegetative cells, were placed into low-temperature nitrogen gas plasma at atmospheric pressure. After different exposure times, the cells were harvested for RNA extraction and enumeration. The RNA was used to perform whole-transcriptome profiling using DNA microarrays. The transcriptome profile showed a large overlap with profiles obtained from conditions generating reactive oxygen species in B. cereus. However, excess radicals such as peroxynitrite, hydroxyl and superoxide could not be detected using radical-specific fluorescence staining. Lack of UV-specific responses including factors involved in DNA damage repair is in line with the absence of UV-specific emission in the afterglow of the nitrogen gas plasma as analysed using optical emission spectroscopy (OES). Conclusions: Antibacterial activity of nitrogen gas plasma is not based on UV radiation. Exposure to nitrogen gas plasma leads to oxidative stress and inactivation of targeted cells. A secondary oxidative stress with the indicative formation of reactive oxygen species within cells could not be observed. Significance and impact of the study: This study represents the first investigation of differential gene expression on a genome-wide scale in B. cereus following nitrogen gas plasma exposure. This study may help to design economically feasible, safe and effective plasma decontamination devices.
Chapter
Bioinformatic tools can greatly improve the efficiency of bacteriocin screening efforts by limiting the amount of strains. Different classes of bacteriocins can be detected in genomes by looking at different features. Finding small bacteriocins can be especially challenging due to low homology and because small open reading frames (ORFs) are often omitted from annotations. In this chapter, several bioinformatic tools/strategies to identify bacteriocins in genomes are discussed.
Article
Full-text available
EASE is a customizable software application for rapid biological interpretation of gene lists that result from the analysis of microarray, proteomics, SAGE and other high-throughput genomic data. The biological themes returned by EASE recapitulate manually determined themes in previously published gene lists and are robust to varying methods of normalization, intensity calculation and statistical selection of genes. EASE is a powerful tool for rapidly converting the results of functional genomics studies from 'genes' to 'themes'.
Article
Full-text available
The biological interpretation of even a simple microarray experiment can be a challenging and highly complex task. Here we present a new method (Iterative Group Analysis) to facilitate, improve, and accelerate this process. Our Iterative Group Analysis approach (iGA) uses elementary statistics to identify those functional classes of genes that are significantly changed in an experiment and at the same time determines which of the class members are most likely to be differentially expressed. iGA does not require that all members of a class change and is therefore robust against imperfect class assignments, which can be derived from public sources (e.g. GeneOntologies) or automated processes (e.g. key word extraction from gene names). In contrast to previous non-iterative approaches, iGA does not depend on the availability of fixed lists of differentially expressed genes, and thus can be used to increase the sensitivity of gene detection especially in very noisy or small data sets. In the extreme, iGA can even produce statistically meaningful results without any experimental replication. The automated functional annotation provided by iGA greatly reduces the complexity of microarray results and facilitates the interpretation process. In addition, iGA can be used as a fast and efficient tool for the platform-independent comparison of a microarray experiment to the vast number of published results, automatically highlighting shared genes of potential interest. By applying iGA to a wide variety of data from diverse organisms and platforms we show that this approach enhances and accelerates the interpretation of microarray experiments.
Article
Full-text available
Independent of the platform and the analysis methods used, the result of a microarray experiment is, in most cases, a list of differentially expressed genes. An automatic ontological analysis approach has been recently proposed to help with the biological interpretation of such results. Currently, this approach is the de facto standard for the secondary analysis of high throughput experiments and a large number of tools have been developed for this purpose. We present a detailed comparison of 14 such tools using the following criteria: scope of the analysis, visualization capabilities, statistical model(s) used, correction for multiple comparisons, reference microarrays available, installation issues and sources of annotation data. This detailed analysis of the capabilities of these tools will help researchers choose the most appropriate tool for a given type of analysis. More importantly, in spite of the fact that this type of analysis has been generally adopted, this approach has several important intrinsic drawbacks. These drawbacks are associated with all tools discussed and represent conceptual limitations of the current state-of-the-art in ontological analysis. We propose these as challenges for the next generation of secondary data analysis tools. Contact:sod@cs.wayne.edu
Article
Full-text available
A novel program suite was implemented for the functional interpretation of high-throughput gene expression data based on the identification of Gene Ontology (GO) nodes. The focus of the analysis lies on the interpretation of microarray data from prokaryotes. The three well established statistical methods of the threshold value-based Fisher's exact test, as well as the threshold value-independent Kolmogorov–Smirnov and Student's t-test were employed in order to identify the groups of genes with a significantly altered expression profile. Furthermore, we provide the application of the rank-based unpaired Wilcoxon's test for a GO-based microarray data interpretation. Further features of the program include recognition of the alternative gene names and the correction for multiple testing. Obtained results are visualized interactively both as a table and as a GO subgraph including all significant nodes. Currently, JProGO enables the analysis of microarray data from more than 20 different prokaryotic species, including all important model organisms, and thus constitutes a useful web service for the microbial research community. JProGO is freely accessible via the web at the following address: http://www.jprogo.de