Comparative co-expression analysis in plant biology
ABSTRACT The analysis of gene expression data generated by high-throughput microarray transcript profiling experiments has shown that transcriptionally coordinated genes are often functionally related. Based on large-scale expression compendia grouping multiple experiments, this guilt-by-association principle has been applied to study modular gene programmes, identify cis-regulatory elements or predict functions for unknown genes in different model plants. Recently, several studies have demonstrated how, through the integration of gene homology and expression information, correlated gene expression patterns can be compared between species. The incorporation of detailed functional annotations as well as experimental data describing protein-protein interactions, phenotypes or tissue specific expression, provides an invaluable source of information to identify conserved gene modules and translate biological knowledge from model organisms to crops. In this review, we describe the different steps required to systematically compare expression data across species. Apart from the technical challenges to compute and display expression networks from multiple species, some future applications of plant comparative transcriptomics are highlighted.
SourceAvailable from: PubMed Central[Show abstract] [Hide abstract]
ABSTRACT: Arabidopsis thaliana is a reference plant that has been studied intensively for several decades. Recent advances in high-throughput experimental technology have enabled the generation of an unprecedented amount of data from A. thaliana, which has facilitated data-driven approaches to unravel the genetic organization of plant phenotypes. We previously published a description of a genome-scale functional gene network for A. thaliana, AraNet, which was constructed by integrating multiple co-functional gene networks inferred from diverse data types, and we demonstrated the predictive power of this network for complex phenotypes. More recently, we have observed significant growth in the availability of omics data for A. thaliana as well as improvements in data analysis methods that we anticipate will further enhance the integrated database of co-functional networks. Here, we present an updated co-functional gene network for A. thaliana, AraNet v2 (available at http://www.inetbio.org/aranet), which covers approximately 84% of the coding genome. We demonstrate significant improvements in both genome coverage and accuracy. To enhance the usability of the network, we implemented an AraNet v2 web server, which generates functional predictions for A. thaliana and 27 nonmodel plant species using an orthology-based projection of nonmodel plant genes on the A. thaliana gene network.Nucleic Acids Research 10/2014; 43(D1). DOI:10.1093/nar/gku1053 · 8.81 Impact Factor
[Show abstract] [Hide abstract]
ABSTRACT: The analysis of gene expression data has shown that transcriptionally coordinated (co-expressed) genes are often functionally related, enabling scientists to use expression data in gene function prediction. This Focused Review discusses our original paper (Large-scale co-expression approach to dissect secondary cell wall formation across plant species, Frontiers in Plant Science 2:23). In this paper we applied cross-species analysis to co-expression networks of genes involved in cellulose biosynthesis. We showed that the co-expression networks from different species are highly similar, indicating that whole biological pathways are conserved across species. This finding has two important implications. First, the analysis can transfer gene function annotation from well-studied plants, such as Arabidopsis, to other, uncharacterized plant species. As the analysis finds genes that have similar sequence and similar expression pattern across different organisms, functionally equivalent genes can be identified. Second, since co-expression analyses are often noisy, a comparative analysis should have higher performance, as parts of co-expression networks that are conserved are more likely to be functionally relevant. In this Focused Review, we outline the comparative analysis done in the original paper and comment on the recent advances and approaches that allow comparative analyses of co-function networks. We hypothesize that in comparison to simple co-expression analysis, comparative analysis would yield more accurate gene function predictions. Finally, by combining comparative analysis with genomic information of green plants, we propose a possible composition of cellulose biosynthesis machinery during earlier stages of plant evolution.Frontiers in Plant Science 08/2014; 5:394. DOI:10.3389/fpls.2014.00394 · 3.64 Impact Factor
[Show abstract] [Hide abstract]
ABSTRACT: In general, the expression of gene alters conditionally to catalyze a specific metabolic pathway. Microarray-based datasets have been massively produced to monitor gene expression levels in parallel with numerous experimental treatments. Although several studies facilitated the linkage of gene expression data and metabolic pathways, none of them are amassed for plants. Moreover, advanced analysis such as pathways enrichment or how genes express under different conditions is not rendered. Therefore, EXPath was developed to not only comprehensively congregate the public microarray expression data from over 1000 samples in biotic stress, abiotic stress, and hormone secretion but also allow the usage of this abundant resource for coexpression analysis and differentially expression genes (DEGs) identification, finally inferring the enriched KEGG pathways and gene ontology (GO) terms of three model plants: Arabidopsis thaliana, Oryza sativa, and Zea mays. Users can access the gene expression patterns of interest under various conditions via five main functions (Gene Search, Pathway Search, DEGs Search, Pathways/GO Enrichment, and Coexpression analysis) in EXPath, which are presented by a user-friendly interface and valuable for further research. In conclusion, EXPath, freely available at http://expath.itps.ncku.edu.tw, is a database resource that collects and utilizes gene expression profiles derived from microarray platforms under various conditions to infer metabolic pathways for plants.