Manipulating large-scale Arabidopsis microarray expression data: identifying dominant expression patterns and biological process enrichment.
ABSTRACT A series of large-scale Arabidopsis thaliana microarray expression experiments profiling genome-wide expression across different developmental stages, cell types, and environmental conditions have resulted in tremendous amounts of gene expression data. This gene expression is the output of complex transcriptional regulatory networks and provides a starting point for identifying the dominant transcriptional regulatory modules acting within the plant. Highly co-expressed groups of genes are likely to be regulated by similar transcription factors. Therefore, finding these co-expressed groups can reduce the dimensionality of complex expression data into a set of dominant transcriptional regulatory modules. Determining the biological significance of these patterns is an informatics challenge and has required the development of new methods. Using these new methods we can begin to understand the biological information contained within large-scale expression data sets.
- SourceAvailable from: Nobutoshi Yamaguchi
Dataset: Yamaguchi N Science 2014 suppl
- [Show abstract] [Hide abstract]
ABSTRACT: Tools that provide improved ability to relate genotype to phenotype have the potential to accelerate breeding for desired traits and to improve our understanding of the molecular variants that underlie phenotypes. The availability of large-scale gene expression profiles in maize provides an opportunity to advance our understanding of complex traits in this agronomically important species. We built co-expression networks based on genome-wide expression data from a variety of maize accessions as well as an atlas of different tissues and developmental stages. We demonstrate that these networks reveal clusters of genes that are enriched for known biological function and contain extensive structure which has yet to be characterized. Furthermore, we found that co-expression networks derived from developmental or tissue atlases as compared to expression variation across diverse accessions capture unique functions. To provide convenient access to these networks, we developed a public, web-based Co-expression Browser (COB), which enables interactive queries of the genome-wide networks. We illustrate the utility of this system through two specific use cases: one in which gene-centric queries are used to provide functional context for previously characterized metabolic pathways, and a second where lists of genes produced by mapping studies are further resolved and validated using co-expression networks.PLoS ONE 01/2014; 9(6):e99193. · 3.53 Impact Factor