Disease signatures are robust across tissues and experiment

Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA.
Molecular Systems Biology (Impact Factor: 10.87). 09/2009; 5(1):307. DOI: 10.1038/msb.2009.66
Source: PubMed


Meta-analyses combining gene expression microarray experiments offer new insights into the molecular pathophysiology of disease not evident from individual experiments. Although the established technical reproducibility of microarrays serves as a basis for meta-analysis, pathophysiological reproducibility across experiments is not well established. In this study, we carried out a large-scale analysis of disease-associated experiments obtained from NCBI GEO, and evaluated their concordance across a broad range of diseases and tissue types. On evaluating 429 experiments, representing 238 diseases and 122 tissues from 8435 microarrays, we find evidence for a general, pathophysiological concordance between experiments measuring the same disease condition. Furthermore, we find that the molecular signature of disease across tissues is overall more prominent than the signature of tissue expression across diseases. The results offer new insight into the quality of public microarray data using pathophysiological metrics, and support new directions in meta-analysis that include characterization of the commonalities of disease irrespective of tissue, as well as the creation of multi-tissue systems models of disease pathology using public data.

Download full-text


Available from: Tarangini Deshpande, Jan 20, 2014
1 Follower
21 Reads
  • Source
    • "Our research highlights that systems approaches not only can aid in clinically motivated knowledge discovery, but also it offers opportunities for the identification of candidate biomarkers or targets with potential therapeutic benefits. Our findings contribute further evidence of the predictive power and reproducibility of insights resulting from systems-based approaches [23,24]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background This study aims to expand knowledge of the complex process of myocardial infarction (MI) through the application of a systems-based approach. Methods We generated a gene co-expression network from microarray data originating from a mouse model of MI. We characterized it on the basis of connectivity patterns and independent biological information. The potential clinical novelty and relevance of top predictions were assessed in the context of disease classification models. Models were validated using independent gene expression data from mouse and human samples. Results The gene co-expression network consisted of 178 genes and 7298 associations. The network was dissected into statistically and biologically meaningful communities of highly interconnected and co-expressed genes. Among the most significant communities, one was distinctly associated with molecular events underlying heart repair after MI (P < 0.05). Col5a2, a gene previously not specifically linked to MI response but responsible for the classic type of Ehlers-Danlos syndrome, was found to have many and strong co-expression associations within this community (11 connections with ρ > 0.85). To validate the potential clinical application of this discovery, we tested its disease discriminatory capacity on independently generated MI datasets from mice and humans. High classification accuracy and concordance was achieved across these evaluations with areas under the receiving operating characteristic curve above 0.8. Conclusion Network-based approaches can enable the discovery of clinically-interesting predictive insights that are accurate and robust. Col5a2 shows predictive potential in MI, and in principle may represent a novel candidate marker for the identification and treatment of ischemic cardiovascular disease.
    BMC Medical Genomics 04/2013; 6(1):13. DOI:10.1186/1755-8794-6-13 · 2.87 Impact Factor
  • Source
    • "The explosion of gene expression data has made central data repositories such as the NCBI Gene Expression Omnibus (GEO, and EMBL-EBI ArrayExpress Archive ( a power house for large-scale systematic meta-analysis, or expression GWAS (eGWAS). It is plausible that disease-related genes exhibit persistent differential expression patterns across multiple studies, and that by scanning through a large data repository related to a given disease condition, these genes can be identified [9, 10, 11•, 12–14]. In a series of studies, Butte and colleagues developed and applied various bioinformatics tools [15–17] to screen GEO datasets for a wide spectrum of human diseases or phenotypes including T2D [11•] and obesity [18]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The metabolically connected triad of obesity, diabetes, and cardiovascular diseases is a major public health threat, and is expected to worsen due to the global shift toward energy-rich and sedentary living. Despite decades of intense research, a large part of the molecular pathogenesis behind complex metabolic diseases remains unknown. Recent advances in genetics, epigenomics, transcriptomics, proteomics and metabolomics enable us to obtain large-scale snapshots of the etiological processes in multiple disease-related cells, tissues and organs. These datasets provide us with an opportunity to go beyond conventional reductionist approaches and to pinpoint the specific perturbations in critical biological processes. In this review, we summarize systems biology methodologies such as functional genomics, causality inference, data-driven biological network construction, and higher-level integrative analyses that can produce novel mechanistic insights, identify disease biomarkers, and uncover potential therapeutic targets from a combination of omics datasets. Importantly, we also demonstrate the power of these approaches by application examples in obesity, diabetes, and cardiovascular diseases.
    Current Cardiovascular Risk Reports 02/2013; 7(1):73-83. DOI:10.1007/s12170-012-0280-y
  • Source
    • "In 2009, Dudley et al. [23] evaluated 429 experiments, representing 238 diseases and 122 tissues from 8435 microarrays, and found evidences of a general, pathophysiological concordance between microarray experiments measuring the same disease in different tissues. Our result showed that microarrays of cell response to drugs which altered the cellular expression pattern could also have similarity across cell lines or species. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Animal models are indispensable tools in studying the cause of human diseases and searching for the treatments. The scientific value of an animal model depends on the accurate mimicry of human diseases. The primary goal of the current study was to develop a cross-species method by using the animal models' expression data to evaluate the similarity to human diseases' and assess drug molecules' efficiency in drug research. Therefore, we hoped to reveal that it is feasible and useful to compare gene expression profiles across species in the studies of pathology, toxicology, drug repositioning, and drug action mechanism. Results We developed a cross-species analysis method to analyze animal models' similarity to human diseases and effectiveness in drug research by utilizing the existing animal gene expression data in the public database, and mined some meaningful information to help drug research, such as potential drug candidates, possible drug repositioning, side effects and analysis in pharmacology. New animal models could be evaluated by our method before they are used in drug discovery. We applied the method to several cases of known animal model expression profiles and obtained some useful information to help drug research. We found that trichostatin A and some other HDACs could have very similar response across cell lines and species at gene expression level. Mouse hypoxia model could accurately mimic the human hypoxia, while mouse diabetes drug model might have some limitation. The transgenic mouse of Alzheimer was a useful model and we deeply analyzed the biological mechanisms of some drugs in this case. In addition, all the cases could provide some ideas for drug discovery and drug repositioning. Conclusions We developed a new cross-species gene expression module comparison method to use animal models' expression data to analyse the effectiveness of animal models in drug research. Moreover, through data integration, our method could be applied for drug research, such as potential drug candidates, possible drug repositioning, side effects and information about pharmacology.
    BMC Systems Biology 12/2012; 6(3). DOI:10.1186/1752-0509-6-S3-S18 · 2.44 Impact Factor
Show more