Article

Disease signatures are robust across tissues and experiments.

Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA.
Molecular Systems Biology (Impact Factor: 14.1). 09/2009; 5:307. DOI: 10.1038/msb.2009.66
Source: PubMed

ABSTRACT Meta-analyses combining gene expression microarray experiments offer new insights into the molecular pathophysiology of disease not evident from individual experiments. Although the established technical reproducibility of microarrays serves as a basis for meta-analysis, pathophysiological reproducibility across experiments is not well established. In this study, we carried out a large-scale analysis of disease-associated experiments obtained from NCBI GEO, and evaluated their concordance across a broad range of diseases and tissue types. On evaluating 429 experiments, representing 238 diseases and 122 tissues from 8435 microarrays, we find evidence for a general, pathophysiological concordance between experiments measuring the same disease condition. Furthermore, we find that the molecular signature of disease across tissues is overall more prominent than the signature of tissue expression across diseases. The results offer new insight into the quality of public microarray data using pathophysiological metrics, and support new directions in meta-analysis that include characterization of the commonalities of disease irrespective of tissue, as well as the creation of multi-tissue systems models of disease pathology using public data.

1 Bookmark
 · 
87 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: High-throughput RNA sequencing (RNA-seq) enables comprehensive scans of entire transcriptomes, but best practices for analyzing RNA-seq data have not been fully defined, particularly for data collected with multiple sequencing platforms or at multiple sites. Here we used standardized RNA samples with built-in controls to examine sources of error in large-scale RNA-seq studies and their impact on the detection of differentially expressed genes (DEGs). Analysis of variations in guanine-cytosine content, gene coverage, sequencing error rate and insert size allowed identification of decreased reproducibility across sites. Moreover, commonly used methods for normalization (cqn, EDASeq, RUV2, sva, PEER) varied in their ability to remove these systematic biases, depending on sample complexity and initial data quality. Normalization methods that combine data from genes across sites are strongly recommended to identify and remove site-specific effects and can substantially improve RNA-seq studies.
    Nature Biotechnology 08/2014; · 39.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The biomarker discovery field is replete with molecular signatures that have not translated into the clinic despite ostensibly promising performance in predicting disease phenotypes. One widely cited reason is lack of classification consistency, largely due to failure to maintain performance from study to study. This failure is widely attributed to variability in data collected for the same phenotype among disparate studies, due to technical factors unrelated to phenotypes (e.g., laboratory settings resulting in "batch-effects") and non-phenotype-associated biological variation in the underlying populations. These sources of variability persist in new data collection technologies.
    PLoS ONE 10/2014; 9(10):e110840. · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The lack of specific symptoms at early tumor stages, together with a high biological aggressiveness of the tumor contribute to the high mortality rate for pancreatic cancer (PC), which has a five year survival rate of less than 5%. Improved screening for earlier diagnosis, through the detection of diagnostic and prognostic biomarkers provides the best hope of increasing the rate of curatively resectable carcinomas. Though many serum markers have been reported to be elevated in patients with PC, so far, most of these markers have not been implemented into clinical routine due to low sensitivity or specificity. In this study, we have identified genes that are significantly upregulated in PC, through a meta-analysis of large number of microarray datasets. We demonstrate that the biological functions ascribed to these genes are clearly associated with PC and metastasis, and that that these genes exhibit a strong link to pathways involved with inflammation and the immune response. This investigation has yielded new targets for cancer genes, and potential biomarkers for pancreatic cancer. The candidate list of cancer genes includes protein kinase genes, new members of gene families currently associated with PC, as well as genes not previously linked to PC. In this study, we are also able to move towards developing a signature for hypomethylated genes, which could be useful for early detection of PC. We also show that the significantly upregulated 800+ genes in our analysis can serve as an enriched pool for tissue and serum protein biomarkers in pancreatic cancer.
    PLoS ONE 04/2014; 9(4):e93046. · 3.53 Impact Factor

Full-text (2 Sources)

Download
25 Downloads
Available from
May 16, 2014