Content-based microarray search using differential expression profiles.

Department of Bioengineering, Stanford University School of Medicine, CA, USA.
BMC Bioinformatics (Impact Factor: 2.67). 01/2010; 11:603. DOI: 10.1186/1471-2105-11-603
Source: DBLP

ABSTRACT With the expansion of public repositories such as the Gene Expression Omnibus (GEO), we are rapidly cataloging cellular transcriptional responses to diverse experimental conditions. Methods that query these repositories based on gene expression content, rather than textual annotations, may enable more effective experiment retrieval as well as the discovery of novel associations between drugs, diseases, and other perturbations.
We develop methods to retrieve gene expression experiments that differentially express the same transcriptional programs as a query experiment. Avoiding thresholds, we generate differential expression profiles that include a score for each gene measured in an experiment. We use existing and novel dimension reduction and correlation measures to rank relevant experiments in an entirely data-driven manner, allowing emergent features of the data to drive the results. A combination of matrix decomposition and p-weighted Pearson correlation proves the most suitable for comparing differential expression profiles. We apply this method to index all GEO DataSets, and demonstrate the utility of our approach by identifying pathways and conditions relevant to transcription factors Nanog and FoxO3.
Content-based gene expression search generates relevant hypotheses for biological inquiry. Experiments across platforms, tissue types, and protocols inform the analysis of new datasets.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A main challenge of data-driven sciences is how to make maximal use of the progressively expanding databases of experimental datasets in order to keep research cumulative. We introduce the idea of a modeling-based dataset retrieval engine designed for relating a researcher's experimental dataset to earlier work in the field. The search is (i) data-driven to enable new findings, going beyond the state of the art of keyword searches in annotations, (ii) modeling-driven, to both include biological knowledge and insights learned from data, and (iii) scalable, as it is accomplished without building one unified grand model of all data. Assuming each dataset has been modeled beforehand, by the researchers or by database managers, we apply a rapidly computable and optimizable combination model to decompose a new dataset into contributions from earlier relevant models. By using the data-driven decomposition we identify a network of interrelated datasets from a large annotated human gene expression atlas. While tissue type and disease were major driving forces for determining relevant datasets, the found relationships were richer and the model-based search was more accurate than keyword search; it moreover recovered biologically meaningful relationships that are not straightforwardly visible from annotations, for instance, between cells in different developmental stages such as thymocytes and T-cells. Data-driven links and citations matched to a large extent; the data-driven links even uncovered corrections to the publication data, as two of the most linked datasets were not highly cited and turned out to have wrong publication entries in the database.
    PloS one. 04/2014; 9(11).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Aim: Inspite of significant progressions in microarray techniques and accurate management, the analysis and interpretation of raw data is a big challenge for the majority of researchers on global scale. In this mini-review the authors have described general parameters to overcome current errors and drawbacks occurred in management, analysis and interpretation of microarray raw data. Methods: Visualization of correct data is related to researchers knowledge about methodologies, experimental designing, appropriate platforms, upgrade softwares, suitable statistical tests and valuable databases. Hence, being up to date and skillful is considered a key factor for ensuring accurate data management, analysis and interpretation. Results: Application of correlated methodologies, experimental designs, platforms, softwares, statistical tests and virtual databases, guarantees high quality management, analysis and interpretation of microarray raw data. Conclusion: In accordance with new efforts in the field of databases and softwares, microarray data management, analysis and interpretation have been improved. The rise of microarray technology applications may lead to a significant decrease of the costs in the future.
    Albanian Medical Journal. 12/2014; 4:84-90.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Cutaneous atrophy is the major adverse effect of topical glucocorticoids; however, its molecular mechanisms are poorly understood. Here, we identify stress-inducible mTOR inhibitor REDD1 (regulated in development and DNA damage response 1) as a major molecular target of glucocorticoids, which mediates cutaneous atrophy. In REDD1 knockout (KO) mice, all skin compartments (epidermis, dermis, subcutaneous fat), epidermal stem, and progenitor cells were protected from atrophic effects of glucocorticoids. Moreover, REDD1 knockdown resulted in similar consequences in organotypic raft cultures of primary human keratinocytes. Expression profiling revealed that gene activation by glucocorticoids was strongly altered in REDD1 KO epidermis. In contrast, the down-regulation of genes involved in anti-inflammatory glucocorticoid response was strikingly similar in wild-type and REDD1 KO mice. Integrative bioinformatics analysis of our and published gene array data revealed similar changes of gene expression in epidermis and in muscle undergoing glucocorticoid-dependent and glucocorticoid-independent atrophy. Importantly, the lack of REDD1 did not diminish the anti-inflammatory effects of glucocorticoids in preclinical model. Our findings suggest that combining steroids with REDD1 inhibitors may yield a novel, safer glucocorticoid-based therapies.
    EMBO Molecular Medicine 12/2014; · 7.80 Impact Factor

Full-text (3 Sources)

Available from
Jun 1, 2014