Article

Integrative set enrichment testing for multiple omics platforms.

Department of Public Health Sciences, Henry Ford Hospital, 1 Ford Place, Detroit, MI 48202, USA.
BMC Bioinformatics (impact factor: 2.75). 11/2011; 12:459. DOI:10.1186/1471-2105-12-459 pp.459
Source: PubMed

ABSTRACT Enrichment testing assesses the overall evidence of differential expression behavior of the elements within a defined set. When we have measured many molecular aspects, e.g. gene expression, metabolites, proteins, it is desirable to assess their differential tendencies jointly across platforms using an integrated set enrichment test. In this work we explore the properties of several methods for performing a combined enrichment test using gene expression and metabolomics as the motivating platforms.
Using two simulation models we explored the properties of several enrichment methods including two novel methods: the logistic regression 2-degree of freedom Wald test and the 2-dimensional permutation p-value for the sum-of-squared statistics test. In relation to their univariate counterparts we find that the joint tests can improve our ability to detect results that are marginal univariately. We also find that joint tests improve the ranking of associated pathways compared to their univariate counterparts. However, there is a risk of Type I error inflation with some methods and self-contained methods lose specificity when the sets are not representative of underlying association.
In this work we show that consideration of data from multiple platforms, in conjunction with summarization via a priori pathway information, leads to increased power in detection of genomic associations with phenotypes.

0 0
 · 
0 Bookmarks
 · 
44 Views
  • Article: Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression.
    [show abstract] [hide abstract]
    ABSTRACT: Multiple, complex molecular events characterize cancer development and progression. Deciphering the molecular networks that distinguish organ-confined disease from metastatic disease may lead to the identification of critical biomarkers for cancer invasion and disease aggressiveness. Although gene and protein expression have been extensively profiled in human tumours, little is known about the global metabolomic alterations that characterize neoplastic progression. Using a combination of high-throughput liquid-and-gas-chromatography-based mass spectrometry, we profiled more than 1,126 metabolites across 262 clinical samples related to prostate cancer (42 tissues and 110 each of urine and plasma). These unbiased metabolomic profiles were able to distinguish benign prostate, clinically localized prostate cancer and metastatic disease. Sarcosine, an N-methyl derivative of the amino acid glycine, was identified as a differential metabolite that was highly increased during prostate cancer progression to metastasis and can be detected non-invasively in urine. Sarcosine levels were also increased in invasive prostate cancer cell lines relative to benign prostate epithelial cells. Knockdown of glycine-N-methyl transferase, the enzyme that generates sarcosine from glycine, attenuated prostate cancer invasion. Addition of exogenous sarcosine or knockdown of the enzyme that leads to sarcosine degradation, sarcosine dehydrogenase, induced an invasive phenotype in benign prostate epithelial cells. Androgen receptor and the ERG gene fusion product coordinately regulate components of the sarcosine pathway. Here, by profiling the metabolomic alterations of prostate cancer progression, we reveal sarcosine as a potentially important metabolic intermediary of cancer cell invasion and aggressivity.
    Nature 03/2009; 457(7231):910-4. · 36.28 Impact Factor
  • Article: Integration of metabolomics and proteomics in molecular plant physiology--coping with the complexity by data-dimensionality reduction.
    [show abstract] [hide abstract]
    ABSTRACT: In recent years, genomics has been extended to functional genomics. Toward the characterization of organisms or species on the genome level, changes on the metabolite and protein level have been shown to be essential to assign functions to genes and to describe the dynamic molecular phenotype. Gas chromatography (GC) and liquid chromatography coupled to mass spectrometry (GC- and LC-MS) are well suited for the fast and comprehensive analysis of ultracomplex metabolite samples. For the integration of metabolite profiles with quantitative protein profiles, a high throughput (HTP) shotgun proteomics approach using LC-MS and label-free quantification of unique proteins in a complex protein digest is described. Multivariate statistics are applied to examine sample pattern recognition based on data-dimensionality reduction and biomarker identification in plant systems biology. The integration of the data reveal multiple correlative biomarkers providing evidence for an increase of information in such holistic approaches. With computational simulation of metabolic networks and experimental measurements, it can be shown that biochemical regulation is reflected by metabolite network dynamics measured in a metabolomics approach. Examples in molecular plant physiology are presented to substantiate the integrative approach.
    Physiologia Plantarum 03/2008; 132(2):176-89. · 3.11 Impact Factor
  • Article: KEGG: kyoto encyclopedia of genes and genomes.
    [show abstract] [hide abstract]
    ABSTRACT: KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).
    Nucleic Acids Research 02/2000; 28(1):27-30. · 8.03 Impact Factor

Full-text (2 Sources)

View
3 Downloads
Available from
13 Mar 2013

Keywords

2-dimensional permutation p-value
 
combined enrichment test
 
conjunction
 
differential expression behavior
 
enrichment methods
 
enrichment test
 
freedom Wald test
 
gene expression
 
joint tests
 
logistic regression 2-degree
 
marginal univariately
 
metabolites
 
motivating platforms
 
multiple platforms
 
novel methods
 
platforms
 
priori pathway information
 
self-contained methods
 
simulation models
 
sum-of-squared statistics test