The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models.

Giuseppe Jurman
Giuseppe Jurman
National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA.
Nature Biotechnology (Impact Factor: 39.08). 08/2010; 28(8):827-38. DOI: 10.1038/nbt.1665
Source: PubMed

ABSTRACT Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Phosphatases are proteins with the ability to dephosphorylate different substrates and are involved in critical cellular processes such as proliferation, tumor suppression, motility and survival. Little is known about their role in the different breast cancer (BC) phenotypes. We carried out microarray phosphatome profiling in 41 estrogen receptor-negative (ER-) BC patients, as determined by immunohistochemistry (IHC), containing both ERBB2+ and ERBB2- in order to characterize the differences between these two groups. We characterized and confirmed the distinct phosphatome of the two main ER- BC subgroups (in two independent microarrays series) and that of ER+ BC (in three large independent series). Our findings point to the importance of the MAPK and PI3K pathways in ER- BCs as some of the most differentially expressed phosphatases (like DUSP4 and DUSP6) sharing ERK as substrate, or regulating the PI3K pathway (INPP4B, PTEN). It was possible to identify a selective group of phosphatases upregulated only in the ER- ERBB2+ subgroup and not in ER+ (like DUSP6, DUSP10 and PPAPDC1A among others), suggesting a role of these phosphatases in specific BC subtypes, unlike other differentially expressed phosphatases (DUSP4 and ENPP1) that seemed to have a role in multiple BC subtypes. Significant correlation was found at the protein level by IHC between the expression of DUSP6 and phospho-ERK (p=0.04) but not of phospho-ERK with DUSP4. To show the potential prognostic relevance of phosphatases as a functional group of genes, we derived and validated in two large independent BC microarray series a multiphosphatase signature enriched in differentially expressed phosphatases, to predict distant metastasis-free survival (DMFS). ER- ERBB2+, ER- ERBB2- and ER+ BC patients have a distinct pattern of phosphatase RNA expression with a potential prognostic relevance. Further studies of the most relevant phosphatases found in this study are warranted.
    International journal of oncology. 09/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Understanding the aspects of the cell functionality that account for disease or drug action mechanisms is one of the main challenges in the analysis of genomic data and is on the basis of the future implementation of precision medicine.ResultsHere we propose a simple probabilistic model in which signaling pathways are separated into elementary sub-pathways or signal transmission circuits (which ultimately trigger cell functions) and then transforms gene expression measurements into probabilities of activation of such signal transmission circuits. Using this model, differential activation of such circuits between biological conditions can be estimated. Thus, circuit activation statuses can be interpreted as biomarkers that discriminate among the compared conditions. This type of mechanism-based biomarkers accounts for cell functional activities and can easily be associated to disease or drug action mechanisms. The accuracy of the proposed model is demonstrated with simulations and real datasets.Conclusions The proposed model provides detailed information that enables the interpretation disease mechanisms as a consequence of the complex combinations of altered gene expression values. Moreover, it offers a framework for suggesting possible ways of therapeutic intervention in a pathologically perturbed system.
    BMC Systems Biology 10/2014; 8(1):121. · 2.85 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Detecting periodicity signals from time-series microarray data is commonly used to facilitate the understanding of the critical roles and underlying mechanisms of regulatory transcriptomes. However, time-series microarray data are noisy. How the temporal data structure affects the performance of periodicity detection has remained elusive. We present a novel method based on empirical mode decomposition (EMD) to examine this effect. We applied EMD to a yeast microarray dataset and extracted a series of intrinsic mode function (IMF) oscillations from the time-series data. Our analysis indicated that many periodically expressed genes might have been under-detected in the original analysis because of interference between decomposed IMF oscillations. By validating a protein complex coexpression analysis, we revealed that 56 genes were newly determined as periodic. We demonstrated that EMD can be used incorporating with existing periodicity detection methods to improve their performance. This approach can be applied to other time-series microarray studies.
    PLoS ONE 11/2014; 9(11):e111719. · 3.53 Impact Factor


Available from
May 20, 2014