Article

Integrative Subtype Discovery in Glioblastoma Using iCluster

Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York, United States of America.
PLoS ONE (Impact Factor: 3.53). 04/2012; 7(4):e35236. DOI: 10.1371/journal.pone.0035236
Source: PubMed

ABSTRACT Large-scale cancer genome projects, such as the Cancer Genome Atlas (TCGA) project, are comprehensive molecular characterization efforts to accelerate our understanding of cancer biology and the discovery of new therapeutic targets. The accumulating wealth of multidimensional data provides a new paradigm for important research problems including cancer subtype discovery. The current standard approach relies on separate clustering analyses followed by manual integration. Results can be highly data type dependent, restricting the ability to discover new insights from multidimensional data. In this study, we present an integrative subtype analysis of the TCGA glioblastoma (GBM) data set. Our analysis revealed new insights through integrated subtype characterization. We found three distinct integrated tumor subtypes. Subtype 1 lacks the classical GBM events of chr 7 gain and chr 10 loss. This subclass is enriched for the G-CIMP phenotype and shows hypermethylation of genes involved in brain development and neuronal differentiation. The tumors in this subclass display a Proneural expression profile. Subtype 2 is characterized by a near complete association with EGFR amplification, overrepresentation of promoter methylation of homeobox and G-protein signaling genes, and a Classical expression profile. Subtype 3 is characterized by NF1 and PTEN alterations and exhibits a Mesenchymal-like expression profile. The data analysis workflow we propose provides a unified and computationally scalable framework to harness the full potential of large-scale integrated cancer genomic data for integrative subtype discovery.

0 Followers
 · 
174 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Recent technological advances have expanded the breadth of available omic data, from whole-genome sequencing data, to extensive transcriptomic, methylomic and metabolomic data. A key goal of analyses of these data is the identification of effective models that predict phenotypic traits and outcomes, elucidating important biomarkers and generating important insights into the genetic underpinnings of the heritability of complex traits. There is still a need for powerful and advanced analysis strategies to fully harness the utility of these comprehensive high-throughput data, identifying true associations and reducing the number of false associations. In this Review, we explore the emerging approaches for data integration - including meta-dimensional and multi-staged analyses - which aim to deepen our understanding of the role of genetics and genomics in complex outcomes. With the use and further development of these approaches, an improved understanding of the relationship between genomic variation and human phenotypes may be revealed.
    Nature Reviews Genetics 02/2015; 16(2):85-97. DOI:10.1038/nrg3868 · 39.79 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Dysregulated EGFR in glioblastoma may inactivate the key autophagy protein Beclin1. Each of high EGFR and low Beclin1 protein expression, independently, has been associated with tumor progression and poor prognosis. High (H) compared to low (L) expression of EGFR and Beclin1 is here correlated with main clinical data in 117 patients after chemo-and radiotherapy. H-EGFR correlated with low Karnofsky performance and worse neurological performance status, higher incidence of synchronous multifocality, poor radiological evidence of response, shorter progression disease-free (PDFS), and overall survival (OS). H-Beclin1 cases showed better Karnofsky performance status, higher incidence of objective response, longer PDFS, and OS. A mutual strengthening effect emerges in correlative power of stratified L-EGFR and H-Beclin1 expression with incidence of radiological response after treatment, unifocal disease, and better prognosis, thus identifying an even longer OS group (30 months median OS compared to 18 months in L-EGFR, 15 months in H-Beclin1, and 11 months in all GBs) (íµí±ƒ = 0.0001). Combined L-EGFR + H-Beclin1 expression may represent a biomarker in identifying relatively favorable clinical presentations and prognosis, thus envisaging possible EGFR/Beclin1-targeted therapies.
    BioMed Research International 10/2014; 2015. DOI:10.1155/2015/208076 · 2.71 Impact Factor