Molecular Stratification of Clear Cell Renal Cell Carcinoma by Consensus Clustering Reveals Distinct Subtypes and Survival Patterns.

Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA.
Genes & cancer 02/2010; 1(2):152-163. DOI: 10.1177/1947601909359929
Source: PubMed

ABSTRACT Clear cell renal cell carcinoma (ccRCC) is the predominant RCC subtype, but even within this classification, the natural history is heterogeneous and difficult to predict. A sophisticated understanding of the molecular features most discriminatory for the underlying tumor heterogeneity should be predicated on identifiable and biologically meaningful patterns of gene expression. Gene expression microarray data were analyzed using software that implements iterative unsupervised consensus clustering algorithms to identify the optimal molecular subclasses, without clinical or other classifying information. ConsensusCluster analysis identified two distinct subtypes of ccRCC within the training set, designated clear cell type A (ccA) and B (ccB). Based on the core tumors, or most well-defined arrays, in each subtype, logical analysis of data (LAD) defined a small, highly predictive gene set that could then be used to classify additional tumors individually. The subclasses were corroborated in a validation data set of 177 tumors and analyzed for clinical outcome. Based on individual tumor assignment, tumors designated ccA have markedly improved disease-specific survival compared to ccB (median survival of 8.6 vs 2.0 years, P = 0.002). Analyzed by both univariate and multivariate analysis, the classification schema was independently associated with survival. Using patterns of gene expression based on a defined gene set, ccRCC was classified into two robust subclasses based on inherent molecular features that ultimately correspond to marked differences in clinical outcome. This classification schema thus provides a molecular stratification applicable to individual tumors that has implications to influence treatment decisions, define biological mechanisms involved in ccRCC tumor progression, and direct future drug discovery.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Clear-cell Renal Cell Carcinoma (ccRCC) is the most- prevalent, chemotherapy resistant and lethal adult kidney cancer. There is a need for novel diagnostic and prognostic biomarkers for ccRCC, due to its heterogeneous molecular profiles and asymptomatic early stage. This study aims to develop classification models to distinguish early stage and late stage of ccRCC based on gene expression profiles. We employed supervised learning algorithms- J48, Random Forest, SMO and Naïve Bayes; with enriched model learning by fast correlation based feature selection to develop classification models trained on sequencing based gene expression data of RNAseq experiments, obtained from The Cancer Genome Atlas. Results Different models developed in the study were evaluated on the basis of 10 fold cross validations and independent dataset testing. Random Forest based prediction model performed best amongst the models developed in the study, with a sensitivity of 89%, accuracy of 77% and area under Receivers Operating Curve of 0.8. Conclusions We anticipate that the prioritized subset of 62 genes and prediction models developed in this study will aid experimental oncologists to expedite understanding of the molecular mechanisms of stage progression and discovery of prognostic factors for ccRCC tumors.
    Great Lakes Bioinformatics Conference 2014, Cincinnati, OH, USA; 05/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Formalin-fixed paraffin-embedded (FFPE) tissue samples are routinely archived in the course of patient care and can be linked to clinical outcomes with long-term follow-up. However, FFPE tissues have degraded RNA which poses challenges for analyzing gene expression. Next-generation sequencing (NGS) is rapidly becoming accepted as an effective tool for measuring gene expressions for research and clinical use. However, the feasibility of NGS has not been firmly established when using FFPE tissue. We optimized strategies for whole transcriptome sequencing (RNA-seq) using FFPE tissue. Ribosomal RNA (rRNA) was successfully depleted by competitive hybridization using the Ribo-zeroTM Kit (Epicentre Biotechnologies), and rRNA sequence content was less than one percent for each library. Gene expression measured by FFPE RNA-seq was compared to two different standards: RNA-seq from fresh frozen (FF) tissue and quantitative PCR (qPCR). Both FF and FFPE tumors were sequenced on an Illumina Genome Analyzer IIX with an average of 10 million reads. The distribution of FPKMs (fragments per kilobase of exon per million fragments mapped) and number of detected genes were similar between FFPE and FF. RNA-seq expressions from FF and FFPE samples from the same renal cell carcinoma (RCC) correlated highly (r = 0.919 for tumor 1 and r = 0.954 for tumor 2). On hierarchical cluster analysis, samples clustered by patient identity rather than method of preservation. TaqMan qPCR of 424 RCC-related genes correlated highly with FFPE RNA-seq expressions (r = 0.775 for FFPE tumor 1, r = 0.803 for FFPE tumor 2). Expression fold changes were considered, to assess biologic relevance of gene expressions. Expression fold changes between FFPE tumors (tumor 1/tumor 2) correlated well when comparing qPCR and RNA-seq (r = 0.890). Expression fold changes between tumors from different risk groups (our high risk RCC/ The Cancer Genome Atlas, TCGA, low risk RCC) also correlated well when comparing RNA-seq determined from FF and FFPE tumors (r = 0.887). FFPE RNA-seq provides reliable genes expression data comparable to that obtained from fresh frozen tissue. It represents a useful tool for discovery and validation of biomarkers.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Objective: To investigate discriminating protein patterns and serum biomarkers between clear cell renal cell carcinoma (ccRCC) patients and healthy controls, as well as between paired pre- and post-operative ccRCC patients. Methods: We used magnetic bead-based separation followed by matrix-assisted laser desorption ionization (MALDI) time-of-flight (TOF) mass spectrometry (MS) to identify patients with ccRCC. A total of 162 serum samples were analyzed in this study, among which there were 58 serum samples from ccRCC patients, 40 from additional paired pre- and post-operative ccRCC patients (n = 20), and 64 from healthy volunteers as healthy controls. ClinProTools software identified several distinct markers between ccRCC patients and healthy controls, as well as between pre- and post-operative patients. Results: Patients with ccRCC could be identified with a mean sensitivity of 88.38% and a mean specificity of 91.67%. Of 67 m/z peaks that differed among the ccRCC, healthy controls, pre- and post-operative ccRCC patients, 24 were significantly different (P<0.05). Three candidate peaks, which were upregulated in ccRCC group and showed a tendency to return to healthy control values after surgery, were identified as peptide regions of RNA-binding protein 6 (RBP6), tubulin beta chain (TUBB), and zinc finger protein 3 (ZFP3) with the m/z values of 1466.98, 1618.22, and 5905.23, respectively. Conclusion: MB-MALDI-TOF-MS method could generate serum peptidome profiles of ccRCC, and provide a new approach to identify potential biomarkers for diagnosis as well as prognosis of this malignancy.
    PLoS ONE 11/2014; 9(11):e111364. DOI:10.1371/journal.pone.0111364 · 3.53 Impact Factor

Full-text (2 Sources)

Available from
Jun 4, 2014