A robust prognostic signature for hormone-positive node-negative breast cancer

Genome Medicine (Impact Factor: 5.34). 10/2013; 5(10):92. DOI: 10.1186/gm496
Source: PubMed


Systemic chemotherapy in the adjuvant setting can cure breast cancer in some patients that would otherwise recur with incurable, metastatic disease. However, since only a fraction of patients would have recurrence after surgery alone, the challenge is to stratify high-risk patients (who stand to benefit from systemic chemotherapy) from low-risk patients (who can safely be spared treatment related toxicities and costs).
We focus here on risk stratification in node-negative, ER-positive, HER2-negative breast cancer. We use a large database of publicly available microarray datasets to build a random forests classifier and develop a robust multi-gene mRNA transcription-based predictor of relapse free survival at ten years, which we call the Random Forests Relapse Score (RFRS). Performance was assessed by internal cross-validation, multiple independent data sets, and comparison to existing algorithms using receiver operating characteristic and Kaplan-Meier survival analysis. Internal redundancy of features was determined using k-means clustering to define optimal signatures with smaller numbers of primary genes, each with multiple alternates.
Internal OOB cross-validation for the initial (full-gene-set) model on training data reported an ROC AUC of 0.704, which was comparable to or better than those reported previously or obtained by applying existing methods to our data set. Three risk groups with probability cut offs for low, intermediate and high-risk were defined. Survival analysis determined a highly significant difference in relapse rate between these risk groups. Validation of the models against independent test datasets showed highly similar results. Smaller 17-gene and 8-gene optimized models were also developed with minimal reduction in performance. Furthermore, the signature was shown to be almost equally effective on both hormone-treated and untreated patients.
RFRS allows flexibility in both the number and identity of genes utilized from thousands to as few as 17 or 8 genes, each with multiple alternatives. The RFRS reports a probability score strongly correlated with risk of relapse. This score could therefore be used to assign systemic chemotherapy specifically to those high-risk patients most likely to benefit from further treatment.

Download full-text


Available from: Obi Lee Griffith, May 28, 2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: Small bowel accounts for only 0.5% of cancer cases in the US but incidence rates have been rising at 2.4% per year over the past decade. One-third of these are adenocarcinomas but little is known about their molecular pathology and no molecular markers are available for clinical use. Using a retrospective 28 patient matched normal-tumor cohort, next-generation sequencing, gene expression arrays and CpG methylation arrays were used for molecular profiling. Next-generation sequencing identified novel mutations in IDH1, CDH1, KIT, FGFR2, FLT3, NPM1, PTEN, MET, AKT1, RET, NOTCH1 and ERBB4. Array data revealed 17% of CpGs and 5% of RNA transcripts assayed to be differentially methylated and expressed respectively (p < 0.01). Merging gene expression and DNA methylation data revealed CHN2 as consistently hypermethylated and downregulated in this disease (Spearman -0.71, p < 0.001). Mutations in TP53 which were found in more than half of the cohort (15/28) and Kazald1 hypomethylation were both were indicative of poor survival (p = 0.03, HR = 3.2 and p = 0.01, HR = 4.9 respectively). By integrating high-throughput mutational, gene expression and DNA methylation data, this study reveals for the first time the distinct molecular profile of small bowel adenocarcinoma and highlights potential clinically exploitable markers.
    No preview · Article · Jul 2015 · Oncotarget
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microarray analysis has revolutionized the role of genomic prognostication in breast cancer. However, most studies are single series studies, and suffer from methodological problems. We sought to use a meta-analytic approach in combining multiple publicly available datasets, while correcting for batch effects, to reach a more robust oncogenomic analysis. The aim of the present study was to find gene sets associated with distant metastasis free survival (DMFS) in systemically untreated, node-negative breast cancer patients, from publicly available genomic microarray datasets. Four microarray series (having 742 patients) were selected after a systematic search and combined. Cox regression for each gene was done for the combined dataset (univariate, as well as multivariate - adjusted for expression of Cell cycle related genes) and for the 4 major molecular subtypes. The centre and microarray batch effects were adjusted by including them as random effects variables. The Cox regression coefficients for each analysis were then ranked and subjected to a Gene Set Enrichment Analysis (GSEA). Gene sets representing protein translation were independently negatively associated with metastasis in the Luminal A and Luminal B subtypes, but positively associated with metastasis in Basal tumors. Proteinaceous extracellular matrix (ECM) gene set expression was positively associated with metastasis, after adjustment for expression of cell cycle related genes on the combined dataset. Finally, the positive association of the proliferation-related genes with metastases was confirmed. To the best of our knowledge, the results depicting mixed prognostic significance of protein translation in breast cancer subtypes are being reported for the first time. We attribute this to our study combining multiple series and performing a more robust meta-analytic Cox regression modeling on the combined dataset, thus discovering 'hidden' associations. This methodology seems to yield new and interesting results and may be used as a tool to guide new research.
    Full-text · Article · Jun 2016 · PLoS ONE