Three-Gene Expression Signature Predicts Survival in Early-Stage Squamous Cell Carcinoma of the Lung
University of Gdansk, Gdansk, Poland. Clinical Cancer Research
(Impact Factor: 8.72).
08/2008; 14(15):4794-9. DOI: 10.1158/1078-0432.CCR-08-0576
Adjuvant treatment may improve survival in early-stage squamous cell carcinoma (SCC) of the lung; however, the absolute gain is modest and mainly limited to stage II-IIIA. Current staging methods are imprecise indications of prognosis, but high-risk patients can be identified by gene expression profiling and considered for adjuvant therapy.
The expression of 29 genes was assessed by reverse transcriptase quantitative PCR in frozen primary tumor specimens obtained from 66 SCC patients who had undergone surgical resection. Expression values were dichotomized using the median as a cutoff value. We used a risk score to develop a gene expression model for the prediction of survival.
The univariate analysis of gene expression in the training cohort identified 10 genes with significant prognostic value: CSF1, EGFR, CA IX, PH4, KIAA0974, ANLN, VEGFC, NTRK1, FN1, and INR1. In the multivariate Cox model, CSF1 (hazard ratio, 3.5; P = 0.005), EGFR (hazard ratio, 2.7; P = 0.02), CA IX (hazard ratio, 0.2; P < 0.0001), and tumor size >4 cm (hazard ratio, 2.7; P = 0.02) emerged as significant markers for survival. The high prognostic value of a risk score based on the expression of the three genes (CSF1, EGFR, and CA IX) was positively validated in a separate cohort of 26 patients in an independent laboratory (P = 0.05).
The three-gene signature is strongly associated with prognosis in early-stage SCC. Positive independent validation suggests its suitability for selecting SCC patients with an increased risk of death who might benefit from adjuvant treatment.
Available from: Charles W Putnam
- "The accompanying clinical data were obtained as described in Methods. The three published reports [18,20,21] of this data set included the 23 stage III cases. However, our analysis was limited to data from the 107 stage I and II cases, a selection consonant with the principle of using the clinical data in optimal fashion to achieve the objective of the study; limiting the cases to stages I and II provided a relatively homogenous patient sample in which the most prominent variable was survival. "
[Show abstract] [Hide abstract]
Numerous microarray-based prognostic gene expression signatures of primary neoplasms have been published but often with little concurrence between studies, thus limiting their clinical utility. We describe a methodology using logistic regression, which circumvents limitations of conventional Kaplan Meier analysis. We applied this approach to a thrice-analyzed and published squamous cell carcinoma (SQCC) of the lung data set, with the objective of identifying gene expressions predictive of early death versus long survival in early-stage disease. A similar analysis was applied to a data set of triple negative breast carcinoma cases, which present similar clinical challenges.
Important to our approach is the selection of homogenous patient groups for comparison. In the lung study, we selected two groups (including only stages I and II), equal in size, of earliest deaths and longest survivors. Genes varying at least four-fold were tested by logistic regression for accuracy of prediction (area under a ROC plot). The gene list was refined by applying two sliding-window analyses and by validations using a leave–one-out approach and model building with validation subsets. In the breast study, a similar logistic regression analysis was used after selecting appropriate cases for comparison.
A total of 8594 variable genes were tested for accuracy in predicting earliest deaths versus longest survivors in SQCC. After applying the two sliding window and the leave-one-out analyses, 24 prognostic genes were identified; most of them were B-cell related. When the same data set of stage I and II cases was analyzed using a conventional Kaplan Meier (KM) approach, we identified fewer immune-related genes among the most statistically significant hits; when stage III cases were included, most of the prognostic genes were missed. Interestingly, logistic regression analysis of the breast cancer data set identified many immune-related genes predictive of clinical outcome.
Stratification of cases based on clinical data, careful selection of two groups for comparison, and the application of logistic regression analysis substantially improved predictive accuracy in comparison to conventional KM approaches. B cell-related genes dominated the list of prognostic genes in early stage SQCC of the lung and triple negative breast cancer.
Available from: Giorgio Valentini
- "It was not possible to found supporting literature for the association of NPPA with the tumor types in which CM234 was found to be activated or deactivated by Segal and colleagues. In  FN1 was sought to be of prognostic value using a univariate analysis of gene expression. FN1 was also bound to be a potential biomarker for hepatocellular carcinomas in . "
[Show abstract] [Hide abstract]
ABSTRACT: Co-expression based Cancer Modules (CMs) are sets of genes that act in concert to carry out specific functions in different cancer types, and are constructed by exploiting gene expression profiles related to specific clinical conditions or expression signatures associated to specific processes altered in cancer. Unfortunately, genes involved in cancer are not always detectable using only expression signatures or co-expressed sets of genes, and in principle other types of functional interactions should be exploited to obtain a comprehensive picture of the molecular mechanisms underlying the onset and progression of cancer.
We propose a novel semi-supervised method to rank genes with respect to CMs using networks constructed from different sources of functional information, not limited to gene expression data. It exploits on the one hand local learning strategies through score functions that extend the guilt-by-association approach, and on the other hand global learning strategies through graph kernels embedded in the score functions, able to take into account the overall topology of the network. The proposed kernelized score functions compare favorably with other state-of-the-art semi-supervised machine learning methods for gene ranking in biological networks and scales well with the number of genes, thus allowing fast processing of very large gene networks.
The modular nature of kernelized score functions provides an algorithmic scheme from which different gene ranking algorithms can be derived, and the results show that using integrated functional networks we can successfully predict CMs defined mainly through expression signatures obtained from gene expression data profiling. A preliminary analysis of top ranked "false positive" genes shows that our approach could be in perspective applied to discover novel genes involved in the onset and progression of tumors related to specific CMs.
Available from: Susan J Done
- "Although we are the first to show that the over-expression of ANLN and KIF2C, and the under-expression of MAPT predict for poor survival in patients with breast cancer (Figures 4 and 5), there is some evidence that supports the correlation of these three genes with prognosis and carcinogenesis in other cancers, and treatment in breast cancers. The over-expression of ANLN has been reported to be a biomarker for pancreatic carcinoma , and predicted for poor survival in early lung cancers . Shimo et al reported that the over-expression of KIF2C might be involved in breast carcinogenesis and is a therapeutic target for breast cancers . "
[Show abstract] [Hide abstract]
ABSTRACT: The ability of gene profiling to predict treatment response and prognosis in breast cancers has been demonstrated in many studies using DNA microarray analyses on RNA from fresh frozen tumor specimens. In certain clinical and research situations, performing such analyses on archival formalin fixed paraffin-embedded (FFPE) surgical specimens would be advantageous as large libraries of such specimens with long-term follow-up data are widely available. However, FFPE tissue processing can cause fragmentation and chemical modifications of the RNA. A number of recent technical advances have been reported to overcome these issues. Our current study evaluates whether or not the technology is ready for clinical applications.
A modified RNA extraction method and a recent DNA microarray technique, cDNA-mediated annealing, selection, extension and ligation (DASL, Illumina Inc) were evaluated. The gene profiles generated from FFPE specimens were compared to those obtained from paired fresh fine needle aspiration biopsies (FNAB) of 25 breast cancers of different clinical subtypes (based on ER and Her2/neu status). Selected RNA levels were validated using RT-qPCR, and two public databases were used to demonstrate the prognostic significance of the gene profiles generated from FFPE specimens.
Compared to FNAB, RNA isolated from FFPE samples was relatively more degraded, nonetheless, over 80% of the RNA samples were deemed suitable for subsequent DASL assay. Despite a higher noise level, a set of genes from FFPE specimens correlated very well with the gene profiles obtained from FNAB, and could differentiate breast cancer subtypes. Expression levels of these genes were validated using RT-qPCR. Finally, for the first time we correlated gene expression profiles from FFPE samples to survival using two independent microarray databases. Specifically, over-expression of ANLN and KIF2C, and under-expression of MAPT strongly correlated with poor outcomes in breast cancer patients.
We demonstrated that FFPE specimens retained important prognostic information that could be identified using a recent gene profiling technology. Our study supports the use of FFPE specimens for the development and refinement of prognostic gene signatures for breast cancer. Clinical applications of such prognostic gene profiles await future large-scale validation studies.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.