Conference Paper

Nonlinear Kernel-Based Approaches for Predicting Normal Tissue Toxicities

Dept. of Radiat. Oncology, Washington Univ., St. Louis, MO, USA
DOI: 10.1109/ICMLA.2008.126 Conference: Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Source: IEEE Xplore


Since the early demonstration of the curative potential of radiation therapy for tumor sterilization, normal tissue toxicity continues to be dose limiting. Accurate prediction of patient¿s complication risk would allow personalization of treatment planning decisions. Nonlinear kernel methods can provide a robust framework for learning complex interactions between observed toxicities and treatment, anatomical, and patient-related variables. However, proper application of these powerful methods would require better understanding of a high-dimensional feature space that is spanned by all these variables. In this work, we investigate methods for visualization of this high-dimensional space and compare different approaches for extracting discriminant features. Our preliminary results demonstrate that principle component analysis is a valuable tool for visualizing high dimensional data and for determining proper kernel type. In addition, variable selection based on resampling methods within the logistic regression framework seemed to yield improved prediction performance compared to the recursive-feature elimination method.

3 Reads
  • Source
    • "Recently, the academic and medical community has seen an increased interest in applying machine learning techniques to predicting radiation pneumonitis risk. In particular, support vector machines (SVMs), which have been successfully used in domains ranging from cancer classification [20] [18] to image retrieval [34] [41], are now being applied to the RP modeling problem with promising results [10] [30]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Radiation-induced lung injury, radiation pneumonitis (RP), is a potentially fatal side-effect of thoracic radiation therapy. In this work, using an ensemble of support vector machines (SVMs), we build a binary RP risk model from clinical and dosimetric parameters. Patient/treatment data is partitioned into balanced subsets to prevent model bias. Forward feature selection, maximizing the area under the curve (AUC) for a cross-validated receiver operating characteristic (ROC) curve, is performed on each subset. Model parameter selection and construction occurs concurrently via alternating SVM and gradient descent steps to minimize estimated generalization error. We show that an ensemble classifier with a mean fusion function, five component SVMs, and limit of five features per classifier exhibits a mean AUC of 0.818—an improvement over previous SVM models of RP risk.
    Full-text · Article · Jun 2010 · Neurocomputing
  • Source
    • "We have demonstrated that supervised machine learning methods based on nonlinear kernels could be used to improve prediction of RP by a factor of 46% compared to traditional logistic regression methods. Potential benefits of these methods could be assessed based on PCA analysis of this data, where nonlinear kernels could be applied to resolve overlapping classes by mapping to higher-dimensional space [36]. We have applied resampling methods based on LOO to assess generalizabilty to unseen data and avoid overfitting pitfalls. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Radiotherapy outcomes are determined by complex interactions between physical and biological factors, reflecting both treatment conditions and underlying genetics. Recent advances in radiotherapy and biotechnology provide new opportunities and challenges for predicting radiation-induced toxicities, particularly radiation pneumonitis (RP), in lung cancer patients. In this work, we utilize datamining methods based on machine learning to build a predictive model of lung injury by retrospective analysis of treatment planning archives. In addition, biomarkers for this model are extracted from a prospective clinical trial that collects blood serum samples at multiple time points. We utilize a 3-way proteomics methodology to screen for differentially expressed proteins that are related to RP. Our preliminary results demonstrate that kernel methods can capture nonlinear dose-volume interactions, but fail to address missing biological factors. Our proteomics strategy yielded promising protein candidates, but their role in RP as well as their interactions with dose-volume metrics remain to be determined.
    Full-text · Article · Feb 2009 · BioMed Research International
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Purpose: Radiation pneumonitis (RP) is a potentially fatal side effect arising in lung cancer patients who receive radiotherapy as part of their treatment. For the modeling of RP outcomes data, several predictive models based on traditional statistical methods and machine learning techniques have been reported. However, no guidance to variation in performance has been provided to date. Materials and methods: In this study, we explore several machine learning algorithms for classification of RP data. The performance of these classification algorithms is investigated in conjunction with several feature selection strategies and the impact of the feature selection strategy on performance is further evaluated. The extracted features include patient's demographic, clinical and pathological variables, treatment techniques, and dose-volume metrics. In conjunction, we have been developing an in-house Matlab-based open source software tool, called dose-response explorer system (DREES), customized for modeling and exploring dose response in radiation oncology. This software has been upgraded with a popular classification algorithm called support vector machine (SVM), which seems to provide improved performance in our exploration analysis and has strong potential to strengthen the ability of radiotherapy modelers in analyzing radiotherapy outcomes data. These tools are demonstrated on an institutional non-small cell lung carcinoma (NSCLC) dataset of patients who received radiotherapy. Results: Our methods were applied to an NSCLC dataset that consists of 209 patients' information, each having 160 variables. Using several feature selection methods, relevant features were searched. Subsequently, with the selected features, various classification algorithms were tested. Through these experiments, we showed the usefulness of machine learning methods in the analysis of radiation oncology dataset. Conclusions: We have presented an open-source software tool and several machine learning algorithms for analyzing radiotherapy outcomes. We demonstrated the tool on a lung cancer patient dataset. We believe that the improved tool will provide radiation oncology modelers with new means to analyze radiation response data.
    Full-text · Article ·
Show more