-
[show abstract]
[hide abstract]
ABSTRACT: Alzheimer's disease (AD) and mild cognitive impairment (MCI), continue to be
widely studied. While there is no consensus on whether MCIs actually "convert"
to AD, the more important question is not whether MCIs convert, but what is the
best such definition. We focus on automatic prognostication, nominally using
only a baseline image brain scan, of whether an MCI individual will convert to
AD within a multi-year period following the initial clinical visit. This is in
fact not a traditional supervised learning problem since, in ADNI, there are no
definitive labeled examples of MCI conversion. Prior works have defined MCI
subclasses based on whether or not clinical/cognitive scores such as CDR
significantly change from baseline. There are concerns with these definitions,
however, since e.g. most MCIs (and ADs) do not change from a baseline CDR=0.5,
even while physiological changes may be occurring. These works ignore rich
phenotypical information in an MCI patient's brain scan and labeled AD and
Control examples, in defining conversion. We propose an innovative conversion
definition, wherein an MCI patient is declared to be a converter if any of the
patient's brain scans (at follow-up visits) are classified "AD" by an
(accurately-designed) Control-AD classifier. This novel definition bootstraps
the design of a second classifier, specifically trained to predict whether or
not MCIs will convert. This second classifier thus predicts whether an
AD-Control classifier will predict that a patient has AD. Our results
demonstrate this new definition leads not only to much higher prognostic
accuracy than by-CDR conversion, but also to subpopulations much more
consistent with known AD brain region biomarkers. We also identify key
prognostic region biomarkers, essential for accurately discriminating the
converter and nonconverter groups.
04/2011;
-
[show abstract]
[hide abstract]
ABSTRACT: Alzheimer's disease (AD) and mild cognitive impairment (MCI) are of great current research interest. While there is no consensus on whether MCIs actually "convert" to AD, this concept is widely applied. Thus, the more important question is not whether MCIs convert, but what is the best such definition. We focus on automatic prognostication, nominally using only a baseline brain image, of whether an MCI will convert within a multi-year period following the initial clinical visit. This is not a traditional supervised learning problem since, in ADNI, there are no definitive labeled conversion examples. It is not unsupervised, either, since there are (labeled) ADs and Controls, as well as cognitive scores for MCIs. Prior works have defined MCI subclasses based on whether or not clinical scores significantly change from baseline. There are concerns with these definitions, however, since, e.g., most MCIs (and ADs) do not change from a baseline CDR = 0.5 at any subsequent visit in ADNI, even while physiological changes may be occurring. These works ignore rich phenotypical information in an MCI patient's brain scan and labeled AD and Control examples, in defining conversion. We propose an innovative definition, wherein an MCI is a converter if any of the patient's brain scans are classified "AD" by a Control-AD classifier. This definition bootstraps design of a second classifier, specifically trained to predict whether or not MCIs will convert. We thus predict whether an AD-Control classifier will predict that a patient has AD. Our results demonstrate that this definition leads not only to much higher prognostic accuracy than by-CDR conversion, but also to subpopulations more consistent with known AD biomarkers (including CSF markers). We also identify key prognostic brain region biomarkers.
PLoS ONE 01/2011; 6(10):e25074. · 4.09 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Feature selection for classification in high-dimensional spaces can improve generalization, reduce classifier complexity, and identify important, discriminating feature "markers." For support vector machine (SVM) classification, a widely used technique is recursive feature elimination (RFE). We demonstrate that RFE is not consistent with margin maximization, central to the SVM learning approach. We thus propose explicit margin-based feature elimination (MFE) for SVMs and demonstrate both improved margin and improved generalization, compared with RFE. Moreover, for the case of a nonlinear kernel, we show that RFE assumes that the squared weight vector 2-norm is strictly decreasing as features are eliminated. We demonstrate this is not true for the Gaussian kernel and, consequently, RFE may give poor results in this case. MFE for nonlinear kernels gives better margin and generalization. We also present an extension which achieves further margin gains, by optimizing only two degrees of freedom--the hyperplane's intercept and its squared 2-norm--with the weight vector orientation fixed. We finally introduce an extension that allows margin slackness. We compare against several alternatives, including RFE and a linear programming method that embeds feature selection within the classifier design. On high-dimensional gene microarray data sets, University of California at Irvine (UCI) repository data sets, and Alzheimer's disease brain image data, MFE methods give promising results.
IEEE Transactions on Neural Networks 02/2010; 21(5):701-17. · 2.95 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: This paper describes a Software Tool for Automated MRI Post-processing (STAMP) of multiple types of brain MRIs on a workstation and for parallel processing on a supercomputer (STAMPS). This software tool enables the automation of nonlinear registration for a large image set and for multiple MR image types. The tool uses standard brain MRI post-processing tools (such as SPM, FSL, and HAMMER) for multiple MR image types in a pipeline fashion. It also contains novel MRI post-processing features. The STAMP image outputs can be used to perform brain analysis using Statistical Parametric Mapping (SPM) or single-/multi-image modality brain analysis using Support Vector Machines (SVMs). Since STAMPS is PBS-based, the supercomputer may be a multi-node computer cluster or one of the latest multi-core computers.
Computer methods and programs in biomedicine 05/2009; 95(2):146-57. · 1.14 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Feature selection for classification working in high-dimensional feature spaces can improve generalization accuracy, reduce clas-sifier complexity, and is also useful for identifying the important feature "markers", e.g., biomarkers in a bioinformatics or biomed-ical context. For support vector machine (SVM) classification, a widely used feature selection technique is recursive feature elimi-nation (RFE). In recent work, we demonstrated that the RFE objec-tive is not generally consistent with the margin maximization ob-jective that is central to the SVM learning approach. We thus pro-posed explicit margin-based feature elimination (MFE) for SVMs and demonstrated both improved margin and improved general-ization accuracy, compared with RFE for the case of linear SVMs. In this paper, after reviewing MFE, we first introduce an exten-sion which achieves further gains in margin at small computational cost. This extension solves the SVM optimization problem to max-imize the classifier's margin at each feature elimination step, albeit in a lightweight fashion by optimizing only two degrees of freedom – the weight vector's slope and intercept. We next consider the case of a nonlinear kernel. We show that RFE defined for the non-linear kernel case assumes that the weight vector length is strictly decreasing as features are eliminated. We demonstrate experimen-tally that this assumption is not in general valid for the Gaussian kernel and that, consequently, RFE may give poor results in this case. An extension of MFE for the nonlinear kernel case gives both better margin and generalization accuracy. This approach may help nonlinear kernel SVMs to avoid overfitting and, thus, to achieve better results than linear SVMs in some high-dimensional domains where use of nonlinear kernels has not to date been found very favorable.