Article
Combined SVMBased Feature Selection and Classification.
Machine Learning (Impact Factor: 1.47). 01/2005; 61:129150. DOI: 10.1007/s1099400515059
Source: DBLP

Article: White box radial basis function classifiers with component selection for clinical prediction models.
[Show abstract] [Hide abstract]
ABSTRACT: To propose a new flexible and sparse classifier that results in interpretable decision support systems. Support vector machines (SVMs) for classification are very powerful methods to obtain classifiers for complex problems. Although the performance of these methods is consistently high and nonlinearities and interactions between variables can be handled efficiently when using nonlinear kernels such as the radial basis function (RBF) kernel, their use in domains where interpretability is an issue is hampered by their lack of transparency. Many feature selection algorithms have been developed to allow for some interpretation but the impact of the different input variables on the prediction still remains unclear. Alternative models using additive kernels are restricted to main effects, reducing their usefulness in many applications. This paper proposes a new approach to expand the RBF kernel into interpretable and visualizable components, including main and twoway interaction effects. In order to obtain a sparse model representation, an iterative l1regularized parametric model using the interpretable components as inputs is proposed. Results on toy problems illustrate the ability of the method to select the correct contributions and an improved performance over standard RBF classifiers in the presence of irrelevant input variables. For a 10dimensional xor problem, an SVM using the standard RBF kernel obtains an area under the receiver operating characteristic curve (AUC) of 0.947, whereas the proposed method achieves an AUC of 0.997. The latter additionally identifies the relevant components. In a second 10dimensional artificial problem, the underlying class probability follows a logistic regression model. An SVM with the RBF kernel results in an AUC of 0.975, as apposed to 0.994 for the presented method. The proposed method is applied to two benchmark datasets: the Pima Indian diabetes and the Wisconsin Breast Cancer dataset. The AUC is in both cases comparable to those of the standard method (0.826 versus 0.826 and 0.990 versus 0.996) and those reported in the literature. The selected components are consistent with different approaches reported in other work. However, this method is able to visualize the effect of each of the components, allowing for interpretation of the learned logic by experts in the application domain. This work proposes a new method to obtain flexible and sparse risk prediction models. The proposed method performs as well as a support vector machine using the standard RBF kernel, but has the additional advantage that the resulting model can be interpreted by experts in the application domain.Artificial intelligence in medicine 10/2013; · 1.65 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: This paper considers a class of feature selecting support vector machines (SVMs) based on LqLqnorm regularization, where q∈(0,1)q∈(0,1). The standard SVM [Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer, NY.] minimizes the hinge loss function subject to the L2L2norm penalty. Recently, L1L1norm SVM (L1L1SVM) [Bradley, P., Mangasarian, O., 1998. Feature selection via concave minimization and support vector machines. In: Machine Learning Proceedings of the Fifteenth International Conference (ICML98). Citeseer, pp. 82–90.] was suggested for feature selection and has gained great popularity since its introduction. L0L0norm penalization would result in more powerful sparsification, but exact solution is NPhard. This raises the question of whether fractionalnorm (LqLq for qq between 0 and 1) penalization can yield benefits over the existing L1L1, and approximated L0L0 approaches for SVMs. The major obstacle to answering this is that the resulting objective functions are nonconvex. This paper addresses the difficult optimization problems of fractionalnorm SVM by introducing a new algorithm based on the Difference of Convex functions (DC) programming techniques [Pham Dinh, T., Le Thi, H., 1998. A DC optimization algorithm for solving the trustregion subproblem. SIAM J. Optim. 8 (2), 476–505. Le Thi, H., Pham Dinh, T., 2008. A continuous approach for the concave cost supply problem via DC programming and DCA. Discrete Appl. Math. 156 (3), 325–338.], which efficiently solves a reweighted L1L1SVM problem at each iteration. Numerical results on seven real world biomedical datasets support the effectiveness of the proposed approach compared to other commonlyused sparse SVM methods, including L1L1SVM, and recent approximated L0L0SVM approaches.Computational Statistics & Data Analysis 11/2013; 67:136–148. · 1.30 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Plenty of feature selection methods are available in literature due to the availability of data with hundreds of variables leading to data with very high dimension. Feature selection methods provides us a way of reducing computation time, improving prediction performance, and a better understanding of the data in machine learning or pattern recognition applications. In this paper we provide an overview of some of the methods present in literature. The objective is to provide a generic introduction to variable elimination which can be applied to a wide array of machine learning problems. We focus on Filter, Wrapper and Embedded methods. We also apply some of the feature selection techniques on standard datasets to demonstrate the applicability of feature selection techniques.Computers & Electrical Engineering 01/2014; 40(1):1628. · 0.93 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.