Conference Paper

A Kernel Statistical Test of Independence.

Conference: Advances in Neural Information Processing Systems 20, Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 3-6, 2007
Source: DBLP
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this article we study the generalization abilities of several classifiers of support vector machine (SVM) type using a certain class of kernels that we call universal. It is shown that the soft margin algorithms with universal kernels are consistent for a large class of classification problems including some kind of noisy tasks provided that the regularization parameter is chosen well. In particular we derive a simple su#cient condition for this parameter in the case of Gaussian RBF kernels. On the one hand our considerations are based on an investigation of an approximation property---the so-called universality---of the used kernels that ensures that all continuous functions can be approximated by certain kernel expressions. This approximation property also gives a new insight into the role of kernels in these and other algorithms. On the other hand the results are achieved by a precise study of the underlying optimization problems of the classifiers. Furthermore, we show consistency for the maximal margin classifier as well as for the soft margin SVM's in the presence of large margins. In this case it turns out that also constant regularization parameters ensure consistency for the soft margin SVM's. Finally we prove that even for simple, noise free classification problems SVM's with polynomial kernels can behave arbitrarily badly.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a framework for filtering fea- tures that employs the Hilbert-Schmidt In- dependence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that good features should maximise such dependence. Fea- ture selection for various supervised learning problems (including classification and regres- sion) is unified under this framework, and the solutions can be approximated using a backward-elimination algorithm. We demon- strate the usefulness of our method on both artificial and real world datasets.
    CoRR. 01/2007; abs/0704.2668.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper considers estimates of multidimensional density functions based on a bounded and bandlimited weight function. The asymptotic behavior of quadratic functions of density function estimates useful in setting up a test of goodness of fit of the density function is determined. A test of independent is also given. The methods use a Poissonization of sample size. The estimates considered are appropriate if interested in estimating density functions or determining local deviations from a given density function.
    The Annals of Statistics 01/1975; · 2.44 Impact Factor