Conference Paper
Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions.
Conference: Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 710 December 2009, Vancouver, British Columbia, Canada.
Source: DBLP

[Show abstract] [Hide abstract]
ABSTRACT: Let P be a distribution with support S. The salient features of S can be quantified with persistent homology, which summarizes topological features of the sublevel sets of the distance function (the distance of any point x to S). Given a sample from P we can infer the persistent homology using an empirical version of the distance function. However, the empirical distance function is highly nonrobust to noise and outliers. Even one outlier is deadly. The distancetoameasure (DTM), introduced by Chazal et al. (2011), and the kernel distance, introduced by Phillips et al. (2014), are smooth functions that provide useful topological information but are robust to noise and outliers. Chazal et al. (2014) derived concentration bounds for DTM. Building on these results, we derive limiting distributions and confidence sets, and we propose a method for choosing tuning parameters. 
Conference Paper: Optimal kernel choice for largescale twosample tests
[Show abstract] [Hide abstract]
ABSTRACT: Given samples from distributions p and q, a twosample test determines whether to reject the null hypothesis that p = q, based on the value of a test statistic measuring the distance between the samples. One choice of test statistic is the maximum mean discrepancy (MMD), which is a distance between embeddings of the probability distributions in a reproducing kernel Hilbert space. The kernel used in obtaining these embeddings is critical in ensuring the test has high power, and correctly distinguishes unlike distributions with high probability. A means of parameter selection for the twosample test based on the MMD is proposed. For a given test level (an upper bound on the probability of making a Type I error), the kernel is chosen so as to maximize the test power, and minimize the probability of making a Type II error. The test statistic, test threshold, and optimization over the kernel parameters are obtained with cost linear in the sample size. These properties make the kernel selection and test procedures suited to data streams, where the observations cannot all be stored in memory. In experiments, the new kernel selection approach yields a more powerful test than earlier kernel selection heuristics.Advances in Neural Information Processing Systems 25 (NIPS 2012); 01/2012 
[Show abstract] [Hide abstract]
ABSTRACT: Recently, domain adaptation learning (DAL) has shown surprising performance by utilizing labeled samples from the source (or auxiliary) domain to learn a robust classifier for the target domain of the interest which has a few or even no labeled samples. In this paper, by incorporating classical graphbased transductive SSL diagram, a novel DAL method is proposed based on a sparse graph constructed via kernel sparse representation of data in an optimal reproduced kernel Hilbert space (RKHS) recovered by minimizing interdomain distribution discrepancy. Our method, named as Sparsity regularization Label Propagation for Domain Adaptation Learning (SLPDAL), can propagate the labels of the labeled data from both domains to the unlabeled one in the target domain using their sparsely reconstructed objects with sufficient smoothness by using three steps: (1) an optimal RKHS is first recovered so as to minimize the data distributions of two domains; (2) it then computes the best kernel sparse reconstructed coefficients for each data point in both domains by using l1norm minimization in the RKHS, thus constructing a sparse graph; and (3) the labels of the labeled data from both domains is finally propagated to the unlabeled points in the target domain over the sparse graph based on our proposed sparsity regularization framework, in which it is assumed that the label of each data point can be sparsely reconstructed by those of other data points from both domains. Furthermore, based on the proposed sparsity regularization framework, an easy way is derived to extend SLPDAL to outofsample data. Promising experimental results have been obtained on both a serial of toy datasets and several realworld datasets such as face, visual video and text.Neurocomputing 09/2014; 139:202–219. DOI:10.1016/j.neucom.2014.02.044 · 2.01 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed.
The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual
current impact factor.
Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence
agreement may be applicable.