Article

The generalization performance of ERM algorithm with strongly mixing observations

Machine Learning (Impact Factor: 1.69). 06/2009; 75(3):275-295. DOI: 10.1007/s10994-009-5104-z
Source: DBLP

ABSTRACT The generalization performance is the main concern of machine learning theoretical research. The previous main bounds describing
the generalization ability of the Empirical Risk Minimization (ERM) algorithm are based on independent and identically distributed
(i.i.d.) samples. In order to study the generalization performance of the ERM algorithm with dependent observations, we first
establish the exponential bound on the rate of relative uniform convergence of the ERM algorithm with exponentially strongly
mixing observations, and then we obtain the generalization bounds and prove that the ERM algorithm with exponentially strongly
mixing observations is consistent. The main results obtained in this paper not only extend the previously known results for
i.i.d. observations to the case of exponentially strongly mixing observations, but also improve the previous results for strongly
mixing samples. Because the ERM algorithm is usually very time-consuming and overfitting may happen when the complexity of
the hypothesis space is high, as an application of our main results we also explore a new strategy to implement the ERM algorithm
in high complexity hypothesis space.

Full-text

Available from: Luoqing Li, Feb 27, 2014
0 Followers
 · 
138 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: The previously known works studying the generalization ability of support vector machine classification (SVMC) algorithm are usually based on the assumption of independent and identically distributed samples. In this paper, we go far beyond this classical framework by studying the generalization ability of SVMC based on uniformly ergodic Markov chain (u.e.M.c.) samples. We analyze the excess misclassification error of SVMC based on u.e.M.c. samples, and obtain the optimal learning rate of SVMC for u.e.M.c. samples. We also introduce a new Markov sampling algorithm for SVMC to generate u.e.M.c. samples from given dataset, and present the numerical studies on the learning performance of SVMC based on Markov sampling for benchmark datasets. The numerical studies show that the SVMC based on Markov sampling not only has better generalization ability as the number of training samples are bigger, but also the classifiers based on Markov sampling are sparsity when the size of dataset is bigger with regard to the input dimension.
    IEEE transactions on neural networks and learning systems 08/2014; 26(3). DOI:10.1109/TCYB.2014.2346536 · 4.37 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Markov sampling is a natural sampling mechanism extensively used in applications, especially in the study of time sequence or content-based pattern recognition or biological sequence analysis. In this paper we generalize the study on the learning performance of support vector machine classification (SVMC) algorithm with Markov chain samples based on linear prediction models to the case of Gaussian kernel. We present the numerical studies on the learning performance of Gaussian kernel SVMC algorithm based on Markov chain samples for benchmark repository.
    2013 9th International Conference on Natural Computation (ICNC); 07/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper studies the generalization performance of radial basis function (RBF) networks using local Rademacher complexities. We propose a general result on controlling local Rademacher complexities with the L1 -metric capacity. We then apply this result to estimate the RBF networks' complexities, based on which a novel estimation error bound is obtained. An effective approximation error bound is also derived by carefully investigating the Hölder continuity of the lp loss function's derivative. Furthermore, it is demonstrated that the RBF network minimizing an appropriately constructed structural risk admits a significantly better learning rate when compared with the existing results. An empirical study is also performed to justify the application of our structural risk in model selection.
    IEEE transactions on neural networks and learning systems 03/2015; 26(3):551-64. DOI:10.1109/TNNLS.2014.2320280 · 4.37 Impact Factor