Conference Paper

Sparse Kernel Logistic Regression using Incremental Feature Selection for Text-Independent Speaker Identification

IESK, Magdeburg Univ.
DOI: 10.1109/ODYSSEY.2006.248115 Conference: Speaker and Language Recognition Workshop, 2006. IEEE Odyssey 2006: The
Source: IEEE Xplore


Logistic regression is a well known classification method in the field of statistical learning. Recently, a kernelized version of logistic regression has become very popular, because it allows non-linear probabilistic classification and shows promising results on several benchmark problems. In this paper we show that kernel logistic regression (KLR) and especially its sparse extensions (SKLR) are useful alternatives to standard Gaussian mixture models (GMMs) and support vector machines (SVMs) in Speaker recognition. While the classification results of KLR and SKLR are similar to the results of SVMs, we show that SKLR produces highly sparse models. Unlike SVMs the kernel logistic regression also provides an estimate of the conditional probability of class membership. In speaker identification experiments the SKLR methods outperform the SVM and the GMM baseline system on the POLY-COST database

Download full-text


Available from: Andreas Wendemuth
  • Source
    • "There is also ongoing research exploring the application of LR in speaker verification. Kernel LR is proposed in [48], [49] for an MFCC frame-level, feature-based speaker verification solution. i-Vector-based LR was recently investigated for the telephone-telephone condition of SRE2010 [12]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This study aims to explore the case of robust speaker recognition with multi-session enrollments and noise, with an emphasis on optimal organization and utilization of speaker information presented in the enrollment and development data. This study has two core objectives. First, we investigate more robust back-ends to address noisy multi-session enrollment data for speaker recognition. This task is achieved by proposing novel back-end algorithms. Second, we construct a highly discriminative speaker verification framework. This task is achieved through intrinsic and extrinsic back-end algorithm modification, resulting in complementary sub-systems. Evaluation of the proposed framework is performed on the NIST SRE2012 corpus. Results not only confirm individual sub-system advancements over an established baseline, the final grand fusion solution also represents a comprehensive overall advancement for the NIST SRE2012 core tasks. Compared with state-of-the-art SID systems on the NIST SRE2012, the novel parts of this study are: 1) exploring a more diverse set of solutions for low-dimensional i-Vector based modeling; and 2) diversifying the information configuration before modeling. All these two parts work together, resulting in very competitive performance with reasonable computational cost.
    Full-text · Article · Aug 2014 · IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: The sparsity driven classification technologies have attracted much attention in recent years, due to their capability of providing more compressive representations and clear interpretation. Two most popular classification approaches are support vector machines (SVMs) and kernel logistic regression (KLR), each having its own advantages. The sparsification of SVM has been well studied, and many sparse versions of 2-norm SVM, such as 1-norm SVM (1-SVM), have been developed. But, the sparsification of KLR has been less studied. The existing sparsification of KLR is mainly based on L 1 norm and L 2 norm penalties, which leads to the sparse versions that yield solutions not so sparse as it should be. A very recent study on L 1/2 regularization theory in compressive sensing shows that L 1/2 sparse modeling can yield solutions more sparse than those of 1 norm and 2 norm, and, furthermore, the model can be efficiently solved by a simple iterative thresholding procedure. The objective function dealt with in L 1/2 regularization theory is, however, of square form, the gradient of which is linear in its variables (such an objective function is the so-called linear gradient function). In this paper, through extending the linear gradient function of L 1/2 regularization framework to the logistic function, we propose a novel sparse version of KLR, the 1/2 quasi-norm kernel logistic regression (1/2-KLR). The version integrates advantages of KLR and L 1/2 regularization, and defines an efficient implementation scheme of sparse KLR. We suggest a fast iterative thresholding algorithm for 1/2-KLR and prove its convergence. We provide a series of simulations to demonstrate that 1/2-KLR can often obtain more sparse solutions than the existing sparsity driven versions of KLR, at the same or better accuracy level. The conclusion is also true even in comparison with sparse SVMs (1-SVM and 2-SVM). We show an exclusive advantage of 1/2-KLR that the regularization parameter in the algorithm can be adaptively set whenever the sparsity (correspondingly, the number of support vectors) is given, which suggests a methodology of comparing sparsity promotion capability of different sparsity driven classifiers. As an illustration of benefits of 1/2-KLR, we give two applications of 1/2-KLR in semi-supervised learning, showing that 1/2-KLR can be successfully applied to the classification tasks in which only a few data are labeled.
    No preview · Article · Apr 2012 · Sciece China. Information Sciences
  • [Show abstract] [Hide abstract]
    ABSTRACT: Hidden Markov models (HMMs) are powerful generative models for sequential data that have been used in automatic speech recognition for more than two decades. Despite their popularity, HMMs make inaccurate assumptions about speech signals, thereby limiting the achievable performance of the conventional speech recognizer. Penalized logistic regression (PLR) is a well-founded discriminative classifier with long roots in the history of statistics. Its classification performance is often compared with that of the popular support vector machine (SVM). However, for speech classification, only limited success with PLR has been reported, partially due to the difficulty with sequential data. In this paper, we present an elegant way of incorporating HMMs in the PLR framework. This leads to a powerful discriminative classifier that naturally handles sequential data. In this approach, speech classification is done using affine combinations of HMM log-likelihoods. We believe that such combinations of HMMs lead to a more accurate classifier than the conventional HMM-based classifier. Unlike similar approaches, we jointly estimate the HMM parameters and the PLR parameters using a single training criterion. The extension to continuous speech recognition is done via rescoring of N-best lists or lattices.
    No preview · Article · Sep 2010 · IEEE Transactions on Audio Speech and Language Processing
Show more