Conference Proceeding
Efficient speaker identification using distributional speaker model clustering
Klipsch Sch. of Electr. & Comput. Eng., New Mexico State Univ., Las Cruces, NM
Circuits, Systems and Computers, 1977. Conference Record. 1977 11th Asilomar Conference on
11/2008;
DOI:10.1109/ACSSC.2008.5074619
pp.1260 - 1264 In proceeding of: Signals, Systems and Computers, 2008 42nd Asilomar Conference on
Source: IEEE Xplore
- Citations (12)
-
Cited In (0)
-
Article: Robust text-independent speaker identification using Gaussian mixture speaker models
[show abstract] [hide abstract]
ABSTRACT: This paper introduces and motivates the use of Gaussian mixture models (GMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance from unconstrained conversational speech and robustness to degradations produced by transmission over a telephone channel. A complete experimental evaluation of the Gaussian mixture speaker model is conducted on a 49 speaker, conversational telephone speech database. The experiments examine algorithmic issues (initialization, variance limiting, model order selection), spectral variability robustness techniques, large population performance, and comparisons to other speaker modeling techniques (uni-modal Gaussian, VQ codebook, tied Gaussian mixture, and radial basis functions). The Gaussian mixture speaker model attains 96.8% identification accuracy using 5 second clean speech utterances and 80.8% accuracy using 15 second telephone speech utterances with a 49 speaker population and is shown to outperform the other speaker modeling techniques on an identical 16 speaker telephone speech taskIEEE Transactions on Speech and Audio Processing 02/1995; · 2.29 Impact Factor -
Article: Speaker Verification Using Adapted Gaussian Mixture Models
[show abstract] [hide abstract]
ABSTRACT: Reynolds, Douglas A., Quatieri, Thomas F., and Dunn, Robert B., Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing10(2000), 19–41.In this paper we describe the major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs). The system is built around the likelihood ratio test for verification, using simple but effective GMMs for likelihood functions, a universal background model (UBM) for alternative speaker representation, and a form of Bayesian adaptation to derive speaker models from the UBM. The development and use of a handset detector and score normalization to greatly improve verification performance is also described and discussed. Finally, representative performance benchmarks and system behavior experiments on NIST SRE corpora are presented.Digital Signal Processing. -
Article: Efficient Speaker Recognition Using Approximated Cross Entropy (ACE)
[show abstract] [hide abstract]
ABSTRACT: Techniques for efficient speaker recognition are presented. These techniques are based on approximating Gaussian mixture modeling (GMM) likelihood scoring using approximated cross entropy (ACE). Gaussian mixture modeling is used for representing both training and test sessions and is shown to perform speaker recognition and retrieval extremely efficiently without any notable degradation in accuracy compared to classic GMM-based recognition. In addition, a GMM compression algorithm is presented. This algorithm decreases considerably the storage needed for speaker retrieval.IEEE Transactions on Audio Speech and Language Processing 10/2007; · 1.50 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed.
The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual
current impact factor.
Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence
agreement may be applicable.
Keywords
applications
distributional distance measure
identification accuracy
KL divergence
large population speaker identification
likelihood computations
NIST-2002 large population speech corpora
NTIMIT
paper implements GMM-UBM
proposed method
speaker model search space
testing stage
TIMIT