Effect of GSM speech coding on the performance of Speaker Recognition System.
ABSTRACT This paper investigates the influence of GSM speech coding on the performance of a text independent Speaker Recognition System (SRS). The SRS developed perform recognition on reconstructed speech waveform from the coded parameters using Gaussian Mixture Models (GMM) technique. The performance evaluation due to the use of the GSM speech coding namely the GSMEFR (Global System Mobile Enhanced Full Rate) codec was conducted, using three transcoded databases, obtained by passing the local ARADIGIT database through the GSM coder/decoder. The recognition evaluation was also conducted using original ARADIGIT sampled at 16 KHz and its 8 KHz downsampled version. The ARADIGIT database consists of 60 speakers (31 male speakers and 29 female speakers) pronouncing the ten Arabic digits five time each. Several experiments were conducted in order to evaluate the degradation introduced by different aspects of the simulated coder.
- SourceAvailable from: citeseerx.ist.psu.edu
Article: Robust Speaker Recognition
- [show abstract] [hide abstract]
ABSTRACT: We describe current approaches to text-independent speaker identification based on probabilistic modeling techniques. The probabilistic approaches have largely supplanted methods based on comparisons of long-term feature averages. The probabilistic approaches have an important and basic dichotomy into nonparametric and parametric probability models. Nonparametric models have the advantage of being potentially more accurate models (though possibly more fragile) while parametric models that offer computational efficiencies and the ability to characterize the effects of the environment by the effects on the parameters. A robust speaker-identification system is presented that was able to deal with various forms of anomalies that are localized in time, such as spurious noise events and crosstalk. It is based on a segmental approach in which normalized segment scores formed the basic input for a variety of robust 43% procedures. Experimental results are presented, illustrating 59% the advantages and disadvantages of the different procedures. 64%. We show the role that cross-validation can play in determining how to weight the different sources of information when combining them into a single score. Finally we explore a Bayesian approach to measuring confidence in the decisions made, which enabled us to reject the consideration of certain tests in order to achieve an improved, predicted performance level on the tests that were retained.< >IEEE Signal Processing Magazine 11/1994; · 3.37 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: We have investigated the influence of GSM speech coding in the performance of a text-independent speaker recognition system based on Gaussian Mixture Models (GMM). The performance degradation due to the utilization of the three GSM speech coders was assessed, using three transcoded databases, obtained by passing the TIMIT through each GSM coder/decoder. The recognition performance was also assessed using the original TIMIT and its 8 kHz downsampled version. Then, different experiments were carried out in order to explore feature calculation directly from the GSM EFR encoded parameters and to measure the degradation introduced by different aspects of the coder.06/2000;