Conference Paper

Effect of GSM speech coding on the performance of Speaker Recognition System

DOI: 10.1109/ISSPA.2010.5605487 Conference: 10th International Conference on Information Sciences, Signal Processing and their Applications, ISSPA 2010, Kuala Lumpur, Malaysia, 10-13 May, 2010
Source: DBLP


This paper investigates the influence of GSM speech coding on the performance of a text independent Speaker Recognition System (SRS). The SRS developed perform recognition on reconstructed speech waveform from the coded parameters using Gaussian Mixture Models (GMM) technique. The performance evaluation due to the use of the GSM speech coding namely the GSMEFR (Global System Mobile Enhanced Full Rate) codec was conducted, using three transcoded databases, obtained by passing the local ARADIGIT database through the GSM coder/decoder. The recognition evaluation was also conducted using original ARADIGIT sampled at 16 KHz and its 8 KHz downsampled version. The ARADIGIT database consists of 60 speakers (31 male speakers and 29 female speakers) pronouncing the ten Arabic digits five time each. Several experiments were conducted in order to evaluate the degradation introduced by different aspects of the simulated coder.

22 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we investigate the influence of noisy channel on the performance of acoustic echo cancellation system. In the mobile communications, acoustic echo is mainly caused by the coupling between the loudspeaker and the microphone of the mobile device. So, an Acoustic Echo Canceller (AEC) should be used locally inside this device. This paper evaluates the performances of AEC, generally based on adaptive filtering, where the transmitted speech is encoded and decoded by AMR-WB (Adaptive Multi-Rate Wide Band) speech codec. The encoded speech is transmitted over a transmission channel modeled by AWGN (Additive White Gaussian Noise) and Rayleigh fading channel. The simulation results show the strong degradation of the AEC performances, due to the noisy channel.
    No preview · Conference Paper · Dec 2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The paper examines issues related to proper selection of models used for quick speaker recognition based on short recordings of mobile telephone conversations. A knowledge of the encoder type used during the transmission of speech allows to apply an appropriate model that takes speci�c characteristics of the encoder into account: full rate (FR), half rate (HR), enhanced full rate (EFR) and adaptive multi-rate (AMR). We analyse both proper model selection and automatic silence removal. Analysis of time of processing is also a part of this study.
    Full-text · Conference Paper · Jun 2014