ABSTRACT: This paper investigates the contribution of formants and prosodic features such as pitch and energy in Arabic speech recognition
under real-life conditions. Our speech recognition system based on Hidden Markov Models (HMMs) is implemented using the HTK
Toolkit. The front-end of the system combines features based on conventional Mel-Frequency Cepstral Coefficient (MFFC), prosodic
information and formants. The experiments are performed on the ARADIGIT corpus which is a database of Arabic spoken words.
The obtained results show that the resulting multivariate feature vectors, in noisy environment, lead to a significant improvement,
up to 27%, in word accuracy relative the word accuracy obtained from the state-of-the-art MFCC-based system.
KeywordsASR system–HMM–MFCC–Formant–Prosodic features–Speech variability–Additive noise
International Journal of Speech Technology 05/2012; 14(4):351-359.