Robust features for speech recognition based on admissible wavelet packets

Dept. of Electron. & Electr. Eng., Loughborough Univ. of Technol., UK
Electronics Letters (Impact Factor: 0.93). 01/2002; 37(25):1554 - 1556. DOI: 10.1049/el:20011029
Source: IEEE Xplore


A six-band filter structure derived by using admissible wavelet
packets for the extraction of the features for recognition of noisy
speech is proposed. A simple compensation for white Gaussian noise is
carried out and the recognition performance is compared with the
features based on Mel scale cepstral coefficients (MFCC) and 24-band
admissible wavelet packet filter structure

1 Follower
4 Reads
  • Source
    • "The decomposition process is recursively applied to both the low and high frequency sub-bands to generate the next level of the hierarchy. If an orthonormal wavelet basis has been chosen, the coefficients computed are independent and possess a distinct feature of the original signal [7]. Wavelet packets can be described by the following collection of basis functions: "
    [Show abstract] [Hide abstract]
    ABSTRACT: The existence of vocal fold edema or the formation of nodules and polyp are one of the conventional types of benign vocal fold lesions that can affect the speech signal quality of patients. This paper proposes a non-invasive method in order to discriminate these three types of vocal fold diseases and classify them into their corresponding group of vocal fold inflammation by processing the speech signal of patients. Experiments on the basis of two different methods of feature extraction, wavelet packet sub bands and Mel frequency scaled filter banks, are carried out with 83 voiced signals, uttered by individuals of both sexes, aged from 19 to 81, each suffering from one of these three special cases of vocal fold swelling. As the similarity of these three groups of vocal fold disorder leads to highly correlated groups of extracted features for each class, genetic algorithm is applied to find the most separable feature vector indexes. The classification done through using support vector machine as a nonlinear classifier showed that extracted feature vectors on the basis of entropy definition, as an expression of vocal fold irregularities, under some specific wavelet packet sub-bands results in the best classification percentage of 91.18% for these three classes of vocal fold pathology.
    International Conference on Speech and Computer; 09/2005
  • Source
    • "Second, filter banks implementing wavelet transforms are typically dyadic, splitting the spectrum in half. We examine tree-structured filter banks, which were previously proposed as potential solution, allowing iteration at both high and low channels of a 2-channel filter bank [8]. We then look into rational filter banks to obtain finer frequency resolution and naturally simulate the critical bands of the human auditory system [9]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2004. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Includes bibliographical references (p. 87-91). by Ghinwa F. Choueiter. S.M.
  • [Show abstract] [Hide abstract]
    ABSTRACT: A new pre-processing stage based on wavelet denoising is proposed to extract robust features in the presence of additive white Gaussian noise. Recognition performance is compared with the commonly used Mel frequency cepstral coefficients with and without this preprocessing stage. The word recognition accuracy is found to improve using the proposed technique by 2 to 28% for signal-to-noise ratio in the range of 20 to 0 dB.
    Electronics Letters 02/2003; 39(1-39):163 - 165. DOI:10.1049/el:20030068 · 0.93 Impact Factor
Show more

Similar Publications