Conference Paper

Discrete wavelet transform and support vector machine applied to pathological voice signals identification

Sch. of Eng. of Sao Carlos, Sao Paulo Univ., Sao Carlos, Brazil
DOI: 10.1109/ISM.2005.50 Conference: Multimedia, Seventh IEEE International Symposium on
Source: IEEE Xplore

ABSTRACT An algorithm able to classify pathological and normal voice signals based on Daubechies discrete wavelet transform (DWT-db) and support vector machines (SVM) classifier is presented. DWT-db is used for time-frequency analysis giving quantitative evaluation of signal characteristics to identify pathologies in voice signals, particularly nodules in vocal folds, of subjects with different ages for both male and female. After using a linear prediction coefficients (LPC) filter, the signals mean square values of a particular scale from wavelet analysis are entries to a nonlinear least square support vector machine (LS-SVM) classifier, which leads to an adequate larynx pathology classifier which over 95% of classification accuracy.

  • Source
    • "Our research question is: Can we keep all the features and use feature interactions to our advantage? Recently, wavelet kernels have been investigated in certain applications including regression [3], voice classification [4], and biomarker discovery in protein structures [5], bringing in the reach framework of wavelet analysis from signal processing. As one attempt with a Haar wavelet in information retrieval highlights [6], there is one fundamental problem with existing wavelet kernels when it comes to classification: in order to make such wavelet kernels operational, a relation (traditionally temporal or spatial) is assumed between subsequent features that describe the data instances. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Wavelet kernels have been introduced for both support vector regression and classification. Most of these wavelet kernels do not use the inner product of the embedding space, but use wavelets in a similar fashion to radial basis function kernels. Wavelet analysis is typically carried out on data with a temporal or spatial relation between consecutive data points. We argue that it is possible to order the features of a general data set so that consecutive features are statistically related to each other, thus enabling us to interpret the vector representation of an object as a series of equally or randomly spaced observations of a hypothetical continuous signal. By approximating the signal with compactly supported basis functions and employing the inner product of the embedding L<sub>2</sub> space, we gain a new family of wavelet kernels. Empirical results show a clear advantage in favor of these kernels.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2011; 33(10-33):2039 - 2050. DOI:10.1109/TPAMI.2011.28 · 5.69 Impact Factor
  • Source
    • "Many of the recent algorithms described in the literature use wavelets as powerful tools in the feature-extraction stage. Wavelets are applied in various forms, such as the discrete wavelet transform (DWT) [1] [31] [32], continuous wavelet transform (CWT) [2] [36] [38] and wavelet packets [3] [33] [34] [35]. From the classification point of view, the efficiency of the overall system depends on the appropriateness of both the extracted features and the classification method applied. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The presence of abnormalities in the vocal system affects the quality of the voice and changes its characteristics. Digital analysis of pathological voices can be an effective and non-invasive tool for the detection of such alterations. This paper proposes a wavelet-based method to distinguish between normal and disordered voices. Wavelet filter banks are used in conjunction with support vector machines, as feature extractors and classifiers, respectively. Orthogonal filter banks are implemented using a highly efficient structure known as "lattice" that parameterizes filter banks and produces a few parameters. The overall problem is to find these parameters such that perfect classification is achieved. To search for such parameters, a genetic algorithm with a fitness function corresponding to the classification result is applied. Simulation is done on the KAY database (a comprehensive database including 710 normal and pathological voice signals, developed by the Massachusetts Eye and Ear Infirmary Voice and Speech Lab), and one additional test set. It is observed that a genetic algorithm is able to find the filter bank parameters such that a 100% correct classification rate is achieved in classifying normal and pathological voices when the test is performed on both databases.
    Computers in Biology and Medicine 09/2011; 41(9):822-8. DOI:10.1016/j.compbiomed.2011.06.019 · 1.90 Impact Factor
  • Source
    • "In this study, mother wavelet function of the tenth order Daubechies has been chosen and the signals have been decomposed to five levels. The mother wavelet used in this study is reported to be effective in voice signal analysis [8] [10] and is being widely used in many pathological voice analyses [7] [15]. Due to the noise-like effect of irregularities in the vibration pattern of damaged vocal folds, the distribution manner of such variations within the whole frequency range of pathological speech signals is not clearly known. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Unilateral vocal fold paralysis, vocal fold polyp, and vocal fold nodules are the most common types of neurogenic and organic vocal disorders. This article aims to distinguish these types of vocal diseases into four different classes for the purpose of automatic screening. Firstly, the reconstructed signal at each wavelet packet decomposition sub-band in five levels of decomposition with mother wavelet of (db10) is used to extract the nonlinear features of self-similarity and approximate entropy. Also, wavelet packet coefficients are used to measure energy and Shannon entropy features at different spectral sub-bands. Consequently, to find a discriminant feature vector, three different methods have been applied: Davies-Bouldin (DB) criteria, genetic algorithm (GA) with the fitness functions of support vector machine's (SVM) and k-nearest neighbor's (KNN) recognition rates. Finally, obtained feature vectors have been passed on to SVM and KNN classifiers. The results show that a feature vector of length 12 obtained by the optimization method of GA with the fitness function of SVM's recognition rate fed to SVM classifier achieves the highest classification accuracy of 91%. Furthermore, nonlinear features play an important role in pathological voice classification by participating rate of approximately 67% in the optimal feature vector.
    Computers in Biology and Medicine 09/2009; 39(10):860-8. DOI:10.1016/j.compbiomed.2009.06.014 · 1.90 Impact Factor
Show more