Article

Evaluation of Perceptual and Multi Sub-Band Energy Features for Classification of Normal, Pathological and Noisy Phonocardiogram

Authors:
  • Ai Health Highway
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In this work, a novel region switching based classification method is proposed for speech emotion classification using vowel-like regions (VLRs) and non-vowel-like regions (non-VLRs). In literature, normally the entire active speech region is processed for emotion classification. Few studies have been performed on segmented sound units, such as, syllables, phones, vowel, consonant and voiced, for speech emotion classification. This work presents a detailed analysis of emotion information contained independently in segmented VLRs and non-VLRs. The proposed region switching based method is implemented by choosing the features of either VLRs or non-VLRs for each emotion. The VLRs are detected by identifying hypothesized VLR onset and end points. Segmentation of non-VLRs is done by using the knowledge of VLRs and active speech regions. The performance is evaluated using EMODB, IEMOCAP and FAU AIBO databases. Experimental results show that both the VLRs and non-VLRs contain emotion-specific information. In terms of emotion classification rate, the proposed region switching based classification approach shows significant improvement in comparison to the classification approach by processing entire active speech region, and it outperforms other state-of-the-art approaches for all the three databases.
Conference Paper
Full-text available
Auscultation is the primary tool for detection and diagnosis of cardiovascular diseases in hospitals and home visits. This fact has led in the recent years to the development of automatic methods for heart sound classification, thus allowing for detecting cardiovascular pathologies in an effective way. The aim of this paper is to review recent methods for automatic classification and to apply several signal processing techniques in order to evaluate them in the PhysioNet/CinC Challenge 2016 results. For this purpose, the records of the open database PysioNet/Computing are modified by segmentation or filtering methods and the results were tested using the challenge best ranked algorithms. Results show that an adequate preprocessing of data and subsequent feature selection may improve the performance of machine learning and classification techniques.
Conference Paper
Full-text available
The goal of the 2016 PhysioNet/CinC Challenge is the development of an algorithm to classify normal/abnormal heart sounds. A total of 124 time-frequency features were extracted from the phonocardiogram (PCG) and input to a variant of the AdaBoost classifier. A second classifier using convolutional neural network (CNN) was trained using PCGs cardiac cycles decomposed into four frequency bands. The final decision rule to classify normal/abnor-mal heart sounds was based on an ensemble of classifiers combining the outputs of AdaBoost and the CNN. The algorithm was trained on a training dataset (normal= 2575, abnormal= 665) and evaluated on a blind test dataset. Our classifier ensemble approach obtained the highest score of the competition with a sensitivity, specificity, and overall score of 0.9424, 0.7781, and 0.8602, respectively.
Article
Full-text available
Music genre recognition based on visual representation has been successfully explored over the last years. Classifiers trained with textural descriptors (e.g., Local Binary Patterns, Local Phase Quantization, and Gabor filters) extracted from the spectrograms have achieved state-of-the-art results on several music datasets. In this work, though, we argue that we can go further with the time-frequency analysis through the use of representation learning. To show that, we compare the results obtained with a Convolutional Neural Network (CNN) with the results obtained by using handcrafted features and SVM classifiers. In addition, we have performed experiments fusing the results obtained with learned features and handcrafted features to assess the complementarity between these representations for the music classification task. Experiments were conducted on three music databases with distinct characteristics, specifically a western music collection largely used in research benchmarks (ISMIR 2004 Database), a collection of Latin American music (LMD database), and a collection of field recordings of ethnic African music. Our experiments show that the CNN compares favorably to other classifiers in several scenarios, hence, it is a very interesting alternative for music genre recognition. Considering the African database, the CNN surpassed the handcrafted representations and also the state-of-the-art by a margin. In the case of the LMD database, the combination of CNN and Robust Local Binary Pattern achieved a recognition rate of 92%, which to the best of our knowledge, is the best result (using an artist filter) on this dataset so far. On the ISMIR 2004 dataset, although the CNN did not improve the state of the art, it performed better than the classifiers based individually on other kind of features.
Conference Paper
Full-text available
Phonocardiogram (PCG) signal is used as a diagnostic test in ambulatory monitoring in order to evaluate the heart hemodynamic status and to detect a cardiovascular disease. The objective of this study is to develop an automatic classification method for anomaly (normal vs. abnormal) and quality (good vs. bad) detection of PCG recordings without segmentation. For this purpose, a subset of 18 features is selected among 40 features based on a wrapper feature selection scheme. These features are extracted from time, frequency, and time-frequency domains without any segmentation. The selected features are fed into an ensemble of 20 feedforward neural networks for classification task. The proposed algorithm achieved the overall score of 91.50% (94.23% sensitivity and 88.76% specificity) and 85.90% (86.91% sensitivity and 84.90% specificity) on the train and unseen test datasets, respectively. The proposed method got the second best score in the PhysioNet/CinC Challenge 2016.
Article
Full-text available
In the past few decades, analysis of heart sound signals (i.e. the phonocardiogram or PCG), especially for automated heart sound segmentation and classification, has been widely studied and has been reported to have the potential value to detect pathology accurately in clinical applications. However, comparative analyses of algorithms in the literature have been hindered by the lack of high-quality, rigorously validated, and standardized open databases of heart sound recordings. This paper describes a public heart sound database, assembled for an international competition, the PhysioNet/Computing in Cardiology (CinC) Challenge 2016. The archive comprises nine different heart sound databases sourced from multiple research groups around the world. It includes 2435 heart sound recordings in total collected from 1297 healthy subjects and patients with a variety of conditions, including heart valve disease and coronary artery disease. The recordings were collected from a variety of clinical or nonclinical (such as in-home visits) environments and equipment. The length of recording varied from several seconds to several minutes. This article reports detailed information about the subjects/patients including demographics (number, age, gender), recordings (number, location, state and time length), associated synchronously recorded signals, sampling frequency and sensor type used. We also provide a brief summary of the commonly used heart sound segmentation and classification methods, including open source code provided concurrently for the Challenge. A description of the PhysioNet/CinC Challenge 2016, including the main aims, the training and test sets, the hand corrected annotations for different heart sound states, the scoring mechanism, and associated open source code are provided. In addition, several potential benefits from the public heart sound database are discussed.
Article
Full-text available
Objective: This study focuses on the first (S1) and sec-ond (S2) heart sound recognition based only on acoustic charac-teristics; the assumptions of the individual durations of S1 and S2 and time intervals of S1-S2 and S2-S1 are not involved in the recognition process. The main objective is to investigate whether reliable S1 and S2 recognition performance can still be attained under situations where the duration and interval information might not be accessible. Methods: A deep neural network (DNN) method is proposed for recognizing S1 and S2 heart sounds. In the proposed method, heart sound signals are first converted into a sequence of Mel-frequency cepstral coefficients (MFCCs). The K-means algorithm is applied to cluster MFCC features into two groups to refine their representation and discriminative capability. The refined features are then fed to a DNN classifier to perform S1 and S2 recognition. We conducted experiments using actual heart sound signals recorded using an electronic stethoscope. Pre-cision, recall, F-measure, and accuracy are used as the evaluation metrics. Results: The proposed DNN-based method can achieve high precision, recall, and F-measure scores with more than 91% accuracy rate. Conclusion: The DNN classifier provides higher evaluation scores compared with other well-known pattern classi-fication methods. Significance: The proposed DNN-based method can achieve reliable S1 and S2 recognition performance based on acoustic characteristics without using an ECG reference or incor-porating the assumptions of the individual durations of S1 and S2 and time intervals of S1-S2 and S2-S1.
Article
Full-text available
This paper describes new techniques for automatic speaker verification using telephone speech. The operation of the system is based on a set of functions of time obtained from acoustic analysis of a fixed, sentence-long utterance. Cepstrum coefficients are extracted by means of LPC analysis successively throughout an utterance to form time functions, and frequency response distortions introduced by transmission systems are removed. The time functions are expanded by orthogonal polynomial representations and, after a feature selection procedure, brought into time registration with stored reference functions to calculate the overall distance. This is accomplished by a new time warping method using a dynamic programming technique. A decision is made to accept or reject an identity claim, based on the overall distance. Reference functions and decision thresholds are updated for each customer. Several sets of experimental utterances were used for the evaluation of the system, which include male and female utterances recorded over a conventional telephone connection. Male utterances processed by ADPCM and LPC coding systems were used together with unprocessed utterances. Results of the experiment indicate that verification error rate of one percent or less can be obtained even if the reference and test utterances are subjected to different transmission conditions.
Article
This paper presents a multi-centroid diastolic duration model for the hidden semi-Markov model (HSMM) based heart sound segmentation. The centroids are calculated by hierarchical agglomerative clustering of the neighboring diastolic duration values using Ward’s method until center of clusters are found at least a systolic duration apart. The multiple peak distribution yields a sharper gradient of likelihood around the expected centroids and improves the discriminability of similar observations. The peak density at each centroid acts as a reference point for the HSMM to determine the origin of the hidden-state and adjust the corresponding state duration based on the maximum likelihood criterion. This model overcomes the limitation of the single peak mean value model that may overfit the duration distribution when the heart rate variation is relatively large. An extended logistic regression-HSMM algorithm using the proposed duration model is presented for the heart sound segmentation. In addition, the total variation filter is used to attenuate the effect of noises and emphasize the fundamental heart sounds, S1 and S2. The proposed method is evaluated on the training-set-a of 2016 Physionet/Computing in Cardiology Challenge and yields an average F1 score of 98.36 ± 0.43.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
The identification of the exact positions of the first and second heart sounds within a phonocardiogram (PCG), or heart sound segmentation, is an essential step in the automatic analysis of heart sound recordings, allowing for the classification of pathological events. While threshold-based segmentation methods have shown modest success, probabilistic models, such as hidden Markov models, have recently been shown to surpass the capabilities of previous methods. Segmentation performance is further improved when a priori information about the expected duration of the states is incorporated into the model, such as in a hidden semi-Markov model (HSMM). This article addresses the problem of the accurate segmentation of the first and second heart sound within noisy, real-world PCG recordings using a HSMM, extended with the use of logistic regression for emission probability estimation. In addition, we implement a modified Viterbi algorithm for decoding the most-likely sequence of states, and evaluated this method on a large dataset of 10172 seconds of PCG recorded from 112 patients (including 12181 first and 11627 second heart sounds). The proposed method achieved an average F1 score of 95.630.85% while the current state-ofthe- art achieved 86.281.55% when evaluated on unseen test recordings. The greater discrimination between states afforded using logistic regression as opposed to the previous Gaussian distribution-based emission probability estimation as well as the use of an extended Viterbi algorithm allows this method to significantly outperform the current state-of-the-art method based on a two-sided, paired t-test.
Book
BewitchedSimon the Loyal has vowed never to love, for love makes a warrior weak. His arranged marriage to a beautiful Norman heiress would be duty and no more. But more than duty stirs his blood when he first sees Ariane.BetrayedShe has known only coldness from men - and a betrayal so deep it all but killed her soul. Wanting no man, trusting no man, speaking only through the sad songs she draws from her harp, Ariane comes to Simon an unwilling bride.EnchantedThey wed to bring peace to the Disputed Lands, but marriage alone is not enough. Simon must teach Ariane passion, she must teach him trust. And both must surrender to the sweet violence of love's enchantment. . .or die. © Springer-Verlag Berlin Heidelberg 1990, 1999, 2007. All rights are reserved.
Article
Digital stethoscopes offer new opportunities for computerized analysis of heart sounds. Segmentation of heart sound recordings into periods related to the first and second heart sound (S1 and S2) is fundamental in the analysis process. However, segmentation of heart sounds recorded with handheld stethoscopes in clinical environments is often complicated by background noise. A duration-dependent hidden Markov model (DHMM) is proposed for robust segmentation of heart sounds. The DHMM identifies the most likely sequence of physiological heart sounds, based on duration of the events, the amplitude of the signal envelope and a predefined model structure. The DHMM model was developed and tested with heart sounds recorded bedside with a commercially available handheld stethoscope from a population of patients referred for coronary arterioangiography. The DHMM identified 890 S1 and S2 sounds out of 901 which corresponds to 98.8% (CI: 97.8-99.3%) sensitivity in 73 test patients and 13 misplaced sounds out of 903 identified sounds which corresponds to 98.6% (CI: 97.6-99.1%) positive predictivity. These results indicate that the DHMM is an appropriate model of the heart cycle and suitable for segmentation of clinically recorded heart sounds.
Article
From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software.