Article

# Logistic Regression-HSMM-based Heart Sound Segmentation

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

## Abstract

The identification of the exact positions of the first and second heart sounds within a phonocardiogram (PCG), or heart sound segmentation, is an essential step in the automatic analysis of heart sound recordings, allowing for the classification of pathological events. While threshold-based segmentation methods have shown modest success, probabilistic models, such as hidden Markov models, have recently been shown to surpass the capabilities of previous methods. Segmentation performance is further improved when a priori information about the expected duration of the states is incorporated into the model, such as in a hidden semi-Markov model (HSMM). This article addresses the problem of the accurate segmentation of the first and second heart sound within noisy, real-world PCG recordings using a HSMM, extended with the use of logistic regression for emission probability estimation. In addition, we implement a modified Viterbi algorithm for decoding the most-likely sequence of states, and evaluated this method on a large dataset of 10172 seconds of PCG recorded from 112 patients (including 12181 first and 11627 second heart sounds). The proposed method achieved an average F1 score of 95.630.85% while the current state-ofthe- art achieved 86.281.55% when evaluated on unseen test recordings. The greater discrimination between states afforded using logistic regression as opposed to the previous Gaussian distribution-based emission probability estimation as well as the use of an extended Viterbi algorithm allows this method to significantly outperform the current state-of-the-art method based on a two-sided, paired t-test.

## No full-text available

... In [6], the authors predict the most likely sequence of heart sounds based on the events' duration and the signal's envelope amplitude using a Hidden semi-Markov Model (HSMM). Authors in [7] used logistic regression instead of Gaussian distribution in [6] as emission probability and also extended the Viterbi algorithm and achieved state-of-the-art results. A twodimensional U-net CNN architecture is proposed in [8], inspired by successful U-net architecture used in image segmentation. ...
... A twodimensional U-net CNN architecture is proposed in [8], inspired by successful U-net architecture used in image segmentation. They also combined this method with work done in [7], considering outputs of CNN architecture as emission probabilities. These modifications further improved PCG segmentations. ...
... We use preprocessing steps proposed in [6] for denoising and spike removal of the signals. Then for feature extraction, we decompose each signal into four frequency bands (25-45, 45-80, 80-200, and 200-400Hz) and use the segmentation method proposed in [7] to segment heart sounds into four parts. ...
Preprint
Full-text available
Cardiovascular disease is one of the leading causes of death according to WHO. Phonocardiography (PCG) is a costeffective, non-invasive method suitable for heart monitoring. The main aim of this work is to classify heart sounds into normal/abnormal categories. Heart sounds are recorded using different stethoscopes, thus varying in the domain. Based on recent studies, this variability can affect heart sound classification. This work presents a Siamese network architecture for learning the similarity between normal vs. normal or abnormal vs. abnormal signals and the difference between normal vs. abnormal signals. By applying this similarity and difference learning across all domains, the task of domain invariant heart sound classification can be well achieved. We have used the multi-domain 2016 Physionet/CinC challenge dataset for the evaluation method. Results: On the evaluation set provided by the challenge, we have achieved a sensitivity of 82.8%, specificity of 75.3%, and mean accuracy of 79.1%. While overcoming the multi-domain problem, the proposed method has surpassed the first-place method of the Physionet challenge in terms of specificity up to 10.9% and mean accuracy up to 5.6%. Also, compared with similar state-of-the-art domain invariant methods, our model converges faster and performs better in specificity (4.1%) and mean accuracy (1.5%) with an equal number of epochs learned.
... sound (FHS) [4]. The S1-heart sound is produced during the closure of the atrioventricular (mitral, tricuspid) valves at the beginning of ventricular systole, and the S2-sound is generated during the closure of semilunar (aortic, pulmonary) valves at the end of the systole [5]. ...
... The innocent murmur is harmless (associated in normal case), but the presence of others murmur is an indication of certain heart valve-related diseases. The automated identification of FHS components is the initial step for analyzing PCG signals to diagnose heart valve diseases [4]. The FHS can be detected using the electrocardiogram (ECG) signal as a reference based on the detection of the QRScomplex. ...
... Therefore, we have selected the order and window size of Savitzky-Golay filter as 5 and 101, respectively, for the proposed work. In Table VI, the comparison of the classification results of the SAE-DNN classifier is shown using the Shannon-energy envelope [51], Hilbert envelope [4], [52], and the proposed STKE energy envelope-based segmentation methods of PCG signals for HSAD. All performance measures of SAE-DNN are highest using the proposed STKE envelope-based approach compared to the other two envelope-based methods for HSAD using PCG signals. ...
Article
Full-text available
The phonocardiogram (PCG) signal deciphers the mechanical activity of the heart, and it consists of the fundamental heart sounds (S1 and S2), murmurs, and other associated sounds (S3 and S4). Detection of fundamental heart sound activity(FHSA) is vital for the automated analysis of PCG signals to diagnose various heart valve diseases. This paper proposes a time-frequency domain (TFD) deep neural network (DNN) approach for automated FHSA detection using PCG signals. The modified Gaussian window-based Stockwell Transform (MGWST) is used to obtain the time-frequency representation (TFR) of PCG signals. The Shannon-Teager-Kaiser Energy (STKE), smoothing, and thresholding techniques are then employed to evaluate the segmented heart sound components. The TFD Shannon entropy(TFDSE) features are computed from the segmented heart sound components of the PCG signal. The deep neural network (DNN)developed based on the stacked autoencoders (SAEs) is used for automated identification of FHSA components. The performance of the proposed approach is evaluated using two publicly available standard databases (Database 1: Michigan heart sound and murmur database and Database 2: PhysioNet Computingin Cardiology Challenge 2016). The results demonstrate that the proposed approach has achieved the accuracy, sensitivity, specificity, and precision values of 99.55 %, 99.93 %, 99.26 %,99.02 % for Database 1, and 95.43 %, 97.92 %, 98.32 %, 97.60 %for Database 2, respectively. It is shown that, the proposed FHSA detection approach has obtained better accuracy than existing methods.
... From the time series classification perspective, cardiac sequences can be segmented with the assistance of machine learning and deep learning. Most commonly employed features are cardiac signal envelopes, such like the homomorphic envelope (HoEnv), Hilbert envelope (HiEnv), power spectral density envelope (PSDEnv) and wavelet envelope [23], which result in accurate heart sound segmentation in the literature. In the terms of machine learning, the logistic regression-hidden semi-Markov model (LR-HSMM) has been proved to address the problem of the accurate heart sound segmentation [23]. ...
... Most commonly employed features are cardiac signal envelopes, such like the homomorphic envelope (HoEnv), Hilbert envelope (HiEnv), power spectral density envelope (PSDEnv) and wavelet envelope [23], which result in accurate heart sound segmentation in the literature. In the terms of machine learning, the logistic regression-hidden semi-Markov model (LR-HSMM) has been proved to address the problem of the accurate heart sound segmentation [23]. Moreover, neural networks based deep learning methods, such as the long short-term memory (LSTM) and the gate recurrent unit (GRU), have achieved outperformances than machine learning methods for segmentation of cardiac sequences [24][25][26]. ...
... The existing techniques listed in Table 5 segment cardiac sequences using heart sound stage division as a reference and use envelopes extracted from filtered radar phase signals as features. The logistic regression-HSMM-based heart sound segmentation method [23] was used in [12] to segment the heart sound stage, and the F 1, S1 scores were approximately 93%. Reference [24] evaluated different LSTM architectures and concluded that Bi-LSTM has the best performance in cardiac sequence segmentation. ...
Article
Cardiac cycle detection methods based on radar sensing have made marked progress in the last decade. In this study, a novel harmonic distribution based non-contact cardiac cycle detection method is innovatively proposed to extract heartbeat intervals and rhythm sequences from radar signals. Initially, we exploit the bimodal Gaussian distribution model of the frequency domain and the Shannon energy to separate heartbeat harmonics from radar signals, which preserve instantaneous variation features of signal components. Furthermore, we cluster harmonics into three types of waves based on K-means. The peak positions of the three waves are near locations of the R-peak, T-peak, and T-wave end in the electrocardiograph (ECG) respectively. Thus, the detected peak locations of three waves can be directly used to estimate heartbeat-to-beat intervals. Additionally, we compared three recurrent neural networks (RNNs) using the clustered waves to segmentation sequences of "R-T-R" and "Q-R-S-T-Q". Specific subsets are classified from the sequences to prove their association with cardiac systolic and diastolic variation. Finally, verification is performed using a clinical dataset. The suitable centre frequency range to use the proposed model is about 3-15 Hz. The performance of the cardiac cycle detection method achieves a mean relative error of 5.34% ± 3.57% for BBIs detection and an accuracy of 95.88% for heartbeat rhythm segmentation. The experimental results show that our radar-based heartbeat harmonic model (RBHHM) based method has better accuracy than other state-of-the-art cardiac cycle detection methods.
... Significant amount of work focus on temporal modeling for PCG segmentation. For example, Logistic Regression Hidden Semi-Markov Model (LR-HSMM) [17] was used to predict the sequence of fundamental heart sounds; In another work, Recurrent Neural Network (RNN) [18] was shown to perform better than Convolutional Neural Network (CNN) in analyzing the sequential states of PCG signals. Since these temporal methods lack the ability to process raw signals, a feature extraction algorithm will be applied to signals first. ...
... Since these temporal methods lack the ability to process raw signals, a feature extraction algorithm will be applied to signals first. Normally, frequency-domain features will be extracted and proven to be effective, for example, wavelet transform was used in [17], and Mel-frequency cepstral coefficients (MFCC) were used in [18], [19]. ...
... For the segmentation task, this challenge provides annotations for fundamental heart sound, S1, systole, S2, and diastole on signals. These annotations are generated by the LR-HSMM algorithm [17] and manually decide their correctness. ...
Preprint
sub>The need for telehealth and home-based monitoring surges during the COVID-19 pandemic. Based on the recent advancement of concurrent electrocardiograph (ECG) and phonocardiogram (PCG) wearable sensors, this paper proposes a novel framework for synchronized ECG and PCG signal analysis for cardiac function monitoring. Our system jointly performs R-peak detection on ECG, fundamental heart sounds segmentation of PCG, and cardiac condition classification. First, we propose the use of recurrent neural networks and developed a new type of labeling method for R-peak detection algorithm. The new labeling strategy utilizes a regression objective to resolve the previous imbalanced classification problem. Second, we propose a 1D U-Net structure for PCG segmentation within a single heartbeat length. We further utilize the multi-modality of signals and contrastive learning to enhance model performance. Finally, we extract 20 features from our signal labeling algorithms to apply to two real-world problems: snore detection during sleep and COVID-19 detection. The proposed method achieves state-of-the-art performance on multiple benchmarks using two public datasets: MIT-BIH and PhysioNet 2016. The proposed method provides a cost-effective alternative to labor-intensive manual segmentation, with more accurate segmentation than existing methods. On the dataset collected by Bayland Scientific which includes synchronized ECG and PCG signals, the proposed system achieves an end-to-end R-peak detection with F1 score of 99.84%, heart sound segmentation with F1 score of 91.25%, and snore and COVID-19 detection with accuracy of 96.30% and 95.06% respectively. </sub
... With the help of a powerful pre-trained model from the field of image classification, the transformed timefrequency maps are directly fed into the model for end to end classification of heart sounds. In earlier studies of heart sound abnormality detection [29,30,31,32], heart sounds will be processed by segments before classification. That is, heart sounds are divided into server components, and then feature extraction is performed on each component. ...
... After denoising, the heart sound signal is segmented into four fundamental components, i. e., the first heart sound, the second heart sound, systole, and diastole for subsequent pathological features extraction. At present, the main segmentation methods include: envelope-based methods [29], features-based methods [30], machine learning-based methods [31], electrocardiogram or carotid signal-based methods [32], and Shannon energy-based methods [37]. However, some complex studies for heart sound abnormality detection have achieved excellent performance even if they avoided this step before feature extraction [33,34]. ...
Article
Full-text available
The advantages of non-invasive, real-time and convenient, computer audition-based heart sound abnormality detection methods have increasingly attracted efforts among the community of cardiovascular diseases. Time-frequency analyses are crucial for computer audition-based applications. However, a comprehensive investigation on discovering an optimised way for extracting time-frequency representations from heart sounds is lacking until now. To this end, we propose a comprehensive investigation on time-frequency methods for analysing the heart sound, i. e., short-time Fourier transformation, Log-Mel transformation, Hilbert-Huang transformation, wavelet transformation , Mel transformation, and Stockwell transformation. The time-frequency representations are automatically learnt via pre-trained deep convolutional neural networks. Considering the urgent need of smart stethoscopes for high robust detection algorithms in real environment, the training, verification, and testing sets employed in the extensive evaluation are subject-independent. Finally, to further understand the heart sound-based digital phenotype for cardiovascular diseases, explainable artificial intelligence approaches are used to reveal the reasons for the performance differences of six time-frequency representations in heart sound abnormality detection. Experimental results show thatStockwell transformation can beat other methods by reaching the highest overall score of 65.2 %. The interpretable results demonstrate that Stockwell transformation does not only present more information for heart sounds, but also provides a certain noise robustness. Besides, the considered fine-tuned deep model brings an improvement of the mean accuracy over the previous state-of-the-art results by 9.0 % in subject-independent testing.
... In the past, the researchers concentrated on the preprocessing and feature extraction of such physiological signal to improve the segmentation performance [1]- [3]. The essence of these methods is to amplify the inter-state difference and discrepancy between target signals and noises, such as calculating the slope change and wavelet transforming to locate the QRScomplex and P waves in ECGs [4], [5] and fundamental heart sounds in PCGs [6], [7], modeling PPGs with Gaussian functions [8], [9], etc.. Nonetheless, these classic methods can only deal with static scenes with single-source noise and non-severe variations. The recent researches indicate that the supervised machine learning methods are capable of significantly improving the segmentation performance for peudoperiodic physiological signals, and are more robust in dynamic databases [10], [11]. ...
... where T P is true positive, F P is false positive and F N is false negative. The standard grace period of 150 ms is used for beat-by-beat comparison in QRS location [52] and 100 ms for state-by-state comparison in heart sound segmentation [6]. 4) Implementation Details: In this work, the Decoders for the two tasks are fixed with two-layer dense block. ...
Preprint
Full-text available
Precise segmentation is a vital first step to analyze semantic information of cardiac cycle and capture anomaly with cardiovascular signals. However, in the field of deep semantic segmentation, inference is often unilaterally confounded by the individual attribute of data. Towards cardiovascular signals, quasi-periodicity is the essential characteristic to be learned, regarded as the synthesize of the attributes of morphology (Am) and rhythm (Ar). Our key insight is to suppress the over-dependence on Am or Ar while the generation process of deep representations. To address this issue, we establish a structural causal model as the foundation to customize the intervention approaches on Am and Ar, respectively. In this paper, we propose contrastive causal intervention (CCI) to form a novel training paradigm under a frame-level contrastive framework. The intervention can eliminate the implicit statistical bias brought by the single attribute and lead to more objective representations. We conduct comprehensive experiments with the controlled condition for QRS location and heart sound segmentation. The final results indicate that our approach can evidently improve the performance by up to 0.41% for QRS location and 2.73% for heart sound segmentation. The efficiency of the proposed method is generalized to multiple databases and noisy signals.
... In this work, we use a logistic regression-hidden semi-Markov model (HSMM) based segmentation method presented in [49,50] to identify the S 1 and S 2 locations. This segmentation method incorporates logistic regression (LR) into a duration dependant HSMM for emission probability estimation and an extended Viterbi algorithm for decoding the most likely sequence of heart sound states [48]. ...
... This segmentation method incorporates logistic regression (LR) into a duration dependant HSMM for emission probability estimation and an extended Viterbi algorithm for decoding the most likely sequence of heart sound states [48]. The software code of this segmentation method [49] is made available online [50] which we have utilized in our study. ...
Article
Full-text available
Cardiovascular disease (CVD) is considered a significant public health concern around the world. Automated early diagnostic tools for CVDs can provide substantial benefits, especially in low-resource countries. This study proposes a time-domain Hilbert envelope feature (HEF) extraction scheme that can effectively distinguish among different cardiac anomalies from heart sounds even in highly noisy recordings. The method is motivated by how a cardiologist listens to the heart murmur configurations, e.g., the intensity of the heart sound envelope over a cardiac cycle. The proposed feature is invariant to the heart rate, the position of the first and second heart sounds, and robust in extracting the murmur configuration pattern in the presence of respiratory noise. Experimental evaluations are performed compared to two different state-of-the-art methods in the presence of respiratory noise with signal-to-noise ratio (SNR) values ranging from 0–15 dB. The proposed HEF, fused with standard acoustic and Resnet features, yields an average accuracy, sensitivity, specificity, and F1-score of 94.78% (±2.63), 87.48 %(±6.07), 96.87% (±1.51) and 87.47% (±5.94), respectively, while using a random forest (RF) classifier and applied on a mixture of clean and noise mixed recordings of an open-access dataset. Compared to the best-performing baseline model, this feature-fusion scheme provides a significant performance improvement (p<0.05), notably achieving an absolute improvement of 6.16% in averaged sensitivity. When applied to noisy heart sound recordings collected from a local hospital, this method significantly outperforms the existing systems by achieving an average accuracy of 80.54% (±2.65) and sensitivity of 68.52% (±4.54). The achieved sensitivity yields an absolute improvement of 12.65% compared to the best-performing baseline model in this real-world dataset.
... Moreover, the expectation maximisation algorithm developed in [71] searched for sojourn time distribution parameters of an HSMM for each subject. Many studies [6], [31], [40], [41], [43], [72]- [79] have employed logistic regression-based HSMM (LR-HSMM) proposed in [80] for heart sound segmentation. LR was incorporated to predict the probability of P [s j |o t ], and B is then computed with the Bayes rule. ...
Preprint
Full-text available
Heart sound auscultation has been demonstrated to be beneficial in clinical usage for early screening of cardiovascular diseases. Due to the high requirement of well-trained professionals for auscultation, automatic auscultation benefiting from signal processing and machine learning can help auxiliary diagnosis and reduce the burdens of training professional clinicians. Nevertheless, classic machine learning is limited to performance improvement in the era of big data. Deep learning has achieved better performance than classic machine learning in many research fields, as it employs more complex model architectures with stronger capability of extracting effective representations. Deep learning has been successfully applied to heart sound analysis in the past years. As most review works about heart sound analysis were given before 2017, the present survey is the first to work on a comprehensive overview to summarise papers on heart sound analysis with deep learning in the past six years 2017--2022. We introduce both classic machine learning and deep learning for comparison, and further offer insights about the advances and future research directions in deep learning for heart sound analysis.
... These include entropy, Hilbert-Huang transform, wavelet transform and autocorrelation, et al. , [11,58,69,70]. Springer et al. [54] proposed a logistic regression based hidden semi-Markov model (HSMM) for the segmentation of the first (S1) and second (S2) heart sound within noisy, real-world PCG recordings. The inclusion of explicit timing durations helps improve the differentiation of heart sound-like noise from noisy heart sounds. ...
Article
Full-text available
Phonocardiogram (PCG) is commonly used as a diagnostic tool in ambulatory monitoring in order to evaluate cardiac abnormalities and detect cardiovascular diseases. Although cardiac auscultation is widely used for evaluation of cardiac function, the analysis of heart sound signals mostly depends on the clinician’s experience and skills. There is growing demand for automatic and objective heart sound interpretation techniques. The objective of this study is to develop an automatic classification method for anomaly (binary and multi-class) detection of PCG recordings without any segmentation. A deep neural network (DNN) model is used on the raw data during the extraction of the features of the PCG inputs. Deep feature maps obtained from hierarchically placed layers in DNN are fed to various shallow classifiers for the anomaly detection, including support vector classifier (SVC), k-nearest neighbors (KNN), random forest (RF), gradient boosting (GB) classifier, decision tree (DT) classifier, quadratic discriminant analysis (QDA), and multi-layer perception (MLP). Principal component analysis (PCA) technique is used to reduce the high dimensions of feature maps.Finally, two famous heart sound databases, namely PhysioNet/Computing in Cardiology (CinC) Challenge heart sound database and heart valve disease (HVD) database, are used for evaluation. The databases are significantly different in terms of the tools used for data acquisition, clinical protocols, digital storages and signal qualities, making it challenging to process and analyze. By using the 10-fold cross-validation style, experimental results demonstrate that the proposed deep features with shallow classifiers yield highest performance with accuracy of 99.61% and 99.44% for binary and multi-class classification on the two databases, respectively. The results indicate that our method is effective for the detection of abnormal heart sound signals and outperforms other state-of-the-art methods.
... For example, El-Segaier et al. proposed to segment heart sounds with reference to corresponding ECG signals [6]. On this basis, Springer et al. and Min et al. used the relationship between the R-peak of the ECG signal and S1 in the heart sound to implement localization of S1 and S2 [7,8]. This method required acquisition of multiple signals, which resulted in certain limitations in practice. ...
Article
Full-text available
The accurate localization of S1 and S2 is essential for heart sound segmentation and classification. However, current direct heart sound segmentation algorithms have poor noise immunity and low accuracy. Therefore, this paper proposes a new optimal heart sound segmentation algorithm based on K-means clustering and Haar wavelet transform. The algorithm includes three parts. Firstly, this method uses the Viola integral method and Shannon’s energy-based algorithm to extract the function of the envelope of the heart sound energy. Secondly, the time–frequency domain features of the acquired envelope are extracted from different dimensions and the optimal peak is searched adaptively based on a dynamic segmentation threshold. Finally, K-means clustering and Haar wavelet transform are implemented to localize S1 and S2 of heart sounds in the time domain. After validation, the recognition rate of S1 reached 98.02% and that of S2 reached 96.76%. The model outperforms other effective methods that have been implemented. The algorithm has high robustness and noise immunity. Therefore, it can provide a new method for feature extraction and analysis of heart sound signals collected in clinical settings.
... Fig. 7(a) and (b) also shows the four heart sounds-the first heart sound (S1), the second heart sound (S2), systole, and diastole, which are identified by the duration of the first envelope segment, the duration of the second envelope segment, the interval from the upper boundary of S1 to the lower boundary of S2, and the interval from the upper boundary of S2 to the lower boundary of the next S1, respectively, in every two consecutive cardiac cycles. The heart sound envelopes were segmented using a hidden semi-Markov model (HSMM) with machine learning [27]. Fig. 7(a) and (b) compares heart sound durations/intervals that were obtained using the proposed system and the PCG reference system and shows impressive agreement. ...
Article
Full-text available
This letter presents a heart sound monitoring system with a chest-worn sensor without contact with the skin. A self-injection-locked oscillator (SILO) and tag antenna are the essential components of the sensor, which operates at 5.8 GHz. The sensor uses the tag antenna to transmit the SILO output signal whose frequency is modulated by the Doppler effect that is caused by the chest movement, to a remote frequency modulation (FM) receiver for extracting the heart sounds. The tag antenna comprises a loop antenna and a concentric loop (CL) in a stack to improve the directivity toward the chest. Experiments were conducted at several auscultation sites to identify the four major heart sounds-the first heart sound (S1), the second heart sound (S2), systole, and diastole-based on a hidden semi-Markov model (HSMM). The results thus obtained correlate well with phonocardiogram (PCG) measurements. Index Terms-Doppler radar, heart sound monitoring, self-injection-locked oscillator (SILO), tag antenna, wearable device.
... After providing the judging condition for continuous hand action prediction, RMS values were sequentially input into the trained models, and then the path of the key state transition could be backtracked by maximum state probability [39]. Many factors still constrained the prediction process, especially its processing time. ...
Article
Full-text available
Conventional classification of hand motions and continuous joint angle estimation based on sEMG have been widely studied in recent years. The classification task focuses on discrete motion recognition and shows poor real-time performance, while continuous joint angle estimation evaluates the real-time joint angles by the continuity of the limb. Few researchers have investigated continuous hand action prediction based on hand motion continuity. In our study, we propose the key state transition as a condition for continuous hand action prediction and simulate the prediction process using a sliding window with long-term memory. Firstly, the key state modeled by GMM-HMMs is set as the condition. Then, the sliding window is used to dynamically look for the key state transition. The prediction results are given while finding the key state transition. To extend continuous multigesture action prediction, we use model pruning to improve reusability. Eight subjects participated in the experiment, and the results show that the average accuracy of continuous two-hand actions is 97% with a 70 ms time delay, which is better than LSTM (94.15%, 308 ms) and GRU (93.83%, 300 ms). In supplementary experiments with continuous four-hand actions, over 85% prediction accuracy is achieved with an average time delay of 90 ms.
... This ensures that strong disturbances are removed and only sequences of high quality are kept. We calculate segmentations for S1, S2, diastole and systole using the segmentation algorithm of Springer et al. [33]. ...
Preprint
Full-text available
Objective: Cardiovascular diseases (CVDs) account for a high fatality rate worldwide. Heart murmurs can be detected from phonocardiograms (PCGs) and may indicate CVDs. Still they are often overlooked as their detection and correct clinical interpretation requires expert skills. In this work, we aim to predict the presence of murmurs and clinical outcome from multiple PCG recordings employing an explainable multitask model. Approach: Our approach consists of a two-stage multitask model. In the first stage, we predict the murmur presence in single PCGs using a multiple instance learning (MIL) framework. MIL also allows us to derive sample-wise classifications (i.e. murmur locations) while only needing one annotation per recording ("weak label") during training. In the second stage, we fuse explainable hand-crafted features with features from a pooling-based artificial neural network (PANN) derived from the MIL framework. Finally, we predict the presence of murmurs as well as the clinical outcome for a single patient based on multiple recordings using a simple feed-forward neural network. Main results: We show qualitatively and quantitatively that the MIL approach yields useful features and can be used to detect murmurs on multiple time instances and may thus guide a practitioner through PCGs. We analyze the second stage of the model in terms of murmur classification and clinical outcome. We achieved a weighted accuracy of 0.714 and an outcome cost of 13612 when using PANN model and demographic features on the CirCor dataset (hidden testset of the George B. Moody PhysioNet challenge 2022, team "Heart2Beat", rank 12 / 40). Significance: To the best of our knowledge, we are the first to demonstrate the usefulness of MIL in PCG classification. Also, we showcase how the explainability of the model can be analyzed quantitatively, thus avoiding confirmation bias inherent to many post-hoc methods. Finally, our overall results demonstrate the merit of employing MIL combined with handcrafted features for the generation of explainable features as well as for a competitive classification performance.
... Sun et al. [14] proposed an automatic heart sound segmentation method based on the Hilbert transform. The hidden semi-Markov model (HSMM) method [15] was extended with logistic regression to achieve signal segmentation in a noisy environment. Furthermore, there are some segmentation methods based on deep learning [16]- [18]. ...
Article
Full-text available
In recent years, auxiliary diagnosis technology for cardiovascular disease based on abnormal heart sound detection has become a research hotspot. Heart sound signals are promising in the preliminary diagnosis of cardiovascular diseases. Previous studies have focused on capturing the local characteristics of heart sounds. In this paper, we investigate a method for mapping heart sound signals with complex patterns to fixed-length feature embedding called HS-Vectors for abnormal heart sound detection. To get the full embedding of the complex heart sound, HS-Vectors are obtained through the Time-Compressed and Frequency-Expanded Time-Delay Neural Network(TCFE-TDNN) and the Dynamic Masked-Attention (DMA) module. HS-Vectors extract and utilize the global and critical heart sound characteristics by masking out irreverent information. Based on the TCFE-TDNN module, the heart sound signal within a certain time is projected into fixed-length embedding. Then, with a learnable mask attention matrix, DMA stats pooling aggregates multi-scale hidden features from different TCFE-TDNN layers and masks out irrelevant frame-level features. Experimental evaluations are performed on a 10-fold cross-validation task using the 2016 PhysioNet/CinC Challenge dataset and the new publicly available pediatric heart sound dataset we collected. Experimental results demonstrate that the proposed method excels the state-of-the-art models in abnormality detection.
... After performing spike removal (following the steps outlined in a 2009 paper by Schmidt et. al. 30 ) and downsampling the signal to sampling rate of 2205 Hz, each recording was segmented into 6 overlapping (50%) blocks, each consisting of 4 cardiac cycles (with each cycle starting and ending at the first heart sound S1), using a modified version of the segmentation algorithm of Springer et al 31 . For each segment, the 13 first Mel frequency cepstral coefficients (MFCC) were computed, where we used the Hanning window function for computing the Fourier transform of the signal, with a time step size of 25 ms, and window overlap of 10 ms. ...
Preprint
Full-text available
Background Although neural networks have shown promise in classifying pathological heart-sounds, algorithms have so far either been trained or tested on selected cohorts which can result in selection bias. Herein, the main objective is to explore the ability of neural networks to predict valvular heart disease (VHD) from heart sound (HS) recordings in an unselected cohort. Methods and results Using annotated HSs and echocardiogram data from 2124 subjects from the Tromso 7 study, we trained a recurrent neural network to predict murmur grade, which was subsequently used to predict VHD. Presence of aortic stenosis (AS) was detected with sensitivity 90.9%, specificity 94.5%, and area-under-the-curve (AUC) 0.979 (CI:0.963-0.995). At least moderate AS was detected with AUC 0.993 (CI:0.989-0.997). Moderate or greater aortic and mitral regurgitation (AR and MR) were predicted with AUC 0.634 (CI:0.565-703) and 0.549 (CI:0.506-0.593) respectively, which increased to 0.766 and 0.677 when adding clinical variables as predictors. Excluding asymptomatic cases from the positive class increased sensitivity to AR from 54.9% to 85.7%, and sensitivity to MR from 55.6% to 83.3%. Screening jointly for at least moderate regurgitation or presence of stenosis resulted in detection of 54.1% of positive cases, 60.5% of negative cases, 97.7% of AS cases (n=44), and all 12 MS cases. Conclusions Despite the cohort being unselected, the algorithm detected AS with performance exceeding performance achieved in similar studies based on selected cohorts. Detection of AR and MR based on HS audio was unreliable, but sensitivity was considerably higher for symptomatic cases, and inclusion of clinical variables improved prediction significantly.
... Now the signals in the dataset are of variable length but the input to the model has to be of an equal length for all the data. In order to do that the signals are segmented to heart sound cycles using the methods proposed by Springer et al. [13]. After the segmentation all the heart sound cycles are either truncated or zero padded to a fixed length of 2500 samples depending on the length of their respective cycle length. ...
Preprint
Full-text available
Cardiovascular diseases are one of the leading cause of death in today's world and early screening of heart condition plays a crucial role in preventing them. The heart sound signal is one of the primary indicator of heart condition and can be used to detect abnormality in the heart. The acquisition of heart sound signal is non-invasive, cost effective and requires minimum equipment. But currently the detection of heart abnormality from heart sound signal depends largely on the expertise and experience of the physician. As such an automatic detection system for heart abnormality detection from heart sound signal can be a great asset for the people living in underdeveloped areas. In this paper we propose a novel deep learning based dual stream network with attention mechanism that uses both the raw heart sound signal and the MFCC features to detect abnormality in heart condition of a patient. The deep neural network has a convolutional stream that uses the raw heart sound signal and a recurrent stream that uses the MFCC features of the signal. The features from these two streams are merged together using a novel attention network and passed through the classification network. The model is trained on the largest publicly available dataset of PCG signal and achieves an accuracy of 87.11%, sensitivity of 82.41%, specificty of 91.8% and a MACC of 87.12%.
... The measurement quality was assessed using an automatic signal quality index (SQI) as proposed by Shi et al. [19]. To segment heartbeats from the resulting displacement data, a hidden semi-Markov model (HSMM) detecting the first heart sound of every heartbeat [20] was applied to MIS data [11]. Based on the inter-beat-intervals, the instantaneous HR was calculated. ...
... 2) Segmentation: The original heart sound recordings are segmented into short intervals of cardiac cycles to increase the number of heart sound samples and ensure that all heart sounds are of the same length. We implement Springer's algorithm for heart sound segmentation, which utilises a trained hidden semi-Markov model with logistic regression to identify the states of heart sounds [22]. The heart sound is segmented at the beginning of each cardiac cycle with a length of 3 seconds. ...
Conference Paper
Full-text available
Heart sound auscultation is an effective method for early-stage diagnosis of heart disease. The application of deep neural networks is gaining increasing attention in automated heart sound classification. This paper proposes deep Convolutional Neural Networks (CNNs) to classify normal/abnormal heart sounds, which takes two-dimensional Mel-scale features as input, including Mel frequency cepstral coefficients (MFCCs) and the Log Mel spectrum. We employ two weighted loss functions during the training to mitigate the class imbalance issue. The model was developed on the public PhysioNet/Computing in Cardiology Challenge (CinC) 2016 heart sound database. On the considered test set, the proposed model with Log Mel spectrum as features achieves an Unweighted Average Recall (UAR) of 89.6%, with sensitivity and specificity being 89.5% and 89.7% respectively. This work proposes a CNN-based model to enable automated heart sound classification, which can provide auxiliary assistance for heart auscultation and has the potential to screen for heart pathologies in clinical applications at a relatively low cost.
... To examine the signal characteristics, we then performed a spectral analysis based on continuous wavelet transform (CWT). Next, we used the segmentation function for phonocardiogram recordings proposed by [9] to examine the spectral dynamics for each cardiac cycle independently. This function identifies S1, systole, S2 and diastole episodes using a duration-dependent logistic regression-based Hidden Markov model. ...
Article
Full-text available
In previous work, we demonstrated the potential of blood flow sounds for biometric authentication acquired by a custom-built auscultation device. For this purpose, we calculated the frequency spectrum for each cardiac cycle represented within the measurements based on continuous wavelet transform. The resulting spectral images were used to train a convolutional neural network based on measurements from seven users. In this work, we investigate which areas of those images are relevant for the network to correctly identify a user. Since they describe the frequencies’ energy within a cardiac cycle, this information can be used to gain knowledge on biometric properties within the signal. Therefore, we calculate the saliency maps for each input image and investigate their mean for each user, opening perspectives for further investigation of the spectral information that was found to be potentially relevant.
... Giordano and Knaflitz [23] employed the Shannon energy envelope algorithm to perform a reliable quantitative characterization of the timing of the occurrence of HS components; they believed that this method was more efficient than other energy envelope methods due to the lack of a search-back phase in the HS detection process. Springer et al. [24] proposed an enhanced hidden Markov algorithm for HS signal segmentation, which included duration dependencies and logistic regression-based emission probabilities, and the implementation of an extension to the Viterbi algorithm for use with hidden semi-Markov models (HSMMs). Li et al. [25] used the Empirical Wavelet Transform (EWT) to decompose S1 and the Hilbert Transform to extract the instantaneous frequency (IF) of the mitral component (M1) and the tricuspid component (T1). ...
Article
Full-text available
Background: Korotkoff sound (KS) is an important indicator of hypertension when monitoring blood pressure. However, its utility in noninvasive diagnosis of Chronic heart failure (CHF) has rarely been studied. Purpose: In this study, we proposed a method for signal denoising, segmentation, and feature extraction for KS, and a Bayesian optimization-based support vector machine algorithm for KS classification. Methods: The acquired KS signal was resampled and denoised to extract 19 energy features, 12 statistical features, 2 entropy features, and 13 Mel Frequency Cepstrum Coefficient (MFCCs) features. A controlled trial based on the VALSAVA maneuver was carried out to investigate the relationship between cardiac function and KS. To classify these feature sets, the K-Nearest Neighbors (KNN), decision tree (DT), Naive Bayes (NB), ensemble (EM) classifiers, and the proposed BO-SVM were employed and evaluated using the accuracy (Acc), sensitivity (Se), specificity (Sp), Precision (Ps), and F1 score (F1). Results: The ALSAVA maneuver indicated that the KS signal could play an important role in the diagnosis of CHF. Through comparative experiments, it was shown that the best performance of the classifier was obtained by BO-SVM, with Acc (85.0%), Se (85.3%), and Sp (84.6%). Conclusions: In this study, a method for noise reduction, segmentation, and classification of KS was established. In the measured data set, our method performed well in terms of classification accuracy, sensitivity, and specificity. In light of this, we believed that the methods described in this paper can be applied to the early, noninvasive detection of heart disease as well as a supplementary monitoring technique for the prognosis of patients with CHF.
... All the recordings were filtered with a 2 nd order Butterworth bandpass filter with cutoff frequencies of 25 Hz and 400 Hz, and normalized to have zero mean and unit variance. Then, using the method developed by Springer et al. [8], we calculated the homomorphic, Hilbert, power spectral density, and Wavelet envelopes, which have different trade-off levels between noise rejection and amplitude resolution [9]. These envelopes were also normalized to have zero mean and unit variance, and form a 4-dimensional multivariate time series for each heart sound segment. ...
Conference Paper
Heart sound recordings are a key non-invasive tool to detect both congenital and acquired heart conditions. As part of the George B. Moody PhysioNet Challenge 2022, we present an approach based on Bidirectional Long Short-Term Memory (BiLSTM) neural networks for the detection of murmurs and prediction of clinical outcome from Phonocardiograms (PCGs). We used the homomorphic, Hilbert, power spectral density, and wavelet envelopes as signal features, from which we extracted fixed-length segments of 4 seconds to train the network. Using the official challenge scoring metrics, our team SmartBeatIT achieved a murmur weighted accuracy score of 0.751 on the hidden validation set (ranked 8th out of 60 teams), and an outcome cost score of 11222 (ranked 41st out of 60 teams). With 5-fold cross-validation on the training set, we obtained a murmur score of 0.652 ± 0.043 (with average sensitivities of 0.827 and 0.312 for the Present and Unknown classes and an average specificity of 0.801); and an outcome score of 12434 ± 401 (with an average sensitivity of 0.676 and an average specificity of 0.544).
... In the early years, various threshold-based techniques based on Shannon Energy, S-transform, and Wavelet transform were used to segment the Phonocardiograms [5][6][7][8][9][10][11][12]. In the later years, statistical models based on HMM, HSMM, GMM, and SVM referred HSMM were used to segment the Phonocardiograms [13][14][15][16][17]. Of late, classifier-based methods using SVM, ANN, CNN, and deep neural networks have been used to segment Phonocardiograms with good accuracy [18][19][20][21][22]. ...
Article
Full-text available
Objective: This paper represents a new method of Phonocardiogram delineation using the heart sound events extracted from the Continuous Wavelet transformation of the Phonocardiogram. Methods: The Phonocardiogram signals were filtered using the Chebyshev bandpass Type-II filter to remove the noise. A two-dimensional Time-Frequency Continuous Wavelet Transform operator was applied to the Phonocardiograms to convert the audio signal into the time-frequency Spectrogram. The Row sum of the Spectrogram matrix was evaluated by summing the matrix for all frequencies along with the time axis. The average of the row sum was calculated. This average value was then subtracted from the row sum. The intersection of the deduced row sum, after subtraction, indicated the boundaries of the Phonocardiogram. Beat identification in the Phonocardiogram was done using Fast Fourier Transform. Conclusion: The method can identify the boundaries of the sounds in the Phonocardiogram and indicate the presence of the First heart sound-systole-Second heart sound-diastole sequence with an accuracy of 90.1% and an F1 score of 94.8%. Significance: The method can be used to delineate Phonocardiograms without any external reference like ECG, PPG, or carotid Pulse. No additional signal processing overheads like noise threshold, activity detection, or feature extraction are involved.
... The HR estimation algorithm introduced by Springer et. al., is initialized with HR determined from peak detection of envelope, and then uses it for heart segmentation to obtain overall heart rate estimate [68]. Breathing rate is determined from peak detection of 300-450Hz power envelope. ...
Article
Full-text available
Chest sound— as the first and most commonly available vital signal for newborns— contains affluent information about their cardiac and respiratory health. However, neonatal lung sound auscultation is currently challenging and often unreliable due to the noise and interference, particularly for preterm infants. The noise often overlaps with the heart and lung contents in both time and frequency. Moreover, the frequency band of the useful components varies from one case to another, making it difficult to separate by fixed band-pass filtering. In this study, a single-channel Blind Source Separation (SCBSS) framework is proposed to separate newborns’ lung and heart sounds from noisy chest sounds recorded by a digital stethoscope. This method first decomposes the signal into a multi-resolution representation using a time-frequency transform, and then applies source separation algorithms, to find proper ad hoc frequency filters. In the simulation scenario, two different time-frequency transforms are considered; Stationary Wavelet Transform (SWT) with dyadic bases, and Continuous Wavelet Transform (CWT) with redundant bases. The transforms are followed by three different source separation methods, namely Principal Component Analysis (PCA), Periodic Component Analysis ( $\pi$ CA), and Second Order Blind Identification (SOBI). The yielded combinations are applied to the chest sounds recorded from ninety-one preterm and full-term newborns. The results show that compared to raw signals, fixed band-pass filtering and seven other separation methods, the heart and lung sounds extracted by the proposed methods have higher quality index and also result in more reliable heart and respiratory rate estimation.
... Segmentation accuracies of 97.5% and 96% for the onset and offset, respectively, have been shown using the method. Likewise, Springer, Tarassenko, and Clifford (2016) have segmented S1 and S2 sounds using the Hidden semi-Markov model and achieved an average F1 score of 95.63 ± 0.85%%, whilst Sedighian, Subudhi, Scalzo, and Asgari (2014) have also used the same segmentation model, to achieve 92.4% ± 1.1% and 93.5% ± 1.1% accuracy in segmenting S1 and S2 sounds, respectively. (Santos's', 2001) has introduced a heart sound segmentation method, which is based on time-frequency analysis with a combination of threshold values. ...
Article
This paper discusses four heart sound segmentation (HSS) methods: Wavelet transform, Fractal decomposition, Hilbert Transform, and Shannon Energy Envelogram, in order to identify the different cardiac sounds. Many research studies related to heart signal analysis, have adopted these methods to give high heart sound segmentation results, especially, for the identification of first and second heart sounds and murmurs. Performance of the heart sound segmentation results have also been compared with one other to identify the most efficient method(s), and it has been found that Shannon energy envelogram provides the best accuracies among the segmentation methods. Understandings of the segmentation methods for heart sound may pave the way for more advanced studies in other heart-related researches, including heart sound classifications.
Article
Full-text available
Objective: This work proposes a semi-supervised training approach for detecting lung and heart sounds simultaneously with only one trained model and in invariance to the auscultation point. Methods: We use open-access data from the 2016 Physionet/CinC Challenge, the 2022 George Moody Challenge, and from the lung sound database HF_V1. We first train specialist single-task models using foreground ground truth (GT) labels from different auscultation databases to identify background sound events in the respective lung and heart auscultation databases. The pseudo-labels generated in this way were combined with the ground truth labels in a new training iteration, such that a new model was subsequently trained to detect foreground and background signals. Benchmark tests ensured that the newly trained model could detect both, lung, and heart sound events in different auscultation sites without regressing on the original task. We also established hand-validated labels for the respective background signal in heart and lung sound auscultations to evaluate the models. Results: In this work, we report for the first time results for i) a multi-class prediction for lung sound events and ii) for simultaneous detection of heart and lung sound events and achieve competitive results using only one model. The combined multi-task model regressed slightly in heart sound detection and gained significantly in lung sound detection accuracy with an overall macro F1 score of 39.2% over six classes, representing a 6.7% improvement over the single-task baseline models. Conclusion/Significance: To the best of our knowledge, this is the first approach developed to date for measuring heart and lung sound events invariant to both, the auscultation site and capturing device. Hence, our model is capable of performing lung and heart sound detection from any auscultation location.
Preprint
Objective: This work proposes a semi-supervised training approach for detecting lung and heart sounds simultaneously with only one trained model and in invariance to the auscultation point. Methods: We use open-access data from the 2016 Physionet/CinC Challenge, the 2022 George Moody Challenge, and from the lung sound database HF_V1. We first train specialist single-task models using foreground ground truth (GT) labels from different auscultation databases to identify background sound events in the respective lung and heart auscultation databases. The pseudo-labels generated in this way were combined with the ground truth labels in a new training iteration, such that a new model was subsequently trained to detect foreground and background signals. Benchmark tests ensured that the newly trained model could detect both, lung, and heart sound events in different auscultation sites without regressing on the original task. We also established hand-validated labels for the respective background signal in heart and lung sound auscultations to evaluate the models. Results: In this work, we report for the first time results for i) a multi-class prediction for lung sound events and ii) for simultaneous detection of heart and lung sound events and achieve competitive results using only one model. The combined multi-task model regressed slightly in heart sound detection and gained significantly in lung sound detection accuracy with an overall macro F1 score of 39.2% over six classes, representing a 6.7% improvement over the single-task baseline models. Conclusion/Significance: To the best of our knowledge, this is the first approach developed to date for measuring heart and lung sound events invariant to both, the auscultation site and capturing device. Hence, our model is capable of performing lung and heart sound detection from any auscultation location.
Preprint
Full-text available
The recent pandemic has refocused the medical world's attention on the diagnostic techniques associated with cardiovascular disease. Heart rate provides a real-time snapshot of cardiovascular health. A more precise heart rate reading provides a better understanding of cardiac muscle activity. Although many existing diagnostic techniques are approaching the limits of perfection, there remains potential for further development. In this paper, we propose MIBINET, a convolutional neural network for real-time proctoring of heart rate via inter-beat-interval (IBI) from millimeter wave (mm-wave) radar ballistocardiography signals. This network can be used in hospitals, homes, and passenger vehicles due to its lightweight and contactless properties. It employs classical signal processing prior to fitting the data into the network. Although MIBINET is primarily designed to work on mm-wave signals, it is found equally effective on signals of various modalities such as PCG, ECG, and PPG. Extensive experimental results and a thorough comparison with the current state-of-the-art on mm-wave signals demonstrate the viability and versatility of the proposed methodology. Keywords: Cardiovascular disease, contactless measurement, heart rate, IBI, mm-wave radar, neural network
Article
Full-text available
Decompensation episodes in chronic heart failure patients frequently result in unplanned outpatient or emergency room visits or even hospitalizations. Early detection of these episodes in their pre-symptomatic phase would likely enable the clinicians to manage this patient cohort with the appropriate modification of medical therapy which would in turn prevent the development of more severe heart failure decompensation thus avoiding the need for heart failure-related hospitalizations. Currently, heart failure worsening is recognized by the clinicians through characteristic changes of heart failure-related symptoms and signs, including the changes in heart sounds. The latter has proven to be largely unreliable as its interpretation is highly subjective and dependent on the clinicians’ skills and preferences. Previous studies have indicated that the algorithms of artificial intelligence are promising in distinguishing the heart sounds of heart failure patients from those of healthy individuals. In this manuscript, we focus on the analysis of heart sounds of chronic heart failure patients in their decompensated and recompensated phase. The data was recorded on 37 patients using two types of electronic stethoscopes. Using a combination of machine learning approaches, we obtained up to 72% classification accuracy between the two phases, which is better than the accuracy of the interpretation by cardiologists, which reached 50%. Our results demonstrate that machine learning algorithms are promising in improving early detection of heart failure decompensation episodes.
Article
Objective: Heart sound segmentation (HSS), which aims to identify the exact positions of the first heart sound(S1), second heart sound(S2), the duration of S1, systole, S2, and diastole within a cardiac cycle of phonocardiogram (PCG), is an indispensable step to find out heart health. Recently, some neural network-based methods for heart sound segmentation have shown good performance. Approach: In this paper, a novel method was proposed for HSS exactly using One-Dimensional Convolution and Bidirectional Long-Short Term Memory neural network with Attention mechanism (C-LSTM-A) by incorporating the 0.5-order smooth Shannon entropy envelope and its instantaneous phase waveform (IPW), and third intrinsic mode function (IMF-3) of PCG signal to reduce the difficulty of neural network learning features. Main results: An average F1-score of 96.85 was achieved in the clinical research dataset (Fuwai Yunnan Cardiovascular Hospital heart sound dataset) and an average F1-score of 95.68 was achieved in 2016 PhysioNet/CinC Challenge dataset using the novel method. Significance: The experimental results show that this method has advantages for normal PCG signals and common pathological PCG signals, and the segmented fundamental heart sound(S1, S2), systole, and diastole signal components are beneficial to the study of subsequent heart sound classification.
Article
Full-text available
Heart failure (HF) is a devastating condition that impairs people’s lives and health. Because of the high morbidity and mortality associated with HF, early detection is becoming increasingly critical. Many studies have focused on the field of heart disease diagnosis based on heart sound (HS), demonstrating the feasibility of sound signals in heart disease diagnosis. In this paper, we propose a non-invasive early diagnosis method for HF based on a deep learning (DL) network and the Korotkoff sound (KS). The accuracy of the KS-based HF prediagnosis was investigated utilizing continuous wavelet transform (CWT) features, Mel frequency cepstrum coefficient (MFCC) features, and signal segmentation. Fivefold cross-validation was applied to the four DL models: AlexNet, VGG19, ResNet50, and Xception, and the performance of each model was evaluated using accuracy (Acc), specificity (Sp), sensitivity (Se), area under curve (AUC), and time consumption (Tc). The results reveal that the performance of the four models on MFCC datasets is significantly improved when compared to CWT datasets, and each model performed considerably better on the non-segmented dataset than on the segmented dataset, indicating that KS signal segmentation and feature extraction had a significant impact on the KS-based CHF prediagnosis performance. Our method eventually achieves the prediagnosis results of Acc (96.0%), Se (97.5%), and Sp (93.8%) based on a comparative study of the model and the data set. The research demonstrates that the KS-based prediagnosis method proposed in this paper could accomplish accurate HF prediagnosis, which will offer new research approaches and a more convenient way to achieve early HF prevention.
Article
Cardiac auscultation is an essential point-of-care method used for the early diagnosis of heart diseases. Automatic analysis of heart sounds for abnormality detection is faced with the challenges of additive noise and sensor-dependent degradation. This paper aims to develop methods to address the cardiac abnormality detection problem when both of these components are present in the cardiac auscultation sound. We first mathematically analyze the effect of additive noise and convolutional distortion on short-term mel-filterbank energy-based features and a Convolutional Neural Network (CNN) layer. Based on the analysis, we propose a combination of linear and logarithmic spectrogram-image features. These 2D features are provided as input to a residual CNN network (ResNet) for heart sound abnormality detection. Experimental validation is performed first on an open-access, multiclass heart sound dataset where we analyzed the effect of additive noise by mixing lung sound noise with the recordings. In noisy conditions, the proposed method outperforms one of the best-performing methods in the literature achieving an Macc (mean of sensitivity and specificity) of 89.55% and an average F-1 score of 82.96%, respectively, when averaged over all noise levels. Next, we perform heart sound abnormality detection (binary classification) experiments on the 2016 Physionet/CinC Challenge dataset that involves noisy recordings obtained from multiple stethoscope sensors. The proposed method achieves significantly improved results compared to the conventional approaches on this dataset, in the presence of both additive noise and channel distortion, with an area under the ROC (receiver operating characteristics) curve (AUC) of 91.36%, F-1 score of 84.09%, and Macc of 85.08%. We also show that the proposed method shows the best mean accuracy across different source domains, including stethoscope and noise variability, demonstrating its effectiveness in different recording conditions. The proposed combination of linear and logarithmic features along with the ResNet classifier effectively minimizes the impact of background noise and sensor variability for classifying phonocardiogram (PCG) signals. The method thus paves the way toward developing computer-aided cardiac auscultation systems in noisy environments using low-cost stethoscopes.
Article
Early detection and diagnosis of heart valve diseases (HVDs) can prevent cardiac arrest. This work proposes a novel feature fusion method for detecting HVDs using a Phonocardiogram (PCG) signal. In the proposed method, first, the raw PCG signal is pre-processed. Then, two feature fusion models are proposed based on mel-frequency cepstral coefficients (MFCC) and linear prediction cepstral coefficients (LPCC). One model is the series feature fusion (SFF) model. Another is the parallel feature fusion (PFF) model. We propose a hierarchical long-short term memory (HLSTM) network with a self-attention mechanism for both models. HLSTM encodes the sequential information of the fused features in different abstractions. Then, the self-attention module aggregates these encoded vectors based on their clinical relevance to detect HVDs. The efficacy of the proposed method is evaluated using two publicly available databases offered by Physionet challenge (CinC) 2016 and the GitHub repository. The CinC 2016 database is used for binary classification. The results show an overall accuracy (OA) of 98.76%and 98.29%for binary classification using SFF-HLSTM and PFF-HLSTM models. The heart sound murmur (HSM) database in the GitHub repository is used for multi-class classification. The results show an OA of 99.10 % and 98.71 % for multi-class classification using the SFF-HLSTM and PFF-HLSTM models, respectively. The promising results show that the proposed method is compelling enough for preliminary automated diagnosis of HVDs in cardiac care units.
Article
Full-text available
Article
Accurate segmentation of electrocardiogram (ECG) waves is crucial for cardiovascular diseases (CVDs). In this study, a bidirectional hidden semi-Markov model (BI-HSMM) based on the probability distributions of ECG waveform duration was proposed for ECG wave segmentation. Four feature-vectors of ECG signals were extracted as the observation sequence of the hidden Markov model (HMM), and the statistical probability distribution of each waveform duration was counted. Logistic regression (LR) was used to train model parameters. The starting and ending positions of the QRS wave were first detected, and thereafter, bidirectional prediction was employed for the other waves. Forwardly, ST segment, T wave, and TP segment were predicted. Backwardly, P wave and PQ segments were detected. The Viterbi algorithm was improved by integrating the recursive formula of the forward prediction and backward backtracking algorithms. In the QT database, the proposed method demonstrated excellent performance (Acc = 97.98%, F1 score of P wave = 98.37%, F1 score of QRS wave = 97.60%, F1 score of T wave = 97.79%). For the wearable dynamic electrocardiography (DCG) signals collected by the Shandong Provincial Hospital (SPH), the detection accuracy was 99.71% and the F1 of each waveform was above 99%. The experimental results and real DCG signal validation confirmed that the proposed new BI-HSMM method exhibits significant ability to segment the resting and DCG signals; this is conducive to the detection and monitoring of CVDs.
Conference Paper
Phonocardiogram (PCG) signal of the mitral valve prolapse (MVP) patients is characterized by transient audio events which include a systolic click (SC) followed by a murmur of varying intensity. Physicians detect these auscultation clues in regular auscultation before ordering expensive echocardio-graphy test. But auscultation is often error prone and even physicians with considerable experience might end up missing these clues. Therefore developing machine learning techniques to help clinicians is the need of the hour. A segmentation technique using Fourier synchrosqueezed transform (FSST) features with a long short term memory (LSTM) network is proposed in this study. An accuracy of 99.8% on MVP dataset demonstrates the potential of the proposed method in clinical diagnosis.
Conference Paper
Cardiac auscultation is the key exam to screen cardiac diseases both in developed and developing countries. A heart sound auscultation procedure can detect the presence of murmurs and point to a diagnosis, thus it is an important first-line assessment and also cost-effective tool. The design automatic recommendation systems based on heart sound auscultation can play an important role in boosting the accuracy and the pervasiveness of screening tools. One such as step, consists in detecting the fundamental heart sound states, a process known as segmentation. A faulty segmentation or a wrong estimation of the heart rate might result in an incapability of heart sound classifiers to detect abnormal waves, such as murmurs. In the process of understanding the impact of a faulty segmentation, several common heart sound segmentation errors are studied in detail, namely those where the heart rate is badly estimated and those where S1/S2 and Systolic/Diastolic states are swapped in comparison with the ground truth state sequence. From the tested algorithms, support vector machine (SVMs) and random forest (RFs) shown to be more sensitive to a wrong estimation of the heart rate (an expected drop of 6% and 8% on the overall performance, respectively) than to a swap in the state sequence of events (an expected drop of 1.9% and 4.6%, respectively).
Conference Paper
In the context of monitoring patients with heart failure conditions, the automated assessment of heart sound quality is of major importance to insure the relevance of the medical analysis of the heart sound data. We propose in this study a technique of quality classification based on the selection of a small set of representative features. The first features are chosen to characterize whether the periodicity, complexity or statistical nature of the heart sound recordings. After segmentation process, the latter features are probing the detectability of the heart sounds in cardiac cycles. Our method is applied on a novel subcutaneous medical implant that combines ECG and accelerometric-based heart sound measurements. The actual prototype is in pre-clinical phase and has been implanted on 4 pigs, which anatomy and activity constitute a challenging environment for obtaining clean heart sounds. As reference quality labeling, we have performed a three-class manual annotation of each recording, qualified as "good", "unsure" and "bad". Our method allows to retrieve good quality heart sounds with a sensitivity and an accuracy of 82% ± 2% and 88% ± 6% respectively. Clinical Relevance- By accurately recovering high quality heart sound sequences, our method will enable to monitor reliable physiological indicators of heart failure complications such as decompensation.
Article
Full-text available
Introduction: Heart sound signal is an important physiological signal of human body, and the identification and research of heart sound signal is of great significance. Methods: For abnormal heart sound signal recognition, an abnormal heart sound recognition system, combining hidden semi-Markov models (HSMM) with deep neural networks, is proposed. Firstly, HSMM is used to build a heart sound segmentation model to accurately segment the heart sound signal, and then the segmented heart sound signal is subjected to feature extraction. Finally, the trained deep neural network model is used for recognition. Results: Compared with other methods, this method has a relatively small amount of input feature data and high accuracy, fast recognition speed. Discussion: HSMM combined with deep neural network is expected to be deployed on smart mobile devices for telemedicine detection.
Article
Nowadays, heart disease is the leading cause of death. The high mortality rate and escalating occurrence of heart diseases worldwide warrant the requirement for a fast and efficient diagnosis of such ailments. The purpose is to design an automated system for the classification of abnormal heartbeat audio signals to assist cardiologists. To the best of our knowledge, this is the first study that uses a single neural network model for the classification of eight different types of heartbeat audio signals. The proposed recurrent neural network (RNN) model using Long short-term memory (LSTM) is developed on two publically available databases such as the PASCAL challenge and the 2017 PhysioNet challenge. Mel frequency cepstrum coefficient (MFCC) is applied to extract the dominant features, and a bandpass filter is used to remove the noise from both of the datasets. Afterward, the downsampling technique is used to fix the size of the sampling rate of each sound signal to 20KHz and 300 Hz for the Pascal and PhysioNet database, respectively. The proposed model is compared with multi-layer perceptron (MLP) in terms of different performance evaluation matrices. Furthermore, the outcomes of five machine learning (ML) models are also analyzed. The proposed model has achieved the highest classification accuracy of 0.9971 on the Pascal database, and 0.9870 accuracy on the PhysioNet challenge dataset, which is consistently superior to its competitor approaches. The proposed model provides significant assistance to the cardiac consultant in detecting heart valve diseases.
Article
Phonocardiogram (PCG) auscultation is one of the most commonly used methods for coronary artery disease (CAD) detection. However, its detection accuracy is influenced by significant interpersonal variations. In this paper, a novel customized framework was proposed for the first time to address the individual specificity in PCG signals. To eliminate individual differences at source, a clustering method based on age information and PCG time–frequency features was developed to partition the subjects into subgroups. Then we designed different classification models for different subgroups to achieve an individually tailored diagnostic strategy, thus further attenuating the effect of individual specificity. In the classification stage, feature fusion was employed to overcome the shortage of single features. Mel-frequency cepstral coefficients and PCG signal fragments were sent to the corresponding convolutional neural networks to obtain two-dimensional time–frequency features and one-dimensional spatial–temporal features. Ultimately, the two multimodal features were fused and fed into a random forest for classification. The experiments demonstrated that the customized framework can effectively solve the problem of individual specificity in PCG detection with an average accuracy of 96.05%, which is an improvement in the accuracy by up to 6.51% over the general method. A comparison with existing research indicates that the proposed method is a robust and effective noninvasive technique for CAD detection, and it is a feasible solution for the problem of individual specificity in PCG classification. In addition, the framework can be extended to other similar biomedical signal applications.
Article
Exercised-induced cardiac fatigue (EICF) refers to an impermanent decline in systolic and diastolic function caused by high-intensity and multi-frequency exercise. Long-term EICF may lead to heart fatigue, myocardial damage, and even sudden cardiac death. However, common cardiac fatigue diagnoses rely on expensive devices or time. Further, cardiac inotropy as an essential reference has not been applied to measure cardiac fatigue. In this paper, heart sound was proposed to evaluate EICF based on its ability to reflect changes of cardiac contractility, with a specialized deep learning network designed to recognize subjects with EICF. First, heart sounds were collected by a physiological signal collection system, and a heart sound database for 20 subjects was established. Then, discrete wavelet transform and logistic regression hidden semi-Markov model were applied to denoise and segment signals respectively, and 1770 samples in total were obtained. Finally, a network unifying residual mapping and attention modules was designed to recognize heart sounds, and the accuracy, precision, and recall of the model were 98.85%, 98.94%, and 98.9% respectively. This study suggests that heart sounds combined with the deep learning model can achieve accurate and objective diagnosis, which is potential for a reliable alternative to diagnose EICF.
Article
Background and objective: Auscultation is the first technique applied to the early diagnose of any cardiovascular disease (CVD) in rural areas and poor-resources countries because of its low cost and non-invasiveness. However, it highly depends on the physician's expertise to recognize specific heart sounds heard through the stethoscope. The analysis of phonocardiogram (PCG) signals attempts to segment each cardiac cycle into the four cardiac states (S1, systole, S2 and diastole) in order to develop automatic systems applied to an efficient and reliable detection and classification of heartbeats. In this work, we propose an unsupervised approach, based on time-frequency characteristics shown by cardiac sounds, to detect and classify heartbeats S1 and S2. Methods: The proposed system consists of a two-stage cascade. The first stage performs a rough heartbeat detection while the second stage refines the previous one, improving the temporal localization and also classifying the heartbeats into types S1 and S2. The first contribution is a novel approach that combines the dissimilarity matrix with the frame-level spectral divergence to locate heartbeats using the repetitiveness shown by the heart sounds and the temporal relationships between the intervals defined by the events S1/S2 and non-S1/S2 (systole and diastole). The second contribution is a verification-correction-classification process based on a sliding window that allows the preservation of the temporal structure of the cardiac cycle in order to be applied in the heart sound classification. The proposed method has been assessed using the open access databases PASCAL, CirCor DigiScope Phonocardiogram and an additional sound mixing procedure considering both Additive White Gaussian Noise (AWGN) and different kinds of clinical ambient noises from a commercial database. Results: The proposed method outperforms the detection and classification performance of other recent state-of-the-art methods. Although our proposal achieves the best average accuracy for PCG signals without cardiac abnormalities, 99.4% in heartbeat detection and 97.2% in heartbeat classification, its worst average accuracy is always above 92% for PCG signals with cardiac abnormalities, signifying an improvement in heartbeat detection/classification above 10% compared to the other state-of-the-art methods evaluated. Conclusions: The proposed method provides the best detection/classification performance in realistic scenarios where the presence of cardiac anomalies as well as different types of clinical environmental noises are active in the PCG signal. Of note, the promising modelling of the temporal structures of the heart provided by the dissimilarity matrix together with the frame-level spectral divergence, as well as the removal of a significant number of spurious heart events and recovery of missing heart events, both corrected by the proposed verification-correction-classification algorithm, suggest that our proposal is a successful tool to be applied in heart segmentation.
Article
Objective: Heart sounds can reflect detrimental changes in cardiac mechanical activity that are common pathological characteristics of chronic heart failure (CHF). The ACC/AHA heart failure (HF) stage classification is essential for clinical decision-making and the management of CHF. Herein, a machine learning model that makes use of multi-scale and multi-domain heart sound features was proposed to provide an objective aid for ACC/AHA HF stage classification. Approach: A dataset containing phonocardiogram (PCG) signals from 275 subjects was obtained from two medical institutions and used in this study. Complementary ensemble empirical mode decomposition and tunable-Q wavelet transform were used to construct self-adaptive sub-sequences and multi-level sub-band signals for PCG signals. Time-domain, frequency-domain and nonlinear feature extraction were then applied to the original PCG signal, heart sound sub-sequences and sub-band signals to construct multi-scale and multi-domain heart sound features. The features selected via the least absolute shrinkage and selection operator were fed into a machine learning classifier for ACC/AHA HF stage classification. Finally, mainstream machine learning classifiers, including least-squares support vector machine (LS-SVM), deep belief network (DBN) and random forest (RF), were compared to determine the optimal model. Main results: The results showed that the LS-SVM, which utilized a combination of multi-scale and multi-domain features, achieved better classification performance than the DBN and RF using multi-scale or multi-domain features alone or together, with average sensitivity, specificity, and accuracy of 0.821, 0.955 and 0.820 on the testing set, respectively. Significance: PCG signal analysis provides efficient measurement information regarding CHF severity and is a promising noninvasive method for ACC/AHA HF stage classification.
Article
Full-text available
Despite significant advances in adult clinical electrocardiography (ECG) signal processing techniques and the power of digital processors, the analysis of non-invasive foetal ECG (NI-FECG) is still in its infancy. The Physionet/Computing in Cardiology Challenge 2013 addresses some of these limitations by making a set of FECG data publicly available to the scientific community for evaluation of signal processing techniques. The abdominal ECG signals were first preprocessed with a band-pass filter in order to remove higher frequencies and baseline wander. A notch filter to remove power interferences at 50 Hz or 60 Hz was applied if required. The signals were then normalized before applying various source separation techniques to cancel the maternal ECG. These techniques included: template subtraction, principal/independent component analysis, extended Kalman filter and a combination of a subset of these methods (FUSE method). Foetal QRS detection was performed on all residuals using a Pan and Tompkins QRS detector and the residual channel with the smoothest foetal heart rate time series was selected. The FUSE algorithm performed better than all the individual methods on the training data set. On the validation and test sets, the best Challenge scores obtained were E1 = 179.44, E2 = 20.79, E3 = 153.07, E4 = 29.62 and E5 = 4.67 for events 1–5 respectively using the FUSE method. These were the best Challenge scores for E1 and E2 and third and second best Challenge scores for E3, E4 and E5 out of the 53 international teams that entered the Challenge. The results demonstrated that existing standard approaches for foetal heart rate estimation can be improved by fusing estimators together. We provide open source code to enable benchmarking for each of the standard approaches described.
Article
Full-text available
An efficient heart sound segmentation (HSS) method that automatically detects the location of first ( S1) and second ( S2) heart sound and extracts them from heart auscultatory raw data is presented here. The heart phonocardiogram is analyzed by employing ensemble empirical mode decomposition (EEMD) combined with kurtosis features to locate the presence of S1, S2, and extract them from the recorded data, forming the proposed HSS scheme, namely HSS-EEMD/K. Its performance is evaluated on an experimental dataset of 43 heart sound recordings performed in a real clinical environment, drawn from 11 normal subjects, 16 patients with aortic stenosis, and 16 ones with mitral regurgitation of different degrees of severity, producing 2608 S1 and S2 sequences without and with murmurs, respectively. Experimental results have shown that, overall, the HSS-EEMD/K approach determines the heart sound locations in a percentage of 94.56% and segments heart cycles correctly for the 83.05% of the cases. Moreover, results from a noise stress test with additive Gaussian noise and respiratory noises justify the noise robustness of the HSS-EEMD/K. When compared with four other efficient methods that mainly employ wavelet transform, energy, simplicity, and frequency measures, respectively, using the same experimental database, the HSS-EEMD/K scheme exhibits increased accuracy and prediction power over all others at the level of 7-19% and 4-9%, respectively, both in controls and pathological cases. The promising performance of the HSS-EEMD/K paves the way for further exploitation of the diagnostic value of heart sounds in everyday clinical practice.
Article
Full-text available
In automated heart sound analysis and diagnosis, a set of clinically valued parameters including sound intensity, frequency content, timing, duration, shape, systolic and diastolic intervals, the ratio of the first heart sound amplitude to second heart sound amplitude (S1/S2), and the ratio of diastolic to systolic duration (D/S) is measured from the PCG signal. The quality of the clinical feature parameters highly rely on accurate determination of boundaries of the acoustic events (heart sounds S1, S2, S3, S4 and murmurs) and the systolic/diastolic pause period in the PCG signal. Therefore, in this paper, we propose a new automated robust heart sound activity detection (HSAD) method based on the total variation filtering, Shannon entropy envelope computation, instantaneous phase based boundary determination, and boundary location adjustment. The proposed HSAD method is validated using different clean and noisy pathological and non-pathological PCG signals. Experiments on a large PCG database show that the HSAD method achieves an average sensitivity (Se) of 99.43% and positive predictivity (+P) of 93.56%. The HSAD method accurately determines boundaries of major acoustic events of the PCG signal with signal-to-noise ratio of 5 dB. Unlike other existing methods, the proposed HSAD method does not use any search-back algorithms. The proposed HSAD method is a quite straightforward and thus it is suitable for real-time wireless cardiac health monitoring and electronic stethoscope devices.
Article
Full-text available
The abdominal electrocardiogram (ECG) provides a non-invasive method for monitoring the fetal cardiac activity in pregnant women. However, the temporal and frequency overlap between the fetal ECG (FECG), the maternal ECG (MECG) and noise results in a challenging source separation problem. This work seeks to compare temporal extraction methods for extracting the fetal signal and estimating fetal heart rate. A novel method for MECG cancelation using an echo state neural network (ESN) based filtering approach was compared with the least mean square (LMS), the recursive least square (RLS) adaptive filter and template subtraction (TS) techniques. Analysis was performed using real signals from two databases composing a total of 4 h 22 min of data from nine pregnant women with 37,452 reference fetal beats. The effects of preprocessing the signals was empirically evaluated. The results demonstrate that the ESN based algorithm performs best on the test data with an F1 measure of 90.2% as compared to the LMS (87.9%), RLS (88.2%) and the TS (89.3%) techniques. Results suggest that a higher baseline wander high pass cut-off frequency than traditionally used for FECG analysis significantly increases performance for all evaluated methods. Open source code for the benchmark methods are made available to allow comparison and reproducibility on the public domain data.
Article
Full-text available
The heart sound signal is first separated into cycles, where the cycle detection is based on an instantaneous cycle frequency. The heart sound data of one cardiac cycle can be decomposed into a number of atoms characterized by timing delay, frequency, amplitude, time width and phase. To segment heart sounds, we made a hypothesis that the atoms of a heart sound congregate as a cluster in time–frequency domains. We propose an atom density function to indicate clusters. To suppress clusters of murmurs and noise, weighted density function by atom energy is further proposed to improve the segmentation of heart sounds. Therefore, heart sounds are indicated by the hybrid analysis of clustering and medical knowledge. The segmentation scheme is automatic and no reference signal is needed. Twenty-six subjects, including 3 normal and 23 abnormal subjects, were tested for heart sound signals in various clinical cases. Our statistics show that the segmentation was successful for signals collected from normal subjects and patients with moderate murmurs.
Article
Full-text available
An automated algorithm to assess electrocardiogram (ECG) quality for both normal and abnormal rhythms is presented for false arrhythmia alarm suppression of intensive care unit (ICU) monitors. A particular focus is given to the quality assessment of a wide variety of arrhythmias. Data from three databases were used: the Physionet Challenge 2011 dataset, the MIT-BIH arrhythmia database, and the MIMIC II database. The quality of more than 33000 single-lead 10s ECG segments were manually assessed and another 12000 bad quality single-lead ECG segments were generated using the Physionet Noise Stress Test Database. Signal quality indices (SQIs) were derived from the ECGs segments and used as the inputs to a support vector machine classifier with a Gaussian kernel. This classifier was trained to estimate the quality of an ECG segment. Classification accuracies of up to 99% on the training and test set were obtained for normal sinus rhythm and up to 95% for arrhythmias, although performance varied greatly depending on the type of rhythm. Additionally, the association between 4050 ICU alarms from the MIMIC II database and the signal quality, as evaluated by the classifier, was studied. Results suggest that the SQIs should be rhythm specific and that the classifier should be trained for each rhythm call independently. This would require a substantially increased set of labelled data in order to train an accurate algorithm.
Article
Full-text available
This paper presents a new method to detect and to delineate phonocardiogram (PCG) sounds. Toward this objective, after preprocessing the PCG signal, two windows were moved on the preprocessed signal, and in each analysis window, two frequency-and amplitude-based features were calculated from the excerpted segment. Then, a synthetic decision making basis was devised by combining these two features for being used as an efficient detection-delineation decision statistic, (DS). Next, local extremums and locations of minimum slopes of the DS were determined by conducting forward-backward local investigations with the purpose of detecting sound incidences and their boundaries. In order to recognize the delineated PCG sounds, first, S1 and S2 were detected. Then, a new DS was regenerated from the signal whose S1 and S2 were eliminated to detect occasional S3 and S4 sounds. Finally, probable murmurs and souffles were spotted. The proposed algorithm was applied to 52 min PCG signals gathered from patients with different valve diseases. The provided database was annotated by some cardiology experts equipped by echocardiography and appropriate computer interfaces. The acquisition landmarks were in 2R (aortic), 2L (pulmonic), 4R (apex) and 4L (tricuspid) positions. The acquisition sensor was an electronic stethoscope (3 M Littmann® 3200, 4 kHz sampling frequency). The operating characteristics of the proposed method have an average sensitivity Se = 99.00% and positive predictive value PPV = 98.60% for sound type recognition (i.e., S1, S2, S3 or S4).
Article
Full-text available
In many examples in biomedical signal processing the modulation amplitude of a signal (e.g. respira-tion movement) conveys information. Traditional methods of demodulation typically involve peak-and valley tracing which is computationally expensive and noise-sensitive. In this paper we present results for the well-known, but little-used, technique of complex-domain homomorphic ltering which ooers an alternative approach to such demodulation problems.
Conference Paper
Full-text available
This work presents a novel method for automatic detection and identification of heart sounds. Homomorphic filtering is used to obtain a smooth envelogram of the phono cardiogram, which enables a robust detection of events of interest in heart sound signal. Sequences of features extracted from the detected events are used as observations of a hidden Markov model. It is demonstrated that the task of detection and identification of the major heart sounds can be learned from unlabelled phono cardiograms by an unsupervised training process and without the assistance of any additional synchronizing channels
Conference Paper
Full-text available
The monitoring of respiration rates using impedance plethysmography is often confused by cardiac activity. This paper proposes using the phonocardiogram as an alternative, since the process of respiration affects heart sounds. As part of this research, a technique is developed to segment heart sounds into its component segments, using hidden Markov models. The heart sounds data is preprocessed into feature vectors, where the feature vectors are comprised of the average Shannon energy of the heart sound signal, the delta Shannon energy, and the delta-delta Shannon energy. The performance of the segmentation system is validated using eight-fold cross-validation
Article
Full-text available
Analysis of phonocardiogram (PCG) signals provides a non-invasive means to determine the abnormalities caused by cardiovascular system pathology. In general, time-frequency representation (TFR) methods are used to study the PCG signal because it is one of the non-stationary bio-signals. The continuous wavelet transform (CWT) is especially suitable for the analysis of non-stationary signals and to obtain the TFR, due to its high resolution, both in time and in frequency and has recently become a favourite tool. It decomposes a signal in terms of elementary contributions called wavelets, which are shifted and dilated copies of a fixed mother wavelet function, and yields a joint TFR. Although the basic characteristics of the wavelets are similar, each type of the wavelets produces a different TFR. In this study, eight real types of the most known wavelets are examined on typical PCG signals indicating heart abnormalities in order to determine the best wavelet to obtain a reliable TFR. For this purpose, the wavelet energy and frequency spectrum estimations based on the CWT and the spectra of the chosen wavelets were compared with the energy distribution and the autoregressive frequency spectra in order to determine the most suitable wavelet. The results show that Morlet wavelet is the most reliable wavelet for the time-frequency analysis of PCG signals.
Article
Full-text available
Heart sound is a valuable biosignal for diagnosis of a large set of cardiac diseases. Ambient and physiological noise interference is one of the most usual and highly probable incidents during heart sound acquisition. It tends to change the morphological characteristics of heart sound that may carry important information for heart disease diagnosis. In this paper, we propose a new method applicable in real time to detect ambient and internal body noises manifested in heart sound during acquisition. The algorithm is developed on the basis of the periodic nature of heart sounds and physiologically inspired criteria. A small segment of uncontaminated heart sound exhibiting periodicity in time as well as in the time-frequency domain is first detected and applied as a reference signal in discriminating noise from the sound. The proposed technique has been tested with a database of heart sounds collected from 71 subjects with several types of heart disease inducing several noises during recording. The achieved average sensitivity and specificity are 95.88% and 97.56%, respectively.
Article
Full-text available
This paper presents an algorithm for automatically locating the waveform boundaries (the onsets and ends of P, QRS, and T waves) in multilead ECG signals (the 12 standard leads and the orthogonal XYZ leads). Given these locations, features of clinical importance (such as the RR interval, the PQ interval, the QRS duration, the ST segment, and the QT interval) may be measured readily. First, a multilead QRS detector locates each beat, using a differentiated and low-pass filtered ECG signal as input. Next, the waveform boundaries are located in each lead. The leads in which the detected electrical activity is of longest duration are used for the final determination of the waveform boundaries. The performance of our algorithm has been evaluated using the CSE multilead measurement database. In comparison with other algorithms tested by the CSE, our algorithm achieves better agreement with manual measurements of the T-wave end and of interval values, while its measurements of other waveform boundaries are within the range of the algorithm and manual measurements obtained by the CSE.
Article
Full-text available
In this paper, we developed and evaluated a robust single-lead electrocardiogram (ECG) delineation system based on the wavelet transform (WT). In a first step, QRS complexes are detected. Then, each QRS is delineated by detecting and identifying the peaks of the individual waves, as well as the complex onset and end. Finally, the determination of P and T wave peaks, onsets and ends is performed. We evaluated the algorithm on several manually annotated databases, such as MIT-BIH Arrhythmia, QT, European ST-T and CSE databases, developed for validation purposes. The QRS detector obtained a sensitivity of Se = 99.66% and a positive predictivity of P+ = 99.56% over the first lead of the validation databases (more than 980,000 beats), while for the well-known MIT-BIH Arrhythmia Database, Se and P+ over 99.8% were attained. As for the delineation of the ECG waves, the mean and standard deviation of the differences between the automatic and manual annotations were computed. The mean error obtained with the WT approach was found not to exceed one sampling interval, while the standard deviations were around the accepted tolerances between expert physicians, outperforming the results of other well known algorithms, especially in determining the end of T wave.
Article
Full-text available
The purpose of this paper is to propose a new algorithm for T-wave end location in electrocardiograms, mainly through the computation of an indicator related to the area covered by the T-wave curve. Based on simple assumptions, essentially on the concavity of the T-wave form, it is formally proved that the maximum of the computed indicator inside each cardiac cycle coincides with the T-wave end. Moreover, the algorithm is robust to acquisition noise, to wave form morphological variations and to baseline wander. It is also computationally very simple: the main computation can be implemented as a simple finite impulse response filter. When evaluated with the PhysioNet QT database in terms of the mean and the standard deviation of the T-wave end location errors, the proposed algorithm outperforms the other algorithms evaluated with the same database, according to the most recent available publications up to our knowledge.
Article
Full-text available
In this paper a new algorithm is proposed for QRS onset and offset detection in single lead electrocardiogram (ECG) records. In each cardiac cycle, the R peak is first detected to serve as a reference to delimit the search windows for the QRS onset and offset. Then, an auxiliary signal is defined from the envelope of the ECG signal. Finally, a statistical hypothesis test is applied to the auxiliary signal in order to detect mean changes. The performances of the algorithm have been evaluated using the PhysioNet QT database. The mean and standard deviation of the differences between onsets or offsets manually marked by cardiologists and those detected by the proposed algorithm are computed. The standard deviations obtained in this work are around the tolerances accepted by expert physicians, and slightly outperform the results obtained by the other known algorithms evaluated with the same database.
Article
Full-text available
We have previously developed a method for localization of the first heart sound (S1) using wavelet denoising and ECG-gated peak-picking. In this study, an additional enhancement step based on cross-correlation and ECG-gated ensemble averaging (EA) is presented. The main objective of the improved method was to localize S1 with very high temporal accuracy in (pseudo-) real time. The performance of S1 detection and localization, with and without EA enhancement, was evaluated on simulated as well as experimental data. The simulation study showed that EA enhancement reduced the localization error considerably and that S1 could be accurately localized at much lower signal-to-noise ratios. The experimental data were taken from ten healthy subjects at rest and during invoked hyper- and hypotension. For this material, the number of correct S1 detections increased from 91% to 98% when using EA enhancement. Improved performance was also demonstrated when EA enhancement was used for continuous tracking of blood pressure changes and for respiration monitoring via the electromechanical activation time. These are two typical applications where accurate localization of S1 is essential for the results.
Conference Paper
Full-text available
A method for detecting the first heart sound (SI) using a time-delay neural network (TDNN) is reported The network consists of a single hidden layer, with time-delay links connecting the hidden units to the time-frequency energy coefficients of a Morlet wavelet decomposition of the input phonocardiogram (PCG) signal. The neural network operates on a 200 msec sliding window with each time-delay hidden unit spanning 100 msec of wavelet data. Heart sounds were recorded from 30 subjects for 20 seconds at each of four standard auscultatory sites using a commercially available electronic stethoscope. A training set comprised of half of the heartbeats from 20 randomly selected subjects was created The network was trained on this set and tested on the full data set. The average performance is 1.6% deletion error and 2.2% insertion error. This level of S1 detection is considered satisfactory for analysis of the phonocardiogram signal.
Conference Paper
Full-text available
Desribes the development of a segmentation algorithm which separates the heart sound signal into four parts: the first heart sound, the systole, the second heart sound and the diastole. The segmentation of phonocardiogram (PCG) signals is the first step of analysis and the most important procedure in the automatic diagnosis of heart sounds. This algorithm is based on the normalized average Shannon energy of a PCG signal. The performance of the algorithm has been evaluated using 515 periods of PCG signals recording from 37 objects including normal and abnormal. The algorithm has achieved a 93 percent correct ratio
Article
This paper proposes a novel automatic method for the moment segmentation and peak detection analysis of heart sound (HS) pattern, with special attention to the characteristics of the envelopes of HS and considering the properties of the Hilbert transform (HT). The moment segmentation and peak location are accomplished in two steps. First, by applying the Viola integral waveform method in the time domain, the envelope (ET) of the HS signal is obtained with an emphasis on the first heart sound (S1) and the second heart sound (S2). Then, based on the characteristics of the ET and the properties of the HT of the convex and concave functions, a novel method, the short-time modified Hilbert transform (STMHT), is proposed to automatically locate the moment segmentation and peak points for the HS by the zero crossing points of the STMHT. A fast algorithm for calculating the STMHT of ET can be expressed by multiplying the ET by an equivalent window (WE). According to the range of heart beats and based on the numerical experiments and the important parameters of the STMHT, a moving window width of N = 1 s is validated for locating the moment segmentation and peak points for HS. The proposed moment segmentation and peak location procedure method is validated by sounds from Michigan HS database and sounds from clinical heart diseases, such as a ventricular septal defect (VSD), an aortic septal defect (ASD), tetralogy of fallot (TOF), rheumatic heart disease (RHD), and so on. As a result, for the sounds where S2 can be separated from S1, the average accuracies achieved for the peak of S1 (AP1), the peak of S2 (AP2), the moment segmentation points from S1 to S2 (AT12) and the cardiac cycle (ACC) are 98.53%, 98.31% and 98.36% and 97.37%, respectively. For the sounds where S1 cannot be separated from S2, the average accuracies achieved for the peak of S1 and S2 (AP12) and the cardiac cycle ACC are 100% and 96.69%.
Article
In this paper a new noninvasive method is proposed for automated estimation of fetal cardiac intervals from Doppler Ultrasound (DUS) signal. This method is based on a novel combination of Empirical Mode Decomposition (EMD) and hybrid support vector machines - Hidden Markov Models (SVM/HMM). EMD was used for feature extraction by decomposing the DUS signal into different components (IMFs), one of which is linked to the cardiac valve motions, i.e. opening (o) and closing (c) of the Aortic (A) and Mitral (M) valves . The non-invasive fetal electrocardiogram (fECG) was used as a reference for the segmentation of the IMF into cardiac cycles. The hybrid SVM/HMM was then applied to identify the cardiac events, based on the amplitude and timing of the IMF peaks as well as the sequence of the events. The estimated timings were verified using pulsed doppler images. Results show that this automated method can continuously evaluate beat-to-beat valve motion timings and identify more than 91% of total events which is higher than previous methods. Moreover the changes of the cardiac intervals were analysed for three fetal age groups: 16- 29, 30-35 and 36-41 weeks. The time intervals from Q-wave of fECG to Ac (Systolic Time Interval, STI), Ac to Mo (Isovolumic Relaxation Time, IRT), Q-wave to Ao (Pre-ejection Period, PEP) and Ao to Ac (Ventricular Ejection Time, VET) were found to change significantly (p < 0:05) across these age groups. In particular, STI, IRT and PEP of the fetuses with 36-41 week were significantly (p < 0:05) different from other age groups. These findings can be used as sensitive markers for evaluating the fetal cardiac performance.
Conference Paper
In this study, a methodology based on hidden Markov models (HMM) as a probabilistic finite state-machine to model systolic and diastolic interval duration is proposed. The detection of the first (S1) and second (S2) heart sound is performed using a network of two HMM's with grammar constraints to parse sequence of systolic and diastolic intervals. Duration modeling was considered in the HMM model architecture selection based on experimental measurements of systolic and diastolic intervals in normal subjects. Feature extraction of heart sound signals was based on time-cepstral features. Results are presented in terms of detection performance compared with QRS peak annotations of the simultaneous ECG recording. The performance of the proposed approach has been evaluated in 80 subjects. The results showed that the system was effective to detect the first and second heart sounds with sensitivity of 95% and a positive predictive value of 97% and thus provides a promising methodology for heart sound analysis.
Article
A method is developed for representing any communication system geometrically. Messages and the corresponding signals are points in two function spaces,' and the modulation process is a mapping of one space into the other. Using this representation, a number of results in communication theory are deduced concerning expansion and compression of bandwidth and the threshold effect. Formulas are found for the maximum rate of transmission of binary digits over a system when the signal is perturbed by various types of noise. Some of the properties of ideal' systems which transmit at this maximum rate are discussed. The equivalent number of binary digits per second for certain information sources is calculated.
Article
A novel method for segmentation of heart sounds (HSs) into single cardiac cycle (S1-Systole-S2-Diastole) using homomorphic filtering and K-means clustering is presented. Feature vectors were formed after segmentation by using Daubechies-2 wavelet detail coefficients at the second decomposition level. These feature vectors were then used as input to the neural networks. Grow and Learn (GAL) and Multilayer perceptron-Backpropagation (MLP-BP) neural networks were used for classification of three different HSs (Normal, Systolic murmur and Diastolic murmur). It was observed that the classification performance of GAL was similar to MLP-BP. However, the training and testing times of GAL were lower as compared to MLP-BP. The proposed framework could be a potential solution for automatic analysis of HSs that may be implemented in real time for classification of HSs.
Article
Phonocardiograms (PCGs), recordings of heart sounds, have many advantages over traditional auscultation in that they may be replayed and analysed for spectral and frequency information. PCG is not a widely used diagnostic tool as it could be. One of the major problems with PCG is noise corruption. Many sources of noise may pollute a PCG including foetal breath sounds if the subject is pregnant, lung and breath sounds, environmental noise and noise from contact between the recording device and the skin. An electronic stethoscope is used to record heart sounds and the problem of extracting noise from the signal is addressed via the use of wavelets and averaging. Using the discrete wavelet transform, the signal is decomposed. Due to the efficient decomposition of heart signals, their wavelet coefficients tend to be much larger than those due to noise. Thus, coefficients below a certain level are regarded as noise and are thresholded out. The signal can then be reconstructed without significant loss of information in the signal content. The questions that this study attempts to answer are which wavelet families, levels of decomposition, and thresholding techniques best remove the noise in a PCG. The use of averaging in combination with wavelet denoising is also addressed. Possible applications of the Hilbert transform to heart sound analysis are discussed.
Conference Paper
In resource-constrained environments, supply chains for consumables, repairs and calibration of diagnostic equipment are generally poor. To obviate this issue, we propose the use of widely available hardware with a strong supply chain: a cellphone with a hands-free kit. In particular, we focus on the use of the audio channel to determine heart rate (HR) and heart rate variability (HRV) in order to provide a first level screening system for infection. This article presents preliminary work performed on a gold standard database and a cellphone platform. Results indicate that HR and HRV can be accurately assessed from acoustic recordings of heart sounds using only a cellphone and hands-free kit. Heart sound analysis software, which can run on a standard cellphone in real time, has been developed that detects S1 heart sounds with a sensitivity of 92.1% and a positive predictivity of 88.4%. Evaluation of data recorded from cellphones demonstrates that the low-frequency response (<100 Hz) is key to the success of heart sound analysis on cellphones. Noise rejection is also shown to be important. © 2010, Association for the Advancement of Artificial Intelligence.
Conference Paper
Automatic analysis of heart sounds aid physicians in the diagnosis of abnormal heart valve conditions. Segmentation, i.e. identification of first (S1) and second (S2) heart sounds, is the first step in the automatic analysis. In this work, we have proposed a segmentation method which uses energy-based and simplicity-based features computed from multi-level wavelet decomposition coefficients. This method utilizes timing information of S1 and S2 based on biomedical domain knowledge. Proposed method has been evaluated on several normal and abnormal heart sounds for identification of S1 and S2 and compared with windowed energy-based and simplicity-based approaches individually. The proposed method is an efficient and robust technique for identification and gating of S1 and S2, and yields better results than the above approaches.
Article
In this study, a biomedical diagnosis system for pattern recognition with normal and abnormal classes has been developed. First, fea- ture extraction processing was made by using the Doppler Ultrasound. During feature extraction stage, Wavelet transforms and short- time Fourier transform were used. As next step, wavelet entropy were applied to these features. In the classification stage, hidden Markov model (HMM) was used. To compute the correct classification rate of proposed HMM classifier, it was compared to ANN by using a data set containing 215 samples. In our experiments, specificity rate and sensitivity rates of proposed HMM classifier system with fuzzy C means (FCM)/K-means algorithms were found as 92% and 97.26% respectively. The present study shows that proper selection of the HMMs initial parameter values according to FCM/K-means algorithms improves the recognition rate of the proposed system which was also compared to our previous study named ANN. � 2006 Elsevier B.V. All rights reserved.
Article
As an extension to the popular hidden Markov model (HMM), a hidden semi-Markov model (HSMM) allows the underlying stochastic process to be a semi-Markov chain. Each state has variable duration and a number of observations being produced while in the state. This makes it suitable for use in a wider range of applications. Its forward–backward algorithms can be used to estimate/update the model parameters, determine the predicted, filtered and smoothed probabilities, evaluate goodness of an observation sequence fitting to the model, and find the best state sequence of the underlying stochastic process. Since the HSMM was initially introduced in 1980 for machine recognition of speech, it has been applied in thirty scientific and engineering areas, such as speech recognition/synthesis, human activity recognition/prediction, handwriting recognition, functional MRI brain mapping, and network anomaly detection. There are about three hundred papers published in the literature. An overview of HSMMs is presented in this paper, including modelling, inference, estimation, implementation and applications. It first provides a unified description of various HSMMs and discusses the general issues behind them. The boundary conditions of HSMM are extended. Then the conventional models, including the explicit duration, variable transition, and residential time of HSMM, are discussed. Various duration distributions and observation models are presented. Finally, the paper draws an outline of the applications.
Article
In this study, the valvular heart disorder (VHD) detection method by the wavelet packet (WP) decomposition and the support vector machine (SVM) techniques are proposed. From considering the truth that the frequency ranges of the normal sound and VHDs are different from each other, the WP decomposition at level 8 is employed to split more elaborate frequency bandwidths of the heart sound signals. And then the WP energy (WPE) with the distribution information of energy throughout the whole frequency range of heart sound signals is calculated. Since the heart sound signals with the frequency range of 20–750 Hz are preferred in this study, WPEs at the terminal nodes from (8, 1) to (8, 47) are selected and two parameters meanWPE and stdWPE as defined by the mean value and standard deviation of the position indices of the terminal nodes with over the weighting value (ζ) of the maximum value of WPE are proposed as a feature. Furthermore, the SVM technique is employed as the identification tool to classify between the normal sound and VHDs. Finally, a case study on the normal sound, aortic and mitral VHDs is demonstrated to validate the usefulness and efficiency of the VHD detection using WP decomposition and SVM classifier. The experimental results of the proposed VHD detection method showed high performance like the specificity of over 96% and the sensitivity of 100% for both the training and testing data.
Article
Digital stethoscopes offer new opportunities for computerized analysis of heart sounds. Segmentation of heart sound recordings into periods related to the first and second heart sound (S1 and S2) is fundamental in the analysis process. However, segmentation of heart sounds recorded with handheld stethoscopes in clinical environments is often complicated by background noise. A duration-dependent hidden Markov model (DHMM) is proposed for robust segmentation of heart sounds. The DHMM identifies the most likely sequence of physiological heart sounds, based on duration of the events, the amplitude of the signal envelope and a predefined model structure. The DHMM model was developed and tested with heart sounds recorded bedside with a commercially available handheld stethoscope from a population of patients referred for coronary arterioangiography. The DHMM identified 890 S1 and S2 sounds out of 901 which corresponds to 98.8% (CI: 97.8-99.3%) sensitivity in 73 test patients and 13 misplaced sounds out of 903 identified sounds which corresponds to 98.6% (CI: 97.6-99.1%) positive predictivity. These results indicate that the DHMM is an appropriate model of the heart cycle and suitable for segmentation of clinically recorded heart sounds.
Article
In this paper, we propose a novel method for pediatric heart sounds segmentation by paying special attention to the physiological effects of respiration on pediatric heart sounds. The segmentation is accomplished in three steps. First, the envelope of a heart sounds signal is obtained with emphasis on the first heart sound (S(1)) and the second heart sound (S(2)) by using short time spectral energy and autoregressive (AR) parameters of the signal. Then, the basic heart sounds are extracted taking into account the repetitive and spectral characteristics of S(1) and S(2) sounds by using a Multi-Layer Perceptron (MLP) neural network classifier. In the final step, by considering the diastolic and systolic intervals variations due to the effect of a child's respiration, a complete and precise heart sounds end-pointing and segmentation is achieved.