A vanilla Convolutional Neural Network (CNN) representation.

A vanilla Convolutional Neural Network (CNN) representation.

Source publication
Article
Full-text available
The aim of this paper was the detection of pathologies through respiratory sounds. The ICBHI (International Conference on Biomedical and Health Informatics) Benchmark was used. This dataset is composed of 920 sounds of which 810 are of chronic diseases, 75 of non-chronic diseases and only 35 of healthy individuals. As more than 88% of the samples o...

Context in source publication

Context 1
... CNN is a Deep Learning algorithm which can take in a bi-dimensional input and be able to differentiate it from another by learning filters which extracts complex features from the inputs automatically. A basic modeling of a CNN is represented in Figure 2. During the training step, each convolution layer learns the filter weights to then produce a feature map. ...

Citations

... In [30], the authors tackled the challenge of detecting respiratory pathologies from sounds, using the ICBHI Benchmark dataset. Given the dataset's imbalance, the study employed a Variational Convolutional Autoencoder for data augmentation, alongside traditional oversampling techniques. ...
... In future work, we will address the challenges encountered with dataset imbalance, which limited the diversity of the training data. To overcome this challenge, we will explore merging multiple publicly available lung sound datasets and applying data augmentation techniques, such as variational autoencoders [30]. Additionally, real-world clinical validation and the inclusion of more respiratory conditions will be key steps toward refining and extending the applicability of our models. ...
Article
Full-text available
Respiratory diseases such as asthma pose significant global health challenges, necessitating efficient and accessible diagnostic methods. The traditional stethoscope is widely used as a non-invasive and patient-friendly tool for diagnosing respiratory conditions through lung auscultation. However, it has limitations, such as a lack of recording functionality, dependence on the expertise and judgment of physicians, and the absence of noise-filtering capabilities. To overcome these limitations, digital stethoscopes have been developed to digitize and record lung sounds. Recently, there has been growing interest in the automated analysis of lung sounds using Deep Learning (DL). Nevertheless, the execution of large DL models in the cloud often leads to latency, dependency on internet connectivity, and potential privacy issues due to the transmission of sensitive health data. To address these challenges, we developed Tiny Machine Learning (TinyML) models for the real-time detection of respiratory conditions by using lung sound recordings, deployable on low-power, cost-effective devices like digital stethoscopes. We trained three machine learning models—a custom CNN, an Edge Impulse CNN, and a custom LSTM—on a publicly available lung sound dataset. Our data preprocessing included bandpass filtering and feature extraction through Mel-Frequency Cepstral Coefficients (MFCCs). We applied quantization techniques to ensure model efficiency. The custom CNN model achieved the highest performance, with 96% accuracy and 97% precision, recall, and F1-scores, while maintaining moderate resource usage. These findings highlight the potential of TinyML to provide accessible, reliable, and real-time diagnostic tools, particularly in remote and underserved areas, demonstrating the transformative impact of integrating advanced AI algorithms into portable medical devices. This advancement facilitates the prospect of automated respiratory health screening using lung sounds.
... In addition to the classifiers previously discussed, the utilization of Deep Neural Network (DNN) methods in detecting PDs has gained traction due to their growing effectiveness [20,21]. A noteworthy development in this domain is the application of Convolutional Neural Networks (CNNs) for LS analysis, elucidated in [22]. ...
Article
Full-text available
Research focuses on the efficacy of Multi-Task Autoencoder (MTAE) models in signal classification due to their ability to handle many tasks while improving feature extraction. However, researchers have not thoroughly investigated the study of lung sounds (LSs) for pulmonary disease detection. This paper introduces a new framework that utilizes an MTAE model to detect lung diseases based on LS signals. The model integrates an autoencoder and a supervised classifier, simultaneously optimizing both classification accuracy and signal reconstruction. Furthermore, we propose a hybrid approach that combines an MTAE and a Support Vector Machine (MTAE-SVM) to enhance performance. We evaluated our model using LS signals from a publicly available database from King Abdullah University Hospital. The model attained an accuracy of 89.47% for four classes (normal, pneumonia, asthma, and chronic obstructive pulmonary disease) and 90.22% for three classes (normal, pneumonia, and asthma cases). Using the MTAE-SVM, the accuracy was further improved to 91.49% for four classes and 93.08% for three classes, respectively. The results indicate that the MTAE and MTAE-SVM have a considerable potential for detecting pulmonary diseases from lung sound signals. This could aid in the creation of more user-friendly and effective diagnostic tools.
... There are a number of works done on this subject and summarized in here [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. K. P. Exarchos et al, [11] noted that COPD techniques could be used to statement various chronic diseases with diverse symptoms and complex interactions to accomplish a restored objective for Artificial Intelligence (AI). ...
... First, the lung sound signal is developed into a spectrogram image by Time-Frequency (TF) modulation, and then the short-time Fourier transform (STFT) method can transform TF. M. T. García-Ordás et al, [13] proposed that new labelled data and well-known oversampling methods can be employed after dataset type imbalances are realized. Breath sounds can be categorized into healthy, chronic and chronic diseases using CNN. A. Abid et al, [1] proposed a single alveolar compartment model to specify the fractional compression of carbon dioxide during exhalation and to record them using time-based capnography. ...
Article
Full-text available
Recently, symptoms of Chronic Obstructive Pulmonary Disease (COPD) have been identified concerning long-term continuous treatment. Furthermore, predicting the life probability of patients with COPD is crucial for formative ensuing treatment and conduct plans. Additionally, it plays a vital role in providing complementary solutions using technologies such as Deep Learning (DL) to address experiments in the medical field. Early and timely analysis of clinical images can improve prognostic accuracy. These include COPD, pneumonia, asthma, tuberculosis and fibrosis. Conventional methods of diagnosing COPD often rely on physical exams and tests such as spirometers, chest and genetic analysis. However, respiratory diseases pose an enormous comprehensive health burden for many patients. Thus these methods are not always accurate or obtainable. However, succeeding in their accuracy involves a nonspecific diagnosis rate, time-consuming manual procedures, and extensive clinical imaging knowledge of the radiologist. To solve this problem, we use a Deep Recursive Convolutional Neural Network (DRCNN) method to detect chronic lower respiratory disease. Initially, we collected the images from the Kaggle repository, and evaluate the result based on the following stage. The first stage is pre-processing using a Gaussian filter to reduce noise and detect the edges. The second stage is segmentation used on Image Threshold Based Segmentation (ITBS), used for counting the binary image and separating the regions. In the third stage, we use the chi-square test to select the best features and evaluate the image values for each feature and threshold. Finally, classification using DRCNN detects CLRD classifying better than the previous method. In synthesis, CLRD can be detected by many staging measures, such as sensitivity, specificity, accuracy, precision, and Recall
... The method achieved a classification accuracy of 99.7%. García-Ordás [3] proposed a novel approach utilizing a Variational Convolutional Autoencoder (VAE) combined with a Convolutional Neural Network (CNN) to classify respiratory sounds into healthy, chronic disease, and non-chronic disease categories as well as six specific pathologies. They achieved performance improvements over state-of-the-art methods with a reported F-Score of 0.993 in the ternary classification. ...
Article
Full-text available
The correct diagnosis and early treatment of respiratory diseases can significantly improve the health status of patients, reduce healthcare expenses, and enhance quality of life. Therefore, there has been extensive interest in developing automatic respiratory disease detection systems. Most recent methods for detecting respiratory disease use machine and deep learning algorithms. The success of these machine learning methods depends heavily on the selection of proper features to be used in the classifier. Although metaheuristic-based feature selection methods have been successful in addressing difficulties presented by high-dimensional medical data in various biomedical classification tasks, there is not much research on the utilization of metaheuristic methods in respiratory disease classification. This paper aims to conduct a detailed and comparative analysis of six widely used metaheuristic optimization methods using eight different transfer functions in respiratory disease classification. For this purpose, two different classification cases were examined: binary and multi-class. The findings demonstrate that metaheuristic algorithms using correct transfer functions could effectively reduce data dimensionality while enhancing classification accuracy.
... To handle the unbalanced data, a variational convolutional autoencoder (VCA) is proposed for data augmentation in [21]. Then, a convolutional neural network (CNN) is used for respiratory sounds classification. ...
... While SEN measures capability of the classifier in correctly prediction of positive class labels, SPE quantifies ability of the model in correctly prediction of negative instants. Average of SEN and SPE is called as score [21]: ...
... In, Mel-spectrograms-CNN-LSTM [28], data is augmented by approaches such as noise addition, speed variation, random shifting, and pitch shift. The VCA-CNN method [21] uses variational convolutional autoencoder, Spectrograms-CNN-Ensemble [26] and Pre-trained ResNet [22] use time stretching and vocal tract length perturbation, and the five features-CNN method [29] utilizes different data augmentation methods such as loudness augmentation, mask augmentation, shift augmentation and speed augmentation. ...
Article
Full-text available
Finding an accurate model is essential for classification of respiratory pathologies through extraction and fusion of respiratory sounds’ features. To handle the unlabeled data, a sequence of autoencoders are used for data augmentation. A deep neural network framework is proposed to extract three types of features: 1-Features based on the human auditory system, using Mel-frequency cepstral coefficients (MFCCs), 2-Temporal features contained in the sequence of sound signal using the long short-term memory (LSTM) network, and 3-The nonlinear and complex relationship among temporal characteristics in neighborhood regions using a two-dimensional (2D) convolutional neural network (CNN) applied to the sound sequence converted to a 2D time array. Three branches are fused through some fully connected layers. The ICBHI 2017 sound database is used in two cases of 6 and 3 pathological classes. In the case of 6-classes database, 98.72% overall accuracy, 98.46% kappa coefficient, 99.66% sensitivity, 98.70% specificity and 99.70% F1-Sscore is provided. In the case of 3-classes database, 97.18% overall accuracy, 95.77% kappa coefficient, 98.31% sensitivity, 99.16% specificity and 98.93% F1-score are achieved. In contrast to the many researchers that are applying CNN on the Mel spectrogram of the audios, CNN is applied to the 2D form of the time series to find the hidden relationship among the time instants in local regions. Moreover, fusion of these hidden features with the LSTM and MFCC features leads to an accurate multiple classification. This work proposes a new deep learning framework for fusion of three types of sound features, which leads to a significant improvement in multiple respiratory infection diagnosing.
... García-Ordás proposed a novel approach utilizing a Variational Convolutional Autoencoder (VAE) combined with a Convolutional Neural Network (CNN) to classify respiratory sounds into healthy, chronic disease, and non-chronic disease categories as well as six specific pathologies. They achieved performance improvements over state-of-the-art methods with a reported F-Score of 0.993 in the ternary classification [3]. Fraiwan et al. investigated the classification of respiratory diseases using respiratory sound signals achieving an accuracy of 98.27% with boosted decision trees, which outperformed traditional classifiers such as support vector machines [4]. ...
Preprint
Full-text available
Correct diagnosis and early treatment of respiratory diseases can significantly improve the health status of patients, reduce healthcare expences, and enhance quality of life. Therefore, there has been extensive interest in developing automatic respiratory disease detection systems. Most of these methods have recently been using machine and deep learning algorithms. The success of machine learning methods depends heavily on the selection of proper features to be used in the classifier. Although metaheuristic-based feature selection methods have been successful in addressing difficulties presented by high-dimensional medical data in various biomedical classification tasks, there is not much research on the utilization of metaheuristic methods in respiratory disease classification. This paper aims to conduct a detailed and comparative analysis of six widely used metaheuristic optimization methods using eight different transfer functions in respiratory disease classification. For this purpose, two different classification cases were examined: binary and multi-class. The findings demonstrate that metaheuristic algorithms using correct transfer functions could effectively reduce data dimensionality while enhancing classification accuracy.
... In [28], the authors classified electroencephalogram (EEG) signals using CWT and a long short-term memory (LSTM) model, similar to the study in [29], in which a dual scalogram comprising the Stockwell transform and a CWT scalogram was employed for fault diagnosis in centrifugal pumps. Furthermore, recent studies have explored different ML and DL techniques for binary-class (normal vs. abnormal) classification and multi-class classification of respiratory diseases [30]. ...
Article
Full-text available
Respiratory diseases are among the leading causes of death, with many individuals in a population frequently affected by various types of pulmonary disorders. Early diagnosis and patient monitoring (traditionally involving lung auscultation) are essential for the effective management of respiratory diseases. However, the interpretation of lung sounds is a subjective and labor-intensive process that demands considerable medical expertise, and there is a good chance of misclassification. To address this problem, we propose a hybrid deep learning technique that incorporates signal processing techniques. Parallel transformation is applied to adventitious respiratory sounds, transforming lung sound signals into two distinct time-frequency scalograms: the continuous wavelet transform and the mel spectrogram. Furthermore, parallel convolutional autoencoders are employed to extract features from scalograms, and the resulting latent space features are fused into a hybrid feature pool. Finally, leveraging a long short-term memory model, a feature from the latent space is used as input for classifying various types of respiratory diseases. Our work is evaluated using the ICBHI-2017 lung sound dataset. The experimental findings indicate that our proposed method achieves promising predictive performance, with average values for accuracy, sensitivity, specificity, and F1-score of 94.16%, 89.56%, 99.10%, and 89.56%, respectively, for eight-class respiratory diseases; 79.61%, 78.55%, 92.49%, and 78.67%, respectively, for four-class diseases; and 85.61%, 83.44%, 83.44%, and 84.21%, respectively, for binary-class (normal vs. abnormal) lung sounds.
... Garcia-Ordas et al. [29] employed a CNN for the categorization of respiratory sounds into three classes: healthy, chronic diseases, and non-chronic diseases. Nguyen et al. [30] introduced an approach for classifying lung sounds by integrating co-tuning and stochastic normalization techniques to enhance classification outcomes. ...
... The best results are obtained from bi-LSTM model with specificity of 0.83 and sensitivity of 0.63. Garcia et al. [29] took advantage of different oversampling techniques to generate new labeled data. Six types of pathologies are classified using CNN network. ...
Article
One of the major causes of deaths worldwide is the pulmonary diseases. There is an increasing need for an efficient technique that can automatically diagnose these diseases with high accuracy. In this paper, we propose a deep learning architecture to automatically detect pulmonary diseases. The raw pulmonary sound signals are taken from two popular datasets: ICBHI and KAUH datasets. These signals have diverse sampling frequencies of 4 kHz, 10 kHz or 44.1 kHz. The signals from KAUH dataset have a duration of minimum 5 s, while ICBHI signals have a duration of 10 to 90 s. These signals undergo pre-processing, which involves re-sampling them to a common 4 kHz frequency, and segmenting them into frames lasting 3 s. The frames are then normalized and passed to the proposed EasyNet model for training and classification. The EasyNet architecture contains only two convolution layers, which reduces the model complexity. The model's performance is analyzed for both binary detection as well as multi-class detection. Our method performs well in all the considered evaluation scenarios, and yields an accuracy, sensitivity, and specificity of 1.0 for the KAUH dataset, while for the ICBHI dataset, an accuracy of 0.997, sensitivity of 0.999, and specificity of 0.997 is achieved. For the combined dataset, we have achieved an accuracy of 0.998, with a sensitivity and specificity of 0.999. These values are better than the existing state-of-the-art methods. The proposed architecture is quite simple yet effective in detecting lung diseases.
... These models have demonstrated the capability to learn meaningful representations of input data and effectively reconstruct anomalies. Further study into the performance and applicability of each autoencoder type is vital for advancing AD techniques and addressing real-world challenges in cybersecurity [42,43], finance [44,45], manufacturing [46][47][48][49], healthcare [50,51], and beyond. The chosen case study on ACE is highly relevant to real-world industrial applications, and has the potential to benefit from effective AD techniques. ...
Article
Full-text available
Speed reducers (SR) and electric motors are crucial in modern manufacturing, especially within adhesive coating equipment. The electric motor mainly transforms electrical power into mechanical force to propel most machinery. Conversely, speed reducers are vital elements that control the speed and torque of rotating machinery, ensuring optimal performance and efficiency. Interestingly, variations in chamber temperatures of adhesive coating machines and the use of specific adhesives can lead to defects in chains and jigs, causing possible breakdowns in the speed reducer and its surrounding components. This study introduces novel deep-learning autoencoder models to enhance production efficiency by presenting a comparative assessment for anomaly detection that would enable precise and predictive insights by modeling complex temporal relationships in the vibration data. The data acquisition framework facilitated adherence to data governance principles by maintaining data quality and consistency, data storage and processing operations, and aligning with data management standards. The study here would capture the attention of practitioners involved in data-centric processes, industrial engineering, and advanced manufacturing techniques.
... Radiomics features [17][18] i.e image texture features are extracted from OCT images and a Nadam optimizer [19] coupled with a U-net segmentation model [20] is used to extract morphological features from fundus images. As a final step, the DSVAECNN model [21] is used to classify retinal image as glaucoma or healthy based on features fusion of OCT and fundus. Figure 1 is an illustration of the system model block diagram. ...
Article
Full-text available
Irreversible vision loss is a common consequence of glaucoma, demands accurate and timely diagnosis for effective management. This research aims to enhance glaucoma classification accuracy by fusing information from two distinct imaging modalities like using deep learning explores the fusion of these modalities through an innovative neural network architecture with optimization. This approach combines Deep Stochastic Variational Autoencoder Convolution Neural Networks (DSVAECNN) and Adam optimization techniques to enable robust and accurate classification of glaucoma. A multi-path architecture is designed to accommodate both imaging as optical coherence tomography (OCT) radiomics features and fundus morphological features simultaneously. To ensure the effectiveness of the model, this study investigates the implementation of advanced optimization algorithm such as Adam, to expedite convergence and alleviate the risk of over fitting. The resulting model demonstrates improved generalization capabilities, critically for accurate diagnosis across diverse patient populations. A comprehensive OCT and fundus images dataset is used to evaluate the proposed approach from a representative cohort of glaucoma patients and healthy individuals. Quantitative various metrics including sensitivity, accuracy and specificity are employed toward appraise the accomplishment of the fusion-based classification model. Comparisons with current techniques show the dominance of the proposed move toward in accurately detecting glaucoma cases. This research provides the advancement of glaucoma diagnosis by effectively harnessing the synergy between fundus image and OCT scans and findings helpful information for clinics, ultimately facilitating early detection and personalized management of glaucoma, thus preserving vision for affected individuals.