Figure 2 - uploaded by José Alberto Benítez-Andrades
Content may be subject to copyright.
Source publication
The aim of this paper was the detection of pathologies through respiratory sounds. The ICBHI (International Conference on Biomedical and Health Informatics) Benchmark was used. This dataset is composed of 920 sounds of which 810 are of chronic diseases, 75 of non-chronic diseases and only 35 of healthy individuals. As more than 88% of the samples o...
Context in source publication
Context 1
... CNN is a Deep Learning algorithm which can take in a bi-dimensional input and be able to differentiate it from another by learning filters which extracts complex features from the inputs automatically. A basic modeling of a CNN is represented in Figure 2. During the training step, each convolution layer learns the filter weights to then produce a feature map. ...
Citations
... The dropout layer is used for reducing the overfitting and the activation function used within hidden layers is Relu. A softmax layer is used at the output layer for the [46] prediction of the tumor type. The accuracy obtained is 97% and 100% respectively for both datasets. ...
Deep learning has made significant advancements in recent years. The technology is rapidly evolving and has been used in numerous automated applications with minimal loss. With these deep learning methods, medical image analysis for disease detection can be performed with minimal errors and losses. A survey of deep learning-based medical image classification is presented in this paper. As a result of their automatic feature representations, these methods have high accuracy and precision. This paper reviews various models like CNN, Transfer learning, Long short term memory, Generative adversarial networks, and Autoencoders and their combinations for various purposes in medical image classification. The total number of papers reviewed is 158. In the study, we discussed the advantages and limitations of the methods. A discussion is provided on the various applications of medical imaging, the available datasets for medical imaging, and the evaluation metrics. We also discuss the future trends in medical imaging using artificial intelligence.
... A deep learning algorithm called CNN employs filters has learned to automatically extract complicated information from inputs in order to distinguish between 2D inputs. Figure(1) illustrates the CNN's fundamental modeling [16]. The three basic neural layers that make up CNN in general are convolutional layers, pooling layers, and fully connected layers. ...
... The General Structure of the CNN System[16]. ...
One of the unlucky phenomena that contribute to environmental disasters is fire, which also poses a serious threat to human safety and life, particularly when it is not recognized by sensor-based fire detection systems. Therefore, putting inexpensive and efficient sensors in certain locations will greatly speed up the detection of fires. Vision-based fire detection systems take advantage of the three fundamental features of fire: color, movement, and shape (fire shape). Work has been done to build smoke and fire detection systems based on images that use security cameras. In this study, fire photos are used to extract features before inputs are made. To one of the classification techniques that applies SVM-based machine learning. and CNN-based deep learning technique. We use the linear regression model so that the bounding box of the classified object determines the correct coordinates.
... During training, the network adjusts parameters, costs, and biases to convert input images to feature vectors, leading to a deeper understanding of image characteristics. García-Ordás et al. [15] suggested that the CNN process, shown in Figure 1, includes components such as convolution, pooling, fully connected neural networks, and layers to tackle overfitting/underfitting. ...
... Basic CNN architecture[15] ...
Accurately classifying user-independent handwritten Bengali characters and numerals presents a formidable challenge in their recognition. This task becomes more complicated due to the inclusion of numerous complex-shaped compound characters and the fact that different authors employ diverse writing styles. Researchers have recently conducted significant researches using individual approaches to recognize handwritten Bangla digits, alphabets, and slightly compound characters. To address this, we propose a straightforward and lightweight convolutional neural network (CNN) framework to accurately categorize handwritten Bangla simple characters, compound characters, and numerals. The suggested approach exhibits outperformance in terms of performance when compared too many previously developed procedures, with faster execution times and requiring fewer epochs. Furthermore, this model applies to more than three datasets. Our proposed CNN-based model has achieved impressive validation accuracies on three datasets. Specifically, for the BanglaLekha isolated dataset, which includes 84-character classes, the validation accuracy was 92.48%. On the Ekush dataset, which includes 60-character classes, the model achieved a validation accuracy of 97.24%, while on the customized dataset, which includes 50-character classes, the validation accuracy was 97.03%. Our model has demonstrated high accuracy and outperformed several prominent existing frameworks.
... Their technique has achieved an overall accuracy of 95.67% for six-class respiratory disease classification. Since the ICBHI 2017 database is heavily imbalanced due to the presence of an ample amount of instances in the COPD class, García-Ordás et al. [16] explored the potential of variational autoencoder to augment the minority classes. After augmentation, the training data were fed to a deep CNN model, and in conjunction with this CNN model, an outstanding classification result worth 98.8% sensitivity and 98.6% specificity has been achieved for six-class classification. ...
... 3) Asthmatic Lung Sound Incorporation and Classification: Asthma is one of the minority classes in the ICBHI dataset with only two sound recordings, making deep-learning approaches unreliable for its inclusion in most classification methods used in [2], [14], and [16]. Altan et al. [19] have shown an interesting result on asthma classification by utilizing lung sounds from their own recorded database. ...
... Table VII shows the comparative results for all three classification tasks. To date, researchers have only used either D1 for six-class classification [2], [14], [16] using lung sound signal. However, in this work, we have used all three databases to train our model, which also ensures the robustness of the model, as we have incorporated all possible pathological lung sounds, which are publicly available. ...
Respiratory diseases are the world’s third leading cause of mortality. Early detection is critical in dealing with respiratory diseases, as it improves the effectiveness of intervention, including treatment and reducing the spread. The main aim of this article is to propose a novel lightweight inception network to classify a wide spectrum of respiratory diseases using lung sound signals. The proposed framework consists of three stages: 1) preprocessing; 2) mel spectrogram extraction and conversion into a three-channel image; and 3) classification of the mel spectrogram images into different pathological classes using the proposed lightweight inception network, namely, respiratory disease lightweight inception network (RDLINet). Utilizing the proposed architecture, we have achieved a high classification accuracy of 96.6%, 99.6%, and 94.0% for seven-class classification, six-class classification, and healthy versus asthma classification. To the best of our knowledge, this is the first work on seven-class respiratory disease classification using lung sounds. Whereas, our proposed network outperforms all the existing published works for six-class and binary classifications. The suggested framework makes use of deep-learning methods and offers a standardized evaluation with strong categorization capabilities. In order to distinguish between a wide range of respiratory diseases, our study is a pioneering one that focuses exclusively on lung sounds. The proposed framework can be translated into real-time clinical application, which will facilitate the prospect of automated respiratory health screening using lung sounds.
... This may be observed in several works [47,48,50,51] presented in Table 2. Another common problem in the literature is the lack of subject separation between the training and testing sets. Authors often consider a cross-validation scheme to evaluate their models but do not ensure subject isolation between sets [49,52]. Once again, significant bias was introduced in the evaluation of the methods with such approaches. ...
... Data augmentation techniques, both at the data and algorithm level, can also be employed to address the data imbalance issue. For instance, generative adversarial networks or variational autoencoders might be used to artificially generate new samples and increase the performance of the classification models [49,67]. Such approaches will particularly help to increase the performance of deep-learning models as these tend to scale better as the number of available samples increases. ...
Background and objective:
Respiratory diseases are among the most significant causes of morbidity and mortality worldwide, causing substantial strain on society and health systems. Over the last few decades, there has been increasing interest in the automatic analysis of respiratory sounds and electrical impedance tomography (EIT). Nevertheless, no publicly available databases with both respiratory sound and EIT data are available.
Methods:
In this work, we have assembled the first open-access bimodal database focusing on the differential diagnosis of respiratory diseases (BRACETS: Bimodal Repository of Auscultation Coupled with Electrical Impedance Thoracic Signals). It includes simultaneous recordings of single and multi-channel respiratory sounds and EIT. Furthermore, we have proposed several machine learning-based baseline systems for automatically classifying respiratory diseases in six distinct evaluation tasks using respiratory sound and EIT (A1, A2, A3, B1, B2, B3). These tasks included classifying respiratory diseases at sample and subject levels. The performance of the classification models was evaluated using a 5-fold cross-validation scheme (with subject isolation between folds).
Results:
The resulting database consists of 1097 respiratory sounds and 795 EIT recordings acquired from 78 adult subjects in two countries (Portugal and Greece). In the task of automatically classifying respiratory diseases, the baseline classification models have achieved the following average balanced accuracy: Task A1 - 77.9±13.1%; Task A2 - 51.6±9.7%; Task A3 - 38.6±13.1%; Task B1 - 90.0±22.4%; Task B2 - 61.4±11.8%; Task B3 - 50.8±10.6%.
Conclusion:
The creation of this database and its public release will aid the research community in developing automated methodologies to assess and monitor respiratory function, and it might serve as a benchmark in the field of digital medicine for managing respiratory diseases. Moreover, it could pave the way for creating multi-modal robust approaches for that same purpose.
... Otherwise, the second group transforms the audio recordings into two-dimensional spectrograms. These spectrograms such as S-transform [6], MFCC spectrogram [7], and log-mel spectrogram [8], [9] are generated to capture spectral and temporal information of respiratory sounds. Next, these spectrograms are inputted into network architectures such as convolutional neural network (CNN) based architectures [10], [11] or recurrent neural network (RNN) based architectures [12], [13] for classification. ...
This paper presents a deep learning system applied for detecting anomalies from respiratory sound recordings. Our system initially performs audio feature extraction using Continuous Wavelet transformation. This transformation converts the respiratory sound input into a two-dimensional spectrogram where both spectral and temporal features are presented. Then, our proposed deep learning architecture inspired by the Inception-residual-based backbone performs the spatial-temporal focusing and multi-head attention mechanism to classify respiratory anomalies. In this work, we evaluate our proposed models on the benchmark SPRSound (The Open-Source SJTU Paediatric Respiratory Sound) database proposed by the IEEE BioCAS 2023 challenge. As regards the Score computed by an average between the average score and harmonic score, our robust system has achieved Top-1 performance with Scores of 0.810, 0.667, 0.744, and 0.608 in Tasks 1-1, 1-2, 2-1, and 2-2, respectively.
... Figure 2 below presents a depiction of the working mechanism of CNN. [22] Monitoring the interior of buses and making inferences based on the images obtained by cameras should be considered as one of the most suitable interactions to enhance the efficiency of bus-based public transportation. In this context, an analysis of the actions performed inside the vehicle should be conducted. ...
... Recently, there have been several experiments with generative approaches, variational autoencoders, in the field of computer vision (VAEs) [4], and generative adversarial networks (GANs) [5]. In particular, GANs show important results in various computer vision tasks, such as image generation [6][7][8], image conversion [9], super-resolution [10], and text-image synthesis [11]. ...
Artificial intelligence and machine learning, in particular, have made rapid advances in image processing. However, their incorporation into architectural design is still in its early stages compared to other disciplines. Therefore, this paper addresses the development of an integrated bottom–up digital design approach and describes a research framework for incorporating the deep convolutional generative adversarial network (GAN) for early stage design exploration and the generation of intricate and complex alternative facade designs for urban interiors. In this paper, a novel facade design is proposed using the architectural style, size, scale, and openings of two adjacent buildings as references to create a new building design in the same neighborhood for urban infill. This newly created building contains the outline, style and shape of the two main buildings. A 2D building design is generated as an image, where (1) neighboring buildings are imported as a reference using the cell phone and (2) iFACADE decodes their spatial neighborhood. It is illustrated that iFACADE will be useful for designers in the early design phase to create new facades in relation to existing buildings in a short time, saving time and energy. Moreover, building owners can use iFACADE to show their preferred architectural facade to their architects by mixing two building styles and creating a new building. Therefore, it is presented that iFACADE can become a communication platform in the early design phases between architects and builders. The initial results define a heuristic function for generating abstract facade elements and sufficiently illustrate the desired functionality of the prototype we developed.
... Finally, we followed a previous approach [31] to employ a variation autoencoder (VAE) to solve the problem of imbalanced classes in the original datasets. The VAE used the mean and standard deviation layers to sample the latent vector (see Figure 1). ...
... Given that collecting new lung and heart sound data is an exhausting and costly process, we leveraged data augmentation to make our proposed model more robust. We followed a previous approach [31] to employ a variation autoencoder (VAE) to solve the problem of imbalanced classes in the original datasets. The VAE uses the mean and standard deviation layers used to sample the latent vector (see Figure 1, for more details, see [31]). ...
... We followed a previous approach [31] to employ a variation autoencoder (VAE) to solve the problem of imbalanced classes in the original datasets. The VAE uses the mean and standard deviation layers used to sample the latent vector (see Figure 1, for more details, see [31]). After data augmentation, the total number of samples was increased from 1917 to 8067. ...
Cardiac and respiratory diseases are the primary causes of health problems. If we can automate anomalous heart and lung sound diagnosis, we can improve the early detection of disease and enable the screening of a wider population than possible with manual screening. We propose a lightweight yet powerful model for simultaneous lung and heart sound diagnosis, which is deployable in an embedded low-cost device and is valuable in remote areas or developing countries where Internet access may not be available. We trained and tested the proposed model with the ICBHI and the Yaseen datasets. The experimental results showed that our 11-class prediction model could achieve 99.94% accuracy, 99.84% precision, 99.89% specificity, 99.66% sensitivity, and 99.72% F1 score. We designed a digital stethoscope (around USD 5) and connected it to a low-cost, single-board-computer Raspberry Pi Zero 2W (around USD 20), on which our pretrained model can be smoothly run. This AI-empowered digital stethoscope is beneficial for anyone in the medical field, as it can automatically provide diagnostic results and produce digital audio records for further analysis.
... The most popular data augmentation techniques are summarized in Table 9.3. [43] Generative adversarial network variants [41,78,81] ...