ArticlePublisher preview available

A lightweight deep learning model with knowledge distillation for pulmonary diseases detection in chest X-rays

Springer Nature
Multimedia Tools and Applications
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Accurate and timely diagnosis of pulmonary diseases is critical in the field of medical imaging. While deep learning models have shown promise in this regard, the current methods for developing such models often require extensive computing resources and complex procedures, rendering them impractical. This study focuses on the development of a lightweight deep-learning model for the detection of pulmonary diseases. Leveraging the benefits of knowledge distillation (KD) and the integration of the ConvMixer block, we propose a novel lightweight student model based on the MobileNet architecture. The methodology begins with training multiple teacher model candidates to identify the most suitable teacher model. Subsequently, KD is employed, utilizing the insights of this robust teacher model to enhance the performance of the student model. The objective is to reduce the student model's parameter size and computational complexity while preserving its diagnostic accuracy. We perform an in-depth analysis of our proposed model's performance compared to various well-established pre-trained student models, including MobileNetV2, ResNet50, InceptionV3, Xception, and NasNetMobile. Through extensive experimentation and evaluation across diverse datasets, including chest X-rays of different pulmonary diseases such as pneumonia, COVID-19, tuberculosis, and pneumothorax, we demonstrate the robustness and effectiveness of our proposed model in diagnosing various chest infections. Our model showcases superior performance, achieving an impressive classification accuracy of 97.92%. We emphasize the significant reduction in model complexity, with 0.63 million parameters, allowing for efficient inference and rapid prediction times, rendering it ideal for resource-constrained environments. Outperforming various pre-trained student models in terms of overall performance and computation cost, our findings underscore the effectiveness of the proposed KD strategy and the integration of the ConvMixer block. This highlights the importance of incorporating advanced techniques and innovative architectural elements in the development of highly effective models for medical image analysis.
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Multimedia Tools and Applications
https://doi.org/10.1007/s11042-024-19638-2
1 3
A lightweight deep learning model withknowledge
distillation forpulmonary diseases detection inchest X‑rays
MohammedA.Asham1· AsmaA.Al‑Shargabi2· RaeedAl‑Sabri1· IbrahimMeftah3
Received: 18 January 2024 / Revised: 31 May 2024 / Accepted: 7 June 2024
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024
Abstract
Accurate and timely diagnosis of pulmonary diseases is critical in the field of medical
imaging. While deep learning models have shown promise in this regard, the current meth-
ods for developing such models often require extensive computing resources and complex
procedures, rendering them impractical. This study focuses on the development of a light-
weight deep-learning model for the detection of pulmonary diseases. Leveraging the bene-
fits of knowledge distillation (KD) and the integration of the ConvMixer block, we propose
a novel lightweight student model based on the MobileNet architecture. The methodology
begins with training multiple teacher model candidates to identify the most suitable teacher
model. Subsequently, KD is employed, utilizing the insights of this robust teacher model
to enhance the performance of the student model. The objective is to reduce the student
model’s parameter size and computational complexity while preserving its diagnostic
accuracy. We perform an in-depth analysis of our proposed model’s performance compared
to various well-established pre-trained student models, including MobileNetV2, ResNet50,
InceptionV3, Xception, and NasNetMobile. Through extensive experimentation and evalu-
ation across diverse datasets, including chest X-rays of different pulmonary diseases such
as pneumonia, COVID-19, tuberculosis, and pneumothorax, we demonstrate the robust-
ness and effectiveness of our proposed model in diagnosing various chest infections. Our
model showcases superior performance, achieving an impressive classification accuracy of
97.92%. We emphasize the significant reduction in model complexity, with 0.63 million
parameters, allowing for efficient inference and rapid prediction times, rendering it ideal
for resource-constrained environments. Outperforming various pre-trained student models
in terms of overall performance and computation cost, our findings underscore the effec-
tiveness of the proposed KD strategy and the integration of the ConvMixer block. This
highlights the importance of incorporating advanced techniques and innovative architec-
tural elements in the development of highly effective models for medical image analysis.
Keywords Pneumonia Detection· Knowledge Distillation· Transfer learning· ConvMixer
Block· Chest Infection Classification· Medical Imaging
Extended author information available on the last page of the article
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Article
Full-text available
COVID-19 has affected hundreds of millions of individuals, seriously harming the global population’s health, welfare, and economy. Furthermore, health facilities are severely overburdened due to the record number of COVID-19 cases, which makes prompt and accurate diagnosis difficult. Automatically identifying infected individuals and promptly placing them under special care is a critical step in reducing the burden of such issues. Convolutional Neural Networks (CNN) and other machine learning techniques can be utilized to address this demand. Many existing Deep learning models, albeit producing the intended outcomes, were developed using millions of parameters, making them unsuitable for use on devices with constrained resources. Motivated by this fact, a novel lightweight deep learning model based on Efficient Channel Attention (ECA) module and SqueezeNet architecture, is developed in this work to identify COVID-19 patients from chest X-ray and CT images in the initial phases of the disease. After the proposed lightweight model was tested on different datasets with two, three and four classes, the results show its better performance over existing models. The outcomes shown that, in comparison to the current heavyweight models, our models reduced the cost and memory requirements for computing resources dramatically, while still achieving comparable performance. These results support the notion that proposed model can help diagnose Covid-19 in patients by being easily implemented on low-resource and low-processing devices.
Article
Full-text available
Chest radiography is an essential diagnostic tool for respiratory diseases such as COVID-19, pneumonia, and tuberculosis because it accurately depicts the structures of the chest. However, accurate detection of these diseases from radiographs is a complex task that requires the availability of medical imaging equipment and trained personnel. Conventional deep learning models offer a viable automated solution for this task. However, the high complexity of these models often poses a significant obstacle to their practical deployment within automated medical applications, including mobile apps, web apps, and cloud-based platforms. This study addresses and resolves this dilemma by reducing the complexity of neural networks using knowledge distillation techniques (KDT). The proposed technique trains a neural network on an extensive collection of chest X-ray images and propagates the knowledge to a smaller network capable of real-time detection. To create a comprehensive dataset, we have integrated three popular chest radiograph datasets with chest radiographs for COVID-19, pneumonia, and tuberculosis. Our experiments show that this knowledge distillation approach outperforms conventional deep learning methods in terms of computational complexity and performance for real-time respiratory disease detection. Specifically, our system achieves an impressive average accuracy of 0.97, precision of 0.94, and recall of 0.97.
Article
Full-text available
Multivariate time series classification (MTSC) based on deep learning (DL) has attracted increasingly more research attention. The performance of a DL-based MTSC algorithm is heavily dependent on the quality of the learned representations providing semantic information for downstream tasks, e.g., classification. Hence, a model's representation learning ability is critical for enhancing its performance. This paper proposes a densely knowledge-aware network (DKN) for MTSC. The DKN's feature extractor consists of a residual multi-head convolutional network (ResMulti) and a transformer-based network (Trans), called ResMulti-Trans. ResMulti has five residual multi-head blocks for capturing the local patterns of data while Trans has three transformer blocks for extracting the global patterns of data. Besides, to enable dense mutual supervision between lower-and higher-level semantic information, this paper adapts densely dual self-distillation (DDSD) for mining rich regularizations and relationships hidden in the data. Experimental results show that compared with 5 state-of-the-art self-distillation variants, the proposed DDSD obtains 13/4/13 in terms of 'win'/'tie'/'lose' and gains the lowest AVG rank score. In particular, compared with pure ResMulti-Trans, DKN results in 20/1/9 regarding 'win'/'tie'/'lose'. Last but not least, DKN overweighs 18 existing MTSC algorithms on 10 UEA2018 datasets and achieves the lowest AVG rank score.
Article
Full-text available
To aid in detection of tuberculosis, researchers have concentrated on developing computer‐aided diagnostic technologies based on x‐ray imaging. Since it generates noninvasive standard‐of‐care data, a chest x‐ray image is one of the most often used diagnostic imaging modalities in computer‐aided solutions. Due to their significant interclass similarities and low intra‐class variation abnormalities, chest x‐ray pictures continue to pose difficulty for proper diagnosis. In this paper, a novel automated framework is proposed for the classification of tuberculosis, COVID‐19, and pneumonia from chest x‐ray images using deep learning and improved optimization technique. Two pre‐trained convolutional neural network models such as EfficientB0 and ResNet50 have been utilized and fine‐tuned based on the additional layers. Both models are trained with fixed hyperparameters on the selected datasets and obtained newly trained models. A novel feature selection technique has been proposed that selects the best features. In the novel version, distance and update position formulation has been modified. The selected features are further fused using a novel technique that is based on the serial and standard deviation threshold function. The experimental process of the proposed framework is conducted on three datasets and obtained an accuracy of 98.2%, 99.0%, and 98.7%, respectively. In addition, a detailed Wilcoxon signed‐rank analysis is conducted and shows the proposed method significance performance. Based on the results, it is concluded that the proposed method accuracy is improved after the fusion process. In addition, the comparison with recent techniques shows the proposed method as more significant in terms of accuracy and precision rate.
Article
Full-text available
The COVID-19 pandemic has had a significant impact on society, necessitating accurate identification and suitable medical treatment. This study addresses two primary objectives: reducing computational costs for running the model on embedded devices, mobile devices, and conventional computers and improving the model's performance relative to existing methods for high-performance and accurate medical recognition. To achieve these goals, we utilized two neural networks, VGG19 and ResNet50V2, for improved feature extraction from the dataset. Consequently, the semantic features supplied by these networks were combined in a fully connected classifier layer to achieve satisfactory classification results for normal and COVID-19 cases. In addition, MobileNetV2, which is effective on mobile and embedded devices, was adopted to reduce computational demands. Furthermore, knowledge distillation was used to transfer information from the teacher networks (ResNet50V2 and VGG19) to the student network (MobileNetV2), improving its performance in COVID-19 identification. Pre-trained networks and fivefold cross-validation were used to evaluate the proposed method. The model achieved an accuracy of 98.8% and an F1 score of 99.1% in detecting infectious and normal cases. These results demonstrate the superior performance of our approach. The student model is suitable for conventional computers, embedded systems, and clinical experts' cell phones, with acceptable accuracy and F1 score using cross-validation. Our method provides a cost-effective solution for COVID-19 identification, enabling wider accessibility and accurate diagnosis. Moreover, the proposed method outperforms previous works by improving accuracy, F1 score, and other related metrics by at least 1%.
Conference Paper
Motor imagery (MI) is a compelling cognitive phenomenon situated at the intersection of cognitive neuroscience and technology, offering insights into human cognition and a spectrum of practical applications. Electroencephalography (EEG) emerges as a technology, offering a non-invasive means to record electrical activity within the brain. Motor imagery-based Brain-Computer Interfaces (MI-BCIs) enables the translation of cerebral activity into implementable commands found in the sensorimotor areas of the brain. This paper offers a comprehensive exploration of MI, delving into its practical implications, the complexities it entails, and the critical role this research focuses on enhancing the classification accuracy of EEG signals with Motor Imagery task. We employ a comprehensive approach that integrates Filter Bank Common Spatial Pattern (CSP) and the Riemannian manifold to extract features, aiming at maintaining critical signal details. Furthermore, we utilize Linear Discriminant Analysis (LDA) alongside advanced classification methods, such as AdaBoost and Total Boost, to elevate classification accuracy. Our results, reveal significant improvements in the classification accuracy. This research contributes to the advancement of MI EEG signal classification and opens new avenues for applications in fields such as brain-computer interfaces and medical diagnostics.
Article
This article proposes a semi-supervised contrastive capsule transformer method with feature-based knowledge distillation (KD) that simplifies the existing semisupervised learning (SSL) techniques for wearable human activity recognition (HAR), called CapMatch. CapMatch gracefully hybridizes supervised learning and unsupervised learning to extract rich representations from input data. In unsupervised learning, CapMatch leverages the pseudolabeling, contrastive learning (CL), and feature-based KD techniques to construct similarity learning on lower and higher level semantic information extracted from two augmentation versions of the data“, weak” and “timecut”, to recognize the relationships among the obtained features of classes in the unlabeled data. CapMatch combines the outputs of the weak-and timecut-augmented models to form pseudolabeling and thus CL. Meanwhile, CapMatch uses the feature-based KD to transfer knowledge from the intermediate layers of the weak-augmented model to those of the timecut-augmented model. To effectively capture both local and global patterns of HAR data, we design a capsule transformer network consisting of four capsule-based transformer blocks and one routing layer. Experimental results show that compared with a number of state-of-the-art semi-supervised and supervised algorithms, the proposed CapMatch achieves decent performance on three commonly used HAR datasets, namely, HAPT, WISDM, and UCI_HAR. With only 10% of data labeled, CapMatch achieves F1F_{1} values of higher than 85.00% on these datasets, outperforming 14 semi-supervised algorithms. When the proportion of labeled data reaches 30%, CapMatch obtains F1F_{1} values of no lower than 88.00% on the datasets above, which is better than several classical supervised algorithms, e.g., decision tree and k -nearest neighbor (KNN).
Article
Gastrointestinal (GI) diseases are the most common in the human digestive system and has a significantly higher mortality rate. Accurate evaluation of endoscopic images plays an important role in decision making regarding patient treatment. Recently, convolutional neural networks (CNNs) have been introduced for the diagnosis of GI diseases. However, achieving high accuracy is still a challenging task. To overcome these limitations, we propose the “Densely Connected Depth-wise Separable Convolution-Based Network” (DCDS-Net) model, utilizing depth-wise separable convolution (DWSC) with residual connections and densely connected blocks (DCB), to effectively diagnose various endoscopic images of GI diseases. In addition, we incorporate global average pooling (GAP), batch normalization, dropout and dense layers in DCB to learn rich discriminative features and improve the performance of the model. We explored the feasibility of block-wise fine-tuning using transfer learning on the proposed model to reduce overfitting, and experimentally explore the optimal level of fine-tuning, since transfer learning is well suited to medical data where labeled data is scarce. The proposed method has been evaluated on 6000 labeled endoscopic images containing 4 classes of GI diseases. In addition, data augmentation has been incorporated into the training pipeline to improve the performance of the model. Furthermore, a critical study was conducted to evaluate the generalizability of the proposed model on smaller training samples (e.g., 60 %, 70 %, 80 %, and 90 %). The study employed Grad-CAM to generate heatmaps that identify the regions in the GI tract that are indicative of the presence of different diseases. The results of extensive experiments show that the proposed model shows significant improvements and achieves the highest classification accuracy of 99.33 %, precision of 99.37 %, recall of 99.32 % and outperforms all pre-trained and existing models for the detection of GI diseases. In conclusion, DCDS-Net exhibits high classification performance and can help endoscopists in automatic GI disease diagnosis.