Fig 5 - uploaded by Abdelaziz Abohamama
Content may be subject to copyright.
Feature-based knowledge distillation.

Feature-based knowledge distillation.

Source publication
Article
Full-text available
Recently, deep neural networks (DNNs) have been used successfully in many fields, particularly, in medical diagnosis. However, deep learning (DL) models are expensive in terms of memory and computing resources, which hinders their implementation in limited-resources devices or for delay-sensitive systems. Therefore, these deep models need to be acc...

Context in source publication

Context 1
... are efficient for learning several levels of feature representation. As a result, feature maps, which are the output of both the final and intermediate layers, can be utilized as the information to supervise the student model training. A trained teacher model also captures knowledge in its intermediate layers, that is particularly for DNNs. Fig. 5 displays a general feature-based knowledge distillation model. The objective is to teach the student model how to activate features similarly to the teacher model. The distillation loss function performs this by reducing the difference ...

Citations

... This process of KD not only facilitates model compression but also enhances the generalization capabilities of the student model [10]. The success of KD is inherently tied to the quality and diversity of the datasets used during the training step, as well as the large applications of the KD learning-based processes [1,12,[14][15][16][17][18][19]. ...
... For example, Li et al. proposed a transferred attention method to improve the performance of convolutional neural networks [27], while Yazdanbakhsh et al. studied the application of knowledge distillation in specific domains such as healthcare [19]. However, despite these significant advances, little attention has been paid to the impact of data on this knowledge transfer process. ...
Article
Full-text available
As the demand for efficient and lightweight models in image classification grows, knowledge distillation has emerged as a promising technique to transfer expertise from complex teacher models to simpler student models. However, the efficacy of knowledge distillation is intricately linked to the choice of datasets used during training. Datasets are pivotal in shaping a model’s learning process, influencing its ability to generalize and discriminate between diverse patterns. While considerable research has independently explored knowledge distillation and image classification, a comprehensive understanding of how different datasets impact knowledge distillation remains a critical gap. This study systematically investigates the impact of diverse datasets on knowledge distillation in image classification. By varying dataset characteristics such as size, domain specificity, and inherent biases, we aim to unravel the nuanced relationship between datasets and the efficacy of knowledge transfer. Our experiments employ a range of datasets to comprehensively explore their impact on the performance gains achieved through knowledge distillation. This study contributes valuable guidance for researchers and practitioners seeking to optimize image classification models through kno-featured applications. By elucidating the intricate interplay between dataset characteristics and knowledge distillation outcomes, our findings empower the community to make informed decisions when selecting datasets, ultimately advancing the field toward more robust and efficient model development.
... Given the comparatively lighter and shallower structure of the student model, its performance may occasionally lag behind that of its more complex counterparts. To address this challenge, we introduced the concept of Knowledge Distillation (KD) [59,60], which facilitates the transfer of valuable knowledge between models. KD operates on the principle of compressing heavyweight models into lightweight versions, often with a tradeoff in accuracy. ...
Article
Full-text available
Accurate and timely diagnosis of pulmonary diseases is critical in the field of medical imaging. While deep learning models have shown promise in this regard, the current methods for developing such models often require extensive computing resources and complex procedures, rendering them impractical. This study focuses on the development of a lightweight deep-learning model for the detection of pulmonary diseases. Leveraging the benefits of knowledge distillation (KD) and the integration of the ConvMixer block, we propose a novel lightweight student model based on the MobileNet architecture. The methodology begins with training multiple teacher model candidates to identify the most suitable teacher model. Subsequently, KD is employed, utilizing the insights of this robust teacher model to enhance the performance of the student model. The objective is to reduce the student model's parameter size and computational complexity while preserving its diagnostic accuracy. We perform an in-depth analysis of our proposed model's performance compared to various well-established pre-trained student models, including MobileNetV2, ResNet50, InceptionV3, Xception, and NasNetMobile. Through extensive experimentation and evaluation across diverse datasets, including chest X-rays of different pulmonary diseases such as pneumonia, COVID-19, tuberculosis, and pneumothorax, we demonstrate the robustness and effectiveness of our proposed model in diagnosing various chest infections. Our model showcases superior performance, achieving an impressive classification accuracy of 97.92%. We emphasize the significant reduction in model complexity, with 0.63 million parameters, allowing for efficient inference and rapid prediction times, rendering it ideal for resource-constrained environments. Outperforming various pre-trained student models in terms of overall performance and computation cost, our findings underscore the effectiveness of the proposed KD strategy and the integration of the ConvMixer block. This highlights the importance of incorporating advanced techniques and innovative architectural elements in the development of highly effective models for medical image analysis.
... Similarly, the DeepEdgeSoc framework Al Koutayni et al. (2023) accelerates DL network design for energyefficient FPGA implementations, aligning with our resource efficiency goal. Moreover, approaches like resource-frugal quantized CNNs Nalepa et al. (2020) and knowledge distillation methods Alabbasy et al. (2023) resonate with our efforts to compress model size while maintaining performance. These studies highlight the importance of balancing computational demands with resource limitations, a core aspect of our research. ...
Article
Full-text available
In this paper, we address the question of achieving high accuracy in deep learning models for agricultural applications through edge computing devices while considering the associated resource constraints. Traditional and state-of-the-art models have demonstrated good accuracy, but their practicality as end-user available solutions remains uncertain due to current resource limitations. One agricultural application for deep learning models is the detection and classification of plant diseases through image-based crop monitoring. We used the publicly available PlantVillage dataset containing images of healthy and diseased leaves for 14 crop species and 6 groups of diseases as example data. The MobileNetV3-small model succeeds in classifying the leaves with a test accuracy of around 99.50%. Post-training optimization using quantization reduced the number of model parameters from approximately 1.5 million to 0.93 million while maintaining the accuracy of 99.50%. The final model is in ONNX format, enabling deployment across various platforms, including mobile devices. These findings offer a cost-effective solution for deploying accurate deep-learning models in agricultural applications.
Article
Full-text available
This study conducts a bibliometric analysis and systematic review to examine research trends in the application of knowledge distillation for medical image segmentation. A total of 806 studies from 343 distinct sources, published between 2019 and 2023, were analyzed using Publish or Perish and VOSviewer, with data retrieved from Scopus and Google Scholar. The findings indicate a rising trend in publications indexed in Scopus, whereas a decline was observed in Google Scholar. Citation analysis revealed that the United States and China emerged as the leading contributors in terms of both publication volume and citation impact. Previous research predominantly focused on optimizing knowledge distillation techniques and their implementation in resource-constrained devices. Keyword analysis demonstrated that medical image segmentation appeared most frequently with 144 occurrences, followed by medical imaging with 110 occurrences. This study highlights emerging research opportunities, particularly in leveraging knowledge distillation for U-Net architectures with large-scale datasets and integrating transformer models to enhance medical image segmentation performance
Article
Accurately segmenting and staging tumor lesions in cancer patients presents a significant challenge for radiologists, but it is essential for devising effective treatment plans including radiation therapy, personalized medicine, and surgical options. The integration of artificial intelligence (AI), particularly deep learning (DL), has become a useful tool for radiologists, enhancing their ability to understand tumor biology and deliver personalized care to patients with H&N tumors. Segmenting H&N tumor lesions using Positron Emission Tomography/Computed Tomography (PET/CT) images has gained significant attention. However, the diverse shapes and sizes of tumors, along with indistinct boundaries between malignant and normal tissues, present significant challenges in effectively fusing PET and CT images. To overcome these challenges, various DL-based models have been developed for segmenting tumor lesions in PET/CT images. This article reviews multimodality (PET/CT) based H&N tumor lesions segmentation methods. We firstly discuss the strengths and limitations of PET/CT imaging and the importance of DL-based models in H&N tumor lesion segmentation. Second, we examine the current state-of-the-art DL models for H&N tumor segmentation, categorizing them into UNet, VNet, Vision Transformer, and miscellaneous models based on their architectures. Third, we explore the annotation and evaluation processes, addressing challenges in segmentation annotation and discussing the metrics used to assess model performance. Finally, we discuss several open challenges and provide some avenues for future research in H&N tumor lesion segmentation.