Figure 2 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
t-SNE visualization of student's representations by applying (a) L IS loss and (b) L IS + L CS loss. L CS can force the student to map samples of the same category closer in representation space (teacher: resnet32×4, student: resnet8×4).
Source publication
Knowledge distillation has become an important technique for model compression and acceleration. The conventional knowledge distillation approaches aim to transfer knowledge from teacher to student networks by minimizing the KL-divergence between their probabilistic outputs, which only consider the mutual relationship between individual representat...
Similar publications
The location of discriminative features and reduction of model complexity are the two main research directions in fine-grained image classification. The manual annotation of object is very labor-intensive, and the commonly used model compression methods usually reduce the classification accuracy while compressing the model. In this paper, we propos...