Search Results for author: Shiwei Zhong

Found 1 papers, 1 papers with code

Why does Knowledge Distillation Work? Rethink its Attention and Fidelity Mechanism

1 code implementation30 Apr 2024 Chenqi Guo, Shiwei Zhong, Xiaofeng Liu, Qianli Feng, Yinglong Ma

By increasing data augmentation strengths, our key findings reveal a decrease in the Intersection over Union (IoU) of attentions between teacher models, leading to reduced student overfitting and decreased fidelity.

Data Augmentation Knowledge Distillation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.