no code implementations • 3 Nov 2023 • Zheyuan Bai, Xinduo Liu, Hailin Hu, Tianyu Guo, Qinghua Zhang, Yunhe Wang
Data-Free Knowledge Distillation (DFKD) plays a vital role in compressing the model when original training data is unavailable.
Data-free Knowledge Distillation Language Modelling +4