1 code implementation • 14 Dec 2022 • Wenye Lin, Yifeng Ding, Zhixiong Cao, Hai-Tao Zheng
A common practice to address this problem is to introduce a pretrained contrastive teacher model and train the lightweight networks with distillation signals generated by the teacher.