no code implementations • 31 Oct 2021 • Lehan Yang, Jincen Song
Recent years have witnessed dramatically improvements in the knowledge distillation, which can generate a compact student model for better efficiency while retaining the model effectiveness of the teacher model.