no code implementations • COLING 2022 • Chonghan Lee, Md Fahim Faysal Khan, Rita Brugarolas Brufau, Ke Ding, Vijaykrishnan Narayanan
While pre-trained language models like BERT have achieved impressive results on various natural language processing tasks, deploying them on resource-restricted devices is challenging due to their intensive computational cost and memory footprint.
no code implementations • 1 Jul 2020 • Md Fahim Faysal Khan
Reducing the model size and computation costs for dedicated AI accelerator designs, neural network quantization methods have at- tracted momentous attention recently.