no code implementations • 24 May 2024 • Shashata Sawmya, Linghao Kong, Ilia Markov, Dan Alistarh, Nir Shavit
We show how to improve the inference efficiency of an LLM by expanding it into a mixture of sparse experts, where each expert is a copy of the original weights, one-shot pruned for a specific cluster of input values.
1 code implementation • 11 Jun 2022 • Wenjian Luo, Hongwei Zhang, Linghao Kong, Zhijian Chen, Ke Tang
The security issues in DNNs, such as adversarial examples, have attracted much attention.