1 code implementation • 7 Jul 2023 • Tommaso Pegolotti, Elias Frantar, Dan Alistarh, Markus Püschel
We present ongoing work on a new automatic code generation approach for supporting quantized generative inference on LLMs such as LLaMA or OPT on off-the-shelf CPUs.
1 code implementation • 9 Feb 2023 • Mahdi Nikdan, Tommaso Pegolotti, Eugenia Iofinova, Eldar Kurtic, Dan Alistarh
We provide a new efficient version of the backpropagation algorithm, specialized to the case where the weights of the neural network being trained are sparse.