Search Results for author: Shubhra Pandit

Found 1 papers, 0 papers with code

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

no code implementations6 May 2024 Abhinav Agarwalla, Abhay Gupta, Alexandre Marques, Shubhra Pandit, Michael Goin, Eldar Kurtic, Kevin Leong, Tuan Nguyen, Mahmoud Salem, Dan Alistarh, Sean Lie, Mark Kurtz

We achieve this for the LLaMA-2 7B model by combining the SparseGPT one-shot pruning method and sparse pretraining of those models on a subset of the SlimPajama dataset mixed with a Python subset of The Stack dataset.

Arithmetic Reasoning Code Generation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.