no code implementations • 25 Apr 2024 • Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu
We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs).
no code implementations • 12 Mar 2024 • Saurabh Agarwal, Bilge Acun, Basil Hosmer, Mostafa Elhoushi, Yejin Lee, Shivaram Venkataraman, Dimitris Papailiopoulos, Carole-Jean Wu
We observe that there is a high amount of redundancy across heads on which tokens they pay attention to.
1 code implementation • 2 Feb 2024 • Minghao Yan, Saurabh Agarwal, Shivaram Venkataraman
Speculative Decoding is a widely used technique to speed up inference for Large Language Models (LLMs) without sacrificing quality.
no code implementations • 1 Jan 2024 • Saurabh Agarwal, K. V. Arya, Yogesh Kumar Meena
The proposed multilayer multimodal fusion model, along with the FDSFM module, holds promise for accurate disease classification and can also be extended to other disease classifications in chest X-ray images.
1 code implementation • 4 May 2023 • Hongyi Wang, Saurabh Agarwal, Pongsakorn U-chupala, Yoshiki Tanaka, Eric P. Xing, Dimitris Papailiopoulos
Cuttlefish leverages the observation that after a few epochs of full-rank training, the stable rank (i. e., an approximation of the true rank) of each layer stabilizes at a constant value.
no code implementations • 24 Feb 2022 • Saurabh Agarwal, Chengpo Yan, Ziyi Zhang, Shivaram Venkataraman
Based on these insights, we develop Bagpipe, a system for training deep recommendation models that uses caching and prefetching to overlap remote embedding accesses with the computation.
1 code implementation • 5 Mar 2021 • Hongyi Wang, Saurabh Agarwal, Dimitris Papailiopoulos
In this work, we present Pufferfish, a communication and computation efficient distributed training framework that incorporates the gradient compression into the model training process via training low-rank, pre-factorized deep networks.
1 code implementation • 28 Feb 2021 • Saurabh Agarwal, Hongyi Wang, Shivaram Venkataraman, Dimitris Papailiopoulos
A rich body of prior work has highlighted the existence of communication bottlenecks in synchronous data-parallel training.
1 code implementation • 2 Feb 2021 • YuHan Liu, Saurabh Agarwal, Shivaram Venkataraman
With the rapid adoption of machine learning (ML), a number of domains now use the approach of fine tuning models which were pre-trained on a large corpus of data.
no code implementations • COLING 2020 • Kshitij Tayal, Nikhil Rao, Saurabh Agarwal, Xiaowei Jia, Karthik Subbian, Vipin Kumar
The lack of structure in short text sequences limits the success of popular NLP methods based on deep learning.
3 code implementations • 29 Oct 2020 • Saurabh Agarwal, Hongyi Wang, Kangwook Lee, Shivaram Venkataraman, Dimitris Papailiopoulos
The techniques usually require choosing a static compression ratio, often requiring users to balance the trade-off between model accuracy and per-iteration speedup.
2 code implementations • NeurIPS 2020 • Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma, Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, Dimitris Papailiopoulos
Due to its decentralized nature, Federated Learning (FL) lends itself to adversarial attacks in the form of backdoors during training.
no code implementations • 27 May 2019 • Aravindakshan Babu, Saurabh Agarwal, Sudarshan Babu, Hariharan Chandrasekaran
K-Medoids(KM) is a standard clustering method, used extensively on semi-metric data. Error analyses of KM have traditionally used an in-sample notion of error, which can be far from the true error and suffer from generalization gap.
no code implementations • 5 Dec 2018 • Piyush Mital, Saurabh Agarwal, Bhargavi Neti, Yashodhara Haribhakta, Vibhavari Kamble, Krishnanjan Bhattacharjee, Debashri Das, Swati Mehta, Ajai Kumar
In today's digital age in the dawning era of big data analytics it is not the information but the linking of information through entities and actions which defines the discourse.
no code implementations • 8 Nov 2014 • Saurabh Agarwal, Punit Kumar Johari
Trademark Image Retrieval is playing a vital role as a part of CBIR System.