Search Results for author: Mahmoud Salem

Found 5 papers, 0 papers with code

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

no code implementations • 6 May 2024 • Abhinav Agarwalla, Abhay Gupta, Alexandre Marques, Shubhra Pandit, Michael Goin, Eldar Kurtic, Kevin Leong, Tuan Nguyen, Mahmoud Salem, Dan Alistarh, Sean Lie, Mark Kurtz

We achieve this for the LLaMA-2 7B model by combining the SparseGPT one-shot pruning method and sparse pretraining of those models on a subset of the SlimPajama dataset mixed with a Python subset of The Stack dataset.

Arithmetic Reasoning Code Generation +2

Paper
Add Code

MediSwift: Efficient Sparse Pre-trained Biomedical Language Models

no code implementations • 1 Mar 2024 • Vithursan Thangarasa, Mahmoud Salem, Shreyas Saxena, Kevin Leong, Joel Hestness, Sean Lie

Large language models (LLMs) are typically trained on general source data for various domains, but a recent surge in domain-specific LLMs has shown their potential to outperform general-purpose models in domain-specific tasks (e. g., biomedicine).

Ranked #10 on Question Answering on PubMedQA

Question Answering

Paper
Add Code

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations

no code implementations • 30 Aug 2023 • Kilichbek Haydarov, Xiaoqian Shen, Avinash Madasu, Mahmoud Salem, Li-Jia Li, Gamaleldin Elsayed, Mohamed Elhoseiny

We introduce Affective Visual Dialog, an emotion explanation and reasoning task as a testbed for research on understanding the formation of emotions in visually grounded conversations.

Explanation Generation Question Answering +1

Paper
Add Code

Gumbel-Softmax Selective Networks

no code implementations • 19 Nov 2022 • Mahmoud Salem, Mohamed Osama Ahmed, Frederick Tung, Gabriel Oliveira

This commonly encountered operational context calls for principled techniques for training ML models with the option to abstain from predicting when uncertain.

Paper
Add Code

Bounding generalization error with input compression: An empirical study with infinite-width networks

no code implementations • 19 Jul 2022 • Angus Galloway, Anna Golubeva, Mahmoud Salem, Mihai Nica, Yani Ioannou, Graham W. Taylor

Estimating the Generalization Error (GE) of Deep Neural Networks (DNNs) is an important task that often relies on availability of held-out data.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.