Search Results for author: Gargi Ghosh

Found 15 papers, 8 papers with code

Text Quality-Based Pruning for Efficient Training of Language Models

no code implementations • 26 Apr 2024 • Vasu Sharma, Karthik Padthe, Newsha Ardalani, Kushal Tirumala, Russell Howes, Hu Xu, Po-Yao Huang, Shang-Wen Li, Armen Aghajanyan, Gargi Ghosh

In recent times training Language Models (LMs) have relied on computationally heavy training over massive datasets which makes this training process extremely laborious.

Paper
Add Code

Demystifying CLIP Data

2 code implementations • 28 Sep 2023 • Hu Xu, Saining Xie, Xiaoqing Ellen Tan, Po-Yao Huang, Russell Howes, Vasu Sharma, Shang-Wen Li, Gargi Ghosh, Luke Zettlemoyer, Christoph Feichtenhofer

We believe that the main ingredient to the success of CLIP is its data and not the model architecture or pre-training objective.

1,036

Paper
Code

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

1 code implementation • 5 Sep 2023 • Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan

It is also a general-purpose model that can do both text-to-image and image-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs.

Ranked #2 on Text-to-Image Generation on MS COCO

Decoder Language Modelling +3

318

Paper
Code

LIMA: Less Is More for Alignment

5 code implementations • NeurIPS 2023 • Chunting Zhou, PengFei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy

Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.

Language Modelling reinforcement-learning

2,527

Paper
Code

CiT: Curation in Training for Effective Vision-Language Data

1 code implementation • ICCV 2023 • Hu Xu, Saining Xie, Po-Yao Huang, Licheng Yu, Russell Howes, Gargi Ghosh, Luke Zettlemoyer, Christoph Feichtenhofer

Large vision-language models are generally applicable to many downstream tasks, but come at an exorbitant training cost that only large institutions can afford.

Paper
Code

ALERT: Adapting Language Models to Reasoning Tasks

no code implementations • 16 Dec 2022 • Ping Yu, Tianlu Wang, Olga Golovneva, Badr Alkhamissy, Gargi Ghosh, Mona Diab, Asli Celikyilmaz

Current large language models can perform reasonably well on complex tasks that require step-by-step reasoning with few-shot learning.

Few-Shot Learning Language Modelling +1

Paper
Add Code

MAViL: Masked Audio-Video Learners

1 code implementation • NeurIPS 2023 • Po-Yao Huang, Vasu Sharma, Hu Xu, Chaitanya Ryali, Haoqi Fan, Yanghao Li, Shang-Wen Li, Gargi Ghosh, Jitendra Malik, Christoph Feichtenhofer

We present Masked Audio-Video Learners (MAViL) to train audio-visual representations.

Contrastive Learning Retrieval

Paper
Code

CM3: A Causal Masked Multimodal Model of the Internet

no code implementations • 19 Jan 2022 • Armen Aghajanyan, Bernie Huang, Candace Ross, Vladimir Karpukhin, Hu Xu, Naman Goyal, Dmytro Okhonko, Mandar Joshi, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer

We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens.

Entity Disambiguation Entity Linking

Paper
Add Code

VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

2 code implementations • EMNLP 2021 • Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer

We present VideoCLIP, a contrastive approach to pre-train a unified model for zero-shot video and text understanding, without using any labels on downstream tasks.

Ranked #1 on Temporal Action Localization on CrossTask (using extra training data)

Action Segmentation Long Video Retrieval (Background Removed) +4

29,328

Paper
Code

HTLM: Hyper-Text Pre-Training and Prompting of Language Models

no code implementations • ICLR 2022 • Armen Aghajanyan, Dmytro Okhonko, Mike Lewis, Mandar Joshi, Hu Xu, Gargi Ghosh, Luke Zettlemoyer

We introduce HTLM, a hyper-text language model trained on a large-scale web crawl.

Ranked #1 on Table-to-Text Generation on DART

Data-to-Text Generation Denoising +2

Paper
Add Code

VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding

1 code implementation • Findings (ACL) 2021 • Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer

We present a simplified, task-agnostic multi-modal pre-training approach that can accept either video or text input, or both for a variety of end tasks.

Ranked #2 on Temporal Action Localization on CrossTask (using extra training data)

Action Segmentation Language Modelling +5

29,326

Paper
Code

Multi-task Retrieval for Knowledge-Intensive Tasks

no code implementations • ACL 2021 • Jean Maillard, Vladimir Karpukhin, Fabio Petroni, Wen-tau Yih, Barlas Oğuz, Veselin Stoyanov, Gargi Ghosh

Retrieving relevant contexts from a large corpus is a crucial step for tasks such as open-domain question answering and fact checking.

Fact Checking Open-Domain Question Answering +1

Paper
Add Code

Pre-training via Paraphrasing

2 code implementations • NeurIPS 2020 • Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer

The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks.

Document Summarization Document Translation +6

Paper
Code

Exploring Deep Multimodal Fusion of Text and Photo for Hate Speech Classification

no code implementations • WS 2019 • Fan Yang, Xiaochang Peng, Gargi Ghosh, Reshef Shilon, Hao Ma, Eider Moore, Goran Predovic

Interactions among users on social network platforms are usually positive, constructive and insightful.

General Classification

Paper
Add Code

Optimizing Query Evaluations using Reinforcement Learning for Web Search

no code implementations • 12 Apr 2018 • Corby Rosset, Damien Jose, Gargi Ghosh, Bhaskar Mitra, Saurabh Tiwary

In web search, typically a candidate generation step selects a small set of documents---from collections containing as many as billions of web pages---that are subsequently ranked and pruned before being presented to the user.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.