Search Results for author: Eric Wallace

Found 40 papers, 26 papers with code

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

no code implementations • ICML 2020 • Zhuohan Li, Eric Wallace, Sheng Shen, Kevin Lin, Kurt Keutzer, Dan Klein, Joseph Gonzalez

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference.

Machine Translation Quantization +1

Paper
Add Code

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

no code implementations • 19 Apr 2024 • Eric Wallace, Kai Xiao, Reimar Leike, Lilian Weng, Johannes Heidecke, Alex Beutel

Today's LLMs are susceptible to prompt injections, jailbreaks, and other attacks that allow adversaries to overwrite a model's original instructions with their own malicious prompts.

Instruction Following

Paper
Add Code

Unfamiliar Finetuning Examples Control How Language Models Hallucinate

no code implementations • 8 Mar 2024 • Katie Kang, Eric Wallace, Claire Tomlin, Aviral Kumar, Sergey Levine

Large language models (LLMs) have a tendency to generate plausible-sounding yet factually incorrect responses, especially when queried on unfamiliar concepts.

Multiple-choice

Paper
Add Code

What Evidence Do Language Models Find Convincing?

1 code implementation • 19 Feb 2024 • Alexander Wan, Eric Wallace, Dan Klein

Retrieval-augmented language models are being increasingly tasked with subjective, contentious, and conflicting queries such as "is aspartame linked to cancer".

counterfactual Misinformation

Paper
Code

Scalable Extraction of Training Data from (Production) Language Models

no code implementations • 28 Nov 2023 • Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset.

Chatbot Memorization

Paper
Add Code

Privacy Side Channels in Machine Learning Systems

no code implementations • 11 Sep 2023 • Edoardo Debenedetti, Giorgio Severi, Nicholas Carlini, Christopher A. Choquette-Choo, Matthew Jagielski, Milad Nasr, Eric Wallace, Florian Tramèr

Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum, when in reality, ML models are part of larger systems that include components for training data filtering, output monitoring, and more.

Paper
Add Code

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

1 code implementation • 8 Aug 2023 • Sewon Min, Suchin Gururangan, Eric Wallace, Hannaneh Hajishirzi, Noah A. Smith, Luke Zettlemoyer

SILO is built by (1) training a parametric LM on Open License Corpus (OLC), a new corpus we curate with 228B tokens of public domain and permissively licensed text and (2) augmenting it with a more general and easily modifiable nonparametric datastore (e. g., containing copyrighted books or news) that is only queried during inference.

Language Modelling Sentence

Paper
Code

The False Promise of Imitating Proprietary LLMs

1 code implementation • 25 May 2023 • Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao liu, Pieter Abbeel, Sergey Levine, Dawn Song

This approach looks to cheaply imitate the proprietary model's capabilities using a weaker open-source model.

Language Modelling

1,089

Paper
Code

Poisoning Language Models During Instruction Tuning

1 code implementation • 1 May 2023 • Alexander Wan, Eric Wallace, Sheng Shen, Dan Klein

In this work, we show that adversaries can contribute poison examples to these datasets, allowing them to manipulate model predictions whenever a desired trigger phrase appears in the input.

Paper
Code

Extracting Training Data from Diffusion Models

1 code implementation • 30 Jan 2023 • Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images.

Privacy Preserving

Paper
Code

Large Language Models Struggle to Learn Long-Tail Knowledge

1 code implementation • 15 Nov 2022 • Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, Colin Raffel

The Internet contains a wealth of knowledge -- from the birthdays of historical figures to tutorials on how to code -- all of which may be learned by language models.

Entity Linking Question Answering +2

Paper
Code

Measuring Forgetting of Memorized Training Examples

no code implementations • 30 Jun 2022 • Matthew Jagielski, Om Thakkar, Florian Tramèr, Daphne Ippolito, Katherine Lee, Nicholas Carlini, Eric Wallace, Shuang Song, Abhradeep Thakurta, Nicolas Papernot, Chiyuan Zhang

In memorization, models overfit specific training examples and become susceptible to privacy attacks.

Memorization

Paper
Add Code

Automated Crossword Solving

1 code implementation • ACL 2022 • Eric Wallace, Nicholas Tomlin, Albert Xu, Kevin Yang, Eshaan Pathak, Matthew Ginsberg, Dan Klein

We present the Berkeley Crossword Solver, a state-of-the-art approach for automatically solving crossword puzzles.

Question Answering

121

Paper
Code

InCoder: A Generative Model for Code Infilling and Synthesis

3 code implementations • 12 Apr 2022 • Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis

Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming.

Ranked #85 on Code Generation on MBPP

Code Generation Comment Generation +1

290

Paper
Code

Deduplicating Training Data Mitigates Privacy Risks in Language Models

3 code implementations • 14 Feb 2022 • Nikhil Kandpal, Eric Wallace, Colin Raffel

Past work has shown that large language models are susceptible to privacy attacks, where adversaries generate sequences from a trained model and detect which sequences are memorized from the training set.

244

Paper
Code

Analyzing Dynamic Adversarial Training Data in the Limit

1 code implementation • Findings (ACL) 2022 • Eric Wallace, Adina Williams, Robin Jia, Douwe Kiela

To create models that are robust across a wide range of test inputs, training datasets should include diverse examples that span numerous phenomena.

Paper
Code

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

2 code implementations • Findings (ACL) 2022 • Robert L. Logan IV, Ivana Balažević, Eric Wallace, Fabio Petroni, Sameer Singh, Sebastian Riedel

Prompting language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning.

Few-Shot Learning Prompt Engineering

107

Paper
Code

Detoxifying Language Models Risks Marginalizing Minority Voices

1 code implementation • NAACL 2021 • Albert Xu, Eshaan Pathak, Eric Wallace, Suchin Gururangan, Maarten Sap, Dan Klein

Language models (LMs) must be both safe and equitable to be responsibly deployed in practice.

Text Generation

Paper
Code

Calibrate Before Use: Improving Few-Shot Performance of Language Models

5 code implementations • 19 Feb 2021 • Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh

We show that this type of few-shot learning can be unstable: the choice of prompt format, training examples, and even the order of the training examples can cause accuracy to vary from near chance to near state-of-the-art.

Few-Shot Learning

327

Paper
Code

Extracting Training Data from Large Language Models

3 code implementations • 14 Dec 2020 • Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, Colin Raffel

We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data.

Language Modelling

155

Paper
Code

Interpreting Predictions of NLP Models

no code implementations • EMNLP 2020 • Eric Wallace, Matt Gardner, Sameer Singh

Although neural NLP models are highly expressive and empirically successful, they also systematically fail in counterintuitive ways and are opaque in their decision-making process.

Decision Making

Paper
Add Code

AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts

3 code implementations • EMNLP 2020 • Taylor Shin, Yasaman Razeghi, Robert L. Logan IV, Eric Wallace, Sameer Singh

The remarkable success of pretrained language models has motivated the study of what kinds of knowledge these models learn during pretraining.

Natural Language Inference Relation +2

555

Paper
Code

Concealed Data Poisoning Attacks on NLP Models

no code implementations • NAACL 2021 • Eric Wallace, Tony Z. Zhao, Shi Feng, Sameer Singh

In this work, we develop a new data poisoning attack that allows an adversary to control model predictions whenever a desired trigger phrase is present in the input.

Data Poisoning Language Modelling +2

Paper
Add Code

Gradient-based Analysis of NLP Models is Manipulable

no code implementations • Findings of the Association for Computational Linguistics 2020 • Junlin Wang, Jens Tuyls, Eric Wallace, Sameer Singh

Gradient-based analysis methods, such as saliency map visualizations and adversarial input perturbations, have found widespread use in interpreting neural NLP models due to their simplicity, flexibility, and most importantly, their faithfulness.

text-classification Text Classification

Paper
Add Code

Evaluating NLP Models via Contrast Sets

no code implementations • 1 Oct 2020 • Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, A. Zhang, Ben Zhou

Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.

Reading Comprehension Sentiment Analysis

Paper
Add Code

Trustworthy AI Inference Systems: An Industry Research View

no code implementations • 10 Aug 2020 • Rosario Cammarota, Matthias Schunter, Anand Rajan, Fabian Boemer, Ágnes Kiss, Amos Treiber, Christian Weinert, Thomas Schneider, Emmanuel Stapf, Ahmad-Reza Sadeghi, Daniel Demmler, Joshua Stock, Huili Chen, Siam Umar Hussain, Sadegh Riazi, Farinaz Koushanfar, Saransh Gupta, Tajan Simunic Rosing, Kamalika Chaudhuri, Hamid Nejatollahi, Nikil Dutt, Mohsen Imani, Kim Laine, Anuj Dubey, Aydin Aysu, Fateme Sadat Hosseini, Chengmo Yang, Eric Wallace, Pamela Norton

Additionally, such systems should also use Privacy-Enhancing Technologies (PETs) to protect customers' data at any time.

Paper
Add Code

Imitation Attacks and Defenses for Black-box Machine Translation Systems

1 code implementation • EMNLP 2020 • Eric Wallace, Mitchell Stern, Dawn Song

To mitigate these vulnerabilities, we propose a defense that modifies translation outputs in order to misdirect the optimization of imitation models.

Machine Translation Translation

Paper
Code

Pretrained Transformers Improve Out-of-Distribution Robustness

1 code implementation • ACL 2020 • Dan Hendrycks, Xiaoyuan Liu, Eric Wallace, Adam Dziedzic, Rishabh Krishnan, Dawn Song

Although pretrained Transformers such as BERT achieve high accuracy on in-distribution examples, do they generalize to new distributions?

Paper
Code

Evaluating Models' Local Decision Boundaries via Contrast Sets

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang, Ben Zhou

Reading Comprehension Sentiment Analysis

Paper
Code

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

2 code implementations • 26 Feb 2020 • Zhuohan Li, Eric Wallace, Sheng Shen, Kevin Lin, Kurt Keutzer, Dan Klein, Joseph E. Gonzalez

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference.

Machine Translation Quantization +1

378

Paper
Code

AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models

1 code implementation • IJCNLP 2019 • Eric Wallace, Jens Tuyls, Junlin Wang, Sanjay Subramanian, Matt Gardner, Sameer Singh

Neural NLP models are increasingly accurate but are imperfect and opaque---they break in counterintuitive ways and leave end users puzzled at their behavior.

Language Modelling Masked Language Modeling +1

Paper
Code

Do NLP Models Know Numbers? Probing Numeracy in Embeddings

1 code implementation • IJCNLP 2019 • Eric Wallace, Yizhong Wang, Sujian Li, Sameer Singh, Matt Gardner

The ability to understand and work with numbers (numeracy) is critical for many complex reasoning tasks.

Question Answering

Paper
Code

Universal Adversarial Triggers for Attacking and Analyzing NLP

1 code implementation • IJCNLP 2019 • Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh

We define universal adversarial triggers: input-agnostic sequences of tokens that trigger a model to produce a specific prediction when concatenated to any input from a dataset.

Language Modelling Reading Comprehension

286

Paper
Code

Compositional Questions Do Not Necessitate Multi-hop Reasoning

1 code implementation • ACL 2019 • Sewon Min, Eric Wallace, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi, Luke Zettlemoyer

Multi-hop reading comprehension (RC) questions are challenging because they require reading and reasoning over multiple paragraphs.

Information Retrieval Multi-Hop Reading Comprehension +1

Paper
Code

Misleading Failures of Partial-input Baselines

no code implementations • ACL 2019 • Shi Feng, Eric Wallace, Jordan Boyd-Graber

Recent work establishes dataset difficulty and removes annotation artifacts via partial-input baselines (e. g., hypothesis-only models for SNLI or question-only models for VQA).

Natural Language Inference Visual Question Answering (VQA)

Paper
Add Code

Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation

1 code implementation • 1 Feb 2019 • Sahil Singla, Eric Wallace, Shi Feng, Soheil Feizi

Second, we compute the importance of group-features in deep learning interpretation by introducing a sparsity regularization term.

Feature Importance General Classification

Paper
Code

Interpreting Neural Networks With Nearest Neighbors

1 code implementation • WS 2018 • Eric Wallace, Shi Feng, Jordan Boyd-Graber

However, the confidence of neural networks is not a robust measure of model uncertainty.

Feature Importance General Classification +2

Paper
Code

Trick Me If You Can: Human-in-the-loop Generation of Adversarial Examples for Question Answering

1 code implementation • TACL 2019 • Eric Wallace, Pedro Rodriguez, Shi Feng, Ikuya Yamada, Jordan Boyd-Graber

We propose human-in-the-loop adversarial generation, where human authors are guided to break models.

Information Retrieval Question Answering +1

Paper
Code

Trick Me If You Can: Adversarial Writing of Trivia Challenge Questions

no code implementations • ACL 2018 • Eric Wallace, Jordan Boyd-Graber

Modern question answering systems have been touted as approaching human performance.

Question Answering

Paper
Add Code

Pathologies of Neural Models Make Interpretations Difficult

no code implementations • EMNLP 2018 • Shi Feng, Eric Wallace, Alvin Grissom II, Mohit Iyyer, Pedro Rodriguez, Jordan Boyd-Graber

In existing interpretation methods for NLP, a word's importance is determined by either input perturbation---measuring the decrease in model confidence when that word is removed---or by the gradient with respect to that word.

Sentence

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.