no code implementations • 2 May 2024 • Maxwell Jones, Sheng-Yu Wang, Nupur Kumari, David Bau, Jun-Yan Zhu
Both qualitative and quantitative experiments show that our method can effectively learn style while avoiding overfitting to image content, highlighting the potential of modeling such stylistic differences from a single image pair.
1 code implementation • 4 Apr 2024 • Arnab Sen Sharma, David Atkinson, David Bau
We investigate the mechanisms of factual recall in the Mamba state space model.
1 code implementation • 28 Mar 2024 • Samuel Marks, Can Rager, Eric J. Michaud, Yonatan Belinkov, David Bau, Aaron Mueller
We introduce methods for discovering and applying sparse feature circuits.
no code implementations • 4 Mar 2024 • Koyena Pal, David Bau, Renée J. Miller
And we discuss what principled data management techniques can be brought to bear on the study of large model management.
no code implementations • 22 Feb 2024 • Nikhil Prakash, Tamar Rott Shaham, Tal Haklay, Yonatan Belinkov, David Bau
We identify the mechanism that enables entity tracking and show that (i) in both the original model and its fine-tuned versions primarily the same circuit implements entity tracking.
1 code implementation • 13 Feb 2024 • Kenneth Li, Tianle Liu, Naomi Bashkansky, David Bau, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
System-prompting is a standard tool for customizing language-model chatbots, enabling them to follow a specific instruction.
no code implementations • 25 Jan 2024 • Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell
The effectiveness of an audit, however, depends on the degree of system access granted to auditors.
1 code implementation • 20 Nov 2023 • Rohit Gandikota, Joanna Materzynska, Tingrui Zhou, Antonio Torralba, David Bau
We present a method to create interpretable concept sliders that enable precise control over attributes in image generations from diffusion models.
no code implementations • 17 Nov 2023 • Silen Naihin, David Atkinson, Marc Green, Merwane Hamadi, Craig Swift, Douglas Schonholtz, Adam Tauman Kalai, David Bau
A prerequisite for safe autonomy-in-the-wild is safe testing-in-the-wild.
no code implementations • 8 Nov 2023 • Koyena Pal, Jiuding Sun, Andrew Yuan, Byron C. Wallace, David Bau
More concretely, in this paper we ask: Given a hidden (internal) representation of a single token at position $t$ in an input, can we reliably anticipate the tokens that will appear at positions $\geq t + 2$?
no code implementations • 23 Oct 2023 • Eric Todd, Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, David Bau
Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV).
1 code implementation • NeurIPS 2023 • Sarah Schwettmann, Tamar Rott Shaham, Joanna Materzynska, Neil Chowdhury, Shuang Li, Jacob Andreas, David Bau, Antonio Torralba
FIND contains functions that resemble components of trained neural networks, and accompanying descriptions of the kind we seek to generate.
1 code implementation • 25 Aug 2023 • Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzyńska, David Bau
Text-to-image models suffer from various safety issues that may limit their suitability for deployment.
1 code implementation • 17 Aug 2023 • Evan Hernandez, Arnab Sen Sharma, Tal Haklay, Kevin Meng, Martin Wattenberg, Jacob Andreas, Yonatan Belinkov, David Bau
Linear relation representations may be obtained by constructing a first-order approximation to the LM from a single prompt, and they exist for a variety of factual, commonsense, and linguistic relations.
no code implementations • 3 Aug 2023 • Sarah Schwettmann, Neil Chowdhury, Samuel Klein, David Bau, Antonio Torralba
Language models demonstrate remarkable capacity to generalize representations learned in one modality to downstream tasks in other modalities.
no code implementations • 7 Jul 2023 • Xander Davies, Max Nadeau, Nikhil Prakash, Tamar Rott Shaham, David Bau
Recent work has shown that computation in language models may be human-understandable, with successful efforts to localize and intervene on both single-unit features and input-output circuits.
2 code implementations • ICCV 2023 • Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, David Bau
We propose a fine-tuning method that can erase a visual concept from a pre-trained diffusion model, given only the name of the style and using negative guidance as a teacher.
1 code implementation • 24 Oct 2022 • Kenneth Li, Aspen K. Hopkins, David Bau, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
Language models show a surprising range of capabilities, but the source of their apparent competence is unclear.
2 code implementations • 13 Oct 2022 • Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
Recent work has shown exciting promise in updating large language models with new memories, so as to replace obsolete information or add specialized knowledge.
1 code implementation • 6 Oct 2022 • Daohan Lu, Sheng-Yu Wang, Nupur Kumari, Rohan Agarwal, Mia Tang, David Bau, Jun-Yan Zhu
To address this need, we introduce the task of content-based model search: given a query and a large set of generative models, finding the models that best match the query.
Ranked #1 on Model Description Based Search on Generative Models
Contrastive Learning Image and Sketch based Model Retrieval +4
1 code implementation • 28 Jul 2022 • Sheng-Yu Wang, David Bau, Jun-Yan Zhu
Our method allows a user to create a model that synthesizes endless objects with defined geometric changes, enabling the creation of a new generative model without the burden of curating a large-scale dataset.
2 code implementations • 6 Jul 2022 • Audrey Cui, Ali Jahanian, Agata Lapedriza, Antonio Torralba, Shahin Mahdizadehaghdam, Rohit Kumar, David Bau
We introduce the task of local relighting, which changes a photograph of a scene by switching on and off the light sources that are visible within the image.
no code implementations • CVPR 2022 • Joanna Materzynska, Antonio Torralba, David Bau
The CLIP network measures the similarity between natural text and images; in this work, we investigate the entanglement of the representation of word images and natural images in its image encoder.
2 code implementations • 10 Feb 2022 • Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov
To test our hypothesis that these computations correspond to factual association recall, we modify feed-forward weights to update specific factual associations using Rank-One Model Editing (ROME).
2 code implementations • 26 Jan 2022 • Evan Hernandez, Sarah Schwettmann, David Bau, Teona Bagashvili, Antonio Torralba, Jacob Andreas
Given a neuron, MILAN generates a description by searching for a natural language string that maximizes pointwise mutual information with the image regions in which the neuron is active.
1 code implementation • NeurIPS 2021 • Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry
We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules.
1 code implementation • ICCV 2021 • Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba
A large body of recent work has identified transformations in the latent spaces of generative adversarial networks (GANs) that consistently and interpretably transform generated images.
no code implementations • ICLR 2022 • Evan Hernandez, Sarah Schwettmann, David Bau, Teona Bagashvili, Antonio Torralba, Jacob Andreas
Given a neuron, MILAN generates a description by searching for a natural language string that maximizes pointwise mutual information with the image regions in which the neuron is active.
1 code implementation • ICCV 2021 • Sheng-Yu Wang, David Bau, Jun-Yan Zhu
In particular, we change the weights of an original GAN model according to user sketches.
no code implementations • 19 Mar 2021 • Alex Andonian, Sabrina Osmany, Audrey Cui, YeonHwan Park, Ali Jahanian, Antonio Torralba, David Bau
We investigate the problem of zero-shot semantic image painting.
2 code implementations • 10 Sep 2020 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, Antonio Torralba
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
1 code implementation • ECCV 2020 • Lucy Chai, David Bau, Ser-Nam Lim, Phillip Isola
The quality of image generation and manipulation is reaching impressive levels, making it increasingly difficult for a human to distinguish between what is real and what is fake.
3 code implementations • ECCV 2020 • David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba
To address the problem, we propose a formulation in which the desired rule is changed by manipulating a layer of a deep network as a linear associative memory.
2 code implementations • CVPR 2020 • Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, Antonio Torralba
We introduce a simple but effective unsupervised method for generating realistic and diverse images.
1 code implementation • 15 May 2020 • David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, Antonio Torralba
First, it is hard for GANs to precisely reproduce an input image.
1 code implementation • ICCV 2019 • David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, Antonio Torralba
Differences in statistics reveal object classes that are omitted by a GAN.
no code implementations • 29 Jun 2019 • Jonathan Frankle, David Bau
Namely, we consider the effect of removing unnecessary structure on the number of hidden units that learn disentangled representations of human-recognizable concepts as identified by network dissection.
no code implementations • ICLR Workshop DeepGenStruct 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba
We present an analytic framework to visualize and understand GANs at the unit-, object-, and scene-level.
no code implementations • 29 Jan 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba
We quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output.
8 code implementations • ICLR 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba
Then, we quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output.
1 code implementation • ECCV 2018 • Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba
Explanations of the decisions made by a deep neural network are important for human end-users to be able to understand and diagnose the trustworthiness of the system.
no code implementations • 7 Jun 2018 • Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba
We confirm that unit attributes such as class selectivity are a poor predictor for impact on overall accuracy as found previously in recent work \cite{morcos2018importance}.
1 code implementation • 31 May 2018 • Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, Lalana Kagal
There has recently been a surge of work in explanatory artificial intelligence (XAI).
BIG-bench Machine Learning Explainable Artificial Intelligence (XAI) +1
2 code implementations • 15 Nov 2017 • Bolei Zhou, David Bau, Aude Oliva, Antonio Torralba
In this work, we describe Network Dissection, a method that interprets networks by providing labels for the units of their deep visual representations.
1 code implementation • CVPR 2017 • David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, Antonio Torralba
Given any CNN model, the proposed method draws on a broad data set of visual concepts to score the semantics of hidden units at each intermediate convolutional layer.