Search Results for author: Andreas Madsen

Found 7 papers, 5 papers with code

Interpretability Needs a New Paradigm

no code implementations • 8 May 2024 • Andreas Madsen, Himabindu Lakkaraju, Siva Reddy, Sarath Chandar

At present, interpretability is divided into two paradigms: the intrinsic paradigm, which believes that only models designed to be explained can be explained, and the post-hoc paradigm, which believes that black-box models can be explained.

Paper
Add Code

Are self-explanations from Large Language Models faithful?

1 code implementation • 15 Jan 2024 • Andreas Madsen, Sarath Chandar, Siva Reddy

For example, if an LLM says a set of words is important for making a prediction, then it should not be able to make its prediction without these words.

counterfactual Faithfulness Critic +4

Paper
Code

Faithfulness Measurable Masked Language Models

1 code implementation • 11 Oct 2023 • Andreas Madsen, Siva Reddy, Sarath Chandar

Additionally, because the model makes faithfulness cheap to measure, we can optimize explanations towards maximal faithfulness; thus, our model becomes indirectly inherently explainable.

Paper
Code

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

1 code implementation • 15 Oct 2021 • Andreas Madsen, Nicholas Meade, Vaibhav Adlakha, Siva Reddy

The principle is that this should result in worse model performance compared to masking random tokens.

Open-Ended Question Answering

Paper
Code

Post-hoc Interpretability for Neural NLP: A Survey

no code implementations • 10 Aug 2021 • Andreas Madsen, Siva Reddy, Sarath Chandar

Neural networks for NLP are becoming increasingly complex and widespread, and there is a growing concern if these models are responsible to use.

Paper
Add Code

Neural Arithmetic Units

3 code implementations • ICLR 2020 • Andreas Madsen, Alexander Rosenberg Johansen

We present two new neural network components: the Neural Addition Unit (NAU), which can learn exact addition and subtraction; and the Neural Multiplication Unit (NMU) that can multiply subsets of a vector.

Inductive Bias

144

Paper
Code

Measuring Arithmetic Extrapolation Performance

4 code implementations • 4 Oct 2019 • Andreas Madsen, Alexander Rosenberg Johansen

The goal of NALU is to learn perfect extrapolation, which requires learning the exact underlying logic of an unknown arithmetic problem.

144

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.