no code implementations • 6 Feb 2024 • Hugo Cui, Freya Behrens, Florent Krzakala, Lenka Zdeborová
We investigate how a dot-product attention layer learns a positional attention matrix (with tokens attending to each other based on their respective positions) and a semantic attention matrix (with tokens attending to each other based on their meaning).
1 code implementation • 12 Jun 2023 • Mahalakshmi Sabanayagam, Freya Behrens, Urte Adomaityte, Anna Dawid
Based on this finding, we provide a new and straightforward approach to studying the complexity of a high-dimensional decision boundary; show that this connection naturally inspires a new generalization measure; and finally, we develop a novel margin estimation technique which, in combination with the generalization measure, precisely identifies minima with simple wide-margin boundaries.
no code implementations • 7 Feb 2021 • Freya Behrens, Stefano Teso, Davide Mottin
We introduce Explearn, an online algorithm that learns to jointly output predictions and explanations for those predictions.
no code implementations • 23 Oct 2020 • Freya Behrens, Jonathan Sauder, Peter Jung
It is well-established that many iterative sparse reconstruction algorithms such as ISTA can be unrolled to yield a learnable neural network for improved empirical performance.
1 code implementation • ICLR 2021 • Freya Behrens, Jonathan Sauder, Peter Jung
A prime example is learned ISTA (LISTA) where weights, step sizes and thresholds are learned from training data.