Search Results for author: Sarthak Mittal

Found 13 papers, 10 papers with code

Does learning the right latent variables necessarily improve in-context learning?

1 code implementation • 29 May 2024 • Sarthak Mittal, Eric Elmoznino, Leo Gagnon, Sangnie Bhardwaj, Dhanya Sridhar, Guillaume Lajoie

Our study highlights the intrinsic limitations of Transformers in achieving structured ICL solutions that generalize, and shows that while inferring the right latents aids interpretability, it is not sufficient to alleviate this problem.

Paper
Code

Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

1 code implementation • 9 Feb 2024 • Tara Akhound-Sadegh, Jarrid Rector-Brooks, Avishek Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong

Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science.

Denoising Efficient Exploration

Paper
Code

Improved off-policy training of diffusion samplers

1 code implementation • 7 Feb 2024 • Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function.

Benchmarking

Paper
Code

Leveraging Synthetic Targets for Machine Translation

no code implementations • 7 May 2023 • Sarthak Mittal, Oleksii Hrinchuk, Oleksii Kuchaiev

In this work, we provide a recipe for training machine translation models in a limited resource setting by leveraging synthetic target data generated using a large pre-trained model.

Machine Translation Translation

Paper
Add Code

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

1 code implementation • 27 Dec 2022 • Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels.

Data Augmentation

Paper
Code

From Points to Functions: Infinite-dimensional Representations in Diffusion Models

1 code implementation • 25 Oct 2022 • Sarthak Mittal, Guillaume Lajoie, Stefan Bauer, Arash Mehrjou

Consequently, it is reasonable to ask if there is an intermediate time step at which the preserved information is optimal for a given downstream task.

Decoder

Paper
Code

On Neural Architecture Inductive Biases for Relational Tasks

1 code implementation • 9 Jun 2022 • Giancarlo Kerg, Sarthak Mittal, David Rolnick, Yoshua Bengio, Blake Richards, Guillaume Lajoie

Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems.

Inductive Bias Out-of-Distribution Generalization

Paper
Code

Is a Modular Architecture Enough?

1 code implementation • 6 Jun 2022 • Sarthak Mittal, Yoshua Bengio, Guillaume Lajoie

Inspired from human cognition, machine learning systems are gradually revealing advantages of sparser and more modular architectures.

Out-of-Distribution Generalization

Paper
Code

Compositional Attention: Disentangling Search and Retrieval

3 code implementations • ICLR 2022 • Sarthak Mittal, Sharath Chandra Raparthy, Irina Rish, Yoshua Bengio, Guillaume Lajoie

Through our qualitative analysis, we demonstrate that Compositional Attention leads to dynamic specialization based on the type of retrieval needed.

Retrieval

7,774

Paper
Code

Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning

1 code implementation • 2 Jul 2021 • Nan Rosemary Ke, Aniket Didolkar, Sarthak Mittal, Anirudh Goyal, Guillaume Lajoie, Stefan Bauer, Danilo Rezende, Yoshua Bengio, Michael Mozer, Christopher Pal

A central goal for AI and causality is thus the joint discovery of abstract representations and causal structure.

Benchmarking Causal Discovery +4

Paper
Code

Diffusion-Based Representation Learning

no code implementations • 29 May 2021 • Korbinian Abstreiter, Sarthak Mittal, Stefan Bauer, Bernhard Schölkopf, Arash Mehrjou

In contrast, the introduced diffusion-based representation learning relies on a new formulation of the denoising score matching objective and thus encodes the information needed for denoising.

Denoising Representation Learning +1

Paper
Add Code

Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

1 code implementation • ICML 2020 • Sarthak Mittal, Alex Lamb, Anirudh Goyal, Vikram Voleti, Murray Shanahan, Guillaume Lajoie, Michael Mozer, Yoshua Bengio

To effectively utilize the wealth of potential top-down information available, and to prevent the cacophony of intermixed signals in a bidirectional architecture, mechanisms are needed to restrict information flow.

Language Modelling Open-Ended Question Answering +2

Paper
Code

A Modern Take on the Bias-Variance Tradeoff in Neural Networks

no code implementations • 19 Oct 2018 • Brady Neal, Sarthak Mittal, Aristide Baratin, Vinayak Tantia, Matthew Scicluna, Simon Lacoste-Julien, Ioannis Mitliagkas

The bias-variance tradeoff tells us that as model complexity increases, bias falls and variances increases, leading to a U-shaped test error curve.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.