Search Results for author: Will Dabney

Found 46 papers, 12 papers with code

Disentangling the Causes of Plasticity Loss in Neural Networks

no code implementations • 29 Feb 2024 • Clare Lyle, Zeyu Zheng, Khimya Khetarpal, Hado van Hasselt, Razvan Pascanu, James Martens, Will Dabney

Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a \textit{stationary} data distribution.

Atari Games reinforcement-learning

Paper
Add Code

A Distributional Analogue to the Successor Representation

1 code implementation • 13 Feb 2024 • Harley Wiltzer, Jesse Farebrother, Arthur Gretton, Yunhao Tang, André Barreto, Will Dabney, Marc G. Bellemare, Mark Rowland

This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure and reward in the learning process.

Distributional Reinforcement Learning Model-based Reinforcement Learning +1

Paper
Code

Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model

no code implementations • 12 Feb 2024 • Mark Rowland, Li Kevin Wenliang, Rémi Munos, Clare Lyle, Yunhao Tang, Will Dabney

We propose a new algorithm for model-based distributional reinforcement learning (RL), and prove that it is minimax-optimal for approximating return distributions with a generative model (up to logarithmic factors), resolving an open question of Zhang et al. (2023).

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Off-policy Distributional Q($λ$): Distributional RL without Importance Sampling

no code implementations • 8 Feb 2024 • Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney

We introduce off-policy distributional Q($\lambda$), a new addition to the family of off-policy distributional evaluation algorithms.

Paper
Add Code

Bootstrapped Representations in Reinforcement Learning

no code implementations • 16 Jun 2023 • Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney

In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988).

Auxiliary Learning reinforcement-learning +1

Paper
Add Code

The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation

no code implementations • 28 May 2023 • Mark Rowland, Yunhao Tang, Clare Lyle, Rémi Munos, Marc G. Bellemare, Will Dabney

We study the problem of temporal-difference-based policy evaluation in reinforcement learning.

Distributional Reinforcement Learning reinforcement-learning

Paper
Add Code

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

no code implementations • 1 May 2023 • Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Remi Munos, Will Dabney, Diana L Borsa

Representation learning and exploration are among the key challenges for any deep reinforcement learning agent.

reinforcement-learning Representation Learning

Paper
Add Code

Understanding plasticity in neural networks

no code implementations • 2 Mar 2023 • Clare Lyle, Zeyu Zheng, Evgenii Nikishin, Bernardo Avila Pires, Razvan Pascanu, Will Dabney

Plasticity, the ability of a neural network to quickly change its predictions in response to new information, is essential for the adaptability and robustness of deep reinforcement learning systems.

Atari Games

Paper
Add Code

An Analysis of Quantile Temporal-Difference Learning

no code implementations • 11 Jan 2023 • Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney

We analyse quantile temporal-difference learning (QTD), a distributional reinforcement learning algorithm that has proven to be a key component in several successful large-scale applications of reinforcement learning.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Settling the Reward Hypothesis

no code implementations • 20 Dec 2022 • Michael Bowling, John D. Martin, David Abel, Will Dabney

The reward hypothesis posits that, "all of what we mean by goals and purposes can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward)."

Paper
Add Code

Understanding Self-Predictive Learning for Reinforcement Learning

no code implementations • 6 Dec 2022 • Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko

We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning

no code implementations • 15 Jul 2022 • Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare

We study the multi-step off-policy learning approach to distributional RL.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Generalised Policy Improvement with Geometric Policy Composition

no code implementations • 17 Jun 2022 • Shantanu Thakoor, Mark Rowland, Diana Borsa, Will Dabney, Rémi Munos, André Barreto

We introduce a method for policy improvement that interpolates between the greedy approach of value-based reinforcement learning (RL) and the full planning approach typical of model-based RL.

Continuous Control Reinforcement Learning (RL)

Paper
Add Code

Learning Dynamics and Generalization in Reinforcement Learning

no code implementations • 5 Jun 2022 • Clare Lyle, Mark Rowland, Will Dabney, Marta Kwiatkowska, Yarin Gal

Solving a reinforcement learning (RL) problem poses two competing challenges: fitting a potentially discontinuous value function, and generalizing well to new observations.

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

Understanding and Preventing Capacity Loss in Reinforcement Learning

no code implementations • ICLR 2022 • Clare Lyle, Mark Rowland, Will Dabney

The reinforcement learning (RL) problem is rife with sources of non-stationarity, making it a notoriously difficult problem domain for the application of neural networks.

Montezuma's Revenge reinforcement-learning +1

Paper
Add Code

On the Expressivity of Markov Reward

no code implementations • NeurIPS 2021 • David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh

We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists.

Paper
Add Code

The Difficulty of Passive Learning in Deep Reinforcement Learning

1 code implementation • NeurIPS 2021 • Georg Ostrovski, Pablo Samuel Castro, Will Dabney

Learning to act from observational data without active environmental interaction is a well-known challenge in Reinforcement Learning (RL).

reinforcement-learning Reinforcement Learning (RL)

12,829

Paper
Code

Revisiting Peng's Q($λ$) for Modern Reinforcement Learning

no code implementations • 27 Feb 2021 • Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel

These results indicate that Peng's Q($\lambda$), which was thought to be unsafe, is a theoretically-sound and practically effective algorithm.

Continuous Control reinforcement-learning +1

Paper
Add Code

On The Effect of Auxiliary Tasks on Representation Dynamics

no code implementations • 25 Feb 2021 • Clare Lyle, Mark Rowland, Georg Ostrovski, Will Dabney

While auxiliary tasks play a key role in shaping the representations learnt by reinforcement learning agents, much is still unknown about the mechanisms through which this is achieved.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Model-Free Counterfactual Credit Assignment

no code implementations • 1 Jan 2021 • Thomas Mesnard, Theophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Marcus Hutter, Lars Holger Buesing, Remi Munos

Credit assignment in reinforcement learning is the problem of measuring an action’s influence on future rewards.

counterfactual valid

Paper
Add Code

Counterfactual Credit Assignment in Model-Free Reinforcement Learning

no code implementations • 18 Nov 2020 • Thomas Mesnard, Théophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Arthur Guez, Éric Moulines, Marcus Hutter, Lars Buesing, Rémi Munos

Credit assignment in reinforcement learning is the problem of measuring an action's influence on future rewards.

counterfactual reinforcement-learning +1

Paper
Add Code

Revisiting Fundamentals of Experience Replay

2 code implementations • ICML 2020 • William Fedus, Prajit Ramachandran, Rishabh Agarwal, Yoshua Bengio, Hugo Larochelle, Mark Rowland, Will Dabney

Experience replay is central to off-policy algorithms in deep reinforcement learning (RL), but there remain significant gaps in our understanding.

DQN Replay Dataset Q-Learning +1

32,943

Paper
Code

Deep Reinforcement Learning and its Neuroscientific Implications

no code implementations • 7 Jul 2020 • Matthew Botvinick, Jane. X. Wang, Will Dabney, Kevin J. Miller, Zeb Kurth-Nelson

The emergence of powerful artificial intelligence is defining new research directions in neuroscience.

Decision Making Image Classification +2

Paper
Add Code

The Value-Improvement Path: Towards Better Representations for Reinforcement Learning

no code implementations • 3 Jun 2020 • Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver

To test our hypothesis empirically, we augmented a standard deep RL agent with an auxiliary task of learning the value-improvement path.

Atari Games reinforcement-learning +3

Paper
Add Code

Temporally-Extended ε-Greedy Exploration

no code implementations • ICLR 2021 • Will Dabney, Georg Ostrovski, André Barreto

Recent work on exploration in reinforcement learning (RL) has led to a series of increasingly complex solutions to the problem.

Reinforcement Learning (RL)

Paper
Add Code

Adapting Behaviour for Learning Progress

no code implementations • 14 Dec 2019 • Tom Schaul, Diana Borsa, David Ding, David Szepesvari, Georg Ostrovski, Will Dabney, Simon Osindero

Determining what experience to generate to best facilitate learning (i. e. exploration) is one of the distinguishing features and open challenges in reinforcement learning.

Atari Games

Paper
Add Code

Hindsight Credit Assignment

1 code implementation • NeurIPS 2019 • Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Remi Munos

We consider the problem of efficient credit assignment in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Adaptive Trade-Offs in Off-Policy Learning

no code implementations • 16 Oct 2019 • Mark Rowland, Will Dabney, Rémi Munos

A great variety of off-policy learning algorithms exist in the literature, and new breakthroughs in this area continue to be made, improving theoretical understanding and yielding state-of-the-art reinforcement learning algorithms.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Conditional Importance Sampling for Off-Policy Learning

no code implementations • 16 Oct 2019 • Mark Rowland, Anna Harutyunyan, Hado van Hasselt, Diana Borsa, Tom Schaul, Rémi Munos, Will Dabney

We theoretically analyse this space, and concretely investigate several algorithms that arise from this framework.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Fast Task Inference with Variational Intrinsic Successor Features

no code implementations • ICLR 2020 • Steven Hansen, Will Dabney, Andre Barreto, Tom Van de Wiele, David Warde-Farley, Volodymyr Mnih

It has been established that diverse behaviors spanning the controllable subspace of an Markov decision process can be trained by rewarding a policy for being distinguishable from other policies \citep{gregor2016variational, eysenbach2018diversity, warde2018unsupervised}.

Paper
Add Code

Recurrent Experience Replay in Distributed Reinforcement Learning

3 code implementations • ICLR 2019 • Steven Kapturowski, Georg Ostrovski, Will Dabney, John Quan, Remi Munos

Using a single network architecture and fixed set of hyperparameters, the resulting agent, Recurrent Replay Distributed DQN, quadruples the previous state of the art on Atari-57, and surpasses the state of the art on DMLab-30.

Ranked #1 on Atari Games on Atari 2600 Pong

Atari Games reinforcement-learning +1

31,317

Paper
Code

The Termination Critic

no code implementations • 26 Feb 2019 • Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Remi Munos, Doina Precup

In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents.

Paper
Add Code

Statistics and Samples in Distributional Reinforcement Learning

no code implementations • 21 Feb 2019 • Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney

We present a unifying framework for designing and analysing distributional reinforcement learning (DRL) algorithms in terms of recursively estimating statistics of the return distribution.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

A Geometric Perspective on Optimal Representations for Reinforcement Learning

no code implementations • NeurIPS 2019 • Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taiga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans, Tor Lattimore, Clare Lyle

We leverage this perspective to provide formal evidence regarding the usefulness of value functions as auxiliary tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Implicit Quantile Networks for Distributional Reinforcement Learning

20 code implementations • ICML 2018 • Will Dabney, Georg Ostrovski, David Silver, Rémi Munos

In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN.

Ranked #1 on Atari Games on Atari 2600 Freeway

Atari Games Distributional Reinforcement Learning +3

10,378

Paper
Code

Autoregressive Quantile Networks for Generative Modeling

1 code implementation • ICML 2018 • Georg Ostrovski, Will Dabney, Rémi Munos

We introduce autoregressive implicit quantile networks (AIQN), a fundamentally different approach to generative modeling than those commonly used, that implicitly captures the distribution using quantile regression.

regression

Paper
Code

Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery

no code implementations • 13 May 2018 • Thomas Stepleton, Razvan Pascanu, Will Dabney, Siddhant M. Jayakumar, Hubert Soyer, Remi Munos

Reinforcement learning (RL) agents performing complex tasks must be able to remember observations and actions across sizable time intervals.

Reinforcement Learning (RL)

Paper
Add Code

Distributed Distributional Deterministic Policy Gradients

5 code implementations • ICLR 2018 • Gabriel Barth-Maron, Matthew W. Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva TB, Alistair Muldal, Nicolas Heess, Timothy Lillicrap

This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting.

Continuous Control Reinforcement Learning (RL)

2,602

Paper
Code

An Analysis of Categorical Distributional Reinforcement Learning

no code implementations • 22 Feb 2018 • Mark Rowland, Marc G. Bellemare, Will Dabney, Rémi Munos, Yee Whye Teh

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Distributional Reinforcement Learning with Quantile Regression

17 code implementations • 27 Oct 2017 • Will Dabney, Mark Rowland, Marc G. Bellemare, Rémi Munos

In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean.

Ranked #1 on Atari Games on Atari 2600 Pong

Atari Games Distributional Reinforcement Learning +3

8,017

Paper
Code

Rainbow: Combining Improvements in Deep Reinforcement Learning

32 code implementations • 6 Oct 2017 • Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver

The deep reinforcement learning community has made several independent improvements to the DQN algorithm.

Ranked #3 on Montezuma's Revenge on Atari 2600 Montezuma's Revenge

Montezuma's Revenge reinforcement-learning +1

7,457

Paper
Code

A Distributional Perspective on Reinforcement Learning

22 code implementations • ICML 2017 • Marc G. Bellemare, Will Dabney, Rémi Munos

We obtain both state-of-the-art results and anecdotal evidence demonstrating the importance of the value distribution in approximate reinforcement learning.

Ranked #4 on Atari Games on Atari 2600 HERO

Atari Games reinforcement-learning +1

3,523

Paper
Code

The Cramer Distance as a Solution to Biased Wasserstein Gradients

2 code implementations • ICLR 2018 • Marc G. Bellemare, Ivo Danihelka, Will Dabney, Shakir Mohamed, Balaji Lakshminarayanan, Stephan Hoyer, Rémi Munos

We show that the Cram\'er distance possesses all three desired properties, combining the best of the Wasserstein and Kullback-Leibler divergences.

BIG-bench Machine Learning Generative Adversarial Network

Paper
Code

The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning

no code implementations • ICLR 2018 • Audrunas Gruslys, Will Dabney, Mohammad Gheshlaghi Azar, Bilal Piot, Marc Bellemare, Remi Munos

Our first contribution is a new policy evaluation algorithm called Distributional Retrace, which brings multi-step off-policy updates to the distributional reinforcement learning setting.

Ranked #6 on Atari Games on Atari 2600 Crazy Climber

Atari Games Distributional Reinforcement Learning +1

Paper
Add Code

Successor Features for Transfer in Reinforcement Learning

no code implementations • NeurIPS 2017 • André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, Hado van Hasselt, David Silver

Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces

no code implementations • 26 May 2014 • Sridhar Mahadevan, Bo Liu, Philip Thomas, Will Dabney, Steve Giguere, Nicholas Jacek, Ian Gemp, Ji Liu

In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to design reliable, convergent, and robust reinforcement learning algorithms (ii) how to guarantee that reinforcement learning satisfies pre-specified "safety" guarantees, and remains in a stable region of the parameter space (iii) how to design "off-policy" temporal difference learning algorithms in a reliable and stable manner, and finally (iv) how to integrate the study of reinforcement learning into the rich theory of stochastic optimization.

Decision Making reinforcement-learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.