Search Results for author: Brett Daley

Found 13 papers, 7 papers with code

Compound Returns Reduce Variance in Reinforcement Learning

no code implementations • 6 Feb 2024 • Brett Daley, Martha White, Marlos C. Machado

Multistep returns, such as $n$-step returns and $\lambda$-returns, are commonly used to improve the sample efficiency of reinforcement learning (RL) methods.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning

1 code implementation • 26 Jan 2023 • Brett Daley, Martha White, Christopher Amato, Marlos C. Machado

Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Adaptive Tree Backup Algorithms for Temporal-Difference Reinforcement Learning

no code implementations • 4 Jun 2022 • Brett Daley, Isaac Chan

Q($\sigma$) is a recently proposed temporal-difference learning method that interpolates between learning from expected backups and sampled backups.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Improving the Efficiency of Off-Policy Reinforcement Learning by Accounting for Past Decisions

no code implementations • 23 Dec 2021 • Brett Daley, Christopher Amato

Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, particularly in the experience replay setting now commonly used with deep neural networks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Virtual Replay Cache

1 code implementation • 6 Dec 2021 • Brett Daley, Christopher Amato

Return caching is a recent strategy that enables efficient minibatch training with multistep estimators (e. g. the {\lambda}-return) for deep reinforcement learning.

Atari Games reinforcement-learning +1

Paper
Code

Human-Level Control without Server-Grade Hardware

1 code implementation • 1 Nov 2021 • Brett Daley, Christopher Amato

Deep Q-Network (DQN) marked a major milestone for reinforcement learning, demonstrating for the first time that human-level control policies could be learned directly from raw visual inputs via reward maximization.

Cloud Computing reinforcement-learning +1

Paper
Code

Investigating Alternatives to the Root Mean Square for Adaptive Gradient Methods

no code implementations • 10 Jun 2021 • Brett Daley, Christopher Amato

Adam is an adaptive gradient method that has experienced widespread adoption due to its fast and reliable training performance.

Paper
Add Code

Stratified Experience Replay: Correcting Multiplicity Bias in Off-Policy Reinforcement Learning

no code implementations • 22 Feb 2021 • Brett Daley, Cameron Hickert, Christopher Amato

Our theory prescribes a special non-uniform distribution to cancel this effect, and we propose a stratified sampling scheme to efficiently implement it.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning

no code implementations • 8 Feb 2021 • Xueguang Lyu, Yuchen Xiao, Brett Daley, Christopher Amato

Centralized Training for Decentralized Execution, where agents are trained offline using centralized information but execute in a decentralized manner online, has gained popularity in the multi-agent reinforcement learning community.

Misconceptions Multi-agent Reinforcement Learning +2

Paper
Add Code

Belief-Grounded Networks for Accelerated Robot Learning under Partial Observability

1 code implementation • 19 Oct 2020 • Hai Nguyen, Brett Daley, Xinchao Song, Christopher Amato, Robert Platt

Many important robotics problems are partially observable in the sense that a single visual or force-feedback measurement is insufficient to reconstruct the state.

Paper
Code

Expectigrad: Fast Stochastic Optimization with Robust Convergence Properties

1 code implementation • 3 Oct 2020 • Brett Daley, Christopher Amato

Many popular adaptive gradient methods such as Adam and RMSProp rely on an exponential moving average (EMA) to normalize their stepsizes.

Stochastic Optimization

Paper
Code

Reconciling λ-Returns with Experience Replay

1 code implementation • NeurIPS 2019 • Brett Daley, Christopher Amato

Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the λ-return difficult in this context.

Atari Games Incremental Learning

Paper
Code

Reconciling $λ$-Returns with Experience Replay

1 code implementation • 23 Oct 2018 • Brett Daley, Christopher Amato

Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the $\lambda$-return difficult in this context.

Atari Games Incremental Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.