Search Results for author: Miljan Martic

To avoid this interference incentive, we introduce a baseline policy that represents a default course of action (such as doing nothing), and use it to filter out future tasks that are not achievable by default.

Paper
Add Code

Scaling shared model governance via model splitting

no code implementations • ICLR 2019 • Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli

Currently the only techniques for sharing governance of a deep learning model are homomorphic encryption and secure multiparty computation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Scalable agent alignment via reward modeling: a research direction

3 code implementations • 19 Nov 2018 • Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg

One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions.

Atari Games reinforcement-learning +1

Paper
Code

Penalizing side effects using stepwise relative reachability

no code implementations • 4 Jun 2018 • Victoria Krakovna, Laurent Orseau, Ramana Kumar, Miljan Martic, Shane Legg

How can we design safe reinforcement learning agents that avoid unnecessary disruptions to their environment?

Safe Reinforcement Learning

Paper
Add Code

AI Safety Gridworlds

2 code implementations • 27 Nov 2017 • Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg

We present a suite of reinforcement learning environments illustrating various safety properties of intelligent agents.

reinforcement-learning Reinforcement Learning (RL) +1

596

Paper
Code

Deep reinforcement learning from human preferences

5 code implementations • NeurIPS 2017 • Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei

For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems.

Atari Games reinforcement-learning +1

297

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.