no code implementations • 7 Nov 2022 • Alexis Jacq, Manu Orsini, Gabriel Dulac-Arnold, Olivier Pietquin, Matthieu Geist, Olivier Bachem
Are the quantity and quality of data truly transformative to the performance of a general controller?
no code implementations • 16 Mar 2022 • Alexis Jacq, Johan Ferret, Olivier Pietquin, Matthieu Geist
We deem those states and corresponding actions important since they explain the difference in performance between the default and the new, lazy policy.
5 code implementations • 1 Jun 2020 • Matthew W. Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Nikola Momchev, Danila Sinopalnikov, Piotr Stańczyk, Sabela Ramos, Anton Raichuk, Damien Vincent, Léonard Hussenot, Robert Dadashi, Gabriel Dulac-Arnold, Manu Orsini, Alexis Jacq, Johan Ferret, Nino Vieillard, Seyed Kamyar Seyed Ghasemipour, Sertan Girgin, Olivier Pietquin, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, Sarah Henderson, Abe Friesen, Ruba Haroun, Alex Novikov, Sergio Gómez Colmenarejo, Serkan Cabi, Caglar Gulcehre, Tom Le Paine, Srivatsan Srinivasan, Andrew Cowie, Ziyu Wang, Bilal Piot, Nando de Freitas
These implementations serve both as a validation of our design decisions as well as an important contribution to reproducibility in RL research.
no code implementations • 24 Jun 2019 • Alexis Jacq, Julien Perolat, Matthieu Geist, Olivier Pietquin
We prove that in repeated symmetric games, this algorithm is a learning equilibrium.