Search Results for author: Victor Gabillon

Found 10 papers, 0 papers with code

Derivative-Free & Order-Robust Optimisation

no code implementations • 9 Oct 2019 • Victor Gabillon, Rasul Tutunov, Michal Valko, Haitham Bou Ammar

In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero'th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes.

Paper
Add Code

MANAS: Multi-Agent Neural Architecture Search

no code implementations • 3 Sep 2019 • Vasco Lopes, Fabio Maria Carlucci, Pedro M Esperança, Marco Singh, Victor Gabillon, Antoine Yang, Hang Xu, Zewei Chen, Jun Wang

The Neural Architecture Search (NAS) problem is typically formulated as a graph search problem where the goal is to learn the optimal operations over edges in order to maximise a graph-level global objective.

Neural Architecture Search

Paper
Add Code

A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption

no code implementations • 1 Oct 2018 • Peter L. Bartlett, Victor Gabillon, Michal Valko

The difficulty of optimization is measured in terms of 1) the amount of \emph{noise} $b$ of the function evaluation and 2) the local smoothness, $d$, of the function.

Paper
Add Code

Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem

no code implementations • NeurIPS 2017 • Yasin Abbasi, Peter L. Bartlett, Victor Gabillon

We study minimax strategies for the online prediction problem with expert advice.

Paper
Add Code

Hit-and-Run for Sampling and Planning in Non-Convex Spaces

no code implementations • 19 Oct 2016 • Yasin Abbasi-Yadkori, Peter L. Bartlett, Victor Gabillon, Alan Malek

We propose the Hit-and-Run algorithm for planning and sampling problems in non-convex spaces.

Paper
Add Code

Approximate Dynamic Programming Finally Performs Well in the Game of Tetris

no code implementations • NeurIPS 2013 • Victor Gabillon, Mohammad Ghavamzadeh, Bruno Scherrer

A close look at the literature of this game shows that while ADP algorithms, that have been (almost) entirely based on approximating the value function (value function based), have performed poorly in Tetris, the methods that search directly in the space of policies by learning the policy parameters using an optimization black box, such as the cross entropy (CE) method, have achieved the best reported results.

Paper
Add Code

Adaptive Submodular Maximization in Bandit Setting

no code implementations • NeurIPS 2013 • Victor Gabillon, Branislav Kveton, Zheng Wen, Brian Eriksson, S. Muthukrishnan

Maximization of submodular functions has wide applications in machine learning and artificial intelligence.

Paper
Add Code

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

no code implementations • NeurIPS 2012 • Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric

We study the problem of identifying the best arm(s) in the stochastic multi-armed bandit setting.

Paper
Add Code

Approximate Modified Policy Iteration

no code implementations • 14 May 2012 • Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist

Modified policy iteration (MPI) is a dynamic programming (DP) algorithm that contains the two celebrated policy and value iteration methods.

General Classification

Paper
Add Code

Multi-Bandit Best Arm Identification

no code implementations • NeurIPS 2011 • Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, Sébastien Bubeck

We first propose an algorithm called Gap-based Exploration (GapE) that focuses on the arms whose mean is close to the mean of the best arm in the same bandit (i. e., small gap).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.