Bandit Learning in Convex Non-Strictly Monotone Games

8 Sep 2020 · Tatiana Tatarenko, Maryam Kamgarpour ·

We address learning Nash equilibria in convex games under the payoff information setting. We consider the case in which the game pseudo-gradient is monotone but not necessarily strictly monotone. This relaxation of strict monotonicity enables application of learning algorithms to a larger class of games, such as, for example, a zero-sum game with a merely convex-concave cost function. We derive an algorithm whose iterates provably converge to the least-norm Nash equilibrium in this setting. {From the perspective of a single player using the proposed algorithm, we view the game as an instance of online optimization}. Through this lens, we quantify the regret rate of the algorithm and provide an approach to choose the algorithm's parameters to minimize the regret rate.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Edit Social Preview

Bandit Learning in Convex Non-Strictly Monotone Games

Code Edit Add Remove Mark official

Categories

Code

Add Remove Mark official