no code implementations • 30 Nov 2023 • Jeremy McMahan, Young Wu, Xiaojin Zhu, Qiaomin Xie
Although the defense problem is NP-hard, we show that optimal Markovian defenses can be computed (learned) in polynomial time (sample complexity) in many scenarios.
1 code implementation • 9 Nov 2023 • Jeremy McMahan, Xiaojin Zhu
Our reduction yields planning and learning algorithms that are time and sample-efficient for tabular cMDPs so long as the precision of the costs is logarithmic in the size of the cMDP.
no code implementations • 1 Nov 2023 • Young Wu, Jeremy McMahan, Yiding Chen, Yudong Chen, Xiaojin Zhu, Qiaomin Xie
We study the game modification problem, where a benevolent game designer or a malevolent adversary modifies the reward function of a zero-sum Markov game so that a target deterministic or stochastic policy profile becomes the unique Markov perfect Nash equilibrium and has a value within a target range, in a way that minimizes the modification cost.
1 code implementation • 18 Jul 2023 • Jeremy McMahan, Young Wu, Yudong Chen, Xiaojin Zhu, Qiaomin Xie
Many real-world games suffer from information asymmetry: one player is only aware of their own payoffs while the other player has the full game information.
no code implementations • 13 Jun 2023 • Young Wu, Jeremy McMahan, Xiaojin Zhu, Qiaomin Xie
We characterize offline data poisoning attacks on Multi-Agent Reinforcement Learning (MARL), where an attacker may change a data set in an attempt to install a (potentially fictitious) unique Markov-perfect Nash equilibrium.
no code implementations • 4 Jun 2022 • Young Wu, Jeremy McMahan, Xiaojin Zhu, Qiaomin Xie
In offline multi-agent reinforcement learning (MARL), agents estimate policies from a given dataset.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 30 Aug 2021 • Shuchi Chawla, Evangelia Gergatsouli, Jeremy McMahan, Christos Tzamos
For distributions of support $m$, UDT admits a $\log m$ approximation, and while a constant factor approximation in polynomial time is a long-standing open problem, constant factor approximations are achievable in subexponential time (arXiv:1906. 11385).