no code implementations • 23 Feb 2023 • Saba Ahmadi, Avrim Blum, Kunhe Yang
For instance, whereas in the non-strategic case, a mistake bound of $\ln|H|$ is achievable via the halving algorithm when the target function belongs to a known class $H$, we show that no deterministic algorithm can achieve a mistake bound $o(\Delta)$ in the strategic setting, where $\Delta$ is the maximum degree of the manipulation graph (even when $|H|=O(\Delta)$).
no code implementations • 1 Nov 2022 • Paria Rashidinejad, Hanlin Zhu, Kunhe Yang, Stuart Russell, Jiantao Jiao
Offline reinforcement learning (RL), which refers to decision-making from a previously-collected dataset of interactions, has received significant attention over the past years.
no code implementations • 17 Feb 2022 • Nika Haghtalab, Yanjun Han, Abhishek Shetty, Kunhe Yang
For the smoothed analysis setting, our results give the first oracle-efficient algorithm for online learning with smoothed adversaries [HRS22].
no code implementations • 16 Jun 2020 • Kunhe Yang, Lin F. Yang, Simon S. Du
This paper presents the first non-asymptotic result showing that a model-free algorithm can achieve a logarithmic cumulative regret for episodic tabular reinforcement learning if there exists a strictly positive sub-optimality gap in the optimal $Q$-function.