no code implementations • ICLR 2019 • Wesley Chung, Somjit Nath, Ajin Joseph, Martha White
A key component for many reinforcement learning agents is to learn a value function, either for policy evaluation or control.
1 code implementation • 22 Oct 2018 • Samuel Neumann, Sungsu Lim, Ajin Joseph, Yangchen Pan, Adam White, Martha White
We first provide a policy improvement result in an idealized setting, and then prove that our conditional CEM (CCEM) strategy tracks a CEM update per state, even with changing action-values.