no code implementations • 2 Jan 2021 • Minbo Gao, Tianle Xie, Simon S. Du, Lin F. Yang
This paper focuses on the linear Markov Decision Process (MDP) recently studied in [Yang et al 2019, Jin et al 2020] where the linear function approximation is used for generalization on the large state space.