no code implementations • 2 Feb 2024 • Sungee Hong, Zhengling Qi, Raymond K. W. Wong
We consider the problem of distributional off-policy evaluation which serves as the foundation of many distributional reinforcement learning (DRL) algorithms.
Distributional Reinforcement Learning Off-policy evaluation