1 code implementation • 3 Feb 2023 • Jaime Sabal Bermúdez, Antonio del Rio Chanona, Calvin Tsay
We introduce Distributional Constrained Policy Optimization (DCPO), a novel approach for reliable constraint satisfaction in RL.
Distributional Reinforcement Learning Policy Gradient Methods +2