1 code implementation • 25 May 2023 • Dinesh Parthasarathy, Georgios Kontes, Axel Plinge, Christopher Mutschler
We propose Constrained MCTS (C-MCTS), which estimates cost using a safety critic that is trained with Temporal Difference learning in an offline phase prior to agent deployment.