Structure and randomness in planning and reinforcement learning
Planning in large state spaces inevitably needs to balance depth and breadth of the search. It has a crucial impact on planners performance and most manage this interplay implicitly. We present a novel method $\textit{Shoot Tree Search (STS)}$, which makes it possible to control this trade-off more explicitly. Our algorithm can be understood as an interpolation between two celebrated search mechanisms: MCTS and random shooting. It also lets the user control the bias-variance trade-off, akin to $TD(n)$, but in the tree search context. In experiments on challenging domains, we show that STS can get the best of both worlds consistently achieving higher scores.
PDF Abstract NeurIPS Workshop 2020 PDF NeurIPS Workshop 2020 Abstract