no code implementations • NeurIPS 2007 • Chris Atkeson, Benjamin Stephens
We combine two threads of research on approximate dynamic programming: random sampling of states and using local trajectory optimizers to globally optimize a policy and associated value function.