no code implementations • 10 Feb 2019 • Andreas Merentitis, Kashif Rasul, Roland Vollgraf, Abdul-Saboor Sheikh, Urs Bergmann
This helps the bandit framework to select the best agents early, since these rewards are smoother and less sparse than the environment reward.
no code implementations • 4 Dec 2017 • Abdul-Saboor Sheikh, Kashif Rasul, Andreas Merentitis, Urs Bergmann
This work explores maximum likelihood optimization of neural networks through hypernetworks.