no code implementations • 10 Oct 2015 • P. Prasanna, Sarath Chandar, Balaraman Ravindran
In this paper, we propose TSEB, a Thompson Sampling based algorithm with adaptive exploration bonus that aims to solve the problem with tighter PAC guarantees, while being cautious on the regret as well.
2 code implementations • 10 Oct 2015 • Janarthanan Rajendran, Aravind Srinivas, Mitesh M. Khapra, P. Prasanna, Balaraman Ravindran
Second, the agent should be able to selectively transfer, which is the ability to select and transfer from different and multiple source tasks for different parts of the state space of the target task.