no code implementations • 1 Nov 2022 • Yifei Wang, Tavor Baharav, Yanjun Han, Jiantao Jiao, David Tse
In the infinite-armed bandit problem, each arm's average reward is sampled from an unknown distribution, and each arm can be sampled further to obtain noisy estimates of the average reward of that arm.
1 code implementation • NeurIPS 2019 • Tavor Baharav, David Tse
Four to five orders of magnitude gains over exact computation are obtained on real data, in terms of both number of distance computations needed and wall clock time.