no code implementations • 22 Oct 2023 • Siddhant Chaudhary, Abhishek Sinha
In this paper, we consider the $\alpha$-Fair Contextual Bandits problem, where the objective is to maximize the global $\alpha$-fair utility function - a non-decreasing concave function of the cumulative rewards in the adversarial setting.
no code implementations • 28 Sep 2022 • Sourav Sahoo, Siddhant Chaudhary, Samrat Mukhopadhyay, Abhishek Sinha
In this connection, we propose an online learning policy called SCore (Subset Selection with Core) that solves the problem for a large class of reward functions.