Approximate Top-$m$ Arm Identification with Heterogeneous Reward Variances

11 Apr 2022  ·  Ruida Zhou, Chao Tian ·

We study the effect of reward variance heterogeneity in the approximate top-$m$ arm identification setting. In this setting, the reward for the $i$-th arm follows a $\sigma^2_i$-sub-Gaussian distribution, and the agent needs to incorporate this knowledge to minimize the expected number of arm pulls to identify $m$ arms with the largest means within error $\epsilon$ out of the $n$ arms, with probability at least $1-\delta$. We show that the worst-case sample complexity of this problem is $$\Theta\left( \sum_{i =1}^n \frac{\sigma_i^2}{\epsilon^2} \ln\frac{1}{\delta} + \sum_{i \in G^{m}} \frac{\sigma_i^2}{\epsilon^2} \ln(m) + \sum_{j \in G^{l}} \frac{\sigma_j^2}{\epsilon^2} \text{Ent}(\sigma^2_{G^{r}}) \right),$$ where $G^{m}, G^{l}, G^{r}$ are certain specific subsets of the overall arm set $\{1, 2, \ldots, n\}$, and $\text{Ent}(\cdot)$ is an entropy-like function which measures the heterogeneity of the variance proxies. The upper bound of the complexity is obtained using a divide-and-conquer style algorithm, while the matching lower bound relies on the study of a dual formulation.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here