Search Results for author: Daniel Vial

Found 9 papers, 1 papers with code

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

no code implementations • 30 May 2023 • Ronshee Chawla, Daniel Vial, Sanjay Shakkottai, R. Srikant

The study of collaborative multi-agent bandits has attracted significant attention recently.

Paper
Add Code

Minimax Regret for Cascading Bandits

no code implementations • 23 Mar 2022 • Daniel Vial, Sujay Sanghavi, Sanjay Shakkottai, R. Srikant

Cascading bandits is a natural and popular model that frames the task of learning to rank from Bernoulli click feedback in a bandit setting.

Learning-To-Rank

Paper
Add Code

Robust Multi-Agent Bandits Over Undirected Graphs

no code implementations • 28 Feb 2022 • Daniel Vial, Sanjay Shakkottai, R. Srikant

Thus, we generalize existing regret bounds beyond the complete graph (where $d_{\text{mal}}(i) = m$), and show the effect of malicious agents is entirely local (in the sense that only the $d_{\text{mal}}(i)$ malicious agents directly connected to $i$ affect its long-term regret).

Paper
Add Code

Improved Algorithms for Misspecified Linear Markov Decision Processes

no code implementations • 12 Sep 2021 • Daniel Vial, Advait Parulekar, Sanjay Shakkottai, R. Srikant

(P1) Its regret after $K$ episodes scales as $K \max \{ \varepsilon_{\text{mis}}, \varepsilon_{\text{tol}} \}$, where $\varepsilon_{\text{mis}}$ is the degree of misspecification and $\varepsilon_{\text{tol}}$ is a user-specified error tolerance.

Multi-Armed Bandits

Paper
Add Code

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

no code implementations • 4 May 2021 • Daniel Vial, Advait Parulekar, Sanjay Shakkottai, R. Srikant

We propose an algorithm that uses linear function approximation (LFA) for stochastic shortest path (SSP).

Paper
Add Code

One-bit feedback is sufficient for upper confidence bound policies

no code implementations • 4 Dec 2020 • Daniel Vial, Sanjay Shakkottai, R. Srikant

We consider a variant of the traditional multi-armed bandit problem in which each arm is only able to provide one-bit feedback during each pull based on its past history of rewards.

Paper
Add Code

Robust Multi-Agent Multi-Armed Bandits

no code implementations • 7 Jul 2020 • Daniel Vial, Sanjay Shakkottai, R. Srikant

Recent works have shown that agents facing independent instances of a stochastic $K$-armed bandit can collaborate to decrease regret.

Distributed Computing Multi-Armed Bandits +1

Paper
Add Code

Empirical Policy Evaluation with Supergraphs

no code implementations • 18 Feb 2020 • Daniel Vial, Vijay Subramanian

We devise and analyze algorithms for the empirical policy evaluation problem in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On the role of clustering in Personalized PageRank estimation

1 code implementation • 4 Jun 2017 • Daniel Vial, Vijay Subramanian

We then show that the common underlying graph can be leveraged to efficiently and jointly estimate PPR for many pairs, rather than treating each pair separately using the primitive algorithm.

Social and Information Networks

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.