Search Results for author: Lei Ying

Found 24 papers, 4 papers with code

Learning-Based Pricing and Matching for Two-Sided Queues

no code implementations • 17 Mar 2024 • Zixian Yang, Lei Ying

We prove that our proposed algorithm yields a sublinear regret $\tilde{O}(T^{5/6})$ and queue-length bound $\tilde{O}(T^{2/3})$, where $T$ is the time horizon.

Paper
Add Code

Cost Aware Best Arm Identification

no code implementations • 26 Feb 2024 • Kellen Kanarios, Qining Zhang, Lei Ying

In this paper, we study a best arm identification problem with dual objects.

Paper
Add Code

Safe Reinforcement Learning with Instantaneous Constraints: The Role of Aggressive Exploration

no code implementations • 22 Dec 2023 • Honghao Wei, Xin Liu, Lei Ying

This paper studies safe Reinforcement Learning (safe RL) with linear function approximation and under hard instantaneous constraints where unsafe actions must be avoided at each step.

4k reinforcement-learning +1

Paper
Add Code

Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs

no code implementations • 27 Sep 2023 • Zihan Zhou, Honghao Wei, Lei Ying

PRI achieves trio objectives: (i) PRI is a model-free algorithm; and (ii) it outputs an approximately optimal policy with a high probability at the end of learning; and (iii) PRI guarantees $\tilde{\mathcal{O}}(H\sqrt{K})$ regret and constraint violation, which significantly improves the best existing regret bound $\tilde{\mathcal{O}}(H^4 \sqrt{SA}K^{\frac{4}{5}})$ under a model-free algorithm, where $H$ is the length of each episode, $S$ is the number of states, $A$ is the number of actions, and the total number of episodes during learning is $2K+\tilde{\cal O}(K^{0. 25}).$ We further present a matching lower via an example that shows under any online learning algorithm, there exists a well-separated CMDP instance such that either the regret or violation has to be $\Omega(H\sqrt{K}),$ which matches the upper bound by a polylogarithmic factor.

Paper
Add Code

Reconstructing Graph Diffusion History from a Single Snapshot

1 code implementation • 1 Jun 2023 • Ruizhong Qiu, Dingsu Wang, Lei Ying, H. Vincent Poor, Yifang Zhang, Hanghang Tong

They are exclusively based on the maximum likelihood estimation (MLE) formulation and require to know true diffusion parameters.

Paper
Code

Provably Efficient Model-Free Algorithms for Non-stationary CMDPs

no code implementations • 10 Mar 2023 • Honghao Wei, Arnob Ghosh, Ness Shroff, Lei Ying, Xingyu Zhou

We study model-free reinforcement learning (RL) algorithms in episodic non-stationary constrained Markov Decision Processes (CMDPs), in which an agent aims to maximize the expected cumulative reward subject to a cumulative constraint on the expected utility (cost).

Reinforcement Learning (RL)

Paper
Add Code

Online Nonstochastic Control with Adversarial and Static Constraints

no code implementations • 5 Feb 2023 • Xin Liu, Zixian Yang, Lei Ying

This subroutine also achieves the state-of-the-art regret and constraint violation bounds for constrained online convex optimization problems, which is of independent interest.

Paper
Add Code

On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures

no code implementations • 26 Jan 2023 • Xian Yu, Lei Ying

Risk-sensitive reinforcement learning (RL) has become a popular tool to control the risk of uncertain outcomes and ensure reliable performance in various sequential decision-making problems.

Decision Making Policy Gradient Methods +1

Paper
Add Code

Network Utility Maximization with Unknown Utility Functions: A Distributed, Data-Driven Bilevel Optimization Approach

no code implementations • 4 Jan 2023 • Kaiyi Ji, Lei Ying

In this paper, we provide a new solution using a distributed and data-driven bilevel optimization approach, where the lower level is a distributed network utility maximization (NUM) algorithm with concave surrogate utility functions, and the upper level is a data-driven learning algorithm to find the best surrogate utility functions that maximize the sum of true network utility.

Bilevel Optimization

Paper
Add Code

Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems

no code implementations • 13 Dec 2022 • Xin Liu, Honghao Wei, Lei Ying

The proposed algorithm is distributed in two aspects: (i) the learned policy is a distributed policy that maps a local state of an agent to its local action and (ii) the learning/training is distributed, during which each agent updates its policy based on its own and neighbors' information.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Learning While Scheduling in Multi-Server Systems with Unknown Statistics: MaxWeight with Discounted UCB

no code implementations • 2 Sep 2022 • Zixian Yang, R. Srikant, Lei Ying

We prove that under our algorithm the asymptotic average queue length is bounded by one divided by the traffic slackness, which is order-wise optimal.

Scheduling

Paper
Add Code

Will Bilevel Optimizers Benefit from Loops

no code implementations • 27 May 2022 • Kaiyi Ji, Mingrui Liu, Yingbin Liang, Lei Ying

Existing studies in the literature cover only some of those implementation choices, and the complexity bounds available are not refined enough to enable rigorous comparison among different implementations.

Bilevel Optimization Computational Efficiency

Paper
Add Code

Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment

no code implementations • 26 May 2022 • Zixian Yang, Xin Liu, Lei Ying

To understand the exploration, exploitation, and engagement in these systems, we propose a new model, called MAB-A where "A" stands for abandonment and the abandonment probability depends on the current recommended item and the user's past experience (called state).

Multi-Armed Bandits Q-Learning +1

Paper
Add Code

Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning

no code implementations • 13 Nov 2021 • Jueming Hu, Xuxi Yang, Weichang Wang, Peng Wei, Lei Ying, Yongming Liu

Obstacle avoidance for small unmanned aircraft is vital for the safety of future urban air mobility (UAM) and Unmanned Aircraft System (UAS) Traffic Management (UTM).

Continuous Control Management +2

Paper
Add Code

A Provably-Efficient Model-Free Algorithm for Constrained Markov Decision Processes

no code implementations • 3 Jun 2021 • Honghao Wei, Xin Liu, Lei Ying

This paper presents the first model-free, simulator-free reinforcement learning algorithm for Constrained Markov Decision Processes (CMDPs) with sublinear regret and zero constraint violation.

Paper
Add Code

An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints

no code implementations • NeurIPS 2021 • Xin Liu, Bin Li, Pengyi Shi, Lei Ying

Thus, the overall computational complexity of our algorithm is similar to that of the linear UCB for unconstrained stochastic linear bandits.

Paper
Add Code

POND: Pessimistic-Optimistic oNline Dispatching

no code implementations • 20 Oct 2020 • Xin Liu, Bin Li, Pengyi Shi, Lei Ying

This paper considers constrained online dispatching with unknown arrival, reward and constraint distributions.

Paper
Add Code

FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning

2 code implementations • 4 Oct 2020 • Honghao Wei, Lei Ying

In this paper, we propose a new type of Actor, named forward-looking Actor or FORK for short, for Actor-Critic algorithms.

reinforcement-learning Reinforcement Learning (RL)

608

Paper
Code

The Mean-Squared Error of Double Q-Learning

1 code implementation • NeurIPS 2020 • Wentao Weng, Harsh Gupta, Niao He, Lei Ying, R. Srikant

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning.

Q-Learning

Paper
Code

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

1 code implementation • NeurIPS 2019 • Harsh Gupta, R. Srikant, Lei Ying

We study two time-scale linear stochastic approximation algorithms, which can be used to model well-known reinforcement learning algorithms such as GTD, GTD2, and TDC.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

QuickStop: A Markov Optimal Stopping Approach for Quickest Misinformation Detection

no code implementations • 4 Mar 2019 • Honghao Wei, Xiaohan Kang, Weina Wang, Lei Ying

The algorithm consists of an offline machine learning algorithm for learning the probabilistic information spreading model and an online optimal stopping algorithm to detect misinformation.

Misinformation

Paper
Add Code

Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning

no code implementations • 3 Feb 2019 • R. Srikant, Lei Ying

We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finite-time bounds on the moments of the error, i. e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE).

Paper
Add Code

Collaborative Filtering with Information-Rich and Information-Sparse Entities

no code implementations • 6 Mar 2014 • Kai Zhu, Rui Wu, Lei Ying, R. Srikant

In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users).

Clustering Collaborative Filtering +1

Paper
Add Code

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

no code implementations • 1 Oct 2013 • Jiaming Xu, Rui Wu, Kai Zhu, Bruce Hajek, R. Srikant, Lei Ying

In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure.

Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.