Search Results for author: Siliang Zeng

Found 8 papers, 2 papers with code

Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment

no code implementations • 28 May 2024 • Jiaxiang Li, Siliang Zeng, Hoi-To Wai, Chenliang Li, Alfredo Garcia, Mingyi Hong

Moreover, we identify a connection between the proposed IRL based approach, and certain self-play approach proposed recently, and showed that self-play is a special case of modeling a reward-learning agent.

Paper
Add Code

A Bayesian Approach to Robust Inverse Reinforcement Learning

2 code implementations • 15 Sep 2023 • Ran Wei, Siliang Zeng, Chenliang Li, Alfredo Garcia, Anthony McDonald, Mingyi Hong

We consider a Bayesian approach to offline model-based inverse reinforcement learning (IRL).

Imitation Learning reinforcement-learning

Paper
Code

When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning

1 code implementation • NeurIPS 2023 • Siliang Zeng, Chenliang Li, Alfredo Garcia, Mingyi Hong

Offline inverse reinforcement learning (Offline IRL) aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.

Autonomous Driving Continuous Control +2

Paper
Code

Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees

no code implementations • 4 Oct 2022 • Siliang Zeng, Chenliang Li, Alfredo Garcia, Mingyi Hong

To reduce the computational burden of a nested loop, novel methods such as SQIL [1] and IQ-Learn [2] emphasize policy estimation at the expense of reward estimation accuracy.

counterfactual Imitation Learning +2

Paper
Add Code

Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees

no code implementations • 4 Oct 2022 • Siliang Zeng, Mingyi Hong, Alfredo Garcia

Other approaches in the inverse reinforcement learning (IRL) literature emphasize policy estimation at the expense of reduced reward estimation accuracy.

Imitation Learning

Paper
Add Code

Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

no code implementations • 11 Oct 2021 • Siliang Zeng, Tianyi Chen, Alfredo Garcia, Mingyi Hong

The flexibility in our design allows the proposed MARL-CAC algorithm to be used in a {\it fully decentralized} setting, where the agents can only communicate with their neighbors, as well as a {\it federated} setting, where the agents occasionally communicate with a server while optimizing their (partially personalized) local models.

Multi-agent Reinforcement Learning

Paper
Add Code

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum

no code implementations • NeurIPS 2021 • Prashant Khanduri, Siliang Zeng, Mingyi Hong, Hoi-To Wai, Zhaoran Wang, Zhuoran Yang

We focus on bilevel problems where the lower level subproblem is strongly-convex and the upper level objective function is smooth.

Bilevel Optimization Hyperparameter Optimization

Paper
Add Code

On the Divergence of Decentralized Non-Convex Optimization

no code implementations • 20 Jun 2020 • Mingyi Hong, Siliang Zeng, Junyu Zhang, Haoran Sun

However, by constructing some counter-examples, we show that when certain local Lipschitz conditions (LLC) on the local function gradient $\nabla f_i$'s are not satisfied, most of the existing decentralized algorithms diverge, even if the global Lipschitz condition (GLC) is satisfied, where the sum function $f$ has Lipschitz gradient.

Open-Ended Question Answering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.