Search Results for author: Yuanying Cai

Found 4 papers, 1 papers with code

MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel Feed

no code implementations • 17 Apr 2023 • Xiaowen Shi, Ze Wang, Yuanying Cai, Xiaoxu Wu, Fan Yang, Guogang Liao, Yongkang Wang, Xingxing Wang, Dong Wang

There are two types of data employed to train reinforcement learning (RL) model for position allocation, named strategy data and random data.

Imitation Learning Position +2

Paper
Add Code

RePreM: Representation Pre-training with Masked Model for Reinforcement Learning

no code implementations • 3 Mar 2023 • Yuanying Cai, Chuheng Zhang, Wei Shen, Xuyun Zhang, Wenjie Ruan, Longbo Huang

Inspired by the recent success of sequence modeling in RL and the use of masked language model for pre-training, we propose a masked model for pre-training in RL, RePreM (Representation Pre-training with Masked Model), which trains the encoder combined with transformer blocks to predict the masked states or actions in a trajectory.

Data Augmentation Language Modelling +3

Paper
Add Code

TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets

1 code implementation • 5 Dec 2022 • Yuanying Cai, Chuheng Zhang, Li Zhao, Wei Shen, Xuyun Zhang, Lei Song, Jiang Bian, Tao Qin, TieYan Liu

There are two challenges for this setting: 1) The optimal trade-off between optimizing the RL signal and the behavior cloning (BC) signal changes on different states due to the variation of the action coverage induced by different behavior policies.

D4RL Offline RL +2

Paper
Code

Exploration by Maximizing Rényi Entropy for Reward-Free RL Framework

no code implementations • 11 Jun 2020 • Chuheng Zhang, Yuanying Cai, Longbo Huang, Jian Li

In the planning phase, the agent computes a good policy for any reward function based on the dataset without further interacting with the environment.

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.