no code implementations • 14 May 2024 • Yiwen Zhu, Jinyi Liu, Wenya Wei, Qianyi Fu, Yujing Hu, Zhou Fang, Bo An, Jianye Hao, Tangjie Lv, Changjie Fan
Enhancing learning efficiency remains a key challenge in RL, with many efforts focused on using ensemble critics to boost policy evaluation efficiency.
no code implementations • 6 Mar 2024 • Yibin Chen, Yifu Yuan, Zeyu Zhang, Yan Zheng, Jinyi Liu, Fei Ni, Jianye Hao
To bridge the gap with the real-world requirements, we introduce $\textbf{SheetRM}$, a benchmark featuring long-horizon and multi-category tasks with reasoning-dependent manipulation caused by real-life challenges.
no code implementations • 22 Feb 2024 • Jinyi Liu, Yifu Yuan, Jianye Hao, Fei Ni, Lingzhi Fu, Yibin Chen, Yan Zheng
Recently, there has been considerable attention towards leveraging large language models (LLMs) to enhance decision-making processes.
1 code implementation • 4 Feb 2024 • Yifu Yuan, Jianye Hao, Yi Ma, Zibin Dong, Hebin Liang, Jinyi Liu, Zhixin Feng, Kai Zhao, Yan Zheng
It is crucial to consider diverse human feedback types and various learning methods in different environments.
no code implementations • 3 Jan 2024 • YanJie Li, Weijun Li, Lina Yu, Min Wu, Jinyi Liu, Wenqiang Li, Meilan Hao
1, The type of activation function is single and relatively fixed, which leads to poor "unit representation ability" of the network, and it is often used to solve simple problems with very complex networks; 2, the network structure is not adaptive, it is easy to cause network structure redundant or insufficient.
no code implementations • 19 Dec 2023 • Jinyi Liu, Zhi Wang, Yan Zheng, Jianye Hao, Chenjia Bai, Junjie Ye, Zhen Wang, Haiyin Piao, Yang Sun
In reinforcement learning, the optimism in the face of uncertainty (OFU) is a mainstream principle for directing exploration towards less explored areas, characterized by higher uncertainty.
no code implementations • 13 Nov 2023 • YanJie Li, Weijun Li, Lina Yu, Min Wu, Jinyi Liu, Wenqiang Li, Meilan Hao, Shu Wei, Yusong Deng
To address these issues, we propose MetaSymNet, a novel neural network that dynamically adjusts its structure in real-time, allowing for both expansion and contraction.
no code implementations • 27 Jun 2023 • Jinyi Liu, Yi Ma, Jianye Hao, Yujing Hu, Yan Zheng, Tangjie Lv, Changjie Fan
In summary, our research emphasizes the significance of trajectory-based data sampling techniques in enhancing the efficiency and performance of offline RL algorithms.
no code implementations • 12 Jun 2023 • Kai Zhao, Yi Ma, Jianye Hao, Jinyi Liu, Yan Zheng, Zhaopeng Meng
Offline reinforcement learning (RL) is a learning paradigm where an agent learns from a fixed dataset of experience.
no code implementations • 10 Jun 2023 • Shixi Lian, Yi Ma, Jinyi Liu, Yan Zheng, Zhaopeng Meng
Offline reinforcement learning (ORL) has gained attention as a means of training reinforcement learning models using pre-collected static data.
no code implementations • 2 Oct 2022 • Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing Hu, Jinyi Liu, Yingfeng Chen, Changjie Fan
Unsupervised reinforcement learning (URL) poses a promising paradigm to learn useful behaviors in a task-agnostic environment without the guidance of extrinsic rewards to facilitate the fast adaptation of various downstream tasks.
no code implementations • 29 Sep 2021 • Jinyi Liu, Zhi Wang, Yan Zheng, Jianye Hao, Junjie Ye, Chenjia Bai, Pengyi Li
Many exploration strategies are built upon the optimism in the face of the uncertainty (OFU) principle for reinforcement learning.
no code implementations • 14 Sep 2021 • Jianye Hao, Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Zhaopeng Meng, Peng Liu, Zhen Wang
In addition to algorithmic analysis, we provide a comprehensive and unified empirical comparison of different exploration methods for DRL on a set of commonly used benchmarks.
no code implementations • 1 Jan 2021 • Jinyi Liu, Zhi Wang, Jianye Hao, Yan Zheng
Recently, the principle of optimism in the face of (aleatoric and epistemic) uncertainty has been utilized to design efficient exploration strategies for Reinforcement Learning (RL).