no code implementations • 29 May 2024 • Hanye Zhao, Xiaoshen Han, Zhengbang Zhu, Minghuan Liu, Yong Yu, Weinan Zhang
We propose Dynamics Diffusion, short as DyDiff, which can inject information from the learning policy to DMs iteratively.
1 code implementation • 2 Nov 2023 • Zhengbang Zhu, Hanye Zhao, Haoran He, Yichao Zhong, Shenyu Zhang, Haoquan Guo, Tingting Chen, Weinan Zhang
Diffusion models surpass previous generative models in sample quality and training stability.
no code implementations • 17 Jun 2022 • Kerong Wang, Hanye Zhao, Xufang Luo, Kan Ren, Weinan Zhang, Dongsheng Li
Offline reinforcement learning (RL) aims at learning policies from previously collected static trajectory data without interacting with the real environment.
no code implementations • NeurIPS 2021 • Minghuan Liu, Hanye Zhao, Zhengyu Yang, Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu
However, IL is usually limited in the capability of the behavioral policy and tends to learn a mediocre behavior from the dataset collected by the mixture of policies.
1 code implementation • 3 Nov 2021 • Minghuan Liu, Hanye Zhao, Zhengyu Yang, Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu
However, IL is usually limited in the capability of the behavioral policy and tends to learn a mediocre behavior from the dataset collected by the mixture of policies.