1 code implementation • 20 May 2024 • Jian Hu, Xibin Wu, Weixun Wang, Xianyu, Dehao Zhang, Yu Cao
However, unlike pretraining or fine-tuning a single model, scaling reinforcement learning from human feedback (RLHF) for training large language models poses coordination challenges across four models.
no code implementations • 10 Aug 2021 • Xiaopeng Bi, Yu Chen, Xinyang Liu, Dehao Zhang, Ran Yan, Zheng Chai, Haotian Zhang, Xiao Liu
This report describes Megvii-3D team's approach towards CVPR 2021 Image Matching Workshop.