1 code implementation • 16 Apr 2024 • Pengyu Cheng, Tianhao Hu, Han Xu, Zhisong Zhang, Yong Dai, Lei Han, Nan Du
In this game, an attacker and a defender communicate around a target word only visible to the attacker.
1 code implementation • 12 Dec 2023 • Dun Zeng, Yong Dai, Pengyu Cheng, Longyue Wang, Tianhao Hu, Wanshun Chen, Nan Du, Zenglin Xu
Our analysis reveals a correlation between the calibration performance of reward models (RMs) and the alignment performance of LLMs.
1 code implementation • 14 Nov 2023 • Pengyu Cheng, Yifan Yang, Jian Li, Yong Dai, Tianhao Hu, Peixin Cao, Nan Du, Xiaolong Li
Targeting more efficient human preference optimization, we propose an Adversarial Preference Optimization (APO) framework, in which the LLM and the reward model update alternatively via a min-max game.
1 code implementation • 7 Sep 2022 • Tianhao Hu, Bangti Jin, Zhi Zhou
Extensive numerical experiments in two- and multi-dimensional spaces with point sources, line sources or their combinations are presented to illustrate the efficiency of the proposed approach, and a comparative study with several existing approaches based on neural networks is also given, which shows clearly its competitiveness for the specific class of problems.