Search Results for author: Ziran Yang

Panacea: Pareto Alignment via Preference Adaptation for LLMs

Panacea trains a single model capable of adapting online and Pareto-optimally to diverse sets of preferences without the need for further tuning.

Paper
Add Code

In this paper, we present Red-teaming Game (RTG), a general game-theoretic framework without manual annotation.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.