Search Results for author: John Dang

Found 2 papers, 1 papers with code

Group Preference Optimization: Few-Shot Alignment of Large Language Models

no code implementations17 Oct 2023 Siyan Zhao, John Dang, Aditya Grover

We introduce Group Preference Optimization (GPO), an alignment framework that steers language models to preferences of individual groups in a few-shot manner.

Few-Shot Learning

Peering Through Preferences: Unraveling Feedback Acquisition for Aligning Large Language Models

1 code implementation30 Aug 2023 Hritik Bansal, John Dang, Aditya Grover

In particular, we find that LLMs that leverage rankings data for alignment (say model X) are preferred over those that leverage ratings data (say model Y), with a rank-based evaluation protocol (is X/Y's response better than reference response?)

Cannot find the paper you are looking for? You can Submit a new open access paper.