Search Results for author: Samrat Phatale

Found 4 papers, 1 papers with code

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

no code implementations • 15 Mar 2024 • Hakim Sidahmed, Samrat Phatale, Alex Hutcheson, Zhuonan Lin, Zhang Chen, Zac Yu, Jarvis Jin, Roman Komarytsia, Christiane Ahlheim, Yonghao Zhu, Simral Chaudhary, Bowen Li, Saravanan Ganesh, Bill Byrne, Jessica Hoffmann, Hassan Mansoor, Wei Li, Abhinav Rastogi, Lucas Dixon

We investigate the setup of "Parameter Efficient Reinforcement Learning" (PERL), in which we perform reward model training and reinforcement learning using LoRA.

reinforcement-learning

Paper
Add Code

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

no code implementations • 1 Sep 2023 • Harrison Lee, Samrat Phatale, Hassan Mansoor, Thomas Mesnard, Johan Ferret, Kellie Lu, Colton Bishop, Ethan Hall, Victor Carbune, Abhinav Rastogi, Sushant Prakash

Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models (LLMs) with human preferences.

Dialogue Generation reinforcement-learning

Paper
Add Code

Conversational Recommendation as Retrieval: A Simple, Strong Baseline

no code implementations • 23 May 2023 • Raghav Gupta, Renat Aksitov, Samrat Phatale, Simral Chaudhary, Harrison Lee, Abhinav Rastogi

Conversational recommendation systems (CRS) aim to recommend suitable items to users through natural language conversation.

Data Augmentation Information Retrieval +3

Paper
Add Code

Prose for a Painting

1 code implementation • 8 Oct 2019 • Prerna Kashyap, Samrat Phatale, Iddo Drori

Painting captions are often dry and simplistic which motivates us to describe a painting creatively in the style of Shakespearean prose.

Style Transfer

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.