Search Results for author: Nitesh Goyal

Found 5 papers, 0 papers with code

Lost in Distillation: A Case Study in Toxicity Modeling

no code implementations • NAACL (WOAH) 2022 • Alyssa Chvasta, Alyssa Lees, Jeffrey Sorensen, Lucy Vasserman, Nitesh Goyal

In an era of increasingly large pre-trained language models, knowledge distillation is a powerful tool for transferring information from a large model to a smaller one.

Knowledge Distillation

Paper
Add Code

Designing for Human-Agent Alignment: Understanding what humans want from their agents

no code implementations • 4 Apr 2024 • Nitesh Goyal, Minsuk Chang, Michael Terry

Our ability to build autonomous agents that leverage Generative AI continues to increase by the day.

Ethics

Paper
Add Code

ConstitutionMaker: Interactively Critiquing Large Language Models by Converting Feedback into Principles

no code implementations • 24 Oct 2023 • Savvas Petridis, Ben Wedin, James Wexler, Aaron Donsbach, Mahima Pushkarna, Nitesh Goyal, Carrie J. Cai, Michael Terry

Inspired by these findings, we developed ConstitutionMaker, an interactive tool for converting user feedback into principles, to steer LLM-based chatbots.

Chatbot Language Modelling +2

Paper
Add Code

`It is currently hodgepodge'': Examining AI/ML Practitioners' Challenges during Co-production of Responsible AI Values

no code implementations • 14 Jul 2023 • Rama Adithya Varanasi, Nitesh Goyal

This work contributes to the discussion by unpacking co-production challenges faced by practitioners as they align their RAI values.

Paper
Add Code

Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

no code implementations • 1 May 2022 • Nitesh Goyal, Ian Kivlichan, Rachel Rosen, Lucy Vasserman

Next, we trained models on the annotations from each of the different rater pools, and compared the scores of these models on comments from several test sets.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.