Search Results for author: Nitesh Goyal

Found 5 papers, 0 papers with code

Lost in Distillation: A Case Study in Toxicity Modeling

no code implementations NAACL (WOAH) 2022 Alyssa Chvasta, Alyssa Lees, Jeffrey Sorensen, Lucy Vasserman, Nitesh Goyal

In an era of increasingly large pre-trained language models, knowledge distillation is a powerful tool for transferring information from a large model to a smaller one.

Knowledge Distillation

Designing for Human-Agent Alignment: Understanding what humans want from their agents

no code implementations4 Apr 2024 Nitesh Goyal, Minsuk Chang, Michael Terry

Our ability to build autonomous agents that leverage Generative AI continues to increase by the day.

Ethics

ConstitutionMaker: Interactively Critiquing Large Language Models by Converting Feedback into Principles

no code implementations24 Oct 2023 Savvas Petridis, Ben Wedin, James Wexler, Aaron Donsbach, Mahima Pushkarna, Nitesh Goyal, Carrie J. Cai, Michael Terry

Inspired by these findings, we developed ConstitutionMaker, an interactive tool for converting user feedback into principles, to steer LLM-based chatbots.

Chatbot Language Modelling +2

`It is currently hodgepodge'': Examining AI/ML Practitioners' Challenges during Co-production of Responsible AI Values

no code implementations14 Jul 2023 Rama Adithya Varanasi, Nitesh Goyal

This work contributes to the discussion by unpacking co-production challenges faced by practitioners as they align their RAI values.

Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

no code implementations1 May 2022 Nitesh Goyal, Ian Kivlichan, Rachel Rosen, Lucy Vasserman

Next, we trained models on the annotations from each of the different rater pools, and compared the scores of these models on comments from several test sets.

Cannot find the paper you are looking for? You can Submit a new open access paper.