no code implementations • 15 May 2024 • Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas Hartvigsen, Maarten Sap
Recent advances in large language models (LLMs) have led to their extensive global deployment, and ensuring their safety calls for comprehensive and multilingual toxicity evaluations.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration.