Search Results for author: M. H. I. Abdalla

Found 1 papers, 0 papers with code

Attacking Large Language Models with Projected Gradient Descent

no code implementations • 14 Feb 2024 • Simon Geisler, Tom Wollschläger, M. H. I. Abdalla, Johannes Gasteiger, Stephan Günnemann

Current LLM alignment methods are readily broken through specifically crafted adversarial prompts.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.