no code implementations • 30 Apr 2024 • Anudeep Das, Vasisht Duddu, Rui Zhang, N. Asokan
Hence, there is a need for concept removal techniques (CRTs) which are effective in removing unacceptable concepts, utility-preserving on acceptable concepts, and robust against evasion with adversarial prompts.
1 code implementation • 18 Aug 2023 • Vasisht Duddu, Anudeep Das, Nora Khayata, Hossein Yalame, Thomas Schneider, N. Asokan
The success of machine learning (ML) has been accompanied by increased concerns about its trustworthiness.