1 code implementation • 30 Nov 2023 • Benjamin Schneider, Nils Lukas, Florian Kerschbaum
We demonstrate the effectiveness and robustness of our universal backdoor attacks by controlling models with up to 6, 000 classes while poisoning only 0. 15% of the training dataset.
no code implementations • 29 Sep 2023 • Nils Lukas, Abdulrahman Diaa, Lucas Fenaux, Florian Kerschbaum
A core security property of watermarking is robustness, which states that an attacker can only evade detection by substantially degrading image quality.
1 code implementation • 14 Jun 2023 • Abdulrahman Diaa, Lucas Fenaux, Thomas Humphries, Marian Dietz, Faezeh Ebrahimianghazani, Bailey Kacsmar, Xinda Li, Nils Lukas, Rasoul Akhavan Mahdavi, Simon Oya, Ehsan Amjadian, Florian Kerschbaum
Motivated by the success of previous work co-designing machine learning and MPC, we develop an activation function co-design.
no code implementations • 7 May 2023 • Nils Lukas, Florian Kerschbaum
Our research points to intrinsic flaws in current attack evaluation methods and raises the bar for all data poisoning attackers who must delicately balance this trade-off to remain robust and undetectable.
2 code implementations • 14 Apr 2023 • Nils Lukas, Florian Kerschbaum
We propose an adaptive attack that can successfully remove any watermarking with access to only 200 non-watermarked images.
1 code implementation • 1 Feb 2023 • Nils Lukas, Ahmed Salem, Robert Sim, Shruti Tople, Lukas Wutschitz, Santiago Zanella-Béguelin
Understanding the risk of LMs leaking Personally Identifiable Information (PII) has received less attention, which can be attributed to the false assumption that dataset curation techniques such as scrubbing are sufficient to prevent PII leakage.
no code implementations • 29 Sep 2021 • Nils Lukas, Charles Zhang, Florian Kerschbaum
Feature Grinding requires at most six percent of the model's training time on CIFAR-10 and at most two percent on ImageNet for sanitizing the surveyed backdoors.
1 code implementation • 11 Aug 2021 • Nils Lukas, Edward Jiang, Xinda Li, Florian Kerschbaum
Watermarking should be robust against watermark removal attacks that derive a surrogate model that evades provenance verification.
1 code implementation • ICLR 2021 • Nils Lukas, Yuxuan Zhang, Florian Kerschbaum
We propose a fingerprinting method for deep neural network classifiers that extracts a set of inputs from the source model so that only surrogates agree with the source model on the classification of such inputs.
no code implementations • 18 Jun 2019 • Masoumeh Shafieinejad, Jiaqi Wang, Nils Lukas, Xinda Li, Florian Kerschbaum
We focus on backdoor-based watermarking and propose two -- a black-box and a white-box -- attacks that remove the watermark.