1 code implementation • 16 Jan 2024 • Bang An, Mucong Ding, Tahseen Rabbani, Aakriti Agrawal, Yuancheng Xu, ChengHao Deng, Sicheng Zhu, Abdirisak Mohamed, Yuxin Wen, Tom Goldstein, Furong Huang
We present WAVES (Watermark Analysis Via Enhanced Stress-testing), a novel benchmark for assessing watermark robustness, overcoming the limitations of current evaluation methods. WAVES integrates detection and identification tasks, and establishes a standardized evaluation protocol comprised of a diverse range of stress tests.
1 code implementation • 23 Oct 2023 • Sicheng Zhu, Ruiyi Zhang, Bang An, Gang Wu, Joe Barrow, Zichao Wang, Furong Huang, Ani Nenkova, Tong Sun
Safety alignment of Large Language Models (LLMs) can be compromised with manual jailbreak attacks and (automatic) adversarial attacks.
1 code implementation • 2 Aug 2023 • Bang An, Sicheng Zhu, Michael-Andrei Panaitescu-Liess, Chaithanya Kumar Mummadi, Furong Huang
Inspired by it, we observe that providing CLIP with contextual attributes improves zero-shot image classification and mitigates reliance on spurious features.
no code implementations • 10 Apr 2023 • Souradip Chakraborty, Amrit Singh Bedi, Sicheng Zhu, Bang An, Dinesh Manocha, Furong Huang
Our work addresses the critical issue of distinguishing text generated by Large Language Models (LLMs) from human-produced text, a task essential for numerous applications.
1 code implementation • NeurIPS 2021 • Sicheng Zhu, Bang An, Furong Huang
Based on this notion, we refine the generalization bound for invariant models and characterize the suitability of a set of data transformations by the sample covering number induced by transformations, i. e., the smallest size of its induced sample covers.
1 code implementation • ICML 2020 • Sicheng Zhu, Xiao Zhang, David Evans
We develop a notion of representation vulnerability that captures the maximum change of mutual information between the input and output distributions, under the worst-case input perturbation.
no code implementations • 10 Jan 2020 • Sicheng Zhu, Bang An, Shiyu Niu
Machine learning models are generally vulnerable to adversarial examples, which is in contrast to the robustness of humans.