no code implementations • 20 May 2024 • Yuxi Li, Yi Liu, Yuekang Li, Ling Shi, Gelei Deng, Shengquan Chen, Kailong Wang
Large language models (LLMs) have transformed the field of natural language processing, but they remain susceptible to jailbreaking attacks that exploit their capabilities to generate unintended and potentially harmful content.
no code implementations • 8 May 2024 • HanXiang Xu, ShenAo Wang, Ningke Li, Kailong Wang, Yanjie Zhao, Kai Chen, Ting Yu, Yang Liu, Haoyu Wang
Overall, our survey provides a comprehensive overview of the current state-of-the-art in LLM4Security and identifies several promising directions for future research.
no code implementations • 15 Apr 2024 • Yuxi Li, Yi Liu, Gelei Deng, Ying Zhang, Wenjia Song, Ling Shi, Kailong Wang, Yuekang Li, Yang Liu, Haoyu Wang
We present categorizations of the identified glitch tokens and symptoms exhibited by LLMs when interacting with glitch tokens.
1 code implementation • 5 Jan 2024 • Baijun Cheng, Shengming Zhao, Kailong Wang, Meizhen Wang, Guangdong Bai, Ruitao Feng, Yao Guo, Lei Ma, Haoyu Wang
Vulnerability detectors based on deep learning (DL) models have proven their effectiveness in recent years.
no code implementations • 1 Jan 2024 • Haodong Li, Gelei Deng, Yi Liu, Kailong Wang, Yuekang Li, Tianwei Zhang, Yang Liu, Guoai Xu, Guosheng Xu, Haoyu Wang
In this paper, we introduce a detailed framework designed to detect and assess the presence of content from potentially copyrighted books within the training datasets of LLMs.
1 code implementation • 21 Aug 2023 • Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, Haoyu Wang
Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages.
no code implementations • 8 Jun 2023 • Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, ZiHao Wang, XiaoFeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, Yang Liu
We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection.
no code implementations • 23 May 2023 • Yi Liu, Gelei Deng, Zhengzi Xu, Yuekang Li, Yaowen Zheng, Ying Zhang, Lida Zhao, Tianwei Zhang, Kailong Wang, Yang Liu
Our study investigates three key research questions: (1) the number of different prompt types that can jailbreak LLMs, (2) the effectiveness of jailbreak prompts in circumventing LLM constraints, and (3) the resilience of ChatGPT against these jailbreak prompts.