1 code implementation • 3 Apr 2024 • Yunzhuo Hao, Wenkai Yang, Yankai Lin
Chat models are extensively adopted across various real-world scenarios, thus the security of chat models deserves increasing attention.
1 code implementation • 17 Feb 2024 • Wenkai Yang, Xiaohan Bi, Yankai Lin, Sishuo Chen, Jie zhou, Xu sun
We first formulate a general framework of agent backdoor attacks, then we present a thorough analysis on the different forms of agent backdoor attacks.
no code implementations • 15 Nov 2023 • Wenkai Yang, Yankai Lin, Jie zhou, JiRong Wen
The current knowledge learning paradigm of LLMs is mainly based on learning from examples, in which LLMs learn the internal rule implicitly from a certain number of supervised examples.
no code implementations • 12 Nov 2023 • Wenkai Yang, Wenyuan Sun, Runxaing Huang
This architecture utilizes a graph feature stream and an image feature stream, aiming to merge the strengths of both modalities for improved performance in image classification and scene graph generation tasks.
1 code implementation • 29 Jul 2023 • Lean Wang, Wenkai Yang, Deli Chen, Hao Zhou, Yankai Lin, Fandong Meng, Jie zhou, Xu sun
As large language models (LLMs) generate texts with increasing fluency and realism, there is a growing need to identify the source of texts to prevent the abuse of LLMs.
1 code implementation • 21 May 2023 • Yi Liu, Xiaohan Bi, Lei LI, Sishuo Chen, Wenkai Yang, Xu sun
However, as pre-trained language models (PLMs) continue to increase in size, the communication cost for transmitting parameters during synchronization has become a training speed bottleneck.
2 code implementations • 30 Jan 2023 • Sishuo Chen, Wenkai Yang, Xiaohan Bi, Xu sun
We find that: (1) no existing method behaves well in both settings; (2) fine-tuning PLMs on in-distribution data benefits detecting semantic shifts but severely deteriorates detecting non-semantic shifts, which can be attributed to the distortion of task-agnostic features.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
no code implementations • 25 Jan 2023 • Wenkai Yang, Deli Chen, Hao Zhou, Fandong Meng, Jie zhou, Xu sun
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively in a data privacy-preserving manner.
no code implementations • 25 Jan 2023 • Wenkai Yang, Yankai Lin, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun
Federated Learning has become a widely-used framework which allows learning a global model on decentralized local datasets under the condition of protecting local data privacy.
1 code implementation • 14 Oct 2022 • Sishuo Chen, Wenkai Yang, Zhiyuan Zhang, Xiaohan Bi, Xu sun
In this work, we take the first step to investigate the unconcealment of textual poisoned samples at the intermediate-feature level and propose a feature-based efficient online defense method.
1 code implementation • EMNLP 2021 • Wenkai Yang, Yankai Lin, Peng Li, Jie zhou, Xu sun
Motivated by this observation, we construct a word-based robustness-aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing (NLP) models.
1 code implementation • 13 Oct 2021 • Guangxiang Zhao, Wenkai Yang, Xuancheng Ren, Lei LI, Yunfang Wu, Xu sun
The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary.
1 code implementation • ACL 2021 • Wenkai Yang, Yankai Lin, Peng Li, Jie zhou, Xu sun
In this work, we point out a potential problem of current backdoor attacking research: its evaluation ignores the stealthiness of backdoor attacks, and most of existing backdoor attacking methods are not stealthy either to system deployers or to system users.
1 code implementation • NAACL 2021 • Wenkai Yang, Lei LI, Zhiyuan Zhang, Xuancheng Ren, Xu sun, Bin He
However, in this paper, we find that it is possible to hack the model in a data-free way by modifying one single word embedding vector, with almost no accuracy sacrificed on clean samples.