Search Results for author: Yuanshun Yao

Found 24 papers, 6 papers with code

Learning to Watermark LLM-generated Text via Reinforcement Learning

1 code implementation • 13 Mar 2024 • Xiaojun Xu, Yuanshun Yao, Yang Liu

While prior works focus on token-level watermark that embeds signals into the output, we design a model-level watermark that embeds signals into the LLM weights, and such signals can be detected by a paired detector.

reinforcement-learning

Paper
Code

Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

no code implementations • 12 Mar 2024 • Wei Shen, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, Yang Liu

Reinforcement learning from human feedback (RLHF) is the mainstream paradigm used to align large language models (LLMs) with human preferences.

reinforcement-learning

Paper
Add Code

Fair Classifiers Without Fair Training: An Influence-Guided Data Sampling Approach

no code implementations • 20 Feb 2024 • Jinlong Pang, Jialu Wang, Zhaowei Zhu, Yuanshun Yao, Chen Qian, Yang Liu

A fair classifier should ensure the benefit of people from different groups, while the group information is often sensitive and unsuitable for model training.

Attribute Fairness

Paper
Add Code

Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting

no code implementations • 16 Feb 2024 • Jiaheng Wei, Yuanshun Yao, Jean-Francois Ton, Hongyi Guo, Andrew Estornell, Yang Liu

In this work, we propose Factualness Evaluations via Weighting LLMs (FEWL), the first hallucination metric that is specifically designed for the scenario when gold-standard answers are absent.

Hallucination In-Context Learning

Paper
Add Code

Rethinking Machine Unlearning for Large Language Models

no code implementations • 13 Feb 2024 • Sijia Liu, Yuanshun Yao, Jinghan Jia, Stephen Casper, Nathalie Baracaldo, Peter Hase, Xiaojun Xu, Yuguang Yao, Hang Li, Kush R. Varshney, Mohit Bansal, Sanmi Koyejo, Yang Liu

We explore machine unlearning (MU) in the domain of large language models (LLMs), referred to as LLM unlearning.

Machine Unlearning Management +2

Paper
Add Code

Human-Instruction-Free LLM Self-Alignment with Limited Samples

no code implementations • 6 Jan 2024 • Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu

The key idea is to first retrieve high-quality samples related to the target domain and use them as In-context Learning examples to generate more samples.

In-Context Learning Instruction Following

Paper
Add Code

Large Language Model Unlearning

1 code implementation • 14 Oct 2023 • Yuanshun Yao, Xiaojun Xu, Yang Liu

To the best of our knowledge, our work is among the first to explore LLM unlearning.

Language Modelling Large Language Model

Paper
Code

Fair Classifiers that Abstain without Harm

no code implementations • 9 Oct 2023 • Tongxin Yin, Jean-François Ton, Ruocheng Guo, Yuanshun Yao, Mingyan Liu, Yang Liu

To generalize the abstaining decisions to test samples, we then train a surrogate model to learn the abstaining decisions based on the IP solutions in an end-to-end manner.

Decision Making Fairness

Paper
Add Code

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

1 code implementation • 10 Aug 2023 • Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo, Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li

However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations.

Fairness Models Alignment

Paper
Code

On the Cause of Unfairness: A Training Sample Perspective

no code implementations • 30 Jun 2023 • Yuanshun Yao, Yang Liu

Identifying the causes of a model's unfairness is an important yet relatively unexplored task.

counterfactual Fairness

Paper
Add Code

Label Inference Attack against Split Learning under Regression Setting

1 code implementation • 18 Jan 2023 • Shangyu Xie, Xin Yang, Yuanshun Yao, Tianyi Liu, Taiqing Wang, Jiankai Sun

In this work, we step further to study the leakage in the scenario of the regression model, where the private labels are continuous numbers (instead of discrete labels in classification).

Inference Attack regression +1

Paper
Code

Learning to Counterfactually Explain Recommendations

no code implementations • 17 Nov 2022 • Yuanshun Yao, Chong Wang, Hang Li

The key idea is to train a surrogate model to learn the effect of removing a subset of user history on the recommendation.

counterfactual Recommendation Systems +1

Paper
Add Code

Weak Proxies are Sufficient and Preferable for Fairness with Missing Sensitive Attributes

1 code implementation • 6 Oct 2022 • Zhaowei Zhu, Yuanshun Yao, Jiankai Sun, Hang Li, Yang Liu

Our theoretical analyses show that directly using proxy models can give a false sense of (un)fairness.

Fairness

Paper
Code

DPAUC: Differentially Private AUC Computation in Federated Learning

1 code implementation • 25 Aug 2022 • Jiankai Sun, Xin Yang, Yuanshun Yao, Junyuan Xie, Di wu, Chong Wang

Federated learning (FL) has gained significant attention recently as a privacy-enhancing tool to jointly train a machine learning model by multiple participants.

Federated Learning

877

Paper
Code

Differentially Private Multi-Party Data Release for Linear Regression

no code implementations • 16 Jun 2022 • Ruihan Wu, Xin Yang, Yuanshun Yao, Jiankai Sun, Tianyi Liu, Kilian Q. Weinberger, Chong Wang

Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects.

regression

Paper
Add Code

Differentially Private AUC Computation in Vertical Federated Learning

no code implementations • 24 May 2022 • Jiankai Sun, Xin Yang, Yuanshun Yao, Junyuan Xie, Di wu, Chong Wang

In this work, we propose two evaluation algorithms that can more accurately compute the widely used AUC (area under curve) metric when using label DP in vFL.

Vertical Federated Learning

Paper
Add Code

Differentially Private Label Protection in Split Learning

no code implementations • 4 Mar 2022 • Xin Yang, Jiankai Sun, Yuanshun Yao, Junyuan Xie, Chong Wang

Split learning is a distributed training framework that allows multiple parties to jointly train a machine learning model over vertically partitioned data (partitioned by attributes).

Paper
Add Code

Label Leakage and Protection from Forward Embedding in Vertical Federated Learning

no code implementations • 2 Mar 2022 • Jiankai Sun, Xin Yang, Yuanshun Yao, Chong Wang

As the raw labels often contain highly sensitive information, some recent work has been proposed to prevent the label leakage from the backpropagated gradients effectively in vFL.

Vertical Federated Learning

Paper
Add Code

Counterfactually Evaluating Explanations in Recommender Systems

no code implementations • 2 Mar 2022 • Yuanshun Yao, Chong Wang, Hang Li

Modern recommender systems face an increasing need to explain their recommendations.

counterfactual Recommendation Systems

Paper
Add Code

Defending against Reconstruction Attack in Vertical Federated Learning

no code implementations • 21 Jul 2021 • Jiankai Sun, Yuanshun Yao, Weihao Gao, Junyuan Xie, Chong Wang

Recently researchers have studied input leakage problems in Federated Learning (FL) where a malicious party can reconstruct sensitive training inputs provided by users from shared gradient.

Privacy Preserving Reconstruction Attack +1

Paper
Add Code

Vertical Federated Learning without Revealing Intersection Membership

no code implementations • 10 Jun 2021 • Jiankai Sun, Xin Yang, Yuanshun Yao, Aonan Zhang, Weihao Gao, Junyuan Xie, Chong Wang

In this paper, we propose a vFL framework based on Private Set Union (PSU) that allows each party to keep sensitive membership information to itself.

Vertical Federated Learning

Paper
Add Code

Backdoor Attacks Against Deep Learning Systems in the Physical World

no code implementations • CVPR 2021 • Emily Wenger, Josephine Passananti, Arjun Bhagoji, Yuanshun Yao, Haitao Zheng, Ben Y. Zhao

A critical question remains unanswered: can backdoor attacks succeed using physical objects as triggers, thus making them a credible threat against deep learning systems in the real world?

Transfer Learning

Paper
Add Code

Regula Sub-rosa: Latent Backdoor Attacks on Deep Neural Networks

no code implementations • 24 May 2019 • Yuanshun Yao, Huiying Li, Hai-Tao Zheng, Ben Y. Zhao

Recent work has proposed the concept of backdoor attacks on deep neural networks (DNNs), where misbehaviors are hidden inside "normal" models, only to be triggered by very specific inputs.

Backdoor Attack Traffic Sign Recognition +1

Paper
Add Code

Automated Crowdturfing Attacks and Defenses in Online Review Systems

no code implementations • 27 Aug 2017 • Yuanshun Yao, Bimal Viswanath, Jenna Cryan, Hai-Tao Zheng, Ben Y. Zhao

Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers.

Cryptography and Security Social and Information Networks

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.