Search Results for author: Wangyue Li

Found 2 papers, 2 papers with code

Can multiple-choice questions really be useful in detecting the abilities of LLMs?

1 code implementation • 26 Mar 2024 • Wangyue Li, Liangzhi Li, Tong Xiang, Xiao Liu, Wei Deng, Noa Garcia

Additionally, we propose two methods to quantify the consistency and confidence of LLMs' output, which can be generalized to other QA evaluation benchmarks.

Multiple-choice Question Answering

Paper
Code

CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity and Infant Care

1 code implementation • NeurIPS 2023 • Tong Xiang, Liangzhi Li, Wangyue Li, Mingbai Bai, Lu Wei, Bowen Wang, Noa Garcia

In an effort to minimize the reliance on human resources for performance evaluation, we offer off-the-shelf judgment models for automatically assessing the LF output of LLMs given benchmark questions.

Misinformation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.