1 code implementation • NAACL 2022 • Yanan Wu, Keqing He, Yuanmeng Yan, QiXiang Gao, Zhiyuan Zeng, Fujia Zheng, Lulu Zhao, Huixing Jiang, Wei Wu, Weiran Xu
Detecting Out-of-Domain (OOD) or unknown intents from user queries is essential in a task-oriented dialog system.
1 code implementation • ACL 2022 • Yutao Mou, Keqing He, Yanan Wu, Zhiyuan Zeng, Hong Xu, Huixing Jiang, Wei Wu, Weiran Xu
Discovering Out-of-Domain(OOD) intents is essential for developing new skills in a task-oriented dialogue system.
no code implementations • EMNLP 2021 • Zhiyuan Zeng, Jiaze Chen, Weiran Xu, Lei LI
Based on the artificial dataset, we train an evaluation model that can not only make accurate and robust factual consistency discrimination but is also capable of making interpretable factual errors tracing by backpropagated gradient distribution on token embeddings.
no code implementations • 29 Feb 2024 • Jiantao Qiu, Haijun Lv, Zhenjiang Jin, Rui Wang, Wenchang Ning, JIA YU, Chaobin Zhang, Zhenxiang Li, Pei Chu, Yuan Qu, Jin Shi, Lindong Lu, Runyu Peng, Zhiyuan Zeng, Huanze Tang, Zhikai Lei, Jiawei Hong, Keyu Chen, Zhaoye Fei, Ruiliang Xu, Wei Li, Zhongying Tu, Lin Dahua, Yu Qiao, Hang Yan, Conghui He
To evaluate the quality and utility of the dataset, we trained 1B-parameter and 3B-parameter models using WanJuan-CC and another dataset, RefinedWeb.
no code implementations • 17 Feb 2024 • Zhiyuan Zeng, Qipeng Guo, Zhaoye Fei, Zhangyue Yin, Yunhua Zhou, Linyang Li, Tianxiang Sun, Hang Yan, Dahua Lin, Xipeng Qiu
To address the dropped tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU Rectification and the Fill-in Rectification.
no code implementations • 26 Jan 2024 • Zhaoye Fei, Yunfan Shao, Linyang Li, Zhiyuan Zeng, Conghui He, Hang Yan, Dahua Lin, Xipeng Qiu
Large language models have demonstrated remarkable potential in various tasks, however, there remains a significant scarcity of open-source models and data for specific domains.
1 code implementation • 11 Oct 2023 • Zhiyuan Zeng, Jiatong Yu, Tianyu Gao, Yu Meng, Tanya Goyal, Danqi Chen
As research in large language models (LLMs) continues to accelerate, LLM-based evaluation has emerged as a scalable and cost-effective alternative to human evaluations for comparing the ever increasing list of models.
1 code implementation • 10 Oct 2023 • Mengzhou Xia, Tianyu Gao, Zhiyuan Zeng, Danqi Chen
In this work, we study structured pruning as an effective means to develop smaller LLMs from pre-trained, larger models.
Ranked #40 on Question Answering on PIQA
1 code implementation • 28 May 2023 • Zhengyan Zhang, Zhiyuan Zeng, Yankai Lin, Chaojun Xiao, Xiaozhi Wang, Xu Han, Zhiyuan Liu, Ruobing Xie, Maosong Sun, Jie zhou
In analogy to human brains, we consider two main characteristics of modularity: (1) functional specialization of neurons: we evaluate whether each neuron is mainly specialized in a certain function, and find that the answer is yes.
1 code implementation • 28 May 2023 • Zhengyan Zhang, Zhiyuan Zeng, Yankai Lin, Huadong Wang, Deming Ye, Chaojun Xiao, Xu Han, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Experimental results on three knowledge-driven NLP tasks show that existing injection methods are not suitable for the new paradigm, while map-tuning effectively improves the performance of downstream models.
no code implementations • 19 Dec 2022 • Aaron Chan, Zhiyuan Zeng, Wyatt Lake, Brihi Joshi, Hanjie Chen, Xiang Ren
First, KNIFE finetunes a teacher LM (given task input and FTR) to predict the task output, transferring reasoning knowledge from the FTRs to the teacher's hidden states.
no code implementations • 17 Oct 2022 • Yanan Wu, Zhiyuan Zeng, Keqing He, Yutao Mou, Pei Wang, Yuanmeng Yan, Weiran Xu
In this paper, we propose a simple but strong energy-based score function to detect OOD where the energy scores of OOD samples are higher than IND samples.
1 code implementation • COLING 2022 • Yanan Wu, Zhiyuan Zeng, Keqing He, Yutao Mou, Pei Wang, Weiran Xu
Out-of-Domain (OOD) detection is a key component in a task-oriented dialog system, which aims to identify whether a query falls outside the predefined supported intent set.
no code implementations • 10 Jun 2022 • Zhiyuan Zeng, Deyi Xiong
We therefore extend the unsupervised models to few-shot parsing models (FPOA, FPIO) that use a few annotated trees to learn better linear projection matrices for parsing.
no code implementations • ACL 2021 • Zhiyuan Zeng, Deyi Xiong
For autoregressive NMT models that generate target words from left to right, we observe that adversarial attack on the source language is more effective than on the target language, and that attacking front positions of target sentences or positions of source sentences aligned to the front positions of corresponding target sentences is more effective than attacking other positions.
1 code implementation • NAACL 2021 • Zhiyuan Zeng, Keqing He, Yuanmeng Yan, Hong Xu, Weiran Xu
Detecting out-of-domain (OOD) intents is crucial for the deployed task-oriented dialogue system.
1 code implementation • ACL 2021 • Yanan Wu, Zhiyuan Zeng, Keqing He, Hong Xu, Yuanmeng Yan, Huixing Jiang, Weiran Xu
Existing slot filling models can only recognize pre-defined in-domain slot types from a limited slot set.
1 code implementation • ACL 2021 • Zhiyuan Zeng, Keqing He, Yuanmeng Yan, Zijun Liu, Yanan Wu, Hong Xu, Huixing Jiang, Weiran Xu
Detecting Out-of-Domain (OOD) or unknown intents from user queries is essential in a task-oriented dialog system.