2 code implementations • Findings (ACL) 2022 • Sen yang, Leyang Cui, Ruoxi Ning, Di wu, Yue Zhang
Neural constituency parsers have reached practical performance on news-domain benchmarks.
no code implementations • 2 Mar 2024 • Jianheng Huang, Leyang Cui, Ante Wang, Chengyi Yang, Xinting Liao, Linfeng Song, Junfeng Yao, Jinsong Su
When conducting continual learning based on a publicly-released LLM checkpoint, the availability of the original training data may be non-existent.
1 code implementation • 29 Feb 2024 • Qintong Li, Leyang Cui, Xueliang Zhao, Lingpeng Kong, Wei Bi
Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks.
no code implementations • 27 Feb 2024 • Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi
To address this, we propose to initialize the training oracles using linguistic heuristics and, more importantly, bootstrap the oracles through iterative self-reinforcement.
1 code implementation • 19 Jan 2024 • Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi
While large language models (LLMs) have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as \emph{hallucination}.
1 code implementation • 16 Jan 2024 • Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li
We present Inferflow, an efficient and highly configurable inference engine for large language models (LLMs).
2 code implementations • 25 Dec 2023 • Yue Zhang, Leyang Cui, Wei Bi, Shuming Shi
Experimental results on both discrimination-based and generation-based hallucination evaluation benchmarks, such as TruthfulQA and \textsc{FActScore}, demonstrate that our proposed ICD methods can effectively enhance the factuality of LLMs across various model sizes and families.
1 code implementation • 16 Nov 2023 • Sen yang, Xin Li, Leyang Cui, Lidong Bing, Wai Lam
Though prompting LLMs with various reasoning structures produces reasoning proofs along with answers, these proofs are not ensured to be causal and reliable due to the inherent defects of LLMs.
no code implementations • 31 Oct 2023 • Yingshu Li, Yunyi Liu, Zhanyu Wang, Xinyu Liang, Lei Wang, Lingqiao Liu, Leyang Cui, Zhaopeng Tu, Longyue Wang, Luping Zhou
This work conducts an evaluation of GPT-4V's multimodal capability for medical image analysis, with a focus on three representative tasks of radiology report generation, medical visual question answering, and medical visual grounding.
1 code implementation • 30 Oct 2023 • Qintong Li, Leyang Cui, Lingpeng Kong, Wei Bi
To explore the synergy between humans and LLM-based evaluators and address the challenges of existing inconsistent evaluation criteria in open-ended NLG tasks, we propose a Collaborative Evaluation pipeline CoEval, involving the design of a checklist of task-specific criteria and the detailed evaluation of texts, in which LLM generates initial ideation, and then humans engage in scrutiny.
1 code implementation • 11 Oct 2023 • Yue Zhang, Leyang Cui, Enbo Zhao, Wei Bi, Shuming Shi
In this paper, we introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.
1 code implementation • 11 Oct 2023 • Yu Zhang, Yue Zhang, Leyang Cui, Guohong Fu
In this work, we propose a novel non-autoregressive text editing method to circumvent the above issues, by modeling the edit process with latent CTC alignments.
1 code implementation • 3 Sep 2023 • Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
no code implementations • 17 Jul 2023 • RuiQi Li, Leyang Cui, Songtuan Lin, Patrik Haslum
Action models, which take the form of precondition/effect axioms, facilitate causal and motivational connections between actions for AI agents.
no code implementations • 16 Jul 2023 • Longyue Wang, Zefeng Du, Donghuai Liu, Deng Cai, Dian Yu, Haiyun Jiang, Yan Wang, Leyang Cui, Shuming Shi, Zhaopeng Tu
Modeling discourse -- the linguistic phenomena that go beyond individual sentences, is a fundamental yet challenging aspect of natural language processing (NLP).
1 code implementation • 20 Jun 2023 • Yafu Li, Leyang Cui, Jianhao Yan, Yongjing Yin, Wei Bi, Shuming Shi, Yue Zhang
Most existing text generation models follow the sequence-to-sequence paradigm.
1 code implementation • 25 May 2023 • Yuejiao Fei, Leyang Cui, Sen yang, Wai Lam, Zhenzhong Lan, Shuming Shi
Grammatical error correction systems improve written communication by detecting and correcting language mistakes.
1 code implementation • 22 May 2023 • Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Longyue Wang, Linyi Yang, Shuming Shi, Yue Zhang
In practical scenarios, the detector faces texts from various domains or LLMs without knowing their sources.
no code implementations • 22 May 2023 • Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, Wei Bi
Proprietary Large Language Models (LLMs), such as ChatGPT, have garnered significant attention due to their exceptional capabilities in handling a diverse range of tasks.
1 code implementation • 20 May 2023 • Hanmeng Liu, Zhiyang Teng, Leyang Cui, Chaoli Zhang, Qiji Zhou, Yue Zhang
LogiCoT serves as an instruction set for teaching models of logical reasoning and elicits general reasoning skills.
1 code implementation • 4 Apr 2023 • RuiQi Li, Patrik Haslum, Leyang Cui
We argue that an important type of relation not explored in NLP or IR research to date is that of an event being an argument - required or optional - of another event.
1 code implementation • 22 Oct 2022 • Xuefeng Bai, Seng Yang, Leyang Cui, Linfeng Song, Yue Zhang
Based on our observation, we investigate two approaches to reduce the domain distribution divergence of text and AMR features, respectively.
1 code implementation • 20 Oct 2022 • Yafu Li, Leyang Cui, Yongjing Yin, Yue Zhang
Despite low latency, non-autoregressive machine translation (NAT) suffers severe performance deterioration due to the naive independence assumption.
1 code implementation • COLING 2022 • Linyi Yang, Lifan Yuan, Leyang Cui, Wenyang Gao, Yue Zhang
Few-shot Named Entity Recognition (NER) is imperative for entity tagging in limited resource domains and thus received proper attention in recent years.
no code implementations • 3 Aug 2022 • Shuming Shi, Enbo Zhao, Duyu Tang, Yan Wang, Piji Li, Wei Bi, Haiyun Jiang, Guoping Huang, Leyang Cui, Xinting Huang, Cong Zhou, Yong Dai, Dongyang Ma
In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME).
no code implementations • 7 Mar 2022 • Leyang Cui, Fandong Meng, Yijin Liu, Jie zhou, Yue Zhang
Although pre-trained sequence-to-sequence models have achieved great success in dialogue response generation, chatbots still suffer from generating inconsistent responses in real-world practice, especially in multi-turn settings.
no code implementations • 2 Mar 2022 • Sen yang, Yunchen Zhang, Leyang Cui, Yue Zhang
Thanks to the advanced improvement of large pre-trained language models, prompt-based fine-tuning is shown to be effective on a variety of downstream tasks.
1 code implementation • EMNLP 2021 • Jian Liu, Zhiyang Teng, Leyang Cui, Hanmeng Liu, Yue Zhang
Aspect category sentiment analysis has attracted increasing research attention.
1 code implementation • ACL 2022 • Leyang Cui, Sen yang, Yue Zhang
Besides, our method achieves state-of-the-art BERT-based performance on PTB (95. 92 F1) and strong performance on CTB (92. 31 F1).
Ranked #6 on Constituency Parsing on Penn Treebank
1 code implementation • EMNLP 2021 • Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang
To deal with this problem, instead of introducing knowledge base as the input, we force the model to learn a better semantic representation by predicting the information in the knowledge base, only based on the input context.
1 code implementation • Findings (ACL) 2021 • Leyang Cui, Yu Wu, Jian Liu, Sen yang, Yue Zhang
To address the issue, we propose a template-based method for NER, treating NER as a language model ranking problem in a sequence-to-sequence framework, where original sentences and statement templates filled by candidate named entity span are regarded as the source sequence and the target sequence, respectively.
1 code implementation • 2 Jun 2021 • Chiyu Song, Hongliang He, Haofei Yu, Pengfei Fang, Leyang Cui, Zhenzhong Lan
The current state-of-the-art ranking methods mainly use an encoding paradigm called Cross-Encoder, which separately encodes each context-candidate pair and ranks the candidates according to their fitness scores.
Ranked #1 on Conversational Response Selection on Persona-Chat
1 code implementation • 10 Nov 2020 • Hanmeng Liu, Leyang Cui, Jian Liu, Yue Zhang
Natural language inference (NLI) is a fundamental NLP task, investigating the entailment relationship between two texts.
1 code implementation • COLING 2020 • Yile Wang, Leyang Cui, Yue Zhang
Contextualized representations give significantly improved results for a wide range of NLP tasks.
1 code implementation • EMNLP 2020 • Dandan Huang, Leyang Cui, Sen yang, Guangsheng Bao, Kun Wang, Jun Xie, Yue Zhang
Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years.
no code implementations • Findings (ACL) 2021 • Leyang Cui, Sijie Cheng, Yu Wu, Yue Zhang
We quantitatively investigate the presence of structural commonsense cues in BERT when solving commonsense tasks, and the importance of such cues for the model prediction.
2 code implementations • 16 Jul 2020 • Jian Liu, Leyang Cui, Hanmeng Liu, Dandan Huang, Yile Wang, Yue Zhang
Machine reading is a fundamental task for testing the capability of natural language understanding, which is closely related to human cognition in many aspects.
1 code implementation • ACL 2020 • Leyang Cui, Yu Wu, Shujie Liu, Yue Zhang, Ming Zhou
Non-task oriented dialogue systems have achieved great success in recent years due to largely accessible conversation data and the development of deep learning techniques.
1 code implementation • 27 Nov 2019 • Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
However, relatively little work has been done investigating commonsense knowledge contained in contextualized representations, which is crucial for human question answering and reading comprehension.
no code implementations • 7 Nov 2019 • Yile Wang, Leyang Cui, Yue Zhang
Contextualized embeddings such as BERT can serve as strong input representations to NLP tasks, outperforming their static embeddings counterparts such as skip-gram, CBOW and GloVe.
1 code implementation • COLING 2020 • Sen yang, Leyang Cui, Jun Xie, Yue Zhang
In this paper, we conduct a study to exploit methods for better use of summary information.
2 code implementations • IJCNLP 2019 • Leyang Cui, Yue Zhang
CRF has been used as a powerful model for statistical sequence labeling.
Ranked #1 on Part-Of-Speech Tagging on UD