no code implementations • 24 Apr 2024 • Divyansh Agarwal, Alexander R. Fabbri, Philippe Laban, Ben Risher, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
In a multi-turn setting, our threat model elevates the average attack success rate (ASR) to 86. 2%, including a 99% leakage with GPT-4 and claude-1. 3.
no code implementations • 15 Nov 2023 • Prafulla Kumar Choubey, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu
Ideal summarization models should generalize to novel summary-worthy content without remembering reference training summaries by rote.
1 code implementation • 15 Nov 2023 • Yixin Liu, Alexander R. Fabbri, Jiawen Chen, Yilun Zhao, Simeng Han, Shafiq Joty, PengFei Liu, Dragomir Radev, Chien-Sheng Wu, Arman Cohan
Our study reveals that instruction controllable text summarization remains a challenging task for LLMs, since (1) all LLMs evaluated still make factual and other types of errors in their summaries; (2) all LLM-based evaluation methods cannot achieve a strong alignment with human annotators when judging the quality of candidate summaries; (3) different LLMs show large performance gaps in summary generation and evaluation.
no code implementations • 14 Nov 2023 • Philippe Laban, Lidiya Murakhovs'ka, Caiming Xiong, Chien-Sheng Wu
The interactive nature of Large Language Models (LLMs) theoretically allows models to refine and improve their answers, yet systematic analysis of the multi-turn behavior of LLMs remains limited.
1 code implementation • 26 Oct 2023 • Lidiya Murakhovs'ka, Philippe Laban, Tian Xie, Caiming Xiong, Chien-Sheng Wu
Making big purchases requires consumers to research or consult a salesperson to gain domain expertise.
no code implementations • 27 Sep 2023 • Philippe Laban, Jesse Vig, Marti A. Hearst, Caiming Xiong, Chien-Sheng Wu
Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing.
no code implementations • 25 Sep 2023 • Tuhin Chakrabarty, Philippe Laban, Divyansh Agarwal, Smaranda Muresan, Chien-Sheng Wu
Inspired by the Torrance Test of Creative Thinking (TTCT), which measures creativity as a process, we use the Consensual Assessment Technique [3] and propose the Torrance Test of Creative Writing (TTCW) to evaluate creativity as a product.
1 code implementation • 17 Sep 2023 • Kung-Hsiang Huang, Philippe Laban, Alexander R. Fabbri, Prafulla Kumar Choubey, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
In this paper, we propose a new task of summarizing diverse information encountered in multiple news articles encompassing the same event.
1 code implementation • 7 Sep 2023 • Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong
Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.
no code implementations • 27 Jun 2023 • Xiang 'Anthony' Chen, Jeff Burke, Ruofei Du, Matthew K. Hong, Jennifer Jacobs, Philippe Laban, DIngzeyu Li, Nanyun Peng, Karl D. D. Willis, Chien-Sheng Wu, Bolei Zhou
Through iterative, cross-disciplinary discussions, we define and propose next-steps for Human-centered Generative AI (HGAI).
1 code implementation • 30 May 2023 • Philippe Laban, Jesse Vig, Wojciech Kryscinski, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
Text simplification research has mostly focused on sentence-level simplification, even though many desirable edits - such as adding relevant background information or reordering content - may require document-level context.
1 code implementation • 23 May 2023 • Philippe Laban, Wojciech Kryściński, Divyansh Agarwal, Alexander R. Fabbri, Caiming Xiong, Shafiq Joty, Chien-Sheng Wu
To address this, we propose a new protocol for inconsistency detection benchmark creation and implement it in a 10-domain benchmark called SummEdits.
1 code implementation • 7 Mar 2023 • Yixin Liu, Alexander R. Fabbri, Yilun Zhao, PengFei Liu, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev
Interpretability and efficiency are two important considerations for the adoption of neural automatic metrics.
no code implementations • 17 Feb 2023 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Xiang 'Anthony' Chen, Caiming Xiong
In a second usability study, we developed and implemented a reading exercise with 95 novice news readers to measure exposure to coverage diversity.
1 code implementation • 20 Dec 2022 • Artidoro Pagnoni, Alexander R. Fabbri, Wojciech Kryściński, Chien-Sheng Wu
In long document controllable summarization, where labeled data is scarce, pretrained models struggle to adapt to the task and effectively respond to user queries.
2 code implementations • 15 Dec 2022 • Yixin Liu, Alexander R. Fabbri, PengFei Liu, Yilun Zhao, Linyong Nan, Ruilin Han, Simeng Han, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev
Human evaluation is the foundation upon which the evaluation of both summarization systems and automatic metrics rests.
1 code implementation • 11 Nov 2022 • Alexander R. Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong
We propose to use sentence-compression data to train the post-editing model to take a summary with extrinsic entity errors marked with special tokens and output a compressed, well-formed summary with those errors removed.
1 code implementation • 9 Nov 2022 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Xiang 'Anthony' Chen, Caiming Xiong
There are many potential benefits to news readers accessing diverse sources.
no code implementations • 23 Oct 2022 • Prafulla Kumar Choubey, Yu Bai, Chien-Sheng Wu, Wenhao Liu, Nazneen Rajani
Pre-trained language models (PLMs) have been shown effective for zero-shot (0shot) text classification.
no code implementations • 23 Oct 2022 • Xiangyu Peng, Chen Xing, Prafulla Kumar Choubey, Chien-Sheng Wu, Caiming Xiong
Through this way, SESoM inherits the superior generalization of model ensemble approaches and simultaneously captures the sample-specific competence of each source prompt.
1 code implementation • 13 May 2022 • Philippe Laban, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Precisely assessing the progress in natural language generation (NLG) tasks is challenging, and human evaluation to establish a preference in a model's output over another is often necessary.
no code implementations • Findings (NAACL) 2022 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Wenhao Liu, Caiming Xiong
Question generation (QGen) models are often evaluated with standardized NLG metrics that are based on n-gram overlap.
2 code implementations • 28 Feb 2022 • Liang Qiu, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Extracting structure information from dialogue data can help us better understand user and system behaviors.
1 code implementation • 16 Jan 2022 • Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu
Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.
Ranked #1 on Task-Oriented Dialogue Systems on KVRET
1 code implementation • NAACL 2022 • Alexander R. Fabbri, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Factual consistency is an essential quality of text summarization models in practical settings.
1 code implementation • Findings (NAACL) 2022 • Jesse Vig, Alexander R. Fabbri, Wojciech Kryściński, Chien-Sheng Wu, Wenhao Liu
Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization.
2 code implementations • ACL 2022 • Prakhar Gupta, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Fact-checking is an essential tool to mitigate the spread of misinformation and disinformation.
1 code implementation • Findings (NAACL) 2022 • Lidiya Murakhovs'ka, Chien-Sheng Wu, Philippe Laban, Tong Niu, Wenhao Liu, Caiming Xiong
Asking good questions is an essential ability for both human and machine intelligence.
no code implementations • 14 Oct 2021 • Prafulla Kumar Choubey, Alexander R. Fabbri, Jesse Vig, Chien-Sheng Wu, Wenhao Liu, Nazneen Fatema Rajani
Then, we fine-tune a base summarization model, which is trained on all training samples, on the clean (noisy) subset to obtain an \textit{expert} (\textit{anti-expert}) model.
1 code implementation • Findings (ACL) 2021 • Chien-Sheng Wu, Linqing Liu, Wenhao Liu, Pontus Stenetorp, Caiming Xiong
In this paper, we aim to improve abstractive dialogue summarization quality and, at the same time, enable granularity control.
1 code implementation • ACL 2022 • Chien-Sheng Wu, Andrea Madotto, Wenhao Liu, Pascale Fung, Caiming Xiong
This paper introduces QAConv, a new question answering (QA) dataset that uses conversations as a knowledge source.
1 code implementation • 17 Feb 2021 • Yifan Gao, Jingjing Li, Chien-Sheng Wu, Michael R. Lyu, Irwin King
On our created OR-ShARC dataset, MUDERN achieves the state-of-the-art performance, outperforming existing single-passage conversational machine reading models as well as a new multi-passage conversational machine reading baseline by a large margin.
no code implementations • 1 Jan 2021 • Chien-Sheng Wu, Linqing Liu, Wenhao Liu, Pontus Stenetorp, Caiming Xiong
2) A simple strategy to control the granularity of the final summary.
no code implementations • EMNLP 2020 • Chien-Sheng Wu, Caiming Xiong
This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Chien-Sheng Wu, Steven Hoi, Caiming Xiong
We present and investigate two self-supervised objectives: preserving latent consistency and modeling conversational behavior.
1 code implementation • EMNLP 2020 • Jian-Guo Zhang, Kazuma Hashimoto, Wenhao Liu, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong
Intent detection is one of the core components of goal-oriented dialog systems, and detecting out-of-scope (OOS) intents is also a practically important skill.
1 code implementation • EMNLP 2020 • Yifan Gao, Chien-Sheng Wu, Jingjing Li, Shafiq Joty, Steven C. H. Hoi, Caiming Xiong, Irwin King, Michael R. Lyu
Based on the learned EDU and entailment representations, we either reply to the user our final decision "yes/no/irrelevant" of the initial question, or generate a follow-up question to inquiry more information.
1 code implementation • ICLR 2021 • Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong
We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data.
Ranked #8 on Semantic Parsing on spider
1 code implementation • ACL 2020 • Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael Lyu, Steven C. H. Hoi
The goal of conversational machine reading is to answer user questions given a knowledge base text which may require asking clarification questions.
1 code implementation • 26 May 2020 • Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael R. Lyu, Steven C. H. Hoi
The goal of conversational machine reading is to answer user questions given a knowledge base text which may require asking clarification questions.
1 code implementation • NeurIPS 2020 • Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu, Semih Yavuz, Richard Socher
Task-oriented dialogue is often decomposed into three tasks: understanding user input, deciding actions, and generating a response.
Ranked #2 on Response Generation on MMConv
1 code implementation • EMNLP 2020 • Chien-Sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice.
no code implementations • 7 Jan 2020 • Andrea Madotto, Zhaojiang Lin, Chien-Sheng Wu, Jamin Shin, Pascale Fung
Dialogue systems require a great deal of different but complementary expertise to assist, inform, and entertain humans.
1 code implementation • Joint Conference on Lexical and Computational Semantics 2020 • Jian-Guo Zhang, Kazuma Hashimoto, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong
Dialog state tracking (DST) is a core component in task-oriented dialog systems.
Ranked #4 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.0
dialog state tracking Multi-domain Dialogue State Tracking +1
no code implementations • CONLL 2019 • Genta Indra Winata, Andrea Madotto, Chien-Sheng Wu, Pascale Fung
Training code-switched language models is difficult due to lack of data and complexity in the grammatical structure.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • IJCNLP 2019 • Peng Xu, Chien-Sheng Wu, Andrea Madotto, Pascale Fung
Sensational headlines are headlines that capture people's attention and generate reader interest.
1 code implementation • LREC 2020 • Chien-Sheng Wu, Andrea Madotto, Zhaojiang Lin, Peng Xu, Pascale Fung
User attributes provide rich and useful information for user understanding, yet structured and easy-to-use attributes are often sparsely populated.
1 code implementation • ACL 2019 • Zhaojiang Lin, Andrea Madotto, Chien-Sheng Wu, Pascale Fung
Existing personalized dialogue models use human designed persona descriptions to improve dialogue consistency.
2 code implementations • ACL 2019 • Chien-Sheng Wu, Andrea Madotto, Ehsan Hosseini-Asl, Caiming Xiong, Richard Socher, Pascale Fung
Over-dependence on domain ontology and lack of knowledge sharing across domains are two practical and yet less studied problems of dialogue state tracking.
Ranked #15 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.0
Dialogue State Tracking Multi-domain Dialogue State Tracking +2
no code implementations • 19 May 2019 • Chien-Sheng Wu
Mem2Seq is the first model to combine multi-hop memory attention with the idea of the copy mechanism.
Dialogue State Tracking Multi-domain Dialogue State Tracking +2
4 code implementations • ICLR 2019 • Chien-Sheng Wu, Richard Socher, Caiming Xiong
In our model, a global memory encoder and a local memory decoder are proposed to share external knowledge.
Ranked #4 on Task-Oriented Dialogue Systems on KVRET
no code implementations • 30 Oct 2018 • Genta Indra Winata, Andrea Madotto, Chien-Sheng Wu, Pascale Fung
Speech recognition in mixed language has difficulties to adapt end-to-end framework due to the lack of data and overlapping phone sets, for example in words such as "one" in English and "w\`an" in Chinese.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 24 Oct 2018 • Genta Indra Winata, Andrea Madotto, Chien-Sheng Wu, Pascale Fung
Building large-scale datasets for training code-switching language models is challenging and very expensive.
no code implementations • EMNLP 2018 • Nayeon Lee, Chien-Sheng Wu, Pascale Fung
Fact-checking of textual sources needs to effectively extract relevant information from large knowledge bases.
1 code implementation • WS 2018 • Peng Xu, Andrea Madotto, Chien-Sheng Wu, Ji Ho Park, Pascale Fung
In this paper, we propose Emo2Vec which encodes emotional semantics into vectors.
Ranked #28 on Sentiment Analysis on SST-5 Fine-grained classification
no code implementations • WS 2018 • Genta Indra Winata, Andrea Madotto, Chien-Sheng Wu, Pascale Fung
Lack of text data has been the major issue on code-switching language modeling.
no code implementations • WS 2018 • Genta Indra Winata, Chien-Sheng Wu, Andrea Madotto, Pascale Fung
We propose an LSTM-based model with hierarchical architecture on named entity recognition from code-switching Twitter data.
1 code implementation • ACL 2018 • Andrea Madotto, Chien-Sheng Wu, Pascale Fung
End-to-end task-oriented dialog systems usually suffer from the challenge of incorporating knowledge bases.
Ranked #10 on Task-Oriented Dialogue Systems on KVRET
no code implementations • COLING 2016 • Pascale Fung, Anik Dey, Farhad Bin Siddique, Ruixi Lin, Yang Yang, Dario Bertero, Yan Wan, Ricky Ho Yin Chan, Chien-Sheng Wu
Zara, or {`}Zara the Supergirl{'} is a virtual robot, that can exhibit empathy while interacting with an user, with the aid of its built in facial and emotion recognition, sentiment analysis, and speech module.
no code implementations • 13 May 2016 • Pascale Fung, Dario Bertero, Yan Wan, Anik Dey, Ricky Ho Yin Chan, Farhad Bin Siddique, Yang Yang, Chien-Sheng Wu, Ruixi Lin
Although research on empathetic robots is still in the early stage, we described our approach using signal processing techniques, sentiment analysis and machine learning algorithms to make robots that can "understand" human emotion.