no code implementations • 19 Feb 2024 • Kundan Krishna, Sanjana Ramprasad, Prakhar Gupta, Byron C. Wallace, Zachary C. Lipton, Jeffrey P. Bigham
We present GenAudit -- a tool intended to assist fact-checking LLM responses for document-grounded tasks.
no code implementations • 5 Feb 2024 • Sanjana Ramprasad, Kundan Krishna, Zachary C Lipton, Byron C Wallace
We analyze whether the prevalence of a given domain in the pretraining corpus affects extractiveness and faithfulness of generated summaries of articles in this domain.
1 code implementation • 23 May 2023 • Kundan Krishna, Prakhar Gupta, Sanjana Ramprasad, Byron C. Wallace, Jeffrey P. Bigham, Zachary C. Lipton
While the NLP community has produced numerous summarization benchmarks, none provide the rich annotations required to simultaneously address many important problems related to control and reliability.
no code implementations • 20 Dec 2022 • Kundan Krishna, Yao Zhao, Jie Ren, Balaji Lakshminarayanan, Jiaming Luo, Mohammad Saleh, Peter J. Liu
We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes.
no code implementations • 30 Sep 2022 • Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J. Liu
Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output.
Abstractive Text Summarization Out-of-Distribution Detection +1
1 code implementation • 28 Sep 2022 • Kundan Krishna, Saurabh Garg, Jeffrey P. Bigham, Zachary C. Lipton
In experiments addressing both ELECTRA and RoBERTa models and 10 distinct downstream classification datasets, we observe that self-pretraining rivals standard pretraining on the BookWiki corpus (despite using around $10\times$--$500\times$ less data), outperforming the latter on $7$ and $5$ datasets, respectively.
1 code implementation • Findings (EMNLP) 2021 • Kundan Krishna, Jeffrey Bigham, Zachary C. Lipton
Pretraining techniques leveraging enormous datasets have driven recent advances in text summarization.
no code implementations • 14 Jul 2020 • Kundan Krishna, Amy Pavel, Benjamin Schloss, Jeffrey P. Bigham, Zachary C. Lipton
In this exploratory study, we describe a new dataset consisting of conversation transcripts, post-visit summaries, corresponding supporting evidence (in the transcript), and structured labels.
no code implementations • 11 May 2020 • Abhilasha Sancheti, Kundan Krishna, Balaji Vasan Srinivasan, Anandhavelu Natarajan
Style transfer deals with the algorithms to transfer the stylistic properties of a piece of text into that of another while ensuring that the core content is preserved.
no code implementations • ACL 2021 • Kundan Krishna, Sopan Khosla, Jeffrey P. Bigham, Zachary C. Lipton
Following each patient visit, physicians draft long semi-structured clinical summaries called SOAP notes.
no code implementations • 20 Jan 2019 • Kushal Chawla, Kundan Krishna, Balaji Vasan Srinivasan
The first shortcoming is the extractive nature of the generated summaries, since the network eventually learns to copy from the input article most of the times, affecting the abstractive nature of the generated summaries.
no code implementations • COLING 2018 • Kundan Krishna, Aniket Murhekar, Saumitra Sharma, Balaji Vasan Srinivasan
Neural sequence-to-sequence models have been successfully extended for summary generation. However, existing frameworks generate a single summary for a given input and do not tune the summaries towards any additional constraints/preferences.
no code implementations • COLING 2018 • Balaji Vasan Srinivasan, Pranav Maneriker, Kundan Krishna, Natwar Modani
Enterprise content writers are engaged in writing textual content for various purposes.
no code implementations • NAACL 2018 • Kundan Krishna, Balaji Vasan Srinivasan
Existing summarization algorithms generate a single summary and are not capable of generating multiple summaries tuned to the interests of the readers.