no code implementations • 28 May 2024 • Aparna Elangovan, Ling Liu, Lei Xu, Sravan Bodapati, Dan Roth
In this position paper, we argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking that draws upon insights from disciplines such as user experience research and human behavioral psychology to ensure that the experimental design and results are reliable.
no code implementations • 7 Nov 2023 • Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor
The NLP community typically relies on performance of a model on a held-out test set to assess generalization.
no code implementations • 12 Oct 2023 • Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor
BERT-based models have had strong performance on leaderboards, yet have been demonstrably worse in real-world settings requiring generalization.
1 code implementation • 6 Jan 2022 • Aparna Elangovan, Yuan Li, Douglas E. V. Pires, Melissa J. Davis, Karin Verspoor
However, by combining high confidence and low variation to identify high quality predictions, tuning the predictions for precision, we retained 19% of the test predictions with 100% precision.
no code implementations • EACL 2021 • Aparna Elangovan, Jiayuan He, Karin Verspoor
Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art methods for many tasks in natural language processing (NLP).
1 code implementation • 3 Feb 2021 • Aparna Elangovan, Jiayuan He, Karin Verspoor
Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art methods for many tasks in natural language processing (NLP).
no code implementations • 20 Aug 2020 • Aparna Elangovan, Melissa Davis, Karin Verspoor
Motivation: Protein-protein interactions (PPI) are critical to the function of proteins in both normal and diseased cells, and many critical protein functions are mediated by interactions. Knowledge of the nature of these interactions is important for the construction of networks to analyse biological data.