no code implementations • EMNLP 2021 • Or Honovich, Leshem Choshen, Roee Aharoni, Ella Neeman, Idan Szpektor, Omri Abend
Neural knowledge-grounded generative models for dialogue often produce content that is factually inconsistent with the knowledge they rely on, making them unreliable and limiting their applicability.
Abstractive Text Summarization Natural Language Inference +3
no code implementations • 9 May 2024 • Zorik Gekhman, Gal Yona, Roee Aharoni, Matan Eyal, Amir Feder, Roi Reichart, Jonathan Herzig
In this work, we study the impact of such exposure to new knowledge on the capability of the fine-tuned model to utilize its pre-existing knowledge.
no code implementations • 15 Feb 2024 • Shashwat Singh, Shauli Ravfogel, Jonathan Herzig, Roee Aharoni, Ryan Cotterell, Ponnurangam Kumaraguru
We demonstrate the effectiveness of the proposed approaches in mitigating bias in multiclass classification and in reducing the generation of toxic language, outperforming strong baselines.
no code implementations • 1 Feb 2024 • Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins, Roee Aharoni, Mor Geva
REVEAL includes comprehensive labels for the relevance, attribution to evidence passages, and logical correctness of each reasoning step in a language model's answer, across a variety of datasets and state-of-the-art language models.
no code implementations • 9 Jan 2024 • Gal Yona, Roee Aharoni, Mor Geva
In this work, we propose GRANOLA QA, a novel evaluation setting where a predicted answer is evaluated in terms of accuracy and informativeness against a set of multi-granularity answers.
no code implementations • 3 Jan 2024 • Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal
As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial.
no code implementations • 16 Oct 2023 • Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva
A growing area of research investigates augmenting language models with tools (e. g., search engines, calculators) to overcome their shortcomings (e. g., missing or incorrect knowledge, incorrect logical inferences).
no code implementations • 31 May 2023 • Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor
Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input.
Abstractive Text Summarization Natural Language Inference +2
no code implementations • 23 May 2023 • Benjamin Muller, John Wieting, Jonathan H. Clark, Tom Kwiatkowski, Sebastian Ruder, Livio Baldini Soares, Roee Aharoni, Jonathan Herzig, Xinyi Wang
Based on these models, we improve the attribution level of a cross-lingual question-answering system.
no code implementations • 22 May 2023 • Elizabeth Clark, Shruti Rijhwani, Sebastian Gehrmann, Joshua Maynez, Roee Aharoni, Vitaly Nikolaev, Thibault Sellam, Aditya Siddhant, Dipanjan Das, Ankur P. Parikh
In this work, we introduce SEAHORSE, a dataset for multilingual, multifaceted summarization evaluation.
1 code implementation • 18 May 2023 • Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor
Factual consistency evaluation is often conducted using Natural Language Inference (NLI) models, yet these models exhibit limited success in evaluating summaries.
1 code implementation • NeurIPS 2023 • Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig, Oran Lang, Eran Ofek, Idan Szpektor
Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks.
Ranked #11 on Visual Reasoning on Winoground
no code implementations • 12 May 2023 • Gal Yona, Or Honovich, Itay Laish, Roee Aharoni
We use CID to highlight context-specific biases that are hard to detect with standard decoding strategies and quantify the effect of different input perturbations.
no code implementations • 27 Apr 2023 • Yonatan Bitton, Shlomi Cohen-Ganor, Ido Hakimi, Yoad Lewenberg, Roee Aharoni, Enav Weinreb
One of the exciting capabilities of recent language models for dialog is their ability to independently search for relevant information to ground a given dialog response.
no code implementations • 20 Dec 2022 • Roee Aharoni, Shashi Narayan, Joshua Maynez, Jonathan Herzig, Elizabeth Clark, Mirella Lapata
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
no code implementations • 19 Dec 2022 • Matan Eyal, Hila Noga, Roee Aharoni, Idan Szpektor, Reut Tsarfaty
We demonstrate that by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a specialized, morpheme-based, separately fine-tuned decoder.
1 code implementation • 15 Dec 2022 • Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster
We take human annotations as a gold standard and show that a correlated automatic metric is suitable for development.
1 code implementation • 10 Nov 2022 • Ella Neeman, Roee Aharoni, Or Honovich, Leshem Choshen, Idan Szpektor, Omri Abend
Question answering models commonly have access to two sources of "knowledge" during inference time: (1) parametric knowledge - the factual knowledge encoded in the model weights, and (2) contextual knowledge - external knowledge (e. g., a Wikipedia passage) given to the model to generate a grounded answer.
1 code implementation • NAACL 2022 • Or Honovich, Roee Aharoni, Jonathan Herzig, Hagai Taitelbaum, Doron Kukliansy, Vered Cohen, Thomas Scialom, Idan Szpektor, Avinatan Hassidim, Yossi Matias
Grounded text generation systems often generate text that contains factual inconsistencies, hindering their real-world applicability.
1 code implementation • 16 Apr 2021 • Or Honovich, Leshem Choshen, Roee Aharoni, Ella Neeman, Idan Szpektor, Omri Abend
Neural knowledge-grounded generative models for dialogue often produce content that is factually inconsistent with the knowledge they rely on, making them unreliable and limiting their applicability.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zorik Gekhman, Roee Aharoni, Genady Beryozkin, Markus Freitag, Wolfgang Macherey
Our approach achieves the highest correlation with human judgements on 9 out of the 18 language pairs from the WMT19 benchmark for evaluation without references, which is the largest number of wins for a single evaluation method on this task.
no code implementations • 11 Aug 2020 • Amit Moryossef, Ioannis Tsochantaridis, Roee Aharoni, Sarah Ebling, Srini Narayanan
We propose a lightweight real-time sign language detection model, as we identify the need for such a case in videoconferencing.
1 code implementation • ACL 2020 • Roee Aharoni, Yoav Goldberg
The notion of "in-domain data" in NLP is often over-simplistic and vague, as textual data varies in many nuanced linguistic aspects such as topic, style or level of formality.
1 code implementation • CONLL 2019 • Ohad Rozen, Vered Shwartz, Roee Aharoni, Ido Dagan
Phenomenon-specific "adversarial" datasets have been recently designed to perform targeted stress-tests for particular inference types.
no code implementations • WS 2019 • Amit Moryossef, Roee Aharoni, Yoav Goldberg
When translating from a language that does not morphologically mark information such as gender and number into a language that does, translation systems must {``}guess{''} this missing information, often leading to incorrect translations in the given context.
no code implementations • 17 Mar 2019 • Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Roee Aharoni, Melvin Johnson, Wolfgang Macherey
Multilingual Neural Machine Translation (NMT) models are capable of translating between multiple source and target languages.
no code implementations • 8 Mar 2019 • Amit Moryossef, Roee Aharoni, Yoav Goldberg
When translating from a language that does not morphologically mark information such as gender and number into a language that does, translation systems must "guess" this missing information, often leading to incorrect translations in the given context.
no code implementations • NAACL 2019 • Roee Aharoni, Melvin Johnson, Orhan Firat
Our experiments on a large-scale dataset with 102 languages to and from English and up to one million examples per direction also show promising results, surpassing strong bilingual baselines and encouraging future work on massively multilingual NMT.
1 code implementation • ACL 2018 • Roee Aharoni, Yoav Goldberg
To aid this, we present a new train-development-test data split and neural models augmented with a copy-mechanism, outperforming the best reported baseline by 8. 68 BLEU and fostering further progress on the task.
2 code implementations • 2 May 2018 • Roee Aharoni, Yoav Goldberg
To aid this, we present a new train-development-test data split and neural models augmented with a copy-mechanism, outperforming the best reported baseline by 8. 68 BLEU and fostering further progress on the task.
no code implementations • ACL 2017 • Roee Aharoni, Yoav Goldberg
We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees.
1 code implementation • ACL 2017 • Roee Aharoni, Yoav Goldberg
We present a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and the characters in its inflection.