no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Shuo Sun, Marina Fomicheva, Fr{\'e}d{\'e}ric Blain, Vishrav Chaudhary, Ahmed El-Kishky, Adithya Renduchintala, Francisco Guzm{\'a}n, Lucia Specia
Predicting the quality of machine translation has traditionally been addressed with language-specific models, under the assumption that the quality label distribution or linguistic features exhibit traits that are not shared across languages.
no code implementations • ACL 2020 • Marina Fomicheva, Lucia Specia, Francisco Guzm{\'a}n
Reliably evaluating Machine Translation (MT) through automated metrics is a long-standing problem.
no code implementations • ACL 2020 • Shuo Sun, Francisco Guzm{\'a}n, Lucia Specia
Recent advances in pre-trained multilingual language models lead to state-of-the-art results on the task of quality estimation (QE) for machine translation.
1 code implementation • IJCNLP 2019 • Francisco Guzm{\'a}n, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc{'}Aurelio Ranzato
For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available.
no code implementations • WS 2019 • Philipp Koehn, Francisco Guzm{\'a}n, Vishrav Chaudhary, Juan Pino
Following the WMT 2018 Shared Task on Parallel Corpus Filtering, we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2{\%} and 10{\%} of the highest-quality data to be used to train machine translation systems.
no code implementations • COLING 2016 • Francisco Guzm{\'a}n, Houda Bouamor, Ramy Baly, Nizar Habash
Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges.