no code implementations • EACL (Hackashop) 2021 • Senja Pollak, Marko Robnik-Šikonja, Matthew Purver, Michele Boggia, Ravi Shekhar, Marko Pranjić, Salla Salmela, Ivar Krustok, Tarmo Paju, Carl-Gustav Linden, Leo Leppänen, Elaine Zosa, Matej Ulčar, Linda Freienthal, Silver Traat, Luis Adrián Cabrera-Diego, Matej Martinc, Nada Lavrač, Blaž Škrlj, Martin Žnidaršič, Andraž Pelicon, Boshko Koloski, Vid Podpečan, Janez Kranjc, Shane Sheehan, Emanuela Boros, Jose G. Moreno, Antoine Doucet, Hannu Toivonen
This paper presents tools and data sources collected and released by the EMBEDDIA project, supported by the European Union’s Horizon 2020 research and innovation program.
no code implementations • EACL (Hackashop) 2021 • Boshko Koloski, Elaine Zosa, Timen Stepišnik-Perdih, Blaž Škrlj, Tarmo Paju, Senja Pollak
Team Name: team-8 Embeddia Tool: Cross-Lingual Document Retrieval Zosa et al. Dataset: Estonian and Latvian news datasets abstract: Contemporary news media face increasing amounts of available data that can be of use when prioritizing, selecting and discovering new news.
1 code implementation • SemEval (NAACL) 2022 • Elaine Zosa, Emanuela Boros, Boshko Koloski, Lidia Pivovarova
In this paper, we present the participation of the EMBEDDIA team in the SemEval-2022 Task 8 (Multilingual News Article Similarity).
no code implementations • LREC 2022 • Matej Martinc, Syrielle Montariol, Lidia Pivovarova, Elaine Zosa
We tackle the problem of neural headline generation in a low-resource setting, where only limited amount of data is available to train a model.
no code implementations • 2 Apr 2024 • Risto Luukkonen, Jonathan Burdge, Elaine Zosa, Aarne Talman, Ville Komulainen, Väinö Hatanpää, Peter Sarlin, Sampo Pyysalo
The pretraining of state-of-the-art large language models now requires trillions of words of text, which is orders of magnitude more than available for the vast majority of languages.
no code implementations • 12 Mar 2024 • Timothee Mickus, Elaine Zosa, Raúl Vázquez, Teemu Vahtola, Jörg Tiedemann, Vincent Segonne, Alessandro Raganato, Marianna Apidianaki
This paper presents the results of the SHROOM, a shared task focused on detecting hallucinations: outputs from natural language generation (NLG) systems that are fluent, yet inaccurate.
no code implementations • 18 Oct 2023 • Timothee Mickus, Elaine Zosa, Denis Paperno
Grounding has been argued to be a crucial component towards the development of more complete and truly semantically competent artificial intelligence systems.
1 code implementation • COLING 2022 • Elaine Zosa, Lidia Pivovarova
This paper presents M3L-Contrast -- a novel multimodal multilingual (M3L) neural topic model for comparable data that maps texts from multiple languages and images into a shared topic space.
1 code implementation • RANLP 2021 • Elaine Zosa, Ravi Shekhar, Mladen Karan, Matthew Purver
Moderation of reader comments is a significant problem for online news platforms.
no code implementations • SEMEVAL 2020 • Matej Martinc, Syrielle Montariol, Elaine Zosa, Lidia Pivovarova
This paper describes the approaches used by the Discovery Team to solve SemEval-2020 Task 1 - Unsupervised Lexical Semantic Change Detection.
no code implementations • 20 Nov 2020 • Jani Marjanen, Elaine Zosa, Simon Hengchen, Lidia Pivovarova, Mikko Tolonen
This paper addresses methodological issues in diachronic data analysis for historical research.
no code implementations • LREC 2020 • Elaine Zosa, Mark Granroth-Wilding, Lidia Pivovarova
We address the problem of linking related documents across languages in a multilingual collection.
no code implementations • 18 Jan 2020 • Matej Martinc, Syrielle Montariol, Elaine Zosa, Lidia Pivovarova
The way the words are used evolves through time, mirroring cultural or technological evolution of society.
no code implementations • RANLP 2019 • Lidia Pivovarova, Elaine Zosa, Jani Marjanen
This paper is a part of a collaboration between computer scientists and historians aimed at development of novel tools and methods to improve analysis of historical newspapers.
no code implementations • RANLP 2019 • Elaine Zosa, Mark Granroth-Wilding
Dynamic topic models (DTMs) capture the evolution of topics and trends in time series data. Current DTMs are applicable only to monolingual datasets.