1 code implementation • COLING (CODI, CRAC) 2022 • Mathilde Veron, Olivier Galibert, Guillaume Bernard, Sophie Rosset
Dialog state tracking (DST) is a core step for task-oriented dialogue systems aiming to track the user’s current goal during a dialogue.
1 code implementation • MMMPIE (COLING) 2022 • Juan Manuel Coria, Mathilde Veron, Sahar Ghannay, Guillaume Bernard, Hervé Bredin, Olivier Galibert, Sophie Rosset
Knowledge transfer between neural language models is a widely used technique that has proven to improve performance in a multitude of natural language tasks, in particular with the recent rise of large pre-trained language models like BERT.
no code implementations • 17 Apr 2024 • Pierre Lepagnol, Thomas Gerald, Sahar Ghannay, Christophe Servan, Sophie Rosset
This study is part of the debate on the efficiency of large versus small language models for text classification by prompting. We assess the performance of small language models in zero-shot text classification, challenging the prevailing dominance of large models. Across 15 datasets, our investigation benchmarks language models from 77M to 40B parameters using different architectures and scoring functions.
1 code implementation • 28 Mar 2024 • Nadège Alavoine, Gaëlle Laperriere, Christophe Servan, Sahar Ghannay, Sophie Rosset
A combination ofmultiple datasets, including the MEDIA dataset, was suggested for training this joint model.
no code implementations • 27 Mar 2024 • Christophe Servan, Sahar Ghannay, Sophie Rosset
Within the current trend of Pretained Language Models (PLM), emerge more and more criticisms about the ethical andecological impact of such models.
no code implementations • 19 Jul 2022 • Oralie Cattan, Sahar Ghannay, Christophe Servan, Sophie Rosset
In this paper, we propose a unified benchmark, focused on evaluating models quality and their ecological impact on two well-known French spoken language understanding tasks.
no code implementations • RANLP 2021 • Oralie Cattan, Christophe Servan, Sophie Rosset
In this paper, we establish a state-of-the-art of the efforts dedicated to the usability of Transformer-based models and propose to evaluate these improvements on the question-answering performances of French language which have few resources.
no code implementations • ACL (MetaNLP) 2021 • Oralie Cattan, Christophe Servan, Sophie Rosset
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven to be effective for limited domain and language applications when a sufficient number of training examples are available.
1 code implementation • 14 Sep 2021 • Juan M. Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset
We propose to address online speaker diarization as a combination of incremental clustering and local diarization applied to a rolling buffer updated every 500ms.
no code implementations • 26 Feb 2021 • Mathilde Veron, Sophie Rosset, Olivier Galibert, Guillaume Bernard
On-the-job learning consists in continuously learning while being used in production, in an open environment, meaning that the system has to deal on its own with situations and elements never seen before.
1 code implementation • COLING 2020 • Sahar Ghannay, Christophe Servan, Sophie Rosset
In this paper, we present a study on a French Spoken Language Understanding (SLU) task: the MEDIA task.
no code implementations • SEMEVAL 2020 • Somnath Banerjee, Sahar Ghannay, Sophie Rosset, Anne Vilnat, Paolo Rosso
This paper describes the participation of LIMSI{\_}UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text.
1 code implementation • 30 Aug 2020 • Somnath Banerjee, Sahar Ghannay, Sophie Rosset, Anne Vilnat, Paolo Rosso
This paper describes the participation of LIMSI UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text.
no code implementations • WS 2020 • Juan Manuel Coria, Sahar Ghannay, Sophie Rosset, Herv{\'e} Bredin
The task of automatic misogyny identification and categorization has not received as much attention as other natural language tasks have, even though it is crucial for identifying hate speech in social Internet interactions.
no code implementations • JEPTALNRECITAL 2020 • Antoine Caubri{\`e}re, Sophie Rosset, Yannick Est{\`e}ve, Antoine Laurent, Emmanuel Morin
Les derni{\`e}res donn{\'e}es disponibles pour la REN structur{\'e}es {\`a} partir de la parole en fran{\c{c}}ais proviennent de la campagne d{'}{\'e}valuation ETAPE en 2012.
no code implementations • LREC 2020 • Antoine Caubri{\`e}re, Sophie Rosset, Yannick Est{\`e}ve, Antoine Laurent, Emmanuel Morin
For this type of systems, we propose an original 3-pass approach.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
1 code implementation • 31 Mar 2020 • Juan M. Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset
Despite the growing popularity of metric learning approaches, very little work has attempted to perform a fair comparison of these techniques for speaker verification.
2 code implementations • 30 May 2019 • Rachel Bawden, Sophie Rosset, Thomas Lavergne, Eric Bilinski
We provide a preliminary analysis of the corpus to confirm that the participants' judgments reveal perceptible differences in MT quality between the two MT systems used.
no code implementations • 10 May 2019 • Jan Deriu, Alvaro Rodrigo, Arantxa Otegi, Guillermo Echegoyen, Sophie Rosset, Eneko Agirre, Mark Cieliebak
We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class.
no code implementations • 23 Nov 2018 • Antoine Neuraz, Leonardo Campillos Llanos, Anita Burgun, Sophie Rosset
In the biomedical domain, the lack of sharable datasets often limit the possibility of developing natural language processing systems, especially dialogue applications and natural language understanding models.
no code implementations • JEPTALNRECITAL 2018 • Pierre Magistry, Anne-Laure Ligozat, Sophie Rosset
Cet article pr{\'e}sente une nouvelle m{\'e}thode d{'}{\'e}tiquetage en parties du discours adapt{\'e}e aux langues peu dot{\'e}es : la d{\'e}finition du contexte utilis{\'e} pour construire les plongements lexicaux est adapt{\'e}e {\`a} la t{\^a}che, et de nouveaux vecteurs sont cr{\'e}{\'e}s pour les mots inconnus.
no code implementations • LREC 2018 • Delphine Bernhard, Anne-Laure Ligozat, Fanny Martin, Myriam Bras, Pierre Magistry, Marianne Vergez-Couret, Lucie Steibl{\'e}, Pascale Erhart, Nabil Hathout, Dominique Huck, Christophe Rey, Philippe Reyn{\'e}s, Sophie Rosset, Jean Sibille, Thomas Lavergne
no code implementations • JEPTALNRECITAL 2018 • Rachel Bawden, Thomas Lavergne, Sophie Rosset
In this article, we provide several approaches to the automatic identification of parallel sentences that require sentence-external linguistic context to be correctly translated.
no code implementations • WS 2017 • Leonardo Campillos Llanos, Sophie Rosset, Pierre Zweigenbaum
We present the work-in-progress of automating the classification of doctor-patient questions in the context of a simulated consultation with a virtual patient.
no code implementations • JEPTALNRECITAL 2017 • Jos{\'e} Moreno, Romaric Besan{\c{c}}on, Romain Beaumont, Eva D{'}hondt, Anne-Laure Ligozat, Sophie Rosset, Xavier Tannier, Brigitte Grau
La d{\'e}sambigu{\"\i}sation d{'}entit{\'e}s (ou liaison d{'}entit{\'e}s), qui consiste {\`a} relier des mentions d{'}entit{\'e}s d{'}un texte {\`a} des entit{\'e}s d{'}une base de connaissance, est un probl{\`e}me qui se pose, entre autre, pour le peuplement automatique de bases de connaissances {\`a} partir de textes.
1 code implementation • 8 Apr 2017 • Guillaume Dubuisson Duplessis, Franck Charras, Vincent Letard, Anne-Laure Ligozat, Sophie Rosset
This paper investigates the use of recurrent surface text patterns to represent and index open-domain dialogue utterances for a retrieval system that can be embedded in a conversational agent.
no code implementations • JEPTALNRECITAL 2016 • Olivier Galibert, Juliette Kahn, Sophie Rosset
Le travail que nous pr{\'e}sentons ici s{'}inscrit dans le domaine de l{'}{\'e}valuation des syst{\`e}mes de reconnaissance automatique de la parole en vue de leur utilisation dans une t{\^a}che aval, ici la reconnaissance des entit{\'e}s nomm{\'e}es.
no code implementations • JEPTALNRECITAL 2016 • Franck Charras, Guillaume Dubuisson Duplessis, Vincent Letard, Anne-Laure Ligozat, Sophie Rosset
Cette d{\'e}monstration pr{\'e}sente un syst{\`e}me de dialogue en domaine ouvert qui utilise une base d{'}exemples de dialogue automatiquement constitu{\'e}e depuis un corpus de sous-titres afin de g{\'e}rer un dialogue social de type « chatbot ».
no code implementations • JEPTALNRECITAL 2016 • Vincent Letard, Gabriel Illouz, Sophie Rosset
Cet article examine l{'}utilisation du raisonnement analogique dans le contexte de l{'}apprentissage incr{\'e}mental.
no code implementations • JEPTALNRECITAL 2016 • Olivier Galibert, Nathalie Camelin, Paul Del{\'e}glise, Sophie Rosset
Nous comparons ici diff{\'e}rentes m{\'e}triques, notamment le WER, NE-WER et ATENE m{\'e}trique propos{\'e}e r{\'e}cemment pour l{'}{\'e}valuation des syst{\`e}mes de reconnaissance de la parole {\'e}tant donn{\'e} une t{\^a}che de reconnaissance d{'}entit{\'e}s nomm{\'e}es.
no code implementations • LREC 2016 • Maud Ehrmann, Damien Nouvel, Sophie Rosset
Recognition of real-world entities is crucial for most NLP applications.
no code implementations • LREC 2016 • Guillaume Dubuisson Duplessis, Vincent Letard, Anne-Laure Ligozat, Sophie Rosset
This system is used as a chatterbot system to collect a corpus of 41 open-domain textual dialogues with 27 human participants.
no code implementations • LREC 2016 • Leonardo Campillos Llanos, Dhouha Bouamor, Pierre Zweigenbaum, Sophie Rosset
We introduce a dialogue task between a virtual patient and a doctor where the dialogue system, playing the patient part in a simulated consultation, must reconcile a specialized level, to understand what the doctor says, and a lay level, to output realistic patient-language utterances.
no code implementations • LREC 2016 • Dhouha Bouamor, Leonardo Campillos Llanos, Anne-Laure Ligozat, Sophie Rosset, Pierre Zweigenbaum
While measuring the readability of texts has been a long-standing research topic, assessing the technicality of terms has only been addressed more recently and mostly for the English language.
no code implementations • LREC 2016 • Johann Poignant, Mateusz Budnik, Herv{\'e} Bredin, Claude Barras, Mickael Stefas, Pierrick Bruneau, Gilles Adda, Laurent Besacier, Hazim Ekenel, Gil Francopoulo, Hern, Javier o, Joseph Mariani, Ramon Morros, Georges Qu{\'e}not, Sophie Rosset, Thomas Tamisier
In this paper, we describe the organization and the implementation of the CAMOMILE collaborative annotation framework for multimodal, multimedia, multilingual (3M) data.
no code implementations • LREC 2016 • Olivier Galibert, Mohamed Ameur Ben Jannet, Juliette Kahn, Sophie Rosset
In the context of Automatic Speech Recognition (ASR) used as a first step towards Named Entity Recognition (NER) in speech, error seriousness is usually determined by their frequency, due to the use of the WER as metric to evaluate the ASR output, despite the emergence of more relevant measures in the literature.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • JEPTALNRECITAL 2015 • Cyril Grouin, V{\'e}ronique Moriceau, Sophie Rosset, Pierre Zweigenbaum
Dans cet article, nous pr{\'e}sentons les m{\'e}thodes que nous avons d{\'e}velopp{\'e}es pour analyser des comptes- rendus hospitaliers r{\'e}dig{\'e}s en anglais.
no code implementations • JEPTALNRECITAL 2015 • Leonardo Campillos, Dhouha Bouamor, {\'E}ric Bilinski, Anne-Laure Ligozat, Pierre Zweigenbaum, Sophie Rosset
Le d{\'e}monstrateur que nous d{\'e}crivons ici est un prototype de syst{\`e}me de dialogue dont l{'}objectif est de simuler un patient.
no code implementations • LREC 2014 • Mohamed Ben Jannet, Martine Adda-Decker, Olivier Galibert, Juliette Kahn, Sophie Rosset
We then introduce a new metric, the Entity Tree Error Rate (ETER), to evaluate hierarchical and structured named entity detection, classification and decomposition.
no code implementations • LREC 2014 • Maria Goryainova, Cyril Grouin, Sophie Rosset, Ioana Vasilescu
The study provides an original standpoint of the speech transcription errors by focusing on the morpho-syntactic features of the erroneous chunks and of the surrounding left and right context.
no code implementations • LREC 2014 • Daniel Luzzati, Cyril Grouin, Ioana Vasilescu, Martine Adda-Decker, Eric Bilinski, Nathalie Camelin, Juliette Kahn, Carole Lailler, Lori Lamel, Sophie Rosset
This paper is concerned with human assessments of the severity of errors in ASR outputs.
no code implementations • LREC 2014 • Cyril Grouin, Jeremy Leixa, Aurélie Névéol, Sophie Rosset, Xavier Tannier, Pierre Zweigenbaum
Overall, a total of 26, 409 entity annotations were mapped to 5, 797 unique UMLS concepts.
no code implementations • JEPTALNRECITAL 2012 • Camille Dutrey, Chlo{\'e} Clavel, Sophie Rosset, Ioana Vasilescu, Martine Adda-Decker
no code implementations • LREC 2012 • Marco Dinarelli, Sophie Rosset
We evaluate our procedure for preprocessing OCR-ized data in two ways: in terms of perplexity and OOV rate of a language model on development and evaluation data, and in terms of the performance of the named entity detection system on the preprocessed data.
no code implementations • LREC 2012 • Olivier Galibert, Sophie Rosset, Cyril Grouin, Pierre Zweigenbaum, Ludovic Quintard
Within the framework of the Quaero project, we proposed a new definition of named entities, based upon an extension of the coverage of named entities as well as the structure of those named entities.
Named Entity Recognition (NER) Optical Character Recognition (OCR)
no code implementations • LREC 2012 • David Doukhan, Sophie Rosset, Albert Rilliard, D{'}Aless, Christophe ro, Martine Adda-Decker
This corpus is used for predicting expressive prosody in children tales, above the level of the sentence.