Search Results for author: Joakim Gustafson

Found 19 papers, 1 papers with code

Speech Data Augmentation for Improving Phoneme Transcriptions of Aphasic Speech Using Wav2Vec 2.0 for the PSST Challenge

no code implementations • RaPID (LREC) 2022 • Birger Moell, Jim O’Regan, Shivam Mehta, Ambika Kirkland, Harm Lameris, Joakim Gustafson, Jonas Beskow

As part of the PSST challenge, we explore how data augmentations, data sources, and model size affect phoneme transcription accuracy on speech produced by individuals with aphasia.

Automatic Phoneme Recognition Data Augmentation +1

Paper
Add Code

Evaluating Sampling-based Filler Insertion with Spontaneous TTS

no code implementations • LREC 2022 • Siyang Wang, Joakim Gustafson, Éva Székely

Perceptual results show little difference between compared filler insertion models including with ground-truth, which may be due to the ambiguity of what is good filler insertion and a strong neural spontaneous TTS that produces natural speech irrespective of input.

Paper
Add Code

On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis

no code implementations • 11 Jul 2023 • Siyang Wang, Gustav Eje Henter, Joakim Gustafson, Éva Székely

Prior work has shown that SSL is an effective intermediate representation in two-stage text-to-speech (TTS) for both read and spontaneous speech.

Self-Supervised Learning Speech Synthesis

Paper
Add Code

Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis

no code implementations • 29 May 2023 • Erik Ekstedt, Siyang Wang, Éva Székely, Joakim Gustafson, Gabriel Skantze

Turn-taking is a fundamental aspect of human communication where speakers convey their intention to either hold, or yield, their turn through prosodic cues.

Speech Synthesis

Paper
Add Code

A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS

no code implementations • 5 Mar 2023 • Siyang Wang, Gustav Eje Henter, Joakim Gustafson, Éva Székely

Recent work has explored using self-supervised learning (SSL) speech representations such as wav2vec2. 0 as the representation medium in standard two-stage TTS, in place of conventionally used mel-spectrograms.

Self-Supervised Learning

Paper
Add Code

Prosody-controllable spontaneous TTS with neural HMMs

no code implementations • 24 Nov 2022 • Harm Lameris, Shivam Mehta, Gustav Eje Henter, Joakim Gustafson, Éva Székely

Spontaneous speech has many affective and pragmatic functions that are interesting and challenging to model in TTS.

valid

Paper
Add Code

Integrated Speech and Gesture Synthesis

1 code implementation • 25 Aug 2021 • Siyang Wang, Simon Alexanderson, Joakim Gustafson, Jonas Beskow, Gustav Eje Henter, Éva Székely

Text-to-speech and co-speech gesture synthesis have until now been treated as separate areas by two different research communities, and applications merely stack the two technologies using a simple system-level pipeline.

Speech Synthesis

Paper
Code

Augmented Prompt Selection for Evaluation of Spontaneous Speech Synthesis

no code implementations • LREC 2020 • Eva Szekely, Jens Edlund, Joakim Gustafson

Spontaneous speech is emergent and transient, whereas text read out loud is pre-planned.

Speech Synthesis

Paper
Add Code

Chinese Whispers: A Multimodal Dataset for Embodied Language Grounding

no code implementations • LREC 2020 • Dimosthenis Kontogiorgos, Elena Sibirtseva, Joakim Gustafson

In this paper, we introduce a multimodal dataset in which subjects are instructing each other how to assemble IKEA furniture.

Paper
Add Code

Crowdsourced Multimodal Corpora Collection Tool

no code implementations • LREC 2018 • Patrik Jonell, Catharine Oertel, Dimosthenis Kontogiorgos, Jonas Beskow, Joakim Gustafson

Paper
Add Code

A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction

no code implementations • LREC 2018 • Dimosthenis Kontogiorgos, Vanya Avramova, Alex, Simon erson, Patrik Jonell, Catharine Oertel, Jonas Beskow, Gabriel Skantze, Joakim Gustafson

Mutual Gaze

Paper
Add Code

Machine Learning and Social Robotics for Detecting Early Signs of Dementia

no code implementations • 5 Sep 2017 • Patrik Jonell, Joseph Mendelson, Thomas Storskog, Goran Hagman, Per Ostberg, Iolanda Leite, Taras Kucherenko, Olga Mikheeva, Ulrika Akenine, Vesna Jelic, Alina Solomon, Jonas Beskow, Joakim Gustafson, Miia Kivipelto, Hedvig Kjellstrom

This paper presents the EACare project, an ambitious multi-disciplinary collaboration with the aim to develop an embodied system, capable of carrying out neuropsychological tests to detect early signs of dementia, e. g., due to Alzheimer's disease.

BIG-bench Machine Learning

Paper
Add Code

Hidden Resources â€• Strategies to Acquire and Exploit Potential Spoken Language Resources in National Archives

no code implementations • LREC 2016 • Jens Edlund, Joakim Gustafson

In 2014, the Swedish government tasked a Swedish agency, The Swedish Post and Telecom Authority (PTS), with investigating how to best create and populate an infrastructure for spoken language resources (Ref N2014/2840/ITP).

Position