no code implementations • 8 Jan 2024 • Tobias Cord-Landwehr, Christoph Boeddeker, Cătălin Zorilă, Rama Doddipatla, Reinhold Haeb-Umbach
We propose a modified teacher-student training for the extraction of frame-wise speaker embeddings that allows for an effective diarization of meeting scenarios containing partially overlapping speech.
no code implementations • 28 Sep 2023 • Thilo von Neumann, Christoph Boeddeker, Tobias Cord-Landwehr, Marc Delcroix, Reinhold Haeb-Umbach
We propose a modular pipeline for the single-channel separation, recognition, and diarization of meeting-style recordings and evaluate it on the Libri-CSS dataset.
no code implementations • 1 Jun 2023 • Tobias Cord-Landwehr, Christoph Boeddeker, Cătălin Zorilă, Rama Doddipatla, Reinhold Haeb-Umbach
We introduce a monaural neural speaker embeddings extractor that computes an embedding for each speaker present in a speech mixture.
no code implementations • 1 Jun 2023 • Tobias Cord-Landwehr, Christoph Boeddeker, Cătălin Zorilă, Rama Doddipatla, Reinhold Haeb-Umbach
Using a Teacher-Student training approach we developed a speaker embedding extraction system that outputs embeddings at frame rate.
1 code implementation • 23 Sep 2022 • Tobias Cord-Landwehr, Thilo von Neumann, Christoph Boeddeker, Reinhold Haeb-Umbach
Training and evaluation of these single tasks requires synthetic data with access to intermediate signals that is as close as possible to the evaluation scenario.
no code implementations • 2 May 2022 • Tobias Gburrek, Christoph Boeddeker, Thilo von Neumann, Tobias Cord-Landwehr, Joerg Schmalenstroeer, Reinhold Haeb-Umbach
We propose a system that transcribes the conversation of a typical meeting scenario that is captured by a set of initially unsynchronized microphone arrays at unknown positions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 15 Nov 2021 • Tobias Cord-Landwehr, Christoph Boeddeker, Thilo von Neumann, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach
Impressive progress in neural network-based single-channel speech source separation has been made in recent years.