no code implementations • IWSLT 2016 • Micha Wetzel, Matthias Sperber, Alexander Waibel
Speech that contains multimedia content can pose a serious challenge for real-time automatic speech recognition (ASR) for two reasons: (1) The ASR produces meaningless output, hurting the readability of the transcript.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • SIGUL (LREC) 2022 • Louisa Lambrecht, Felix Schneider, Alexander Waibel
There, improvements range from 7. 5 to 10. 6 BLEU points over the baseline depending on the dialect.
no code implementations • IWSLT (ACL) 2022 • Ngoc-Quan Pham, Tuan Nam Nguyen, Thai-Binh Nguyen, Danni Liu, Carlos Mullov, Jan Niehues, Alexander Waibel
Pretrained models in acoustic and textual modalities can potentially improve speech translation for both Cascade and End-to-end approaches.
no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe
The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.
no code implementations • NAACL (SIGTYP) 2021 • Zhong Zhou, Alexander Waibel
In other words, given a text in 124 source languages, we translate it into a severely low resource language using only ∼1, 000 lines of low resource data without any external help.
no code implementations • EAMT 2020 • Maciej Modrzejewski, Miriam Exel, Bianka Buschbeck, Thanh-Le Ha, Alexander Waibel
The correct translation of named entities (NEs) still poses a challenge for conventional neural machine translation (NMT) systems.
no code implementations • ACL (IWSLT) 2021 • Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner
The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.
no code implementations • ACL (IWSLT) 2021 • Ngoc-Quan Pham, Tuan Nam Nguyen, Thanh-Le Ha, Sebastian Stüker, Alexander Waibel, Dan He
This paper contains the description for the submission of Karlsruhe Institute of Technology (KIT) for the multilingual TEDx translation task in the IWSLT 2021 evaluation campaign.
no code implementations • IWSLT 2016 • Yang Zhang, Jan Niehues, Alexander Waibel
Neural models have recently shown big improvements in the performance of phrase-based machine translation.
no code implementations • AACL (lifelongnlp) 2020 • Christian Huber, Juan Hussain, Tuan-Nam Nguyen, Kaihang Song, Sebastian Stüker, Alexander Waibel
This problem is even bigger for end-to-end speech recognition systems that only accept transcribed speech as training data, which is harder and more expensive to obtain than text data.
no code implementations • IWSLT 2017 • Ngoc-Quan Pham, Matthias Sperber, Elizabeth Salesky, Thanh-Le Ha, Jan Niehues, Alexander Waibel
For the SLT track, in addition to a monolingual neural translation system used to generate correct punctuations and true cases of the data prior to training our multilingual system, we introduced a noise model in order to make our system more robust.
no code implementations • EMNLP (IWSLT) 2019 • Ngoc-Quan Pham, Thai-Son Nguyen, Thanh-Le Ha, Juan Hussain, Felix Schneider, Jan Niehues, Sebastian Stüker, Alexander Waibel
This paper describes KIT’s submission to the IWSLT 2019 Speech Translation task on two sub-tasks corresponding to two different datasets.
no code implementations • COLING (WANLP) 2020 • Juan Hussain, Mohammed Mediani, Moritz Behr, M. Amin Cheragui, Sebastian Stüker, Alexander Waibel
As this is a very specific domain, in addition to the linguistic challenges posed by translating between Arabic and German, we also focus in this paper on the methods we implemented for adapting our speech translation system to the domain of this psychiatric interview.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +7
no code implementations • 7 May 2024 • Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Seymanur Aktı, Hazim Kemal Ekenel, Alexander Waibel
In the task of talking face generation, the objective is to generate a face video with lips synchronized to the corresponding audio while preserving visual details and identity information.
no code implementations • 27 Feb 2024 • Fabian Retkowski, Alexander Waibel
Text segmentation is a fundamental task in natural language processing, where documents are split into contiguous sections.
Ranked #1 on Headline Generation on YTSeg
no code implementations • 9 Jan 2024 • Christian Huber, Alexander Waibel
Despite recent advances, Automatic Speech Recognition (ASR) systems are still far from perfect.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Aug 2023 • Thai-Binh Nguyen, Alexander Waibel
The model utilizes a single-channel speech enhancement module that isolates the speaker's voice from background noise (ConVoiFilter) and an ASR module.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 7 Aug 2023 • Christian Huber, Tu Anh Dinh, Carlos Mullov, Ngoc Quan Pham, Thai Binh Nguyen, Fabian Retkowski, Stefan Constantin, Enes Yavuz Ugan, Danni Liu, Zhaolin Li, Sai Koneru, Jan Niehues, Alexander Waibel
Secondly, we compare different approaches to low-latency speech translation using this framework.
no code implementations • 18 Jul 2023 • Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Hazim Kemal Ekenel, Alexander Waibel
Specifically, this involves unintended flow of lip, pose and other information from the reference to the generated image, as well as instabilities during model training.
1 code implementation • 8 Jun 2023 • Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues
In this paper, we describe our speech translation system for the multilingual track of IWSLT 2023, which evaluates translation quality on scientific conference talks.
no code implementations • 21 Nov 2022 • Ngoc-Quan Pham, Jan Niehues, Alexander Waibel
Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training.
no code implementations • 7 Nov 2022 • Fevziye Irem Eyiokur, Alperen Kantarcı, Mustafa Ekrem Erakin, Naser Damer, Ferda Ofli, Muhammad Imran, Janez Križaj, Albert Ali Salah, Alexander Waibel, Vitomir Štruc, Hazim Kemal Ekenel
The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals.
no code implementations • 17 Oct 2022 • Enes Yavuz Ugan, Christian Huber, Juan Hussain, Alexander Waibel
Code-Switching (CS) is referred to the phenomenon of alternately using words and phrases from different languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 4 Oct 2022 • Christian Huber, Enes Yavuz Ugan, Alexander Waibel
We propose a) a Language Agnostic end-to-end Speech Translation model (LAST), and b) a data augmentation strategy to increase code-switching (CS) performance.
no code implementations • 9 Jun 2022 • Alexander Waibel, Moritz Behr, Fevziye Irem Eyiokur, Dogucan Yaman, Tuan-Nam Nguyen, Carlos Mullov, Mehmet Arif Demirtas, Alperen Kantarcı, Stefan Constantin, Hazim Kemal Ekenel
The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, face video of the original speaker.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
1 code implementation • 22 Apr 2022 • Fevziye Irem Eyiokur, Dogucan Yaman, Hazim Kemal Ekenel, Alexander Waibel
We show that after applying exposure correction with the proposed model, the portrait matting quality increases significantly.
no code implementations • IWSLT (ACL) 2022 • Peter Polák, Ngoc-Quan Ngoc, Tuan-Nam Nguyen, Danni Liu, Carlos Mullov, Jan Niehues, Ondřej Bojar, Alexander Waibel
In this paper, we describe our submission to the Simultaneous Speech Translation at IWSLT 2022.
no code implementations • 29 Mar 2022 • Christian Huber, Rishu Kumar, Ondřej Bojar, Alexander Waibel
In this paper we study, a) methods to acquire important words for this memory dynamically and, b) the trade-off between improvement in recognition accuracy of new words and the potential danger of false alarms for those added words.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 5 Jul 2021 • Christian Huber, Juan Hussain, Sebastian Stüker, Alexander Waibel
To alleviate this problem we supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 6 Jun 2021 • Dogucan Yaman, Hazim Kemal Ekenel, Alexander Waibel
We first generate a coarse segmentation map from the input image and then predict the alpha matte by utilizing the image and segmentation map.
no code implementations • 7 May 2021 • Ngoc-Quan Pham, Tuan-Nam Nguyen, Sebastian Stueker, Alexander Waibel
The key idea of the method is to assign fast weight matrices for each language by decomposing each weight matrix into a shared component and a language dependent component.
no code implementations • 26 Apr 2021 • Henning Schulze, Dogucan Yaman, Alexander Waibel
Generating images according to natural language descriptions is a challenging task.
2 code implementations • 16 Mar 2021 • Fevziye Irem Eyiokur, Hazim Kemal Ekenel, Alexander Waibel
To train and evaluate the developed system, we collected and annotated images that represent face mask usage and face-hand interaction in the real world.
no code implementations • 11 Mar 2021 • Carlos Mullov, Ngoc-Quan Pham, Alexander Waibel
In an attempt to train the mapping from the encoder sentence representation to a new target language we use our model as an autoencoder.
no code implementations • 20 May 2020 • Ngoc-Quan Pham, Thanh-Le Ha, Tuan-Nam Nguyen, Thai-Son Nguyen, Elizabeth Salesky, Sebastian Stueker, Jan Niehues, Alexander Waibel
We also show that this model is able to better utilize synthetic data than the Transformer, and adapts better to variable sentence segmentation quality for speech translation.
no code implementations • 29 Jan 2020 • Zhong Zhou, Isak Czeresnia Etinger, Florian Metze, Alexander Hauptmann, Alexander Waibel
We have interesting results both in bounding the shooter as well as detecting the gun smoke.
no code implementations • 30 Apr 2019 • Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Sebastian Stüker, Alexander Waibel
Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community.
1 code implementation • IWSLT 2017 • Thanh-Le Ha, Jan Niehues, Alexander Waibel
In this paper, we proposed two strategies which can be applied to a multilingual neural machine translation system in order to better tackle zero-shot scenarios despite not having any parallel corpus.
no code implementations • IWSLT 2016 • Thanh-Le Ha, Jan Niehues, Alexander Waibel
In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach.