Search Results for author: Alexander Waibel

Found 39 papers, 5 papers with code

Audio Segmentation for Robust Real-Time Speech Recognition Based on Neural Networks

no code implementations • IWSLT 2016 • Micha Wetzel, Matthias Sperber, Alexander Waibel

Speech that contains multimedia content can pose a serious challenge for real-time automatic speech recognition (ASR) for two reasons: (1) The ASR produces meaningless output, hurting the readability of the transcript.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Machine Translation from Standard German to Alemannic Dialects

no code implementations • SIGUL (LREC) 2022 • Louisa Lambrecht, Felix Schneider, Alexander Waibel

There, improvements range from 7. 5 to 10. 6 BLEU points over the baseline depending on the dialect.

Machine Translation Translation

Paper
Add Code

Effective combination of pretrained models - KIT@IWSLT2022

no code implementations • IWSLT (ACL) 2022 • Ngoc-Quan Pham, Tuan Nam Nguyen, Thai-Binh Nguyen, Danni Liu, Carlos Mullov, Jan Niehues, Alexander Waibel

Pretrained models in acoustic and textual modalities can potentially improve speech translation for both Cascade and End-to-end approaches.

Translation

Paper
Add Code

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

Paper
Add Code

Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Text-based Translation

no code implementations • NAACL (SIGTYP) 2021 • Zhong Zhou, Alexander Waibel

In other words, given a text in 124 source languages, we translate it into a severely low resource language using only ∼1, 000 lines of low resource data without any external help.

Paper
Add Code

Incorporating External Annotation to improve Named Entity Translation in NMT

no code implementations • EAMT 2020 • Maciej Modrzejewski, Miriam Exel, Bianka Buschbeck, Thanh-Le Ha, Alexander Waibel

The correct translation of named entities (NEs) still poses a challenge for conventional neural machine translation (NMT) systems.

Machine Translation named-entity-recognition +4

Paper
Add Code

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations • ACL (IWSLT) 2021 • Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Paper
Add Code

Multilingual Speech Translation KIT @ IWSLT2021

no code implementations • ACL (IWSLT) 2021 • Ngoc-Quan Pham, Tuan Nam Nguyen, Thanh-Le Ha, Sebastian Stüker, Alexander Waibel, Dan He

This paper contains the description for the submission of Karlsruhe Institute of Technology (KIT) for the multilingual TEDx translation task in the IWSLT 2021 evaluation campaign.

Translation

Paper
Add Code

Integrating Encyclopedic Knowledge into Neural Language Models

no code implementations • IWSLT 2016 • Yang Zhang, Jan Niehues, Alexander Waibel

Neural models have recently shown big improvements in the performance of phrase-based machine translation.

Language Modelling Machine Translation +2

Paper
Add Code

Supervised Adaptation of Sequence-to-Sequence Speech Recognition Systems using Batch-Weighting

no code implementations • AACL (lifelongnlp) 2020 • Christian Huber, Juan Hussain, Tuan-Nam Nguyen, Kaihang Song, Sebastian Stüker, Alexander Waibel

This problem is even bigger for end-to-end speech recognition systems that only accept transcribed speech as training data, which is harder and more expensive to obtain than text data.

Sequence-To-Sequence Speech Recognition speech-recognition

Paper
Add Code

KIT’s Multilingual Neural Machine Translation systems for IWSLT 2017

no code implementations • IWSLT 2017 • Ngoc-Quan Pham, Matthias Sperber, Elizabeth Salesky, Thanh-Le Ha, Jan Niehues, Alexander Waibel

For the SLT track, in addition to a monolingual neural translation system used to generate correct punctuations and true cases of the data prior to training our multilingual system, we introduced a noise model in order to make our system more robust.

Machine Translation NMT +1

Paper
Add Code

The IWSLT 2019 KIT Speech Translation System

no code implementations • EMNLP (IWSLT) 2019 • Ngoc-Quan Pham, Thai-Son Nguyen, Thanh-Le Ha, Juan Hussain, Felix Schneider, Jan Niehues, Sebastian Stüker, Alexander Waibel

This paper describes KIT’s submission to the IWSLT 2019 Speech Translation task on two sub-tasks corresponding to two different datasets.

speech-recognition Speech Recognition +1

Paper
Add Code

German-Arabic Speech-to-Speech Translation for Psychiatric Diagnosis

no code implementations • COLING (WANLP) 2020 • Juan Hussain, Mohammed Mediani, Moritz Behr, M. Amin Cheragui, Sebastian Stüker, Alexander Waibel

As this is a very specific domain, in addition to the linguistic challenges posed by translating between Arabic and German, we also focus in this paper on the methods we implemented for adapting our speech translation system to the domain of this psychiatric interview.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Add Code

Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation

no code implementations • 7 May 2024 • Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Seymanur Aktı, Hazim Kemal Ekenel, Alexander Waibel

In the task of talking face generation, the objective is to generate a face video with lips synchronized to the corresponding audio while preserving visual details and identity information.

Talking Face Generation Video Generation

Paper
Add Code

From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions

no code implementations • 27 Feb 2024 • Fabian Retkowski, Alexander Waibel

Text segmentation is a fundamental task in natural language processing, where documents are split into contiguous sections.

Ranked #1 on Headline Generation on YTSeg

Headline Generation Segmentation +1

Paper
Add Code

Continuously Learning New Words in Automatic Speech Recognition

no code implementations • 9 Jan 2024 • Christian Huber, Alexander Waibel

Despite recent advances, Automatic Speech Recognition (ASR) systems are still far from perfect.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Convoifilter: A case study of doing cocktail party speech recognition

no code implementations • 22 Aug 2023 • Thai-Binh Nguyen, Alexander Waibel

The model utilizes a single-channel speech enhancement module that isolates the speaker's voice from background noise (ConVoiFilter) and an ASR module.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

End-to-End Evaluation for Low-Latency Simultaneous Speech Translation

no code implementations • 7 Aug 2023 • Christian Huber, Tu Anh Dinh, Carlos Mullov, Ngoc Quan Pham, Thai Binh Nguyen, Fabian Retkowski, Stefan Constantin, Enes Yavuz Ugan, Danni Liu, Zhaolin Li, Sai Koneru, Jan Niehues, Alexander Waibel

Secondly, we compare different approaches to low-latency speech translation using this framework.

Translation

Paper
Add Code

Audio-driven Talking Face Generation by Overcoming Unintended Information Flow

no code implementations • 18 Jul 2023 • Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Hazim Kemal Ekenel, Alexander Waibel

Specifically, this involves unintended flow of lip, pose and other information from the reference to the generated image, as well as instabilities during model training.

Audio-Visual Synchronization Talking Face Generation

Paper
Add Code

KIT's Multilingual Speech Translation System for IWSLT 2023

1 code implementation • 8 Jun 2023 • Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues

In this paper, we describe our speech translation system for the multilingual track of IWSLT 2023, which evaluates translation quality on scientific conference talks.

Data Augmentation Retrieval +1

Paper
Code

Towards continually learning new languages

no code implementations • 21 Nov 2022 • Ngoc-Quan Pham, Jan Niehues, Alexander Waibel

Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training.

speech-recognition Speech Recognition +1

Paper
Add Code

A Survey on Computer Vision based Human Analysis in the COVID-19 Era

no code implementations • 7 Nov 2022 • Fevziye Irem Eyiokur, Alperen Kantarcı, Mustafa Ekrem Erakin, Naser Damer, Ferda Ofli, Muhammad Imran, Janez Križaj, Albert Ali Salah, Alexander Waibel, Vitomir Štruc, Hazim Kemal Ekenel

The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals.

Crowd Counting Face Recognition

Paper
Add Code

Language-agnostic Code-Switching in Sequence-To-Sequence Speech Recognition

no code implementations • 17 Oct 2022 • Enes Yavuz Ugan, Christian Huber, Juan Hussain, Alexander Waibel

Code-Switching (CS) is referred to the phenomenon of alternately using words and phrases from different languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Code-Switching without Switching: Language Agnostic End-to-End Speech Translation

no code implementations • 4 Oct 2022 • Christian Huber, Enes Yavuz Ugan, Alexander Waibel

We propose a) a Language Agnostic end-to-end Speech Translation model (LAST), and b) a data augmentation strategy to increase code-switching (CS) performance.

Data Augmentation speech-recognition +2

Paper
Add Code

Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos

no code implementations • 9 Jun 2022 • Alexander Waibel, Moritz Behr, Fevziye Irem Eyiokur, Dogucan Yaman, Tuan-Nam Nguyen, Carlos Mullov, Mehmet Arif Demirtas, Alperen Kantarcı, Stefan Constantin, Hazim Kemal Ekenel

The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, face video of the original speaker.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Exposure Correction Model to Enhance Image Quality

1 code implementation • 22 Apr 2022 • Fevziye Irem Eyiokur, Dogucan Yaman, Hazim Kemal Ekenel, Alexander Waibel

We show that after applying exposure correction with the proposed model, the portrait matting quality increases significantly.

Decoder Image Matting +1

Paper
Code

CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022

no code implementations • IWSLT (ACL) 2022 • Peter Polák, Ngoc-Quan Ngoc, Tuan-Nam Nguyen, Danni Liu, Carlos Mullov, Jan Niehues, Ondřej Bojar, Alexander Waibel

In this paper, we describe our submission to the Simultaneous Speech Translation at IWSLT 2022.

Translation

Paper
Add Code

Short-Term Word-Learning in a Dynamically Changing Environment

no code implementations • 29 Mar 2022 • Christian Huber, Rishu Kumar, Ondřej Bojar, Alexander Waibel

In this paper we study, a) methods to acquire important words for this memory dynamically and, b) the trade-off between improvement in recognition accuracy of new words and the potential danger of false alarms for those added words.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Instant One-Shot Word-Learning for Context-Specific Neural Sequence-to-Sequence Speech Recognition

1 code implementation • 5 Jul 2021 • Christian Huber, Juan Hussain, Sebastian Stüker, Alexander Waibel

To alleviate this problem we supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Alpha Matte Generation from Single Input for Portrait Matting

no code implementations • 6 Jun 2021 • Dogucan Yaman, Hazim Kemal Ekenel, Alexander Waibel

We first generate a coarse segmentation map from the input image and then predict the alpha matte by utilizing the image and segmentation map.

Image Matting Segmentation

Paper
Add Code

Efficient Weight factorization for Multilingual Speech Recognition

no code implementations • 7 May 2021 • Ngoc-Quan Pham, Tuan-Nam Nguyen, Sebastian Stueker, Alexander Waibel

The key idea of the method is to assign fast weight matrices for each language by decomposing each weight matrix into a shared component and a language dependent component.

speech-recognition Speech Recognition

Paper
Add Code

CAGAN: Text-To-Image Generation with Combined Attention GANs

no code implementations • 26 Apr 2021 • Henning Schulze, Dogucan Yaman, Alexander Waibel

Generating images according to natural language descriptions is a challenging task.

Generative Adversarial Network Text-to-Image Generation

Paper
Add Code

Unconstrained Face-Mask & Face-Hand Datasets: Building a Computer Vision System to Help Prevent the Transmission of COVID-19

2 code implementations • 16 Mar 2021 • Fevziye Irem Eyiokur, Hazim Kemal Ekenel, Alexander Waibel

To train and evaluate the developed system, we collected and annotated images that represent face mask usage and face-hand interaction in the real world.

Paper
Code

Unsupervised Transfer Learning in Multilingual Neural Machine Translation with Cross-Lingual Word Embeddings

no code implementations • 11 Mar 2021 • Carlos Mullov, Ngoc-Quan Pham, Alexander Waibel

In an attempt to train the mapping from the encoder sentence representation to a new target language we use our model as an autoencoder.

Cross-Lingual Word Embeddings Machine Translation +5

Paper
Add Code

Relative Positional Encoding for Speech Recognition and Direct Translation

no code implementations • 20 May 2020 • Ngoc-Quan Pham, Thanh-Le Ha, Tuan-Nam Nguyen, Thai-Son Nguyen, Elizabeth Salesky, Sebastian Stueker, Jan Niehues, Alexander Waibel

We also show that this model is able to better utilize synthetic data than the Transformer, and adapts better to variable sentence segmentation quality for speech translation.

Position Sentence +4

Paper
Add Code

Gun Source and Muzzle Head Detection

no code implementations • 29 Jan 2020 • Zhong Zhou, Isak Czeresnia Etinger, Florian Metze, Alexander Hauptmann, Alexander Waibel

We have interesting results both in bounding the shooter as well as detecting the gun smoke.

Head Detection object-detection +1

Paper
Add Code

Very Deep Self-Attention Networks for End-to-End Speech Recognition

no code implementations • 30 Apr 2019 • Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Sebastian Stüker, Alexander Waibel

Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community.

speech-recognition Speech Recognition

Paper
Add Code

Effective Strategies in Zero-Shot Neural Machine Translation

1 code implementation • IWSLT 2017 • Thanh-Le Ha, Jan Niehues, Alexander Waibel

In this paper, we proposed two strategies which can be applied to a multilingual neural machine translation system in order to better tackle zero-shot scenarios despite not having any parallel corpus.

Machine Translation Translation

Paper
Code

Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder

no code implementations • IWSLT 2016 • Thanh-Le Ha, Jan Niehues, Alexander Waibel

In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach.

Decoder Machine Translation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.