Search Results for author: Dirk Padfield

Found 7 papers, 1 papers with code

Inverted Projection for Robust Speech Translation

no code implementations • ACL (IWSLT) 2021 • Dirk Padfield, Colin Cherry

Traditional translation systems trained on written documents perform well for text-based translation but not as well for speech-based applications.

Translation

Paper
Add Code

AudioPaLM: A Large Language Model That Can Speak and Listen

no code implementations • 22 Jun 2023 • Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor, Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Frank

AudioPaLM inherits the capability to preserve paralinguistic information such as speaker identity and intonation from AudioLM and the linguistic knowledge present only in text large language models such as PaLM-2.

Language Modelling Large Language Model +5

Paper
Add Code

MultiTurnCleanup: A Benchmark for Multi-Turn Spoken Conversational Transcript Cleanup

1 code implementation • 19 May 2023 • Hua Shen, Vicky Zayats, Johann C. Rocholl, Daniel D. Walker, Dirk Padfield

Current disfluency detection models focus on individual utterances each from a single speaker.

Paper
Code

Chronological Self-Training for Real-Time Speaker Diarization

no code implementations • 5 Aug 2022 • Dirk Padfield, Daniel J. Liebling

Diarization partitions an audio stream into segments based on the voices of the speakers.

speaker-diarization Speaker Diarization

Paper
Add Code

Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection

no code implementations • NAACL 2022 • Angelica Chen, Vicky Zayats, Daniel D. Walker, Dirk Padfield

In modern interactive speech-based systems, speech is consumed and transcribed incrementally prior to having disfluencies removed.

Machine Translation

Paper
Add Code

Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

no code implementations • EMNLP 2021 • Katrin Tomanek, Vicky Zayats, Dirk Padfield, Kara Vaillancourt, Fadi Biadsy

We demonstrate this on two speech adaptation tasks (atypical and accented speech) and for two state-of-the-art ASR architectures.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Sentence Boundary Augmentation For Neural Machine Translation Robustness

no code implementations • 21 Oct 2020 • Daniel Li, Te I, Naveen Arivazhagan, Colin Cherry, Dirk Padfield

Specifically, in the context of long-form speech translation systems, where the input transcripts come from Automatic Speech Recognition (ASR), the NMT models have to handle errors including phoneme substitutions, grammatical structure, and sentence boundaries, all of which pose challenges to NMT robustness.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.