no code implementations • 6 May 2022 • Sankaran Panchapagesan, Arun Narayanan, Turaj Zakizadeh Shabestary, Shuai Shao, Nathan Howard, Alex Park, James Walker, Alexander Gruenstein
Acoustic Echo Cancellation (AEC) is essential for accurate recognition of queries spoken to a smart speaker that is playing out audio.
no code implementations • 26 Apr 2022 • Arun Narayanan, James Walker, Sankaran Panchapagesan, Nathan Howard, Yuma Koizumi
Using neural network based acoustic frontends for improving robustness of streaming automatic speech recognition (ASR) systems is challenging because of the causality constraints and the resulting distortion that the frontend processing introduces in speech.
no code implementations • 25 Apr 2022 • Joseph Caroselli, Arun Narayanan, Nathan Howard, Tom O'Malley
This work introduces the Cleanformer, a streaming multichannel neural based enhancement frontend for automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 18 Nov 2021 • Tom O'Malley, Arun Narayanan, Quan Wang, Alex Park, James Walker, Nathan Howard
Compared to the noisy baseline, the joint model reduces the word error rate in low signal-to-noise ratio conditions by at least 71% on our echo cancellation dataset, 10% on our noisy dataset, and 26% on our multi-speaker dataset.
no code implementations • 1 Jun 2021 • Nathan Howard, Alex Park, Turaj Zakizadeh Shabestary, Alexander Gruenstein, Rohit Prabhavalkar
We consider the problem of recognizing speech utterances spoken to a device which is generating a known sound waveform; for example, recognizing queries issued to a digital assistant which is generating responses to previous user inputs.