no code implementations • 22 Jun 2022 • Felix Weninger, Marco Gaudesi, Md Akmal Haidar, Nicola Ferri, Jesús Andrés-Ferrer, Puming Zhan
In the dual-mode Conformer Transducer model, layers can function in online or offline mode while sharing parameters, and in-place knowledge distillation from offline to online mode is applied in training to improve online accuracy.
no code implementations • 23 Sep 2021 • Marco Gaudesi, Felix Weninger, Dushyant Sharma, Puming Zhan
End-to-end (E2E) multi-channel ASR systems show state-of-the-art performance in far-field ASR tasks by joint training of a multi-channel front-end along with the ASR model.
no code implementations • 17 Sep 2021 • Felix Weninger, Marco Gaudesi, Ralf Leibold, Roberto Gemello, Puming Zhan
We use a single-channel encoder for CT speech and a multi-channel encoder with Spatial Filtering neural beamforming for FT speech, which are jointly trained with the encoder selection.