no code implementations • 31 Oct 2022 • Suyoun Kim, Ke Li, Lucas Kabela, Rongqing Huang, Jiedan Zhu, Ozlem Kalinli, Duc Le
In this work, we present our Joint Audio/Text training method for Transformer Rescorer, to leverage unpaired text-only data which is relatively cheaper than paired audio-text data.
no code implementations • 29 Mar 2022 • Jay Mahadeokar, Yangyang Shi, Ke Li, Duc Le, Jiedan Zhu, Vikas Chandra, Ozlem Kalinli, Michael L Seltzer
Streaming ASR with strict latency constraints is required in many speech recognition applications.
no code implementations • 23 Oct 2019 • Jun Liu, Jiedan Zhu, Vishal Kathuria, Fuchun Peng
A second layer is a private cache that caches the graph that represents the personalized language model, which is only shared by the utterances from a particular user.