no code implementations • 5 Apr 2022 • Ye-Qian Du, Jie Zhang, Qiu-Shi Zhu, Li-Rong Dai, Ming-Hui Wu, Xin Fang, Zhou-Wang Yang
Unpaired data has shown to be beneficial for low-resource automatic speech recognition~(ASR), which can be involved in the design of hybrid models with multi-task training or language model dependent pre-training.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 15 Feb 2022 • Zi-Qiang Zhang, Jie Zhang, Jian-Shu Zhang, Ming-Hui Wu, Xin Fang, Li-Rong Dai
The proposed approach explores both the complementarity of audio-visual modalities and long-term context dependency using a transformer-based fusion module and a flexible masking strategy.
no code implementations • 22 Jan 2022 • Qiu-Shi Zhu, Jie Zhang, Zi-Qiang Zhang, Ming-Hui Wu, Xin Fang, Li-Rong Dai
In this work, we therefore first analyze the noise robustness of wav2vec2. 0 via experiments.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 15 Mar 2021 • Zi-Qiang Zhang, Yan Song, Ming-Hui Wu, Xin Fang, Li-Rong Dai
In this paper, we propose a weakly supervised multilingual representation learning framework, called cross-lingual self-training (XLST).