no code implementations • 2 Jun 2023 • Haoyu Wang, Siyuan Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan
Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM have been shown to significantly improve many speech tasks.
no code implementations • 13 Oct 2022 • Haoyu Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan
Labeled audio data is insufficient to build satisfying speech recognition systems for most of the languages in the world.
no code implementations • 17 Jun 2022 • Bang Zeng, Hongbing Suo, Yulong Wan, Ming Li
The common target speech separation directly estimate the target source, ignoring the interrelationship between different speakers at each frame.