1 code implementation • 6 Feb 2024 • Liang-Hsuan Tseng, En-Pei Hu, Cheng-Han Chiang, Yuan Tseng, Hung-Yi Lee, Lin-shan Lee, Shao-Hua Sun
REBORN alternates between (1) training a segmentation model that predicts the boundaries of the segmental structures in speech signals and (2) training the phoneme prediction model, whose input is the speech feature segmented by the segmentation model, to predict a phoneme transcription.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 12 May 2023 • Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, Chen-An Li, Tsu-Yuan Hsu, Shinji Watanabe, Hung-Yi Lee
We use fully unpaired data to train our unsupervised systems and evaluate our results on CoVoST 2 and CVSS.
no code implementations • 15 Nov 2022 • Derek Xu, Shuyan Dong, Changhan Wang, Suyoun Kim, Zhaojiang Lin, Akshat Shrivastava, Shang-Wen Li, Liang-Hsuan Tseng, Alexei Baevski, Guan-Ting Lin, Hung-Yi Lee, Yizhou Sun, Wei Wang
Recent studies find existing self-supervised speech encoders contain primarily acoustic rather than semantic information.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +10
1 code implementation • 14 Oct 2022 • Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-Yi Lee
Self-supervised learned (SSL) speech pre-trained models perform well across various speech processing tasks.
no code implementations • 7 Oct 2021 • Liang-Hsuan Tseng, Yu-Kuan Fu, Heng-Jui Chang, Hung-Yi Lee
Code-switching (CS) is common in daily conversations where more than one language is used within a sentence.