no code implementations • 4 Jun 2024 • Hao Yen, Pin-Jui Ku, Sabato Marco Siniscalchi, Chin-Hui Lee
We propose a novel language-universal approach to end-to-end automatic spoken keyword recognition (SKR) leveraging upon (i) a self-supervised pre-trained model, and (ii) a set of universal speech attributes (manner and place of articulation).
no code implementations • 16 Sep 2023 • Hao Yen, Sabato Marco Siniscalchi, Chin-Hui Lee
The proposed joint multilingual model is evaluated through phoneme recognition.
no code implementations • 4 Nov 2022 • Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux
Diffusion models have recently shown promising results for difficult enhancement tasks such as the conditional and unconditional restoration of natural images and audio signals.
no code implementations • 30 Oct 2022 • Hao Yen, Woojay Jeon
In embedding-matching acoustic-to-word (A2W) ASR, every word in the vocabulary is represented by a fixed-dimension embedding vector that can be added or removed independently of the rest of the system.
1 code implementation • 8 Oct 2021 • Hao Yen, Pin-Jui Ku, Chao-Han Huck Yang, Hu Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, Yu Tsao
In this study, we propose a novel adversarial reprogramming (AR) approach for low-resource spoken command recognition (SCR), and build an AR-SCR system.
no code implementations • 3 Jul 2021 • Hao Yen, Chao-Han Huck Yang, Hu Hu, Sabato Marco Siniscalchi, Qing Wang, Yuyang Wang, Xianjun Xia, Yuanjun Zhao, Yuzhong Wu, Yannan Wang, Jun Du, Chin-Hui Lee
We propose a novel neural model compression strategy combining data augmentation, knowledge transfer, pruning, and quantization for device-robust acoustic scene classification (ASC).