1 code implementation • 22 May 2024 • Xiang Geng, Ming Zhu, Jiahuan Li, Zhejian Lai, Wei Zou, Shuaijie She, Jiaxin Guo, Xiaofeng Zhao, Yinglu Li, Yuang Li, Chang Su, Yanqing Zhao, Min Zhang, Hao Yang, Xinglin Lyu, Jiajun Chen, ShuJian Huang
For the second issue, we propose a method comprising two synergistic components: low-rank adaptation for training to maintain the original LLM parameters, and recovery KD, which utilizes data generated by the chat LLM itself to recover the original knowledge from the frozen parameters.
no code implementations • 7 Apr 2024 • Yuang Li, Min Zhang, Mengxin Ren, Miaomiao Ma, Daimeng Wei, Hao Yang
Audio deepfake detection (ADD) is essential for preventing the misuse of synthetic voices that may infringe on personal rights and privacy.
no code implementations • 21 Jan 2024 • Yuang Li, Jiawei Yu, Yanqing Zhao, Min Zhang, Mengxin Ren, Xiaofeng Zhao, Xiaosong Qiao, Chang Su, Miaomiao Ma, Hao Yang
In this work, we connect the Whisper encoder with ChatGLM3 and provide in-depth comparisons of these two approaches using Chinese automatic speech recognition (ASR) and name entity recognition (NER) tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 18 Sep 2023 • Yuang Li, Yinglu Li, Min Zhang, Chang Su, Mengxin Ren, Xiaosong Qiao, Xiaofeng Zhao, Mengyao Piao, Jiawei Yu, Xinglin Lv, Miaomiao Ma, Yanqing Zhao, Hao Yang
End-to-end automatic speech recognition (ASR) systems often struggle to recognize rare name entities, such as personal names, organizations, and terminologies not frequently encountered in the training data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 28 Jun 2023 • Yuang Li, Yu Wu, Jinyu Li, Shujie Liu
Recent end-to-end automatic speech recognition (ASR) systems often utilize a Transformer-based acoustic encoder that generates embedding at a high frame rate.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 28 Jun 2023 • Yuang Li, Yu Wu, Jinyu Li, Shujie Liu
Different from these methods, in this work, with only a domain-specific text prompt, we propose two zero-shot ASR domain adaptation methods using LLaMA, a 7-billion-parameter large language model (LLM).
no code implementations • 31 May 2023 • Huiqiang Jiang, Li Lyna Zhang, Yuang Li, Yu Wu, Shijie Cao, Ting Cao, Yuqing Yang, Jinyu Li, Mao Yang, Lili Qiu
In this paper, we propose a novel compression strategy that leverages structured pruning and knowledge distillation to reduce the model size and inference cost of the Conformer model while preserving high recognition performance.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 3 Apr 2023 • Yuang Li, Xianrui Zheng, Philip C. Woodland
In this paper, seven SSL models were compared on both simulated and real-world corpora.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 20 Jan 2021 • Xin Liu, Yuang Li, Josh Fromm, Yuntao Wang, Ziheng Jiang, Alex Mariakakis, Shwetak Patel
In this work, we demonstrate state-of-the-art latency and accuracy for on-device super-resolution using a novel hybrid architecture called SplitSR and a novel lightweight residual block called SplitSRBlock.