no code implementations • 16 May 2024 • Yuchen Hu, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng, Ruizhe Li
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which aims to predict the ground-truth transcription from the decoded N-best hypotheses.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 16 May 2024 • Ruizhe Li, Grazziela Figueredo, Dorothee Auer, Christian Wagner, Xin Chen
To address this challenge, this paper proposes a mask-guided encoder-decoder DCNN-based image registration method, named as MrRegNet.
1 code implementation • 6 May 2024 • Ruizhe Li, Yanjun Gao
By updating these vectors within MLP and recalibrating attention patterns to neutralise the preference for the first choice 'A', we effectively mitigate the anchored bias.
1 code implementation • 3 Mar 2024 • Jiangbo Pei, Ruizhe Li, Qingchao Chen
Specifically, we first conduct source model selection based on the proposed selection principles.
1 code implementation • 10 Feb 2024 • Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng
Leveraging the rich linguistic knowledge and strong reasoning abilities of LLMs, our new paradigm can integrate the rich information in N-best candidates to generate a higher-quality translation result.
no code implementations • 8 Feb 2024 • Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, EnSiong Chng, Chao-Han Huck Yang
Recent studies have successfully shown that large language models (LLMs) can be successfully used for generative error correction (GER) on top of the automatic speech recognition (ASR) output.
Audio-Visual Speech Recognition Automatic Speech Recognition +3
1 code implementation • 19 Jan 2024 • Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Chao Zhang, Pin-Yu Chen, EnSiong Chng
To this end, we propose to extract a language-space noise embedding from the N-best list to represent the noise conditions of source speech, which can promote the denoising process in GER.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 28 Aug 2023 • Yanjun Gao, Ruizhe Li, John Caskey, Dmitriy Dligach, Timothy Miller, Matthew M. Churpek, Majid Afshar
In this paper, we outline an innovative approach for augmenting the proficiency of LLMs in the realm of automated diagnosis generation, achieved through the incorporation of a medical knowledge graph (KG) and a novel graph model: Dr. Knows, inspired by the clinical diagnostic reasoning process.
1 code implementation • 16 Jul 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng
Specifically, we design a noise classification (NC) model to produce acoustic embedding as a noise conditioner for guiding the reverse denoising process.
1 code implementation • 18 Jun 2023 • Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng
In this work, we investigate the noise-invariant visual modality to strengthen robustness of AVSR, which can adapt to any testing noises while without dependence on noisy training data, a. k. a., unsupervised noise adaptation.
1 code implementation • 18 Jun 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Heqing Zou, Eng Siong Chng
In this paper, we aim to learn the shared representations across modalities to bridge their gap.
1 code implementation • 16 May 2023 • Yuchen Hu, Ruizhe Li, Chen Chen, Heqing Zou, Qiushi Zhu, Eng Siong Chng
However, most existing AVSR approaches simply fuse the audio and visual features by concatenation, without explicit interactions to capture the deep correlations between them, which results in sub-optimal multimodal representations for downstream speech recognition task.
Audio-Visual Speech Recognition Automatic Speech Recognition +3
1 code implementation • 22 Feb 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng
In this paper, we propose a simple yet effective approach called gradient remedy (GR) to solve interference between task gradients in noise-robust speech recognition, from perspectives of both angle and magnitude.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 14 Oct 2022 • Ruizhe Li, Xin Chen
The final trained model was also evaluated on an independent test set by the CMRxMotion organisers, which achieved the classification accuracy of 72. 5% and Cohen's Kappa of 0. 6309 (ranked top 1 in this grand challenge).
no code implementations • 7 Oct 2021 • Ruizhe Li, Xutan Peng, Chenghua Lin
In this paper, we provide the first focused study on the discontinuities (aka.
no code implementations • 29 Sep 2021 • Ruizhe Li, Xutan Peng, Chenghua Lin
In this paper, we provide the first focused study on the discontinuities (aka.
1 code implementation • INLG (ACL) 2021 • Chengkun Zeng, Guanyi Chen, Chenghua Lin, Ruizhe Li, Zhigang Chen
Understanding speaker's feelings and producing appropriate responses with emotion connection is a key communicative skill for empathetic dialogue systems.
1 code implementation • 3 Aug 2021 • Ruizhe Li, Matteo Bastiani, Dorothee Auer, Christian Wagner, Xin Chen
The proposed method was evaluated on a public brain MRI data set for age estimation.
1 code implementation • COLING 2020 • Ruizhe Li, Xiao Li, Guanyi Chen, Chenghua Lin
The Variational Autoencoder (VAE) is a popular and powerful model applied to text modelling to generate diverse sentences.
no code implementations • EMNLP 2020 • Xiao Li, Guanyi Chen, Chenghua Lin, Ruizhe Li
We propose DGST, a novel and simple Dual-Generator network architecture for text Style Transfer.
1 code implementation • 28 Apr 2020 • Mina Jafari, Ruizhe Li, Yue Xing, Dorothee Auer, Susan Francis, Jonathan Garibaldi, Xin Chen
In this paper, we present a generic deep convolutional neural network (DCNN) for multi-class image segmentation.
1 code implementation • 16 Apr 2020 • Ruizhe Li, Dorothee Auer, Christian Wagner, Xin Chen
To address this problem, we propose a generic semi-supervised learning framework for image segmentation based on a deep convolutional neural network (DCNN).
1 code implementation • WS 2019 • Ruizhe Li, Xiao Li, Chenghua Lin, Matthew Collinson, Rui Mao
Variational Autoencoder (VAE) is a powerful method for learning representations of high-dimensional data.
2 code implementations • ICML 2020 • Xiao Li, Chenghua Lin, Ruizhe Li, Chaozheng Wang, Frank Guerin
We demonstrate the utility of our method for attribute manipulation in autoencoders trained across varied domains, using both human evaluation and automated methods.
Ranked #7 on Image Generation on CelebA 256x256 (FID metric)
no code implementations • CONLL 2019 • Ruizhe Li, Chenghua Lin, Matthew Collinson, Xiao Li, Guanyi Chen
Recognising dialogue acts (DA) is important for many natural language processing tasks such as dialogue generation and intention recognition.
Ranked #4 on Dialogue Act Classification on Switchboard corpus
no code implementations • SEMEVAL 2018 • Rui Mao, Guanyi Chen, Ruizhe Li, Chenghua Lin
This paper describes the system that we submitted for SemEval-2018 task 10: capturing discriminative attributes.