Search Results for author: Ruizhe Li

Found 26 papers, 17 papers with code

Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models

no code implementations • 16 May 2024 • Yuchen Hu, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng, Ruizhe Li

Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which aims to predict the ground-truth transcription from the decoded N-best hypotheses.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

MrRegNet: Multi-resolution Mask Guided Convolutional Neural Network for Medical Image Registration with Large Deformations

no code implementations • 16 May 2024 • Ruizhe Li, Grazziela Figueredo, Dorothee Auer, Christian Wagner, Xin Chen

To address this challenge, this paper proposes a mask-guided encoder-decoder DCNN-based image registration method, named as MrRegNet.

Decoder Image Registration +1

Paper
Add Code

Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions

1 code implementation • 6 May 2024 • Ruizhe Li, Yanjun Gao

By updating these vectors within MLP and recalibrating attention patterns to neutralise the preference for the first choice 'A', we effectively mitigate the anchored bias.

Decision Making Multiple-choice

Paper
Code

On the Model-Agnostic Multi-Source-Free Unsupervised Domain Adaptation

1 code implementation • 3 Mar 2024 • Jiangbo Pei, Ruizhe Li, Qingchao Chen

Specifically, we first conduct source model selection based on the proposed selection principles.

Model Selection Unsupervised Domain Adaptation

Paper
Code

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

1 code implementation • 10 Feb 2024 • Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng

Leveraging the rich linguistic knowledge and strong reasoning abilities of LLMs, our new paradigm can integrate the rich information in N-best candidates to generate a higher-quality translation result.

Machine Translation Translation

171

Paper
Code

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

no code implementations • 8 Feb 2024 • Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, EnSiong Chng, Chao-Han Huck Yang

Recent studies have successfully shown that large language models (LLMs) can be successfully used for generative error correction (GER) on top of the automatic speech recognition (ASR) output.

Audio-Visual Speech Recognition Automatic Speech Recognition +3

Paper
Add Code

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

1 code implementation • 19 Jan 2024 • Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Chao Zhang, Pin-Yu Chen, EnSiong Chng

To this end, we propose to extract a language-space noise embedding from the N-best list to represent the noise conditions of source speech, which can promote the denoising process in GER.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

103

Paper
Code

Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction

no code implementations • 28 Aug 2023 • Yanjun Gao, Ruizhe Li, John Caskey, Dmitriy Dligach, Timothy Miller, Matthew M. Churpek, Majid Afshar

In this paper, we outline an innovative approach for augmenting the proficiency of LLMs in the realm of automated diagnosis generation, achieved through the incorporation of a medical knowledge graph (KG) and a novel graph model: Dr. Knows, inspired by the clinical diagnostic reasoning process.

Paper
Add Code

Noise-aware Speech Enhancement using Diffusion Probabilistic Model

1 code implementation • 16 Jul 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

Specifically, we design a noise classification (NC) model to produce acoustic embedding as a noise conditioner for guiding the reverse denoising process.

Denoising Multi-Task Learning +2

Paper
Code

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

1 code implementation • 18 Jun 2023 • Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng

In this work, we investigate the noise-invariant visual modality to strengthen robustness of AVSR, which can adapt to any testing noises while without dependence on noisy training data, a. k. a., unsupervised noise adaptation.

Audio-Visual Speech Recognition speech-recognition +1

Paper
Code

MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition

1 code implementation • 18 Jun 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Heqing Zou, Eng Siong Chng

In this paper, we aim to learn the shared representations across modalities to bridge their gap.

Audio-Visual Speech Recognition Representation Learning +3

Paper
Code

Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition

1 code implementation • 16 May 2023 • Yuchen Hu, Ruizhe Li, Chen Chen, Heqing Zou, Qiushi Zhu, Eng Siong Chng

However, most existing AVSR approaches simply fuse the audio and visual features by concatenation, without explicit interactions to capture the deep correlations between them, which results in sub-optimal multimodal representations for downstream speech recognition task.

Audio-Visual Speech Recognition Automatic Speech Recognition +3

Paper
Code

Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition

1 code implementation • 22 Feb 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

In this paper, we propose a simple yet effective approach called gradient remedy (GR) to solve interference between task gradients in noise-robust speech recognition, from perspectives of both angle and magnitude.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Motion-related Artefact Classification Using Patch-based Ensemble and Transfer Learning in Cardiac MRI

1 code implementation • 14 Oct 2022 • Ruizhe Li, Xin Chen

The final trained model was also evaluated on an independent test set by the CMRxMotion organisers, which achieved the classification accuracy of 72. 5% and Cohen's Kappa of 0. 6309 (ranked top 1 in this grand challenge).

Transfer Learning

Paper
Code

On the Latent Holes of VAEs for Text Generation

no code implementations • 7 Oct 2021 • Ruizhe Li, Xutan Peng, Chenghua Lin

In this paper, we provide the first focused study on the discontinuities (aka.

Decoder Text Generation

Paper
Add Code

On the Latent Holes 🧀 of VAEs for Text Generation

no code implementations • 29 Sep 2021 • Ruizhe Li, Xutan Peng, Chenghua Lin

In this paper, we provide the first focused study on the discontinuities (aka.

Decoder Text Generation

Paper
Add Code

Affective Decoding for Empathetic Response Generation

1 code implementation • INLG (ACL) 2021 • Chengkun Zeng, Guanyi Chen, Chenghua Lin, Ruizhe Li, Zhigang Chen

Understanding speaker's feelings and producing appropriate responses with emotion connection is a key communicative skill for empathetic dialogue systems.

Empathetic Response Generation Response Generation

Paper
Code

Image Augmentation Using a Task Guided Generative Adversarial Network for Age Estimation on Brain MRI

1 code implementation • 3 Aug 2021 • Ruizhe Li, Matteo Bastiani, Dorothee Auer, Christian Wagner, Xin Chen

The proposed method was evaluated on a public brain MRI data set for age estimation.

Age Estimation Generative Adversarial Network +3

Paper
Code

Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation

1 code implementation • COLING 2020 • Ruizhe Li, Xiao Li, Guanyi Chen, Chenghua Lin

The Variational Autoencoder (VAE) is a popular and powerful model applied to text modelling to generate diverse sentences.

Language Modelling Response Generation +1

Paper
Code

DGST: a Dual-Generator Network for Text Style Transfer

no code implementations • EMNLP 2020 • Xiao Li, Guanyi Chen, Chenghua Lin, Ruizhe Li

We propose DGST, a novel and simple Dual-Generator network architecture for text Style Transfer.

Style Transfer Text Style Transfer

Paper
Add Code

FU-net: Multi-class Image Segmentation Using Feedback Weighted U-net

1 code implementation • 28 Apr 2020 • Mina Jafari, Ruizhe Li, Yue Xing, Dorothee Auer, Susan Francis, Jonathan Garibaldi, Xin Chen

In this paper, we present a generic deep convolutional neural network (DCNN) for multi-class image segmentation.

Image Segmentation Segmentation +1

Paper
Code

A generic ensemble based deep convolutional neural network for semi-supervised medical image segmentation

1 code implementation • 16 Apr 2020 • Ruizhe Li, Dorothee Auer, Christian Wagner, Xin Chen

To address this problem, we propose a generic semi-supervised learning framework for image segmentation based on a deep convolutional neural network (DCNN).

Decoder Image Segmentation +6

Paper
Code

A Stable Variational Autoencoder for Text Modelling

1 code implementation • WS 2019 • Ruizhe Li, Xiao Li, Chenghua Lin, Matthew Collinson, Rui Mao

Variational Autoencoder (VAE) is a powerful method for learning representations of high-dimensional data.

Paper
Code

Latent Space Factorisation and Manipulation via Matrix Subspace Projection

2 code implementations • ICML 2020 • Xiao Li, Chenghua Lin, Ruizhe Li, Chaozheng Wang, Frank Guerin

We demonstrate the utility of our method for attribute manipulation in autoencoders trained across varied domains, using both human evaluation and automated methods.

Ranked #7 on Image Generation on CelebA 256x256 (FID metric)