Search Results for author: Zilu Guo

Found 6 papers, 3 papers with code

DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation

1 code implementation • 6 Jun 2024 • Zilu Guo, Liuyang Bian, Xuan Huang, Hu Wei, Jingyu Li, Huasheng Ni

Following these guidelines, we propose DSNet, a Dual-Branch CNN architecture, which incorporates atrous convolutions in the shallow layers of the model architecture, as well as pretraining the nearly entire encoder on ImageNet to achieve better performance.

Paper
Code

A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and Recognition

1 code implementation • 27 May 2024 • Zilu Guo, Qing Wang, Jun Du, Jia Pan, Qing-Feng Liu, Chin-Hui

In this paper, we propose a variance-preserving interpolation framework to improve diffusion models for single-channel speech enhancement (SE) and automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

no code implementations • 24 May 2024 • Chang Li, Ruoyu Wang, Lijuan Liu, Jun Du, Yixuan Sun, Zilu Guo, Zhenrong Zhang, Yuan Jiang

To overcome these challenges, we introduce a novel quality-aware masked diffusion transformer (QA-MDT) approach that enables generative models to discern the quality of input music waveform during training.

Ranked #1 on Text-to-Music Generation on MusicCaps