Search Results for author: Zilu Guo

Found 6 papers, 3 papers with code

DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation

1 code implementation6 Jun 2024 Zilu Guo, Liuyang Bian, Xuan Huang, Hu Wei, Jingyu Li, Huasheng Ni

Following these guidelines, we propose DSNet, a Dual-Branch CNN architecture, which incorporates atrous convolutions in the shallow layers of the model architecture, as well as pretraining the nearly entire encoder on ImageNet to achieve better performance.

A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and Recognition

1 code implementation27 May 2024 Zilu Guo, Qing Wang, Jun Du, Jia Pan, Qing-Feng Liu, Chin-Hui

In this paper, we propose a variance-preserving interpolation framework to improve diffusion models for single-channel speech enhancement (SE) and automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

no code implementations24 May 2024 Chang Li, Ruoyu Wang, Lijuan Liu, Jun Du, Yixuan Sun, Zilu Guo, Zhenrong Zhang, Yuan Jiang

To overcome these challenges, we introduce a novel quality-aware masked diffusion transformer (QA-MDT) approach that enables generative models to discern the quality of input music waveform during training.

Music Generation Text-to-Music Generation

VT-CLIP: Enhancing Vision-Language Models with Visual-guided Texts

no code implementations4 Dec 2021 Longtian Qiu, Renrui Zhang, Ziyu Guo, Ziyao Zeng, Zilu Guo, Yafeng Li, Guangnan Zhang

Contrastive Language-Image Pre-training (CLIP) has drawn increasing attention recently for its transferable visual representation learning.

Language Modelling Representation Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.