Search Results for author: Yuxin Mao

Found 13 papers, 7 papers with code

TAVGBench: Benchmarking Text to Audible-Video Generation

1 code implementation • 22 Apr 2024 • Yuxin Mao, Xuyang Shen, Jing Zhang, Zhen Qin, Jinxing Zhou, Mochu Xiang, Yiran Zhong, Yuchao Dai

To support research in this field, we have developed a comprehensive Text to Audible-Video Generation Benchmark (TAVGBench), which contains over 1. 7 million clips with a total duration of 11. 8 thousand hours.

Benchmarking Contrastive Learning +1

Paper
Code

Multimodal Variational Auto-encoder based Audio-Visual Segmentation

1 code implementation • ICCV 2023 • Yuxin Mao, Jing Zhang, Mochu Xiang, Yiran Zhong, Yuchao Dai

To achieve this, our ECMVAE factorizes the representations of each modality with a modality-shared representation and a modality-specific representation.

Attribute Representation Learning

Paper
Code

RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation

1 code implementation • ICCV 2023 • Zhexiong Wan, Yuxin Mao, Jing Zhang, Yuchao Dai

Recently, the RGB images and point clouds fusion methods have been proposed to jointly estimate 2D optical flow and 3D scene flow.

Optical Flow Estimation Scene Flow Estimation

Paper
Code

Decomposed Guided Dynamic Filters for Efficient RGB-Guided Depth Completion

no code implementations • 5 Sep 2023 • YuFei Wang, Yuxin Mao, Qi Liu, Yuchao Dai

The decomposed filters not only maintain the favorable properties of guided dynamic filters as being content-dependent and spatially-variant, but also reduce model parameters and hardware costs, as the learned adaptors are decoupled with the number of feature channels.

Depth Completion object-detection +2

Paper
Add Code

Improving Audio-Visual Segmentation with Bidirectional Generation

no code implementations • 16 Aug 2023 • Dawei Hao, Yuxin Mao, Bowen He, Xiaodong Han, Yuchao Dai, Yiran Zhong

In this paper, inspired by the human ability to mentally simulate the sound of an object and its visual appearance, we introduce a bidirectional generation framework.

Motion Estimation Object +2

Paper
Add Code

Contrastive Conditional Latent Diffusion for Audio-visual Segmentation

no code implementations • 31 Jul 2023 • Yuxin Mao, Jing Zhang, Mochu Xiang, Yunqiu Lv, Yiran Zhong, Yuchao Dai

We propose a latent diffusion model with contrastive learning for audio-visual segmentation (AVS) to extensively explore the contribution of audio.

Contrastive Learning Denoising +2

Paper
Add Code

Mutual Information Regularization for Weakly-supervised RGB-D Salient Object Detection

1 code implementation • 6 Jun 2023 • Aixuan Li, Yuxin Mao, Jing Zhang, Yuchao Dai

In particular, following the principle of disentangled representation learning, we introduce a mutual information upper bound with a mutual information minimization regularizer to encourage the disentangled representation of each modality for salient object detection.

Object object-detection +3

Paper
Code

Joint Appearance and Motion Learning for Efficient Rolling Shutter Correction

1 code implementation • CVPR 2023 • Bin Fan, Yuxin Mao, Yuchao Dai, Zhexiong Wan, Qi Liu

Rolling shutter correction (RSC) is becoming increasingly popular for RS cameras that are widely used in commercial and industrial applications.

Data Augmentation Decoder +1

Paper
Code

Learning Dense and Continuous Optical Flow from an Event Camera

1 code implementation • 16 Nov 2022 • Zhexiong Wan, Yuchao Dai, Yuxin Mao

In this paper, we propose a novel deep learning-based dense and continuous optical flow estimation framework from a single image with event streams, which facilitates the accurate perception of high-speed motion.

Optical Flow Estimation

Paper
Code

Deep Idempotent Network for Efficient Single Image Blind Deblurring

no code implementations • 13 Oct 2022 • Yuxin Mao, Zhexiong Wan, Yuchao Dai, Xin Yu

Single image blind deblurring is highly ill-posed as neither the latent sharp image nor the blur kernel is known.

Decoder Single-Image Blind Deblurring

Paper
Add Code

MUNet: Motion Uncertainty-aware Semi-supervised Video Object Segmentation

no code implementations • 29 Nov 2021 • Jiadai Sun, Yuxin Mao, Yuchao Dai, Yiran Zhong, Jianyuan Wang

The task of semi-supervised video object segmentation (VOS) has been greatly advanced and state-of-the-art performance has been made by dense matching-based methods.

Object Semantic Segmentation +2

Paper
Add Code

Generative Transformer for Accurate and Reliable Salient Object Detection

2 code implementations • 20 Apr 2021 • Yuxin Mao, Jing Zhang, Zhexiong Wan, Yuchao Dai, Aixuan Li, Yunqiu Lv, Xinyu Tian, Deng-Ping Fan, Nick Barnes

For the former, we apply transformer to a deterministic model, and explain that the effective structure modeling and global context modeling abilities lead to its superior performance compared with the CNN based frameworks.

Attribute Camouflaged Object Segmentation +8

Paper
Code

PRAFlow_RVC: Pyramid Recurrent All-Pairs Field Transforms for Optical Flow Estimation in Robust Vision Challenge 2020

no code implementations • 14 Sep 2020 • Zhexiong Wan, Yuxin Mao, Yuchao Dai

Optical flow estimation is an important computer vision task, which aims at estimating the dense correspondences between two frames.

Optical Flow Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.