no code implementations • 29 Apr 2024 • Bo Chen, Shoukang Hu, Qi Chen, Chenpeng Du, Ran Yi, Yanmin Qian, Xie Chen
We present GStalker, a 3D audio-driven talking face generation model with Gaussian Splatting for both fast training (40 minutes) and real-time rendering (125 FPS) with a 3$\sim$5 minute video for training material, in comparison with previous 2D and 3D NeRF-based modeling frameworks which require hours of training and seconds of rendering per frame.
no code implementations • 24 Apr 2024 • Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma
Furthermore, we propose a few-shot camera motion disentanglement method to extract the common camera motion from multiple videos with similar camera motions, which employs a window-based clustering technique to extract the common features in temporal attention maps of multiple videos.
no code implementations • 8 Apr 2024 • Yating Wang, Ran Yi, Ke Fan, Jinkun Hao, Jiangbo Lu, Lizhuang Ma
Our goal is to leverage the superiority of neural volume rendering into multi-view reconstruction of face mesh with consistent topology.
1 code implementation • 4 Apr 2024 • Sichen Chen, Yingyi Zhang, Siming Huang, Ran Yi, Ke Fan, Ruixin Zhang, Peixian Chen, Jun Wang, Shouhong Ding, Lizhuang Ma
To mitigate the problem of under-fitting, we design a transformer module named Multi-Cycled Transformer(MCT) based on multiple-cycled forwards to more fully exploit the potential of small model parameters.
1 code implementation • 17 Jan 2024 • Hexiang Wang, Fengqi Liu, Qianyu Zhou, Ran Yi, Xin Tan, Lizhuang Ma
To address this issue, we propose to model motion from the source image to the driving frame in highly-expressive diffeomorphism spaces.
no code implementations • 23 Dec 2023 • Changsong Lei, Mengfei Xia, Shaofeng Wang, Yaqian Liang, Ran Yi, Yuhui Wen, YongJin Liu
To address this challenge, we propose a general tooth arrangement neural network based on the diffusion probabilistic model.
no code implementations • 15 Dec 2023 • Yige Chen, Ang Chen, Siyuan Chen, Ran Yi
Firstly, our work divides the editing process into a geometry editing stage and a texture editing stage to achieve more detailed and photo-realistic results ; Secondly, in order to perform non-rigid transformation with controllable results while maintain the fidelity towards original 3D models in the same time, we propose a multi-view-embedding(MVE) optimization strategy to ensure that the diffusion model learns the overall features of the original object and an embedding-fusion(EF) to control the degree of editing by adjusting the value of the fusing rate.
1 code implementation • 10 Dec 2023 • Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang
Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.
no code implementations • 30 Nov 2023 • Mengfei Xia, Yujun Shen, Ceyuan Yang, Ran Yi, Wenping Wang, Yong-Jin Liu
In this work, we revisit the mathematical foundations of GANs, and theoretically reveal that the native adversarial loss for GAN training is insufficient to fix the problem of subsets with positive Lebesgue measure of the generated data manifold lying out of the real data manifold.
no code implementations • 9 Nov 2023 • Haokun Zhu, Juang Ian Chong, Teng Hu, Ran Yi, Yu-Kun Lai, Paul L. Rosin
Vector graphics are widely used in graphical designs and have received more and more attention.
no code implementations • 14 Oct 2023 • Mengfei Xia, Yujun Shen, Changsong Lei, Yu Zhou, Ran Yi, Deli Zhao, Wenping Wang, Yong-Jin Liu
By viewing the generation of diffusion models as a discretized integrating process, we argue that the quality drop is partly caused by applying an inaccurate integral direction to a timestep interval.
1 code implementation • 4 Oct 2023 • Yuze He, Yushi Bai, Matthieu Lin, Wang Zhao, Yubin Hu, Jenny Sheng, Ran Yi, Juanzi Li, Yong-Jin Liu
Recent methods in text-to-3D leverage powerful pretrained diffusion models to optimize NeRF.
no code implementations • 30 Sep 2023 • Yuze He, Peng Wang, Yubin Hu, Wang Zhao, Ran Yi, Yong-Jin Liu, Wenping Wang
In this paper, we explore the potential of MPI and show that MPI can synthesize high-quality novel views of complex scenes with diverse camera distributions and view directions, which are not only limited to simple forward-facing scenes.
1 code implementation • ICCV 2023 • Zhimin Sun, Shen Chen, Taiping Yao, Bangjie Yin, Ran Yi, Shouhong Ding, Lizhuang Ma
The challenge in sourcing attribution for forgery faces has gained widespread attention due to the rapid development of generative techniques.
2 code implementations • 7 Sep 2023 • Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, Lizhuang Ma
To solve the problem, we propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions.
1 code implementation • 7 Sep 2023 • Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Liang Liu, Yabiao Wang, Chengjie Wang
To improve the facial representation quality, we use feature map of a pre-trained visual backbone as a supervision item and use a partially pre-trained decoder for mask image modeling.
1 code implementation • ICCV 2023 • Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma
To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model.
1 code implementation • ICCV 2023 • Zhiwei Zhang, Zhizhong Zhang, Qian Yu, Ran Yi, Yuan Xie, Lizhuang Ma
3D panoptic segmentation is a challenging perception task that requires both semantic segmentation and instance segmentation.
1 code implementation • 12 Jul 2023 • Ke Fan, Changan Wang, Yabiao Wang, Chengjie Wang, Ran Yi, Lizhuang Ma
Glass-like objects are widespread in daily life but remain intractable to be segmented for most existing methods.
no code implementations • 18 May 2023 • Bin Fang, Bo Li, Shuang Wu, Ran Yi, Shouhong Ding, Lizhuang Ma
The unauthorized use of personal data for commercial purposes and the clandestine acquisition of private data for training machine learning models continue to raise concerns.
no code implementations • 18 May 2023 • Bin Fang, Bo Li, Shuang Wu, Tianyi Zheng, Shouhong Ding, Ran Yi, Lizhuang Ma
One of the crucial factors contributing to this success has been the access to an abundance of high-quality data for constructing machine learning models.
1 code implementation • CVPR 2023 • Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Ran Yi, Shouhong Ding, Lizhuang Ma
To address these issues, we propose a novel perspective for DG FAS that aligns features on the instance level without the need for domain labels.
1 code implementation • CVPR 2023 • Ran Yi, Haoyuan Tian, Zhihao Gu, Yu-Kun Lai, Paul L. Rosin
To fill the gap in the field of artistic image aesthetics assessment (AIAA), we first introduce a large-scale AIAA dataset: Boldbrush Artistic Image Dataset (BAID), which consists of 60, 337 artistic images covering various art forms, with more than 360, 000 votes from online users.
2 code implementations • ICCV 2023 • Junshu Tang, Tengfei Wang, Bo Zhang, Ting Zhang, Ran Yi, Lizhuang Ma, Dong Chen
In this work, we investigate the problem of creating high-fidelity 3D content from only a single image.
1 code implementation • CVPR 2023 • Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, Chengjie Wang
2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields.
Ranked #3 on RGB+3D Anomaly Detection and Segmentation on MVTEC 3D-AD (using extra training data)
Contrastive Learning RGB+3D Anomaly Detection and Segmentation
1 code implementation • ICCV 2023 • Zhihao Gu, Liang Liu, Xu Chen, Ran Yi, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Annan Shu, Guannan Jiang, Lizhuang Ma
Specifically, we first propose a normality recall memory (NR Memory) to strengthen the normality of student-generated features by recalling the stored normal information.
Ranked #11 on Anomaly Detection on MVTec AD
no code implementations • 20 Jul 2022 • Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Kekai Sheng, Shouhong Ding, Lizhuang Ma
Most existing UDA FAS methods typically fit the trained models to the target domain via aligning the distribution of semantic high-level features.
no code implementations • 20 Jul 2022 • Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Shouhong Ding, Lizhuang Ma
Existing DG-based FAS approaches always capture the domain-invariant features for generalizing on the various unseen domains.
no code implementations • 13 Apr 2022 • Zipeng Ye, Zhiyao Sun, Yu-Hui Wen, Yanan sun, Tian Lv, Ran Yi, Yong-Jin Liu
In this paper, we propose a method to generate talking-face videos with continuously controllable expressions in real-time.
1 code implementation • CVPR 2022 • Junshu Tang, Zhijun Gong, Ran Yi, Yuan Xie, Lizhuang Ma
An asymmetric keypoint locator, including an unsupervised multi-scale keypoint detector and a complete keypoint generator, is proposed for localizing aligned keypoints from complete and partial point clouds.
no code implementations • 16 Mar 2022 • Yue Wang, Ran Yi, Luying Li, Ying Tai, Chengjie Wang, Lizhuang Ma
We propose a new encoder which embeds real faces into Z+ space and proposes a dual-path training strategy to better cope with the adapted decoder and eliminate the artifacts.
1 code implementation • 8 Feb 2022 • Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin
In this paper, we propose a novel method to automatically transform face photos to portrait drawings using unpaired training data with two new features; i. e., our method can (1) learn to generate high quality portrait drawings in multiple styles using a single network and (2) generate portrait drawings in a "new style" unseen in the training data.
no code implementations • 16 Jan 2022 • Zipeng Ye, Mengfei Xia, Ran Yi, Juyong Zhang, Yu-Kun Lai, Xuwei Huang, Guoxin Zhang, Yong-Jin Liu
In this paper, we present a dynamic convolution kernel (DCK) strategy for convolutional neural networks.
no code implementations • 13 Jan 2022 • Yifeng Chen, Wenqing Chu, Fangfang Wang, Ying Tai, Ran Yi, Zhenye Gan, Liang Yao, Chengjie Wang, Xi Li
Recently, there is growing attention on one-stage panoptic segmentation methods which aim to segment instances and stuff jointly within a fully convolutional pipeline efficiently.
1 code implementation • CVPR 2022 • Shaohua Guo, Liang Liu, Zhenye Gan, Yabiao Wang, Wuhao Zhang, Chengjie Wang, Guannan Jiang, Wei zhang, Ran Yi, Lizhuang Ma, Ke Xu
The huge burden of computation and memory are two obstacles in ultra-high resolution image segmentation.
no code implementations • 28 Dec 2021 • Qiqi Gu, Shen Chen, Taiping Yao, Yang Chen, Shouhong Ding, Ran Yi
The progressive enhancement process facilitates the learning of discriminative features with fine-grained face forgery clues.
1 code implementation • 11 Oct 2021 • Qianyu Zhou, Chuyun Zhuang, Ran Yi, Xuequan Lu, Lizhuang Ma
In this paper, we propose a novel and fully end-to-end trainable approach, called regional contrastive consistency regularization (RCCR) for domain adaptive semantic segmentation.
no code implementations • 1 Mar 2021 • Ran Yi, Yang Zhou, Xin Wang, Zhiyuan Liu, Xiaotian Li, Bin Ran
This paper presents an infrastructure assisted constrained connected automated vehicles (CAVs) trajectory optimization method on curved roads.
no code implementations • 1 Sep 2020 • Paul L. Rosin, Yu-Kun Lai, David Mould, Ran Yi, Itamar Berger, Lars Doyle, Seungyong Lee, Chuan Li, Yong-Jin Liu, Amir Semmo, Ariel Shamir, Minjung Son, Holger Winnemoller
Despite the recent upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer, the state of performance evaluation in this field is limited, especially compared to the norms in the computer vision and machine learning communities.
1 code implementation • CVPR 2020 • Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin
We observe that due to the significant imbalance of information richness between photos and drawings, existing unpaired transfer methods such as CycleGAN tends to embed invisible reconstruction information indiscriminately in the whole drawings, leading to important facial features partially missing in drawings.
1 code implementation • 15 Mar 2020 • Zipeng Ye, Mengfei Xia, Yanan sun, Ran Yi, MinJing Yu, Juyong Zhang, Yu-Kun Lai, Yong-Jin Liu
The most challenging issue for our system is that the source domain of face photos (characterized by normal 2D faces) is significantly different from the target domain of 3D caricatures (characterized by 3D exaggerated face shapes and textures).
1 code implementation • 24 Feb 2020 • Ran Yi, Zipeng Ye, Juyong Zhang, Hujun Bao, Yong-Jin Liu
In this paper, we address this problem by proposing a deep neural network model that takes an audio signal A of a source person and a very short video V of a target person as input, and outputs a synthesized high-quality talking face video with personalized head pose (making use of the visual information in V), expression and lip synchronization (by considering both A and V).
no code implementations • 17 Nov 2019 • Yiheng Han, Wang Zhao, Jia Pan, Zipeng Ye, Ran Yi, Yong-Jin Liu
Motion planning for robots of high degrees-of-freedom (DOFs) is an important problem in robotics with sampling-based methods in configuration space C as one popular solution.
6 code implementations • CVPR 2019 • Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin
Moreover, artists tend to use different strategies to draw different facial features and the lines drawn are only loosely related to obvious image features.
no code implementations • CVPR 2018 • Ran Yi, Yong-Jin Liu, Yu-Kun Lai
We propose an efficient Lloyd-like method with a splitting-merging scheme to compute a uniform tessellation on M, which induces the CSS in X. Theoretically our method has a good competitive ratio O(1).