no code implementations • 12 Mar 2024 • Ri-Zhao Qiu, Yafei Hu, Ge Yang, Yuchen Song, Yang Fu, Jianglong Ye, Jiteng Mu, Ruihan Yang, Nikolay Atanasov, Sebastian Scherer, Xiaolong Wang
An open problem in mobile manipulation is how to represent objects and scenes in a unified manner, so that robots can use it both for navigating in the environment and manipulating objects.
no code implementations • 4 Oct 2023 • Jianglong Ye, Peng Wang, Kejie Li, Yichun Shi, Heng Wang
Specifically, we decompose the NVS task into two stages: (i) transforming observed regions to a novel view, and (ii) hallucinating unseen regions.
1 code implementation • 31 Aug 2023 • Yanjie Ze, Ge Yan, Yueh-Hua Wu, Annabella Macaluso, Yuying Ge, Jianglong Ye, Nicklas Hansen, Li Erran Li, Xiaolong Wang
To incorporate semantics in 3D, the reconstruction module utilizes a vision-language foundation model ($\textit{e. g.}$, Stable Diffusion) to distill rich semantic information into the deep 3D voxel.
2 code implementations • 31 Aug 2023 • Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, Xiao Yang
We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt.
no code implementations • ICCV 2023 • Jianglong Ye, Naiyan Wang, Xiaolong Wang
Recent works on generalizable NeRFs have shown promising results on novel view synthesis from single or few images.
1 code implementation • 11 Jul 2022 • Jianglong Ye, Jiashun Wang, Binghao Huang, Yuzhe Qin, Xiaolong Wang
We will first convert the large-scale human-object interaction trajectories to robot demonstrations via motion retargeting, and then use these demonstrations to train CGF.
1 code implementation • CVPR 2022 • Jianglong Ye, Yuntao Chen, Naiyan Wang, Xiaolong Wang
This limitation leads to tedious data processing (converting non-watertight raw data to watertight) as well as the incapability of representing general object shapes in the real world.
1 code implementation • 24 Nov 2021 • Jianglong Ye, Yuntao Chen, Naiyan Wang, Xiaolong Wang
Tracking and reconstructing 3D objects from cluttered scenes are the key components for computer vision, robotics and autonomous driving systems.