1 code implementation • 2 May 2024 • Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen, Ruotong Liao, Yan Di, Nassir Navab, Federico Tombari, Benjamin Busam
The scheme ensures that the denoising processes are influenced by a holistic understanding of the scene graph, facilitating the generation of globally coherent scenes.
no code implementations • 17 Mar 2024 • Yanyan Li, Chenyu Lyu, Yan Di, Guangyao Zhai, Gim Hee Lee, Federico Tombari
During the Gaussian Splatting optimization process, the scene's geometry can gradually deteriorate if its structure is not deliberately preserved, especially in non-textured regions such as walls, ceilings, and furniture surfaces.
1 code implementation • 18 Nov 2023 • Yan Di, Chenyangguang Zhang, Chaowei Wang, Ruida Zhang, Guangyao Zhai, Yanyan Li, Bowen Fu, Xiangyang Ji, Shan Gao
In this paper, we present ShapeMatcher, a unified self-supervised learning framework for joint shape canonicalization, segmentation, retrieval and deformation.
1 code implementation • 18 Nov 2023 • Yamei Chen, Yan Di, Guangyao Zhai, Fabian Manhardt, Chenyangguang Zhang, Ruida Zhang, Federico Tombari, Nassir Navab, Benjamin Busam
Leveraging the advantage of DINOv2 in providing SE(3)-consistent semantic features, we hierarchically extract two types of SE(3)-invariant geometric features to further encapsulate local-to-global object-specific information.
no code implementations • 21 Sep 2023 • Guangyao Zhai, Xiaoni Cai, Dianye Huang, Yan Di, Fabian Manhardt, Federico Tombari, Nassir Navab, Benjamin Busam
In this paper, we present SG-Bot, a novel rearrangement framework that utilizes a coarse-to-fine scheme with a scene graph as the scene representation.
no code implementations • 15 Aug 2023 • Yan Di, Chenyangguang Zhang, Pengyuan Wang, Guangyao Zhai, Ruida Zhang, Fabian Manhardt, Benjamin Busam, Xiangyang Ji, Federico Tombari
However, such strategies fail to consistently align the denoised point cloud with the given image, leading to unstable conditioning and inferior performance.
1 code implementation • NeurIPS 2023 • Guangyao Zhai, Evin Pınar Örnek, Shun-Cheng Wu, Yan Di, Federico Tombari, Nassir Navab, Benjamin Busam
The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
1 code implementation • CVPR 2023 • HyunJun Jung, Patrick Ruhkamp, Guangyao Zhai, Nikolas Brasch, Yitong Li, Yannick Verdie, Jifei Song, Yiren Zhou, Anil Armagan, Slobodan Ilic, Ales Leonardis, Nassir Navab, Benjamin Busam
Learning-based methods to solve dense 3D vision problems typically train on 3D sensor data.
no code implementations • CVPR 2023 • Dekai Zhu, Guangyao Zhai, Yan Di, Fabian Manhardt, Hendrik Berkemeyer, Tuan Tran, Nassir Navab, Federico Tombari, Benjamin Busam
Reliable multi-agent trajectory prediction is crucial for the safe planning and control of autonomous systems.
1 code implementation • 20 Dec 2022 • HyunJun Jung, Guangyao Zhai, Shun-Cheng Wu, Patrick Ruhkamp, Hannah Schieber, Giulia Rizzoli, Pengyuan Wang, Hongcheng Zhao, Lorenzo Garattoni, Sven Meier, Daniel Roth, Nassir Navab, Benjamin Busam
Estimating 6D object poses is a major challenge in 3D computer vision.
no code implementations • 2 Nov 2022 • Yongzhi Su, Yan Di, Fabian Manhardt, Guangyao Zhai, Jason Rambach, Benjamin Busam, Didier Stricker, Federico Tombari
Despite monocular 3D object detection having recently made a significant leap forward thanks to the use of pre-trained depth estimators for pseudo-LiDAR recovery, such two-stage methods typically suffer from overfitting and are incapable of explicitly encapsulating the geometric relation between depth and object bounding box.
no code implementations • 26 Sep 2022 • Guangyao Zhai, Dianye Huang, Shun-Cheng Wu, HyunJun Jung, Yan Di, Fabian Manhardt, Federico Tombari, Nassir Navab, Benjamin Busam
6-DoF robotic grasping is a long-lasting but unsolved problem.
no code implementations • 31 Jul 2022 • Guangyao Zhai, Yu Zheng, Ziwei Xu, Xin Kong, Yong liu, Benjamin Busam, Yi Ren, Nassir Navab, Zhengyou Zhang
In this paper, we introduce DA$^2$, the first large-scale dual-arm dexterity-aware dataset for the generation of optimal bimanual grasping pairs for arbitrary large objects.
no code implementations • 9 May 2022 • HyunJun Jung, Patrick Ruhkamp, Guangyao Zhai, Nikolas Brasch, Yitong Li, Yannick Verdie, Jifei Song, Yiren Zhou, Anil Armagan, Slobodan Ilic, Ales Leonardis, Benjamin Busam
Depth estimation is a core task in 3D computer vision.
no code implementations • 14 Dec 2020 • Guangyao Zhai, Xin Kong, Jinhao Cui, Yong liu, Zhen Yang
Most end-to-end Multi-Object Tracking (MOT) methods face the problems of low accuracy and poor generalization ability.
1 code implementation • 26 Aug 2020 • Xin Kong, Xuemeng Yang, Guangyao Zhai, Xiangrui Zhao, Xianfang Zeng, Mengmeng Wang, Yong liu, Wanlong Li, Feng Wen
First, we propose a novel semantic graph representation for the point cloud scenes by reserving the semantic and topological information of the raw point cloud.
no code implementations • 4 Sep 2019 • Xin Kong, Guangyao Zhai, Baoquan Zhong, Yong liu
In this paper, we propose PASS3D to achieve point-wise semantic segmentation for 3D point cloud.
no code implementations • 19 Jun 2019 • Guangyao Zhai, Liang Liu, Linjian Zhang, Yong liu
The feature-encoding module encodes the short-term motion feature in an image pair, while the memory-propagating module captures the long-term motion feature in the consecutive image pairs.