1 code implementation • 23 May 2024 • Jiahao Sun, Chunmei Qing, Xiang Xu, Lingdong Kong, Youquan Liu, Li Li, Chenming Zhu, Jingwei Zhang, Zeqi Xiao, Runnan Chen, Tai Wang, Wenwei Zhang, Kai Chen
In the rapidly evolving field of autonomous driving, precise segmentation of LiDAR data is crucial for understanding complex 3D environments.
1 code implementation • 16 May 2024 • Yilun Chen, Shuai Yang, Haifeng Huang, Tai Wang, Ruiyuan Lyu, Runsen Xu, Dahua Lin, Jiangmiao Pang
Prior studies on 3D scene understanding have primarily developed specialized models for specific tasks or required task-specific fine-tuning.
no code implementations • 25 Feb 2024 • Xiao Chen, Quanyi Li, Tai Wang, Tianfan Xue, Jiangmiao Pang
Previous works attempt to automate this process using the Next-Best-View (NBV) policy for active 3D reconstruction.
1 code implementation • 26 Dec 2023 • Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang
In the realm of computer vision and robotics, embodied agents are expected to explore their environment and carry out human instructions.
1 code implementation • 6 Dec 2023 • Yuhang Lu, Xinge Zhu, Tai Wang, Yuexin Ma
Occupancy prediction has increasingly garnered attention in recent years for its fine-grained understanding of 3D scenes.
no code implementations • 13 Oct 2023 • Xidong Peng, Runnan Chen, Feng Qiao, Lingdong Kong, Youquan Liu, Tai Wang, Xinge Zhu, Yuexin Ma
Unsupervised domain adaptation (UDA) in 3D segmentation tasks presents a formidable challenge, primarily stemming from the sparse and unordered nature of point cloud data.
no code implementations • 18 Sep 2023 • Chenming Zhu, Wenwei Zhang, Tai Wang, Xihui Liu, Kai Chen
Instead of leveraging 2D images, we propose Object2Scene, the first approach that leverages large-scale large-vocabulary 3D object datasets to augment existing 3D scene datasets for open-vocabulary 3D object detection.
1 code implementation • 14 Sep 2023 • Zeqi Xiao, Tai Wang, Jingbo Wang, Jinkun Cao, Wenwei Zhang, Bo Dai, Dahua Lin, Jiangmiao Pang
Based on the definition, UniHSI constitutes a Large Language Model (LLM) Planner to translate language prompts into task plans in the form of CoC, and a Unified Controller that turns CoC into uniform task execution.
3 code implementations • 31 Aug 2023 • Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
The unprecedented advancements in Large Language Models (LLMs) have shown a profound impact on natural language processing but are yet to fully embrace the realm of 3D understanding.
Ranked #3 on 3D Object Captioning on Objaverse
2 code implementations • ICCV 2023 • Chonghao Sima, Wenwen Tong, Tai Wang, Li Chen, Silei Wu, Hanming Deng, Yi Gu, Lewei Lu, Ping Luo, Dahua Lin, Hongyang Li
Human driver can easily describe the complex traffic scene by visual system.
1 code implementation • 29 Mar 2023 • Qing Lian, Tai Wang, Dahua Lin, Jiangmiao Pang
Recent multi-camera 3D object detectors usually leverage temporal information to construct multi-view stereo that alleviates the ill-posed depth estimation.
1 code implementation • CVPR 2023 • Runsen Xu, Tai Wang, Wenwei Zhang, Runjian Chen, Jinkun Cao, Jiangmiao Pang, Dahua Lin
This paper introduces the Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training and a carefully designed data-efficient 3D object detection benchmark on the Waymo dataset.
1 code implementation • 23 Mar 2023 • Zeqi Xiao, Wenwei Zhang, Tai Wang, Chen Change Loy, Dahua Lin, Jiangmiao Pang
DEtection TRansformer (DETR) started a trend that uses a group of learnable queries for unified visual perception.
Ranked #1 on Panoptic Segmentation on SemanticKITTI
1 code implementation • ICCV 2023 • Jihao Liu, Tai Wang, Boxiao Liu, Qihang Zhang, Yu Liu, Hongsheng Li
In this paper, we propose Geometry Enhanced Masked Image Modeling (GeoMIM) to transfer the knowledge of the LiDAR model in a pretrain-finetune paradigm for improving the multi-view camera-based 3D detection.
1 code implementation • 4 Aug 2022 • Yuexin Ma, Tai Wang, Xuyang Bai, Huitong Yang, Yuenan Hou, Yaming Wang, Yu Qiao, Ruigang Yang, Dinesh Manocha, Xinge Zhu
In recent years, vision-centric Bird's Eye View (BEV) perception has garnered significant interest from both industry and academia due to its inherent advantages, such as providing an intuitive representation of the world and being conducive to data fusion.
1 code implementation • 26 Jul 2022 • Tai Wang, Jiangmiao Pang, Dahua Lin
Perceiving 3D objects from monocular inputs is crucial for robotic systems, given its economy compared to multi-sensor settings.
1 code implementation • 26 Jul 2022 • Tai Wang, Qing Lian, Chenming Zhu, Xinge Zhu, Wenwei Zhang
In this technical report, we present our solution, dubbed MV-FCOS3D++, for the Camera-Only 3D Detection track in Waymo Open Dataset Challenge 2022.
1 code implementation • ICCV 2023 • Renrui Zhang, Han Qiu, Tai Wang, Ziyu Guo, Xuanzhuo Xu, Ziteng Cui, Yu Qiao, Peng Gao, Hongsheng Li
In this paper, we introduce the first DETR framework for Monocular DEtection with a depth-guided TRansformer, named MonoDETR.
3D Object Detection From Monocular Images Autonomous Driving +4
1 code implementation • NeurIPS 2021 • Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin
We adopt DCD to evaluate the point cloud completion task, where experimental results show that DCD pays attention to both the overall structure and local geometric details and provides a more reliable evaluation even when CD and EMD contradict each other.
1 code implementation • 24 Nov 2021 • Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin
We adopt DCD to evaluate the point cloud completion task, where experimental results show that DCD pays attention to both the overall structure and local geometric details and provides a more reliable evaluation even when CD and EMD contradict each other.
1 code implementation • 12 Sep 2021 • Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Wei Li, Yuexin Ma, Hongsheng Li, Ruigang Yang, Dahua Lin
In this paper, we benchmark our model on these three tasks.
no code implementations • 22 Aug 2021 • Xidong Peng, Xinge Zhu, Tai Wang, Yuexin Ma
Due to the information sparsity of local cost volume, we further introduce match reweighting and structure-aware attention, to make the depth information more concentrated.
1 code implementation • 29 Jul 2021 • Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin
As the preliminary depth estimation of each instance is usually inaccurate in this ill-posed setting, we incorporate a probabilistic representation to capture the uncertainty.
Ranked #10 on 3D Object Detection on KITTI Cars Hard val
8 code implementations • 22 Apr 2021 • Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin
In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D.
Ranked #323 on 3D Object Detection on nuScenes
no code implementations • 20 Nov 2020 • Tai Wang, Conghui He, Zhe Wang, Jianping Shi, Dahua Lin
Recent years have witnessed the rapid progress of perception algorithms on top of LiDAR, a widely adopted sensor for autonomous driving systems.
2 code implementations • CVPR 2021 • Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Yuexin Ma, Wei Li, Hongsheng Li, Dahua Lin
However, we found that in the outdoor point cloud, the improvement obtained in this way is quite limited.
Ranked #3 on 3D Semantic Segmentation on ScribbleKITTI
1 code implementation • 6 Apr 2020 • Xinge Zhu, Yuexin Ma, Tai Wang, Yan Xu, Jianping Shi, Dahua Lin
Multi-class 3D object detection aims to localize and classify objects of multiple categories from point clouds.
no code implementations • 6 Apr 2020 • Tai Wang, Xinge Zhu, Dahua Lin
LiDAR is an important method for autonomous driving systems to sense the environment.
no code implementations • 12 Feb 2016 • Tai Wang, Xiangen Hu, Keith Shubeck, Zhiqiang Cai, Jie Tang
The relationship between reading and writing (RRW) is one of the major themes in learning science.