Search Results for author: Tai Wang

Found 29 papers, 22 papers with code

An Empirical Study of Training State-of-the-Art LiDAR Segmentation Models

1 code implementation • 23 May 2024 • Jiahao Sun, Chunmei Qing, Xiang Xu, Lingdong Kong, Youquan Liu, Li Li, Chenming Zhu, Jingwei Zhang, Zeqi Xiao, Runnan Chen, Tai Wang, Wenwei Zhang, Kai Chen

In the rapidly evolving field of autonomous driving, precise segmentation of LiDAR data is crucial for understanding complex 3D environments.

Autonomous Driving Benchmarking +3

4,920

Paper
Code

Grounded 3D-LLM with Referent Tokens

1 code implementation • 16 May 2024 • Yilun Chen, Shuai Yang, Haifeng Huang, Tai Wang, Ruiyuan Lyu, Runsen Xu, Dahua Lin, Jiangmiao Pang

Prior studies on 3D scene understanding have primarily developed specialized models for specific tasks or required task-specific fine-tuning.

Dense Captioning Language Modelling +3

Paper
Code

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

no code implementations • 25 Feb 2024 • Xiao Chen, Quanyi Li, Tai Wang, Tianfan Xue, Jiangmiao Pang

Previous works attempt to automate this process using the Next-Best-View (NBV) policy for active 3D reconstruction.

3D Reconstruction Reinforcement Learning (RL)

Paper
Add Code

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

1 code implementation • 26 Dec 2023 • Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang

In the realm of computer vision and robotics, embodied agents are expected to explore their environment and carry out human instructions.

Scene Understanding

331

Paper
Code

OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries

1 code implementation • 6 Dec 2023 • Yuhang Lu, Xinge Zhu, Tai Wang, Yuexin Ma

Occupancy prediction has increasingly garnered attention in recent years for its fine-grained understanding of 3D scenes.

Paper
Code

Learning to Adapt SAM for Segmenting Cross-domain Point Clouds

no code implementations • 13 Oct 2023 • Xidong Peng, Runnan Chen, Feng Qiao, Lingdong Kong, Youquan Liu, Tai Wang, Xinge Zhu, Yuexin Ma

Unsupervised domain adaptation (UDA) in 3D segmentation tasks presents a formidable challenge, primarily stemming from the sparse and unordered nature of point cloud data.

General Knowledge Image Segmentation +4

Paper
Add Code

Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection

no code implementations • 18 Sep 2023 • Chenming Zhu, Wenwei Zhang, Tai Wang, Xihui Liu, Kai Chen

Instead of leveraging 2D images, we propose Object2Scene, the first approach that leverages large-scale large-vocabulary 3D object datasets to augment existing 3D scene datasets for open-vocabulary 3D object detection.

Ranked #2 on 3D Open-Vocabulary Object Detection on ScanNet on unseen classes

3D Object Detection 3D Open-Vocabulary Object Detection +4

Paper
Add Code

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

1 code implementation • 14 Sep 2023 • Zeqi Xiao, Tai Wang, Jingbo Wang, Jinkun Cao, Wenwei Zhang, Bo Dai, Dahua Lin, Jiangmiao Pang

Based on the definition, UniHSI constitutes a Large Language Model (LLM) Planner to translate language prompts into task plans in the form of CoC, and a Unified Controller that turns CoC into uniform task execution.

Language Modelling Large Language Model

129

Paper
Code

PointLLM: Empowering Large Language Models to Understand Point Clouds

3 code implementations • 31 Aug 2023 • Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin

The unprecedented advancements in Large Language Models (LLMs) have shown a profound impact on natural language processing but are yet to fully embrace the realm of 3D understanding.

Ranked #3 on 3D Object Captioning on Objaverse

3D Object Captioning 3D Question Answering (3D-QA) +3

400

Paper
Code

Scene as Occupancy

2 code implementations • ICCV 2023 • Chonghao Sima, Wenwen Tong, Tai Wang, Li Chen, Silei Wu, Hanming Deng, Yi Gu, Lewei Lu, Ping Luo, Dahua Lin, Hongyang Li

Human driver can easily describe the complex traffic scene by visual system.

Decoder Motion Planning

502

Paper
Code

DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking

1 code implementation • 29 Mar 2023 • Qing Lian, Tai Wang, Dahua Lin, Jiangmiao Pang

Recent multi-camera 3D object detectors usually leverage temporal information to construct multi-view stereo that alleviates the ill-posed depth estimation.

3D Object Detection Depth Estimation +3

Paper
Code

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

1 code implementation • CVPR 2023 • Runsen Xu, Tai Wang, Wenwei Zhang, Runjian Chen, Jinkun Cao, Jiangmiao Pang, Dahua Lin

This paper introduces the Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training and a carefully designed data-efficient 3D object detection benchmark on the Waymo dataset.

3D Object Detection object-detection

Paper
Code

Position-Guided Point Cloud Panoptic Segmentation Transformer

1 code implementation • 23 Mar 2023 • Zeqi Xiao, Wenwei Zhang, Tai Wang, Chen Change Loy, Dahua Lin, Jiangmiao Pang

DEtection TRansformer (DETR) started a trend that uses a group of learnable queries for unified visual perception.

Ranked #1 on Panoptic Segmentation on SemanticKITTI

Instance Segmentation Panoptic Segmentation +3

Paper
Code

GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding

1 code implementation • ICCV 2023 • Jihao Liu, Tai Wang, Boxiao Liu, Qihang Zhang, Yu Liu, Hongsheng Li

In this paper, we propose Geometry Enhanced Masked Image Modeling (GeoMIM) to transfer the knowledge of the LiDAR model in a pretrain-finetune paradigm for improving the multi-view camera-based 3D detection.

3D Object Detection Decoder +2

Paper
Code

Vision-Centric BEV Perception: A Survey

1 code implementation • 4 Aug 2022 • Yuexin Ma, Tai Wang, Xuyang Bai, Huitong Yang, Yuenan Hou, Yaming Wang, Yu Qiao, Ruigang Yang, Dinesh Manocha, Xinge Zhu

In recent years, vision-centric Bird's Eye View (BEV) perception has garnered significant interest from both industry and academia due to its inherent advantages, such as providing an intuitive representation of the world and being conducive to data fusion.

644

Paper
Code

Monocular 3D Object Detection with Depth from Motion

1 code implementation • 26 Jul 2022 • Tai Wang, Jiangmiao Pang, Dahua Lin

Perceiving 3D objects from monocular inputs is crucial for robotic systems, given its economy compared to multi-sensor settings.

Depth Estimation Monocular 3D Object Detection +2

301

Paper
Code

MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones

1 code implementation • 26 Jul 2022 • Tai Wang, Qing Lian, Chenming Zhu, Xinge Zhu, Wenwei Zhang

In this technical report, we present our solution, dubbed MV-FCOS3D++, for the Camera-Only 3D Detection track in Waymo Open Dataset Challenge 2022.

object-detection Object Detection +1

301

Paper
Code

MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection

1 code implementation • ICCV 2023 • Renrui Zhang, Han Qiu, Tai Wang, Ziyu Guo, Xuanzhuo Xu, Ziteng Cui, Yu Qiao, Peng Gao, Hongsheng Li

In this paper, we introduce the first DETR framework for Monocular DEtection with a depth-guided TRansformer, named MonoDETR.

Ranked #9 on 3D Object Detection From Monocular Images on KITTI-360

3D Object Detection From Monocular Images Autonomous Driving +4

321

Paper
Code

Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

1 code implementation • NeurIPS 2021 • Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin

We adopt DCD to evaluate the point cloud completion task, where experimental results show that DCD pays attention to both the overall structure and local geometric details and provides a more reliable evaluation even when CD and EMD contradict each other.

Point Cloud Completion

137

Paper
Code

Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

1 code implementation • 24 Nov 2021 • Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin

Point Cloud Completion

137

Paper
Code

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception

1 code implementation • 12 Sep 2021 • Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Wei Li, Yuexin Ma, Hongsheng Li, Ruigang Yang, Dahua Lin

In this paper, we benchmark our model on these three tasks.

Panoptic Segmentation Segmentation

815

Paper
Code

SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation

no code implementations • 22 Aug 2021 • Xidong Peng, Xinge Zhu, Tai Wang, Yuexin Ma

Due to the information sparsity of local cost volume, we further introduce match reweighting and structure-aware attention, to make the depth information more concentrated.

Depth Estimation

Paper
Add Code

Probabilistic and Geometric Depth: Detecting Objects in Perspective

1 code implementation • 29 Jul 2021 • Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

As the preliminary depth estimation of each instance is usually inaccurate in this ill-posed setting, we incorporate a probabilistic representation to capture the uncertainty.

Ranked #10 on 3D Object Detection on KITTI Cars Hard val

Attribute Depth Estimation +2

4,920

Paper
Code

FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection

8 code implementations • 22 Apr 2021 • Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D.

Ranked #323 on 3D Object Detection on nuScenes

Autonomous Driving Monocular 3D Object Detection +2

4,920

Paper
Code

FLAVA: Find, Localize, Adjust and Verify to Annotate LiDAR-Based Point Clouds

no code implementations • 20 Nov 2020 • Tai Wang, Conghui He, Zhe Wang, Jianping Shi, Dahua Lin

Recent years have witnessed the rapid progress of perception algorithms on top of LiDAR, a widely adopted sensor for autonomous driving systems.

Autonomous Driving

Paper
Add Code

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

2 code implementations • CVPR 2021 • Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Yuexin Ma, Wei Li, Hongsheng Li, Dahua Lin

However, we found that in the outdoor point cloud, the improvement obtained in this way is quite limited.

Ranked #3 on 3D Semantic Segmentation on ScribbleKITTI

LIDAR Semantic Segmentation Panoptic Segmentation +3

815

Paper
Code

SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds

1 code implementation • 6 Apr 2020 • Xinge Zhu, Yuexin Ma, Tai Wang, Yan Xu, Jianping Shi, Dahua Lin

Multi-class 3D object detection aims to localize and classify objects of multiple categories from point clouds.

3D Object Detection object-detection

Paper
Code

Reconfigurable Voxels: A New Representation for LiDAR-Based Point Clouds

no code implementations • 6 Apr 2020 • Tai Wang, Xinge Zhu, Dahua Lin

LiDAR is an important method for autonomous driving systems to sense the environment.

Autonomous Driving

Paper
Add Code

An Empirical Study on Academic Commentary and Its Implications on Reading and Writing

no code implementations • 12 Feb 2016 • Tai Wang, Xiangen Hu, Keith Shubeck, Zhiqiang Cai, Jie Tang

The relationship between reading and writing (RRW) is one of the major themes in learning science.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.