1 code implementation • 22 Dec 2023 • Zhangyang Qi, Ye Fang, Mengchen Zhang, Zeyi Sun, Tong Wu, Ziwei Liu, Dahua Lin, Jiaqi Wang, Hengshuang Zhao
We conducted a series of structured experiments to evaluate their performance in various industrial application scenarios, offering a comprehensive perspective on their practical utility.
1 code implementation • 5 Dec 2023 • Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao
Multimodal Large Language Models (MLLMs) have excelled in 2D image-text comprehension and image generation, but their understanding of the 3D world is notably deficient, limiting progress in 3D language understanding and generation.
no code implementations • 2 Jun 2023 • Zhangyang Qi, Jiaqi Wang, Xiaoyang Wu, Hengshuang Zhao
Multi-view 3D object detection is becoming popular in autonomous driving due to its high effectiveness and low cost.