Search Results for author: Zhangyang Qi

Found 3 papers, 2 papers with code

Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases

1 code implementation • 22 Dec 2023 • Zhangyang Qi, Ye Fang, Mengchen Zhang, Zeyi Sun, Tong Wu, Ziwei Liu, Dahua Lin, Jiaqi Wang, Hengshuang Zhao

We conducted a series of structured experiments to evaluate their performance in various industrial application scenarios, offering a comprehensive perspective on their practical utility.

181

Paper
Code

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

1 code implementation • 5 Dec 2023 • Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao

Multimodal Large Language Models (MLLMs) have excelled in 2D image-text comprehension and image generation, but their understanding of the 3D world is notably deficient, limiting progress in 3D language understanding and generation.

3D Generation Reading Comprehension

260

Paper
Code

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

no code implementations • 2 Jun 2023 • Zhangyang Qi, Jiaqi Wang, Xiaoyang Wu, Hengshuang Zhao

Multi-view 3D object detection is becoming popular in autonomous driving due to its high effectiveness and low cost.

3D Object Detection Autonomous Driving +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.