1 code implementation • 17 Nov 2023 • Jiageng Mao, Junjie Ye, Yuxi Qian, Marco Pavone, Yue Wang
Our approach, termed Agent-Driver, transforms the traditional autonomous driving pipeline by introducing a versatile tool library accessible via function calls, a cognitive memory of common sense and experiential knowledge for decision-making, and a reasoning engine capable of chain-of-thought reasoning, task planning, motion planning, and self-reflection.
1 code implementation • 2 Oct 2023 • Jiageng Mao, Yuxi Qian, Junjie Ye, Hang Zhao, Yue Wang
In this paper, we propose a novel approach to motion planning that capitalizes on the strong reasoning capabilities and generalization potential inherent to Large Language Models (LLMs).
Ranked #1 on Motion Planning on nuScenes
no code implementations • 3 Apr 2022 • Yuxi Qian, Yuncong Hu, Ruonan Wang, Fangxiang Feng, Xiaojie Wang
It first models semantic, spatial, and implicit visual relations in images by three graph attention networks, then question information is utilized to guide the aggregation process of the three graphs, further, our QD-GFN adopts an object filtering mechanism to remove question-irrelevant objects contained in the image.
no code implementations • Findings (ACL) 2022 • Ruonan Wang, Yuxi Qian, Fangxiang Feng, Xiaojie Wang, Huixing Jiang
Most existing approaches to Visual Question Answering (VQA) answer questions directly, however, people usually decompose a complex question into a sequence of simple sub questions and finally obtain the answer to the original question after answering the sub question sequence(SQS).
1 code implementation • 6 Aug 2021 • Yulin Li, Yuxi Qian, Yuchen Yu, Xiameng Qin, Chengquan Zhang, Yan Liu, Kun Yao, Junyu Han, Jingtuo Liu, Errui Ding
Due to the complexity of content and layout in VRDs, structured text understanding has been a challenging task.