Search Results for author: Ruihang Chu

Found 14 papers, 6 papers with code

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

2 code implementations • 27 Mar 2024 • Yanwei Li, Yuechen Zhang, Chengyao Wang, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jia

We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i. e., high-resolution visual tokens, high-quality data, and VLM-guided generation.

Ranked #9 on Visual Question Answering on MM-Vet

Image Comprehension Visual Dialog +1

2,892

Paper
Code

DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving

no code implementations • 25 Mar 2024 • Tianqi Wang, Enze Xie, Ruihang Chu, Zhenguo Li, Ping Luo

We utilize the challenging driving scenarios from the CARLA leaderboard 2. 0, which involve high-speed driving and lane-changing, and propose a rule-based expert policy to control the vehicle and generate ground truth labels for its reasoning process across different driving aspects and the final decisions.

Paper
Add Code

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

no code implementations • 13 Mar 2024 • Minbin Huang, Yanxin Long, Xinchi Deng, Ruihang Chu, Jiangfeng Xiong, Xiaodan Liang, Hong Cheng, Qinglin Lu, Wei Liu

However, many of these works face challenges in identifying correct output modalities and generating coherent images accordingly as the number of output modalities increases and the conversations go deeper.

Prompt Engineering Text-to-Image Generation

Paper
Add Code

A Survey of Reasoning with Foundation Models

1 code implementation • 17 Dec 2023 • Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng, Yue Wu, Wenhai Wang, Junsong Chen, Zhangyue Yin, Xiaozhe Ren, Jie Fu, Junxian He, Wu Yuan, Qi Liu, Xihui Liu, Yu Li, Hao Dong, Yu Cheng, Ming Zhang, Pheng Ann Heng, Jifeng Dai, Ping Luo, Jingdong Wang, Ji-Rong Wen, Xipeng Qiu, Yike Guo, Hui Xiong, Qun Liu, Zhenguo Li

Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.

Medical Diagnosis

345

Paper
Code

Mask-Attention-Free Transformer for 3D Instance Segmentation

1 code implementation • ICCV 2023 • Xin Lai, Yuhui Yuan, Ruihang Chu, Yukang Chen, Han Hu, Jiaya Jia

Therefore, we abandon the mask attention design and resort to an auxiliary center regression task instead.

3D Instance Segmentation Position +2

58

Paper
Code

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation

1 code implementation • NeurIPS 2023 • Shentong Mo, Enze Xie, Ruihang Chu, Lewei Yao, Lanqing Hong, Matthias Nießner, Zhenguo Li

Recent Diffusion Transformers (e. g., DiT) have demonstrated their powerful effectiveness in generating high-quality 2D images.

Ranked #1 on Point Cloud Generation on ShapeNet Car

3D Shape Generation Denoising +2

146

Paper
Code

TriVol: Point Cloud Rendering via Triple Volumes

1 code implementation • CVPR 2023 • Tao Hu, Xiaogang Xu, Ruihang Chu, Jiaya Jia

However, artifacts still appear in rendered images, due to the challenges in extracting continuous and discriminative 3D features from point clouds.

38

Paper
Code

Command-Driven Articulated Object Understanding and Manipulation

no code implementations • CVPR 2023 • Ruihang Chu, Zhengzhe Liu, Xiaoqing Ye, Xiao Tan, Xiaojuan Qi, Chi-Wing Fu, Jiaya Jia

The key of Cart is to utilize the prediction of object structures to connect visual observations with user commands for effective manipulations.

motion prediction Object +1

Paper
Add Code

TWIST: Two-Way Inter-Label Self-Training for Semi-Supervised 3D Instance Segmentation

no code implementations • CVPR 2022 • Ruihang Chu, Xiaoqing Ye, Zhengzhe Liu, Xiao Tan, Xiaojuan Qi, Chi-Wing Fu, Jiaya Jia

We explore the way to alleviate the label-hungry problem in a semi-supervised setting for 3D instance segmentation.

3D Instance Segmentation Denoising +2

Paper
Add Code

ICM-3D: Instantiated Category Modeling for 3D Instance Segmentation

no code implementations • 26 Aug 2021 • Ruihang Chu, Yukang Chen, Tao Kong, Lu Qi, Lei LI

Separating 3D point clouds into individual instances is an important task for 3D vision.

3D Instance Segmentation Semantic Segmentation

Paper
Add Code

Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation

no code implementations • 5 Aug 2021 • Yiming Li, Tao Kong, Ruihang Chu, Yifeng Li, Peng Wang, Lei LI

In a unified framework, we jointly predict the feasible 6-DoF grasp poses, instance semantic segmentation, and collision information.

Multi-Task Learning Pose Estimation +1

Paper
Add Code

Scale-aware Automatic Augmentation for Object Detection

1 code implementation • CVPR 2021 • Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei LI, Jiaya Jia

We propose Scale-aware AutoAug to learn data augmentation policies for object detection.

Data Augmentation Instance Segmentation +5

196

Paper
Code

Reinforcement Learning for the Beginning of Starcraft II Game

no code implementations • CUHK Course IERG5350 2020 • Yukang Chen, Ruihang Chu

In this project, we plan to develop a reinforcement learning model for the beginning of Starcraft II game, instead of the full-length game.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Vehicle Re-identification with Viewpoint-aware Metric Learning

no code implementations • ICCV 2019 • Ruihang Chu, Yifan Sun, Yadong Li, Zheng Liu, Chi Zhang, Yichen Wei

This paper considers vehicle re-identification (re-ID) problem.

Metric Learning Vehicle Re-Identification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.