Search Results for author: Min Bai

Found 14 papers, 4 papers with code

ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling

no code implementations • 9 Feb 2024 • Siming Yan, Min Bai, Weifeng Chen, Xiong Zhou, QiXing Huang, Li Erran Li

By combining natural language understanding, generation capabilities, and breadth of knowledge of large language models with image perception, recent large vision language models (LVLMs) have shown unprecedented visual reasoning capabilities.

Hallucination Natural Language Understanding +2

Paper
Add Code

AffordanceLLM: Grounding Affordance from Vision Language Models

no code implementations • 12 Jan 2024 • Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Li

Affordance grounding refers to the task of finding the area of an object with which one can interact.

Human-Object Interaction Detection Object

Paper
Add Code

LiDAR-Based 3D Object Detection via Hybrid 2D Semantic Scene Generation

1 code implementation • 4 Apr 2023 • Haitao Yang, Zaiwei Zhang, Xiangru Huang, Min Bai, Chen Song, Bo Sun, Li Erran Li, QiXing Huang

Bird's-Eye View (BEV) features are popular intermediate scene representations shared by the 3D backbone and the detector head in LiDAR-based object detectors.

3D Object Detection object-detection +1

Paper
Code

Implicit Surface Contrastive Clustering for LiDAR Point Clouds

no code implementations • CVPR 2023 • Zaiwei Zhang, Min Bai, Erran Li

The first task focuses on learning semantic information by sorting local groups of points in the scene into a globally consistent set of semantically meaningful clusters using contrastive learning.

3D Object Detection Clustering +5

Paper
Add Code

Improving self-supervised representation learning via sequential adversarial masking

no code implementations • 16 Dec 2022 • Dylan Sam, Min Bai, Tristan McKinney, Li Erran Li

Recent methods in self-supervised learning have demonstrated that masking-based pretext tasks extend beyond NLP, serving as useful pretraining objectives in computer vision.

Representation Learning Self-Supervised Learning

Paper
Add Code

Non-parametric Memory for Spatio-Temporal Segmentation of Construction Zones for Self-Driving

no code implementations • 18 Jan 2021 • Min Bai, Shenlong Wang, Kelvin Wong, Ersin Yumer, Raquel Urtasun

In this paper, we introduce a non-parametric memory representation for spatio-temporal segmentation that captures the local space and time around an autonomous vehicle (AV).

Paper
Add Code

Auto4D: Learning to Label 4D Objects from Sequential Point Clouds

no code implementations • 17 Jan 2021 • Bin Yang, Min Bai, Ming Liang, Wenyuan Zeng, Raquel Urtasun

The key idea is to decompose the 4D object label into two parts: the object size in 3D that's fixed through time for rigid objects, and the motion path describing the evolution of the object's pose through time.

3D Object Detection Object

Paper
Add Code

Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization

no code implementations • 8 Aug 2019 • Wei-Chiu Ma, Ignacio Tartavull, Ioan Andrei Bârsan, Shenlong Wang, Min Bai, Gellert Mattyus, Namdar Homayounfar, Shrinidhi Kowshika Lakshmikanth, Andrei Pokrovsky, Raquel Urtasun

In this paper we propose a novel semantic localization algorithm that exploits multiple sensors and has precision on the order of a few centimeters.

Self-Driving Cars

Paper
Add Code

Deep Multi-Sensor Lane Detection

no code implementations • 4 May 2019 • Min Bai, Gellert Mattyus, Namdar Homayounfar, Shenlong Wang, Shrinidhi Kowshika Lakshmikanth, Raquel Urtasun

Reliable and accurate lane detection has been a long-standing problem in the field of autonomous driving.

Autonomous Driving Lane Detection +1

Paper
Add Code

UPSNet: A Unified Panoptic Segmentation Network

1 code implementation • CVPR 2019 • Yuwen Xiong, Renjie Liao, Hengshuang Zhao, Rui Hu, Min Bai, Ersin Yumer, Raquel Urtasun

More importantly, we introduce a parameter-free panoptic head which solves the panoptic segmentation via pixel-wise classification.

Ranked #3 on Panoptic Segmentation on Indian Driving Dataset

Instance Segmentation Panoptic Segmentation +1

639

Paper
Code

Learning deep structured active contours end-to-end

2 code implementations • CVPR 2018 • Diego Marcos, Devis Tuia, Benjamin Kellenberger, Lisa Zhang, Min Bai, Renjie Liao, Raquel Urtasun

The world is covered with millions of buildings, and precisely knowing each instance's position and extents is vital to a multitude of applications.

Instance Segmentation Segmentation +1

Paper
Code

TorontoCity: Seeing the World with a Million Eyes

no code implementations • ICCV 2017 • Shenlong Wang, Min Bai, Gellert Mattyus, Hang Chu, Wenjie Luo, Bin Yang, Justin Liang, Joel Cheverie, Sanja Fidler, Raquel Urtasun

In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712. 5 $km^2$ of land, 8439 $km$ of road and around 400, 000 buildings.

Instance Segmentation Semantic Segmentation

Paper
Add Code

Deep Watershed Transform for Instance Segmentation

3 code implementations • CVPR 2017 • Min Bai, Raquel Urtasun

Most contemporary approaches to instance segmentation use complex pipelines involving conditional random fields, recurrent neural networks, object proposals, or template matching schemes.

Ranked #1000000000 on Instance Segmentation on Cityscapes test

Instance Segmentation Object +3

199

Paper
Code

Exploiting Semantic Information and Deep Matching for Optical Flow

no code implementations • 6 Apr 2016 • Min Bai, Wenjie Luo, Kaustav Kundu, Raquel Urtasun

We tackle the problem of estimating optical flow from a monocular camera in the context of autonomous driving.

Autonomous Driving Optical Flow Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.