Search Results for author: Bowen Cheng

Found 24 papers, 15 papers with code

VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud

1 code implementation • CVPR 2023 • Ziqin Wang, Bowen Cheng, Lichen Zhao, Dong Xu, Yang Tang, Lu Sheng

Since 2D images provide rich semantics and scene graphs are in nature coped with languages, in this study, we propose Visual-Linguistic Semantics Assisted Training (VL-SAT) scheme that can significantly empower 3DSSG prediction models with discrimination about long-tailed and ambiguous semantic relations.

Ranked #1 on 3d scene graph generation on 3DSSG (using extra training data)

3d scene graph generation Relation

Paper
Code

Locating Noise is Halfway Denoising for Semi-Supervised Segmentation

no code implementations • ICCV 2023 • Yan Fang, Feng Zhu, Bowen Cheng, Luoqi Liu, Yao Zhao, Yunchao Wei

This work shows that locating the patch-wise noisy region is a better way to deal with noise.

Denoising Semi-Supervised Semantic Segmentation

Paper
Add Code

Mask2Former for Video Instance Segmentation

5 code implementations • 20 Dec 2021 • Bowen Cheng, Anwesa Choudhuri, Ishan Misra, Alexander Kirillov, Rohit Girdhar, Alexander G. Schwing

We find Mask2Former also achieves state-of-the-art performance on video instance segmentation without modifying the architecture, the loss or even the training pipeline.

Ranked #14 on Video Instance Segmentation on YouTube-VIS validation

Image Segmentation Instance Segmentation +5

126,240

Paper
Code

Masked-attention Mask Transformer for Universal Image Segmentation

6 code implementations • CVPR 2022 • Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar

While only the semantics of each task differ, current research focuses on designing specialized architectures for each task.

Ranked #3 on Semantic Segmentation on Mapillary val

Image Segmentation Instance Segmentation +3

126,240

Paper
Code

Per-Pixel Classification is Not All You Need for Semantic Segmentation

3 code implementations • NeurIPS 2021 • Bowen Cheng, Alexander G. Schwing, Alexander Kirillov

Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results.

Ranked #4 on Semantic Segmentation on Mapillary val

Classification Panoptic Segmentation +1

126,240

Paper
Code

Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection

1 code implementation • 29 Apr 2021 • Jiachen Li, Bowen Cheng, Rogerio Feris, JinJun Xiong, Thomas S. Huang, Wen-mei Hwu, Humphrey Shi

Current anchor-free object detectors are quite simple and effective yet lack accurate label assignment methods, which limits their potential in competing with classic anchor-based models that are supported by well-designed assignment methods based on the Intersection-over-Union~(IoU) metric.

Object object-detection +1

Paper
Code

Pointly-Supervised Instance Segmentation

2 code implementations • CVPR 2022 • Bowen Cheng, Omkar Parkhi, Alexander Kirillov

Our experiments show that the new module is more suitable for the point-based supervision.

Instance Segmentation Object +3

28,911

Paper
Code

Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

1 code implementation • CVPR 2021 • Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu

Inspired by the back-tracing strategy in the conventional Hough voting methods, in this work, we introduce a new 3D object detection method, named as Back-tracing Representative Points Network (BRNet), which generatively back-traces the representative points from the vote centers and also revisits complementary seed points around these generated points, so as to better capture the fine local structural features surrounding the potential objects from the raw point clouds.

Ranked #17 on 3D Object Detection on ScanNetV2

3D Object Detection Object +1

Paper
Code

Boundary IoU: Improving Object-Centric Image Segmentation Evaluation

2 code implementations • CVPR 2021 • Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov

We perform an extensive analysis across different error types and object sizes and show that Boundary IoU is significantly more sensitive than the standard Mask IoU measure to boundary errors for large objects and does not over-penalize errors on smaller objects.

Image Segmentation Object +2

208

Paper
Code

ScaleNAS: One-Shot Learning of Scale-Aware Representations for Visual Recognition

no code implementations • 30 Nov 2020 • Hsin-Pai Cheng, Feng Liang, Meng Li, Bowen Cheng, Feng Yan, Hai Li, Vikas Chandra, Yiran Chen

We use ScaleNAS to create high-resolution models for two different tasks, ScaleNet-P for human pose estimation and ScaleNet-S for semantic segmentation.

Ranked #5 on Multi-Person Pose Estimation on COCO test-dev

Multi-Person Pose Estimation Neural Architecture Search +2

Paper
Add Code

Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation

1 code implementation • ECCV 2020 • Liang-Chieh Chen, Raphael Gontijo Lopes, Bowen Cheng, Maxwell D. Collins, Ekin D. Cubuk, Barret Zoph, Hartwig Adam, Jonathon Shlens

We view this work as a notable step towards building a simple procedure to harness unlabeled video sequences and extra images to surpass state-of-the-art performance on core computer vision tasks.

Image Segmentation Optical Flow Estimation +4

76,636

Paper
Code

Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation

9 code implementations • CVPR 2020 • Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

In this work, we introduce Panoptic-DeepLab, a simple, strong, and fast system for panoptic segmentation, aiming to establish a solid baseline for bottom-up methods that can achieve comparable performance of two-stage methods while yielding fast inference speed.

Ranked #6 on Panoptic Segmentation on Cityscapes test (using extra training data)

Decoder Instance Segmentation +2

76,637

Paper
Code

Panoptic-DeepLab

2 code implementations • 10 Oct 2019 • Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

The semantic segmentation branch is the same as the typical design of any semantic segmentation model (e. g., DeepLab), while the instance segmentation branch is class-agnostic, involving a simple instance center regression.

Decoder Instance Segmentation +3

28,913

Paper
Code

SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems

2 code implementations • 20 Sep 2019 • Xiaofan Zhang, Haoming Lu, Cong Hao, Jiachen Li, Bowen Cheng, Yuhong Li, Kyle Rupnow, JinJun Xiong, Thomas Huang, Honghui Shi, Wen-mei Hwu, Deming Chen

Object detection and tracking are challenging tasks for resource-constrained embedded systems.

Efficient Neural Network Object +3

231

Paper
Code

HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation

20 code implementations • CVPR 2020 • Bowen Cheng, Bin Xiao, Jingdong Wang, Honghui Shi, Thomas S. Huang, Lei Zhang

HigherHRNet even surpasses all top-down methods on CrowdPose test (67. 6% AP), suggesting its robustness in crowded scene.

Ranked #2 on Pose Estimation on UAV-Human

2D Human Pose Estimation Multi-Person Pose Estimation +2

12,167

Paper
Code

SPGNet: Semantic Prediction Guidance for Scene Parsing

no code implementations • ICCV 2019 • Bowen Cheng, Liang-Chieh Chen, Yunchao Wei, Yukun Zhu, Zilong Huang, JinJun Xiong, Thomas Huang, Wen-mei Hwu, Honghui Shi

The multi-scale context module refers to the operations to aggregate feature responses from a large spatial extent, while the single-stage encoder-decoder structure encodes the high-level semantic information in the encoder path and recovers the boundary information in the decoder path.

Decoder Pose Estimation +3

Paper
Add Code

High Frequency Residual Learning for Multi-Scale Image Classification

no code implementations • 7 May 2019 • Bowen Cheng, Rong Xiao, Jian-Feng Wang, Thomas Huang, Lei Zhang

We present a novel high frequency residual learning framework, which leads to a highly efficient multi-scale network (MSNet) architecture for mobile and embedded vision problems.

Classifier calibration General Classification +2

Paper
Add Code

A Simple Non-i.i.d. Sampling Approach for Efficient Training and Better Generalization

no code implementations • 23 Nov 2018 • Bowen Cheng, Yunchao Wei, Jiahui Yu, Shiyu Chang, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi

While training on samples drawn from independent and identical distribution has been a de facto paradigm for optimizing image classification networks, humans learn new concepts in an easy-to-hard manner and on the selected examples progressively.

General Classification Image Classification +6

Paper
Add Code

Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection

3 code implementations • 5 Oct 2018 • Bowen Cheng, Yunchao Wei, Rogerio Feris, JinJun Xiong, Wen-mei Hwu, Thomas Huang, Humphrey Shi

In particular, DCR places a separate classification network in parallel with the localization network (base detector).

Classification General Classification +3

167

Paper
Code

Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition

no code implementations • 17 Sep 2018 • Bowen Cheng, Rong Xiao, Yandong Guo, Yuxiao Hu, Jian-Feng Wang, Lei Zhang

We study in this paper how to initialize the parameters of multinomial logistic regression (a fully connected layer followed with softmax and cross entropy loss), which is widely used in deep neural network (DNN) models for classification problems.

General Classification Image Classification +4

Paper
Add Code

TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection

no code implementations • ECCV 2018 • Yunchao Wei, Zhiqiang Shen, Bowen Cheng, Honghui Shi, JinJun Xiong, Jiashi Feng, Thomas Huang

This work provides a simple approach to discover tight object bounding boxes with only image-level supervision, called Tight box mining with Surrounding Segmentation Context (TS2C).

Multiple Instance Learning Object +4

Paper
Add Code

Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

7 code implementations • ECCV 2018 • Bowen Cheng, Yunchao Wei, Honghui Shi, Rogerio Feris, JinJun Xiong, Thomas Huang

Recent region-based object detectors are usually built with separate classification and localization branches on top of shared feature extraction networks.

Classification General Classification +1

167

Paper
Code

Enhance Visual Recognition under Adverse Conditions via Deep Networks

no code implementations • 20 Dec 2017 • Ding Liu, Bowen Cheng, Zhangyang Wang, Haichao Zhang, Thomas S. Huang

Visual recognition under adverse conditions is a very important and challenging problem of high practical value, due to the ubiquitous existence of quality distortions during image acquisition, transmission, or storage.

Data Augmentation Image Restoration +3

Paper
Add Code

Robust Emotion Recognition from Low Quality and Low Bit Rate Video: A Deep Learning Approach

no code implementations • 10 Sep 2017 • Bowen Cheng, Zhangyang Wang, Zhaobin Zhang, Zhu Li, Ding Liu, Jianchao Yang, Shuai Huang, Thomas S. Huang

Emotion recognition from facial expressions is tremendously useful, especially when coupled with smart devices and wireless multimedia applications.

Decoder Emotion Recognition +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.