Search Results for author: Gangshan Wu

Found 55 papers, 40 papers with code

Boundary-Aware Cascade Networks for Temporal Action Segmentation

1 code implementation • ECCV 2020 • Zhenzhi Wang, Ziteng Gao, Li-Min Wang, Zhifeng Li, Gangshan Wu

To address these problems, we present a new boundary-aware cascade network by introducing two novel components.

Ranked #14 on Action Segmentation on GTEA

Action Segmentation

Paper
Code

STMixer: A One-Stage Sparse Action Detector

no code implementations • 15 Apr 2024 • Tao Wu, Mengqi Cao, Ziteng Gao, Gangshan Wu, LiMin Wang

First, we present a query-based adaptive feature sampling module, which endows the detector with the flexibility of mining a group of discriminative features from the entire spatio-temporal domain.

Action Detection

Paper
Add Code

SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos

no code implementations • 6 Apr 2024 • Tao Wu, Runyu He, Gangshan Wu, LiMin Wang

We hope that SportsHHI can stimulate research on human interaction understanding in videos and promote the development of spatio-temporal context modeling techniques in video visual relation detection.

Graph Generation Relation +4

Paper
Add Code

Dual DETRs for Multi-Label Temporal Action Detection

no code implementations • 31 Mar 2024 • Yuhan Zhu, Guozhen Zhang, Jing Tan, Gangshan Wu, LiMin Wang

To address this issue, we propose a new Dual-level query-based TAD framework, namely DualDETR, to detect actions from both instance-level and boundary-level.

Action Detection object-detection +1

Paper
Add Code

Sketch and Refine: Towards Fast and Accurate Lane Detection

1 code implementation • 26 Jan 2024 • Chao Chen, Jie Liu, Chang Zhou, Jie Tang, Gangshan Wu

At the "Sketch" stage, local directions of keypoints can be easily estimated by fast convolutional layers.

Lane Detection

Paper
Code

Asymmetric Masked Distillation for Pre-Training Small Foundation Models

no code implementations • 6 Nov 2023 • Zhiyu Zhao, Bingkun Huang, Sen Xing, Gangshan Wu, Yu Qiao, LiMin Wang

And AMD achieves 73. 3% classification accuracy using the ViT-B model on the Something-in-Something V2 dataset, a 3. 7% improvement over the original ViT-B model from VideoMAE.

Ranked #20 on Action Recognition on Something-Something V2

Action Classification Action Recognition +3

Paper
Add Code

Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation

no code implementations • 25 Aug 2023 • Jiaming Zhang, Yutao Cui, Gangshan Wu, LiMin Wang

To overcome these issues, we propose a unified VOS framework, coined as JointFormer, for joint modeling the three elements of feature, correspondence, and a compressed memory.

Semantic Segmentation Video Object Segmentation +1

Paper
Add Code

DPL: Decoupled Prompt Learning for Vision-Language Models

no code implementations • 19 Aug 2023 • Chen Xu, Yuhan Zhu, Guozhen Zhang, Haocheng Shen, Yixuan Liao, Xiaoxin Chen, Gangshan Wu, LiMin Wang

Prompt learning has emerged as an efficient and effective approach for transferring foundational Vision-Language Models (e. g., CLIP) to downstream tasks.

Paper
Add Code

Robust Object Modeling for Visual Tracking

1 code implementation • ICCV 2023 • Yidong Cai, Jie Liu, Jie Tang, Gangshan Wu

To enjoy the merits of both methods, we propose a robust object modeling framework for visual tracking (ROMTrack), which simultaneously models the inherent template and the hybrid template features.

Object Visual Tracking

Paper
Code

Lightweight Super-Resolution Head for Human Pose Estimation

1 code implementation • 31 Jul 2023 • Haonan Wang, Jie Liu, Jie Tang, Gangshan Wu

We first propose the SR head, which predicts heatmaps with a spatial resolution higher than the input feature maps (or even consistent with the input image) by super-resolution, to effectively reduce the quantization error and the dependence on further post-processing.

Pose Estimation Quantization +1

Paper
Code

MaxSR: Image Super-Resolution Using Improved MaxViT

no code implementations • 14 Jul 2023 • Bincheng Yang, Gangshan Wu

Because transformer models have powerful representation capacity and the in-built self-attention mechanisms in transformer models help to leverage self-similarity prior in input low-resolution image to improve performance for single image super-resolution, we present a single image super-resolution model based on recent hybrid vision transformer of MaxViT, named as MaxSR.

Image Super-Resolution

Paper
Add Code

Transferring Foundation Models for Generalizable Robotic Manipulation

no code implementations • 9 Jun 2023 • Jiange Yang, Wenhui Tan, Chuhao Jin, Keling Yao, Bei Liu, Jianlong Fu, Ruihua Song, Gangshan Wu, LiMin Wang

In this paper, we propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models, to condition robot manipulation tasks.

Imitation Learning Object +1

Paper
Add Code

MixFormerV2: Efficient Fully Transformer Tracking

1 code implementation • NeurIPS 2023 • Yutao Cui, Tianhui Song, Gangshan Wu, LiMin Wang

Our key design is to introduce four special prediction tokens and concatenate them with the tokens from target template and search areas.

120

Paper
Code

Video Frame Interpolation with Densely Queried Bilateral Correlation

1 code implementation • 26 Apr 2023 • Chang Zhou, Jie Liu, Jie Tang, Gangshan Wu

To better model correlations and to produce more accurate motion fields, we propose the Densely Queried Bilateral Correlation (DQBC) that gets rid of the receptive field dependency problem and thus is more friendly to small and fast-moving objects.

Ranked #1 on Video Frame Interpolation on MSU Video Frame Interpolation (VMAF metric)

Motion Estimation Video Frame Interpolation

Paper
Code

Efficient Video Action Detection with Token Dropout and Context Refinement

2 code implementations • ICCV 2023 • Lei Chen, Zhan Tong, Yibing Song, Gangshan Wu, LiMin Wang

Our EVAD consists of two specialized designs for video action detection.

Action Detection Decoder

1,226

Paper
Code

SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes

1 code implementation • ICCV 2023 • Yutao Cui, Chenkai Zeng, Xiaoyu Zhao, Yichun Yang, Gangshan Wu, LiMin Wang

We expect SportsMOT to encourage the MOT trackers to promote in both motion-based association and appearance-based association.

Ranked #3 on Multi-Object Tracking on SportsMOT (using extra training data)

Multi-Object Tracking Multiple Object Tracking +1

120

Paper
Code

LinK: Linear Kernel for LiDAR-based 3D Perception

1 code implementation • CVPR 2023 • Tao Lu, Xiang Ding, Haisong Liu, Gangshan Wu, LiMin Wang

Extending the success of 2D Large Kernel to 3D perception is challenging due to: 1. the cubically-increasing overhead in processing 3D data; 2. the optimization difficulties from data scarcity and sparsity.

3D Object Detection 3D Semantic Segmentation +1

Paper
Code

STMixer: A One-Stage Sparse Action Detector

no code implementations • CVPR 2023 • Tao Wu, Mengqi Cao, Ziteng Gao, Gangshan Wu, LiMin Wang

STMixer is based on two core designs.

Paper
Add Code

CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection

no code implementations • 28 Mar 2023 • Lei Chen, Zhan Tong, Yibing Song, Gangshan Wu, LiMin Wang

Existing studies model each actor and scene relation to improve action recognition.

Action Detection Action Recognition +2

Paper
Add Code

Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation

1 code implementation • CVPR 2023 • Guozhen Zhang, Yuhan Zhu, Haonan Wang, Youxin Chen, Gangshan Wu, LiMin Wang

In this paper, we propose a novel module to explicitly extract motion and appearance information via a unifying operation.

Ranked #1 on Video Frame Interpolation on MSU Video Frame Interpolation (PSNR metric)

Video Frame Interpolation

317

Paper
Code

CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets

1 code implementation • 13 Feb 2023 • Jiange Yang, Sheng Guo, Gangshan Wu, LiMin Wang

Our CoMAE presents a curriculum learning strategy to unify the two popular self-supervised representation learning algorithms: contrastive learning and masked image modeling.

Contrastive Learning Representation Learning +1

Paper
Code

MixFormer: End-to-End Tracking with Iterative Mixed Attention

1 code implementation • 6 Feb 2023 • Yutao Cui, Cheng Jiang, Gangshan Wu, LiMin Wang

Our core design is to utilize the flexibility of attention operations, and propose a Mixed Attention Module (MAM) for simultaneous feature extraction and target information integration.

Ranked #1 on Visual Object Tracking on TrackingNet

Visual Object Tracking

425

Paper
Code

From Coarse to Fine: Hierarchical Pixel Integration for Lightweight Image Super-Resolution

1 code implementation • 30 Nov 2022 • Jie Liu, Chao Chen, Jie Tang, Gangshan Wu

In the fine area, we use an Intra-Patch Self-Attention (IPSA) module to model long-range pixel dependencies in a local patch, and then a $3\times3$ convolution is applied to process the finest details.

Image Super-Resolution

Paper
Code

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations • 11 May 2022 • Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

117

Paper
Code

APP-Net: Auxiliary-point-based Push and Pull Operations for Efficient Point Cloud Classification

1 code implementation • 2 May 2022 • Tao Lu, Chunxu Liu, Youxin Chen, Gangshan Wu, LiMin Wang

In the existing work, each point in the cloud may inevitably be selected as the neighbors of multiple aggregation centers, as all centers will gather neighbor features from the whole point cloud independently.

Ranked #45 on 3D Point Cloud Classification on ScanObjectNN

3D Classification 3D Point Cloud Classification +1

Paper
Code

Fast and Memory-Efficient Network Towards Efficient Image Super-Resolution

1 code implementation • 18 Apr 2022 • Zongcai Du, Ding Liu, Jie Liu, Jie Tang, Gangshan Wu, Lean Fu

Besides, FMEN-S achieves the lowest memory consumption and the second shortest runtime in NTIRE 2022 challenge on efficient super-resolution.

Image Super-Resolution

Paper
Code

MixFormer: End-to-End Tracking with Iterative Mixed Attention

1 code implementation • CVPR 2022 • Yutao Cui, Cheng Jiang, LiMin Wang, Gangshan Wu

Our core design is to utilize the flexibility of attention operations, and propose a Mixed Attention Module (MAM) for simultaneous feature extraction and target information integration.

Ranked #7 on Visual Object Tracking on UAV123

Semi-Supervised Video Object Segmentation Visual Object Tracking

425

Paper
Code

Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection

no code implementations • 1 Mar 2022 • Jing Tan, Yuhong Wang, Gangshan Wu, LiMin Wang

Instead, in this paper, we present Temporal Perceiver, a general architecture with Transformer, offering a unified solution to the detection of arbitrary generic boundaries, ranging from shot-level, event-level, to scene-level GBDs.

Avg Boundary Detection +1

Paper
Add Code

Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning

1 code implementation • 31 Dec 2021 • Bin-Cheng Yang, Gangshan Wu

By introducing dual path connections inspired by Dual Path Networks into EMSRDPN, it uses residual connections and dense connections in an integrated way in most network layers.

Feature Correlation Image Super-Resolution

Paper
Code

AdaDM: Enabling Normalization for Image Super-Resolution

1 code implementation • 27 Nov 2021 • Jie Liu, Jie Tang, Gangshan Wu

We found that the standard deviation of the residual feature shrinks a lot after normalization layers, which causes the performance degradation in SR networks.

Image Super-Resolution

Paper
Code

A Closer Look at Few-Shot Video Classification: A New Baseline and Benchmark

1 code implementation • 24 Oct 2021 • Zhenxi Zhu, LiMin Wang, Sheng Guo, Gangshan Wu

In this paper, we aim to present an in-depth study on few-shot video classification by making three contributions.

Classification Meta-Learning +2

Paper
Code

Mutual Supervision for Dense Object Detection

no code implementations • ICCV 2021 • Ziteng Gao, LiMin Wang, Gangshan Wu

In this paper, we break the convention of the same training samples for these two heads in dense detectors and explore a novel supervisory paradigm, termed as Mutual Supervision (MuSu), to respectively and mutually assign training samples for the classification and regression head to ensure this consistency.

Classification Dense Object Detection +3

Paper
Add Code

Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

2 code implementations • 10 Sep 2021 • Zhenzhi Wang, LiMin Wang, Tao Wu, TianHao Li, Gangshan Wu

Instead, from a perspective on temporal grounding as a metric-learning problem, we present a Mutual Matching Network (MMN), to directly model the similarity between language queries and video moments in a joint embedding space.

Ranked #3 on Temporal Sentence Grounding on Charades-STA

Metric Learning Representation Learning +2

Paper
Code

Self Supervision to Distillation for Long-Tailed Visual Recognition

1 code implementation • ICCV 2021 • TianHao Li, LiMin Wang, Gangshan Wu

In this paper, we show that soft label can serve as a powerful solution to incorporate label correlation into a multi-stage training scheme for long-tailed recognition.

Ranked #43 on Long-tail Learning on CIFAR-100-LT (ρ=100)

Long-tail Learning

Paper
Code

Target Adaptive Context Aggregation for Video Scene Graph Generation

1 code implementation • ICCV 2021 • Yao Teng, LiMin Wang, Zhifeng Li, Gangshan Wu

Specifically, we design an efficient method for frame-level VidSGG, termed as {\em Target Adaptive Context Aggregation Network} (TRACE), with a focus on capturing spatio-temporal context information for relation recognition.

Graph Generation Relation +2

Paper
Code

CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation

1 code implementation • CVPR 2021 • Tao Lu, LiMin Wang, Gangshan Wu

Previous point cloud semantic segmentation networks use the same process to aggregate features from neighbors of the same category and different categories.

Ranked #1 on Semantic Segmentation on SYNTHIA

Segmentation Semantic Segmentation

Paper
Code

SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

1 code implementation • 6 Jun 2021 • Zeyu Ruan, Changqing Zou, Longhai Wu, Gangshan Wu, LiMin Wang

Three-dimensional face dense alignment and reconstruction in the wild is a challenging problem as partial facial information is commonly missing in occluded and large pose face images.

Ranked #1 on 3D Face Reconstruction on AFLW2000-3D

3D Face Alignment 3D Face Reconstruction +3

123

Paper
Code

Anchor-based Plain Net for Mobile Image Super-Resolution

3 code implementations • 20 May 2021 • Zongcai Du, Jie Liu, Jie Tang, Gangshan Wu

Along with the rapid development of real-world applications, higher requirements on the accuracy and efficiency of image super-resolution (SR) are brought forward.

Image Super-Resolution Quantization

278

Paper
Code

MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions

1 code implementation • ICCV 2021 • Yixuan Li, Lei Chen, Runyu He, Zhenzhi Wang, Gangshan Wu, LiMin Wang

Spatio-temporal action detection is an important and challenging problem in video understanding.

Action Detection Action Localization +1

100

Paper
Code

MGSampler: An Explainable Sampling Strategy for Video Action Recognition

1 code implementation • ICCV 2021 • Yuan Zhi, Zhan Tong, LiMin Wang, Gangshan Wu

First, we present two different motion representations to enable us to efficiently distinguish the motion-salient frames from the background.

Action Recognition Temporal Action Localization

Paper
Code

Target Transformed Regression for Accurate Tracking

1 code implementation • 1 Apr 2021 • Yutao Cui, Cheng Jiang, LiMin Wang, Gangshan Wu

Accurate tracking is still a challenging task due to appearance variations, pose and view changes, and geometric deformations of target in videos.

Ranked #1 on Visual Object Tracking on VOT2019

regression Visual Object Tracking +1

Paper
Code

Relaxed Transformer Decoders for Direct Action Proposal Generation

2 code implementations • ICCV 2021 • Jing Tan, Jiaqi Tang, LiMin Wang, Gangshan Wu

Extensive experiments on THUMOS14 and ActivityNet-1. 3 benchmarks demonstrate the effectiveness of RTD-Net, on both tasks of temporal action proposal generation and temporal action detection.

Action Detection Temporal Action Proposal Generation +1

Paper
Code

Temporal Difference Networks for Action Recognition

no code implementations • 1 Jan 2021 • LiMin Wang, Bin Ji, Zhan Tong, Gangshan Wu

To mitigate this issue, this paper presents a new video architecture, termed as Temporal Difference Network (TDN), with a focus on capturing multi-scale temporal information for efficient action recognition.

Action Recognition In Videos

Paper
Add Code

TDN: Temporal Difference Networks for Efficient Action Recognition

1 code implementation • CVPR 2021 • LiMin Wang, Zhan Tong, Bin Ji, Gangshan Wu

Ranked #17 on Action Recognition on Something-Something V1

Action Classification Action Recognition In Videos

362

Paper
Code

Residual Feature Distillation Network for Lightweight Image Super-Resolution

2 code implementations • 24 Sep 2020 • Jie Liu, Jie Tang, Gangshan Wu

Thanks to FDC, we can rethink the information multi-distillation network (IMDN) and propose a lightweight and accurate SISR model called residual feature distillation network (RFDN).

Image Super-Resolution

345

Paper
Code

AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

3 code implementations • 15 Sep 2020 • Kai Zhang, Martin Danelljan, Yawei Li, Radu Timofte, Jie Liu, Jie Tang, Gangshan Wu, Yu Zhu, Xiangyu He, Wenjie Xu, Chenghua Li, Cong Leng, Jian Cheng, Guangyang Wu, Wenyi Wang, Xiaohong Liu, Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, Chao Dong, Maitreya Suin, Kuldeep Purohit, A. N. Rajagopalan, Xiaochuan Li, Zhiqiang Lang, Jiangtao Nie, Wei Wei, Lei Zhang, Abdul Muqeet, Jiwon Hwang, Subin Yang, JungHeum Kang, Sung-Ho Bae, Yongwoo Kim, Geun-Woo Jeon, Jun-Ho Choi, Jun-Hyuk Kim, Jong-Seok Lee, Steven Marty, Eric Marty, Dongliang Xiong, Siang Chen, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Haicheng Wang, Vineeth Bhaskara, Alex Levinshtein, Stavros Tsogkas, Allan Jepson, Xiangzhen Kong, Tongtong Zhao, Shanshan Zhao, Hrishikesh P. S, Densen Puthussery, Jiji C. V, Nan Nan, Shuai Liu, Jie Cai, Zibo Meng, Jiaming Ding, Chiu Man Ho, Xuehui Wang, Qiong Yan, Yuzhi Zhao, Long Chen, Jiangtao Zhang, Xiaotong Luo, Liang Chen, Yanyun Qu, Long Sun, Wenhao Wang, Zhenbing Liu, Rushi Lan, Rao Muhammad Umer, Christian Micheloni

This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results.

Image Super-Resolution

2,741

Paper
Code

Context-Aware RCNN: A Baseline for Action Detection in Videos

3 code implementations • ECCV 2020 • Jianchao Wu, Zhanghui Kuang, Li-Min Wang, Wayne Zhang, Gangshan Wu

In this work, we first empirically find the recognition accuracy is highly correlated with the bounding box size of an actor, and thus higher resolution of actors contributes to better performance.

Action Detection Action Recognition

Paper
Code

Fully Convolutional Online Tracking

2 code implementations • 15 Apr 2020 • Yutao Cui, Cheng Jiang, Li-Min Wang, Gangshan Wu

To tackle this issue, we present the fully convolutional online tracking framework, coined as FCOT, and focus on enabling online learning for both classification and regression branches by using a target filter based tracking paradigm.

Real-Time Visual Tracking regression

Paper
Code

Actions as Moving Points

2 code implementations • ECCV 2020 • Yixuan Li, Zixu Wang, Li-Min Wang, Gangshan Wu

The existing action tubelet detectors often depend on heuristic anchor design and placement, which might be computationally expensive and sub-optimal for precise localization.

Ranked #5 on Action Detection on UCF101-24

Action Detection Action Recognition

262

Paper
Code

Simple and Lightweight Human Pose Estimation

1 code implementation • 23 Nov 2019 • Zhe Zhang, Jie Tang, Gangshan Wu

Specifically, our LPN-50 can achieve 68. 7 in AP score on the COCO test-dev set, with only 2. 7M parameters and 1. 0 GFLOPs, while the inference speed is 17 FPS on an Intel i7-8700K CPU machine.

Keypoint Detection Novel Concepts

Paper
Code

LIP: Local Importance-based Pooling

1 code implementation • ICCV 2019 • Ziteng Gao, Li-Min Wang, Gangshan Wu

Spatial downsampling layers are favored in convolutional neural networks (CNNs) to downscale feature maps for larger receptive fields and less memory consumption.

Ranked #147 on Object Detection on COCO test-dev (using extra training data)

Image Classification Object Detection

213

Paper
Code

Dynamically Visual Disambiguation of Keyword-based Image Search

no code implementations • 27 May 2019 • Yazhou Yao, Zeren Sun, Fumin Shen, Li Liu, Li-Min Wang, Fan Zhu, Lizhong Ding, Gangshan Wu, Ling Shao

To address this issue, we present an adaptive multi-model framework that resolves polysemy by visual disambiguation.

General Classification Image Retrieval

Paper
Add Code

Translate-to-Recognize Networks for RGB-D Scene Recognition

1 code implementation • CVPR 2019 • Dapeng Du, Li-Min Wang, Huiling Wang, Kai Zhao, Gangshan Wu

Empirically, we verify that this new semi-supervised setting is able to further enhance the performance of recognition network.

Decoder Scene Recognition +1

Paper
Code

Learning Actor Relation Graphs for Group Activity Recognition

2 code implementations • CVPR 2019 • Jianchao Wu, Li-Min Wang, Li Wang, Jie Guo, Gangshan Wu

To this end, we propose to build a flexible and efficient Actor Relation Graph (ARG) to simultaneously capture the appearance and position relation between actors.

Ranked #3 on Group Activity Recognition on Collective Activity

Action Recognition Group Activity Recognition +1

169

Paper
Code

StereoSnakes: Contour Based Consistent Object Extraction For Stereo Images

no code implementations • ICCV 2015 • Ran Ju, Tongwei Ren, Gangshan Wu

We also demonstrate in a few applications how our method can be used as a basic tool for stereo image editing.

Image Segmentation Object +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.