Search Results for author: Yandong Guo

Found 68 papers, 24 papers with code

Debiased Novel Category Discovering and Localization

no code implementations • 29 Feb 2024 • Juexiao Feng, Yuhong Yang, Yanchun Xie, Yaqian Li, Yandong Guo, Yuchen Guo, Yuwei He, Liuyu Xiang, Guiguang Ding

In recent years, object detection in deep learning has experienced rapid development.

Contrastive Learning Novel Class Discovery +4

Paper
Add Code

Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision

1 code implementation • 14 Feb 2024 • Zhaoqing Wang, Xiaobo Xia, Ziye Chen, Xiao He, Yandong Guo, Mingming Gong, Tongliang Liu

With this unpaired mask-text supervision, we propose a new weakly-supervised open-vocabulary segmentation framework (Uni-OVSeg) that leverages confident pairs of mask predictions and entities in text descriptions.

Language Modelling

Paper
Code

LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

no code implementations • 21 Dec 2023 • Senqiao Yang, Jiaming Liu, Ray Zhang, Mingjie Pan, Zoey Guo, Xiaoqi Li, Zehui Chen, Peng Gao, Yandong Guo, Shanghang Zhang

In this paper, we introduce LiDAR-LLM, which takes raw LiDAR data as input and harnesses the remarkable reasoning capabilities of LLMs to gain a comprehensive understanding of outdoor 3D scenes.

Instruction Following Language Modelling +1

Paper
Add Code

Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation

no code implementations • 19 Dec 2023 • Jiaming Liu, ran Xu, Senqiao Yang, Renrui Zhang, Qizhe Zhang, Zehui Chen, Yandong Guo, Shanghang Zhang

To tackle these issues, we propose a continual self-supervised method, Adaptive Distribution Masked Autoencoders (ADMA), which enhances the extraction of target domain knowledge while mitigating the accumulation of distribution shifts.

Decoder Self-Supervised Learning +1

Paper
Add Code

Seeing through the Mask: Multi-task Generative Mask Decoupling Face Recognition

no code implementations • 20 Nov 2023 • Zhaohui Wang, Sufang Zhang, Jianteng Peng, Xinyi Wang, Yandong Guo

Therefore, this paper proposes a Multi-task gEnerative mask dEcoupling face Recognition (MEER) network to jointly handle these two tasks, which can learn occlusionirrelevant and identity-related representation while achieving unmasked face synthesis.

Face Generation Face Recognition

Paper
Add Code

NTO3D: Neural Target Object 3D Reconstruction with Segment Anything

1 code implementation • 22 Sep 2023 • Xiaobao Wei, Renrui Zhang, Jiarui Wu, Jiaming Liu, Ming Lu, Yandong Guo, Shanghang Zhang

NTO3D lifts the 2D masks and features of SAM into the 3D neural field for high-quality neural target object 3D reconstruction.

3D Object Reconstruction 3D Reconstruction +1

Paper
Code

AdvFAS: A robust face anti-spoofing framework against adversarial examples

no code implementations • 4 Aug 2023 • Jiawei Chen, Xiao Yang, Heng Yin, Mingzhi Ma, Bihui Chen, Jianteng Peng, Yandong Guo, Zhaoxia Yin, Hang Su

Ensuring the reliability of face recognition systems against presentation attacks necessitates the deployment of face anti-spoofing techniques.

Adversarial Defense Face Anti-Spoofing +1

Paper
Add Code

ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation

1 code implementation • 7 Jun 2023 • Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang

Note that, our method can be regarded as a novel transfer paradigm for large-scale models, delivering promising results in adaptation to continually changing distributions.

Test-time Adaptation

Paper
Code

Recognize Anything: A Strong Image Tagging Model

2 code implementations • 6 Jun 2023 • Youcai Zhang, Xinyu Huang, Jinyu Ma, Zhaoyang Li, Zhaochuan Luo, Yanchun Xie, Yuzhuo Qin, Tong Luo, Yaqian Li, Shilong Liu, Yandong Guo, Lei Zhang

We are releasing the RAM at \url{https://recognize-anything. github. io/} to foster the advancements of large models in computer vision.

Semantic Parsing

2,463

Paper
Code

The 3rd Anti-UAV Workshop & Challenge: Methods and Results

no code implementations • 12 May 2023 • Jian Zhao, Jianan Li, Lei Jin, Jiaming Chu, Zhihao Zhang, Jun Wang, Jiangqiang Xia, Kai Wang, Yang Liu, Sadaf Gulshad, Jiaojiao Zhao, Tianyang Xu, XueFeng Zhu, Shihan Liu, Zheng Zhu, Guibo Zhu, Zechao Li, Zheng Wang, Baigui Sun, Yandong Guo, Shin ichi Satoh, Junliang Xing, Jane Shen Shengmei

Second, we set up two tracks for the first time, i. e., Anti-UAV Tracking and Anti-UAV Detection & Tracking.

Object Tracking

Paper
Add Code

ContrastMotion: Self-supervised Scene Motion Learning for Large-Scale LiDAR Point Clouds

no code implementations • 25 Apr 2023 • Xiangze Jia, Hui Zhou, Xinge Zhu, Yandong Guo, Ji Zhang, Yuexin Ma

In this paper, we propose a novel self-supervised motion estimator for LiDAR-based autonomous driving via BEV representation.

Autonomous Driving Contrastive Learning +2

Paper
Add Code

A Comprehensive Comparison of Projections in Omnidirectional Super-Resolution

no code implementations • 13 Apr 2023 • Huicheng Pi, Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Shunli Zhang

In these works, omnidirectional frames are projected from the 3D sphere to a 2D plane by Equi-Rectangular Projection (ERP).

ERP Super-Resolution

Paper
Add Code

CABM: Content-Aware Bit Mapping for Single Image Super-Resolution Network with Large Input

1 code implementation • CVPR 2023 • Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Yurong Chen, Shunli Zhang

Therefore, we design a strategy to build an Edge-to-Bit lookup table that maps the edge score of a patch to the bit of each layer during inference.

2k 4k +3

Paper
Code

SGL: Structure Guidance Learning for Camera Localization

no code implementations • 12 Apr 2023 • Xudong Zhang, Shuang Gao, Xiaohu Nan, Haikuan Ning, Yuchen Yang, Yishan Ping, Jixiang Wan, Shuzhou Dong, Jijunnan Li, Yandong Guo

Camera localization is a classical computer vision task that serves various Artificial Intelligence and Robotics applications.

Camera Localization Visual Localization

Paper
Add Code

CloSET: Modeling Clothed Humans on Continuous Surface with Explicit Template Decomposition

no code implementations • CVPR 2023 • Hongwen Zhang, Siyou Lin, Ruizhi Shao, Yuxiang Zhang, Zerong Zheng, Han Huang, Yandong Guo, Yebin Liu

In this way, the clothing deformations are disentangled such that the pose-dependent wrinkles can be better learned and applied to unseen poses.

Paper
Add Code

Box-Level Active Detection

1 code implementation • CVPR 2023 • Mengyao Lyu, Jundong Zhou, Hui Chen, YiJie Huang, Dongdong Yu, Yaqian Li, Yandong Guo, Yuchen Guo, Liuyu Xiang, Guiguang Ding

Active learning selects informative samples for annotation within budget, which has proven efficient recently on object detection.

Active Learning object-detection +1

Paper
Code

Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning

1 code implementation • CVPR 2023 • Weixuan Sun, Jiayi Zhang, Jianyuan Wang, Zheyuan Liu, Yiran Zhong, Tianpeng Feng, Yandong Guo, Yanhao Zhang, Nick Barnes

Based on this observation, we propose a new learning strategy named False Negative Aware Contrastive (FNAC) to mitigate the problem of misleading the training with such false negative samples.

Contrastive Learning

Paper
Code

PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection

1 code implementation • CVPR 2023 • Anthony Chen, Kevin Zhang, Renrui Zhang, Zihan Wang, Yuheng Lu, Yandong Guo, Shanghang Zhang

Masked Autoencoders learn strong visual representations and achieve state-of-the-art results in several independent modalities, yet very few works have addressed their capabilities in multi-modality settings.

3D Object Detection Decoder +3

108

Paper
Code

Tag2Text: Guiding Vision-Language Model via Image Tagging

2 code implementations • 10 Mar 2023 • Xinyu Huang, Youcai Zhang, Jinyu Ma, Weiwei Tian, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Lei Zhang

This paper presents Tag2Text, a vision language pre-training (VLP) framework, which introduces image tagging into vision-language models to guide the learning of visual-linguistic features.

Language Modelling TAG

2,463

Paper
Code

Neural Reconstruction of Relightable Human Model from Monocular Video

no code implementations • ICCV 2023 • Wenzhang Sun, Yunlong Che, Han Huang, Yandong Guo

In this paper, we introduce a novel self-supervised framework that takes a monocular video of a moving human as input and generates a 3D neural representation capable of being rendered with novel poses under arbitrary lighting conditions.

Paper
Add Code

A Survey of Face Recognition

no code implementations • 26 Dec 2022 • Xinyi Wang, Jianteng Peng, Sufang Zhang, Bihui Chen, Yi Wang, Yandong Guo

Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks.

Face Recognition

Paper
Add Code

BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks

no code implementations • CVPR 2023 • Xiaowei Chi, Jiaming Liu, Ming Lu, Rongyu Zhang, Zhaoqing Wang, Yandong Guo, Shanghang Zhang

In order to find them, we further propose a LiDAR-guided sampling strategy to leverage the statistical distribution of LiDAR to determine the heights of local slices.

3D Object Detection Autonomous Driving +1

Paper
Add Code

BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for BEV 3D Object Detection

1 code implementation • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang

In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.

3D Object Detection Autonomous Driving +4

Paper
Code

BEVUDA: Multi-geometric Space Alignments for Domain Adaptive BEV 3D Object Detection

no code implementations • 30 Nov 2022 • Jiaming Liu, Rongyu Zhang, Xiaoqi Li, Xiaowei Chi, Zehui Chen, Ming Lu, Yandong Guo, Shanghang Zhang

In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model.

3D Object Detection Autonomous Driving +4

Paper
Add Code

A Real-Time Fusion Framework for Long-term Visual Localization

no code implementations • 18 Oct 2022 • Yuchen Yang, Xudong Zhang, Shuang Gao, Jixiang Wan, Yishan Ping, Yuyue Liu, Jijunnan Li, Yandong Guo

In this paper, we present an efficient client-server visual localization architecture that fuses global and local pose estimations to realize promising precision and efficiency.

Visual Localization

Paper
Add Code

SoccerNet 2022 Challenges Results

7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Paper
Code

Efficient Meta-Tuning for Content-aware Neural Video Delivery

1 code implementation • 20 Jul 2022 • Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang

Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.

Super-Resolution

Paper
Code

CrossHuman: Learning Cross-Guidance from Multi-Frame Images for Human Reconstruction

no code implementations • 20 Jul 2022 • Liliang Chen, Jiaqi Li, Han Huang, Yandong Guo

We propose CrossHuman, a novel method that learns cross-guidance from parametric human model and multi-frame RGB images to achieve high-quality 3D human reconstruction.

3D Human Reconstruction

Paper
Add Code

IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training

1 code implementation • 12 Jul 2022 • Xinyu Huang, Youcai Zhang, Ying Cheng, Weiwei Tian, RuiWei Zhao, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Xiaobo Zhang

However, the image-text pairs co-occurrent on the Internet typically lack explicit alignment information, which is suboptimal for VLP.

Multi-Label Learning Object +1

Paper
Code

Mixed Sample Augmentation for Online Distillation

no code implementations • 24 Jun 2022 • Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo

Mixed Sample Regularization (MSR), such as MixUp or CutMix, is a powerful data augmentation strategy to generalize convolutional neural networks.

Data Augmentation Knowledge Distillation

Paper
Add Code

BANet: Motion Forecasting with Boundary Aware Network

no code implementations • 16 Jun 2022 • Chen Zhang, Honglin Sun, Chen Chen, Yandong Guo

We propose a motion forecasting model called BANet, which means Boundary-Aware Network, and it is a variant of LaneGCN.

Motion Forecasting

Paper
Add Code

Situational Perception Guided Image Matting

no code implementations • 20 Apr 2022 • Bo Xu, Jiake Xie, Han Huang, Ziwen Li, Cheng Lu, Yong Tang, Yandong Guo

In this paper, we propose a Situational Perception Guided Image Matting (SPG-IM) method that mitigates subjective bias of matting annotations and captures sufficient situational perception information for better global saliency distilled from the visual-to-textual task.

Image Matting Object

Paper
Add Code

SEAL: A Large-scale Video Dataset of Multi-grained Spatio-temporally Action Localization

no code implementations • 6 Apr 2022 • Shimin Chen, Wei Li, Chen Chen, Jianyang Gu, Jiaming Chu, Xunqiang Tao, Yandong Guo

SEAL consists of two kinds of annotations, SEAL Tubes and SEAL Clips.

Action Recognition Spatio-Temporal Action Localization +2

Paper
Add Code

Faster-TAD: Towards Temporal Action Detection with Proposal Generation and Classification in a Unified Network

no code implementations • 6 Apr 2022 • Shimin Chen, Chen Chen, Wei Li, Xunqiang Tao, Yandong Guo

In this paper, we propose a unified network for TAD, termed Faster-TAD, by re-purposing a Faster-RCNN like architecture.

Action Detection Action Spotting

Paper
Add Code

Personalized Image Aesthetics Assessment with Rich Attributes

no code implementations • CVPR 2022 • Yuzhe Yang, Liwu Xu, Leida Li, Nan Qie, Yaqian Li, Peng Zhang, Yandong Guo

To solve the dilemma, we conduct so far, the most comprehensive subjective study of personalized image aesthetics and introduce a new Personalized image Aesthetics database with Rich Attributes (PARA), which consists of 31, 220 images with annotations by 438 subjects.

Paper
Add Code

Self-Distillation from the Last Mini-Batch for Consistency Regularization

1 code implementation • CVPR 2022 • Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo

Afterwards, the former half mini-batch distills on-the-fly soft targets generated in the previous iteration.

Knowledge Distillation

Paper
Code

Structured Local Radiance Fields for Human Avatar Modeling

no code implementations • CVPR 2022 • Zerong Zheng, Han Huang, Tao Yu, Hongwen Zhang, Yandong Guo, Yebin Liu

These local radiance fields not only leverage the flexibility of implicit representation in shape and appearance modeling, but also factorize cloth deformations into skeleton motions, node residual translations and the dynamic detail variations inside each individual radiance field.

Paper
Add Code

Adaptive Patch Exiting for Scalable Single Image Super-Resolution

1 code implementation • 22 Mar 2022 • Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo

Once the incremental capacity is below the threshold, the patch can exit at the specific layer.

Image Super-Resolution

Paper
Code

Semantic Distillation Guided Salient Object Detection

no code implementations • 8 Mar 2022 • Bo Xu, Guanze Liu, Han Huang, Cheng Lu, Yandong Guo

Most existing CNN-based salient object detection methods can identify local segmentation details like hair and animal fur, but often misinterpret the real saliency due to the lack of global contextual information caused by the subjectiveness of the SOD task and the locality of convolution layers.

Image Captioning Object +3

Paper
Add Code

Single-Stage Is Enough: Multi-Person Absolute 3D Pose Estimation

no code implementations • CVPR 2022 • Lei Jin, Chenyang Xu, Xiaojuan Wang, Yabo Xiao, Yandong Guo, Xuecheng Nie, Jian Zhao

The existing multi-person absolute 3D pose estimation methods are mainly based on two-stage paradigm, i. e., top-down or bottom-up, leading to redundant pipelines with high computation cost.

3D Pose Estimation Depth Estimation +1

Paper
Add Code

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

2 code implementations • 13 Dec 2021 • Youcai Zhang, Yuhao Cheng, Xinyu Huang, Fei Wen, Rui Feng, Yaqian Li, Yandong Guo

Multi-label learning in the presence of missing labels (MLML) is a challenging problem.

Missing Labels Multi-Label Image Classification

Paper
Code

CRIS: CLIP-Driven Referring Image Segmentation

1 code implementation • CVPR 2022 • Zhaoqing Wang, Yu Lu, Qiang Li, Xunqiang Tao, Yandong Guo, Mingming Gong, Tongliang Liu

In addition, we present text-to-pixel contrastive learning to explicitly enforce the text feature similar to the related pixel-level features and dissimilar to the irrelevances.

Ranked #4 on Generalized Referring Expression Segmentation on gRefCOCO

Contrastive Learning Decoder +4

228

Paper
Code

Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation

no code implementations • 22 Oct 2021 • Ziwen Li, Bo Xu, Han Huang, Cheng Lu, Yandong Guo

In this paper, we propose a new framework Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation (DTS-VIBE), to generate 3D human pose and mesh from RGB videos.

Ranked #43 on 3D Human Pose Estimation on 3DPW

3D Human Pose Estimation Optical Flow Estimation

Paper
Add Code

Pose Refinement with Joint Optimization of Visual Points and Lines

no code implementations • 8 Oct 2021 • Shuang Gao, Jixiang Wan, Yishan Ping, Xudong Zhang, Shuzhou Dong, Yuchen Yang, Haikuan Ning, Jijunnan Li, Yandong Guo

High-precision camera re-localization technology in a pre-established 3D environment map is the basis for many tasks, such as Augmented Reality, Robotics and Autonomous Driving.

Autonomous Driving

Paper
Add Code

Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction

1 code implementation • ICCV 2021 • Bo Xu, Han Huang, Cheng Lu, Ziwen Li, Yandong Guo

In this paper, we propose a Virtual Multi-modality Foreground Matting (VMFM) method to learn human-object interactive foreground (human and objects interacted with him or her) from a raw RGB image.

Decoder Human-Object Interaction Detection +1

Paper
Code

Towards Communication-Efficient and Privacy-Preserving Federated Representation Learning

no code implementations • 29 Sep 2021 • Haizhou Shi, Youcai Zhang, Zijin Shen, Siliang Tang, Yaqian Li, Yandong Guo, Yueting Zhuang

This paper investigates the feasibility of federated representation learning under the constraints of communication cost and privacy protection.

Contrastive Learning Federated Learning +2

Paper
Add Code

Improving the Robustness of Adversarial Attacks Using an Affine-Invariant Gradient Estimator

no code implementations • 13 Sep 2021 • Wenzhao Xiang, Hang Su, Chang Liu, Yandong Guo, Shibao Zheng

As designers of artificial intelligence try to outwit hackers, both sides continue to hone in on AI's inherent vulnerabilities.

Adversarial Attack

Paper
Add Code

The 2nd Anti-UAV Workshop & Challenge: Methods and Results

no code implementations • 23 Aug 2021 • Jian Zhao, Gang Wang, Jianan Li, Lei Jin, Nana Fan, Min Wang, Xiaojuan Wang, Ting Yong, Yafeng Deng, Yandong Guo, Shiming Ge, Guodong Guo

The 2nd Anti-UAV Workshop \& Challenge aims to encourage research in developing novel and accurate methods for multi-scale object tracking.

Object Tracking

Paper
Add Code

Retrieval and Localization with Observation Constraints

no code implementations • 19 Aug 2021 • Yuhao Zhou, Huanhuan Fan, Shuang Gao, Yuchen Yang, Xudong Zhang, Jijunnan Li, Yandong Guo

The localization pipeline is designed as a coarse-to-fine paradigm.

Autonomous Driving Image Retrieval +2

Paper
Add Code

On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals

no code implementations • 30 Jul 2021 • Haizhou Shi, Youcai Zhang, Siliang Tang, Wenjie Zhu, Yaqian Li, Yandong Guo, Yueting Zhuang

It is a consensus that small models perform quite poorly under the paradigm of self-supervised contrastive learning.

Clustering Contrastive Learning +1

Paper
Add Code

View-Guided Point Cloud Completion

no code implementations • CVPR 2021 • Xuancheng Zhang, Yutong Feng, Siqi Li, Changqing Zou, Hai Wan, Xibin Zhao, Yandong Guo, Yue Gao

This paper presents a view-guided solution for the task of point cloud completion.

Ranked #3 on Point Cloud Completion on ShapeNet-ViPC

Point Cloud Completion

Paper
Add Code

Generator Pyramid for High-Resolution Image Inpainting

no code implementations • 4 Dec 2020 • Leilei Cao, Tong Yang, Yixu Wang, Bo Yan, Yandong Guo

Thus, our model consists of a pyramid of fully convolutional GANs, wherein the content GAN is responsible for completing contents in the lowest-resolution masked image, and each texture GAN is responsible for synthesizing textures in a higher-resolution image.

Image Inpainting Texture Synthesis +1

Paper
Add Code

Perceptual Extreme Super Resolution Network with Receptive Field Block

1 code implementation • 26 May 2020 • Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo

Third, we alternately use different upsampling methods in the upsampling stage to reduce the high computation complexity and still remain satisfactory performance.

Ranked #1 on Image Super-Resolution on DIV8K test - 16x upscaling

Image Super-Resolution object-detection +1

Paper
Code

Visual Localization Using Semantic Segmentation and Depth Prediction

no code implementations • 25 May 2020 • Huanhuan Fan, Yuhao Zhou, Ang Li, Shuang Gao, Jijunnan Li, Yandong Guo

In this paper, we propose a monocular visual localization pipeline leveraging semantic and depth cues.

Clustering Depth Estimation +5

Paper
Add Code

Discriminative Multi-modality Speech Recognition

2 code implementations • CVPR 2020 • Bo Xu, Cheng Lu, Yandong Guo, Jacob Wang

Vision is often used as a complementary modality for audio speech recognition (ASR), especially in the noisy environment where performance of solo audio modality significantly deteriorates.

Ranked #6 on Audio-Visual Speech Recognition on LRS3-TED (using extra training data)

Audio-Visual Speech Recognition Lipreading +2

Paper
Code

NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

no code implementations • 3 May 2020 • Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He, Wenhao Wu, Yukang Ding, Chao Li, Fu Li, Shilei Wen, Jianwei Li, Fuzhi Yang, Huan Yang, Jianlong Fu, Byung-Hoon Kim, JaeHyun Baek, Jong Chul Ye, Yuchen Fan, Thomas S. Huang, Junyeop Lee, Bokyeung Lee, Jungki Min, Gwantae Kim, Kanghyu Lee, Jaihyun Park, Mykola Mykhailych, Haoyu Zhong, Yukai Shi, Xiaojun Yang, Zhijing Yang, Liang Lin, Tongtong Zhao, Jinjia Peng, Huibing Wang, Zhi Jin, Jiahao Wu, Yifu Chen, Chenming Shang, Huanrong Zhang, Jeongki Min, Hrishikesh P. S, Densen Puthussery, Jiji C. V

This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results.

Image Super-Resolution

Paper
Add Code

Learning to Detect Head Movement in Unconstrained Remote Gaze Estimation in the Wild

no code implementations • 7 Apr 2020 • Zhecan Wang, Jian Zhao, Cheng Lu, Han Huang, Fan Yang, Lianji Li, Yandong Guo

To better demonstrate the advantage of our methods, we further propose a new benchmark dataset with the most rich distribution of head-gaze combination reflecting real-world scenarios.

Gaze Estimation

Paper
Add Code

To See in the Dark: N2DGAN for Background Modeling in Nighttime Scene

no code implementations • 12 Dec 2019 • Zhenfeng Zhu, Yingying Meng, Deqiang Kong, Xingxing Zhang, Yandong Guo, Yao Zhao

Due to the deteriorated conditions of \mbox{illumination} lack and uneven lighting, nighttime images have lower contrast and higher noise than their daytime counterparts of the same scene, which limits seriously the performances of conventional background modeling methods.

Paper
Add Code

Dually Supervised Feature Pyramid for Object Detection and Segmentation

1 code implementation • 8 Dec 2019 • Fan Yang, Cheng Lu, Yandong Guo, Longin Jan Latecki, Haibin Ling

Feature pyramid architecture has been broadly adopted in object detection and segmentation to deal with multi-scale problem.

Object object-detection +2

Paper
Code

Generative One-Shot Face Recognition

no code implementations • 28 Sep 2019 • Zhengming Ding, Yandong Guo, Lei Zhang, Yun Fu

Specifically, we target at building a more effective general face classifier for both normal persons and one-shot persons.

Face Recognition One-Shot Learning +1

Paper
Add Code

Edge Heuristic GAN for Non-uniform Blind Deblurring

no code implementations • 11 Jul 2019 • Shuai Zheng, Zhenfeng Zhu, Jian Cheng, Yandong Guo, Yao Zhao

Non-uniform blur, mainly caused by camera shake and motions of multiple objects, is one of the most common causes of image quality degradation.

Deblurring Generative Adversarial Network

Paper
Add Code

Large Scale Incremental Learning

4 code implementations • CVPR 2019 • Yue Wu, Yinpeng Chen, Lijuan Wang, Yuancheng Ye, Zicheng Liu, Yandong Guo, Yun Fu

We believe this is because of the combination of two factors: (a) the data imbalance between the old and new classes, and (b) the increasing number of visually similar classes.

Ranked #2 on Incremental Learning on CIFAR-100 - 50 classes + 50 steps of 1 class

Class Incremental Learning Incremental Learning

1,686

Paper
Code

Learning to Count Objects with Few Exemplar Annotations

no code implementations • 20 May 2019 • Jianfeng Wang, Rong Xiao, Yandong Guo, Lei Zhang

In this paper, we study the problem of object counting with incomplete annotations.

Object Object Counting +2

Paper
Add Code

Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition

no code implementations • 17 Sep 2018 • Bowen Cheng, Rong Xiao, Yandong Guo, Yuxiao Hu, Jian-Feng Wang, Lei Zhang

We study in this paper how to initialize the parameters of multinomial logistic regression (a fully connected layer followed with softmax and cross entropy loss), which is widely used in deep neural network (DNN) models for classification problems.

General Classification Image Classification +4

Paper
Add Code

Incremental Classifier Learning with Generative Adversarial Networks

no code implementations • 2 Feb 2018 • Yue Wu, Yinpeng Chen, Lijuan Wang, Yuancheng Ye, Zicheng Liu, Yandong Guo, Zhengyou Zhang, Yun Fu

To address these problems, we propose (a) a new loss function to combine the cross-entropy loss and distillation loss, (b) a simple way to estimate and remove the unbalance between the old and new classes , and (c) using Generative Adversarial Networks (GANs) to generate historical data and select representative exemplars during generation.

General Classification

Paper
Add Code

One-shot Face Recognition by Promoting Underrepresented Classes

1 code implementation • 18 Jul 2017 • Yandong Guo, Lei Zhang

First, we build a face feature extraction model, and improve its performance, especially for the persons with very limited training samples, by introducing a regularizer to the cross entropy loss for the multi-nomial logistic regression (MLR) learning.

Face Identification Face Recognition +1

Paper
Code

Model-based Iterative Restoration for Binary Document Image Compression with Dictionary Learning

no code implementations • CVPR 2017 • Yandong Guo, Cheng Lu, Jan P. Allebach, Charles A. Bouman

Experimental results with a variety of document images demonstrate that our method improves the image quality compared with the observed image, and simultaneously improves the compression ratio.

Dictionary Learning Image Compression

Paper
Add Code

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

11 code implementations • 27 Jul 2016 • Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, Jianfeng Gao

In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base.

Face Recognition Image Captioning

21,452

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.