no code implementations • ECCV 2020 • Xiaobing Zhang, Shijian Lu, Haigang Gong, Zhipeng Luo, Ming Liu
Online knowledge distillation has attracted increasing interest recently, which jointly learns teacher and student models or an ensemble of student models simultaneously and collaboratively.
no code implementations • ECCV 2020 • Siyuan Yang, Jun Liu, Shijian Lu, Meng Hwa Er, Alex C. Kot
The proposed network exploits joint-aware features that are crucial for both tasks, with which gesture recognition and 3D hand pose estimation boost each other to learn highly discriminative features and models.
no code implementations • 13 May 2024 • Xueying Jiang, Sheng Jin, Xiaoqin Zhang, Ling Shao, Shijian Lu
With the proposed object occlusion and completion, MonoMAE learns enriched 3D representations that achieve superior monocular 3D detection performance qualitatively and quantitatively for both occluded and non-occluded objects.
1 code implementation • 2 May 2024 • Yi Yu, YuFei Wang, Song Xia, Wenhan Yang, Shijian Lu, Yap-Peng Tan, Alex C. Kot
Based on this network, a two-stage purification approach is naturally developed.
no code implementations • 19 Apr 2024 • Xinlong Ji, Fangneng Zhan, Shijian Lu, Shi-Sheng Huang, Hua Huang
However, the method of generating illumination maps has poor generalization performance and parametric models such as Spherical Harmonic (SH) and Spherical Gaussian (SG) fall short in capturing high-frequency or low-frequency components.
no code implementations • 27 Mar 2024 • Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, Eric Xing
TDA works with a lightweight key-value cache that maintains a dynamic queue with few-shot pseudo labels as values and the corresponding test-sample features as keys.
1 code implementation • 12 Mar 2024 • Han Qiu, Jiaxing Huang, Peng Gao, Lewei Lu, Xiaoqin Zhang, Shijian Lu
Inspired by the success of general-purpose models in NLP, recent studies attempt to unify different vision tasks in the same sequence format and employ autoregressive Transformers for sequence prediction.
no code implementations • 12 Mar 2024 • Kunhao Liu, Fangneng Zhan, Muyu Xu, Christian Theobalt, Ling Shao, Shijian Lu
We introduce StyleGaussian, a novel 3D style transfer technique that allows instant transfer of any image's style to a 3D scene at 10 frames per second (fps).
no code implementations • 11 Mar 2024 • Jiahui Zhang, Fangneng Zhan, Muyu Xu, Shijian Lu, Eric Xing
3D Gaussian splatting has achieved very impressive performance in real-time novel view synthesis.
no code implementations • 29 Feb 2024 • Xueying Jiang, Sheng Jin, Lewei Lu, Xiaoqin Zhang, Shijian Lu
We propose SKD-WM3D, a weakly supervised monocular 3D detection framework that exploits depth information to achieve M3D with a single-view image exclusively without any 3D annotations or other training data.
no code implementations • 27 Feb 2024 • Weijing Tao, Biwen Lei, Kunhao Liu, Shijian Lu, Miaomiao Cui, Xuansong Xie, Chunyan Miao
We design DivAvatar, a novel framework that generates diverse avatars, empowering 3D creatives with a multitude of distinct and richly varied 3D avatars from a single text prompt.
no code implementations • 7 Feb 2024 • Sheng Jin, Xueying Jiang, Jiaxing Huang, Lewei Lu, Shijian Lu
This paper presents DVDet, a Descriptor-Enhanced Open Vocabulary Detector that introduces conditional context prompts and hierarchical textual descriptors that enable precise region-text alignment as well as open-vocabulary detection training in general.
no code implementations • 6 Feb 2024 • Aoran Xiao, Weihao Xuan, Heli Qi, Yun Xing, Ruijie Ren, Xiaoqin Zhang, Ling Shao, Shijian Lu
CAT-SAM freezes the entire SAM and adapts its mask decoder and image encoder simultaneously with a small number of learnable parameters.
1 code implementation • 16 Jan 2024 • Jiahao Nie, Yun Xing, Gongjie Zhang, Pei Yan, Aoran Xiao, Yap-Peng Tan, Alex C. Kot, Shijian Lu
Cross-Domain Few-Shot Segmentation (CD-FSS) poses the challenge of segmenting novel categories from a distinct domain using only limited exemplars.
no code implementations • 13 Jan 2024 • Kai Jiang, Jiaxing Huang, Weiying Xie, Yunsong Li, Ling Shao, Shijian Lu
Camera-only Bird's Eye View (BEV) has demonstrated great potential in environment perception in a 3D space.
no code implementations • 13 Jan 2024 • Kai Jiang, Jiaxing Huang, Weiying Xie, Jie Lei, Yunsong Li, Ling Shao, Shijian Lu
Large-vocabulary object detectors (LVDs) aim to detect objects of many categories, which learn super objectness features and can locate objects accurately while applied to various downstream data.
no code implementations • 9 Jan 2024 • Jiaxing Huang, Kai Jiang, Jingyi Zhang, Han Qiu, Lewei Lu, Shijian Lu, Eric Xing
SAMs work with two types of prompts including spatial prompts (e. g., points) and semantic prompts (e. g., texts), which work together to prompt SAMs to segment anything on downstream datasets.
no code implementations • 27 Dec 2023 • Jiaxing Huang, Jingyi Zhang, Kai Jiang, Han Qiu, Shijian Lu
Traditional computer vision generally solves each single task independently by a dedicated model with the task instruction implicitly designed in the model architecture, arising two limitations: (1) it leads to task-specific models, which require multiple models for different tasks and restrict the potential synergies from diverse tasks; (2) it leads to a pre-defined and fixed model interface that has limited interactivity and adaptability in following user' task instructions.
2 code implementations • 28 Nov 2023 • Sicong Leng, Hang Zhang, Guanzheng Chen, Xin Li, Shijian Lu, Chunyan Miao, Lidong Bing
Large Vision-Language Models (LVLMs) have advanced considerably, intertwining visual recognition and language understanding to generate content that is not only coherent but also contextually attuned.
1 code implementation • 3 Oct 2023 • Zuhao Yang, Fangneng Zhan, Kunhao Liu, Muyu Xu, Shijian Lu
The advancement of visual intelligence is intrinsically tethered to the availability of large-scale data.
no code implementations • 26 Sep 2023 • Eman Ali, Dayan Guan, Shijian Lu, Abdulmotaleb Elsaddik
NtUA works as a key-value cache that formulates visual features and predicted pseudo-labels of the few-shot unlabelled target samples as key-value pairs.
2 code implementations • NeurIPS 2023 • Yun Xing, Jian Kang, Aoran Xiao, Jiahao Nie, Ling Shao, Shijian Lu
Such semantic misalignment circulates in pre-training, leading to inferior zero-shot performance in dense predictions due to insufficient visual concepts captured in textual representations.
no code implementations • ICCV 2023 • Xueying Jiang, Jiaxing Huang, Sheng Jin, Shijian Lu
Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models, leading to over-fitting or under-fitting in the trained generalization model.
no code implementations • ICCV 2023 • Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Kunhao Liu, Rongliang Wu, Xiaoqin Zhang, Ling Shao, Shijian Lu
However, as the pose estimator is trained with only rendered images, the pose estimation is usually biased or inaccurate for real images due to the domain gap between real images and rendered images, leading to poor robustness for the pose estimation of real images and further local minima in joint optimization.
no code implementations • ICCV 2023 • Jingyi Zhang, Jiaxing Huang, Xueying Jiang, Shijian Lu
However, the source predictions of target data are often noisy and training with them is prone to learning collapses.
no code implementations • ICCV 2023 • Muyu Xu, Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Xiaoqin Zhang, Christian Theobalt, Ling Shao, Shijian Lu
Neural Radiance Field (NeRF) has shown impressive performance in novel view synthesis via implicit scene representation.
no code implementations • 14 Jul 2023 • Siyuan Yang, Jun Liu, Shijian Lu, Er Meng Hwa, Alex C. Kot
The first is multi-scale matching which captures the scale-wise semantic relevance of skeleton data at multiple spatial and temporal scales simultaneously.
no code implementations • 29 Jun 2023 • Jiaxing Huang, Jingyi Zhang, Han Qiu, Sheng Jin, Shijian Lu
Traditional domain adaptation assumes the same vocabulary across source and target domains, which often struggles with limited transfer flexibility and efficiency while handling target domains with different vocabularies.
1 code implementation • 31 May 2023 • Aoran Xiao, Xiaoqin Zhang, Ling Shao, Shijian Lu
We address three critical questions in this emerging research field: i) the importance and urgency of label-efficient learning in point cloud processing, ii) the subfields it encompasses, and iii) the progress achieved in this area.
1 code implementation • NeurIPS 2023 • Kunhao Liu, Fangneng Zhan, Jiahui Zhang, Muyu Xu, Yingchen Yu, Abdulmotaleb El Saddik, Christian Theobalt, Eric Xing, Shijian Lu
Open-vocabulary segmentation of 3D scenes is a fundamental function of human perception and thus a crucial objective in computer vision research.
no code implementations • 18 Apr 2023 • Rongliang Wu, Yingchen Yu, Fangneng Zhan, Jiahui Zhang, Xiaoqin Zhang, Shijian Lu
To accommodate fair variation of plausible facial animations for the same audio, we design a transformer-based probabilistic mapping network that can model the variational facial animation distribution conditioned upon the input audio and autoregressively convert the audio signals into a facial animation sequence.
no code implementations • 18 Apr 2023 • Siyuan Yang, Jun Liu, Shijian Lu, Er Meng Hwa, Yongjian Hu, Alex C. Kot
We investigate self-supervised representation learning and design a novel skeleton cloud colorization technique that is capable of learning spatial and temporal skeleton representations from unlabeled skeleton sequence data.
no code implementations • 18 Apr 2023 • Rongliang Wu, Yingchen Yu, Fangneng Zhan, Jiahui Zhang, Shengcai Liao, Shijian Lu
POCE achieves the more accessible and realistic pose-controllable expression editing by mapping face images into UV space, where facial expressions and head poses can be disentangled and edited separately.
no code implementations • 5 Apr 2023 • Kaiwen Cui, Rongliang Wu, Fangneng Zhan, Shijian Lu
Face swapping aims to generate swapped images that fuse the identity of source faces and the attributes of target faces.
1 code implementation • CVPR 2023 • Aoran Xiao, Jiaxing Huang, Weihao Xuan, Ruijie Ren, Kangcheng Liu, Dayan Guan, Abdulmotaleb El Saddik, Shijian Lu, Eric Xing
In addition, we design a domain randomization technique that alternatively randomizes the geometry styles of point clouds and aggregates their embeddings, ultimately leading to a generalizable model that can improve 3DSS under various adverse weather effectively.
1 code implementation • 3 Apr 2023 • Jingyi Zhang, Jiaxing Huang, Sheng Jin, Shijian Lu
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks (DNNs) training, and they usually train a DNN for each single visual recognition task, leading to a laborious and time-consuming visual recognition paradigm.
1 code implementation • CVPR 2023 • Kunhao Liu, Fangneng Zhan, YiWen Chen, Jiahui Zhang, Yingchen Yu, Abdulmotaleb El Saddik, Shijian Lu, Eric Xing
In addition, it transforms the grid features according to the reference style which directly leads to high-quality zero-shot style transfer.
no code implementations • 14 Mar 2023 • Zhipeng Luo, Gongjie Zhang, Changqing Zhou, Zhonghua Wu, Qingyi Tao, Lewei Lu, Shijian Lu
The task of 3D single object tracking (SOT) with LiDAR point clouds is crucial for various applications, such as autonomous driving and robotics.
no code implementations • CVPR 2023 • Jiahui Zhang, Fangneng Zhan, Christian Theobalt, Shijian Lu
The first is a prior distribution regularization which measures the discrepancy between a prior token distribution and the predicted token distribution to avoid codebook collapse and low codebook utilization.
no code implementations • CVPR 2023 • Yi Yu, YuFei Wang, Wenhan Yang, Shijian Lu, Yap-Peng Tan, Alex C. Kot
Extensive experiments show that with our trained trigger injection models and simple modification of encoder parameters (of the compression model), the proposed attack can successfully inject several backdoors with corresponding triggers in a single image compression model.
no code implementations • 15 Dec 2022 • Zhipeng Luo, Changqing Zhou, Gongjie Zhang, Shijian Lu
3D object detection with surround-view images is an essential task for autonomous driving.
no code implementations • 1 Dec 2022 • Zichen Tian, Chuhui Xue, Jingyi Zhang, Shijian Lu
We study domain adaptive scene text detection, a largely neglected yet very meaningful task that aims for optimal transfer of labelled scene text images while handling unlabelled images in various new domains.
no code implementations • CVPR 2023 • Gongjie Zhang, Zhipeng Luo, Zichen Tian, Jingyi Zhang, Xiaoqin Zhang, Shijian Lu
Multi-scale features have been proven highly effective for object detection but often come with huge and even prohibitive extra computation costs, especially for the recent Transformer-based detectors.
1 code implementation • 10 Aug 2022 • Zhipeng Luo, Changqing Zhou, Liang Pan, Gongjie Zhang, Tianrui Liu, Yueru Luo, Haiyu Zhao, Ziwei Liu, Shijian Lu
In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in consecutive frames given an object template.
no code implementations • 4 Aug 2022 • Zhipeng Luo, Gongjie Zhang, Changqing Zhou, Tianrui Liu, Shijian Lu, Liang Pan
3D object detection using point clouds has attracted increasing attention due to its wide applications in autonomous driving and robotics.
no code implementations • 4 Aug 2022 • Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Rongliang Wu, Xiaoqin Zhang, Shijian Lu
In addition, stochastic noises fed to the generator are employed for unconditional detail generation, which tends to produce unfaithful details that compromise the fidelity of the generated SR image.
1 code implementation • 30 Jul 2022 • Gongjie Zhang, Zhipeng Luo, Kaiwen Cui, Shijian Lu, Eric P. Xing
Despite its success, the said paradigm is still constrained by several factors, such as (i) low-quality region proposals for novel classes and (ii) negligence of the inter-class correlation among different classes.
2 code implementations • 30 Jul 2022 • Aoran Xiao, Jiaxing Huang, Dayan Guan, Kaiwen Cui, Shijian Lu, Ling Shao
The first is scene-level swapping which exchanges point cloud sectors of two LiDAR scans that are cut along the azimuth axis.
1 code implementation • 28 Jul 2022 • Gongjie Zhang, Zhipeng Luo, Jiaxing Huang, Shijian Lu, Eric P. Xing
The recently proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection.
no code implementations • 26 Jul 2022 • Chuhui Xue, Jiaxing Huang, Shijian Lu, Changhu Wang, Song Bai
We formulate the new setup by a dual detection task which first detects integral text units and then groups them into a CTB.
no code implementations • 21 Jul 2022 • Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Kaiwen Cui, Changgong Zhang, Shijian Lu
Extensive experiments over multiple conditional image generation tasks show that our method achieves superior diverse image generation performance qualitatively and quantitatively as compared with the state-of-the-art.
1 code implementation • 6 Jul 2022 • Yingchen Yu, Fangneng Zhan, Rongliang Wu, Jiahui Zhang, Shijian Lu, Miaomiao Cui, Xuansong Xie, Xian-Sheng Hua, Chunyan Miao
In addition, we design a simple yet effective scheme that explicitly maps CLIP embeddings (of target text) to the latent space and fuses them with latent codes for effective latent code optimization and accurate editing.
1 code implementation • 6 Jul 2022 • Yun Xing, Dayan Guan, Jiaxing Huang, Shijian Lu
Specifically, we design cross-frame pseudo labelling to provide pseudo supervision from previous video frames while learning from the augmented current video frames.
no code implementations • 6 Jul 2022 • Jiahui Zhang, Fangneng Zhan, Rongliang Wu, Yingchen Yu, Wenqing Zhang, Bai Song, Xiaoqin Zhang, Shijian Lu
With the feature transport plan as the guidance, a novel pose calibration technique is designed which rectifies the initially randomized camera poses by predicting relative pose transformations between the pair of rendered and real images.
no code implementations • CVPR 2023 • Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Shijian Lu
Domain adaptive panoptic segmentation aims to mitigate data annotation challenge by leveraging off-the-shelf annotated data in one or multiple related source domains.
Ranked #2 on Domain Adaptation on Panoptic SYNTHIA-to-Cityscapes
no code implementations • CVPR 2022 • Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Shijian Lu, Changgong Zhang
We design a Marginal Contrastive Learning Network (MCL-Net) that explores contrastive learning to learn domain-invariant features for realistic exemplar-based image translation.
1 code implementation • CVPR 2022 • Chuhui Xue, Zichen Tian, Fangneng Zhan, Shijian Lu, Song Bai
State-of-the-art document dewarping techniques learn to predict 3-dimensional information of documents which are prone to errors while dealing with documents with irregular distortions or large variations in depth.
1 code implementation • CVPR 2022 • Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu
We build the balanced subclass distributions by clustering pixels of each original class into multiple subclasses of similar sizes, which provide class-balanced pseudo supervision to regularize the class-biased segmentation.
1 code implementation • CVPR 2022 • Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Rongliang Wu, Shijian Lu
Perceiving the similarity between images has been a long-standing and fundamental problem underlying various visual generation tasks.
1 code implementation • CVPR 2022 • Gongjie Zhang, Zhipeng Luo, Yingchen Yu, Kaiwen Cui, Shijian Lu
First, it projects object queries into the same embedding space as encoded image features, where the matching can be accomplished efficiently with aligned semantics.
no code implementations • 8 Mar 2022 • Chuhui Xue, Wenqing Zhang, Yu Hao, Shijian Lu, Philip Torr, Song Bai
Our network consists of an image encoder and a character-aware text encoder that extract visual and textual features, respectively, as well as a visual-textual decoder that models the interaction among textual and visual features for learning effective scene text representations.
Optical Character Recognition Optical Character Recognition (OCR) +2
1 code implementation • 28 Feb 2022 • Aoran Xiao, Jiaxing Huang, Dayan Guan, Xiaoqin Zhang, Shijian Lu, Ling Shao
The convergence of point cloud and DNNs has led to many deep point cloud models, largely trained under the supervision of large-scale and densely-labelled point cloud data.
no code implementations • 28 Jan 2022 • Shuang Wu, Zhenguang Li, Shijian Lu, Li Cheng
Music and dance have always co-existed as pillars of human activities, contributing immensely to the cultural, social, and entertainment functions in virtually all societies.
1 code implementation • 30 Dec 2021 • Zhenguang Liu, Shuang Wu, Shuyuan Jin, Shouling Ji, Qi Liu, Shijian Lu, Li Cheng
One aspect that has been obviated so far, is the fact that how we represent the skeletal pose has a critical impact on the prediction results.
2 code implementations • 27 Dec 2021 • Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Shijian Lu, Lingjie Liu, Adam Kortylewski, Christian Theobalt, Eric Xing
With superb power in modeling the interaction among multimodal information, multimodal image synthesis and editing has become a hot research topic in recent years.
1 code implementation • CVPR 2022 • Changqing Zhou, Zhipeng Luo, Yueru Luo, Tianrui Liu, Liang Pan, Zhongang Cai, Haiyu Zhao, Shijian Lu
In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in the current search point cloud given a template point cloud.
no code implementations • 3 Dec 2021 • Shuang Wu, Shijian Lu, Li Cheng
We introduce an optimal transport distance for evaluating the authenticity of the generated dance distribution and a Gromov-Wasserstein distance to measure the correspondence between the dance distribution and the input music.
1 code implementation • NeurIPS 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
To this end, we design an innovative historical contrastive learning (HCL) technique that exploits historical source hypothesis to make up for the absence of source data in UMA.
1 code implementation • 4 Oct 2021 • Kaiwen Cui, Jiaxing Huang, Zhipeng Luo, Gongjie Zhang, Fangneng Zhan, Shijian Lu
Specifically, we design GenCo, a Generative Co-training network that mitigates the discriminator over-fitting issue by introducing multiple complementary discriminators that provide diverse supervision from multiple distinctive views in training.
no code implementations • 29 Sep 2021 • Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Song Bai, Changhu Wang
This paper presents Contextual Text Detection, a new setup that detects contextual text blocks for better understanding of texts in scenes.
no code implementations • ICCV 2021 • Siyuan Yang, Jun Liu, Shijian Lu, Meng Hwa Er, Alex C. Kot
We investigate unsupervised representation learning for skeleton action recognition, and design a novel skeleton cloud colorization technique that is capable of learning skeleton representations from unlabeled skeleton sequence data.
1 code implementation • ICCV 2021 • Yingchen Yu, Fangneng Zhan, Shijian Lu, Jianxiong Pan, Feiying Ma, Xuansong Xie, Chunyan Miao
This paper presents WaveFill, a wavelet-based inpainting network that decomposes images into multiple frequency bands and fills the missing regions in each frequency band separately and explicitly.
1 code implementation • ICCV 2021 • Zhipeng Luo, Zhongang Cai, Changqing Zhou, Gongjie Zhang, Haiyu Zhao, Shuai Yi, Shijian Lu, Hongsheng Li, Shanghang Zhang, Ziwei Liu
In addition, existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.
1 code implementation • ICCV 2021 • Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu
This paper presents DA-VSN, a domain adaptive video segmentation network that addresses domain gaps in videos by temporal consistency regularization (TCR) for consecutive frames of target-domain videos.
1 code implementation • 12 Jul 2021 • Aoran Xiao, Jiaxing Huang, Dayan Guan, Fangneng Zhan, Shijian Lu
Extensive experiments show that SynLiDAR provides a high-quality data source for studying 3D transfer and the proposed PCT achieves superior point cloud translation consistently across the three setups.
no code implementations • 7 Jul 2021 • Kaiwen Cui, Gongjie Zhang, Fangneng Zhan, Jiaxing Huang, Shijian Lu
Generative Adversarial Networks (GANs) have become the de-facto standard in image synthesis.
2 code implementations • 7 Jul 2021 • Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Kaiwen Cui, Aoran Xiao, Shijian Lu, Chunyan Miao
This paper presents a versatile image translation and manipulation framework that achieves accurate semantic and style guidance in image generation by explicitly building a correspondence.
no code implementations • 6 Jul 2021 • Mengxi Jia, Xinhua Cheng, Shijian Lu, Jian Zhang
To better eliminate interference from occlusions, we design a contrast feature learning technique (CFL) for better separation of occlusion features and discriminative ID features.
no code implementations • 1 Jul 2021 • Jiahui Zhang, Shijian Lu, Fangneng Zhan, Yingchen Yu
Extensive experiments on synthetic datasets and real images show that the proposed CRL-SR can handle multi-modal and spatially variant degradation effectively under blind settings and it also outperforms state-of-the-art SR methods qualitatively and quantitatively.
no code implementations • ICCV 2021 • Fangneng Zhan, Changgong Zhang, WenBo Hu, Shijian Lu, Feiying Ma, Xuansong Xie, Ling Shao
Accurate lighting estimation is challenging yet critical to many computer vision and computer graphics tasks such as high-dynamic-range (HDR) relighting.
no code implementations • CVPR 2021 • Fangneng Zhan, Yingchen Yu, Kaiwen Cui, Gongjie Zhang, Shijian Lu, Jianxiong Pan, Changgong Zhang, Feiying Ma, Xuansong Xie, Chunyan Miao
In addition, we design a semantic-activation normalization scheme that injects style features of exemplars into the image translation process successfully.
no code implementations • 16 Jun 2021 • Zhipeng Luo, Xiaobing Zhang, Shijian Lu, Shuai Yi
Compared with single-source unsupervised domain adaptation (SUDA), domain shift in MUDA exists not only between the source and target domains but also among multiple source domains.
Classification Multi-Source Unsupervised Domain Adaptation +2
no code implementations • CVPR 2022 • Jingyi Zhang, Jiaxing Huang, Zichen Tian, Shijian Lu
Second, it introduces multi-view spectral learning that learns useful unsupervised representations by maximizing mutual information among multiple ST-generated spectral views of each target sample.
1 code implementation • ICCV 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
With FAA-generated samples, the training can continue the 'random walk' and drift into an area with a flat loss landscape, leading to more robust domain adaptation.
1 code implementation • CVPR 2022 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu, Ling Shao
In this work, we explore the idea of instance contrastive learning in unsupervised domain adaptation (UDA) and propose a novel Category Contrast technique (CaCo) that introduces semantic priors on top of instance discrimination for visual UDA tasks.
no code implementations • 5 Jun 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
We position the few labeled target samples as references that gauge the similarity between source and target features and guide adaptive inter-domain alignment for learning more similar source features.
no code implementations • 18 May 2021 • Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Changhu Wang, Song Bai
The first task focuses on image-to-character (I2C) mapping which detects a set of character candidates from images based on different alignments of visual features in an non-sequential way.
no code implementations • 26 Apr 2021 • Yingchen Yu, Fangneng Zhan, Rongliang Wu, Jianxiong Pan, Kaiwen Cui, Shijian Lu, Feiying Ma, Xuansong Xie, Chunyan Miao
With image-level attention, transformers enable to model long-range dependencies and generate diverse contents with autoregressive modeling of pixel-sequence distributions.
no code implementations • CVPR 2023 • Jingyi Zhang, Jiaxing Huang, Zhipeng Luo, Gongjie Zhang, Xiaoqin Zhang, Shijian Lu
DA-DETR introduces a novel CNN-Transformer Blender (CTBlender) that fuses the CNN features and Transformer features ingeniously for effective feature alignment and knowledge transfer across domains.
no code implementations • 28 Mar 2021 • Gongjie Zhang, Kaiwen Cui, Tzu-Yi Hung, Shijian Lu
In addition, the synthesized defect samples demonstrate their effectiveness in training better defect inspection networks.
no code implementations • 24 Mar 2021 • Jiaxing Huang, Dayan Guan, Shijian Lu, Aoran Xiao
Recent progresses in domain adaptive semantic segmentation demonstrate the effectiveness of adversarial learning (AL) in unsupervised domain adaptation.
2 code implementations • 22 Mar 2021 • Gongjie Zhang, Zhipeng Luo, Kaiwen Cui, Shijian Lu
Few-shot object detection has been extensively investigated by incorporating meta-learning into region-based detection frameworks.
Ranked #7 on Few-Shot Object Detection on MS-COCO (30-shot)
1 code implementation • CVPR 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
It has been studied widely by domain randomization that transfers source images to different styles in spatial space for learning domain-agnostic features.
1 code implementation • CVPR 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
The inter-task regularization exploits the complementary nature of instance segmentation and semantic segmentation and uses it as a constraint for better feature alignment across domains.
Ranked #2 on Domain Adaptation on Panoptic SYNTHIA-to-Mapillary
no code implementations • 1 Mar 2021 • Chuhui Xue, Shijian Lu, Steven Hoi
Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc.
1 code implementation • 1 Mar 2021 • Aoran Xiao, Xiaofei Yang, Shijian Lu, Dayan Guan, Jiaxing Huang
Specifically, we design a residual dense block with multiple receptive fields as a building block in the encoder which preserves detailed information in each modality and learns hierarchical modality-specific and fused features effectively.
Ranked #23 on 3D Semantic Segmentation on SemanticKITTI
3 code implementations • 27 Feb 2021 • Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu, Yanpeng Cao
Specifically, we design an uncertainty metric that assesses the alignment of each sample and adjusts the strength of adversarial learning for well-aligned and poorly-aligned samples adaptively.
1 code implementation • 20 Feb 2021 • Fangneng Zhan, Yingchen Yu, Changgong Zhang, Rongliang Wu, WenBo Hu, Shijian Lu, Feiying Ma, Xuansong Xie, Ling Shao
This paper presents Geometric Mover's Light (GMLight), a lighting estimation framework that employs a regression network and a generative projector for effective illumination estimation.
no code implementations • 21 Dec 2020 • Fangneng Zhan, Changgong Zhang, Yingchen Yu, Yuan Chang, Shijian Lu, Feiying Ma, Xuansong Xie
Motivated by the Earth Mover distance, we design a novel spherical mover's loss that guides to regress light distribution parameters accurately by taking advantage of the subtleties of spherical distribution.
no code implementations • 17 Sep 2020 • Fangneng Zhan, Shijian Lu, Changgong Zhang, Feiying Ma, Xuansong Xie
State-of-the-art methods strive to harmonize the composed image by adapting the style of foreground objects to be compatible with the background image, whereas the potential shadow of foreground objects within the composed image which is critical to the composition realism is largely neglected.
no code implementations • ECCV 2020 • Rongliang Wu, Shijian Lu
Recent studies on facial expression editing have obtained very promising progress.
no code implementations • 14 Jul 2020 • Changgong Zhang, Fangneng Zhan, Shijian Lu, Feiying Ma, Xuansong Xie
Recent advances in generative adversarial networks (GANs) have achieved great success in automated image composition that generates new images by embedding interested foreground objects into background images automatically.
1 code implementation • ECCV 2020 • Jiaxing Huang, Shijian Lu, Dayan Guan, Xiaobing Zhang
Recent advances in unsupervised domain adaptation for semantic segmentation have shown great potentials to relieve the demand of expensive per-pixel annotations.
2 code implementations • ECCV 2020 • Yunpeng Zhai, Qixiang Ye, Shijian Lu, Mengxi Jia, Rongrong Ji, Yonghong Tian
Often the best performing deep neural models are ensembles of multiple base-level networks, nevertheless, ensemble learning with respect to domain adaptive person re-ID remains unexplored.
Domain Adaptive Person Re-Identification Ensemble Learning +1
no code implementations • 3 Jul 2020 • Mengxi Jia, Yunpeng Zhai, Shijian Lu, Siwei Ma, Jian Zhang
RGB-Infrared (IR) cross-modality person re-identification (re-ID), which aims to search an IR image in RGB gallery or vice versa, is a challenging task due to the large discrepancy between IR and RGB modalities.
Cross-Modality Person Re-identification Person Re-Identification
no code implementations • CVPR 2020 • Yunpeng Zhai, Shijian Lu, Qixiang Ye, Xuebo Shan, Jie Chen, Rongrong Ji, Yonghong Tian
Domain adaptive person re-identification (re-ID) is a challenging task, especially when person identities in target domains are unknown.
Ranked #8 on Unsupervised Domain Adaptation on Duke to Market
no code implementations • 19 Mar 2020 • Zongxian Li, Qixiang Ye, Chong Zhang, Jingjing Liu, Shijian Lu, Yonghong Tian
In this work, we propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains while considering the instantaneous alignment difficulty.
no code implementations • CVPR 2020 • Rongliang Wu, Gongjie Zhang, Shijian Lu, Tao Chen
Recent advances in Generative Adversarial Nets (GANs) have shown remarkable improvements for facial expression editing.
2 code implementations • CVPR 2020 • Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, Yu Qiao
Annotating a qualitative large-scale facial expression dataset is extremely difficult due to the uncertainties caused by ambiguous facial expressions, low-quality facial images, and the subjectiveness of annotators.
Facial Expression Recognition Facial Expression Recognition (FER)
no code implementations • 20 Dec 2019 • Xi Liu, Rui Zhang, Yongsheng Zhou, Qianyi Jiang, Qi Song, Nan Li, Kai Zhou, Lei Wang, Dong Wang, Minghui Liao, Mingkun Yang, Xiang Bai, Baoguang Shi, Dimosthenis Karatzas, Shijian Lu, C. V. Jawahar
21 teams submit results for Task 1, 23 teams submit results for Task 2, 24 teams submit results for Task 3, and 13 teams submit results for Task 4.
no code implementations • ICCV 2019 • Fangneng Zhan, Chuhui Xue, Shijian Lu
Recent adversarial learning research has achieved very impressive progress for modelling cross-domain data shifts in appearance space but its counterpart in modelling cross-domain shifts in geometry space lags far behind.
no code implementations • 12 Jul 2019 • Chun-Mei Feng, Kai Wang, Shijian Lu, Yong Xu, Heng Kong, Ling Shao
The deep sub-network learns from the residuals of the high-frequency image information, where multiple residual blocks are cascaded to magnify the MRI images at the last network layer.
no code implementations • 12 May 2019 • Fangneng Zhan, Jiaxing Huang, Shijian Lu
Despite the rapid progress of generative adversarial networks (GANs) in image synthesis in recent years, the existing image synthesis approaches work in either geometry domain or appearance domain alone which often introduces various synthesis artifacts.
1 code implementation • 3 Mar 2019 • Gongjie Zhang, Shijian Lu, Wei zhang
This paper presents a novel object detection network (CAD-Net) that exploits attention-modulated features as well as global and local contexts to address the new challenges in detecting objects from remote sensing images.
no code implementations • 26 Jan 2019 • Changgong Zhang, Fangneng Zhan, Hongyuan Zhu, Shijian Lu
Experiments over a number of public datasets demonstrate the effectiveness of our proposed image synthesis technique - the use of our synthesized images in deep network training is capable of achieving similar or even better scene text detection and scene text recognition performance as compared with using real images.
no code implementations • 9 Jan 2019 • Chuhui Xue, Shijian Lu, Wei zhang
State-of-the-art scene text detection techniques predict quadrilateral boxes that are prone to localization errors while dealing with straight or curved text lines of different orientations and lengths in scenes.
no code implementations • CVPR 2019 • Fangneng Zhan, Hongyuan Zhu, Shijian Lu
Recent advances in generative adversarial networks (GANs) have shown great potentials in realistic image synthesis whereas most existing works address synthesis realism in either appearance space or geometry space but few in both.
no code implementations • CVPR 2019 • Fangneng Zhan, Shijian Lu
Automated recognition of texts in scenes has been a research challenge for years, largely due to the arbitrary variation of text appearances in perspective distortion, text line curvature, text styles and different types of imaging artifacts.
no code implementations • 25 Nov 2018 • Dinh NguyenVan, Shijian Lu, Shangxuan Tian, Nizar Ouarti, Mounir Mokhtari
Automatic reading texts in scenes has attracted increasing interest in recent years as texts often carry rich semantic information that is useful for scene understanding.
no code implementations • 13 Oct 2018 • Fan Yang, Ke Yan, Shijian Lu, Huizhu Jia, Xiaodong Xie, Wen Gao
Person re-identification (ReID) is a challenging task due to arbitrary human pose variations, background clutters, etc.
no code implementations • ECCV 2018 • Chuhui Xue, Shijian Lu, Fangneng Zhan
This paper presents a scene text detection technique that exploits bootstrapping and text border semantics for accurate localization of texts in scenes.
no code implementations • ECCV 2018 • Fangneng Zhan, Shijian Lu, Chuhui Xue
This paper presents a novel image synthesis technique that aims to generate a large amount of annotated scene text images for training accurate and robust scene text detection and recognition models.
no code implementations • ICCV 2017 • Shangxuan Tian, Shijian Lu, Chongshou Li
With a "light" supervised model trained on a small fully annotated dataset, we explore semi-supervised and weakly supervised learning on a large unannotated dataset and a large weakly annotated dataset, respectively.
no code implementations • ICCV 2017 • Hongyuan Zhu, Romain Vial, Shijian Lu
Recently, the regression-based object detectors and long-term recurrent convolutional network (LRCN) have demonstrated superior performance in human action detection and recognition.
5 code implementations • 31 Aug 2017 • Baoguang Shi, Cong Yao, Minghui Liao, Mingkun Yang, Pei Xu, Linyan Cui, Serge Belongie, Shijian Lu, Xiang Bai
This report introduces RCTW, a new competition that focuses on Chinese text reading.
no code implementations • 26 Jun 2017 • Hongyuan Zhu, Romain Vial, Shijian Lu, Yonghong Tian, Xian-Bin Cao
In this paper, we present YoTube-a novel network fusion framework for searching action proposals in untrimmed videos, where each action proposal corresponds to a spatialtemporal video tube that potentially locates one human action.
no code implementations • 12 Jun 2017 • Artsiom Ablavatski, Shijian Lu, Jianfei Cai
We design an Enriched Deep Recurrent Visual Attention Model (EDRAM) - an improved attention-based architecture for multiple object recognition.
no code implementations • 15 May 2017 • Andrei Polzounov, Artsiom Ablavatski, Sergio Escalera, Shijian Lu, Jianfei Cai
In recent years, text recognition has achieved remarkable success in recognizing scanned document text.
no code implementations • CVPR 2016 • Hongyuan Zhu, Jean-Baptiste Weibel, Shijian Lu
RGBD scene recognition has attracted increasingly attention due to the rapid development of depth sensors and their wide application scenarios.
no code implementations • ICCV 2015 • Shangxuan Tian, Yifeng Pan, Chang Huang, Shijian Lu, Kai Yu, Chew Lim Tan
With character candidates detected by cascade boosting, the min-cost flow network model integrates the last three sequential steps into a single process which solves the error accumulation problem at both character level and text line level effectively.
no code implementations • 16 Jul 2015 • Hongyuan Zhu, Shijian Lu, Jianfei Cai, Quangqing Lee
Recently, Hosang et al. conduct the first unified study of existing methods' in terms of various image-level degradations.
no code implementations • 3 Feb 2015 • Hongyuan Zhu, Fanman Meng, Jianfei Cai, Shijian Lu
Image segmentation refers to the process to divide an image into nonoverlapping meaningful regions according to human perception, which has become a classic topic since the early ages of computer vision.