1 code implementation • 31 Mar 2024 • Qiankun Liu, Yuqi Jiang, Zhentao Tan, Dongdong Chen, Ying Fu, Qi Chu, Gang Hua, Nenghai Yu
The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer.
1 code implementation • 18 Mar 2024 • Zixin Zhu, Xuelu Feng, Dongdong Chen, Junsong Yuan, Chunming Qiao, Gang Hua
We hypothesize that the latent representation learned from a pretrained generative T2V model encapsulates rich semantics and coherent temporal correspondences, thereby naturally facilitating video understanding.
Referring Video Object Segmentation Semantic Segmentation +2
no code implementations • 17 Mar 2024 • Kun Xia, Le Wang, Sanping Zhou, Gang Hua, Wei Tang
To this end, we first devise innovative strategies to adaptively select high-quality positive and negative classes from the label space, by modeling both the confidence and rank of a class in relation to those of the target class.
no code implementations • 9 Mar 2024 • Yonghao Dong, Le Wang, Sanping Zhou, Gang Hua, Changyin Sun
Previous studies have tried to tackle this problem by leveraging a portion of the trajectory data from the target domain to adapt the model.
no code implementations • 27 Feb 2024 • Mo Zhou, Yiding Yang, Haoxiang Li, Vishal M. Patel, Gang Hua
With a strong alignment between the training and test distributions, object relation as a context prior facilitates object detection.
1 code implementation • 4 Feb 2024 • Zhenxing Niu, Haodong Ren, Xinbo Gao, Gang Hua, Rong Jin
This paper focuses on jailbreaking attacks against multi-modal large language models (MLLMs), seeking to elicit MLLMs to generate objectionable responses to harmful user queries.
no code implementations • 26 Dec 2023 • Lu Ling, Yichen Sheng, Zhi Tu, Wentian Zhao, Cheng Xin, Kun Wan, Lantao Yu, Qianyu Guo, Zixun Yu, Yawen Lu, Xuanmao Li, Xingpeng Sun, Rohan Ashok, Aniruddha Mukherjee, Hao Kang, Xiangrui Kong, Gang Hua, Tianyi Zhang, Bedrich Benes, Aniket Bera
We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS).
1 code implementation • 28 Nov 2023 • Jiaxin Lu, Hao Kang, Haoxiang Li, Bo Liu, Yiding Yang, QiXing Huang, Gang Hua
Generation-based methods that generate grasping postures conditioned on the object can often produce diverse grasping, but they are insufficient for high grasping success due to lack of discriminative information.
no code implementations • 27 Nov 2023 • Yonghao Dong, Le Wang, Sanpin Zhou, Gang Hua, Changyin Sun
Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream.
no code implementations • 23 Nov 2023 • Lei Fan, Mingfu Liang, Yunxuan Li, Gang Hua, Ying Wu
Active recognition enables robots to intelligently explore novel observations, thereby acquiring more information while circumventing undesired viewing conditions.
1 code implementation • ICCV 2023 • Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Weiming Zhang, Gang Hua, Nenghai Yu
Even though they can enable very fine-grained local control, such interaction modes are inefficient for the editing conditions that can be easily specified by language descriptions or reference images.
no code implementations • ICCV 2023 • Lei Fan, Bo Liu, Haoxiang Li, Ying Wu, Gang Hua
First, prediction uncertainty should be separately quantified as confusion depicting inter-class uncertainties and ignorance identifying out-of-distribution samples.
1 code implementation • ICCV 2023 • Yuanhao Zhai, Ziyi Liu, Zhenyu Wu, Yi Wu, Chunluan Zhou, David Doermann, Junsong Yuan, Gang Hua
The former prevents the decoder from reconstructing the video background given video features, and thus helps reduce the background information in feature learning.
1 code implementation • ICCV 2023 • Qidong Huang, Xiaoyi Dong, Dongdong Chen, Yinpeng Chen, Lu Yuan, Gang Hua, Weiming Zhang, Nenghai Yu
Based on our analysis, we provide a simple yet effective way to boost the adversarial robustness of MAE.
1 code implementation • 8 Jun 2023 • Qinhong Yang, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Lu Yuan, Gang Hua, Nenghai Yu
This paper introduces a new large-scale image restoration dataset, called HQ-50K, which contains 50, 000 high-quality images with rich texture details and semantic diversity.
2 code implementations • 7 Jun 2023 • Zixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, Gang Hua
The training cost of our asymmetric VQGAN is cheap, and we only need to retrain a new asymmetric decoder while keeping the vanilla VQGAN encoder and StableDiffusion unchanged.
1 code implementation • CVPR 2023 • Zhicheng Sun, Yadong Mu, Gang Hua
Continual learning aims to learn on non-stationary data streams without catastrophically forgetting previous knowledge.
no code implementations • CVPR 2023 • Zheng Qin, Sanping Zhou, Le Wang, Jinghai Duan, Gang Hua, Wei Tang
For dense crowds, we design a novel Interaction Module to learn interaction-aware motions from short-term trajectories, which can estimate the complex movement of each target.
1 code implementation • CVPR 2023 • Qidong Huang, Xiaoyi Dong, Dongdong Chen, Weiming Zhang, Feifei Wang, Gang Hua, Nenghai Yu
We present Diversity-Aware Meta Visual Prompting~(DAM-VP), an efficient and effective prompting method for transferring pre-trained models to downstream tasks with frozen backbone.
no code implementations • ICCV 2023 • Xingyu Liu, Sanping Zhou, Le Wang, Gang Hua
Learning discriminative features from very few labeled samples to identify novel classes has received increasing attention in skeleton-based action recognition.
1 code implementation • ICCV 2023 • Kun Xia, Le Wang, Sanping Zhou, Gang Hua, Wei Tang
To this end, we propose a unified framework, termed Noisy Pseudo-Label Learning, to handle both location biases and category errors.
1 code implementation • ICCV 2023 • Liushuai Shi, Le Wang, Sanping Zhou, Gang Hua
Pedestrian trajectory prediction is an essentially connecting link to understanding human behavior.
no code implementations • ICCV 2023 • Yonghao Dong, Le Wang, Sanping Zhou, Gang Hua
Specifically, SICNet learns comprehensive sparse instances, i. e., representative points of the future trajectory, through a mask generated by a long short-term memory encoder and uses the memory mechanism to store and retrieve such sparse instances.
1 code implementation • 8 Dec 2022 • Xiangyu Xu, Li Guan, Enrique Dunn, Haoxiang Li, Gang Hua
In this paper, we propose an end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching for the task of image-based 3D localization.
1 code implementation • 30 Nov 2022 • Haichao Yu, Haoxiang Li, Gang Hua, Gao Huang, Humphrey Shi
To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.
1 code implementation • 21 Nov 2022 • Zixin Zhu, Yixuan Wei, JianFeng Wang, Zhe Gan, Zheng Zhang, Le Wang, Gang Hua, Lijuan Wang, Zicheng Liu, Han Hu
The image captioning task is typically realized by an auto-regressive method that decodes the text tokens one by one.
no code implementations • 16 Sep 2022 • Qidong Huang, Xiaoyi Dong, Dongdong Chen, Hang Zhou, Weiming Zhang, Kui Zhang, Gang Hua, Nenghai Yu
Notwithstanding the prominent performance achieved in various applications, point cloud recognition models have often suffered from natural corruptions and adversarial perturbations.
1 code implementation • 26 May 2022 • Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Fang Zheng, Nanning Zheng, Gang Hua
Understanding the multiple socially-acceptable future behaviors is an essential task for many vision applications.
1 code implementation • 9 Apr 2022 • Xin Hu, Zhenyu Wu, Hao-Yu Miao, Siqi Fan, Taiyu Long, Zhenyu Hu, Pengcheng Pi, Yi Wu, Zhou Ren, Zhangyang Wang, Gang Hua
Video action detection (spatio-temporal action localization) is usually the starting point for human-centric intelligent analysis of videos nowadays.
no code implementations • CVPR 2023 • Bingxu Mu, Zhenxing Niu, Le Wang, Xue Wang, Rong Jin, Gang Hua
Deep neural networks (DNNs) are known to be vulnerable to both backdoor attacks as well as adversarial attacks.
1 code implementation • ICCV 2023 • Siming Yan, Zhenpei Yang, Haoxiang Li, Chen Song, Li Guan, Hao Kang, Gang Hua, QiXing Huang
The most popular and accessible 3D representation, i. e., point clouds, involves discrete samples of the underlying continuous 3D surface.
Ranked #5 on 3D Point Cloud Linear Classification on ModelNet40 (using extra training data)
3D Point Cloud Classification 3D Point Cloud Linear Classification +4
1 code implementation • NeurIPS 2021 • Dongkai Wang, Shiliang Zhang, Gang Hua
Instead of inferring individual keypoints, the Pose-level Inference Network (PINet) directly infers the complete pose cues for a person from his/her visible body parts.
1 code implementation • 26 Nov 2021 • Kumara Kahatapitiya, Zhou Ren, Haoxiang Li, Zhenyu Wu, Michael S. Ryoo, Gang Hua
However, such pretrained models are not ideal for downstream detection, due to the disparity between the pretraining and the downstream fine-tuning tasks.
Ranked #3 on Action Detection on Charades
1 code implementation • 12 Sep 2021 • Aichun Zhu, Zijie Wang, Yifeng Li, Xili Wan, Jing Jin, Tian Wang, Fangqiang Hu, Gang Hua
Many previous methods on text-based person retrieval tasks are devoted to learning a latent common space mapping, with the purpose of extracting modality-invariant features from both visual and textual modality.
Ranked #6 on Text based Person Retrieval on RSTPReid
1 code implementation • 5 Aug 2021 • Jie Zhang, Dongdong Chen, Qidong Huang, Jing Liao, Weiming Zhang, Huamin Feng, Gang Hua, Nenghai Yu
As the image structure can keep its semantic meaning during the data transformation, such trigger pattern is inherently robust to data transformations.
no code implementations • 5 Aug 2021 • Jie Zhang, Dongdong Chen, Jing Liao, Han Fang, Zehua Ma, Weiming Zhang, Gang Hua, Nenghai Yu
However, little attention has been devoted to the protection of DNNs in image processing tasks.
1 code implementation • ICCV 2021 • Fang Zheng, Le Wang, Sanping Zhou, Wei Tang, Zhenxing Niu, Nanning Zheng, Gang Hua
Specifically, the proposed unlimited neighborhood interaction module generates the fused-features of all agents involved in an interaction simultaneously, which is adaptive to any number of agents and any range of interaction area.
1 code implementation • ICCV 2021 • Zixin Zhu, Wei Tang, Le Wang, Nanning Zheng, Gang Hua
We explore two existing models to be the P-Net in our experiments.
no code implementations • CVPR 2021 • Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, Gang Hua
Specifically, the SGCN explicitly models the sparse directed interaction with a sparse directed spatial graph to capture adaptive interaction pedestrians.
no code implementations • CVPR 2021 • Yifan Sun, QiXing Huang, Dun-Yu Hsiao, Li Guan, Gang Hua
Efficient 3D space sampling to represent an underlying3D object/scene is essential for 3D vision, robotics, and be-yond.
no code implementations • 7 Jun 2021 • Zhanning Gao, Le Wang, Nebojsa Jojic, Zhenxing Niu, Nanning Zheng, Gang Hua
In the proposed framework, a dedicated feature alignment module is incorporated for redundancy removal across frames to produce the tensor representation, i. e., the video imprint.
no code implementations • CVPR 2021 • Yiding Yang, Zhou Ren, Haoxiang Li, Chunluan Zhou, Xinchao Wang, Gang Hua
In this paper, we propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame, and hence may serve as a robust estimation even in challenging scenarios including occlusion.
Multi-Person Pose Estimation Multi-Person Pose Estimation and Tracking +1
1 code implementation • 7 Jun 2021 • Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Nanning Zheng, Gang Hua
In this paper, we propose two attacks against deep ranking systems, i. e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations.
no code implementations • 1 May 2021 • Bo Liu, Mandar Dixit, Roland Kwitt, Gang Hua, Nuno Vasconcelos
In the absence of dense pose sampling in image space, these latent space trajectories provide cross-modal guidance for learning.
no code implementations • 1 May 2021 • Bo Liu, Haoxiang Li, Hao Kang, Nuno Vasconcelos, Gang Hua
A consistency loss has been introduced to limit the impact from unlabeled data while leveraging them to update the feature embedding.
no code implementations • ICCV 2021 • Bo Liu, Haoxiang Li, Hao Kang, Gang Hua, Nuno Vasconcelos
A new learning algorithm is then proposed for GeometrIc Structure Transfer (GIST), with resort to a combination of loss functions that combine class-balanced and random sampling to guarantee that, while overfitting to the popular classes is restricted to geometric parameters, it is leveraged to transfer class geometry from popular to few-shot classes.
no code implementations • 1 May 2021 • Bo Liu, Haoxiang Li, Hao Kang, Gang Hua, Nuno Vasconcelos
It is shown that, unlike class-balanced sampling, this is an adversarial augmentation strategy.
2 code implementations • 15 Apr 2021 • Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Weiming Zhang, Lu Yuan, Gang Hua, Nenghai Yu
This paper studies the problem of StyleGAN inversion, which plays an essential role in enabling the pretrained StyleGAN to be used for real image editing tasks.
4 code implementations • 4 Apr 2021 • Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, Gang Hua
Meanwhile, we use a sparse directed temporal graph to model the motion tendency, thus to facilitate the prediction based on the observed direction.
no code implementations • 30 Mar 2021 • Ziyi Liu, Le Wang, Wei Tang, Junsong Yuan, Nanning Zheng, Gang Hua
To address this challenge, we introduce a framework that learns two feature subspaces respectively for actions and their context.
Action Recognition Weakly-supervised Temporal Action Localization +1
no code implementations • 28 Mar 2021 • Ziyi Liu, Le Wang, Qilin Zhang, Wei Tang, Junsong Yuan, Nanning Zheng, Gang Hua
In this paper, we introduce an Action-Context Separation Network (ACSNet) that explicitly takes into account context for accurate action localization.
Ranked #7 on Weakly Supervised Action Localization on THUMOS’14
Video Polyp Segmentation Weakly Supervised Action Localization +2
no code implementations • 24 Mar 2021 • Wei Wei, Li Guan, Yue Liu, Hao Kang, Haoxiang Li, Ying Wu, Gang Hua
By the proposed physical regularization, our method can generate HDRs which are not only visually appealing but also physically plausible.
1 code implementation • CVPR 2021 • Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu
In this paper, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class distributions, which naturally supports diverse generation at semantic or even instance level.
Ranked #1 on Image-to-Image Translation on ADE20K Labels-to-Photos (LPIPS metric)
2 code implementations • ICCV 2021 • Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Yinghui Xu, Nanning Zheng, Gang Hua
In this paper, we formulate a new adversarial attack against deep ranking systems, i. e., the Order Attack, which covertly alters the relative order among a selected set of candidates according to an attacker-specified permutation, with limited interference to other unrelated candidates.
1 code implementation • 8 Mar 2021 • Jie Zhang, Dongdong Chen, Jing Liao, Weiming Zhang, Huamin Feng, Gang Hua, Nenghai Yu
By jointly training the target model and watermark embedding, the extra barrier can even be absorbed into the target model.
1 code implementation • ICCV 2021 • Haoxuanye Ji, Le Wang, Sanping Zhou, Wei Tang, Nanning Zheng, Gang Hua
Unsupervised person re-identification (Re-ID) remains challenging due to the lack of ground-truth labels.
no code implementations • 1 Jan 2021 • Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Xu Yinghui, Nanning Zheng, Gang Hua
The objective of this paper is to formalize and practically implement a new adversarial attack against deep ranking systems, i. e., the Order Attack, which covertly alters the relative order of a selected set of candidates according to a permutation vector predefined by the attacker, with only limited interference to other unrelated candidates.
1 code implementation • 8 Dec 2020 • Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Gang Hua, Nenghai Yu
Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis \cite{park2019semantic}, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away.
no code implementations • 1 Nov 2020 • Hang Zhou, Dongdong Chen, Jing Liao, Weiming Zhang, Kejiang Chen, Xiaoyi Dong, Kunlin Liu, Gang Hua, Nenghai Yu
To overcome these shortcomings, this paper proposes a novel label guided adversarial network (LG-GAN) for real-time flexible targeted point cloud attack.
1 code implementation • NeurIPS 2020 • Jie Zhang, Dongdong Chen, Jing Liao, Weiming Zhang, Gang Hua, Nenghai Yu
Only when the model IP is suspected to be stolen by someone, the private passport-aware branch is added back for ownership verification.
no code implementations • ECCV 2020 • Yuanhao Zhai, Le Wang, Wei Tang, Qilin Zhang, Junsong Yuan, Gang Hua
Weakly-supervised Temporal Action Localization (W-TAL) aims to classify and localize all action instances in an untrimmed video under only video-level supervision.
Ranked #12 on Weakly Supervised Action Localization on THUMOS14
Vocal Bursts Valence Prediction Weakly Supervised Action Localization +2
no code implementations • 21 Sep 2020 • Dengpan Fu, Bo Xin, Jingdong Wang, Dong-Dong Chen, Jianmin Bao, Gang Hua, Houqiang Li
Not only does such a simple method improve the performance of the baseline models, it also achieves comparable performance with latest advanced re-ranking methods.
1 code implementation • CVPR 2020 • Bo Liu, Hao Kang, Haoxiang Li, Gang Hua, Nuno Vasconcelos
It is argued that the classic softmax classifier is a poor solution for open-set recognition, since it tends to overfit on the training classes.
no code implementations • CVPR 2020 • Victor Fragoso, Joseph DeGol, Gang Hua
Many real-world applications in augmented reality (AR), 3D mapping, and robotics require both fast and accurate estimation of camera poses and scales from multiple images captured by multiple cameras or a single moving camera.
1 code implementation • CVPR 2020 • Shiyi Lan, Zhou Ren, Yi Wu, Larry S. Davis, Gang Hua
Object detection is an essential step towards holistic scene understanding.
Ranked #205 on Object Detection on COCO test-dev
3 code implementations • ECCV 2020 • Mo Zhou, Zhenxing Niu, Le Wang, Qilin Zhang, Gang Hua
In this paper, we propose two attacks against deep ranking systems, i. e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations.
1 code implementation • 26 Nov 2019 • Ye Yuan, Wuyang Chen, Tianlong Chen, Yang Yang, Zhou Ren, Zhangyang Wang, Gang Hua
Many real-world applications, such as city-scale traffic monitoring and control, requires large-scale re-identification.
2 code implementations • 18 Nov 2019 • Mo Zhou, Zhenxing Niu, Le Wang, Zhanning Gao, Qilin Zhang, Gang Hua
For visual-semantic embedding, the existing methods normally treat the relevance between queries and candidates in a bipolar way -- relevant or irrelevant, and all "irrelevant" candidates are uniformly pushed away from the query by an equal margin in the embedding space, regardless of their various proximity to the query.
2 code implementations • 17 Nov 2019 • Haichao Yu, Haoxiang Li, Honghui Shi, Thomas S. Huang, Gang Hua
When all layers are set to low-bits, we show that the model achieved accuracy comparable to dedicated models trained at the same precision.
no code implementations • 11 Jul 2019 • Qingnan Fan, Dong-Dong Chen, Lu Yuan, Gang Hua, Nenghai Yu, Baoquan Chen
To overcome this limitation, we propose a new decoupled learning algorithm to learn from the operator parameters to dynamically adjust the weights of a deep network for image operators, denoted as the base network.
no code implementations • 27 Nov 2018 • Wei Tang, John Corring, Ying Wu, Gang Hua
Printed text recognition is an important problem for industrial OCR systems.
Optical Character Recognition (OCR) Printed Text Recognition
1 code implementation • 21 Nov 2018 • Dongdong Chen, Mingming He, Qingnan Fan, Jing Liao, Liheng Zhang, Dongdong Hou, Lu Yuan, Gang Hua
Image dehazing aims to recover the uncorrupted content from a hazy image.
Ranked #1 on Rain Removal on DID-MDN
no code implementations • 27 Sep 2018 • Aijun Bai, Dongdong Chen, Gang Hua, Lu Yuan
Many machine learning systems are implemented as pipelines.
no code implementations • 27 Sep 2018 • Jiahuan Zhou, Nikolaos Karianakis, Ying Wu, Gang Hua
Current Convolutional Neural Network (CNN)-based object detection models adopt strictly feedforward inference to predict the final detection results.
1 code implementation • ECCV 2018 • Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, Gang Hua
Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of prediction accuracy between the quantized model and the full-precision model.
1 code implementation • ECCV 2018 • Qingnan Fan, Dong-Dong Chen, Lu Yuan, Gang Hua, Nenghai Yu, Baoquan Chen
Many different deep networks have been used to approximate, accelerate or improve traditional image operators, such as image smoothing, super-resolution and denoising.
no code implementations • CVPR 2018 • Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua
We then recombine the identity vector and the attribute vector to synthesize a new face of the subject with the extracted attribute.
6 code implementations • ECCV 2018 • Kuang-Huei Lee, Xi Chen, Gang Hua, Houdong Hu, Xiaodong He
Prior work either simply aggregates the similarity of all possible pairs of regions and words without attending differentially to more and less important words or regions, or uses a multi-step attentional process to capture limited number of semantic alignments which is less interpretable.
Ranked #4 on Image Retrieval on PhotoChat
no code implementations • 19 Mar 2018 • Jinliang Zang, Le Wang, Ziyi Liu, Qilin Zhang, Zhenxing Niu, Gang Hua, Nanning Zheng
Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs).
no code implementations • CVPR 2018 • Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, Gang Hua
This paper presents the first attempt at stereoscopic neural style transfer, which responds to the emerging demand for 3D movies or AR/VR.
no code implementations • ECCV 2018 • Navaneeth Bodla, Gang Hua, Rama Chellappa
We achieve this by fusing two generators: one for unconditional image generation, and the other for conditional image generation, where the two partly share a common latent space thereby disentangling the generation.
no code implementations • 2 Nov 2017 • Bin Dai, Baoyuan Wang, Gang Hua
Selecting attractive photos from a human action shot sequence is quite challenging, because of the subjective nature of the "attractiveness", which is mainly a combined factor of human pose in action and the background.
no code implementations • ICCV 2017 • Zhenxing Niu, Mo Zhou, Le Wang, Xinbo Gao, Gang Hua
We address the problem of dense visual-semantic embedding that maps not only full sentences and whole images but also phrases within sentences and salient regions within images into a multimodal embedding space.
no code implementations • 14 Aug 2017 • Chen Zhou, Jiaolong Yang, Chunshui Zhao, Gang Hua
This work is devoted to a task that is indispensable for safety yet was largely overlooked in the past -- detecting obstacles that are of very thin structures, such as wires, cables and tree branches.
1 code implementation • ICCV 2017 • Qingnan Fan, Jiaolong Yang, Gang Hua, Baoquan Chen, David Wipf
This paper proposes a deep neural network structure that exploits edge information in addressing representative low-level vision tasks such as layer separation and image filtering.
no code implementations • CVPR 2017 • Bing Su, Gang Hua
We present a new distance measure between sequences that can tackle local temporal distortion and periodic sequences with arbitrary starting points.
no code implementations • CVPR 2017 • Chengjiang Long, Gang Hua
A set of correlational tensors is adopted to model the relationship within a single domain as well as across multiple domains.
no code implementations • CVPR 2017 • Zhanning Gao, Gang Hua, Dong-Qing Zhang, Nebojsa Jojic, Le Wang, Jianru Xue, Nanning Zheng
We develop a unified framework for complex event retrieval, recognition and recounting.
1 code implementation • 16 Jun 2017 • Bin Dai, Yu Wang, John Aston, Gang Hua, David Wipf
Variational autoencoders (VAE) represent a popular, flexible form of deep generative model that can be stochastically fit to samples from a given random process using an information-theoretic variational bound on the true underlying distribution.
5 code implementations • 2 May 2017 • Jing Liao, Yuan YAO, Lu Yuan, Gang Hua, Sing Bing Kang
We propose a new technique for visual attribute transfer across images that may have very different appearance but have perceptually similar semantic structure.
3 code implementations • ICCV 2017 • Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua
Our approach models an image as a composition of label and latent attributes in a probabilistic model.
no code implementations • ICCV 2017 • Dongdong Chen, Jing Liao, Lu Yuan, Nenghai Yu, Gang Hua
Training a feed-forward network for fast neural style transfer of images is proven to be successful.
1 code implementation • CVPR 2017 • Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, Gang Hua
It also enables us to conduct incremental learning to add a new image style by learning a new filter bank while holding the auto-encoder fixed.
no code implementations • CVPR 2017 • Xiangyu Kong, Bo Xin, Yizhou Wang, Gang Hua
We examine the problem of joint top-down active search of multiple objects under interaction, e. g., person riding a bicycle, cups held by the table, etc..
no code implementations • CVPR 2018 • Qingnan Fan, Jiaolong Yang, Gang Hua, Baoquan Chen, David Wipf
While invaluable for many computer vision applications, decomposing a natural image into intrinsic reflectance and shading layers represents a challenging, underdetermined inverse problem.
no code implementations • 19 Jul 2016 • Dong Chen, Gang Hua, Fang Wen, Jian Sun
For real-time performance, we run the cascaded network only on regions of interests produced from a boosting cascade face detector.
Ranked #5 on Face Detection on PASCAL Face
no code implementations • CVPR 2016 • Haoxiang Li, Jonathan Brandt, Zhe Lin, Xiaohui Shen, Gang Hua
Our new framework enables efficient use of these complementary multi-level contextual cues to improve overall recognition rates on the photo album person recognition task, as demonstrated through state-of-the-art results on a challenging public dataset.
no code implementations • CVPR 2016 • Zhenxing Niu, Mo Zhou, Le Wang, Xinbo Gao, Gang Hua
To address the non-stationary property of aging patterns, age estimation can be cast as an ordinal regression problem.
no code implementations • 5 Apr 2016 • Zhanning Gao, Gang Hua, Dongqing Zhang, Jianru Xue, Nanning Zheng
Event retrieval and recognition in a large corpus of videos necessitates a holistic fixed-size visual representation at the video clip level that is comprehensive, compact, and yet discriminative.
no code implementations • CVPR 2017 • Jiaolong Yang, Peiran Ren, Dong-Qing Zhang, Dong Chen, Fang Wen, Hongdong Li, Gang Hua
The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition.
Ranked #7 on Face Verification on IJB-A
no code implementations • ICCV 2015 • Chengjiang Long, Gang Hua
Based on the EP approximation inference, a generalized Expectation Maximization (GEM) algorithm is derived to estimate both the parameters for instances and the quality of each individual annotator.
no code implementations • ICCV 2015 • Yan Xia, Xudong Cao, Fang Wen, Gang Hua, Jian Sun
We study the problem of automatically removing outliers from noisy data, with application for removing outlier images from an image collection.
no code implementations • CVPR 2015 • Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Gang Hua
To improve localization effectiveness, and reduce the number of candidates at later stages, we introduce a CNN-based calibration stage after each of the detection stages in the cascade.
no code implementations • CVPR 2015 • Haoxiang Li, Gang Hua
We apply the PEP model hierarchically to decompose a face image into face parts at different levels of details to build pose-invariant part-based face representations.
no code implementations • CVPR 2015 • Dapeng Chen, Zejian yuan, Gang Hua, Nanning Zheng, Jingdong Wang
We follow the learning-to-rank methodology and learn a similarity function to maximize the difference between the similarity scores of matched and unmatched images for a same person.
no code implementations • CVPR 2014 • Zhenxing Niu, Gang Hua, Xinbo Gao, Qi Tian
In such way, we can efficiently leverage the loosely related tags, and build an intermediate level representation for a collection of weakly annotated images.
no code implementations • CVPR 2014 • Haoxiang Li, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Gang Hua
Despite the fact that face detection has been studied intensively over the past several decades, the problem is still not completely solved.
no code implementations • CVPR 2014 • Yadong Mu, Gang Hua, Wei Fan, Shih-Fu Chang
This paper presents a novel algorithm which uses compact hash bits to greatly improve the efficiency of non-linear kernel SVM in very large scale visual classification problems.
no code implementations • CVPR 2014 • Wei Liu, Gang Hua, John R. Smith
Outliers are pervasive in many computer vision and pattern recognition problems.
no code implementations • CVPR 2013 • Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus.
no code implementations • CVPR 2013 • Gangqiang Zhao, Junsong Yuan, Gang Hua
We show that such data driven co-occurrence information from bottom-up can conveniently be incorporated in LDA with a Gaussian Markov prior, which combines top down probabilistic topic modeling with bottom up priors in a unified model.