no code implementations • 21 Apr 2024 • Bingwen Zhu, Fanyi Wang, Tianyi Lu, Peng Liu, Jingwen Su, Jinxiu Liu, Yanhao Zhang, Zuxuan Wu, Yu-Gang Jiang, Guo-Jun Qi
Image-to-video(I2V) generation aims to create a video sequence from a single image, which requires high temporal coherence and visual fidelity with the source image. However, existing approaches suffer from character appearance inconsistency and poor preservation of fine details.
no code implementations • 9 Dec 2023 • Yuming Qiao, Fanyi Wang, Jingwen Su, Yanhao Zhang, Yunjie Yu, Siyu Wu, Guo-Jun Qi
Image editing approaches with diffusion models have been rapidly developed, yet their applicability are subject to requirements such as specific editing types (e. g., foreground or background object editing, style transfer), multiple conditions (e. g., mask, sketch, caption), and time consuming fine-tuning of diffusion models.
no code implementations • 30 Nov 2023 • Zhangsihao Yang, Mingyuan Zhou, Mengyi Shan, Bingbing Wen, Ziwei Xuan, Mitch Hill, Junjie Bai, Guo-Jun Qi, Yalin Wang
Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions, without a large-scale animal text-motion dataset.
no code implementations • 2 Sep 2023 • Sanyi Zhang, Xiaochun Cao, Rui Wang, Guo-Jun Qi, Jie zhou
The experimental results show that the proposed method demonstrates good universality which can improve the robustness of the human parsing models and even the semantic segmentation models when facing various image common corruptions.
1 code implementation • CVPR 2023 • Tingting Liao, Xiaomei Zhang, Yuliang Xiu, Hongwei Yi, Xudong Liu, Guo-Jun Qi, Yong Zhang, Xuan Wang, Xiangyu Zhu, Zhen Lei
This paper presents a framework for efficient 3D clothed avatar reconstruction.
no code implementations • ICCV 2023 • Xianpeng Liu, Ce Zheng, Kelvin Cheng, Nan Xue, Guo-Jun Qi, Tianfu Wu
Motivated by a new and strong observation that this challenge can be remedied by a 3D-space local-grid search scheme in an ideal case, we propose a stage-wise approach, which combines the information flow from 2D-to-3D (3D bounding box proposal generation with a single 2D image) and 3D-to-2D (proposal verification by denoising with 3D-to-2D contexts) in a top-down manner.
1 code implementation • CVPR 2023 • Ce Zheng, Xianpeng Liu, Guo-Jun Qi, Chen Chen
In this paper, we propose a pure transformer architecture named POoling aTtention TransformER (POTTER) for the HMR task from single images.
Ranked #33 on 3D Human Pose Estimation on 3DPW
no code implementations • 23 Mar 2023 • Ce Zheng, Xianpeng Liu, Mengyuan Liu, Tianfu Wu, Guo-Jun Qi, Chen Chen
While image-based HMR methods have achieved impressive results, they often struggle to recover humans in dynamic scenarios, leading to temporal inconsistencies and non-smooth 3D motion predictions due to the absence of human motion.
Ranked #56 on 3D Human Pose Estimation on 3DPW
1 code implementation • 14 Mar 2023 • Xiao Wang, Ying Wang, Ziwei Xuan, Guo-Jun Qi
A criterion in unsupervised pretraining is the pretext task needs to be sufficiently hard to prevent the transformer encoder from learning trivial low-level features not generalizable well to downstream tasks.
no code implementations • ICCV 2023 • Benzhi Wang, Yang Yang, Jinlin Wu, Guo-Jun Qi, Zhen Lei
On the other hand, the similarity of cross-scale images is often smaller than that of images with the same scale for a person, which will increase the difficulty of matching.
no code implementations • 18 Feb 2023 • Na Zhang, Xudong Liu, Xin Li, Guo-Jun Qi
Semantic face image manipulation has received increasing attention in recent years.
no code implementations • 29 Dec 2022 • Wenjie Li, Juncheng Li, Guangwei Gao, Weihong Deng, Jian Yang, Guo-Jun Qi, Chia-Wen Lin
Recently, great progress has been made in single-image super-resolution (SISR) based on deep learning technology.
no code implementations • 23 Oct 2022 • Guo-Jun Qi, Mubarak Shah
In this paper, we review adversarial pretraining of self-supervised deep networks including both convolutional neural networks and vision transformers.
no code implementations • TIP 2022 • Tiantian Geng, Feng Zheng, Xiaorong Hou, Ke Lu, Guo-Jun Qi, Ling Shao
Spatial-temporal relation reasoning is a significant yet challenging problem for video action recognition.
Ranked #35 on Action Recognition on Something-Something V1
1 code implementation • 5 Aug 2022 • Ziteng Cui, Yingying Zhu, Lin Gu, Guo-Jun Qi, Xiaoxiao Li, Renrui Zhang, Zenghui Zhang, Tatsuya Harada
Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in low quality images.
no code implementations • 14 Jul 2022 • Sanyi Zhang, Xiaochun Cao, Guo-Jun Qi, Zhanjie Song, Jie zhou
Most state-of-the-art instance-level human parsing models adopt two-stage anchor-based detectors and, therefore, cannot avoid the heuristic anchor box design and the lack of analysis on a pixel level.
1 code implementation • 6 Jul 2022 • Wenjie Li, Juncheng Li, Guangwei Gao, Jiantao Zhou, Jian Yang, Guo-Jun Qi
Recently, Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks due to the ability of global feature extraction.
1 code implementation • CVPR 2023 • Ce Zheng, Matias Mendieta, Taojiannan Yang, Guo-Jun Qi, Chen Chen
Recently, vision transformers have shown great success in a set of human reconstruction tasks such as 2D human pose estimation (2D HPE), 3D human pose estimation (3D HPE), and human mesh reconstruction (HMR) tasks.
Ranked #29 on 3D Human Pose Estimation on 3DPW
2 code implementations • ICCV 2021 • Ziteng Cui, Guo-Jun Qi, Lin Gu, ShaoDi You, Zenghui Zhang, Tatsuya Harada
To enhance object detection in a dark environment, we propose a novel multitask auto encoding transformation (MAET) model which is able to explore the intrinsic pattern behind illumination translation.
Ranked #1 on 2D Object Detection on ExDark
1 code implementation • 19 Apr 2022 • Guangwei Gao, Zixiang Xu, Juncheng Li, Jian Yang, Tieyong Zeng, Guo-Jun Qi
Then, we design an efficient Feature Refinement Module (FRM) to enhance the encoded features.
1 code implementation • 27 Mar 2022 • Xiao Wang, Yuhang Huang, Dan Zeng, Guo-Jun Qi
It trains an encoder by distinguishing positive samples from negative ones given query anchors.
Ranked #65 on Self-Supervised Image Classification on ImageNet
no code implementations • 22 Jan 2022 • Ying Wang, Chiuman Ho, Wenju Xu, Ziwei Xuan, Xudong Liu, Guo-Jun Qi
We propose a Dual-Flattening Transformer (DFlatFormer) to enable high-resolution output by reducing complexity to $\mathcal{O}(hw(H+W))$ that is multiple orders of magnitude smaller than the naive dense transformer.
no code implementations • 7 Jan 2022 • Ziteng Cui, Yingying Zhu, Lin Gu, Guo-Jun Qi, Xiaoxiao Li, Peng Gao, Zenghui Zhang, Tatsuya Harada
Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in degraded images.
1 code implementation • 10 Jun 2021 • Rui Wang, Zuxuan Wu, Zejia Weng, Jingjing Chen, Guo-Jun Qi, Yu-Gang Jiang
Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a fully-labeled source domain to a different unlabeled target domain.
1 code implementation • 25 May 2021 • Xiang Gao, Wei Hu, Guo-Jun Qi
We formalize the proposed model from an information-theoretic perspective, by maximizing the mutual information between topology transformations and node representations before and after the transformations.
1 code implementation • 15 Apr 2021 • Xiao Wang, Guo-Jun Qi
Thus, we propose a general framework called Contrastive Learning with Stronger Augmentations~(CLSA) to complement current contrastive learning approaches.
no code implementations • 25 Mar 2021 • Guangwei Gao, Yi Yu, Jian Yang, Guo-Jun Qi, Meng Yang
(i) To learn more robust and discriminative features, we desire to adaptively fuse the contextual features from different layers.
no code implementations • 1 Mar 2021 • Xiang Gao, Wei Hu, Guo-Jun Qi
Then, we self-train a representation to capture the intrinsic 3D object representation by decoding 3D transformation parameters from the fused feature representations of multiple views before and after the transformation.
no code implementations • 3 Feb 2021 • Liangxi Liu, Xi Jiang, Feng Zheng, Hong Chen, Guo-Jun Qi, Heng Huang, Ling Shao
On the client side, a prior loss that uses the global posterior probabilistic parameters delivered from the server is designed to guide the local training.
no code implementations • 1 Jan 2021 • Xiang Gao, Wei Hu, Guo-Jun Qi
We formalize the TopoTER from an information-theoretic perspective, by maximizing the mutual information between topology transformations and node representations before and after the transformations.
2 code implementations • CVPR 2021 • Qianjiang Hu, Xiao Wang, Wei Hu, Guo-Jun Qi
Contrastive learning relies on constructing a collection of negative examples that are sufficiently hard to discriminate against positive queries when their representations are self-trained.
2 code implementations • ICCV 2021 • Si Chen, Mostafa Kahla, Ruoxi Jia, Guo-Jun Qi
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
no code implementations • 27 Jul 2020 • Haohang Xu, Hongkai Xiong, Guo-Jun Qi
In this paper, we propose the $K$-Shot Contrastive Learning (KSCL) of visual features by applying multiple augmentations to investigate the sample variations within individual instances.
no code implementations • 9 May 2020 • Weiyao Lin, Huabin Liu, Shizhan Liu, Yuxi Li, Rui Qian, Tao Wang, Ning Xu, Hongkai Xiong, Guo-Jun Qi, Nicu Sebe
To this end, we present a new large-scale dataset with comprehensive annotations, named Human-in-Events or HiEve (Human-centric video analysis in complex Events), for the understanding of human motions, poses, and actions in a variety of realistic events, especially in crowd & complex events.
1 code implementation • 9 Jan 2020 • Mingxing Xu, Wenrui Dai, Chunmiao Liu, Xing Gao, Weiyao Lin, Guo-Jun Qi, Hongkai Xiong
In this paper, we propose a novel paradigm of Spatial-Temporal Transformer Networks (STTNs) that leverages dynamical directed spatial dependencies and long-range temporal dependencies to improve the accuracy of long-term traffic forecasting.
no code implementations • 29 Dec 2019 • Haohang Xu, Hongkai Xiong, Guo-Jun Qi
To this end, we present a novel regularization mechanism by learning the change of feature representations induced by a distribution of transformations without using the labels of data examples.
2 code implementations • 21 Nov 2019 • Xiao Wang, Daisuke Kihara, Jiebo Luo, Guo-Jun Qi
In this study, we propose a new EnAET framework to further improve existing semi-supervised methods with self-supervised information.
Ranked #1 on Semi-Supervised Image Classification on STL-10
1 code implementation • CVPR 2020 • Xiang Gao, Wei Hu, Guo-Jun Qi
Recent advances in Graph Convolutional Neural Networks (GCNNs) have shown their efficiency for non-Euclidean data on graphs, which often require a large amount of labeled data with high cost.
no code implementations • 16 Nov 2019 • Feng Lin, Haohang Xu, Houqiang Li, Hongkai Xiong, Guo-Jun Qi
For this reason, we should use the geodesic to characterize how an image transform along the manifold of a transformation group, and adopt its length to measure the deviation between transformations.
no code implementations • 25 Oct 2019 • Yiheng Liu, Wengang Zhou, Jianzhuang Liu, Guo-Jun Qi, Qi Tian, Houqiang Li
By presenting a target attention loss, the pedestrian features extracted from the foreground branch become more insensitive to the backgrounds, which greatly reduces the negative impacts of changing backgrounds on matching an identical across different camera views.
1 code implementation • 25 Oct 2019 • Qiaokang Xie, Wengang Zhou, Guo-Jun Qi, Qi Tian, Houqiang Li
In our approach, we first collect tracklet data within each camera by automatic person detection and tracking.
no code implementations • 29 Sep 2019 • Xiangbo Shu, Liyan Zhang, Guo-Jun Qi, Wei Liu, Jinhui Tang
To this end, we propose a novel Skeleton-joint Co-attention Recurrent Neural Networks (SC-RNN) to capture the spatial coherence among joints, and the temporal evolution among skeletons simultaneously on a skeleton-joint co-attention feature map in spatiotemporal space.
8 code implementations • ICLR 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, Hongkai Xiong
Differentiable architecture search (DARTS) provided a fast solution in finding effective network architectures, but suffered from large memory and computing overheads in jointly training a super-network and searching for an optimal architecture.
Ranked #20 on Neural Architecture Search on CIFAR-10
no code implementations • 19 Jun 2019 • Guo-Jun Qi, Liheng Zhang, Xiao Wang
Transformation Equivariant Representations (TERs) aim to capture the intrinsic visual structures that equivary to various transformations by expanding the notion of {\em translation} equivariance underlying the success of Convolutional Neural Networks (CNNs).
no code implementations • 9 May 2019 • Naifan Zhuang, Guo-Jun Qi, The Duc Kieu, Kien A. Hua
The Long Short-Term Memory (LSTM) recurrent neural network is capable of processing complex sequential information since it utilizes special gating schemes for learning representations from long input sequences.
no code implementations • 27 Mar 2019 • Guo-Jun Qi, Jiebo Luo
Representation learning with small labeled data have emerged in many problems, since the success of deep neural networks often relies on the availability of a huge amount of labeled data that is expensive to collect.
1 code implementation • ICCV 2019 • Guo-Jun Qi, Liheng Zhang, Chang Wen Chen, Qi Tian
This ensures the resultant TERs of individual images contain the {\em intrinsic} information about their visual structures that would equivary {\em extricably} under various transformations in a generalized {\em nonlinear} case.
no code implementations • 15 Feb 2019 • Hao Hu, Liqiang Wang, Guo-Jun Qi
Recent advancements in recurrent neural network (RNN) research have demonstrated the superiority of utilizing multiscale structures in learning temporal representations of time series.
1 code implementation • CVPR 2019 • Liheng Zhang, Guo-Jun Qi, Liqiang Wang, Jiebo Luo
The success of deep neural networks often relies on a large amount of labeled examples, which can be difficult to obtain in many real scenarios.
no code implementations • 11 Dec 2018 • Yunxiao Qin, WeiGuo Zhang, Chenxu Zhao, Zezheng Wang, Xiangyu Zhu, Guo-Jun Qi, Jingping Shi, Zhen Lei
In this paper, inspired by the human cognition process which utilizes both prior-knowledge and vision attention in learning new knowledge, we present a novel paradigm of meta-learning approach with three developments to introduce attention mechanism and prior-knowledge for meta-learning.
1 code implementation • 13 Nov 2018 • Zezheng Wang, Chenxu Zhao, Yunxiao Qin, Qiusheng Zhou, Guo-Jun Qi, Jun Wan, Zhen Lei
Face anti-spoofing is significant to the security of face recognition systems.
no code implementations • 1 Nov 2018 • Xiangbo Shu, Jinhui Tang, Guo-Jun Qi, Wei Liu, Jian Yang
In a Co-LSTM unit, each sub-memory unit stores individual motion information, while this Co-LSTM unit selectively integrates and stores inter-related motion information between multiple interacting persons from multiple sub-memory units via the cell gate and co-memory cell, respectively.
Ranked #1 on Human Interaction Recognition on UT
no code implementations • ECCV 2018 • Marzieh Edraki, Guo-Jun Qi
Such a manifold assumption suggests the distance over the manifold should be a better measure to characterize the distinct between real and fake sam- ples.
no code implementations • ECCV 2018 • Yiru Zhao, Zhongming Jin, Guo-Jun Qi, Hongtao Lu, Xian-Sheng Hua
While deep neural networks have demonstrated competitive results for many visual recognition and image retrieval tasks, the major challenge lies in distinguishing similar images from different categories (i. e., hard negative examples) while clustering images with large variations from the same category (i. e., hard positive examples).
no code implementations • 8 Jun 2018 • Xiangyu Zhu, Hao liu, Zhen Lei, Hailin Shi, Fan Yang, Dong Yi, Guo-Jun Qi, Stan Z. Li
In this paper, we propose a deep learning based large-scale bisample learning (LBL) method for IvS face recognition.
no code implementations • CVPR 2018 • Guotian Xie, Jingdong Wang, Ting Zhang, Jian-Huang Lai, Richang Hong, Guo-Jun Qi
In this paper, we study the problem of designing efficient convolutional neural network architectures with the interest in eliminating the redundancy in convolution kernels.
no code implementations • 25 May 2018 • Kristjan Arumae, Guo-Jun Qi, Fei Liu
Asking effective questions is a powerful social skill.
no code implementations • 20 May 2018 • Muhammad Abdullah Jamal, Guo-Jun Qi, Mubarak Shah
Meta-learning approaches have been proposed to tackle the few-shot learning problem. Typically, a meta-learner is trained on a variety of tasks in the hopes of being generalizable to new tasks.
no code implementations • NeurIPS 2018 • Liheng Zhang, Marzieh Edraki, Guo-Jun Qi
In this paper, we formalize the idea behind capsule nets of using a capsule vector rather than a neuron activation to predict the label of samples.
no code implementations • 7 May 2018 • Lu Jin, Xiangbo Shu, Kai Li, Zechao Li, Guo-Jun Qi, Jinhui Tang
However, most existing deep hashing methods directly learn the hash functions by encoding the global semantic information, while ignoring the local spatial information of images.
no code implementations • 7 May 2018 • Chen Shen, Guo-Jun Qi, Rongxin Jiang, Zhongming Jin, Hongwei Yong, Yaowu Chen, Xian-Sheng Hua
In this paper, we present novel sharp attention networks by adaptively sampling feature maps from convolutional neural networks (CNNs) for person re-identification (re-ID) problem.
2 code implementations • 17 Apr 2018 • Guotian Xie, Jingdong Wang, Ting Zhang, Jian-Huang Lai, Richang Hong, Guo-Jun Qi
In this paper, we study the problem of designing efficient convolutional neural network architectures with the interest in eliminating the redundancy in convolution kernels.
no code implementations • 11 Apr 2018 • Naifan Zhuang, The Duc Kieu, Guo-Jun Qi, Kien A. Hua
The proposed model progressively builds up the ability of the LSTM gates to detect salient dynamical patterns in deeper stacked layers modeling higher orders of DoS, and thus the proposed LSTM model is termed deep differential Recurrent Neural Network (d2RNN).
2 code implementations • CVPR 2018 • Guo-Jun Qi, Liheng Zhang, Hao Hu, Marzieh Edraki, Jingdong Wang, Xian-Sheng Hua
In this paper, we present a novel localized Generative Adversarial Net (GAN) to learn on the manifold of real data.
no code implementations • ICCV 2017 • Ting Zhang, Guo-Jun Qi, Bin Xiao, Jingdong Wang
The main point lies in a novel building block, a pair of two successive interleaved group convolutions: primary group convolution and secondary group convolution.
1 code implementation • 13 Aug 2017 • Liheng Zhang, Charu Aggarwal, Guo-Jun Qi
Then the future stock prices are predicted as a nonlinear mapping of the combination of these components in an Inverse Fourier Transform (IFT) fashion.
2 code implementations • ICML 2017 • Hao Hu, Guo-Jun Qi
Modeling temporal sequences plays a fundamental role in various modern applications and has drawn more and more attentions in the machine learning community.
2 code implementations • 10 Jul 2017 • Ting Zhang, Guo-Jun Qi, Bin Xiao, Jingdong Wang
The main point lies in a novel building block, a pair of two successive interleaved group convolutions: primary group convolution and secondary group convolution.
no code implementations • 3 Jun 2017 • Xiangbo Shu, Jinhui Tang, Guo-Jun Qi, Yan Song, Zechao Li, Liyan Zhang
To this end, we propose a novel Concurrence-Aware Long Short-Term Sub-Memories (Co-LSTSM) to model the long-term inter-related dynamics between two interacting people on the bounding boxes covering people.
Ranked #2 on Human Interaction Recognition on BIT
no code implementations • 22 Mar 2017 • Guo-Jun Qi, Wei Liu, Charu Aggarwal, Thomas Huang
One of our goals in this paper is to develop a model for revealing the functional relationships between text and image features as to directly transfer intermodal and intramodal labels to annotate the images.
1 code implementation • 23 Jan 2017 • Guo-Jun Qi
In this paper, we present the Lipschitz regularization theory and algorithms for a novel Loss-Sensitive Generative Adversarial Network (LS-GAN).
Ranked #43 on Image Classification on SVHN
no code implementations • CVPR 2016 • Guo-Jun Qi
While image structures usually have various scales, it is difficult to use a single scale to model the spatial contexts for all individual pixels.
no code implementations • 6 Jun 2015 • Jun Ye, Hao Hu, Kai Li, Guo-Jun Qi, Kien A. Hua
With the prevalence of the commodity depth cameras, the new paradigm of user interfaces based on 3D motion capturing and recognition have dramatically changed the way of interactions between human and computers.
no code implementations • CVPR 2015 • Ting Zhang, Guo-Jun Qi, Jinhui Tang, Jingdong Wang
The benefit is that the distance evaluation between the query and the dictionary element (a sparse vector) is accelerated using the efficient sparse vector operation, and thus the cost of distance table computation is reduced a lot.
no code implementations • ICCV 2015 • Vivek Veeriah, Naifan Zhuang, Guo-Jun Qi
This change in information gain is quantified by Derivative of States (DoS), and thus the proposed LSTM model is termed as differential Recurrent Neural Network (dRNN).
no code implementations • 19 Mar 2015 • Kai Li, Guo-Jun Qi, Jun Ye, Kien A. Hua
In this work, we propose a novel hash learning framework that encodes feature's rank orders instead of numeric values in a number of optimal low-dimensional ranking subspaces.