1 code implementation • 29 May 2024 • Zifan Song, Yudong Wang, Wenwei Zhang, Kuikun Liu, Chengqi Lyu, Demin Song, Qipeng Guo, Hang Yan, Dahua Lin, Kai Chen, Cairong Zhao
Open-source Large Language Models (LLMs) and their specialized variants, particularly Code LLMs, have recently delivered impressive performance.
no code implementations • 3 Feb 2024 • Ran Miao, Xueyu Chen, Liang Hu, Zhifei Zhang, Minghua Wan, Qi Zhang, Cairong Zhao
Patent documents in the patent database (PatDB) are crucial for research, development, and innovation as they contain valuable technical information.
1 code implementation • 31 Jan 2024 • Shuguang Dou, Xiangyang Jiang, Yuanpeng Tu, Junyao Gao, Zefan Qu, Qingsong Zhao, Cairong Zhao
Unlike mainstream approaches using global features for simultaneous multi-task learning of ReID and human parsing, or relying on semantic information for attention guidance, DROP argues that the inferior performance of the former is due to distinct granularity requirements for ReID and human parsing features.
1 code implementation • 11 Dec 2023 • Yubin Wang, Xinyang Jiang, De Cheng, Dongsheng Li, Cairong Zhao
To address this limitation and prioritize harnessing structured knowledge, this paper advocates for leveraging LLMs to build a graph for each description to model the entities and attributes describing the category, as well as their correlations.
Ranked #1 on Prompt Engineering on ImageNet V2
no code implementations • 22 Nov 2023 • Zefan Qu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Cairong Zhao
To the best of our knowledge, we are the first to exploit the LUT structure to extract temporal information in video tasks.
1 code implementation • IEEE Transactions on Image Processing 2023 • Cairong Zhao, Zefan Qu, Xinyang Jiang, Yuanpeng Tu, Xiang Bai
To address these challenges, we propose a novel Content-Adaptive Auto-Occlusion Network (CAAO), that is able to dynamically select the proper occlusion region of an image based on its content and the current training status.
1 code implementation • IEEE Transactions on Multimedia 2023 • Tianli Sun, Haonan Chen, Guosheng Hu, Lianghua He, Cairong Zhao
In addition, we demonstrate the utilization of visualization result in three ways: (1) We visualize attention with respect to connectionist temporal classification (CTC) loss to train an ASR model with adversarial attention erasing regularization, which effectively decreases the word error rate (WER) of the model and improves its generalization capability.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 2023 • Cairong Zhao, Chutian Wang, Guosheng Hu, Haonan Chen, Chun Liu, Jinhui Tang
To address these two challenges, in this paper, we propose an Interpretable Spatial-Temporal Video Transformer (ISTVT), which consists of a novel decomposed spatial-temporal self-attention and a self-subtract mechanism to capture spatial artifacts and temporal inconsistency for robust Deepfake detection.
1 code implementation • IEEE Transactions on Image Processing 2022 • Shuguang Dou, Cairong Zhao, Xinyang Jiang, Shanshan Zhang, Wei-Shi Zheng, WangMeng Zuo
Most supervised methods propose to train an extra human parsing model aside from the ReID model with cross-domain human parts annotation, suffering from expensive annotation cost and domain gap; Unsupervised methods integrate a feature clustering-based human parsing process into the ReID model, but lacking supervision signals brings less satisfactory segmentation results.
Ranked #3 on Person Re-Identification on Occluded-DukeMTMC
1 code implementation • 8 Dec 2022 • Cairong Zhao, Yubin Wang, Xinyang Jiang, Yifei Shen, Kaitao Song, Dongsheng Li, Duoqian Miao
Prompt learning is one of the most effective and trending ways to adapt powerful vision-language foundation models like CLIP to downstream datasets by tuning learnable prompt vectors with very few samples.
Ranked #4 on Prompt Engineering on Caltech-101
1 code implementation • 20 Nov 2022 • Wenli Sun, Xinyang Jiang, Shuguang Dou, Dongsheng Li, Duoqian Miao, Cheng Deng, Cairong Zhao
Instead of learning fixed triggers for the target classes from the training set, DT-IBA can dynamically generate new triggers for any unknown identities.
no code implementations • 23 Aug 2022 • Boshen Zhang, Yuxi Li, Yuanpeng Tu, Jinlong Peng, Yabiao Wang, Cunlin Wu, Yang Xiao, Cairong Zhao
Specifically, for the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus alleviating the effect from noisy samples incorrectly grouped into the clean set.
no code implementations • 15 Jul 2022 • Shuguang Dou, Xinyang Jiang, Qingsong Zhao, Dongsheng Li, Cairong Zhao
In this paper, we aim to develop a technique that can achieve a good trade-off between privacy protection and data usability for person ReID.
1 code implementation • IEEE Transactions on Circuits and Systems for Video Technology 2022 • Cairong Zhao, Zhicheng Chen, Shuguang Dou, Zefan Qu, Jiawei Yao, Jun Wu, Duoqian Miao
For human-introduced noise, we propose a noise-discovery and noise-suppression training process for mislabeling robust person search.
no code implementations • 2 Mar 2022 • Qingsong Zhao, Yi Wang, Shuguang Dou, Chen Gong, Yin Wang, Cairong Zhao
Regarding this hypothesis, we propose a novel regularization to improve discriminative learning.
no code implementations • 21 Feb 2022 • Qingsong Zhao, Yi Wang, Zhipeng Zhou, Duoqian Miao, LiMin Wang, Yu Qiao, Cairong Zhao
Flattening is essential in computer vision by converting multi-dimensional feature maps or images into one-dimensional vectors.
2 code implementations • IEEE Transactions on Image Processing 2021 • Cairong Zhao, Yuanpeng Tu, Zhihui Lai, Fumin Shen, Heng Tao Shen, Duoqian Miao
Moreover, a novel iterative asymmetric mutual training strategy (IAMT) is proposed to alleviate drawbacks of common mutual learning, which can continuously refine the discriminative regions for SSB and extract regularized dark knowledge for two models as well.
1 code implementation • IEEE Transactions on Circuits and Systems for Video Technology 2021 • Shaowei Hou, Cairong Zhao, Zhicheng Chen, Jun Wu, Zhihua Wei, Duoqian Miao
Our method achieves comparable performance on two benchmarks, CUHK-SYSU and PRW, and achieves 91. 96% of mAP and 93. 34% of rank1 accuracy on CUHK-SYSU.
1 code implementation • IEEE Transactions on Image Processing 2021 • Cairong Zhao, Xinbi Lv, Shuguang Dou, Shanshan Zhang, Jun Wu, Liang Wang
The adversarial suppression branch, embedded with two occlusion suppression module, minimizes the generated occlusion’s response and strengthens attentive feature representation on human non-occluded body regions.
Ranked #6 on Person Re-Identification on Occluded-DukeMTMC
1 code implementation • IEEE Transactions on Multimedia 2020 • Cairong Zhao, Xinbi Lv, Zhang Zhang, WangMeng Zuo, Jun Wu, Duoqian Miao
The extraction of robust feature representations from pedestrian images through CNNs with a single deterministic pooling operation is problematic as the features in real pedestrian images are complex and diverse.