1 code implementation • 25 Apr 2024 • Sifan Long, Linbin Wang, Zhen Zhao, Zichang Tan, Yiming Wu, Shengsheng Wang, Jingdong Wang
In light of this, we propose Training-Free Unsupervised Prompts (TFUP), which maximally preserves the inherent representation capabilities and enhances them with a residual connection to similarity-based prediction probabilities in a training-free and labeling-free manner.
no code implementations • 19 Apr 2024 • Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao liu, Yuan Xie, Xiang Bai, Can Huang
Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive, high-quality instruction tuning data.
no code implementations • 17 Jan 2024 • Kun Wu, Ning Liu, Zhen Zhao, Di Qiu, Jinming Li, Zhengping Che, Zhiyuan Xu, Qinru Qiu, Jian Tang
Imitation learning (IL), aiming to learn optimal control policies from expert demonstrations, has been an effective method for robot manipulation tasks.
no code implementations • 17 Jan 2024 • Haowen Wang, Zhen Zhao, Zhao Jin, Zhengping Che, Liang Qiao, Yakun Huang, Zhipeng Fan, XIUQUAN QIAO, Jian Tang
Reconstructing real-world objects and estimating their movable joint structures are pivotal technologies within the field of robotics.
3 code implementations • 19 Dec 2023 • Yue Duan, Zhen Zhao, Lei Qi, Luping Zhou, Lei Wang, Yinghuan Shi
While semi-supervised learning (SSL) has yielded promising results, the more realistic SSL scenario remains to be explored, in which the unlabeled data exhibits extremely high recognition difficulty, e. g., fine-grained visual classification in the context of SSL (SS-FGVC).
1 code implementation • 29 Nov 2023 • Zhen Zhao, Zicheng Wang, Longyue Wang, Yixuan Yuan, Luping Zhou
To mitigate the confirmation bias from the diverse supervision, the core of AD-MT lies in two proposed modules: the Random Periodic Alternate (RPA) Updating Module and the Conflict-Combating Module (CCM).
1 code implementation • 28 Nov 2023 • Zicheng Wang, Zhen Zhao, Erjian Guo, Luping Zhou
Current methods focusing on medical image segmentation suffer from incorrect annotations, which is known as the noisy label issue.
1 code implementation • 27 Nov 2023 • Zicheng Wang, Zhen Zhao, Yiming Wu, Luping Zhou, Dong Xu
Unlike previous works that focus on feature extractor adaptation, our PTSFA approach focuses on classifier adaptation.
no code implementations • 25 Nov 2023 • Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu
While the recent advances in Multimodal Large Language Models (MLLMs) constitute a significant leap forward in the field, these models are predominantly confined to the realm of input-side multimodal comprehension, lacking the capacity for multimodal content generation.
1 code implementation • 22 Nov 2023 • Zhen Zhao, Jingqun Tang, Chunhui Lin, Binghong Wu, Can Huang, Hao liu, Xin Tan, Zhizhong Zhang, Yuan Xie
A straightforward solution is performing model fine-tuning tailored to a specific scenario, but it is computationally intensive and requires multiple model copies for various scenarios.
no code implementations • 2 Oct 2023 • Zhenhua Xu, Yujia Zhang, Enze Xie, Zhen Zhao, Yong Guo, Kwan-Yee. K. Wong, Zhenguo Li, Hengshuang Zhao
Multimodal large language models (MLLMs) have emerged as a prominent area of interest within the research community, given their proficiency in handling and reasoning with non-textual data, including images and videos.
1 code implementation • ICCV 2023 • Guan Gui, Zhen Zhao, Lei Qi, Luping Zhou, Lei Wang, Yinghuan Shi
Sample adaptive augmentation (SAA) is proposed for this stated purpose and consists of two modules: 1) sample selection module; 2) sample augmentation module.
1 code implementation • 23 Aug 2023 • Zhen Zhao, Ye Liu, Meng Zhao, Di Yin, Yixuan Yuan, Luping Zhou
Studies on semi-supervised medical image segmentation (SSMIS) have seen fast progress recently.
2 code implementations • ICCV 2023 • Yue Duan, Zhen Zhao, Lei Qi, Luping Zhou, Lei Wang, Yinghuan Shi
Semi-supervised learning (SSL) tackles the label missing problem by enabling the effective usage of unlabeled data.
1 code implementation • ICCV 2023 • Lihe Yang, Zhen Zhao, Lei Qi, Yu Qiao, Yinghuan Shi, Hengshuang Zhao
To mitigate potentially incorrect pseudo labels, recent frameworks mostly set a fixed confidence threshold to discard uncertain samples.
no code implementations • 4 Aug 2023 • Haowen Wang, Zhipeng Fan, Zhen Zhao, Zhengping Che, Zhiyuan Xu, Dong Liu, Feifei Feng, Yakun Huang, XIUQUAN QIAO, Jian Tang
We introduce a pose regression module that shares the deformation features and template codes from the fields to estimate the accurate 6D pose of each object in the scene.
no code implementations • ICCV 2023 • Sifan Long, Zhen Zhao, Junkun Yuan, Zichang Tan, JiangJiang Liu, Luping Zhou, Shengsheng Wang, Jingdong Wang
A contrastive loss is employed to align such augmented text and image representations on downstream tasks.
1 code implementation • CVPR 2023 • Zicheng Wang, Zhen Zhao, Xiaoxia Xing, Dong Xu, Xiangyu Kong, Luping Zhou
In this work, we propose a new conflict-based cross-view consistency (CCVC) method based on a two-branch co-training framework which aims at enforcing the two sub-nets to learn informative features from irrelevant views.
no code implementations • CVPR 2023 • Zhen Zhao, Zhizhong Zhang, Xin Tan, Jun Liu, Yanyun Qu, Yuan Xie, Lizhuang Ma
In this paper, we propose a space decoupling (SD) algorithm to decouple the feature space into a pair of complementary subspaces, i. e., the stability space I, and the plasticity space R. I is established by conducting space intersection between the historic and current feature space, and thus I contains more task-shared bases.
1 code implementation • CVPR 2023 • Zhen Zhao, Lihe Yang, Sifan Long, Jimin Pi, Luping Zhou, Jingdong Wang
Differently, in this work, we follow a standard teacher-student framework and propose AugSeg, a simple and clean approach that focuses mainly on data perturbations to boost the SSS performance.
1 code implementation • CVPR 2023 • Zhen Zhao, Sifan Long, Jimin Pi, Jingdong Wang, Luping Zhou
Relying on the model's performance, iMAS employs a class-weighted symmetric intersection-over-union to evaluate quantitative hardness of each unlabeled instance and supervises the training on unlabeled data in a model-adaptive manner.
1 code implementation • CVPR 2023 • Sifan Long, Zhen Zhao, Jimin Pi, Shengsheng Wang, Jingdong Wang
In this paper, we emphasize the cruciality of diverse global semantics and propose an efficient token decoupling and merging method that can jointly consider the token importance and diversity for token pruning.
Ranked #4 on Efficient ViTs on ImageNet-1K (with DeiT-T)
3 code implementations • 27 Mar 2022 • Yue Duan, Zhen Zhao, Lei Qi, Lei Wang, Luping Zhou, Yinghuan Shi, Yang Gao
The core issue in semi-supervised learning (SSL) lies in how to effectively leverage unlabeled data, whereas most existing methods tend to put a great emphasis on the utilization of high-confidence samples yet seldom fully explore the usage of low-confidence samples.
1 code implementation • 22 Feb 2022 • Zhen Zhao, Yuqiu Liu, Gang Zhang, Liang Tang, Xiaolin Hu
This report introduces our solution to the iFLYTEK challenge 2021 cultivated land extraction from high-resolution remote sensing image.
no code implementations • CVPR 2022 • Zhen Zhao, Luping Zhou, Yue Duan, Lei Wang, Lei Qi, Yinghuan Shi
Consistency-based Semi-supervised learning (SSL) has achieved promising performance recently.
no code implementations • 14 Nov 2020 • Zhen Zhao, Yuhong Guo, Jieping Ye
Recently the problem of cross-domain object detection has started drawing attention in the computer vision community.
no code implementations • ECCV 2020 • Zhen Zhao, Miaojing Shi, Xiaoxiao Zhao, Li Li
To learn a reliable people counter from crowd images, head center annotations are normally required.
no code implementations • 8 Jun 2020 • Zhen Zhao, Bingyu Liu, Yuhong Guo, Jieping Ye
In this paper, we present our proposed ensemble model with batch spectral regularization and data blending mechanisms for the Track 2 problem of the cross-domain few-shot learning (CD-FSL) challenge.
no code implementations • 18 May 2020 • Bingyu Liu, Zhen Zhao, Zhenpeng Li, Jianan Jiang, Yuhong Guo, Jieping Ye
In this paper, we propose a feature transformation ensemble model with batch spectral regularization for the Cross-domain few-shot learning (CD-FSL) challenge.
no code implementations • ECCV 2020 • Zhen Zhao, Yuhong Guo, Haifeng Shen, Jieping Ye
In this paper, we propose a novel end-to-end unsupervised deep domain adaptation model for adaptive object detection by exploiting multi-label object recognition as a dual auxiliary task.
no code implementations • 29 Mar 2020 • Zhenpeng Li, Zhen Zhao, Yuhong Guo, Haifeng Shen, Jieping Ye
However, in practice the labeled data can come from multiple source domains with different distributions.
no code implementations • 15 Apr 2019 • Zhen Zhao, Ashley Kleinhans, Gursharan Sandhu, Ishan Patel, K. P. Unnikrishnan
Afterward, the routing coefficients associated with the training examples are accumulated offline and used to create a set of "master" routing coefficients.
no code implementations • 22 Mar 2019 • Zhen Zhao, Ashley Kleinhans, Gursharan Sandhu, Ishan Patel, K. P. Unnikrishnan
Capsule Networks (CapsNet) use the Softmax function to convert the logits of the routing coefficients into a set of normalized values that signify the assignment probabilities between capsules in adjacent layers.
no code implementations • 10 Aug 2018 • Chenyu You, Guang Li, Yi Zhang, Xiaoliu Zhang, Hongming Shan, Shenghong Ju, Zhen Zhao, Zhuiyang Zhang, Wenxiang Cong, Michael W. Vannier, Punam K. Saha, Ge Wang
Specifically, with the generative adversarial network (GAN) as the building block, we enforce the cycle-consistency in terms of the Wasserstein distance to establish a nonlinear end-to-end mapping from noisy LR input images to denoised and deblurred HR outputs.
no code implementations • 2 May 2018 • Chenyu You, Qingsong Yang, Hongming Shan, Lars Gjesteby, Guang Li, Shenghong Ju, Zhuiyang Zhang, Zhen Zhao, Yi Zhang, Wenxiang Cong, Ge Wang
However, the radiation dose reduction compromises the signal-to-noise ratio (SNR), leading to strong noise and artifacts that down-grade CT image quality.