1 code implementation • 16 Apr 2024 • Enming Zhang, Bingke Zhu, Yingying Chen, Qinghai Miao, Ming Tang, Jinqiao Wang
This limitation restricts the capabilities of pretrained VLMs and can result in incorrect predictions in downstream tasks.
1 code implementation • 21 Mar 2024 • Zheng Zhang, Yeyao Ma, Enming Zhang, Xiang Bai
PSALM is a powerful extension of the Large Multi-modal Model (LMM) to address the segmentation task challenges.
1 code implementation • 14 Dec 2023 • Shuailei Ma, Yuefeng Wang, Ying WEI, Jiaqi Fan, Enming Zhang, Xinyu Sun, Peihao Chen
Ablation experiments demonstrate that both of them are effective in mitigating the impact of open-world knowledge distillation on the learning of known objects.
1 code implementation • 6 Jun 2023 • Wenwen Yu, MingYu Liu, Biao Yang, Enming Zhang, Deqiang Jiang, Xing Sun, Yuliang Liu, Xiang Bai
Text recognition in the wild is a long-standing problem in computer vision.
1 code implementation • 21 Mar 2023 • Shuailei Ma, Yuefeng Wang, Ying WEI, Peihao Chen, Zhixiang Ye, Jiaqi Fan, Enming Zhang, Thomas H. Li
We propose leveraging the VL as the ``Brain'' of the open-world detector by simply generating unknown labels.
no code implementations • 12 Jul 2022 • Yang Tan, Enming Zhang, Yang Li, Shao-Lun Huang, Xiao-Ping Zhang
We propose two novel transferability metrics F-OTCE (Fast Optimal Transport based Conditional Entropy) and JC-OTCE (Joint Correspondence OTCE) to evaluate how much the source model (task) can benefit the learning of the target task and to learn more transferable representations for cross-domain cross-task transfer learning.