no code implementations • 11 Jan 2024 • Xiaoyan Yu, Neng Dong, Liehuang Zhu, Hao Peng, Dapeng Tao
Additionally, acknowledging the complementary nature of semantic details across different modalities, we integrate text features from the bimodal language descriptions to achieve comprehensive semantics.
1 code implementation • 7 Nov 2023 • Neng Dong, Shuanglin Yan, Hao Tang, Jinhui Tang, Liyan Zhang
Moreover, as multiple images with the same identity are not accessible in the testing stage, we devise an Information Propagation (IP) mechanism to distill knowledge from the comprehensive representation to that of a single occluded image.
no code implementations • 17 Oct 2023 • Shuanglin Yan, Neng Dong, Jun Liu, Liyan Zhang, Jinhui Tang
Since the support set is unavailable during inference, we propose to distill the knowledge learned by the "richer" model into a lightweight model for inference with a single image/text as input.
1 code implementation • 14 Jul 2023 • Neng Dong, Liyan Zhang, Shuanglin Yan, Hao Tang, Jinhui Tang
Occlusion perturbation presents a significant challenge in person re-identification (re-ID), and existing methods that rely on external visual cues require additional computational resources and only consider the issue of missing information caused by occlusion.
1 code implementation • 19 Oct 2022 • Shuanglin Yan, Neng Dong, Liyan Zhang, Jinhui Tang
Secondly, cross-grained feature refinement (CFR) and fine-grained correspondence discovery (FCD) modules are proposed to establish the cross-grained and fine-grained interactions between modalities, which can filter out non-modality-shared image patches/words and mine cross-modal correspondences from coarse to fine.