1 code implementation • 4 Apr 2024 • Shuting He, Henghui Ding
In fact, static cues can sometimes interfere with temporal perception by overshadowing motion cues.
Ranked #1 on Referring Video Object Segmentation on MeViS
no code implementations • 29 Mar 2024 • Yanyan Shao, Shuting He, Qi Ye, Yuchao Feng, Wenhan Luo, Jiming Chen
Tracking by natural language specification (TNL) aims to consistently localize a target in a video sequence given a linguistic description in the initial frame.
no code implementations • 13 Nov 2023 • Shuting He, Hao Luo, Wei Jiang, Xudong Jiang, Henghui Ding
With the help of relational knowledge transfer, VGKT is capable of aligning semantic-group textual features with corresponding visual features without external tools and complex pairwise interaction.
Ranked #6 on Text based Person Retrieval on CUHK-PEDES (using extra training data)
no code implementations • 7 Sep 2023 • Shuting He, Weihua Chen, Kai Wang, Hao Luo, Fan Wang, Wei Jiang, Henghui Ding
Then, to measure the importance of each generated region, we introduce a Region Assessment Module (RAM) that assigns confidence scores to different regions and reduces the negative impact of the occlusion regions by lower scores.
1 code implementation • 30 Aug 2023 • Shuting He, Henghui Ding, Chang Liu, Xudong Jiang
This dataset encompasses a range of expressions: those referring to multiple targets, expressions with no specific target, and the single-target expressions.
Generalized Referring Expression Comprehension Referring Expression +1
1 code implementation • ICCV 2023 • Henghui Ding, Chang Liu, Shuting He, Xudong Jiang, Chen Change Loy
To investigate the feasibility of using motion expressions to ground and segment objects in videos, we propose a large-scale dataset called MeViS, which contains numerous motion expressions to indicate target objects in complex environments.
Ranked #2 on Referring Video Object Segmentation on MeViS
1 code implementation • CVPR 2023 • Shuting He, Henghui Ding, Wei Jiang
The inter-class relationships of semantic-related visual features are then required to be aligned with those in semantic space, thereby transferring semantic knowledge to visual feature learning.
1 code implementation • 23 May 2023 • Shuting He, Xudong Jiang, Wei Jiang, Henghui Ding
In this work, we address the challenging task of few-shot and zero-shot 3D point cloud semantic segmentation.
1 code implementation • CVPR 2023 • Shuting He, Henghui Ding, Wei Jiang
It is desired to rescue novel objects from background and dominated seen categories.
1 code implementation • ICCV 2023 • Henghui Ding, Chang Liu, Shuting He, Xudong Jiang, Philip H. S. Torr, Song Bai
However, since the target objects in these existing datasets are usually relatively salient, dominant, and isolated, VOS under complex scenes has rarely been studied.
1 code implementation • 20 May 2021 • Hao Luo, Weihua Chen, Xianzhe Xu, Jianyang Gu, Yuqi Zhang, Chong Liu, Yiqi Jiang, Shuting He, Fan Wang, Hao Li
We mainly focus on four points, i. e. training data, unsupervised domain-adaptive (UDA) training, post-processing, model ensembling in this challenge.
4 code implementations • ICCV 2021 • Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, Wei Jiang
Extracting robust feature representation is one of the key challenges in object re-identification (ReID).
Ranked #1 on Person Re-Identification on Market-1501-C
1 code implementation • 25 Dec 2020 • Jianyang Gu, Hao Luo, Weihua Chen, Yiqi Jiang, Yuqi Zhang, Shuting He, Fan Wang, Hao Li, Wei Jiang
Considering the large gap between the source domain and target domain, we focused on solving two biases that influenced the performance on domain adaptive pedestrian Re-ID and proposed a two-stage training procedure.
2 code implementations • 22 Apr 2020 • Shuting He, Hao Luo, Weihua Chen, Miao Zhang, Yuqi Zhang, Fan Wang, Hao Li, Wei Jiang
Our solution is based on a strong baseline with bag of tricks (BoT-BS) proposed in person ReID.