no code implementations • ICCV 2023 • Jiajin Tang, Ge Zheng, Sibei Yang
Furthermore, to explicitly capture object motions and spatial-temporal cross-modal reasoning over objects, we propose a novel temporal collection-distribution mechanism for interacting between the global referent token and object queries.
no code implementations • ICCV 2023 • Jiajin Tang, Ge Zheng, Jingyi Yu, Sibei Yang
Its challenge lies in object categories available for the task being too diverse to be limited to a closed set of object vocabulary for traditional object detection.
1 code implementation • CVPR 2023 • Jiajin Tang, Ge Zheng, Cheng Shi, Sibei Yang
Referring image segmentation aims to segment the target referent in an image conditioning on a natural language expression.