1 code implementation • 31 Mar 2024 • Xiaorui Huang, Gen Luo, Chaoyang Zhu, Bo Tong, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji
Recently, Segment Anything Model (SAM) has become a research hotspot in the fields of multimedia and computer vision, which exhibits powerful yet versatile capabilities on various (un) conditional image segmentation tasks.
1 code implementation • 18 Jul 2023 • Chaoyang Zhu, Long Chen
By ``open-vocabulary'', we mean that the models can classify objects beyond pre-defined categories.
3 code implementations • 30 Mar 2022 • Chaoyang Zhu, Yiyi Zhou, Yunhang Shen, Gen Luo, Xingjia Pan, Mingbao Lin, Chao Chen, Liujuan Cao, Xiaoshuai Sun, Rongrong Ji
In this paper, we propose a simple yet universal network termed SeqTR for visual grounding tasks, e. g., phrase localization, referring expression comprehension (REC) and segmentation (RES).
1 code implementation • ICCV 2021 • Yiyi Zhou, Tianhe Ren, Chaoyang Zhu, Xiaoshuai Sun, Jianzhuang Liu, Xinghao Ding, Mingliang Xu, Rongrong Ji
Due to the superior ability of global dependency modeling, Transformer and its variants have become the primary choice of many vision-and-language tasks.
no code implementations • 17 Jun 2017 • Zhiqiang Zeng, Jian Zhang, Xiaodong Wang, Yuming Chen, Chaoyang Zhu
Place recognition is one of the most fundamental topics in computer vision and robotics communities, where the task is to accurately and efficiently recognize the location of a given query image.