1 code implementation • 27 Feb 2024 • Yanpeng Sun, Jiahui Chen, Shan Zhang, Xinyu Zhang, Qiang Chen, Gang Zhang, Errui Ding, Jingdong Wang, Zechao Li
In this paper, we propose a novel Visual Reference Prompt (VRP) encoder that empowers the Segment Anything Model (SAM) to utilize annotated reference images as prompts for segmentation, creating the VRP-SAM model.
1 code implementation • 16 Dec 2023 • Kaiyou Song, Shan Zhang, Tong Wang
In this study, inspired by human beings' way of grasping an image, i. e., focusing on the main object first, we present a semantic-aware autoregressive image modeling (SemAIM) method to tackle this challenge.
1 code implementation • CVPR 2023 • Kaiyou Song, Jin Xie, Shan Zhang, Zimeng Luo
Different from existing SSL-KD methods that transfer knowledge from a static pre-trained teacher to a student, in MOKD, two different models learn collaboratively in a self-supervised manner.
no code implementations • 14 Mar 2023 • Steven Shaw, Kanishka Tyagi, Shan Zhang
Many radar signal processing methodologies are being developed for critical road safety perception tasks.
no code implementations • ICCV 2023 • Jinhao Du, Shan Zhang, Qiang Chen, Haifeng Le, Yanpeng Sun, Yao Ni, Jian Wang, Bin He, Jingdong Wang
To provide precise information for the query image, the prototype is decoupled into task-specific ones, which provide tailored guidance for 'where to look' and 'what to look for', respectively.
no code implementations • ICCV 2023 • Kaiyou Song, Shan Zhang, Zihao An, Zimeng Luo, Tong Wang, Jin Xie
In contrastive self-supervised learning, the common way to learn discriminative representation is to pull different augmented "views" of the same image closer while pushing all other images further apart, which has been proven to be effective.
no code implementations • arXiv 2022 • Qiang Chen, Jian Wang, Chuchu Han, Shan Zhang, Zexian Li, Xiaokang Chen, Jiahui Chen, Xiaodi Wang, Shuming Han, Gang Zhang, Haocheng Feng, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang
The training process consists of self-supervised pretraining and finetuning a ViT-Huge encoder on ImageNet-1K, pretraining the detector on Object365, and finally finetuning it on COCO.
Ranked #8 on Object Detection on COCO test-dev
1 code implementation • 30 Oct 2022 • Shan Zhang, Naila Murray, Lei Wang, Piotr Koniusz
To address these drawbacks, we propose a Time-rEversed diffusioN tEnsor Transformer (TENET), which i) forms high-order tensor representations that capture multi-way feature occurrences that are highly discriminative, and ii) uses a transformer that dynamically extracts correlations between the query image and the entire support set, instead of a single average-pooled support embedding.
2 code implementations • ICCV 2023 • Qiang Chen, Xiaokang Chen, Jian Wang, Shan Zhang, Kun Yao, Haocheng Feng, Junyu Han, Errui Ding, Gang Zeng, Jingdong Wang
Detection transformer (DETR) relies on one-to-one assignment, assigning one ground-truth object to one prediction, for end-to-end detection without NMS post-processing.
no code implementations • 20 Jun 2022 • Jinghang Lin, Shan Zhang, Qing Lu
Transfer learning has emerged as a powerful technique in many application problems, such as computer vision and natural language processing.
no code implementations • 27 Apr 2022 • Shan Zhang, Tianyi Wu, Sitong Wu, Guodong Guo
In this work, we effectively integrate the context and affinity information via the proposed novel Context and Affinity Transformer (CATrans) in a hierarchical architecture.
no code implementations • 17 Mar 2022 • Shan Zhang, Pranay Sharma, Baocheng Geng, Pramod K. Varshney
To achieve greater sensor transmission and estimation efficiency, we propose a two step group-based collaborative distributed estimation scheme, where in the first step, sensors form dependence driven groups such that sensors in the same group are highly dependent, while sensors from different groups are independent, and perform a copula-based maximum a posteriori probability (MAP) estimation via intragroup collaboration.
1 code implementation • 22 Jan 2022 • Junjie Wang, Feng Gao, Junyu Dong, Shan Zhang, Qian Du
Synthetic aperture radar (SAR) image change detection is a vital yet challenging task in the field of remote sensing image analysis.
no code implementations • CVPR 2022 • Shan Zhang, Lei Wang, Naila Murray, Piotr Koniusz
We design a Kernelized Few-shot Object Detector by leveraging kernelized matrices computed over multiple proposal regions, which yield expressive non-linear representations whose model complexity is learned on the fly.
no code implementations • 25 Jun 2018 • Kush R. Varshney, Prashant Khanduri, Pranay Sharma, Shan Zhang, Pramod K. Varshney
Such arguments, however, fail to acknowledge that the overall decision-making system is composed of two entities: the learned model and a human who fuses together model outputs with his or her own information.