1 code implementation • 22 Apr 2024 • Yuying Ge, Sijie Zhao, Jinguo Zhu, Yixiao Ge, Kun Yi, Lin Song, Chen Li, Xiaohan Ding, Ying Shan
We hope that our work will inspire future research into what can be achieved by versatile multimodal foundation models in real-world applications.
1 code implementation • 30 Jan 2024 • Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan
The You Only Look Once (YOLO) series of detectors have established themselves as efficient and practical tools.
2 code implementations • 27 Nov 2023 • Xiaohan Ding, Yiyuan Zhang, Yixiao Ge, Sijie Zhao, Lin Song, Xiangyu Yue, Ying Shan
1) We propose four architectural guidelines for designing large-kernel ConvNets, the core of which is to exploit the essential characteristics of large kernels that distinguish them from small kernels - they can see wide without going deep.
Ranked #1 on Object Detection on COCO 2017 (mAP metric)
1 code implementation • NeurIPS 2023 • Cheng Cheng, Lin Song, Ruoyi Xue, Hang Wang, Hongbin Sun, Yixiao Ge, Ying Shan
Without bells and whistles, our approach outperforms the state-of-the-art online few-shot learning method by an average of 3. 6\% on eight image classification datasets with higher inference speed.
1 code implementation • 8 Oct 2023 • Ronghao Dang, Jiangyan Feng, Haodong Zhang, Chongjian Ge, Lin Song, Lijun Gong, Chengju Liu, Qijun Chen, Feng Zhu, Rui Zhao, Yibing Song
In order to encompass common detection expressions, we involve emerging vision-language model (VLM) and large language model (LLM) to generate instructions guided by text prompts and object bbxs, as the generalizations of foundation models are effective to produce human-like expressions (e. g., describing object property, category, and relationship).
no code implementations • 30 Jun 2023 • Weixin Mao, Jinrong Yang, Zheng Ge, Lin Song, HongYu Zhou, Tiezheng Mao, Zeming Li, Osamu Yoshie
In light of the success of sample mining techniques in 2D object detection, we propose a simple yet effective mining strategy for improving depth perception in 3D object detection.
no code implementations • 12 Jun 2023 • Sijie Zhao, Yixiao Ge, Zhongang Qi, Lin Song, Xiaohan Ding, Zehua Xie, Ying Shan
Therefore, we propose StickerCLIP as a benchmark model on the Sticker820K dataset.
1 code implementation • NeurIPS 2023 • Rui Yang, Lin Song, Yanwei Li, Sijie Zhao, Yixiao Ge, Xiu Li, Ying Shan
This paper aims to efficiently enable Large Language Models (LLMs) to use multimodal tools.
1 code implementation • ICCV 2023 • Rui Yang, Lin Song, Yixiao Ge, Xiu Li
Box-supervised instance segmentation has gained much attention as it requires only simple box annotations instead of costly mask or polygon annotations.
1 code implementation • NeurIPS 2021 • Lin Song, Songyang Zhang, Songtao Liu, Zeming Li, Xuming He, Hongbin Sun, Jian Sun, Nanning Zheng
Specifically, we propose a Dynamic Grained Encoder for vision transformers, which can adaptively assign a suitable number of queries to each spatial region.
no code implementations • 8 Oct 2022 • Lin Song, Pan Zhao, Neng Wan, Naira Hovakimyan
This paper presents a novel approach for achieving safe stochastic optimal control in networked multi-agent systems (MASs).
1 code implementation • 22 Jul 2022 • Jinrong Yang, Lin Song, Songtao Liu, Weixin Mao, Zeming Li, Xiaoping Li, Hongbin Sun, Jian Sun, Nanning Zheng
Many point-based 3D detectors adopt point-feature sampling strategies to drop some points for efficient inference.
no code implementations • 8 Apr 2022 • Neng Wan, Dapeng Li, Lin Song, Naira Hovakimyan
A simplified analysis is performed on the Bode-type filtering sensitivity trade-off integrals, which capture the sensitivity characteristics of the estimate and estimation error with respect to the process input and estimated signal in continuous- and discrete-time linear time-invariant filtering systems.
no code implementations • 21 Sep 2021 • Lin Song, Neng Wan, Aditya Gahlawat, Chuyuan Tao, Naira Hovakimyan, Evangelos A. Theodorou
The control action composition is achieved by taking a weighted mixture of the existing controllers according to the contribution of each component task.
1 code implementation • 27 Jul 2021 • Songyang Zhang, Lin Song, Songtao Liu, Zheng Ge, Zeming Li, Xuming He, Jian Sun
In this report, we introduce our real-time 2D object detection system for the realistic autonomous driving scenario.
1 code implementation • NeurIPS 2020 • Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Xiangyu Zhang, Hongbin Sun, Jian Sun, Nanning Zheng
The Learnable Tree Filter presents a remarkable approach to model structure-preserving relations for semantic segmentation.
1 code implementation • NeurIPS 2020 • Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng
To this end, we propose a fine-grained dynamic head to conditionally select a pixel-level combination of FPN features from different scales for each instance, which further releases the ability of multi-scale feature representation.
1 code implementation • CVPR 2021 • JianFeng Wang, Lin Song, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng
Mainstream object detectors based on the fully convolutional network has achieved impressive performance.
no code implementations • 28 Sep 2020 • Lin Song, Neng Wan, Aditya Gahlawat, Naira Hovakimyan, Evangelos A. Theodorou
The proposed approach achieves both the compositionality and optimality of control actions simultaneously within the cooperative MAS framework in both discrete- and continuous-time in a sample-efficient manner, which reduces the burden of re-computation of the optimal control solutions for the new task on the MASs.
no code implementations • 8 Sep 2020 • Aditya Gahlawat, Arun Lakshmanan, Lin Song, Andrew Patterson, Zhuohuan Wu, Naira Hovakimyan, Evangelos Theodorou
We present $\mathcal{CL}_1$-$\mathcal{GP}$, a control framework that enables safe simultaneous learning and control for systems subject to uncertainties.
1 code implementation • CVPR 2020 • Yanwei Li, Lin Song, Yukang Chen, Zeming Li, Xiangyu Zhang, Xingang Wang, Jian Sun
To demonstrate the superiority of the dynamic property, we compare with several static architectures, which can be modeled as special cases in the routing space.
1 code implementation • NeurIPS 2019 • Lin Song, Yanwei Li, Zeming Li, Gang Yu, Hongbin Sun, Jian Sun, Nanning Zheng
To this end, tree filtering modules are embedded to formulate a unified framework for semantic segmentation.
no code implementations • CVPR 2019 • Lin Song, Shiwei Zhang, Gang Yu, Hongbin Sun
In this paper, we define these ambiguous samples as "transitional states", and propose a Transition-Aware Context Network (TACNet) to distinguish transitional states.
Ranked #7 on Action Detection on J-HMDB