Search Results for author: Lin Song

Found 23 papers, 15 papers with code

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

1 code implementation • 22 Apr 2024 • Yuying Ge, Sijie Zhao, Jinguo Zhu, Yixiao Ge, Kun Yi, Lin Song, Chen Li, Xiaohan Ding, Ying Shan

We hope that our work will inspire future research into what can be achieved by versatile multimodal foundation models in real-world applications.

Image Generation

240

Paper
Code

YOLO-World: Real-Time Open-Vocabulary Object Detection

1 code implementation • 30 Jan 2024 • Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan

The You Only Look Once (YOLO) series of detectors have established themselves as efficient and practical tools.

Instance Segmentation Language Modelling +4

3,543

Paper
Code

UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

2 code implementations • 27 Nov 2023 • Xiaohan Ding, Yiyuan Zhang, Yixiao Ge, Sijie Zhao, Lin Song, Xiangyu Yue, Ying Shan

1) We propose four architectural guidelines for designing large-kernel ConvNets, the core of which is to exploit the essential characteristics of large kernels that distinguish them from small kernels - they can see wide without going deep.

Ranked #1 on Object Detection on COCO 2017 (mAP metric)

Image Classification Object Detection +3

831

Paper
Code

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model

1 code implementation • NeurIPS 2023 • Cheng Cheng, Lin Song, Ruoyi Xue, Hang Wang, Hongbin Sun, Yixiao Ge, Ying Shan

Without bells and whistles, our approach outperforms the state-of-the-art online few-shot learning method by an average of 3. 6\% on eight image classification datasets with higher inference speed.

Few-Shot Learning Image Classification +3

Paper
Code

InstructDET: Diversifying Referring Object Detection with Generalized Instructions

1 code implementation • 8 Oct 2023 • Ronghao Dang, Jiangyan Feng, Haodong Zhang, Chongjian Ge, Lin Song, Lijun Gong, Chengju Liu, Qijun Chen, Feng Zhu, Rui Zhao, Yibing Song

In order to encompass common detection expressions, we involve emerging vision-language model (VLM) and large language model (LLM) to generate instructions guided by text prompts and object bbxs, as the generalizations of foundation models are effective to produce human-like expressions (e. g., describing object property, category, and relationship).

Language Modelling Large Language Model +4

Paper
Code

GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection

no code implementations • 30 Jun 2023 • Weixin Mao, Jinrong Yang, Zheng Ge, Lin Song, HongYu Zhou, Tiezheng Mao, Zeming Li, Osamu Yoshie

In light of the success of sample mining techniques in 2D object detection, we propose a simple yet effective mining strategy for improving depth perception in 3D object detection.

3D Object Detection Depth Estimation +3

Paper
Add Code

Sticker820K: Empowering Interactive Retrieval with Stickers

no code implementations • 12 Jun 2023 • Sijie Zhao, Yixiao Ge, Zhongang Qi, Lin Song, Xiaohan Ding, Zehua Xie, Ying Shan

Therefore, we propose StickerCLIP as a benchmark model on the Sticker820K dataset.

Image Retrieval Retrieval

Paper
Add Code

GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction

1 code implementation • NeurIPS 2023 • Rui Yang, Lin Song, Yanwei Li, Sijie Zhao, Yixiao Ge, Xiu Li, Ying Shan

This paper aims to efficiently enable Large Language Models (LLMs) to use multimodal tools.

Image Generation Instruction Following +3

728

Paper
Code

BoxSnake: Polygonal Instance Segmentation with Box Supervision

1 code implementation • ICCV 2023 • Rui Yang, Lin Song, Yixiao Ge, Xiu Li

Box-supervised instance segmentation has gained much attention as it requires only simple box annotations instead of costly mask or polygon annotations.

Box-supervised Instance Segmentation Segmentation +1

Paper
Code

Dynamic Grained Encoder for Vision Transformers

1 code implementation • NeurIPS 2021 • Lin Song, Songyang Zhang, Songtao Liu, Zeming Li, Xuming He, Hongbin Sun, Jian Sun, Nanning Zheng

Specifically, we propose a Dynamic Grained Encoder for vision transformers, which can adaptively assign a suitable number of queries to each spatial region.

Image Classification Language Modelling +2

Paper
Code

Safety Embedded Stochastic Optimal Control of Networked Multi-Agent Systems via Barrier States

no code implementations • 8 Oct 2022 • Lin Song, Pan Zhao, Neng Wan, Naira Hovakimyan

This paper presents a novel approach for achieving safe stochastic optimal control in networked multi-agent systems (MASs).

Paper
Add Code

DBQ-SSD: Dynamic Ball Query for Efficient 3D Object Detection

1 code implementation • 22 Jul 2022 • Jinrong Yang, Lin Song, Songtao Liu, Weixin Mao, Zeming Li, Xiaoping Li, Hongbin Sun, Jian Sun, Nanning Zheng

Many point-based 3D detectors adopt point-feature sampling strategies to drop some points for efficient inference.

3D Object Detection object-detection

Paper
Code

Simplified Analysis on Filtering Sensitivity Trade-offs in Continuous- and Discrete-Time Systems

no code implementations • 8 Apr 2022 • Neng Wan, Dapeng Li, Lin Song, Naira Hovakimyan

A simplified analysis is performed on the Bode-type filtering sensitivity trade-off integrals, which capture the sensitivity characteristics of the estimate and estimation error with respect to the process input and estimated signal in continuous- and discrete-time linear time-invariant filtering systems.

Paper
Add Code

Generalization of Safe Optimal Control Actions on Networked Multi-Agent Systems

no code implementations • 21 Sep 2021 • Lin Song, Neng Wan, Aditya Gahlawat, Chuyuan Tao, Naira Hovakimyan, Evangelos A. Theodorou

The control action composition is achieved by taking a weighted mixture of the existing controllers according to the contribution of each component task.

Paper
Add Code

Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception Challenge

1 code implementation • 27 Jul 2021 • Songyang Zhang, Lin Song, Songtao Liu, Zheng Ge, Zeming Li, Xuming He, Jian Sun

In this report, we introduce our real-time 2D object detection system for the realistic autonomous driving scenario.

Autonomous Driving object-detection +1

9,051

Paper
Code

Rethinking Learnable Tree Filter for Generic Feature Transform

1 code implementation • NeurIPS 2020 • Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Xiangyu Zhang, Hongbin Sun, Jian Sun, Nanning Zheng

The Learnable Tree Filter presents a remarkable approach to model structure-preserving relations for semantic segmentation.

Instance Segmentation object-detection +3

Paper
Code

Fine-Grained Dynamic Head for Object Detection

1 code implementation • NeurIPS 2020 • Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng

To this end, we propose a fine-grained dynamic head to conditionally select a pixel-level combination of FPN features from different scales for each instance, which further releases the ability of multi-scale feature representation.

Object object-detection +1

Paper
Code

End-to-End Object Detection with Fully Convolutional Network

1 code implementation • CVPR 2021 • JianFeng Wang, Lin Song, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng

Mainstream object detectors based on the fully convolutional network has achieved impressive performance.

object-detection Object Detection

490

Paper
Code

Compositionality of Linearly Solvable Optimal Control in Networked Multi-Agent Systems

no code implementations • 28 Sep 2020 • Lin Song, Neng Wan, Aditya Gahlawat, Naira Hovakimyan, Evangelos A. Theodorou

The proposed approach achieves both the compositionality and optimality of control actions simultaneously within the cooperative MAS framework in both discrete- and continuous-time in a sample-efficient manner, which reduces the burden of re-computation of the optimal control solutions for the new task on the MASs.

Paper
Add Code

Contraction $\mathcal{L}_1$-Adaptive Control using Gaussian Processes

no code implementations • 8 Sep 2020 • Aditya Gahlawat, Arun Lakshmanan, Lin Song, Andrew Patterson, Zhuohuan Wu, Naira Hovakimyan, Evangelos Theodorou

We present $\mathcal{CL}_1$-$\mathcal{GP}$, a control framework that enables safe simultaneous learning and control for systems subject to uncertainties.

Gaussian Processes

Paper
Add Code

Learning Dynamic Routing for Semantic Segmentation

1 code implementation • CVPR 2020 • Yanwei Li, Lin Song, Yukang Chen, Zeming Li, Xiangyu Zhang, Xingang Wang, Jian Sun

To demonstrate the superiority of the dynamic property, we compare with several static architectures, which can be modeled as special cases in the routing space.

Segmentation Semantic Segmentation

377

Paper
Code

Learnable Tree Filter for Structure-preserving Feature Transform

1 code implementation • NeurIPS 2019 • Lin Song, Yanwei Li, Zeming Li, Gang Yu, Hongbin Sun, Jian Sun, Nanning Zheng

To this end, tree filtering modules are embedded to formulate a unified framework for semantic segmentation.

Semantic Segmentation

140

Paper
Code

TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection

no code implementations • CVPR 2019 • Lin Song, Shiwei Zhang, Gang Yu, Hongbin Sun

In this paper, we define these ambiguous samples as "transitional states", and propose a Transition-Aware Context Network (TACNet) to distinguish transitional states.

Ranked #7 on Action Detection on J-HMDB

Action Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.