Search Results for author: Huanjin Yao

Found 4 papers, 4 papers with code

Dense Connector for MLLMs

1 code implementation22 May 2024 Huanjin Yao, Wenhao Wu, Taojiannan Yang, Yuxin Song, Mengxi Zhang, Haocheng Feng, Yifan Sun, Zhiheng Li, Wanli Ouyang, Jingdong Wang

We witness the rise of larger and higher-quality instruction datasets, as well as the involvement of larger-sized LLMs.

Video Understanding

Automated Multi-level Preference for MLLMs

1 code implementation18 May 2024 Mengxi Zhang, Wenhao Wu, Yu Lu, Yuxin Song, Kang Rong, Huanjin Yao, Jianbo Zhao, Fanglong Liu, Yifan Sun, Haocheng Feng, Jingdong Wang

To verify our viewpoint, we present the Automated Multi-level Preference (AMP) framework for MLLMs.

Hallucination

Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning

2 code implementations27 Nov 2023 Huanjin Yao, Wenhao Wu, Zhiheng Li

In this paper, we present a novel Spatial-Temporal Side Network for memory-efficient fine-tuning large image models to video understanding, named Side4Video.

Action Classification Action Recognition +3

GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?

2 code implementations27 Nov 2023 Wenhao Wu, Huanjin Yao, Mengxi Zhang, Yuxin Song, Wanli Ouyang, Jingdong Wang

Our study centers on the evaluation of GPT-4's linguistic and visual capabilities in zero-shot visual recognition tasks: Firstly, we explore the potential of its generated rich textual descriptions across various categories to enhance recognition performance without any training.

Zero-Shot Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.