Search Results for author: Haozhe Zhao

Found 8 papers, 7 papers with code

Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation

1 code implementation • 12 Apr 2024 • Haozhe Zhao, Zefan Cai, Shuzheng Si, Liang Chen, Yufeng He, Kaikai An, Baobao Chang

Therefore, we introduce ALSACE to leverage the learned knowledge from the well-performing languages to guide under-performing ones within the same mPLM, eliminating the need for additional labeled multilingual data.

Paper
Code

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

1 code implementation • 11 Mar 2024 • Liang Chen, Haozhe Zhao, Tianyu Liu, Shuai Bai, Junyang Lin, Chang Zhou, Baobao Chang

To this end, we introduce FastV, a versatile plug-and-play method designed to optimize computational efficiency by learning adaptive attention patterns in early layers and pruning visual tokens in subsequent ones.

Computational Efficiency Video Understanding

142

Paper
Code

PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain

1 code implementation • 21 Feb 2024 • Liang Chen, Yichi Zhang, Shuhuai Ren, Haozhe Zhao, Zefan Cai, Yuchi Wang, Peiyi Wang, Xiangdi Meng, Tianyu Liu, Baobao Chang

To address this, we introduce Embodied-Instruction-Evolution (EIE), an automatic framework for synthesizing instruction tuning examples in multimodal embodied environments.

Autonomous Driving Decision Making

Paper
Code

ML-Bench: Evaluating Large Language Models for Code Generation in Repository-Level Machine Learning Tasks

1 code implementation • 16 Nov 2023 • Yuliang Liu, Xiangru Tang, Zefan Cai, Junjie Lu, Yichi Zhang, Yanjun Shao, Zexuan Deng, Helan Hu, Kaikai An, Ruijun Huang, Shuzheng Si, Sheng Chen, Haozhe Zhao, Liang Chen, Yan Wang, Tianyu Liu, Zhiwei Jiang, Baobao Chang, Yujia Qin, Wangchunshu Zhou, Yilun Zhao, Arman Cohan, Mark Gerstein

While Large Language Models (LLMs) have demonstrated proficiency in code generation benchmarks, translating these results into practical development scenarios - where leveraging existing repository-level libraries is the norm - remains challenging.

Code Generation Navigate

Paper
Code

Distantly-Supervised Named Entity Recognition with Uncertainty-aware Teacher Learning and Student-student Collaborative Learning

no code implementations • 14 Nov 2023 • Helan Hu, Shuzheng Si, Haozhe Zhao, Shuang Zeng, Kaikai An, Zefan Cai, Baobao Chang

Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates the burden of annotation, but meanwhile suffers from the label noise.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Coarse-to-Fine Dual Encoders are Better Frame Identification Learners

1 code implementation • 20 Oct 2023 • Kaikai An, Ce Zheng, Bofei Gao, Haozhe Zhao, Baobao Chang

Recent researches measure the similarity or matching score between targets and candidate frames by modeling frame definitions.

Contrastive Learning Representation Learning +1

Paper
Code

Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond

1 code implementation • 3 Oct 2023 • Liang Chen, Yichi Zhang, Shuhuai Ren, Haozhe Zhao, Zefan Cai, Yuchi Wang, Peiyi Wang, Tianyu Liu, Baobao Chang

In this study, we explore the potential of Multimodal Large Language Models (MLLMs) in improving embodied decision-making processes for agents.

Decision Making Language Modelling +2

Paper
Code

MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning

2 code implementations • 14 Sep 2023 • Haozhe Zhao, Zefan Cai, Shuzheng Si, Xiaojian Ma, Kaikai An, Liang Chen, Zixuan Liu, Sheng Wang, Wenjuan Han, Baobao Chang

In this paper, we address the limitation above by 1) introducing vision-language Model with Multi-Modal In-Context Learning(MMICL), a new approach to allow the VLM to deal with multi-modal inputs efficiently; 2) proposing a novel context scheme to augment the in-context learning ability of the VLM; 3) constructing the Multi-modal In-Context Learning (MIC) dataset, designed to enhance the VLM's ability to understand complex multi-modal prompts.

Ranked #16 on Visual Reasoning on Winoground

Hallucination In-Context Learning +2

302

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.