Search Results for author: Jianbo Yuan

Found 24 papers, 11 papers with code

An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing

1 code implementation • 25 Mar 2024 • Ziwei Chai, Guoyin Wang, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang

We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs.

Paper
Code

How Can LLM Guide RL? A Value-Based Approach

1 code implementation • 25 Feb 2024 • Shenao Zhang, Sirui Zheng, Shuqi Ke, Zhihan Liu, Wanxin Jin, Jianbo Yuan, Yingxiang Yang, Hongxia Yang, Zhaoran Wang

Specifically, we develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning, particularly when the difference between the ideal policy and the LLM-informed policy is small, which suggests that the initial policy is close to optimal, reducing the need for further exploration.

Decision Making Reinforcement Learning (RL)

Paper
Code

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning

no code implementations • 10 Jan 2024 • Yiqi Wang, Wentao Chen, Xiaotian Han, Xudong Lin, Haiteng Zhao, Yongfei Liu, Bohan Zhai, Jianbo Yuan, Quanzeng You, Hongxia Yang

In this survey, we comprehensively review the existing evaluation protocols of multimodal reasoning, categorize and illustrate the frontiers of MLLMs, introduce recent trends in applications of MLLMs on reasoning-intensive tasks, and finally discuss current practices and future directions.

Multimodal Reasoning

Paper
Add Code

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

1 code implementation • 10 Jan 2024 • Xueyu Hu, Ziyu Zhao, Shuang Wei, Ziwei Chai, Qianli Ma, Guoyin Wang, Xuwu Wang, Jing Su, Jingjing Xu, Ming Zhu, Yao Cheng, Jianbo Yuan, Jiwei Li, Kun Kuang, Yang Yang, Hongxia Yang, Fei Wu

In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks.

Benchmarking

Paper
Code

Improving In-Context Learning in Diffusion Models with Visual Context-Modulated Prompts

no code implementations • 3 Dec 2023 • Tianqi Chen, Yongfei Liu, Zhendong Wang, Jianbo Yuan, Quanzeng You, Hongxia Yang, Mingyuan Zhou

In light of the remarkable success of in-context learning in large language models, its potential extension to the vision domain, particularly with visual foundation models like Stable Diffusion, has sparked considerable interest.

In-Context Learning

Paper
Add Code

Self-Infilling Code Generation

1 code implementation • 29 Nov 2023 • Lin Zheng, Jianbo Yuan, Zhi Zhang, Hongxia Yang, Lingpeng Kong

This work introduces self-infilling code generation, a general framework that incorporates infilling operations into auto-regressive decoding.

Code Generation

Paper
Code

Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

no code implementations • 28 Nov 2023 • Xiaohui Chen, Yongfei Liu, Yingxiang Yang, Jianbo Yuan, Quanzeng You, Li-Ping Liu, Hongxia Yang

Recent advancements in text-to-image (T2I) generative models have shown remarkable capabilities in producing diverse and imaginative visuals based on text prompts.

Image Generation

Paper
Add Code

InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

no code implementations • 20 Nov 2023 • Xiaotian Han, Quanzeng You, Yongfei Liu, Wentao Chen, Huangjie Zheng, Khalil Mrini, Xudong Lin, Yiqi Wang, Bohan Zhai, Jianbo Yuan, Heng Wang, Hongxia Yang

To mitigate this issue, we manually curate a benchmark dataset specifically designed for MLLMs, with a focus on complex reasoning tasks.

Paper
Add Code

Let's reward step by step: Step-Level reward model as the Navigators for Reasoning

no code implementations • 16 Oct 2023 • Qianli Ma, Haotian Zhou, Tingkai Liu, Jianbo Yuan, PengFei Liu, Yang You, Hongxia Yang

Recent years have seen considerable advancements in multi-step reasoning with Large Language Models (LLMs).

Code Generation GSM8K +2

Paper
Add Code

LoBaSS: Gauging Learnability in Supervised Fine-tuning Data

no code implementations • 16 Oct 2023 • Haotian Zhou, Tingkai Liu, Qianli Ma, Jianbo Yuan, PengFei Liu, Yang You, Hongxia Yang

In this paper, we introduce a new dimension in SFT data selection: learnability.

Paper
Add Code

LEMON: Lossless model expansion

no code implementations • 12 Oct 2023 • Yite Wang, Jiahao Su, Hanlin Lu, Cong Xie, Tianyi Liu, Jianbo Yuan, Haibin Lin, Ruoyu Sun, Hongxia Yang

Our empirical results demonstrate that LEMON reduces computational costs by 56. 7% for Vision Transformers and 33. 2% for BERT when compared to training from scratch.

Paper
Add Code

Let Models Speak Ciphers: Multiagent Debate through Embeddings

no code implementations • 10 Oct 2023 • Chau Pham, Boyi Liu, Yingxiang Yang, Zhengyu Chen, Tianyi Liu, Jianbo Yuan, Bryan A. Plummer, Zhaoran Wang, Hongxia Yang

Although natural language is an obvious choice for communication due to LLM's language understanding capability, the token sampling step needed when generating natural language poses a potential risk of information loss, as it uses only one token to represent the model's belief across the entire vocabulary.

Paper
Add Code

Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling

no code implementations • 10 Oct 2023 • Huangjie Zheng, Zhendong Wang, Jianbo Yuan, Guanghan Ning, Pengcheng He, Quanzeng You, Hongxia Yang, Mingyuan Zhou

Diffusion models excel at generating photo-realistic images but come with significant computational costs in both training and sampling.

Image Generation

Paper
Add Code

Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens

1 code implementation • CVPR 2023 • Yuxiao Chen, Jianbo Yuan, Yu Tian, Shijie Geng, Xinyu Li, Ding Zhou, Dimitris N. Metaxas, Hongxia Yang

However, direct aligning cross-modal information using such representations is challenging, as visual patches and text tokens differ in semantic levels and granularities.

Contrastive Learning

Paper
Code

HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention

1 code implementation • 6 Mar 2023 • Shijie Geng, Jianbo Yuan, Yu Tian, Yuxiao Chen, Yongfeng Zhang

The success of large-scale contrastive vision-language pretraining (CLIP) has benefited both visual recognition and multimodal content understanding.

Paper
Code

A Reparameterized Discrete Diffusion Model for Text Generation

1 code implementation • 11 Feb 2023 • Lin Zheng, Jianbo Yuan, Lei Yu, Lingpeng Kong

This work studies discrete diffusion probabilistic models with applications to natural language generation.

Text Generation

Paper
Code

Efficient Attention via Control Variates

1 code implementation • 9 Feb 2023 • Lin Zheng, Jianbo Yuan, Chong Wang, Lingpeng Kong

Built upon previous progress of RFA, we characterize this gap through the lens of control variates and show that RFA can be decomposed into a sum of multiple control variate estimators for each element in the sequence.

Paper
Code

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning

1 code implementation • 20 Jul 2022 • Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas

Despite the success of fully-supervised human skeleton sequence modeling, utilizing self-supervised pre-training for skeleton sequence representation learning has been an active field because acquiring task-specific skeleton annotations at large scales is difficult.

Action Detection Action Recognition +3

Paper
Code

More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image-Text Matching

no code implementations • 20 May 2021 • Yuxiao Chen, Jianbo Yuan, Long Zhao, Tianlang Chen, Rui Luo, Larry Davis, Dimitris N. Metaxas

Cross-modal attention mechanisms have been widely applied to the image-text matching task and have achieved remarkable improvements thanks to its capability of learning fine-grained relevance across different modalities.

Contrastive Learning Image Captioning +4

Paper
Add Code

Automatic Radiology Report Generation based on Multi-view Image Fusion and Medical Concept Enrichment

no code implementations • 22 Jul 2019 • Jianbo Yuan, Haofu Liao, Rui Luo, Jiebo Luo

In addition, in order to enrich the decoder with descriptive semantics and enforce the correctness of the deterministic medical-related contents such as mentions of organs or diagnoses, we extract medical concepts based on the radiology reports in the training data and fine-tune the encoder to extract the most frequent medical concepts from the x-ray images.

Decoder Descriptive +3

Paper
Add Code

Construct Dynamic Graphs for Hand Gesture Recognition via Spatial-Temporal Attention

1 code implementation • 20 Jul 2019 • Yuxiao Chen, Long Zhao, Xi Peng, Jianbo Yuan, Dimitris N. Metaxas

We propose a Dynamic Graph-Based Spatial-Temporal Attention (DG-STA) method for hand gesture recognition.

Ranked #3 on Hand Gesture Recognition on SHREC 2017

Hand Gesture Recognition Hand-Gesture Recognition +1

Paper
Code

Artifact Disentanglement Network for Unsupervised Metal Artifact Reduction

1 code implementation • 5 Jun 2019 • Haofu Liao, Wei-An Lin, Jianbo Yuan, S. Kevin Zhou, Jiebo Luo

Extensive experiments show that our method significantly outperforms the existing unsupervised models for image-to-image translation problems, and achieves comparable performance to existing supervised models on a synthesized dataset.

Computed Tomography (CT) Disentanglement +3

162

Paper
Code

Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM

no code implementations • 20 Jul 2018 • Yuxiao Chen, Jianbo Yuan, Quanzeng You, Jiebo Luo

Sentiment analysis on large-scale social media data is important to bridge the gaps between social media contents and real world activities including political election prediction, individual and public emotional status monitoring and analysis, and so on.

Twitter Sentiment Analysis

Paper
Add Code

Solving Cold-Start Problem in Large-scale Recommendation Engines: A Deep Learning Approach

no code implementations • 16 Nov 2016 • Jianbo Yuan, Walid Shalaby, Mohammed Korayem, David Lin, Khalifeh Aljadda, Jiebo Luo

One of the most important features of the proposed technique is the fact that it can be applied on top of any existing CF based recommendation engine without changing the CF core.

Collaborative Filtering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.