Search Results for author: Yaoyuan Liang

Found 4 papers, 3 papers with code

RCA-NOC: Relative Contrastive Alignment for Novel Object Captioning

no code implementations • ICCV 2023 • Jiashuo Fan, Yaoyuan Liang, Leyao Liu, ShaoLun Huang, Lei Zhang

We evaluate our approach on two datasets and show that our proposed RCA-NOC approach outperforms state-of-the-art methods by a large margin, demonstrating its effectiveness in improving vision-language representation for novel object captioning.

Contrastive Learning Object +1

Paper
Add Code

Exploring Iterative Refinement with Diffusion Models for Video Grounding

1 code implementation • 26 Oct 2023 • Xiao Liang, Tao Shi, Yaoyuan Liang, Te Tao, Shao-Lun Huang

In this paper, we propose DiffusionVG, a novel framework with diffusion models that formulates video grounding as a conditional generation task, where the target span is generated from Gaussian noise inputs and interatively refined in the reverse diffusion process.

Sentence Video Grounding

Paper
Code

SSLCL: An Efficient Model-Agnostic Supervised Contrastive Learning Framework for Emotion Recognition in Conversations

1 code implementation • 25 Oct 2023 • Tao Shi, Xiao Liang, Yaoyuan Liang, Xinyi Tong, Shao-Lun Huang

To address these challenges, we propose an efficient and model-agnostic SCL framework named Supervised Sample-Label Contrastive Learning with Soft-HGR Maximal Correlation (SSLCL), which eliminates the need for a large batch size and can be seamlessly integrated with existing ERC models without introducing any model-specific assumptions.

Contrastive Learning Emotion Recognition

Paper
Code

DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding

1 code implementation • 28 Nov 2022 • Shilong Liu, Yaoyuan Liang, Feng Li, Shijia Huang, Hao Zhang, Hang Su, Jun Zhu, Lei Zhang

As phrase extraction can be regarded as a $1$D text segmentation problem, we formulate PEG as a dual detection problem and propose a novel DQ-DETR model, which introduces dual queries to probe different features from image and text for object prediction and phrase mask prediction.

Ranked #7 on Referring Expression Comprehension on RefCOCO

object-detection Object Detection +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.