Search Results for author: Chengju Liu

Found 16 papers, 6 papers with code

Efficient Text-driven Motion Generation via Latent Consistency Training

no code implementations • 5 May 2024 • Mengxian Hu, Minghao Zhu, Xun Zhou, Qingqing Yan, Shu Li, Chengju Liu, Qijun Chen

Motion diffusion models have recently proven successful for text-driven human motion generation.

Quantization

Paper
Add Code

Vision-and-Language Navigation via Causal Learning

1 code implementation • 16 Apr 2024 • Liuyi Wang, Zongtao He, Ronghao Dang, Mengjiao Shen, Chengju Liu, Qijun Chen

In the pursuit of robust and generalizable environment perception and language understanding, the ubiquitous challenge of dataset bias continues to plague vision-and-language navigation (VLN) agents, hindering their performance in unseen environments.

Causal Inference Contrastive Learning +2

Paper
Code

Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation

no code implementations • 6 Mar 2024 • Liuyi Wang, Zongtao He, Ronghao Dang, Huiyi Chen, Chengju Liu, Qijun Chen

Vision-and-Language Navigation (VLN) has gained significant research interest in recent years due to its potential applications in real-world scenarios.

Representation Learning Vision and Language Navigation

Paper
Add Code

CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge

no code implementations • 24 Feb 2024 • Xiao Lin, Minghao Zhu, Ronghao Dang, Guangliang Zhou, Shaolong Shu, Feng Lin, Chengju Liu, Qijun Chen

Inspired by this motivation, we propose CLIPose, a novel 6D pose framework that employs the pre-trained vision-language model to develop better learning of object category information, which can fully leverage abundant semantic knowledge in image and text modalities.

Contrastive Learning Language Modelling +2

Paper
Add Code

TransPose: 6D Object Pose Estimation with Geometry-Aware Transformer

no code implementations • 25 Oct 2023 • Xiao Lin, Deming Wang, Guangliang Zhou, Chengju Liu, Qijun Chen

To improve robustness to occlusion, we adopt Transformer to perform the exchange of global information, making each local feature contains global information.

6D Pose Estimation using RGB Object

Paper
Add Code

InstructDET: Diversifying Referring Object Detection with Generalized Instructions

1 code implementation • 8 Oct 2023 • Ronghao Dang, Jiangyan Feng, Haodong Zhang, Chongjian Ge, Lin Song, Lijun Gong, Chengju Liu, Qijun Chen, Feng Zhu, Rui Zhao, Yibing Song

In order to encompass common detection expressions, we involve emerging vision-language model (VLM) and large language model (LLM) to generate instructions guided by text prompts and object bbxs, as the generalizations of foundation models are effective to produce human-like expressions (e. g., describing object property, category, and relationship).

Language Modelling Large Language Model +4

Paper
Code

Fine-Grained Spatiotemporal Motion Alignment for Contrastive Video Representation Learning

1 code implementation • 1 Sep 2023 • Minghao Zhu, Xiao Lin, Ronghao Dang, Chengju Liu, Qijun Chen

As the most essential property in a video, motion information is critical to a robust and generalized video representation.

Contrastive Learning Decoder +1

Paper
Code

PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For Vision-and-Language Navigation

no code implementations • 19 May 2023 • Liuyi Wang, Chengju Liu, Zongtao He, Shu Li, Qingqing Yan, Huiyi Chen, Qijun Chen

The experimental results demonstrate that PASTS outperforms all existing speaker models and successfully improves the performance of previous VLN models, achieving state-of-the-art performance on the standard Room-to-Room (R2R) dataset.

Data Augmentation Vision and Language Navigation

Paper
Add Code

A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation

1 code implementation • 5 May 2023 • Liuyi Wang, Zongtao He, Jiagui Tang, Ronghao Dang, Naijia Wang, Chengju Liu, Qijun Chen

Vision-and-Language Navigation (VLN) is a realistic but challenging task that requires an agent to locate the target region using verbal and visual cues.

Vision and Language Navigation

Paper
Code

MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation

1 code implementation • 2 Mar 2023 • Zongtao He, Liuyi Wang, Shu Li, Qingqing Yan, Chengju Liu, Qijun Chen

For a better performance in continuous VLN, we design a multi-level instruction understanding procedure and propose a novel model, Multi-Level Attention Network (MLANet).

Navigate Vision and Language Navigation

Paper
Code

Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation

no code implementations • 3 Feb 2023 • Ronghao Dang, Lu Chen, Liuyi Wang, Zongtao He, Chengju Liu, Qijun Chen

We propose a meta-ability decoupling (MAD) paradigm, which brings together various object navigation methods in an architecture system, allowing them to mutually enhance each other and evolve together.

Object

Paper
Add Code

Search for or Navigate to? Dual Adaptive Thinking for Object Navigation

no code implementations • ICCV 2023 • Ronghao Dang, Liuyi Wang, Zongtao He, Shuai Su, Chengju Liu, Qijun Chen

After seeing the target, we remember the target location and navigate to.

Navigate Object

Paper
Add Code

Unbiased Directed Object Attention Graph for Object Navigation

no code implementations • 9 Apr 2022 • Ronghao Dang, Zhuofan Shi, Liuyi Wang, Zongtao He, Chengju Liu, Qijun Chen

Thus, in this paper, we propose a directed object attention (DOA) graph to guide the agent in explicitly learning the attention relationships between objects, thereby reducing the object attention bias.

Object

Paper
Add Code

PointTrackNet: An End-to-End Network For 3-D Object Detection and Tracking From Point Clouds

no code implementations • 26 Feb 2020 • Sukai Wang, Yuxiang Sun, Chengju Liu, Ming Liu

Recent machine learning-based multi-object tracking (MOT) frameworks are becoming popular for 3-D point clouds.

Multi-Object Tracking Object +2

Paper
Add Code

Unsupervised Learning of Depth and Deep Representation for Visual Odometry from Monocular Videos in a Metric Space

no code implementations • 4 Aug 2019 • Xiaochuan Yin, Chengju Liu

In contrast to the previous methods, our proposed method calculates the camera motion with a direct method rather than regressing the ego-motion from the pose network.

Depth Estimation Motion Estimation +4

Paper
Add Code

Focal Loss in 3D Object Detection

1 code implementation • 17 Sep 2018 • Peng Yun, Lei Tai, Yu-An Wang, Chengju Liu, Ming Liu

Inspired by the recent use of focal loss in image-based object detection, we extend this hard-mining improvement of binary cross entropy to point-cloud-based object detection and conduct experiments to show its performance based on two different 3D detectors: 3D-FCN and VoxelNet.

3D Object Detection Autonomous Driving +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.