Search Results for author: Xiu Li

Found 109 papers, 49 papers with code

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

no code implementations • 28 May 2024 • Haonan Han, Rui Yang, Huan Liao, Jiankai Xing, Zunnan Xu, Xiaoming Yu, Junwei Zha, Xiu Li, Wanhua Li

Traditional image-to-3D models often struggle with scenes containing multiple objects due to biases and occlusion complexities.

Paper
Add Code

Cross-Domain Policy Adaptation by Capturing Representation Mismatch

1 code implementation • 24 May 2024 • Jiafei Lyu, Chenjia Bai, Jingwen Yang, Zongqing Lu, Xiu Li

We perform representation learning only in the target domain and measure the representation deviations on the transitions from the source domain, which we show can be a signal of dynamics mismatch.

Paper
Code

Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration

no code implementations • 23 May 2024 • Yang Zhang, Shixin Yang, Chenjia Bai, Fei Wu, Xiu Li, Zhen Wang, Xuelong Li

In this paper, we propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans.

regression

Paper
Add Code

MMTryon: Multi-Modal Multi-Reference Control for High-Quality Fashion Generation

no code implementations • 1 May 2024 • Xujie Zhang, Ente Lin, Xiu Li, Yuxuan Luo, Michael Kampffmeyer, Xin Dong, Xiaodan Liang

Besides, to remove the segmentation dependency, MMTryon uses a parsing-free garment encoder and leverages a novel scalable data generation pipeline to convert existing VITON datasets to a form that allows MMTryon to be trained without requiring any explicit segmentation.

Segmentation Virtual Try-on

Paper
Add Code

Contrastive Quantization based Semantic Code for Generative Recommendation

no code implementations • 23 Apr 2024 • mengqun Jin, Zexuan Qiu, Jieming Zhu, Zhenhua Dong, Xiu Li

Finally, we train and test semantic code with with generative retrieval on a sequential recommendation model.

Decoder Language Modelling +3

Paper
Add Code

Deep Pattern Network for Click-Through Rate Prediction

no code implementations • 17 Apr 2024 • Hengyu Zhang, Junwei Pan, Dapeng Liu, Jie Jiang, Xiu Li

These patterns harbor substantial potential to significantly enhance CTR prediction performance.

Click-Through Rate Prediction Recommendation Systems +1

Paper
Add Code

AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation

no code implementations • 16 Apr 2024 • Zexin Li, Yiyang Lin, Zijie Fang, Shuyan Li, Xiu Li

In this paper, we propose the Attention-Based Varifocal Generative Adversarial Network (AV-GAN), which solves multiple problems in pathologic image translation tasks, such as uneven translation difficulty in different regions, mutual interference of multiple resolution information, and nuclear deformation.

Generative Adversarial Network Translation

Paper
Add Code

Video Object Segmentation with Dynamic Query Modulation

1 code implementation • 18 Mar 2024 • Hantao Zhou, Runze Hu, Xiu Li

Storing intermediate frame segmentations as memory for long-range context modeling, spatial-temporal memory-based methods have recently showcased impressive results in semi-supervised video object segmentation (SVOS).

Object Segmentation +3

Paper
Code

GRA: Detecting Oriented Objects through Group-wise Rotating and Attention

no code implementations • 17 Mar 2024 • Jiangshan Wang, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li, Gao Huang

GRA can adaptively capture fine-grained features of objects with diverse orientations, comprising two key components: Group-wise Rotating and Group-wise Attention.

Object object-detection +2

Paper
Add Code

Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

1 code implementation • 15 Mar 2024 • Ronghui Li, Yuxiang Zhang, Yachao Zhang, Hongwen Zhang, Jie Guo, Yan Zhang, Yebin Liu, Xiu Li

In contrast, the second-stage is the local diffusion, which parallelly generates detailed motion sequences under the guidance of the dance primitives and choreographic rules.

Ranked #1 on Motion Synthesis on FineDance

Motion Synthesis

Paper
Code

MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models

no code implementations • 14 Mar 2024 • Zunnan Xu, Yukang Lin, Haonan Han, Sicheng Yang, Ronghui Li, Yachao Zhang, Xiu Li

Gesture synthesis is a vital realm of human-computer interaction, with wide-ranging applications across various fields like film, robotics, and virtual reality.

Paper
Add Code

Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts

2 code implementations • 13 Mar 2024 • Yue Ma, Yingqing He, Hongfa Wang, Andong Wang, Chenyang Qi, Chengfei Cai, Xiu Li, Zhifeng Li, Heung-Yeung Shum, Wei Liu, Qifeng Chen

Despite recent advances in image-to-video generation, better controllability and local animation are less explored.

Image Animation Image to Video Generation

749

Paper
Code

Harmonious Group Choreography with Trajectory-Controllable Diffusion

no code implementations • 10 Mar 2024 • Yuqin Dai, Wanlu Zhu, Ronghui Li, Zeping Ren, Xiangzheng Zhou, Xiu Li, Jun Li, Jian Yang

Specifically, to tackle dancer collisions, we introduce a Dance-Beat Navigator capable of generating trajectories for multiple dancers based on the music, complemented by a Distance-Consistency loss to maintain appropriate spacing among trajectories within a reasonable threshold.

Paper
Add Code

SEABO: A Simple Search-Based Method for Offline Imitation Learning

1 code implementation • 6 Feb 2024 • Jiafei Lyu, Xiaoteng Ma, Le Wan, Runze Liu, Xiu Li, Zongqing Lu

Offline reinforcement learning (RL) has attracted much attention due to its ability in learning from static offline datasets and eliminating the need of interacting with the environment.

D4RL Imitation Learning +2

Paper
Code

Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence

no code implementations • 5 Feb 2024 • Jiafei Lyu, Le Wan, Xiu Li, Zongqing Lu

Recently, there are many efforts attempting to learn useful policies for continuous control in visual reinforcement learning (RL).

Continuous Control Learning Theory +1

Paper
Add Code

BATON: Aligning Text-to-Audio Model with Human Preference Feedback

no code implementations • 1 Feb 2024 • Huan Liao, Haonan Han, Kai Yang, Tianjiao Du, Rui Yang, Zunnan Xu, Qinmei Xu, Jingquan Liu, Jiasheng Lu, Xiu Li

With the development of AI-Generated Content (AIGC), text-to-audio models are gaining widespread attention.

Paper
Add Code

CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion

1 code implementation • 25 Jan 2024 • Nisha Huang, WeiMing Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu

Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.

Image Generation Style Transfer

Paper
Code

Exploration and Anti-Exploration with Distributional Random Network Distillation

2 code implementations • 18 Jan 2024 • Kai Yang, Jian Tao, Jiafei Lyu, Xiu Li

To address this issue, we introduce the Distributional RND (DRND), a derivative of the RND.

D4RL

Paper
Code

1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation

1 code implementation • 1 Jan 2024 • Zhuoyan Luo, Yicheng Xiao, Yong liu, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang

The recent transformer-based models have dominated the Referring Video Object Segmentation (RVOS) task due to the superior performance.

Object Referring Video Object Segmentation +3

Paper
Code

Text2Avatar: Text to 3D Human Avatar Generation with Codebook-Driven Body Controllable Attribute

no code implementations • 1 Jan 2024 • Chaoqun Gong, Yuqin Dai, Ronghui Li, Achun Bao, Jun Li, Jian Yang, Yachao Zhang, Xiu Li

Generating 3D human models directly from text helps reduce the cost and time of character modeling.

Attribute Disentanglement +2

Paper
Add Code

Exploring Multi-Modal Control in Music-Driven Dance Generation

no code implementations • 1 Jan 2024 • Ronghui Li, Yuqin Dai, Yachao Zhang, Jun Li, Jian Yang, Jie Guo, Xiu Li

Existing music-driven 3D dance generation methods mainly concentrate on high-quality dance generation, but lack sufficient control during the generation process.

Paper
Add Code

Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

no code implementations • 26 Dec 2023 • Zunnan Xu, Yachao Zhang, Sicheng Yang, Ronghui Li, Xiu Li

We introduce a novel method that separates priors from speech and employs multimodal priors as constraints for generating gestures.

Gesture Generation

Paper
Add Code

Realistic Human Motion Generation with Cross-Diffusion Models

no code implementations • 18 Dec 2023 • Zeping Ren, Shaoli Huang, Xiu Li

Our method integrates 3D and 2D information using a shared transformer network within the training of the diffusion model, unifying motion noise into a single feature space.

Paper
Add Code

Semi-supervised Semantic Segmentation Meets Masked Modeling:Fine-grained Locality Learning Matters in Consistency Regularization

no code implementations • 14 Dec 2023 • Wentao Pan, Zhe Xu, Jiangpeng Yan, Zihan Wu, Raymond Kai-yu Tong, Xiu Li, Jianhua Yao

Semi-supervised semantic segmentation aims to utilize limited labeled images and abundant unlabeled images to achieve label-efficient learning, wherein the weak-to-strong consistency regularization framework, popularized by FixMatch, is widely used as a benchmark scheme.

Image Classification Pseudo Label +2

Paper
Add Code

WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on

no code implementations • 6 Dec 2023 • Xujie Zhang, Xiu Li, Michael Kampffmeyer, Xin Dong, Zhenyu Xie, Feida Zhu, Haoye Dong, Xiaodan Liang

Image-based Virtual Try-On (VITON) aims to transfer an in-shop garment image onto a target person.

Image Generation Virtual Try-on

Paper
Add Code

MagicStick: Controllable Video Editing via Control Handle Transformations

1 code implementation • 5 Dec 2023 • Yue Ma, Xiaodong Cun, Yingqing He, Chenyang Qi, Xintao Wang, Ying Shan, Xiu Li, Qifeng Chen

Yet succinct, our method is the first method to show the ability of video property editing from the pre-trained text-to-image model.

Video Editing Video Generation

Paper
Code

Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

1 code implementation • 28 Nov 2023 • Yicheng Xiao, Zhuoyan Luo, Yong liu, Yue Ma, Hengwei Bian, Yatai Ji, Yujiu Yang, Xiu Li

Video Moment Retrieval (MR) and Highlight Detection (HD) have attracted significant attention due to the growing demand for video analysis.

Ranked #1 on Highlight Detection on YouTube Highlights

Contrastive Learning Highlight Detection +5

Paper
Code

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

1 code implementation • 22 Nov 2023 • Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li

The direct preference optimization (DPO) method, effective in fine-tuning large language models, eliminates the necessity for a reward model.

Denoising

124

Paper
Code

Replay-enhanced Continual Reinforcement Learning

no code implementations • 20 Nov 2023 • Tiantian Zhang, Kevin Zehua Shen, Zichuan Lin, Bo Yuan, Xueqian Wang, Xiu Li, Deheng Ye

On the other hand, offline learning on replayed tasks while learning a new task may induce a distributional shift between the dataset and the learned policy on old tasks, resulting in forgetting.

Continual Learning reinforcement-learning

Paper
Add Code

Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model

1 code implementation • 20 Nov 2023 • Chunming He, Chengyu Fang, Yulun Zhang, Tian Ye, Kai Li, Longxiang Tang, Zhenhua Guo, Xiu Li, Sina Farsiu

These priors are subsequently utilized by RGformer to guide the decomposition of image features into their respective reflectance and illumination components.

Image Restoration

Paper
Code

The primacy bias in Model-based RL

no code implementations • 23 Oct 2023 • Zhongjian Qiao, Jiafei Lyu, Xiu Li

The primacy bias in deep reinforcement learning (DRL), which refers to the agent's tendency to overfit early data and lose the ability to learn from new data, can significantly decrease the performance of DRL algorithms.

Continuous Control Model-based Reinforcement Learning +1

Paper
Add Code

Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors

no code implementations • 29 Sep 2023 • Yukang Lin, Haonan Han, Chaoqun Gong, Zunnan Xu, Yachao Zhang, Xiu Li

However, due to utilizing the case-agnostic rigid strategy, their generalization ability to arbitrary cases and the 3D consistency of reconstruction are still poor.

Image to 3D

Paper
Add Code

UniHead: Unifying Multi-Perception for Detection Heads

1 code implementation • 23 Sep 2023 • Hantao Zhou, Rui Yang, Yachao Zhang, Haoran Duan, Yawen Huang, Runze Hu, Xiu Li, Yefeng Zheng

More precisely, our approach (1) introduces deformation perception, enabling the model to adaptively sample object features; (2) proposes a Dual-axial Aggregation Transformer (DAT) to adeptly model long-range dependencies, thereby achieving global perception; and (3) devises a Cross-task Interaction Transformer (CIT) that facilitates interaction between the classification and localization branches, thus aligning the two tasks.

Paper
Code

Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO

no code implementations • 30 Aug 2023 • Yangkun Chen, Joseph Suarez, Junjie Zhang, Chenghui Yu, Bo Wu, HanMo Chen, Hengman Zhu, Rui Du, Shanliang Qian, Shuai Liu, Weijun Hong, Jinke He, Yibing Zhang, Liang Zhao, Clare Zhu, Julian Togelius, Sharada Mohanty, Jiaxin Chen, Xiu Li, Xiaolong Zhu, Phillip Isola

We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions.

Benchmarking Reinforcement Learning (RL)

Paper
Add Code

Time-aligned Exposure-enhanced Model for Click-Through Rate Prediction

no code implementations • 19 Aug 2023 • Hengyu Zhang, Chang Meng, Wei Guo, Huifeng Guo, Jieming Zhu, Guangpeng Zhao, Ruiming Tang, Xiu Li

Click-Through Rate (CTR) prediction, crucial in applications like recommender systems and online advertising, involves ranking items based on the likelihood of user clicks.

Click-Through Rate Prediction Recommendation Systems

Paper
Add Code

Parallel Knowledge Enhancement based Framework for Multi-behavior Recommendation

1 code implementation • 9 Aug 2023 • Chang Meng, Chenhao Zhai, Yu Yang, Hengyu Zhang, Xiu Li

In the fusion step, advanced neural networks are used to model the hierarchical correlations between user behaviors.

Multi-Task Learning

Paper
Code

Strategic Preys Make Acute Predators: Enhancing Camouflaged Object Detectors by Generating Camouflaged Objects

1 code implementation • 6 Aug 2023 • Chunming He, Kai Li, Yachao Zhang, Yulun Zhang, Zhenhua Guo, Xiu Li, Martin Danelljan, Fisher Yu

On the prey side, we propose an adversarial training framework, Camouflageator, which introduces an auxiliary generator to generate more camouflaged objects that are harder for a COD method to detect.

object-detection Object Detection

Paper
Code

Consistency Regularization for Generalizable Source-free Domain Adaptation

no code implementations • 3 Aug 2023 • Longxiang Tang, Kai Li, Chunming He, Yulun Zhang, Xiu Li

In this paper, we propose a consistency regularization framework to develop a more generalizable SFDA method, which simultaneously boosts model performance on both target training and testing datasets.

Pseudo Label Source-Free Domain Adaptation

Paper
Add Code

Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment

no code implementations • 1 Aug 2023 • Hongbo Liu, Mingda Wu, Kun Yuan, Ming Sun, Yansong Tang, Chuanchuan Zheng, Xing Wen, Xiu Li

Video quality assessment (VQA) has attracted growing attention in recent years.

Knowledge Distillation Video Quality Assessment +1

Paper
Add Code

HQG-Net: Unpaired Medical Image Enhancement with High-Quality Guidance

no code implementations • 15 Jul 2023 • Chunming He, Kai Li, Guoxia Xu, Jiangpeng Yan, Longxiang Tang, Yulun Zhang, Xiu Li, YaoWei Wang

Specifically, we extract features from an HQ image and explicitly insert the features, which are expected to encode HQ cues, into the enhancement network to guide the LQ enhancement with the variational normalization module.

Image Enhancement Medical Image Enhancement

Paper
Add Code

Source-Free Domain Adaptive Fundus Image Segmentation with Class-Balanced Mean Teacher

1 code implementation • 14 Jul 2023 • Longxiang Tang, Kai Li, Chunming He, Yulun Zhang, Xiu Li

This paper aims to address these two issues by proposing the Class-Balanced Mean Teacher (CBMT) model.

Image Segmentation Semantic Segmentation

Paper
Code

Zero-shot Preference Learning for Offline RL via Optimal Transport

no code implementations • 6 Jun 2023 • Runze Liu, Yali Du, Fengshuo Bai, Jiafei Lyu, Xiu Li

In this paper, we propose a novel zero-shot preference-based RL algorithm that leverages labeled preference data from source tasks to infer labels for target tasks, eliminating the requirement for human queries.

Offline RL

Paper
Add Code

Normalization Enhances Generalization in Visual Reinforcement Learning

no code implementations • 1 Jun 2023 • Lu Li, Jiafei Lyu, Guozheng Ma, Zilin Wang, Zhenjie Yang, Xiu Li, Zhiheng Li

Though normalization techniques have demonstrated huge success in supervised and unsupervised learning, their applications in visual RL are still scarce.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction

1 code implementation • NeurIPS 2023 • Rui Yang, Lin Song, Yanwei Li, Sijie Zhao, Yixiao Ge, Xiu Li, Ying Shan

This paper aims to efficiently enable Large Language Models (LLMs) to use multimodal tools.

Image Generation Instruction Following +3

732

Paper
Code

Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse

no code implementations • 29 May 2023 • Jiafei Lyu, Le Wan, Zongqing Lu, Xiu Li

Empirical results show that SMR significantly boosts the sample efficiency of the base methods across most of the evaluated tasks without any hyperparameter tuning or additional tricks.

Continuous Control Q-Learning +1

Paper
Add Code

SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

1 code implementation • NeurIPS 2023 • Zhuoyan Luo, Yicheng Xiao, Yong liu, Shuyan Li, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang

To address this issue, we propose Semantic-assisted Object Cluster (SOC), which aggregates video content and textual guidance for unified temporal modeling and cross-modal alignment.

Ranked #2 on Referring Expression Segmentation on A2D Sentences (using extra training data)

Object Referring Expression Segmentation +4

Paper
Code

Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo Labeling and Multi-scale Feature Grouping

no code implementations • NeurIPS 2023 • Chunming He, Kai Li, Yachao Zhang, Guoxia Xu, Longxiang Tang, Yulun Zhang, Zhenhua Guo, Xiu Li

It remains a challenging task since (1) it is hard to distinguish concealed objects from the background due to the intrinsic similarity and (2) the sparsely-annotated training data only provide weak supervision for model learning.

Segmentation Semantic Segmentation

Paper
Add Code

Towards Realizing the Value of Labeled Target Samples: a Two-Stage Approach for Semi-Supervised Domain Adaptation

no code implementations • 21 Apr 2023 • mengqun Jin, Kai Li, Shuyan Li, Chunming He, Xiu Li

We further propose a consistency learning based mean teacher model to effectively adapt the learned UDA model using labeled and unlabeled target samples.

Semi-supervised Domain Adaptation Unsupervised Domain Adaptation

Paper
Add Code

Data-Efficient Image Quality Assessment with Attention-Panel Decoder

1 code implementation • 11 Apr 2023 • Guanyi Qin, Runze Hu, Yutao Liu, Xiawu Zheng, Haotian Liu, Xiu Li, Yan Zhang

Blind Image Quality Assessment (BIQA) is a fundamental task in computer vision, which however remains unresolved due to the complex distortion conditions and diversified image contents.

Blind Image Quality Assessment Decoder

Paper
Code

Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement Learning

1 code implementation • 10 Apr 2023 • Junjie Zhang, Jiafei Lyu, Xiaoteng Ma, Jiangpeng Yan, Jun Yang, Le Wan, Xiu Li

To empirically show the advantages of TATU, we first combine it with two classical model-based offline RL algorithms, MOPO and COMBO.

D4RL Data Augmentation +3

Paper
Code

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos

1 code implementation • 3 Apr 2023 • Yue Ma, Yingqing He, Xiaodong Cun, Xintao Wang, Siran Chen, Ying Shan, Xiu Li, Qifeng Chen

Generating text-editable and pose-controllable character videos have an imperious demand in creating various digital human.

Text-to-Image Generation Text-to-Video Generation +1

1,032

Paper
Code

Efficient Meshy Neural Fields for Animatable Human Avatars

1 code implementation • 23 Mar 2023 • Xiaoke Huang, Yiji Cheng, Yansong Tang, Xiu Li, Jie zhou, Jiwen Lu

Moreover, only minutes of optimization is enough for plausible reconstruction results.

Disentanglement Inverse Rendering

Paper
Code

BoxSnake: Polygonal Instance Segmentation with Box Supervision

1 code implementation • ICCV 2023 • Rui Yang, Lin Song, Yixiao Ge, Xiu Li

Box-supervised instance segmentation has gained much attention as it requires only simple box annotations instead of costly mask or polygon annotations.

Box-supervised Instance Segmentation Segmentation +1

Paper
Code

SSGD: A smartphone screen glass dataset for defect detection

1 code implementation • 12 Mar 2023 • Haonan Han, Rui Yang, Shuyan Li, Runze Hu, Xiu Li

Interactive devices with touch screen have become commonly used in various aspects of daily life, which raises the demand for high production quality of touch screen glass.

Defect Detection object-detection +1

Paper
Code

Compressed Interaction Graph based Framework for Multi-behavior Recommendation

1 code implementation • 4 Mar 2023 • Wei Guo, Chang Meng, Enming Yuan, ZhiCheng He, Huifeng Guo, Yingxue Zhang, Bo Chen, Yaochen Hu, Ruiming Tang, Xiu Li, Rui Zhang

However, it is challenging to explore multi-behavior data due to the unbalanced data distribution and sparse target behavior, which lead to the inadequate modeling of high-order relations when treating multi-behavior data ''as features'' and gradient conflict in multitask learning when treating multi-behavior data ''as labels''.

Multi-Task Learning

Paper
Code

SemanticAC: Semantics-Assisted Framework for Audio Classification

no code implementations • 12 Feb 2023 • Yicheng Xiao, Yue Ma, Shuyan Li, Hantao Zhou, Ran Liao, Xiu Li

In this paper, we propose SemanticAC, a semantics-assisted framework for Audio Classification to better leverage the semantic information.

Audio Classification Language Modelling

Paper
Add Code

Model-based Transfer Learning for Automatic Optical Inspection based on domain discrepancy

1 code implementation • 14 Jan 2023 • Erik Isai Valle Salgado, Haoxin Yan, Yue Hong, Peiyuan Zhu, Shidong Zhu, Chengwei Liao, Yanxiang Wen, Xiu Li, Xiang Qian, Xiaohao Wang, Xinghui Li

However, related research enhanced the network models by applying TL without considering the domain similarity among datasets, the data long-tailedness of a source dataset, and mainly used linear transformations to mitigate the lack of samples.

Data Augmentation Transfer Learning

Paper
Code

Adversarial Alignment for Source Free Object Detection

no code implementations • 11 Jan 2023 • Qiaosong Chu, Shuyan Li, Guangyi Chen, Kai Li, Xiu Li

Source-free object detection (SFOD) aims to transfer a detector pre-trained on a label-rich source domain to an unlabeled target domain without seeing source data.

Object object-detection +1

Paper
Add Code

Emergent collective intelligence from massive-agent cooperation and competition

1 code implementation • 4 Jan 2023 • HanMo Chen, Stone Tao, Jiaxin Chen, Weihan Shen, Xihui Li, Chenghui Yu, Sikai Cheng, Xiaolong Zhu, Xiu Li

Since these learned group strategies arise from individual decisions without an explicit coordination mechanism, we claim that artificial collective intelligence emerges from massive-agent cooperation and competition.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Camouflaged Object Detection With Feature Decomposition and Edge Reconstruction

no code implementations • CVPR 2023 • Chunming He, Kai Li, Yachao Zhang, Longxiang Tang, Yulun Zhang, Zhenhua Guo, Xiu Li

COD is a challenging task due to the intrinsic similarity of camouflaged objects with the background, as well as their ambiguous boundaries.

object-detection Object Detection

Paper
Add Code

Degradation-Resistant Unfolding Network for Heterogeneous Image Fusion

no code implementations • ICCV 2023 • Chunming He, Kai Li, Guoxia Xu, Yulun Zhang, Runze Hu, Zhenhua Guo, Xiu Li

Heterogeneous image fusion (HIF) techniques aim to enhance image quality by merging complementary information from images captured by different sensors.

Paper
Add Code

FLAG3D: A 3D Fitness Activity Dataset with Language Instruction

1 code implementation • CVPR 2023 • Yansong Tang, Jinpeng Liu, Aoyang Liu, Bin Yang, Wenxun Dai, Yongming Rao, Jiwen Lu, Jie zhou, Xiu Li

With the continuously thriving popularity around the world, fitness activity analytic has become an emerging research topic in computer vision.

Action Generation Action Recognition +2

Paper
Code

FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation

1 code implementation • ICCV 2023 • Ronghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Zeping Ren, Han Zhang, Yansong Tang, Xiu Li

To address these problems, we propose FineDance, which contains 14. 6 hours of music-dance paired data, with fine-grained hand motions, fine-grained genres (22 dance genres), and accurate posture.

Motion Synthesis Retrieval

Paper
Code

SimVTP: Simple Video Text Pre-training with Masked Autoencoders

no code implementations • 7 Dec 2022 • Yue Ma, Tianyu Yang, Yin Shan, Xiu Li

This paper presents SimVTP: a Simple Video-Text Pretraining framework via masked autoencoders.

Ranked #16 on Moment Retrieval on Charades-STA

Contrastive Learning Moment Retrieval +1

Paper
Add Code

Human-machine Interactive Tissue Prototype Learning for Label-efficient Histopathology Image Segmentation

1 code implementation • 26 Nov 2022 • Wentao Pan, Jiangpeng Yan, Hanbo Chen, Jiawei Yang, Zhe Xu, Xiu Li, Jianhua Yao

Then, the encoder is used to map the images into the embedding space and generate pixel-level pseudo tissue masks by querying the tissue prototype dictionary.

Contrastive Learning Image Segmentation +5

Paper
Code

Disentangling Past-Future Modeling in Sequential Recommendation via Dual Networks

1 code implementation • 26 Oct 2022 • Hengyu Zhang, Enming Yuan, Wei Guo, ZhiCheng He, Jiarui Qin, Huifeng Guo, Bo Chen, Xiu Li, Ruiming Tang

Sequential recommendation (SR) plays an important role in personalized recommender systems because it captures dynamic and diverse preferences from users' real-time increasing behaviors.

Disentanglement Information Retrieval +1

Paper
Code

Estimating Neural Reflectance Field from Radiance Field using Tree Structures

no code implementations • 9 Oct 2022 • Xiu Li, Xiao Li, Yan Lu

A high-quality NeRF decomposition relies on good geometry information extraction as well as good prior terms to properly resolve ambiguities between different components.

Paper
Add Code

State Advantage Weighting for Offline RL

no code implementations • 9 Oct 2022 • Jiafei Lyu, Aicheng Gong, Le Wan, Zongqing Lu, Xiu Li

We present state advantage weighting for offline reinforcement learning (RL).

D4RL Offline RL +2

Paper
Add Code

Dynamics-Adaptive Continual Reinforcement Learning via Progressive Contextualization

no code implementations • 1 Sep 2022 • Tiantian Zhang, Zichuan Lin, Yuxing Wang, Deheng Ye, Qiang Fu, Wei Yang, Xueqian Wang, Bin Liang, Bo Yuan, Xiu Li

A key challenge of continual reinforcement learning (CRL) in dynamic environments is to promptly adapt the RL agent's behavior as the environment changes over its lifetime, while minimizing the catastrophic forgetting of the learned information.

Bayesian Inference Knowledge Distillation +3

Paper
Add Code

A Medical Semantic-Assisted Transformer for Radiographic Report Generation

no code implementations • 22 Aug 2022 • Zhanyu Wang, Mingkang Tang, Lei Wang, Xiu Li, Luping Zhou

Automated radiographic report generation is a challenging cross-domain task that aims to automatically generate accurate and semantic-coherence reports to describe medical images.

Image Captioning Medical Report Generation

Paper
Add Code

Neural Capture of Animatable 3D Human from Monocular Video

no code implementations • 18 Aug 2022 • Gusi Te, Xiu Li, Xiao Li, Jinglu Wang, Wei Hu, Yan Lu

We present a novel paradigm of building an animatable 3D human representation from a monocular video input, such that it can be rendered in any unseen poses and views.

Paper
Add Code

Coarse-to-Fine Knowledge-Enhanced Multi-Interest Learning Framework for Multi-Behavior Recommendation

no code implementations • 3 Aug 2022 • Chang Meng, Ziqi Zhao, Wei Guo, Yingxue Zhang, Haolun Wu, Chen Gao, Dong Li, Xiu Li, Ruiming Tang

More specifically, we propose a novel Coarse-to-fine Knowledge-enhanced Multi-interest Learning (CKML) framework to learn shared and behavior-specific interests for different behaviors.

Paper
Add Code

Towards Better Dermoscopic Image Feature Representation Learning for Melanoma Classification

1 code implementation • 15 Jul 2022 • Chenghui Yu, Mingkang Tang, ShengGe Yang, Mingqing Wang, Zhe Xu, Jiangpeng Yan, HanMo Chen, Yu Yang, Xiao-jun Zeng, Xiu Li

Deep learning-based melanoma classification with dermoscopic images has recently shown great potential in automatic early-stage melanoma diagnosis.

Data Augmentation Denoising +2

Paper
Code

Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination

1 code implementation • 16 Jun 2022 • Jiafei Lyu, Xiu Li, Zongqing Lu

Model-based RL methods offer a richer dataset and benefit generalization by generating imaginary trajectories with either trained forward or reverse dynamics model.

D4RL Offline RL +1

Paper
Code

Seeking Common Ground While Reserving Differences: Multiple Anatomy Collaborative Framework for Undersampled MRI Reconstruction

no code implementations • 15 Jun 2022 • Jiangpeng Yan, Chenghui Yu, Hanbo Chen, Zhe Xu, Junzhou Huang, Xiu Li, Jianhua Yao

Four different implementations of anatomy-specific learners are presented and explored on the top of our framework in two MRI reconstruction networks.

Anatomy De-aliasing +1

Paper
Add Code

Mildly Conservative Q-Learning for Offline Reinforcement Learning

3 code implementations • 9 Jun 2022 • Jiafei Lyu, Xiaoteng Ma, Xiu Li, Zongqing Lu

The distribution shift between the learned policy and the behavior policy makes it necessary for the value function to stay conservative such that out-of-distribution (OOD) actions will not be severely overestimated.

D4RL Q-Learning +2

238

Paper
Code

OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression

1 code implementation • 6 Jun 2022 • Wanhua Li, Xiaoke Huang, Zheng Zhu, Yansong Tang, Xiu Li, Jie zhou, Jiwen Lu

In this paper, we propose to learn the rank concepts from the rich semantic CLIP latent space.

Ranked #1 on Few-shot Age Estimation on MORPH Album2

Aesthetics Quality Assessment Few-shot Age Estimation +4

Paper
Code

UniInst: Unique Representation for End-to-End Instance Segmentation

1 code implementation • 25 May 2022 • Yimin Ou, Rui Yang, Lufan Ma, Yong liu, Jiangpeng Yan, Shang Xu, Chengjie Wang, Xiu Li

Existing instance segmentation methods have achieved impressive performance but still suffer from a common dilemma: redundant representations (e. g., multiple boxes, grids, and anchor points) are inferred for one instance, which leads to multiple duplicated predictions.

Instance Segmentation Re-Ranking +2

132

Paper
Code

ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer

2 code implementations • 21 Mar 2022 • Rui Yang, Hailong Ma, Jie Wu, Yansong Tang, Xuefeng Xiao, Min Zheng, Xiu Li

The vanilla self-attention mechanism inherently relies on pre-defined and steadfast computational dimensions.

Paper
Code

Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL

1 code implementation • ICLR 2022 • Rui Yang, Yiming Lu, Wenzhe Li, Hao Sun, Meng Fang, Yali Du, Xiu Li, Lei Han, Chongjie Zhang

In this paper, we revisit the theoretical property of GCSL -- optimizing a lower bound of the goal reaching objective, and extend GCSL as a novel offline goal-conditioned RL algorithm.

Offline RL Reinforcement Learning (RL) +1

Paper
Code

Hybrid intelligence for dynamic job-shop scheduling with deep reinforcement learning and attention mechanism

1 code implementation • 3 Jan 2022 • Yunhui Zeng, Zijun Liao, Yuanzhi Dai, Rong Wang, Xiu Li, Bo Yuan

The dynamic job-shop scheduling problem (DJSP) is a class of scheduling tasks that specifically consider the inherent uncertainties such as changing order requirements and possible machine breakdown in realistic smart manufacturing settings.

Graph Representation Learning Job Shop Scheduling +2

177

Paper
Code

Value Activation for Bias Alleviation: Generalized-activated Deep Double Deterministic Policy Gradients

1 code implementation • 21 Dec 2021 • Jiafei Lyu, Yu Yang, Jiangpeng Yan, Xiu Li

It is vital to accurately estimate the value function in Deep Reinforcement Learning (DRL) such that the agent could execute proper actions instead of suboptimal ones.

Continuous Control

Paper
Code

Implicit Feature Refinement for Instance Segmentation

1 code implementation • 9 Dec 2021 • Lufan Ma, Tiancai Wang, Bin Dong, Jiangpeng Yan, Xiu Li, Xiangyu Zhang

Our IFR enjoys several advantages: 1) simulates an infinite-depth refinement network while only requiring parameters of single residual block; 2) produces high-level equilibrium instance features of global receptive field; 3) serves as a plug-and-play general module easily extended to most object recognition frameworks.

Instance Segmentation Object Recognition +3

Paper
Code

CLIP4Caption: CLIP for Video Caption

no code implementations • 13 Oct 2021 • Mingkang Tang, Zhanyu Wang, Zhenhua Liu, Fengyun Rao, Dian Li, Xiu Li

It is noted that our model is only trained on the MSR-VTT dataset.

Decoder Sentence +4

Paper
Add Code

Double-Uncertainty Guided Spatial and Temporal Consistency Regularization Weighting for Learning-based Abdominal Registration

no code implementations • 6 Jul 2021 • Zhe Xu, Jie Luo, Donghuan Lu, Jiangpeng Yan, Sarah Frisken, Jayender Jagadeesan, William Wells III, Xiu Li, Yefeng Zheng, Raymond Tong

Such convention has two limitations: (i) Besides the laborious grid search for the optimal fixed weight, the regularization strength of a specific image pair should be associated with the content of the images, thus the "one value fits all" training scheme is not ideal; (ii) Only spatially regularizing the transformation may neglect some informative clues related to the ill-posedness.

Image Registration

Paper
Add Code

MHER: Model-based Hindsight Experience Replay

no code implementations • 1 Jul 2021 • Rui Yang, Meng Fang, Lei Han, Yali Du, Feng Luo, Xiu Li

Replacing original goals with virtual goals generated from interaction with a trained dynamics model leads to a novel relabeling method, model-based relabeling (MBR).

Multi-Goal Reinforcement Learning reinforcement-learning +1

Paper
Add Code

A Self-Boosting Framework for Automated Radiographic Report Generation

no code implementations • CVPR 2021 • Zhanyu Wang, Luping Zhou, Lei Wang, Xiu Li

On one hand, the image-text matching branch helps to learn highly text-correlated visual features for the report generation branch to output high quality reports.

Image Captioning Image-text matching +3

Paper
Add Code

Self-Supervised Video Hashing via Bidirectional Transformers

1 code implementation • CVPR 2021 • Shuyan Li, Xiu Li, Jiwen Lu, Jie zhou

Most existing unsupervised video hashing methods are built on unidirectional models with less reliable training objectives, which underuse the correlations among frames and the similarity structure between videos.

Decoder Retrieval +1

Paper
Code

A Coarse-to-Fine Instance Segmentation Network with Learning Boundary Representation

no code implementations • 18 Jun 2021 • Feng Luo, Bin-Bin Gao, Jiangpeng Yan, Xiu Li

Experiments also show that our proposed method achieves competitive performance compared to existing boundary-based methods with a lightweight design and a simple pipeline.

Distance regression Instance Segmentation +2

Paper
Add Code

Efficient Continuous Control with Double Actors and Regularized Critics

1 code implementation • 6 Jun 2021 • Jiafei Lyu, Xiaoteng Ma, Jiangpeng Yan, Xiu Li

First, we uncover and demonstrate the bias alleviation property of double actors by building double actors upon single critic and double critics to handle overestimation bias in DDPG and underestimation bias in TD3 respectively.

Continuous Control Reinforcement Learning (RL)

Paper
Code

Noisy Labels are Treasure: Mean-Teacher-Assisted Confident Learning for Hepatic Vessel Segmentation

1 code implementation • 3 Jun 2021 • Zhe Xu, Donghuan Lu, Yixin Wang, Jie Luo, Jayender Jagadeesan, Kai Ma, Yefeng Zheng, Xiu Li

Manually segmenting the hepatic vessels from Computer Tomography (CT) is far more expertise-demanding and laborious than other structures due to the low-contrast and complex morphology of vessels, resulting in the extreme lack of high-quality labeled data.

Paper
Code

Reward function shape exploration in adversarial imitation learning: an empirical study

no code implementations • 14 Apr 2021 • Yawei Wang, Xiu Li

To ensure our results' reliability, we conduct the experiments on a series of Mujoco and Box2D continuous control tasks based on four different AILs.

Continuous Control Imitation Learning

Paper
Add Code

Universal and Flexible Optical Aberration Correction Using Deep-Prior Based Deconvolution

1 code implementation • ICCV 2021 • Xiu Li, Jinli Suo, Weihang Zhang, Xin Yuan, Qionghai Dai

High quality imaging usually requires bulky and expensive lenses to compensate geometric and chromatic aberrations.

Paper
Code

Bias-reduced Multi-step Hindsight Experience Replay for Efficient Multi-goal Reinforcement Learning

no code implementations • 25 Feb 2021 • Rui Yang, Jiafei Lyu, Yu Yang, Jiangpeng Yan, Feng Luo, Dijun Luo, Lanqing Li, Xiu Li

Two main challenges in multi-goal reinforcement learning are sparse rewards and sample inefficiency.

Multi-Goal Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Frequency-Aware Spatiotemporal Transformers for Video Inpainting Detection

no code implementations • ICCV 2021 • Bingyao Yu, Wanhua Li, Xiu Li, Jiwen Lu, Jie zhou

In this paper, we propose a frequency-aware spatiotemporal transformers for deep In this paper, we propose a Frequency-Aware Spatiotemporal Transformer (FAST) for video inpainting detection, which aims to simultaneously mine the traces of video inpainting from spatial, temporal, and frequency domains.

Decoder Video Inpainting

Paper
Add Code

Unsupervised Multimodal Image Registration with Adaptative Gradient Guidance

no code implementations • 12 Nov 2020 • Zhe Xu, Jiangpeng Yan, Jie Luo, Xiu Li, Jayender Jagadeesan

Multimodal image registration (MIR) is a fundamental procedure in many image-guided therapies.

Image Registration

Paper
Add Code

Unimodal Cyclic Regularization for Training Multimodal Image Registration Networks

no code implementations • 12 Nov 2020 • Zhe Xu, Jiangpeng Yan, Jie Luo, William Wells, Xiu Li, Jayender Jagadeesan

The loss function of an unsupervised multimodal image registration framework has two terms, i. e., a metric for similarity measure and regularization.

Image Registration

Paper
Add Code

F3RNet: Full-Resolution Residual Registration Network for Deformable Image Registration

no code implementations • 15 Sep 2020 • Zhe Xu, Jie Luo, Jiangpeng Yan, Xiu Li, Jagadeesan Jayender

In this paper, we propose a novel unsupervised registration network, namely the Full-Resolution Residual Registration Network (F3RNet), for deformable registration of severely deformed organs.

Image Registration

Paper
Add Code

Adversarial Uni- and Multi-modal Stream Networks for Multimodal Image Registration

no code implementations • 6 Jul 2020 • Zhe Xu, Jie Luo, Jiangpeng Yan, Ritvik Pulya, Xiu Li, William Wells III, Jayender Jagadeesan

Deformable image registration between Computed Tomography (CT) images and Magnetic Resonance (MR) imaging is essential for many image-guided therapies.

Computed Tomography (CT) Image Registration +2

Paper
Add Code

Disentangled Non-Local Neural Networks

5 code implementations • ECCV 2020 • Minghao Yin, Zhuliang Yao, Yue Cao, Xiu Li, Zheng Zhang, Stephen Lin, Han Hu

This paper first studies the non-local block in depth, where we find that its attention computation can be split into two terms, a whitened pairwise term accounting for the relationship between two pixels and a unary term representing the saliency of every pixel.

Ranked #20 on Semantic Segmentation on Cityscapes test (using extra training data)

Action Recognition object-detection +2

8,331

Paper
Code

Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration

1 code implementation • 5 Jun 2020 • Ming Zhang, Yawei Wang, Xiaoteng Ma, Li Xia, Jun Yang, Zhiheng Li, Xiu Li

The generative adversarial imitation learning (GAIL) has provided an adversarial learning framework for imitating expert policy from demonstrations in high-dimensional continuous tasks.

Continuous Control Imitation Learning

Paper
Code

4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras

1 code implementation • CVPR 2020 • Yuxiang Zhang, Liang An, Tao Yu, Xiu Li, Kun Li, Yebin Liu

Our method enables a realtime online motion capture system running at 30fps using 5 cameras on a 5-person scene.

Ranked #8 on 3D Multi-Person Pose Estimation on Shelf

3D Multi-Person Pose Estimation

179

Paper
Code

Neural Architecture Search for Compressed Sensing Magnetic Resonance Image Reconstruction

1 code implementation • 22 Feb 2020 • Jiangpeng Yan, Shuo Chen, Yongbing Zhang, Xiu Li

Our proposed method can reach a better trade-off between computation cost and reconstruction performance for MR reconstruction problem with good generalizability and offer insights to design neural networks for other medical image applications.

Image Reconstruction Neural Architecture Search +1

Paper
Code

PgNN: Physics-guided Neural Network for Fourier Ptychographic Microscopy

no code implementations • 19 Sep 2019 • Yongbing Zhang, Yangzhe Liu, Xiu Li, Shaowei Jiang, Krishna Dixit, Xinfeng Zhang, Xiangyang Ji

Since the optimal parameters of the PgNN can be derived by minimizing the difference between the model-generated images and real captured angle-varied images corresponding to the same scene, the proposed PgNN can get rid of the problem of massive training data as in traditional supervised methods.

Paper
Add Code

On the Mathematical Understanding of ResNet with Feynman Path Integral

no code implementations • 16 Apr 2019 • Minghao Yin, Xiu Li, Yongbing Zhang, Shiqi Wang

In this paper, we aim to understand Residual Network (ResNet) in a scientifically sound way by providing a bridge between ResNet and Feynman path integral.

Paper
Add Code

Capture Dense: Markerless Motion Capture Meets Dense Pose Estimation

no code implementations • 5 Dec 2018 • Xiu Li, Yebin Liu, Hanbyul Joo, Qionghai Dai, Yaser Sheikh

Specifically, we first introduce a novel markerless motion capture method that can take advantage of dense parsing capability provided by the dense pose detector.

Human Parsing Markerless Motion Capture +1

Paper
Add Code

Structure from Recurrent Motion: From Rigidity to Recurrency

no code implementations • CVPR 2018 • Xiu Li, Hongdong Li, Hanbyul Joo, Yebin Liu, Yaser Sheikh

This paper proposes a new method for Non-Rigid Structure-from-Motion (NRSfM) from a long monocular video sequence observing a non-rigid object performing recurrent and possibly repetitive dynamic action.

Clustering

Paper
Add Code

Scale-Aware Face Detection

no code implementations • CVPR 2017 • Zekun Hao, Yu Liu, Hongwei Qin, Junjie Yan, Xiu Li, Xiaolin Hu

Then the scale histogram guides the zoom-in and zoom-out of the image.

Face Detection

Paper
Add Code

Joint Training of Cascaded CNN for Face Detection

no code implementations • CVPR 2016 • Hongwei Qin, Junjie Yan, Xiu Li, Xiaolin Hu

Cascade has been widely used in face detection, where classifier with low computation cost can be firstly used to shrink most of the background while keeping the recall.

Face Detection Region Proposal

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.