Search Results for author: Baoyuan Wang

Found 47 papers, 18 papers with code

Towards Objectively Benchmarking Social Intelligence for Language Agents at Action Level

no code implementations • 8 Apr 2024 • Chenxu Wang, Bin Dai, Huaping Liu, Baoyuan Wang

To gauge the significance of agent architecture, we implement a target-driven planning (TDP) module as an adjunct to the existing agent.

Benchmarking

Paper
Add Code

Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer

no code implementations • 20 Mar 2024 • Yu Deng, Duomin Wang, Baoyuan Wang

In this paper, we propose a novel learning approach for feed-forward one-shot 4D head avatar synthesis.

Paper
Add Code

Subobject-level Image Tokenization

1 code implementation • 22 Feb 2024 • Delong Chen, Samuel Cahyawijaya, Jianfeng Liu, Baoyuan Wang, Pascale Fung

Transformer-based vision models typically tokenize images into fixed-size square patches as input units, which lacks the adaptability to image content and overlooks the inherent pixel grouping structure.

Attribute Language Modelling +2

Paper
Code

Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations

1 code implementation • 19 Feb 2024 • Nuo Chen, Hongguang Li, Juhua Huang, Baoyuan Wang, Jia Li

Existing retrieval-based methods have made significant strides in maintaining long-term conversations.

Chatbot Language Modelling +3

Paper
Code

From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting

no code implementations • 18 Dec 2023 • Nuo Chen, Hongguang Li, Baoyuan Wang, Jia Li

IMP-TIP follows the ``From Good to Great" concept, collecting multiple potential solutions from both LLMs and their Tool-Augmented counterparts for the same math problem, and then selecting or re-generating the most accurate answer after cross-checking these solutions via tool-augmented interleaf prompting.

GSM8K Math +1

Paper
Add Code

GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance

no code implementations • 12 Dec 2023 • Haiming Zhang, Zhihao Yuan, Chaoda Zheng, Xu Yan, Baoyuan Wang, Guanbin Li, Song Wu, Shuguang Cui, Zhen Li

Our proposed GSmoothFace model mainly consists of the Audio to Expression Prediction (A2EP) module and the Target Adaptive Face Translation (TAFT) module.

Face Model Talking Face Generation

Paper
Add Code

PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns

no code implementations • 7 Dec 2023 • Shuliang Ning, Duomin Wang, Yipeng Qin, Zirong Jin, Baoyuan Wang, Xiaoguang Han

Unlike prior arts constrained by specific input types, our method allows flexible specification of style (text or image) and texture (full garment, cropped sections, or texture patches) conditions.

Disentanglement Human Parsing +1

Paper
Add Code

Portrait4D: Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data

no code implementations • 30 Nov 2023 • Yu Deng, Duomin Wang, Xiaohang Ren, Xingyu Chen, Baoyuan Wang

The key is to first learn a part-wise 4D generative model from monocular images via adversarial learning, to synthesize multi-view images of diverse identities and full motions as training data; then leverage a transformer-based animatable triplane reconstructor to learn 4D head reconstruction using the synthetic data.

3D Reconstruction

Paper
Add Code

AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents

no code implementations • 29 Nov 2023 • Duomin Wang, Bin Dai, Yu Deng, Baoyuan Wang

In this study, our goal is to create interactive avatar agents that can autonomously plan and animate nuanced facial movements realistically, from both visual and behavioral perspectives.

Neural Rendering

Paper
Add Code

AvatarGPT: All-in-One Framework for Motion Understanding, Planning, Generation and Beyond

no code implementations • 28 Nov 2023 • Zixiang Zhou, Yu Wan, Baoyuan Wang

AvatarGPT treats each task as one type of instruction fine-tuned on the shared LLM.

Motion Synthesis

Paper
Add Code

A Unified Framework for Multimodal, Multi-Part Human Motion Synthesis

no code implementations • 28 Nov 2023 • Zixiang Zhou, Yu Wan, Baoyuan Wang

The field has made significant progress in synthesizing realistic human motion driven by various modalities.

Motion Synthesis

Paper
Add Code

HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images

no code implementations • 27 Nov 2023 • Xihe Yang, Xingyu Chen, Daiheng Gao, Shaohui Wang, Xiaoguang Han, Baoyuan Wang

As for human avatar reconstruction, contemporary techniques commonly necessitate the acquisition of costly data and struggle to achieve satisfactory results from a small number of casual images.

Paper
Add Code

DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models

1 code implementation • 8 Oct 2023 • Chengcheng Han, Xiaowei Du, Che Zhang, Yixin Lian, Xiang Li, Ming Gao, Baoyuan Wang

Chain-of-Thought (CoT) prompting has proven to be effective in enhancing the reasoning capabilities of Large Language Models (LLMs) with at least 100 billion parameters.

Arithmetic Reasoning

Paper
Code

MDSC: Towards Evaluating the Style Consistency Between Music and Dance

1 code implementation • 4 Sep 2023 • Zixiang Zhou, Weiyuan Li, Baoyuan Wang

We found that directly measuring the embedding distance between motion and music is not an optimal solution.

Paper
Code

Controlling Character Motions without Observable Driving Source

no code implementations • 11 Aug 2023 • Weiyuan Li, Bin Dai, Ziyi Zhou, Qi Yao, Baoyuan Wang

A high-level prior model can be easily injected on top to generate unlimited long and diverse sequences.

Paper
Add Code

Reinforced Disentanglement for Face Swapping without Skip Connection

no code implementations • ICCV 2023 • Xiaohang Ren, Xingyu Chen, Pengfei Yao, Heung-Yeung Shum, Baoyuan Wang

The SOTA face swap models still suffer the problem of either target identity (i. e., shape) being leaked or the target non-identity attributes (i. e., background, hair) failing to be fully preserved in the final results.

Disentanglement Face Swapping

Paper
Add Code

Visual Instruction Tuning with Polite Flamingo

2 code implementations • 3 Jul 2023 • Delong Chen, Jianfeng Liu, Wenliang Dai, Baoyuan Wang

This side effect negatively impacts the model's ability to format responses appropriately -- for instance, its "politeness" -- due to the overly succinct and unformatted nature of raw annotations, resulting in reduced human preference.

Paper
Code

LiveChat: A Large-Scale Personalized Dialogue Dataset Automatically Constructed from Live Streaming

1 code implementation • 14 Jun 2023 • Jingsheng Gao, Yixin Lian, Ziyi Zhou, Yuzhuo Fu, Baoyuan Wang

Open-domain dialogue systems have made promising progress in recent years.

Retrieval

Paper
Code

Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification

1 code implementation • 26 May 2023 • Ke Ji, Yixin Lian, Jingsheng Gao, Baoyuan Wang

Due to the complex label hierarchy and intensive labeling cost in practice, the hierarchical text classification (HTC) suffers a poor performance especially when low-resource or few-shot settings are considered.

Contrastive Learning Few-shot HTC +2

Paper
Code

ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers

4 code implementations • 24 May 2023 • Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang

Recently, plain vision Transformers (ViTs) have shown impressive performance on various computer vision tasks, thanks to their strong modeling capacity and large-scale pretraining.

Ranked #2 on Image Matting on Distinctions-646

Image Matting

127,058

Paper
Code

FashionTex: Controllable Virtual Try-on with Text and Texture

1 code implementation • 8 May 2023 • Anran Lin, Nanxuan Zhao, Shuliang Ning, Yuda Qiu, Baoyuan Wang, Xiaoguang Han

Virtual try-on attracts increasing research attention as a promising way for enhancing the user experience for online cloth shopping.

Virtual Try-on

Paper
Code

An Effective Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds

1 code implementation • 21 Mar 2023 • Chaoda Zheng, Xu Yan, Haiming Zhang, Baoyuan Wang, Shenghui Cheng, Shuguang Cui, Zhen Li

Due to the motion-centric nature, our method shows its impressive generalizability with limited training labels and provides good differentiability for end-to-end cycle training.

3D Single Object Tracking Autonomous Driving +3

236

Paper
Code

Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation

no code implementations • ICCV 2023 • Xingyu Chen, Yu Deng, Baoyuan Wang

Improving the photorealism via CNN-based 2D super-resolution can break the strict 3D consistency, while keeping the 3D consistency by learning high-resolution 3D representations for direct rendering often compromises image quality.

Image Generation Representation Learning +1

Paper
Add Code

Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension

1 code implementation • 27 Feb 2023 • Nuo Chen, Hongguang Li, Junqing He, Yinan Bao, Xinshi Lin, Qi Yang, Jianfeng Liu, Ruyi Gan, Jiaxing Zhang, Baoyuan Wang, Jia Li

Thus, model's comprehension ability towards real scenarios are hard to evaluate reasonably.

Machine Reading Comprehension

Paper
Code

Natural Response Generation for Chinese Reading Comprehension

1 code implementation • 17 Feb 2023 • Nuo Chen, Hongguang Li, Yinan Bao, Baoyuan Wang, Jia Li

To this end, we construct a new dataset called Penguin to promote the research of MRC, providing a training and test bed for natural response generation to real scenarios.

Chinese Reading Comprehension Machine Reading Comprehension +1

Paper
Code

Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors

no code implementations • ICCV 2023 • Zhentao Yu, Zixin Yin, Deyu Zhou, Duomin Wang, Finn Wong, Baoyuan Wang

In this paper, we introduce a simple and novel framework for one-shot audio-driven talking head generation.

Talking Head Generation

Paper
Add Code

UDE: A Unified Driving Engine for Human Motion Generation

1 code implementation • CVPR 2023 • Zixiang Zhou, Baoyuan Wang

Generating controllable and editable human motion sequences is a key challenge in 3D Avatar generation.

Decoder Quantization

Paper
Code

Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis

1 code implementation • CVPR 2023 • Duomin Wang, Yu Deng, Zixin Yin, Heung-Yeung Shum, Baoyuan Wang

We present a novel one-shot talking head synthesis method that achieves disentangled and fine-grained control over lip motion, eye gaze&blink, head pose, and emotional expression.

Contrastive Learning Disentanglement

Paper
Code

Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis from Monocular Image

no code implementations • CVPR 2023 • Yu Deng, Baoyuan Wang, Heung-Yeung Shum

We introduce a novel detail manifolds reconstructor to learn 3D-consistent fine details on the radiance manifolds from monocular images, and combine them with the coarse radiance manifolds for high-fidelity reconstruction.

Image Generation Novel View Synthesis

Paper
Add Code

Hand Avatar: Free-Pose Hand Animation and Rendering from Monocular Video

no code implementations • CVPR 2023 • Xingyu Chen, Baoyuan Wang, Heung-Yeung Shum

We present HandAvatar, a novel representation for hand animation and rendering, which can generate smoothly compositional geometry and self-occlusion-aware texture.

Disentanglement

Paper
Add Code

Local-Adaptive Face Recognition via Graph-based Meta-Clustering and Regularized Adaptation

no code implementations • CVPR 2022 • Wenbin Zhu, Chien-Yi Wang, Kuan-Lun Tseng, Shang-Hong Lai, Baoyuan Wang

Leveraging the environment-specific local data after the deployment of the initial global model, LaFR aims at getting optimal performance by training local-adapted models automatically and un-supervisely, as opposed to fixing their initial global model.

Clustering Face Recognition

Paper
Add Code

Privacy-preserving Online AutoML for Domain-Specific Face Detection

no code implementations • CVPR 2022 • Chenqian Yan, Yuge Zhang, Quanlu Zhang, Yaming Yang, Xinyang Jiang, Yuqing Yang, Baoyuan Wang

Thanks to HyperFD, each local task (client) is able to effectively leverage the learning "experience" of previous tasks without uploading raw images to the platform; meanwhile, the meta-feature extractor is continuously learned to better trade off the bias and variance.

AutoML Face Detection +1

Paper
Add Code

Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds

1 code implementation • CVPR 2022 • Chaoda Zheng, Xu Yan, Haiming Zhang, Baoyuan Wang, Shenghui Cheng, Shuguang Cui, Zhen Li

3D single object tracking (3D SOT) in LiDAR point clouds plays a crucial role in autonomous driving.

Ranked #1 on Object Tracking on KITTI

3D Single Object Tracking Autonomous Driving +1

236

Paper
Code

CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement

no code implementations • CVPR 2021 • Noranart Vesdapunt, Baoyuan Wang

Our confidence ranker is model-agnostic, so we can augment the data by choosing the pairs from multiple face detectors during the training, and generalize to a wide range of face detectors during the testing.

8k Face Detection

Paper
Add Code

Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting

no code implementations • ECCV 2020 • Bindita Chaudhuri, Noranart Vesdapunt, Linda Shapiro, Baoyuan Wang

Traditional methods for image-based 3D face reconstruction and facial motion retargeting fit a 3D morphable model (3DMM) to the face, which has limited modeling capacity and fail to generalize well to in-the-wild data.

3D Face Reconstruction Face Model +1

Paper
Add Code

JNR: Joint-based Neural Rig Representation for Compact 3D Face Modeling

no code implementations • ECCV 2020 • Noranart Vesdapunt, Mitch Rundle, HsiangTao Wu, Baoyuan Wang

In this paper, we introduce a novel approach to learn a 3D face model using a joint-based face rig and a neural skinning network.

3D Face Modelling Face Model

Paper
Add Code

Animating Face using Disentangled Audio Representations

no code implementations • 2 Oct 2019 • Gaurav Mittal, Baoyuan Wang

All previous methods for audio-driven talking head generation assume the input audio to be clean with a neutral tone.

Representation Learning Talking Head Generation

Paper
Add Code

Joint Face Detection and Facial Motion Retargeting for Multiple Faces

no code implementations • CVPR 2019 • Bindita Chaudhuri, Noranart Vesdapunt, Baoyuan Wang

Facial motion retargeting is an important problem in both computer graphics and vision, which involves capturing the performance of a human face and transferring it to another 3D character.

3D Face Reconstruction Face Alignment +3

Paper
Add Code

Real-time Burst Photo Selection Using a Light-Head Adversarial Network

no code implementations • 20 Mar 2018 • Baoyuan Wang, Noranart Vesdapunt, Utkarsh Sinha, Lei Zhang

The system is designed to run in the viewfinder mode and capture a burst sequence of frames before and after the shutter is pressed.

Attribute

Paper
Add Code

Personalized Exposure Control Using Adaptive Metering and Reinforcement Learning

no code implementations • 6 Mar 2018 • Huan Yang, Baoyuan Wang, Noranart Vesdapunt, Minyi Guo, Sing Bing Kang

We propose a reinforcement learning approach for real-time exposure control of a mobile camera that is personalizable.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Understanding and Predicting The Attractiveness of Human Action Shot

no code implementations • 2 Nov 2017 • Bin Dai, Baoyuan Wang, Gang Hua

Selecting attractive photos from a human action shot sequence is quite challenging, because of the subjective nature of the "attractiveness", which is mainly a combined factor of human pose in action and the background.

Paper
Add Code

Exposure: A White-Box Photo Post-Processing Framework

1 code implementation • 27 Sep 2017 • Yuanming Hu, Hao He, Chenxi Xu, Baoyuan Wang, Stephen Lin

Retouching can significantly elevate the visual appeal of photos, but many casual photographers lack the expertise to do this well.

761

Paper
Code

Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning

no code implementations • ICCV 2017 • Tae-Hyun Oh, Kyungdon Joo, Neel Joshi, Baoyuan Wang, In So Kweon, Sing Bing Kang

Cinemagraphs are a compelling way to convey dynamic aspects of a scene.

Object Recognition Semantic Segmentation

Paper
Add Code

FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling

1 code implementation • CVPR 2017 • Yuanming Hu, Baoyuan Wang, Stephen Lin

However, the patch-based CNNs that exist for this problem are faced with the issue of estimation ambiguity, where a patch may contain insufficient information to establish a unique or even a limited possible range of illumination colors.

Color Constancy

187

Paper
Code

Unsupervised Extraction of Video Highlights Via Robust Recurrent Auto-encoders

no code implementations • ICCV 2015 • Huan Yang, Baoyuan Wang, Stephen Lin, David Wipf, Minyi Guo, Baining Guo

With the growing popularity of short-form video sharing platforms such as \em{Instagram} and \em{Vine}, there has been an increasing need for techniques that automatically extract highlights from video.

Paper
Add Code

Harvesting Discriminative Meta Objects with Deep CNN Features for Scene Classification

no code implementations • ICCV 2015 • Ruobing Wu, Baoyuan Wang, Wenping Wang, Yizhou Yu

Recent work on scene classification still makes use of generic CNN features in a rudimentary manner.

Clustering General Classification +3

Paper
Add Code

Automatic Photo Adjustment Using Deep Neural Networks

1 code implementation • 24 Dec 2014 • Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, Yizhou Yu

Many photographic styles rely on subtle adjustments that depend on the image content and even its semantics.

Photo Retouching

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.