no code implementations • 8 Apr 2024 • Chenxu Wang, Bin Dai, Huaping Liu, Baoyuan Wang
To gauge the significance of agent architecture, we implement a target-driven planning (TDP) module as an adjunct to the existing agent.
no code implementations • 20 Mar 2024 • Yu Deng, Duomin Wang, Baoyuan Wang
In this paper, we propose a novel learning approach for feed-forward one-shot 4D head avatar synthesis.
1 code implementation • 22 Feb 2024 • Delong Chen, Samuel Cahyawijaya, Jianfeng Liu, Baoyuan Wang, Pascale Fung
Transformer-based vision models typically tokenize images into fixed-size square patches as input units, which lacks the adaptability to image content and overlooks the inherent pixel grouping structure.
1 code implementation • 19 Feb 2024 • Nuo Chen, Hongguang Li, Juhua Huang, Baoyuan Wang, Jia Li
Existing retrieval-based methods have made significant strides in maintaining long-term conversations.
no code implementations • 18 Dec 2023 • Nuo Chen, Hongguang Li, Baoyuan Wang, Jia Li
IMP-TIP follows the ``From Good to Great" concept, collecting multiple potential solutions from both LLMs and their Tool-Augmented counterparts for the same math problem, and then selecting or re-generating the most accurate answer after cross-checking these solutions via tool-augmented interleaf prompting.
no code implementations • 12 Dec 2023 • Haiming Zhang, Zhihao Yuan, Chaoda Zheng, Xu Yan, Baoyuan Wang, Guanbin Li, Song Wu, Shuguang Cui, Zhen Li
Our proposed GSmoothFace model mainly consists of the Audio to Expression Prediction (A2EP) module and the Target Adaptive Face Translation (TAFT) module.
no code implementations • 7 Dec 2023 • Shuliang Ning, Duomin Wang, Yipeng Qin, Zirong Jin, Baoyuan Wang, Xiaoguang Han
Unlike prior arts constrained by specific input types, our method allows flexible specification of style (text or image) and texture (full garment, cropped sections, or texture patches) conditions.
no code implementations • 30 Nov 2023 • Yu Deng, Duomin Wang, Xiaohang Ren, Xingyu Chen, Baoyuan Wang
The key is to first learn a part-wise 4D generative model from monocular images via adversarial learning, to synthesize multi-view images of diverse identities and full motions as training data; then leverage a transformer-based animatable triplane reconstructor to learn 4D head reconstruction using the synthetic data.
no code implementations • 29 Nov 2023 • Duomin Wang, Bin Dai, Yu Deng, Baoyuan Wang
In this study, our goal is to create interactive avatar agents that can autonomously plan and animate nuanced facial movements realistically, from both visual and behavioral perspectives.
no code implementations • 28 Nov 2023 • Zixiang Zhou, Yu Wan, Baoyuan Wang
AvatarGPT treats each task as one type of instruction fine-tuned on the shared LLM.
no code implementations • 28 Nov 2023 • Zixiang Zhou, Yu Wan, Baoyuan Wang
The field has made significant progress in synthesizing realistic human motion driven by various modalities.
no code implementations • 27 Nov 2023 • Xihe Yang, Xingyu Chen, Daiheng Gao, Shaohui Wang, Xiaoguang Han, Baoyuan Wang
As for human avatar reconstruction, contemporary techniques commonly necessitate the acquisition of costly data and struggle to achieve satisfactory results from a small number of casual images.
1 code implementation • 8 Oct 2023 • Chengcheng Han, Xiaowei Du, Che Zhang, Yixin Lian, Xiang Li, Ming Gao, Baoyuan Wang
Chain-of-Thought (CoT) prompting has proven to be effective in enhancing the reasoning capabilities of Large Language Models (LLMs) with at least 100 billion parameters.
1 code implementation • 4 Sep 2023 • Zixiang Zhou, Weiyuan Li, Baoyuan Wang
We found that directly measuring the embedding distance between motion and music is not an optimal solution.
no code implementations • 11 Aug 2023 • Weiyuan Li, Bin Dai, Ziyi Zhou, Qi Yao, Baoyuan Wang
A high-level prior model can be easily injected on top to generate unlimited long and diverse sequences.
no code implementations • ICCV 2023 • Xiaohang Ren, Xingyu Chen, Pengfei Yao, Heung-Yeung Shum, Baoyuan Wang
The SOTA face swap models still suffer the problem of either target identity (i. e., shape) being leaked or the target non-identity attributes (i. e., background, hair) failing to be fully preserved in the final results.
2 code implementations • 3 Jul 2023 • Delong Chen, Jianfeng Liu, Wenliang Dai, Baoyuan Wang
This side effect negatively impacts the model's ability to format responses appropriately -- for instance, its "politeness" -- due to the overly succinct and unformatted nature of raw annotations, resulting in reduced human preference.
1 code implementation • 14 Jun 2023 • Jingsheng Gao, Yixin Lian, Ziyi Zhou, Yuzhuo Fu, Baoyuan Wang
Open-domain dialogue systems have made promising progress in recent years.
1 code implementation • 26 May 2023 • Ke Ji, Yixin Lian, Jingsheng Gao, Baoyuan Wang
Due to the complex label hierarchy and intensive labeling cost in practice, the hierarchical text classification (HTC) suffers a poor performance especially when low-resource or few-shot settings are considered.
4 code implementations • 24 May 2023 • Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang
Recently, plain vision Transformers (ViTs) have shown impressive performance on various computer vision tasks, thanks to their strong modeling capacity and large-scale pretraining.
Ranked #2 on Image Matting on Distinctions-646
1 code implementation • 8 May 2023 • Anran Lin, Nanxuan Zhao, Shuliang Ning, Yuda Qiu, Baoyuan Wang, Xiaoguang Han
Virtual try-on attracts increasing research attention as a promising way for enhancing the user experience for online cloth shopping.
1 code implementation • 21 Mar 2023 • Chaoda Zheng, Xu Yan, Haiming Zhang, Baoyuan Wang, Shenghui Cheng, Shuguang Cui, Zhen Li
Due to the motion-centric nature, our method shows its impressive generalizability with limited training labels and provides good differentiability for end-to-end cycle training.
no code implementations • ICCV 2023 • Xingyu Chen, Yu Deng, Baoyuan Wang
Improving the photorealism via CNN-based 2D super-resolution can break the strict 3D consistency, while keeping the 3D consistency by learning high-resolution 3D representations for direct rendering often compromises image quality.
1 code implementation • 27 Feb 2023 • Nuo Chen, Hongguang Li, Junqing He, Yinan Bao, Xinshi Lin, Qi Yang, Jianfeng Liu, Ruyi Gan, Jiaxing Zhang, Baoyuan Wang, Jia Li
Thus, model's comprehension ability towards real scenarios are hard to evaluate reasonably.
1 code implementation • 17 Feb 2023 • Nuo Chen, Hongguang Li, Yinan Bao, Baoyuan Wang, Jia Li
To this end, we construct a new dataset called Penguin to promote the research of MRC, providing a training and test bed for natural response generation to real scenarios.
Chinese Reading Comprehension Machine Reading Comprehension +1
no code implementations • ICCV 2023 • Zhentao Yu, Zixin Yin, Deyu Zhou, Duomin Wang, Finn Wong, Baoyuan Wang
In this paper, we introduce a simple and novel framework for one-shot audio-driven talking head generation.
1 code implementation • CVPR 2023 • Zixiang Zhou, Baoyuan Wang
Generating controllable and editable human motion sequences is a key challenge in 3D Avatar generation.
1 code implementation • CVPR 2023 • Duomin Wang, Yu Deng, Zixin Yin, Heung-Yeung Shum, Baoyuan Wang
We present a novel one-shot talking head synthesis method that achieves disentangled and fine-grained control over lip motion, eye gaze&blink, head pose, and emotional expression.
no code implementations • CVPR 2023 • Yu Deng, Baoyuan Wang, Heung-Yeung Shum
We introduce a novel detail manifolds reconstructor to learn 3D-consistent fine details on the radiance manifolds from monocular images, and combine them with the coarse radiance manifolds for high-fidelity reconstruction.
no code implementations • CVPR 2023 • Xingyu Chen, Baoyuan Wang, Heung-Yeung Shum
We present HandAvatar, a novel representation for hand animation and rendering, which can generate smoothly compositional geometry and self-occlusion-aware texture.
no code implementations • CVPR 2022 • Wenbin Zhu, Chien-Yi Wang, Kuan-Lun Tseng, Shang-Hong Lai, Baoyuan Wang
Leveraging the environment-specific local data after the deployment of the initial global model, LaFR aims at getting optimal performance by training local-adapted models automatically and un-supervisely, as opposed to fixing their initial global model.
no code implementations • CVPR 2022 • Chenqian Yan, Yuge Zhang, Quanlu Zhang, Yaming Yang, Xinyang Jiang, Yuqing Yang, Baoyuan Wang
Thanks to HyperFD, each local task (client) is able to effectively leverage the learning "experience" of previous tasks without uploading raw images to the platform; meanwhile, the meta-feature extractor is continuously learned to better trade off the bias and variance.
1 code implementation • CVPR 2022 • Chaoda Zheng, Xu Yan, Haiming Zhang, Baoyuan Wang, Shenghui Cheng, Shuguang Cui, Zhen Li
3D single object tracking (3D SOT) in LiDAR point clouds plays a crucial role in autonomous driving.
Ranked #1 on Object Tracking on KITTI
no code implementations • CVPR 2021 • Noranart Vesdapunt, Baoyuan Wang
Our confidence ranker is model-agnostic, so we can augment the data by choosing the pairs from multiple face detectors during the training, and generalize to a wide range of face detectors during the testing.
no code implementations • ECCV 2020 • Bindita Chaudhuri, Noranart Vesdapunt, Linda Shapiro, Baoyuan Wang
Traditional methods for image-based 3D face reconstruction and facial motion retargeting fit a 3D morphable model (3DMM) to the face, which has limited modeling capacity and fail to generalize well to in-the-wild data.
no code implementations • ECCV 2020 • Noranart Vesdapunt, Mitch Rundle, HsiangTao Wu, Baoyuan Wang
In this paper, we introduce a novel approach to learn a 3D face model using a joint-based face rig and a neural skinning network.
no code implementations • 2 Oct 2019 • Gaurav Mittal, Baoyuan Wang
All previous methods for audio-driven talking head generation assume the input audio to be clean with a neutral tone.
no code implementations • CVPR 2019 • Bindita Chaudhuri, Noranart Vesdapunt, Baoyuan Wang
Facial motion retargeting is an important problem in both computer graphics and vision, which involves capturing the performance of a human face and transferring it to another 3D character.
no code implementations • 20 Mar 2018 • Baoyuan Wang, Noranart Vesdapunt, Utkarsh Sinha, Lei Zhang
The system is designed to run in the viewfinder mode and capture a burst sequence of frames before and after the shutter is pressed.
no code implementations • 6 Mar 2018 • Huan Yang, Baoyuan Wang, Noranart Vesdapunt, Minyi Guo, Sing Bing Kang
We propose a reinforcement learning approach for real-time exposure control of a mobile camera that is personalizable.
no code implementations • 2 Nov 2017 • Bin Dai, Baoyuan Wang, Gang Hua
Selecting attractive photos from a human action shot sequence is quite challenging, because of the subjective nature of the "attractiveness", which is mainly a combined factor of human pose in action and the background.
1 code implementation • 27 Sep 2017 • Yuanming Hu, Hao He, Chenxi Xu, Baoyuan Wang, Stephen Lin
Retouching can significantly elevate the visual appeal of photos, but many casual photographers lack the expertise to do this well.
no code implementations • ICCV 2017 • Tae-Hyun Oh, Kyungdon Joo, Neel Joshi, Baoyuan Wang, In So Kweon, Sing Bing Kang
Cinemagraphs are a compelling way to convey dynamic aspects of a scene.
1 code implementation • CVPR 2017 • Yuanming Hu, Baoyuan Wang, Stephen Lin
However, the patch-based CNNs that exist for this problem are faced with the issue of estimation ambiguity, where a patch may contain insufficient information to establish a unique or even a limited possible range of illumination colors.
no code implementations • ICCV 2015 • Huan Yang, Baoyuan Wang, Stephen Lin, David Wipf, Minyi Guo, Baining Guo
With the growing popularity of short-form video sharing platforms such as \em{Instagram} and \em{Vine}, there has been an increasing need for techniques that automatically extract highlights from video.
no code implementations • ICCV 2015 • Ruobing Wu, Baoyuan Wang, Wenping Wang, Yizhou Yu
Recent work on scene classification still makes use of generic CNN features in a rudimentary manner.
1 code implementation • 24 Dec 2014 • Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, Yizhou Yu
Many photographic styles rely on subtle adjustments that depend on the image content and even its semantics.