no code implementations • 2 May 2024 • Kelvin C. K. Chan, Yang Zhao, Xuhui Jia, Ming-Hsuan Yang, Huisheng Wang
In subject-driven text-to-image synthesis, the synthesis process tends to be heavily influenced by the reference images provided by users, often overlooking crucial attributes detailed in the text prompt.
no code implementations • 3 Jan 2024 • Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia
We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision.
no code implementations • 5 Dec 2023 • Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, Ming-Hsuan Yang
To achieve detailed control, we propose a unified framework to jointly inject control signals into the existing text-to-video model.
no code implementations • 5 Dec 2023 • Prafull Sharma, Varun Jampani, Yuanzhen Li, Xuhui Jia, Dmitry Lagun, Fredo Durand, William T. Freeman, Mark Matthews
We propose a method to control material attributes of objects like roughness, metallic, albedo, and transparency in real images.
no code implementations • 27 Apr 2023 • Kangning Liu, Yu-Chuan Su, Wei, Hong, Ruijin Cang, Xuhui Jia
The one-shot talking-head synthesis task aims to animate a source image to another pose and expression, which is dictated by a driving frame.
no code implementations • 16 Apr 2023 • Hong-You Chen, Jike Zhong, Mingda Zhang, Xuhui Jia, Hang Qi, Boqing Gong, Wei-Lun Chao, Li Zhang
FedBasis learns a set of few shareable ``basis'' models, which can be linearly combined to form personalized models for clients.
no code implementations • 14 Apr 2023 • Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia
Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.
no code implementations • 5 Apr 2023 • Kai Han, Yandong Li, Sagar Vaze, Jie Li, Xuhui Jia
In this paper, we reconsider the recognition problem and task a vision-language model to assign class names to images given only a large and essentially unconstrained vocabulary of categories as prior information.
no code implementations • 5 Apr 2023 • Xuhui Jia, Yang Zhao, Kelvin C. K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su
This paper proposes a method for generating images of customized objects specified by users.
no code implementations • CVPR 2022 • Yang Zhao, Yu-Chuan Su, Chun-Te Chu, Yandong Li, Marius Renn, Yukun Zhu, Changyou Chen, Xuhui Jia
While existing approaches for face restoration make significant progress in generating high-quality faces, they often fail to preserve facial features and cannot authentically reconstruct the faces.
no code implementations • ICCV 2021 • Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green
This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories.
no code implementations • 1 Jan 2021 • Xuhui Jia, Kai Han, Yukun Zhu, Bradley Green
This paper studies the problem of novel category discovery on single- and multi-modal data with labels from different but relevant categories.
no code implementations • 1 Jan 2021 • Zhuoran Shen, Irwan Bello, Raviteja Vemulapalli, Xuhui Jia, Ching-Hui Chen
Based on the proposed GSA module, we introduce new standalone global attention-based deep networks that use GSA modules instead of convolutions to model pixel interactions.
1 code implementation • CVPR 2021 • Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong
This paper is concerned with ranking many pre-trained deep neural networks (DNNs), called checkpoints, for the transfer learning to a downstream task.
Ranked #6 on Transferability on classification benchmark
no code implementations • 15 Oct 2020 • Bardia Doosti, Ching-Hui Chen, Raviteja Vemulapalli, Xuhui Jia, Yukun Zhu, Bradley Green
In this work, we focus on the task of image-based mutual gaze detection, and propose a simple and effective approach to boost the performance by using an auxiliary 3D gaze estimation task during the training phase.
no code implementations • 6 Oct 2020 • Zhuoran Shen, Irwan Bello, Raviteja Vemulapalli, Xuhui Jia, Ching-Hui Chen
Based on the proposed GSA module, we introduce new standalone global attention-based deep networks that use GSA modules instead of convolutions to model pixel interactions.
no code implementations • CVPR 2020 • Yu Liu, Xuhui Jia, Mingxing Tan, Raviteja Vemulapalli, Yukun Zhu, Bradley Green, Xiaogang Wang
Standard Knowledge Distillation (KD) approaches distill the knowledge of a cumbersome teacher model into the parameters of a student model with a pre-defined architecture.
no code implementations • 16 Nov 2015 • Heng Yang, Xuhui Jia, Chen Change Loy, Peter Robinson
In this paper, we carry out a rigorous evaluation of these methods by making the following contributions: 1) we proposes a new evaluation metric for face alignment on a set of images, i. e., area under error distribution curve within a threshold, AUC$_\alpha$, given the fact that the traditional evaluation measure (mean error) is very sensitive to big alignment error.