Search Results for author: Xiaohui Zeng

Found 13 papers, 5 papers with code

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

no code implementations • 30 Apr 2024 • Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui

VFC consists of three steps: 1) proposal, where image-to-text captioning models propose multiple initial captions; 2) verification, where a large language model (LLM) utilizes tools such as object detection and VQA models to fact-check proposed captions; 3) captioning, where an LLM generates the final caption by summarizing caption proposals and the fact check verification results.

Caption Generation Hallucination +7

Paper
Add Code

NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

no code implementations • 25 Apr 2024 • Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, HaoNing Wu, Yixuan Gao, Yuqin Cao, ZiCheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng, Jianquan Yang, Weigang Wang, Xi Fang, Xiaoxin Lv, Jun Yan, Tianwu Zhi, Yabin Zhang, Yaohui Li, Yang Li, Jingwen Xu, Jianzhao Liu, Yiting Liao, Junlin Li, Zihao Yu, Yiting Lu, Xin Li, Hossein Motamednia, S. Farhad Hosseini-Benvidi, Fengbin Guan, Ahmad Mahmoudi-Aznaveh, Azadeh Mansouri, Ganzorig Gankhuyag, Kihwan Yoon, Yifang Xu, Haotian Fan, Fangyuan Kong, Shiling Zhao, Weifeng Dong, Haibing Yin, Li Zhu, Zhiling Wang, Bingchen Huang, Avinab Saha, Sandeep Mishra, Shashank Gupta, Rajesh Sureddi, Oindrila Saha, Luigi Celona, Simone Bianco, Paolo Napoletano, Raimondo Schettini, Junfeng Yang, Jing Fu, Wei zhang, Wenzhi Cao, Limei Liu, Han Peng, Weijun Yuan, Zhan Li, Yihang Cheng, Yifan Deng, Haohui Li, Bowen Qu, Yao Li, Shuqing Luo, Shunzhou Wang, Wei Gao, Zihao Lu, Marcos V. Conde, Xinrui Wang, Zhibo Chen, Ruling Liao, Yan Ye, Qiulin Wang, Bing Li, Zhaokun Zhou, Miao Geng, Rui Chen, Xin Tao, Xiaoyu Liang, Shangkun Sun, Xingyuan Ma, Jiaze Li, Mengduo Yang, Haoran Xu, Jie zhou, Shiding Zhu, Bohan Yu, Pengfei Chen, Xinrui Xu, Jiabin Shen, Zhichao Duan, Erfan Asadi, Jiahe Liu, Qi Yan, Youran Qu, Xiaohui Zeng, Lele Wang, Renjie Liao

A total of 196 participants have registered in the video track.

Image Quality Assessment Image Restoration +2

Paper
Add Code

LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis

no code implementations • 22 Mar 2024 • Kevin Xie, Jonathan Lorraine, Tianshi Cao, Jun Gao, James Lucas, Antonio Torralba, Sanja Fidler, Xiaohui Zeng

Recent text-to-3D generation approaches produce impressive 3D results but require time-consuming optimization that can take up to an hour per prompt.

3D Generation Text to 3D

Paper
Add Code

XCube ($\mathcal{X}^3$): Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies

no code implementations • 6 Dec 2023 • Xuanchi Ren, Jiahui Huang, Xiaohui Zeng, Ken Museth, Sanja Fidler, Francis Williams

In addition to unconditional generation, we show that our model can be used to solve a variety of tasks such as user-guided editing, scene completion from a single scan, and text-to-3D.

3D Shape Generation Scene Generation +1

Paper
Add Code

ATT3D: Amortized Text-to-3D Object Synthesis

no code implementations • ICCV 2023 • Jonathan Lorraine, Kevin Xie, Xiaohui Zeng, Chen-Hsuan Lin, Towaki Takikawa, Nicholas Sharp, Tsung-Yi Lin, Ming-Yu Liu, Sanja Fidler, James Lucas

Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields.

Image to 3D Object +1

Paper
Add Code

Magic3D: High-Resolution Text-to-3D Content Creation

1 code implementation • CVPR 2023 • Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin

DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results.

Ranked #2 on Text to 3D on T$^3$Bench

Text to 3D Vocal Bursts Intensity Prediction

142

Paper
Code

LION: Latent Point Diffusion Models for 3D Shape Generation

2 code implementations • 12 Oct 2022 • Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis

To advance 3D DDMs and make them useful for digital artists, we require (i) high generation quality, (ii) flexibility for manipulation and applications such as conditional synthesis and shape interpolation, and (iii) the ability to output smooth surfaces or meshes.

Ranked #1 on Point Cloud Generation on ShapeNet Airplane

3D Generation 3D Shape Generation +3

707

Paper
Code

Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level Attention

no code implementations • 5 Jul 2022 • Gary Leung, Jun Gao, Xiaohui Zeng, Sanja Fidler

HILA extends hierarchical vision transformer architectures by adding local connections between features of higher and lower levels to the backbone encoder.

Object Semantic Segmentation

Paper
Add Code

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation

1 code implementation • 25 Jun 2021 • Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao

1) We propose a non-parametric prior distribution over the appearance of image parts so that the latent variable ``what-to-draw'' per step becomes a categorical random variable.

Image Generation

Paper
Code

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation

no code implementations • ECCV 2020 • Bo-Wen Chen, Huan Ling, Xiaohui Zeng, Gao Jun, Ziyue Xu, Sanja Fidler

Our approach tolerates a modest amount of noise in the box placements, thus typically only a few clicks are needed to annotate tracked boxes to a sufficient accuracy.

Object Segmentation +3

Paper
Add Code

DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation

1 code implementation • ICCV 2019 • Xiaohui Zeng, Renjie Liao, Li Gu, Yuwen Xiong, Sanja Fidler, Raquel Urtasun

In practice, it performs similarly to the Hungarian algorithm during inference.

Ranked #18 on Semi-Supervised Video Object Segmentation on DAVIS (no YouTube-VOS training)

Object One-shot visual object segmentation +3

147

Paper
Code

Scene Graph Parsing as Dependency Parsing

2 code implementations • NAACL 2018 • Yu-Siang Wang, Chenxi Liu, Xiaohui Zeng, Alan Yuille

The scene graphs generated by our learned neural dependency parser achieve an F-score similarity of 49. 67% to ground truth graphs on our evaluation set, surpassing best previous approaches by 5%.

Dependency Parsing Image Retrieval +2

Paper
Code

Adversarial Attacks Beyond the Image Space

no code implementations • CVPR 2019 • Xiaohui Zeng, Chenxi Liu, Yu-Siang Wang, Weichao Qiu, Lingxi Xie, Yu-Wing Tai, Chi Keung Tang, Alan L. Yuille

Though image-space adversaries can be interpreted as per-pixel albedo change, we verify that they cannot be well explained along these physically meaningful dimensions, which often have a non-local effect.

Question Answering Visual Question Answering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.