Search Results for author: Jacob Zhiyuan Fang

Found 3 papers, 0 papers with code

FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation

no code implementations8 May 2024 Xuehai He, Jian Zheng, Jacob Zhiyuan Fang, Robinson Piramuthu, Mohit Bansal, Vicente Ordonez, Gunnar A Sigurdsson, Nanyun Peng, Xin Eric Wang

Controllable text-to-image (T2I) diffusion models generate images conditioned on both text prompts and semantic inputs of other modalities like edge maps.

Text-to-Image Generation

E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer

no code implementations28 Nov 2023 Jacob Zhiyuan Fang, Skyler Zheng, Vasu Sharma, Robinson Piramuthu

Regardless of their effectiveness, larger architectures unavoidably prevent the models from being extended to real-world applications, so building a lightweight VL architecture and an efficient learning schema is of great practical value.

Language Modelling Question Answering +3

Text-to-image Editing by Image Information Removal

no code implementations27 May 2023 Zhongping Zhang, Jian Zheng, Jacob Zhiyuan Fang, Bryan A. Plummer

Using the input image as a control could mitigate these issues, but since these models are trained via reconstruction, a model can simply hide information about the original image when encoding it to perfectly reconstruct the image without learning the editing task.

Image Generation Image Reconstruction

Cannot find the paper you are looking for? You can Submit a new open access paper.