no code implementations • 12 Oct 2023 • Jaewoo Lee, Jaehong Yoon, Wonjae Kim, Yunji Kim, Sung Ju Hwang
Continuously learning a variety of audio-video semantics over time is crucial for audio-related reasoning tasks in our ever-evolving world.
1 code implementation • ICCV 2023 • Yunji Kim, Jiyoung Lee, Jin-Hwa Kim, Jung-Woo Ha, Jun-Yan Zhu
To address this, we propose DenseDiffusion, a training-free method that adapts a pre-trained text-to-image model to handle such dense captions while offering control over the scene layout.
no code implementations • 30 May 2023 • Doyeon Kim, Eunji Ko, Hyunsu Kim, Yunji Kim, Junho Kim, Dongchan Min, Junmo Kim, Sung Ju Hwang
Portrait stylization, which translates a real human face image into an artistically stylized image, has attracted considerable interest and many prior works have shown impressive quality in recent years.
no code implementations • 25 May 2023 • Jooyoung Choi, Yunjey Choi, Yunji Kim, Junho Kim, Sungroh Yoon
Text-to-image diffusion models can generate diverse, high-fidelity images based on user-provided text prompts.
no code implementations • ICCV 2023 • Jaewoong Lee, Sangwon Jang, Jaehyeong Jo, Jaehong Yoon, Yunji Kim, Jin-Hwa Kim, Jung-Woo Ha, Sung Ju Hwang
Token-based masked generative models are gaining popularity for their fast inference time with parallel decoding.
1 code implementation • 25 May 2022 • Jin-Hwa Kim, Yunji Kim, Jiyoung Lee, Kang Min Yoo, Sang-Woo Lee
Based on a recent trend that multimodal generative evaluations exploit a vison-and-language pre-trained model, we propose the negative Gaussian cross-mutual information using the CLIP features as a unified metric, coined by Mutual Information Divergence (MID).
Ranked #1 on Human Judgment Classification on Pascal-50S
Hallucination Pair-wise Detection (1-ref) Hallucination Pair-wise Detection (4-ref) +5
1 code implementation • ICLR 2022 • Yunji Kim, Jung-Woo Ha
Specifically, we map the input of a generator, which was sampled from the categorical distribution, to the embedding space of the discriminator and let them act as a cluster centroid.
1 code implementation • NeurIPS 2019 • Yunji Kim, Seonghyeon Nam, In Cho, Seon Joo Kim
To generate future frames, we first detect keypoints of a moving object and predict future motion as a sequence of keypoints.
no code implementations • NeurIPS 2018 • Seonghyeon Nam, Yunji Kim, Seon Joo Kim
Our task aims to semantically modify visual attributes of an object in an image according to the text describing the new visual appearance.