no code implementations • ICCV 2023 • Koutilya PNVR, Bharat Singh, Pallabi Ghosh, Behjat Siddiquie, David Jacobs
First, we show that the latent space of LDMs (z-space) is a better input representation compared to other feature representations like RGB images or CLIP encodings for text-based image segmentation.
1 code implementation • CVPR 2020 • Koutilya PNVR, Hao Zhou, David Jacobs
Ideally, this results in images from two domains that present shared information to the primary network.
Ranked #2 on Monocular Depth Estimation on Make3D