1 code implementation • 13 Dec 2023 • Mykola Lavreniuk, Shariq Farooq Bhat, Matthias Müller, Peter Wonka
Second, we propose a novel image-text alignment module for improved feature extraction of the Stable Diffusion backbone.
Ranked #1 on Referring Expression Segmentation on RefCOCO testB
no code implementations • 5 Dec 2023 • Shariq Farooq Bhat, Niloy J. Mitra, Peter Wonka
We present LooseControl to allow generalized depth conditioning for diffusion-based image generation.
1 code implementation • 4 Dec 2023 • Zhenyu Li, Shariq Farooq Bhat, Peter Wonka
Single image depth estimation is a foundational task in computer vision and generative modeling.
1 code implementation • 16 Oct 2023 • Hanan Gani, Shariq Farooq Bhat, Muzammal Naseer, Salman Khan, Peter Wonka
Diffusion-based generative models have significantly advanced text-to-image generation but encounter challenges when processing lengthy and intricate text prompts describing complex scenes with multiple objects.
3 code implementations • 23 Feb 2023 • Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, Matthias Müller
Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains.
Ranked #16 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)
1 code implementation • 28 Mar 2022 • Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka
We build on AdaBins which estimates a global distribution of depth values for the input image and evolve the architecture in two ways.
Ranked #35 on Monocular Depth Estimation on NYU-Depth V2
no code implementations • NeurIPS 2021 • Wamiq Reyaz Para, Shariq Farooq Bhat, Paul Guerrero, Tom Kelly, Niloy Mitra, Leonidas Guibas, Peter Wonka
Sketches can be represented as graphs, with the primitives as nodes and the constraints as edges.
no code implementations • 4 Jun 2021 • Hiroyasu Akada, Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka
Specifically, we extend self-supervised learning from traditional representation learning, which works on images from a single domain, to domain invariant representation learning, which works on images from two different domains by utilizing an image-to-image translation network.
11 code implementations • CVPR 2021 • Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka
We address the problem of estimating a high quality dense depth map from a single RGB input image.
Ranked #7 on Depth Estimation on NYU-Depth V2