1 code implementation • 19 Dec 2023 • Barak Meiri, Dvir Samuel, Nir Darshan, Gal Chechik, Shai Avidan, Rami Ben-Ari
Several applications of these models, including image editing interpolation, and semantic augmentation, require diffusion inversion.
1 code implementation • 18 Dec 2023 • Boaz Lerner, Nir Darshan, Rami Ben-Ari
With such a massive growth in the number of images stored, efficient search in a database has become a crucial endeavor managed by image retrieval systems.
no code implementations • 13 Jul 2023 • Gavriel Habib, Noa Barzilay, Or Shimshi, Rami Ben-Ari, Nir Darshan
Unsupervised Domain Adaptation (UDA) tries to adapt a model, pre-trained in a supervised manner on a source domain, to an unlabelled target domain.
1 code implementation • NeurIPS 2023 • Dvir Samuel, Rami Ben-Ari, Nir Darshan, Haggai Maron, Gal Chechik
Text-to-image diffusion models show great potential in synthesizing a large variety of concepts in new compositions and scenarios.
1 code implementation • NeurIPS 2023 • Matan Levy, Rami Ben-Ari, Nir Darshan, Dani Lischinski
These questions form a dialog with the user in order to retrieve the desired image from a large corpus.
1 code implementation • 27 Apr 2023 • Dvir Samuel, Rami Ben-Ari, Simon Raviv, Nir Darshan, Gal Chechik
We show that their limitation is partly due to the long-tail nature of their training data: web-crawled data sets are strongly unbalanced, causing models to under-represent concepts from the tail of the distribution.
1 code implementation • 16 Mar 2023 • Matan Levy, Rami Ben-Ari, Nir Darshan, Dani Lischinski
To address these shortcomings, we introduce the Large Scale Composed Image Retrieval (LaSCo) dataset, a new CoIR dataset which is ten times larger than existing ones.
Ranked #1 on Image Retrieval on LaSCo
1 code implementation • 17 May 2022 • Daniel Rotman, Yevgeny Yaroker, Elad Amrani, Udi Barzelay, Rami Ben-Ari
Video scene detection is the task of dividing videos into temporal semantic chapters.
1 code implementation • 29 Nov 2021 • Matan Levy, Rami Ben-Ari, Dani Lischinski
Our model is particularly well suited for realistic questions with out-of-vocabulary answers that require regression.
Ranked #1 on Visual Question Answering (VQA) on PlotQA-D1
no code implementations • 21 Apr 2020 • Rami Ben-Ari, Mor Shpigel, Ophir Azulai, Udi Barzelay, Daniel Rotman
Classification of new class entities requires collecting and annotating hundreds or thousands of samples that is often prohibitively costly.
1 code implementation • 6 Mar 2020 • Elad Amrani, Rami Ben-Ari, Daniel Rotman, Alex Bronstein
One of the key factors of enabling machine learning models to comprehend and solve real-world tasks is to leverage multimodal data.
Ranked #3 on Visual Question Answering on MSRVTT-QA (Accuracy metric)
1 code implementation • 27 May 2019 • Elad Amrani, Rami Ben-Ari, Tal Hakim, Alex Bronstein
In this work, we propose to exploit the natural correlation in narrations and the visual presence of objects in video, to learn an object detector and retrieval without any manual labeling involved.
no code implementations • 29 Apr 2019 • Ran Bakalo, Jacob Goldberger, Rami Ben-Ari
We show that the time consuming local annotations involved in supervised learning can be addressed by a weakly supervised method that can leverage a subset of locally annotated data.
no code implementations • 28 Apr 2019 • Ran Bakalo, Rami Ben-Ari, Jacob Goldberger
The high cost of generating expert annotations, poses a strong limitation for supervised machine learning methods in medical imaging.