no code implementations • 29 Nov 2023 • Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian
Then, it uses two VLMs to select the best generation: a Visual Question Answering model that measures the alignment of generated images to the text, and another that measures the generation's aesthetic quality.
no code implementations • 1 Nov 2023 • Tiwalayo Eisape, MH Tessler, Ishita Dasgupta, Fei Sha, Sjoerd van Steenkiste, Tal Linzen
A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises.
no code implementations • 30 Oct 2023 • Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette, Tal Linzen
Because model latency is approximately linear in the number of layers, these results lead us to the recommendation that, with a given total parameter budget, transformers can be made shallower than is typical without sacrificing performance.
no code implementations • 9 Oct 2023 • Maximilian Seitzer, Sjoerd van Steenkiste, Thomas Kipf, Klaus Greff, Mehdi S. M. Sajjadi
Our Dynamic Scene Transformer (DyST) model leverages recent work in neural scene representation to learn a latent decomposition of monocular real-world videos into scene content, per-view scene dynamics, and camera pose.
no code implementations • 13 Jun 2023 • Allan Jabri, Sjoerd van Steenkiste, Emiel Hoogeboom, Mehdi S. M. Sajjadi, Thomas Kipf
In this paper, we leverage recent progress in diffusion models to equip 3D scene representation learning models with the ability to render high-fidelity novel views, while retaining benefits such as object-level scene editing to a large degree.
no code implementations • 30 May 2023 • Roland S. Zimmermann, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Thomas Kipf, Klaus Greff
Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets.
1 code implementation • 10 Feb 2023 • Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby
The scaling of Transformers has driven breakthrough capabilities for language models.
Ranked #1 on Zero-Shot Transfer Image Classification on ObjectNet
1 code implementation • 9 Feb 2023 • Ondrej Biza, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gamaleldin F. Elsayed, Aravindh Mahendran, Thomas Kipf
Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning.
1 code implementation • 18 Nov 2022 • Aditya Ramesh, Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber
Furthermore, RC-GVF significantly outperforms previous methods in the absence of ground-truth episodic counts in the partially observable MiniGrid environments.
1 code implementation • 15 Jun 2022 • Gamaleldin F. Elsayed, Aravindh Mahendran, Sjoerd van Steenkiste, Klaus Greff, Michael C. Mozer, Thomas Kipf
The visual world can be parsimoniously characterized in terms of distinct entities with sparse interactions.
no code implementations • 14 Jun 2022 • Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf
A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition.
1 code implementation • 25 Mar 2022 • Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber, Sjoerd van Steenkiste
The discovery of reusable sub-routines simplifies decision-making and planning in complex reinforcement learning problems.
1 code implementation • 21 Mar 2022 • Mihir Prabhudesai, Anirudh Goyal, Sujoy Paul, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gaurav Aggarwal, Thomas Kipf, Deepak Pathak, Katerina Fragkiadaki
In our work, we find evidence that these losses are insufficient for the task of scene decomposition, without also considering architectural inductive biases.
no code implementations • 9 Dec 2020 • Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber
Contemporary neural networks still fall short of human-level generalization, which extends far beyond our direct experiences.
1 code implementation • ICLR 2021 • Anand Gopalakrishnan, Sjoerd van Steenkiste, Jürgen Schmidhuber
We propose PermaKey, a novel approach to representation learning based on object keypoints.
no code implementations • 7 Oct 2020 • Aleksandar Stanić, Sjoerd van Steenkiste, Jürgen Schmidhuber
Common-sense physical reasoning in the real world requires learning about the interactions of objects and their dynamics.
1 code implementation • ICLR 2021 • Róbert Csordás, Sjoerd van Steenkiste, Jürgen Schmidhuber
Neural networks (NNs) whose subnetworks implement reusable functions are expected to offer numerous advantages, including compositionality through efficient recombination of functional building blocks, interpretability, preventing catastrophic interference, etc.
no code implementations • ICLR 2020 • Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber
Biological evolution has distilled the experiences of many learners into the general learning algorithms of humans.
no code implementations • 3 Jun 2019 • Sjoerd van Steenkiste, Klaus Greff, Jürgen Schmidhuber
In order to meet the diverse challenges in solving many real-world problems, an intelligent agent has to be able to dynamically construct a model of its environment.
no code implementations • NeurIPS 2019 • Sjoerd van Steenkiste, Francesco Locatello, Jürgen Schmidhuber, Olivier Bachem
A disentangled representation encodes information about the salient factors of variation in the data independently.
no code implementations • ICLR 2019 • Sjoerd van Steenkiste, Karol Kurach, Sylvain Gelly
In this work we propose to structure the generator of a GAN to consider objects and their relations explicitly, and generate images by means of composition.
no code implementations • ICLR Workshop DeepGenStruct 2019 • Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphaël Marinier, Marcin Michalski, Sylvain Gelly
While recent generative models of video have had some success, current progress is hampered by the lack of qualitative metrics that consider visual quality, temporal coherence, and diversity of samples.
3 code implementations • 3 Dec 2018 • Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphael Marinier, Marcin Michalski, Sylvain Gelly
To this extent we propose Fr\'{e}chet Video Distance (FVD), a new metric for generative models of video, and StarCraft 2 Videos (SCV), a benchmark of game play from custom starcraft 2 scenarios that challenge the current capabilities of generative models of video.
no code implementations • ICLR 2019 • Sjoerd van Steenkiste, Karol Kurach, Jürgen Schmidhuber, Sylvain Gelly
We present a minimal modification of a standard generator to incorporate this inductive bias and find that it reliably learns to generate images as compositions of objects.
3 code implementations • ICLR 2018 • Sjoerd van Steenkiste, Michael Chang, Klaus Greff, Jürgen Schmidhuber
Common-sense physical reasoning is an essential ingredient for any intelligent agent operating in the real-world.
1 code implementation • NeurIPS 2017 • Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber
Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities.