Search Results for author: Sjoerd van Steenkiste

Found 26 papers, 11 papers with code

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback

no code implementations • 29 Nov 2023 • Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian

Then, it uses two VLMs to select the best generation: a Visual Question Answering model that measures the alignment of generated images to the text, and another that measures the generation's aesthetic quality.

Question Answering Text-to-Image Generation +1

Paper
Add Code

A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models

no code implementations • 1 Nov 2023 • Tiwalayo Eisape, MH Tessler, Ishita Dasgupta, Fei Sha, Sjoerd van Steenkiste, Tal Linzen

A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises.

Logical Fallacies

Paper
Add Code

The Impact of Depth on Compositional Generalization in Transformer Language Models

no code implementations • 30 Oct 2023 • Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette, Tal Linzen

Because model latency is approximately linear in the number of layers, these results lead us to the recommendation that, with a given total parameter budget, transformers can be made shallower than is typical without sacrificing performance.

Language Modelling

Paper
Add Code

DyST: Towards Dynamic Neural Scene Representations on Real-World Videos

no code implementations • 9 Oct 2023 • Maximilian Seitzer, Sjoerd van Steenkiste, Thomas Kipf, Klaus Greff, Mehdi S. M. Sajjadi

Our Dynamic Scene Transformer (DyST) model leverages recent work in neural scene representation to learn a latent decomposition of monocular real-world videos into scene content, per-view scene dynamics, and camera pose.

Paper
Add Code

DORSal: Diffusion for Object-centric Representations of Scenes et al

no code implementations • 13 Jun 2023 • Allan Jabri, Sjoerd van Steenkiste, Emiel Hoogeboom, Mehdi S. M. Sajjadi, Thomas Kipf

In this paper, we leverage recent progress in diffusion models to equip 3D scene representation learning models with the ability to render high-fidelity novel views, while retaining benefits such as object-level scene editing to a large degree.

Neural Rendering Object +3

Paper
Add Code

Sensitivity of Slot-Based Object-Centric Models to their Number of Slots

no code implementations • 30 May 2023 • Roland S. Zimmermann, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Thomas Kipf, Klaus Greff

Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets.

Paper
Add Code

Scaling Vision Transformers to 22 Billion Parameters

1 code implementation • 10 Feb 2023 • Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby

The scaling of Transformers has driven breakthrough capabilities for language models.

Ranked #1 on Zero-Shot Transfer Image Classification on ObjectNet

Action Classification Fairness +3

192

Paper
Code

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

1 code implementation • 9 Feb 2023 • Ondrej Biza, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gamaleldin F. Elsayed, Aravindh Mahendran, Thomas Kipf

Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning.

Object Object Discovery

32,917

Paper
Code

Exploring through Random Curiosity with General Value Functions

1 code implementation • 18 Nov 2022 • Aditya Ramesh, Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber

Furthermore, RC-GVF significantly outperforms previous methods in the absence of ground-truth episodic counts in the partially observable MiniGrid environments.

Efficient Exploration

Paper
Code

SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos

1 code implementation • 15 Jun 2022 • Gamaleldin F. Elsayed, Aravindh Mahendran, Sjoerd van Steenkiste, Klaus Greff, Michael C. Mozer, Thomas Kipf

The visual world can be parsimoniously characterized in terms of distinct entities with sparse interactions.

Object Semantic Segmentation

140

Paper
Code

Object Scene Representation Transformer

no code implementations • 14 Jun 2022 • Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf

A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition.

Decoder Novel View Synthesis +2

Paper
Add Code

Unsupervised Learning of Temporal Abstractions with Slot-based Transformers

1 code implementation • 25 Mar 2022 • Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber, Sjoerd van Steenkiste

The discovery of reusable sub-routines simplifies decision-making and planning in complex reinforcement learning problems.

Decision Making

Paper
Code

Test-time Adaptation with Slot-Centric Models

1 code implementation • 21 Mar 2022 • Mihir Prabhudesai, Anirudh Goyal, Sujoy Paul, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gaurav Aggarwal, Thomas Kipf, Deepak Pathak, Katerina Fragkiadaki

In our work, we find evidence that these losses are insufficient for the task of scene decomposition, without also considering architectural inductive biases.

Image Classification Image Segmentation +7

Paper
Code

On the Binding Problem in Artificial Neural Networks

no code implementations • 9 Dec 2020 • Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber

Contemporary neural networks still fall short of human-level generalization, which extends far beyond our direct experiences.

Paper
Add Code

Unsupervised Object Keypoint Learning using Local Spatial Predictability

1 code implementation • ICLR 2021 • Anand Gopalakrishnan, Sjoerd van Steenkiste, Jürgen Schmidhuber

We propose PermaKey, a novel approach to representation learning based on object keypoints.

Object Salient Object Detection

Paper
Code

Hierarchical Relational Inference

no code implementations • 7 Oct 2020 • Aleksandar Stanić, Sjoerd van Steenkiste, Jürgen Schmidhuber

Common-sense physical reasoning in the real world requires learning about the interactions of objects and their dynamics.

Common Sense Reasoning

Paper
Add Code

Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks

1 code implementation • ICLR 2021 • Róbert Csordás, Sjoerd van Steenkiste, Jürgen Schmidhuber

Neural networks (NNs) whose subnetworks implement reusable functions are expected to offer numerous advantages, including compositionality through efficient recombination of functional building blocks, interpretability, preventing catastrophic interference, etc.

Systematic Generalization

Paper
Code

Improving Generalization in Meta Reinforcement Learning using Learned Objectives

no code implementations • ICLR 2020 • Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber

Biological evolution has distilled the experiences of many learners into the general learning algorithms of humans.

Meta Reinforcement Learning reinforcement-learning +1

Paper
Add Code

A Perspective on Objects and Systematic Generalization in Model-Based RL

no code implementations • 3 Jun 2019 • Sjoerd van Steenkiste, Klaus Greff, Jürgen Schmidhuber

In order to meet the diverse challenges in solving many real-world problems, an intelligent agent has to be able to dynamically construct a model of its environment.

Systematic Generalization

Paper
Add Code

Are Disentangled Representations Helpful for Abstract Visual Reasoning?

no code implementations • NeurIPS 2019 • Sjoerd van Steenkiste, Francesco Locatello, Jürgen Schmidhuber, Olivier Bachem

A disentangled representation encodes information about the salient factors of variation in the data independently.

Disentanglement Visual Reasoning

Paper
Add Code

A Case for Object Compositionality in Deep Generative Models of Images

no code implementations • ICLR 2019 • Sjoerd van Steenkiste, Karol Kurach, Sylvain Gelly

In this work we propose to structure the generator of a GAN to consider objects and their relations explicitly, and generate images by means of composition.

Paper
Add Code

FVD: A new Metric for Video Generation

no code implementations • ICLR Workshop DeepGenStruct 2019 • Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphaël Marinier, Marcin Michalski, Sylvain Gelly

While recent generative models of video have had some success, current progress is hampered by the lack of qualitative metrics that consider visual quality, temporal coherence, and diversity of samples.

Representation Learning Video Generation

Paper
Add Code

Towards Accurate Generative Models of Video: A New Metric & Challenges

3 code implementations • 3 Dec 2018 • Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphael Marinier, Marcin Michalski, Sylvain Gelly

To this extent we propose Fr\'{e}chet Video Distance (FVD), a new metric for generative models of video, and StarCraft 2 Videos (SCV), a benchmark of game play from custom starcraft 2 scenarios that challenge the current capabilities of generative models of video.

Representation Learning Starcraft +1

892

Paper
Code

Investigating Object Compositionality in Generative Adversarial Networks

no code implementations • ICLR 2019 • Sjoerd van Steenkiste, Karol Kurach, Jürgen Schmidhuber, Sylvain Gelly

We present a minimal modification of a standard generator to incorporate this inductive bias and find that it reliably learns to generate images as compositions of objects.

Image Generation Inductive Bias +5