Search Results for author: Arthur Szlam

Found 82 papers, 36 papers with code

Dialogue in the Wild: Learning from a Deployed Role-Playing Game with Humans and Bots

no code implementations • Findings (ACL) 2021 • Kurt Shuster, Jack Urbanek, Emily Dinan, Arthur Szlam, Jason Weston

Paper
Add Code

Fast Adaptation to New Environments via Policy-Dynamics Value Functions

no code implementations • ICML 2020 • Roberta Raileanu, Max Goldstein, Arthur Szlam, Facebook Rob Fergus

An ensemble of conventional RL policies is used to gather experience on training environments, from which embeddings of both policies and environments can be learned.

Paper
Add Code

DiPaCo: Distributed Path Composition

no code implementations • 15 Mar 2024 • Arthur Douillard, Qixuan Feng, Andrei A. Rusu, Adhiguna Kuncoro, Yani Donchev, Rachita Chhaparia, Ionel Gog, Marc'Aurelio Ranzato, Jiajun Shen, Arthur Szlam

Progress in machine learning (ML) has been fueled by scaling neural network models.

Language Modelling Model Compression

Paper
Add Code

Asynchronous Local-SGD Training for Language Modeling

1 code implementation • 17 Jan 2024 • Bo Liu, Rachita Chhaparia, Arthur Douillard, Satyen Kale, Andrei A. Rusu, Jiajun Shen, Arthur Szlam, Marc'Aurelio Ranzato

Local stochastic gradient descent (Local-SGD), also referred to as federated averaging, is an approach to distributed optimization where each device performs more than one SGD update per communication.

Distributed Optimization Language Modelling

Paper
Code

DiLoCo: Distributed Low-Communication Training of Language Models

no code implementations • 14 Nov 2023 • Arthur Douillard, Qixuan Feng, Andrei A. Rusu, Rachita Chhaparia, Yani Donchev, Adhiguna Kuncoro, Marc'Aurelio Ranzato, Arthur Szlam, Jiajun Shen

In this work, we propose a distributed optimization algorithm, Distributed Low-Communication (DiLoCo), that enables training of language models on islands of devices that are poorly connected.

Distributed Optimization

Paper
Add Code

A Data Source for Reasoning Embodied Agents

1 code implementation • 14 Sep 2023 • Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam

In this work, to further pursue these advances, we introduce a new data generator for machine reasoning that integrates with an embodied agent.

Paper
Code

Transforming Human-Centered AI Collaboration: Redefining Embodied Agents Capabilities through Interactive Grounded Language Instructions

2 code implementations • 18 May 2023 • Shrestha Mohanty, Negar Arabzadeh, Julia Kiseleva, Artem Zholus, Milagro Teruel, Ahmed Awadallah, Yuxuan Sun, Kavya Srinet, Arthur Szlam

Human intelligence's adaptability is remarkable, allowing us to adjust to new tasks and multi-modal environments swiftly.

Paper
Code

Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models

no code implementations • 26 Apr 2023 • Jimmy Wei, Kurt Shuster, Arthur Szlam, Jason Weston, Jack Urbanek, Mojtaba Komeili

We compare models trained on our new dataset to existing pairwise-trained dialogue models, as well as large language models with few-shot prompting.

Paper
Add Code

Infusing Commonsense World Models with Graph Knowledge

no code implementations • 13 Jan 2023 • Alexander Gurung, Mojtaba Komeili, Arthur Szlam, Jason Weston, Jack Urbanek

While language models have become more capable of producing compelling language, we find there are still gaps in maintaining consistency, especially when describing events in a dynamically changing world.

Paper
Add Code

Collecting Interactive Multi-modal Datasets for Grounded Language Understanding

2 code implementations • 12 Nov 2022 • Shrestha Mohanty, Negar Arabzadeh, Milagro Teruel, Yuxuan Sun, Artem Zholus, Alexey Skrynnik, Mikhail Burtsev, Kavya Srinet, Aleksandr Panov, Arthur Szlam, Marc-Alexandre Côté, Julia Kiseleva

Human intelligence can remarkably adapt quickly to new tasks and environments.

Task 2

Paper
Code

CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory

2 code implementations • 11 Oct 2022 • Nur Muhammad Mahi Shafiullah, Chris Paxton, Lerrel Pinto, Soumith Chintala, Arthur Szlam

We propose CLIP-Fields, an implicit scene model that can be used for a variety of tasks, such as segmentation, instance identification, semantic search over space, and view localization.

Segmentation Semantic Segmentation +1

141

Paper
Code

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

2 code implementations • 5 Aug 2022 • Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.

Continual Learning

10,433

Paper
Code

IGLU 2022: Interactive Grounded Language Understanding in a Collaborative Environment at NeurIPS 2022

1 code implementation • 27 May 2022 • Julia Kiseleva, Alexey Skrynnik, Artem Zholus, Shrestha Mohanty, Negar Arabzadeh, Marc-Alexandre Côté, Mohammad Aliannejadi, Milagro Teruel, Ziming Li, Mikhail Burtsev, Maartje ter Hoeve, Zoya Volovikova, Aleksandr Panov, Yuxuan Sun, Kavya Srinet, Arthur Szlam, Ahmed Awadallah

Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions.

Natural Language Understanding Reinforcement Learning (RL)

Paper
Code

Interactive Grounded Language Understanding in a Collaborative Environment: IGLU 2021

no code implementations • 5 May 2022 • Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Marc-Alexandre Côté, Katja Hofmann, Ahmed Awadallah, Linar Abdrazakov, Igor Churin, Putra Manggala, Kata Naszadi, Michiel van der Meer, Taewoon Kim

The primary goal of the competition is to approach the problem of how to build interactive agents that learn to solve a task while provided with grounded natural language instructions in a collaborative environment.

Paper
Add Code

Many Episode Learning in a Modular Embodied Agent via End-to-End Interaction

no code implementations • 19 Apr 2022 • Yuxuan Sun, Ethan Carlson, Rebecca Qian, Kavya Srinet, Arthur Szlam

In this work we give a case study of an embodied machine-learning (ML) powered agent that improves itself via interactions with crowd-workers.

Paper
Add Code

Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion

1 code implementation • 24 Mar 2022 • Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston

We show that, when using SeeKeR as a dialogue model, it outperforms the state-of-the-art model BlenderBot 2 (Chen et al., 2021) on open-domain knowledge-grounded conversations for the same number of parameters, in terms of consistency, knowledge and per-turn engagingness.

Language Modelling Retrieval

10,433

Paper
Code

Can I see an Example? Active Learning the Long Tail of Attributes and Relations

no code implementations • 11 Mar 2022 • Tyler L. Hayes, Maximilian Nickel, Christopher Kanan, Ludovic Denoyer, Arthur Szlam

Using this framing, we introduce an active sampling method that asks for examples from the tail of the data distribution and show that it outperforms classical active learning methods on Visual Genome.

Active Learning

Paper
Add Code

Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity

no code implementations • Findings (NAACL) 2022 • Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston

State-of-the-art dialogue models still often stumble with regards to factual accuracy and self-contradiction.

Paper
Add Code

Reason first, then respond: Modular Generation for Knowledge-infused Dialogue

no code implementations • 9 Nov 2021 • Leonard Adolphs, Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston

Large language models can produce fluent dialogue but often hallucinate factual inaccuracies.

Retrieval

Paper
Add Code

NeurIPS 2021 Competition IGLU: Interactive Grounded Language Understanding in a Collaborative Environment

no code implementations • 13 Oct 2021 • Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Katja Hofmann, Michel Galley, Ahmed Awadallah

Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions.

Natural Language Understanding Reinforcement Learning (RL)

Paper
Add Code

Beyond Goldfish Memory: Long-Term Open-Domain Conversation

no code implementations • ACL 2022 • Jing Xu, Arthur Szlam, Jason Weston

Despite recent improvements in open-domain dialogue models, state of the art models are trained and evaluated on short conversations with little context.

Decoder Retrieval

Paper
Add Code

Hash Layers For Large Sparse Models

no code implementations • NeurIPS 2021 • Stephen Roller, Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston

We investigate the training of sparse layers that use different parameters for different inputs based on hashing in large Transformer models.

Language Modelling

Paper
Add Code

Not All Memories are Created Equal: Learning to Forget by Expiring

1 code implementation • 13 May 2021 • Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan

We demonstrate that Expire-Span can help models identify and retain critical information and show it can achieve strong performance on reinforcement learning tasks specifically designed to challenge this functionality.

Ranked #4 on Language Modelling on enwik8

Language Modelling

136

Paper
Code

droidlet: modular, heterogenous, multi-modal agents

1 code implementation • 25 Jan 2021 • Anurag Pratik, Soumith Chintala, Kavya Srinet, Dhiraj Gandhi, Rebecca Qian, Yuxuan Sun, Ryan Drew, Sara Elkafrawy, Anoushka Tiwari, Tucker Hart, Mary Williamson, Abhinav Gupta, Arthur Szlam

In recent years, there have been significant advances in building end-to-end Machine Learning (ML) systems that learn at scale.

829

Paper
Code

Not All Memories are Created Equal: Learning to Expire

1 code implementation • 1 Jan 2021 • Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason E Weston, Angela Fan

We demonstrate that Expire-Span can help models identify and retain critical information and show it can achieve state of the art results on long-context language modeling, reinforcement learning, and algorithmic tasks.

Language Modelling

Paper
Code

Reducing conversational agents' overconfidence through linguistic calibration

no code implementations • 30 Dec 2020 • Sabrina J. Mielke, Arthur Szlam, Emily Dinan, Y-Lan Boureau

While improving neural dialogue agents' factual accuracy is the object of much research, another important aspect of communication, less studied in the setting of neural dialogue, is transparency about ignorance.

Paper
Add Code

Few-shot Sequence Learning with Transformers

no code implementations • 17 Dec 2020 • Lajanugen Logeswaran, Ann Lee, Myle Ott, Honglak Lee, Marc'Aurelio Ranzato, Arthur Szlam

In the simplest setting, we append a token to an input sequence which represents the particular task to be undertaken, and show that the embedding of this token can be optimized on the fly given few labeled examples.

Few-Shot Learning

Paper
Add Code

CURI: A Benchmark for Productive Concept Learning Under Uncertainty

1 code implementation • 6 Oct 2020 • Ramakrishna Vedantam, Arthur Szlam, Maximilian Nickel, Ari Morcos, Brenden Lake

Humans can learn and reason under substantial uncertainty in a space of infinitely many concepts, including structured relational concepts ("a scene with objects that have the same color") and ad-hoc categories defined through goals ("objects that could fall on one's head").

Meta-Learning Systematic Generalization

Paper
Code

How to Motivate Your Dragon: Teaching Goal-Driven Agents to Speak and Act in Fantasy Worlds

no code implementations • NAACL 2021 • Prithviraj Ammanabrolu, Jack Urbanek, Margaret Li, Arthur Szlam, Tim Rocktäschel, Jason Weston

We seek to create agents that both act and communicate with other agents in pursuit of a goal.

Language Modelling

Paper
Add Code

Deploying Lifelong Open-Domain Dialogue Learning

no code implementations • 18 Aug 2020 • Kurt Shuster, Jack Urbanek, Emily Dinan, Arthur Szlam, Jason Weston

As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013).

Paper
Add Code

Fast Adaptation via Policy-Dynamics Value Functions

1 code implementation • 6 Jul 2020 • Roberta Raileanu, Max Goldstein, Arthur Szlam, Rob Fergus

An ensemble of conventional RL policies is used to gather experience on training environments, from which embeddings of both policies and environments can be learned.

Paper
Code

CraftAssist Instruction Parsing: Semantic Parsing for a Voxel-World Assistant

no code implementations • ACL 2020 • Kavya Srinet, Yacine Jernite, Jonathan Gray, Arthur Szlam

We propose a semantic parsing dataset focused on instruction-driven communication with an agent in the game Minecraft.

Semantic Parsing

Paper
Add Code

Empirically Verifying Hypotheses Using Reinforcement Learning

no code implementations • 29 Jun 2020 • Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta

This paper formulates hypothesis verification as an RL problem.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

no code implementations • 22 Jun 2020 • Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.

Continual Learning

Paper
Add Code

Residual Energy-Based Models for Text Generation

1 code implementation • ICLR 2020 • Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato

In this work, we investigate un-normalized energy-based models (EBMs) which operate not at the token but at the sequence level.

Language Modelling Machine Translation +2

Paper
Code

Learning to Visually Navigate in Photorealistic Environments Without any Supervision

no code implementations • 10 Apr 2020 • Lina Mezghani, Sainbayar Sukhbaatar, Arthur Szlam, Armand Joulin, Piotr Bojanowski

Learning to navigate in a realistic setting where an agent must rely solely on visual inputs is a challenging task, in part because the lack of position information makes it difficult to provide supervision during training.

Navigate Position

Paper
Add Code

Residual Energy-Based Models for Text

no code implementations • 6 Apr 2020 • Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc'Aurelio Ranzato, Arthur Szlam

Current large-scale auto-regressive language models display impressive fluency and can generate convincing text.

Paper
Add Code

I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents

no code implementations • 7 Feb 2020 • Shrimai Prabhumoye, Margaret Li, Jack Urbanek, Emily Dinan, Douwe Kiela, Jason Weston, Arthur Szlam

Dialogue research tends to distinguish between chit-chat and goal-oriented tasks.

Paper
Add Code

Generating Interactive Worlds with Text

no code implementations • 20 Nov 2019 • Angela Fan, Jack Urbanek, Pratik Ringshia, Emily Dinan, Emma Qian, Siddharth Karamcheti, Shrimai Prabhumoye, Douwe Kiela, Tim Rocktaschel, Arthur Szlam, Jason Weston

We show that the game environments created with our approach are cohesive, diverse, and preferred by human evaluators compared to other machine learning based world construction algorithms.

BIG-bench Machine Learning Common Sense Reasoning

Paper
Add Code

Agent as Scientist: Learning to Verify Hypotheses

no code implementations • 25 Sep 2019 • Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta

In order to train the agents, we exploit the underlying structure in the majority of hypotheses -- they can be formulated as triplets (pre-condition, action sequence, post-condition).

Paper
Add Code

Why Build an Assistant in Minecraft?

1 code implementation • 22 Jul 2019 • Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston

In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.

Natural Language Understanding

606

Paper
Code

CraftAssist: A Framework for Dialogue-enabled Interactive Agents

3 code implementations • 19 Jul 2019 • Jonathan Gray, Kavya Srinet, Yacine Jernite, Haonan Yu, Zhuoyuan Chen, Demi Guo, Siddharth Goyal, C. Lawrence Zitnick, Arthur Szlam

This paper describes an implementation of a bot assistant in Minecraft, and the tools and platform allowing players to interact with the bot and to record those interactions.

829

Paper
Code

Real or Fake? Learning to Discriminate Machine from Human Generated Text

no code implementations • 7 Jun 2019 • Anton Bakhtin, Sam Gross, Myle Ott, Yuntian Deng, Marc'Aurelio Ranzato, Arthur Szlam

Energy-based models (EBMs), a. k. a.

Language Modelling Text Generation

Paper
Add Code

Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies

no code implementations • ICLR 2019 • Kenneth Marino, Abhinav Gupta, Rob Fergus, Arthur Szlam

The high-level policy is trained using a sparse, task-dependent reward, and operates by choosing which of the low-level policies to run at any given time.

Paper
Add Code

CraftAssist Instruction Parsing: Semantic Parsing for a Minecraft Assistant

no code implementations • 17 Apr 2019 • Yacine Jernite, Kavya Srinet, Jonathan Gray, Arthur Szlam

We propose a large scale semantic parsing dataset focused on instruction-driven communication with an agent in Minecraft.

Semantic Parsing

Paper
Add Code

Learning to Speak and Act in a Fantasy Text Adventure Game

1 code implementation • IJCNLP 2019 • Jack Urbanek, Angela Fan, Siddharth Karamcheti, Saachi Jain, Samuel Humeau, Emily Dinan, Tim Rocktäschel, Douwe Kiela, Arthur Szlam, Jason Weston

We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully.

Retrieval

Paper
Code

The Second Conversational Intelligence Challenge (ConvAI2)

2 code implementations • 31 Jan 2019 • Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W. black, Alexander Rudnicky, Jason Williams, Joelle Pineau, Mikhail Burtsev, Jason Weston

We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots.

Paper
Code

Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning

2 code implementations • 22 Nov 2018 • Sainbayar Sukhbaatar, Emily Denton, Arthur Szlam, Rob Fergus

In hierarchical reinforcement learning a major challenge is determining appropriate low-level policies.

Hierarchical Reinforcement Learning reinforcement-learning +1

Paper
Code

Dialogue Natural Language Inference

no code implementations • ACL 2019 • Sean Welleck, Jason Weston, Arthur Szlam, Kyunghyun Cho

Consistency is a long standing issue faced by dialogue models.

Natural Language Inference

Paper
Add Code

GenEval: A Benchmark Suite for Evaluating Generative Models

no code implementations • 27 Sep 2018 • Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato

In this work, we aim at addressing this problem by introducing a new benchmark evaluation suite, dubbed \textit{GenEval}.

Paper
Add Code

Planning with Arithmetic and Geometric Attributes

no code implementations • 6 Sep 2018 • David Folqué, Sainbayar Sukhbaatar, Arthur Szlam, Joan Bruna

A desirable property of an intelligent agent is its ability to understand its environment to quickly generalize to novel tasks and compose simpler tasks into more complex ones.

Paper
Add Code

Lightweight Adaptive Mixture of Neural and N-gram Language Models

no code implementations • 20 Apr 2018 • Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato, Edouard Grave

It is often the case that the best performing language model is an ensemble of a neural language model with n-grams.

Language Modelling

Paper
Add Code

Composable Planning with Attributes

no code implementations • ICML 2018 • Amy Zhang, Adam Lerer, Sainbayar Sukhbaatar, Rob Fergus, Arthur Szlam

The tasks that an agent will need to solve often are not known during training.

Attribute Starcraft

Paper
Add Code

Modeling Others using Oneself in Multi-Agent Reinforcement Learning

1 code implementation • ICML 2018 • Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus

We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility.

Multi-agent Reinforcement Learning reinforcement-learning +1

356

Paper
Code

Personalizing Dialogue Agents: I have a dog, do you have pets too?

15 code implementations • ACL 2018 • Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, Jason Weston

Chit-chat models are known to have several problems: they lack specificity, do not display a consistent personality and are often not very captivating.

Ranked #5 on Dialogue Generation on Persona-Chat (using extra training data)

Conversational Response Selection Dialogue Generation +1

10,433

Paper
Code

Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent

no code implementations • ICLR 2018 • Zhilin Yang, Saizheng Zhang, Jack Urbanek, Will Feng, Alexander H. Miller, Arthur Szlam, Douwe Kiela, Jason Weston

Contrary to most natural language processing research, which makes use of static datasets, humans learn language interactively, grounded in an environment.

Grounded language learning

Paper
Add Code

Optimizing the Latent Space of Generative Networks

6 code implementations • ICML 2018 • Piotr Bojanowski, Armand Joulin, David Lopez-Paz, Arthur Szlam

Generative Adversarial Networks (GANs) have achieved remarkable results in the task of generating realistic natural images.

120

Paper
Code

Low-shot learning with large-scale diffusion

1 code implementation • CVPR 2018 • Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou

This paper considers the problem of inferring image labels from images when only a few annotated examples are available at training time.

Ranked #6 on Few-Shot Image Classification on ImageNet-FS (1-shot, novel)

Few-Shot Image Classification graph construction

Paper
Code

Hard Mixtures of Experts for Large Scale Weakly Supervised Vision

no code implementations • CVPR 2017 • Sam Gross, Marc'Aurelio Ranzato, Arthur Szlam

In this work we show that a simple hard mixture of experts model can be efficiently trained to good effect on large scale hashtag (multilabel) prediction tasks.

Paper
Add Code

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

3 code implementations • ICLR 2018 • Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus

When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.

Paper
Code

Training Language Models Using Target-Propagation

1 code implementation • 15 Feb 2017 • Sam Wiseman, Sumit Chopra, Marc'Aurelio Ranzato, Arthur Szlam, Ruoyu Sun, Soumith Chintala, Nicolas Vasilache

While Truncated Back-Propagation through Time (BPTT) is the most popular approach to training Recurrent Neural Networks (RNNs), it suffers from being inherently sequential (making parallelization difficult) and from truncating gradient flow between distant time-steps.

Paper
Code

Automatic Rule Extraction from Long Short Term Memory Networks

no code implementations • 8 Feb 2017 • W. James Murdoch, Arthur Szlam

Although deep learning models have proven effective at solving problems in natural language processing, the mechanism by which they come to their conclusions is often unclear.

Question Answering Sentiment Analysis

Paper
Add Code

Transformation-Based Models of Video Sequences

no code implementations • 29 Jan 2017 • Joost van Amersfoort, Anitha Kannan, Marc'Aurelio Ranzato, Arthur Szlam, Du Tran, Soumith Chintala

In this work we propose a simple unsupervised approach for next frame prediction in video.

Paper
Add Code

Tracking the World State with Recurrent Entity Networks

5 code implementations • 12 Dec 2016 • Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann Lecun

The EntNet sets a new state-of-the-art on the bAbI tasks, and is the first method to solve all the tasks in the 10k training examples setting.

Ranked #5 on Procedural Text Understanding on ProPara

Procedural Text Understanding Question Answering

1,755

Paper
Code

The Product Cut

1 code implementation • NeurIPS 2016 • Thomas Laurent, James Von Brecht, Xavier Bresson, Arthur Szlam

We introduce a theoretical and algorithmic framework for multi-way graph partitioning that relies on a multiplicative cut-based objective.

graph partitioning

Paper
Code

Geometric deep learning: going beyond Euclidean data

no code implementations • 24 Nov 2016 • Michael M. Bronstein, Joan Bruna, Yann Lecun, Arthur Szlam, Pierre Vandergheynst

In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques.

Paper
Add Code

Learning Multiagent Communication with Backpropagation

9 code implementations • NeurIPS 2016 • Sainbayar Sukhbaatar, Arthur Szlam, Rob Fergus

Many tasks in AI require the collaboration of multiple agents.

356

Paper
Code

Recurrent Orthogonal Networks and Long-Memory Tasks

1 code implementation • 22 Feb 2016 • Mikael Henaff, Arthur Szlam, Yann Lecun

Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research.

Paper
Code

Simple Baseline for Visual Question Answering

7 code implementations • 7 Dec 2015 • Bolei Zhou, Yuandong Tian, Sainbayar Sukhbaatar, Arthur Szlam, Rob Fergus

We describe a very simple bag-of-words baseline for visual question answering.

Ranked #10 on Visual Question Answering (VQA) on COCO Visual Question Answering (VQA) real images 1.0 multiple choice

Visual Question Answering

186

Paper
Code

Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

no code implementations • NeurIPS 2015 • Emily L. Denton, Soumith Chintala, Arthur Szlam, Rob Fergus

In this paper we introduce a generative model capable of producing high quality samples of natural images.

Paper
Add Code

MazeBase: A Sandbox for Learning from Games

2 code implementations • 23 Nov 2015 • Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, Rob Fergus

This paper introduces MazeBase: an environment for simple 2D games, designed as a sandbox for machine learning approaches to reasoning and planning.

Negation Reinforcement Learning (RL) +1

244

Paper
Code

Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems

1 code implementation • 21 Nov 2015 • Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, Jason Weston

A long-term goal of machine learning is to build intelligent conversational agents.

139

Paper
Code

Convolutional networks and learning invariant to homogeneous multiplicative scalings

no code implementations • 26 Jun 2015 • Mark Tygert, Arthur Szlam, Soumith Chintala, Marc'Aurelio Ranzato, Yuandong Tian, Wojciech Zaremba

The conventional classification schemes -- notably multinomial logistic regression -- used in conjunction with convolutional networks (convnets) are classical in statistics, designed without consideration for the usual coupling with convnets, stochastic gradient descent, and backpropagation.

Classification General Classification +1

Paper
Add Code

Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

1 code implementation • 18 Jun 2015 • Emily Denton, Soumith Chintala, Arthur Szlam, Rob Fergus

In this paper we introduce a generative parametric model capable of producing high quality samples of natural images.

598

Paper
Code

End-To-End Memory Networks

44 code implementations • NeurIPS 2015 • Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus

For the former our approach is competitive with Memory Networks, but with less supervision.

Ranked #6 on Question Answering on bAbi

Language Modelling Question Answering

11,473

Paper
Code

A mathematical motivation for complex-valued convolutional networks

no code implementations • 11 Mar 2015 • Joan Bruna, Soumith Chintala, Yann Lecun, Serkan Piantino, Arthur Szlam, Mark Tygert

Courtesy of the exact correspondence, the remarkably rich and rigorous body of mathematical analysis for wavelets applies directly to (complex-valued) convnets.

Paper
Add Code

Video (language) modeling: a baseline for generative models of natural videos

1 code implementation • 20 Dec 2014 • MarcAurelio Ranzato, Arthur Szlam, Joan Bruna, Michael Mathieu, Ronan Collobert, Sumit Chopra

We propose a strong baseline model for unsupervised feature learning using video data.

Language Modelling

Paper
Code

An Incremental Reseeding Strategy for Clustering

no code implementations • 15 Jun 2014 • Xavier Bresson, Huiyi Hu, Thomas Laurent, Arthur Szlam, James Von Brecht

In this work we propose a simple and easily parallelizable algorithm for multiway graph partitioning.

Clustering graph partitioning

Paper
Add Code

Better Feature Tracking Through Subspace Constraints

no code implementations • CVPR 2014 • Bryan Poling, Gilad Lerman, Arthur Szlam

Our approach does not require direct modeling of the structure or the motion of the scene, and runs in real time on a single CPU core.

Paper
Add Code

Spectral Networks and Locally Connected Networks on Graphs

4 code implementations • 21 Dec 2013 • Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann Lecun

Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain.

Clustering Translation

1,320

Paper
Code

Unsupervised Feature Learning by Deep Sparse Coding

no code implementations • 20 Dec 2013 • Yunlong He, Koray Kavukcuoglu, Yun Wang, Arthur Szlam, Yanjun Qi

In this paper, we propose a new unsupervised feature learning framework, namely Deep Sparse Coding (DeepSC), that extends sparse coding to a multi-layer architecture for visual object recognition tasks.

Object Recognition

Paper
Add Code

Signal Recovery from Pooling Representations

no code implementations • 16 Nov 2013 • Joan Bruna, Arthur Szlam, Yann Lecun

In this work we compute lower Lipschitz bounds of $\ell_p$ pooling operators for $p=1, 2, \infty$ as well as $\ell_p$ pooling operators preceded by half-rectification layers.

regression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.