Search Results for author: Guillaume Lample

Found 35 papers, 26 papers with code

Mixtral of Experts

3 code implementations • 8 Jan 2024 • Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed

In particular, Mixtral vastly outperforms Llama 2 70B on mathematics, code generation, and multilingual benchmarks.

Ranked #9 on Question Answering on PIQA

Code Generation Common Sense Reasoning +4

620

Paper
Code

Mistral 7B

5 code implementations • 10 Oct 2023 • Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed

We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.

Ranked #4 on Zero-Shot Video Question Answer on NExT-GQA

Arithmetic Reasoning Chatbot +9

8,715

Paper
Code

LLaMA: Open and Efficient Foundation Language Models

44 code implementations • arXiv 2023 • Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.

Ranked #3 on Question Answering on OBQA

Arithmetic Reasoning Code Generation +6

125,425

Paper
Code

Deep Generative Symbolic Regression with Monte-Carlo-Tree-Search

no code implementations • 22 Feb 2023 • Pierre-Alexandre Kamienny, Guillaume Lample, Sylvain Lamprier, Marco Virgolin

Symbolic regression (SR) is the problem of learning a symbolic expression from numerical data.

regression Symbolic Regression

Paper
Add Code

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

3 code implementations • 21 Oct 2022 • Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample

In this work, we introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches, and uses the sketches to guide an automated prover by directing its search to easier sub-problems.

Ranked #3 on Automated Theorem Proving on miniF2F-valid (Pass@100 metric)

Automated Theorem Proving Language Modelling

Paper
Code

HyperTree Proof Search for Neural Theorem Proving

no code implementations • 23 May 2022 • Guillaume Lample, Marie-Anne Lachaux, Thibaut Lavril, Xavier Martinet, Amaury Hayat, Gabriel Ebner, Aurélien Rodriguez, Timothée Lacroix

With a similar computational budget, we improve the state of the art on the Lean-based miniF2F-curriculum dataset from 31% to 42% proving accuracy.

Ranked #1 on Automated Theorem Proving on Metamath set.mm (Pass@32 metric)

Automated Theorem Proving

Paper
Add Code

End-to-end symbolic regression with transformers

3 code implementations • 22 Apr 2022 • Pierre-Alexandre Kamienny, Stéphane d'Ascoli, Guillaume Lample, François Charton

Symbolic regression, the task of predicting the mathematical expression of a function from the observation of its values, is a difficult task which usually involves a two-step procedure: predicting the "skeleton" of the expression up to the choice of numerical constants, then fitting the constants by optimizing a non-convex loss function.

regression Symbolic Regression

Paper
Code

Deep Symbolic Regression for Recurrent Sequences

no code implementations • 12 Jan 2022 • Stéphane d'Ascoli, Pierre-Alexandre Kamienny, Guillaume Lample, François Charton

Symbolic regression, i. e. predicting a function from the observation of its values, is well-known to be a challenging task.

regression Symbolic Regression

Paper
Add Code

Leveraging Automated Unit Tests for Unsupervised Code Translation

1 code implementation • ICLR 2022 • Baptiste Roziere, Jie M. Zhang, Francois Charton, Mark Harman, Gabriel Synnaeve, Guillaume Lample

With little to no parallel data available for programming languages, unsupervised methods are well-suited to source code translation.

Code Translation Sentence +2

675

Paper
Code

DOBF: A Deobfuscation Pre-Training Objective for Programming Languages

2 code implementations • NeurIPS 2021 • Baptiste Roziere, Marie-Anne Lachaux, Marc Szafraniec, Guillaume Lample

Recent advances in self-supervised learning have dramatically improved the state of the art on a wide variety of tasks.

Code Search Code Translation +5

675

Paper
Code

Target Conditioning for One-to-Many Generation

no code implementations • Findings of the Association for Computational Linguistics 2020 • Marie-Anne Lachaux, Armand Joulin, Guillaume Lample

In this paper, we propose to explicitly model this one-to-many mapping by conditioning the decoder of a NMT model on a latent variable that represents the domain of target sentences.

Decoder Machine Translation +3

Paper
Add Code

Learning advanced mathematical computations from examples

1 code implementation • ICLR 2021 • François Charton, Amaury Hayat, Guillaume Lample

Using transformers over large generated datasets, we train models to learn mathematical properties of differential systems, such as local stability, behavior at infinity and controllability.

175

Paper
Code

Unsupervised Translation of Programming Languages

9 code implementations • NeurIPS 2020 • Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, Guillaume Lample

We train our model on source code from open source GitHub projects, and show that it can translate functions between C++, Java, and Python with high accuracy.

Code Translation Translation +1

1,666

Paper
Code

Deep Learning for Symbolic Mathematics

7 code implementations • ICLR 2020 • Guillaume Lample, François Charton

Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data.

506

Paper
Code

The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali--English and Sinhala--English

1 code implementation • IJCNLP 2019 • Francisco Guzm{\'a}n, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc{'}Aurelio Ranzato

For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available.

Machine Translation Translation

660

Paper
Code

Large Memory Layers with Product Keys

8 code implementations • NeurIPS 2019 • Guillaume Lample, Alexandre Sablayrolles, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou

In our experiments we consider a dataset with up to 30 billion words, and we plug our memory layer in a state-of-the-art transformer-based architecture.

Language Modelling

2,857

Paper
Code

Augmenting Self-attention with Persistent Memory

2 code implementations • 2 Jul 2019 • Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin

More precisely, we augment the self-attention layers with persistent memory vectors that play a similar role as the feed-forward layer.

Ranked #5 on Language Modelling on Text8

Language Modelling Translation

4,160

Paper
Code

Multiple-Attribute Text Rewriting

no code implementations • ICLR 2019 • Guillaume Lample, Sandeep Subramanian, Eric Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau

The dominant approach to unsupervised "style transfer" in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its "style".

Attribute Disentanglement +2

Paper
Add Code

The FLoRes Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English

2 code implementations • 4 Feb 2019 • Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc'Aurelio Ranzato

For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available.

Machine Translation Translation

660

Paper
Code

Cross-lingual Language Model Pretraining

16 code implementations • NeurIPS 2019 • Guillaume Lample, Alexis Conneau

On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.

Ranked #2 on Unsupervised Machine Translation on WMT2016 English--Romanian

Language Modelling Natural Language Understanding +2

125,425

Paper
Code

Multiple-Attribute Text Style Transfer

3 code implementations • 1 Nov 2018 • Sandeep Subramanian, Guillaume Lample, Eric Michael Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau

The dominant approach to unsupervised "style transfer" in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its "style".

Attribute Disentanglement +3

222

Paper
Code

Phrase-Based \& Neural Unsupervised Machine Translation

no code implementations • EMNLP 2018 • Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc{'}Aurelio Ranzato

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.

Denoising NMT +3

Paper
Add Code

XNLI: Evaluating Cross-lingual Sentence Representations

10 code implementations • EMNLP 2018 • Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R. Bowman, Holger Schwenk, Veselin Stoyanov

State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models.

Ranked #5 on Natural Language Inference on XNLI French

Cross-Lingual Natural Language Inference Machine Translation +2

2,857

Paper
Code

What you can cram into a single \$\&!\#* vector: Probing sentence embeddings for linguistic properties

no code implementations • ACL 2018 • Alexis Conneau, German Kruszewski, Guillaume Lample, Lo{\"\i}c Barrault, Marco Baroni

Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing.

General Classification Machine Translation +3

Paper
Add Code

What you can cram into a single vector: Probing sentence embeddings for linguistic properties

6 code implementations • 3 May 2018 • Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni

Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing.

General Classification Sentence +2

2,279

Paper
Code

Phrase-Based & Neural Unsupervised Machine Translation

15 code implementations • EMNLP 2018 • Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato

Ranked #2 on Machine Translation on WMT2016 English-Russian

NMT Sentence +2

125,425

Paper
Code

Fader Networks:Manipulating Images by Sliding Attributes

no code implementations • NeurIPS 2017 • Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato

This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space.

Attribute Decoder

Paper
Add Code

Unsupervised Machine Translation Using Monolingual Corpora Only

15 code implementations • ICLR 2018 • Guillaume Lample, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato

By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data.

Ranked #7 on Machine Translation on WMT2016 German-English

Sentence Translation +1

3,167

Paper
Code

Word Translation Without Parallel Data

19 code implementations • ICLR 2018 • Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou

We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.

Ranked #2 on Word Alignment on en-es

Cross-Lingual Word Embeddings Translation +4

3,167

Paper
Code

Fader Networks: Manipulating Images by Sliding Attributes

3 code implementations • 1 Jun 2017 • Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato

Attribute Decoder

758

Paper
Code

Playing FPS Games with Deep Reinforcement Learning

8 code implementations • 18 Sep 2016 • Guillaume Lample, Devendra Singh Chaplot

Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions.

Game of Doom Q-Learning +2

515

Paper
Code

Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning

no code implementations • NAACL 2016 • Yulia Tsvetkov, Sunayana Sitaram, Manaal Faruqui, Guillaume Lample, Patrick Littell, David Mortensen, Alan W. black, Lori Levin, Chris Dyer

We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.

Representation Learning

Paper
Add Code

Neural Architectures for Named Entity Recognition

43 code implementations • NAACL 2016 • Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer

State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available.

Ranked #8 on Named Entity Recognition (NER) on CoNLL++