Search Results for author: Keiran Paster

Found 7 papers, 4 papers with code

Llemma: An Open Language Model For Mathematics

4 code implementations • 16 Oct 2023 • Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen Mcaleer, Albert Q. Jiang, Jia Deng, Stella Biderman, Sean Welleck

We present Llemma, a large language model for mathematics.

Ranked #6 on Automated Theorem Proving on miniF2F-test

Arithmetic Reasoning Automated Theorem Proving +3

6,624

Paper
Code

OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text

2 code implementations • 10 Oct 2023 • Keiran Paster, Marco Dos Santos, Zhangir Azerbayev, Jimmy Ba

We hope that our dataset, openly released on the Hugging Face Hub, will help spur advances in the reasoning abilities of large language models.

604

Paper
Code

STEVE-1: A Generative Model for Text-to-Behavior in Minecraft

no code implementations • NeurIPS 2023 • Shalev Lifshitz, Keiran Paster, Harris Chan, Jimmy Ba, Sheila Mcilraith

Constructing AI models that respond to text instructions is challenging, especially for sequential decision-making tasks.

Decision Making Image Generation +1

Paper
Add Code

Large Language Models Are Human-Level Prompt Engineers

2 code implementations • 3 Nov 2022 • Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba

By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers.

Few-Shot Learning In-Context Learning +3

1,000

Paper
Code

You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments

no code implementations • 31 May 2022 • Keiran Paster, Sheila Mcilraith, Jimmy Ba

In all tested domains, ESPER achieves significantly better alignment between the target return and achieved return than simply conditioning on returns.

Offline RL Playing the Game of 2048

Paper
Add Code

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

1 code implementation • NeurIPS 2021 • Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R. Zhang, Jimmy Ba

These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents.

Domain Generalization Reinforcement Learning (RL)

Paper
Code

Planning from Pixels using Inverse Dynamics Models

no code implementations • ICLR 2021 • Keiran Paster, Sheila A. McIlraith, Jimmy Ba

Learning task-agnostic dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.