no code implementations • 11 Sep 2023 • Ruocheng Wang, Eric Zelikman, Gabriel Poesia, Yewen Pu, Nick Haber, Noah D. Goodman
Because of the prohibitive cost of generation with state-of-the-art LLMs, we consider a middle step to filter the set of hypotheses that will be implemented into programs: we either ask the LLM to summarize into a smaller set of hypotheses, or ask human annotators to select a subset of the hypotheses.
no code implementations • 6 Jun 2023 • Gabriel Poesia, Kanishk Gandhi, Eric Zelikman, Noah D. Goodman
In experiments on PrOntoQA, ProofWriter and Syllogism Validity datasets, \textsc{LogicGuide} significantly improves the performance of GPT-3, GPT-3. 5 Turbo and LLaMA (accuracy gains up to 35\%), while drastically reducing \emph{content effects} -- the interference between unwanted prior assumptions and reasoning, which humans and language models suffer from.
1 code implementation • 16 Apr 2023 • Joy He-Yueya, Gabriel Poesia, Rose E. Wang, Noah D. Goodman
Automatically generating high-quality step-by-step solutions to math word problems has many applications in education.
1 code implementation • 20 Dec 2022 • Eric Zelikman, Qian Huang, Gabriel Poesia, Noah D. Goodman, Nick Haber
Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical multi-step reasoning tasks like generating complex programs.
Ranked #8 on Code Generation on HumanEval
1 code implementation • 29 Nov 2022 • Gabriel Poesia, Noah D. Goodman
We explore this idea in a case study on 5 sections of beginning algebra on the Khan Academy platform.
1 code implementation • 16 Nov 2022 • Zhening Li, Gabriel Poesia, Omar Costilla-Reyes, Noah Goodman, Armando Solar-Lezama
Humans tame the complexity of mathematical reasoning by developing hierarchies of abstractions.
2 code implementations • ICLR 2022 • Gabriel Poesia, Oleksandr Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, Sumit Gulwani
Then, Synchromesh feeds the examples to a pre-trained language model and samples programs using Constrained Semantic Decoding (CSD): a general framework for constraining the output to a set of valid programs in the target language.
no code implementations • EMNLP 2021 • Julia White, Gabriel Poesia, Robert Hawkins, Dorsa Sadigh, Noah Goodman
An overarching goal of natural language processing is to enable machines to communicate seamlessly with humans.
1 code implementation • NeurIPS 2021 • Gabriel Poesia, WenXin Dong, Noah Goodman
Our results suggest new directions for reinforcement learning in symbolic domains, as well as applications to mathematics education.