Search Results for author: Kaya Stechly

Found 4 papers, 0 papers with code

Chain of Thoughtlessness: An Analysis of CoT in Planning

no code implementations8 May 2024 Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

Large language model (LLM) performance on reasoning problems typically does not generalize out of distribution.

Language Modelling Large Language Model

On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

no code implementations12 Feb 2024 Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples--ranging from multiplication to simple planning--there persists a wide spread belief that LLMs can self-critique and improve their own solutions in an iterative fashion.

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

no code implementations2 Feb 2024 Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Kaya Stechly, Mudit Verma, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the problem specification from one syntactic format to another, and ship the problem off to external symbolic solvers.

GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems

no code implementations19 Oct 2023 Kaya Stechly, Matthew Marquez, Subbarao Kambhampati

The study seems to indicate that (i) LLMs are bad at solving graph coloring instances (ii) they are no better at verifying a solution--and thus are not effective in iterative modes with LLMs critiquing LLM-generated solutions (iii) the correctness and content of the criticisms--whether by LLMs or external solvers--seems largely irrelevant to the performance of iterative prompting.

Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.