Search Results for author: Annarose M B

Found 1 papers, 1 papers with code

Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap

1 code implementation29 Feb 2024 Saurabh Srivastava, Annarose M B, Anto P V, Shashank Menon, Ajay Sukumar, Adwaith Samod T, Alan Philipose, Stevin Prince, Sooraj Thomas

Models that solve a reasoning test should exhibit no difference in performance over the static version of a problem compared to a snapshot of the functional variant.

Math

Cannot find the paper you are looking for? You can Submit a new open access paper.