1 code implementation • 14 Feb 2024 • Jack Miller, Patrick Gleeson, Charles O'Neill, Thang Bui, Noam Levi
Neural networks sometimes exhibit grokking, a phenomenon where perfect or near-perfect performance is achieved on a validation set well after the same performance has been obtained on the corresponding training set.