no code implementations • 22 Feb 2024 • Nikhil Prakash, Tamar Rott Shaham, Tal Haklay, Yonatan Belinkov, David Bau
We identify the mechanism that enables entity tracking and show that (i) in both the original model and its fine-tuned versions primarily the same circuit implements entity tracking.
no code implementations • 19 Oct 2023 • Junwoo Chang, Hyunwoo Ryu, Jiwoo Kim, Soochul Yoo, Jongeun Choi, Joohwan Seo, Nikhil Prakash, Roberto Horowitz
Diffusion models have risen as a powerful tool in robotics due to their flexibility and multi-modality.
no code implementations • 7 Jul 2023 • Xander Davies, Max Nadeau, Nikhil Prakash, Tamar Rott Shaham, David Bau
Recent work has shown that computation in language models may be human-understandable, with successful efforts to localize and intervene on both single-unit features and input-output circuits.
no code implementations • 11 Dec 2020 • Nikhil Prakash, Kory W. Mathewson
As artificial intelligence (AI) systems are getting ubiquitous within our society, issues related to its fairness, accountability, and transparency are increasing rapidly.