no code implementations • 5 Oct 2021 • Geonhwa Jeong, Eric Qin, Ananda Samajdar, Christopher J. Hughes, Sreenivas Subramoney, Hyesoon Kim, Tushar Krishna
As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency.
no code implementations • 16 Aug 2021 • Ananda Samajdar, Jan Moritz Joseph, Matthew Denton, Tushar Krishna
We design and train a custom network architecture called AIRCHITECT, which is capable of learning the architecture design space with as high as 94. 3% test accuracy and predicting optimal configurations which achieve on average (GeoMean) of 99. 9% the best possible performance on a test dataset with $10^5$ GEMM workloads.
no code implementations • 12 Jan 2021 • Ananda Samajdar, Michael Pellauer, Tushar Krishna
We demonstrate an instance of SARA with an accelerator we call SAGAR, which introduces a novel reconfigurable systolic array that can be configured to work as a distributed collection of smaller arrays of various sizes or as a single array with flexible aspect ratios.
no code implementations • 27 Aug 2020 • Parth Mannan, Ananda Samajdar, Tushar Krishna
The true impact of AI can only be fully realized if we can have AI agents continuously interacting with the real world and solving everyday problems.
8 code implementations • 16 Oct 2018 • Ananda Samajdar, Yuhao Zhu, Paul Whatmough, Matthew Mattina, Tushar Krishna
Systolic Arrays are one of the most popular compute substrates within Deep Learning accelerators today, as they provide extremely high efficiency for running dense matrix multiplications.
Distributed, Parallel, and Cluster Computing Hardware Architecture
no code implementations • 3 Aug 2018 • Ananda Samajdar, Parth Mannan, Kartikay Garg, Tushar Krishna
EvE can evolve the topology and weights of neural networks completely in hardware for the task at hand, without requiring hand-optimization or backpropagation training.