1 code implementation • 16 Feb 2023 • Hadar Sivan, Moshe Gabel, Assaf Schuster
Popular machine learning approaches forgo second-order information due to the difficulty of computing curvature in high dimensions.
no code implementations • 15 Jul 2022 • James Gleeson, Daniel Snider, Yvonne Yang, Moshe Gabel, Eyal de Lara, Gennady Pekhimenko
We show that simulator kernel fusion speedups with a simple simulator are $11. 3\times$ and increase by up to $1024\times$ as simulator complexity increases in terms of memory bandwidth requirements.
1 code implementation • 8 Feb 2021 • James Gleeson, Srivatsan Krishnan, Moshe Gabel, Vijay Janapa Reddi, Eyal de Lara, Gennady Pekhimenko
Deep reinforcement learning (RL) has made groundbreaking advancements in robotics, data center management and other applications.
no code implementations • ICML 2020 • Gal Yehuda, Moshe Gabel, Assaf Schuster
Can deep neural networks learn to solve any task, and in particular problems of high complexity?
no code implementations • 26 Jul 2019 • Ido Hakimi, Saar Barkai, Moshe Gabel, Assaf Schuster
We propose DANA: a novel technique for asynchronous distributed SGD with momentum that mitigates gradient staleness by computing the gradient on an estimated future position of the model's parameters.
no code implementations • ICLR 2019 • Ido Hakimi, Saar Barkai, Moshe Gabel, Assaf Schuster
We propose DANA, a novel approach that scales out-of-the-box to large clusters using the same hyperparameters and learning schedule optimized for training on a single worker, while maintaining similar final accuracy without additional overhead.