1 code implementation • 5 Sep 2023 • Gongyue Zhang, Dinghuang Zhang, Shuwen Zhao, Donghan Liu, Carrie M. Toptan, Honghai Liu
It not only can accelerates slow-changing parameters for sparse gradients, similar to adaptive optimizers, but also can choose to accelerates frequently-changing parameters for non-sparse gradients, thus being adaptable to all types of datasets.