1 code implementation • 4 Mar 2021 • Fu Wang, Yanghao Zhang, Yanbin Zheng, Wenjie Ruan
Therefore, based on the magnitude of the gradient, we propose a general acceleration strategy, M+ acceleration, which enables an automatic and highly effective method of adjusting the training procedure.