no code implementations • 6 May 2024 • Jinying Xiao, Ping Li, Jie Nie
TED uses an optimization objective based on Internal Generalization Distance (IGD), measuring changes in IG before and after pruning to align with true generalization performance and achieve implicit regularization.
no code implementations • 19 Mar 2024 • Jinying Xiao, Ping Li, Zhe Tang, Jie Nie
Pruning before training enables the deployment of neural networks on smart devices.
1 code implementation • 19 Mar 2024 • Jinying Xiao, Ping Li, Jie Nie, Zhe Tang
We utilize this design to dynamically assess the importance scores of weights. SEVEN is introduced by us, which particularly favors weights with consistently high sensitivity, i. e., weights with small gradient noise.