Theory of adaptive SVD regularization for deep neural networks

1 Aug 2020  ·  Mohammad Mahdi Bejani, MehdiGhatee ·

Deep networks can learn complex problems, however, they suffer from overfitting. To solve this problem, regularization methods have been proposed that are not adaptable to the dynamic changes in the training process. With a different approach, this paper presents a regularization method based on the Singular Value Decomposition (SVD) that adjusts the learning model adaptively. To this end, the overfitting can be evaluated by condition numbers of the synaptic matrices. When the overfitting is high, the matrices are substituted with their SVD approximations. Some theoretical results are derived to show the performance of this regularization method. It is proved that SVD approximation cannot solve overfitting after several iterations. Thus, a new Tikhonov term is added to the loss function to converge the synaptic weights to the SVD approximation of the best-found results. Following this approach, an Adaptive SVD Regularization (ASR) is proposed to adjust the learning model with respect to the dynamic training characteristics. ASR results are visualized to show how ASR overcomes overfitting. The different configurations of Convolutional Neural Networks (CNN) are implemented with different augmentation schemes to compare ASR with state-of-the-art regularization methods. The results show that on MNIST, F-MNIST, SVHN, CIFAR-10 and CIFAR-100, the accuracies of ASR are 99.4%, 95.7%, 97.1%, 93.2% and 55.6%, respectively. Although ASR improves the overfitting and validation loss, its elapsed time is not significantly greater than the learning without regularization.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here