Evolution of Eigenvalue Decay in Deep Networks
The linear transformations in converged deep networks show fast eigenvalue decay. The distribution of eigenvalues looks like a Heavy-tail distribution, where the vast majority of eigenvalues is small, but not actually zero, and only a few spikes of large eigenvalues exist. We use a stochastic approximator to generate histograms of eigenvalues. This allows us to investigate layers with hundreds of thousands of dimensions. We show how the distributions change over the course of image net training, converging to a similar heavy-tail spectrum across all intermediate layers.
PDF Abstract