2 code implementations • 1 Aug 2022 • David Peer, Bart Keulen, Sebastian Stabinger, Justus Piater, Antonio Rodríguez-Sánchez
We show empirically that we can therefore train a "vanilla" fully connected network and convolutional neural network -- no skip connections, batch normalization, dropout, or any other architectural tweak -- with 500 layers by simply adding the batch-entropy regularization term to the loss function.