Predictive coding, precision and natural gradients

12 Nov 2021  ·  Andre Ofner, Raihan Kabir Ratul, Suhita Ghosh, Sebastian Stober ·

There is an increasing convergence between biologically plausible computational models of inference and learning with local update rules and the global gradient-based optimization of neural network models employed in machine learning. One particularly exciting connection is the correspondence between the locally informed optimization in predictive coding networks and the error backpropagation algorithm that is used to train state-of-the-art deep artificial neural networks. Here we focus on the related, but still largely under-explored connection between precision weighting in predictive coding networks and the Natural Gradient Descent algorithm for deep neural networks. Precision-weighted predictive coding is an interesting candidate for scaling up uncertainty-aware optimization -- particularly for models with large parameter spaces -- due to its distributed nature of the optimization process and the underlying local approximation of the Fisher information metric, the adaptive learning rate that is central to Natural Gradient Descent. Here, we show that hierarchical predictive coding networks with learnable precision indeed are able to solve various supervised and unsupervised learning tasks with performance comparable to global backpropagation with natural gradients and outperform their classical gradient descent counterpart on tasks where high amounts of noise are embedded in data or label inputs. When applied to unsupervised auto-encoding of image inputs, the deterministic network produces hierarchically organized and disentangled embeddings, hinting at the close connections between predictive coding and hierarchical variational inference.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods