The Volume of Non-Restricted Boltzmann Machines and Their Double Descent Model Complexity

The double descent risk phenomenon has received much interest in the machine learning and statistics community. Motivated through Rissanen's minimum description length (MDL) principle, and Amari's information geometry, we investigate how a double descent-like behavior may manifest by considering the $\log V$ modeling term - which is the logarithm of the model volume. In particular, the $\log V$ term will be studied for the general class of fully-observed statistical lattice models, of which Boltzmann machines form a subset. Ultimately, it is found that for such models the $\log V$ term can decrease with increasing model dimensionality, at a rate which appears to overwhelm the classically understood $\mathcal{O}(D)$ complexity terms of AIC and BIC. Our analysis aims to deepen the understanding of how the double descent behavior may arise in deep lattice structures, and by extension, why generalization error may not necessarily continue to grow with increasing model dimensionality.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here