Neural Networks Generalize on Low Complexity Data
We show that feedforward neural networks with ReLU activation generalize on low complexity data, suitably defined. Given i.i.d.~data generated from a simple programming language, the minimum description length (MDL) feedforward neural network which interpolates the data generalizes with high probability. We define this simple programming language, along with a notion of description length of such networks. We provide several examples on basic computational tasks, such as checking primality of a natural number. For primality testing, our theorem shows the following and more. Suppose that we draw an i.i.d.~sample of $n$ numbers uniformly at random from $1$ to $N$. For each number $x_i$, let $y_i = 1$ if $x_i$ is a prime and $0$ if it is not. Then, the interpolating MDL network accurately answers, with probability $1- O((\ln N)/n)$, whether a newly drawn number between $1$ and $N$ is a prime or not. Note that the network is not designed to detect primes; minimum description learning discovers a network which does so. Extensions to noisy data are also discussed, suggesting that MDL neural network interpolators can demonstrate tempered overfitting.
💡 Research Summary
The paper “Neural Networks Generalize on Low Complexity Data” establishes a rigorous theoretical framework showing that feed‑forward ReLU neural networks, when selected by the Minimum Description Length (MDL) principle, can generalize with high probability on data generated by short programs. The authors introduce a deliberately simple programming language called Simple Neural Programs (SNPs). An SNP consists of a bounded number of variables, input statements, integer or Boolean initializations, basic arithmetic and logical operations, and control structures such as loops and conditionals. Each SNP defines a deterministic mapping P :
Comments & Academic Discussion
Loading comments...
Leave a Comment