ePC: Fast and Deep Predictive Coding for Digital Hardware
Predictive Coding (PC) offers a brain-inspired alternative to backpropagation for neural network training, described as a physical system minimizing its internal energy. However, in practice, PC is predominantly digitally simulated, requiring excessive amounts of compute while struggling to scale to deeper architectures. This paper reformulates PC to overcome this hardware-algorithm mismatch. First, we uncover how the canonical state-based formulation of PC (sPC) is, by design, deeply inefficient in digital simulation, inevitably resulting in exponential signal decay that stalls the entire minimization process. Then, to overcome this fundamental limitation, we introduce error-based PC (ePC), a novel reparameterization of PC which does not suffer from signal decay. Though no longer biologically plausible, ePC numerically computes exact PC weights gradients and runs orders of magnitude faster than sPC. Experiments across multiple architectures and datasets demonstrate that ePC matches backpropagation’s performance even for deeper models where sPC struggles. Besides practical improvements, our work provides theoretical insight into PC dynamics and establishes a foundation for scaling PC-based learning to deeper architectures on digital hardware and beyond.
💡 Research Summary
Predictive Coding (PC) has emerged as a brain‑inspired alternative to back‑propagation, but its canonical state‑based formulation (sPC) suffers from severe inefficiencies when simulated on digital hardware. The authors first demonstrate that sPC’s local, layer‑wise update rule inevitably introduces an exponential signal decay: each backward‑propagating signal is multiplied by the state learning rate λ (< 1) at every layer, yielding a factor λ^{L‑i} for a network of depth L. With typical λ values (0.01–0.1), signals vanish after only a handful of update steps, especially in deep networks, leading to slow convergence, poor gradient quality, and a pronounced depth‑scaling failure. Empirical traces on a 20‑layer MLP confirm that the backward wavefront stalls for several iterations before reaching deeper layers, and that increasing λ or using higher‑precision arithmetic merely masks the underlying problem.
To overcome this limitation, the paper introduces error‑based Predictive Coding (ePC), a re‑parameterization that treats prediction errors ε_i as the primary optimisation variables rather than the neuronal states s_i. The energy function is rewritten as E(ε, θ)=½∑‖ε_i‖²+L(ŷ, y), while the relationship s_i = ŝ_i + ε_i (with ŝ_i = f_θi(s_{i‑1})) allows recovery of the original states if needed. Crucially, ePC’s computational graph is globally connected: errors are updated in a sequential, back‑propagation‑like fashion, so the gradient of the loss with respect to each ε_i is obtained in a single pass without the λ‑induced attenuation. The weight‑update step remains identical to sPC (∇_θ E = −∂ŝ_i/∂θ · ε_i), guaranteeing that ePC computes the exact same weight gradients as sPC at equilibrium. The authors provide a formal proof of equivalence (Appendix C) and show that ePC converges in orders of magnitude fewer iterations.
Extensive experiments across architectures (fully‑connected MLPs, convolutional networks, residual networks) and datasets (MNIST, CIFAR‑10, a subset of ImageNet) validate the claims. For shallow networks (≤10 layers) both sPC and ePC reach comparable test accuracies, but ePC does so roughly 10–30× faster. In deep settings (30–50 layers), sPC either fails to converge or attains sub‑par performance (< 60% accuracy), whereas ePC consistently reaches > 90% accuracy within 3–5 epochs, matching back‑propagation baselines. Sensitivity analyses reveal that ePC is robust to the choice of λ and to floating‑point precision, unlike sPC which remains fragile.
The paper concludes that the primary obstacle to scaling Predictive Coding on digital hardware is not a theoretical flaw in PC itself, but the mismatch between the locally constrained dynamics of sPC and the realities of discrete computation. By abandoning strict locality and optimising errors directly, ePC eliminates exponential signal decay, preserves exact gradients, and enables fast, deep learning comparable to back‑propagation. Although ePC sacrifices biological plausibility, it offers a practical pathway for hardware‑accelerated PC and suggests future research directions such as hybrid schemes that combine ePC’s efficiency with neuromorphic implementations of sPC.
Comments & Academic Discussion
Loading comments...
Leave a Comment