Metabolic cost of information processing in Poisson variational autoencoders

Reading time: 5 minute
...

📝 Original Info

  • Title: Metabolic cost of information processing in Poisson variational autoencoders
  • ArXiv ID: 2602.13421
  • Date: 2026-02-13
  • Authors: ** 논문에 명시된 저자 정보가 제공되지 않았으므로, 원문에 기재된 저자명을 그대로 삽입해 주세요. — **

📝 Abstract

Computation in biological systems is fundamentally energy-constrained, yet standard theories of computation treat energy as freely available. Here, we argue that variational free energy minimization under a Poisson assumption offers a principled path toward an energy-aware theory of computation. Our key observation is that the Kullback-Leibler (KL) divergence term in the Poisson free energy objective becomes proportional to the prior firing rates of model neurons, yielding an emergent metabolic cost term that penalizes high baseline activity. This structure couples an abstract information-theoretic quantity -- the *coding rate* -- to a concrete biophysical variable -- the *firing rate* -- which enables a trade-off between coding fidelity and energy expenditure. Such a coupling arises naturally in the Poisson variational autoencoder (P-VAE) -- a brain-inspired generative model that encodes inputs as discrete spike counts and recovers a spiking form of *sparse coding* as a special case -- but is absent from standard Gaussian VAEs. To demonstrate that this metabolic cost structure is unique to the Poisson formulation, we compare the P-VAE against Grelu-VAE, a Gaussian VAE with ReLU rectification applied to latent samples, which controls for the non-negativity constraint. Across a systematic sweep of the KL term weighting coefficient $β$ and latent dimensionality, we find that increasing $β$ monotonically increases sparsity and reduces average spiking activity in the P-VAE. In contrast, Grelu-VAE representations remain unchanged, confirming that the effect is specific to Poisson statistics rather than a byproduct of non-negative representations. These results establish Poisson variational inference as a promising foundation for a resource-constrained theory of computation.

💡 Deep Analysis

📄 Full Content

Modern artificial intelligence (AI) has achieved impressive breakthroughs with no slowdown in sight. However, this achievement comes at a serious cost: mainstream AI models are energetically inefficient, posing a global sustainability threat (Hao, 2025). Powering models like Chat-GPT consumes gigawatt-hours, and energy is quickly becoming the unavoidable bottleneck for AI progress (You & Owen, 2025). This is a fundamental physical constraint we cannot engineer our way around-there is an urgent need to address this from first principles.

The energy inefficiency of mainstream AI systems originates from a critical design principle: the decoupling of energy and computation (Deacon, 2011;Landauer, 1961). There are no mechanisms internal to architectures like transformers that relate computation to energy expenditure. For a transformer, every token is created equal, as far as voltage goes. This is wasteful by design.

In sharp contrast, energy-efficiency is a core principle of biological computation (Olshausen & Field, 1997;Quiroga et al., 2008;Sterling & Laughlin, 2015). Brains run on ∼20 watts (Balasubramanian, 2021), yet perform computations that require megawatt-scale data centers to approximate. This efficiency is likely driven by the efficient coding hypothesis (Barlow, 1961(Barlow, , 1972(Barlow, , 1989)), which states that brains adapt to the statistics of their environments (Simoncelli & Olshausen, 2001), minimizing metabolic cost (Attwell & Laughlin, 2001;Olshausen & Field, 1996;Padamsey & Rochefort, 2023). A complementary possibility is that the brain’s representational form-discrete spiking events-determines the cost structure of the computation itself.

Neuromorphic computing aims to bridge biological and artificial computation (Mead, 2002), where event-driven architectures like Intel’s Loihi (Davies et al., 2018) already “think” in spikes and energy. But we still lack rigorous theoretical foundations to inform future algorithm and hardware co-design. This motivates the need for an energy-aware theory of computation that goes beyond the current frameworks, which are limited to time and space complexity (Sipser, 2012;Von Neumann, 1945). Aimone (2025) recently approached this from the hardware perspective. He argued that neuromorphic computing has fundamentally different energy scaling compared to the von Neumann architecture. In conventional systems, energy is proportional to total algorithmic work: every operation incurs a fixed cost regardless of what is actually being computed. In neuromorphic systems, energy is instead proportional to the cumulative change of state across the computational graph: if a neuron does not spike and its state does not change, no energy is expended. Aimone (2025) further showed that the dominant energy terms all scale with the average firing rate, making sparsity the primary lever for efficiency. However, this analysis addresses only the hardware side. The algorithmic and theoretical foundations remain missing.

Here, we demonstrate how Poisson variational inference (Vafaii et al., 2024(Vafaii et al., , 2025) ) naturally leads to the emer-The variational free energy equation: relating model evidence (left hand side) to the variational free energy objective (ELBO = -ℱ), plus the standard KL objective (used as the starting point in variational inference). Importantly, the left hand side does not depend on the variational parameters, 𝜆; therefore, minimizing ℱ with respect to 𝜆 directly minimizes the original inference KL objective. In short: evidence(𝑥; 𝜃) = -ℱ (𝑥; 𝜃, 𝜆) + KL(𝑥; 𝜃, 𝜆

gence of an energy-aware objective that learns to trade computational accuracy for energy expenditure. We contrast this with standard Gaussian variational inference (Friston, 2005(Friston, , 2009(Friston, , 2010;;Kingma & Welling, 2014), revealing that such a metabolic cost term is critically absent from the Gaussian formulation. We provide a theoretical explanation using information geometry (Amari, 2016):

Poisson and Gaussian distributions have fundamentally different geometries, and only Poisson realizes the kind of energy-computation coupling that Aimone (2025) argues for. We then conduct comprehensive experiments that confirm these theoretical predictions.

We establish from probabilistic first principles that variational inference under Poisson assumptions naturally produces an emergent metabolic term that makes silence cheap, and couples information rate to firing rate. This cost structure is strikingly similar to what Aimone (2025) arrives at from hardware principles: that energy-efficient neuromorphic computation requires algorithms where energy scales with change-of-state rather than total work, and that sparsity (silence) should be free. The convergence of these two independent lines of reasoning-hardware and information theorypositions Poisson variational inference as a promising foundation for resource-constrained theories of computation that treat energy expenditure as a core

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut