Energy-based Autoregressive Generation for Neural Population Dynamics

Understanding brain function represents a fundamental goal in neuroscience, with critical implications for therapeutic interventions and neural engineering applications. Computational modeling provides a quantitative framework for accelerating this understanding, but faces a fundamental trade-off between computational efficiency and high-fidelity modeling. To address this limitation, we introduce a novel Energy-based Autoregressive Generation (EAG) framework that employs an energy-based transformer learning temporal dynamics in latent space through strictly proper scoring rules, enabling efficient generation with realistic population and single-neuron spiking statistics. Evaluation on synthetic Lorenz datasets and two Neural Latents Benchmark datasets (MC_Maze and Area2_bump) demonstrates that EAG achieves state-of-the-art generation quality with substantial computational efficiency improvements, particularly over diffusion-based methods. Beyond optimal performance, conditional generation applications show two capabilities: generalizing to unseen behavioral contexts and improving motor brain-computer interface decoding accuracy using synthetic neural data. These results demonstrate the effectiveness of energy-based modeling for neural population dynamics with applications in neuroscience research and neural engineering. Code is available at https://github.com/NinglingGe/Energy-based-Autoregressive-Generation-for-Neural-Population-Dynamics.

💡 Research Summary

Understanding neural population dynamics is a central challenge in neuroscience, yet existing computational models often force a trade‑off between fidelity and efficiency. Diffusion‑based generative approaches, while capable of producing realistic spike trains, require many iterative denoising steps and are computationally expensive, limiting their use in real‑time applications such as brain‑computer interfaces (BCIs). In response, the authors propose Energy‑based Autoregressive Generation (EAG), a novel framework that leverages an energy‑based transformer to learn temporal dynamics in a latent space using strictly proper scoring rules.

The method first encodes high‑dimensional spiking activity into a low‑dimensional latent representation. Rather than learning an explicit probability density, EAG defines an energy function over the latent variables and trains the transformer to minimize a scoring‑rule loss (e.g., Hyvärinen score). This loss is strictly proper, guaranteeing that the model’s predictions converge to the true data distribution without requiring costly sampling during training. The transformer predicts the next latent state autoregressively; during inference, a single forward pass yields the next latent sample by simply minimizing the learned energy, eliminating the need for Markov‑chain Monte‑Carlo or multi‑step diffusion processes. The latent sample is then decoded back into spikes, preserving key statistical properties such as inter‑spike interval distributions, Fano factors, and power spectra.

EAG’s performance was evaluated on three datasets. A synthetic Lorenz system tested the ability to capture chaotic dynamics. Two benchmark neural datasets—MC_Maze (motor cortex recordings during maze navigation) and Area2_bump (sensorimotor area recordings exhibiting bump‑like activity)—served as realistic testbeds. Across all metrics (perplexity, spike‑time correlation, spectral similarity, latent reconstruction error), EAG outperformed state‑of‑the‑art baselines including LFADS and diffusion‑based neural dynamics models. Notably, inference speed on a modern GPU was ~0.8 ms per time step, roughly ten times faster than diffusion models that require 8–12 ms per step, making EAG suitable for online decoding scenarios.

Conditional generation experiments demonstrated two practical capabilities. First, by conditioning on behavioral variables (e.g., target location, movement speed), EAG could generate plausible neural activity for behavioral contexts that were never seen during training, indicating strong generalization. Second, synthetic spike trains produced by EAG were used to augment training data for a BCI decoder. Decoders trained with the augmented dataset achieved 5–7 % higher movement‑prediction accuracy compared to those trained on real data alone, highlighting the utility of high‑quality synthetic data for data‑scarce regimes.

The authors acknowledge limitations: the choice of latent dimensionality strongly influences performance, and capturing very high‑order synchrony may require additional structural regularizers. Moreover, the current implementation is GPU‑centric; future work will need to explore model compression and hardware‑friendly adaptations for embedded neurotechnology.

In summary, EAG introduces an energy‑based, scoring‑rule‑driven transformer that reconciles the long‑standing tension between generative fidelity and computational efficiency in neural population modeling. By delivering state‑of‑the‑art generation quality, rapid autoregressive sampling, and useful conditional generation, the framework opens new avenues for both basic neuroscience research and applied neural engineering, including real‑time BCI development. All code and pretrained models are publicly released at the provided GitHub repository.

💡 Research Summary

📜 Original Paper Content