POCO: Scalable Neural Forecasting through Population Conditioning

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Predicting future neural activity is a core challenge in modeling brain dynamics, with applications ranging from scientific investigation to closed-loop neurotechnology. While recent models of population activity emphasize interpretability and behavioral decoding, neural forecasting-particularly across multi-session, spontaneous recordings-remains underexplored. We introduce POCO, a unified forecasting model that combines a lightweight univariate forecaster with a population-level encoder to capture both neuron-specific and brain-wide dynamics. Trained across five calcium imaging datasets spanning zebrafish, mice, and C. elegans, POCO achieves state-of-the-art accuracy at cellular resolution in spontaneous behaviors. After pre-training, POCO rapidly adapts to new recordings with minimal fine-tuning. Notably, POCO’s learned unit embeddings recover biologically meaningful structure-such as brain region clustering-without any anatomical labels. Our comprehensive analysis reveals several key factors influencing performance, including context length, session diversity, and preprocessing. Together, these results position POCO as a scalable and adaptable approach for cross-session neural forecasting and offer actionable insights for future model design. By enabling accurate, generalizable forecasting models of neural dynamics across individuals and species, POCO lays the groundwork for adaptive neurotechnologies and large-scale efforts for neural foundation models. Code is available at https://github.com/yuvenduan/POCO.

💡 Research Summary

The paper introduces POCO (Population‑Conditioned forecaster), a novel architecture for predicting future neural activity at cellular resolution across multiple recording sessions and species. POCO tackles the under‑explored problem of neural time‑series forecasting (TSF) in spontaneous, task‑free recordings, where existing models have largely focused on interpreting population dynamics or decoding behavior from limited, trial‑based data.

POCO consists of two complementary components. The first is a lightweight univariate multilayer perceptron (MLP) forecaster. Given a context window of C = 48 time steps (≈1–2 s) for each neuron, the MLP maps the past activity to a hidden representation of size M = 1024 and then linearly projects to predict the next P = 16 steps (≈15 s). This part captures each neuron’s intrinsic autocorrelation and simple temporal patterns.

The second component is a population encoder that conditions the MLP via Feature‑wise Linear Modulation (FiLM). The encoder adapts the POYO architecture (a Perceiver‑IO‑based tokenizer originally designed for behavioral decoding). Each neuron’s trace is split into tokens of length T_C = 16; tokens are linearly projected, summed with a learnable unit embedding (UnitEmbed) and a session embedding (SessionEmbed), and fed into a Perceiver‑IO cross‑attention stack with N_L = 8 latent vectors. After one self‑attention layer, the final attention uses the unit embeddings as queries to produce per‑neuron FiLM parameters (γ, β) that scale and shift the MLP hidden activations. Thus, the model blends neuron‑specific dynamics with the global brain state at each time point.

The authors evaluate POCO on five calcium‑imaging datasets spanning zebrafish (two labs), mouse, and C. elegans (two labs). These datasets differ in species, number of neurons (from ~100 to >70 k), recording length, and sampling frequency. All traces are z‑scored; for the largest zebrafish dataset, the authors also report results on the first 512 principal components to manage dimensionality. They compare POCO against a broad suite of baselines: simple linear and univariate linear models (NLinear, DLinear), latent dynamical systems (Latent_PLRNN), modern MLP‑based TSF (TSMixer, TexFilter), attention‑based models (AR_Transformer, NetFormer), convolutional TCN, and a copy‑last‑frame baseline (f_copy). Performance is measured by mean‑squared error and a “Prediction Score” defined as 1 − MSE_model / MSE_copy, analogous to an R² relative to the naive copy baseline.

Results show that POCO consistently outperforms all baselines across all datasets. In single‑session experiments, POCO achieves the highest prediction scores (e.g., 0.466 ± 0.019 on the Deisseroth zebrafish data). In multi‑session training, the advantage is even larger: MS_POCO reaches an average score of 0.525 ± 0.004, surpassing a multi‑session MLP by ~0.1 and beating more sophisticated models such as NetFormer and AR_Transformer by substantial margins. Sample prediction traces illustrate that POCO captures both slow calcium dynamics and transient events.

A series of ablations elucidate key design choices. Increasing the context length C improves performance up to a plateau around C = 48; longer contexts provide diminishing returns. Token length T_C = 16 balances computational cost (linear scaling with neuron count) and accuracy; very short tokens (T_C = 1) inflate token count and degrade efficiency. Training on a diverse set of sessions enhances generalization, confirming that shared neural motifs exist across individuals and species. Pre‑training on the multi‑session corpus enables rapid adaptation: fine‑tuning on a new session for just a few epochs yields performance comparable to training from scratch, highlighting POCO’s suitability for online or closed‑loop neurotechnology.

Perhaps the most striking finding is that the learned unit embeddings, despite being trained without any anatomical supervision, naturally cluster neurons by brain region. This emergent structure suggests that the population encoder captures biologically meaningful relationships and that FiLM conditioning propagates this information to the forecaster. The authors also demonstrate that POCO can be applied to reduced representations such as principal components, further broadening its applicability.

In summary, POCO introduces a scalable, interpretable framework for neural forecasting that leverages (i) a simple univariate predictor for neuron‑specific dynamics, (ii) a population‑level encoder to model global brain state, and (iii) FiLM modulation to fuse the two. The model scales linearly with neuron count, adapts quickly to new recordings, and uncovers latent anatomical structure, positioning it as a strong candidate for building foundation models of neural activity and for practical closed‑loop neurotechnologies.

POCO: Scalable Neural Forecasting through Population Conditioning

💡 Research Summary

Comments & Academic Discussion

Leave a Comment