Dynamical modeling of nonlinear latent factors in multiscale neural activity with real-time inference

February 20, 2026

Reading time: 5 minute

...

📝 Original Info

Title: Dynamical modeling of nonlinear latent factors in multiscale neural activity with real-time inference
ArXiv ID: 2512.12462
Date: 2025-12-13
Authors: Eray Erturk, Maryam M. Shanechi

📝 Abstract

Real-time decoding of target variables from multiple simultaneously recorded neural time-series modalities, such as discrete spiking activity and continuous field potentials, is important across various neuroscience applications. However, a major challenge for doing so is that different neural modalities can have different timescales (i.e., sampling rates) and different probabilistic distributions, or can even be missing at some time-steps. Existing nonlinear models of multimodal neural activity do not address different timescales or missing samples across modalities. Further, some of these models do not allow for real-time decoding. Here, we develop a learning framework that can enable real-time recursive decoding while nonlinearly aggregating information across multiple modalities with different timescales and distributions and with missing samples. This framework consists of 1) a multiscale encoder that nonlinearly aggregates information after learning within-modality dynamics to handle different timescales and missing samples in real time, 2) a multiscale dynamical backbone that extracts multimodal temporal dynamics and enables real-time recursive decoding, and 3) modality-specific decoders to account for different probabilistic distributions across modalities. In both simulations and three distinct multiscale brain datasets, we show that our model can aggregate information across modalities with different timescales and distributions and missing samples to improve real-time target decoding. Further, our method outperforms various linear and nonlinear multimodal benchmarks in doing so.

💡 Deep Analysis

📄 Full Content

Dynamical modeling of nonlinear latent factors in multiscale neural activity with real-time inference Eray Erturk Ming Hsieh Department of Electrical and Computer Engineering University of Southern California Los Angeles, CA eerturk@usc.edu Maryam M. Shanechi∗ Ming Hsieh Department of Electrical and Computer Engineering Thomas Lord Department of Computer Science Alfred E. Mann Department of Biomedical Engineering Neuroscience Graduate Program University of Southern California Los Angeles, CA shanechi@usc.edu Abstract Real-time decoding of target variables from multiple simultaneously recorded neural time-series modalities, such as discrete spiking activity and continuous field potentials, is important across various neuroscience applications. However, a major challenge for doing so is that different neural modalities can have different timescales (i.e., sampling rates) and different probabilistic distributions, or can even be missing at some time-steps. Existing nonlinear models of multimodal neural activity do not address different timescales or missing samples across modal- ities. Further, some of these models do not allow for real-time decoding. Here, we develop a learning framework that can enable real-time recursive decoding while nonlinearly aggregating information across multiple modalities with different timescales and distributions and with missing samples. This framework consists of 1) a multiscale encoder that nonlinearly aggregates information after learning within-modality dynamics to handle different timescales and missing samples in real time, 2) a multiscale dynamical backbone that extracts multimodal temporal dynamics and enables real-time recursive decoding, and 3) modality-specific de- coders to account for different probabilistic distributions across modalities. In both simulations and three distinct multiscale brain datasets, we show that our model can aggregate information across modalities with different timescales and distributions and missing samples to improve real-time target decoding. Further, our method outperforms various linear and nonlinear multimodal benchmarks in doing so. 1 Introduction Real-time continuous decoding of target time-series, such as various brain or behavioral states from neural time-series data is of interest across many neuroscience applications. A popular approach for doing so is to develop dynamical latent factor models that describe neural dynamics in terms of the temporal evolution of latent variables that can be used for downstream decoding. To date, ∗Corresponding author: shanechi@usc.edu 39th Conference on Neural Information Processing Systems (NeurIPS 2025). arXiv:2512.12462v1 [cs.LG] 13 Dec 2025 dynamical latent factor models of neural data have mostly focused on a single modality of neural data, for example, either spiking activity or local field potentials (LFP) [1–4]. However, brain and behavioral target states are encoded across multiple spatial and temporal scales of brain activity that are measured with different neural modalities. Furthermore, some of these dynamical models have a non-causal inference procedure, which hinders real-time decoding. Therefore, inference of target variables could be improved by developing nonlinear dynamical models of multimodal neural time-series that can, at each time-step, aggregate information across neural modalities and do so in real-time. A natural challenge in developing such multimodal models arises when modalities are not aligned due to their different recording timescales that can be caused by various factors such as fundamental biological differences across modalities—with some modalities evolving slower than others [5]— differences in recording devices [6, 7], or measurement failures or interruptions [8–10]. Further, modalities could have different distributions. For example, spiking activity is a binary-valued time- series that indicates the presence of action potential events from neurons at each time. As such, it has a fast millisecond timescale and is often modeled as count processes, such as Poisson. In comparison, LFP activity is a continuous-valued modality that measures network-level neural processes, has a slower timescale, and is typically modeled with a Gaussian distribution [5, 7, 11]. We refer to multimodal data with different timescales as multiscale data. Thus, to fuse information across spiking and LFP modalities and improve downstream target decoding tasks, their dynamics should be modeled by incorporating their cross-modality probabilistic and timescale differences through a careful encoder design. Existing neural dynamical modeling approaches do not address the nonlinear modeling of multimodal data with different timescales and/or with real-time decoding capability. Specifically, most dynamical models do not capture multimodal neural dynamics and instead focus on a single modality of neural activity either by using linear/switching-linear approaches [1, 12, 13] or by utilizing no

📄 Read Full PDF on ArXiv