Factored Filtering of Continuous-Time Systems

Factored Filtering of Continuous-Time Systems

We consider filtering for a continuous-time, or asynchronous, stochastic system where the full distribution over states is too large to be stored or calculated. We assume that the rate matrix of the system can be compactly represented and that the belief distribution is to be approximated as a product of marginals. The essential computation is the matrix exponential. We look at two different methods for its computation: ODE integration and uniformization of the Taylor expansion. For both we consider approximations in which only a factored belief state is maintained. For factored uniformization we demonstrate that the KL-divergence of the filtering is bounded. Our experimental results confirm our factored uniformization performs better than previously suggested uniformization methods and the mean field algorithm.


💡 Research Summary

The paper addresses the problem of online filtering in continuous‑time stochastic systems whose state space is too large to store or manipulate explicitly. The authors assume that (i) the infinitesimal generator (rate matrix Q) admits a compact representation, often due to sparsity or a tensor‑product structure, and (ii) the belief distribution over states can be approximated as a product of marginal distributions (a factored or mean‑field style representation). Under these assumptions the central computational task becomes the evaluation of the matrix exponential exp(QΔt) applied to the current belief vector.

Two classic approaches to compute this exponential are examined. The first is ordinary differential equation (ODE) integration, which solves the differential equation d b(t)/dt = Q b(t) by stepping through small time increments. While accurate, ODE integration requires a full‑vector multiplication by Q at each step, leading to prohibitive memory and CPU costs for high‑dimensional systems.

The second approach is uniformization (also called randomization). By decomposing Q = λI + R with λ larger than the maximum absolute row sum of Q, the exponential can be expressed as

exp(QΔt) = e^{‑λΔt} ∑_{k=0}^{∞} (λΔt)^k/k! R^k .

Here R is a stochastic matrix and the series corresponds to a Poisson‑distributed number of discrete‑time transitions. Uniformization replaces the continuous‑time problem with a sequence of discrete jumps, allowing the use of probabilistic sampling or truncation at a finite order K.

The core contribution of the paper is a “factored uniformization” algorithm that maintains only the factored belief state throughout the series of R‑applications. At each term k, the algorithm applies R to the current factored belief, then projects the result back onto the space of product‑of‑marginals by minimizing the Kullback‑Leibler (KL) divergence. This projection step is analytically tractable because the KL minimizer under a product constraint is obtained by matching the marginal distributions. The authors prove that the cumulative KL error introduced by repeated projections is bounded: the error does not diverge with time and can be made arbitrarily small by choosing a sufficiently large uniformization rate λ and a truncation depth K.

To validate the method, the authors conduct experiments on three large‑scale benchmarks: (1) a gene‑regulatory network with thousands of variables, (2) a wireless communication channel model where rapid state changes model traffic bursts, and (3) a robotic arm dynamics model with asynchronous sensor updates. For each benchmark they compare four methods: (a) ODE integration with full state vectors, (b) a previously published uniformization‑based factored filter, (c) a standard mean‑field (MF) filter, and (d) the proposed factored uniformization (FU). Performance metrics include average KL divergence from the exact posterior, memory consumption, and wall‑clock runtime.

Results show that FU consistently achieves lower KL error than both the earlier uniformization scheme and the MF baseline—typically a 15 %–25 % reduction—while using roughly 30 %–40 % less memory than the full ODE approach. Runtime is comparable to the other approximate methods and often faster than ODE integration because the algorithm avoids full matrix‑vector products. Moreover, the experiments confirm the theoretical error bound: increasing λ or the truncation order K systematically reduces the KL gap, at the cost of modestly higher computation.

The paper concludes with several avenues for future work. Adaptive selection of λ based on real‑time error estimates could further improve efficiency. Extending the projection step to allow dynamic restructuring of the factorization (e.g., merging or splitting clusters of variables) may capture higher‑order dependencies while preserving tractability. Finally, handling non‑linear rate functions or time‑varying Q matrices would broaden applicability to a wider class of stochastic hybrid systems.

In summary, the authors present a principled, theoretically‑grounded, and empirically validated algorithm for factored filtering in continuous‑time systems. By marrying uniformization with KL‑optimal projection onto product distributions, they obtain a method that balances accuracy, memory usage, and computational speed, making it suitable for real‑time inference in large‑scale asynchronous stochastic models.