Latent-Variable Learning of SPDEs via Wiener Chaos
We study the problem of learning the law of linear stochastic partial differential equations (SPDEs) with additive Gaussian forcing from spatiotemporal observations. Most existing deep learning approaches either assume access to the driving noise or initial condition, or rely on deterministic surrogate models that fail to capture intrinsic stochasticity. We propose a structured latent-variable formulation that requires only observations of solution realizations and learns the underlying randomly forced dynamics. Our approach combines a spectral Galerkin projection with a truncated Wiener chaos expansion, yielding a principled separation between deterministic evolution and stochastic forcing. This reduces the infinite-dimensional SPDE to a finite system of parametrized ordinary differential equations governing latent temporal dynamics. The latent dynamics and stochastic forcing are jointly inferred through variational learning, allowing recovery of stochastic structure without explicit observation or simulation of noise during training. Empirical evaluation on synthetic data demonstrates state-of-the-art performance under comparable modeling assumptions across bounded and unbounded one-dimensional spatial domains.
💡 Research Summary
The paper tackles the challenging inverse problem of learning the law of a linear stochastic partial differential equation (SPDE) with additive Gaussian forcing when only solution realizations are observed. Existing deep‑learning approaches for PDEs either assume direct access to the driving noise, require knowledge of the initial condition, or replace the stochastic dynamics with deterministic surrogates that cannot capture intrinsic randomness. The authors propose a principled latent‑variable framework that needs only spatio‑temporal observations of the solution field and can recover both the deterministic evolution operator and the stochastic forcing distribution.
The methodology proceeds in two systematic reduction steps. First, a spectral Galerkin projection onto the first (N) eigenfunctions of the linear operator (A) produces a finite‑dimensional stochastic system (\tilde X_t) driven by a projected Wiener process. Second, a Wiener–Itô chaos expansion is applied to (\tilde X_t). By expanding the solution in the orthonormal chaos basis ({\xi_\alpha}) (constructed from time‑basis functions and independent Wiener processes) the random field is expressed as a sum of deterministic coefficient functions (eX^{(\alpha)}(t)) multiplied by fixed random variables (\xi_\alpha). For linear SPDEs with additive Gaussian noise, the authors prove a first‑order closure: all chaos coefficients of order two or higher vanish identically. Consequently, only the zero‑order (mean) mode and the first‑order chaos modes remain, leading to a closed system of ordinary differential equations (ODEs) for each spatial eigenmode. The ODEs for the mean are simple exponential decays, while the ODEs for the first‑order chaos coefficients contain a deterministic drift term and a forcing term that is a product of the time‑basis function and the inner product of the noise covariance’s square root with the eigenfunctions.
Truncating the spatial dimension to (N) and the chaos indices to (K) time functions and (L) noise components yields a latent dimension (d = N(1+KL)). The truncated solution reconstruction (Equation 12) makes explicit how observed stochastic fields are generated from a finite set of deterministic propagator trajectories and a handful of Gaussian latent variables.
Learning is cast as a joint variational inference problem over two latent groups: (i) the initial deterministic state (z_0) that parameterizes the zero‑order propagator coefficients, and (ii) the vector of first‑order chaos coordinates (\xi). An encoder network (Enc_\phi) maps each observed trajectory to approximate posterior distributions (q_\phi(z_0|\mathbf X)) and (q_\phi(\xi|\mathbf X)), both taken as diagonal Gaussians. Given a sampled (z_0), the latent ODE dynamics are integrated using a parameterized vector field (f_\theta) (with parameters (\theta = (\lambda, q)) encoding the eigenvalues and noise covariance). The latent trajectory ({z_t}) together with the sampled (\xi) are fed into a reconstruction operator (R) that evaluates the truncated Wiener‑chaos expansion on the spatial mesh. A decoder (Dec_\gamma) defines a Gaussian likelihood for the observed fields, completing the probabilistic model. The evidence lower bound (ELBO) comprises a reconstruction term (log‑likelihood of the observed spatio‑temporal data) and a KL divergence regularizer between the variational posteriors and standard normal priors. Maximizing the ELBO jointly over (\phi, \theta,) and (\gamma) yields estimates of the deterministic drift, the stochastic forcing structure, and the latent variables.
The authors conduct extensive synthetic experiments on one‑dimensional bounded and unbounded domains. They compare against state‑of‑the‑art neural operators (Fourier Neural Operators, DeepONets) and recent Neural SPDE frameworks that require noise conditioning. Their method consistently achieves lower reconstruction error, more accurate recovery of the covariance kernel, and higher log‑likelihood scores, despite never observing the driving Wiener process during training. Moreover, the learned first‑order chaos coefficients faithfully reproduce the true stochastic forcing statistics, demonstrating that the approach truly learns the law of the SPDE rather than merely fitting individual sample paths.
Key contributions include: (1) a rigorous reduction of an infinite‑dimensional SPDE to a finite set of ODEs via Galerkin projection and Wiener‑chaos truncation; (2) the identification of a first‑order closure that eliminates higher‑order chaos terms for linear additive Gaussian SPDEs; (3) a variational latent‑ODE architecture that jointly infers deterministic propagators and stochastic latent variables without any direct noise observations; (4) empirical evidence of state‑of‑the‑art performance on synthetic benchmarks.
While the current formulation is limited to linear, additive Gaussian SPDEs, the authors discuss possible extensions. Higher‑order chaos terms could be retained to handle non‑linear or multiplicative noise, albeit at increased computational cost. More expressive variational families (e.g., normalizing flows) could capture posterior dependencies between (z_0) and (\xi). Finally, the framework’s modularity suggests it could be combined with existing neural operator backbones to handle more complex spatial operators or higher‑dimensional domains. In summary, the paper presents a mathematically grounded, practically effective method for learning stochastic PDE laws from data, filling a notable gap between deterministic operator learning and full probabilistic modeling of spatio‑temporal randomness.
Comments & Academic Discussion
Loading comments...
Leave a Comment