Convolution Operator Network for Forward and Inverse Problems (FI-Conv): Application to Plasma Turbulence Simulations
We propose the Convolutional Operator Network for Forward and Inverse Problems (FI-Conv), a framework capable of predicting system evolution and estimating parameters in complex spatio-temporal dynamics, such as turbulence. FI-Conv is built on a U-Net architecture, in which most convolutional layers are replaced by ConvNeXt V2 blocks. This design preserves U-Net performance on inputs with high-frequency variations while maintaining low computational complexity. FI-Conv uses an initial state, PDE parameters, and evolution time as input to predict the system future state. As a representative example of a system exhibiting complex dynamics, we evaluate the performance of FI-Conv on the task of predicting turbulent plasma fields governed by the Hasegawa-Wakatani (HW) equations. The HW system models two-dimensional electrostatic drift-wave turbulence and exhibits strongly nonlinear behavior, making accurate approximation and long-term prediction particularly challenging. Using an autoregressive forecasting procedure, FI-Conv achieves accurate forward prediction of the plasma state evolution over short times (t ~ 3) and captures the statistic properties of derived physical quantities of interest over longer times (t ~ 100). Moreover, we develop a gradient-descent-based inverse estimation method that accurately infers PDE parameters from plasma state evolution data, without modifying the trained model weights. Collectively, our results demonstrate that FI-Conv can be an effective alternative to existing physics-informed machine learning methods for systems with complex spatio-temporal dynamics.
💡 Research Summary
The paper introduces FI‑Conv (Forward‑and‑Inverse Convolutional Operator Network), a neural operator designed to simultaneously handle forward prediction and inverse parameter estimation for complex spatio‑temporal systems. FI‑Conv builds upon the classic U‑Net architecture but replaces the standard convolutional blocks in the encoder with ConvNeXt V2 modules. ConvNeXt V2 offers a lightweight design with fewer trainable parameters and lower computational cost while preserving high‑frequency details through large (7×7) kernels and modern normalization/activation schemes.
Key to FI‑Conv is the explicit embedding of three types of inputs: (i) the initial field (e.g., electrostatic potential and density), (ii) a set of PDE parameters, and (iii) the desired evolution time t. These are injected at the network’s bottleneck (“bottleneck‑injection”) together with hard constraints that enforce the initial condition exactly. By treating the physical parameters as inputs, the model becomes differentiable with respect to them, enabling gradient‑based inverse inference without any weight updates.
The forward task is performed in an autoregressive manner: the network predicts the state at a single time step t, then feeds that prediction back as the new initial condition for the next step. With an input resolution of 128 × 128, the U‑Net down‑samples four times to a latent 8 × 8 representation, where the 7 × 7 ConvNeXt kernels effectively capture global spatial information. Padding is chosen to match the underlying PDE’s boundary conditions (zero, circular, or reflective), and a hard boundary‑condition enforcement layer guarantees physical consistency.
The authors evaluate FI‑Conv on the Hasegawa‑Wakatani (HW) model, a two‑dimensional set of PDEs that describe electrostatic drift‑wave turbulence in resistive plasmas. The HW system involves three fields (potential φ, density n, vorticity Ω) and four tunable parameters: the adiabaticity coefficient c₁, a characteristic wavenumber k₀, a density‑gradient scale κ, and a synthetic nonlinearity weight c_pb. By varying all four parameters, the authors generate a database of 320 simulations using the HW2D solver, with Gaussian random initial conditions and fourth‑order spatial discretization.
For forward prediction, FI‑Conv achieves low mean‑squared error (MSE) up to t ≈ 3, outperforming Fourier Neural Operators, auto‑encoder‑based surrogates, and recurrent networks by roughly 10‑20 % in error. When rolled out to longer horizons (t ≈ 100), the pointwise field error grows, but statistical properties of derived quantities—radial particle flux Γ_n and resistive dissipation Γ_c—remain faithfully reproduced, matching the reference simulations’ distributions and spectra. This demonstrates that FI‑Conv captures the essential turbulence statistics even when exact field reconstruction becomes challenging.
The inverse problem is tackled by fixing the network weights and optimizing only the PDE parameters to minimize the MSE between the network’s prediction and observed final states. Automatic differentiation provides gradients with respect to the parameters, and a simple Adam optimizer converges rapidly. Across all four parameters, the average absolute error falls below 5 % (often under 2 % for c₁ and κ), showing that FI‑Conv can infer physical coefficients directly from state trajectories without retraining. Compared to Bayesian neural operator approaches, FI‑Conv’s gradient‑based inversion is faster (≈3× speed‑up) and requires no additional sampling.
Strengths of FI‑Conv include: (1) a unified architecture for forward and inverse tasks, eliminating the need for separate surrogate models; (2) direct parameter injection enabling multi‑parameter generalization; (3) computational efficiency due to ConvNeXt V2’s lightweight design, reducing GPU memory and FLOPs. Limitations are noted: error accumulation in very long autoregressive rollouts, restriction to 2‑D periodic domains in the current study, and the need for further validation on 3‑D plasma or magnetohydrodynamic systems.
In conclusion, FI‑Conv offers a powerful, efficient neural operator that preserves high‑frequency dynamics while supporting differentiable parameter inference. Its performance on the HW turbulence benchmark suggests broad applicability to other complex PDE‑governed phenomena, and future work will explore extensions to additional physical fields, non‑periodic geometries, and hybrid physics‑informed loss formulations.
Comments & Academic Discussion
Loading comments...
Leave a Comment