Data-integrated neural networks for solving partial differential equations
In this work, we propose data-integrated neural networks (DataInNet) for solving partial differential equations (PDEs), offering a novel approach to leveraging data (e.g., source terms, initial conditions, and boundary conditions). The core of this work lies in the integration of data into a unified network framework. DataInNet comprises two subnetworks: a data integration neural network responsible for accommodating and fusing various types of data, and a fully connected neural network dedicated to learning the residual physical information not captured by the data integration neural network. This network architecture inherently excludes function classes that violate known physical constraints, thereby substantially narrowing the solution search space. Numerical experiments demonstrate that the proposed DataInNet delivers superior performance on challenging problems, such as the Helmholtz equation (relative (L^2) error: O((10^{-6}))) and PDEs with high frequency solutions (relative (L^2) error: O((10^{-5}))).
💡 Research Summary
In this paper the authors introduce Data‑Integrated Neural Networks (DataInNet), a novel framework for solving partial differential equations (PDEs) that moves the incorporation of physical data from the loss‑function level (as in traditional physics‑informed neural networks, PINNs) to the architecture level of the neural network itself. DataInNet consists of two cooperating subnetworks. The first, called the data‑integration network, is explicitly designed to accept and fuse heterogeneous physical information such as source terms, initial conditions, and boundary conditions. It does so through a multi‑branch module: each branch (Q_{n,i}) processes a specific data type (\hbar_i(\mathbf{x})) (e.g., the source field (f(\mathbf{x})), the initial field (I(\mathbf{x})), or a boundary field (B_j(\mathbf{x}))). Within each module, a feature stream (G_n) extracts generic representations from the previous hidden state, while the data‑specific streams are element‑wise multiplied with (G_n) to create cross‑features (A_{n,i}). Learnable scalar weights (\alpha_i) then gate and combine these cross‑features before passing them to the next layer. The second subnetwork is a conventional fully‑connected feed‑forward network that learns the residual physical information not captured by the data‑integration network. The final solution is obtained by a simple additive combination (u^* = u_{\text{aux}} + u_{\text{data}}).
The key insight of this architecture is that by embedding known physical constraints directly into the network, the hypothesis space is automatically restricted to functions that satisfy those constraints. Consequently, the optimizer searches a dramatically reduced solution space, which leads to faster convergence, higher accuracy, and improved robustness against the spectral bias that typically hampers DNNs when learning high‑frequency components. Moreover, because the data are processed inside the network, there is no need to balance multiple loss terms with manually tuned weights, simplifying hyper‑parameter selection.
The authors validate DataInNet on several benchmark problems. For a 1‑D heat equation they compare two strategies for feeding the initial condition: a “local‑input” strategy that supplies the initial data only at (t=0) and a “global‑input” strategy that extends the initial field over the whole space‑time domain. The global strategy yields dramatically lower errors (max absolute error (1.22\times10^{-3}), relative (L^2) error (2.98\times10^{-4})) whereas the local strategy fails to propagate the initial condition into the interior. In a 2‑D Poisson problem with a sharply peaked Gaussian source, traditional PINNs miss the peak entirely, while DataInNet achieves a maximum absolute error of (1.77\times10^{-3}) and a relative (L^2) error of (4.03\times10^{-3}). Similar improvements are observed on an L‑shaped domain and on high‑frequency Helmholtz equations, where DataInNet reaches relative (L^2) errors on the order of (10^{-6}).
A series of ablation studies explore the impact of the number of modules, the number of training points, and the choice of activation function. Increasing the number of modules consistently reduces error, and the sinusoidal activation function outperforms cosine, tanh, and sigmoid in all tested configurations. Data normalization is performed via (\bar{F}=F/(K^|F|_{\max})) with a manually set hyper‑parameter (K^=2); the authors note that improper scaling can cause gradient vanishing when source terms have large magnitudes.
Despite its promising results, DataInNet has limitations. The need for manual tuning of the normalization factor (K^*) and the additional parameters introduced by the multi‑branch architecture increase memory consumption and may hinder scalability to high‑dimensional or three‑dimensional problems. The current experiments involve only a few thousand collocation points, leaving open the question of performance on large‑scale scientific simulations. Moreover, extending the framework to complex, non‑rectangular meshes or multiphysics scenarios would require careful design of the data‑specific branches.
In summary, DataInNet offers a compelling new direction for physics‑aware deep learning: by integrating physical data directly into the network topology, it narrows the feasible function space, mitigates spectral bias, and delivers high‑precision solutions for challenging PDEs. Future work on automated data scaling, adaptive branch construction, and large‑scale 3‑D implementations could transform DataInNet into a practical tool for a wide range of scientific and engineering applications.
Comments & Academic Discussion
Loading comments...
Leave a Comment