On finite-dimensional encoding/decoding theorems for neural operators
Recently, versions of neural networks with infinite-dimensional affine operators inside the computational units (``neural operator’’ networks) have been applied to learn solutions to differential equations. To enable practical computations, one employs finite-dimensional encoding/decoding theorems of the following kind: every continuous mapping $f$ between function spaces $E$ and $F$ is approximated in the topology of uniform convergence on compacta by continuous mappings factoring through two finite dimensional Banach spaces. Such a result is known (Kovachki et al., 2023) for $E,F$ being Banach spaces having the approximation property. We point out that the result needs no assumptions on $E,F$ whatsoever and remains true not only for all normed spaces, but for arbitrary locally convex spaces as well. At the same time, an analogous result for $C^k$-smooth mappings and the $C^k$ compact open topology, $k\geq 1$, holds if and only if the space $E$ has the approximation property. This analysis may be useful already because non-normable locally convex function spaces are common in the theory of differential equations, the main field of applications for the emerging theory.
💡 Research Summary
The paper investigates a foundational question in the emerging field of neural operators: under what conditions can a mapping between infinite‑dimensional function spaces be approximated by a composition of finite‑dimensional linear maps and a finite‑dimensional nonlinear map, i.e. by a “latent‑structure’’ of the form S ∘ g ∘ T? This question is crucial because practical implementations of neural operators must ultimately reduce to finite‑dimensional computations.
Background. Traditional neural networks operate on finite‑dimensional vectors; each layer is an affine map followed by a non‑linear activation. Neural operators generalize this idea by allowing the affine part to be a continuous linear operator between infinite‑dimensional spaces (e.g., spaces of coefficients or solutions of PDEs). In practice, one needs an encoding map T from the input space E to a finite‑dimensional space E₁≈ℝ^m, a decoding map S from a finite‑dimensional space F₁≈ℝ^n to the output space F, and a finite‑dimensional nonlinear map g:E₁→F₁. The goal is to make S∘g∘T uniformly close to the target operator f:E→F on any compact subset of E.
Earlier work (Kovachki et al., 2023) proved such an encoding/decoding theorem when E and F are Banach spaces possessing the approximation property (AP). The AP means that the identity operator can be uniformly approximated on compact sets by finite‑rank operators.
Main Contributions.
- Theorem 1.1 (Continuous case). The authors show that no structural assumption on E or F is needed: for any locally convex spaces E and F (and in particular for any normed spaces) and any continuous map f:E→F, there exist finite‑dimensional spaces E₁, F₁, bounded linear maps T:E→E₁, S:F₁→F, and a continuous map g:E₁→F₁ such that
\
Comments & Academic Discussion
Loading comments...
Leave a Comment