📝 Original Info
- Title: 공간 가변 스펙트럼을 학습하는 신경망 표현
- ArXiv ID: 2511.18384
- Date: 2025-11-23
- Authors: Plein Versace
📝 Abstract
Implicit Neural Representations (INRs) have emerged as a powerful paradigm for representing signals such as images, audio, and 3D scenes. However, existing INR frameworks-including MLPs with Fourier features, SIREN, and multiresolution hash grids-implicitly assume a global and stationary spectral basis. This assumption is fundamentally misaligned with real-world signals whose frequency characteristics vary significantly across space, exhibiting local high-frequency textures, smooth regions, and frequency drift phenomena. We propose Neural Spectral Transport Representation (NSTR), the first INR framework that explicitly models a spatially varying local frequency field. NSTR introduces a learnable frequency transport equation, a PDE that governs how local spectral compositions evolve across space. Given a learnable local spectrum field S(x) and a frequency transport network F θ enforcing ∇S(x) ≈ F θ (x, S(x)), NSTR reconstructs signals by spatially modulating a compact set of global sinusoidal bases. This formulation enables strong local adaptivity and offers a new level of interpretability via visualizing frequency flows. Experiments on 2D image regression, audio reconstruction, and implicit 3D geometry show that NSTR achieves significantly better accuracy-parameter trade-offs than SIREN, Fourier-feature MLPs, and Instant-NGP. NSTR requires fewer global frequencies, converges faster, and naturally explains signal structure through spectral transport fields. We believe NSTR opens a new direction in INR research by introducing explicit modeling of space-varying spectrum.
💡 Deep Analysis
Deep Dive into 공간 가변 스펙트럼을 학습하는 신경망 표현.
Implicit Neural Representations (INRs) have emerged as a powerful paradigm for representing signals such as images, audio, and 3D scenes. However, existing INR frameworks-including MLPs with Fourier features, SIREN, and multiresolution hash grids-implicitly assume a global and stationary spectral basis. This assumption is fundamentally misaligned with real-world signals whose frequency characteristics vary significantly across space, exhibiting local high-frequency textures, smooth regions, and frequency drift phenomena. We propose Neural Spectral Transport Representation (NSTR), the first INR framework that explicitly models a spatially varying local frequency field. NSTR introduces a learnable frequency transport equation, a PDE that governs how local spectral compositions evolve across space. Given a learnable local spectrum field S(x) and a frequency transport network F θ enforcing ∇S(x) ≈ F θ (x, S(x)), NSTR reconstructs signals by spatially modulating a compact set of global sinu
📄 Full Content
NSTR: NEURAL SPECTRAL TRANSPORT REPRESENTATION
FOR SPACE-VARYING FREQUENCY FIELDS
Plein Versace
Essential.ai, Italy
plein@essential.ai.com
ABSTRACT
Implicit Neural Representations (INRs) have emerged as a powerful paradigm for representing signals
such as images, audio, and 3D scenes. However, existing INR frameworks—including MLPs with
Fourier features, SIREN, and multiresolution hash grids—implicitly assume a global and stationary
spectral basis. This assumption is fundamentally misaligned with real-world signals whose frequency
characteristics vary significantly across space, exhibiting local high-frequency textures, smooth
regions, and frequency drift phenomena. We propose Neural Spectral Transport Representation
(NSTR), the first INR framework that explicitly models a spatially varying local frequency field.
NSTR introduces a learnable frequency transport equation, a PDE that governs how local spectral
compositions evolve across space. Given a learnable local spectrum field S(x) and a frequency
transport network Fθ enforcing ∇S(x) ≈Fθ(x, S(x)), NSTR reconstructs signals by spatially
modulating a compact set of global sinusoidal bases. This formulation enables strong local adaptivity
and offers a new level of interpretability via visualizing frequency flows. Experiments on 2D image
regression, audio reconstruction, and implicit 3D geometry show that NSTR achieves significantly
better accuracy–parameter trade-offs than SIREN, Fourier-feature MLPs, and Instant-NGP. NSTR
requires fewer global frequencies, converges faster, and naturally explains signal structure through
spectral transport fields. We believe NSTR opens a new direction in INR research by introducing
explicit modeling of space-varying spectrum.
1
Introduction
Implicit Neural Representations (INRs) encode signals as continuous functions parameterized by neural networks [1–
18], offering memory-efficient and differentiable alternatives to discrete grids. They have become foundational in
neural rendering, geometry processing, audio synthesis, scientific simulations, and compression. Most existing INR
formulations assume that a neural network — typically an MLP augmented with sinusoidal activations, Fourier features,
or multiresolution hash encodings — directly maps a coordinate x to a signal value f(x). This “coordinate-to-value”
paradigm has driven remarkable progress, yet implicitly relies on a strong but rarely challenged assumption: the spectral
basis used to represent the signal is global, stationary, and fixed throughout space.
In practice, however, natural signals exhibit rich and spatially varying spectral structures. Consider typical real-world
data:
• Textures and images contain localized edges, periodic micro-textures, smoothly varying shading, and sharp
discontinuities — each region having drastically different frequency content.
• 3D shapes and SDFs include nearly flat surfaces (low-frequency), corners and creases (high-frequency), and
topology-dependent frequency modulation.
• Neural radiance fields (NeRFs) demonstrate viewpoint-dependent frequency variations due to specular
highlights, varying density gradients, and complex light–material interactions.
• Audio or 1D signals exhibit local pitch drift, vibrato, transients, and harmonics that are not globally stationary.
arXiv:2511.18384v1 [cs.SD] 23 Nov 2025
These observations expose a fundamental limitation of existing INRs: even when equipped with sophisticated architec-
tures, the model ultimately relies on a global coordinate system whose induced representation basis cannot adapt to the
local spectral structure of the signal. For example, sinusoidal networks (SIREN) impose a frequency ω that is uniform
across space; Fourier feature embeddings encode a fixed set of frequencies regardless of local complexity; and hash-grid
encodings focus on localized content but do not explicitly model how frequencies evolve spatially. Consequently,
networks are forced to compensate by increasing depth, width, or embedding resolution, leading to:
1. unnecessary over-parameterization in smooth regions,
2. underfitting or aliasing in high-frequency areas,
3. slower optimization due to spectral mismatch,
4. poor scalability when modeling signals with heterogeneous frequency distributions.
These challenges lead to a central research question:
Can an INR explicitly model the local spectrum of a signal and its spatial evolution, instead of
relying on a fixed global basis?
To answer this, we introduce a new family of implicit representations, Neural Spectral Transport Representations
(NSTR). The core insight is to reinterpret a signal not merely as a mapping x 7→s(x), but as a spatially evolving
spectral field. Specifically, we assume that each position x is associated with a local spectrum S(x), and that the
spectrum evolves smoothly according to a neural partial differential equation (PDE):
∇S(x) = Fθ(x, S(x)).
Here Fθ acts as a learned spectral flow field, transporting local frequency bases across
…(Full text truncated)…
📸 Image Gallery
Reference
This content is AI-processed based on ArXiv data.