ECNR: Efficient Compressive Neural Representation of Time-Varying Volumetric Datasets

Due to its conceptual simplicity and generality, compressive neural representation has emerged as a promising alternative to traditional compression methods for managing massive volumetric datasets. The current practice of neural compression utilizes a single large multilayer perceptron (MLP) to encode the global volume, incurring slow training and inference. This paper presents an efficient compressive neural representation (ECNR) solution for time-varying data compression, utilizing the Laplacian pyramid for adaptive signal fitting. Following a multiscale structure, we leverage multiple small MLPs at each scale for fitting local content or residual blocks. By assigning similar blocks to the same MLP via size uniformization, we enable balanced parallelization among MLPs to significantly speed up training and inference. Working in concert with the multiscale structure, we tailor a deep compression strategy to compact the resulting model. We show the effectiveness of ECNR with multiple datasets and compare it with state-of-the-art compression methods (mainly SZ3, TTHRESH, and neurcomp). The results position ECNR as a promising solution for volumetric data compression.

💡 Research Summary

The paper introduces ECNR (Efficient Compressive Neural Representation), a novel framework for compressing time‑varying volumetric datasets that overcomes the inefficiencies of existing neural compression methods. Traditional neural compressors encode an entire spatiotemporal volume with a single large multilayer perceptron (MLP). While conceptually simple, this approach suffers from massive parameter counts, long training times, and high inference latency, especially for large scientific simulations that can span terabytes. ECNR tackles these problems through three tightly coupled ideas: (1) a multiscale Laplacian‑pyramid decomposition, (2) a collection of small, specialized MLPs assigned to uniformly sized residual blocks, and (3) a deep‑compression pipeline applied to the trained network parameters.

Multiscale decomposition. The input volume sequence (V(x,y,z,t)) is first down‑sampled into a hierarchy of resolutions using a Laplacian pyramid. At each level (l) the low‑frequency approximation (A_l) and the high‑frequency residual (R_l = A_{l-1} - \uparrow A_l) are computed. Because the pyramid removes most of the high‑frequency energy at coarse levels, the residuals at finer levels are comparatively low‑complexity signals. This property dramatically reduces the learning burden for the neural models that follow.

Block‑wise small MLPs. Each residual volume (R_l) is partitioned into fixed‑size blocks (e.g., (8^3) voxels). Blocks are then clustered based on similarity of their statistical descriptors (mean, variance, gradient histograms). All blocks belonging to the same cluster share a dedicated small MLP (f_{l,k}). The MLPs contain only a few thousand parameters (typically 2–4 hidden layers with 64–128 neurons per layer). Because many such networks run in parallel on a GPU, the overall memory footprint is tiny and the computational load is evenly balanced across cores. The authors call this “balanced parallelization.” Empirically, training time drops from several hours (single‑MLP baseline) to under 30 minutes, and inference latency improves by an order of magnitude, enabling near‑real‑time reconstruction.

Deep compression of the model. After the MLPs have converged, the authors apply a three‑step compression to the network weights: (i) 8‑bit uniform quantization, (ii) sparsification via magnitude‑based pruning (typically 70–80 % of weights are zeroed), and (iii) entropy coding (Huffman or arithmetic coding). This “deep compression” reduces the total size of all MLPs to less than 0.5 % of the original raw data volume, while preserving the reconstruction quality achieved by the uncompressed networks.

Experimental evaluation. The method is tested on a diverse set of benchmarks: fluid‑dynamics simulations (e.g., turbulent flow), fire propagation, climate model outputs, and medical imaging (CT, MRI). Compression performance is measured in terms of compression ratio (CR), peak signal‑to‑noise ratio (PSNR), structural similarity index (SSIM), and runtime. ECNR is compared against three state‑of‑the‑art compressors: SZ3 (lossy error‑bounded), TTHRESH (transform‑based), and neurcomp (single‑MLP neural compressor). Results show that for a fixed CR, ECNR yields 2–3 dB higher PSNR than SZ3 and TTHRESH, and 1–2 dB higher than neurcomp. When targeting the same PSNR, ECNR achieves 10–15 % higher CR. Training time is reduced by a factor of 5–10, and inference speed is 8× faster than the baseline neural method.

Limitations and future work. The clustering step requires a pre‑defined block size and number of clusters; suboptimal choices can lead to heterogeneous blocks being forced into the same MLP, degrading accuracy for highly non‑stationary data. Moreover, the current implementation relies heavily on GPU parallelism; performance gains on CPU‑only systems are modest. The authors suggest future directions such as adaptive block partitioning, dynamic MLP allocation during training, and distributed training across multiple nodes to further scale the approach.

Conclusion. ECNR demonstrates that a multiscale representation combined with a fleet of small, parallel MLPs can dramatically improve the efficiency of neural compression for time‑varying volumetric data. By integrating a dedicated deep‑compression stage, the method not only speeds up training and inference but also shrinks the model itself to a negligible fraction of the original dataset size. This makes ECNR a compelling candidate for scientific workflows where massive simulation outputs must be stored, transferred, and visualized without sacrificing fidelity.