Neural Texture Block Compression
Block compression is a widely used technique to compress textures in real-time graphics applications, offering a reduction in storage size. However, their storage efficiency is constrained by the fixed compression ratio, which substantially increases storage size when hundreds of high-quality textures are required. In this paper, we propose a novel block texture compression method with neural networks, Neural Texture Block Compression (NTBC). NTBC learns the mapping from uncompressed textures to block-compressed textures, which allows for significantly reduced storage costs without any change in the shaders.Our experiments show that NTBC can achieve reasonable-quality results with up to about 70% less storage footprint, preserving real-time performance with a modest computational overhead at the texture loading phase in the graphics pipeline.
💡 Research Summary
Neural Texture Block Compression (NTBC) addresses the growing storage burden of high‑resolution textures in real‑time graphics by keeping the widely supported BC1 and BC4 block‑compression formats unchanged while using neural networks to regenerate the compressed data on‑the‑fly. Traditional BC formats compress each 4×4 texel block to a fixed 8‑byte representation, which is efficient for random access but becomes a bottleneck when hundreds of 4K textures are required, leading to gigabytes of storage. Variable‑rate formats such as ASTC or BC7 improve quality at the cost of expensive per‑block optimization, making them unsuitable for many real‑time pipelines.
NTBC proposes a two‑network architecture: an endpoint network that predicts the two color endpoints (e₀, e₁) for each block, and a color network that predicts the original uncompressed texel colors. Both networks receive 2‑D coordinates encoded by multi‑resolution feature grids, a representation well‑suited for spatially varying data. During inference, the predicted endpoints generate a palette of possible colors; the nearest palette entry to each predicted texel color determines the 2‑bit (BC1) or 3‑bit (BC4) index via an argmax operation. To enable back‑propagation through the discrete argmax, the authors approximate it with a softmax‑like formulation, allowing gradients to flow to both networks.
Training uses a combination of losses: Lₑ (L2 loss on endpoints), L_c (L2 loss on predicted colors), and L_cd (L2 loss on colors decoded from the reconstructed block). The endpoint network is first trained on continuous endpoint values, then fine‑tuned with quantized indices while the weight network is frozen. Crucially, the method employs quantization‑aware training (QAT) to compress both the feature grids and the MLP weights to 8‑bit integers, drastically reducing the model’s on‑disk footprint.
In practice, only the compact neural model is stored on disk; at texture loading time the GPU runs inference to reconstruct the BC blocks, which are then copied to VRAM and consumed by existing texture samplers without any shader modifications. Experiments on multiple 4K textures show that NTBC can be trained per material within ten minutes, incurs roughly 1 ms of GPU overhead per texture during loading, and achieves a 55 %–70 % reduction in storage compared to native BC1/BC4. Visual quality measured by PSNR and SSIM remains within 0.5 dB and 0.98 of the original BC, respectively, even when compressing several texture maps (diffuse, normal, roughness) jointly.
Limitations include the need for a short per‑material training phase and slight quality degradation on extremely high‑frequency textures. The authors suggest future work on extending NTBC to variable‑rate formats (BC7, ASTC), accelerating training via meta‑learning, and optimizing inference for mobile GPUs. Overall, NTBC demonstrates a novel paradigm: using neural networks to regenerate standard block‑compressed data, thereby preserving compatibility while delivering substantial storage savings and maintaining real‑time performance.
Comments & Academic Discussion
Loading comments...
Leave a Comment