Generative Decompression: Optimal Lossy Decoding Against Distribution Mismatch
This paper addresses optimal decoding strategies in lossy compression where the assumed distribution for compressor design mismatches the actual (true) distribution of the source. This problem has immediate relevance in standardized communication systems where the decoder acquires side information or priors about the true distribution that are unavailable to the fixed encoder. We formally define the mismatched quantization problem, demonstrating that the optimal reconstruction rule, termed generative decompression, aligns with classical Bayesian estimation by taking the conditional expectation under the true distribution given the quantization indices and adapting it to fixed-encoder constraints. This strategy effectively performs a generative Bayesian correction on the decoder side, strictly outperforming the conventional centroid rule. We extend this framework to transmission over noisy channels, deriving a robust soft-decoding rule that quantifies the inefficiency of standard modular source–channel separation architectures under mismatch. Furthermore, we generalize the approach to task-oriented decoding, showing that the optimal strategy shifts from conditional mean estimation to maximum a posteriori (MAP) detection. Experimental results on Gaussian sources and deep-learning-based semantic classification demonstrate that generative decompression closes a vast majority of the performance gap to the ideal joint-optimization benchmark, enabling adaptive, high-fidelity reconstruction without modifying the encoder.
💡 Research Summary
The paper tackles the practical problem of distribution mismatch in lossy compression systems where the encoder is fixed—often due to standards or legacy hardware—and the decoder has access to side information about the true source distribution. The authors formalize the “mismatched quantization” problem: a b‑bit scalar quantizer (Q_d) is designed under a design law (P^{(d)}_X) with reconstruction centroids (a^{(d)}_i), while the actual source follows a different law (P^{(t)}_X). The central question is how the decoder should map the received index (I=Q_d(X)) to a reconstruction value in order to minimize the expected distortion under the true distribution.
The key contribution is the introduction of “generative decompression,” which is simply the Bayesian minimum‑mean‑square‑error (MMSE) estimator under the true distribution: the optimal reconstruction for each quantization cell (R_i) is the conditional expectation (\mathbb{E}_t
Comments & Academic Discussion
Loading comments...
Leave a Comment