Denoising Diffusions with Optimal Transport: Localization, Curvature, and Multi-Scale Complexity
Adding noise is easy; what about denoising? Diffusion is easy; what about reverting a diffusion? Diffusion-based generative models aim to denoise a Langevin diffusion chain, moving from a log-concave equilibrium measure $ν$, say an isotropic Gaussian, back to a complex, possibly non-log-concave initial measure $μ$. The score function performs denoising, moving backward in time, and predicting the conditional mean of the past location given the current one. We show that score denoising is the optimal backward map in transportation cost. What is its localization uncertainty? We show that the curvature function determines this localization uncertainty, measured as the conditional variance of the past location given the current. We study in this paper the effectiveness of the diffuse-then-denoise process: the contraction of the forward diffusion chain, offset by the possible expansion of the backward denoising chain, governs the denoising difficulty. For any initial measure $μ$, we prove that this offset net contraction at time $t$ is characterized by the curvature complexity of a smoothed $μ$ at a specific signal-to-noise ratio (SNR) scale $r(t)$. We discover that the multi-scale curvature complexity collectively determines the difficulty of the denoising chain. Our multi-scale complexity quantifies a fine-grained notion of average-case curvature instead of the worst-case. Curiously, it depends on an integrated tail function, measuring the relative mass of locations with positive curvature versus those with negative curvature; denoising at a specific SNR scale is easy if such an integrated tail is light. We conclude with several non-log-concave examples to demonstrate how the multi-scale complexity probes the bottleneck SNR for the diffuse-then-denoise process.
💡 Research Summary
This paper provides a rigorous theoretical framework for understanding diffusion‑based generative models through the lens of optimal transport and curvature. The authors begin by formalizing the forward Langevin diffusion as a Markov chain that contracts a simple log‑concave equilibrium distribution (typically a standard Gaussian) toward a potentially highly non‑log‑concave target distribution μ. They then examine the backward (denoising) process, showing in Proposition 1 that the optimal backward transition map—i.e., the map that minimizes expected transportation cost—is exactly the score function ∇ log p_{μ_k,η}. Proposition 2 further interprets score estimation as a supervised prediction problem: given a noisy sample at time t, the optimal predictor of the previous (clean) sample is its conditional expectation under the forward diffusion.
A central contribution is the introduction of “localization uncertainty,” defined as the conditional covariance Var
Comments & Academic Discussion
Loading comments...
Leave a Comment