Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance
Score-based Generative Models (SGMs) aim to sample from a target distribution by learning score functions using samples perturbed by Gaussian noise. Existing convergence bounds for SGMs in the W2-distance rely on stringent assumptions about the data distribution. In this work, we present a novel framework for analyzing W2-convergence in SGMs, significantly relaxing traditional assumptions such as log-concavity and score regularity. Leveraging the regularization properties of the Ornstein–Uhlenbeck (OU) process, we show that weak log-concavity of the data distribution evolves into log-concavity over time. This transition is rigorously quantified through a PDE-based analysis of the Hamilton–Jacobi–Bellman equation governing the log-density of the forward process. Moreover, we establish that the drift of the time-reversed OU process alternates between contractive and non-contractive regimes, reflecting the dynamics of concavity. Our approach circumvents the need for stringent regularity conditions on the score function and its estimators, relying instead on milder, more practical assumptions. We demonstrate the wide applicability of this framework through explicit computations on Gaussian mixture models, illustrating its versatility and potential for broader classes of data distributions.
💡 Research Summary
This paper addresses a fundamental limitation in the theoretical analysis of score‑based generative models (SGMs): existing Wasserstein‑2 (W₂) convergence guarantees rely on strong assumptions such as strict log‑concavity of the data distribution and high‑order regularity of the score function. The authors propose a novel analytical framework that relaxes these requirements dramatically, requiring only weak log‑concavity and a one‑sided log‑Lipschitz condition on the target distribution.
The key technical insight is to exploit the regularizing effect of the Ornstein–Uhlenbeck (OU) process, which serves as the forward diffusion in many SGMs. The OU dynamics converge to a standard Gaussian stationary distribution, and the forward flow can be viewed as convolving the initial data density with a Gaussian kernel. By studying the Hamilton–Jacobi–Bellman (HJB) equation satisfied by the log‑density of the forward process, the authors rigorously track how the weak log‑concavity constant evolves over time. They prove that this constant grows exponentially, eventually turning the marginal densities into strongly log‑concave measures after a finite transition time.
On the reverse side, the time‑reversed OU SDE has drift (b_t(x) = -x + 2\nabla\log\tilde p_t(x)), where (\tilde p_t) is the density ratio with respect to the Gaussian. The paper shows that this drift exhibits two distinct regimes: an early non‑contractive phase and a later contractive phase once the log‑concavity has become strong enough. The transition point is explicitly characterized in terms of the evolving concavity constant.
To translate these dynamical properties into a concrete W₂ bound, the authors employ a hybrid coupling strategy that combines reflection coupling, sticky coupling, and the recently introduced controlled coupling technique. This approach yields a differential inequality for the expected squared distance between the true reverse diffusion and its discretized, score‑estimated counterpart. Solving the inequality gives an explicit convergence rate that depends on (i) the initial weak log‑concavity parameter, (ii) the OU diffusion coefficient, (iii) the approximation error of the learned score network, and (iv) the discretization step size. All constants are fully spelled out, avoiding the opaque “big‑O” notation common in prior work.
A significant portion of the paper is devoted to validating the assumptions. The authors prove that Gaussian mixture models (GMMs) satisfy both weak log‑concavity and the one‑sided log‑Lipschitz condition, and they provide explicit formulas for the relevant constants in Theorem 4.1. Because the OU flow acts as a Gaussian kernel, it regularizes GMMs rapidly, guaranteeing that even with early stopping the reverse process remains stable. This demonstrates that the theory applies to realistic, multimodal data distributions rather than only to idealized log‑concave cases.
Finally, the paper discusses practical implications. By removing the need for strong smoothness assumptions on the score, the analysis aligns more closely with how SGMs are trained in practice (via denoising score matching). The explicit dependence of the bound on the score‑estimation error offers a principled way to set network capacity and training budget. Moreover, the identified contractive regime suggests that algorithmic designs (e.g., adaptive step‑size schedules) can be tuned to stay within the contractive window, improving sample quality and reducing computational cost.
In summary, this work delivers a rigorous, fully explicit Wasserstein‑2 convergence guarantee for SGMs under markedly weaker distributional assumptions. It bridges PDE‑based HJB analysis, stochastic coupling techniques, and practical considerations such as Gaussian mixture applicability and early‑stopping stability, thereby broadening the theoretical foundation of diffusion‑based generative modeling and offering actionable guidance for future algorithm design.
Comments & Academic Discussion
Loading comments...
Leave a Comment