CREPE: Controlling Diffusion with Replica Exchange

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Inference-time control of diffusion models aims to steer model outputs to satisfy new constraints without retraining. Previous approaches have mostly relied on heuristic guidance or have been coupled with Sequential Monte Carlo (SMC) for bias correction. In this paper, we propose a flexible alternative based on replica exchange, an algorithm designed initially for sampling problems. We refer to this method as CREPE (Controlling with REPlica Exchange). Unlike SMC, CREPE: (1) generates particles sequentially, (2) maintains high diversity in the generated samples after a burn-in period, and (3) enables online refinement or early termination. We demonstrate its versatility across various tasks, including temperature annealing, reward-tilting, model composition and classifier-free guidance debiasing, with competitive performance compared to prior SMC methods.

💡 Research Summary

The paper introduces CREPE (Control with REPlica Exchange), a novel inference‑time control framework for diffusion models that replaces the commonly used Sequential Monte Carlo (SMC) approach with a replica‑exchange (parallel tempering) scheme. The authors observe three major drawbacks of SMC: (1) it requires a large number of particles throughout the denoising trajectory, leading to high memory consumption; (2) particle diversity quickly collapses, especially when the particle budget is limited; and (3) once the sampling run finishes, the method cannot refine the generated samples, forcing a full restart when new constraints appear.

CREPE addresses these issues by inverting the relationship between time and parallelism. Instead of propagating many particles in parallel along a single diffusion time axis, CREPE runs multiple Markov chains in parallel, each initialized at a different diffusion step (t₀, t₁, …, t_M). Within each chain, samples are generated sequentially, and adjacent chains periodically attempt a Metropolis‑Hastings swap. The swap acceptance probability is derived from a Radon‑Nikodym estimator (RNE) that quantifies the likelihood ratio between forward and backward diffusion path measures. Crucially, the RNE can be computed using the pretrained diffusion model’s score network, without requiring an explicit target density.

The framework consists of three components:

Annealing Path – a user‑defined family of intermediate distributions π_t that interpolates between a tractable reference (e.g., a Gaussian or fully‑masked distribution) and the desired target distribution. The authors show how common control objectives—temperature scaling (π_t ∝ p_t^β), reward‑tilting (π_t ∝ p_t·exp(r_t)), and model composition (π_t ∝ ∑_j w_j p_t^{(j)})—fit naturally into this path.
Communication Move – a swap operation between neighboring chains. The acceptance probability α_{t,t′} = min(1, (d←Q′/d→Q)(x)·(d→Q/d←Q′)(x′)) is computed using the forward and backward proposal path measures, which are themselves expressed via the RNE. This guarantees detailed balance with respect to the product of the π_t’s.
Local Exploration – standard diffusion reverse‑process steps (e.g., Langevin updates using the pretrained score network) applied independently within each chain. No additional training is required; the existing score model provides the necessary gradients.

The authors provide a rigorous derivation showing that, despite the lack of an explicit target density, the replica‑exchange dynamics remain valid because the Radon‑Nikodym derivative between the forward and backward diffusion processes equals one for the unconditioned diffusion, and the additional control terms appear only as multiplicative factors that are accounted for in the swap ratio. They also extend the method to discrete diffusion (mask‑based) models by using the corresponding CTMC rate matrices and concrete scores.

Empirically, CREPE is evaluated on several modalities (high‑resolution ImageNet‑512 images, video generation, and text generation) across four representative control tasks:

Temperature annealing – adjusting the effective temperature of the diffusion to trade off fidelity vs. diversity.
Reward‑tilting / posterior sampling – incorporating a learned or handcrafted reward function to bias samples toward desired attributes.
Model composition – blending multiple pretrained diffusion models into a single composite distribution.
Classifier‑free guidance debiasing – correcting the known bias of classifier‑free guidance by treating it as a composition of the original model and a conditional model with a weighting factor.

Across all benchmarks, CREPE matches or exceeds SMC‑based baselines in quantitative metrics such as FID, IS, and a dedicated diversity score, while using 2–3× less memory. Notably, after a short burn‑in period, particle diversity remains high, and the method supports “online refinement”: if early samples are unsatisfactory, additional replica‑exchange iterations can improve them without restarting from scratch. Visualizations (e.g., Figure 1) illustrate that generated images quickly align with the textual prompt after burn‑in.

The paper also discusses limitations. Swap efficiency depends on sufficient overlap between adjacent π_t’s; aggressive annealing schedules can reduce acceptance rates. The current implementation assumes all chains run on a single device, so scaling to large distributed clusters requires further engineering. Finally, complex reward functions may lead to numerically unstable RNE estimates, suggesting a need for variance‑reduction techniques.

Future directions proposed include adaptive annealing schedules that adjust based on observed swap rates, asymmetric or multi‑swap proposals to improve mixing, and integration with large‑scale distributed systems.

In summary, CREPE offers a principled, memory‑efficient, and flexible alternative to SMC for inference‑time control of diffusion models. By leveraging replica exchange and the Radon‑Nikodym estimator, it preserves particle diversity, enables continual refinement, and supports a broad spectrum of control objectives without any additional training of the diffusion model. This work opens a new avenue for practical, controllable generative modeling in high‑dimensional settings.

CREPE: Controlling Diffusion with Replica Exchange

💡 Research Summary

Comments & Academic Discussion

Leave a Comment