Generalized Denoising Diffusion Codebook Models (gDDCM): Tokenizing images using a pre-trained diffusion model
📝 Original Info
- Title: Generalized Denoising Diffusion Codebook Models (gDDCM): Tokenizing images using a pre-trained diffusion model
- ArXiv ID: 2511.13387
- Date: 2025-11-17
- Authors: ** 제공된 논문 초록 및 본문에 저자 정보가 명시되지 않았습니다. **
📝 Abstract
Denoising diffusion models have emerged as a dominant paradigm in image generation. Discretizing image data into tokens is a critical step for effectively integrating images with Transformer and other architectures. Although the Denoising Diffusion Codebook Models (DDCM) pioneered the use of pre-trained diffusion models for image tokenization, it strictly relies on the traditional discrete-time DDPM architecture. Consequently, it fails to adapt to modern continuous-time variants-such as Flow Matching and Consistency Models-and suffers from inefficient sampling in high-noise regions. To address these limitations, this paper proposes the Generalized Denoising Diffusion Codebook Models (gDDCM). We establish a unified theoretical framework and introduce a generic "De-noise and Back-trace" sampling strategy. By integrating a deterministic ODE denoising step with a residual-aligned noise injection step, our method resolves the challenge of adaptation. Furthermore, we introduce a backtracking parameter $p$ and significantly enhance tokenization ability. Extensive experiments on CIFAR10 and LSUN Bedroom datasets demonstrate that gDDCM achieves comprehensive compatibility with mainstream diffusion variants and significantly outperforms DDCM in terms of reconstruction quality and perceptual fidelity.💡 Deep Analysis
📄 Full Content
Reference
This content is AI-processed based on open access ArXiv data.