Context-Aware Initialization for Reducing Generative Path Length in Diffusion Language Models
📝 Original Info
- Title: Context-Aware Initialization for Reducing Generative Path Length in Diffusion Language Models
- ArXiv ID: 2512.19004
- Date: 2025-12-22
- Authors: 저자 정보: 논문에 명시된 저자 정보가 제공되지 않아 확인할 수 없습니다. (알 수 없음)
📝 Abstract
Diffusion Large Language Models (DLLMs) enable fully parallel token decoding but often remain impractical at inference time due to the many denoising iterations required to refine an information-free, fully masked initialization into coherent text. Most existing acceleration methods focus on traversing this generative trajectory more efficiently via improved solvers or sampling strategies. We advance a complementary perspective: shorten the trajectory itself by starting closer to the target distribution through context-aware initialization. We propose a training-free interface that injects prompt-conditioned priors from a lightweight auxiliary model into the diffusion initialization, and instantiate it with two mechanisms: discrete token injection and representation-level embedding interpolation. Because injected priors can be imperfect and unmask-only decoding can over-commit early, we also introduce a simple confidence-based remasking mechanism as a form of prior skepticism. Preliminary evidence on GSM8K suggests that context-aware initialization can substantially reduce denoising iterations (about 35\% fewer function evaluations in our setting), while also exposing a key open challenge: naive warm-starting can degrade final accuracy relative to strong diffusion baselines. We use these findings to motivate a research agenda around calibration, revision mechanisms, and representation alignment for reliable warm-started diffusion decoding.💡 Deep Analysis
📄 Full Content
Reference
This content is AI-processed based on open access ArXiv data.