Generative diversity varies significantly across discrete latent generative models such as AR, MIM, and Diffusion. We propose a diagnostic framework, grounded in Information Bottleneck (IB) theory, to analyze the underlying strategies resolving this behavior. The framework models generation as a conflict between a 'Compression Pressure' - a drive to minimize overall codebook entropy - and a 'Diversity Pressure' - a drive to maximize conditional entropy given an input. We further decompose this diversity into two primary sources: 'Path Diversity', representing the choice of high-level generative strategies, and 'Execution Diversity', the randomness in executing a chosen strategy. To make this decomposition operational, we introduce three zero-shot, inference-time interventions that directly perturb the latent generative process and reveal how models allocate and express diversity. Application of this probe-based framework to representative AR, MIM, and Diffusion systems reveals three distinct strategies: "Diversity-Prioritized" (MIM), "Compression-Prioritized" (AR), and "Decoupled" (Diffusion). Our analysis provides a principled explanation for their behavioral differences and informs a novel inference-time diversity enhancement technique.
💡 Deep Analysis
📄 Full Content
Deconstructing Generative Diversity: An Information Bottleneck Analysis of
Discrete Latent Generative Models
Yudi Wu
Zhejiang University
wuyudi@zju.edu.cn
Wenhao Zhao
National University of Singapore
wenhaozhao@u.nus.edu
Dianbo Liu
National University of Singapore
dianbo@nus.edu.sg
Abstract
Generative diversity varies significantly across discrete la-
tent generative models such as AR, MIM, and Diffusion.
We propose a diagnostic framework, grounded in Informa-
tion Bottleneck (IB) theory, to analyze the underlying strate-
gies resolving this behavior. The framework models gener-
ation as a conflict between a ’Compression Pressure’—a
drive to minimize overall codebook entropy—and a ’Di-
versity Pressure’—a drive to maximize conditional entropy
given an input. We further decompose this diversity into two
primary sources: ’Path Diversity’, representing the choice
of high-level generative strategies, and ’Execution Diver-
sity’, the randomness in executing a chosen strategy. To
make this decomposition operational, we introduce three
zero-shot, inference-time interventions that directly perturb
the latent generative process and reveal how models allo-
cate and express diversity. Application of this probe-based
framework to representative AR, MIM, and Diffusion sys-
tems reveals three distinct strategies: “Diversity-Prioritized”
(MIM), “Compression-Prioritized” (AR), and “Decoupled”
(Diffusion). Our analysis provides a principled explana-
tion for their behavioral differences and informs a novel
inference-time diversity enhancement technique.
1. Introduction
Discrete latent generative models have recently emerged as
a central paradigm in image synthesis. Approaches based
on vector quantization (VQ)[1], such as autoregressive
transformers[2], masked image models[3], and diffusion
models operating in discrete latent spaces[4], have demon-
strated remarkable progress in controllable and high-fidelity
generation. The discrete formulation offers compact and
structured representations that align well with token-based
learning and scalable training pipelines. As these models be-
come increasingly prevalent, understanding their underlying
generative behavior—particularly the nature and source of
their diversity—has become a critical research question.
Generative models have been widely studied from the per-
spective of evaluating or quantifying their output variation.
Recent works have proposed a range of metrics to assess
novelty, variability, or originality in generated samples [5–7].
These studies have improved our ability to measure how
diverse model outputs appear but provide limited insight into
the mechanisms that produce such variation. In particular,
they seldom address how different model architectures inter-
nalize and control stochasticity within their latent representa-
tions. As a result, we still lack a systematic understanding of
why discrete latent models such as autoregressive, masked,
and diffusion-based frameworks exhibit distinct patterns of
generative behavior.
We address this gap through an information-theoretic
framework grounded in the Information Bottleneck (IB) prin-
ciple. From this perspective, generative diversity arises from
the balance between two opposing pressures: a compression
pressure, which encourages compact latent representations
and low entropy, and a diversity pressure, which promotes
stochastic and expressive mappings that retain uncertainty.
We further decompose this diversity into interpretable com-
ponents that correspond to variability in generative strategy
(path diversity) and randomness during execution (execution
diversity), providing a unified view of information flow in
generation.
To operationalize this analysis, we develop a set of zero-
shot, inference-time probes that directly perturb a model’s
latent generative process. Each probe targets a different com-
ponent of the IB decomposition—codebook usage, sampling
stochasticity, and prompt conditioning—and measures how
the model’s outputs change under these controlled interven-
tions. By examining a model’s sensitivity to these perturba-
1
arXiv:2512.01831v1 [cs.LG] 1 Dec 2025
tions, we expose how its internal mechanism navigates the
trade-off between compression and diversity.
The resulting probe responses reveal several recurring
patterns across architectures. Some models behave as
compression-prioritized systems, showing minimal changes
under perturbations and consistently producing stable, low-
variance outputs. Others are diversity-prioritized, maintain-
ing high conditional entropy and expressing substantial vari-
ation even when constraints are imposed. A third group
exhibits decoupled behavior, where path-level randomness
and execution-level randomness contribute independently,
yielding models that remain stable at the structural level
while preserving controlled variation during sampling. These
patterns provide a coherent view of how different discrete
generative models manage information flow and where the