Deconstructing Generative Diversity: An Information Bottleneck Analysis of Discrete Latent Generative Models

Reading time: 5 minute
...

📝 Original Info

  • Title: Deconstructing Generative Diversity: An Information Bottleneck Analysis of Discrete Latent Generative Models
  • ArXiv ID: 2512.01831
  • Date: 2025-12-01
  • Authors: ** - Yudi Wu (Zhejiang University) – wuyudi@zju.edu.cn - Wenhao Zhao (National University of Singapore) – wenhaozhao@nus.edu.sg - Dianbo Liu (National University of Singapore) – dianbo@nus.edu.sg **

📝 Abstract

Generative diversity varies significantly across discrete latent generative models such as AR, MIM, and Diffusion. We propose a diagnostic framework, grounded in Information Bottleneck (IB) theory, to analyze the underlying strategies resolving this behavior. The framework models generation as a conflict between a 'Compression Pressure' - a drive to minimize overall codebook entropy - and a 'Diversity Pressure' - a drive to maximize conditional entropy given an input. We further decompose this diversity into two primary sources: 'Path Diversity', representing the choice of high-level generative strategies, and 'Execution Diversity', the randomness in executing a chosen strategy. To make this decomposition operational, we introduce three zero-shot, inference-time interventions that directly perturb the latent generative process and reveal how models allocate and express diversity. Application of this probe-based framework to representative AR, MIM, and Diffusion systems reveals three distinct strategies: "Diversity-Prioritized" (MIM), "Compression-Prioritized" (AR), and "Decoupled" (Diffusion). Our analysis provides a principled explanation for their behavioral differences and informs a novel inference-time diversity enhancement technique.

💡 Deep Analysis

Figure 1

📄 Full Content

Deconstructing Generative Diversity: An Information Bottleneck Analysis of Discrete Latent Generative Models Yudi Wu Zhejiang University wuyudi@zju.edu.cn Wenhao Zhao National University of Singapore wenhaozhao@u.nus.edu Dianbo Liu National University of Singapore dianbo@nus.edu.sg Abstract Generative diversity varies significantly across discrete la- tent generative models such as AR, MIM, and Diffusion. We propose a diagnostic framework, grounded in Informa- tion Bottleneck (IB) theory, to analyze the underlying strate- gies resolving this behavior. The framework models gener- ation as a conflict between a ’Compression Pressure’—a drive to minimize overall codebook entropy—and a ’Di- versity Pressure’—a drive to maximize conditional entropy given an input. We further decompose this diversity into two primary sources: ’Path Diversity’, representing the choice of high-level generative strategies, and ’Execution Diver- sity’, the randomness in executing a chosen strategy. To make this decomposition operational, we introduce three zero-shot, inference-time interventions that directly perturb the latent generative process and reveal how models allo- cate and express diversity. Application of this probe-based framework to representative AR, MIM, and Diffusion sys- tems reveals three distinct strategies: “Diversity-Prioritized” (MIM), “Compression-Prioritized” (AR), and “Decoupled” (Diffusion). Our analysis provides a principled explana- tion for their behavioral differences and informs a novel inference-time diversity enhancement technique. 1. Introduction Discrete latent generative models have recently emerged as a central paradigm in image synthesis. Approaches based on vector quantization (VQ)[1], such as autoregressive transformers[2], masked image models[3], and diffusion models operating in discrete latent spaces[4], have demon- strated remarkable progress in controllable and high-fidelity generation. The discrete formulation offers compact and structured representations that align well with token-based learning and scalable training pipelines. As these models be- come increasingly prevalent, understanding their underlying generative behavior—particularly the nature and source of their diversity—has become a critical research question. Generative models have been widely studied from the per- spective of evaluating or quantifying their output variation. Recent works have proposed a range of metrics to assess novelty, variability, or originality in generated samples [5–7]. These studies have improved our ability to measure how diverse model outputs appear but provide limited insight into the mechanisms that produce such variation. In particular, they seldom address how different model architectures inter- nalize and control stochasticity within their latent representa- tions. As a result, we still lack a systematic understanding of why discrete latent models such as autoregressive, masked, and diffusion-based frameworks exhibit distinct patterns of generative behavior. We address this gap through an information-theoretic framework grounded in the Information Bottleneck (IB) prin- ciple. From this perspective, generative diversity arises from the balance between two opposing pressures: a compression pressure, which encourages compact latent representations and low entropy, and a diversity pressure, which promotes stochastic and expressive mappings that retain uncertainty. We further decompose this diversity into interpretable com- ponents that correspond to variability in generative strategy (path diversity) and randomness during execution (execution diversity), providing a unified view of information flow in generation. To operationalize this analysis, we develop a set of zero- shot, inference-time probes that directly perturb a model’s latent generative process. Each probe targets a different com- ponent of the IB decomposition—codebook usage, sampling stochasticity, and prompt conditioning—and measures how the model’s outputs change under these controlled interven- tions. By examining a model’s sensitivity to these perturba- 1 arXiv:2512.01831v1 [cs.LG] 1 Dec 2025 tions, we expose how its internal mechanism navigates the trade-off between compression and diversity. The resulting probe responses reveal several recurring patterns across architectures. Some models behave as compression-prioritized systems, showing minimal changes under perturbations and consistently producing stable, low- variance outputs. Others are diversity-prioritized, maintain- ing high conditional entropy and expressing substantial vari- ation even when constraints are imposed. A third group exhibits decoupled behavior, where path-level randomness and execution-level randomness contribute independently, yielding models that remain stable at the structural level while preserving controlled variation during sampling. These patterns provide a coherent view of how different discrete generative models manage information flow and where the

📸 Image Gallery

abs.png amused_diversity_metrics_comparison.png amused_quality_metrics_comparison.png argmax_ablation_diversity.png codebook_ablation_diversity.png factor_effects_waterfall_LlamaGen.png factor_effects_waterfall_VQ-Diffusion.png factor_effects_waterfall_aMUSEd.png interaction_by_codebook_VQ-Diffusion.png interaction_by_codebook_aMUSEd.png interaction_by_prompt_VQ-Diffusion.png interaction_by_prompt_aMUSEd.png interaction_by_sampling_VQ-Diffusion.png interaction_by_sampling_aMUSEd.png interaction_prompt_codebook_LlamaGen.png iqa_curve.png janus_clip_iqa.png janus_lpips_diversity.png llamagen_diversity_metrics_comparison.png llamagen_quality_metrics_comparison.png lpips_curve.png paraphrase_ablation_diversity.png sample1_baseline.png sample1_mixed.png sample1_paraphrase.png sample1_subset.png sample2_baseline.png sample2_mixed.png sample2_paraphrase.png sample2_subset.png sample3_baseline.png sample3_mixed.png sample3_paraphrase.png sample3_subset.png showo_clip_iqa.png showo_lpips_diversity.png vq-diffusion_diversity_metrics_comparison.png vq-diffusion_quality_metrics_comparison.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut