LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models
Deep generative models like VAEs and diffusion models have advanced various generation tasks by leveraging latent variables to learn data distributions and generate high-quality samples. Despite the field of explainable AI making strides in interpreting machine learning models, understanding latent variables in generative models remains challenging. This paper introduces LatentExplainer, a framework for automatically generating semantically meaningful explanations of latent variables in deep generative models. LatentExplainer tackles three main challenges: inferring the meaning of latent variables, aligning explanations with inductive biases, and handling varying degrees of explainability. Our approach perturbs latent variables, interprets changes in generated data, and uses multimodal large language models (MLLMs) to produce human-understandable explanations. We evaluate our proposed method on several real-world and synthetic datasets, and the results demonstrate superior performance in generating high-quality explanations for latent variables. The results highlight the effectiveness of incorporating inductive biases and uncertainty quantification, significantly enhancing model interpretability.
💡 Research Summary
**
LatentExplainer is a novel framework designed to automatically generate human‑understandable explanations for the latent variables of deep generative models such as Variational Autoencoders (VAEs) and diffusion models. The authors identify three core challenges: (1) inferring the semantic meaning of each latent dimension, (2) aligning explanations with the inductive biases that are often baked into the model (e.g., disentanglement, combination, and conditional biases), and (3) handling the fact that some latent dimensions are intrinsically more explainable than others.
To address (1), the method perturbs each latent variable (z_i) by a small amount (\gamma) and decodes the perturbed latent vector to obtain a sequence of generated samples. In VAEs the perturbation is a simple additive shift; in diffusion models it follows the formulation (\tilde{z}=z+\gamma,
Comments & Academic Discussion
Loading comments...
Leave a Comment