Towards Uncertainty Quantification in Generative Model Learning

Reading time: 5 minute
...

📝 Abstract

While generative models have become increasingly prevalent across various domains, fundamental concerns regarding their reliability persist. A crucial yet understudied aspect of these models is the uncertainty quantification surrounding their distribution approximation capabilities. Current evaluation methodologies focus predominantly on measuring the closeness between the learned and the target distributions, neglecting the inherent uncertainty in these measurements. In this position paper, we formalize the problem of uncertainty quantification in generative model learning. We discuss potential research directions, including the use of ensemble-based precision-recall curves. Our preliminary experiments on synthetic datasets demonstrate the effectiveness of aggregated precision-recall curves in capturing model approximation uncertainty, enabling systematic comparison among different model architectures based on their uncertainty characteristics.

💡 Analysis

While generative models have become increasingly prevalent across various domains, fundamental concerns regarding their reliability persist. A crucial yet understudied aspect of these models is the uncertainty quantification surrounding their distribution approximation capabilities. Current evaluation methodologies focus predominantly on measuring the closeness between the learned and the target distributions, neglecting the inherent uncertainty in these measurements. In this position paper, we formalize the problem of uncertainty quantification in generative model learning. We discuss potential research directions, including the use of ensemble-based precision-recall curves. Our preliminary experiments on synthetic datasets demonstrate the effectiveness of aggregated precision-recall curves in capturing model approximation uncertainty, enabling systematic comparison among different model architectures based on their uncertainty characteristics.

📄 Content

Towards Uncertainty Quantification in Generative Model Learning Giorgio Morales, Frederic Jurie, Jalal Fadili Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC 14000, Caen, France giorgiomorales@ieee.org, frederic.jurie@unicaen.fr, jalal.fadili@ensicaen.fr Abstract While generative models have become increasingly prevalent across various do- mains, fundamental concerns regarding their reliability persist. A crucial yet understudied aspect of these models is the uncertainty quantification surrounding their distribution approximation capabilities. Current evaluation methodologies focus predominantly on measuring the closeness between the learned and the target distributions, neglecting the inherent uncertainty in these measurements. In this position paper, we formalize the problem of uncertainty quantification in generative model learning. We discuss potential research directions, including the use of ensemble-based precision-recall curves. Our preliminary experiments on synthetic datasets demonstrate the effectiveness of aggregated precision-recall curves in cap- turing model approximation uncertainty, enabling systematic comparison among different model architectures based on their uncertainty characteristics. 1 Introduction The use of generative artificial intelligence (GenAI) has significantly impacted applications that are integral to everyday life for a diverse range of users, including mobile applications, content generation tools, and search engines [11]. As a result, these models are no longer confined to small teams of researchers or expert users in highly specialized domains. However, they often exhibit complex, opaque behaviors that are difficult for humans to interpret. As these models are rapidly deployed in high-stakes domains, enhancing their reliability and trustworthiness is no longer optional, but a critical necessity. Therefore, it is essential to quantify their uncertainty to establish a measure of confidence in their learned distributions and mitigate potential risks associated with their deployment [9]. Existing works on uncertainty quantification (UQ) primarily address sample-level uncertainty; e.g., the confidence individual outputs (see Appendix A for a review) [19, 15, 8, 2, 1, 33]. However, these methods overlook the uncertainty of the evaluation metric itself. To our knowledge, no work quantifies the confidence in the measured closeness between the learned and target distributions. Addressing this gap would enable more robust evaluations and comparative studies to assess model stability, improving both training methodologies and uncertainty-aware decision-making. This aspect is crucial for assessing model reliability, especially in applications that rely on precise distribution alignment. In fields such as experimental particle physics, generative models are increas- ingly explored as efficient alternatives to traditional Monte Carlo (MC) simulations [12, 20]. MC simulations are computationally expensive yet fundamental for comparing experimental data with theoretical predictions. In such applications, ensuring that the learned distribution closely aligns with the target distribution is critical, as small discrepancies might lead to incorrect interpretations of physical phenomena. Similar challenges arise in weather forecasting, medicine, and molecular biology, where misaligned distributions can lead to unrealistic climate predictions [27], unreliable diagnostic models [31], or non-physical molecular structures [10]. EurIPS 2025 Workshop: Epistemic Intelligence in Machine Learning (EIML@EurIPS 2025) arXiv:2511.10710v1 [cs.LG] 13 Nov 2025 Our contribution is two-fold. First, we provide a formal definition of UQ in generative model learning, distinguishing it from existing approaches that quantify uncertainty in generative models. Second, we outline potential research directions to improve the understanding and measurement of uncertainty in this context. To establish a foundation for our analysis, we recognize that uncertainty is generally categorized into aleatoric uncertainty, which arises from inherent noise in the data, and epistemic uncertainty, which reflects a lack of knowledge due to limited data coverage or model uncertainty [17, 25]. In this position paper, we specifically focus on model uncertainty, which stems from the choice of model architecture, optimization procedure, and initialization variability. To begin addressing this, we suggest estimating model uncertainty by analyzing the variability of the precision-recall curve [26, 29] across multiple training runs with different random initializations. By quantifying this variability, we provide insights into how sensitive a generative model is to training instabilities, which is crucial for assessing its reliability in real-world applications. Preliminary experiments on a synthetic dataset and a diffusion model suggest that ensembles of PR curves help assess how model uncertainty changes with the score network’s paramete

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut