Quantifying and Attributing Submodel Uncertainty in Stochastic Simulation Models and Digital Twins

Reading time: 5 minute
...

📝 Original Info

  • Title: Quantifying and Attributing Submodel Uncertainty in Stochastic Simulation Models and Digital Twins
  • ArXiv ID: 2602.16099
  • Date: 2026-02-18
  • Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (원문에 저자 명단이 포함되어 있지 않음) **

📝 Abstract

Stochastic simulation is widely used to study complex systems composed of various interconnected subprocesses, such as input processes, routing and control logic, optimization routines, and data-driven decision modules. In practice, these subprocesses may be inherently unknown or too computationally intensive to directly embed in the simulation model. Replacing these elements with estimated or learned approximations introduces a form of epistemic uncertainty that we refer to as submodel uncertainty. This paper investigates how submodel uncertainty affects the estimation of system performance metrics. We develop a framework for quantifying submodel uncertainty in stochastic simulation models and extend the framework to digital-twin settings, where simulation experiments are repeatedly conducted with the model initialized from observed system states. Building on approaches from input uncertainty analysis, we leverage bootstrapping and Bayesian model averaging to construct quantile-based confidence or credible intervals for key performance indicators. We propose a tree-based method that decomposes total output variability and attributes uncertainty to individual submodels in the form of importance scores. The proposed framework is model-agnostic and accommodates both parametric and nonparametric submodels under frequentist and Bayesian modeling paradigms. A synthetic numerical experiment and a more realistic digital-twin simulation of a contact center illustrate the importance of understanding how and how much individual submodels contribute to overall uncertainty.

💡 Deep Analysis

📄 Full Content

Many stochastic systems studied in operations research and management science can be viewed as collections of interconnected subprocesses that each represent a specific aspect of system behavior such as demand generation, service dynamics, routing logic, or optimization-based decision making. When such systems are studied using stochastic simulation, the modeler must determine how to model each subprocess. In many cases, the true subprocess cannot be accessed or fully understood and instead can only be observed in the real world, such as when observing customer arrivals in a service system. In other cases, the true subprocess may be accessible, such as when it represents some operational decision made according to some policy or sophisticated solution method, e.g., optimization, but directly embedding the subprocess within the simulation may be impractical due to concerns about privacy, latency, computational cost, or data accessibility. For these reasons, the modeler may, out of necessity or choice, replace certain subprocesses with approximate representations, such as generative models, simplified decision rules, heuristics, or optimization proxies trained on historical data. These approximations, which we henceforth refer to as submodels, may take several forms.

In the case of modeling stochastic inputs, such as customer demand or service times, probability distributions are fitted to real-world data. In other cases, a submodel may be a function mapping subsystem inputs to outputs trained on input-output data from that subprocess using supervised learning techniques. In both cases, submodel selection can involve trade-offs between accuracy, interpretability, and computational cost.

The use of submodels within a stochastic simulation model introduces errors that propagate through the model to its outputs. Understanding how these errors propagate is critical for quantifying uncertainty in estimators of key performance indicators (KPIs) and directing efforts to reduce uncertainty. While prior work in the simulation literature has extensively studied input uncertainty (IU) [4,19,23]-the epistemic uncertainty due to estimating stochastic input models from limited data-less attention has been given to uncertainty originating from other types of submodels that influence the internal dynamics of the simulated system, including the decision logic, event ordering, and system state. The errors introduced by both types of submodels can influence simulation behavior in complex and path-dependent ways. Along these lines, [11] adopt methods from IU to study uncertainty arising from embedding machine learning (ML) surrogates of decision-support systems (DSSs) within simulation models. This paper builds on the ideas of [11] to introduce a unifying framework that encompasses many potential sources of epistemic uncertainty in stochastic simulation models. We collectively refer to this more general form of epistemic uncertainty as submodel uncertainty. This paper presents general-purpose methods for systematically quantifying submodel uncertainty to support and enhance operational decision making. More specifically, we employ bootstrapping and Bayesian model averaging (BMA) to generate plausible submodels that drive a designed simulation experiment. We leverage design of experiments (DOE), specifically stacked Latin hypercube (LH) designs, to more efficiently explore the space of submodel instances when studying systems with multiple submodels. The experiment results are then used to construct quantile-based confidence or credible intervals (CIs) that account for both aleatoric and epistemic uncertainty. We propose a tree-based method that provides importance scores quantifying how the overall uncertainty can be decomposed into aleatoric and epistemic terms and how the epistemic uncertainty can be further attributed to individual submodels. These importance scores enable practitioners to identify the most influential submodels and thereby prioritize further efforts to reduce overall uncertainty by, for example, acquiring additional training data or refining their modeling specifications. The framework’s strength is its generality, as it can accommodate both parametric and non-parametric submodels under either the frequentist or Bayesian modeling paradigms. Moreover, whereas some existing approaches for analyzing IU assume that the input models are independent of each other [15], we make no such assumption about the submodels. Submodel uncertainty is also highly relevant in digital-twin settings, wherein a simulation model is enhanced by integrating real-time or periodically observed system state data to maintain a synchronized virtual representation of the physical system [25,16,34]. Digital twins are used for monitoring, forecasting, and evaluating alternative operational strategies in real time. Recent work has emphasized the central role of uncertainty quantification (UQ) for reliable prediction and decision support. F

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut