Estimation and model errors in Gaussian-process-based Sensitivity Analysis of functional outputs

Estimation and model errors in Gaussian-process-based Sensitivity Analysis of functional outputs
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Global sensitivity analysis (GSA) of functional-output models is usually performed by combining statistical techniques, such as basis expansions, metamodeling and sampling based estimation of sensitivity indices. By neglecting truncation error from basis expansion, two main sources of errors propagate to the final sensitivity indices: the metamodeling related error and the sampling-based, or pick-freeze (PF), estimation error. This work provides an efficient algorithm to estimate these errors in the frame of Gaussian processes (GP), based on the approach of Le Gratiet et al. [16]. The proposed algorithm takes advantage of the fact that the number of basis coefficients of expanded model outputs is significantly smaller than output dimensions. Basis coefficients are fitted by GP models and multiple conditional GP trajectories are sampled. Then, vector-valued PF estimation is used to speed-up the estimation of Sobol indices and generalized sensitivity indices (GSI). We illustrate the methodology on an analytical test case and on an application in non-Newtonian hydraulics, modelling an idealized dam-break flow. Numerical tests show an improvement of 15 times in the computational time when compared to the application of Le Gratiet et al. [16] algorithm separately over each output dimension.


💡 Research Summary

The paper addresses a critical bottleneck in global sensitivity analysis (GSA) of models that produce functional (high‑dimensional) outputs. Traditional GSA relies on Sobol indices or their generalized counterpart (GSI) and typically employs Pick‑Freeze (PF) estimators, which require thousands of model evaluations. When the underlying model is expensive, a surrogate (metamodel) such as Gaussian Process Regression (GPR) is used. However, the state‑of‑the‑art error‑propagation method of Le Gratiet et al. (2014) was designed for scalar outputs; applying it dimension‑by‑dimension to functional outputs leads to a prohibitive computational cost because each output dimension would need its own set of GP trajectory samples and bootstrap resampling.

The authors propose an efficient algorithm that exploits the fact that functional outputs can be represented in a low‑dimensional basis. After expanding the output (y(x)) on a pre‑chosen functional basis (e.g., Fourier, POD), only the most energetic coefficients (typically far fewer than the original number of output points) are retained. Each retained coefficient is modeled independently with a Gaussian Process. Because the number of coefficients (K) is much smaller than the original output dimension (L), the surrogate construction and subsequent sampling become dramatically cheaper.

The algorithm proceeds as follows: (1) draw two independent Monte‑Carlo samples of the input vector, forming the PF pair ((X, X’)); (2) construct the PF input sets ((x, \tilde{x})) where (\tilde{x}) freezes the components of interest; (3) for each of (N_Z) GP trajectories, sample the conditional GP at the (2N) locations defined by the PF inputs, obtaining simulated values of the retained coefficients; (4) compute vector‑valued PF estimators of the un‑normalized Sobol matrix (cD^{pf}_I) and the total variance matrix (bD^{pf}) using the matrix formulas given in the paper (Definition 2); (5) obtain the GSI by taking the trace ratio (\text{GSI}_I = \frac{\operatorname{Tr}(cD^{pf}_I)}{\operatorname{Tr}(bD^{pf})}) (Definition 3‑4).

To separate the metamodel‑only uncertainty from the total uncertainty (metamodel + PF estimation), the authors employ a bootstrap scheme on the PF input indices. The first bootstrap replicate uses the original PF output samples; subsequent replicates are generated by resampling the indices of the already simulated GP outputs, thus avoiding any additional GP draws. This yields two families of Sobol/GSI estimates: (i) the “metamodel‑only” distribution (no bootstrap, only different GP trajectories) and (ii) the “overall” distribution (bootstrap + different GP trajectories). Both distributions can be summarized by means, variances, and confidence intervals, providing a comprehensive error quantification.

The methodology is validated on two case studies. The first is a synthetic analytical function with 50 output dimensions, reduced to a handful of basis coefficients. The second is a realistic non‑Newtonian hydraulic simulation of an idealized dam‑break flow, where a Proper Orthogonal Decomposition (POD) yields the dominant modes. In both cases, the proposed basis‑derived GP algorithm achieves roughly a 15‑fold reduction in computational time compared with the naïve application of Le Gratiet et al.’s algorithm to each output dimension separately, while delivering comparable error estimates. Moreover, the approach naturally extends to GSI, allowing practitioners to assess the global influence of each input over the entire functional domain.

Key contributions of the work are: (1) a unified framework for quantifying both surrogate‑model error and PF estimation error in functional GSA; (2) a basis‑derived dimensionality reduction that makes the Le Gratiet error‑propagation algorithm scalable to high‑dimensional outputs; (3) the extension of the error analysis to generalized sensitivity indices, which are often more informative for functional outputs. The paper also discusses limitations, such as the dependence on an appropriate basis selection and the assumption of independent inputs, and outlines future research directions, including adaptive basis selection, handling correlated inputs, multi‑output GP models that capture cross‑coefficient correlations, and integration with active learning strategies for sequential experimental design.

Overall, the study provides a practical, theoretically sound, and computationally efficient solution for error‑aware sensitivity analysis of complex functional models, opening the door to more reliable uncertainty quantification in fields ranging from fluid dynamics to climate modeling and beyond.


Comments & Academic Discussion

Loading comments...

Leave a Comment