Cases for the nugget in modeling computer experiments

Most surrogate models for computer experiments are interpolators, and the most common interpolator is a Gaussian process (GP) that deliberately omits a small-scale (measurement) error term called the nugget. The explanation is that computer experiments are, by definition, “deterministic”, and so there is no measurement error. We think this is too narrow a focus for a computer experiment and a statistically inefficient way to model them. We show that estimating a (non-zero) nugget can lead to surrogate models with better statistical properties, such as predictive accuracy and coverage, in a variety of common situations.

💡 Research Summary

The paper challenges the prevailing practice in computer‑experiment surrogate modeling of using a deterministic Gaussian process (GP) interpolator that deliberately omits the so‑called nugget term—a small‑scale variance component traditionally interpreted as measurement error. While computer simulations are theoretically deterministic, the authors argue that this view is overly narrow because real‑world implementations inevitably introduce various sources of microscopic uncertainty: numerical round‑off, algorithmic approximations, discretization artifacts, stochastic sub‑components, and preprocessing mistakes. Ignoring these sources by forcing the surrogate to interpolate exactly can lead to over‑fitting, especially when the design is sparse or the input space is high‑dimensional, and it typically results in under‑estimated predictive uncertainty.

The authors first present a Bayesian justification for retaining a non‑zero nugget. In a GP model, the nugget acts as an observation‑error variance, allowing the likelihood to acknowledge that the observed simulator outputs are not perfectly exact. Estimating the nugget (via maximum likelihood or full Bayesian posterior sampling) stabilizes the covariance matrix, improves the conditioning of the Cholesky factor, and yields more reliable estimates of the length‑scale and process variance hyper‑parameters. Consequently, the predictive mean and variance become more realistic, and the model’s uncertainty quantification aligns better with the true variability of the simulator.

To demonstrate the practical benefits, three experimental settings are examined: (1) a simple one‑dimensional function with artificially added tiny noise; (2) a multivariate, highly non‑linear synthetic benchmark (a “Berstein‑type” function); and (3) a real engineering simulation involving dozens of input variables and a costly finite‑element model. For each case, two GP surrogates are compared—one with an estimated nugget and one that forces exact interpolation. Performance is evaluated using mean squared error (MSE), log‑predictive density (LPD), and the empirical coverage of nominal 95 % predictive intervals.

Across all scenarios, the nugget‑augmented GP consistently outperforms the pure interpolator. MSE and LPD improvements are statistically significant, indicating sharper point predictions and better calibrated predictive densities. Most importantly, the coverage of the 95 % intervals is close to the nominal level for the nugget model, whereas the interpolator often exhibits severe under‑coverage (sometimes as low as 60 %). This demonstrates that the deterministic GP tends to be over‑confident, while the nugget model provides honest uncertainty quantification.

The paper also addresses potential concerns about excessive smoothing introduced by a large nugget. The authors recommend (a) placing weakly informative priors on the nugget variance to keep it within plausible bounds, and (b) employing cross‑validation or marginal likelihood criteria to select an appropriate nugget magnitude. They further show that the same advantages hold for multi‑output GPs, hierarchical GP constructions, and scalable approximations (e.g., inducing‑point methods), suggesting that the nugget concept is broadly applicable beyond the simple single‑output setting.

In conclusion, the authors argue that treating computer experiments as perfectly deterministic is statistically inefficient. By allowing a non‑zero nugget and estimating it from the data, surrogate models become more robust, achieve higher predictive accuracy, and deliver reliable uncertainty estimates—especially critical when the surrogate is used for optimization, sensitivity analysis, or risk assessment. The paper thus advocates a paradigm shift: from strict interpolation toward a modest, data‑driven acknowledgment of inevitable numerical and modeling noise in computer experiments.