Estimating changes in extreme quantiles over time, applied to desert temperatures

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We quantify changes DeltaQ in 100-year return values for regional annual maxima and minima of near-surface atmospheric temperature from output of five CMIP6 models, for five of the Earth’s desert regions, over the interval (2025,2125). We use generalised extreme value (GEV) regression to characterise changes in extremes, considering a range of different parametric forms for the variation of GEV parameters with time, and coupling models for different scenarios so that they provide a common GEV tail in the first year of observation. Parameters are estimated using Bayesian inference. We perform a simulation study using ground truth models generating data qualitatively similar to the CMIP6 output, to assess the relative performance of different information criteria in selecting models from a set of candidates, to minimise error in predictions of DeltaQ. The Bayesian information criterion (BIC) provides best performance, out-performing the divergence and widely-applicable information criteria in particular. Using BIC-selected GEV regression models, we estimate joint posterior distributions of DeltaQ over three forcing scenarios, for different combinations of region, GCM and climate ensemble. Estimates show a consistent trend across regions, GCMs and climate ensembles, of DeltaQ increasing with climate scenario for both regional annual maxima and minima. Aggregating posterior distributions over climate ensembles and GCMs, we find evidence for significant increases in DeltaQ for regional annual maxima under more severe forcing scenarios for all desert regions. Similar but weaker and less significant trends are observed for regional annual minima.

💡 Research Summary

The paper tackles the challenging problem of estimating how extreme temperature quantiles will evolve over the next century in the world’s major desert regions, using output from five CMIP6 global climate models (GCMs). The authors focus on the change ΔQ in the 100‑year return value between the years 2025 and 2125 for both annual maxima and minima of near‑surface air temperature (tas).

Data and preprocessing – For each of the five GCMs (ACCESS‑CM2, CESM2, EC‑Earth3, MRI‑ESM2‑0, UKESM1‑0‑LL) the authors extract daily tas over five desert regions (Antarctic, Dasht‑e‑Lut, Mojave, Sahara, Simpson) and a temperate UK control region. Annual block maxima and minima are computed for each calendar year from 2015 to 2100, yielding 86 observations per series. Three Shared Socio‑Economic Pathway (SSP) forcing scenarios are considered (SSP126, SSP245, SSP585). Where available, up to five ensemble members per scenario are included.

Statistical modelling – The authors adopt a parametric Generalised Extreme Value (GEV) regression framework. The three GEV parameters – location μ, scale σ, and shape ξ – are allowed to vary with a normalized time index τ and with scenario index j (j = 1,2,3). Three functional forms are examined for each parameter: constant, linear, and quadratic in τ (η_j(t)=η_0+τ η_{1j}+τ² η_{2j}). Crucially, the models are constrained so that the three scenarios share the same GEV tail distribution in the first year (2015), reflecting the common initial climate state across scenarios.

Bayesian inference – Parameter estimation is performed in a fully Bayesian manner using Markov chain Monte Carlo (MCMC). This yields posterior samples for all GEV parameters, from which the posterior distribution of the 100‑year return level at any year can be derived. The change ΔQ is then obtained as the difference between the posterior return levels for 2025 and 2125, allowing a direct quantification of uncertainty.

Model selection – To choose among the candidate GEV regression specifications, the authors compare four information criteria: Akaike (AIC), Bayesian (BIC), Deviance (DIC), and Widely Applicable (WAIC). Because the relative performance of such criteria is problem‑specific, they conduct a dedicated simulation study. Synthetic datasets are generated from “ground‑truth” models that mimic the statistical properties of the CMIP6 output (small sample size, non‑stationarity, shared initial tail). For each simulated dataset the four criteria are used to select a model, and the resulting ΔQ predictions are compared to the known truth. The simulation results demonstrate that BIC consistently yields the smallest prediction error, outperforming AIC, DIC, and WAIC. The authors attribute this to BIC’s stronger penalty for model complexity, which guards against over‑fitting in the limited‑sample extreme‑value context.

Results – Using the BIC‑selected models, posterior distributions of ΔQ are estimated for every combination of desert region, GCM, scenario, and ensemble member. The main findings are:

For all desert regions, ΔQ for annual maxima increases with scenario severity (SSP126 < SSP245 < SSP585). The increase is statistically significant for the strongest forcing (SSP585) across all GCMs.
For annual minima, a similar upward trend is observed, but the magnitude is smaller and statistical significance is weaker, reflecting the generally lower sensitivity of cold extremes to greenhouse‑gas forcing in arid environments.
Aggregating over ensembles and GCMs yields robust evidence that extreme hot temperatures (maxima) will become markedly more extreme under high‑forcing pathways, while extreme cold temperatures show more modest changes.

The authors also explore Bayesian model averaging (stacking) as an alternative to single‑model selection. While stacking can improve predictive performance in some settings, in this application the BIC‑chosen single model provides comparable or better uncertainty quantification and is easier to interpret.

Discussion and limitations – The study does not apply bias‑correction or down‑scaling to the GCM outputs, arguing that the ΔQ difference is invariant to a constant bias but acknowledging that uncorrected model bias may still affect the absolute magnitude of extremes. The authors note the uneven number of grid points across regions (e.g., many points for Sahara vs. few for Dasht‑e‑Lut) and the relatively short 86‑year series as potential sources of variability. They suggest future work could incorporate high‑resolution down‑scaling, observation‑based calibration, and semi‑parametric or non‑parametric GEV extensions (splines, Gaussian processes, neural networks) to capture more complex non‑stationarity.

Conclusion – The paper provides a rigorous, Bayesian‑based workflow for estimating changes in extreme temperature quantiles under climate change, demonstrates that the Bayesian Information Criterion is the most reliable tool for model selection in this context, and delivers clear evidence that desert heat extremes are poised to increase substantially under stronger greenhouse‑gas forcing scenarios. This has important implications for risk assessment, adaptation planning, and the broader understanding of climate‑driven extreme events.

Estimating changes in extreme quantiles over time, applied to desert temperatures

💡 Research Summary

Comments & Academic Discussion

Leave a Comment