Extending the application of dynamic Bayesian networks in calculating market risk: Standard and stressed expected shortfall
In the last five years, expected shortfall (ES) and stressed ES (SES) have become key required regulatory measures of market risk in the banking sector, especially following events such as the global financial crisis. Thus, finding ways to optimize their estimation is of great importance. We extend the application of dynamic Bayesian networks (DBNs) to the estimation of 10-day 97.5% ES and stressed ES, building on prior work applying DBNs to value at risk. Using the S&P 500 index as a proxy for the equities trading desk of a US bank, we compare the performance of three DBN structure-learning algorithms with several traditional market risk models, using either the normal or the skewed Student’s t return distributions. Backtesting shows that all models fail to produce statistically accurate ES and SES forecasts at the 2.5% level, reflecting the difficulty of modeling extreme tail behavior. For ES, the EGARCH(1,1) model (normal) produces the most accurate forecasts, while, for SES, the GARCH(1,1) model (normal) performs best. All distribution-dependent models deteriorate substantially when using the skewed Student’s t distribution. The DBNs perform comparably to the historical simulation model, but their contribution to tail prediction is limited by the small weight assigned to their one-day-ahead forecasts within the return distribution. Future research should examine weighting schemes that enhance the influence of forward-looking DBN forecasts on tail risk estimation.
💡 Research Summary
The paper investigates the use of dynamic Bayesian networks (DBNs) for forecasting 10‑day 97.5 % Expected Shortfall (ES) and Stressed Expected Shortfall (SES) on the S&P 500 index, extending earlier work that applied DBNs to Value‑at‑Risk. Using daily closing prices from March 1991 to February 2020, the authors compare three DBN structure‑learning algorithms with a suite of traditional market‑risk models: historical simulation, delta‑normal, ARCH(1), GARCH(1,1), EGARCH(1,1) and RiskMetrics. Each parametric model is calibrated under both the normal and the skewed Student’s‑t distributions; the historical and delta‑normal approaches are distribution‑agnostic.
The stressed period is constructed by aggregating the most severe loss days of the S&P 500 (not necessarily consecutive), following the methodology of Gross et al. (2025). ES and SES are computed as the average loss beyond the 10‑day 97.5 % VaR, consistent with Basel III/IV regulatory definitions. Model performance is evaluated using the Basel Committee’s traffic‑light test together with three more rigorous back‑tests: the conditional test, the minimally biased test (both Acerbi & Székely) and the Du‑Escanciano independence test.
Results show that none of the models achieve statistical accuracy at the 2.5 % level, underscoring the difficulty of tail‑risk estimation. Under the normal distribution, EGARCH(1,1) delivers the most accurate ES forecasts, while GARCH(1,1) is best for SES. When the skewed Student’s‑t distribution is employed, all models deteriorate markedly, with historical simulation and RiskMetrics severely under‑estimating tail risk. DBNs perform on par with historical simulation but contribute little to tail prediction because the one‑day‑ahead forecasts generated by the DBN receive a small weight in the ten‑day return distribution.
The authors conclude that, in its current formulation, the DBN approach does not outperform well‑established GARCH‑type models for ES/SES estimation. They attribute the limited impact to the weighting scheme and to instability introduced by the skewed t‑distribution. Future research directions include developing adaptive weighting mechanisms that amplify the influence of forward‑looking DBN forecasts, integrating DBNs within ensemble frameworks, and exploring Bayesian regularisation to stabilise parameter estimation under heavy‑tailed, asymmetric distributions. Extending the analysis to longer horizons, other asset classes, and more recent market data is also recommended to validate the robustness of the proposed methodology.
Comments & Academic Discussion
Loading comments...
Leave a Comment