Trustworthiness Layer for Foundation Models in Power Systems: Application for N-k Contingency Assessment

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This work introduces for the first time, to our knowledge, a trustworthiness layer for foundation models in power systems. Using stratified conformal prediction, we devise adaptive, statistically valid confidence bounds for each output of a foundation model. For regression, this allows users to obtain an uncertainty estimate for each output; for screening, it supports conservative decisions that minimize false negatives. We demonstrate our method by enhancing GridFM, the first open-source Foundation Model for power systems, with statistically valid prediction intervals instead of heuristic error margins. We apply it for N-k contingency assessment, a combinatorial NP-Hard problem. We show that trustworthy GridFM can offer richer and more accurate information than DC Power Flow, having 2x-3x higher precision, while running up to 18x faster than AC Power Flow for systems up to 118 buses. Moving a step further, we also examine the ability of trustworthy GridFM to generalize to unseen high-order contingencies: through a rigorous analysis, we assess how a model trained on N-1 or N-2 outages extrapolates to unseen contingencies up to N-5.

💡 Research Summary

This paper introduces, for the first time in power‑system research, a “trustworthiness layer” that endows a foundation model (FM) with statistically valid uncertainty estimates. The authors build on GridFM, an open‑source graph‑transformer foundation model that learns the physics of AC power flow from massive, topology‑rich data. By wrapping GridFM with a stratified conformal prediction (SCP) module, each regression output—bus voltage magnitudes, angles, and derived line loadings—receives a calibrated prediction interval that guarantees a user‑specified coverage probability (e.g., 95 %). For screening tasks, the upper bound of the interval is used conservatively, thereby minimizing false‑negative (missed‑violation) decisions.

The paper first motivates the need for such a layer. Modern transmission systems are increasingly stressed by variable renewable generation, electrified heating and transport, and consequently exhibit higher volatility. Traditional security assessment relies on solving the full non‑linear AC power‑flow (ACPF) for thousands of N‑k contingencies, an NP‑hard task that is far too slow for real‑time operation. Industry therefore resorts to linearized DC power‑flow (DCPF), which sacrifices voltage magnitude and reactive‑power fidelity and cannot detect voltage instability. Recent data‑driven approaches—regression, graph neural networks (GNNs), and most recently foundation models—offer dramatic speedups but remain “black‑box” and lack trustworthy uncertainty quantification, raising safety concerns. Moreover, models trained on routine N‑1 or N‑2 outages often fail to extrapolate to rarer, higher‑order N‑k events.

The authors’ methodology consists of three pillars: (1) adaptation of the pre‑trained GridFM backbone, which processes the power‑grid graph with multi‑head graph attention layers; (2) a physics‑informed fine‑tuning stage that forces the model to reconstruct full AC states (voltage magnitudes and angles) from masked inputs, preserving Kirchhoff’s laws; and (3) a calibrated uncertainty quantification layer based on stratified conformal prediction. The SCP procedure defines a non‑conformity score on a held‑out calibration set, then computes prediction intervals that are valid conditional on the observed topology and operating point, without assuming Gaussianity or stationarity.

For the downstream task, the reconstructed complex voltages are fed into a π‑equivalent line model to compute branch currents at both ends, from which line‑loading ratios are derived. A binary “safe/unsafe” label is assigned if any loading exceeds its thermal limit. By using the upper confidence bound of the loading ratio, the system errs on the side of safety, dramatically reducing the probability of missed violations.

Experimental evaluation uses the IEEE 118‑bus test system. GridFM is trained on N‑1 and N‑2 outage data, then tested on unseen N‑3, N‑4, and N‑5 contingencies. The results show: (i) prediction intervals achieve the target 95 % coverage across all contingency orders; (ii) the trustworthy GridFM attains 2‑3× lower RMSE in voltage magnitudes and angles compared with DCPF, while delivering up to 18× speedup relative to full AC power‑flow; (iii) false‑negative rates in congestion screening drop below 0.5 % even for N‑5 scenarios, whereas a baseline GNN without SCP exhibits a steep performance decline. These findings demonstrate that the trustworthiness layer not only supplies rigorous uncertainty bounds but also enables conservative, reliable screening in real‑time environments.

The paper’s contributions are threefold: (1) the first integration of stratified conformal prediction with a power‑system foundation model, providing mathematically guaranteed coverage; (2) a conservative screening framework that explicitly minimizes false negatives through interval‑based decision making; (3) a systematic analysis of high‑order N‑k generalization, showing that models trained on low‑order outages can safely extrapolate when equipped with calibrated uncertainty.

Future work is outlined as follows: (a) online updating of the calibration set to maintain interval validity under changing grid conditions; (b) privacy‑preserving conformal methods to address data‑sensitivity concerns; (c) scaling the approach to very large transmission networks (thousands of buses) and integrating it with security‑constrained optimal power flow (SC‑OPF) pipelines. By delivering speed, accuracy, and statistically sound trustworthiness, the proposed framework positions foundation‑model AI as a viable, safe tool for next‑generation grid operation and contingency management.

Trustworthiness Layer for Foundation Models in Power Systems: Application for N-k Contingency Assessment

💡 Research Summary

Comments & Academic Discussion

Leave a Comment