A Bayesian Hierarchical Model for Generating Synthetic Unbalanced Power Distribution Grids
The real-world data of power networks is often inaccessible due to privacy and security concerns, highlighting the need for tools to generate realistic synthetic network data. Existing methods leverage geographic tools like OpenStreetMap with heuristic rules to model system topology and typically focus on single-phase, balanced systems, limiting their applicability to real-world distribution systems, which are usually unbalanced. This work proposes a Bayesian Hierarchical Model (BHM) to generate unbalanced three-phase distribution systems learning from existing networks. The scheme takes as input the base topology and aggregated demand per node and outputs a three-phase unbalanced system. The proposed scheme achieves a Mean Absolute Percentage Error (MAPE) of less than $8%$ across all phases, with computation times of 20.4 seconds for model training and 3.1 seconds per sample generation. The tool is applied to learn from publicly available SMART-DS dataset and applied to generate European 906 and IEEE-123 systems. We demonstrate the transfer learning capability of the proposed tool by leveraging a model trained on an observed system to generate a synthetic network for an unobserved system. Specifically, the tool is trained using the publicly available SMART-DS dataset and subsequently applied to generate synthetic networks for the European 906-bus system and the IEEE 123-bus system. This tool allows researchers to simulate realistic unbalanced three-phase power data with high accuracy and speed, enhancing planning and operational analysis for modern power grids.
💡 Research Summary
The paper addresses the critical shortage of publicly available real‑world distribution‑grid data, which hampers the development and validation of planning, operation, and control algorithms for modern power systems. Existing synthetic‑grid generation techniques rely heavily on geographic information (e.g., OpenStreetMap) and heuristic rules, and they typically produce single‑phase, balanced networks. Such approaches cannot capture the pervasive phase‑imbalance and hybrid (single‑phase/three‑phase) configurations observed in actual low‑voltage distribution networks.
To overcome these limitations, the authors propose a Bayesian Hierarchical Model (BHM) that learns statistical characteristics from an observed network and then generates realistic unbalanced three‑phase distribution systems for any given topology. The methodology proceeds in four main steps:
-
Probability Estimation from Real Data – Using a reference dataset (SMART‑DS, specifically the Austin region), the authors compute:
* p₃Φ(d): the probability that a load is three‑phase as a function of its normalized distance d from the feeder, reflecting the empirical observation that three‑phase loads cluster near the substation.
* pΦ(A), pΦ(B), pΦ(C): the probabilities that a single‑phase load is connected to phase A, B, or C.
* μΦ and σΦ: mean and standard deviation of active power for each phase, modeled with a truncated normal distribution to enforce positivity.
* rΦ(A), rΦ(B), rΦ(C): the proportion of a three‑phase load’s total demand allocated to each phase, modeled with a Dirichlet distribution. -
Bayesian Hierarchical Modeling – The hierarchical structure links the above probabilities to the actual load variables:
* Load type (single‑ vs three‑phase) follows a Bernoulli distribution with parameter p₃Φ(d).
* For single‑phase loads, the phase choice follows a Categorical distribution with parameters pΦ(A/B/C).
* For three‑phase loads, the phase‑share vector (rΦ(A), rΦ(B), rΦ(C)) follows a Dirichlet distribution.
* Active power for each phase is drawn from a truncated normal distribution (μΦ, σΦ).
This layered formulation captures both discrete decisions (type, phase) and continuous quantities (power), while preserving the statistical dependencies observed in real networks. -
Synthetic System Generation – Given a user‑provided network topology (graph of buses and lines) and the learned probability parameters, the model samples the load variables for each bus. Algorithm 2 allocates active power to the appropriate phases based on the sampled type and phase‑share, and computes reactive power using a randomly selected power‑factor from {0.85, 0.90, 0.95}.
-
Phase‑Consistency Enforcement – To ensure that a phase present at a downstream bus is also present on every upstream line (a physical requirement in real distribution systems), a rule‑based algorithm (Algorithm 3) traverses the network from leaves to the feeder, propagating missing phases upstream as needed. This lightweight approach replaces more computationally intensive mixed‑integer formulations used in prior work.
Experimental Validation
Parameter Fitting: The authors fit the BHM to SMART‑DS data, confirming that p₃Φ(d) declines with distance and that the phase‑share distribution for three‑phase loads is centered around 1/3, indicating near‑balanced loads in the dataset. The posterior means for single‑phase phase selection are pΦ(A)=0.3350, pΦ(B)=0.3296, pΦ(C)=0.3354, essentially uniform.
User‑Defined Scenarios: Two synthetic scenarios are tested—(i) a balanced case with equal phase shares (rΦ=1/3 each) and (ii) an unbalanced case (rΦ(A)=0.1, rΦ(B)=0.6, rΦ(C)=0.3). For each scenario, 1,000 samples are generated. Histograms show that the sampled means and modes align closely with the input specifications, and the overall Mean Absolute Percentage Error (MAPE) stays below 8 % across all phases, demonstrating that the BHM faithfully reproduces both average behavior and variability.
Transfer Learning: The model trained on SMART‑DS (System A) is applied to generate synthetic data for two distinct networks: the European Low‑Voltage (LV) network (System B) and the IEEE‑123‑bus test feeder. Because the real load data for these systems is known, the authors can directly compare synthetic and actual statistics. Results indicate that voltage profiles, line loading, and phase‑imbalance metrics of the synthetic networks deviate by less than 5 % from the real counterparts, confirming the effectiveness of the transfer‑learning capability.
Performance: Training the BHM takes 20.4 seconds on a standard workstation, and each synthetic sample is generated in 3.1 seconds, making the approach suitable for large‑scale Monte‑Carlo studies or rapid prototyping.
Contributions and Impact
- Introduces a probabilistic, data‑driven framework for generating unbalanced three‑phase distribution networks, filling a gap left by heuristic, balanced‑only methods.
- Demonstrates that distance‑dependent three‑phase probabilities and phase‑share distributions can be reliably estimated from publicly available datasets.
- Shows that the learned hierarchical model can be transferred across networks of different sizes and geographic contexts without retraining, greatly reducing the data collection burden.
- Provides a fast, scalable sampling pipeline combined with a simple yet effective phase‑consistency enforcement algorithm.
Limitations and Future Work
The current model treats loads as static snapshots; temporal dynamics (daily load curves, renewable generation variability) are not incorporated. The Dirichlet parameters may become unstable for highly skewed phase‑share scenarios, suggesting a need for hierarchical priors or more informative hyper‑parameters. Future research directions include extending the BHM to time‑series load profiles, integrating distributed generation and electric‑vehicle charging, and exploring GPU‑accelerated sampling for real‑time synthetic‑grid generation.
Conclusion
By leveraging Bayesian hierarchical modeling, the authors deliver a versatile, accurate, and computationally efficient tool for synthesizing realistic unbalanced three‑phase distribution networks. The method bridges the data‑privacy gap, supports transfer learning across disparate systems, and offers the power‑systems community a valuable resource for algorithm development, testing, and educational purposes.
Comments & Academic Discussion
Loading comments...
Leave a Comment