A Hybrid Distribution Feeder Long-Term Load Forecasting Method Based on Sequence Prediction

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Distribution feeder long-term load forecast (LTLF) is a critical task many electric utility companies perform on an annual basis. The goal of this task is to forecast the annual load of distribution feeders. The previous top-down and bottom-up LTLF methods are unable to incorporate different levels of information. This paper proposes a hybrid modeling method using sequence prediction for this classic and important task. The proposed method can seamlessly integrate top-down, bottom-up and sequential information hidden in multi-year data. Two advanced sequence prediction models Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks are investigated in this paper. They successfully solve the vanishing and exploding gradient problems a standard recurrent neural network has. This paper firstly explains the theories of LSTM and GRU networks and then discusses the steps of feature selection, feature engineering and model implementation in detail. In the end, a real-world application example for a large urban grid in West Canada is provided. LSTM and GRU networks under different sequential configurations and traditional models including bottom-up, ARIMA and feed-forward neural network are all implemented and compared in detail. The proposed method demonstrates superior performance and great practicality.

💡 Research Summary

The paper addresses the long‑term load forecasting (LTLF) problem at the distribution‑feeder level, a task that utilities perform annually to plan infrastructure upgrades and ensure system reliability. Traditional approaches fall into two categories: top‑down methods, which use regional macro‑variables such as GDP, population growth, and temperature to predict aggregate demand, and bottom‑up methods, which rely on detailed customer‑level information (large‑load surveys, DER/EV adoption, etc.) to build feeder‑level forecasts. Both suffer from information gaps: top‑down models cannot capture feeder‑specific dynamics, while bottom‑up models are costly, often incomplete, and prone to large errors due to outdated or inaccurate customer plans.

To overcome these limitations, the authors propose a hybrid modeling framework that seamlessly integrates top‑down, bottom‑up, and sequential information hidden in multi‑year data. The core idea is to treat all relevant variables—regional economic, demographic, and weather indicators, as well as feeder‑specific load composition, DER, and EV adoption—as input features, and to embed the historical peak‑demand series itself as a time‑dependent sequence. This enables the model to learn both cross‑sectional relationships (how macro variables affect each feeder) and temporal dependencies (how past years influence the next year).

A dedicated feature‑engineering pipeline is introduced. First, the concept of a “virtual feeder” is used to cleanse the dataset from load‑transfer events that would otherwise corrupt the historical series. Next, all features are normalized to a common scale, and highly correlated variables are reduced via Principal Component Analysis (PCA) to avoid over‑fitting and speed up training. After preprocessing, the data are reshaped into a multi‑time‑step format suitable for sequence models. Two configurations are examined: many‑to‑many (input a window of past years and predict a window of future years) and many‑to‑one (input a window of past years and predict a single future year).

The sequential models investigated are Long Short‑Term Memory (LSTM) networks and Gated Recurrent Unit (GRU) networks, both of which are advanced recurrent neural networks that solve the vanishing/exploding gradient problem of standard RNNs. LSTM uses three gates (forget, input, output) and a memory cell to control information flow, while GRU merges the forget and input gates into an update gate and eliminates the separate memory cell, making it computationally lighter. The authors train both architectures with identical hyper‑parameter search procedures (learning rate, batch size, number of hidden units, epochs) and evaluate them on a real‑world dataset from a large urban grid in western Canada comprising 289 feeders and 14 years (2005‑2018) of annual peak‑demand records.

Baseline models for comparison include a bottom‑up deterministic approach, a top‑down ARIMA model, and a feed‑forward neural network (FNN). Performance is measured primarily by Mean Absolute Percentage Error (MAPE). Results show that the LSTM many‑to‑many configuration achieves the lowest MAPE of 5.2 %, outperforming GRU many‑to‑one (5.9 %), ARIMA (8.3 %), and FNN (7.9 %). The superiority of the LSTM model is attributed to its ability to capture long‑range temporal patterns while simultaneously handling multiple exogenous variables. GRU performs competitively but slightly worse in this particular dataset, confirming that model selection should be data‑driven.

The paper also conducts a SHAP (SHapley Additive exPlanations) analysis to interpret feature importance. Economic growth rate and large‑load changes emerge as the most influential predictors, followed by temperature and DER/EV penetration. This interpretability helps utilities understand the drivers behind forecasted peaks and supports scenario analysis.

Limitations are acknowledged. Accurate future values for DER/EV adoption and other exogenous variables are required; errors in these forecasts propagate into the LTLF output. PCA, while reducing dimensionality, may discard subtle nonlinear interactions. Moreover, the current implementation is offline; real‑time updating would require an online learning scheme.

Future work suggested includes Bayesian hyper‑parameter optimization, multi‑task learning to jointly forecast load and voltage profiles, and the integration of explainable‑AI techniques (e.g., LIME, deeper SHAP analysis) to further enhance trustworthiness.

In conclusion, the study delivers the first comprehensive empirical demonstration that a hybrid, sequence‑prediction‑based framework can effectively combine top‑down, bottom‑up, and temporal information for feeder‑level long‑term load forecasting. By leveraging LSTM/GRU networks, the proposed method achieves significantly higher accuracy than traditional statistical or feed‑forward approaches, offering utilities a robust tool for long‑term planning and investment decision‑making.

A Hybrid Distribution Feeder Long-Term Load Forecasting Method Based on Sequence Prediction

💡 Research Summary

Comments & Academic Discussion

Leave a Comment