Latent Space Representation of Electricity Market Curves: Maintaining Structural Integrity

Latent Space Representation of Electricity Market Curves: Maintaining Structural Integrity
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Efficiently representing supply and demand curves is vital for energy market analysis and downstream modelling; however, dimensionality reduction often produces reconstructions that violate fundamental economic principles such as monotonicity. This paper evaluates the performance of PCA, Kernel PCA, UMAP, and AutoEncoder across 2d and 3d latent spaces. During preprocessing, we transform the original data to achieve a unified structure, mitigate outlier effects, and focus on critical curve segments. To ensure theoretical validity, we integrate Isotonic Regression as an optional post-processing step to enforce monotonic constraints on reconstructed outputs. Results from a three-year hourly MIBEL dataset demonstrate that the non-linear technique UMAP consistently outperforms other methods, securing the top rank across multiple error metrics. Furthermore, Isotonic Regression serves as a crucial corrective layer, significantly reducing error and restoring physical validity for several methods. We argue that UMAP`s local structure preservation, combined with intelligent post-processing, provides a robust foundation for downstream tasks such as forecasting, classification, and clustering.


💡 Research Summary

This paper addresses the challenge of compressing electricity market supply and demand curves into low‑dimensional latent spaces while preserving the essential economic property of monotonicity (i.e., higher prices correspond to lower quantities). The authors evaluate four dimensionality‑reduction (DR) techniques—Principal Component Analysis (PCA), Kernel PCA (kPCA), Uniform Manifold Approximation and Projection (UMAP), and AutoEncoder (AE)—in both 2‑D and 3‑D latent representations.

The dataset consists of hourly supply and demand curves from the Iberian Electricity Market (MIBEL) covering 2018‑2020, amounting to 26,298 curves. A four‑step preprocessing pipeline is applied: (1) price winsorization to the 99 % confidence interval, (2) mapping all prices onto an integer‑step price grid (ΔP = 1 EUR), (3) uniform volume sampling with a step of 100 MWh within the 99 % volume confidence interval, and (4) standardization to zero mean and unit variance. This yields a uniform matrix of size K × N with N = 268 price points per curve, ensuring that all DR methods receive identical input dimensions.

Each DR method is tuned via grid search on a training subset and evaluated on a validation set. Performance metrics include Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), bias, and Weighted Absolute Percentage Error (W‑APE). Because most DR techniques do not inherently enforce monotonicity, the authors optionally apply Isotonic Regression (IR) as a post‑processing step to the reconstructed curves, guaranteeing a non‑decreasing relationship between price and quantity without altering the underlying DR architecture.

Experimental results show that UMAP consistently outperforms the other methods across all error metrics in both 2‑D and 3‑D settings. In the 2‑D case, UMAP achieves RMSE = 2.17 €/MWh and MAE = 0.87 MWh, while the next best (kPCA) records RMSE = 4.85 and MAE = 3.50. The superiority of UMAP is attributed to its ability to preserve local manifold structure, which aligns well with the piecewise‑linear nature of market curves. PCA and kPCA frequently produce non‑monotonic reconstructions, leading to economically implausible curves; applying IR dramatically reduces their errors (e.g., PCA RMSE drops from 4.85 to 4.40, MAE from 3.50 to 2.13). AE yields visually plausible, monotonic curves but suffers from higher reconstruction errors (RMSE ≈ 6.70, MAE ≈ 3.31), indicating that while it captures non‑linear patterns, it does not preserve the global geometry as effectively as UMAP.

Increasing the latent dimensionality from 2 to 3 generally improves performance for all methods, yet the gains for UMAP diminish due to increased manifold sparsity and potential over‑fitting to local noise. Visualizations of the latent spaces reveal that UMAP clusters curves by hour of the day and captures variations linked to renewable generation, whereas PCA‑based methods show fragmented or distorted structures.

The authors conclude that for electricity market curve representation, a non‑linear manifold learner like UMAP, combined with a simple monotonicity‑enforcing post‑processor such as Isotonic Regression, provides the most accurate and economically sound reconstructions. This framework enables downstream tasks—forecasting, clustering, regime detection, and anomaly identification—to operate on compact yet structurally faithful representations. Future work will explore applying the approach to other energy markets, integrating the latent space directly into forecasting models, and investigating alternative monotonicity‑preserving architectures.


Comments & Academic Discussion

Loading comments...

Leave a Comment