Supervised Transfer Learning Framework for Fault Diagnosis in Wind Turbines
Common challenges in fault diagnosis include the lack of labeled data and the need to build models for each domain, resulting in many models that require supervision. Transfer learning can help tackle these challenges by learning cross-domain knowledge. Many approaches still require at least some labeled data in the target domain, and often provide unexplainable results. To this end, we propose a supervised transfer learning framework for fault diagnosis in wind turbines that operates in an Anomaly-Space. This space was created using SCADA data and vibration data and was built and provided to us by our research partner. Data within the Anomaly-Space can be interpreted as anomaly scores for each component in the wind turbine, making each value intuitive to understand. We conducted cross-domain evaluation on the train set using popular supervised classifiers like Random Forest, Light-Gradient-Boosting-Machines and Multilayer Perceptron as metamodels for the diagnosis of bearing and sensor faults. The Multilayer Perceptron achieved the highest classification performance. This model was then used for a final evaluation in our test set. The results show, that the proposed framework is able to detect cross-domain faults in the test set with a high degree of accuracy by using one single classifier, which is a significant asset to the diagnostic team.
💡 Research Summary
The paper addresses two persistent challenges in wind turbine (WT) fault diagnosis: the scarcity of labeled fault data and the need to maintain separate diagnostic models for each turbine or wind park. Conventional supervised approaches typically require abundant labeled samples for each specific turbine, while existing transfer‑learning methods either depend on a small set of labeled target data or produce features that are difficult for engineers to interpret. To overcome these limitations, the authors propose a supervised transfer‑learning framework that operates in a specially constructed “Anomaly‑Space”. This space is derived from SCADA measurements and high‑frequency vibration signals using two proprietary detectors: a broadband‑characteristic‑value (bbcv) detector that extracts statistical features (e.g., skewness, kurtosis, mean) from vibration data and evaluates their temporal trends, and a tuplet detector that monitors variance among semantically related SCADA variables (e.g., the three phases of generator temperature). Both detectors output normalized anomaly scores, where values above 1.0 indicate abnormal behavior. Because the scores directly reflect deviations from normal operation, they are intuitively understandable for maintenance personnel.
Data were collected from seven turbines across five wind parks. Five turbines (four parks) formed the training/validation set, while two turbines from a completely different park comprised the test set. Fault types considered were bearing faults (critical, high‑cost) and sensor faults (cheaper but potentially misleading). Fault intervals were labeled by domain experts; periods outside fault windows were labeled “Normal”, and data collected during turbine shutdown or low‑wind conditions were excluded. Missing values were forward‑filled for up to three hours, with any remaining gaps set to zero, yielding a continuous time series.
Feature engineering reduced each detector’s output to a single most‑variant variable, resulting in a two‑dimensional raw feature vector per time step. A sliding window of 144 samples (approximately one day) with stride one was then applied, and within each window two derived features were computed: trend‑certainty (tc), a binary indicator derived from a Mann‑Kendall p‑value threshold (p < 0.001), and variance (v). This process expanded the representation to six features per window. For the multilayer perceptron (MLP), the raw detector scores were additionally min‑max scaled.
Model evaluation employed stratified three‑fold cross‑validation on the training data, using the Fβ‑score with β = 0.5 as the primary metric. Emphasizing precision over recall reflects the operational priority of minimizing false‑positive alarms, which could otherwise trigger unnecessary maintenance. Three supervised classifiers were benchmarked: Random Forest (RF), LightGBM, and MLP. A simple baseline (“Above‑One”) classified any sample with a bbcv score > 1.0 as a bearing fault and any sample with a tuplet score > 1.0 as a sensor fault.
Results showed that RF achieved an average F0.5 of 0.81, LightGBM 0.84, while the MLP outperformed both with an average F0.5 of 0.874. The optimal MLP configuration consisted of a single hidden layer with five neurons, ReLU activation, Adam optimizer, and a learning rate of 0.001. This model attained a precision of 0.992 and a recall of 0.789, indicating that it can reliably detect faults while keeping false alarms extremely low.
For the final test, the MLP was retrained on the entire training set using the best hyper‑parameters and then applied to the two unseen turbines from a new wind park. The test set yielded an F0.5 score of 0.937, an F1 score of 0.871, precision of 0.992, and recall of 0.789. The higher performance on the test data, despite its lower label quality (e.g., loose‑contact sensor faults that appear normal for extended periods), suggests that the Anomaly‑Space features are robust to noise and domain shift. The authors discuss that increasing the window size or segmenting fault intervals could further improve detection of prolonged, intermittent fault signatures.
Key insights from the study include: (1) constructing a domain‑shared feature space from physically meaningful anomaly scores enables effective transfer learning without requiring labeled target data; (2) a single, relatively shallow neural network can serve as a universal fault classifier across multiple turbines and wind parks; and (3) the interpretability of anomaly scores fosters trust and facilitates collaboration between data scientists and field engineers.
Future work will explore expanding the detector suite to enrich the Anomaly‑Space, integrating temporal deep‑learning architectures (e.g., Temporal Convolutional Networks) to capture longer‑range dependencies, and developing online learning mechanisms for real‑time deployment. The proposed framework thus offers a scalable, interpretable, and high‑performing solution for wind turbine condition monitoring and has potential applicability to other industrial assets facing similar labeling and domain‑adaptation challenges.
Comments & Academic Discussion
Loading comments...
Leave a Comment