Self-Ensemble Post Learning for Noisy Domain Generalization

Self-Ensemble Post Learning for Noisy Domain Generalization
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

While computer vision and machine learning have made great progress, their robustness is still challenged by two key issues: data distribution shift and label noise. When domain generalization (DG) encounters noise, noisy labels further exacerbate the emergence of spurious features in deep layers, i.e. spurious feature enlargement, leading to a degradation in the performance of existing algorithms. This paper, starting from domain generalization, explores how to make existing methods rework when meeting noise. We find that the latent features inside the model have certain discriminative capabilities, and different latent features focus on different parts of the image. Based on these observations, we propose the Self-Ensemble Post Learning approach (SEPL) to diversify features which can be leveraged. Specifically, SEPL consists of two parts: feature probing training and prediction ensemble inference. It leverages intermediate feature representations within the model architecture, training multiple probing classifiers to fully exploit the capabilities of pre-trained models, while the final predictions are obtained through the integration of outputs from these diverse classification heads. Considering the presence of noisy labels, we employ semi-supervised algorithms to train probing classifiers. Given that different probing classifiers focus on different areas, we integrate their predictions using a crowdsourcing inference approach. Extensive experimental evaluations demonstrate that the proposed method not only enhances the robustness of existing methods but also exhibits significant potential for real-world applications with high flexibility.


💡 Research Summary

The paper addresses the practical challenge of simultaneously encountering domain shift and label noise, a scenario where existing domain generalization (DG) methods often fail. By analyzing a ResNet‑50 model trained on the PACS dataset with 25 % of labels randomly flipped, the authors discover that deep layers tend to amplify spurious features, focusing on non‑essential regions (e.g., a dog’s body) rather than discriminative parts (head, tail). They further observe that intermediate representations from different layers attend to distinct image regions, suggesting that each layer holds complementary discriminative information. Building on these insights, the authors propose Self‑Ensemble Post Learning (SEPL), a two‑stage framework that does not require retraining the backbone. In the first stage, linear probing classifiers are attached to multiple intermediate layers and trained using a semi‑supervised scheme: high‑confidence predictions retain their (potentially noisy) labels, while low‑confidence samples are treated as unlabeled and leveraged with semi‑supervised techniques such as MixMatch. In the second stage, the predictions of all probing heads are fused via a Dawid‑Skene crowdsourcing‑style ensemble, which statistically aggregates diverse opinions and mitigates individual classifier errors. Experiments on standard DG benchmarks (PACS, Office‑Home) with injected label noise show that SEPL consistently improves accuracy by 3–5 percentage points over baseline DG methods, and it also yields notable gains in a medical imaging task where label uncertainty is common. Importantly, SEPL operates as a post‑processing step, preserving the original backbone weights and incurring minimal additional computational overhead. The authors acknowledge limitations related to the number of layers used and performance under extremely high noise rates, and they suggest future work on more accurate noise estimation and deeper ensemble strategies.


Comments & Academic Discussion

Loading comments...

Leave a Comment