Provably robust learning of regression neural networks using $β$-divergences

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Regression neural networks (NNs) are most commonly trained by minimizing the mean squared prediction error, which is highly sensitive to outliers and data contamination. Existing robust training methods for regression NNs are often limited in scope and rely primarily on empirical validation, with only a few offering partial theoretical guarantees. In this paper, we propose a new robust learning framework for regression NNs based on the $β$-divergence (also known as the density power divergence) which we call `rRNet’. It applies to a broad class of regression NNs, including models with non-smooth activation functions and error densities, and recovers the classical maximum likelihood learning as a special case. The rRNet is implemented via an alternating optimization scheme, for which we establish convergence guarantees to stationary points under mild, verifiable conditions. The (local) robustness of rRNet is theoretically characterized through the influence functions of both the parameter estimates and the resulting rRNet predictor, which are shown to be bounded for suitable choices of the tuning parameter $β$, depending on the error density. We further prove that rRNet attains the optimal 50% asymptotic breakdown point at the assumed model for all $β\in(0, 1]$, providing a strong global robustness guarantee that is largely absent for existing NN learning methods. Our theoretical results are complemented by simulation experiments and real-data analyses, illustrating practical advantages of rRNet over existing approaches in both function approximation problems and prediction tasks with noisy observations.

💡 Research Summary

This paper addresses the well‑known vulnerability of regression neural networks (NNs) trained with the mean‑squared error (MSE) loss to outliers and contaminated data. While numerous robust regression techniques exist for linear models—such as least absolute deviation (LAD), trimmed estimators, and M‑estimators—most of them either apply only to specific NN architectures, lack rigorous theoretical guarantees, or provide only limited robustness (e.g., zero breakdown point). To fill this gap, the authors propose a novel robust learning framework called rRNet, which is built upon the β‑divergence (also known as the density power divergence, DPD).

The β‑divergence introduces a tuning parameter β∈

Provably robust learning of regression neural networks using $β$-divergences

💡 Research Summary

Comments & Academic Discussion

Leave a Comment