Reconciliating Bayesian and frequentist approaches to robustness against outliers

Reconciliating Bayesian and frequentist approaches to robustness against outliers
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Heavy-tailed models are used as a way to gain robustness against outliers in Bayesian analyses. In frequentist analyses, M-estimators are often employed. In this paper, the two approaches are tentatively reconciled by considering M-estimators as maximum likelihood estimators of heavy-tailed models. From this perspective, it is realized that a fundamental difference exists as frequentists, contrarily to Bayesians, do not require these heavy-tailed models to be proper. For instance, a popular robust estimator in linear regression, Tukey’s biweight M-estimator, does not correspond to a proper heavy-tailed model. Thus, a Bayesian practitioner does not have access to the same range of tools as a frequentist practitioner. It is shown through two real-data linear regression analyses that the former may in consequence obtain significantly different estimation results than the latter, where the difference is due to a more pronounced influence by the outliers in the former case. It is highlighted that a way to give these practitioners access to the same range of tools is for the Bayesian to adopt the generalized Bayesian framework of Bissiri et al. (2016) which allows the use of improper models (Jewson and Rossell, 2022), in combination with proper prior distributions yielding proper generalized posterior distributions. A complete reconciliation of the Bayesian and frequentist approaches to robustness is then achieved. An extensive theoretical study of the generalized Bayesian counterpart of Tukey’s biweight M-estimator is provided, which includes a robustness characterization result and a Bernstein–von Mises result, the latter allowing to calibrate the generalized posterior distribution for meaningful uncertainty quantification. After adopting the generalized Bayesian framework, the Bayesian practitioner obtains similar results as the frequentist practitioner in the aforementioned examples.


💡 Research Summary

**
This paper investigates the relationship between Bayesian and frequentist robust methods for handling outliers in linear regression. The authors begin by noting that both paradigms aim to reduce the influence of extreme observations, but they traditionally do so in different ways. Frequentists replace the quadratic term in the normal log‑likelihood with a less rapidly growing function, yielding M‑estimators such as Huber’s and Tukey’s biweight. These estimators can be interpreted as maximum‑likelihood estimators of heavy‑tailed error distributions, yet the corresponding “distributions’’ need not be proper probability densities. In particular, Tukey’s biweight leads to a log‑likelihood that becomes constant beyond a cutoff, producing an improper model.

Traditional Bayesian robustness, by contrast, relies on proper heavy‑tailed likelihoods (e.g., Student‑t, log‑regularly varying distributions such as the LPTN). Because Bayes’ theorem requires a normalizable likelihood, Bayesian analysts cannot directly use the improper models that underlie some frequentist M‑estimators. This creates a fundamental gap: frequentists have access to a broader class of robust tools than conventional Bayesians.

To bridge this gap, the authors adopt the generalized Bayesian framework of Bissiri, Holmes and Walker (2016). In this approach the likelihood is replaced by a loss function (or an unnormalized “pseudo‑likelihood”), and a proper prior is combined with the loss to form a generalized posterior. Properness of the posterior is guaranteed as long as the prior is proper, even when the loss corresponds to an improper model. The paper shows that Tukey’s biweight loss fits naturally into this framework, and that inference can be performed using the Hyvärinen score, which does not require knowledge of the normalizing constant.

The theoretical contribution consists of three main results for the generalized Bayesian counterpart of Tukey’s biweight: (1) a robustness characterization showing that the generalized posterior down‑weights outliers in the same way as the original M‑estimator; (2) a strong consistency theorem guaranteeing that the posterior concentrates on the true regression coefficients as the sample size grows; and (3) a Bernstein–von Mises theorem establishing asymptotic normality of the generalized posterior. The latter result provides a principled way to calibrate the posterior “temperature’’ so that credible intervals have frequentist coverage comparable to those from classical robust estimators.

Empirically, the authors analyze two real data sets. The first is the classic rat‑shock data (available in the R package RobStatTM). Ordinary least squares and a Bayesian LPTN model are heavily pulled toward a few extreme points, whereas Tukey’s biweight M‑estimator yields a regression line that follows the bulk of the data. When the generalized Bayesian approach is applied to the biweight loss, the resulting MAP estimates and 95 % highest‑posterior‑density intervals are virtually identical to those from the frequentist biweight estimator and to the OLS estimates obtained after removing the identified outliers. The second example involves a higher‑dimensional regression with many covariates, again showing that the generalized Bayesian posterior aligns closely with the frequentist robust fit, while traditional Bayesian heavy‑tailed models (Student‑t, LPTN) remain more influenced by the outliers.

In summary, the paper demonstrates that by embracing improper models within a generalized Bayesian framework—paired with proper priors—Bayesian practitioners can access the full suite of frequentist robust tools, including those based on non‑normalizable likelihoods. The theoretical guarantees (robustness, consistency, Bernstein–von Mises) ensure that the resulting inference is both statistically sound and practically comparable to frequentist robust methods. This work thus offers a compelling argument for adopting generalized Bayesian inference when robustness to outliers is a primary concern.


Comments & Academic Discussion

Loading comments...

Leave a Comment