PAC-Bayesian Generalization Guarantees for Fairness on Stochastic and Deterministic Classifiers

PAC-Bayesian Generalization Guarantees for Fairness on Stochastic and Deterministic Classifiers
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Classical PAC generalization bounds on the prediction risk of a classifier are insufficient to provide theoretical guarantees on fairness when the goal is to learn models balancing predictive risk and fairness constraints. We propose a PAC-Bayesian framework for deriving generalization bounds for fairness, covering both stochastic and deterministic classifiers. For stochastic classifiers, we derive a fairness bound using standard PAC-Bayes techniques. Whereas for deterministic classifiers, as usual PAC-Bayes arguments do not apply directly, we leverage a recent advance in PAC-Bayes to extend the fairness bound beyond the stochastic setting. Our framework has two advantages: (i) It applies to a broad class of fairness measures that can be expressed as a risk discrepancy, and (ii) it leads to a self-bounding algorithm in which the learning procedure directly optimizes a trade-off between generalization bounds on the prediction risk and on the fairness. We empirically evaluate our framework with three classical fairness measures, demonstrating not only its usefulness but also the tightness of our bounds.


💡 Research Summary

The paper addresses a crucial gap in learning theory: while classical PAC generalization bounds guarantee that a classifier’s empirical error approximates its true error, they provide no assurances regarding fairness—a key requirement when models affect human outcomes. To bridge this gap, the authors propose a PAC‑Bayesian framework that yields high‑probability generalization bounds for a broad class of group‑fairness measures, covering both stochastic (Gibbs) and deterministic (majority‑vote) classifiers.

Key ideas and technical contributions

  1. Risk‑discrepancy formulation of fairness – The authors model fairness as the absolute difference between subgroup risks, (R_F^D(h)=|R_D^{a}(h)-R_D^{b}(h)|). By choosing appropriate loss functions and conditioning on the sensitive attribute, this single expression captures Demographic Parity, Equalized Odds, and Equal Opportunity.
  2. PAC‑Bayesian bounds for stochastic classifiers – Building on Seeger (2002) and Maurer (2004), the paper applies a KL‑based bound separately to each sensitive group and then uses a union bound. The resulting inequality improves upon the earlier bound of Oneto et al. (2020) by tightening constant factors and logarithmic terms.
  3. Extension to deterministic classifiers – Recent advances (Leblanc & Germain, 2025) provide both upper and lower bounds on the Gibbs risk, enabling control of the absolute risk difference from both sides. By coupling these bounds with classic majority‑vote inequalities, the authors derive a direct PAC‑Bayesian guarantee for the deterministic majority‑vote classifier (H_\rho), eliminating the need for the crude factor‑2 loss incurred by earlier approaches.
  4. Self‑bounding learning algorithm – The two derived bounds (one for prediction risk, one for fairness risk) are combined into a single differentiable objective that includes the KL divergence term (KL(\rho|\pi)). Optimizing this objective (via stochastic gradient descent or variational inference) yields a posterior distribution (\rho) whose associated majority‑vote classifier simultaneously satisfies a certified bound on accuracy and on fairness. Because the bound is part of the training loss, the algorithm is “self‑bounding”: the model comes with its own intrinsic certification.
  5. Empirical validation – Experiments on standard fairness benchmarks (Adult, COMPAS, Law School) evaluate three fairness metrics (DP, EO, EOP). The proposed method achieves comparable or lower empirical fairness violations than post‑hoc correction methods while the PAC‑Bayesian bounds remain remarkably tight—often matching the observed test‑time risks. Moreover, deterministic classifiers obtained via majority voting enjoy the same guarantees, a capability absent from prior work.

Impact and future directions
The work unifies fairness assessment with PAC‑Bayesian theory, offering a principled way to trade off accuracy and fairness during training rather than after the fact. It opens several avenues: extending the risk‑discrepancy approach to multi‑group or continuous sensitive attributes, applying the framework to multi‑class or regression settings, and exploring tighter majority‑vote inequalities that could further reduce the gap between stochastic and deterministic guarantees. Overall, the paper provides both a solid theoretical foundation and a practical algorithm for building fair, certifiable machine‑learning models.


Comments & Academic Discussion

Loading comments...

Leave a Comment