Fairness Under Group-Conditional Prior Probability Shift: Invariance, Drift, and Target-Aware Post-Processing
Machine learning systems are often trained and evaluated for fairness on historical data, yet deployed in environments where conditions have shifted. A particularly common form of shift occurs when the prevalence of positive outcomes changes differently across demographic groups–for example, when disease rates rise faster in one population than another, or when economic conditions affect loan default rates unequally. We study group-conditional prior probability shift (GPPS), where the label prevalence $P(Y=1\mid A=a)$ may change between training and deployment while the feature-generation process $P(X\mid Y,A)$ remains stable. Our analysis yields three main contributions. First, we prove a fundamental dichotomy: fairness criteria based on error rates (equalized odds) are structurally invariant under GPPS, while acceptance-rate criteria (demographic parity) can drift–and we prove this drift is unavoidable for non-trivial classifiers (shift-robust impossibility). Second, we show that target-domain risk and fairness metrics are identifiable without target labels: the invariance of ROC quantities under GPPS enables consistent estimation from source labels and unlabeled target data alone, with finite-sample guarantees. Third, we propose TAP-GPPS, a label-free post-processing algorithm that estimates prevalences from unlabeled data, corrects posteriors, and selects thresholds to satisfy demographic parity in the target domain. Experiments validate our theoretical predictions and demonstrate that TAP-GPPS achieves target fairness with minimal utility loss.
💡 Research Summary
This paper investigates how fairness guarantees behave when the prevalence of positive outcomes changes differently across demographic groups between training and deployment—a scenario the authors formalize as Group‑Conditional Prior Probability Shift (GPPS). Under GPPS the conditional distribution of features given label and group, P(X | Y, A), remains unchanged, while the group‑specific label priors π_{s,a}=P(Y=1 | A=a) may differ between the source (training) and target (deployment) domains. The authors first prove a fundamental dichotomy: fairness criteria that depend only on error rates conditioned on the true label (e.g., Equalized Odds and Equality of Opportunity) are invariant under GPPS, whereas criteria that depend on acceptance rates (e.g., Demographic Parity) or predictive values (e.g., Predictive Parity) drift because they involve the group‑specific priors. Lemma 4.1 shows that any statistic of the score function that conditions on (Y, A) has identical distribution in source and target, which immediately yields Corollary 4.2: group‑wise ROC curves (TPR and FPR as functions of threshold) are unchanged by GPPS. Theorem 4.3 leverages this to guarantee that any classifier satisfying Equalized Odds on the source will continue to satisfy it on the target without any adaptation.
In contrast, Proposition 4.6 expresses the acceptance rate as a linear function of the prior, AR_{s,a}(t)=π_{s,a}·TPR_a(t)+(1−π_{s,a})·FPR_a(t). Consequently, Demographic Parity gaps can change whenever the priors shift differentially across groups; Corollary 4.7 quantifies this drift and shows that unless the classifier is uninformative (TPR=FPR) or the priors shift uniformly, DP will inevitably be violated. Theorem 4.11 formalizes an impossibility result: a non‑trivial classifier cannot satisfy Demographic Parity simultaneously under two distinct prior regimes unless degenerate conditions hold. Predictive Parity suffers even more because PPV depends non‑linearly on the prior (Proposition 4.9), amplifying small prevalence changes into large fairness violations.
A major contribution is the identification of target‑domain risk and fairness metrics without any target labels. Because ROC quantities are invariant, the authors can estimate TPR and FPR from labeled source data, then estimate the unknown target priors π̂_{tgt,a} from unlabeled target features using either an EM algorithm (Saerens et al., 2002) or the Black‑Box Shift Estimation (BBSE) method (Lipton et al., 2018). Theorem 6.1 proves that, given accurate estimates of π̂ and the invariant ROC, the target risk, Demographic Parity gap, and Predictive Parity can be consistently estimated. Finite‑sample guarantees are provided in Theorem C.1, giving explicit concentration bounds for the estimators.
Building on these theoretical insights, the authors propose TAP‑GPPS, a label‑free post‑processing algorithm. TAP‑GPPS proceeds in three steps: (1) estimate target priors from unlabeled data; (2) correct the classifier’s posterior probabilities using Bayes’ rule with the estimated priors; (3) select group‑specific thresholds that enforce the desired Demographic Parity in the target domain. The algorithm requires only the pre‑trained score function and unlabeled target data, making it highly practical for settings where acquiring target labels is costly or delayed.
Empirical evaluation spans synthetic data and semi‑synthetic benchmarks derived from medical (cardiovascular disease), credit (loan default), and criminal‑justice (recidivism) domains. Experiments confirm that: (i) Equalized Odds remains unchanged under GPPS, matching the theoretical invariance; (ii) Demographic Parity indeed drifts when priors shift, and the magnitude aligns with the derived linear formula; (iii) TAP‑GPPS successfully restores Demographic Parity in the target domain while incurring negligible utility loss (often <2% drop in AUC or accuracy). Comparisons against existing label‑free correction methods (importance weighting, covariate‑shift adjustments) demonstrate superior fairness restoration and better preservation of predictive performance, especially under heterogeneous prior shifts.
The paper’s implications are twofold. First, practitioners must recognize that not all fairness notions are equally robust to prior shifts; error‑rate‑based notions (Equalized Odds) are safe under GPPS, whereas acceptance‑rate‑based notions (Demographic Parity) are fundamentally vulnerable unless the classifier is uninformative or priors shift uniformly. Second, the proposed identifiability results and TAP‑GPPS algorithm provide a concrete, label‑free pathway to achieve target‑domain fairness when only unlabeled deployment data are available. Limitations include the strong assumption that P(X|Y,A) is perfectly invariant—a condition that may be violated in practice—and the focus on binary classification with a single sensitive attribute. Extending the analysis to multi‑class settings, multiple protected attributes, and partial violations of the GPPS assumption constitute promising directions for future work.
In summary, the paper delivers a rigorous theoretical framework for fairness under group‑conditional prior shifts, establishes clear impossibility boundaries, and offers a practical, label‑free post‑processing solution that aligns with real‑world deployment constraints.
Comments & Academic Discussion
Loading comments...
Leave a Comment