Individual Fairness In Strategic Classification
Strategic classification, where individuals modify their features to influence machine learning (ML) decisions, presents critical fairness challenges. While group fairness in this setting has been widely studied, individual fairness remains underexplored. We analyze threshold-based classifiers and prove that deterministic thresholds violate individual fairness. Then, we investigate the possibility of using a randomized classifier to achieve individual fairness. We introduce conditions under which a randomized classifier ensures individual fairness and leverage these conditions to find an optimal and individually fair randomized classifier through a linear programming problem. Additionally, we demonstrate that our approach can be extended to group fairness notions. Experiments on real-world datasets confirm that our method effectively mitigates unfairness and improves the fairness-accuracy trade-off.
💡 Research Summary
Strategic classification studies how individuals may manipulate their observable features in order to receive favorable outcomes from a machine‑learning model. While much of the fairness literature in this setting has focused on group‑level notions (statistical parity, equal opportunity, etc.), the question of individual fairness—“similar individuals should be treated similarly”—has received far less attention. This paper fills that gap by rigorously analyzing threshold‑based classifiers under strategic behavior and proposing a principled randomized‑threshold solution that satisfies individual fairness while preserving predictive performance.
The authors first formalize a binary classification problem in a d‑dimensional feature space X⊂ℝ^d with true label Y∈{0,1}. Individuals can change their feature vector x to x′ at a cost c(x,x′). The institution’s utility is classification accuracy on the adjusted features, whereas an individual’s utility balances the gain from a positive prediction against the incurred cost, weighted by a parameter λ. The best‑response mapping Δ_x(f) selects the feature modification that maximizes the individual’s utility for a given classifier f.
A deterministic threshold classifier f(x;t)=1
Comments & Academic Discussion
Loading comments...
Leave a Comment