Evolved SampleWeights for Bias Mitigation: Effectiveness Depends on Optimization Objectives

Reading time: 6 minute
...

📝 Abstract

Machine learning models trained on real-world data may inadvertently make biased predictions that negatively impact marginalized communities. Reweighting is a method that can mitigate such bias in model predictions by assigning a weight to each data point used during model training. In this paper, we compare three methods for generating these weights: (1) evolving them using a Genetic Algorithm (GA), (2) computing them using only dataset characteristics, and (3) assigning equal weights to all data points. Model performance under each strategy was evaluated using paired predictive and fairness metrics, which also served as optimization objectives for the GA during evolution. Specifically, we used two predictive metrics (accuracy and area under the Receiver Operating Characteristic curve) and two fairness metrics (demographic parity difference and subgroup false negative fairness). Using experiments on eleven publicly available datasets (including two medical datasets), we show that evolved sample weights can produce models that achieve better trade-offs between fairness and predictive performance than alternative weighting methods. However, the magnitude of these benefits depends strongly on the choice of optimization objectives. Our experiments reveal that optimizing with accuracy and demographic parity difference metrics yields the largest number of datasets for which evolved weights are significantly better than other weighting strategies in optimizing both objectives.

💡 Analysis

Machine learning models trained on real-world data may inadvertently make biased predictions that negatively impact marginalized communities. Reweighting is a method that can mitigate such bias in model predictions by assigning a weight to each data point used during model training. In this paper, we compare three methods for generating these weights: (1) evolving them using a Genetic Algorithm (GA), (2) computing them using only dataset characteristics, and (3) assigning equal weights to all data points. Model performance under each strategy was evaluated using paired predictive and fairness metrics, which also served as optimization objectives for the GA during evolution. Specifically, we used two predictive metrics (accuracy and area under the Receiver Operating Characteristic curve) and two fairness metrics (demographic parity difference and subgroup false negative fairness). Using experiments on eleven publicly available datasets (including two medical datasets), we show that evolved sample weights can produce models that achieve better trade-offs between fairness and predictive performance than alternative weighting methods. However, the magnitude of these benefits depends strongly on the choice of optimization objectives. Our experiments reveal that optimizing with accuracy and demographic parity difference metrics yields the largest number of datasets for which evolved weights are significantly better than other weighting strategies in optimizing both objectives.

📄 Content

Evolved Sample Weights for Bias Mitigation: Effectiveness Depends on Optimization Objectives ANIL K. SAINI, Cedars-Sinai Medical Center, Los Angeles, USA JOSE GUADALUPE HERNANDEZ, Cedars-Sinai Medical Center, Los Angeles, USA EMILY F. WONG, Cedars-Sinai Medical Center, Los Angeles, USA DEBANSHI MISRA, University of California, Los Angeles, USA JASON H. MOORE, Cedars-Sinai Medical Center, Los Angeles, USA Machine learning models trained on real-world data may inadvertently make biased predictions that negatively impact marginalized communities. Reweighting is a method that can mitigate such bias in model predictions by assigning a weight to each data point used during model training. In this paper, we compare three methods for generating these weights: (1) evolving them using a Genetic Algorithm (GA), (2) computing them using only dataset characteristics, and (3) assigning equal weights to all data points. Model performance under each strategy was evaluated using paired predictive and fairness metrics, which also served as optimization objectives for the GA during evolution. Specifically, we used two predictive metrics (accuracy and area under the Receiver Operating Characteristic curve) and two fairness metrics (demographic parity difference and subgroup false negative fairness). Using experiments on eleven publicly available datasets (including two medical datasets), we show that evolved sample weights can produce models that achieve better trade-offs between fairness and predictive performance than alternative weighting methods. However, the magnitude of these benefits depends strongly on the choice of optimization objectives. Our experiments reveal that optimizing with accuracy and demographic parity difference metrics yields the largest number of datasets for which evolved weights are significantly better than other weighting strategies in optimizing both objectives. CCS Concepts: • Computing methodologies →Genetic algorithms; • Applied computing →Health informatics. Additional Key Words and Phrases: genetic algorithm, fairness, reweighting ACM Reference Format: Anil K. Saini, Jose Guadalupe Hernandez, Emily F. Wong, Debanshi Misra, and Jason H. Moore. 2025. Evolved Sample Weights for Bias Mitigation: Effectiveness Depends on Optimization Objectives. 1, 1 (November 2025), 15 pages. https://doi.org/10.1145/nnnnnnn . nnnnnnn 1 Introduction While machine learning (ML) has revolutionized numerous industries, it has also demonstrated the ability to perpetuate racial, gender, and other biases captured within a dataset [1]. In areas where these ML systems are used to make high-stakes decisions, such as healthcare, algorithmic bias can have unintended negative consequences (e.g., widening Authors’ Contact Information: Anil K. Saini, anil.saini@cshs.org, Cedars-Sinai Medical Center, Los Angeles, California, USA; Jose Guadalupe Hernandez, Cedars-Sinai Medical Center, Los Angeles, California, USA, jose.hernandez8@cshs.org; Emily F. Wong, Cedars-Sinai Medical Center, Los Angeles, California, USA, emily.wong@cshs.org; Debanshi Misra, University of California, Los Angeles, California, USA, debanshi@ucla.edu; Jason H. Moore, Cedars-Sinai Medical Center, Los Angeles, California, USA, jason.moore@csmc.edu. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM. Manuscript submitted to ACM Manuscript submitted to ACM 1 arXiv:2511.20909v1 [cs.LG] 25 Nov 2025 2 Saini et al. health disparities). All ML, regardless of whether the models are learning through supervised, unsupervised, or semi- supervised approaches, requires data. As such, there is a risk of algorithmic bias for all ML methods, given that the biases captured within the data may be unknown. Bias can arise from various sources, such as the use of incorrect features and the lack of diversity during sampling [19]. Bias may also be introduced by the configuration of an algorithm (e.g., the choice of optimization functions or regularization) [19]. Bias can be ameliorated at various stages of using the ML model: pre-processing, which modifies data prior to training and evaluation; in-processing, which involves tuning the algorithm during the training process; and post-processing, which adjusts predictions after training. Reweighting is a widely used pre-processing approach to mitigate bias in model predictions. It involves assigning weights to data points in the tra

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut