Joint Distribution-Informed Shapley Values for Sparse Counterfactual Explanations

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Counterfactual explanations (CE) aim to reveal how small input changes flip a model’s prediction, yet many methods modify more features than necessary, reducing clarity and actionability. We introduce \emph{COLA}, a model- and generator-agnostic post-hoc framework that refines any given CE by computing a coupling via optimal transport (OT) between factual and counterfactual sets and using it to drive a Shapley-based attribution (\emph{$p$-SHAP}) that selects a minimal set of edits while preserving the target effect. Theoretically, OT minimizes an upper bound on the $W_1$ divergence between factual and counterfactual outcomes and that, under mild conditions, refined counterfactuals are guaranteed not to move farther from the factuals than the originals. Empirically, across four datasets, twelve models, and five CE generators, COLA achieves the same target effects with only 26–45% of the original feature edits. On a small-scale benchmark, COLA shows near-optimality.

💡 Research Summary

The paper addresses a fundamental limitation of existing counterfactual explanation (CE) methods: they often modify far more features than necessary to achieve a desired model output, which hampers interpretability and actionability. To overcome this, the authors propose COLA (Counterfactuals with Limited Actions), a model‑agnostic and generator‑agnostic post‑hoc framework that refines any given set of counterfactuals. COLA operates in two stages. First, it computes an optimal transport (OT) coupling between the factual data set x and the initially generated counterfactual set r. By solving an entropically regularized OT problem (using the Sinkhorn algorithm), COLA obtains a transport plan pₒₜ that aligns each factual instance with the most “natural” counterfactual counterpart, minimizing the expected squared Euclidean cost. This coupling simultaneously minimizes an upper bound on the 1‑Wasserstein (W₁) distance between the factual and counterfactual distributions, providing a principled measure of distributional similarity.
Second, the OT‑derived coupling is used to define a new Shapley‑value variant called p‑SHAP. The value function is
v_i(S) = 𝔼_{r∼p(r|x_i)}

Joint Distribution-Informed Shapley Values for Sparse Counterfactual Explanations

💡 Research Summary

Comments & Academic Discussion

Leave a Comment