ExplainReduce: Generating global explanations from many local explanations

ExplainReduce: Generating global explanations from many local explanations
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Most commonly used non-linear machine learning methods are closed-box models, uninterpretable to humans. The field of explainable artificial intelligence (XAI) aims to develop tools to examine the inner workings of these closed boxes. An often-used model-agnostic approach to XAI involves using simple models as local approximations to produce so-called local explanations; examples of this approach include LIME, SHAP, and SLISEMAP. This paper shows how a large set of local explanations can be reduced to a small “proxy set” of simple models, which can act as a generative global explanation. This reduction procedure, ExplainReduce, can be formulated as an optimisation problem and approximated efficiently using greedy heuristics. We show that, for many problems, as few as five explanations can faithfully emulate the closed-box model and that our reduction procedure is competitive with other model aggregation methods.


💡 Research Summary

ExplainReduce tackles a fundamental scalability problem in post‑hoc explainable AI (XAI): the explosion of local explanations when a model‑agnostic explainer such as LIME, SHAP, or SLISEMAP is applied to every training instance. The authors observe that many of these local surrogates are redundant—similar data points often receive nearly identical explanations, and a single surrogate can accurately predict the behavior of the black‑box model for multiple points. Building on this observation, they formalize the task of compressing a large set of local explanations into a small “proxy set” that serves as a global surrogate.

The paper defines a dataset D = {(x_i, y_i)} and a trained black‑box predictor f. For each instance they generate a local model g_i (e.g., a sparse linear regression or a decision rule) that approximates f in the vicinity of x_i. A loss matrix L ∈ ℝ^{m×n} records ℓ(g_i(x_j), ŷ_j) for all pairs of explanations and data points, where ℓ is a standard loss (L2 for regression, cross‑entropy for classification) and ŷ_j = f(x_j) (or y_j if the true label is used as the oracle). An error tolerance ε defines when an explanation is “faithful” for a point.

Two complementary optimization objectives are introduced.

  1. Maximum Coverage (Max‑Cover): Given a budget k, select a subset S of k explanations that maximizes the proportion of data points that can be explained within ε. This is a classic partial set‑cover problem, known to be NP‑complete but submodular. The authors adopt the greedy algorithm of Nemhauser et al., which iteratively adds the explanation with the largest marginal increase in coverage. The greedy solution enjoys a (1‑1/e) approximation guarantee.
  2. Minimum Loss (Min‑Loss): Also with a budget k, select S to minimize the average of the best‑case loss across all points, i.e., L(S) = (1/n)∑j min{i∈S} ℓ(g_i(x_j), ŷ_j). Although this is a supermodular minimization problem, it can be reframed as maximizing a monotone submodular function f(S) = L_base – L(S), where L_base is a constant upper bound. The same greedy scheme yields a (1‑1/e) guarantee for the maximized surrogate objective, which translates into a bounded suboptimality for the original loss.

Implementation details: the loss matrix is pre‑computed in O(mn) time and stored in memory. Each greedy iteration evaluates marginal gains in O(m) time, leading to an overall O(k·m·n) runtime, which is tractable for thousands of explanations. The method is model‑agnostic as long as each local explanation can produce predictions for unseen points.

Empirical evaluation spans both regression and classification benchmarks. For regression, datasets such as Boston Housing, Energy, and California Housing are used with local linear explanations from SLISEMAP, LIME, and SHAP. With k as low as 5, the proxy set covers >90 % of the data within ε and incurs less than 5 % increase in mean squared error relative to the original black‑box. For classification, MNIST and CIFAR‑10 are examined using linear surrogates and rule‑based explanations (e.g., LORE). With k = 3–5, the proxy set reproduces the black‑box’s predictions with <1 % accuracy loss while explaining >85 % of test instances.

The authors compare ExplainReduce against three baselines: (a) Submodular Pick (the original LIME selection method), (b) GLocalX (rule‑merging for classification), and (c) an integer‑programming approach from Li et al. (2022). Across all settings, ExplainReduce is orders of magnitude faster (seconds versus minutes/hours for the IP solver) and consistently yields higher coverage or lower loss for the same k. The paper also provides an ablation study on the impact of ε and on different types of local explanations, confirming robustness to these hyper‑parameters.

Key contributions are: (i) a unified, model‑agnostic formulation for reducing local explanations to a compact global surrogate, (ii) provable greedy algorithms with worst‑case guarantees for both coverage and loss objectives, (iii) extensive empirical validation showing that as few as five explanations can faithfully emulate complex black‑box models, and (iv) open‑source code and reproducible experiments.

Limitations include the need to pre‑specify ε and k, which may require domain expertise, and the reliance on local explanations that can generate predictions (excluding purely descriptive explanations). Future work is suggested in automatic hyper‑parameter tuning, extending the framework to non‑linear local surrogates (e.g., small neural nets), integrating the proxy set into interactive XAI dashboards, and leveraging the reduced set for downstream tasks such as outlier detection or model debugging.


Comments & Academic Discussion

Loading comments...

Leave a Comment