Functional Decomposition and Shapley Interactions for Interpreting Survival Models

Functional Decomposition and Shapley Interactions for Interpreting Survival Models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Hazard and survival functions are natural, interpretable targets in time-to-event prediction, but their inherent non-additivity fundamentally limits standard additive explanation methods. We introduce Survival Functional Decomposition (SurvFD), a principled approach for analyzing feature interactions in machine learning survival models. By decomposing higher-order effects into time-dependent and time-independent components, SurvFD offers a previously unrecognized perspective on survival explanations, explicitly characterizing when and why additive explanations fail. Building on this theoretical decomposition, we propose SurvSHAP-IQ, which extends Shapley interactions to time-indexed functions, providing a practical estimator for higher-order, time-dependent interactions. Together, SurvFD and SurvSHAP-IQ establish an interaction- and time-aware interpretability approach for survival modeling, with broad applicability across time-to-event prediction tasks.


💡 Research Summary

The paper tackles a fundamental limitation of existing explainability methods for survival analysis: the non‑additive nature of hazard and survival functions. Standard additive explanations such as SHAP or LIME assume that a model’s output can be decomposed into a sum of feature contributions, an assumption that breaks down when the target is a time‑dependent function like a hazard or survival curve. To address this, the authors introduce two tightly coupled contributions.

First, Survival Functional Decomposition (SurvFD) extends the classic functional decomposition (FD) framework to the survival setting. In FD, any square‑integrable function F(x) can be written as a sum of pure effects f_M(x) over all subsets M of the feature set. SurvFD generalizes this to functions of both features and time, F(t|x), by separating a baseline time‑dependent term f_∅(t) from two families of pure effects: (i) I⋆_d, the subsets whose effects vary with time, and (ii) I⋆_id, the subsets whose effects are constant over time. The authors prove that, under feature independence, SurvFD applied to the log‑hazard exactly recovers the ground‑truth time‑dependent and time‑independent partitions (Theorem 3.2). They also characterize how non‑linear transformations (exponential to hazard, integration to survival) cause “downward” and “upward” propagation of time‑dependence, meaning that a truly time‑varying interaction can induce apparent time‑varying effects in lower‑order subsets and even generate spurious interactions in higher‑order supersets (Theorem 3.3, Corollary 3.4). Notably, even the classic Cox proportional hazards model, which is linear on the log‑hazard scale, exhibits interaction terms when expressed on the hazard or survival scale (Proposition 3.5).

Second, building on this theoretical foundation, the authors propose SurvSHAP‑IQ, a Shapley‑based estimator for interaction effects that respects the time dimension. The value function v(t|M) is defined as the expected deviation of the survival curve when a subset M of features is fixed to its observed values, minus the unconditional expectation. The Shapley interaction for a pair (i, j) at time t is then the weighted average over all coalitions M ⊆ P \ { i, j } of the marginal contribution of adding both features together versus adding them separately. This construction satisfies the four Shapley axioms (symmetry, linearity, dummy, efficiency) for every time point, and can be instantiated with either marginal or conditional reference distributions, thereby offering both “model‑true” (interventional) and “data‑true” (observational) perspectives.

Empirically, the authors first validate SurvSHAP‑IQ on synthetic functions with known interaction structures, demonstrating accurate recovery of both the magnitude and temporal pattern of interactions, as well as strict local accuracy (the sum of all contributions equals the model’s prediction at each time). They then apply the method to several real‑world cancer survival datasets that combine clinical variables, genomic markers, and imaging features. Using state‑of‑the‑art survival learners (e.g., XGBoost‑survival, DeepSurv), they compute time‑resolved interaction heatmaps. Notable findings include a strong early‑phase interaction between a specific mutation and a chemotherapy regimen that fades after one year, and a persistent interaction between age and a radiomic texture feature influencing long‑term survival. These patterns are invisible to existing survival‑specific explainers such as SurvLIME or SurvSHAP(t), highlighting the added value of time‑aware interaction analysis.

In summary, the paper makes three key contributions: (1) a rigorous functional‑decomposition theory for survival predictions that distinguishes time‑dependent from time‑independent effects; (2) a practical Shapley‑interaction estimator, SurvSHAP‑IQ, that quantifies higher‑order, time‑varying interactions while preserving the core Shapley properties; and (3) thorough validation showing that the approach recovers ground‑truth interactions in simulation and yields clinically interpretable insights in real data. By bridging the gap between non‑additive survival outcomes and additive explanation frameworks, SurvFD and SurvSHAP‑IQ open new avenues for transparent, interaction‑aware decision support in medical prognosis and beyond.


Comments & Academic Discussion

Loading comments...

Leave a Comment