GRAFT: Decoupling Ranking and Calibration for Survival Analysis

GRAFT: Decoupling Ranking and Calibration for Survival Analysis
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Survival analysis is complicated by censored data, high-dimensional features, and non-linear interactions. Classical models are interpretable but restrictive, while deep learning models are flexible but often non-interpretable and sensitive to noise. We propose GRAFT (Gated Residual Accelerated Failure Time), a novel AFT model that decouples prognostic ranking from calibration. GRAFT’s hybrid architecture combines a linear AFT model with a non-linear residual neural network, and it also integrates stochastic gates for automatic, end-to-end feature selection. The model is trained by directly optimizing a differentiable, C-index-aligned ranking loss using stochastic conditional imputation from local Kaplan-Meier estimators. In public benchmarks, GRAFT outperforms baselines in discrimination and calibration, while remaining robust and sparse in high-noise settings.


💡 Research Summary

The paper introduces GRAFT (Gated Residual Accelerated Failure Time), a novel survival analysis framework that simultaneously addresses three major challenges: right‑censoring, high‑dimensional feature spaces, and complex non‑linear interactions. Traditional approaches such as Kaplan‑Meier, Cox proportional hazards, and parametric AFT models either lack covariate handling, assume proportional hazards, or impose rigid distributional forms, limiting their applicability to modern biomedical data. Recent deep‑learning models (e.g., DeepSurv, DeepHit) capture non‑linear patterns but typically use all input features, making them vulnerable to noise and over‑fitting, and they do not provide built‑in feature selection.

GRAFT’s architecture is a hybrid of a linear AFT component and a residual multilayer perceptron (MLP). Input features x are first modulated by a stochastic gate vector g (element‑wise product g⊙x). The gated features are fed directly to the linear part (βᵀ(g⊙x) + μ) and also to the MLP fθ(g⊙x), whose output is added to the linear term, yielding a prognostic score s = βᵀ


Comments & Academic Discussion

Loading comments...

Leave a Comment