AI-based Prediction of Biochemical Recurrence from Biopsy and Prostatectomy Samples

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Biochemical recurrence (BCR) after radical prostatectomy (RP) is a surrogate marker for aggressive prostate cancer with adverse outcomes, yet current prognostic tools remain imprecise. We trained an AI-based model on diagnostic prostate biopsy slides from the STHLM3 cohort (n = 676) to predict patient-specific risk of BCR, using foundation models and attention-based multiple instance learning. Generalizability was assessed across three external RP cohorts: LEOPARD (n = 508), CHIMERA (n = 95), and TCGA-PRAD (n = 379). The image-based approach achieved 5-year time-dependent AUCs of 0.64, 0.70, and 0.70, respectively. Integrating clinical variables added complementary prognostic value and enabled statistically significant risk stratification. Compared with guideline-based CAPRA-S, AI incrementally improved postoperative prognostication. These findings suggest biopsy-trained histopathology AI can generalize across specimen types to support preoperative and postoperative decision making, but the added value of AI-based multimodal approaches over simpler predictive models should be critically scrutinized in further studies.

💡 Research Summary

Biochemical recurrence (BCR) after radical prostatectomy (RP) is a widely used surrogate for aggressive prostate cancer, yet existing prognostic tools such as Gleason grade, PSA, and clinical stage provide only modest discrimination. In this study, the authors developed a deep‑learning model that predicts patient‑specific risk of BCR directly from routine histopathology whole‑slide images (WSIs) of diagnostic prostate biopsies. Using the population‑based STHLM3 cohort (n = 676, 183 BCR events, median follow‑up 9 years) as the development set, they extracted high‑dimensional tile embeddings with three pretrained foundation models (UNI2, VIRCHOW2, CONCH). Tile‑level features were aggregated with an attention‑based multiple‑instance learning (MIL) framework optimized with a Cox proportional‑hazards loss, producing a continuous risk score.

Three model variants were trained: (1) a clinical‑only model using age, pre‑treatment PSA, and ISUP grade; (2) an image‑only model using the tile embeddings; and (3) a multimodal model that concatenates clinical variables with image embeddings before the MIL aggregator. Internal 5‑fold cross‑validation identified the best performing configurations: the clinical‑only model (5‑year AUC 0.70 ± 0.12), the CONCH‑based image‑only model (AUC 0.70 ± 0.07), and the UNI2‑based multimodal model (AUC 0.73 ± 0.03).

External validation was performed on three independent RP cohorts: LEOPARD (n = 508), CHIMERA (n = 95), and TCGA‑PRAD (n = 379). In LEOPARD, where only image data were available, the image‑only model achieved a 5‑year AUC of 0.64 (95 % CI 0.55–0.72). In CHIMERA, the clinical‑only model reached AUC 0.80, the image‑only model AUC 0.70, and the multimodal model the highest AUC 0.82 (95 % CI 0.69–0.94). In TCGA‑PRAD, the clinical‑only model performed best (AUC 0.76), while the multimodal model achieved AUC 0.72 and the image‑only model AUC 0.70. These results illustrate that the relative contribution of image‑derived versus clinical information varies across cohorts, likely reflecting differences in patient characteristics, slide preparation, and follow‑up duration.

To benchmark against current practice, the authors compared the multimodal AI model with the postoperative CAPRA‑S score, which incorporates pathological stage, margin status, and other clinicopathologic variables. In CHIMERA, CAPRA‑S yielded an AUC of 0.79, slightly lower than the multimodal AI model (0.82). When the AI risk score was added to CAPRA‑S, the combined model reached an AUC of 0.83 and demonstrated a statistically significant improvement in model fit (likelihood‑ratio χ² = 8.20, p = 0.004). In TCGA‑PRAD, CAPRA‑S alone outperformed the AI model (AUC 0.76 vs. 0.72), yet the combined model again improved discrimination to AUC 0.79 (χ² = 8.09, p = 0.004). Thus, histopathology‑derived deep features capture prognostic information that is complementary to established clinicopathologic predictors.

Risk stratification analyses divided patients into quartiles based on the AI risk scores. Kaplan–Meier curves showed clear, monotonic separation of BCR‑free survival across quartiles in all external cohorts (p < 0.05). Patients in the highest quartile experienced markedly earlier recurrences, suggesting that AI‑derived scores could be used to identify individuals who may benefit from intensified surveillance or early adjuvant therapy, while low‑risk patients might avoid overtreatment.

The study acknowledges several limitations. The CHIMERA cohort is relatively small, which may affect the stability of performance estimates. Complete clinical data were unavailable for the LEOPARD cohort, precluding multimodal evaluation there. Moreover, the model was trained on pre‑treatment biopsy material but validated on postoperative prostatectomy specimens, introducing potential domain shift due to differences in tissue handling and tumor representation. Finally, interpretability of the deep‑learning features remains limited, and prospective validation in a real‑world clinical workflow is required before adoption.

In summary, this work demonstrates that AI models trained on diagnostic prostate biopsies can predict biochemical recurrence after radical prostatectomy, generalize across diverse external cohorts, and provide additive prognostic value beyond the CAPRA‑S score. The findings support the potential of histopathology‑based deep learning as a decision‑support tool for both pre‑operative risk assessment and postoperative management, pending further validation and integration into clinical pathways.

AI-based Prediction of Biochemical Recurrence from Biopsy and Prostatectomy Samples

💡 Research Summary

Comments & Academic Discussion

Leave a Comment