Agent-Based Output Drift Detection for Breast Cancer Response Prediction in a Multisite Clinical Decision Support System

Modern clinical decision support systems can concurrently serve multiple, independent medical imaging institutions, but their predictive performance may degrade across sites due to variations in patient populations, imaging hardware, and acquisition protocols. Continuous surveillance of predictive model outputs offers a safe and reliable approach for identifying such distributional shifts without ground truth labels. However, most existing methods rely on centralized monitoring of aggregated predictions, overlooking site-specific drift dynamics. We propose an agent-based framework for detecting drift and assessing its severity in multisite clinical AI systems. To evaluate its effectiveness, we simulate a multi-center environment for output-based drift detection, assigning each site a drift monitoring agent that performs batch-wise comparisons of model outputs against a reference distribution. We analyse several multi-center monitoring schemes, that differ in how the reference is obtained (site-specific, global, production-only and adaptive), alongside a centralized baseline. Results on real-world breast cancer imaging data using a pathological complete response prediction model shows that all multi-center schemes outperform centralized monitoring, with F1-score improvements up to 10.3% in drift detection. In the absence of site-specific references, the adaptive scheme performs best, with F1-scores of 74.3% for drift detection and 83.7% for drift severity classification. These findings suggest that adaptive, site-aware agent-based drift monitoring can enhance reliability of multisite clinical decision support systems.

💡 Research Summary

This paper proposes an agent-based framework for detecting and assessing drift in the predictive performance of clinical decision support systems across multiple medical imaging institutions. The method involves deploying independent agents at each site to compare model outputs with a reference distribution, enabling the detection of distributional shifts without relying on ground truth labels.

The authors simulate a multi-center environment for output-based drift detection, assigning each institution an agent that performs batch-wise comparisons against various types of reference distributions (site-specific, global, production-only, and adaptive). The effectiveness of these schemes is evaluated using real-world breast cancer imaging data with a pathological complete response prediction model. Results show that all multi-center monitoring schemes outperform centralized monitoring in drift detection, achieving up to 10.3% higher F1 scores.

In the absence of site-specific references, the adaptive scheme performs best, demonstrating an F1 score of 74.3% for drift detection and 83.7% for drift severity classification. This research highlights that agent-based drift monitoring can significantly enhance the reliability of multisite clinical decision support systems by effectively identifying performance degradation across different institutions.

The study underscores the importance of site-specific monitoring in maintaining model accuracy and suggests that adaptive, decentralized approaches are more effective than centralized methods in managing drift in multi-institutional settings. This finding is crucial for improving the robustness and trustworthiness of AI-driven clinical decision support systems in diverse healthcare environments.

💡 Research Summary

📜 Original Paper Content