Healthcare AI systems face major vulnerabilities to data poisoning that current defenses and regulations cannot adequately address. We analyzed eight attack scenarios in four categories: architectural attacks on convolutional neural networks, large language models, and reinforcement learning agents; infrastructure attacks exploiting federated learning and medical documentation systems; critical resource allocation attacks affecting organ transplantation and crisis triage; and supply chain attacks targeting commercial foundation models. Our findings indicate that attackers with access to only 100-500 samples can compromise healthcare AI regardless of dataset size, often achieving over 60 percent success, with detection taking an estimated 6 to 12 months or sometimes not occurring at all. The distributed nature of healthcare infrastructure creates many entry points where insiders with routine access can launch attacks with limited technical skill. Privacy laws such as HIPAA and GDPR can unintentionally shield attackers by restricting the analyses needed for detection. Supply chain weaknesses allow a single compromised vendor to poison models across 50 to 200 institutions. The Medical Scribe Sybil scenario shows how coordinated fake patient visits can poison data through legitimate clinical workflows without requiring a system breach. Current regulations lack mandatory adversarial robustness testing, and federated learning can worsen risks by obscuring attribution. We recommend multilayer defenses including required adversarial testing, ensemble-based detection, privacy-preserving security mechanisms, and international coordination on AI security standards. We also question whether opaque black-box models are suitable for high-stakes clinical decisions, suggesting a shift toward interpretable systems with verifiable safety guarantees.
Consider this plausible scenario: a hospital's radiology AI systematically misses early-stage lung cancers in patients from specific ethnic backgrounds. The failure rate aligns with documented healthcare disparities, raising no immediate alarm. Yet, this is not bias resulting from the training data distribution; it stems from approximately 250 poisoned training samples (0.025% of a million-image dataset) that were injected during routine data contributions by an insider with standard access. Three years pass before the discovery is made through epidemiological analysis, a detection timeline at the extreme end of the 6-to-12-month-or-longer range observed across different attack types, with demographic-targeted attacks being particularly difficult to detect without longitudinal outcome studies. By then, hundreds of patients have experienced delayed diagnosis at advanced stages, where treatment outcomes are substantially worse.
While hypothetical, this scenario reflects empirically demonstrated vulnerabilities. Recent security studies have shown that healthcare AI architectures can be successfully backdoored with a small number of poisoned samples, regardless of the total dataset size [1][2][3][4]. Recent research from October 2025 confirms that attack success depends on the absolute number of poisoned documents rather than their fraction of the corpus. This finding fundamentally changes our understanding of poisoning threat models [1,34]. The vulnerability affects large language models (LLMs) used for clinical documentation and diagnosis, [1,2] convolutional neural networks (CNNs) for medical imaging interpretation, [3,4] and emerging agentic systems that autonomously navigate clinical workflows. [5] Healthcare AI deployment is accelerating without commensurate security evaluation. LLMs assist with clinical documentation, [6] generate differential diagnoses, [7] and provide patientfacing medical advice [8]. Medical imaging AI interprets radiographs, CT scans, and pathology slides, often with limited physician oversight [9,10]. Autonomous agentic systems are being developed to schedule appointments, triage patients, order laboratory tests, and adjust treatment protocols based on real-time clinical data. [11,12] These systems make life-critical decisions affecting vulnerable populations, yet systematic analysis of their security posture against adversarial attacks remains limited.
This article provides a critical analysis of the structural vulnerabilities in healthcare AI architectures. We first examine how specific deployment methods, such as federated learning, intensify these risks rather than reducing them. We then evaluate how existing regulatory frameworks and standard security testing fail to offer sufficient protection. Finally, we suggest technical and policy measures, concluding with a fundamental review of whether current AI architectures are suitable for the high-risk environment of healthcare.
This analysis synthesizes empirical findings from published security research that demonstrate attack feasibility, evaluates architectural vulnerabilities, assesses defense mechanisms and their limitations, and examines regulatory frameworks governing the deployment of healthcare AI. We covered security research published between 2019 and 2025, with a particular focus on studies that demonstrate empirical attack success against production-scale AI systems.
To build this analysis, we synthesized empirical evidence from 41 key security studies published between 2019 and 2025, identified through a structured review of leading AI/security (e.g., NeurIPS, IEEE S&P) and medical AI (e.g., NEJM AI, Nature Medicine) venues. Our framework prioritized studies that demonstrated empirical attacks with quantitative data (e.g., attack success rates) and clear reproducibility. Crucially, we focused on research using realistic threat models applicable to healthcare, prioritizing attacks achievable via ‘routine insider access’ and emphasizing training-time data poisoning over inference-time examples.
The attack scenarios presented throughout this paper are hypothetical examples constructed to illustrate vulnerabilities demonstrated in empirical security research. They are not reports of documented incidents.
The analysis focuses on healthcare AI systems currently deployed or under development, examining them along two dimensions: underlying neural architecture and clinical application domain. Three major architecture types dominate in healthcare AI literature and deployment: (1) Transformer-based large language models (parameters: 0.6B-13B) used for clinical documentation, decision support, and patient-facing interactions; (2) Convolutional neural networks (ResNet, DenseNet architectures) and vision transformers for medical imaging interpretation across radiology, pathology, and dermatology; (3) Reinforcement learning agents and multi-agent systems for autonomous clinical workflow navigation and resource allocation.
Fo
This content is AI-processed based on open access ArXiv data.