Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem

Reading time: 5 minute
...

📝 Original Info

  • Title: Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem
  • ArXiv ID: 2512.08290
  • Date: 2025-12-09
  • Authors: Shiva Gaire, Srijan Gyawali, Saroj Mishra, Suman Niroula, Dilip Thakur, Umesh Yadav

📝 Abstract

The Model Context Protocol (MCP) has emerged as the de facto standard for connecting Large Language Models (LLMs) to external data and tools, effectively functioning as the "USB-C for Agentic AI." While this decoupling of context and execution solves critical interoperability challenges, it introduces a profound new threat landscape where the boundary between epistemic errors (hallucinations) and security breaches (unauthorized actions) dissolves. This Systematization of Knowledge (SoK) aims to provide a comprehensive taxonomy of risks in the MCP ecosystem, distinguishing between adversarial security threats (e.g., indirect prompt injection, tool poisoning) and epistemic safety hazards (e.g., alignment failures in distributed tool delegation). We analyze the structural vulnerabilities of MCP primitives, specifically Resources, Prompts, and Tools, and demonstrate how "context" can be weaponized to trigger unauthorized operations in multi-agent environments. Furthermore, we survey state-of-the-art defenses, ranging from cryptographic provenance (ETDI) to runtime intent verification, and conclude with a roadmap for securing the transition from conversational chatbots to autonomous agentic operating systems.

💡 Deep Analysis

Figure 1

📄 Full Content

The field of Artificial Intelligence is undergoing a paradigmatic shift from Conversational AI-where models generate text in isolation-to Agentic AI, where models perceive, reason, and act upon the external world. This transition requires a standardized connective tissue to link probabilistic Large Language Models (LLMs) with deterministic digital systems. The Model Context Protocol (MCP), introduced in late 2024, has emerged as this standard, effectively serving as the "USB-C for AI applications" by abstracting the complexities of data retrieval and tool execution into a unified open protocol [1], [2].

The adoption of MCP solves a critical interoperability bottleneck, famously known as the “M ×N integration problem,” allowing any model to connect to any data source without bespoke adapters [3]. However, this architectural decoupling introduces profound security implications. By standardizing the interface between an LLM and local files, databases, and remote APIs, MCP significantly expands the attack surface of AI systems. It transforms the LLM from a passive text processor into an active system component with shell-level privileges, capable of executing actions based on potentially untrusted context.

As MCP adoption accelerates in enterprise environments-powering IDEs, data pipelines, and customer support agents-the industry faces a critical knowledge gap. While individual vulnerabilities like prompt injection are well-documented, there is no comprehensive framework understanding how these threats manifest in a decentralized, protocol-driven ecosystem where control flow is determined by semantic context rather than code.

The core challenge in securing MCP ecosystems lies in the convergence of security and safety failures. In traditional software, these domains are distinct: security protects against malicious adversaries (e.g., SQL injection), while safety protects against unintended system behaviors (e.g., race conditions). In MCP, this distinction blurs.

A “security” breach, such as an attacker injecting a malicious document into a company’s knowledge base (Indirect Prompt Injection), can trigger a “safety” failure, where the model honestly but mistakenly believes it is authorized to delete a database. Conversely, a safety failure, such as model hallucination regarding a tool’s parameters, can lead to a security breach where sensitive data is exfiltrated to a public log [4].

Current defense mechanisms are ill-equipped for this duality. Traditional firewalls cannot inspect the semantic intent of a JSON-RPC message, and LLM safety filters cannot see the downstream consequences of a tool execution. This paper argues that securing MCP requires a unified threat model that treats context availability and execution privilege as inextricably linked variables.

This Systematization of Knowledge (SoK) focuses on the unique risks introduced by the Model Context Protocol ecosystem. Our analysis encompasses:

• Protocol Primitives: Vulnerabilities inherent in the design of Resources, Prompts, and Tools as defined in the MCP specification [1]. • Topology Risks: Threats arising from the distributed nature of Host-Client-Server interactions, including supply chain risks in open tool registries. • Intersection of Threats: We specifically exclude general LLM adversarial attacks (e.g., weight poisoning) unless they directly impact the protocol’s integrity or execution flow.

To our knowledge, this is the first academic survey to systematize the risks of the Model Context Protocol. Our contributions are as follows:

  1. Unified Vulnerability Taxonomy: We propose a novel taxonomy (Table III) that distinguishes between Adversarial Security Threats (e.g., tool masquerading, context poisoning) and Epistemic Safety Hazards (e.g., alignment failures in tool delegation). 2) Structural Analysis of MCP Primitives: We analyze how the decoupling of “Context” (Resources) and “Action” (Tools) creates new classes of vulnerabilities, such as Cross-Primitive Escalation, where read-only access is weaponized to trigger write-actions [5]. 3) Survey of Emerging Defenses: We synthesize state-ofthe-art mitigation strategies, moving beyond basic prompt engineering to architectural solutions like the Enhanced Tool Definition Interface (ETDI) [6] and kernel-level session isolation [7]. 4) Forensic Case Studies: We reconstruct real-world incidents, such as the Supabase data leak [8], to derive actionable lessons for enterprise deployment.

The remainder of this paper is organized as follows: Section II provides a technical overview of the MCP architecture. Section III defines the threat landscape and adversarial actors. Sections IV and V detail the specific security and safety challenges, respectively. Section VI surveys mitigation strategies and architectural defenses. Section VII outlines open research directions, and Section VIII presents case studies of recent MCP-related incidents. Finally, Section IX concludes with a roadmap for secure ad

📸 Image Gallery

64.png 65.png ETDI_Workflow.png contextPoison.png indirect_prompt_injection_flow.png mcpArchitecture.png mcpDataFlow.png mcpVSrestFinal.png supplyChainModelSwitchAttack.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut