AI Agent Security — Attacks, Jailbreaking, and Defense · Agent Security with Tools and MCP
MCP security: tool poisoning, confused deputy, and rug-pull in the Model Context Protocol
Agent Security with Tools and MCP
Introduction
The Model Context Protocol (MCP) is an open standard (Anthropic, 2024) that standardises how AI agents connect to external tools and resources. The standardisation brings ecosystem benefits but introduces new protocol-specific attack vectors: tool poisoning (manipulating tool descriptions), confused deputy (a tool acting on behalf of the wrong principal), and rug-pull (tool substitution after installation). This lesson analyses each attack mechanistically and covers defences.