Robots Atlas>ROBOTS ATLAS

AI Agent Security — Attacks, Jailbreaking, and Defense · Agent Security with Tools and MCP

MCP security: tool poisoning, confused deputy, and rug-pull in the Model Context Protocol

Agent Security with Tools and MCP

Introduction

The Model Context Protocol (MCP) is an open standard (Anthropic, 2024) that standardises how AI agents connect to external tools and resources. The standardisation brings ecosystem benefits but introduces new protocol-specific attack vectors: tool poisoning (manipulating tool descriptions), confused deputy (a tool acting on behalf of the wrong principal), and rug-pull (tool substitution after installation). This lesson analyses each attack mechanistically and covers defences.