AI Agent Security — Attacks, Jailbreaking, and Defense · Agent Security with Tools and MCP

Human-in-the-loop: HITL configuration for destructive operations (delete, send, execute)

Agent Security with Tools and MCP

Introduction

Human-in-the-loop (HITL) is an architecture pattern requiring explicit human approval before executing high-impact or irreversible operations. This lesson covers: how to classify operations requiring HITL, how to implement technical checkpoints, the trade-offs between security and usability, how to avoid "approval fatigue" that leads to reflexive "OK" clicking, and implementation patterns in popular agent frameworks.