AI Agent Security — Attacks, Jailbreaking, and Defense · Guardrails and AI Firewall — Multi-Layer Defense

Dual LLM pattern: use a second model as guardian of your own model

Guardrails and AI Firewall — Multi-Layer Defense

Introduction

The Dual LLM pattern (also called "LLM-as-judge for safety" or "privileged LLM + unprivileged LLM") is an architecture in which one model acts as a generator and another as an independent guardian. This lesson covers the architecture of the pattern, its variants (symmetric vs asymmetric, inline vs async), concrete applications, effectiveness limits, and implementation pitfalls.