AI Agent Architecture — ReAct, Memory, Planning and Multi-Agent Systems · Agent Memory — From Context Window to Vector Store

Short-term memory — context window as agent RAM and its limits

Agent Memory — From Context Window to Vector Store

Introduction

The context window is the only "workbench" of a stateless LLM. It holds everything the agent operates on at a given moment: system instructions, conversation history, tool results, retrieved memories, and plans. This lesson examines the hard token limit, inference cost, context management techniques (truncation, summarisation, selective compression), and the architectural limits that follow from this finiteness.