AI Agent Architecture — ReAct, Memory, Planning and Multi-Agent Systems · Agent Memory — From Context Window to Vector Store
Short-term memory — context window as agent RAM and its limits
Agent Memory — From Context Window to Vector Store
Introduction
The context window is the only "workbench" of a stateless LLM. It holds everything the agent operates on at a given moment: system instructions, conversation history, tool results, retrieved memories, and plans. This lesson examines the hard token limit, inference cost, context management techniques (truncation, summarisation, selective compression), and the architectural limits that follow from this finiteness.