Architecture

MAS

1995productionpeer_reviewed

Key innovation

Decomposition of complex tasks into a set of autonomous agents with specialized roles that communicate and coordinate, enabling systems to exceed the reasoning limitations of single-model inference and complete long-horizon tasks that cannot be completed within a single LLM call.

How it works

A MAS system defines a set of agents, a communication environment, and a coordination protocol. Each agent is an autonomous unit with: (1) a cognitive core (typically an LLM with a system prompt defining its role), (2) short-term memory (context) and optionally long-term memory (vector store), (3) a tool-use interface (APIs, code execution, web search, retrieval), (4) a goal or task instruction. Task execution unfolds in stages: (a) decomposition — an orchestrator or planner agent breaks the query into subtasks, (b) routing — subtasks are assigned to agents based on roles and capabilities, (c) communication — agents exchange messages (natural language, structured function calls, shared memory), (d) iteration — agents execute actions, observe results, correct the plan (often with a built-in critic agent), (e) aggregation — synthesis of results into a final answer. Coordination topologies: sequential pipeline (chain), hierarchical (orchestrator-worker), parallel fan-out/fan-in, blackboard (shared workspace), debate (agents compete/discuss), publish-subscribe, and decentralized peer-to-peer.

Problem solved

A single LLM faces hard limits on long-horizon tasks, tasks requiring diverse domain expertise, multi-step planning, or parallel processing of multiple information sources. A monolithic prompt rapidly loses coherence, context degrades (context rot), and the lack of specialized roles prevents separation of concerns (e.g., planner vs executor vs critic). MAS addresses this by decomposing the problem into subtasks assigned to autonomous agents with dedicated roles, tools, and memory, coordinated by a communication topology. This allows system complexity to scale without enlarging a single model, enables parallel solution exploration, and supports peer-review / critique mechanisms between agents.

Key mechanisms

Task decomposition — an orchestrator or planner agent breaks the query into a set of delegable subtasks

Specialized roles — each agent receives a system prompt defining identity, scope of responsibility, and available tools

Inter-agent communication — message exchange in natural language, structured function calls, or via shared memory (blackboard)

Coordination topology — pattern choice: sequential, hierarchical (orchestrator-worker), parallel fan-out/fan-in, debate, publish-subscribe, peer-to-peer

Tool use — agents call external APIs, execute code, search the web, use retrieval and vector databases

Agent memory — short-term context window + long-term memory (vector store), with optional sharing across agents

Critique / peer-review mechanism — a built-in critic agent verifies results of others (reflexion, self-critique, multi-agent debate)

Dynamic routing — the orchestrator selects the next agent based on current state, task type, or prior step results

Plan iteration and correction — agents can observe action outcomes and modify the plan on-the-fly (closed-loop control)

Strengths & limitations

Strengths

✓Complexity scaling without enlarging the model — capability grows by adding agents instead of training a bigger LLM

✓Separation of concerns — specialized roles (planner, executor, critic) outperform a monolithic prompt

✓Parallelism — independent subtasks can execute concurrently, reducing latency

✓Modularity and swappability — agents can be updated, replaced, or added without redesigning the whole system

✓Peer-review and self-correction — multiple agents catch errors that a single model would miss (multi-agent debate, reflexion)

✓Extended effective horizon — long-horizon tasks (hours of work) become tractable through sub-task decomposition

✓Natural fit for multi-actor problems — social simulations, negotiation, multi-robot fleets are directly modelable

✓Rich framework ecosystem — AutoGen, CAMEL, MetaGPT, CrewAI, LangGraph provide production-ready abstractions

Limitations

✗High token cost — inter-agent communication multiplies LLM calls (often 5–50× a single query)

✗Sequential latency — agent chains introduce cumulative delay, up to 10× a single query

✗Error propagation — one agent's mistake passed downstream can snowball into a completely wrong final answer

✗Lack of global coherence — agents may locally optimize in ways that conflict with the system goal (inter-agent misalignment)

✗Debugging difficulty — non-deterministic trajectories and multi-threaded communication make error diagnosis very expensive

✗Coordination overhead — for simple tasks a single LLM call is cheaper and faster; MAS pays off only at complexity

✗Amplified hallucinations — agents may mutually reinforce false beliefs (echo chamber effect)

✗No protocol standards — different frameworks use incompatible message formats and role schemas

✗API rate limits — concurrent LLM calls from one account may hit throttling and queueing

Components

AgentCore computational and decision-making unit of the system.

Autonomous computational entity with its own internal state, perception, reasoning, and action capabilities. In LLM-based MAS: a language model with a system prompt defining the agent's role, goal, and constraints.

Orchestrator AgentAgent responsible for task decomposition, assigning subtasks to worker agents, and aggregating results.

Specialized Agent (Worker Agent)Agent with a narrow specialization (e.g., search, coding, analysis), executing specific subtasks assigned by the orchestrator.

Human-in-the-Loop AgentA human participating as an agent in the system, verifying or correcting AI agent actions at critical decision points.

Official

Communication ChannelEnables coordination and state transfer between agents.

Mechanism for information exchange between agents. Can take the form of direct message passing, shared memory, publish-subscribe systems, or event queues.

Message PassingAgents exchange structured messages directly or through a message broker.

Shared Memory (Blackboard)Agents read from and write to a shared knowledge base or state object (e.g., vector database, graph state object).

Official

Orchestrator / CoordinatorManaging global task progress and coordinating between agents.

Component responsible for workflow management: task decomposition, routing to agents, dependency management, error handling, and aggregating final results. Can be an LLM agent or a programmatic controller.

Official

Memory SubsystemPersists state across agent calls and system sessions.

State storage mechanisms: short-term memory (conversation context / LLM context window), long-term memory (vector database, external database), episodic memory (interaction history), and procedural memory (learned procedures).

Official

Tool InterfaceExtends agent capabilities beyond language processing to actions in the external world.

Layer integrating agents with external systems: APIs, search engines, code interpreters, databases, file services. Enables agents to act beyond their textual context.

Official

Implementation

Reference implementations

AutoGen (Microsoft Research)

Python · Microsoft Research

LangGraph (LangChain)

Python · LangChain AI

Implementation pitfalls

Error Cascading Between AgentsHigh

Incorrect output from one agent is passed as input to the next, leading to error accumulation across the pipeline. Without a validator agent, this can result in completely incorrect final outputs.

Fix:Deploy critic/validator agents after key steps; use multiple independent execution paths with voting or result aggregation.

Excessive Token Costs (Token Cost Explosion)High

Each inter-agent communication consumes tokens (passing conversation history, context, tools). With many agents and long communication chains, costs can grow disproportionately fast.

Fix:Use context compression; limit the history passed between agents to the necessary minimum; use lighter models for simpler subtasks.

Infinite Loops and Non-ConvergenceCritical

Agents can enter loops — e.g., one agent requests revision from another, which requests clarification back — without a termination mechanism. Absence of stopping criteria causes infinite loops.

Fix:Define explicit termination criteria (maximum number of iterations, a 'TERMINATE' condition, timeout); use a supervising monitor agent.

State Inconsistency Between AgentsHigh

During parallel execution, agents may operate on inconsistent versions of shared state, leading to conflicts and race conditions, especially when shared memory is non-transactional.

Fix:Use transactional updates for shared state; design agents to be idempotent; minimize write-write contention by assigning distinct state partitions to separate agents.

Role Confusion and Overlapping Responsibilities Between AgentsMedium

Agents with poorly defined roles may duplicate work, conflict in decision-making, or skip tasks because each assumes another agent will handle it.

Fix:Precisely define each agent's scope of responsibility in the system prompt; apply formal handoff protocols and task-receipt acknowledgments.

Evolution

Original paper · 1995

Intelligent Agents: Theory and Practice

1995

Wooldridge and Jennings formalize the notion of intelligent agents and MAS

Inflection point

Wooldridge and Jennings publish 'Intelligent Agents: Theory and Practice' (Knowledge Engineering Review), defining agent properties (autonomy, reactivity, pro-activeness, social ability) and foundations of MAS theory.

Intelligent Agents: Theory and Practice (paper)

1999

Gerhard Weiss edits 'Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence' (MIT Press)

Textbook standardizing MAS terminology and architecture, becoming the primary academic reference for the following decade.

2023

LLM-MAS framework explosion: CAMEL, AutoGen, MetaGPT

Inflection point

2023 sees the first wave of LLM-based MAS frameworks: CAMEL (Li et al., March 2023), AutoGen (Wu et al., Microsoft, August 2023), MetaGPT (Hong et al., August 2023), transforming MAS from rule-based to language-model-based systems.

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation (paper)

2024

Standardization of communication protocols (MCP, A2A) and orchestration frameworks (LangGraph, CrewAI)

Anthropic announces Model Context Protocol (MCP), Google announces Agent-to-Agent Protocol (A2A). More mature frameworks emerge: LangGraph (LangChain), CrewAI, supporting complex agent topologies with state persistence and controlled flow.

Sources

Multi-Agent Systems: An Introduction

MAS

How it works

Problem solved

Key mechanisms

Strengths & limitations

Components

Implementation

Evolution

Sources

Computational complexity

Hyperparameters (configurable axes)

Execution paradigm

Parallelism

Hardware requirements