## Definition The **orchestrator-subagent pattern** is the production-dominant multi-agent architecture of 2026: a single *orchestrator* agent owns the full conversation context and delegates discrete tasks to *ephemeral* subagents that run in isolated context windows and return only a compressed summary. By 2026 it had been adopted, with minor variations, by Anthropic, Cognition, OpenAI, Microsoft Agent Framework, and LangChain. ## Roles - **Orchestrator**: holds the user's full request, the plan, and the running conversation. Decides when to spawn a subagent, what task to give it, what summary it expects back, and how to integrate the result. - **Subagent**: receives a focused task and a fresh context window. Does whatever tool-calling and reasoning the task requires. Returns one summary string. Discarded after. The orchestrator never inherits the subagent's full trace — only the summary. Subagents never talk to each other. ## Why This Pattern Won It beats the "GroupChat" alternative (peer agents communicating freely) on three engineering axes: - **Context-window economics**. The orchestrator avoids absorbing every subagent's intermediate tool calls. A 50-step research subagent compresses to a paragraph for the orchestrator. - **Coordination complexity**. Peer designs grow O(n²) communication edges; orchestrator-subagent stays O(n). - **Failure isolation**. A subagent that derails (hallucinates, loops, hits a tool error) is contained — its bad context doesn't poison the orchestrator's. ## Spawning Heuristics Drawn from Anthropic's deployed Research system: - 1 agent (no subagents) — simple fact-finding. - 2–4 subagents — direct comparisons or parallel exploration of a small space. - 10+ subagents — complex multi-faceted research. The cost of spawning is not free: a multi-agent run consumes roughly 15× the tokens of a single-chat interaction. Spawn when the task value clears that bar. ## Trade-offs - **Latency** improves when subagent work parallelises; worsens when the orchestrator serialises on summaries. - **Cost** is dominated by total subagent tokens. Routing cheaper models (e.g. Haiku) to subagents while keeping the orchestrator on Opus is the standard cost-control lever. - **Determinism** is harder than for single-agent. Summaries lose information by design — the orchestrator can't introspect *why* a subagent decided what it did. ## Related - [[Subagent Context Isolation]] - [[Compressed Summary Return]] - [[Ephemeral Subagent]] - [[Multi-Agent System]] - [[AI Agent]] - [[Agentic Loop]] - [[Model Selection Strategy]] ## Sources - [[Multi-Agent AI Systems in 2026 (FlowHunt)]] - [[The Architecture of Scale - Anthropic Sub-Agents (Oswal)]]