## Definition
The **orchestrator-subagent pattern** is the production-dominant multi-agent architecture of 2026: a single *orchestrator* agent owns the full conversation context and delegates discrete tasks to *ephemeral* subagents that run in isolated context windows and return only a compressed summary. By 2026 it had been adopted, with minor variations, by Anthropic, Cognition, OpenAI, Microsoft Agent Framework, and LangChain.
## Roles
- **Orchestrator**: holds the user's full request, the plan, and the running conversation. Decides when to spawn a subagent, what task to give it, what summary it expects back, and how to integrate the result.
- **Subagent**: receives a focused task and a fresh context window. Does whatever tool-calling and reasoning the task requires. Returns one summary string. Discarded after.
The orchestrator never inherits the subagent's full trace — only the summary. Subagents never talk to each other.
## Why This Pattern Won
It beats the "GroupChat" alternative (peer agents communicating freely) on three engineering axes:
- **Context-window economics**. The orchestrator avoids absorbing every subagent's intermediate tool calls. A 50-step research subagent compresses to a paragraph for the orchestrator.
- **Coordination complexity**. Peer designs grow O(n²) communication edges; orchestrator-subagent stays O(n).
- **Failure isolation**. A subagent that derails (hallucinates, loops, hits a tool error) is contained — its bad context doesn't poison the orchestrator's.
## Spawning Heuristics
Drawn from Anthropic's deployed Research system:
- 1 agent (no subagents) — simple fact-finding.
- 2–4 subagents — direct comparisons or parallel exploration of a small space.
- 10+ subagents — complex multi-faceted research.
The cost of spawning is not free: a multi-agent run consumes roughly 15× the tokens of a single-chat interaction. Spawn when the task value clears that bar.
## Trade-offs
- **Latency** improves when subagent work parallelises; worsens when the orchestrator serialises on summaries.
- **Cost** is dominated by total subagent tokens. Routing cheaper models (e.g. Haiku) to subagents while keeping the orchestrator on Opus is the standard cost-control lever.
- **Determinism** is harder than for single-agent. Summaries lose information by design — the orchestrator can't introspect *why* a subagent decided what it did.
## Related
- [[Subagent Context Isolation]]
- [[Compressed Summary Return]]
- [[Ephemeral Subagent]]
- [[Multi-Agent System]]
- [[AI Agent]]
- [[Agentic Loop]]
- [[Model Selection Strategy]]
## Sources
- [[Multi-Agent AI Systems in 2026 (FlowHunt)]]
- [[The Architecture of Scale - Anthropic Sub-Agents (Oswal)]]