## Definition
**Compressed summary return** is the communication convention of the [[Orchestrator-Subagent Pattern]]: a subagent returns one summary string to its orchestrator, not the full transcript of its internal work. The orchestrator integrates that string into its own context as if it were a tool result. The intermediate reasoning, tool calls, and false starts stay behind in the subagent's context and are discarded with it.
## Format in Practice
A typical summary return contains:
- **The answer** to the task as posed. One paragraph or a short structured block.
- **The provenance**: which sources, files, or tool calls produced the answer.
- **Caveats** the orchestrator needs to know — partial coverage, conflicting evidence, things the subagent couldn't resolve.
It does *not* contain: chains of thought, raw tool outputs, dead-end branches, or the subagent's own reasoning trace. Those can run to tens of thousands of tokens and would defeat the purpose.
## Why Compression Matters
The orchestrator's context is the most expensive token budget in the system. It grows monotonically across the whole user session; every subagent return adds to it. If subagents returned full transcripts, an orchestrator running ten spawns would carry hundreds of thousands of tokens of unused tool noise. The compression is what makes the pattern economically viable.
There's a quality argument too: the orchestrator reasons more clearly when its context is signal-dense. Returning summaries forces each subagent to commit to a concrete conclusion, which is easier to integrate than raw evidence.
## The Information Loss
Compression is lossy by design. The orchestrator can't introspect the subagent's choices. If a summary is wrong, it's hard to tell *whether* it's wrong without re-running the work. The standard mitigation is to have the subagent cite its sources or attach minimal structured evidence, so the orchestrator can spot-check without inheriting the full trace.
For audit, debugging, or compliance use cases, the subagent's full trace is usually logged separately — not returned to the orchestrator, but kept available out-of-band.
## Related
- [[Orchestrator-Subagent Pattern]]
- [[Subagent Context Isolation]]
- [[Ephemeral Subagent]]
- [[Context Compaction]]
## Sources
- [[Multi-Agent AI Systems in 2026 (FlowHunt)]]
- [[The Architecture of Scale - Anthropic Sub-Agents (Oswal)]]