## Definition
An **ephemeral subagent** is a single-task agent in the [[Orchestrator-Subagent Pattern]] whose lifecycle is bounded by the task it was spawned to do. It comes into existence with a fresh context, executes its task, returns one summary, and is discarded. It has no persistent memory, no state across spawns, and no identity that outlives the call.
## Lifecycle
1. **Spawn**: orchestrator constructs a system prompt and a task description, allocates a fresh context window.
2. **Execute**: subagent runs its own [[Agentic Loop]] — plans, calls tools, reasons. May run for many internal steps.
3. **Return**: emits one [[Compressed Summary Return]].
4. **Discard**: context window is released. State, history, partial work — all gone.
The lifecycle is intentionally short and one-shot. A subagent never resumes; the next time the orchestrator needs work on this topic it spawns a new one with whatever context it has accumulated since.
## Why Ephemerality
Persistent agents create three operational headaches the ephemeral pattern sidesteps:
- **State management**. A long-lived agent needs durable memory, conflict resolution, and identity. An ephemeral subagent needs none of those.
- **Resource pinning**. Persistent agents hold capacity. Ephemeral ones release it the moment the task ends, which matters when subagents run on expensive models in parallel.
- **Failure recovery**. A crashed ephemeral subagent is restarted by spawning a new one — same prompt, fresh context. A crashed persistent agent needs state recovery and may have left partial side effects.
## When Persistence Is Right
The pattern does have exceptions. Long-running monitoring agents (Cognition's Devin Auto-Triage watching incident streams), per-user assistants, or agents with expensive warm-up costs are better as persistent processes. The orchestrator-subagent pattern still describes their *internal* structure — they spawn ephemeral subagents to do discrete work — but the outer agent itself is durable.
The rule of thumb is: ephemeral when the task has a clear start and end, persistent when the agent's job is to be ready when called.
## Related
- [[Orchestrator-Subagent Pattern]]
- [[Subagent Context Isolation]]
- [[Compressed Summary Return]]
- [[Agentic Loop]]
- [[AI Agent]]
## Sources
- [[Multi-Agent AI Systems in 2026 (FlowHunt)]]
- [[The Architecture of Scale - Anthropic Sub-Agents (Oswal)]]