## Definition
**Compounding error** is the defining risk of multi-step agents: because the agent's steps are chained, per-step reliability *multiplies*, so a model that is individually quite accurate becomes unreliable over a long trajectory. It is the mathematical reason that long agentic chains are hard.
## The arithmetic
If each step succeeds independently with probability *p*, a chain of *n* steps succeeds with probability *pⁿ*. Even a strong per-step accuracy decays brutally:
| Per-step accuracy | 10 steps | 100 steps |
|-------------------|----------|-----------|
| 95% | ~60% | ~0.6% |
| 99% | ~90% | ~37% |
Huyen uses exactly the 95% example in *[[AI Engineering - Chip Huyen]]*: a 95%-reliable step yields only about 60% success over ten steps and a catastrophic ~0.6% over a hundred. The lesson is stark — small per-step gaps that look like rounding noise become the dominant failure once the agent loops many times.
## Why agents are especially exposed
A single prompt is one step; an agent is the [[Agentic Loop]] run many times. Every iteration adds a multiplicand. Worse, errors are not always independent — a wrong observation early on poisons every downstream decision, so real agents often do *worse* than the independent-multiplication model predicts. And the steps are not just reasoning: they include [[Tool Use]], where a single bad write action can be irreversible. **Tool access raises the stakes of each error** from "wrong sentence" to "wrong transaction."
## What it forces in design
Compounding error is the budget you are spending; everything below is how to spend less of it:
- **Stronger models / fewer steps.** Push *p* toward 1 and shrink *n*. Shorter plans beat longer ones; collapse steps where you can.
- **[[Reflection]].** Insert checkpoints where the agent critiques its own progress and catches drift before it propagates.
- **Verification.** Check intermediate results — and prefer [[Verifier Independence]], because a verifier that shares the actor's blind spots multiplies the error instead of catching it.
- **Bounded horizons.** Cap the number of steps and fail loudly rather than wandering.
## Relation to failure analysis
Compounding error explains *why* agents fail at scale; cataloguing *how* they fail is the job of [[Agent Failure Modes]]. The two are complementary: the multiplication law tells you long chains are fragile, and the failure-mode taxonomy tells you which step types are eating your reliability budget so you know where to intervene.
## Related
- [[Agentic Loop]]
- [[Agent Failure Modes]]
- [[Verifier Independence]]
- [[Reflection]]
- [[Tool Use]]
- [[AI Engineering - Chip Huyen]]