## Definition
The **context window** is the maximum number of tokens an LLM can attend to in a single turn. It is the model's *active attention surface*, not its memory. In 2026 the frontier baseline is ~200k tokens, with 1M-token tiers available for select workloads.
## What Degrades as the Window Fills
1. **Recall accuracy.** Information dropped in the middle of a long prompt is less likely to be used (the [[Lost in the Middle Effect]]).
2. **Instruction adherence.** Early instructions ("always use TypeScript strict mode") get drowned by recent intermediate output.
## Common Misconceptions
- *Context window is memory.* It is not. It is attention surface for **one turn**. Anything that must persist between sessions belongs in a file or a memory store — see [[Context vs Memory]].
- *Bigger is always better.* A model with 30k tokens of relevant context outperforms one with 30k relevant + 100k of "just in case". Every irrelevant token is a vote against the right answer.
## Operational Rules
- Curate the working set actively per task.
- When approaching the limit, externalise state to disk rather than waiting for [[Context Compaction]].
- Use sub-agents to keep verbose tool output out of the main thread.
## Related
- [[Token]]
- [[Lost in the Middle Effect]]
- [[Context Compaction]]
- [[Context vs Memory]]
- [[Hierarchical Retrieval]]