Reasoning Budget - Albert Masoliver's learning site

## Definition A **reasoning budget** is the maximum number of internal "thinking" tokens an LLM is allowed to spend *before* producing its visible answer. It is the mechanism behind [[Extended Thinking]] and the trigger words `think`, `think hard`, `think harder`, and `ultrathink` in Claude Code. ## Budget Tiers (Claude Code, approx.) | Trigger | Approx. internal tokens | | -------------- | ----------------------- | | *(none)* | minimal | | `think` | ~4,000 | | `think hard` | ~10,000 | | `think harder` | ~32,000 | | `ultrathink` | up to ~64,000 | ## Heuristic — Spend Where You Can't Un-Spend a Mistake Use a higher budget when: - The action is hard to reverse (migrations, schema, public APIs). - The decision constrains future work (framework, dependency). - The model has to *weigh trade-offs*, not execute a plan. Skip the budget when: - Renaming one variable in one file. - Writing a regex you already know the shape of. - Translating code line-by-line. ## Cost Thinking tokens bill as **output tokens**. A 20k-token ultrathink session adds roughly $0.30 in thinking *before* the response itself. ## Failure Modes - *Over-thinking.* On simple tasks the model second-guesses correct work and rewrites it worse. - *Under-feeding.* More thinking does not manufacture missing information; feed the file first. ## Related - [[Extended Thinking]] - [[Model Selection Strategy]] - [[Token]] - [[Specialized Agent]]