Chain-of-Thought - Albert Masoliver's learning site

## Definition **Chain-of-Thought (CoT)** is the prompting technique that asks an LLM to produce intermediate reasoning steps before the final answer. Demonstrated to dramatically improve performance on arithmetic, commonsense, and symbolic-reasoning tasks by Wei et al. (2022) — see [[Chain-of-Thought Prompting (Wei et al.)]]. ## The Two Forms ### Few-shot CoT Prompt includes exemplars where each answer is preceded by its reasoning: ``` Q: There are 15 trees. After planting more, there are 21. How many were planted? A: There were 15 originally. There are 21 now. 21 − 15 = 6. So 6 were planted. Q: A juggler has 16 balls. Half are golf balls; half of those are blue. How many blue golf balls? A: 16 / 2 = 8 golf balls. 8 / 2 = 4 blue golf balls. So 4 blue golf balls. Q: <new question> A: ``` ### Zero-shot CoT Simply append *"Let's think step by step."* (Kojima et al., 2022). The model produces reasoning on its own without exemplars. Stunningly effective for the simplicity. ## Why It Matters - **Surfaces reasoning to the user.** You can see the model's working and catch errors. - **Improves accuracy on multi-step problems.** Forces the model to allocate compute to the substeps instead of jumping to an answer. - **Conceptual ancestor of [[Extended Thinking]].** Modern frontier models internalise CoT via a reasoning budget rather than via explicit prompting. ## Variants - **Self-consistency.** Sample multiple CoTs at non-zero temperature; take the majority answer (Wang et al., 2022). Robust improvement. - **Least-to-most prompting.** Break the problem into easier subproblems first, then solve each (Zhou et al., 2022). - **Tree of Thoughts.** Explore a tree of reasoning branches and prune (Yao et al., 2023). - **[[ReAct Pattern]].** CoT + tool use; reasoning informs actions. ## Emergent at Scale CoT prompting works only on sufficiently large models — roughly 100B+ parameters in the original paper. Smaller models tend to produce reasoning that doesn't help (or even hurts). ## When NOT to Use CoT - Simple lookup tasks. Reasoning adds tokens without value. - Time-sensitive completions. Reasoning lengthens responses. - Tasks where the model is already at ceiling without it. ## Related - [[Prompt Engineering]] - [[In-Context Learning]] - [[ReAct Pattern]] - [[Extended Thinking]] - [[Reasoning Budget]] - [[Chain-of-Thought Prompting (Wei et al.)]]