Reflection - Albert Masoliver's learning site

## Definition **Reflection** is the pattern where an agent explicitly critiques its own previous output — or another agent's output — and revises based on the critique. A bounded form of self-improvement within a single session. ## The Two Common Shapes ### Self-reflection The same agent re-reads its draft, identifies problems, and produces a revised version: ``` 1. Agent produces draft. 2. Agent is prompted: "Review the draft. List the three weakest points." 3. Agent produces revision addressing the listed points. ``` Effective when the failure mode is *visible to the model on re-reading* — verbose answers, missed cases, format errors. ### Reflexion-style external memory Shinn et al. (2023) — the agent writes self-critiques to an external "reflection" memory that persists across attempts, even after failures. Subsequent attempts read past reflections and avoid repeating mistakes. Especially useful for tasks attempted many times. ## Why It Works (Sometimes) LLMs are noticeably better at *evaluating* outputs than *generating* them. A reflection pass exploits this asymmetry — generating once is easy; spotting the flaws is the cheap part. ## When It Fails - **The model can't see the flaw.** If the model produced a hallucinated API call, it usually won't catch it on re-reading; both forward and backward passes share the same wrong belief. - **Sycophantic self-review.** Without prompting bias, "review your draft" produces "looks good to me." Demand specific failure modes. - **Adds cost without adding signal.** A reflection that doesn't change anything is just an expensive comment. Measure whether revisions actually improve. ## Builder-Critic as Structured Reflection The [[Builder-Critic Pattern]] is reflection across **two different agents** — the *critic* applies pressure the *builder* can't apply to itself. The same intelligence is more rigorous when given a different role. ## When to Reach For Reflection - The task's *acceptance criteria* are visible to the model but easy to fail on the first pass (formatting, completeness, missing edge case). - The cost of a second pass is low relative to the cost of shipping the wrong thing. - You can prompt the reflection step adversarially — "find at least three problems" — rather than open-endedly. ## Related - [[Builder-Critic Pattern]] - [[Verifier Independence]] - [[Chain-of-Thought]] - [[Building Effective AI Agents]]