## Definition **Structured outputs** constrain a model to emit a value conforming to a declared schema — typically JSON with specified keys and types — instead of free prose that you then parse. The schema is part of the request; conformance is enforced during generation. ## The problem: prose is a hostile interface Asking a model for "the extracted fields" and getting a paragraph forces you into regex archaeology: brittle parsing, fragile to phrasing changes, broken by a polite preamble like "Sure, here are the fields." Every prose-to-data hop is a place the pipeline silently corrupts. Structured outputs delete that hop. ```json { "severity": "high", "component": "auth", "needs_human_review": true } ``` A downstream system can consume this directly. There is no "mostly works" — either it parses or the contract was violated, and the violation is visible. ## Why it's cheaper Constraining to a schema cuts tokens. The model spends none of its budget on connective prose ("Based on my analysis, I would say that...") — it emits values. Fewer output tokens means lower latency and lower cost per call ([[Token]]), and the savings compound across an agentic loop that calls the model thousands of times. | | Prose output | Structured output | |---|---|---| | Parsing | Regex / heuristics | Direct deserialization | | Token spend | High (filler) | Low (values only) | | Drift | Frequent | Schema-bounded | | Failure | Silent misparse | Explicit violation | ## Same discipline as a tool schema A structured output schema and a tool's input schema ([[Function Calling]]) are the same idea pointed in opposite directions. A tool call is the model producing a structured *input* to your code; a structured output is the model producing structured *data* for your code. Both replace "interpret this English" with "fill this shape." If you are comfortable defining tool parameters, you already know how to define an output schema. ## Connection to verifiable specs Structured outputs make a result *checkable*. A field with an enum of allowed values is a tiny [[Executable Acceptance Criterion]]: the schema asserts "severity must be one of {low, medium, high}," and a non-conforming answer fails automatically. This is the same move SDD makes at the project level, applied to a single call. ## When not to use it Schema constraint can suppress reasoning quality if applied too early — forcing JSON before the model has "thought" can flatten its answer. The common pattern is to let the model reason freely (or in [[Extended Thinking]]), then emit a structured final value. ## Related - [[Function Calling]] - [[Tool Use]] - [[Token]] - [[Executable Acceptance Criterion]] - [[Extended Thinking]]