## Definition
**Structured outputs** constrain a model to emit a value conforming to a declared schema — typically JSON with specified keys and types — instead of free prose that you then parse. The schema is part of the request; conformance is enforced during generation.
## The problem: prose is a hostile interface
Asking a model for "the extracted fields" and getting a paragraph forces you into regex archaeology: brittle parsing, fragile to phrasing changes, broken by a polite preamble like "Sure, here are the fields." Every prose-to-data hop is a place the pipeline silently corrupts. Structured outputs delete that hop.
```json
{
"severity": "high",
"component": "auth",
"needs_human_review": true
}
```
A downstream system can consume this directly. There is no "mostly works" — either it parses or the contract was violated, and the violation is visible.
## Why it's cheaper
Constraining to a schema cuts tokens. The model spends none of its budget on connective prose ("Based on my analysis, I would say that...") — it emits values. Fewer output tokens means lower latency and lower cost per call ([[Token]]), and the savings compound across an agentic loop that calls the model thousands of times.
| | Prose output | Structured output |
|---|---|---|
| Parsing | Regex / heuristics | Direct deserialization |
| Token spend | High (filler) | Low (values only) |
| Drift | Frequent | Schema-bounded |
| Failure | Silent misparse | Explicit violation |
## Same discipline as a tool schema
A structured output schema and a tool's input schema ([[Function Calling]]) are the same idea pointed in opposite directions. A tool call is the model producing a structured *input* to your code; a structured output is the model producing structured *data* for your code. Both replace "interpret this English" with "fill this shape." If you are comfortable defining tool parameters, you already know how to define an output schema.
## Connection to verifiable specs
Structured outputs make a result *checkable*. A field with an enum of allowed values is a tiny [[Executable Acceptance Criterion]]: the schema asserts "severity must be one of {low, medium, high}," and a non-conforming answer fails automatically. This is the same move SDD makes at the project level, applied to a single call.
## When not to use it
Schema constraint can suppress reasoning quality if applied too early — forcing JSON before the model has "thought" can flatten its answer. The common pattern is to let the model reason freely (or in [[Extended Thinking]]), then emit a structured final value.
## Related
- [[Function Calling]]
- [[Tool Use]]
- [[Token]]
- [[Executable Acceptance Criterion]]
- [[Extended Thinking]]