## Definition
**Logprobs** (log probabilities) are the log-scale scores the model assigns to each possible next [[Token]], exposing how confident it was at every step of generation. They are a cheap, built-in signal you can read off the API without any extra model call.
## Where they come from
At each step a transformer emits a vector of **logits** — one raw score per vocabulary entry. A **softmax** turns those into a probability distribution, and the log of each probability is a logprob:
$
p_i = \frac{e^{z_i}}{\sum_j e^{z_j}} \qquad \text{logprob}_i = \log p_i
$
The token actually emitted depends on the [[Decoding Strategy]] and [[Temperature]], but the underlying distribution — and thus the logprobs — is what [[Sampling]] draws from.
## Reading confidence
A high logprob (close to 0) means the model was sure; a very negative one means it was guessing among many options. Aggregating logprobs over a span gives you a per-segment confidence estimate essentially for free.
## Practical uses
- **Routing.** Flag low-confidence spans and send just those to a bigger, costlier model — a cascade pattern that saves money.
- **Human-in-the-loop.** Surface low-confidence output for review instead of trusting it blindly.
- **Extraction QA.** Low logprobs on a parsed field hint the model may have fabricated it.
## The crucial caveat
Logprobs are **not calibrated truth**. A model can be *confidently wrong* — assign a high probability to a fluent fabrication. Confidence reflects how typical the text is given training, not whether it is factually correct. So logprobs are a useful *heuristic* for triage, never a verifier; treat them as a smoke detector, not a guarantee. They are a signal in the same family as [[Hallucination]] detection, not a cure.
## Related
- [[Sampling]]
- [[Temperature]]
- [[Decoding Strategy]]
- [[Hallucination]]
- [[Token]]
- [[AI Engineering - Chip Huyen]]