## Definition
**Precision** and **recall** measure two complementary aspects of a classifier's positive predictions. They are the dominant metric pair when classes are imbalanced or when error costs are asymmetric — i.e., most production ML scenarios.
## Definitions
$
\text{Precision} = \frac{TP}{TP + FP}
$
$
\text{Recall} = \frac{TP}{TP + FN}
$
- **Precision** — of the predictions I labelled positive, what fraction were actually positive?
- **Recall** — of the actually positive items, what fraction did I catch?
## The Trade-off
Adjusting the decision threshold of a probabilistic classifier moves precision and recall in opposite directions:
- **Lower threshold** → more positive predictions → higher recall, lower precision.
- **Higher threshold** → fewer positive predictions → higher precision, lower recall.
Where you sit depends on the cost structure.
## When To Optimise For Which
- **High recall matters when missing positives is costly.** Cancer screening, fraud detection, security threats. Better to false-alarm than miss.
- **High precision matters when false positives are costly.** Spam filtering (don't lose legitimate mail), legal document review (don't waste lawyer time on irrelevant docs), automated approvals.
Many real systems set a *precision floor* and maximise recall subject to that — or vice versa.
## Precision-Recall Curve
Plot precision vs recall across all decision thresholds. **AUC-PR** (area under the precision-recall curve) summarises the trade-off in one number. AUC-PR is often more informative than ROC-AUC for imbalanced classes, because it ignores true negatives entirely.
## F1 Score
The harmonic mean of precision and recall — see [[F1 Score]].
## Multi-Class Precision and Recall
Compute per class and aggregate:
- **Macro-average.** Mean across classes, treating each equally. Sensitive to small classes.
- **Micro-average.** Sum TP, FP, FN across classes, then compute. Dominated by large classes.
- **Weighted average.** Weighted by class support.
The right choice depends on whether minority-class performance should "count" equally.
## Related
- [[Confusion Matrix]]
- [[F1 Score]]
- [[ROC Curve and AUC]]