## Definition
The **F1 score** is the harmonic mean of precision and recall:
$
F_1 = 2 \cdot \frac{P \cdot R}{P + R}
$
It collapses [[Precision and Recall]] into a single number, useful when you need to compare classifiers without picking one or the other.
## Why Harmonic Mean
The harmonic mean punishes imbalance. A classifier with precision 0.99 and recall 0.01 has:
- Arithmetic mean: 0.50 (looks fine).
- Harmonic mean (F1): 0.02 (correctly diagnoses the disaster).
F1 is only high when *both* precision and recall are reasonably high. That's usually what you want.
## Generalised — $F_\beta$
$
F_\beta = (1 + \beta^2) \cdot \frac{P \cdot R}{\beta^2 \cdot P + R}
$
- $\beta = 1$: balances precision and recall (F1).
- $\beta = 2$: weights recall twice as much as precision.
- $\beta = 0.5$: weights precision twice as much as recall.
Choose $\beta$ based on which error type is costlier in your domain.
## When F1 Is Right
- **Single-number comparison** across classifiers.
- **Imbalanced classes** — accuracy is misleading; F1 reflects performance on the rare class.
- **Hyperparameter tuning** when you can't articulate the precision-recall trade-off precisely.
## When F1 Is Wrong
- **Ranking tasks.** Use AUC instead.
- **Asymmetric costs that aren't well-captured by a single $\beta$.** Compute expected cost directly from the [[Confusion Matrix]].
- **Multi-class with unequal class importance.** Macro-F1 treats all classes equally; weighted-F1 weights by support — choose based on intent.
## Multi-Class Variants
- **Macro-F1.** Compute F1 per class, then average. Treats classes equally.
- **Micro-F1.** Pool TP, FP, FN across classes, then compute. Equivalent to accuracy for single-label multi-class.
- **Weighted-F1.** Per-class F1 weighted by support.
## Practical Note
In Kaggle and ML competition contexts, F1 (especially macro-F1) is the default metric for imbalanced classification. In production, the choice between F1 and a cost-sensitive metric should follow business reality, not convention.
## Related
- [[Precision and Recall]]
- [[Confusion Matrix]]
- [[ROC Curve and AUC]]