## Definition
**Overfitting** is when a model fits the training data too closely, including its noise, and generalises poorly. **Underfitting** is when a model is too constrained to capture the underlying pattern, performing poorly on both training and test data. The two opposite failure modes of supervised learning.
## Diagnostic Signatures
| Symptom | Likely cause |
|---|---|
| Training error low, test error high | Overfitting (high variance) |
| Training error high, test error similarly high | Underfitting (high bias) |
| Both errors decreasing with more data | Healthy training |
| Both errors plateaued at high values | Underfitting |
| Gap grows as model complexity grows | Overfitting risk increasing |
## Causes
**Overfitting** — model has too much capacity relative to the signal available. Common drivers:
- Too many parameters for too few examples.
- Training too long with no early stopping.
- Insufficient regularisation.
- Noisy or unrepresentative training data.
**Underfitting** — model is structurally unable to capture the pattern. Common drivers:
- Hypothesis class too restrictive (e.g., linear model on non-linear data).
- Excessive regularisation.
- Insufficient training time.
- Wrong features.
## Mitigations
**Against overfitting:**
- More data.
- Simpler model.
- [[Regularization]] (L1, L2, dropout).
- Early stopping.
- Data augmentation.
- Ensembling.
**Against underfitting:**
- Richer model class.
- Better features (see [[Feature Engineering]]).
- Less regularisation.
- Train longer.
## Validation as the Compass
You diagnose by *comparing training and validation performance*, not by absolute numbers. See [[Cross-Validation]].
## Related
- [[Bias-Variance Tradeoff]]
- [[Generalization]]
- [[Regularization]]
- [[Cross-Validation]]