## Definition
The **Apriori algorithm** (Agrawal & Srikant, 1994) is the foundational algorithm for **association rule learning** — discovering frequent itemsets and useful "if X then Y" rules in transactional data. Classic application: market-basket analysis.
## The Key Insight (Apriori Property)
> If an itemset is frequent, then all its subsets are frequent.
Equivalently: if a subset is not frequent, no superset can be frequent. This *downward closure* enables massive pruning of the candidate space.
## Algorithm
```
L_1 ← frequent 1-itemsets (single items above minimum support)
for k = 2, 3, ...:
candidates C_k ← join L_{k-1} with itself, prune those containing infrequent (k-1) subsets
scan transactions, count support of each candidate
L_k ← candidates with support ≥ min_support
return ∪_k L_k
for each frequent itemset:
generate association rules (X → Y) with confidence ≥ min_confidence
```
## Three Key Metrics
For a rule $X \to Y$:
- **Support:** $P(X \cup Y)$ — fraction of transactions containing both.
- **Confidence:** $P(Y \mid X)$ — fraction of transactions with $X$ that also have $Y$.
- **Lift:** $P(Y \mid X) / P(Y)$ — how much more likely $Y$ is given $X$ vs at random. $\text{lift} > 1$ = positive correlation.
A useful rule needs high support (occurs often), high confidence (when X, then Y reliably), and lift > 1 (not just because Y is always common).
## Example: Market Basket
> {diapers} → {beer}, support 0.05, confidence 0.7, lift 3.5
5% of transactions contain both diapers and beer. 70% of diaper-buying transactions also buy beer. Diaper buyers are 3.5× more likely to buy beer than random shoppers. (The famous data-mining example, though its actual provenance is contested.)
## Limitations
- **Multiple database scans.** One scan per level $k$. Slow on large data.
- **Combinatorial blow-up** with many items. Even with pruning, intermediate candidate sets can be huge.
- **Memory-heavy** for dense data.
## Successors
- **FP-Growth** (Han et al., 2000) — uses a compact FP-tree structure; mines without candidate generation. Much faster.
- **Eclat** — uses vertical data layout (tid-lists) and intersection. Memory-efficient for sparse data.
## Modern Status
Pure association rule mining has largely been displaced by:
- **Recommendation systems** (collaborative filtering, matrix factorisation, deep models).
- **Sequence pattern mining** for ordered behaviour.
But Apriori remains useful for:
- Interpretable rule discovery (medical diagnosis, regulatory contexts).
- Quick exploratory analysis of categorical transactions.
- Foundation for understanding frequent-pattern mining.
## Related
- [[Naive Bayes]]
- [[Unsupervised Learning]]
- [[Bayes Theorem]]