## Definition
**Dynamic Programming (DP)** solves optimisation problems by decomposing them into overlapping subproblems and storing solutions to subproblems for reuse. Coined by Richard Bellman in the 1950s. The foundational technique whenever a problem exhibits **optimal substructure** and **overlapping subproblems**.
## The Two Requirements
### Optimal Substructure
The optimal solution to a problem can be constructed from optimal solutions to its subproblems. Examples:
- **Shortest path:** the shortest path from A to C through B = (shortest A → B) + (shortest B → C).
- **Knapsack:** the optimal selection from $n$ items = either (optimal from $n-1$ items, weight $W$) or (optimal from $n-1$ items, weight $W - w_n$) + value of item $n$.
### Overlapping Subproblems
The same subproblems recur many times in the recursive structure. Memoising solutions yields exponential speed-up.
## Two Styles
### Top-Down (Memoised Recursion)
```python
@lru_cache
def fib(n):
if n <= 1: return n
return fib(n-1) + fib(n-2)
```
Recursive with caching. Natural to write; clean for problems with sparse subproblem space.
### Bottom-Up (Iterative Tabulation)
```python
def fib(n):
dp = [0, 1]
for i in range(2, n+1):
dp.append(dp[i-1] + dp[i-2])
return dp[n]
```
Fill a table from base cases up. More space-efficient; often easier to optimise space (rolling array, e.g., only need the last two values for Fibonacci).
## Canonical DP Problems
- **Fibonacci numbers.** Pedagogical workhorse.
- **0-1 Knapsack.** $O(nW)$.
- **Coin change.** $O(n \cdot \text{amount})$.
- **Longest common subsequence.** $O(nm)$.
- **Edit distance (Levenshtein).** $O(nm)$.
- **Matrix chain multiplication.** $O(n^3)$.
- **Shortest paths:** Bellman-Ford ($O(VE)$), Floyd-Warshall ($O(V^3)$).
- **Traveling Salesman (Held-Karp).** $O(n^2 2^n)$ — exponential but better than naive $O(n!)$.
## The Bellman Equation
For sequential decision problems, the Bellman equation characterises the optimal value function:
$
V(s) = \max_a \left[ R(s, a) + \gamma \sum_{s'} P(s' \mid s, a) V(s') \right]
$
The foundation of [[Value Iteration]], [[Policy Iteration]], and [[Reinforcement Learning]] more generally. See [[Bellman Equation]].
## When DP Wins
- Problem decomposes into a manageable number of subproblems.
- Subproblems repeat — memoisation matters.
- State space is bounded.
## When DP Loses
- **Curse of dimensionality.** DP table size scales exponentially with state dimensions.
- **Continuous state spaces** require function approximation; pure DP doesn't apply.
- **Insufficient overlap.** If subproblems rarely repeat, memoisation overhead exceeds the saving.
## Approximate Dynamic Programming
For large or continuous state spaces, replace exact DP with function approximation:
- **Value function approximation** with linear regression, neural networks.
- **Sampled DP** — Monte Carlo estimates of value.
- This becomes [[Reinforcement Learning]].
## Beyond Optimisation
DP appears across CS:
- **Probabilistic inference.** Forward-backward, Viterbi for HMMs.
- **Parsing.** CYK algorithm.
- **Bioinformatics.** Sequence alignment (Needleman-Wunsch, Smith-Waterman).
- **Statistical mechanics.** Transfer matrix methods.
Any problem with the right recurrence structure can use DP.
## Related
- [[Bellman Equation]]
- [[Value Iteration]]
- [[Policy Iteration]]
- [[Combinatorial Optimization]]
- [[Hidden Markov Model]]