Beam Search - Albert Masoliver's learning site

## Definition **Beam search** is a memory-bounded heuristic search algorithm. At each level it keeps only the **best $k$ nodes** (the "beam width") ordered by a heuristic, discarding the rest. A controlled compromise between breadth-first exploration and greedy depth-pursuit. ## Algorithm ``` beam ← {initial_state} while beam non-empty: candidates ← all successors of nodes in beam if any candidate is a goal: return path beam ← top-k candidates by h(n) return failure (beam exhausted without goal) ``` The hyperparameter $k$ controls the trade-off: larger $k$ approaches BFS quality; $k = 1$ degenerates to [[Greedy Best-First Search]]. ## Properties - **Complete:** no — discarded candidates may have been on the only path to the goal. - **Optimal:** no. - **Space:** $O(k)$ per level — linear in beam width. - **Time:** $O(k \cdot d)$ for a goal at depth $d$. ## Uses Outside Classical Search Beam search is the dominant *decoding strategy* in older sequence-to-sequence neural models (machine translation, speech recognition). For each output token, the model maintains the top-$k$ partial sequences by joint log-probability. Modern LLMs largely abandoned beam search for open-ended generation (it produces bland, repetitive output) — see [[Sampling]] for the alternatives that displaced it. ## When to Use - Memory constraints rule out A* or BFS. - Approximation is acceptable. - Combinatorial search spaces with smooth heuristic landscapes. ## Tuning - Small $k$ (2-5): fast but brittle. - Medium $k$ (10-50): standard for translation. - Large $k$ (100+): diminishing returns; often a sign you should use a better algorithm. ## Related - [[Greedy Best-First Search]] - [[Heuristic Search]] - [[Sampling]]