Evaluation Function - Albert Masoliver's learning site

## Definition An **evaluation function** $f(s)$ estimates the value of a non-terminal game state — used by [[Minimax Algorithm]] and [[Alpha-Beta Pruning]] when the search cannot reach terminal states within the time budget. The bridge between exact game-tree reasoning and practical play. ## Why Needed Full game trees are typically too large. The minimax search cuts off at depth $d$ and uses $f(s)$ to estimate utility at the leaves. The quality of $f$ largely determines the playing strength of the engine. ## Properties of Good Evaluation Functions 1. **Correlation with true value.** Better positions yield higher scores. Ideally monotonic in true minimax value. 2. **Computational efficiency.** Called millions of times per second in tournament engines; must be cheap. 3. **Smooth.** Small changes in position cause small changes in score; avoids alpha-beta search instability. 4. **Side-symmetric.** $f(\text{state from white's view}) = -f(\text{state from black's view})$. ## Classical Chess Evaluation A traditional engine sums weighted features: $ f(s) = \sum_i w_i \cdot f_i(s) $ Common features: - **Material count.** Pawn = 1, Knight = 3, Bishop = 3.25, Rook = 5, Queen = 9. The dominant term. - **Piece-square tables.** Position of each piece on the board (knights are stronger in the centre). - **Pawn structure.** Doubled, isolated, passed pawns. - **King safety.** Pawn shield, attacker count. - **Mobility.** Number of legal moves. - **Tempo.** Whose move it is. Weights were historically hand-tuned by grandmasters; later automatically tuned via self-play and gradient descent on outcomes. ## Neural Network Evaluation (NNUE and Beyond) Modern chess engines (Stockfish 12+, Komodo Dragon) replaced the hand-crafted evaluation with an **NNUE** — Efficiently Updatable Neural Network. A small MLP trained on millions of self-played games. Two key properties: 1. **Incremental update.** When a piece moves, only the affected inputs change — most network state can be recomputed cheaply. 2. **Quantised arithmetic** for CPU speed. The NNUE jump was ~80–120 Elo points — equivalent to two decades of hand-tuning improvement. ## AlphaZero-Style Evaluation In AlphaZero and successors, a deep convolutional network produces: - A **value head** estimating winning probability. - A **policy head** estimating the distribution of best moves. These replace both the evaluation function and rollouts in [[Monte Carlo Tree Search]]. ## Domain-Specific Examples - **Go:** territory estimation, influence functions, life-and-death evaluation. - **Backgammon:** TD-Gammon (Tesauro 1992) trained a network via temporal-difference learning; one of the first major neural-network successes in any domain. - **Poker:** more about belief states than positions; "evaluation" is over information sets. ## Related - [[Minimax Algorithm]] - [[Alpha-Beta Pruning]] - [[Monte Carlo Tree Search]]