## Definition
A **Markov network** (or Markov Random Field, MRF) is an *undirected* probabilistic graphical model. Joint distribution factorises into non-negative functions over the cliques of an undirected graph. The undirected sibling of the [[Bayesian Network]].
## Why Undirected
Many domains have symmetric, non-causal dependencies:
- **Pixels in an image.** Neighbouring pixels correlate; no direction.
- **Spins in a magnet** (Ising model).
- **Words in a text co-occurrence.**
Directed graphs force a parent → child orientation that doesn't naturally fit.
## Factorisation
The joint distribution is:
$
P(X_1, \dots, X_n) = \frac{1}{Z} \prod_{c \in \mathcal{C}} \phi_c(X_c)
$
- $\mathcal{C}$ — set of cliques in the graph.
- $\phi_c$ — non-negative **potential function** over the variables in clique $c$.
- $Z$ — **partition function**, the normalising constant $Z = \sum_{x} \prod_c \phi_c(x_c)$.
Potentials need not be probabilities; only the normalised product is.
## Local Markov Property
A variable is conditionally independent of all other variables given its immediate neighbours in the graph. The undirected analogue of [[D-Separation]] for directed models.
## Difference from Bayesian Networks
| Property | Bayesian Network | Markov Network |
| ----------------------- | ----------------------- | ---------------------------- |
| Edge direction | Directed (DAG) | Undirected |
| Local factors | Conditional probabilities (sum to 1) | Arbitrary potentials |
| Normalisation | Implicit (product is a joint) | Explicit (partition function $Z$) |
| Independence reading | D-separation | Graph separation |
| Natural for | Causal / generative models | Symmetric correlations |
Some distributions are easier to represent in one form; others in the other.
## Conditional Random Fields (CRFs)
A **CRF** is a Markov network *conditioned on* observed inputs $X$. Used for structured prediction:
$
P(Y \mid X) = \frac{1}{Z(X)} \exp\left(\sum_k w_k \, f_k(X, Y)\right)
$
CRFs were the standard model for named-entity recognition, part-of-speech tagging, and image segmentation pre-2015. Often combined with neural networks (BiLSTM-CRF for sequence labelling).
## Inference
Same algorithm families as Bayesian networks — [[Variable Elimination]], junction tree, belief propagation, MCMC, variational. Computing the partition function $Z$ is typically the hard part.
## The Ising Model
The original MRF. Each variable $X_i \in \{-1, +1\}$; potentials encode "neighbours prefer to align":
$
P(X) \propto \exp\left(\sum_{(i,j) \in E} J_{ij} X_i X_j + \sum_i h_i X_i\right)
$
A physics-derived model that anticipated graphical models by decades.
## Modern Relevance
- Image denoising, segmentation, super-resolution (pre-CNN era).
- Markov Logic Networks combine FOL with MRF weights.
- Energy-based models in deep learning are MRFs with neural-network potentials.
## Related
- [[Bayesian Network]]
- [[Hidden Markov Model]]
- [[D-Separation]]
- [[Variable Elimination]]