## 1. Paper Identity
### Charlie Chen, Sebastian Borgeaud, Geoffrey Irving, Jean-Baptiste Lespiau, Laurent Sifre, John Jumper
### DeepMind technical report, 2023
### *Accelerating Large Language Model Decoding with Speculative Sampling*
## 2. Core Contribution
### Independent, concurrent formulation of speculative decoding (here called *speculative sampling*) developed at DeepMind
### Same core idea as Leviathan et al.: a draft model proposes, the target model verifies in parallel, a rejection-sampling step preserves the exact target distribution
### Demonstrates the technique at frontier scale: 2×–2.5× speedup on Chinchilla 70B in a distributed serving setup
## 3. Method
### A faster draft model produces a short continuation of length K
### The target model evaluates all K+1 positions in a single parallel forward call
### *Modified rejection sampling*: at each position, accept the draft token with probability min(1, p_target / p_draft); on rejection, sample from the residual distribution (p_target − p_draft)_+ normalised
### Mathematically preserves the target distribution within hardware numerics
## 4. Key Results
### 2×–2.5× decoding speedup on Chinchilla 70B
### No degradation in downstream task quality or sample distribution
### Demonstrates that the technique scales to production-sized models on distributed hardware
## 5. Lineage / Why It Matters
### Together with Leviathan et al. 2023, the canonical citation for the technique — the two papers are typically cited as a pair
### Established that draft-then-verify is robust at 70B-class scale, not just on encoder-decoder T5
### Foundation for later work on draft-model training, tree-based drafts (SpecInfer), and self-drafting (Medusa, EAGLE)
## 6. Limitations
### Acceptance rate depends on the alignment between draft and target — poor pairings yield little speedup
### Draft model must share the target's tokenizer
### Memory overhead from co-hosting two models is non-trivial at distributed scale
## 7. Source
- https://arxiv.org/abs/2302.01318
- Accessed: 2026-05-23