## 1. Paper Identity ### Charlie Chen, Sebastian Borgeaud, Geoffrey Irving, Jean-Baptiste Lespiau, Laurent Sifre, John Jumper ### DeepMind technical report, 2023 ### *Accelerating Large Language Model Decoding with Speculative Sampling* ## 2. Core Contribution ### Independent, concurrent formulation of speculative decoding (here called *speculative sampling*) developed at DeepMind ### Same core idea as Leviathan et al.: a draft model proposes, the target model verifies in parallel, a rejection-sampling step preserves the exact target distribution ### Demonstrates the technique at frontier scale: 2×–2.5× speedup on Chinchilla 70B in a distributed serving setup ## 3. Method ### A faster draft model produces a short continuation of length K ### The target model evaluates all K+1 positions in a single parallel forward call ### *Modified rejection sampling*: at each position, accept the draft token with probability min(1, p_target / p_draft); on rejection, sample from the residual distribution (p_target − p_draft)_+ normalised ### Mathematically preserves the target distribution within hardware numerics ## 4. Key Results ### 2×–2.5× decoding speedup on Chinchilla 70B ### No degradation in downstream task quality or sample distribution ### Demonstrates that the technique scales to production-sized models on distributed hardware ## 5. Lineage / Why It Matters ### Together with Leviathan et al. 2023, the canonical citation for the technique — the two papers are typically cited as a pair ### Established that draft-then-verify is robust at 70B-class scale, not just on encoder-decoder T5 ### Foundation for later work on draft-model training, tree-based drafts (SpecInfer), and self-drafting (Medusa, EAGLE) ## 6. Limitations ### Acceptance rate depends on the alignment between draft and target — poor pairings yield little speedup ### Draft model must share the target's tokenizer ### Memory overhead from co-hosting two models is non-trivial at distributed scale ## 7. Source - https://arxiv.org/abs/2302.01318 - Accessed: 2026-05-23