## Definition
The **AI compute crunch** is the structural shortfall in GPU and AI-accelerator supply relative to inference demand that defined the AI economy from late 2024 onward. It is the supply-side cause of the 2026 pricing rebase: with capacity scarce, providers can no longer subsidise unmetered access, and prices rise to reflect the true marginal cost of serving each token.
## What Constitutes the Crunch
Three components compound:
1. **Hardware lead times** — frontier accelerators (NVIDIA HGX, H200, Blackwell-class parts) carry 6–18 month order-to-delivery windows; capacity additions trail demand by a year or more.
2. **Power and cooling** — even when chips ship, the datacentre power envelope to host them is a separate constraint; new builds are gated on grid interconnects.
3. **Inference demand mix** — by 2026 inference workloads dwarf training in aggregate compute, and inference is bursty, latency-sensitive, and harder to schedule than training.
## How It Manifests in Pricing
- Across-the-board per-token price increases on flagship models (OpenAI roughly doubled GPT-5.5 pricing relative to GPT-4-class predecessors).
- The rise of capacity-priced premium tiers — paying a multiplier to skip the queue (e.g. Anthropic Fast Mode at 6× standard rates).
- End of unmetered subscriptions for heavy use; subscription plans now serve as a price-anchored entry point with strict caps, with [[Consumption-Based Pricing]] beyond them.
- Rate-limit tightening on consumer tiers and the partitioning of capacity to enterprise contracts.
## The Supply-Side Response
The hardware ecosystem is responding on three fronts: custom inference silicon (TPU v6, Anthropic's own accelerators via in-house deals, Trainium 2), lower-precision arithmetic (FP4, FP6 in production), and architectural efficiency (MoE, speculative decoding). The Register's framing — "those selling the shovels of the AI boom are now racing to bring new hardware better suited to serving these models" — captures the dynamic.
The relief will arrive, but unevenly. Enterprise customers with volume contracts negotiate ahead of the spot prices; consumers absorb the increases first.
## Related
- [[Consumption-Based Pricing]]
- [[Per-Token Pricing]]
- [[Agentic Workload Cost Explosion]]
## Sources
- [[AI is Getting Expensive (The Register)]]
- [[Anthropic 2026 Pricing Shift (Kingy AI)]]