## Definition
**Generative AI** is the family of machine-learning systems that *produce* new content — text, images, audio, video, code, 3D — rather than classifying, ranking, or extracting from existing inputs. The defining property is that the output space is the same shape as the input space (or richer).
## Major Modalities (2026)
- **Text.** LLMs — see [[Large Language Model]].
- **Image.** Diffusion models — see [[Diffusion Model]]. Stable Diffusion 4, Imagen 3, FLUX, Midjourney.
- **Audio.** Speech synthesis (ElevenLabs, OpenAI TTS), music (Suno, Udio), sound effects.
- **Video.** Sora 2, Veo 3, Runway Gen-4 — diffusion + transformer hybrids.
- **Code.** A specialisation of text; same architecture, different training mix.
- **Multimodal.** Models that mix several modalities natively — see [[Multimodal Model]].
- **3D and scene.** NeRFs, Gaussian splats, 3D-aware diffusion.
## Common Architectural Families
- **Transformers** — dominate text, code, and increasingly other modalities.
- **Diffusion models** — dominate image and audio; encroaching into video.
- **Autoregressive image models** — older approach, mostly displaced by diffusion.
- **GANs** — historically central for images; now mostly niche.
- **Latent diffusion** — diffusion in a compressed latent space; the practical workhorse for high-res image generation.
## What Makes It "Generative"
Two related properties:
1. **Sampling from a learned distribution** — see [[Sampling]].
2. **Compositional output** — the model produces structure piece by piece (tokens, denoising steps), so each output is in principle novel.
## Why It Reshaped Software (2022+)
The cost of producing a *first draft* of almost any creative or technical artifact dropped dramatically. The bottleneck shifted from generation to:
- **Specification** — knowing what to ask for (see [[Spec-Driven Development]]).
- **Verification** — knowing whether what was produced is correct.
- **Curation** — choosing among many candidate outputs.
These three are now the work — see [[Orchestrator Role]].
## Related
- [[Large Language Model]]
- [[Diffusion Model]]
- [[Multimodal Model]]
- [[Foundation Model]]