## Definition **Fine-tuning** is the process of further training a pretrained LLM on a smaller, more targeted dataset to adapt it to a specific task, style, or alignment objective. Vastly cheaper than pretraining; correspondingly less transformative. ## The Common Variants ### Supervised Fine-Tuning (SFT) / Instruction Tuning Train on (instruction, ideal-response) pairs. The model learns to follow instructions and produce helpful answers in the desired format. This is the *first* post-pretraining step for almost every chat assistant. ### Preference Fine-Tuning Includes [[RLHF]] (Reinforcement Learning from Human Feedback) and DPO (Direct Preference Optimisation). Train on (prompt, preferred response, dispreferred response) triples. The model learns *which* of two responses is better — a subtler signal than "produce this exact output." ### Parameter-Efficient Fine-Tuning (PEFT) - **LoRA** — train small low-rank update matrices on top of frozen weights. - **QLoRA** — quantise the base model to 4-bit; train LoRA adapters in 16-bit. - **Adapters** — small trainable bottleneck modules inserted between layers. PEFT lets you fine-tune a 70B+ model on a single GPU and ship many adapters per base model. ## When to Fine-Tune vs Prompt | Situation | Fine-tune? | | ------------------------------------------ | ---------- | | Domain vocabulary the model doesn't know | Maybe — RAG often cheaper | | Specific output format / tone | Yes | | Hard task the frontier model can't do | Probably not (model isn't capable) | | Cost reduction (smaller model, same quality) | Yes | Usually try **prompting → RAG → fine-tuning** in that order. Fine-tuning is rarely the first move it appears to be. ## Catastrophic Forgetting Fine-tuning on a narrow distribution can degrade general capabilities. Mitigations: mix general data into the fine-tuning corpus; use lower learning rates; prefer PEFT methods that don't touch base weights. ## Related - [[Pretraining]] - [[RLHF]] - [[Constitutional AI]] - [[In-Context Learning]] - [[Large Language Model]]