## Definition
**AI engineering** is the discipline of building applications on top of readily available [[Foundation Model]]s, as opposed to training models from scratch. Chip Huyen (2024) characterises it as one of the fastest-growing engineering disciplines, enabled by three converging factors: (1) the general-purpose capabilities of foundation models, (2) a sharp increase in AI investment post-ChatGPT, and (3) a low barrier to entry via model-as-a-service APIs.
The defining shift: where traditional ML engineering *produces* models, AI engineering *consumes* them and differentiates through adaptation and evaluation.
## How It Differs from ML Engineering
Huyen identifies three structural differences:
| Dimension | Traditional ML Engineering | AI Engineering |
|---|---|---|
| Model origin | Built from scratch | Pre-trained, sourced via API |
| Core focus | Modeling and training | Model adaptation and evaluation |
| Output type | Closed-ended (fixed classes) | Open-ended (free text) |
| Evaluation difficulty | Ground-truth comparison | Requires richer rubrics |
| Compute concern | Inference optimization | Even more so, at larger scale |
Because outputs are open-ended, evaluation is *harder*, not easier — a chatbot response has no single ground truth. This makes evaluation a first-class engineering problem, not an afterthought. See [[Inference Latency]] for latency implications.
## Model Adaptation Techniques
AI engineers adapt foundation models without training them from scratch. Huyen groups techniques by whether they update model weights:
**Prompt-based (no weight updates)**
- [[Prompt Engineering]] — giving the model instructions and context.
- [[Retrieval-Augmented Generation]] — connecting the model to external knowledge.
- Few-shot examples in the prompt (a form of [[In-Context Learning]]).
**Weight-updating**
- [[Fine-Tuning]] — further training on domain-specific or task-specific data; required when the task wasn't seen during pretraining or when strict output formats must be guaranteed.
Huyen's heuristic: try prompt-based techniques first; fine-tune only when they plateau.
## The AI Engineering Stack
The [[Three-Layer AI Stack]] breaks the process into three levels: application development (prompts, context, evaluation), model development (training, fine-tuning, inference optimisation), and infrastructure (serving, compute, monitoring). Most AI engineers operate primarily in the top layer.
## Application Landscape
Common application patterns identified across 205 open-source repositories and 100+ enterprise case studies (Huyen, 2024): coding assistants, image and video production, writing aids, education tools, conversational bots, information aggregation, data organisation, and workflow automation. Internal-facing applications are deployed earlier than customer-facing ones due to lower compliance and risk thresholds.
## AI Product Defensibility
Because foundation models are commodities and APIs lower barriers for all competitors, moats are narrow. Huyen identifies three sources of competitive advantage:
- **Technology** — largely similar across teams using the same base models.
- **Data** — proprietary usage data and domain-specific datasets. The *data flywheel* (usage → data → better model → more usage) is the durable moat.
- **Distribution** — reach, established user bases; skews to large incumbents.
## Planning Considerations
Huyen frames AI product planning around a "last mile challenge": demos are easy to build (days), production-quality products are hard (months to years). LinkedIn's experience: one month to reach 80% of desired quality, four more months to reach 95%. Each subsequent 1% gain is progressively more expensive — a pattern analogous to the broader [[Scaling Laws]] of model training.
Application design dimensions (from Apple's framework, as cited by Huyen):
- **Critical vs complementary** — how dependent is the app on AI?
- **Reactive vs proactive** — does AI respond to requests or surface predictions opportunistically?
- **Dynamic vs static** — is the model continuously updated per user, or periodically updated?
## Related
- [[Foundation Model]]
- [[Three-Layer AI Stack]]
- [[Prompt Engineering]]
- [[Fine-Tuning]]
- [[Retrieval-Augmented Generation]]
- [[In-Context Learning]]
- [[Inference Latency]]
- [[Hallucination]]
- [[Scaling Laws]]
## Sources
- [[AI Engineering - Chip Huyen]]