## Definition
**Semantic search** retrieves documents by *meaning* rather than by exact lexical overlap, using vector embeddings of queries and documents in a shared space. The same query — phrased differently — should return the same relevant documents.
## Vs Lexical Search (BM25)
| Property | Lexical (BM25) | Semantic (dense) |
| ------------------------------------- | -------------- | ------------------ |
| Matches exact terms | Strong | Weak |
| Matches paraphrases / synonyms | Weak | Strong |
| Handles typos | Weak | Stronger |
| Cross-lingual | None natively | Yes (with multilingual embeddings) |
| Explainable | Yes (matched terms) | Mostly opaque |
| Index/query cost | Cheap | More expensive |
| Cold start (no training) | Works | Requires embedding model |
Neither is universally better. The dominant production pattern is **hybrid**.
## Hybrid Search
Run both lexical and semantic search, then merge:
- **Reciprocal Rank Fusion (RRF)** — combine rank positions, not raw scores. Robust and simple; the workhorse.
- **Learned linear combination** — weighted sum of lexical and semantic scores.
- **Cross-encoder reranking** — top-N from both methods passed to a cross-encoder that scores each (query, doc) pair jointly.
In practice, hybrid + cross-encoder reranker is the strongest off-the-shelf retrieval setup as of 2026.
## Why It Mattered
Pre-2020, search was overwhelmingly lexical. Semantic search opened:
- **"What's the doc about user authentication that doesn't use the word 'authentication'?"** — semantic finds it; lexical doesn't.
- **Cross-lingual search.** Embed in 100+ languages; match queries across them.
- **Question-answering at scale.** Retrieve the few passages that *answer* a question, not the many that *mention* its keywords.
## Failure Modes
- **Domain mismatch.** A general-purpose embedding model misses domain jargon. Mitigate with domain-specific embeddings or fine-tuning.
- **Spurious geometric closeness.** Two unrelated documents can land near each other in embedding space. Mitigate with reranking.
- **No exact-match fallback.** A user searching for a specific product code wants the product code, not "similar things." Hybrid restores this.
## Connection to RAG
Semantic search is the *retrieval* step of [[Retrieval-Augmented Generation]]. The vector DB ([[Vector Database]]) is the storage layer. The generation layer is the LLM.
## Related
- [[Embedding]]
- [[Vector Database]]
- [[Retrieval-Augmented Generation]]
- [[Embedding-Based Retrieval]]