## Definition
An **embedding** is a fixed-length dense vector representation of a token, a phrase, or an entire document — mapped from discrete symbols into a continuous space where geometric proximity encodes semantic similarity.
## Two Uses Inside an LLM
1. **Input embeddings.** Each token maps to a vector at the start of the forward pass. The embedding table is part of the model weights.
2. **Hidden states.** Every intermediate layer produces vector representations of each token — these are also "embeddings" in the broader sense, just contextualised by the surrounding tokens.
## Embeddings as Standalone Models
Beyond the LLM itself, dedicated **embedding models** (often a Transformer with mean/CLS pooling) produce single vectors per text input, optimised for similarity tasks. Examples (2026): OpenAI `text-embedding-3-large`, Voyage `voyage-3`, Cohere `embed-v3`, BGE family, Nomic Embed.
## Geometric Property
- Cosine similarity (or dot product on normalised vectors) measures semantic relatedness.
- Vectors live in spaces of typically 256–4096 dimensions.
- Words/phrases used in similar contexts cluster together.
## Where Embeddings Power the Stack
- [[Semantic Search]] — query embedding compared to document embeddings.
- [[Retrieval-Augmented Generation]] — retrieves documents by embedding similarity.
- Classification — train a small linear head on top of frozen embeddings.
- Clustering — group similar documents.
## Choosing an Embedding Model
Two axes that matter:
- **Dimensionality.** More dimensions = finer-grained similarity, more storage cost.
- **Domain match.** Models trained heavily on English prose underperform on code or non-English text. Measure on your corpus.
## Important Caveats
- Embeddings encode *similarity*, not *truth*. Two false statements can be near each other.
- Embedding-space norms and angles are not always meaningful in absolute terms — only the *relative* comparisons.
## Related
- [[Tokenization]]
- [[Vector Database]]
- [[Semantic Search]]
- [[Embedding-Based Retrieval]]
- [[Retrieval-Augmented Generation]]