## Definition An **embedding** is a fixed-length dense vector representation of a token, a phrase, or an entire document — mapped from discrete symbols into a continuous space where geometric proximity encodes semantic similarity. ## Two Uses Inside an LLM 1. **Input embeddings.** Each token maps to a vector at the start of the forward pass. The embedding table is part of the model weights. 2. **Hidden states.** Every intermediate layer produces vector representations of each token — these are also "embeddings" in the broader sense, just contextualised by the surrounding tokens. ## Embeddings as Standalone Models Beyond the LLM itself, dedicated **embedding models** (often a Transformer with mean/CLS pooling) produce single vectors per text input, optimised for similarity tasks. Examples (2026): OpenAI `text-embedding-3-large`, Voyage `voyage-3`, Cohere `embed-v3`, BGE family, Nomic Embed. ## Geometric Property - Cosine similarity (or dot product on normalised vectors) measures semantic relatedness. - Vectors live in spaces of typically 256–4096 dimensions. - Words/phrases used in similar contexts cluster together. ## Where Embeddings Power the Stack - [[Semantic Search]] — query embedding compared to document embeddings. - [[Retrieval-Augmented Generation]] — retrieves documents by embedding similarity. - Classification — train a small linear head on top of frozen embeddings. - Clustering — group similar documents. ## Choosing an Embedding Model Two axes that matter: - **Dimensionality.** More dimensions = finer-grained similarity, more storage cost. - **Domain match.** Models trained heavily on English prose underperform on code or non-English text. Measure on your corpus. ## Important Caveats - Embeddings encode *similarity*, not *truth*. Two false statements can be near each other. - Embedding-space norms and angles are not always meaningful in absolute terms — only the *relative* comparisons. ## Related - [[Tokenization]] - [[Vector Database]] - [[Semantic Search]] - [[Embedding-Based Retrieval]] - [[Retrieval-Augmented Generation]]