Embeddings Are Just Coordinates

Vector embeddings sound complicated. They're not. They're just coordinates in a high-dimensional space where similar meanings are close together. That's it. Everything else — semantic search, recommendation systems, RAG — is just applications of that simple idea.

If you understand that "cat" and "dog" should be closer together than "cat" and "democracy," you understand embeddings. The rest is just math.

What Embeddings Actually Are

An embedding is a list of numbers that represents text. The sentence "I love pizza" might become [0.23, -0.45, 0.67, ...] with 768 numbers total. The sentence "I enjoy pizza" becomes a different list, but one that's numerically similar.

These numbers are coordinates in a 768-dimensional space. You can't visualize 768 dimensions, but the math works the same as 2D or 3D coordinates. Distance measures similarity. Close coordinates mean similar meanings.

The model that generates these embeddings is trained to put semantically similar text close together. "Happy" and "joyful" get similar coordinates. "Happy" and "sad" get distant coordinates. The model learns these relationships from massive amounts of text.

Why This Enables Semantic Search

Traditional search matches keywords. If your query is "best Italian food" and a document says "top pasta restaurants," keyword search misses it. No shared words, no match.

Semantic search converts both the query and documents into embeddings, then finds documents with coordinates close to the query's coordinates. "Best Italian food" and "top pasta restaurants" have similar embeddings because they mean similar things.

This works across languages, synonyms, and paraphrases. The embedding model has learned that these different phrasings represent similar concepts, so it assigns them similar coordinates.

Embeddings turn the fuzzy concept of "meaning" into precise numerical coordinates that computers can compare.

How Distance Is Measured

The most common distance metric is cosine similarity. It measures the angle between two vectors, ignoring their magnitude. Two vectors pointing in the same direction have high cosine similarity, even if one is longer.

This matters because embedding magnitudes vary. What matters is direction — whether two pieces of text are "pointing" toward similar semantic concepts in the embedding space.

Cosine similarity ranges from -1 to 1. A score of 1 means identical direction (very similar meaning). A score of 0 means perpendicular (unrelated). A score of -1 means opposite direction (opposite meaning).

Why 768 Dimensions

Why not 10 dimensions? Or 10,000? The dimensionality is a tradeoff. More dimensions allow finer distinctions between meanings. Fewer dimensions are faster to compute and require less storage.

768 dimensions is a sweet spot for many models. It's enough to capture nuanced semantic relationships but small enough to be practical. Some models use 384 dimensions for speed. Others use 1,536 for accuracy.

The key insight is that language has inherent dimensionality. You can't represent all possible meanings in 2D space. But you also don't need a million dimensions. Hundreds or thousands are enough.

The Limitations

Embeddings capture meaning, but they lose specifics. "I love pizza" and "I hate pizza" might have similar embeddings because they're both about pizza and emotional reactions. The model might miss the opposite sentiment.

This is why embeddings work best for retrieval, not for understanding. They're great for finding relevant documents. They're not great for determining whether a document agrees or disagrees with a statement.

Newer embedding models are better at capturing these nuances, but the fundamental limitation remains: you're compressing complex text into a fixed-size vector. Information is lost.

Embeddings vs. Fine-Tuning

Embeddings are frozen representations. Once generated, they don't change. Fine-tuning updates the model itself. Both are useful, but for different purposes.

Use embeddings when you need to compare many pieces of text efficiently. Generate embeddings once, store them, and compare them repeatedly. This is perfect for search and recommendation.

Use fine-tuning when you need the model to learn domain-specific patterns. Embeddings capture general semantic similarity. Fine-tuning teaches the model your specific use case.

The Practical Reality

Most developers don't need to understand the math behind embeddings. You just need to know: similar text gets similar vectors, and you can find similar text by comparing vectors.

The hard part isn't the concept. It's the infrastructure. Storing millions of embeddings, indexing them for fast search, and keeping them updated as your data changes — that's where the complexity lives.

But the core idea remains simple: embeddings are coordinates, and coordinates let you measure distance. Everything else is just engineering.

Visualize semantic similarity with LLM Utils Embeddings Tool — see how different sentences map to similar or distant coordinates.