Vector Distance Metrics Explained

Vector distance metrics are the mathematical rules a vector search system uses to compare vectors. They decide which stored vectors are closest to a query vector and therefore which results appear near the top.

Distance metrics matter because embeddings are only useful if the system compares them in the right way. The same vectors can produce different rankings depending on whether the search uses cosine distance, L2 distance, dot product, Manhattan distance, or another metric.

What Is a Vector Distance Metric?

A vector distance metric takes two vectors and returns a number that describes how close or far apart they are.

In vector search, one vector usually represents the query and the other vector represents a stored object such as a document chunk, product, image, note, or support ticket.

Most distance values follow this rule:

smaller distance means closer vectors
larger distance means farther vectors
a distance of 0 often means identical vectors

Similarity scores can behave differently. With a similarity score, higher often means more similar. Always check whether the database returns a distance or a similarity value.

Why Distance Metrics Matter

Distance metrics affect retrieval quality.

If the metric matches the embedding model, search results are more likely to reflect the relationships the model learned. If the metric does not match, the database may still return results quickly, but the ranking may be weaker.

Metric choice affects:

semantic search relevance
RAG context retrieval
recommendation quality
duplicate detection
clustering behavior
score thresholds
index configuration

Cosine Distance

Cosine distance compares the direction of two vectors.

It is based on cosine similarity, which measures whether two vectors point in a similar direction. Cosine distance is commonly expressed as:

cosine distance = 1 - cosine similarity

Cosine is common for text embeddings because direction often matters more than raw vector length.

Use cosine-style comparison when the embedding model expects angular similarity or when model documentation recommends it.

Dot Product

Dot product measures alignment and magnitude together.

It can behave like a similarity score when larger values mean stronger alignment. Some systems transform dot product into a distance-style value so that lower values still mean closer results.

Dot product can be useful when the embedding model was trained with dot-product scoring or when vector magnitude carries meaningful information.

L2 or Euclidean Distance

L2 distance, also called Euclidean distance, measures straight-line distance between two vectors.

The ordinary formula is:

sqrt(sum((Ai - Bi)^2))

L2 is useful when coordinate distance matters and the embedding model was trained or evaluated with Euclidean-style distance.

With L2 distance, lower values mean closer vectors.

Squared L2 Distance

Squared L2 distance is L2 distance without the final square root:

sum((Ai - Bi)^2)

Many vector systems use squared L2 because it is efficient and preserves nearest-neighbor ordering. If one vector is closer than another by L2 distance, it is also closer by squared L2 distance.

The scale is different, though. A squared L2 value should not be interpreted as ordinary L2 distance.

Manhattan or L1 Distance

Manhattan distance, also called L1 distance or taxicab distance, adds the absolute differences between vector components:

sum(|Ai - Bi|)

It is like traveling through a city grid instead of taking a direct straight line.

L1 can be useful for some feature-engineered or sparse vectors, or when the model and task are designed for that style of comparison.

Hamming Distance

Hamming distance counts how many positions are different between two vectors.

For example:

A = [1, 9, 3, 4, 5]

B = [1, 2, 3, 9, 5]

The vectors differ in two positions, so the Hamming distance is 2.

Hamming distance is useful when vectors represent discrete values, binary codes, or categorical comparisons rather than ordinary dense text embeddings.

Distance vs Similarity

Distance and similarity are related, but they are not the same.

Distance: lower usually means more similar.
Similarity: higher usually means more similar.

This distinction matters when sorting results or setting thresholds.

If you sort a distance score in the wrong direction, you can put the worst results first. If you treat a raw distance as a percentage, the score can be misleading.

Which Metric Should You Choose?

The practical rule is to use the metric recommended for the embedding model.

If the model was trained with cosine similarity, use cosine-style search. If it was trained or evaluated with L2, use L2 or squared L2. If the model documentation recommends dot product, use dot product.

If the model documentation is unclear, compare metrics with real queries and expected results.

How to Evaluate a Metric

To evaluate a distance metric, build a small test set.

Include:

real user queries
known relevant results
hard negative examples
short and long queries
domain-specific terms
edge cases such as rare names or codes

Then compare retrieval quality using metrics such as recall, precision, MRR, or nDCG. Also inspect results manually, because relevance often depends on product context.

Metric Choice and RAG

In RAG systems, the distance metric affects which chunks are sent to the language model.

If the retrieval metric is wrong, the generated answer may be weak even if the language model is strong. The model can only use the context it receives.

For RAG, metric choice should be tested with answer quality, not only search speed.

Metric Choice and Performance

Distance calculations happen many times during indexing and search. Some metrics are cheaper to compute than others, and some are better optimized in specific databases or hardware environments.

However, speed should not be the only factor. A faster metric that retrieves worse results may create more downstream cost in reranking, prompt size, or user correction.

Common Mistakes

Common mistakes include:

choosing a metric without checking the embedding model
mixing distance and similarity score interpretation
using thresholds copied from another model
switching metrics without rebuilding evaluation baselines
comparing raw scores across different metrics
assuming cosine, dot product, and L2 always rank results the same

Summary

Vector distance metrics define how a vector search system compares query vectors with stored vectors. Common metrics include cosine distance, dot product, L2 or squared L2 distance, Manhattan distance, and Hamming distance.

The best metric depends on the embedding model, data, and application. Lower distance usually means closer vectors, while higher similarity usually means closer vectors.

Choose the metric that matches the model, then validate it with real retrieval examples before relying on it in production.