L2 indexes and inner product indexes rank vectors by different ideas of closeness. An L2 index looks for vectors with the smallest coordinate distance from the query. An inner product index looks for vectors with the strongest dot product alignment with the query.
The right choice depends on the embedding model and what its vector space was trained to optimize.
Short Answer
Use an L2-style index when the model expects Euclidean distance or squared Euclidean distance. Use an inner product index when the model expects dot product or maximum inner product search.
If the model documentation recommends a metric, follow that recommendation first. Then validate with real queries.
What an L2 Index Optimizes
An L2 index ranks candidates by Euclidean distance from the query vector.
L2(a, b) = sqrt(sum((a_i - b_i)^2))
Many vector systems use squared L2 instead:
squared_L2(a, b) = sum((a_i - b_i)^2)
For ranking nearest neighbors, ordinary L2 and squared L2 preserve the same order for a fixed query. Lower distance means closer.
What an Inner Product Index Optimizes
An inner product index ranks candidates by dot product with the query vector.
dot(a, b) = sum(a_i * b_i)
Raw dot product is usually similarity-like: higher means more aligned.
Some systems expose inner product as a distance by returning the negative dot product:
dot_distance(a, b) = -dot(a, b)
In that case, lower distance means more similar because a larger dot product becomes a smaller negative value.
The Main Difference
L2 asks: how far apart are these two vectors in coordinate space?
Inner product asks: how strongly does one vector align with the other, including magnitude?
Those are not the same question. The same query can produce different rankings under L2 and inner product.
Magnitude Matters More With Inner Product
Inner product is affected by both direction and vector length.
If two vectors point in a similar direction, the one with larger magnitude can get a larger dot product score.
This can be useful when magnitude carries signal. It can be harmful when magnitude is an artifact that should not affect ranking.
L2 Focuses on Coordinate Distance
L2 distance compares actual coordinate differences.
Two vectors can point in a similar direction but still be far apart by L2 if one vector is much longer than the other.
a = [1, 1]
b = [10, 10]
These vectors point in the same direction, but their L2 distance is large.
Normalized Vectors Change the Relationship
When vectors are normalized to unit length, some metric differences become smaller.
For normalized vectors, dot product and cosine similarity are closely related. L2 distance also becomes tied to angular difference because every vector has the same length.
Without normalization, L2 and inner product can behave very differently.
Why Index Metric Must Match the Model
Embedding models are trained or evaluated with assumptions about similarity.
If a model was trained for dot product retrieval, using L2 may hurt ranking. If a model was trained for Euclidean distance, using inner product may produce unexpected results.
The index metric is not just a storage setting. It is part of the retrieval model.
Score Direction
Score direction is a common source of confusion.
- L2 distance: lower is closer.
- Squared L2 distance: lower is closer.
- Raw inner product: higher is closer.
- Negative dot product distance: lower is closer.
Before sorting or thresholding, confirm what the database returns.
Thresholds Are Not Portable
An L2 threshold cannot be reused as an inner product threshold.
For example, this kind of threshold:
L2 distance <= 0.8
does not translate directly into:
dot product >= 0.8
The score ranges and meanings differ. Recalibrate thresholds when changing metrics.
When L2 Indexes Fit
An L2 index can fit when:
- the embedding model recommends Euclidean or L2 distance
- the training loss used L2-style geometry
- coordinate distance is meaningful for the representation
- you want nearest neighbors by spatial closeness
- evaluation shows better recall or RAG quality than dot product
When Inner Product Indexes Fit
An inner product index can fit when:
- the embedding model recommends dot product
- the model was trained for maximum inner product search
- vector magnitude carries useful signal
- retrieval benchmarks for the model use dot product
- evaluation shows better ranking than L2 or cosine
Relationship to Cosine Similarity
Cosine similarity compares angle and ignores most magnitude effects.
Dot product compares angle and magnitude. L2 compares coordinate distance.
If vectors are normalized, cosine and dot product may rank results similarly or identically. If vectors are not normalized, dot product can favor larger-magnitude vectors.
Impact on ANN Indexes
Approximate nearest neighbor indexes are built around a distance or similarity function. That metric affects graph construction, candidate selection, scoring, and recall behavior.
Changing the metric may require rebuilding the index. It also requires revalidating search quality because the nearest-neighbor structure changes.
Impact on RAG
In a RAG system, the index metric affects which chunks enter the prompt.
If the metric does not match the embedding model, semantically useful chunks may rank lower, and less useful chunks may appear higher. That can reduce answer quality even if the vector database is working correctly.
Metric choice should be evaluated with real questions and expected source chunks.
How to Choose Between Them
Use this process:
- Check the embedding model documentation.
- Use the model’s recommended metric first.
- Confirm whether vectors should be normalized.
- Build a small relevance benchmark.
- Compare top-k results under each candidate metric.
- Evaluate RAG answer quality, not only nearest-neighbor scores.
- Recalibrate thresholds after choosing the metric.
Common Mistakes
Common mistakes include:
- using L2 because it sounds mathematically familiar
- using inner product because it is fast or common in examples
- forgetting that dot product depends on magnitude
- mixing normalized and unnormalized vectors
- sorting negative dot product in the wrong direction
- copying thresholds from L2 to inner product
- changing the index metric without rebuilding or revalidating
Summary
L2 indexes retrieve vectors with the smallest coordinate distance. Inner product indexes retrieve vectors with the strongest dot product alignment, often including magnitude.
Neither is universally better. The correct index metric is the one that matches the embedding model and performs best on your retrieval task. For production search, choose by model guidance and validation data, not by metric popularity.