How to Build “Customers Also Liked” Recommendations With Vector Search

To build “customers also liked” recommendations with vector search, represent each catalog item as an embedding, use the current item as the query, retrieve similar items, then filter and rerank those candidates for the product experience.

The core idea is simple: items that are close in vector space become candidates for the recommendation row.

Short Answer

Build “customers also liked” recommendations by embedding your catalog items, storing the embeddings in a vector database, querying with the current item vector, filtering out ineligible products, reranking the candidates with business and user signals, and measuring whether users actually engage with the results.

Vector search gives you fast candidate retrieval. Ranking, filters, diversity, and evaluation make the recommendations production-ready.

Step 1: Define the Recommendation Goal

Start by defining what “also liked” means for your product.

It may mean visually similar items, functionally similar items, items often bought together, substitutes, alternatives, or items that users with similar behavior engaged with.

The goal determines which data you embed and how you rank results.

Step 2: Choose the Recommendation Surface

Decide where the recommendations will appear.

Common surfaces include product detail pages, cart pages, checkout pages, search result pages, content pages, and home feeds.

Each surface may need different logic, latency, filtering, and ranking.

Step 3: Prepare the Catalog

Create a clean item record for every product or content object.

Include stable item IDs, titles, descriptions, categories, images, tags, brand, price, availability, region, language, popularity, and other metadata needed for filtering and ranking.

Stable IDs are important because recommendations must link back to live catalog objects.

Step 4: Decide What to Embed

The embedding should represent the kind of similarity you want.

For product text similarity, embed title, description, category, attributes, and reviews.

For visual similarity, embed product images. For behavior similarity, learn vectors from clicks, purchases, likes, saves, or co-views.

Step 5: Use Text Embeddings When Meaning Matters

Text embeddings are useful when descriptions, titles, and attributes capture product meaning.

They work well for books, courses, support articles, software listings, technical products, documents, and product catalogs with rich copy.

They may be weaker when style or appearance drives user choice.

Step 6: Use Image Embeddings When Appearance Matters

Image embeddings are useful when customers care about visual similarity.

Examples include clothing, furniture, decor, art, food, beauty products, real estate, and marketplace listings.

Image similarity can find related items even when text metadata is inconsistent.

Step 7: Use Behavioral Signals When You Have Enough Data

Behavioral signals capture what users actually do.

Co-clicks, co-purchases, likes, saves, cart additions, watch time, and repeat engagement can reveal relationships that content alone misses.

Behavioral embeddings usually become more useful as interaction volume grows.

Step 8: Support Multiple Vectors if Needed

Some products need more than one vector.

A fashion item might have one vector for image style and another for text attributes. A marketplace listing might need vectors for title, image, seller description, and behavior.

Multiple vectors let the system choose the right similarity signal for each recommendation surface.

Step 9: Store Vectors With Metadata

Store item vectors with the metadata needed for filtering and display.

At minimum, keep item ID, title, URL, image URL, category, status, availability, region, price, and any eligibility fields.

This avoids a slow second lookup for basic recommendation rows.

Step 10: Build the Vector Index

A vector index makes nearest-neighbor retrieval fast.

For large catalogs, approximate nearest neighbor search is usually needed because exact comparison against every item is too slow.

Tune the index for your latency, recall, throughput, memory, and update requirements.

Step 11: Query With the Current Item

For a product detail page, use the current product as the query.

The retrieval system searches for items near that product vector.

This is the simplest way to build “customers also liked” or “similar items” recommendations.

Step 12: Exclude the Current Item

The current item should not appear in its own recommendation row.

Filter it out by ID.

Also exclude duplicate variants when showing them would make the row feel repetitive or broken.

Step 13: Apply Eligibility Filters

Filter out items that should not be recommended.

Common filters include out-of-stock products, restricted items, unsupported regions, inactive listings, hidden content, expired offers, wrong language, wrong tenant, or content the user cannot access.

Eligibility filters should be enforced before display.

Step 14: Apply Category Logic

Decide whether recommendations should stay within the same category.

For substitutes, same-category recommendations usually make sense. For cross-sell, related categories may be better.

Category filters should reflect the surface and user intent.

Step 15: Retrieve More Candidates Than You Show

Do not retrieve only the number of items you plan to display.

If the row shows 8 items, retrieve a larger candidate set such as 50 or 100.

This gives filtering, reranking, and diversity logic enough room to work.

Step 16: Rerank Candidates

Vector similarity should usually be the first ranking signal, not the only one.

Rerank candidates using popularity, freshness, price, margin, availability, rating, conversion rate, user preferences, compatibility, and business rules.

The final ranking should match the product goal, not only vector distance.

Step 17: Add Diversity

Similarity search can return near-duplicates.

Diversity logic prevents the row from showing the same item style, brand, color, category, or creator repeatedly.

Use category caps, brand caps, clustering, or diversity reranking to make the row more useful.

Step 18: Add Freshness

Freshness helps new and recently updated items appear.

Use timestamps, launch dates, inventory updates, seasonal tags, and trending signals to avoid stale recommendations.

Freshness should be balanced with relevance.

Step 19: Add Personalization Carefully

“Customers also liked” can be item-based, personalized, or both.

A simple version shows the same recommendations for every viewer of an item. A personalized version reranks those candidates based on the current user or session.

Personalization can help, but it should not override the item relationship completely unless the surface is designed for it.

Step 20: Use Session Signals

Session signals are useful when the user is anonymous or has changing intent.

If the user recently clicked hiking backpacks, similar backpack recommendations should probably be boosted.

A session vector can be built from recent interactions and used for reranking or blending.

Step 21: Handle Cold Start Items

New items often lack interaction data.

Content and image embeddings help recommend them immediately based on description, appearance, and attributes.

Behavioral ranking can be added later as interaction data accumulates.

Step 22: Handle Sparse Metadata

Vector search can help when product tags are incomplete.

Embeddings can find similar items from text, images, or behavior even when shared metadata is missing.

Still, critical fields such as availability and access permissions should be reliable.

Step 23: Build a Fallback Strategy

Sometimes retrieval returns too few good candidates.

Fallbacks can use popular items in the same category, trending items, editorial picks, recently viewed category items, or broader vector search without strict filters.

Fallbacks should be visible in metrics so quality issues are not hidden.

Step 24: Log Recommendation Events

Log impressions, clicks, saves, purchases, skips, add-to-cart events, dwell time, and conversions.

These events help evaluate the recommendation row and can later improve user, item, or behavior embeddings.

Without event logging, recommendation quality is hard to improve.

Step 25: Evaluate Offline

Offline evaluation helps compare retrieval and ranking changes before deployment.

Useful metrics include Recall@K, Precision@K, nDCG, mean reciprocal rank, coverage, novelty, and diversity.

Offline metrics are useful, but they do not replace live testing.

Step 26: Evaluate Online

Online evaluation measures user behavior.

Track click-through rate, conversion rate, add-to-cart rate, revenue per session, saves, watch time, and downstream satisfaction.

A recommendation that looks similar offline may still fail if users do not engage.

Reference Architecture

A practical architecture includes:

  • catalog ingestion pipeline
  • embedding generation job
  • vector database with item metadata
  • recommendation API
  • eligibility filtering
  • candidate reranking
  • diversity and freshness logic
  • event logging
  • offline and online evaluation loop

Example Request Flow

  1. User opens a product page.
  2. The recommendation API receives the current item ID.
  3. The API fetches or references the item vector.
  4. The vector database retrieves similar candidates.
  5. Filters remove unavailable or ineligible items.
  6. The ranking layer scores candidates.
  7. Diversity logic trims near-duplicates.
  8. The API returns the final recommendation row.
  9. User interactions are logged for evaluation.

Choosing the Candidate Size

The candidate set should be larger than the display count.

If the UI needs 6 results, retrieving 30 to 100 candidates gives downstream logic room to filter and diversify.

Larger candidate sets can improve quality but may increase latency and reranking cost.

Choosing the Similarity Metric

Use the distance or similarity metric expected by the embedding model.

Common choices include cosine similarity, dot product, and Euclidean distance.

Changing the metric without testing can silently degrade recommendation quality.

Preventing Repetitive Rows

A common failure is showing products that are too similar.

For example, a row may show eight versions of the same shirt in slightly different colors.

Use variant grouping, diversity reranking, or category limits to make the row more useful.

Preventing Bad Substitutes

Similarity does not always mean substitutability.

A product can look similar but be the wrong size, incompatible, unavailable, unsafe, or inappropriate for the user.

Use filters and rules for hard constraints.

Adding Explanations

Recommendation explanations can improve trust and debugging.

Examples include “similar style,” “same category,” “popular with customers who viewed this,” “matches your recent views,” or “compatible accessory.”

Explanations should be based on real signals, not generic labels.

Monitoring Production Quality

Monitor empty recommendation rows, filter drop-off, latency, candidate count, click-through rate, conversion rate, duplicate rate, diversity, and fallback frequency.

Also track embedding version, index version, and model changes.

Recommendation systems can degrade when catalog data or user behavior changes.

Common Mistakes

  • embedding only product titles when images drive similarity
  • not excluding the current item
  • not filtering unavailable products
  • retrieving too few candidates
  • ranking only by vector distance
  • showing near-duplicate variants
  • not logging impressions and clicks
  • not testing with real catalog scale
  • ignoring cold-start items
  • forgetting access control and regional eligibility

Minimum Viable Version

A minimum viable version can be simple.

Embed item title, description, category, and image if available. Store item metadata and vectors. Query with the current item. Exclude the current item and unavailable products. Return the top similar candidates.

Then add reranking, diversity, personalization, and evaluation as the system matures.

Summary

Building “customers also liked” recommendations with vector search starts with item embeddings and nearest-neighbor retrieval.

A production system also needs filters, reranking, diversity, freshness, event logging, and evaluation.

Vector search supplies the candidate set; the rest of the recommendation pipeline turns similar items into useful, safe, and measurable recommendations.