E-CommerceSemantic SearchCASE_06

Replacing keyword search with semantic understanding across a 2M SKU catalog

A large EU fashion retailer (~2M-SKU catalogue, engagement 2024) where 23% of search queries returned zero results and another ~40% returned the wrong category, shoppers used natural language ("flowy summer dress with pockets") while the catalogue was tagged with supplier codes and rigid taxonomy terms. A fine-tuned bi-encoder (sentence-transformer base, trained on the retailer's own ~12-month query-product click stream) drives dense vector search over the full 2M SKUs in Qdrant; a lightweight re-ranking layer applies business rules (margin, stock level, trend score) without degrading relevance. p95 latency 78ms. Zero-result queries 23% → 2%. Search-to-purchase conversion +34%.

23%→2%

Zero-result queries

+34%

Search-to-purchase conversion

<80ms

p95 search latency

The Challenge

23% of search queries returned zero results, not because the products didn't exist, but because customers used natural language ("flowy summer dress with pockets") while the catalog was tagged with supplier codes and rigid taxonomy terms. Another 40% of queries returned results from the wrong category entirely.

Our Approach

We proposed replacing the keyword index with a bi-encoder semantic search architecture. Feasibility validated that the catalog could be fully re-embedded in 72 hours on available infrastructure. The architecture: a fine-tuned sentence-transformer embedding model (trained on the retailer's own query-product click data), dense vector index via Qdrant, and a lightweight re-ranking layer that incorporates business rules (margin, stock level, trend score) without degrading relevance.

Outcome

Zero-result queries dropped from 23% to 2%. Search-to-purchase conversion increased 34%. p95 latency: 78ms on the full 2M SKU catalog. The business rules re-ranking layer let the merchandising team influence results without touching the model.

What We Learned

Fine-tuning on your own click data outperforms generic embeddings for domain-specific catalogs.

Business rules and ML can coexist, the re-ranking layer is the right place for them.

Indexing 2M vectors is a data pipeline problem, not a model problem.

Stages Engaged

Feasibility Call

Discovery & Blueprint

Concept Validation

Production Build

Total Duration

4 months total

Artifacts Delivered

PRD

Search Architecture Blueprint

Embedding Model Spec

WBS

SOW

Start with a Feasibility Call

2 hours. No cost. We'll tell you honestly whether AI makes sense for your case.

Book a call