HISS: A novel hybrid inference architecture in embedding based product sourcing using knowledge distillation
Semantic Sourcing is a well-studied area in web and product search to improve the quality of search results. In the context of Semantic Sourcing in e-commerce search, transformer-based models like BERT (fine-tuned for relevance) can be used to encode the representation of queries into a semantic space where the semantically equivalent entities (i.e., queries for Query Reformulation (QR) or products for direct Semantic Sourcing application) are in the neighbourhood of the given query. Although BERT achieves state-of-the-art performance, this comes at a latency cost to compute the embedding, making it unsuitable for real-time reformulations where a Deep Semantic Search Model (DSSM)- a simple architecture comprised of word embedding layer followed by mean-pool layer, is more suitable. In this work, we demonstrate that (1) applying knowledge distillation to transfer the knowledge from SBERT (BERT fine-tuned for relevance) to DSSM shows improvement in AUC of 2.03% in the query-product relevance task compared to training DSSM directly on the relevance data and (2) HISS: Hybrid Inference architecture in Semantic Search: DSSM (from knowledge distillation) if used in conjunction with BERT on alignment loss, shows improvement in AUC of 0.8-1.2% over DSSM (from KD) only model.