Evaluating auto-complete ranking for diversity and relevance
2025
Traditional Query Auto-completion (QAC) systems optimise for query relevance based on past user interactions. This approach excels at surfacing frequently searched queries, but ensuring a diverse range of suggestions and incorporating new products or trends often requires post-processing heuristics. This limitation stems from relying on user search logs, which may not fully capture the evolving product landscape. This paper presents a comparison of traditional state-of-the-art (SOTA) methods with Large Language Models (LLMs) for query auto-completion. The LLMs, with their ability to understand language and product information, can generate a wider range of relevant and diverse suggestions, en-compassing the entire product catalog and potentially including entirely new queries. Here, we study the trade-off between latency, relevance and comprehensiveness in the QAC systems. Our experiments on real world data show that the LLM based QAC system offers a significant 38% boost in diversity with no significant change in relevance metrics. However, their high compute and memory demands make them less suitable for real-time applications. We introduce a heuristic approach that integrates the strengths of existing methods with the power of LLMs. This combined approach has been shown to yield a measurable 0.13% increase in sales revenue in live A/B testing.
Research areas