Cost-efficiency trade-offs for neural cascade rankers in web search

Ekaterina Trimbach; Badr Abdallaoui; Paul Missault

Publication

Cost-efficiency trade-offs for neural cascade rankers in web search

By Ekaterina Trimbach, Badr Abdallaoui, Paul Missault

2025

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Web search engines process billions of queries daily, making the balance between computational efficiency and ranking quality crucial. While neural ranking models have shown impressive performance, their computational costs, particularly in feature extraction, pose significant challenges for large-scale deployment. This paper investigates how different configurations of feature selection and document filtering in neural cascade ranking systems influence the trade-off between computational cost and ranking performance.

We propose a two-stage neural cascade architecture where both stages utilize Multi-Layer Perceptrons (MLPs). The first stage processes all documents using a reduced feature set, while the second stage applies a more sophisticated model to only the top-ranked documents. This design allows us to systematically explore the impact of feature selection and document filtering on both computational cost and ranking performance.

Through extensive experiments on three large-scale datasets (Yahoo, Istella, and Microsoft MSLR-WEB30K), we demonstrate significant opportunities for cost reduction with minimal impact on ranking quality. Our results show that optimal cascade configurations can achieve cost reductions in feature extraction of up to 40.37% on the Yahoo dataset and 16% on the MSLR-WEB30K dataset while maintaining nearly identical NDCG@10. Furthermore, we identify clear patterns of diminishing returns in ranking performance as computational resources increase, providing valuable insights for developing resource-efficient ranking systems in large-scale web search environments.

Cost-efficiency trade-offs for neural cascade rankers in web search

Latest news

Work with us