Learning explainable network request signatures for robot detection
2023
Rapid growth of deep learning models in recent years for robot and fraud detection has led to significant improvement in precision and recall but has also created a challenge for explainability and trust in the model decisions. In this paper, we propose a scalable multitiered framework that generates explainable network request level signatures for crawler bots on a large e-commerce advertising program. Depending on the bot traffic distribution, the framework uses a combination of volumetric aggregation, decision trees and predictive deep learning models based on weak labels to generate precise and explainable bot signatures, achieving 87.9% coverage over a black-box crawler detection system comprising of multiple deep learning models and heuristic techniques. We further demonstrate that the learnt signatures are more robust in time when compared to traditional IP level bot denylists and reduce false negatives for the black-box crawler bot detection system. Explainable network signatures also enable manual inspection and help with attributing traffic to bot toolkits, which not only improves trust in the blackbox system decisions but also provides insights into the evolving bot landscape.
Research areas