Efficient off-policy evaluation of content blending in station-based music experiences
2025
Audio streaming services, on both voice assistants and in visual apps, often field requests such as 'play more like Foo Fighters.' The service then returns a sequence of tracks that is both relevant to the request and personalized to the requester. While it is natural to evaluate the policies that produce these sequences in terms of customer engagement, such metrics do not assess their performance on other key business goals. We present our work to implement a content blending strategy to increase the prevalence of specific strategically-important content in these sequences, while minimizing harm to playback rates. In particular, we describe our efficient extension of off-policy evaluation to evaluate how blending impacts both overall engagement and the number of successful new release plays. We demonstrate how we used this work to choose blend rates for new policies so as to maximize overall engagement while preserving the new release metric baseline set by the current production policy. We also investigate the accuracy of these methods by comparing our estimates to online results.
Research areas