CoSearcher: Studying the effectiveness of conversational search refinement and clarification through user simulation
A key application of conversational search is reining a user’s search intent by asking a series of clarification questions, aiming to improve the relevance of search results. Training and evaluating such conversational systems currently requires human participation, making it infeasible to examine a wide range of user behaviors. To support robust training/evaluation of such systems, we propose a simulation framework called CoSearcher Information about code/resources available at https://github.com/amzn/cosearcher that includes a parameterized user simulator controlling key behavioral factors like cooperativeness and patience. To evaluate our approach, we use both a standard conversational query clarification benchmark and develop an extended dataset using query suggestions from a popular Web search engine as a source of additional refinement candidates. Using these datasets, we investigate the impact of a variety of conditions on search refinement and clarification effectiveness over a wide range of user behaviors, semantic policies, and dynamic facet generation. Our results quantify the effects of user behavior variation, and identify conditions required for conversational search refinement and clarification to be effective. This paper is an extended version of our previous work, and includes new experimental results for comparing semantic similarity ranking strategies for facets, using enhanced representations of facets, learning from negative user responses, among other new results and more detailed experimental descriptions.