Improving contextual query rewrite for conversational AI agents through user-preference feedback learning

Zhongkai Sun; Yingxue Zhou; Jie Hao; Xing Fan; Yanbin Lu; Chengyuan Ma; Wei (Sawyer) Shen; Chenlei (Edward) Guo

Publication

Improving contextual query rewrite for conversational AI agents through user-preference feedback learning

By Zhongkai Sun, Yingxue Zhou, Jie Hao, Xing Fan, Yanbin Lu, Chengyuan Ma, Wei (Sawyer) Shen, Chenlei (Edward) Guo

2023

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Contextual query rewriting (CQR) is a crucial component in Conversational AI agents, leveraging the contextual information from previous user-agent conversations to improve the comprehension of current user intent. However, traditional CQR methods often concentrate on supervised fine-tuning only, neglecting the opportunities to learn from user feedback to align with user preferences. Inspired by recent advances in learning from human feedback (LHF), this paper proposes a novel Preference Aligned Contextual Query Rewriting (PA-CQR) framework to enhance the CQR model’s capability in generating user preference-aligned rewrites. This paper also investigates the efficacy of various state-of-the-art feedback learning algorithms on the CQR task, and proposes a novel Dynamic Direct Preference Optimization (Dynamic DPO) algorithm to better adapt the DPO algorithm to large-scale CQR training. Experiments on large-scale real-world CQR data set demonstrate the superiority of the proposed PACQR framework and the Dynamic DPO.

Improving contextual query rewrite for conversational AI agents through user-preference feedback learning

Latest news

Work with us