Policy-driven neural response generation for knowledge-grounded dialog systems
Open-domain dialog systems aim to generate relevant, informative and engaging responses. In this paper, we propose using a dialog policy to plan the content and style of target, open domain responses in the form of an action plan, which includes knowledge sentences related to the dialog context, targeted dialog acts, topic information, etc. For training, the attributes within the action plan are obtained by automatically annotating the publicly released TopicalChat dataset. We condition neural response generators on the action plan which is then realized as target utterances at the turn and sentence levels. We also investigate different dialog policy models to predict an action plan given the dialog context. Through automated and human evaluation, we measure the appropriateness of the generated responses and check if the generation models indeed learn to realize the given action plans. We demonstrate that a basic dialog policy that operates at the sentence level generates better responses in comparison to turn level generation as well as baseline models with no action plan. Additionally the basic dialog policy has the added benefit of controllability.