Generating synthetic data for task-oriented semantic parsing with hierarchical representations

Ke Tran; Ming Tan

Publication

Generating synthetic data for task-oriented semantic parsing with hierarchical representations

By Ke Tran, Ming Tan

2020

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Modern conversational AI systems support natural language understanding for a wide variety of capabilities. While a majority of these tasks can be accomplished using a simple and ﬂat representation of intents and slots, more sophisticated capabilities require complex hierarchical representations supported by semantic parsing. State-of-the-art semantic parsers are trained using supervised learning with data labeled according to a hierarchical schema which might be costly to obtain or not readily available for a new domain. In this work, we explore the possibility of generating synthetic data for neural semantic parsing using a pretrained denoising sequence-to-sequence model (i.e., BART). Speciﬁcally, we ﬁrst extract masked templates from the existing labeled utterances, and then ﬁne-tune BART to generate synthetic utterances conditioning on the extracted templates. Finally, we use an auxiliary parser (AP) to ﬁlter the generated utterances. The AP guarantees the quality of the generated data. We show the potential of our approach when evaluating on the Facebook TOP dataset1 for navigation domain.

Generating synthetic data for task-oriented semantic parsing with hierarchical representations

Latest news

Work with us