Semantic complexity in end-to-end spoken language understanding

Joseph McKenna; Samridhi Choudhary; Michael Saxon; Grant Strimel; Thanasis Mouchtaris

Publication

Semantic complexity in end-to-end spoken language understanding

By Joseph McKenna, Samridhi Choudhary, Michael Saxon, Grant Strimel, Thanasis Mouchtaris

2020

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

End-to-end spoken language understanding (SLU) models are a class of model architectures that predict semantics directly from speech. Because of their input and output types, we refer to them as speech-to-interpretation (STI) models. Previous works have successfully applied STI models to targeted use cases, such as recognizing home automation commands, however no study has yet addressed how these models generalize to broader use cases. In this work, we analyze the relationship between the performance of STI models and the difﬁculty of the use case to which they are applied. We introduce empirical measures of data set semantic complexity to quantify the difﬁculty of the SLU tasks. We show that near-perfect performance metrics for STI models reported in the literature were obtained with data sets that have low semantic complexity values. We perform experiments where we vary the semantic complexity of a large, proprietary data set and show that STI model performance correlates with our semantic complexity measures, such that performance increases as complexity values decrease. Our results show that it is important to contextualize an STI model’s performance with the complexity values of its training data set to reveal the scope of its applicability.

Semantic complexity in end-to-end spoken language understanding

Latest news

Work with us