Identifying and resolving annotation changes for natural language understanding

Jose Garrido Ramas; Giorgio Pessot; Abdalghani Abujabal; Martin Rajman

Publication

Identifying and resolving annotation changes for natural language understanding

By Jose Garrido Ramas, Giorgio Pessot, Abdalghani Abujabal, Martin Rajman

2021

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Annotation conflict resolution is crucial towards building machine learning models with acceptable performance. Past work on annotation conflict resolution had assumed that data is collected at once, with a fixed set of annotators and fixed annotation guidelines. Moreover, previous work dealt with atomic labeling tasks. In this paper, we address annotation conflict resolution for Natural Language Understanding (NLU), a structured prediction task, in a real-world setting of commercial voice-controlled personal assistants, where (1) regular data collections are needed to support new and existing functionalities, (2) annotation guidelines evolve over time, and (3) the pool of annotators changes across data collections. We devise an approach combining information-theoretic measures and a supervised neural model to resolve conflicts in data annotation. We evaluate our approach both intrinsically and extrinsically on a real-world dataset with 3.5M utterances of a commercial dialog system in German. Our approach leads to dramatic improvements over a majority baseline especially in contentious cases. On the NLU task, our approach achieves 2.75% error reduction over a no-resolution baseline.

Identifying and resolving annotation changes for natural language understanding

Latest news

Work with us