Adversarial named-entity recognition with word attributions and disentanglement

Xiaomeng Jin; Bhanu Vinzamuri; Sriram Venkatapathy; Heng Ji; Pradeep Natarajan

Publication

Adversarial named-entity recognition with word attributions and disentanglement

By Xiaomeng Jin, Bhanu Vinzamuri, Sriram Venkatapathy, Heng Ji, Pradeep Natarajan

2023

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

The issue of enhancing the robustness of Named Entity Recognition (NER) models against adversarial attacks has recently gained significant attention (Simoncini and Spanakis, 2021; Lin et al., 2021). The existing techniques for robustifying NER models rely on exhaustive perturbation of the input training data to generate adversarial examples, often resulting in adversarial examples that are not semantically equivalent to the original. In this paper, we employ word attributions guided perturbations that generate adversarial examples with a comparable attack rates but at a lower modification rate. Our approach also uses disentanglement of entity and non-entity word representations as a mechanism to generate diverse and unbiased adversarial examples. Adversarial training results based on our method improves the F1 score over originally trained NER model by 8% and 18% on CoNLL-2003 and Ontonotes 5.0 datasets respectively.

Adversarial named-entity recognition with word attributions and disentanglement

Latest news

Work with us