Zero-shot spoken language understanding for English-Hindi: An easy victory against word order divergence
2021
While the strong zero-shot performance of multilingual BERT has been shown to drop in case of word order divergence between source and target language, the problem has been studied rarely to date. In this paper, we explore light-weight techniques to improve BERT-based zero-shot spoken language understanding for English-Hindi, which are languages with divergent word orders. We show that word order divergence can be tackled by reordering the source data to reflect target language word order. In particular, we study two computationally inexpensive methods for re-ordering the source data to better match that of the target language: one making use of slot label information, and another one making use of syntactic parse trees. Our experiments show that the former, which is simpler and doesn’t require any additional resources when compared to vanilla zero-shot transfer, can obtain surprisingly large improvements on a real-world dataset.
Research areas