Distilling multiple domains for neural machine translation

Anna Currey; Prashant Mathur; Georgiana Dinu

Publication

Distilling multiple domains for neural machine translation

By Anna Currey, Prashant Mathur, Georgiana Dinu

2020

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Neural machine translation achieves impressive results in high-resource conditions, but performance often suffers when the input domain is low-resource. The standard practice of adapting a separate model for each domain of interest does not scale well in practice from both a quality perspective (brittleness under domain shift) as well as a cost perspective (added maintenance and inference complexity). In this paper, we propose a framework for training a single multi-domain neural machine translation model that is able to translate several domains without increasing inference time or memory usage. We show that this model can improve translation on both high and low-resource domains over strong multidomain baselines. In addition, our proposed model is effective when domain labels are unknown during training, as well as robust under noisy data conditions.

Distilling multiple domains for neural machine translation

Latest news

Work with us