Distilling multiple domains for neural machine translation
2020
Neural machine translation achieves impressive results in high-resource conditions, but performance often suffers when the input domain is low-resource. The standard practice of adapting a separate model for each domain of interest does not scale well in practice from both a quality perspective (brittleness under domain shift) as well as a cost perspective (added maintenance and inference complexity). In this paper, we propose a framework for training a single multi-domain neural machine translation model that is able to translate several domains without increasing inference time or memory usage. We show that this model can improve translation on both high and low-resource domains over strong multidomain baselines. In addition, our proposed model is effective when domain labels are unknown during training, as well as robust under noisy data conditions.
Research areas