Abstract: Transformer-based neural machine translation (NMT) systems have achieved remarkable success with high-resource bilingual corpora. However, their performance deteriorates significantly in low ...