Towards a Better Integration of Fuzzy Matches in Neural Machine Translation through Data Augmentation

Arda Tezcan, Bram Bulté, Bram Vanroy

Research output: Contribution to journalArticle

Abstract

We identify a number of aspects that can boost the performance of Neural Fuzzy Repair (NFR), an easy-to-implement method to integrate translation memory matches and neural machine translation (NMT).We explore various ways of maximising the added value of retrieved matches within the NFR paradigm for eight language combinations, using Transformer NMT systems. In particular, we test the impact of different fuzzy matching techniques, sub-word-level segmentation methods and alignment-based features on overall translation quality. Furthermore, we propose a
fuzzy match combination technique that aims to maximise the coverage of source words. This is supplemented with an analysis of how translation quality is affected by input sentence length and fuzzy match score. The results show that applying a combination of the tested modifications leads to a significant increase in estimated translation quality over all baselines for all language combinations.
Original languageEnglish
Article number7
JournalInformatics
Volume8
Issue number1
Publication statusPublished - 29 Jan 2021
Externally publishedYes

Fingerprint Dive into the research topics of 'Towards a Better Integration of Fuzzy Matches in Neural Machine Translation through Data Augmentation'. Together they form a unique fingerprint.

Cite this