Advancing Fuzzy Match Augmentation for Domain-Specific Machine Translation: An Empirical Study on Large Language Models and Neural Machine Translation
Journal of Artificial Intelligence Research, 2025
Machine Translation
Unifies fuzzy-match augmentation for NMT and LLMs and shows specialized models can match or beat much larger ones.
This flagship study integrates fuzzy-match augmentation with LLM fine-tuning, in-context learning, and back-translation across three legal-domain language pairs. The best configuration improves on strong NMT baselines, and a translation-specialized 9B model matches a 24B model.
Research theme: Machine translation
Citation
BibTeX citation:
@article{moerman2025,
author = {Moerman, Thomas and Lefever, Els and Tezcan, Arda},
title = {Advancing {Fuzzy} {Match} {Augmentation} for
{Domain-Specific} {Machine} {Translation:} {An} {Empirical} {Study}
on {Large} {Language} {Models} and {Neural} {Machine} {Translation}},
journal = {Journal of Artificial Intelligence Research},
date = {2025-12-02},
langid = {en}
}
For attribution, please cite this work as:
Moerman, Thomas, Els Lefever, and Arda Tezcan. 2025. “Advancing
Fuzzy Match Augmentation for Domain-Specific Machine Translation: An
Empirical Study on Large Language Models and Neural Machine
Translation.” Journal of Artificial Intelligence
Research, accepted, December 2.