Advancing Fuzzy Match Augmentation for Domain-Specific Machine Translation: An Empirical Study on Large Language Models and Neural Machine Translation

Journal of Artificial Intelligence Research, 2025

Machine Translation
Unifies fuzzy-match augmentation for NMT and LLMs and shows specialized models can match or beat much larger ones.
Authors

Thomas Moerman

Els Lefever

Arda Tezcan

Published

December 2, 2025

This flagship study integrates fuzzy-match augmentation with LLM fine-tuning, in-context learning, and back-translation across three legal-domain language pairs. The best configuration improves on strong NMT baselines, and a translation-specialized 9B model matches a 24B model.

Research theme: Machine translation

Citation

BibTeX citation:
@article{moerman2025,
  author = {Moerman, Thomas and Lefever, Els and Tezcan, Arda},
  title = {Advancing {Fuzzy} {Match} {Augmentation} for
    {Domain-Specific} {Machine} {Translation:} {An} {Empirical} {Study}
    on {Large} {Language} {Models} and {Neural} {Machine} {Translation}},
  journal = {Journal of Artificial Intelligence Research},
  date = {2025-12-02},
  langid = {en}
}
For attribution, please cite this work as:
Moerman, Thomas, Els Lefever, and Arda Tezcan. 2025. “Advancing Fuzzy Match Augmentation for Domain-Specific Machine Translation: An Empirical Study on Large Language Models and Neural Machine Translation.” Journal of Artificial Intelligence Research, accepted, December 2.