Improving Fuzzy Match Augmented Neural Machine Translation in Specialised Domains through Synthetic Data

Prague Bulletin of Mathematical Linguistics, 2024

Machine Translation
Combines back-translation with Neural Fuzzy Repair across three language directions and beats several LLMs.
Authors

Arda Tezcan

Alina Skidanova

Thomas Moerman

Published

December 1, 2024

A journal study across three language directions and two specialized domains. It combines back-translation with Neural Fuzzy Repair to expand small parallel datasets with synthetic data. The combination gives large and statistically significant gains and beats several open and commercial LLMs on automatic metrics.

DOI

Research theme: Machine translation

Citation

BibTeX citation:
@article{tezcan2024,
  author = {Tezcan, Arda and Skidanova, Alina and Moerman, Thomas},
  title = {Improving {Fuzzy} {Match} {Augmented} {Neural} {Machine}
    {Translation} in {Specialised} {Domains} Through {Synthetic} {Data}},
  journal = {The Prague Bulletin of Mathematical Linguistics},
  date = {2024-12-01},
  doi = {10.14712/00326585.030},
  langid = {en}
}
For attribution, please cite this work as:
Tezcan, Arda, Alina Skidanova, and Thomas Moerman. 2024. “Improving Fuzzy Match Augmented Neural Machine Translation in Specialised Domains Through Synthetic Data.” The Prague Bulletin of Mathematical Linguistics, accepted, December 1. https://doi.org/10.14712/00326585.030.