We present the results from the 5th round of the WMT task on MT Automatic Post-Editing. The task consists in automatically correcting the output of a “black-box” machine translation system by learning from human corrections. Keeping the same general evaluation setting of the previous four rounds, this year we focused on two language pairs (English-German and English-Russian) and on domain-specific data (In-formation Technology). For both the language directions, MT outputs were produced by neural systems unknown to participants. Seven teams participated in the English-German task, with a total of 18 submitted runs. The evaluation, which was performed on the same test set used for the 2018 round, shows a slight progress in APE technology: 4 teams achieved better results than last year’s winning system, with improvements up to -0.78 TER and +1.23 BLEU points over the baseline. Two teams participated in the English-Russian task submitting 2 runs each. On this new language direction, characterized by a higher quality of the original translations, the task proved to be particularly challenging. None of the submitted runs improved the very high results of the strong system used to produce the initial translations(16.16 TER, 76.20 BLEU).

Findings of the WMT 2019 Shared Task on Automatic Post-Editing.

Matteo Negri;Marco Turchi
2019-01-01

Abstract

We present the results from the 5th round of the WMT task on MT Automatic Post-Editing. The task consists in automatically correcting the output of a “black-box” machine translation system by learning from human corrections. Keeping the same general evaluation setting of the previous four rounds, this year we focused on two language pairs (English-German and English-Russian) and on domain-specific data (In-formation Technology). For both the language directions, MT outputs were produced by neural systems unknown to participants. Seven teams participated in the English-German task, with a total of 18 submitted runs. The evaluation, which was performed on the same test set used for the 2018 round, shows a slight progress in APE technology: 4 teams achieved better results than last year’s winning system, with improvements up to -0.78 TER and +1.23 BLEU points over the baseline. Two teams participated in the English-Russian task submitting 2 runs each. On this new language direction, characterized by a higher quality of the original translations, the task proved to be particularly challenging. None of the submitted runs improved the very high results of the strong system used to produce the initial translations(16.16 TER, 76.20 BLEU).
File in questo prodotto:
File Dimensione Formato  
W19-5402.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Dominio pubblico
Dimensione 724.99 kB
Formato Adobe PDF
724.99 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/321896
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact