The usefulness of translation quality estimation (QE) to increase productivity in a computer-assisted translation (CAT) framework is a widely held assumption (Specia, 2011; Huang et al., 2014). So far, however, the validity of this assumption has not been yet demonstrated through sound evaluations in realistic settings. To this aim, we report on an evaluation involving professional translators operating with a CAT tool in controlled but natural conditions. Contrastive experiments are carried out by measuring post-editing time differences when: i) translation suggestions are presented together with binary quality estimates, and ii) the same suggestions are presented without quality indicators. Translators’ productivity in the two conditions is analysed in a principled way, accounting for the main factors (e.g. differences in translators’ behaviour, quality of the suggestions) that directly impact on time measurements. While the general assumption about the usefulness of QE is verified, significance testing results reveal that real productivity gains can be observed only under specific conditions.

MT Quality Estimation for Computer-assisted Translation: Does it Really Help?

Turchi, Marco;Negri, Matteo;Federico, Marcello
2015-01-01

Abstract

The usefulness of translation quality estimation (QE) to increase productivity in a computer-assisted translation (CAT) framework is a widely held assumption (Specia, 2011; Huang et al., 2014). So far, however, the validity of this assumption has not been yet demonstrated through sound evaluations in realistic settings. To this aim, we report on an evaluation involving professional translators operating with a CAT tool in controlled but natural conditions. Contrastive experiments are carried out by measuring post-editing time differences when: i) translation suggestions are presented together with binary quality estimates, and ii) the same suggestions are presented without quality indicators. Translators’ productivity in the two conditions is analysed in a principled way, accounting for the main factors (e.g. differences in translators’ behaviour, quality of the suggestions) that directly impact on time measurements. While the general assumption about the usefulness of QE is verified, significance testing results reveal that real productivity gains can be observed only under specific conditions.
File in questo prodotto:
File Dimensione Formato  
ACL15short2 18.28.38.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: DRM non definito
Dimensione 194.06 kB
Formato Adobe PDF
194.06 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/300435
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact