Abstract Advanced computer-assisted translation (CAT) tools include automatic quality estimation (QE) mechanisms to support post-editors in identifying and selecting useful suggestions. Based on supervised learning techniques, QE relies on high-quality data annotations obtained from expensive manual procedures. However, as the notion of MT quality is inherently subjective, such procedures might result in unreliable or uninformative annotations. To overcome these issues, we propose an automatic method to obtain binary annotated data that explicitly discriminate between useful (suitable for post-editing) and useless suggestions. Our approach is fully data-driven and bypasses the need for explicit human labelling. Experiments with different language pairs and domains demonstrate that it yields better models than those based on the adaptation into binary datasets of the available QE corpora. Furthermore, our analysis suggests that the learned thresholds separating useful from useless translations are significantly lower than as suggested in the existing guidelines for human annotators. Finally, a verification experiment with several translators operating with a CAT tool confirms our empirical findings.

Data-driven annotation of binary MT quality estimation corpora based on human post-editions

Turchi, Marco;Negri, Matteo;Federico, Marcello
2014-01-01

Abstract

Abstract Advanced computer-assisted translation (CAT) tools include automatic quality estimation (QE) mechanisms to support post-editors in identifying and selecting useful suggestions. Based on supervised learning techniques, QE relies on high-quality data annotations obtained from expensive manual procedures. However, as the notion of MT quality is inherently subjective, such procedures might result in unreliable or uninformative annotations. To overcome these issues, we propose an automatic method to obtain binary annotated data that explicitly discriminate between useful (suitable for post-editing) and useless suggestions. Our approach is fully data-driven and bypasses the need for explicit human labelling. Experiments with different language pairs and domains demonstrate that it yields better models than those based on the adaptation into binary datasets of the available QE corpora. Furthermore, our analysis suggests that the learned thresholds separating useful from useless translations are significantly lower than as suggested in the existing guidelines for human annotators. Finally, a verification experiment with several translators operating with a CAT tool confirms our empirical findings.
File in questo prodotto:
File Dimensione Formato  
mtj-final.pdf

solo utenti autorizzati

Tipologia: Documento in Post-print
Licenza: DRM non definito
Dimensione 1.03 MB
Formato Adobe PDF
1.03 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/300527
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact