Supervised approaches to NLP tasks rely on high-quality data annotations, which typically result from expensive manual la- belling procedures. For some tasks, how- ever, the subjectivity of human judgements might reduce the usefulness of the an- notation for real-world applications. In Machine Translation (MT) Quality Estimation (QE), for instance, using human- annotated data to train a binary classifier that discriminates between good (useful for a post-editor) and bad translations is not trivial. Focusing on this binary task, we show that subjective human judgements can be effectively replaced with an automatic annotation procedure. To this aim, we compare binary classifiers trained on different data: the human-annotated dataset from the 7th Workshop on Statistical Machine Translation (WMT-12), and an automatically labelled version of the same corpus. Our results show that human labels are less suitable for the task.

Coping with the Subjectivity of Human Judgements in MT Quality Estimation

Turchi, Marco;Negri, Matteo;Federico, Marcello
2013

Abstract

Supervised approaches to NLP tasks rely on high-quality data annotations, which typically result from expensive manual la- belling procedures. For some tasks, how- ever, the subjectivity of human judgements might reduce the usefulness of the an- notation for real-world applications. In Machine Translation (MT) Quality Estimation (QE), for instance, using human- annotated data to train a binary classifier that discriminates between good (useful for a post-editor) and bad translations is not trivial. Focusing on this binary task, we show that subjective human judgements can be effectively replaced with an automatic annotation procedure. To this aim, we compare binary classifiers trained on different data: the human-annotated dataset from the 7th Workshop on Statistical Machine Translation (WMT-12), and an automatically labelled version of the same corpus. Our results show that human labels are less suitable for the task.
9781937284572
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11582/179816
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact