This paper addresses the manual evaluation of Machine Translation (MT) quality by means of crowdsourcing. To this purpose, we replicated the ranking evaluation of the Arabic- English BTEC task proposed at the IWSLT 2010Workshop by hiring non-experts through the CrowdFlower interface to Amazon’s Mechanical Turk. In particular, we investigated the effectiveness of “gold units” offered by CrowdFlower as the main quality control mechanism. The analysis of the collected data shows that agreement rates for non-experts are comparable to those obtained for experts, and that the crowd-based system ranking has a very strong correlation with expert-based ranking. Our results confirm that crowdsourcing is an effective way to reduce the costs of MT evaluation without sacrificing quality, and demonstrate that just exploiting the Crowd- Flower control mechanism is enough to approximate expert-level data quality.

Getting Expert Quality from the Crowd for Machine Translation Evaluation

Bentivogli, Luisa;Federico, Marcello;Moretti, Giovanni;
2011-01-01

Abstract

This paper addresses the manual evaluation of Machine Translation (MT) quality by means of crowdsourcing. To this purpose, we replicated the ranking evaluation of the Arabic- English BTEC task proposed at the IWSLT 2010Workshop by hiring non-experts through the CrowdFlower interface to Amazon’s Mechanical Turk. In particular, we investigated the effectiveness of “gold units” offered by CrowdFlower as the main quality control mechanism. The analysis of the collected data shows that agreement rates for non-experts are comparable to those obtained for experts, and that the crowd-based system ranking has a very strong correlation with expert-based ranking. Our results confirm that crowdsourcing is an effective way to reduce the costs of MT evaluation without sacrificing quality, and demonstrate that just exploiting the Crowd- Flower control mechanism is enough to approximate expert-level data quality.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/61400
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact