This paper summarizes the work done to prepare the first shared task on automatic translation memory cleaning. This shared task aims at finding automatic ways of cleaning TMs that, for some reason, have not been properly curated and include wrong translations. Participants in this task are required to take pairs of source and target segments from TMs and decide whether they are right translations. For this first task three language pairs have been prepared: English/Spanish, English/Italian, and English/German. In this paper, we report on how the shared task was prepared and explain the process of data selection and data annotation, the building of the training and test sets and the implemented baselines for automatic classifiers comparison.
1st Shared Task on Automatic Translation Memory Cleaning: Preparation and Lessons Learned
Barbu, Eduard;Bentivogli, Luisa;Negri, Matteo;Turchi, Marco;Federico, Marcello;
2016-01-01
Abstract
This paper summarizes the work done to prepare the first shared task on automatic translation memory cleaning. This shared task aims at finding automatic ways of cleaning TMs that, for some reason, have not been properly curated and include wrong translations. Participants in this task are required to take pairs of source and target segments from TMs and decide whether they are right translations. For this first task three language pairs have been prepared: English/Spanish, English/Italian, and English/German. In this paper, we report on how the shared task was prepared and explain the process of data selection and data annotation, the building of the training and test sets and the implemented baselines for automatic classifiers comparison.File | Dimensione | Formato | |
---|---|---|---|
LREC2016Workshop-NLP4TM_Proceedings.pdf
accesso aperto
Descrizione: Articolo
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
192.24 kB
Formato
Adobe PDF
|
192.24 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.