IRIS Institutional Research Information System

In recent years, several end-to-end online translation systems have been proposed to success-fully incorporate human post-editing feedback in the translation workflow. The performance of these systems in a multi-domain translation environment (involving different text genres, post-editing styles, machine translation systems) within the automatic post-editing (APE) task has not been thoroughly investigated yet. In this work, we show that when used in the APE framework the existing online systems are not robust towards domain changes in the incoming data stream. In particular, these systems lack in the capability to learn and use domain-specific post-editing rules from a pool of multi-domain data sets. To cope with this problem, we propose an online learning framework that generates more reliable translations with significantly better quality as compared with the existing online and batch systems. Our framework includes: i) an instance selection technique based on information retrieval that helps to build domain-specificAPE systems, and ii)an optimization procedure to tune the feature weights of the log-linear model that allows the decoder to improve the post-editing quality.

Instance Selection forOnline Automatic Post-Editing in a multi-domain scenario.

Chatterjee, Rajen;Arcan, Mihael;Negri, Matteo;Turchi, Marco

2016-01-01

Abstract

In recent years, several end-to-end online translation systems have been proposed to success-fully incorporate human post-editing feedback in the translation workflow. The performance of these systems in a multi-domain translation environment (involving different text genres, post-editing styles, machine translation systems) within the automatic post-editing (APE) task has not been thoroughly investigated yet. In this work, we show that when used in the APE framework the existing online systems are not robust towards domain changes in the incoming data stream. In particular, these systems lack in the capability to learn and use domain-specific post-editing rules from a pool of multi-domain data sets. To cope with this problem, we propose an online learning framework that generates more reliable translations with significantly better quality as compared with the existing online and batch systems. Our framework includes: i) an instance selection technique based on information retrieval that helps to build domain-specificAPE systems, and ii)an optimization procedure to tune the feature weights of the log-linear model that allows the decoder to improve the post-editing quality.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2016

Appare nelle tipologie:

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
online_APE_amta_2016 (1).pdf accesso aperto Tipologia: Documento in Pre-print Licenza: Creative commons Dimensione 678.61 kB Formato Adobe PDF Visualizza/Apri	678.61 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/307236

Citazioni

ND

social impact