IRIS Institutional Research Information System

Previous phrase-based approaches to Automatic Post-editing (APE) have shown that the dependency of MT errors from the source sentence can be exploited by jointly learning from source and target information. By integrating this notion in a neural approach to the problem, we present the multi-source neural machine translation (NMT) system submitted by FBK to the WMT 2017 APE shared task. Our system implements multi-source NMT in a weighted ensemble of 8 models. The n-best hypotheses produced by this ensemble are further re-ranked using features based on the edit distance between the original MT output and each APE hypothesis, as well as other statistical models(n-gram language model and operation sequence model). This solution resulted in the best system submission for this round of the APE shared task for both en-de and de-en language directions. For the former language direction, our primary sub-mission improves over the MT baseline up to -4.9 TER and +7.6 BLEU points. For the latter, where the higher quality of the original MT output reduces the room for improvement, the gains are lower but still significant (-0.25 TER and +0.3 BLEU).

Multi-source Neural Automatic Post-Editing: FBK’s participation in the WMT 2017 APE shared task

Rajen Chatterjee;M. Farajian;Matteo Negri;Marco Turchi;Ankit Srivastava;Santanu Pal

2017-01-01

Abstract

Previous phrase-based approaches to Automatic Post-editing (APE) have shown that the dependency of MT errors from the source sentence can be exploited by jointly learning from source and target information. By integrating this notion in a neural approach to the problem, we present the multi-source neural machine translation (NMT) system submitted by FBK to the WMT 2017 APE shared task. Our system implements multi-source NMT in a weighted ensemble of 8 models. The n-best hypotheses produced by this ensemble are further re-ranked using features based on the edit distance between the original MT output and each APE hypothesis, as well as other statistical models(n-gram language model and operation sequence model). This solution resulted in the best system submission for this round of the APE shared task for both en-de and de-en language directions. For the former language direction, our primary sub-mission improves over the MT baseline up to -4.9 TER and +7.6 BLEU points. For the latter, where the higher quality of the original MT output reduces the room for improvement, the gains are lower but still significant (-0.25 TER and +0.3 BLEU).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Codice ISBN
	
				978-1-945626-96-8
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
WMT73.pdf accesso aperto Tipologia: Documento in Post-print Licenza: DRM non definito Dimensione 194.67 kB Formato Adobe PDF Visualizza/Apri	194.67 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/313128

Citazioni

ND

social impact