In Arabic-to-English phrase-based statis- tical machine translation, a large number of syntactic disfluencies are due to wrong long-range reordering of the verb in VSO sentences, where the verb is anticipated with respect to the English word order. In this paper, we propose a chunk-based reordering technique to automatically de- tect and displace clause-initial verbs in the Arabic side of a word-aligned parallel cor- pus. This method is applied to preprocess the training data, and to collect statistics about verb movements. From this anal- ysis, specific verb reordering lattices are then built on the test sentences before de- coding them. The application of our re- ordering methods on the training and test sets results in consistent BLEU score im- provements on the NIST-MT 2009 Arabic- English benchmark.

Chunk-Based Verb Reordering in VSO Sentences for Arabic-English Statistical Machine Translation

Bisazza, Arianna;Federico, Marcello
2010-01-01

Abstract

In Arabic-to-English phrase-based statis- tical machine translation, a large number of syntactic disfluencies are due to wrong long-range reordering of the verb in VSO sentences, where the verb is anticipated with respect to the English word order. In this paper, we propose a chunk-based reordering technique to automatically de- tect and displace clause-initial verbs in the Arabic side of a word-aligned parallel cor- pus. This method is applied to preprocess the training data, and to collect statistics about verb movements. From this anal- ysis, specific verb reordering lattices are then built on the test sentences before de- coding them. The application of our re- ordering methods on the training and test sets results in consistent BLEU score im- provements on the NIST-MT 2009 Arabic- English benchmark.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/21670
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact