This paper describes efforts towards the development of an Arabic to Italian SMT system for the news domain. Since only very little parallel data are available for this language pair, we investigated both the exploitation of comparable corpora and pivot translation. Experimental evaluation was conducted on a new benchmark developed by extending two Arabic-to-English NIST evaluation sets with Italian and French translations, produced from the source language by experts. Preliminary results show potentials of both approaches with respect to performance achieved by a popular state-of-the-art Web-based translation service.
Bootstrapping Arabic-Italian SMT through Comparable Texts and Pivot Translation
Cettolo, Mauro;Bertoldi, Nicola;Federico, Marcello
2011-01-01
Abstract
This paper describes efforts towards the development of an Arabic to Italian SMT system for the news domain. Since only very little parallel data are available for this language pair, we investigated both the exploitation of comparable corpora and pivot translation. Experimental evaluation was conducted on a new benchmark developed by extending two Arabic-to-English NIST evaluation sets with Italian and French translations, produced from the source language by experts. Preliminary results show potentials of both approaches with respect to performance achieved by a popular state-of-the-art Web-based translation service.File | Dimensione | Formato | |
---|---|---|---|
EAMT-2011-Cettolo-etAl.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
353.64 kB
Formato
Adobe PDF
|
353.64 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.