IRIS Institutional Research Information System

The problem of language model adaptation in statistical machine translation is considered. A mixture of language models is employed, which is obtained by clustering the bilingual training data. Unsupervised clustering is guided by either the development or the test set. Different mixture weight estimation schemes are proposed and compared, at the level of either single or all source sentences. Experimental results show that, by training different specific language models weighted according to the actual input instead of using a single target language model, translation quality is improved, as measured by BLEU and TER.

Online Language Model adaptation via N-gram Mixtures for Statistical Machine Translation

German Sanchis Trilles;Cettolo, Mauro

2010-01-01

Abstract

The problem of language model adaptation in statistical machine translation is considered. A mixture of language models is employed, which is obtained by clustering the bilingual training data. Unsupervised clustering is guided by either the development or the test set. Different mixture weight estimation schemes are proposed and compared, at the level of either single or all source sentences. Experimental results show that, by training different specific language models weighted according to the actual input instead of using a single target language model, translation quality is improved, as measured by BLEU and TER.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2010

Appare nelle tipologie:

4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/14068

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

social impact