The problem of language model adaptation in statistical machine translation is considered. A mixture of language models is employed, which is obtained by clustering the bilingual training data. Unsupervised clustering is guided by either the development or the test set. Different mixture weight estimation schemes are proposed and compared, at the level of either single or all source sentences. Experimental results show that, by training different specific language models weighted according to the actual input instead of using a single target language model, translation quality is improved, as measured by BLEU and TER.
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte di FBK.
Titolo: | Online Language Model adaptation via N-gram Mixtures for Statistical Machine Translation |
Autori: | |
Data di pubblicazione: | 2010 |
Abstract: | The problem of language model adaptation in statistical machine translation is considered. A mixture of language models is employed, which is obtained by clustering the bilingual training data. Unsupervised clustering is guided by either the development or the test set. Different mixture weight estimation schemes are proposed and compared, at the level of either single or all source sentences. Experimental results show that, by training different specific language models weighted according to the actual input instead of using a single target language model, translation quality is improved, as measured by BLEU and TER. |
Handle: | http://hdl.handle.net/11582/14068 |
Appare nelle tipologie: | 4.1 Contributo in Atti di convegno |