Many syntactic machine translation decoders, including Moses, cdec, and Joshua, implement bottom-up dynamic programming to integrate N-gram language model proba- bilities into hypothesis scoring. These decoders concatenate hypotheses according to grammar rules, yielding larger hy- potheses and eventually complete translations. When hy- potheses are concatenated, the language model score is ad- justed to account for boundary-crossing n-grams. Words on the boundary of each hypothesis are encoded in state, con- sisting of left state (the first few words) and right state (the last few words). We speed concatenation by encoding left state using data structure pointers in lieu of vocabulary in- dices and by avoiding unnecessary queries. To increase the decoder’s opportunities to recombine hypothesis, we mini- mize the number of words encoded by left state. This has the effect of reducing search errors made by the decoder. The resulting gain in model score is smaller than for right state minimization, which we explain by observing a relationship between state minimization and language model probability. With a fixed cube pruning pop limit, we show a 3-6% re- duction in CPU time and improved model scores. Reducing the pop limit to the point where model scores tie the baseline yields a net 11% reduction in CPU time.

Left Language Model State for Syntactic Machine Translation

Federico, Marcello
2011-01-01

Abstract

Many syntactic machine translation decoders, including Moses, cdec, and Joshua, implement bottom-up dynamic programming to integrate N-gram language model proba- bilities into hypothesis scoring. These decoders concatenate hypotheses according to grammar rules, yielding larger hy- potheses and eventually complete translations. When hy- potheses are concatenated, the language model score is ad- justed to account for boundary-crossing n-grams. Words on the boundary of each hypothesis are encoded in state, con- sisting of left state (the first few words) and right state (the last few words). We speed concatenation by encoding left state using data structure pointers in lieu of vocabulary in- dices and by avoiding unnecessary queries. To increase the decoder’s opportunities to recombine hypothesis, we mini- mize the number of words encoded by left state. This has the effect of reducing search errors made by the decoder. The resulting gain in model score is smaller than for right state minimization, which we explain by observing a relationship between state minimization and language model probability. With a fixed cube pruning pop limit, we show a 3-6% re- duction in CPU time and improved model scores. Reducing the pop limit to the point where model scores tie the baseline yields a net 11% reduction in CPU time.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/70198
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact