A novel approach to Spoken Language Translation is proposed, which more tightly integrates Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT). SMT is directly applied on an approximation of the word graph produced by the ASR system, namely a confusion network. The decoding algorithm extends a conventional phrase-based decoder in that it can process at once a large number of source sentence hypotheses contained in the confusion network. Experimental results are presented on a Spanish-English large vocabulary task, namely the translation of the European Parliament Plenary Sessions. With respect to a conventional SMT decoder processing N-best lists, a slight improvement in the BLEU score is reported as well as a significantly lower decoding time.
A New Decoder for Spoken Language Translation based on Confusion Networks
Bertoldi, Nicola;Federico, Marcello
2005-01-01
Abstract
A novel approach to Spoken Language Translation is proposed, which more tightly integrates Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT). SMT is directly applied on an approximation of the word graph produced by the ASR system, namely a confusion network. The decoding algorithm extends a conventional phrase-based decoder in that it can process at once a large number of source sentence hypotheses contained in the confusion network. Experimental results are presented on a Spanish-English large vocabulary task, namely the translation of the European Parliament Plenary Sessions. With respect to a conventional SMT decoder processing N-best lists, a slight improvement in the BLEU score is reported as well as a significantly lower decoding time.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.