Translating from confusion networks (CNs) has been proven to be more effective than translating from single best hypotheses. Moreover, it is widely accepted that the availability of good punctuation marks in the input can improve translation quality. At present, no ASR systems can generate punctuation marks in the word graphs, therefore CNs miss punctuation. In this paper we investigate the problem of adding punctuation marks into confusion networks. We investigate different punctuation strategies and show that the use of multiple hypotheses improves translation quality in a large-vocabulary speech translation task.
Punctuating Confusion Networks for Speech Translation
Cattoni, Roldano;Bertoldi, Nicola;Federico, Marcello
2007-01-01
Abstract
Translating from confusion networks (CNs) has been proven to be more effective than translating from single best hypotheses. Moreover, it is widely accepted that the availability of good punctuation marks in the input can improve translation quality. At present, no ASR systems can generate punctuation marks in the word graphs, therefore CNs miss punctuation. In this paper we investigate the problem of adding punctuation marks into confusion networks. We investigate different punctuation strategies and show that the use of multiple hypotheses improves translation quality in a large-vocabulary speech translation task.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.