We present a two stage automatic speech recognition architecture suited for applications, such asspoken document retrieval, where large scale language models can be used and very lowout-of-vocabulary rates need to be reached. The proposed system couples a weakly constrainedphone-recognizer with a phone-to-word decoder that was originally developed for phrase-basedstatistical machine translation. The decoder permits to efficiently decode confusion networks in input,and to exploit large scale unpruned language models. Preliminary experiments are reported on thetranscription of speeches of the Italian parliament. The use of phone confusion networks as interfacebetween the two decoding steps permits to reduce the WER by 28%, thus making the system performrelatively close to a state-of-the-art baseline using a comparable language model.

Fast Speech Decoding through Phone Confusion Networks

Bertoldi, Nicola;Federico, Marcello;Falavigna, Giuseppe Daniele;Gerosa, Matteo
2008-01-01

Abstract

We present a two stage automatic speech recognition architecture suited for applications, such asspoken document retrieval, where large scale language models can be used and very lowout-of-vocabulary rates need to be reached. The proposed system couples a weakly constrainedphone-recognizer with a phone-to-word decoder that was originally developed for phrase-basedstatistical machine translation. The decoder permits to efficiently decode confusion networks in input,and to exploit large scale unpruned language models. Preliminary experiments are reported on thetranscription of speeches of the Italian parliament. The use of phone confusion networks as interfacebetween the two decoding steps permits to reduce the WER by 28%, thus making the system performrelatively close to a state-of-the-art baseline using a comparable language model.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/3940
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact