This paper reports on experiments of porting the ITC-irst Italian broadcast news recognition system to two spontaneous dialogue domains. The trade-off between performance and the required amount of task specific data was investigated. Porting was experimented by applying supervised adaptation methods on acoustic and language models. By using manual transcripts equivalent to two hours of speech, and one hour of annotated speech, word error rates of 27.0% and 29.3% were achieved by the adapted systems. Two domains specific baseline systems, developed on much more training data, achieved word error rates of 22.6% and 22.0%, respectively
From Broadcast News to Spontraneous Dialogue Transcription: Portability Issues
Bertoldi, Nicola;Brugnara, Fabio;Cettolo, Mauro;Federico, Marcello;Giuliani, Diego
2001-01-01
Abstract
This paper reports on experiments of porting the ITC-irst Italian broadcast news recognition system to two spontaneous dialogue domains. The trade-off between performance and the required amount of task specific data was investigated. Porting was experimented by applying supervised adaptation methods on acoustic and language models. By using manual transcripts equivalent to two hours of speech, and one hour of annotated speech, word error rates of 27.0% and 29.3% were achieved by the adapted systems. Two domains specific baseline systems, developed on much more training data, achieved word error rates of 22.6% and 22.0%, respectivelyFile in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.