In this paper we address the problem of continuous digit recognition over the telephone in real-time. We describe a telephone corpus, that has been acquired both to retrain Hidden Markov Models, derived from clean speech, and to test the application. Experimental comparisons, using different acoustic features, are given, showing that linear prediction cepstral coefficients outperfrm the other types of features. Cepstral mean subtraction is compared with RASTA filtering. This latter one is more attractive because it allows to perform recognition while the user is still speaking. Explicit modeling of wome wak spontaneous speech phenomena, that allows to considerably improve word accuracy, is also described. Finally, we discuss the use of a rejection strategy, for the recognition of small vocabularies, that is fundamental in real applications
Evaluation of Digit Recognition over the Telephone Network
Falavigna, Giuseppe Daniele;Gretter, Roberto
1997-01-01
Abstract
In this paper we address the problem of continuous digit recognition over the telephone in real-time. We describe a telephone corpus, that has been acquired both to retrain Hidden Markov Models, derived from clean speech, and to test the application. Experimental comparisons, using different acoustic features, are given, showing that linear prediction cepstral coefficients outperfrm the other types of features. Cepstral mean subtraction is compared with RASTA filtering. This latter one is more attractive because it allows to perform recognition while the user is still speaking. Explicit modeling of wome wak spontaneous speech phenomena, that allows to considerably improve word accuracy, is also described. Finally, we discuss the use of a rejection strategy, for the recognition of small vocabularies, that is fundamental in real applicationsI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.