This paper introduces bootstrap error estimation for automatic tuning of parameters in combined networks, applied as front-end preprocessors for a speech recognition system based on Hidden Markov Model. The method is evaluated on a large vocabulary (10,000 words), continuous speech recognition task. Bootstrap estimates of minimum MSE allow selection of speaker normalization models improving recognition performance. The procedure allows a flexible strategy for dealing with inter-speaker variability without requiring an additional validation set. Recognition results are compared for linear, generalized RBF and MLP network architectures
Speaker Normalization and Model Selection of Combined Neural Nets
Furlanello, Cesare;Giuliani, Diego;Trentin, Edmondo;Merler, Stefano
1997-01-01
Abstract
This paper introduces bootstrap error estimation for automatic tuning of parameters in combined networks, applied as front-end preprocessors for a speech recognition system based on Hidden Markov Model. The method is evaluated on a large vocabulary (10,000 words), continuous speech recognition task. Bootstrap estimates of minimum MSE allow selection of speaker normalization models improving recognition performance. The procedure allows a flexible strategy for dealing with inter-speaker variability without requiring an additional validation set. Recognition results are compared for linear, generalized RBF and MLP network architecturesFile in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.