This paper introduces bootstrap error estimation for automatic tuning of parameters in combined networks, applied as front-end preprocessors for a speech recognition system based on Hidden Markov Model. The method is evaluated on a large vocabulary (10,000 words), continuous speech recognition task. Bootstrap estimates of minimum MSE allow selection of speaker normalization models improving recognition performance. The procedure allows a flexible strategy for dealing with inter-speaker variability without requiring an additional validation set. Recognition results are compared for linear, generalized RBF and MLP network architectures

Speaker Normalization and Model Selection of Combined Neural Nets

Furlanello, Cesare;Giuliani, Diego;Trentin, Edmondo;Merler, Stefano
1997

Abstract

This paper introduces bootstrap error estimation for automatic tuning of parameters in combined networks, applied as front-end preprocessors for a speech recognition system based on Hidden Markov Model. The method is evaluated on a large vocabulary (10,000 words), continuous speech recognition task. Bootstrap estimates of minimum MSE allow selection of speaker normalization models improving recognition performance. The procedure allows a flexible strategy for dealing with inter-speaker variability without requiring an additional validation set. Recognition results are compared for linear, generalized RBF and MLP network architectures
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/1214
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact