The Aurora 2 database may be used as a benchmark for evaluation of algorithms under noisy conditions. In particular, the clean training/noisy test mode is aimed at evaluating models that are trained on clean data only without further adjustments on the noisy data, i.e. under severe mismatch between the training and test conditions. While several researchers proposed techniques at the front-end level to improve recognition performance over the reference hideen Markov model (HMM) baseline, investigations at the back-end level are sought. In this respect, the goal is to develop acoustic models that are intrinsically less noise sensitive. This paper presents the word accuracy yielded by a non-parametric HMM with connectionist estimates of the emission probabilities, i.e. a neural network is applied instead of the usual parametric (Gaussian mixture) probability densities. A regularization technique, relying on a maximum-likelihood parameter grouping algorithm, is explicitly introduced to increase the generalization capability of the model and, in turn, its noise-robustness. Results show that a 15,43% relative word error rate reduction w.r.t. the Gaussianmixture HMM is obtained by averaging over the different noises and SNRs of Aurora 2 test set A.
Evaluation on the Aurora 2 Database of Acoustic Models That Are Less Noise-Sensitive
Trentin, Edmondo;Matassoni, Marco;Gori, Marco
2003-01-01
Abstract
The Aurora 2 database may be used as a benchmark for evaluation of algorithms under noisy conditions. In particular, the clean training/noisy test mode is aimed at evaluating models that are trained on clean data only without further adjustments on the noisy data, i.e. under severe mismatch between the training and test conditions. While several researchers proposed techniques at the front-end level to improve recognition performance over the reference hideen Markov model (HMM) baseline, investigations at the back-end level are sought. In this respect, the goal is to develop acoustic models that are intrinsically less noise sensitive. This paper presents the word accuracy yielded by a non-parametric HMM with connectionist estimates of the emission probabilities, i.e. a neural network is applied instead of the usual parametric (Gaussian mixture) probability densities. A regularization technique, relying on a maximum-likelihood parameter grouping algorithm, is explicitly introduced to increase the generalization capability of the model and, in turn, its noise-robustness. Results show that a 15,43% relative word error rate reduction w.r.t. the Gaussianmixture HMM is obtained by averaging over the different noises and SNRs of Aurora 2 test set A.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.