Neural networks learning theory draws a relationship between `learning with noise` and applying a regularization term in the cost function that is minimized during the training process on clean (nn-noisy) data. Application of regularizers and other robust training techniques are aimed at improving the generalization ability of connectionist models, reducing overfitting. This paper presents an application of a variant of the so called Segmental Neural Network (SNN) to the recognition of speaker independent isolated words with noise. The SNN is enhanced with the introduction of trainable amplitudes of activation functions (SNN-TA), that act as regularizers and increase robustness toward noise. Experimental results show that when training is accomplished on clean data, the SNN-TA outperforms a standard Continuous Density HMM in the recognition task of noisy Italian digits, and the performance turns out to be stable as the Signal-to-Noise ratio gets lower. Current research is focused on extending the present scheme for continuous speech recognition.

Robust Segmental-Connectionist Learning for Recognition of Noisy Speech

Trentin, Edmondo;Matassoni, Marco
1999-01-01

Abstract

Neural networks learning theory draws a relationship between `learning with noise` and applying a regularization term in the cost function that is minimized during the training process on clean (nn-noisy) data. Application of regularizers and other robust training techniques are aimed at improving the generalization ability of connectionist models, reducing overfitting. This paper presents an application of a variant of the so called Segmental Neural Network (SNN) to the recognition of speaker independent isolated words with noise. The SNN is enhanced with the introduction of trainable amplitudes of activation functions (SNN-TA), that act as regularizers and increase robustness toward noise. Experimental results show that when training is accomplished on clean data, the SNN-TA outperforms a standard Continuous Density HMM in the recognition task of noisy Italian digits, and the performance turns out to be stable as the Signal-to-Noise ratio gets lower. Current research is focused on extending the present scheme for continuous speech recognition.
1999
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/1763
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact