This paper addresses the problem of hands-free speech recognition in a noisy office environment. An array of seven omnidirectional microphones and a time delay compensation module are used to provide a beamformed signal as input to a HMM-based recognizer. Training of HMMs is performed either using a clean speech database or using a filtered version of the same database. Filtering consists in a convolution with the acoustic impulse response between speaker and microphone, to reproduce the reverberation effect. Background noise is summed to provide the desired SNR. The paper shows that the new models can be applied to further improve system performance
Use of Filtered Clean Speech for Robust HMM Training
Giuliani, Diego;Matassoni, Marco;Omologo, Maurizio;Svaizer, Piergiorgio
1999-01-01
Abstract
This paper addresses the problem of hands-free speech recognition in a noisy office environment. An array of seven omnidirectional microphones and a time delay compensation module are used to provide a beamformed signal as input to a HMM-based recognizer. Training of HMMs is performed either using a clean speech database or using a filtered version of the same database. Filtering consists in a convolution with the acoustic impulse response between speaker and microphone, to reproduce the reverberation effect. Background noise is summed to provide the desired SNR. The paper shows that the new models can be applied to further improve system performanceI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.