Distortions due to reverberation have detrimental effect on the performance of automatic speech recognition (ASR). In this work, an auditory filter-bank based feature is presented to improve the ASR in reverberant conditions. The proposed technique is based on the gammachirp filter bank which provides level dependent frequency response to emulate mechanisms performed in the human auditory system, particularly basilar membrane filtering aimed to improve robustness of the ear. The low frequency tail of gammachirp filter which is unaffected by bandwidth parameters due to level dependency frequency resolution is effective in reducing the reverberation distortions. Experiments are performed on the Aurora-5 meeting recorder digit task recorded with four different microphones in hands-free mode at a real meeting room. The ASR experiments using the proposed gammachirp based features show reliable and consistent improvements when compared to other conventional feature extraction techniques.

A Level-Dependent Auditory Filter-Bank for Speech Recognition in Reverberant Environments

Maganti, Hari Krishna;Matassoni, Marco
2011-01-01

Abstract

Distortions due to reverberation have detrimental effect on the performance of automatic speech recognition (ASR). In this work, an auditory filter-bank based feature is presented to improve the ASR in reverberant conditions. The proposed technique is based on the gammachirp filter bank which provides level dependent frequency response to emulate mechanisms performed in the human auditory system, particularly basilar membrane filtering aimed to improve robustness of the ear. The low frequency tail of gammachirp filter which is unaffected by bandwidth parameters due to level dependency frequency resolution is effective in reducing the reverberation distortions. Experiments are performed on the Aurora-5 meeting recorder digit task recorded with four different microphones in hands-free mode at a real meeting room. The ASR experiments using the proposed gammachirp based features show reliable and consistent improvements when compared to other conventional feature extraction techniques.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/48792
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact