One of the important properties observed in basilar membrane filtering, aimed to improve robustness of the human ear is lateral inhibition-based level-dependent frequency resolution. However, this particular property has not been extensively considered for improving robustness of the speech processing systems. In this work, an auditory filter-bank which includes lateral inhibition based on input stimulus providing a good fit to human auditory masking is used for improving robustness of the speech recognition system. The gammachirp auditory filter is the real part of the analytic gammachirp function which has been shown to provide an accurate description for the asymmetric and lateral inhibition observed in the basilar membrane filtering. The gammachirp is characterised with symmetry in the low frequency tail of auditory filter response and models level dependent properties such as decrease in gain and a shift in the centre frequency of the filter with increase in level. The speech recognition experiments using the standard HTK framework are performed on standard Aurora-5 digit task database, both simulated and real data recorded with distant microphones in a hands-free mode at a real meeting room. The gammachirp-based features show reliable and consistent improvements when compared to the conventional features used for speech recognition.

Enhancing robustness for speech recognition through bio-inspired auditory filter-bank

Maganti, Hari Krishna;Matassoni, Marco
2012

Abstract

One of the important properties observed in basilar membrane filtering, aimed to improve robustness of the human ear is lateral inhibition-based level-dependent frequency resolution. However, this particular property has not been extensively considered for improving robustness of the speech processing systems. In this work, an auditory filter-bank which includes lateral inhibition based on input stimulus providing a good fit to human auditory masking is used for improving robustness of the speech recognition system. The gammachirp auditory filter is the real part of the analytic gammachirp function which has been shown to provide an accurate description for the asymmetric and lateral inhibition observed in the basilar membrane filtering. The gammachirp is characterised with symmetry in the low frequency tail of auditory filter response and models level dependent properties such as decrease in gain and a shift in the centre frequency of the filter with increase in level. The speech recognition experiments using the standard HTK framework are performed on standard Aurora-5 digit task database, both simulated and real data recorded with distant microphones in a hands-free mode at a real meeting room. The gammachirp-based features show reliable and consistent improvements when compared to the conventional features used for speech recognition.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11582/79402
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact