Mel-frequency cepstrum based features have been traditionally used for speech recognition in a number of applications, as they naturally provide a higher recognition accuracies. However, these features are not very robust in a noisy acoustic conditions. In this article, we investigate the use of bio-inspired auditory features emulating the processing performed by cochlea to improve the robustness, particularly to counter environmental reverberation. Our methodology first extracts robust noise resistant features by gammatone filtering, which emulate cochlea frequency resolution and then a long-term modulation spectral processing is performed which preserves speech intelligibility in the signal. We compare and discuss the features based upon the performance on Aurora5 meeting recorder digit task recorded with four different microphones in a hands-free mode at a real meeting room. The experimental results show that the proposed features provide considerable improvements with respect to the state of the art feature extraction techniques.

BIO-INSPIRED AUDITORY PROCESSING FOR SPEECH FEATURE ENHANCEMENT

Maganti, Hari Krishna;Matassoni, Marco
2011

Abstract

Mel-frequency cepstrum based features have been traditionally used for speech recognition in a number of applications, as they naturally provide a higher recognition accuracies. However, these features are not very robust in a noisy acoustic conditions. In this article, we investigate the use of bio-inspired auditory features emulating the processing performed by cochlea to improve the robustness, particularly to counter environmental reverberation. Our methodology first extracts robust noise resistant features by gammatone filtering, which emulate cochlea frequency resolution and then a long-term modulation spectral processing is performed which preserves speech intelligibility in the signal. We compare and discuss the features based upon the performance on Aurora5 meeting recorder digit task recorded with four different microphones in a hands-free mode at a real meeting room. The experimental results show that the proposed features provide considerable improvements with respect to the state of the art feature extraction techniques.
9789898425355
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/20689
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact