This paper introduces a novel set of non-linear spectro-temporal features that improve automatic speech recognition performance in presence of room reverberation. The solution is based on extracting features derived from auditory characteristics, which include gammatone filtering, non-linear processing and modulation spectral processing to emulate the mechanisms performed in the cochlea and middle ear aimed to improve robustness in human ear. Experiments are performed on Aurora-5 meeting recorder digit task (mrd), captured with four different distant microphones in hands-free mode at a real meeting room. For comparison purposes the recognition results obtained using standard conventional features are tested. The experimental results show that the proposed features provide considerable improvements with respect to state of the art feature extraction techniques.
Non-linear Spectro-temporal Modulations for Reverberant Speech Recognition
Matassoni, Marco;Maganti, Hari Krishna;Omologo, Maurizio
2011-01-01
Abstract
This paper introduces a novel set of non-linear spectro-temporal features that improve automatic speech recognition performance in presence of room reverberation. The solution is based on extracting features derived from auditory characteristics, which include gammatone filtering, non-linear processing and modulation spectral processing to emulate the mechanisms performed in the cochlea and middle ear aimed to improve robustness in human ear. Experiments are performed on Aurora-5 meeting recorder digit task (mrd), captured with four different distant microphones in hands-free mode at a real meeting room. For comparison purposes the recognition results obtained using standard conventional features are tested. The experimental results show that the proposed features provide considerable improvements with respect to state of the art feature extraction techniques.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.