In this paper, an auditory based modulation spectral feature is presented to improve automatic speech recognition performance in presence of room reverberation. The solution is based on extracting features from auditory processing characteristics, specifically gammatone filtering based long-term modulation spectral features to reduce sensitivity to environmental noise and further preserve the important speech intelligibility information in the speech signal essential for ASR. Experiments are performed on Aurora-5 meeting recorder digit task recorded with four different microphones in hands-free mode at a real meeting room. For comparison purposes the recognition results obtained using standard ETSI basic and advanced front-ends and conventional features with standard feature compensation are tested. The experimental results reveal that the proposed features provide reliable and considerable improvements with respect to the state-of-the-art feature extraction techniques.
An Auditory Based Modulation Spectral Feature for Reverberant Speech Recognition
Maganti, Hari Krishna;Matassoni, Marco
2010-01-01
Abstract
In this paper, an auditory based modulation spectral feature is presented to improve automatic speech recognition performance in presence of room reverberation. The solution is based on extracting features from auditory processing characteristics, specifically gammatone filtering based long-term modulation spectral features to reduce sensitivity to environmental noise and further preserve the important speech intelligibility information in the speech signal essential for ASR. Experiments are performed on Aurora-5 meeting recorder digit task recorded with four different microphones in hands-free mode at a real meeting room. For comparison purposes the recognition results obtained using standard ETSI basic and advanced front-ends and conventional features with standard feature compensation are tested. The experimental results reveal that the proposed features provide reliable and considerable improvements with respect to the state-of-the-art feature extraction techniques.File | Dimensione | Formato | |
---|---|---|---|
INTERSPEECH2010.pdf
non disponibili
Tipologia:
Documento in Post-print
Licenza:
DRM non definito
Dimensione
264.66 kB
Formato
Adobe PDF
|
264.66 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.