The goal of this work is to improve automatic speech recognition (ASR) performance in very noisy and reverberant environments. The solution is based on extracting sub-band spectral variance normalization based features, which are capable of estimating the relative strengths of speech and noise components both in presence and absence of speech. The advanced ETSI-2 frontend, RASTA-PLP, MFCC alone and in combination with spectral subtraction are tested for comparison purposes. Speech recognition evaluations are performed on the noisy standard AURORA-2 and meeting recorder digit (MRD) subset of AURORA-5 databases, which represent additive noise and reverberant acoustic conditions. The results reveal that the proposed method is robust and reliable for both low SNR and reverberant scenarios, and provide considerable improvements with respect to the traditional feature extraction techniques.

Sub-band Spectral Variance Feature for Noise Robust ASR

Maganti, Hari Krishna;Zanon, Silvia;Matassoni, Marco;Brutti, Alessio
2011-01-01

Abstract

The goal of this work is to improve automatic speech recognition (ASR) performance in very noisy and reverberant environments. The solution is based on extracting sub-band spectral variance normalization based features, which are capable of estimating the relative strengths of speech and noise components both in presence and absence of speech. The advanced ETSI-2 frontend, RASTA-PLP, MFCC alone and in combination with spectral subtraction are tested for comparison purposes. Speech recognition evaluations are performed on the noisy standard AURORA-2 and meeting recorder digit (MRD) subset of AURORA-5 databases, which represent additive noise and reverberant acoustic conditions. The results reveal that the proposed method is robust and reliable for both low SNR and reverberant scenarios, and provide considerable improvements with respect to the traditional feature extraction techniques.
File in questo prodotto:
File Dimensione Formato  
1569427227.pdf

non disponibili

Tipologia: Documento in Post-print
Licenza: DRM non definito
Dimensione 355.92 kB
Formato Adobe PDF
355.92 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/48793
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact