Speech recognition in a realistic noisy environment using multiple microphones is the focal point of the third CHiME challenge. Over the baseline ASR system provided for this challenge, we apply state of the art algorithms for boosting acoustic model learning and hypothesis rescoring to improve the final output. To this aim, we first use the automatic transcription of each channel to re-train the acoustic model for that channel and then we apply linear language model rescoring to find a better solution in the n-best list. LM rescoring is performed using an efficient set of N-gram and Recurrent Neural Network LM (RNNLM) trained on a wisely-selected text set. In the experiments, we show that the proposed approach improves not only the individual channel transcription, but also the enhanced channels produced by MVDR and delay-and-sum beamforming.

Boosted acoustic model learning and hypotheses rescoring on the CHiME-3 task

Jalalvand, Shahab;Falavigna, Giuseppe Daniele;Matassoni, Marco;Svaizer, Piergiorgio;Omologo, Maurizio
2015-01-01

Abstract

Speech recognition in a realistic noisy environment using multiple microphones is the focal point of the third CHiME challenge. Over the baseline ASR system provided for this challenge, we apply state of the art algorithms for boosting acoustic model learning and hypothesis rescoring to improve the final output. To this aim, we first use the automatic transcription of each channel to re-train the acoustic model for that channel and then we apply linear language model rescoring to find a better solution in the n-best list. LM rescoring is performed using an efficient set of N-gram and Recurrent Neural Network LM (RNNLM) trained on a wisely-selected text set. In the experiments, we show that the proposed approach improves not only the individual channel transcription, but also the enhanced channels produced by MVDR and delay-and-sum beamforming.
2015
978-1-4799-7291-3
978-1-4799-7291-3
File in questo prodotto:
File Dimensione Formato  
CHiME3_ASRU2015.pdf

solo utenti autorizzati

Descrizione: articolo
Tipologia: Documento in Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 267.34 kB
Formato Adobe PDF
267.34 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/307390
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact