This paper deals with audio compensation in the car environment for the development of a hands-free spoken dialogue system with barge-in functionalities. While in a preliminary work we investigated the problem, given an interfering radio signal to compensate, here we focus only on the compensation of speech audio generated by a text to speech synthesizer. The latter one is a more difficult signal to manage, as speech is colored and non stationary or quasi stationary and this degrades the performance of the AEC if a simple NLMS is used. In this paper we will introduce a Subband Acoustic Echo Cancellation (SAEC) for compensating the synthetic prompt speech and we will compare it with a Fullband Acoustic Echo Cancellation (FAEC) demonstrating the good effectiveness in terms of speed of convergence, robustness against noise and computational complexity. The system performance are being measured in terms of Word Error Rate % (WER) by recognizing isolated and connected digits.

Comparison between Subband and Fullband NLMS for In-Car Audio Compensation and Hands-Free Speech Recognition

Omologo, Maurizio;Zieger, Christian
2005

Abstract

This paper deals with audio compensation in the car environment for the development of a hands-free spoken dialogue system with barge-in functionalities. While in a preliminary work we investigated the problem, given an interfering radio signal to compensate, here we focus only on the compensation of speech audio generated by a text to speech synthesizer. The latter one is a more difficult signal to manage, as speech is colored and non stationary or quasi stationary and this degrades the performance of the AEC if a simple NLMS is used. In this paper we will introduce a Subband Acoustic Echo Cancellation (SAEC) for compensating the synthetic prompt speech and we will compare it with a Fullband Acoustic Echo Cancellation (FAEC) demonstrating the good effectiveness in terms of speed of convergence, robustness against noise and computational complexity. The system performance are being measured in terms of Word Error Rate % (WER) by recognizing isolated and connected digits.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11582/3477
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact