This paper describes a new corpus of multi-channel audio data designed to study and develop distant-speech recognition systems able to cope with known interfering sounds propagating in an environment. The corpus consists of both real and simulated signals and of a corresponding detailed annotation. An extensive set of speech recognition experiments was conducted using three different Acoustic Echo Cancellation (AEC) techniques to establish baseline results for future reference. The AEC techniques were applied both to single distant microphone input signals and beamformed signals generated using two state-of-the-art beamforming techniques. We show that the speech recognition performance using the different techniques is comparable for both the simulated and real data, demonstrating the usefulness of this corpus for speech research. We also show that a significant improvement in speech recognition performance can be obtained by combining state-of-the-art AEC and beamforming techniques, compared to using a single distant microphone input.

A multi-channel corpus for distant-speech interaction in presence of known interferences

Ravanelli, Mirco;Svaizer, Piergiorgio;Omologo, Maurizio
2015-01-01

Abstract

This paper describes a new corpus of multi-channel audio data designed to study and develop distant-speech recognition systems able to cope with known interfering sounds propagating in an environment. The corpus consists of both real and simulated signals and of a corresponding detailed annotation. An extensive set of speech recognition experiments was conducted using three different Acoustic Echo Cancellation (AEC) techniques to establish baseline results for future reference. The AEC techniques were applied both to single distant microphone input signals and beamformed signals generated using two state-of-the-art beamforming techniques. We show that the speech recognition performance using the different techniques is comparable for both the simulated and real data, demonstrating the usefulness of this corpus for speech research. We also show that a significant improvement in speech recognition performance can be obtained by combining state-of-the-art AEC and beamforming techniques, compared to using a single distant microphone input.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/272020
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact