In this paper we focus on the problem of voice activity detectionfor distant-talking speech recognition.Under noisy conditions the exploitation ofa coherence measure between differentmicrophones turns out to be an effective featurein the case of moderate SNR andreverberant signals.A non linear processing using the phase information in the Cross PowerSpectrum leads to the detection of speech segments.The recognition test on a real multi-channel corpus ofisolated words shows that the proposed technique outperformsthe classic energy-based algorithm.

Distant-talking activity detection with multi-channel audio input

Armani, Luca;Matassoni, Marco;Omologo, Maurizio;Svaizer, Piergiorgio
2003-01-01

Abstract

In this paper we focus on the problem of voice activity detectionfor distant-talking speech recognition.Under noisy conditions the exploitation ofa coherence measure between differentmicrophones turns out to be an effective featurein the case of moderate SNR andreverberant signals.A non linear processing using the phase information in the Cross PowerSpectrum leads to the detection of speech segments.The recognition test on a real multi-channel corpus ofisolated words shows that the proposed technique outperformsthe classic energy-based algorithm.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/929
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact