A novel method is presented for a robust two channel multiple Time Difference of Arrival (TDOA) estimation for multispeaker localization which can provide satisfactory performance even in highly reverberant environment. The method is based on a recursive frequency-domain Independent Component Analysis (ICA) and on a novel State Coherence Transform (SCT). Exploiting the phase coherence of the demixing matrices obtained in the ICA stage the SCT is able to generate envelopes with clear peaks in the corresponding maximum-likelihood TDOAs. The SCT envelopes are computed independently in each time-block and accurate multiple TDOAs are estimated by means of a time-frequency sparse representation of the sources. The method has been applied to real data obtained by recording many sources in a room with a reverberation time of 700ms. Experimental results show that an accurate localization of 7 closely-spaced sources is possibile given only few seconds of data even in the case of low SNR. Experiments also show the advantage of using the proposed solution rather than the well-known GCC-PHAT.

Robust two-channel TDOA estimation for multiple speaker localization by using recursive ICA and a state coherence transform

Nesta, Francesco;Svaizer, Piergiorgio;Omologo, Maurizio
2009

Abstract

A novel method is presented for a robust two channel multiple Time Difference of Arrival (TDOA) estimation for multispeaker localization which can provide satisfactory performance even in highly reverberant environment. The method is based on a recursive frequency-domain Independent Component Analysis (ICA) and on a novel State Coherence Transform (SCT). Exploiting the phase coherence of the demixing matrices obtained in the ICA stage the SCT is able to generate envelopes with clear peaks in the corresponding maximum-likelihood TDOAs. The SCT envelopes are computed independently in each time-block and accurate multiple TDOAs are estimated by means of a time-frequency sparse representation of the sources. The method has been applied to real data obtained by recording many sources in a room with a reverberation time of 700ms. Experimental results show that an accurate localization of 7 closely-spaced sources is possibile given only few seconds of data even in the case of low SNR. Experiments also show the advantage of using the proposed solution rather than the well-known GCC-PHAT.
9781424423538
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/8694
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact