A novel method is presented for a robust two channel multiple Time Difference of Arrival (TDOA) estimation for multispeaker localization which can provide satisfactory performance even in highly reverberant environment. The method is based on a recursive frequency-domain Independent Component Analysis (ICA) and on a novel State Coherence Transform (SCT). Exploiting the phase coherence of the demixing matrices obtained in the ICA stage the SCT is able to generate envelopes with clear peaks in the corresponding maximum-likelihood TDOAs. The SCT envelopes are computed independently in each time-block and accurate multiple TDOAs are estimated by means of a time-frequency sparse representation of the sources. The method has been applied to real data obtained by recording many sources in a room with a reverberation time of 700ms. Experimental results show that an accurate localization of 7 closely-spaced sources is possibile given only few seconds of data even in the case of low SNR. Experiments also show the advantage of using the proposed solution rather than the well-known GCC-PHAT.
Robust two-channel TDOA estimation for multiple speaker localization by using recursive ICA and a state coherence transform
Nesta, Francesco;Svaizer, Piergiorgio;Omologo, Maurizio
2009-01-01
Abstract
A novel method is presented for a robust two channel multiple Time Difference of Arrival (TDOA) estimation for multispeaker localization which can provide satisfactory performance even in highly reverberant environment. The method is based on a recursive frequency-domain Independent Component Analysis (ICA) and on a novel State Coherence Transform (SCT). Exploiting the phase coherence of the demixing matrices obtained in the ICA stage the SCT is able to generate envelopes with clear peaks in the corresponding maximum-likelihood TDOAs. The SCT envelopes are computed independently in each time-block and accurate multiple TDOAs are estimated by means of a time-frequency sparse representation of the sources. The method has been applied to real data obtained by recording many sources in a room with a reverberation time of 700ms. Experimental results show that an accurate localization of 7 closely-spaced sources is possibile given only few seconds of data even in the case of low SNR. Experiments also show the advantage of using the proposed solution rather than the well-known GCC-PHAT.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.