This paper tackles the front-back disambiguity problem in speaker localization when the audio signals are captured by a symmetric microphone array. To this end, a deep neural network is proposed with an attention-based mechanism designed to assign different weights to features obtained from individual microphones. For support, a real dataset with synchronized multichannel audio signals captured by a large linear microphone array is introduced, along with manual annotations. The experimental results demonstrate the effectiveness of the proposed method over the other approaches. In particular, more than 50% reduction in Equal Error Rate (EER) is achieved when comparing with the single-channel case. The designed multi-channel self-attention mechanism also brings further improvements. The dataset and source code will be released.

Speaker front‐back disambiguity using multi‐channel speech signals

Qian, Xinyuan;Brutti, Alessio
2022-01-01

Abstract

This paper tackles the front-back disambiguity problem in speaker localization when the audio signals are captured by a symmetric microphone array. To this end, a deep neural network is proposed with an attention-based mechanism designed to assign different weights to features obtained from individual microphones. For support, a real dataset with synchronized multichannel audio signals captured by a large linear microphone array is introduced, along with manual annotations. The experimental results demonstrate the effectiveness of the proposed method over the other approaches. In particular, more than 50% reduction in Equal Error Rate (EER) is achieved when comparing with the single-channel case. The designed multi-channel self-attention mechanism also brings further improvements. The dataset and source code will be released.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/335802
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact