Multichannel sparse representation of acoustic sources has shown to provide an attractive framework for source separation. The multichannel sparse modeling assumes an ability to describe signals as linear combinations of few atoms from a pre-specified dictionary. The dictionary is built by simulating room impulse responses on a grid of locations, exploiting a prior knowledge on the room geometry and reflection coefficients. However, due to the simplified modeling, any mismatch between the simulated and true observed RIRs would generate a considerable distortion in the recovered output signals. In this work we propose an unsupervised adaptation of the dictionary through a semi-blind weighted Natural Gradient, assuming spatio-temporal source sparseness. The system continuously adapts the atoms with the incoming data, improving the match between the dictionary and the true mixing parameters. Results over simulated data show that the proposed framework is a promising solution to underdetermined convolutive source separation in difficult acoustic scenarios.

Unsupervised spatial dictionary learning for sparse underdetermined multichannel source separation

Nesta, Francesco;
2013-01-01

Abstract

Multichannel sparse representation of acoustic sources has shown to provide an attractive framework for source separation. The multichannel sparse modeling assumes an ability to describe signals as linear combinations of few atoms from a pre-specified dictionary. The dictionary is built by simulating room impulse responses on a grid of locations, exploiting a prior knowledge on the room geometry and reflection coefficients. However, due to the simplified modeling, any mismatch between the simulated and true observed RIRs would generate a considerable distortion in the recovered output signals. In this work we propose an unsupervised adaptation of the dictionary through a semi-blind weighted Natural Gradient, assuming spatio-temporal source sparseness. The system continuously adapts the atoms with the incoming data, improving the match between the dictionary and the true mixing parameters. Results over simulated data show that the proposed framework is a promising solution to underdetermined convolutive source separation in difficult acoustic scenarios.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/178612
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact