Unsupervised spatial dictionary learning for sparse underdetermined multichannel source separation

Nesta, Francesco; Fakhry, M.

Multichannel sparse representation of acoustic sources has shown to provide an attractive framework for source separation. The multichannel sparse modeling assumes an ability to describe signals as linear combinations of few atoms from a pre-specified dictionary. The dictionary is built by simulating room impulse responses on a grid of locations, exploiting a prior knowledge on the room geometry and reflection coefficients. However, due to the simplified modeling, any mismatch between the simulated and true observed RIRs would generate a considerable distortion in the recovered output signals. In this work we propose an unsupervised adaptation of the dictionary through a semi-blind weighted Natural Gradient, assuming spatio-temporal source sparseness. The system continuously adapts the atoms with the incoming data, improving the match between the dictionary and the true mixing parameters. Results over simulated data show that the proposed framework is a promising solution to underdetermined convolutive source separation in difficult acoustic scenarios.