This work addresses the problem of underdetermined audio source separation exploiting source-based prior information. To solve the problem by following local Gaussian modeling of a mixing process, the covariance matrix of a mixture of audio sources is parameterized by source variances and spatial covariance matrices. An iterative algorithm is proposed to estimate the model parameters in the maximum likelihood (ML) sense by reformulating the parameters and using the prior information. Applying a matrix factorization algorithm, such as nonnegative matrix factorization (NMF), the source variance can be represented as multiplication of two matrices, a spectral basis matrix and a time-varying coefficient matrix. The basis matrix of each source signal is trained in advance using a set of training data, while a corrupted copy of the coefficient matrix is estimated during the separation process. Moreover, the pre-trained information is exploited to move the amplitude indeterminacy of the spatial covariance matrix to the coefficient matrix domain. The proposed algorithm was evaluated using simulated and real mixing conditions, and it provided a high performance in reverberant environments.

Reverberant audio source separation using partially pre-trained non-negative matrix factorization

Abdelraheem, Mahmoud Fakhry Mahmoud;Svaizer, Piergiorgio;Omologo, Maurizio
2014-01-01

Abstract

This work addresses the problem of underdetermined audio source separation exploiting source-based prior information. To solve the problem by following local Gaussian modeling of a mixing process, the covariance matrix of a mixture of audio sources is parameterized by source variances and spatial covariance matrices. An iterative algorithm is proposed to estimate the model parameters in the maximum likelihood (ML) sense by reformulating the parameters and using the prior information. Applying a matrix factorization algorithm, such as nonnegative matrix factorization (NMF), the source variance can be represented as multiplication of two matrices, a spectral basis matrix and a time-varying coefficient matrix. The basis matrix of each source signal is trained in advance using a set of training data, while a corrupted copy of the coefficient matrix is estimated during the separation process. Moreover, the pre-trained information is exploited to move the amplitude indeterminacy of the spatial covariance matrix to the coefficient matrix domain. The proposed algorithm was evaluated using simulated and real mixing conditions, and it provided a high performance in reverberant environments.
File in questo prodotto:
File Dimensione Formato  
Paper_IWAENC_2014.pdf

non disponibili

Descrizione: Articolo principale
Tipologia: Documento in Pre-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 699.22 kB
Formato Adobe PDF
699.22 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/307068
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact