This work addresses the problem of underdetermined audio source separation exploiting source-based prior information. To solve the problem by following local Gaussian modeling of a mixing process, the covariance matrix of a mixture of audio sources is parameterized by source variances and spatial covariance matrices. An iterative algorithm is proposed to estimate the model parameters in the maximum likelihood (ML) sense by reformulating the parameters and using the prior information. Applying a matrix factorization algorithm, such as nonnegative matrix factorization (NMF), the source variance can be represented as multiplication of two matrices, a spectral basis matrix and a time-varying coefficient matrix. The basis matrix of each source signal is trained in advance using a set of training data, while a corrupted copy of the coefficient matrix is estimated during the separation process. Moreover, the pre-trained information is exploited to move the amplitude indeterminacy of the spatial covariance matrix to the coefficient matrix domain. The proposed algorithm was evaluated using simulated and real mixing conditions, and it provided a high performance in reverberant environments.
Reverberant audio source separation using partially pre-trained non-negative matrix factorization
Abdelraheem, Mahmoud Fakhry Mahmoud;Svaizer, Piergiorgio;Omologo, Maurizio
2014-01-01
Abstract
This work addresses the problem of underdetermined audio source separation exploiting source-based prior information. To solve the problem by following local Gaussian modeling of a mixing process, the covariance matrix of a mixture of audio sources is parameterized by source variances and spatial covariance matrices. An iterative algorithm is proposed to estimate the model parameters in the maximum likelihood (ML) sense by reformulating the parameters and using the prior information. Applying a matrix factorization algorithm, such as nonnegative matrix factorization (NMF), the source variance can be represented as multiplication of two matrices, a spectral basis matrix and a time-varying coefficient matrix. The basis matrix of each source signal is trained in advance using a set of training data, while a corrupted copy of the coefficient matrix is estimated during the separation process. Moreover, the pre-trained information is exploited to move the amplitude indeterminacy of the spatial covariance matrix to the coefficient matrix domain. The proposed algorithm was evaluated using simulated and real mixing conditions, and it provided a high performance in reverberant environments.File | Dimensione | Formato | |
---|---|---|---|
Paper_IWAENC_2014.pdf
non disponibili
Descrizione: Articolo principale
Tipologia:
Documento in Pre-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
699.22 kB
Formato
Adobe PDF
|
699.22 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.