IRIS Institutional Research Information System

This work addresses the problem of underdetermined audio source separation exploiting source-based prior information. To solve the problem by following local Gaussian modeling of a mixing process, the covariance matrix of a mixture of audio sources is parameterized by source variances and spatial covariance matrices. An iterative algorithm is proposed to estimate the model parameters in the maximum likelihood (ML) sense by reformulating the parameters and using the prior information. Applying a matrix factorization algorithm, such as nonnegative matrix factorization (NMF), the source variance can be represented as multiplication of two matrices, a spectral basis matrix and a time-varying coefficient matrix. The basis matrix of each source signal is trained in advance using a set of training data, while a corrupted copy of the coefficient matrix is estimated during the separation process. Moreover, the pre-trained information is exploited to move the amplitude indeterminacy of the spatial covariance matrix to the coefficient matrix domain. The proposed algorithm was evaluated using simulated and real mixing conditions, and it provided a high performance in reverberant environments.

Reverberant audio source separation using partially pre-trained non-negative matrix factorization

Abdelraheem, Mahmoud Fakhry Mahmoud;Svaizer, Piergiorgio;Omologo, Maurizio

2014-01-01

Abstract

This work addresses the problem of underdetermined audio source separation exploiting source-based prior information. To solve the problem by following local Gaussian modeling of a mixing process, the covariance matrix of a mixture of audio sources is parameterized by source variances and spatial covariance matrices. An iterative algorithm is proposed to estimate the model parameters in the maximum likelihood (ML) sense by reformulating the parameters and using the prior information. Applying a matrix factorization algorithm, such as nonnegative matrix factorization (NMF), the source variance can be represented as multiplication of two matrices, a spectral basis matrix and a time-varying coefficient matrix. The basis matrix of each source signal is trained in advance using a set of training data, while a corrupted copy of the coefficient matrix is estimated during the separation process. Moreover, the pre-trained information is exploited to move the amplitude indeterminacy of the spatial covariance matrix to the coefficient matrix domain. The proposed algorithm was evaluated using simulated and real mixing conditions, and it provided a high performance in reverberant environments.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2014

Appare nelle tipologie:

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Paper_IWAENC_2014.pdf non disponibili Descrizione: Articolo principale Tipologia: Documento in Pre-print Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 699.22 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	699.22 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/307068

Citazioni

ND

social impact