IRIS Institutional Research Information System

In Gaussian model-based multichannel audio source separation, the likelihood of observed mixtures of source signals is parametrized by source spectral variances and by associated spatial covariance matrices. These parameters are estimated by maximizing the likelihood through an expectation-maximization algorithm and used to separate the signals by means of multichannel Wiener filtering. We propose to estimate these parameters by applying nonnegative factorization based on prior information on source variances. In the nonnegative factorization, spectral basis matrices can be defined as the prior information. The matrices can be either extracted or indirectly made available through a redundant library that is trained in advance. In a separate step, applying nonnegative tensor factorization, two algorithms are proposed in order to either extract or detect the basis matrices that best represent the power spectra of the source signals in the observed mixtures. The factorization is achieved by minimizing the β-divergence through multiplicative update rules. The sparsity of factorization can be controlled by tuning the value of β. Experiments show that sparsity, rather than the value assigned to β in the training, is crucial in order to increase the separation performance. The proposed method was evaluated in several mixing conditions. It provides better separation quality with respect to other comparable algorithms.

Audio Source Separation in Reverberant Environments Using β -Divergence-Based Nonnegative Factorization

Abdelraheem, Mahmoud Fakhry Mahmoud;Svaizer, Piergiorgio;Omologo, Maurizio

2017-01-01

Abstract

In Gaussian model-based multichannel audio source separation, the likelihood of observed mixtures of source signals is parametrized by source spectral variances and by associated spatial covariance matrices. These parameters are estimated by maximizing the likelihood through an expectation-maximization algorithm and used to separate the signals by means of multichannel Wiener filtering. We propose to estimate these parameters by applying nonnegative factorization based on prior information on source variances. In the nonnegative factorization, spectral basis matrices can be defined as the prior information. The matrices can be either extracted or indirectly made available through a redundant library that is trained in advance. In a separate step, applying nonnegative tensor factorization, two algorithms are proposed in order to either extract or detect the basis matrices that best represent the power spectra of the source signals in the observed mixtures. The factorization is achieved by minimizing the β-divergence through multiplicative update rules. The sparsity of factorization can be controlled by tuning the value of β. Experiments show that sparsity, rather than the value assigned to β in the training, is crucial in order to increase the separation performance. The proposed method was evaluated in several mixing conditions. It provides better separation quality with respect to other comparable algorithms.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2017

Appare nelle tipologie:

1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Mahmoud Fakhry, Piergiorgio Svaizer, and Maurizio Omologo - Audio Source Separation in Reverberant Environments Using β-Divergence-Based Nonnegative Factorization.pdf non disponibili Tipologia: Documento in Pre-print Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 910.81 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	910.81 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/309979

Citazioni

ND

social impact