Speech enhancement is a relevant component in many real-world applications such as hearing aid devices, mobile telecommunications, and healthcare applications. In this paper, we investigate on the Dilated Wave-U-Net model: a recently proposed end-to-end neural speech enhancement approach based on the Wave-U-Net architecture. We evaluate the performance of the model on two datasets: the public VCTK dataset, and a contaminated version of the Librispeech dataset. In particular, we experiment on using alternative losses based on the MSE loss, L1 norm, and on a combination of L1 and MSE losses. Results show that the Dilated Wave-U-Net architecture outperforms other state-of-the-art methods in terms of intelligibility and quality metrics on both datasets and that MSE loss is the most performing one.

Speech Enhancement Using Dilated Wave-U-Net: an Experimental Analysis

Mohamed Nabih Ali
Methodology
;
Alessio Brutti;Daniele Falavigna
2020-01-01

Abstract

Speech enhancement is a relevant component in many real-world applications such as hearing aid devices, mobile telecommunications, and healthcare applications. In this paper, we investigate on the Dilated Wave-U-Net model: a recently proposed end-to-end neural speech enhancement approach based on the Wave-U-Net architecture. We evaluate the performance of the model on two datasets: the public VCTK dataset, and a contaminated version of the Librispeech dataset. In particular, we experiment on using alternative losses based on the MSE loss, L1 norm, and on a combination of L1 and MSE losses. Results show that the Dilated Wave-U-Net architecture outperforms other state-of-the-art methods in terms of intelligibility and quality metrics on both datasets and that MSE loss is the most performing one.
File in questo prodotto:
File Dimensione Formato  
Speech_Enhancement_Using_Dilated_Wave-U-Net_an_Exp.pdf

solo utenti autorizzati

Descrizione: The full paper
Licenza: DRM non definito
Dimensione 1.65 MB
Formato Adobe PDF
1.65 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/324530
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact