This paper studies feature extraction within the context of automatic speech segmentation at phonetic level. Current state-of-the-art solutions widely use cepstral features as a front-end for HMM based frameworks. Although the automatic segmentation results have reached the inter-annotator agreement, within a tolerance equal or higher than 20ms, the same is not true when a lower tolerance is considered. We propose a new set of cepstral features that derive from the time-frequency reassigned spectrogram and offer a sharper representation of the speech signal in the cepstral domain. The features are evaluated through a series of forced alignment experiments which demonstrate a better performance, compared to the traditional MFCC features, in aligning phone boundaries within a small distance from their true position.

Time-Frequency Reassigned Cepstral Coefficients for Phone-Level Speech Segmentation

Tryfou, Georgia;Pellin, Marco;Omologo, Maurizio
2014

Abstract

This paper studies feature extraction within the context of automatic speech segmentation at phonetic level. Current state-of-the-art solutions widely use cepstral features as a front-end for HMM based frameworks. Although the automatic segmentation results have reached the inter-annotator agreement, within a tolerance equal or higher than 20ms, the same is not true when a lower tolerance is considered. We propose a new set of cepstral features that derive from the time-frequency reassigned spectrogram and offer a sharper representation of the speech signal in the cepstral domain. The features are evaluated through a series of forced alignment experiments which demonstrate a better performance, compared to the traditional MFCC features, in aligning phone boundaries within a small distance from their true position.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/250656
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact