This paper introduces the use of the TFRCC features, a time-frequency reassigned feature set, as a front-end for speech recognition. Compared to the power spectrogram, the time-frequency reassigned version is particularly helpful in describing simultaneously the temporal and spectral features of speech signals, as it offers an improved visualization of the various components. This powerful attribute is exploited from the cepstral reassigned features, which are incorporated in a state-of-the-art speech recognizer. Experimental activities investigate the proposed features in various scenarios, starting from recognition of close-talk signals and gradually increasing the complexity of the task. The results prove the superiority of these features compared to a MFCC baseline.

A reassigned front-end for speech recognition

Tryfou, Georgia;Omologo, Maurizio
2017-01-01

Abstract

This paper introduces the use of the TFRCC features, a time-frequency reassigned feature set, as a front-end for speech recognition. Compared to the power spectrogram, the time-frequency reassigned version is particularly helpful in describing simultaneously the temporal and spectral features of speech signals, as it offers an improved visualization of the various components. This powerful attribute is exploited from the cepstral reassigned features, which are incorporated in a state-of-the-art speech recognizer. Experimental activities investigate the proposed features in various scenarios, starting from recognition of close-talk signals and gradually increasing the complexity of the task. The results prove the superiority of these features compared to a MFCC baseline.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/310772
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact