In this work, we describe how prosodic information can be employed to improve the performance of an Automatic Speech Recognizer (ASR) for specific restricted tasks. The approach exploits additional prosodic information in a post-processing stage. Prosodic features are estimated at word level; this additional information is encoded through a feature extractor and is then modeled using a statistical classifier. To train and test this system we collected an Italian database designed to comprise specific dialogue problems like ambiguous utterances. The proposed system yields a 69.5% relative word error rate reduction compared to a traditional state-of-the-art recognizer for the task of recognizing sequences of numbers.
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte di FBK.
Titolo: | Using Prosodic Information for Disambiguation Purposes |
Autori: | |
Data di pubblicazione: | 2005 |
Abstract: | In this work, we describe how prosodic information can be employed to improve the performance of an Automatic Speech Recognizer (ASR) for specific restricted tasks. The approach exploits additional prosodic information in a post-processing stage. Prosodic features are estimated at word level; this additional information is encoded through a feature extractor and is then modeled using a statistical classifier. To train and test this system we collected an Italian database designed to comprise specific dialogue problems like ambiguous utterances. The proposed system yields a 69.5% relative word error rate reduction compared to a traditional state-of-the-art recognizer for the task of recognizing sequences of numbers. |
Handle: | http://hdl.handle.net/11582/3351 |
Appare nelle tipologie: | 4.1 Contributo in Atti di convegno |