An in-car automatic speech recognizer based on multiplerecognition units is introduced within an advanced dialogue system conceived for obtaining driving assistance and accessing tourist information. The use of a set of HMM recognizers, running in parallel, is being investigated in order to ensure low complexity, modularity, fast response, and to allow a real-time reconfiguration of the language models and grammars according to the dialogue context.A corpus of spontaneous speech interactions was collected using theWizard-of-Oz method in a real driving situation. The use of a set of recognition units specialized on geographical subdomains and simpler language models was explored using the resulting corpus.Experiments show that, in presence of large lists of names(e.g. cities, streets, hotels), the choice of the output with maximum likelihood among the active units provides better results than the use of a single comprehensive language model.
In-car speech interaction by means of multiple recognition units
Cristoforetti, Luca;Matassoni, Marco;Omologo, Maurizio;Svaizer, Piergiorgio
2003-01-01
Abstract
An in-car automatic speech recognizer based on multiplerecognition units is introduced within an advanced dialogue system conceived for obtaining driving assistance and accessing tourist information. The use of a set of HMM recognizers, running in parallel, is being investigated in order to ensure low complexity, modularity, fast response, and to allow a real-time reconfiguration of the language models and grammars according to the dialogue context.A corpus of spontaneous speech interactions was collected using theWizard-of-Oz method in a real driving situation. The use of a set of recognition units specialized on geographical subdomains and simpler language models was explored using the resulting corpus.Experiments show that, in presence of large lists of names(e.g. cities, streets, hotels), the choice of the output with maximum likelihood among the active units provides better results than the use of a single comprehensive language model.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.