Use of Multiple Speech Recognition Units in a In-car Assistance System

Brutti, Alessio; Coletti, P.; Cristoforetti, Luca; Geutner, P.; Giacomini, A.; Maistrello, M.; Matassoni, Marco; Omologo, Maurizio; Steffens, F.; Svaizer, Piergiorgio

This chapter presents an advanced dialogue system based on in-car hands-free voice interaction, conceived for btaining driving assistance and for accessing tourist information while driving. Part of the related activities aimed at developing this “Virtual Intelligent Codriver” are being conducted under the European VICO project. The architecture of the dialogue system is here presented, with a description of its main modules: Front-end Speech Processing, Recognition Engine, Natural Language Understanding, Dialogue Manager and Car Wide Web. The use of a set of HMM recognizers, running in parallel, is being investigated within this project in order to ensure low complexity, modularity, fast response, and to allow a real-time reconfiguration of the language models and grammars according to the dialogue context. A corpus of spontaneous speech interactions was collected at ITC-irst using the Wizard-of-Oz method in a real driving situation. Multiple recognition units specialized on geographical subdomains and simpler language models were experimented using the resulting corpus. This investigation shows that, in presence of large lists of names (e.g. cities, streets, hotels), the choice of the output with maximum likelihood among the active units, although a simple approach, provides better results than the use of a single comprehensive language model