An architecture for a multi-modal Web Browser

Azzini, Ivano; Giorgino, Toni; Nardelli, Luca; Orlandi, Marco; Rognoni, C.

In this work we propose an architecture for handling multi-modal browsing through the synchronization of HTML and VoiceXML documents. A client uses a traditional Web brower to interpret HTML documents. The client should also be able to acquire and transmit the voice signal to the server (voice channel). The server slide hosts: a conventional Web server, which handles both HTML and VoiceXML requests, the Speech Server, which manages both ASR and TTS resources and the multi-modal brower process. The symchronization take place through a TCP/IP connection between the two browser (visual and multi-modal). The multi-modal brower consists of 3 components: a Voice Gate that manages the voice channel, a Visual Gate that manages a TCP/IP connection with the Web browser and an Interpreter Manage that corresponds to the VOiceXML Interpreter Context and integrates the VOiceXML Interpreter