Distant-talking Interfaces for Control of Interactive TV” (DICIT) is a European Union-funded project whose main objective is to integrate distant-talking voice interaction as a complementary modality to the use of a remote control in interactive TV systems. Hands-free and seamless control enables a natural user-system interaction providing a suitable means to greatly ease information retrieval. In the given living room scenario the system recognizes commands spoken by multiple and possibly moving users, even in the presence of background noise and TV surround audio. This paper focuses on the multichannel acoustic frontend (MCAF) processing for acoustic scene interpretation which is based on the combination of multi-channel acoustic echo cancellation, blind source separation, beamforming, acoustic event classification, and multiple speaker localization. The fully functional DICIT prototype consists of the MCAF, automatic speech recognition, natural language understanding, mixed-initiative dialogue and satellite connection

A natural acoustic front-end for Interactive TV in the EU-Project DICIT

Svaizer, Piergiorgio;Brutti, Alessio;Zieger, Christian;Omologo, Maurizio;
2009-01-01

Abstract

Distant-talking Interfaces for Control of Interactive TV” (DICIT) is a European Union-funded project whose main objective is to integrate distant-talking voice interaction as a complementary modality to the use of a remote control in interactive TV systems. Hands-free and seamless control enables a natural user-system interaction providing a suitable means to greatly ease information retrieval. In the given living room scenario the system recognizes commands spoken by multiple and possibly moving users, even in the presence of background noise and TV surround audio. This paper focuses on the multichannel acoustic frontend (MCAF) processing for acoustic scene interpretation which is based on the combination of multi-channel acoustic echo cancellation, blind source separation, beamforming, acoustic event classification, and multiple speaker localization. The fully functional DICIT prototype consists of the MCAF, automatic speech recognition, natural language understanding, mixed-initiative dialogue and satellite connection
2009
9781424445615
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/5219
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact