In this paper we have studied how to adapt a close-talking baseline acoustic model to a distant-talking application developed in an interactive TV dialogue system: distant-talking interfaces for control of interactive TV (DICIT) project. We have shown that in order to have effective adaptation from the out-of-domain data it is better to acquire that data in the same DICIT environment than using contaminated data. By measuring grammar error rate (GER) and action classification error rate (AER) in addition to word error rate (WER), we have shown the best way to adapt the baseline model using available out-of-domain adaptation data (TIMIT) and small amount of in-domain (DICIT) adaptation data. The best approach is to use cascading MAP adaptation. With less than 5 hours of out-of-domain data and 1 hour of in-domain data, the cascading MAP improves WER/GER/AER by 17%/18%/16% relative respectively over the baseline model. The experimental results show that in-domain adaptation data is definitely needed to improve GER and AER.

Effective Acoustic Adaptation for A Distant-talking Interactive TV System

Matassoni, Marco
2008

Abstract

In this paper we have studied how to adapt a close-talking baseline acoustic model to a distant-talking application developed in an interactive TV dialogue system: distant-talking interfaces for control of interactive TV (DICIT) project. We have shown that in order to have effective adaptation from the out-of-domain data it is better to acquire that data in the same DICIT environment than using contaminated data. By measuring grammar error rate (GER) and action classification error rate (AER) in addition to word error rate (WER), we have shown the best way to adapt the baseline model using available out-of-domain adaptation data (TIMIT) and small amount of in-domain (DICIT) adaptation data. The best approach is to use cascading MAP adaptation. With less than 5 hours of out-of-domain data and 1 hour of in-domain data, the cascading MAP improves WER/GER/AER by 17%/18%/16% relative respectively over the baseline model. The experimental results show that in-domain adaptation data is definitely needed to improve GER and AER.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11582/4508
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact