We examined the content of 6 talk-show TV programs in order to better understand the challenges posed by this program genre to automatic transcription. The selected programs were first segmented, transcribed and annotated by experts. Most of the speech content was found in conversational style with a significant portion of overlapped speech, about 18%. Then, automatic speech recognition experiments were conducted showing that recognition performance on talk-show programs is much worse, 28.3% word error rate (WER), in comparison with that achieved on broadcast news programs, 10.9% WER. For talk-shows performance varied tangibly between non-overlapped speech, 21.8% WER, and overlapped speech, 58.5% WER. On clean, non-overlapped speech a 18.7% WER is achieved, this result is significantly worse than the result achieved for the dominant condition in broadcast news programs represented by clean read/planned speech from the anchormen, 7.6% WER.

Analysis of the Characteristics of Talk-show TV Programs

Brugnara, Fabio;Falavigna, Giuseppe Daniele;Giuliani, Diego;Gretter, Roberto
2012-01-01

Abstract

We examined the content of 6 talk-show TV programs in order to better understand the challenges posed by this program genre to automatic transcription. The selected programs were first segmented, transcribed and annotated by experts. Most of the speech content was found in conversational style with a significant portion of overlapped speech, about 18%. Then, automatic speech recognition experiments were conducted showing that recognition performance on talk-show programs is much worse, 28.3% word error rate (WER), in comparison with that achieved on broadcast news programs, 10.9% WER. For talk-shows performance varied tangibly between non-overlapped speech, 21.8% WER, and overlapped speech, 58.5% WER. On clean, non-overlapped speech a 18.7% WER is achieved, this result is significantly worse than the result achieved for the dominant condition in broadcast news programs represented by clean read/planned speech from the anchormen, 7.6% WER.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/104412
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact