We examined the content of 6 talk-show TV programs in order to better understand the challenges posed by this program genre to automatic transcription. The selected programs were first segmented, transcribed and annotated by experts. Most of the speech content was found in conversational style with a significant portion of overlapped speech, about 18%. Then, automatic speech recognition experiments were conducted showing that recognition performance on talk-show programs is much worse, 28.3% word error rate (WER), in comparison with that achieved on broadcast news programs, 10.9% WER. For talk-shows performance varied tangibly between non-overlapped speech, 21.8% WER, and overlapped speech, 58.5% WER. On clean, non-overlapped speech a 18.7% WER is achieved, this result is significantly worse than the result achieved for the dominant condition in broadcast news programs represented by clean read/planned speech from the anchormen, 7.6% WER.
Analysis of the Characteristics of Talk-show TV Programs
Brugnara, Fabio;Falavigna, Giuseppe Daniele;Giuliani, Diego;Gretter, Roberto
2012-01-01
Abstract
We examined the content of 6 talk-show TV programs in order to better understand the challenges posed by this program genre to automatic transcription. The selected programs were first segmented, transcribed and annotated by experts. Most of the speech content was found in conversational style with a significant portion of overlapped speech, about 18%. Then, automatic speech recognition experiments were conducted showing that recognition performance on talk-show programs is much worse, 28.3% word error rate (WER), in comparison with that achieved on broadcast news programs, 10.9% WER. For talk-shows performance varied tangibly between non-overlapped speech, 21.8% WER, and overlapped speech, 58.5% WER. On clean, non-overlapped speech a 18.7% WER is achieved, this result is significantly worse than the result achieved for the dominant condition in broadcast news programs represented by clean read/planned speech from the anchormen, 7.6% WER.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.