Speaker tracking is the process of following who says something in an audio stream. In the case the audio stream is a recording of broadcast news, speaker identity can be an important meta-data for building digital libraries. moreover, the segmentation and classification of the audio stream in terms of acoustic contents, bandwidth and speaker gender allow to filter out portions of the signal which do not contain speech and to improve transcription accuracy through the use of condition-dependent acoustic models and adaptation techniques. In this paper, the problem of automatic speaker tracking in a corpus of Italian broadcast news is investigated. A 81.9% frame classification accuracy is achieved on a 1h:15m test set, in terms of 37 named speakers and one label for the world model
Speaker Tracking in a Broadcast News Corpus
Cettolo, Mauro
2001-01-01
Abstract
Speaker tracking is the process of following who says something in an audio stream. In the case the audio stream is a recording of broadcast news, speaker identity can be an important meta-data for building digital libraries. moreover, the segmentation and classification of the audio stream in terms of acoustic contents, bandwidth and speaker gender allow to filter out portions of the signal which do not contain speech and to improve transcription accuracy through the use of condition-dependent acoustic models and adaptation techniques. In this paper, the problem of automatic speaker tracking in a corpus of Italian broadcast news is investigated. A 81.9% frame classification accuracy is achieved on a 1h:15m test set, in terms of 37 named speakers and one label for the world modelI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.