Speaker Tracking in a Broadcast News Corpus

Cettolo, Mauro

Speaker tracking is the process of following who says something in an audio stream. In the case the audio stream is a recording of broadcast news, speaker identity can be an important meta-data for building digital libraries. moreover, the segmentation and classification of the audio stream in terms of acoustic contents, bandwidth and speaker gender allow to filter out portions of the signal which do not contain speech and to improve transcription accuracy through the use of condition-dependent acoustic models and adaptation techniques. In this paper, the problem of automatic speaker tracking in a corpus of Italian broadcast news is investigated. A 81.9% frame classification accuracy is achieved on a 1h:15m test set, in terms of 37 named speakers and one label for the world model