IRIS Institutional Research Information System

This paper presents the first achievements in the development of a broadcast news transcription system to be applied for the processing of huge audio archives. In paricular, the Italian broadcast news corpus under collection is introduced, and the first implemented baseline system is outlined. The baseline system consists of an audio segmentation module and a speech recognizer featuring a recursive Viterbi beam search, a 64K-word lexicon, a tree-based trigram LM representation, and MLLR adaptation. The word error rate of the baseline was 20.9% on planned studio speech and 28.8% on the whole test set

A Baseline for the Transcription of Italian Broadcast News

Brugnara, Fabio;Cettolo, Mauro;Federico, Marcello;Giuliani, Diego

2000-01-01

Abstract

This paper presents the first achievements in the development of a broadcast news transcription system to be applied for the processing of huge audio archives. In paricular, the Italian broadcast news corpus under collection is introduced, and the first implemented baseline system is outlined. The baseline system consists of an audio segmentation module and a speech recognizer featuring a recursive Viterbi beam search, a 64K-word lexicon, a tree-based trigram LM representation, and MLLR adaptation. The word error rate of the baseline was 20.9% on planned studio speech and 28.8% on the whole test set

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2000

Appare nelle tipologie:

4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/1857

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

social impact