This paper describes a system, based on statistical machine translation, that tries to remove from the output of an automatic audio transcription system non relevant words, such as: erroneously inserted functional words, filled pauses, interjections, word fragments, etc, as well as to repair, at a certain extent, ungrammatical pieces of sentences. For this work we decided to concentrate on a political speeches application domain, due to the immediate availability of a parallel corpus of automatic audio transcriptions and related proceedings, manually produced. The system can effectively detect and correct several errors (mainly insertions) included in the alignment between a given automatic audio transcription and a reference transcription derived from a corresponding proceeding. Preliminary results, expressed in terms of word error rate, show that the proposed approach allows to improve of a relative 5% with respect to the usage of the pure automatic transcription of the audio.

Redundancy Reduction in ASR of Spontaneous Speech through Statistical Machine Translation

Falavigna, Giuseppe Daniele
2011-01-01

Abstract

This paper describes a system, based on statistical machine translation, that tries to remove from the output of an automatic audio transcription system non relevant words, such as: erroneously inserted functional words, filled pauses, interjections, word fragments, etc, as well as to repair, at a certain extent, ungrammatical pieces of sentences. For this work we decided to concentrate on a political speeches application domain, due to the immediate availability of a parallel corpus of automatic audio transcriptions and related proceedings, manually produced. The system can effectively detect and correct several errors (mainly insertions) included in the alignment between a given automatic audio transcription and a reference transcription derived from a corresponding proceeding. Preliminary results, expressed in terms of word error rate, show that the proposed approach allows to improve of a relative 5% with respect to the usage of the pure automatic transcription of the audio.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/54013
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact