This paper describes FBK’s submission to the end-to-end English-German speech translation task at IWSLT 2018. Our system relies on a state-of-the-art model based on LSTMs and CNNs, where the CNNs are used to reduce the temporal dimension of the audio input, which is in generalmuch higher than machine translation input. Our model wastrained only on the audio-to-text parallel data released forthe task, and fine-tuned on cleaned subsets of the originaltraining corpus. The addition of weight normalization andlabel smoothing improved the baseline system by1.0BLEUpoint on our validation set. The final submission also fea-tured checkpoint averaging within a training run and ensem-ble decoding of models trained during multiple runs. On testdata, our best single model obtained a BLEU score of9.7,while the ensemble obtained a BLEU score of10.24.

Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018

Mattia Antonino Di Gangi;Roldano Cattoni;Matteo Negri;Marco Turchi
2018-01-01

Abstract

This paper describes FBK’s submission to the end-to-end English-German speech translation task at IWSLT 2018. Our system relies on a state-of-the-art model based on LSTMs and CNNs, where the CNNs are used to reduce the temporal dimension of the audio input, which is in generalmuch higher than machine translation input. Our model wastrained only on the audio-to-text parallel data released forthe task, and fine-tuned on cleaned subsets of the originaltraining corpus. The addition of weight normalization andlabel smoothing improved the baseline system by1.0BLEUpoint on our validation set. The final submission also fea-tured checkpoint averaging within a training run and ensem-ble decoding of models trained during multiple runs. On testdata, our best single model obtained a BLEU score of9.7,while the ensemble obtained a BLEU score of10.24.
File in questo prodotto:
File Dimensione Formato  
1810.07652.pdf

accesso aperto

Tipologia: Documento in Pre-print
Licenza: Creative commons
Dimensione 223.21 kB
Formato Adobe PDF
223.21 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/316435
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact