Nowadays ASR systems provide excellent results in a variety of fields; however, specific issues arise in scenarios involving simultaneous interpretation, non-native speech or in presence of code-switching. In such cases, a multilingual acoustic model can mitigate some challenges exploiting common acoustic properties across similar phonemes or graphemes, particularly in case of under-resourced languages or if real time constraints are required. The resulting ASR can ideally deal with more than a language at a time and it is expected to be robust against nonnative speech and code-switching phenomena. The aim of this work is to create an efficient multilingual ASR model which is able to handle multiple languages simultaneously. The languages involved are Italian, Spanish, French, German and English. Although the languages taken into account present certain similarities from a phonetic point of view, others share less features. Our goal is to find the best combination among the phoneme inventories of the considered languages.

Efficient phonetic representations for multilingual ASR system

Sara Picciau;Domenico De Cristofaro;Roberto Gretter;Marco Matassoni
2022-01-01

Abstract

Nowadays ASR systems provide excellent results in a variety of fields; however, specific issues arise in scenarios involving simultaneous interpretation, non-native speech or in presence of code-switching. In such cases, a multilingual acoustic model can mitigate some challenges exploiting common acoustic properties across similar phonemes or graphemes, particularly in case of under-resourced languages or if real time constraints are required. The resulting ASR can ideally deal with more than a language at a time and it is expected to be robust against nonnative speech and code-switching phenomena. The aim of this work is to create an efficient multilingual ASR model which is able to handle multiple languages simultaneously. The languages involved are Italian, Spanish, French, German and English. Although the languages taken into account present certain similarities from a phonetic point of view, others share less features. Our goal is to find the best combination among the phoneme inventories of the considered languages.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/335851
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact