We present an overview of the ASR challenge for non-native children’s speech organized for a special session at Interspeech2020. The data for the challenge was obtained in the context of a spoken language proficiency assessment administered at Italian schools for students between the ages of9and16who were studying English and German as a foreign language. The corpus distributed for the challenge was a subset of the English recordings. Participating teams competed either in a closed track, in which they could use only the training data released by the organizers of the challenge, or in an open track, in which they were allowed to use additional training data. The closed track received 9 entries and the open track received 7 entries, with the best scoring systems achieving substantial improvements over a state-of-the-art baseline system. This paper describes the corpus of non-native children’s speech that was used for the challenge ,analyzes the results, and discusses some points that should be considered for subsequent challenges in this domain in the future.

Overview of the Interspeech TLT2020 Shared Task onASR for Non-Native Children’s Speech

Roberto Gretter;Marco Matassoni;Daniele Falavigna;
2020-01-01

Abstract

We present an overview of the ASR challenge for non-native children’s speech organized for a special session at Interspeech2020. The data for the challenge was obtained in the context of a spoken language proficiency assessment administered at Italian schools for students between the ages of9and16who were studying English and German as a foreign language. The corpus distributed for the challenge was a subset of the English recordings. Participating teams competed either in a closed track, in which they could use only the training data released by the organizers of the challenge, or in an open track, in which they were allowed to use additional training data. The closed track received 9 entries and the open track received 7 entries, with the best scoring systems achieving substantial improvements over a state-of-the-art baseline system. This paper describes the corpus of non-native children’s speech that was used for the challenge ,analyzes the results, and discusses some points that should be considered for subsequent challenges in this domain in the future.
File in questo prodotto:
File Dimensione Formato  
is2020_challenge.pdf

accesso aperto

Tipologia: Documento in Pre-print
Licenza: PUBBLICO - Creative Commons 3.6
Dimensione 116.86 kB
Formato Adobe PDF
116.86 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/324906
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact