We present an overview of the ASR challenge for non-native children’s speech organized for a special session at Interspeech2020. The data for the challenge was obtained in the context of a spoken language proficiency assessment administered at Italian schools for students between the ages of9and16who were studying English and German as a foreign language. The corpus distributed for the challenge was a subset of the English recordings. Participating teams competed either in a closed track, in which they could use only the training data released by the organizers of the challenge, or in an open track, in which they were allowed to use additional training data. The closed track received 9 entries and the open track received 7 entries, with the best scoring systems achieving substantial improvements over a state-of-the-art baseline system. This paper describes the corpus of non-native children’s speech that was used for the challenge ,analyzes the results, and discusses some points that should be considered for subsequent challenges in this domain in the future.
Overview of the Interspeech TLT2020 Shared Task onASR for Non-Native Children’s Speech
Roberto Gretter;Marco Matassoni;Daniele Falavigna;
2020-01-01
Abstract
We present an overview of the ASR challenge for non-native children’s speech organized for a special session at Interspeech2020. The data for the challenge was obtained in the context of a spoken language proficiency assessment administered at Italian schools for students between the ages of9and16who were studying English and German as a foreign language. The corpus distributed for the challenge was a subset of the English recordings. Participating teams competed either in a closed track, in which they could use only the training data released by the organizers of the challenge, or in an open track, in which they were allowed to use additional training data. The closed track received 9 entries and the open track received 7 entries, with the best scoring systems achieving substantial improvements over a state-of-the-art baseline system. This paper describes the corpus of non-native children’s speech that was used for the challenge ,analyzes the results, and discusses some points that should be considered for subsequent challenges in this domain in the future.File | Dimensione | Formato | |
---|---|---|---|
is2020_challenge.pdf
accesso aperto
Tipologia:
Documento in Pre-print
Licenza:
PUBBLICO - Creative Commons 3.6
Dimensione
116.86 kB
Formato
Adobe PDF
|
116.86 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.