Speech Sound Disorders (SSD) are common among children, affecting their academic, social, and emotional development. Traditional diagnostic methods are based on speech-language pathologists, making them resource-intensive. Due to the global shortage of experts and increasing demand, exploring deeplearning tools is crucial. Adapting a multi-task framework to f ine-tune a pre-trained multilingual Wav2Vec model, this study tackles Automatic Speech Recognition and SSD classification for German children using a custom dataset. We show that incorporating public out-of-domain datasets improves robustness and generalizability. Interestingly, combining pathological and typical speech data with mispronunciations benefits the performance in terms of speech recognition and SSD detection. Finally, we investigate a two-step training of the model that further improves the overall performance.
Automatic detection of speech sound disorders in German-speaking children: augmenting the data with typically developed speech
Marco Matassoni;Alessio Brutti
2025-01-01
Abstract
Speech Sound Disorders (SSD) are common among children, affecting their academic, social, and emotional development. Traditional diagnostic methods are based on speech-language pathologists, making them resource-intensive. Due to the global shortage of experts and increasing demand, exploring deeplearning tools is crucial. Adapting a multi-task framework to f ine-tune a pre-trained multilingual Wav2Vec model, this study tackles Automatic Speech Recognition and SSD classification for German children using a custom dataset. We show that incorporating public out-of-domain datasets improves robustness and generalizability. Interestingly, combining pathological and typical speech data with mispronunciations benefits the performance in terms of speech recognition and SSD detection. Finally, we investigate a two-step training of the model that further improves the overall performance.| File | Dimensione | Formato | |
|---|---|---|---|
|
marx25_interspeech.pdf
accesso aperto
Descrizione: Paper open access
Licenza:
Creative commons
Dimensione
1.54 MB
Formato
Adobe PDF
|
1.54 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
