Speech Sound Disorders (SSD) are common among children, affecting their academic, social, and emotional development. Traditional diagnostic methods are based on speech-language pathologists, making them resource-intensive. Due to the global shortage of experts and increasing demand, exploring deeplearning tools is crucial. Adapting a multi-task framework to f ine-tune a pre-trained multilingual Wav2Vec model, this study tackles Automatic Speech Recognition and SSD classification for German children using a custom dataset. We show that incorporating public out-of-domain datasets improves robustness and generalizability. Interestingly, combining pathological and typical speech data with mispronunciations benefits the performance in terms of speech recognition and SSD detection. Finally, we investigate a two-step training of the model that further improves the overall performance.

Automatic detection of speech sound disorders in German-speaking children: augmenting the data with typically developed speech

Marco Matassoni;Alessio Brutti
2025-01-01

Abstract

Speech Sound Disorders (SSD) are common among children, affecting their academic, social, and emotional development. Traditional diagnostic methods are based on speech-language pathologists, making them resource-intensive. Due to the global shortage of experts and increasing demand, exploring deeplearning tools is crucial. Adapting a multi-task framework to f ine-tune a pre-trained multilingual Wav2Vec model, this study tackles Automatic Speech Recognition and SSD classification for German children using a custom dataset. We show that incorporating public out-of-domain datasets improves robustness and generalizability. Interestingly, combining pathological and typical speech data with mispronunciations benefits the performance in terms of speech recognition and SSD detection. Finally, we investigate a two-step training of the model that further improves the overall performance.
File in questo prodotto:
File Dimensione Formato  
marx25_interspeech.pdf

accesso aperto

Descrizione: Paper open access
Licenza: Creative commons
Dimensione 1.54 MB
Formato Adobe PDF
1.54 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/363550
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact