Automatic speech recognition models require large speech recordings for training. However, the collection of such data often is cumbersome and leads to privacy concerns. Federated learning has been widely used as an effective decentralized technique that collaboratively learns a shared model while keeping the data local on clients devices. Unfortunately, client devices often feature limited computation and communication resources leading to practical difficulties for large models. In addition, the heterogeneity that characterizes edge devices make unpractical federating a single model that fits all the different clients. Differently from the recent literature, where multiple different architectures are used, in this work we 10 propose using early-exiting. This brings 2 benefits: a single model is used on a variety of devices; federating the models is straightforward. Experiments on the public dataset TED-LIUM 3 show that our proposed approach is effective and can be combined with basic federated learning strategies. We also shed light on how to federate self-attention models for speech recognition, for which an established recipe does not exist in literature.
Fed-EE: Federating Heterogeneous ASR Models using Early-Exit Architectures
Mohamed Nabih Ali
Membro del Collaboration Group
;Daniele Falavigna
Membro del Collaboration Group
;Alessio Brutti
Membro del Collaboration Group
2023-01-01
Abstract
Automatic speech recognition models require large speech recordings for training. However, the collection of such data often is cumbersome and leads to privacy concerns. Federated learning has been widely used as an effective decentralized technique that collaboratively learns a shared model while keeping the data local on clients devices. Unfortunately, client devices often feature limited computation and communication resources leading to practical difficulties for large models. In addition, the heterogeneity that characterizes edge devices make unpractical federating a single model that fits all the different clients. Differently from the recent literature, where multiple different architectures are used, in this work we 10 propose using early-exiting. This brings 2 benefits: a single model is used on a variety of devices; federating the models is straightforward. Experiments on the public dataset TED-LIUM 3 show that our proposed approach is effective and can be combined with basic federated learning strategies. We also shed light on how to federate self-attention models for speech recognition, for which an established recipe does not exist in literature.File | Dimensione | Formato | |
---|---|---|---|
paper_49.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
PUBBLICO - Creative Commons 3.6
Dimensione
812.94 kB
Formato
Adobe PDF
|
812.94 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.