Maintaining up-to-date attack profiles is a critical challenge for Network Intrusion Detection Systems (NIDSs). State-of-the-art solutions based on Machine Learning (ML) algorithms often rely on public datasets, which can be outdated or anonymised, hindering their effectiveness in real-world scenarios. Collaborative learning tackles data limitations by enabling multiple parties to jointly train and update their NIDSs through sharing recent attack information. However, directly sharing network traffic data can compromise the participants’ privacy. Federated Learning (FL) addresses this concern: it allows participants to collaboratively improve their NIDS models by sharing only the trained model parameters, not the raw data itself. Nevertheless, recent studies have proven that the Federated Averaging (FedAvg) algorithm at the core of FL can be inefficient with heterogeneous and unbalanced datasets. A recent solution called FLAD addresses the limitations of FedAvg, resulting in higher accuracy of the final ML model on out-of-distribution data. This work focuses on the resource usage of the FL process, demonstrating the superiority of FLAD over FedAvg in computational efficiency and convergence time, showcasing its potential to enhance NIDS effectiveness.
Resource-Efficient Federated Learning for Network Intrusion Detection
Doriguzzi-Corin, Roberto
;Cretti, Silvio;Siracusa, Domenico
2024-01-01
Abstract
Maintaining up-to-date attack profiles is a critical challenge for Network Intrusion Detection Systems (NIDSs). State-of-the-art solutions based on Machine Learning (ML) algorithms often rely on public datasets, which can be outdated or anonymised, hindering their effectiveness in real-world scenarios. Collaborative learning tackles data limitations by enabling multiple parties to jointly train and update their NIDSs through sharing recent attack information. However, directly sharing network traffic data can compromise the participants’ privacy. Federated Learning (FL) addresses this concern: it allows participants to collaboratively improve their NIDS models by sharing only the trained model parameters, not the raw data itself. Nevertheless, recent studies have proven that the Federated Averaging (FedAvg) algorithm at the core of FL can be inefficient with heterogeneous and unbalanced datasets. A recent solution called FLAD addresses the limitations of FedAvg, resulting in higher accuracy of the final ML model on out-of-distribution data. This work focuses on the resource usage of the FL process, demonstrating the superiority of FLAD over FedAvg in computational efficiency and convergence time, showcasing its potential to enhance NIDS effectiveness.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.