The challenge of training AI models is heightened by the limited availability of data, particularly when public datasets are insufficient. While obtaining data from private sources may seem like a viable solution, privacy concerns often prevent data sharing. Therefore, it is essential to establish a system that effectively balances privacy concerns with the need for data. In our previous work, we introduced “Defendroid”, which focuses on real-time Android code vulnerability detection using a blockchain federated neural network with explainable artificial intelligence. In this study, the Defendroid approach is enhanced by incorporating variable differential privacy techniques to ensure the privacy of the model training process. The proposed method significantly improves privacy, achieving a privacy budget between 1 and 1.5, while maintaining Defendroid’s baseline accuracy of 96% and an F1-Score of 0.96. As a result, this research thoroughly addresses concerns about the privacy of source code, filling a critical gap. This advancement not only showcases the effectiveness of the new approach but also its capability to address the significant challenges of privacy and data scarcity in AI-driven, community-focused Android code vulnerability detection.

Assuring Privacy of AI-Powered Community Driven Android Code Vulnerability Detection

Luca Piras;
2025-01-01

Abstract

The challenge of training AI models is heightened by the limited availability of data, particularly when public datasets are insufficient. While obtaining data from private sources may seem like a viable solution, privacy concerns often prevent data sharing. Therefore, it is essential to establish a system that effectively balances privacy concerns with the need for data. In our previous work, we introduced “Defendroid”, which focuses on real-time Android code vulnerability detection using a blockchain federated neural network with explainable artificial intelligence. In this study, the Defendroid approach is enhanced by incorporating variable differential privacy techniques to ensure the privacy of the model training process. The proposed method significantly improves privacy, achieving a privacy budget between 1 and 1.5, while maintaining Defendroid’s baseline accuracy of 96% and an F1-Score of 0.96. As a result, this research thoroughly addresses concerns about the privacy of source code, filling a critical gap. This advancement not only showcases the effectiveness of the new approach but also its capability to address the significant challenges of privacy and data scarcity in AI-driven, community-focused Android code vulnerability detection.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/367095
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact