Online discussions about software applications and services that take place on web-based communication platforms represent an invaluable knowledge source for diverse software engineering tasks, including requirements elicitation. The amount of research work on developing effective tool-supported analysis methods is rapidly increasing, as part of the so called software analytics. Textual messages in App store reviews, tweets, online discussions taking place in mailing lists and user forums, are analysed by combining natural language processing techniques to filter out irrelevant data; text mining and machine learning algorithms to classify messages into different categories, such as bug report and feature request. Our research objective is to exploit a linguistic technique based on speech-acts for the analysis of online discussions with the ultimate goal of discovering requirement-relevant information. In this paper, we present a revised and extended version of the speech-acts based analysis technique, which we previously presented at CAiSE 2017, together with a detailed experimental characterisation of its properties. Datasets used in the experimental evaluation are taken from a widely used open source software project (161120 textual comments), as well as from an industrial project in the home energy management domain. We make them available for experiment replication purposes. On these datasets, our approach is able to successfully classify messages into Feature/Enhancement and Other, with F-measure of 0.81 and 0.84 respectively. We also found evidence that there is an association between types of speech-acts and categories of issues, and that there is correlation between some of the speech-acts and issue priority, thus motivating further research on the exploitation of our speech-acts based analysis technique in semi-automated multi-criteria requirements prioritisation.
Speech-acts based analysis for requirements discovery from online discussions
Kifetew, Fitsum Meshesha;Perini, Anna
2019-01-01
Abstract
Online discussions about software applications and services that take place on web-based communication platforms represent an invaluable knowledge source for diverse software engineering tasks, including requirements elicitation. The amount of research work on developing effective tool-supported analysis methods is rapidly increasing, as part of the so called software analytics. Textual messages in App store reviews, tweets, online discussions taking place in mailing lists and user forums, are analysed by combining natural language processing techniques to filter out irrelevant data; text mining and machine learning algorithms to classify messages into different categories, such as bug report and feature request. Our research objective is to exploit a linguistic technique based on speech-acts for the analysis of online discussions with the ultimate goal of discovering requirement-relevant information. In this paper, we present a revised and extended version of the speech-acts based analysis technique, which we previously presented at CAiSE 2017, together with a detailed experimental characterisation of its properties. Datasets used in the experimental evaluation are taken from a widely used open source software project (161120 textual comments), as well as from an industrial project in the home energy management domain. We make them available for experiment replication purposes. On these datasets, our approach is able to successfully classify messages into Feature/Enhancement and Other, with F-measure of 0.81 and 0.84 respectively. We also found evidence that there is an association between types of speech-acts and categories of issues, and that there is correlation between some of the speech-acts and issue priority, thus motivating further research on the exploitation of our speech-acts based analysis technique in semi-automated multi-criteria requirements prioritisation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.