Stakeholders working in open-source software development use social media, emails or any available means in the Internet to communicate and express what they want or need through the use of text. The recognition of such needs or desires (that we call intentions) is usually done by a human reader, and it can require a considerable effort when the amount of messages in online discussions increases. The problem is that to support an automated recognition of the intentions hidden in the text, data are needed in the domain of software development for training classifiers. However, so far there is no data annotated with intentions that can be used for data mining purposes. In order to tackle the lack of data we have collected online discussions in the domain of software development and asked people to annotate such discussions with intentions. This collection has been performed by crowdsourcing the task of annotating sentences with their hidden intention. In this paper we report the experience of carrying out a crowdsourcing project with a heterogeneous crowd. We discuss how we applied the steps of the crowdsourcing workflow in CrowdIntent. Lessons learned and future work are also presented.
CrowdIntent: Annotation of Intentions Hidden in Online Discussions
Morales Ramirez, Itzel;Perini, Anna
2015-01-01
Abstract
Stakeholders working in open-source software development use social media, emails or any available means in the Internet to communicate and express what they want or need through the use of text. The recognition of such needs or desires (that we call intentions) is usually done by a human reader, and it can require a considerable effort when the amount of messages in online discussions increases. The problem is that to support an automated recognition of the intentions hidden in the text, data are needed in the domain of software development for training classifiers. However, so far there is no data annotated with intentions that can be used for data mining purposes. In order to tackle the lack of data we have collected online discussions in the domain of software development and asked people to annotate such discussions with intentions. This collection has been performed by crowdsourcing the task of annotating sentences with their hidden intention. In this paper we report the experience of carrying out a crowdsourcing project with a heterogeneous crowd. We discuss how we applied the steps of the crowdsourcing workflow in CrowdIntent. Lessons learned and future work are also presented.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.