For many years now, emojis have been used in social networks and chat services in order to enrich written text with auxiliary graphical content, achieving a higher degree of empathy. In particular, given the wide use of this medium, emojis are now available in such a number and variety that every basic concept and mood is covered by at least one. For this reason, the connection between the emoji and its semantical meaning grows stronger. In this paper, we will be describing the work performed in order to develop a Machine Learning based tool that, given a tweet, predicts the most likely emoji associated with the text. The task resembles the one presented by Barbieri et al., [23], and is placed within the context of the International Workshop on Semantic Evaluation (SemEval) 2018. We designed a baseline with standard Support Vector Machines and another baseline based on fastText, which was provided as part of the Workshop. In addition, we implemented several models based on Neural Networks such as Bidirectional Long Short-Term Memory Recurrent Neural Networks and Convolutional Neural Networks. We found that the latter is the most effective since it outperformed all our models and ranks in the 6th position out of 47 total participants. Our work aims to illustrate the potential of simpler models, which, thanks to the fine-tuning of hyper-parameters, could achieve accuracy comparable to the more complex models of the challenge.
Exploiting Deep Neural Networks for Tweet-based Emoji Prediction
Andrei Catalin Coman
;Zara, Giacomo;Yaroslav Nechaev;Gianni Barlacchi;
2018-01-01
Abstract
For many years now, emojis have been used in social networks and chat services in order to enrich written text with auxiliary graphical content, achieving a higher degree of empathy. In particular, given the wide use of this medium, emojis are now available in such a number and variety that every basic concept and mood is covered by at least one. For this reason, the connection between the emoji and its semantical meaning grows stronger. In this paper, we will be describing the work performed in order to develop a Machine Learning based tool that, given a tweet, predicts the most likely emoji associated with the text. The task resembles the one presented by Barbieri et al., [23], and is placed within the context of the International Workshop on Semantic Evaluation (SemEval) 2018. We designed a baseline with standard Support Vector Machines and another baseline based on fastText, which was provided as part of the Workshop. In addition, we implemented several models based on Neural Networks such as Bidirectional Long Short-Term Memory Recurrent Neural Networks and Convolutional Neural Networks. We found that the latter is the most effective since it outperformed all our models and ranks in the 6th position out of 47 total participants. Our work aims to illustrate the potential of simpler models, which, thanks to the fine-tuning of hyper-parameters, could achieve accuracy comparable to the more complex models of the challenge.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.