Feature extraction and selection for automatic hate speech detection on Twitter

In recent decades, information technology went through an explosive evolution, revolutionizing the way communication takes place, on the one hand enabling the rapid, easy and almost costless digital interaction, but, on the other, easing the adoption of more aggressive communication styles. It is crucial to regulate and attenuate these behaviors, especially in the digital context, where these emerge at a fast and uncontrollable pace and often cause severe damage to the targets. Social networks and other entities tend to channel their efforts into minimizing hate speech, but the way each one handles the issue varies. Thus, in this thesis, we investigate the problem of hate speech detection in social networks, focusing directly on Twitter. Our first goal was to conduct a systematic literature review of the topic, targeting mostly theoretical and practical approaches. We exhaustively collected and critically summarized mostly recent literature addressing the topic, highlighting popular definitions of hate, common targets and different manifestations of such behaviors. Most perspectives tackle the problem by adopting machine learning approaches, focusing mostly on text mining and natural language processing techniques, on Twitter. Other authors present novel features addressing the users themselves. Although most recent approaches target Twitter, we noticed there were few tools available that would address this social network platform or tweets in particular, considering their informal and specific syntax. Thus, our second goal was to develop a tokenizer able to split tweets into their corresponding tokens, taking into account all their particularities. We performed two binary hate identification experiments, having achieved the best f-score in one of them using our tokenizer. We used our tool in the experiments conducted in the following chapters. As our third goal, we proposed to assess which text-based features and preprocessing techniques would produce the best results in hate speech detection. During our literature review, we collected the most common preprocessing, sentiment and vectorization features and extracted the ones we found suitable for Twitter in particular. We concluded that preprocessing the data is crucial to reduce its dimensionality, which is often a problem in small datasets. Additionally, the f-score also improved. Furthermore, analyzing the tweets’ semantics and extracting their character n-grams were the tested features that better improved the detection of hate, enhancing the f-score by 1.5% and the hate recall by almost 5% on unseen testing data. On the other hand, analyzing the tweets’ sentiment didn’t prove to be helpful. Our final goal derived from a lack of user-based features in the literature. Thus, we investigated a set of features based on profiling Twitter users, focusing on several aspects, such as the gender of authors and mentioned users, their tendency towards hateful behaviors and other characteristics related to their accounts (e.g. number of friends and followers). For each user, we also generated an ego network, and computed graph-related statistics (e.g. centrality, homophily), achieving significant improvements – f-score and hate recall increased by 5.7% and 7%, respectively.

x
Tags: Social Media, Twitter (X)