Journal Article |
Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate in Online News Media
View Abstract
Online social media platforms generally attempt to mitigate hateful expressions, as these comments can be detrimental to the health of the community. However, automatically identifying hateful comments can be challenging. We manually label 5,143 hateful expressions posted to YouTube and Facebook videos among a dataset of 137,098 comments from an online news media. We then create a granular taxonomy of different types and targets of online hate and train machine learning models to automatically detect and classify the hateful comments in the full dataset. Our contribution is twofold: 1) creating a granular taxonomy for hateful online comments that includes both types and targets of hateful comments, and 2) experimenting with machine learning, including Logistic Regression, Decision Tree, Random Forest, Adaboost, and Linear SVM, to generate a multiclass, multilabel classification model that automatically detects and categorizes hateful comments in the context of online news media. We find that the best performing model is Linear SVM, with an average F1 score of 0.79 using TF-IDF features. We validate the model by testing its predictive ability, and, relatedly, provide insights on distinct types of hate speech taking place on social media.
|
2018 |
Salminen, J., Almerekhi, H., Milenković, M., Jung, S.G., An, J., Kwak, H. and Jansen, B.J. |
View
Publisher
|
Journal Article |
Hate Speech Detection on Twitter: Feature Engineering v.s. Feature Selection
View Abstract
The increasing presence of hate speech on social media has drawn significant investment from governments, companies, and empirical research. Existing methods typically use a supervised text classification approach that depends on carefully engineered features. However, it is unclear if these features contribute equally to the performance of such methods. We conduct a feature selection analysis in such a task using Twitter as a case study, and show findings that challenge conventional perception of the importance of manual feature engineering: automatic feature selection can drastically reduce the carefully engineered features by over 90% and selects predominantly generic features often used by many other language related tasks; nevertheless, the resulting models perform better using automatically selected features than carefully crafted task-specific features.
|
2018 |
Robinson, D., Zhang, Z. and Tepper, J. |
View
Publisher
|
Journal Article |
Hierarchical CVAE for Fine-Grained Hate Speech Classification
View Abstract
Existing work on automated hate speech detection typically focuses on binary classification or on differentiating among a small set of categories. In this paper, we propose a novel method on a fine-grained hate speech classification task, which focuses on differentiating among 40 hate groups of 13 different hate group categories. We first explore the Conditional Variational Autoencoder (CVAE) as a discriminative model and then extend it to a hierarchical architecture to utilize the additional hate category information for more accurate prediction. Experimentally, we show that incorporating the hate category information for training can significantly improve the classification performance and our proposed model outperforms commonly-used discriminative models.
|
2018 |
Qian, J., ElSherief, M., Belding, E. and Wang, W.Y. |
View
Publisher
|
Journal Article |
Detecting the Hate Code on Social Media
View Abstract
Social media has become an indispensable part of the everyday lives of millions of people around the world. It provides a platform for expressing opinions and beliefs, communicated to a massive audience. However, this ease with which people can express themselves has also allowed for the large scale spread of propaganda and hate speech. To prevent violating the abuse policies of social media platforms and also to avoid detection by automatic systems like Google’s Conversation AI, racists have begun to use a code (a movement termed Operation Google). This involves substituting references to communities by benign words that seem out of context, in hate filled posts or Tweets. For example, users have used the words Googles and Bings to represent the African-American and Asian communities, respectively. By generating the list of users who post such content, we move a step forward from classifying tweets by allowing us to study the usage pattern of these concentrated set of users.
|
2017 |
Magu, R., Joshi, K. and Luo, J. |
View
Publisher
|
Journal Article |
A Survey on Automatic Detection of Hate Speech in Text
View Abstract
The scientific study of hate speech, from a computer science point of view, is recent. This survey organizes and describes the current state of the field, providing a structured overview of previous approaches, including core algorithms, methods, and main features used. This work also discusses the complexity of the concept of hate speech, defined in many platforms and contexts, and provides a unifying definition. This area has an unquestionable potential for societal impact, particularly in online communities and digital media platforms. The development and systematization of shared resources, such as guidelines, annotated datasets in multiple languages, and algorithms, is a crucial step in advancing the automatic detection of hate speech.
|
2018 |
Fortuna, P. and Nunes, S. |
View
Publisher
|
Journal Article |
Us and them: identifying cyber hate on Twitter across multiple protected characteristics
View Abstract
Hateful and antagonistic content published and propagated via the World Wide Web has the potential to cause harm and suffering on an individual basis, and lead to social tension and disorder beyond cyber space. Despite new legislation aimed at prosecuting those who misuse new forms of communication to post threatening, harassing, or grossly offensive language – or cyber hate – and the fact large social media companies have committed to protecting their users from harm, it goes largely unpunished due to difficulties in policing online public spaces. To support the automatic detection of cyber hate online, specifically on Twitter, we build multiple individual models to classify cyber hate for a range of protected characteristics including race, disability and sexual orientation. We use text parsing to extract typed dependencies, which represent syntactic and grammatical relationships between words, and are shown to capture ‘othering’ language – consistently improving machine classification for different types of cyber hate beyond the use of a Bag of Words and known hateful terms. Furthermore, we build a data-driven blended model of cyber hate to improve classification where more than one protected characteristic may be attacked (e.g. race and sexual orientation), contributing to the nascent study of intersectionality in hate crime.
|
2016 |
Burnap, P. and Williams, M.L. |
View
Publisher
|