Journal Article |
Understanding Abuse: A Typology of Abusive Language Detection Subtasks
View Abstract
As the body of research on abusive language detection and analysis grows, there is a need for critical consideration of the relationships between different subtasks that have been grouped under this label. Based on work on hate speech, cyberbullying, and online abuse we propose a typology that captures central similarities and differences between subtasks and we discuss its implications for data annotation and feature construction. We emphasize the practical actions that can be taken by researchers to best approach their abusive language detection subtask of interest.
|
2017 |
Waseem, Z., Davidson, T., Warmsley, D. and Weber, I. |
View
Publisher
|
Journal Article |
Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words
View Abstract
Common approaches to text categorization essentially rely either on n-gram counts or on word embeddings. This presents important difficulties in highly dynamic or quickly-interacting environments, where the appearance of new words and/or varied misspellings is the norm. A paradigmatic example of this situation is abusive online behavior, with social networks and media platforms struggling to effectively combat uncommon or nonblacklisted hate words. To better deal with these issues in those fast-paced environments, we propose using the error signal
of class-based language models as input to text classification algorithms. In particular, we train a next-character prediction model for any given class, and then exploit the error of such class-based models.
to inform a neural network classifier. This
way, we shift from the ability to describe
seen documents to the ability to predict
unseen content. Preliminary studies using out-of-vocabulary splits from abusive
tweet data show promising results, outperforming competitive text categorization
strategies by 4–11%.
|
2017 |
Serra, J., Leontiadis, I., Spathis, D., Stringhini, G., Blackburn, J. and Vakali, A. |
View
Publisher
|
Journal Article |
A Survey on Hate Speech Detection using Natural Language Processing
View Abstract
This paper presents a survey on hate speech detection. Given the steadily growing body of social media content, the amount of online hate speech is also increasing. Due to the massive scale of the web, methods that automatically detect hate speech are required. Our survey describes key areas that have been explored to automatically recognize these types of utterances using natural language processing. We also discuss limits of those approaches.
|
2017 |
Schmidt, A. and Wiegand, M. |
View
Publisher
|
Journal Article |
Detecting the Hate Code on Social Media
View Abstract
Social media has become an indispensable part of the everyday lives of millions of people around the world. It provides a platform for expressing opinions and beliefs, communicated to a massive audience. However, this ease with which people can express themselves has also allowed for the large scale spread of propaganda and hate speech. To prevent violating the abuse policies of social media platforms and also to avoid detection by automatic systems like Google’s Conversation AI, racists have begun to use a code (a movement termed Operation Google). This involves substituting references to communities by benign words that seem out of context, in hate filled posts or Tweets. For example, users have used the words Googles and Bings to represent the African-American and Asian communities, respectively. By generating the list of users who post such content, we move a step forward from classifying tweets by allowing us to study the usage pattern of these concentrated set of users.
|
2017 |
Magu, R., Joshi, K. and Luo, J. |
View
Publisher
|
Journal Article |
Automated hate speech detection and the problem of offensive language
View Abstract
A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can
reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.
|
2017 |
Davidson, T., Warmsley, D., Macy, M. and Weber, I. |
View
Publisher
|
Journal Article |
Deep Learning for Hate Speech Detection in Tweets
View Abstract
Hate speech detection on Twitter is critical for applications like controversial event extraction, building AI chatterbots, content recommendation, and sentiment analysis. We define this task as being able to classify a tweet as racist, sexist or neither. The complexity of the natural language constructs makes this task very challenging. We perform extensive experiments with multiple deep learning architectures to learn semantic word embeddings to handle this complexity. Our experiments on a benchmark dataset of 16K annotated tweets show that such deep learning methods outperform state-of-the-art char/word n-gram methods by ~18 F1 points.
|
2017 |
Badjatiya, P., Gupta, S., Gupta, M. and Varma, V. |
View
Publisher
|