Neural Network Applications in Hate Speech Detection

Neural Network Applications in Hate Speech Detection

Brian Tuan Khieu (San Jose State University, USA) and Melody Moh (San Jose State University, USA)
Copyright: © 2020 |Pages: 17
DOI: 10.4018/978-1-7998-1159-6.ch012


This chapter presents a literature survey of the current state of hate speech detection models with a focus on neural network applications in the area. The growth and freedom of social media has facilitated the dissemination of positive and negative ideas. Proponents of hate speech are one of the key abusers of the privileges allotted by social media, and the companies behind these networks have a vested interest in identifying such speech. Manual moderation is too cumbersome and slow to deal with the torrent of content generation on these social media sites, which is why many have turned to machine learning. Neural network applications in this area have been very promising and yielded positive results. However, there are newly discovered and unaddressed problems with the current state of hate speech detection. Authors' survey identifies the key techniques and methods used in identifying hate speech, and they discuss promising new directions for the field as well as newly identified issues.
Chapter Preview


With the spread of social networking websites, it has become easier than ever to broadcast one’s opinions on whichever topic one may choose. While the quick dissemination of information through such sites can elicit much good, in irresponsible or scheming hands, such power can bring about great division and anguish. One such example of the harm that can come about is the birth of echo chambers on the internet; misguided or misinformed people can find themselves trapped in a vicious cycle of ingraining more and more radical and polarizing sentiments. Hate speech and its prevalence in online social networks have proven to be an ongoing problem on such sites. While manual user flaggings of comments or posts can help, the process can be abused to silence opinions one disagrees with. With the constant stream of content generation, simply employing an army of moderators will not solve the issue either. Thus, there is a need for an effective and automated system for identifying hate speech.

One way to identify hate speech is to use a lexical-based approach where certain negative words are always flagged to indicate a need for further inspection. Certain words are statistically identified to appear in manually identified hate speech more than others, and they are subsequently added to a ruleset to follow. Unfortunately, such approaches are somewhat naive and ill-equipped to handle slang and symbolism. Although, these lexical-based approaches are sometimes used in conjunction with other methods to form a more robust solution.

The more generally accepted method of identifying hate speech is the use of machine learning and deep learning algorithms. This approach more readily handles slang and symbolism since the models will be trained upon a dataset that includes such words and phrases.

Machine learning and deep learning models built for hate speech detection can fall into one of two categories, word-based and character-based models. Word-based models rely on extracting features from n-grams of different tokenized word combinations while character-based models do so from n-grams of characters. Word-based models can also utilize lexical-based techniques and factor in a word’s sentiment or connotation.

One of the earliest machine learning techniques leveraged to identify hate speech is logistic regression. Logistic regression involves using the sigmoid function to squash values between 0 and 1 in order to map observations to a number of discrete classes. Since the values are forced to be between 0 and 1, the output is composed of probabilities instead of continuous values like linear regression does.

While logistic regression is somewhat effective at identifying hate speech detection, researchers have been eager to apply deep learning methods to the problem. Another early attempt at solving the issue using deep learning incorporated a Multilayer Perceptron Network. A Multilayer Perceptron is a composed of several layers of nodes where each node is connected to every node in the preceding layer. This approach did not significantly outperform the logistic regression model and did struggle somewhat to effectively identify hate speech. This most likely is the result of the lack of memory of the network; past events are not taken into consideration when determining the current event’s significance. In the area of hate speech detection and natural language processing, forgoing the effect past words have on current and future words leads to a loss of meaning and context. Long Short-Term Networks are a subset of Recurrent Neural Networks and both retain memory of past events through use of an internal state. Specifically for Long Short-Term Networks, they use a combination of input and output gates to properly discard, retain, and pass on old information. This makes them more appropriate for addressing the issue of hate speech detection since past words can properly give current words context and meaning. Recurrent Neural Networks themselves have an issue with retaining information from long ago in the past which is why Gated Recurrent Units were developed to address the issue. Gated Recurrent Units are often paired with Convolutional Neural Networks, networks that apply convolutions and pooling operations incoming data. The purpose is to have the Convolutional Neural Networks extract key features from the input data while having the Gated Recurrent Units retain past information to give context. Both Long Short-Term Memory and the combination of Convolutional Neural Networks and Gated Recurrent Units identify hate speech comparably well to one another. However, it’s important to note that the training time for the combination is significantly less than that of the Long Short-Term Memory.

Key Terms in this Chapter

Word Embeddings: Vector representation of terms that reflect the distance between different terms. These are primarily used and generated by neural network text models.

Convolution Neural Network: A type of deep neural networks that uses convolution and pooling layers to typically classify imagery.

Context: Any information not present within the original text such as current events.

Hate Speech: Language expressing hatred of a type of people with varying degrees of a call to action.

Bag of Words: A representation of text using frequencies with an assortment of words.

Term Frequency Inverse Document Frequency: A statistic that reflects how important a word is based on how frequently it appears in a document while inversely proportional to how often in appears in other documents.

Recurrent Neural Network: A type of neural network where nodes are connected in a temporal sequence to retain information from the past.

Long Short Term Memory: A type of recurrent neural network that process sequential data while also retaining information deep in the past.

Complete Chapter List

Search this Book: