Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Neural Network Applications in Hate Speech Detection

Brian Tuan Khieu, Melody Moh

Source Title: Neural Networks for Natural Language Processing

DOI: 10.4018/978-1-7998-1159-6.ch012

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This chapter presents a literature survey of the current state of hate speech detection models with a focus on neural network applications in the area. The growth and freedom of social media has facilitated the dissemination of positive and negative ideas. Proponents of hate speech are one of the key abusers of the privileges allotted by social media, and the companies behind these networks have a vested interest in identifying such speech. Manual moderation is too cumbersome and slow to deal with the torrent of content generation on these social media sites, which is why many have turned to machine learning. Neural network applications in this area have been very promising and yielded positive results. However, there are newly discovered and unaddressed problems with the current state of hate speech detection. Authors' survey identifies the key techniques and methods used in identifying hate speech, and they discuss promising new directions for the field as well as newly identified issues.

Chapter Preview

Top

Introduction

With the spread of social networking websites, it has become easier than ever to broadcast one’s opinions on whichever topic one may choose. While the quick dissemination of information through such sites can elicit much good, in irresponsible or scheming hands, such power can bring about great division and anguish. One such example of the harm that can come about is the birth of echo chambers on the internet; misguided or misinformed people can find themselves trapped in a vicious cycle of ingraining more and more radical and polarizing sentiments. Hate speech and its prevalence in online social networks have proven to be an ongoing problem on such sites. While manual user flaggings of comments or posts can help, the process can be abused to silence opinions one disagrees with. With the constant stream of content generation, simply employing an army of moderators will not solve the issue either. Thus, there is a need for an effective and automated system for identifying hate speech.

One way to identify hate speech is to use a lexical-based approach where certain negative words are always flagged to indicate a need for further inspection. Certain words are statistically identified to appear in manually identified hate speech more than others, and they are subsequently added to a ruleset to follow. Unfortunately, such approaches are somewhat naive and ill-equipped to handle slang and symbolism. Although, these lexical-based approaches are sometimes used in conjunction with other methods to form a more robust solution.

The more generally accepted method of identifying hate speech is the use of machine learning and deep learning algorithms. This approach more readily handles slang and symbolism since the models will be trained upon a dataset that includes such words and phrases.

Machine learning and deep learning models built for hate speech detection can fall into one of two categories, word-based and character-based models. Word-based models rely on extracting features from n-grams of different tokenized word combinations while character-based models do so from n-grams of characters. Word-based models can also utilize lexical-based techniques and factor in a word’s sentiment or connotation.

One of the earliest machine learning techniques leveraged to identify hate speech is logistic regression. Logistic regression involves using the sigmoid function to squash values between 0 and 1 in order to map observations to a number of discrete classes. Since the values are forced to be between 0 and 1, the output is composed of probabilities instead of continuous values like linear regression does.

While logistic regression is somewhat effective at identifying hate speech detection, researchers have been eager to apply deep learning methods to the problem. Another early attempt at solving the issue using deep learning incorporated a Multilayer Perceptron Network. A Multilayer Perceptron is a composed of several layers of nodes where each node is connected to every node in the preceding layer. This approach did not significantly outperform the logistic regression model and did struggle somewhat to effectively identify hate speech. This most likely is the result of the lack of memory of the network; past events are not taken into consideration when determining the current event’s significance. In the area of hate speech detection and natural language processing, forgoing the effect past words have on current and future words leads to a loss of meaning and context. Long Short-Term Networks are a subset of Recurrent Neural Networks and both retain memory of past events through use of an internal state. Specifically for Long Short-Term Networks, they use a combination of input and output gates to properly discard, retain, and pass on old information. This makes them more appropriate for addressing the issue of hate speech detection since past words can properly give current words context and meaning. Recurrent Neural Networks themselves have an issue with retaining information from long ago in the past which is why Gated Recurrent Units were developed to address the issue. Gated Recurrent Units are often paired with Convolutional Neural Networks, networks that apply convolutions and pooling operations incoming data. The purpose is to have the Convolutional Neural Networks extract key features from the input data while having the Gated Recurrent Units retain past information to give context. Both Long Short-Term Memory and the combination of Convolutional Neural Networks and Gated Recurrent Units identify hate speech comparably well to one another. However, it’s important to note that the training time for the combination is significantly less than that of the Long Short-Term Memory.

Key Terms in this Chapter

Word Embeddings: Vector representation of terms that reflect the distance between different terms. These are primarily used and generated by neural network text models.

Convolution Neural Network: A type of deep neural networks that uses convolution and pooling layers to typically classify imagery.

Context: Any information not present within the original text such as current events.

Hate Speech: Language expressing hatred of a type of people with varying degrees of a call to action.

Bag of Words: A representation of text using frequencies with an assortment of words.

Term Frequency Inverse Document Frequency: A statistic that reflects how important a word is based on how frequently it appears in a document while inversely proportional to how often in appears in other documents.

Recurrent Neural Network: A type of neural network where nodes are connected in a temporal sequence to retain information from the past.

Long Short Term Memory: A type of recurrent neural network that process sequential data while also retaining information deep in the past.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Neural Network Applications in Hate Speech Detection

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List