Detection of Sexually Harassing Tweets in Hindi Using Deep Learning Methods

Detection of Sexually Harassing Tweets in Hindi Using Deep Learning Methods

Tarun Jain, Rishabh Jain, Shivaji Ray Chaudhuri, Shrey Upadhyay, Arjun Singh, Vivek K. Verma, Aditya Sinha
Copyright: © 2022 |Pages: 15
DOI: 10.4018/IJSI.309110
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In the modern era, social networking platforms play a vital role in day-to-day life. It is used by personalities, bureaucrats, and common people alike. It provides a platform for everyone to express their opinions, share their experiences with their family, friends, and the world. The advent of such platforms has completely changed the outlook of the world. Sexual harassment has been a problem for a long time, but with the advancement in technology, some users have taken the harassment to the digital level. It is common these days to find users posting derogatory remarks, uploading sensual and private content of others in order to sexually harass them. This is not all. Some targeted audience might even take some wrong steps. This article presents a way to detect such derogatory and harassing remarks using Twitter as a database. The classification part is done with the help of two deep learning models, and at the end, the performance of both models is evaluated on various parameters.
Article Preview
Top

1. Introduction

Before we discuss about the more technical aspects of this article, we need to know about the Social Networking Platform we are relying on.

Statistics have shown that as of November 2019 there are approximately 330 million monthly users in Twitter with approximately 145 million users using the service on a daily basis. Since we are focusing on Hindi, we need to know the demographics of India as majority of the Hindi speaking crowd comes from India. Again, as of 2019, India has approximately 7.75 million users on Twitter. Most people in India use Twitter as a source of news. There is very high gender inequality in the users as only 16% are female and rest 84% are male.

We also need to know what is considered as sexual harassment and its severity in social networking platforms. In general, sexual harassment refers to forceful or unwelcome gestures, behaviors or sexual advances directed towards a person. Mostly it is seen as a violence against women. But when it comes to sexual harassment on social networking platforms it takes a completely different perspective. Rather than advances as in physical harassment cases, in digital world sexual harassment is more related to humiliation, threatening or discriminating someone by posting, replying in a very unresentful, disturbing manner(Chowdhury,2019).

Talking about Twitter it was generally classified as a safe place for expressing views. The company even touted that every word has a power to change the world but these days sexually harassing tweets have become quite common targeting both men and women. Women have been targeted more with abusive, derogatory tweets. In a report from Amnesty International, they have described that when the world is taking huge strides in gender equality, some individuals or groups are using social medial platforms to curb the enthusiasm by posting offensive tweets which more often than not resulted in backing of women from such platforms and public fields (Chowdhury,2019).

This article focuses on Hindi tweets so we should also look about the situation of our country India which is the largest Hindi speaking nation. India has taken huge steps in promoting gender equality, prevention of discrimination, crime against women. These days’ women occupy top positions in the country’s governance and excel in all fields be it sports, business or social services. There is also a vast presence of presence of Indian women in Twitter. Again, according to a report from Amnesty International where they surveyed 114,716 tweets for 95 of the top women politicians in India and found that 13.8% of the tweets were “problematic” or “abusive”, it contained harmful and hurtful content (Yin, 2009).

There have been many campaigns from welfare bodies around the world to spread awareness regarding the severity of such actions. They are directed towards men and women alike and needs to be controlled for a safe social media platform (Chowdhury,2019). Thus, this article presents a solution which classifies such tweets in Hindi language to be specific.

Now coming to sentiment analysis, this procedure allows us to classify texts into positive or negative classes. Since our research focuses on detecting sexually harassing tweets, our dataset contains tweets along with a binary polarity column that specifies whether a tweet can be classified as sexually harassing or not(Yin, 2009).

We have chosen a dataset such that majority of the tweets are in Hindi language. After that we have done some preprocessing on our data and then used a combination of CNN and LSTM model and a combination of RNN and LSTM model to obtain results (Kennedy, 2017). The technical part, accuracies achieved, and conclusions will be discussed in the later parts of this article thoroughly.

Complete Article List

Search this Journal:
Reset
Volume 12: 1 Issue (2024)
Volume 11: 1 Issue (2023)
Volume 10: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 9: 4 Issues (2021)
Volume 8: 4 Issues (2020)
Volume 7: 4 Issues (2019)
Volume 6: 4 Issues (2018)
Volume 5: 4 Issues (2017)
Volume 4: 4 Issues (2016)
Volume 3: 4 Issues (2015)
Volume 2: 4 Issues (2014)
Volume 1: 4 Issues (2013)
View Complete Journal Contents Listing