Relationship Between Personality Patterns and Harmfulness: Analysis and Prediction Based on Sentence Embedding

Relationship Between Personality Patterns and Harmfulness: Analysis and Prediction Based on Sentence Embedding

Kazuyuki Matsumoto, Ryota Kishima, Seiji Tsuchiya, Tomoki Hirobayashi, Minoru Yoshida, Kenji Kita
DOI: 10.4018/IJITWE.298654
Article PDF Download
Open access articles are freely available for download

Abstract

This paper hypothesize that harmful utterances need to be judged in context of whole sentences, and extract features of harmful expressions using a general-purpose language model. Based on the extracted features, we propose a method to predict the presence or absence of harmful categories. In addition, the authors believe that it is possible to analyze users who incite others by combining this method with research on analyzing the personality of the speaker from statements on social networking sites. The results confirmed that the proposed method can judge the possibility of harmful comments with higher accuracy than simple dictionary-based models or models using a distributed representations of words. The relationship between personality patterns and harmful expressions was also confirmed by an analysis based on a harmful judgment model.
Article Preview
Top

1. Introduction

There are various risks associated with posting provocative or offensive messages online (also known as “Internet flaming”). The risk of personal information being leaked can lead to inquiries and harassment via email and phone; if the situation becomes serious, it can lead to the breakdown of relationships and affect unrelated people. To avoid such risks, it is important to prepare measures to prevent Internet flaming.

There are many causes of online flame wars. In particular, microblogs such as Twitter allow users to post easily; therefore, even extreme or inappropriate content that gives a bad impression is often posted without careful thought. For example, some posts share annoying behavior or pranks or boasts about criminal activities. These posters have no idea that what they post on social networking sites (SNS) will be seen by many people. All of these have the potential to cause a social media storm.

According to statistical studies, there is a slight tendency for flaming participants to be complicit in online flames according to their gender, age, and annual income (Tanaka & Yamaguchi, 2016). However, anyone who uses SNS can be a participant or a victim of online flaming.

In addition, artificial intelligence (AI) chatbots, which have been developing rapidly in recent years, can generate natural and fluent speech as if spoken by a human. The language models used to generate this speech are often trained using large-scale text data collected from the Internet. Therefore, there is a problem that inappropriate expressions in the text data can be reflected in the speech generation, resulting in harmful statements (Fuchs, 2018).

The causes and forms of all recent Internet flames are diverse, and it is difficult to detect them using only the text posted as a clue. There are two problems: 1) the text may contain images or videos that may upset users, and 2) most of the posts in cases of Internet flaming are deleted immediately after the reaction caused.

First, it is not impossible to analyze the meaning of images and videos, but more complicated processing is required. Second, it is difficult to collect data on a large scale. Therefore, here, we focus on harmful expressions such as swear words and discriminatory statements, which are less dependent on a specific time or speaker and may be repeated by many people.

It is not always the case that harmful expressions are included in comments that cause flames. If a tweet contains abusive language, depending on the context, it may not be seen as an example of Internet flaming. For this reason, it is difficult to predict whether a tweet will become an Internet flame. However, if a tweet is judged to be harmful, there will be many people who will misunderstand the tweet. In other words, it is possible to prevent flaming by informing the speaker of the possibility that a harmful text may trigger a flaming incident.

Here, the utterances are collected to be analyzed based on the harmful expression dictionary defined in Matsumoto et al.’s study (Matsumoto et al., 2018). It is created that a corpus of harmful expressions by assigning harmful categories to these sentences. The sentences are then vectorized using a pre-trained language model such as the bidirectional encoder representations from transformers (BERT) model, and a deep neural network is used to train the harmful category classification model and evaluate its accuracy. In addition, by analyzing the relationship between the user’s personality pattern, which can be inferred from the content of the utterance, and the harmfulness of the utterance, it is thought that it can be useful for predicting flaming before it occurs. Here, we focus on the similarity between harmful comments and personality patterns and analyze what kind of personality is likely to tweet harmful comments.

The remainder of the paper consists of the following. Section 2 describes the related research and Section 3 the proposed method. Section 4 discusses the results of the classification experiments and their similarity to personality patterns, and Section 5 summarizes the results.

Complete Article List

Search this Journal:
Reset
Volume 19: 1 Issue (2024)
Volume 18: 1 Issue (2023)
Volume 17: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 16: 4 Issues (2021)
Volume 15: 4 Issues (2020)
Volume 14: 4 Issues (2019)
Volume 13: 4 Issues (2018)
Volume 12: 4 Issues (2017)
Volume 11: 4 Issues (2016)
Volume 10: 4 Issues (2015)
Volume 9: 4 Issues (2014)
Volume 8: 4 Issues (2013)
Volume 7: 4 Issues (2012)
Volume 6: 4 Issues (2011)
Volume 5: 4 Issues (2010)
Volume 4: 4 Issues (2009)
Volume 3: 4 Issues (2008)
Volume 2: 4 Issues (2007)
Volume 1: 4 Issues (2006)
View Complete Journal Contents Listing