Article Preview
TopIntroduction
By providing virtual environments, online social networks allow humans to communicate and interact with each other in different ways to achieve their business, political, economic, and social goals. Twitter, Parler, and Reddit are well-known online social platforms that have attracted a lot of users based on the characteristics and functionalities they provide. For example, Twitter (https://www.reddit.com) provides a forum-based platform in which users can benefit from the news they are interested in by subscribing to the appropriate subreddits and exchanging comments.
The accuracy of the information shared and of the user’s identity is essential for social platforms. Social bots or users with malicious goals can be a threat to the accuracy of information and resources shared on social platforms. This group of users tries to act like legitimate users (for example, following other users, voting on posts created by other users, or engaging in different discussions) so that they can achieve their malicious goals without being detected (Najari et al., 2022). Various approaches have been proposed for identifying bots on social networking platforms, most of which are associated with Twitter. The most popular identification tool on Twitter is Botometer, through which bots and legitimate accounts (human accounts) are classified based on their characteristics such as metadata, content, and timing features. In accordance with the classification done by Latah et al. (2020), Botometer’s identification approach can be classified as machine-learning-based. In our previous work (Adel Alipour et al., 2022), we benchmarked Botometer and Tweetbotornot on publicly available labeled Twitter datasets to evaluate their performance. We then went further and proposed two new methods that could be used independently (Method 1, aka RepScope) and as an add-on (Method 2) to Botometer for improving Twitter bot identification. We evaluated the new methods on the same datasets that were used for benchmarking Botometer and Tweetbotornot. We could obtain high accuracy in identifying both bots and humans on Twitter using a much simpler approach than Botometer or Tweetbotornot. However, note that Botometer and Tweetbotornot can work only on Twitter data, whereas RepScope can potentially work on other social networks as well. In a nutshell, RepScope indicates the scope of the repetitive behavior of a user. In this research, we extend RepScope in two ways: (1.) we aim to improve upon RepScope by providing guidelines to choose online networks with appropriate thresholds and (2.) we analyze the generalizability of RepScope on different data from three different social networks; namely, Twitter, Parler, and Reddit. Among social platforms, we chose Twitter and Reddit because of their popularity, and Parler because of its alternative nature, providing fewer restrictions for users in publishing information, which in return could make it more vulnerable compared with other ones. The obtained results show that by considering the same threshold values, RepScope is able to identify malicious and legitimate behaviors on Twitter and Reddit datasets using the same configurations. However, the configuration of RepScope needs to be changed to identify legitimate and malicious users on Parler.
In the following sections of this paper, we present an overview of the related works, introduce our methodology, and share the results and evaluations. Finally, we discuss our conclusions and recommendations for future work.