The Keyboard Knows About You: Revealing User Characteristics via Keystroke Dynamics

The Keyboard Knows About You: Revealing User Characteristics via Keystroke Dynamics

Ioannis Tsimperidis (Democritus University of Thrace, Greece) and Avi Arampatzis (Democritus University of Thrace, Greece)
Copyright: © 2020 |Pages: 18
DOI: 10.4018/IJT.2020070103
OnDemand PDF Download:
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

One of the causes of several problems on the internet, such as financial fraud, cyber-bullying, and seduction of minors, is the complete anonymity that a malicious user can maintain. Most methods that have been proposed to remove this anonymity are either intrusive, or violate privacy, or expensive. This paper proposes the recognition of certain characteristics of an unknown user through keystroke dynamics, which is the way a person is typing. The evaluation of the method consists of three stages: the acquisition of keystroke dynamics data from 118 volunteers during the daily use of their devices, the extraction and selection of keystroke dynamics features based on their information gain, and the testing of user characteristics recognition by training five well-known machine learning models. Experimental results show that it is possible to identify the gender, the age group, the handedness, and the educational level of an unknown user with high accuracy.
Article Preview
Top

Introduction

Today there are more than 4 billion Internet users in the world who use online services in order to communicate, entertain, educate, work, etc. The way we communicate over the Internet with someone else differs radically from the way we do it in person. Most of the time we do not see the face of our interlocutor, nor his/her expressions, we do not hear his/her voice, nor the way its tone changes. The stimuli that used to give us information about who our interlocutor is and what his/her intentions are, have ceased to exist. In addition, we have to consider that often a user is talking to someone completely unknown and that kids and teenagers participate in these conversations, especially in social networks. It is easily understood that these lurk many dangers, such as financial fraud, seduction of minors, anonymous threats, etc. In addition, it raises the question of how ethical it is for someone to take advantage of this particularity of communication and to conceal his/her identity from his/her interlocutor.

According to the definition of “Technoethics” from the work of Alim and Khalid (2019), technology (apart from being part of social development) causes changes in lifestyle and as a result many ethical considerations have to be addressed. Much of these considerations are about the individuals, and more specifically the ethical questions that are exacerbated by the ways in which technology extends or curtailed their power. Consequently, the limitation imposed on a computer user to know some things about the person talking, through a messaging application for example, is an issue to be considered in the light of “Technoethics”. Just as it would in a face-to-face conversation, or even a telephone conversation, where everyone would receive information about their interlocutor, consequently modifying their attitude accordingly, it would be fair to do so where this information cannot be obtained.

One solution to the aforementioned problem is to know some characteristics of the user we are talking to, but without violating his/her privacy, such as the user’s gender, age, educational level, and so on. There are several proposals for achieving these, such as that of Cheung and She (2017), who tried to recognize the gender of users from the images generated by their mobile devices and shared in social networks. The gender of the user can also be predicted from multimodal data as demonstrated by Estruch et al. (2017) using a corpus containing text data, but also image and location information, coming from three different social networks from users of three different cities, achieving accuracy of 91.3%. A method for recognizing the age of users that exploits sociolinguistically-based and content-related text features is proposed by Simaki et al. (2016), while similarly, Arroju et al. (2015) try to determine the gender and age of Twitter users based on the contents of their tweets. Although in most cases the target characteristic is the gender and/or age of users, there are also methods in the literature trying to discover other characteristics, such as the work of Seneviratne et al. (2014) where the authors attempt to determine religion, relationship status, spoken languages, and countries of interest of unknown users from snapshots of apps installed on their smartphones.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 13: 2 Issues (2022): Forthcoming, Available for Pre-Order
Volume 12: 2 Issues (2021): 1 Released, 1 Forthcoming
Volume 11: 2 Issues (2020)
Volume 10: 2 Issues (2019)
Volume 9: 2 Issues (2018)
Volume 8: 2 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2015)
Volume 5: 2 Issues (2014)
Volume 4: 2 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing