An Exploration of Machine Learning Methods for Biometric Identification Based on Keystroke Dynamics

An Exploration of Machine Learning Methods for Biometric Identification Based on Keystroke Dynamics

Didem Filiz, Ömer Özgür Tanrıöver
DOI: 10.4018/978-1-7998-0301-0.ch014
(Individual Chapters)
No Current Special Offers


In this chapter, authors explore keystroke dynamics as behavioral biometrics and effectiveness of state-of-the-art machine learning algorithms for identifying and authenticating users based on keystroke data. One of the motivations is to explore the use of classifiers to the field keystroke dynamics. In different settings, recent machine learning models have been effective with limited data and computationally relatively inexpensive. Therefore, authors conducted experiments with two different keystroke dynamics datasets with limited data. They demonstrated the effectiveness of models on dataset obtained from touch screen devices (mobile phones) and also on normal keyboard. Although there are similar recent studies which explore different classification algorithms, their main aim has been anomaly detection. But authors experimented with classification methods for user identification and authentication using two different keystroke datasets from touchscreens and keyboards.
Chapter Preview


Information technology development over the past decades has contributed to increase of access and storage need of confidential information. Users are usually required to sign up with passwords for services offered by online service providers. Keeping track of numerous accounts may make the users’ lives difficult as one may forget his/her password or, worse, some malicious party can steal it (Giot, El-Abed, Hemery, & Rosenberger, 2011).

Using passwords for authentication alone is not considered reliable. Therefore, various approaches of authentication systems rely on using humans’ biometric information, such as fingerprint, retina, voice, and so on. Another approach is the use of human behavior characteristics for strengthening the user identification process. One of the behavioral approaches is based on “Keystroke Dynamics” (Xi, Tang, & Hu, 2011). Keystroke Dynamics is considered a behavioral biometric that can identify a user uniquely. It is important to note that keystroke dynamics is known with a few different names in the literature: keyboard dynamics, keystroke analysis, typing biometrics, and typing rhythms.

Researchers consider a user’s typing pattern to be unique to the person because of differences in neuro-physiological based, character-based, and physical factors. In other words, researchers analyze not what is typed, but how it is typed. The characteristics of typing patterns are considered to be a good sign of identity, and therefore, they may be used as biometrics without the need of other information about the user and without any extra hardware. Keystroke dynamics may be based on various different measurements with keyboard keys or touch screen (Revett, 2009).

In the last two decades, researchers have conducted numerous studies on data acquisition methods, feature representations, classification methods, experimental protocols, and evaluations. In this chapter, the authors focus on the effectiveness of recent classification methods. First, the authors review the work related to keystroke dynamics for classification. Then, the authors investigate the accuracy of different algorithms on two different known datasets collected from mobile device screens (Hwang, Cho, & Park, 2009). and keyboards, namely “The Mobikey Keystroke Dynamics Password Database” (Antal, & Nemes, 2016) and “Benchmark Data Set for Anomaly-Detection with Keystroke Dynamics” (Killourhy, & Maxion,2009). By using multi-class approach, the authors approach the authentication problem from a different perspective. There is one class for every user, and each training point belongs to one of these different classes. Additionally, binary class classification approach is also tested using positive and negative samples. In these tests, positive samples are genuine users and negative samples are assumed as impostors. The authors used two different datasets to assess the reliability of the applied methods.

Therefore, the authors explore the influence and results of different classification algorithms on these datasets. This study is done by using a set of keystroke datasets and performing classification through different methods. In this way, the authors could analyze the experimentation to show the best data-classification algorithm for keystroke dynamics classification. The conclusions that arise from the experimentation can be a useful contribution to studies in this field. In the next sections, Section 2 presents a review of the related literature and the difference of this study from other studies. Sections 3 describes how the authors perform their experiments. Finally, Section 4 discusses the conclusions of the work.

Complete Chapter List

Search this Book: