ThesisPDF Available

Identifying users using Keystroke Dynamics and contextual information

Authors:

Abstract and Figures

Biometric identification systems based on Keystroke Dynamics have been around for almost forty years now. There has always been a lot of interest in identifying individuals using their physiological or behavioral traits. Keystroke Dynamics focuses on the particular way a person types on a keyboard. The objective of the proposed research is to determine how well the identity of users can be established when using this biometric trait and when contextual information is also taken into account. The proposed research focuses on free text. Users were never told what to type, how or when. This particular field of Keystroke Dynamics has not been as thoroughly studied as the fixed text alternative where a plethora of methods have been tried. The proposed methods focus on the hypothesis that the position of a particular letter, or combination of letters, in a word is of high importance. Other studies have not taken into account if these letter combinations had occurred at the beginning, the middle, or the end of a word. A template of the user will be built using the context of the written words and the latency between successive keystrokes. Other features, like word length, minimum number of needed words to consider a session valid, frequency of words, model building parameters, as well as age group and gender have also been studied to determine those that better help ascertain the identity of an individual. The results of the proposed research should help determine if using Keystroke Dynamics and the proposed methodology are enough to identify users from the content they type with a good enough level of certainty. From this moment, it could be used as a method to ensure that a user is not supplanted, in authentication schemes, or even to help determine the authorship of different parts of a document written by more than one user.
Content may be subject to copyright.
A preview of the PDF is not available
... The most common performance metric used in biometric systems are the False Accept Rate (FAR) and the False Reject Rate (FRR). The FAR is the measure of the likelihood that the biometric system will incorrectly grant access to an unauthorized user [37]. This is calculated as the ratio of the number of false acceptances divided by the total number of impostor attempts, as shown in equation 1. ...
Preprint
Reliably identifying and authenticating users remains integral to computer system security. Various novel authentication tenchniques such as biometric authentication systems have been devised in recent years. This paper surveys keystroke-based authentication systems and their applications such as continuous authentication. Keystroke dynamics promises to be non-intrusive and cost-effective as no addition hardware is required other than a keyboard. This survey can be a reference for researchers working on keystroke dynamics.
Article
Full-text available
Currently people store more and more sensitive data on their mobile devices. Therefore it is highly important to strengthen the existing authentication mechanisms. The analysis of typing patterns, formally known as keystroke dynamics is useful to enhance the security of password-based authentication. Moreover, touchscreen allows adding features ranging from pressure of the screen or finger area to the classical time-based features used for keystroke dynamics. In this paper we examine the effect of these additional touchscreen features to the identification and verification performance through our dataset of 42 users. Results show that these additional features enhance the accuracy of both processes.
Article
Full-text available
User Specific traits are a very strong method to strengthen the security of any system as it makes the system connected to a specific individual instead of being accessed through some token, key, etc. Behaviour-based user authentication with pointing devices, such as touchpads or mice, has been obtaining attention. Mouse Dynamicsis method which is inexpensive and provides unique characteristic to prevent unlocked workstations attacks to lock out unauthorized users from accessing the system. A perceptive survey with comparison on mouse dynamics biometrics study performed till now is the objective of this paper. We consider here the best results reported in terms of FalseRejection Rate (FRR) &False Acceptance Rate (FAR).
Article
Full-text available
This study introduces an approach for user authentication using free-text keystroke dynamics which incorporates text in Arabic language. The Arabic language has completely different characteristics to those of English. The approach followed in this study involves the use of the keyboard's key-layout. The method extracts timing features from specific key-pairs in the typed text. Decision trees were exploited to classify each of the users' data. In parallel for comparison, support vector machines were also used for classification in association with an ant colony optimisation feature selection technique. The results obtained from this study are encouraging as low false accept rates and false reject rates were achieved in the experimentation phase. This signifies that satisfactory overall system performance was achieved by using the typing attributes in the proposed approach, while typing Arabic text. © 2016, Institution of Engineering and Technology. All rights reserved.
Conference Paper
Full-text available
Currently people store more and more sensitive data on their mobile devices. Therefore it is highly important to strengthen the existing authentication mechanisms. The analysis of typing patterns, formally known as keystroke dynamics is useful to enhance the security of password-based authentication. Moreover, touchscreen allows adding features ranging from pressure of the screen or finger area to the classical time-based features used for keystroke dynamics. In this paper we examine the effect of these additional touchscreen features to the identification and verification performance through our dataset of 42 users. Results show that these additional features enhance the accuracy of both processes.
Article
Full-text available
This paper uses a static keystroke dynamics in user authentication. The inputs are the key down and up times and the key ASCII codes captured while the user is typing a string. Four features (key code, two keystroke latencies, and key duration) were analyzed and seven experiments were performed combining these features. The results of the experiments were evaluated with three types of user: the legitimate, the impostor and the observer impostor users. The best results were achieved utilizing all features, obtaining a false rejection rate of 1.45% and a false acceptance rate of 1.89%. This approach can be used to improve the usual login-password authentication when the password is no more a secret. This paper innovates using four features to authenticate users.
Conference Paper
This paper investigates gender recognition from keystroke dynamics data and from touchscreen swipes. Classification measurements were performed using 10-fold cross-validation and leave-one-user-out cross-validation (LOUOCV). We show that when the target is unseen user data classification, only the second approach is viable. Based on our limited datasets, we show that gender cannot be reliably predicted. The best results were 64.76% for the keystroke dataset and 57.16% for the swipes dataset. However, the classification accuracy is over 80% for more than half of the users in the case of keystroke dynamics dataset.
Article
Dependence on computers to store and process sensitive information has made it necessary to secure them from intruders. A behavioral biometric such as keystroke dynamics which makes use of the typing cadence of an individual can be used to strengthen existing security techniques effectively and cheaply. Due to the ballistic (semi-autonomous) nature of the typing behavior it is difficult to impersonate, making it useful as abiometric. Therefore in this paper, we provide a basic background of the psychological basis behind the use of keystroke dynamics. We also discuss the data acquisition methods, approaches and the performance of the methods used by researchers on standard computer keyboards. In this survey, we find that the use and acceptance of this biometric could be increased by development of standardized databases, assignment of nomenclature for features, development of common data interchange formats, establishment of protocols for evaluating methods, and resolution of privacy issues.
Article
Most computer systems rely on usernames and passwords as a mechanism for access control and authentication of autho- rized users. These credential sets offer weak protection to a broad scope of applications with differing levels of sensitivity. Traditional physiological biometric systems such as fingerprint, face, and iris recognition are not readily deployable in re- mote authentication schemes. Keystroke dynamics provide the ability to combine the ease of use of username / password schemes with the increased security and trustworthiness associated with biometrics. Our research extends previous work on keystroke dynamics by incorporating shift-key patterns. The system is capable of operating at various points on a traditional ROC curve depending on application specific security needs. A 1% False Accept Rate is attainable at a 14% False Reject Rate for high security systems. An Equal Error Rate of 5% can be obtained in lower security systems. As a username pass- word authentication scheme, our approach decreases the imposter penetration rate associated with compromised passwords by 95-99%. In our data collection experiment, as part of the enrollment, each user was given two sets of username / password credential sequences. The username in both sets took the form Firstname.Lastname with the first letter of each name capitalized. The first password was formed as an all lowercase English word. The second password, designed to force shift-key behavior consisted of 12 randomly generated characters in a consistent pattern that included characters of varying capitalization, digits, and special symbols. Examples of such passwords include +AL4lfav8TB= and UC8gkum5WH. This pattern was not intended to elicit any specific shift-key behavior but only to allow for easy interpretation of potentially ambiguous symbols. Input sequences were subsequently defined as the username + either of the two passwords. Participants in the study were asked to enter both types of username / password combinations through a web-based interface developed using client-side Java applets. Imposter input was collected through an interface that provided the current user with the credentials of other individuals enrolled in the system. This is notably different than most biometric experiments that simply cross compare all collected data. The experiment was completely remote and unsupervised (only written / video instructions were provided). Users varying highly in typing ability, age, and ethnic background were allowed, but not required to use multiple keyboards for input. The span in which users entered sequences varied but input never consisted solely of a single session. At the end of a one month collection period, we collected data from over 50 users and more than 10,000 username / password credential sequences. For each sequence, digraph delays (latencies) and hold times (durations) were recorded. Based on these records, aggregate features such as average hold time, maximum delay, and total strokes were calculated with particular attention paid to digraphs involving either of the shift keys. The result of these calculations was a 40 attribute feature vector to be used for classification. It should once again be noted, due to the behavioral nature of keystroke dynamics, genuine input was not used to create artificial imposter attempts; both genuine and imposter data was collected using the credential sets of users enrolled in the system.
Article
Heterogeneous and aggregate vectors are the two widely used feature vectors in fixed text keystroke authentication. In this paper, we address the question “Which vectors, heterogeneous, aggregate, or a combination of both, are more discriminative and why?” We accomplish this in three ways – (1) by providing an intuitive example to illustrate how aggregation of features inherently reduces discriminability; (2) by formulating “discriminability” as a non-parametric estimate of Bhattacharya distance, we show theoretically that the discriminability of a heterogeneous vector is higher than an aggregate vector; and (3) by conducting user recognition experiments using a dataset containing keystrokes from 33 users typing a 32-character reference text, we empirically validate our theoretical analysis. To compare the discriminability of heterogeneous and aggregate vectors with different combinations of keystroke features, we conduct feature selection analysis using three methods: (1) ReliefF, (2) correlation based feature selection, and (3) consistency based feature selection. Results of feature selection analysis reinforce the findings of our theoretical analysis.