Conference PaperPDF Available

Using Keystroke Dynamics and context features to assess authorship in online learning environments

Authors:

Abstract

Using off-the-shelf keyboards and the possibility of measuring the particular rhythm a user has when typing on a computer system has led to the possibility of identifying these users. Over the years, obtaining good Authentication, Identification and Verification methods has been the main focus of Keystroke Dynamics. The objective of the proposed research is to determine how well the identity of users can be established when they use online resources like e-learning environments when context features are evaluated. This research was performed on a real-life environment using a free text methodology. The proposed method focuses on the hypothesis that the position of a particular combination of letters in a word is of high importance. The template of the user is built using the latency between successive keystrokes, and the context of the written words, that is, taking into account where a particular letter stroke has taken place. Other contextual features have also been studied to determine the ones that better help ascertain the identity of a user. The results of the proposed research should help determine if using Keystroke Dynamics and the proposed method is enough to identify users from the content they create with a good enough level of certainty. From this moment, it could be used as a method to ensure that a user is not supplanted by another, in authentication schemes, or to help determine the authorship of different parts of a document written by more than one user.
Using Keystroke Dynamics and context features to assess
authorship in online learning environments
Aleix Dorca Josa1, Jose Antonio Morán Moreno2, Eugènia Santamaría Pérez2
1Universitat d’Andorra
2Universitat Oberta de Catalunya
Abstract
Using off-the-shelf keyboards and the possibility of measuring the particular rhythm a user has
when typing on a computer system has led to the possibility of identifying these users. Over the years,
obtaining good Authentication, Identification and Verification methods has been the main focus of
Keystroke Dynamics.
The objective of the proposed research is to determine how well the identity of users can be
established when they use online resources like e-learning environments when context features are
evaluated. This research was performed on a real-life environment using a free text methodology. The
proposed method focuses on the hypothesis that the position of a particular combination of letters in
a word is of high importance. The template of the user is built using the latency between successive
keystrokes, and the context of the written words, that is, taking into account where a particular letter
stroke has taken place. Other contextual features have also been studied to determine the ones that
better help ascertain the identity of a user.
The results of the proposed research should help determine if using Keystroke Dynamics and the
proposed method is enough to identify users from the content they create with a good enough level of
certainty. From this moment, it could be used as a method to ensure that a user is not supplanted
by another, in authentication schemes, or to help determine the authorship of different parts of a
document written by more than one user.
Keywords: Keystroke Dynamics, context, free text, assessment, e-learning environments
1 Introduction
Identifying users is one of the main objectives when using biometric techniques. In online learning
environments, the doubt whether an assignment has been written by the user who submitted it may
sometimes appear. Having the users authenticated in the online platform or even in their desktop
environment is no guarantee that the submitted papers were authored by these users. Keystroke Dynamics
can be of help in asserting their identity using the time intervals between keystrokes and the context
features related to the written words.
Keystroke Dynamics has been studied since the late seventies. This field of study has been divided into
different branches, being authentication, identification and verification the most relevant. At the same
time, the study of how users type on the keyboard has been carried out using two main approaches: fixed
text and free text. A typical fixed text example would be that of a password, something known to the
users that they always type in the same manner. Opposed to this is the idea of the free text methodology
in which users can type anything they want without restrictions in length or content.
The typical methodology consists in creating a template of the features that best describe a user when
typing on a keyboard. Against this model, new samples can be compared to verify their validity, always
with a certain level of error. Efforts should be put into minimizing this error. The European standards
for control-access systems specifies a false-alarm rate of less than 1%, with a miss rate of no more than
0.001% [1].
This study uses the context data of the written words to identify the users as opposed to other
well-known techniques like, for example, n-graph frequency. This method discusses whether the rhythm
of a particular user is the same when they type, for example: IS, IRIS, THESIS or DISAPPEAR. The
combination of letters I-S would normally be considered a digraph and would be grouped in a common
data structure without considering if it had appeared at the beginning, in the middle, or at the end of the
1
word. This particular feature has not been thoroughly studied before even though it has been proposed as
a possible line of work [2, 3].
2 Background
As previously stated, this study focuses on free text. This research field has been far less studied than
the fixed text alternative. In one of the very first articles that dealt with free text, the results were not
very promising with only a 23% of positive identification [4]. The wide range of different environments
(sometimes highly tailored and controlled), user and sample count, classification methods and other factors
makes it very difficult to establish a standard to be compared to and even more so when less studied
features, like context data, are studied [5]. Some public Keystroke databases have been made available
but mostly for fixed text environments [6].
One of the most cited works is that by D. Gunetti and C. Picardi [7]. The authors calculated both
Relative and Absolute distances between newly collected samples and previously stored templates, and
combined these to obtain their results. These were around 0.005% FAR and 5% FRR. One of the problems
with their method was that the resources needed to obtain the degree of disorder of a sample vector could
be very demanding. Other studies have tried to deal with this scalability problem [8] or, at the same time,
improve their results by slightly modifying their method [9].
The influence of different keyboards is something that has also been studied [10]. The study M.
Villani et al. carried out is of high relevance in order to evaluate the results presented in this study.
User identification was more precise if the user always used the same keyboard or input device (99.8%
identification rate). In this study, users were able not only to submit information using any device but
also from any location so results can be affected by this lack of consistency.
Another study criticizes the methodology based on n-graphs suggesting that this data structure does
not provide enough information about the way a user types [11]. The study suggests that whole words
could give equal or better results than just using short n-graphs. The present study will answer the
question whether length matters by analyzing different word lengths.
The methodology used in this paper shares similitudes with the work of Messerman et al. [12] and
M. Curtin et. al [13]. They used n-graphs samples to build the models. An interesting result of their
research was the fact that if a new sample was compared to an increasing number of models the chances
of correctly identifying the user diminished at a speeding rate.
Brizan et al. [14] published a very interesting article. In their study, they tried to identify the
demographics of the users studied with 82.2% accuracy when samples were at least 50 words long. This is
in consonance with what was found in the research presented in this paper. The authors also studied
other features related to context to try to establish the cognitive task a user was performing.
Also, close to the methodology that will be proposed in this study is the work of Morales et al. [15].
They studied 64 students using different distance measurements, digraphs and trigraphs obtained an
accuracy over 90% when identifying users in online learning environments. They did not use context
features, though.
The research presented in this paper is the continuation of the work started in [16].The most relevant
results were that with a small dataset word length was of importance. This is something that will be
evaluated again in this paper using a different set of samples (more interesting in terms of size and
robustness).
It is worth noting that this biometric technique has been used in multimodal schemes including, but
not limited to, face recognition and speech recognition to improve the global identification rate [17, 18].
3 Methodology
3.1 Samples collection
The samples for this study were collected over a period of two semesters (a whole year) from the messages
sent to the forums at the Virtual Campus of the University of Andorra. In this paper, each of these
messages is referred to as a Session (S).
A snippet of code combining PHP, jQuery, Javascript and AJAX was developed and added to the
base code of the Forum module of the Moodle Learning Content Management System (LCMS). The time
intervals for every pressed key were collected and securely sent to a remote server where they were stored
in a database for later analysis. For every key event this was the gathered information: a user and a
2
session identifier, the key event code, the type of event (either Keydown or Keyup), the timestamp of the
moment the event had been recorded and other minor metadata regarding the user’s device and location.
A total of 60 users were used for this study. These were selected among the ones that had sent the
most number of events to the LCMS. Close to 4.000 sessions were evaluated. It is worth noting that the
information was collected only from desktop computers. Unfortunately, there was not enough information
to perform the tests with events sent from mobile devices.
The profile of the selected users was highly heterogeneous, a characteristic that has been highly
regarded in this kind of studies. Samples from students and faculty alike, from all kinds of studies offered
at the University of Andorra, were collected. Their age ranged from 18 to 65.
3.2 Interval analysis
The study of a Session (
S
) consists in analyzing the different Keydown (KD) and Keyup (KU) events in
order to find the time intervals between them. This allows the possibility of finding the information of
the Press–Release (also known as dwell time or PR) and Release–Press (also known as fly time or RP)
intervals for every pressed key.
The process of detecting words was done taking two features into account: known delimiters (i.e. the
space key, the comma key, the period key. . . ), and a maximum time interval of silence (300 ms).
Figure 1 shows an example of the time intervals for the words: THE SUN. The first word (THE) is
formed by the following PR intervals:
D1
= 54,
D2
= 28 and
D3
= 18. The RP intervals are:
F1
= 25
and
F2
= 5. When a word separator is detected (a space key event in this case) the intervals of that event
are discarded (
F3
,
D4
and
F4
). The second word (SUN) is formed by the following PR intervals:
D5
= 32,
D6
= 38 and
D7
= 28 and of the following RP intervals:
F5
= 29 and
F6
= 33. From this information
other features like Press–Press (PP) or Release–Release (RR) intervals can also be easily obtained.
(ms)
10
PT
64
RT
89
PH
117
RH
122
PE
140
RE
192
Psp
256
Rsp
288
PS
320
RS
349
PU
387
RU
420
PN
448
RN
D1
F1
D2
F2
D3
F3
D4
F4
D5
F5
D6
F6
D7
Figure 1: Timing intervals for the words: THE SUN
3.3 The tree model
Detected words were stored in a logical tree structure like the one shown in Figure 2. In this example, the
following words have been added to the tree: A, T, W, ALL, ALBERT, THE, THERE, THIS, WORD
and WIT. Each of the nodes containing a letter can have PR and RP timing intervals (first and second
list, respectively). Single letter words do not have RP values. Only PR intervals can be obtained. Since
this research used four features (namely PR, RP, PP, and RR) one letter words were discarded. At the
same time these seemed to add little valuable information [16].
In the tree model, a node can have PR and RP timing intervals or not depending on whether the
user has ever typed that particular whole word. The timing information is always stored in the node
corresponding to the last letter of the word. If a word is detected more than once there will be a different
PR and RP list for each instance found (i.e. ALL in the figure). If a word is a sub-word of an already
stored word, there will be PR and RP timing information in a non-leaf node (i.e. THE – THERE in the
figure).
This tree model stores the information from the beginning to the end of each word. It is thus called a
straight tree model. This means, for example, that for the word THIS the first node would contain the
letter T, its first child node would contain the H, then the I and the leaf node, at depth 4, would finally
have the S. The timing intervals would be stored on the S node.
Another model that has been used in this research is an inverted tree model. This model is built
following the same methodology but from the end to the beginning of words. Using the previous example,
for the word THIS the first node would contain the letter S, the first child would be the I, then the H
and finally, the leaf node with the timing information, also at depth 4, would contain the letter T. It was
3
_
A[98]
[]
L
L
[[80, 117, 124]
[67,122],
[86, 112, 120]
[60,118]]
B
E
R
T[112, 92, 127, 142, 154, 231]
[88, 49, 69, 73, 112]
T[122]
[]
H
E[45, 94, 83]
[67, 89]
R
E[98, 92, 122, 88, 82]
[81, 70, 65, 82]
I
S
[90, 124, 79, 145]
[129, 111, 89]
W[143]
[]
O
R
D[152, 231, 129, 87]
[82, 123, 159]
I
T
[90, 113, 93]
[78, 92]
Figure 2: Straight tree model
observed that when comparing new sessions against the straight tree model many words would be found
only up to a certain depth because the user had previously typed a different word with the same root
letters. It would be normal to discard information from the end of these partially found words. The idea
of using an inverted tree model was to make sure that most of the context data available would be used.
The combined tree model that was used to generate the results used the data from both, straight and
inverted, trees.
It is also worth noting that both tree models were cleaned of word instances outside three standard
deviations to avoid having excessive noise, as it had also been done in [16].
3.4 Session evaluation
Once the tree model had been built it was possible to compare new sessions against it and try to establish
the author of a given session. The process consisted in searching every word of the new session in the tree
model and calculate the distance between the origin word and the word found on the model.
For this study the Chebyshev distance measurement was used. Other distance measurements were
also evaluated but this was chosen because it was the one that behaved better. To obtain the distance
between a word and a model an origin vector and a target vector were needed. The origin vector was the
list of interval times from the word being searched and the target vector was the one obtained from the
information stored in the tree model. If a word in the tree model had more than one instance the mean
vector of all recorded instances would be used.
When searching words in the tree model one of the following situations would be encountered:
The word was not found in the model. It would simply be discarded.
The full word was found in the model and the last letter was that of a leaf node. The distance would
be immediate to obtain.
The word was partially found but the node in which the last letter of the origin word was found did
not have timing information because this was the first time the user had typed this particular whole
word. Partial timings from the leaves from the node of the last letter could be determined and used
to find the distance between these partial sub-words.
The word was partially found in the model but there were still letters from the origin word left to
be found. Previously, the user had only entered shorter words with the same root letters. In this
case, only the timings of the partial sub-word found would be used. The partial origin sub-word not
found in the tree model still contained data, though. How this data was to be used, using recursion,
is one of the studied parameters in this research. Three different options have been studied:
4
Search the partial sub-word again in the model as if it were a new word. If not found, loose
the first letter and repeat the process until all letters have been used or a sub-word is found.
This method uses the highest level of recursion and is also the most exhaustive. This method
is identified by R0.
Search the partial sub-word again as if it were a new word and discard it if not found. Only
partial recursion is used. This method is identified by R1.
Discard the sub-word. No recursion is used. This method is identified by R2.
3.5 Studied parameters related to context
The following parameters have been studied in order to see their effect when evaluating context data:
Length of words: this parameter analyzes whether all word lengths in the tree model are equally
relevant. This is of interest, not only in terms of performance and model optimization, but also
in order to determine if users have a natural tendency to be more consistent in their typing for a
limited number of keystrokes. In this study the following values were tried: unlimited number of
letters (
2); greater than 2 (
>
2); between 2 and 5 (
[2 5]
); and between 3 and 7 (
[3 7]
). One
letter words were discarded.
Recursion when searching partial sub-words. The effect of using the different types of recursion
previously described in Section 3.4 when searching partial sub-words is analyzed with this parameter.
Number of words found when searching the model. A recurrent problem appeared when the number
of words in a session was too low. It could well happen that a user had only accessed the forum
to contribute with a few words. Also, having abnormally small models could lead to incorrect
identification because the user’s template did not have enough information. This parameter tries
to mitigate this problem by establishing a minimum number of words either in the session being
analyzed or in the model. In this study a threshold value of 50 words found was established based
on the results of other studies and on incremental tests performed on the available data.
3.6 Determining the owner of a session
The Chebyshev distance between two Vectors ~
Xand ~
Yis defined by the following equation:
DCH (~
X, ~
Y) = maxn
i=1|XiYi|(1)
Each Session
S
has
W
words. Each Word
Wi
is a vector of values
~
X
. This vector may include a
combination of the dwell times and/or the fly times from the recorded timing intervals depending on the
feature Fthat is analyzed. Fcan be one of the following: PR (Press–Release), RP (Release–Press), PP
(Press–Press), and RR (Release–Release).
The Word
Wi
searched in the Model
M
belonging to User
U
produces another vector
~
Y
. From these
two ~
Xand ~
Yvectors the distance DCH can be determined:
WiS, Di(Wi, MU) = DCH (~
Xi,~
Yi)(2)
From these distances two values are then calculated: the Mean
md
and the Weighted Mean
wmd
for all Features. The
md
and the
wmd
values make use of the Depth
d
at which each
Wi
is found. The
Weighted Mean value is obtained using the following weights: all values up to 100 have a weight of 15;
values between 100 and 200 have a weight of 5; and values between 200 and 500 have a weight of 1. Values
over 500 are discarded. These weights were obtained empirically.
Fj[P R, RP, P P, RR], md(Wi) = M ean(Di(Wi, MU)Fj)/d (3)
Fj[P R, RP, P P, RR], wmd(Wi) = W eig thedM ean(Di(Wi, MU)Fj)/d (4)
At this point, there is an
md
(
Wi
)and a
wmd
(
Wi
)value for every Word
Wi
searched in the model
M
. The final global distance
gd
between a Session
S
and the Model
M
is composed of four values (
gdm
,
gdmed ,gdwm ,gdwmed ) calculated using the following method:
5
md(Wi)S, gdm=M ean(md(Wi)), g dmed =Median(md(Wi)) (5)
wmd(Wi)S, gdw m =Mean(wmd(Wi)), gdwmed =Median(wmd(Wi)) (6)
As an example of the proposed method, Table 1 shows a results table after having calculated the
Chebyshev distance measurement between the words of an origin session and the user’s tree model. Five
different users are shown in this example (column Test). In this example, each user has had four Words
compared (here,sun,there, and moon) between the origin session and the tree model. The distance values
for the four features used are shown (PP,RP,PP, and RR). The column Real identifies the real owner of
the session. The Depth column shows the number of letters that were found in the tree model. If the
origin word had only been found partially this value would show the depth at which the last letter had
been found. Finally, columns
md
and
wmd
show the calculated Mean and Weighted Mean values for each
word.
For the first row of user 3207 the Mean value would be: (69 + 144 + 176 + 99)
/
4 = 122. Similarly, the
Weighted Mean value would be: (69
·
15 + 144
·
5 + 176
·
5 + 99
·
15)
/
40 = 103. These two values would
be then divided by the depth at which the last letter of the word was found:
md
= 122
/
4 = 30
.
50 and
wmd = 103/4 = 25.75.
Word Feature Depth User md wmd
PR RP PP RR Test Real
here 69 144 176 99 4 3207 192 30.50 25.75
sun 67 19 48 21 3 3207 192 12.92 12.92
there 56 135 145 93 5 3207 192 21.45 18.18
moon 88 33 66 30 4 3207 192 13.56 13.56
here 84 200 163 124 4 37 192 35.69 30.79
sun 71 16 58 74 3 37 192 18.25 18.25
there 72 187 145 110 5 37 192 25.70 21.93
moon 66 25 70 60 4 37 192 13.81 13.81
here 23 11 16 20 4 192 192 4.38 4.38
sun 15 15 14 23 3 192 192 5.58 5.58
there 34 20 13 18 5 192 192 4.25 4.25
moon 20 30 15 28 4 192 192 5.81 5.81
here 71 13 43 59 4 56 192 11.63 11.63
sun 48 31 24 17 3 56 192 10.00 10.00
there 80 22 55 48 5 56 192 10.25 10.25
moon 56 40 40 25 4 56 192 10.06 10.06
here 60 120 155 140 4 78 192 29.69 24.79
sun 30 15 10 45 3 78 192 8.33 8.33
there 52 112 163 132 5 78 192 22.95 18.77
moon 33 5 3 38 4 78 192 4.94 4.94
Table 1: Distances after comparing a session against 5 different models
From each of these
md
and
wmd
values and for each user
U
the final four values
gdm
,
gdmed
,
gdwm
,
gdwmed
are then calculated. Table 2 shows this final values for the proposed example. Again, as an
example, for user 3207,gdm= (30.50 + 12.92 + 21.45 + 13.56)/4 = 19.61
3.7 Fusion using a voting method
In Table 2, the Votes column shows the total number where each of the gd values was a minimum when
compared to each other user. It was observed that when evaluating sessions using these four
gd
values,
there would be some incorrectly identified sessions but most of the time these errors would not be reported
by the four
gd
values at the same time. It was decided to use a fusion method to try to improve the global
rate of identification by using a voting scheme. A session would be determined as owned by a particular
6
User gdmgdmed gdwm gdw med Votes
Test Real
3207 192 19.61 17.60 17.51 15.87 0
37 192 23.36 21.20 21.98 20.09 0
192 192 5.01 5.01 4.98 4.98 4
56 192 10.48 10.48 10.16 10.16 0
78 192 16.48 14.21 15.64 13.55 0
Table 2: Final values for the proposed method
user by selecting the one that had the majority of minimum
gd
values. In the example in Table 2, user
192 obtained 4votes and thus it is determined as the owner of the session.
3.8 Generalization
Thirty different randomly chosen test sets of 40 users from the available pool of 60 users were used to test
the proposed method. The partition of sessions to test and build the models was 30/70%.
The process to evaluate the sessions was that of a typical data mining study. Each session would be
compared to all models. This process was repeated for every session of every user. The percentage of
correctly identified sessions would be then determined. For the best result the mean FAR and FRR values
are also shown as well as the Wilson confidence interval at 95%.
Just as a comparison to a methodology not using context data the experiment that had given the best
results in this study was repeated using only trigraphs.
4 Results
The results presented in this section show the effect of the analyzed parameters related to context (length
of word, recursion method, and minimum word count found per session). Table 3 also shows the mean
value of the percentage of correctly identified sessions when each of the
gd
values and the Voting system
were used.
The best value in Table 3 is a percentage of
98.74%
correctly identified sessions with a Wilson
binomial confidence interval, at 95%, of [0.77, 3.52]. With a mean value of 377 sessions compared against
the models, the FRR was 0.0126 and the FAR was 0.0002.
This result was obtained using all word lengths. Throughout the table, it can be seen that discarding
larger word lengths does not improve the results. On the other hand, if optimization and computer
performance is of great concern, the difference in the number of correctly identified sessions when using
all word lengths and when only using the [2 5] interval, for example, is marginal.
No doubt the most important parameter is the minimum number of words found in the model. When
this is established to 50 words the results improve vastly. As a disadvantage of setting this parameter less
sessions are being evaluated.
As per the recursion parameter it is interesting to see that when there is no inferior limit regarding
word count, using all available information tends to be somewhat better, at the cost of having to evaluate
more than twice the information. On the contrary, when sessions are of better quality and 50 words are
mandatory, this behavior is inverted, something that proves the importance of contextual information.
It is worth noting that using no recursion improves the performance not only of the correctly identified
sessions but also of the computation speed. It seems that having a large number of events is not always
the best solution to build a concise and rich model.
As a comparison to previously studied methods the test was repeated against templates built using
only trigraphs, without considering context features or recursion methods. The quantity of available
information using this method was much higher (up to a double) than the data available for the context
and recursion tests. Using this method, though, the effectiveness of the system decreased to an 84%. The
proposed method benefits from the fact that having less information but of much better quality greatly
improves the results.
7
Word count No inferior limit >50
Word length 2>2 [2 5] [3 7] 2>2 [2 5] [3 7]
Method Recursion1
R0 84.95 84.78 81.88 83.07 96.65 96.56 95.10 95.36
gdmR1 84.86 84.73 81.78 83.06 96.67 96.49 95.11 95.30
R2 83.98 83.43 80.26 81.59 97.48 96.60 96.00 95.57
R0 85.81 84.97 82.79 83.45 97.78 96.95 96.84 96.11
gdmed R1 85.70 84.93 82.73 83.38 97.70 96.90 96.85 96.10
R2 84.78 83.73 81.31 82.07 98.28 96.93 97.18 96.22
R0 87.24 87.35 84.80 86.12 97.75 97.97 96.99 97.18
gdwm R1 87.13 87.21 84.67 86.01 97.74 97.93 96.94 97.13
R2 86.14 86.17 83.23 84.87 98.10 98.10 97.47 97.42
R0 86.20 85.72 83.17 84.33 97.79 97.33 96.96 96.55
gdwmed R1 86.08 85.69 83.08 84.29 97.83 97.30 96.98 96.55
R2 85.02 84.68 81.87 83.12 98.18 97.71 97.41 96.97
R0 88.81 88.29 86.61 87.18 98.43 98.32 97.88 97.71
Voting R1 88.72 88.23 86.53 87.14 98.43 98.29 97.90 97.67
R2 87.90 87.29 85.22 86.08 98.74 98.44 98.18 97.95
1R0: Exhaustive recursion; R1: Partial recursion; R2: No recursion
Table 3: Results by features and methods
5 Conclusions
The aim of this study was to find out if using Keystroke Dynamics and context data, as opposed to other
well-known techniques, was an effective method when trying to identify users. A new data structure,
based on logical trees of words, has been proposed. From the results obtained the following conclusions
can be derived:
The most important outcome is the validity of context data as an identification feature. It has been
proved, using a highly hostile and real-life environment, that using only simple statistical techniques
offers a very good rate of accuracy, comparable, if not better, to previous studies in similar harsh
environments.
The results obtained when using combined tree models proves that context is a very important
feature. This result is highly relevant in order to perform future research based on contextual
information.
The best word length result was to use all available word lengths.
The best recursion method is not using any recursion but only when sessions and models are of a
certain quality. This is of paramount importance and it confirms the importance of the position of
the letters and that not all information in a word should be treated equally. It is better to have less
information but of better quality than loads of bad information.
When there is a minimum number of words found in the model, as opposed to accepting any sized
session to be compared against the models, the results are far better. This is in concordance with
what other studies have also stated.
The fusion method based on the proposed voting scheme always improves the results when compared
to partial
gd
values. From these, the Weighted Mean and the Median statistic tend to be the ones
that perform better.
8
6 Future work
Some lines of future work can also be put forward here. Below are some ideas to continue with the research
line started in this study:
Study if other factors such as age, gender, time of day of submission.. .are relevant when it comes
to identifying users. Since users from all kinds of ages are available, and other metadata is also
available, segmentation could be tried.
Study other distances measurements and evaluate if there are significant differences when choosing
one over another.
Search for other features, methods, and strategies to increase the percentage of correctly identified
sessions without having to sacrifice poor or shorter sessions.
To improve the performance of the system, and seeing that in most cases choosing a parameter over
another gives little improvement on the results, some restrictions could be set when building the
tree model. For example: limit the length of words and/or avoid recursion when searching. In this
study the optimized tests could be up to 5 times faster taking these considerations into matter.
It could be analyzed if the studied parameters are valid for all users in the same way or if some
users are more susceptible to some parameters.
References
[1]
CENELEC. European Standard EN 50133-1: Alarm systems. Access control systems for use in
security applications. Part 1: System requirements. 2002.
[2]
Bours, Patrick. “Continuous keystroke dynamics: A different perspective towards biometric evalua-
tion”. In: Information Security Technical Report 17.1 (2012), pp. 36–43.
[3]
Sim, Terence, Zhang, Sheng, Janakiraman, Rajkumar, and Kumar, Sandeep. “ Continuous verification
using multimodal biometrics”. In: Pattern Analysis and Machine Intelligence, IEEE Transactions
on 29.4 (2007), pp. 687–700.
[4]
Monrose, Fabian and Rubin, Aviel D. “Authentication via keystroke dynamics”. In: Proceedings of
the 4th ACM conference on Computer and communications security. ACM. 1997, pp. 48–56.
[5]
Alsultan, Arwa and Warwick, Kevin. “Keystroke Dynamics Authentication: A Survey of Free-text
Methods”. In: International Journal of Computer Science Issues 10.4 (2013).
[6]
Giot, Romain, Dorizzi, Bernadette, and Rosenberger, Christophe. “A review on the public benchmark
databases for static keystroke dynamics”. In: Computers & Security 55 (2015), pp. 46–61.
[7]
Gunetti, Daniele and Picardi, Claudia. “Keystroke Analysis of Free Text”. In: ACM Transactions on
Information and System Security 8.3 (2005), pp. 312–347. issn: 1094-9224.
[8]
Hu, Jiankun, Gingrich, Don, and Sentosa, Andy. “A k-nearest neighbor approach for user authentica-
tion through biometric keystroke dynamics”. In: Communications, 2008. ICC’08. IEEE International
Conference on. IEEE. 2008, pp. 1556–1560.
[9]
Davoudi, Homa and Kabir, Ehsanollah. “A new distance measure for free text keystroke authentica-
tion”. In: Computer Conference, 2009. CSICC 2009. 14th International CSI. IEEE. 2009, pp. 570–
575.
[10]
Villani, Mary et al. “Keystroke biometric recognition studies on long-text input under ideal and
application-oriented conditions”. In: Computer Vision and Pattern Recognition Workshop, 2006.
CVPRW’06. Conference on. IEEE. 2006, pp. 39–39.
[11]
Sim, Terence and Janakiraman, Rajkumar. “ Are digraphs good for free-text keystroke dynamics?”
In: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on. IEEE. 2007,
pp. 1–6.
[12]
Messerman, Arik, Mustafic, Tarik, Camtepe, Seyit Ahmet, and Albayrak, Sahin. “Continuous and
non-intrusive identity verification in real-time environments based on free-text keystroke dynamics”.
In: Biometrics (IJCB), 2011 International Joint Conference on. IEEE. 2011, pp. 1–8.
9
[13]
Curtin, Mary et al. “Keystroke biometric recognition on long-text input: A feasibility study”. In:
Proc. Int. MultiConf. Engineers & Computer Scientists (IMECS) (2006).
[14]
Brizan, David Guy et al. “Utilizing linguistically-enhanced keystroke dynamics to predict typist
cognition and demographics”. In: International Journal of Human-Computer Studies (2015).
[15]
Morales, Aythami, Fierrez, Julian, Vera-Rodriguez, Ruben, and Ortega-Garcia, Javier. “Autenticación
Web de Estudiantes Mediante Reconocimiento Biométrico”. In: III Congreso Internacional sobre
Aprendizaje, Innovación y Competitividad. 2016.
[16]
Dorca Josa, Aleix, Santamaría Pérez, Eugènia, and Morán Moreno, Jose Antonio. “Identificación de
usuarios mediante dinámica de tecleo en entornos de entrada libre usando información de contexto”.
In: XXXI Simposium Nacional de la Unión Científica Internacional de Radio (URSI, 2016). 2016.
[17]
Giot, Romain, Hemery, Baptiste, and Rosenberger, Christophe. “Low cost and usable multimodal
biometric system based on keystroke dynamics and 2d face recognition”. In: Pattern Recognition
(ICPR), 2010 20th International Conference on. IEEE. 2010, pp. 1128–1131.
[18]
Montalvao Filho, Jugurta R. and Freire, Eduardo O. “Multimodal biometric fusion-joint typist
(keystroke) and speaker verification”. In: Telecommunications symposium, 2006 international. IEEE.
2006, pp. 609–614.
10
... Following this, statistics on the performance of stylometry, keystroke and combined keystroke-stylometry systems were deduced, based on data obtained from 40 test-taking students enrolled in a university course. Meanwhile, [3] conducted a research to determine the extent the users' identity can be established every time they use online resources, such as e-learning environments, when context features are evaluated. In the study, the template of the user was built using the latency between successive keystrokes, and the context of the written words, taking into account the location a particular letter stroke has taken place. ...
Conference Paper
Online learning is a common tool among university students that depends solely on the Internet medium. However, the lack of monitoring during examination sessions has resulted in the rising cases of online cheating among the students. As such, better mechanisms to replace the use of password authentication are needed to enhance the level of security for the online learning assessment. Keystroke dynamics is one of the most popular methods used, because it does not require any extra devices other than the keyboard. In addition, it can be used as a tool to secure online learning system, although the username and password are known to the user. This paper aimed to comprehensively investigate authentication systems, specifically the keystroke dynamic authentication system for current applications. Besides, the applications and the keystroke benchmarking dataset were also reviewed to have a better insight about their security and usability and to improve the level of security in the current online learning environment.
Chapter
Enhanced authentication is the need of the hour in today's technology. Commonly used login and password are not enough as they may be guessed by imposters. Most of the websites adopt the traditional authentication as login and password. But they don't verify whether the same person is accessing their information continuously in the current session. This is of great concern in distance-based e-learning systems. The institutes offering the e-courses must verify whether it is the same student who enrolled, is accessing their materials, doing the assignments themselves, and completing the examination without any cheating. In this case, one of the techniques, behavioral biometrics-keystroke dynamics, plays a very important role. Along with other authentication methods, keystroke dynamics can be combined to provide a more secured system for the students in e-learning environments. In this chapter, the basics of keystroke dynamics and some of the applications that use them are discussed.
Conference Paper
Full-text available
User identification using biometric techniques has been a proven method to complement, or substitute, other methods like passwords or tokens when these have not been robust enough. In this article a study is detailed where keystroke dynamics have been used in conjunction with context information of the written words. User samples have been gathered on a free and uncontrolled environment. With this information a tree model has been built that has allowed the search of whole or partial words and the obtaining of distances measures. User identification has been performed on four groups of ten users each. The result of using this technique not only shows that user identification is possible but also that context information is an important feature to take into account.
Article
Full-text available
While most previous keystroke biometric studies dealt with short input like passwords, we focused on long-text input for applications such as identifying perpetrators of inappropriate e-mail or fraudulent Internet activity. A Java applet collected raw keystroke data over the Internet, appropriate long-text-input features were extracted, and a pattern classifier made identification decisions. Experiments focused on the system's usability under ideal conditions: copy task, long-text input (600 characters), same keyboard for enrollment and testing, and subjects aware of the nature of the study and instructed to type naturally. Essentially 100% identification accuracy was achieved on 8 subjects typing the same text. This accuracy decreased in going to 30 subjects, on copying different testing texts, and on progressively reducing the length of the testing text. In summary, we found the keystroke biometric effective for identifying up to 30 users inputting text under the following conditions: sufficient training and testing text length, sufficient number of enrollment samples, and same keyboard type used for enrollment and testing.
Conference Paper
Full-text available
Internet services are important part of daily activities for most of us. These services come with sophisticated authen-tication requirements which may not be handled by average Internet users. The management of secure passwords for example creates an extra overhead which is often neglected due to usability reasons. Furthermore, password-based ap-proaches are applicable only for initial logins and do not protect against unlocked workstation attacks. In this paper, we provide a non-intrusive identity ver-ification scheme based on behavior biometrics where keystroke dynamics based-on free-text is used continuously for verifying the identity of a user in real-time. We improved existing keystroke dynamics based verification schemes in four aspects. First, we improve the scalability where we use a constant number of users instead of whole user space to verify the identity of target user. Second, we provide an adaptive user model which enables our solution to take the change of user behavior into consideration in verifica-tion decision. Next, we identify a new distance measure which enables us to verify identity of a user with shorter text. Fourth, we decrease the number of false results. Our solution is evaluated on a data set which we have collected from users while they were interacting with their mail-boxes during their daily activities.
Conference Paper
Full-text available
Research in keystroke dynamics has largely focused on the typing patterns found in fixed text (e.g. userid and passwords). In this regard, digraphs and trigraphs have proven to be discriminative features. However, there is increasing interest in free-text keystroke dynamics, in which the user to be authenticated is free to type whatever he/she wants, rather than a pre-determined text. The natural question that arises is whether digraphs and trigraphs are just as discriminative for free text as they are for fixed text. We attempt to answer this question in this paper. We show that digraphs and trigraphs, if computed without regard to what word was typed, are no longer discriminative. Instead, word-specific digraphs/trigraphs are required. We also show that the typing dynamics for some words depend on whether they are part of a larger word. Our study is the first to investigate these issues, and we hope our work will help guide researchers looking for good features for free-text keystroke dynamics.
Conference Paper
Full-text available
Keystroke dynamics-based authentication, KDA, verifies users via their typing patterns. To authenticate users based on their typing samples, it is required to find out the resemblance of a typing sample and the training samples of a user regardless of the text typed. In this paper, a measure is proposed to find the distance between a typing sample and a set of samples of a user. For each digraph, histogram-based density estimation is used to find the pdf of its duration time. This measure is combined with another measure which is based on the two samples distances. Experimental results show considerable decrease in FAR while FRR remains constant.
Article
Keystroke dynamics allows to authenticate individuals through their way of typing their password or a free text on a keyboard. In general, in biometrics, a novel algorithm is validated through a comparison to the state of the art one's using some datasets in an offline way. Several benchmark datasets for keystroke dynamics have been proposed in the literature. They differ in many ways and their intrinsic properties influence the performance of the algorithms under evaluation. In this work, we (a) provide a literature review on existing benchmark datasets of keystroke dynamics; (b) present several criteria and tests in order to characterize them; (c) and apply these criteria on these available public benchmark datasets. The review analysis shows a great disparity in the acquisition protocol, the population involved, the complexity of the passwords, or the expected performance (there is a relative difference of 76% between the EER on the worst and best performing datasets with the same authentication method).
Article
Entering information on a computer keyboard is a ubiquitous mode of expression and communication. We investigate whether typing behavior is connected to two factors: the cognitive demands of a given task and the demographic features of the typist. We utilize features based on keystroke dynamics, stylometry, and “language production”, which are novel hybrid features that capture the dynamics of a typists linguistic choices. Our study takes advantage of a large dataset (∼350 subjects) made up of relatively short samples (∼450 characters) of free text. Experiments show that these features can recognize the cognitive demands of task that an unseen typist is engaged in, and can classify his or her demographics with better than chance accuracy. We correctly distinguish High vs. Low cognitively demanding tasks with accuracy up to 72.39%. Detection of non-Native speakers of English is achieved with F1=0.462 over a baseline of 0.166, while detection of female typists reaches F1=0.524 over a baseline of 0.442. Recognition of left-handed typists achieves F1=0.223 over a baseline of 0.100. Further analyses reveal that novel relationships exist between language production as manifested through typing behavior, and both cognitive and demographic factors.
Article
In this paper we will describe a way to evaluate a biometric continuous keystroke dynamics system. Such a system will continuously monitor the typing behaviour of a user and will determine if the current user is still the genuine one or not, so that the system can be locked if a different user is detected. The main focus of this paper will be the way to evaluate the performance of such a biometric authentication system. The purpose of a performance evaluation for a static and for a continuous biometric authentication system differ greatly. For a static biometric system it is important to know how often a wrong decision is made. On the other hand, the purpose of a performance evaluation for a continuous biometric authentication system is not to see if an impostor is detected, but how fast he is detected. The performance of a continuous keystroke dynamic system will be tested based on this new evaluation method.
Conference Paper
Keystroke dynamics exhibit people's behavioral features which are similar to hand signatures. A major problem hindering the large scale deployment of this technology is its high FAR (false acceptance rate) and FRR (false rejection rate). A significant progress, in terms of improving the FAR and FRR performance, has been made by the work of Gunetti and Picardi (2005). However, their identification based authentication suffers a severe scalability issue as it needs to verify the input with every training sample of every user within the whole database. In this paper, a k-nearest neighbor approach has been proposed to classify users' keystroke dynamics profiles. For authentication, an input will be checked against the profiles within the cluster which has greatly reduced the verification load. Experiment has demonstrated the same level of FAR and FRR as that of Gunetti and Picardi approach while as high as 66.7% improvement of the authentication speed has been achieved.