Content uploaded by Pawel Kasprowski
Author content
All content in this area was uploaded by Pawel Kasprowski on Jul 05, 2016
Content may be subject to copyright.
Silesian University of Technology
Faculty of Automatic Control, Electronics and Computer Science
Institute of Computer Science
Human identification using eye movements
Doctoral thesis
Paweł Kasprowski
Supervisor:
prof. dr hab. inż. Józef Ober
Gliwice, 2004
Eyes are windows of our soul
William Shakespeare
TABLE OF CONTENTS
TABLE OF CONTENTS ...................................................................................III
LIST OF FIGURES ...........................................................................................VI
LIST OF TABLES...........................................................................................VIII
LIST OF ABBREVIATIONS..............................................................................IX
1 INTRODUCTION ........................................................................................10
1.1 Biometric identification................................................................................10
1.2 Eye movements in biometric identification................................................11
1.3 Thesis of the dissertation..............................................................................12
2 BIOMETRIC IDENTIFICATION ISSUES....................................................13
2.1 Classification of biometric identification methods ....................................14
2.2 Evaluating efficiency of biometric identification.......................................15
2.3 Overview of biometric identification methods...........................................16
2.3.1 Fingerprint verification...........................................................................16
2.3.2 Face recognition......................................................................................17
2.3.3 Iris recognition........................................................................................18
2.3.4 Behavioral techniques.............................................................................20
2.3.5 Multimodal systems................................................................................21
2.4 Summary........................................................................................................21
3 EYE MOVEMENTS ....................................................................................23
3.1 Physiology of eye movements.......................................................................23
3.2 Previous researches considering eye movements.......................................24
3.3 Eye movement tracking methodologies ......................................................25
3.4 The OBER2 system.......................................................................................28
4 EYE MOVEMENT REGISTERING EXPERIMENT.....................................31
4.1 Possible testing strategies.............................................................................31
4.2 Possible stimulations.....................................................................................34
III
TABLE OF CONTENTS
4.3 Jumping point stimulation...........................................................................35
4.4 The learning effect........................................................................................36
4.5 Methodology used during the experiment..................................................38
4.6 Storing results of the experiment ................................................................41
5 ENTRY PROCESSING OF COLLECTED DATA.......................................43
5.1 Sample calibration........................................................................................44
5.2 Sample normalization...................................................................................46
5.2.1 Finding fixations.....................................................................................48
5.2.2 Pairing fixations with required fixation locations...................................50
5.2.3 Recalculating all values into a new range...............................................51
5.3 Calculating different eye movement properties.........................................54
5.3.1 Average velocity direction calculation ...................................................55
5.3.2 Eye distance ............................................................................................56
5.3.3 Distance to stimulation ...........................................................................57
5.3.4 Discrete Fourier Transform ....................................................................58
5.3.5 Wavelet Transform .................................................................................59
5.4 Conclusions....................................................................................................61
6 MINIMIZATION OF ATTRIBUTE VECTORS .............................................62
6.1 Relevancy estimation....................................................................................63
6.2 Linear conversions........................................................................................63
6.2.1 Principal Component Analysis ...............................................................64
6.2.2 Other techniques .....................................................................................65
6.3 Conclusions....................................................................................................65
7 CLASSIFICATION METHODS...................................................................67
7.1 K Nearest Neighbors.....................................................................................67
7.2 Template – threshold....................................................................................68
7.3 Naïve Bayes....................................................................................................69
7.4 C45 Decision Trees........................................................................................70
7.5 Support Vector Machines ............................................................................72
7.6 Ensemble classifiers......................................................................................74
7.6.1 Bagging...................................................................................................74
IV
TABLE OF CONTENTS
7.6.2 Boosting..................................................................................................75
7.6.3 Using different classifiers and data representations ...............................75
7.7 Cross-validation............................................................................................77
8 EXPERIMENT ............................................................................................79
8.1 Data preparation...........................................................................................79
8.1.1 Data gathering.........................................................................................79
8.1.2 Entry processing - datasets preparation ..................................................80
8.2 Performing classification tests.....................................................................80
8.2.1 Dividing a dataset into train-set and test-set...........................................81
8.2.2 Minimizing dataset .................................................................................81
8.2.3 Classification ..........................................................................................83
8.3 Verification of the results.............................................................................84
8.3.1 Analyzing errors for the datasets ............................................................85
8.3.2 Analyzing errors of the classification algorithms...................................85
8.3.3 Voting classifiers ....................................................................................86
8.4 Conclusions – performance considerations................................................87
9 RESULTS...................................................................................................89
9.1 Multiple trials estimation.............................................................................90
9.2 Problem of overfitting ..................................................................................92
9.3 Conclusions....................................................................................................92
10 LITERATURE.............................................................................................94
APPENDIX. SOFTWARE TOOLS.................................................................102
EyeLogin – data acquiring....................................................................................103
EyeLoader – creating a dataset of samples..........................................................103
EyeAnalyser – maintaining datasets ....................................................................103
EyeDataset ...........................................................................................................104
EyeVisualizer.......................................................................................................105
EyeConverter .......................................................................................................106
EyeClassifier ........................................................................................................108
EyeResults............................................................................................................110
EyeStat – analyses of the results...........................................................................110
External packages..................................................................................................111
V
LIST OF FIGURES
LIST OF FIGURES
Fig. 2.1 Example of the ROC curve........................................................................................................ 16
Fig. 2.2 The image of the fingerprint and the same fingerprint with detected minutiae ...................17
Fig. 2.3 Examples of eigenfaces............................................................................................................... 18
Fig. 2.4 Image of the human iris............................................................................................................. 19
Fig. 2.5 Image with detected iris and corresponding IrisCode ............................................................ 19
Fig. 3.1 Image of the retina..................................................................................................................... 23
Fig. 3.2 Six oculomotor system muscles................................................................................................. 24
Fig. 3.3 Subject ready for electro-oculography eye movement measuring experiment .....................26
Fig. 3.4 An example of contact lens coil................................................................................................. 26
Fig. 3.5 Head mounted video-based eyetracker..................................................................................... 27
Fig. 3.6 Video based eye tracker with camera combined with the TFT display................................. 27
Fig. 3.7 OBER2 system operation principle........................................................................................... 28
Fig. 3.8 Goggles to be worn during the experiment.............................................................................. 29
Fig. 3.9 Graph illustration of the OBER2 measuring process.............................................................. 29
Fig. 3.10 The laboratory version of the OBER2 system ....................................................................... 30
Fig. 4.1 Schema of the system registering only eye movements........................................................... 32
Fig. 4.2 Schema of the system registering both eye movements and the observed image..................32
Fig. 4.3 Schema of the system registering eye movements as the answer to the stimulation............. 33
Fig. 4.4 Hierarchy of possible stimulations............................................................................................ 34
Fig. 4.5 Scanpaths of eyes looking at the static image........................................................................... 34
Fig. 4.6 Points matrix for stimulation.................................................................................................... 39
Fig. 4.7 Typical eye movement reaction for point position change in one axis...................................39
Fig. 4.8 Visual description of stimulation (a-l) ...................................................................................... 40
Fig. 4.9 Results of a single test ................................................................................................................ 41
Fig. 4.10 Result of one test stored in a text file (EyeTestFile format).................................................. 42
Fig. 5.1 An example of a sample consisting of six independent parts (signals)...................................43
Fig. 5.2 Information needed for proper calibration of eye movement signal ..................................... 45
Fig. 5.3 Example of badly acquired sample........................................................................................... 46
Fig. 5.4 Two graphs presenting left eye horizontal reaction ................................................................ 47
Fig. 5.5 Signal (A) from Fig. 5.4 with detected fixations.......................................................................49
Fig. 5.6 Signal (A) and its averaged levels.............................................................................................. 50
Fig. 5.7 Signal (B) with detected fixations.............................................................................................. 51
Fig. 5.8 Signal (B) and its averaged conversion with fixation assigned to the wrong level................ 52
Fig. 5.9 The same signal (B) as on Fig. 5.8 but the upper level has been rejected.............................. 52
Fig. 5.10 Signals (A) and (B) presented on Fig. 5.4 after normalization ............................................. 53
Fig. 5.11 A sample contains a source information for producing different vectors of attributes ..... 54
Fig. 5.12 Example of velocity vector calculated for left eye (using LX and LY signals).................... 55
Fig. 5.13 Average velocities of left eye in 16 different directions......................................................... 56
Fig. 5.14 Radar graphs of average velocities of left eye in 16 different directions ............................. 56
Fig. 5.15 Absolute distance between eyes’ gaze-points in the following moments of time................. 57
Fig. 5.16 Difference of the LX eye signal from the required fixation location....................................57
VI
LIST OF FIGURES
Fig. 5.17 Comparison of the normalized signals (A) and (B) presented above on Fig. 5.10 .............. 58
Fig. 5.18 Fourier spectra of signals (A) and (B) presented on Fig. 5.10 .............................................. 59
Fig. 5.19 Discrete wavelet transform of LX signal (using Daub4 mother wavelet)............................ 61
Fig. 6.1 Data conversions schema........................................................................................................... 62
Fig. 7.1 Architecture of classification process ....................................................................................... 67
Fig. 7.2 Example of genuine-impostor diagram....................................................................................68
Fig. 7.3 Example of FAR and FRR in the function of the threshold distance value .......................... 69
Fig. 7.4 The idea of voting classifiers...................................................................................................... 76
Fig. 7.5 The idea of validation using train-set and test-set...................................................................77
Fig. 8.1 Three phases of the experiment................................................................................................. 79
Fig. 8.2 Process of the dataset creation.................................................................................................. 80
Fig. 8.3 Partial PCA calculation ............................................................................................................. 82
Fig. 8.4 Errors for six different dataset types........................................................................................ 85
Fig. 8.5 Errors for different classification algorithms.......................................................................... 86
Fig. 8.6 The voting algorithm is using results of all classifiers............................................................. 87
Fig. 9.1 Errors for different persons. ..................................................................................................... 89
Fig. 9.2 Errors for different persons in two trial test............................................................................ 92
Fig. 0.1 Schema of data preparing procedure.....................................................................................102
Fig. 0.2 Structure of file in EyeTestFile format................................................................................... 103
Fig. 0.3 Structure of the file in EyeDatasetFile format ....................................................................... 103
Fig. 0.4 The visual description of EyeAnalyser application functionality......................................... 104
Fig. 0.5 Structure of the file in EyeResultsFile format........................................................................ 110
VII
LIST OF TABLES
LIST OF TABLES
Table 8.1. Symbols of prepared datasets and descriptions with references............80
Table 8.2. Symbols of applied conversions................................................................. 83
Table 8.3. Symbols of used classification algorithms................................................. 83
Table 8.4. Average error rates for six different types of dataset..............................85
Table 8.5. Average error rates for eight different classification algorithms...........86
Table 9.1. Error rates in authorization tests.............................................................. 89
Table 9.2. Simulated error rates in authorization tests in two independent trials.90
Table 9.3. Calculation of paired results......................................................................91
Table 9.4. Error rates in authorization test combined from two trials....................91
VIII
LIST OF ABBREVIATIONS
LIST OF ABBREVIATIONS
CWT Continuous Wavelet Transform (5.3.5)
DFT Discrete Fourier Transform (5.3.4)
DWT Discrete Wavelet Transform (5.3.5)
EER Equal Error Rate (2.2)
EOG Electro-oculography (3.3)
FAR False Acceptance Rate (2.2)
FRR False Rejection Rate (2.2)
HMM Hidden Markov Model (5.2.1)
HTER Half Total Error Rate (8.3)
ICA Independent Component Analysis (6.2.2)
IROG Infrared-oculography (3.3)
KNN k Nearest Neighbors (7.1)
LDA Linear Discriminant Analysis (6.2.2)
PCA Principal Components Analysis (6.2.1)
RFL Required Fixation Location (5.2)
ROC Receiver Operating Characteristic (2.2)
SMO Sequential Minimal Optimization (7.5)
STFT Short Term Fourier Transform (5.3.5)
SVM Support Vector Machines (7.5)
VOG Video-oculography (3.3)
IX
1 Introduction
Security issues seem to be one of the most important problems of contemporary
computer science. One of the most important branches of security is identification of
users. Identification may be required for access control to buildings, rooms, devices or
information. In case of computer systems we say about access to software and data. The
basic aim of identification is to make it impossible for unauthorized persons to access to
the specified resources.
There are generally three solutions for performing secure identification:
• Token methods (something you have),
• Memory methods (something you know).
• Biometric methods (somebody you are).
The token method has two significant drawbacks. Firstly, the token may be lost or
stolen. A person who finds or steals a token may have an access to all the resources that
the proper owner of the token was able to access, and there is no possibility to find out
if they are the person they claim to be. Secondly, the token may be copied. The easiness
of making a copy is of course different for different kinds of tokens, but it is always
technically possible.
Memory based methods identify people by checking their knowledge. The most
popular memory methods are of course different kinds of passwords. The main
drawback of this kind of methods is the unconscious selectivity of human memory.
People may do their best to remember a password but they cannot guarantee that the
information will not be forgotten. Similarly to the token method when a malicious user
knows a password it is impossible to check if they are the person they claim to be .
The problems with token and memory-based methods are the main cause of
increasing interest in methods of identification based on biometric information of a
person.
1.1 Biometric identification
The terms "Biometrics" and "Biometry" have been used since the early 20th century
to refer to the field of development of statistical and mathematical methods applicable
to data analysis problems in the biological sciences [88]. Biometric techniques are
frequently used in medicine, agriculture or biology. Recently the emerging field of
technology devoted to identification of individuals by means of using biological traits
10
Introduction 1.2. Eye movements in biometric identification
(e.g. biometric methods) resulted in the common narrowing the term ‘biometrics’ to
refer only to that kind of researches. Therefore, to avoid misunderstanding, the term
‘biometric identification’ will be used in most cases in this dissertation. However when
the term is shortened to ‘biometrics’ it is used only in its narrower meaning.
Biometric identification uses a fact that measurements of biological properties often
gives different results for different people. As some measurements are very similar to
whole or most of the population – for example body temperature or pulse frequency –
biometric identification methods seek measurements, which are characteristic of a
single human being only and therefore unique.
The main advantage of biometrics identification is that it is generally more difficult
to forge it than in the case with classic methods. Another interesting property of
biometric identification is that, contrary to classic methods, it enables the so called
‘negative identification’. It means that people not only can prove that they are who they
claim to be but they also can prove that they are not who they claim they are not.
Classic identification methods are based on the assumption that people want to be
identified. However, in many applications the problem is not to give access to a
specified user but to reject it. With such a problem proper identification is against user
interest and for instance password identification becomes ineffective.
Therefore a great amount of interest with biometrics appears in services aiming to
find criminals or terrorists.
1.2 Eye movements in biometric identification
Using eyes to perform biometric human identification has a long tradition including
well-established iris pattern recognition algorithms [17] and retina scanning. However,
the only papers concerning identification based on eye movement characteristic, known
to the author of this dissertation, are written by him and his supervisor
[49][48][51][47][50]. It is a bit surprising because that method has several important
advantages.
Firstly, it compiles physiological (muscles) and behavioral (brain) aspects. The most
popular biometric methods like fingerprint verification or iris recognition are based
mostly on physiological properties of human body. Therefore, what is needed for proper
identification, is only a “body” of a person who is to be identified. It makes possible to
identify an unconscious or - in some methods - even a dead person.
Moreover, physiological properties may be forged. Preparing models of a finger or
even retina (using special holograms) is technically possible. As eye movement based
11
Introduction 1.3. Thesis of the dissertation
identification uses information which is produced mostly by brain (so far impossible to
be imitated), forging this kind of information seems to be much more difficult.
Although it has not been studied in that paper, it seems possible to perform a covert
identification, i.e. identification of a person unaware of that process (for instance using
hidden cameras).
Last but not least, there are many easy to use eye-tracking devices nowadays, so
performing identification by means of that technique is not very expensive. For instance
a very fast and accurate OBER2 [71] eye tracking system was used in the present work.
It measures eye movements with a very high precision using infrared reflection and the
production costs are comparable to fingerprint scanners.
1.3 Thesis of the dissertation
1. Eye movements may be used for human identification.
2. Biometric identification using eye movements is the valuable addition to other
existing biometric identification methods.
3. Eye movement measuring when subject is following jumping point stimulation
gives information that may be used to perform identification.
4. Principal Component Analysis technique is very useful for feature extraction
from eye movement signal.
12
2 Biometric identification issues
As it was written in the introduction, memory and token methods are implemented to
judge if the specified user should have an access to a specified resource. Therefore an
exact identification of a person is not necessary and indeed is not always performed. It
is possible that a group of people has the same token or know the same password.
Contrary to this, biometric identification methods start with proper identification of a
person and only after that, the proper rights are assigned. Thus the main difference
between classic methods and biometrics is that biometric properties cannot be
‘borrowed’ so people cannot - in the way as simple as giving a token or telling the
password - propagate their rights to others. It obviously increases security of the system
but sometimes may cause problem.
First stage in each biometric process is collecting a set of ‘samples’ from every user
who should be identified by the system. A sample is a set of biometric data measured
for a person in a single measurement. The biometric data may be a different kind of
psycho-physiological measurements. Next stage in most methods is creating a
‘template’ for each user based on previously collected samples. A template is a kind of
mean from all samples collected for this user. The process of creating a template is
called an ‘enrolment’ of the user.
Having a set of templates for each known user it is possible to identify new
unclassified samples. There are two basic techniques:
• Identification.
• Authorization.
During the identification process, system collects a sample and than tries to match it
with one of the stored templates. Commonly it counts for each template a probability
that the sample was collected from the user and chooses one with the highest
probability. However, this method works only when we are sure that there are templates
of all possible users in the system’s database. It must be remembered that this method
finds identification for every sample. So persons whose templates are not in the
database would be always classified as one of the previously enrolled persons and
would get some rights, which they should not have.
The solution of the problem is introducing an error threshold for each template. If the
sample being identified is not close enough to any template the identification is rejected.
Assigning a proper threshold is not a simple problem. It may be fixed for all templates
13
Biometric identification issues 2.1. Classification of biometric identification methods
or counted independently for each one on the basis of - for instance - variance of
enrolled samples.
Another kind of test is an authorization test. In such test users are first explicitly
asked for their names or logins and then system measures a sample of their biometric
attributes. After that the system evaluates similarity of the sample to the template of the
specified person and accepts or rejects authorization. It is obvious that authorization is
much more reliable than identification. Furthermore it is easier to provide and generally
faster to perform.
2.1 Classification of biometric identification methods
A biometric feature is a specific attribute of a person that can be measured with some
precision. There are a lot of different biometric features that can be measured [95]. The
methods measure different parts of body using different measurement devices therefore
it is difficult to compare them directly. But there are some properties, which may be
evaluated for each measurement method. The most important are:
• Distinctiveness.
• Repeatability.
• Accessibility.
• Acceptability.
Distinctiveness shows how much the specific feature is different for different people.
For instance iris pattern or fingerprint are supposed to be very distinctive. On the other
hand, a shape of the palm or the hair color are not very distinctive.
Distinctiveness is very often considered the most important property of biometric
methods but there are also some other properties that imply possibility of usage of the
method. One of them is repeatability of the method. Generally speaking, it shows how
easily the same feature may be different for different measurements of the same person.
For instance fingerprint may be easily damaged with chemicals or simple injuries. The
shape of the face may be easily changed also with moustache or glasses. On the other
hand, it is rather difficult to change the shape of the palm or iris image.
The property that is very important when considering practical usage of the biometric
method is its accessibility. Questions that may be asked here are:
1) How fast is the process of collecting data from one user (a measurement)
2) How quickly can the measurement be repeated?
3) How complicated is the measurement process?
14
Biometric identification issues 2.2. Evaluating efficiency of biometric identification
4) Should the identified person be previously trained and how difficult the training is?
5) What is the accessibility of devices performing the measurement (including their
prices)?
6) What amount of space is needed to store the template of one person?
7) How fast are methods for evaluating a new measurement?
The answers for all this questions let evaluate if it is possible to use this method in
the real environment.
Last but not least, the acceptability of the method should be mentioned. It may be
said that acceptability is accessibility from the users’ point of view. One of the main
problems is intrusiveness of the method. Wayman in [95] mentions a system based on
the resonance patterns of the human head, measured through microphones placed in the
users’ ear canals. Such a system is for sure very inconvenient for users and its
acceptability is rather low. On the other hand face recognition systems that are using
cameras are not invasive and may be considered acceptable for users.
2.2 Evaluating efficiency of biometric identification
Measurement of biological quantities is always to some degree imprecise and
therefore is producing different values for the same quantity measured [88]. These
errors are an instant part of every biometric method and the main problem of that kind
of identification is to elaborate algorithms that sufficiently deal with these imprecise
data.
Although many companies are advertising their biometric identification products as
reliable and error-free, the independent comparisons like Fingerprint Verification
Competition [64][62][63] or Face Recognition Vendor Test [29][25] show that even a
well-established fingerprint technology is not fully reliable.
There are two kinds of tests when considering authorization (two class) system:
• Genuine test – when a sample is given with correct identification information
(login). In another words ‘the identified person is telling the truth’. In such
case the rate of improper rejections may be measured. This measure is often
called a False Rejection Rate (FRR) or False Non-Match Rate.
• Impostor test – when a sample is given with incorrect login. In another words
‘the identified person is lying’. Now a rate of improper acceptances may be
measured. This measure is called a False Acceptance Rate (FAR) or False
Match Rate.
15
Biometric identification issues 2.3. Overview of biometric identification methods
Both measures are often dependent on each other. When decreasing False Rejection
Rate, False Acceptance Rate increases and vice versa. Therefore, to properly state an
efficiency of biometric method, its results are often presented on a graph called the
ROC curve [95]. The acronym ROC stands for ‘Receiving Operating Characteristic’, a
term used in signal detection to characterize the tradeoff between hit rate and false
alarm rate over a noisy channel [21]. The ROC curve with FRR’s on X-axis and FAR’s
on Y-axis presents how these two rates are dependent each other. Fig. 2.1 presents an
example of the ROC curve for some biometric system.
FAR %
100
FRR %
100
Fig. 2.1 Example of the ROC curve. Each point stands for one classification
verification for which FRR and FAR has been received.
Possibility of evaluating the ROC curve depends on the used pattern matching
method. When it is possible one can calculate a point where FRR and FAR values are
equal. Value of errors in this point is called Equal Error Rate (EER) and is often
referred to in the literature.
2.3 Overview of biometric identification methods
Nearly every part of human body has been used to identification. There are well-
established methods for measuring fingerprints, iris, eye retina, face, palm, teeth, ears
and even smell. There are methods that measure human’s behavior patterns such as way
of walking (gait), shape of signature or mouse signature. Some of them have been
briefly described in this chapter.
2.3.1 Fingerprint verification
Fingerprints verification is one of the oldest biometric techniques used for human
identification. First uses of fingerprints instead of signatures were reported in 19th
century [42]. The milestone was adopting a Galton/Henry system of identification by
Scotland Yard in 1900. Since then fingerprints became one of the most important
features used in forensic prosecutions.
16
Biometric identification issues 2.3. Overview of biometric identification methods
There are a lot of easy to use and cheap fingerprint scanners. They are based on
different technologies including optic, capacitive, ultrasound, pressure, thermal and
electric field sensors. It is no need now for using dactylograms (inked fingerprints). The
identification is based on detection of ridges and the so called minutiae – places where
ridges end or bifurcate.
The technology is widely accepted as very reliable. There is a common belief
(however never proved!) that fingerprints are unique in whole human population. That
is why fingerprint evidence is even acceptable in a court of law.
Fig. 2.2 The image of the fingerprint and the same fingerprint with detected minutiae.
However there are several reasons why the fingerprint verification has not
completely dominated biometric identification. Firstly, fingerprints are very sensitive
for physical damages and therefore are not very robust. Secondly, many people have
chronically dry skin and cannot present clear prints. Fingerprints are supposed to be
very distinctive but - what may be surprising – in the competitions using real world test
samples the errors may be sometimes more than 2% [42][63]. Moreover fingerprints
have a very bad ‘reputation’ as they are commonly joined with criminal investigations.
Nevertheless, fingerprints are at present the most popular biometric identification
system.
2.3.2 Face recognition
Face recognition is one of the most promising techniques nowadays. The possibility
of covert identification of people unaware of that makes it eligible for – for instance –
terrorist search in crowded places. First face recognition technologies were the so called
17
Biometric identification issues 2.3. Overview of biometric identification methods
geometric based methods. They were based on recognition of the specific elements of
human face like nose or eyes and measuring its relative positions and shape. The
methods were insensitive to variations in illumination and viewpoint.
In 1990 Turk et al [90] proposed technique of extraction most expressive features
from the face image. The technique based on Principal Component Analysis (see also
section 6.2) is creating a set of eigenfaces – images containing the most meaningful
parts of the source image. The technique has been widely accepted. Recently more
sophisticated techniques like Fisher Linear Discriminant Analysis [6] or Independent
Component Analysis [54] are also widely used. The new direction of researches is the
so called 3D Face Recognition [60].
However face recognition is still in the very early stage with very limited usage in
the real world. The attempts to use it for – for instance – terrorist recognition so far have
failed [1].
Fig. 2.3 Examples of eigenfaces [52]
(reproduced with permission)
The fact that (contrary to fingerprints) people are able to do face recognition by
themselves without any special equipment, encourages researchers for seeking better
methodologies imitating human. But the universal ready-to-use face recognition
technology is still the ‘wave of future’.
2.3.3 Iris recognition
Iris recognition is dominated by John Daugman and his algorithm based on Gabor
transformation [17]. The algorithm was patented and now it is the property of Iridian
Technologies Inc. Although the iris is very small (about 11 mm) it has enormous pattern
variability among different persons [18]. What is very important iris is well protected
from the environment and stable over all lifetime.
18
Biometric identification issues 2.3. Overview of biometric identification methods
Fig. 2.4 Image of the human iris.
In the algorithm proposed by Daugman, the result of 2D Gabor wavelet applied on
iris image is converted into 2,048 bits vector - so called IrisCode. Two IrisCodes may
be very quickly compared using simple Hamming Distance and XOR operator.
According to reports presented by Daugman [18] the methodology is absolutely error
free. Therefore there is no doubt that iris recognition is the most reliable biometric
identification technique.
Fig. 2.5 Image with detected iris and corresponding IrisCode [18].
(reproduced with permission)
However the main disadvantage of the iris recognition is the difficulty of acquiring
the proper image. Iris is very small, partially occluded by eyelids (often drooping). The
image is often obscured by eyelashes or lenses. Moreover eyes tend to move very fast.
19
Biometric identification issues 2.3. Overview of biometric identification methods
That is the main reason why iris recognition has not dominated the biometric market
despite having several important applications – like registering all travelers arriving to
United Arab Emirates or registration of refugees in Afghanistan.
2.3.4 Behavioral techniques
Methods described above measure physiological properties of the human body. The
problem with that kind of methods is that to proper identification only the part of the
human body is needed. Potential forgers may try to prepare models of human body like
artificial fingers or contact lenses. Moreover the person being identified may be
unconscious or – for some methods – even dead.
The problems described above resulted in increased attention paid to methods using
not only physiological properties but also measuring the behavioral patterns also [47].
Behavioral biometrics are based on measurements of data derived from an action. One
of the defining characteristics of a behavioral biometric is the incorporation of time as a
metric – the measured behavior has a beginning, middle and end.
Behavioral methods are obviously more difficult to forge because it is difficult to
imitate somebody’s behavior. On the other hand analyses of information obtained in
such dynamic measurement are more difficult than in case of measurement of –
presumably invariant - physiological properties. Therefore error rates achieved in
behavioral methods are typically higher than for physiological ones. The most popular
behavioral biometric techniques include:
• Speech recognition.
Speech recognition is the special area of interest of telecommunication companies
[14]. The main advantage is low cost in application. The microphone is the only needed
equipment. However the voice may be easily imitated, disguised and electronically
transformed. Moreover the voice of a person may change in time (for instance altered
by a cold).
• Keystroke dynamics.
The method is based on measuring the dynamics of the sequence of keystrokes when
the user writes something on the keyboard. The idea behind keystroke dynamics has
been around since World War II [13]. It was well documented during the war that
telegraph operators on many U.S. ships could recognize the sending operator. Raw
measurements already available by the standard keyboard can be manipulated to
determine Dwell time (the time one keeps a key pressed) and Flight time (the time it
takes a person to jump from one key to another) [47]. Another properties may be
measured when using specially designed keypads [72].
20
Biometric identification issues 2.4. Summary
As applications are using mostly standard keyboards (that is common standard input
devices) they are vulnerable for forgery.
• Dynamic signature verification.
Static signature identification uses only the image of the signature, the dynamic one
uses also the information about pen velocities during making a signature on special
tablets [47]. To improve performance, the signature verification may use a specially
designed pen registering information about pen’s position and pressure [34] or even
special gloves registering information about position of each finger [32]. There is also a
patent pending methodology of identification based on signature made with mouse [24].
There are a lot of other behavioral biometric identification methods like for instance
gait analysis [36] but most of them is in early experimental phase. It must be mentioned
that every behavioral biometric identification method has also a physiological factor.
For instance keystroke dynamics depends on length of fingers and speech recognition
depends on physiology of the human’s vocal tract.
2.3.5 Multimodal systems
There is no optimal biometric identification method. Therefore the technique of
combining several methods became the area of interest of researchers [41][12][80].
Systems, that utilize more than one physiological or behavioral characteristic for
identification are called multimodal biometric systems. The benefits of the multimodal
systems are:
• Reducing false rejection and false acceptance rates.
• Providing a secondary means of identification if sufficient data cannot be
acquired.
• Combating attempts to spoof biometric systems through non-live data sources
such as fake fingers.
The first benefit comes from fact that combining results of weak classifiers may
improve the overall performance. There are some obvious combinations of different
biometric methods like finger and palm or face and voice. As there are still problems
with performance and protection against forging of single biometric methods, the
multimodal systems seem to become more important in future.
2.4 Summary
The section described briefly state-of-the-art methods in the field of biometric
identification. As it can be seen, the problem is yet far from being solved. There are
21
Biometric identification issues 2.4. Summary
plenty of methods but each of them has some drawbacks. On the other hand, there is a
great interest in reliable human identification so there is still need for further researches.
Developing a new biometric technologies - like eye movement based identification -
may prove to be very useful especially when combining it with other methods in
multimodal systems.
22
3 Eye movements
Eyes are one of the most important human organs. There is a common saying that
eyes are ‘windows to our soul’. In fact eyes are the main ‘interface’ between
environment and human brain. Therefore, it is not a surprise that the system that deals
with human vision is physiologically and neurologically complicated.
3.1 Physiology of eye movements
When individual looks at an object, the image of the object is projected on to the
retina, which is composed of light-sensitive cells that convert light into signals, which in
turn can be transmitted to brain via the optic nerve. The density (or distribution) of these
light-sensitive cells on retina is uneven, with denser clustering at the centre of the retina
rather than at the periphery. Such clustering causes the acuity of vision to vary, with the
most detailed vision available when the object of interest falls on the centre of the
retina. This area is called yellow dot or fovea and covers about two degrees of visual
angle. Outside this region visual acuity rapidly decreases. Eye movements are made to
reorient the eye so that the object of interest falls upon the fovea and the highest level of
detail can be extracted [16].
Fig. 3.1 Image of the retina. Dark region in the right is the fovea.
That is why it is possible to define a ‘gaze point’ – an exact point a person is looking
at in a given moment of time. When eyes are looking at something for a period of time
this state of the eye is called a fixation. During that time the brain analyzes the image
that is projected on the fovea. The standard fixation lasts for about 200-300 ms, but of
course it depends on the complexity of an image, which is observed. After the fixation,
23
Eye movements 3.2. Previous researches considering eye movements
eyes move rapidly to another gaze point – another fixation. This rapid movement is
termed a saccade. Saccades differ in longitude, yet always are very fast.
To enable brain to acquire image in real time, the system which controls eye
movements (termed oculomotor system) has to be very fast and accurate. It is built of
six extra ocular muscles (see Fig. 3.2) which act as three agonist/antagonist pairs
concerned with horizontal, vertical and oblique rotations of eye [38]. Eyes are
controlled directly by the brain with three cranial nerves originating from midbrain and
pons. Therefore its movements are the fastest reactions for changing environment.
Fig. 3.2 Six oculomotor system muscles.
3.2 Previous researches considering eye movements
Eye movements are essential to visual perception [66], so it is not a surprise that
there are a lot of researches on our vision. Most of them are concerned with
neurobiological and psychological aspects of vision.
One of the first scientists who emphasized the importance of eye movements in
vision and perception was Descartes (1596-1650). First known researches were made by
French ophthalmologist, Emile Javal in 1879 [44]. He discovered that eyes move in a
series of jumps (saccades) and pauses (fixations). His research was based only on his
direct observation of eyes, so it could not be fully reliable. First eye-tracker was
developed by Edmund Burke Huey in 1897. The way in which people read text was the
first area of interest. It turned out – contrary to common point of view in those times –
that people read more than one letter simultaneously. They read whole words or even
whole phrases. The nature of reading ability was examined and the results were
published in a comprehensive form in 1908 [37].
Other area of interest was how the brain processes images. It turned out that
placements and order of fixations were strictly dependent on the kind of picture that was
seen and on previous individual experience with that kind of pictures. The brain was
24
Eye movements 3.3. Eye movement tracking methodologies
believed to be attracted by the most important elements of the picture, and, after
examining them, to focus on less important details. The acquired knowledge on the way
the brain was processing information was used mostly in psychological research
[19][23].
Another evolving field where eye trackers are used is research called usability
engineering – the study of the way that users are interacting with products to improve
those products’ design. Among the most popular nowadays is the study of the usability
of WWW pages [16][46][83].
Although there has not been any research of using eye movements to perform human
identification, some authors noticed significant differences between people. Josephson
and Holmes [46] tested the scanpath theory introduced by Stark and Norton [70] on
three different WWW pages. They not only confirmed that individual learnt scanpaths
(series of fixations) and repeated it when exposed on the same stimulation again, but
they also noticed that each examined person learned a different scanpath.
There are also some studies comparing the eye-movements of different categories of
people, for instance males and females [83] or musicians and non-musicians [55].
3.3 Eye movement tracking methodologies
As it was stated above the first eye tracker was developed in 1897. Until now a lot of
different methods for measuring eye movements have been developed.
There are four broad categories of eye movement measurement methodologies
including [20][59]:
• Electro-oculography (EOG).
• Scleral contact lens/search coil.
• Video-oculography (VOG).
• Infrared corneal reflection oculography (IROG).
Electro-oculography (EOG) is the cheapest method and has been widely used in the
past. It relies on recordings of the electric potential differences of the skin surrounding
the ocular cavity. Surface recording electrodes are typically placed on the skin close to
the eyes in horizontal and vertical planes so as to record relative shifts in the potential.
When the eyes look to the left, the positive charge of the cornea moves closer to the left
surface electrode, and a shift in DC output is recorded. The relationship between EOG
output and horizontal angle of gaze is approximately linear for ± 30° of arc, and is
usually accurate to within ± 1.5-2.0° [69].
25
Eye movements 3.3. Eye movement tracking methodologies
Fig. 3.3 Subject ready for electro-oculography eye movement measuring experiment [68].
(reproduced with permission)
The main disadvantage of that method is that it requires electrodes to be placed
around the eye. It’s not very convenient for subjects being examined. Moreover that
method is not very precise in comparison to other.
Contact lens coil is the most precise eye tracking method. It involves attaching a
mechanical or optical reference object mounted on a contact lens, which is then worn
directly on the eye [20]. There may be different devices attached to the lens but the
principal methods employ a wire coil, which is then measured moving through an
electromagnetic field. Although the contact lens coil is the most precise eye movement
measurement method (accurate to about 5-10 seconds of arc over a limited range of
about 5 deg), it is also the most intrusive method. Insertion of the lens requires care and
practice and wearing of the lens may cause discomfort. Its high intrusiveness makes it
practically useless in human identification experiments.
Fig. 3.4 An example of contact lens coil [86].
(reproduced with permission)
Video oculography (VOG) is generally based on analyses of image of the eye,
changing in time. Because it uses CCD cameras it is convenient for observed subjects as
no physical contact with device is necessary. The recording device may be attached to a
special head-mounted helmet (Fig. 3.5), what is not very convenient or may be table-
mounted (in the distance from eye), for instance attached to computer display (Fig. 3.6).
26
Eye movements 3.3. Eye movement tracking methodologies
Fig. 3.5 Head mounted video-based eyetracker [87].
(reproduced with permission)
These techniques involve the measurement of distinguishable features of the eyes
under rotation/translation, e.g.: the apparent shape of the pupil, the position of the
limbus (the iris-sclera boundary) and corneal reflections of a closely situated directed
light source (often infra-red) [20].
Fig. 3.6 Video based eye tracker with camera combined with the TFT display [89].
(reproduced with permission)
The last methodology takes advantage of the reflection properties of human eyes.
The beam of light is directed to the eye and the reflection is measured. As the directed,
closely situated light source could be inconvenient for examined person an infrared light
sources are often used. That is why the methodology is called Infrared oculography
(IROG). Contrary to VOG the method does not use complicated image capturing
devices but only the simple light receivers.
27
Eye movements 3.4. The OBER2 system
The method is very precise and not very intrusive so it seems to be the best choice
for experiments presented in this work. One of the best examples of products based on
that methodology is an OBER2 system described below.
3.4 The OBER2 system
The OBER2 system is the product of many years of experience and experiments. It
was developed by the international group of researchers including dr Per Udden from
Permobil Meditech, Sweden, professor Jan Ober from Institute of Biocybernetics and
Biomedical Engineering, Polish Academy of Sciences and professor Józef Ober from
Silesian University of Technology, Gliwice, Poland.
The OBER2 system is an example of infrared oculography (IROG) based system. It
works using pairs of infrared transmitters and receivers. Transmitters emit infrared light
towards the eye. The light is reflected from iris or sclera regions and is collected by the
receivers. Eye movements are measured using differential comparison of transmitted
and reflected signals in time [59]. It occurs that difference of amount of light received
during two consecutive trials is with good accuracy proportional to the angular position
of the eye [43].
Fig. 3.7 OBER2 system operation principle.
Fig. 3.7 presents the basic idea of the OBER2 system. In fact there are eight pairs of
transmitters and receivers measuring movements of each eye. They are attached to
specially designed ‘goggles’ presented on Fig. 3.8.
28
Eye movements 3.4. The OBER2 system
Fig. 3.8 Goggles to be worn during the experiment.
Because of medical restrictions the infrared light to which eye is exposed should be
as small as possible. So the OBER2 system emits infrared light in very short 80 µs
pulses. The amount of light measured by receivers during the pulse is (after substraction
of the predicted amount of ambient light) the output of the system. The whole process
may be described in the following steps (see Fig. 3.9):
• Measure the amount of ambient light received in t0 (IR0).
• Measure the amount of ambient light received in t1 (IR1) and start emitting
the impulse.
• Measure the amount of light received in t2, when the impulse reaches the
maximum value (IR2).
• Calculate (using the extrapolation from the amounts measured in t0 and t1)
the predicted amount of ambient light in t1 (IR3).
• The output of the system is IR2 –IR3.
light
time t2 t1 t0
IR0
IR1
IR3
IR2
ambient light
impulse
Fig. 3.9 Graph illustration of the OBER2 measuring process.
29
Eye movements 3.4. The OBER2 system
The value is then compared to the value obtained during the calibration (see section
5.1) and sent to 12-bit AD converter. The result of the single measurement is therefore a
12-bit digit. The same procedure is used for both eyes and both directions (horizontal
and vertical) so the system gives four such digits as the result of one measurement.
The current version of the system has the ability to measure eye positions with
frequency up to 2 kHz (4 kHz in case of measuring in only one direction). The
estimated precision of measurement is about 30 seconds of arc.
The system used during experiments was connected with a PC computer with RS232
interface. All results measured by the system were directly sent to the computer and
stored on a hard drive (see section 4.6).
Fig. 3.10 The laboratory version of the OBER2 system.
The advanced technology used in OBER2 system makes it one of the fastest and the
most accurate eye tracking devices.
30
4 Eye movement registering experiment
To prove that eye movements may be used for human identification an experiment
had to be performed. The experiment was divided into two stages:
1) Gathering samples of human eye movements from different persons.
2) Processing samples obtained in the previous step to extract individual features.
This chapter analyses the first stage of the experiment – gathering eye movement
samples. That process consists of series of tests on different subjects (persons). Each
test is a registration of eye movements of the subject for the specified period of time
with the OBER2 system. The result of the single test is a sample, which will be used in
the second stage of the experiment.
4.1 Possible testing strategies
There are several possibilities how to perform a single test. This section discusses
some of the possibilities and describes the chosen solution.
• Registering eye movement only, without information about an observed
image.
In that kind of tests, eye movements of a person are registered for a specified period
of time. The testing system consists of OBER2 device for eye movements registering
and a PC computer connected with the OBER2 with RS232 interface for data storing
(Fig. 4.1). The solution is very simple to conduct even without any cooperation from the
person being identified. Eye movements may be measured during normal activity of that
person without any information about the observed image.
The main drawback of that method is the obvious fact that eye movements are
strongly correlated with the image they are looking at. The movements would be quite
different in case of a person looking at quickly changing environment (for instance a
sport event or an action movie) than in case of a person looking simply at white solid
wall.
31
Eye movement registering experiment 4.1. Possible testing strategies
Fig. 4.1 Schema of the system registering only eye movements without any information
about the observed image.
Of course one may say that human identification should be independent of visual
stimulation. Indeed, theoretically it should be possible to extract identification patterns
from every eye movement without knowledge of the character of stimulation. However,
that kind of extraction seems to be very difficult and requires more comprehensive
study and experiments.
• Registering both eye movements and the observed image.
In that solution the testing system is expanded with the image-capturing device,
which registers the ‘reason’ of eyes’ movements (Fig. 4.2). In that kind of test we treat a
tested subject as the dynamic system for which we are registering input and the answer
to that input.
Fig. 4.2 Schema of the system registering both eye movements and the observed image.
32
Eye movement registering experiment 4.1. Possible testing strategies
Such improvement gives a lot more data to be analyzed [31]; yet it also has several
serious drawbacks. First of all, the testing system is more complicated. We need
additional camera recorder, which registers the image the examined person is looking
at. Furthermore, we need to implement special algorithms to synchronize visual data
with eye movement signal. A lot more capacity is also needed for data storing.
We must additionally be aware that camera ‘sees’ the world differently than a human
eye, thus the image we register cannot be considered completely equivalent to eyes’
input.
Moreover to be usable in real world for the purpose of biometric identification, a
single test cannot be too long. With no influence on stimulation (image being observed)
one cannot be sure if enough interesting information about a person being identified can
be registered during the short time of the test.
• Generating image and registering eye movements as the answer to it.
In that solution the testing system consists of the OBER2 eye tracker and PC
computer for both data storing and controlling the monitor, which produces a visual
signal (Fig. 4.3). The OBER2 system registers answer to that signal produced by the
subject’s eyes. However, we should be aware of the fact that the monitor screen is only
a part of the image that the eyes see, so not the whole input is measured. Furthermore,
the input may consist of non-visual signals. Sudden loud sounds may, for instance,
cause rapid eye movements [92].
Fig. 4.3 Schema of the system generating stimulation on a computer display and registering eye
movements as the answer to that stimulation.
As that methodology gives influence on ‘input’ of the examined subject it seems to
be the most interesting from the researcher’s point of view. Therefore all tests described
33
Eye movement registering experiment 4.2. Possible stimulations
in this work were performed using a stimulation displayed on the monitor with the
system architecture presented on Fig. 4.3.
4.2 Possible stimulations
Persons being tested look at the computer monitor for a specified period of time and
their eye movements are measured. The computer monitor displays a scene called here a
stimulation. One may consider different types of stimulations (see Fig. 4.4). The type of
stimulation implies what aspect of eye movements is measured.
Stimulations
Static Dynamic
Passive
Forcing eye movement
Interactive Not interactive
Fig. 4.4 Hierarchy of possible stimulations.
The simplest one could be just a static image. Such image does not change in time
and it is the same during the whole test. As has already been stated, eyes are moving
constantly, even looking at the static image, to register every important element of
image with fovea region fixation. According to Stark and Norton [70] brain is creating a
‘scanpath’ of eye movements for each image seen.
Fig. 4.5 Scanpaths of eyes looking at the static image.
34
Eye movement registering experiment 4.3. Jumping point stimulation
As can be seen on the figure above scanpaths generated by eyes when observing an
image may be a very interesting field of observation. The theory that scanpath
characteristic is unique for a person [46] seems to be promising.
A special kind of static stimulation may be a text stimulation. In such experiment a
person just reads the text appearing on the screen. There are a lot of studies concerning
eye movement tracking while reading a text and they give very interesting results
[11][22][23]. After years of usage the human brain is very well prepared to control eye
movements while reading a text and each human being has slightly different customs
and habits based on different ‘reading experience’. Therefore, it may be assumed that by
analyzing the way a person reads a specially prepared text a lot of interesting
information may be extracted. However there is a problem with a (described later)
learning effect of the same stimulation observed a number of times.
The more sophisticated solution could be a dynamic, changing in time, stimulation.
There may be different aspects of stimulation considered: color, intensity, speed, etc.
This kind of stimulation may be passive – that is just showing the stimulation without
any expectations of the examined person’s reactions. For instance it may be a movie or
animation. On the other hand the stimulation may force eye movements. In that kind of
stimulation the system is expecting eyes reaction for stimulation.
That kind of stimulation may be interactive or non interactive. In interactive
stimulation the computer display shows image and testing system is waiting for user’s
reaction. It may be for instance a visual task stimulation like finding a matching picture
[35] or finding missing elements on a known picture [33]. The stimulation changes
according to registered eye movements. When a subject, for instance, finds with their
eyes a matching picture the stimulation automatically changes to the next task. The
person can also give information about task completion using any other input device
like keyboard or mouse.
In non-interactive stimulation the testing system is presenting a stimulation and the
task of the tested person is to look at a specified point. The examined subject has not got
any influence on stimulation as the stimulation is not sensitive to how the subject moves
their eyes.
4.3 Jumping point stimulation
One of the simplest forms of dynamic, forcing eye movements and non-interactive
stimulations is a ‘jumping point’ stimulation. It that kind of stimulation the screen is
blank with only one point ‘jumping’ through it. The task of examined person is to
follow the point with their eyes.
35
Eye movement registering experiment 4.4. The learning effect
It is easier to analyze results of such stimulation. This time we are not interested in
examining where the persons are looking but in examining how they look at the point.
We may suppose that all results will be more or less similar to one another and our task
is to extract the differences among people.
The main drawback of the method, however, is that it completely ignores the will of
the person. The person cannot decide where to look at the moment and therefore we are
loosing all information from brain’s ‘decision centre’. We may say that that kind of
stimulation examines more the oculomotor system than the brain.
However the jumping point stimulation has several significant advantages:
• It is self-calibrating, as the required fixation location is known during the whole
experiment. That allows us to omit the pre-calibrating stage, which might be
necessary for other kinds of stimulations [35][2].
• Its duration is the same for each experiment. When a person is completing a
visual task like text reading or picture matching we can never be sure when they
will finish. The jumping point stimulation lasts always for the same time.
• It is very easy to display even without a monitor. In fact, there are only sources of
light needed (for instance just simple diodes).
• No additional hardware is needed (like mouse or keyboard for some visual tasks).
4.4 The learning effect
What is very important when using eye movements in biometrics is that the same test
will be performed on the same person a number of times. Firstly, there should be several
tests performed to enroll user characteristic. Then, the test is performed each time when
an individual wants to identify themselves. It may be supposed that if the same
stimulation was used each time, the person would get familiar with it and for instance
would seek only for previously unnoticed details. After a lot of repetitions scene is
known and the person may even stop to move their eyes when looking at it. The effect
may be called a learning effect as the brain learns the stimulation.
The learning effect is present in every experiment measuring human behavior. For
example when reading the text for the first time eye movements are interesting, because
eyes are stopping at more difficult words, and sometimes are going back to read some
fragments again. This process is very often unconscious. However, when reading the
same, already known, text once again eye movements are smooth and not interesting at
all. The brain has this text in memory and these movements are not really reading it
once again.
36
Eye movement registering experiment 4.4. The learning effect
When performing a text reading test we may also for instance notice that a person
has a lot of problems with reading the word ‘oculomotor’ and when reading this word
eyes are going back to read it again. However, presenting the same word during each
experiment causes that person’s brain gets familiar with the word and that effect
disappears.
There are two possible ways how to handle the learning effect:
• Overcome the learning effect by changing the stimulation for every
experiment.
It is possible to overcome the learning effect by changing a stimulation in every test.
When different stimulation is used every time the learning effect is obviously
minimized. However these stimulations should be as similar as possible to enable
extraction of the same eye movements’ parameters for future analyses. On the other
hand they should be different so that a learning effect can be eliminated. The task is
therefore not an easy one.
Using different texts in the text reading stimulation one can overcome the learning
effect. However extracting the same information about the examined person is difficult
because the difficulty of different texts is varying.
Similarly, when considering the jumping point stimulation, the moments of point
changes and placements of the following points may be accidental. But the main
drawback is that we cannot directly compare two experiments, as eye movements
during them are different. That problem may of course be minimized by using a proper
feature extraction method described in section 5.
• Use the learning effect as the direct identification measure.
Instead of avoiding the leaning effect it may be directly used in the identification
process. A person who looks at the same stimulation a lot of times gets used to it and
the results of the subsequent experiments are converging – the subsequent samples are
more similar.
Having that in mind we can suppose that, after a specific number of experiments,
next samples will be very similar and therefore easier to identify. It is exactly the same
effect as with the written signature.
Let’s imagine the following experiment: Persons are asked to write a word on the
paper. The written word looks typical for their handwriting style and it is possible for a
specialist to identify them. Yet, when they are asked to write the same word over and
over again, they get used to it and the brain produces the kind of automatic schema for
performing that task. At this moment the handwritten word looks very similar every
37
Eye movement registering experiment 4.5. Methodology used during the experiment
time and that is what we call a signature. Contrary to handwriting, the signature may be
recognized even by an unqualified person – for instance a shop assistant.
We would like to use the same effect with the eye movements. Firstly, we show a
person the same stimulation several (as many as possible) times. After that process, the
person’s brain produces an automatic schema and results of the following experiments
will start to converge. It, of course, makes the process of recognition (identification)
easier – remember a handwriting specialist versus a shop assistant. Of course we must
be sure that even after convergence eye movements would still be informative – eyes
must be active even when the stimulation is well known. Therefore that kind of
‘stimulation learning’ may be used only with stimulations forcing eye movements.
Returning to the jumping point stimulation - if the stimulation is invariable, every
test is similar and may be directly compared. After many repetitions a person gets
familiar with the point’s appearance order and may even predict the next point position.
When every examined person has their own, individual, stimulation the test is
analogous to handwritten signature. Each person learns their own ‘eye movement
signature’ and it is difficult for any other person to repeat the same sequence of eye
movements. Of course – as the process is dynamic and mostly unconscious – forging
the other persons by pretending their ‘eye movement signature’ is much more difficult
than the written one.
But, using the same analogy to hand writing, when we ask several persons to write
the same word several times, they will do it differently with their own handwriting
style. So it may be supposed that even when using the same stimulation for each person
it may be possible to distinguish the differences.
The latter method (using the same stimulation for every test) gives better opportunity
for further cross testing of samples as every test may be used both as the genuine and
impostor one (see section 7).
4.5 Methodology used during the experiment
The stimulation, which has been eventually chosen, was a ‘jumping point’ kind of
stimulation with the same points order for every experiment. There are nine different
point placements defined on the screen, one in the middle and eight on the edges,
creating 3 x 3 matrix. The point flashes in one placement in a given moment. The
stimulation begins and ends with a point in the middle of the screen. During the
stimulation, point’s placement changes in specified intervals.
38
Eye movement registering experiment 4.5. Methodology used during the experiment
Fig. 4.6 Points matrix for stimulation.
The main problem in developing stimulation is to make it short and informative.
Those properties are as if on two opposite poles, so a ‘golden mean’ must be found. It
may be assumed that gathering one sample could not last longer than 10 seconds.
Longer stimulations would be impractical when considering usage in real world. To be
informative, experiment should consist of as many point position changes as possible.
However, moving a point too quickly makes it impossible for eyes to follow it.
Experiments and literature [38] confirmed that the reaction time for change of
stimulation is about 100-200 ms. After that time eyes start a saccade, which moves
fovea to the new gaze point. The saccade is very fast and lasts not longer than 10-20 ms.
After a saccade, the brain analyses a new position of the eyes and, if necessary, tries to
correct it. So very often about 50 ms after the first saccade the next saccade happens. It
can be called a ‘calibration’ saccade.
Fig. 4.7. Typical eye movement reaction for point position change in one axis.
39
Eye movement registering experiment 4.5. Methodology used during the experiment
Therefore to register whole reaction for point change it was necessary to provide an
interval between point locations change as more than 300 ms.
The stimulation, which has been developed and used during all tests, consists of
eleven point position changes giving twelve consecutive point positions. First point
appears in the middle of the screen and the person should look at it with eyes positioned
directly ahead. After 1600 ms the point in the middle disappears and for 20 ms a screen
is blank. In that time eyes are in instable state waiting for another point of interest. That
moment is uncomfortable for eyes because there is no point to look at. Then the point
appears in the upper right corner. The flashing point on the blank screen attracts eyes
attention even without the person’s will. The ‘jumps’ of the point continue until the last
point position in the middle of the screen is reached.
a. 1600 ms
e. 550 ms
i. 550 ms
b. 550 ms
f. 550 ms
j. 550 ms
c. 550 ms
g. 550 ms
k. 550 ms
d. 550 ms
h. 550 ms
l. 1100 ms
Fig. 4.8 Visual description of stimulation (a-l).
The chosen stimulation is described on the picture above. Each picture except for the
first and the last one is seen by the subject for about 550 ms. The first and the last points
are placed in the middle of the screen.
The next problem, which had to be solved, was choosing parameters of eye
movements registration. The OBER2 eye tracking system measures eye movements
with frequencies up to 2kHz. Such high frequencies may reveal information about eyes
micro-movements. However higher frequencies are giving more data and such amount
of data is sometimes difficult to maintain when considering number of experiments. In
different eye movement researches described in section 3.2 measuring frequencies were
almost always less than 100Hz. Therefore a frequency 250Hz was arbitrary chosen.
40
Eye movement registering experiment 4.6. Storing results of the experiment
Such frequency makes it impossible to reveal eye micro movements like tremors [59]
but was enough to acquire information about fixations and saccades.
4.6 Storing results of the experiment
Each test was a recording of 2048 positions of eyes looking at the stimulation. That
number has been chosen for two reasons:
• It is easier to maintain signals with number of elements equal to power of two.
• As OBER2 system measured eye positions with frequency of 250 Hz the whole
experiment lasted 8128 ms which is fulfilling ‘less than 10 seconds’ condition.
Each of 2048 measurements (taken at intervals of 4 ms) consisted of six integer
values which were giving the position of stimulation point on the screen (SX, SY), the
position of the point the left eye was looking at (LX, LY) and the position of the point
the right eye was looking at (RX, RY) at each moment of time during presentation of
the stimulation.
Positions were results from 12-bit AD converter so they were in range 0-4095.
Stimulation point position was also recounted to those bounds. In each experiment a
sample of 2048 x 6 = 12288 values is collected.
Fig. 4.9 Results of a single test.
The result of the single test may be showed on two graphs (Fig. 4.9). The result of a
single test (called later a sample) was stored in a single text file in the format defined as
EyeTextFile format. The example of such a file is presented below.
41
Eye movement registering experiment 4.6. Storing results of the experiment
login 2003-12-17 11:00:39
2048 2048 2071 2026 2110 2229
2048 2048 2102 2030 2104 2236
2048 2048 2102 2032 2103 2233
2048 2048 2101 2031 2102 2231
2048 2048 2100 2030 2101 2226
2048 2048 2102 2032 2101 2224
2048 2048 2100 2028 2096 2219
2048 2048 2098 2030 2096 2216
2048 2048 2101 2029 2095 2216
2048 2048 2100 2024 2095 2214
2048 2048 2099 2026 2095 2215
2048 2048 2097 2028 2092 2213
...
Fig. 4.10 Result of one test stored in a text file (EyeTestFile format).
As it can be seen the sample file consists of identification information, date and time
of acquiring and set of 2048 measurements. Each measurement consists of six values
for SX, SY, LX, RX, LY and RY. Each sample file was stored on a hard disk with
filename created as:
<login>.<index>
where login is the identification of the person being tested and index is the number of
the sample taken from that person. For instance a file ABC.12 is the 12th sample taken
from a person identified as ABC.
The samples obtained during tests and stored in separate files were subject of further
analyses described in the next sections.
42
Entry processing of collected data 4.6. Storing results of the experiment
5 Entry processing of collected data
Each test described in section 4 gave sample consisting of information about eye and
stimulation point positions at specified moments of time. The sample consists of 12288
values grouped in six 2048 long vectors of integers from range 0-4095. The vectors are:
• SX – horizontal position of stimulation point. 0 for left position, 2048 for
middle position and 4095 for right position.
• SY – vertical position of stimulation point. 0 for top position, 2048 for
middle position and 4095 for bottom position.
• LX – horizontal position of the left eye. 0 for relative left position, 2048 for
middle position and 4095 for relative right position of eye.
• LY – vertical position of the left eye. 0 for relative upper position, 2048 for
middle position and 4095 for relative bottom position of eye.
• RX – horizontal position of the right eye. 0 for relative left position, 2048 for
middle position and 4095 for relative right position of eye.
• RY – vertical position of the left eye. 0 for relative upper position, 2048 for
middle position and 4095 for relative bottom position of eye.
For the convenience of further operations all vectors were combined into one sample
vector as it can be seen on Fig. 5.1. Each vector will be called ‘a signal’ later in this
section and will be treated separately.
SX SY LX RX LY RY
Fig. 5.1 An example of a sample consisting of six independent parts (signals). X-axis represents
time (separately for each signal) and Y-axis represents a value obtained from OBER2 system’s AD
converter (for signals LX, RX, LY and RY). In case of signals SX and SY, value of Y-axis has only
three possible values (0, 2048 and 4095) representing stimulation point positions.
43
Entry processing of collected data 5.1. Sample calibration
The data has to be transformed to create a vector of attributes, which will be used for
identification process described in the following sections. Each feature should give
some information about a person who was the subject of the experiment. That
information may be understandable – for instance “his dominant eye is left” or “his eyes
are flickering with frequency 10 Hz”, but the meaning of the feature may also be
hidden, giving only the value. The main problem is how to extract a set of features that
have values for different samples of the same person (inner-class samples) as similar as
possible and that have values for different person’s samples as different as possible.
As it has been mentioned earlier, identification based on eye movement analysis is a
brand new technique. The main disadvantage of that technique is that one cannot use
already published algorithms and just try to improve it with one’s own methods.
Therefore, the only possibility is to use methods which have been successfully used
while dealing with similar problems.
Methods described in this section include:
• Methods used elsewhere to analyze eye movement data.
• General methods used in signal processing and classification.
Signal transformations described in this section use information from the sample (the
result of a single test) to produce a vector of attributes. All calculations are independent
of other samples.
vector_of_attributes = f(sample) Eq. 5.1
First problem that is to be solved before processing a vector of features is choosing
samples, which are possible to further analyses and normalization of them to enable
direct comparison of their different attributes.
5.1 Sample calibration
First step in laboratory tests involving eye trackers (or any other measuring device) is
always a calibration of the device. The typical calibration consists of series of tests and
analysis of the obtained results to improve the device measurements [35]. After
achieving rewarding parameters the real experiment begins.
The OBER2 device calibrates in the moment when the test is started. It is assumed
that in that moment examined person is looking straight ahead and the initial value of
system’s output is set to 2048 (the middle value of 12-bit AD converter). When the eye
moves down the value describing the vertical position increases, when the eye moves up
the value decreases. Similarly, when the eye moves right the value describing the
horizontal position increases and when the eye moves left the value decreases. The
44
Entry processing of collected data 5.1. Sample calibration
amount of this increase or decrease may be controlled with gain parameter of the
OBER2 system.
The system is very sensitive to the adjustment of the goggles. The results are directly
dependent on the distance between goggles and eyes of the tested person. When the
distance is small, changes of reflection caused by eye movements are high. And
opposite - when the distance is big, changes are smaller. The results are also dependent
on the properties of the cornea – its shape and color.
To calculate eye’s position in degrees relative to center position, a calibration
procedure should have three values:
• Distance between eyes and stimulation (monitor screen)
• Value obtained from OBER2 system when eye is looking at the center of the
screen (usually 2048)
• Value obtained from OBER2 system when eye is looking at the specified
point on the screen.
Fig. 5.2 Information needed for proper calibration of eye movement signal.
When that three values are known, it is possible to recalculate pure OBER2 output to
arc degrees. Obviously, on the basis of those numbers, it is also possible to adjust gain
parameter of the system. The gain parameter controls the amount of light emitted by the
transmitters. When the gain is high the system is more sensitive to eye movements. That
parameter should be set up in such way that AD converter could measure the expected
extreme positions of the eye without saturations. These positions should also give
values from boundaries of the converter range to lower the measurement error.
However, one of the main ideas of the proposed human identification technique is to
make it as convenient as it is possible for potential users to ensure sufficient
acceptability of the method. Therefore, a single test has to be easy to perform. In order
45
Entry processing of collected data 5.2. Sample normalization
to provide over 1000 tests on almost 50 people, one should make sure that the test is
also fast. The method in which the device is to be calibrated for 5 minutes to acquire
one 8-second sample has not been taken into account because it is too inconvenient for
users. In fact many tests have been provided by users themselves without any
assistance.
Fortunately, to find features useful to human identification we seek only for
characteristic properties of eye movements as well as for correlation between
stimulation and eye movements. Therefore, we do not need exact information about the
relative angle of eye position. Instead, we can only use values obtained from OBER2
AD converter without information about scale of eye movements.
Hence, almost all calibrations were omitted during tests. The only problem that had
to be solved at once during the test was problem of AD converter saturation. If the
goggles are too close to the eye the changes of amount of light reflected by cornea
extend capacity of 12-bit converter and values are going under 0 or beyond 4095. As
can be seen on Fig. 5.3 a lot of information, which should be provided by the test, is
lost. The result is therefore impossible to analyze.
Fig. 5.3 Example of badly acquired sample. Left eye, horizontal position (LX).
Gain was to high and the signal causes AD converter saturation.
To avoid that situation, the user was informed during the test (with a beep signal)
that measured values exceeded the allowed range. Than the gain was lowered and the
test was repeated.
5.2 Sample normalization
As all calibrations were omitted, values obtained from the OBER2 system during
tests may be sometimes disturbed. That is why it was necessary to preprocess gathered
samples with a special normalization procedure. That procedure should adjust samples
to the same range and reject outliers for which that correction is impossible.
46
Entry processing of collected data 5.2. Sample normalization
For instance, although the tested persons were instructed to avoid blinks during the
test, the eye blink is often impossible to stop. The blink is obviously influencing the
gathered data. As there are methods of blink recognition in eye movements data
[43][31], three possibilities could be chosen:
- Reject all samples with blinks.
- Filter out blinks using methods described in [43].
- Leave blinks unchanged.
Rejection of all ‘blinked’ samples is impractical as there are people who cannot
abstain from blinking for eight seconds. On the other hand, while filtering blinks out
from the signal, some information useful for identification may be lost (especially
information about eye oscillations may be disturbed). Leaving blinks unchanged seems
to be the safest solution. However, sometimes the number of blinks in the signal is so
high that it is impossible to analyze the signal itself. That kind of samples (with number
of blinks disturbing analyses) were rejected with a procedure described below.
Moreover, the tested person may lose attention during the test because of external
circumstances (somebody else entering the room or sudden noise). The identification
system should also be able to recognize such badly acquired samples.
The pre-processing should solve the problem of badly acquired samples. But even
properly acquired samples have – because of the lack of system calibration - different
amplitudes (Fig. 5.4). To produce vector of attributes that could be compared to another
attribute’s vectors - obtained during different test - all samples had to be normalized.
(A) (B)
Fig. 5.4 Two graphs presenting left eye horizontal reaction (LX). The presented samples were
obtained from two different persons (A) and (B) for the same stimulation. The degree of inclination
significantly differs.
47
Entry processing of collected data 5.2. Sample normalization
The normalization procedure may be described by the following steps:
1) Find all fixations in the signal.
2) Pair fixations with required fixation locations (places where the stimulation point
flashes at that time). The Required Fixation Locations are referred to as RFLs.
3) Calculate average value for RFLs taking into account values of fixations
measured during the specified RFL.
4) Recalculate all values to different range.
All calibrations were performed for signals (like LX or RY) separately using only
information from corresponding stimulation signal (SX or SY). All procedures
presented in this section are visualized on one of the LX signals presented on Fig. 5.4
referred to as signal (A) and (B).
5.2.1 Finding fixations
There are a lot of methods for finding fixations and saccades. Salvucci at al. [82] in
effort to create taxonomy of those methods named five types:
Velocity-Threshold Identification
The method separates fixation and saccade points based on their point-to-point
velocities. The velocity profiles of saccadic eye movements show essentially two
distributions of velocities: low velocities for fixations and high velocities for saccades.
The only problem of the methodology is correct assignment of the threshold.
HMM Identification
Hidden Markov model fixation identification uses probabilistic analysis to determine
the most likely identifications for a given protocol. Hidden Markov models (HMMs) are
probabilistic finite state machines. This methodology uses a two-state HMM in which
the states represent the velocity distributions for saccade and fixation points. This
probabilistic representation helps to perform more robust identification than a velocity-
threshold method.
Dispersion-Threshold Identification
In contrast to the velocity-based identification and HMM identification, dispersion-
threshold identification utilizes the fact that fixation points, because of their low
velocity, tend to cluster closely together. The method identifies fixations as groups of
consecutive points within a particular dispersion, or maximum separation. The
algorithm takes two thresholds: maximum dispersion and minimum duration.
48
Entry processing of collected data 5.2. Sample normalization
MST Identification
MST identification is based on minimum spanning trees (MSTs) — that is, a tree
connecting a set of points in such a way that the total length of the tree’s line segments
is minimized. MSTs can provide a highly flexible and controllable representation for
dispersion-based fixation identification.
Area-based Algorithms
The four previous identification methods can identify fixations at any location in the
visual field. In contrast, area-based fixation identification identifies only fixations that
occur within specified target areas. Because it may be provided only after proper
calibration – it is useless in our application.
The presented methodologies have different accuracy and robustness. The easiness of
implementation is also an important property. Because dispersion-threshold algorithms
seem to be one of the most accurate and robust and are easy to implement [82] one of
them was used in this work. Namely, it was the algorithm proposed by Augustyniak at
al. [2]. It tries to find intervals with sufficient duration for which the point’s
approximation is flat enough with error lesser than a specified threshold.
It takes three parameters: minimal length of fixation, maximal slope of the interval
approximation and threshold of a maximal approximation error. As the authors did not
give their suggestions for these parameters, all three values had to be assigned
empirically.
Fig. 5.5 Signal (A) from Fig. 5.4 with detected fixations (marked with the thick lines).
49
Entry processing of collected data 5.2. Sample normalization
5.2.2 Pairing fixations with required fixation locations
After calculating fixations the next step was to decide if the fixation was connected
with any RFL (Required Fixation Location) and if so, to assign it to the RFL. The main
problem in that fixation assignment was how to avoid incorrect classifications of
fixations. Several sophisticated heuristic algorithms were tested, but every methodology
resulted in many misclassifications. So one of the simplest algorithms was developed:
• Accept all fixations that last only during one RFL and assign it to that RFL.
• Accept all fixations that last during two or more different RFLs but the
duration of fixation in one of that RFLs is longer than 75% of overall fixation
length and is longer than 150 ms.
• Do not take into account fixations not fulfilling the previous conditions.
The main drawback of this algorithm was that it did not take into account fixations,
which sometimes could be used (from the point of view of a human looking at the
signal) and therefore less data was available for future analyses. But the advantage was
that algorithm make mistakes very rarely (one of such cases will be described below).
The algorithm assigns fixations into three different levels. It has been visualized as
three different values on the graph. The next procedure was averaging all values in
fixations assigned to each RFL as average levels L0, L1, and L2.
Fig. 5.6 Signal (A) and its averaged levels.
50
Entry processing of collected data 5.2. Sample normalization
5.2.3 Recal