Content uploaded by Pawel Kasprowski
Author content
All content in this area was uploaded by Pawel Kasprowski on Aug 13, 2015
Content may be subject to copyright.
Eye movements in biometrics
Pawel Kasprowski1, Józef Ober1,2
1Institute of Computer Science, Silesian University of Technology, 44-100 Gliwice, Poland
2Institute of Theoretical and Applied Informatics, Polish Academy of Science, Gliwice,
Poland
kasprowski@polsl.pl
Abstract. The paper presents a brand new technique of performing human
identification which is based on eye movements characteristic. Using this
method, the system measures human’s eyes reaction for visual stimulation. The
eyes of the person who is being identified follow the point on the computer
screen and eye movement tracking system is used to collect information about
eye movements during the experiment. The first experiments showed that it was
possible to identify people by means of that method. The method scrutinized
here has several significant advantages. It compiles behavioral and
physiological aspects and therefore it is difficult to counterfeit and at the same
time it is easy to perform. Moreover, it is possible to combine it with other
camera-based techniques like iris or face recognition.
1 Introduction
Eyes are one of the most important human organs. There is a common saying that
eyes are ‘windows to our soul’. In fact eyes are the main ‘interface’ between
environment and human brain. Therefore, it is not a surprise that the system which
deals with human vision is physiologically and neurologically complicated.
Using eyes to perform human identification in biometric methods has a long
tradition including well established iris pattern recognition algorithms [1] and retina
scanning. However, as far as the authors of the paper know, there are not any
researches concerning identification based on eye movement characteristic. It is a bit
surprising because that method has several important advantages.
Firstly, it compiles physiological (muscles) and behavioral (brain) aspects. The
most popular biometric methods like fingerprint verification or iris recognition are
based mostly on physiological properties of human body. Therefore, what is needed
for proper identification is only a “body” of a person who is to be identified. It makes
it possible to identify an unconscious or - in some methods - even a dead person.
Moreover, physiological properties may be forged. Preparing models of a finger or
even retina (using special holograms) is technically possible. As eye movement based
identification uses information which is produced mostly by brain (so far impossible
to be imitated), forging this kind of information seems to be much more difficult.
Although it has not been studied in the present paper, it seems possible to perform
a covert identification, i.e. identification of a person unaware of that process (for
instance using hidden cameras).
This is a pre-print.
The original version was published by Springer,
Lecture Notes in Computer Science vol 3087,
Book title: Biometric Authentication, pp 248-258,
DOI 10.1007/978-3-540-25976-3_23
Last but not least, there are many easy to use eye tracking devices nowadays, so
performing identification by means of that technique is not very expensive. For
instance a very fast and accurate OBER2 [2] eye tracking system was used in the
present work. It measures eye movements with a very high precision using infrared
reflection and the production costs are comparable to fingerprint scanners.
2 Physiology of eye movements
When individual looks at an object, the image of the object is projected on to the
retina, which is composed of light-sensitive cells that convert light into signals which
in turn can be transmitted to brain via the optic nerve. The density (or distribution) of
this light-sensitive cells on retina is uneven, with denser clustering at the centre of the
retina rather than at the periphery. Such clustering causes the acuity of vision to vary,
with the most detailed vision available when the object of interest falls on the centre
of the retina. This area is called yellow dot or fovea and covers about two degrees of
visual angle. Outside this region visual acuity rapidly decreases. Eye movements are
made to reorient the eye so that the object of interest falls upon the fovea and the
highest level of detail can be extracted [3].
That is why it is possible to define a ‘gaze point’ – an exact point a person is
looking at in a given moment of time. When eyes are looking at something for a
period of time this state of the eye is called a fixation. During that time the image
which is projected on the fovea is analyzed by the brain. The standard fixation lasts
for about 200-300 ms, but of course it depends on the complexity of an image which
is observed. After the fixation, eyes move rapidly to another gaze point – another
fixation. This rapid movement is termed a saccade. Saccades differ in longitude, yet
always are very fast.
To enable brain to acquire image in real time, the system which controls eye
movements (termed oculomotor system) has to be very fast and accurate. It is built of
six extraocular muscles which act as three agonist/antagonist pairs. Eyes are
controlled directly by the brain and its movements are the fastest reactions for
changing environment.
3 Previous researches concerning eye movements
Eye movements are essential to visual perception [4], so it is not a surprise that
there are a lot of researches on our vision. Most of them are concerned with
neurobiological and psychological aspects of vision.
One of the first scientists who emphasized the importance of eye movements in
vision and perception was Descartes (1596-1650). First known researches were made
by French ophthalmologist, Emile Javal in 1879 [5]. He discovered that eyes move in
a series of jumps (saccades) and pauses (fixations). His research was based only on
his direct observation of eyes, so it could not be fully reliable. First eye-tracker was
developed by Edmund Burke Huey in 1897. The way in which people read text was
the first area of interest. It turned out – contrary to common point of view in those
times – that people read more than one letter simultaneously. They read whole words
or even whole phrases. The nature of reading ability was examined and the results
were published in a comprehensive form in 1908 [6].
Other area of interest was how the brain processed images. It turned out that
placements and order of fixations were strictly dependent on the kind of picture that
was seen and on previous individual experience with that kind of pictures. The brain
was believed to be attracted by the most important elements of the picture, and, after
examining them, to focus on less important details. The acquired knowledge on the
way the brain was processing information was used mostly in psychological research
[7].
Another evolving field where eye trackers are used is research called usability
engineering – the study of the way that users are interacting with products to improve
those products’ design. Among the most popular nowadays is the study of the
usability of WWW pages [3][8][9].
Although there has not been any research of using eye movements to perform
human identification, some authors noticed significant differences between people.
Josephson and Holmes [8] tested the scanpath theory introduced by Stark and Norton
[10] on three different WWW pages. They not only confirmed that individual learnt
scanpaths (series of fixations) and repeated it when exposed on the same stimulation
again, but they also noticed that each examined person learned a different scanpath.
Hornof and Halverson [11] tried to implement methods for automatic calibration of
eye tracker. What is worth mentioning, they noticed that calibration errors were
different for different persons and created the so-called ‘error signatures’ for each
person being examined.
There are a lot of studies comparing the eye-movements of different categories of
people, for instance males and females [9] or musicians and non-musicians [12].
4 Human identification with eye movements’ information
4.1 How to perform the experiment
As was described in the previous section, eye movements may give a lot of
information about an individual. The simplest way to obtain a probe seems to be just
recording eye movements of a person for a predefined time. The method is very
simple to conduct even without any cooperation from a person being identified. In
fact the person may not even be aware of being identified, which gives opportunity
for the so-called ‘covert’ identification. The main drawback of that method is the
obvious fact that eye movements are strongly correlated with the image they are
looking at. The movements would be quite different for a person looking at quickly
changing environment (for instance a sport event or an action movie) than for a
person looking simply at white solid wall. Of course one may say that human
identification should be independent of visual stimulation. Indeed, theoretically it
should be possible to extract identification patterns from every eye movement without
knowledge of the character of stimulation. However, that kind of extraction seems to
be very difficult and requires a lot of more comprehensive study and experiments.
On the basis of the described problems a simple upgrade of the system could add a
module which registers the image which is the ‘reason’ of eyes’ movements. In that
kind of model we have a dynamic system for which we are registering input and the
system’s answer for that input.
Such improvement gives a lot more data to analyze, yet it also has several serious
drawbacks. First of all, the hardware system is much more complicated. We need
additional camera recorder which registers the image the examined person is looking
at. Furthermore, we need to implement special algorithms to synchronize visual data
with eye movement signal. A lot more capacity is also needed for data storing.
We must additionally be aware that camera ‘sees’ the world differently than a
human eye, thus the image we register cannot be considered completely equivalent to
eyes’ input.
To be useful in biometric identification, a single experiment must be as short as
possible. With no influence on stimulation we cannot be sure that in the short time of
experiment we can register enough interesting information about a person being
identified. Therefore, the natural consequence of that fact is introducing our own
stimulation. That solution gives us opportunity for avoiding image registering,
because when stimulation is known, registration is not necessary.
Fig. 1. System generates visual stimulation and registers the oculomotor system reaction.
The most convenient source of visual stimulation is a computer display. However,
as in the previous methods, we should be aware of the fact that the monitor screen is
only a part of the image that eyes see, so not the whole input is measured.
Furthermore, the input may consist of non-visual signals. Sudden loud sounds may,
for instance, cause rapid eye movements [13]. Nevertheless, that kind of experiment
seems to be the simplest to realize.
4.2 Choosing stimulation
The problem which emerges here is ‘what should stimulation be?’
One may consider different types of stimulation for the experiment. The type of
stimulation implies what aspect of eye movements is measured.
The simplest one could be just a static image. As has been already stated, eyes are
moving constantly, even looking at the static image, to register every important
element of image with fovea region fixation. According to Stark and Norton [n03]
brain is creating a ‘scanpath’ of eye movements for each seen image.
The more sophisticated stimulation could be a dynamic one – like a movie or
animation displayed on the screen. There may be different aspects of stimulation
considered: color, intensity, speed, etc.
The problem with using eye movements in biometrics is that the same experiment
will be performed on the same person a number of times. Firstly, there should be
several experiments performed to enroll user characteristic. Then, the experiment is
performed each time an individual wants to identify himself. It may be supposed that
if the same stimulation is used each time, a person would be bored with it and eye
movements would not be very interesting. The problem may be called a ‘learning
effect’ as the brain learns the stimulation and acts differently after several repetitions
of the same image.
The solution could be presenting different stimulation each time. The different
types of stimulation should be as similar as possible to enable extraction of the same
eye movements parameters for future analyses. On the other hand, they should be so
different that a learning effect would be eliminated. The task is therefore not an easy
one.
To be sure that an examined person cooperates with the system (e.g. they move
their eyes) a special kind of stimulation called ‘visual task’ may be proposed. The
visual task may be for instance finding a matching picture [11] or finding missing
elements on a known picture [14],
A special kind of visual task could be a text reading task. In such an experiment a
person just reads the text appearing on the screen. There are a lot of studies
concerning eye movement tracking while reading a text and they give very interesting
results [15][16]. Especially there are some theoretical models of human eyes
movements like SWIFT [16]. After years of usage the human brain is very well
prepared to control eye movements while reading a text and each human being has
slightly different customs and habits based on different ‘reading experience’.
Therefore, it may be assumed that by analyzing the way a person reads a specially
prepared text a lot of interesting information may be extracted. And indeed there are a
lot of researches concerning that subject [15][16][7].
Yet, when applying that to identification system, the problem of the learning effect
appears once again. We may for instance notice that a person has a lot of problems
with reading the word ‘oculomotor’. However, presenting that word during each
experiment causes that person’s brain gets familiar with the word and that effect
disappears.
Thus, the learning effect makes it impossible to repeat the same experiment and to
get the same results. In other words, the experiment is not stationary for a number of
trials. Each experiment performed on the same person is different, because previous
experiments have changed the brain parameters. Of course that effect is clearly visible
onl y when a lot of experiments are conducted on the same person .
It is a serious problem because, contrary to other eye movements experiments, the
main attribute of identification system should be its repeatability for the same person.
The experiment must be performed a lot of times giving the same results. A learning
effect makes it very difficult.
A solution of that problem may be a stimulation which imposes on a person a spot
the eyes should look at. The simplest form of that stimulation may be a ‘wandering
point’ stimulation. It that kind of stimulation the screen is blank with only one point
‘wandering’ through it. The task of examined person is to follow the point with the
eyes.
It is easier to analyze results of that kind of stimulation. That time we are not
interested in examining where the person is looking but in examining how they look
at the point. We may suppose that all results will be more or less similar to one
another and our task is to extract the differences among people.
The main drawback of the method, however, is that it completely ignores the will
of the person. The person cannot decide where to look at the moment and therefore
we are loosing all information from brain’s ‘decision centre’. We may say that that
kind of stimulation examines more the oculomotor system than the brain.
However, using it we can overcome the learning effect. Obviously it still exists, but
in that experiment it may become our ally. A person who looks at the same
stimulation a lot of times gets used to it and the results of the following experiments
are converging to the same point.
Having that in mind we can suppose that, after a specific number of experiments,
next probes will be very similar and therefore easier to identify. It is exactly the same
effect as for the written signature:
A person is asked to write a word on the paper. It may be their surname for
instance. The word on the paper looks typical for their handwriting style and it is
possible for a specialist to identify them. Yet, when they are asked to write the same
word over and over again, they get used to it and the brain produces the kind of
automatic schema for performing that task. At this moment the handwritten word
looks very similar every time and that is what we call a signature. Contrary to
handwriting, the signature may be recognized even by an unqualified person – for
instance a shop assistant.
We would like to use the same effect with the eye movements. First, we show a
person the same stimulation several (as many as possible) times. After that process,
the person’s brain produces an automatic schema and results of the following
experiments will start to converge. It, of course, makes the process of recognition
(identification) easier – remember a handwriting specialist versus a shop assistant.
5 Experiment
The stimulation which has been eventually chosen was a ‘jumping point’ kind of
stimulation. There are nine different point placements defined on the screen, one in
the middle and eight on the edges, creating 3 x 3 matrix. The point flashes in one
placement in a given moment. The stimulation begins and ends with a point in the
middle of the screen. During the stimulation, point’s placement changes in specified
intervals.
That kind of stimulation has the advantage over others (especially ‘wandering
point’) that it is very easy to perform even without a display. In fact, there are only
nine sources of light needed (for instance just simple diodes).
The main problem in developing stimulation is to make it short and informative.
Those properties are as if on two opposite poles, so a ‘golden mean’ must be found.
We assumed that gathering one probe could not last longer than 10 seconds. To be
informative it should consist of as many point position changes as possible. However,
moving a point too quickly makes it impossible for eyes to follow it. Our
experiments confirmed that the reaction time for change of stimulation is about 100-
200 ms [14]. After that time eyes start a saccade which moves fovea to the new gaze
point. The saccade is very fast and lasts not longer than 10-20 ms. After a saccade, the
brain analyses a new position of the eyes and, if necessary, tries to correct it. So very
often about 50 ms after the first saccade the next saccade happens. We can call it a
‘calibration’ saccade.
Fig. 2. Typical eye movement reaction for point jump in one axis. Reaction time is understood
as the period of time between stimulation’s change and eye reaction. Stabilization time is the
time until fixation on a new gaze-point. Two calibration saccades may be observed.
5.1 Feature extraction
After collecting data the main problem is how to extract information which is useful
for human identification.
The dataset consists of probes. Each probe is the result of recording of one
person’s eye movements during about 8 seconds-lasting stimulation. As the
experiments were made with frequency 250Hz the probe consists of 2048 single
measurements. Each measurement consists of six integer values which are giving the
position of a stimulation point on the screen (sx, sy), the position of the point the left
eye is looking at (lx, ly) and the position of the point the right eye is looking at (rx,
ry).
So in each experiment we are collecting a probe of 2048 x 6 = 12288 integer
values.
The next step is to convert those values into a set of features. Each feature gives
some information about a person who made that experiment. That information may be
understandable – for instance “he is a male” or “his reaction time is less than 200 ms”,
but the meaning of the feature may be hidden also, giving only the value.
Fig. 3. Registered eye movements of left eye in horizontal axis as the answer to stimulation.
Reaction times and drifts may be observed. One experiment gives four such waveforms.
The main problem is how to extract a set of features that have values for different
probes of the same person as similar as possible and that have values for different
person’s probes as different as possible. The perfect solution would be finding
features which have exactly the same values for the same person in every experiment.
As it was mentioned earlier, identification based on eye movement analysis is a
brand new technique. The main disadvantage of that technique is that one cannot use
recently published algorithms and just try to improve it with one’s own ideas.
Therefore, we could only try to use methods which have been successfully used with
similar problems.
There are many different possibilities, for instance Karhunen-Loeve transform
(popular PCA) [17], Fourier transform, cepstrum [21] or wavelets. Each of those
techniques has been successfully used in different biometric applications like face,
finger, voice or iris recognition. What may also be considered are features specific for
the nature of the eye movement signal: average reaction time, average stabilization
time, saccade velocity and acceleration etc..
It seems that each method may work as good as others. Therefore, a lot of work has
to be done to compare those techniques. Cepstrum was used in the present work
because of its success in voice recognition. The cepstrum is counted as the inverse
Fourier transform of the logarithm of the power spectrum of a signal [21].
5.2 Classification
Having extracted features that are hopefully relevant to identification, the next step is
to prepare a classification model. When we obtain the next unclassified probe, it
would be possible, using that model, to estimate probability that the specific probe
belongs to a specified person.
There are a lot of classification algorithms which can be used here. Their main
property is creating a model based on information obtained from training data – a set
of classified probes. Algorithms try to generalize rules that were found in the training
set and to prepare a function which could be used to classify future probes.
Of course that generalization depends not only on the algorithm itself, but mostly
on characteristic of a training data. Only if the training set is representative of the
whole population and there are some (even hidden) rules in its features that could be
helpful, classifying algorithm may work well. Therefore, proper feature extraction is
the crucial element of that work.
There were four different classifying algorithms used in the present paper. Namely,
k-Nearest Neighbor (for k=3 and k=7), Naïve Bayes [20], C4.5 Decision Tree [18]
and Support Vector Machines [19]. Due to the lack of space for description of the
algorithms, the reader directed to the referred materials.
6 Results
The experiment was performed on nine participants. Each participant was enrolled
over 30 times and last 30 trials were used for classification, giving 270 probes for a
training set.
First 15 cepstral coefficients were extracted from each of four waveforms. That
gave 60 dimensional vectors for each probe. Then, nine training sets were generated –
each for classifying one participant. One classifying model was generated from each
set using different classification algorithms.
Table 1. Results of data validation for different classification algorithms. Averages of false
acceptance and false rejection rates (FAR and FRR) are counted from nine generated models
Algorithm Average FAR Average FRR
Naïve Bayes 17.49 % 12.59 %
C45 Decision Tree 3.33 % 35.56 %
SVM polynomial 1.60 % 28.89 %
KNN k=3 1.48 % 22.59 %
KNN k=7 1.36 % 28.89 %
Each classification model was cross-validated using 10 fold stratified cross
validation [20]. K-Nearest Neighbor algorithm with k=3 performed best with the
average false acceptance rate 1.48 % with the best result 0.4 % and the worst result
2.5 % and false rejection rate in range of 6.7 % to 43.3 % with average 22.59 %.
7 Conclusion
The idea of personal identification using eye movement characteristic presented in the
paper seems to be valuable addition to other well known biometric techniques. What
makes it interesting is the easiness of combining it with, for instance, face or iris
recognition. As all of those techniques need digital cameras to collect data, the system
that uses the same recording devices to gather information about human face shape,
eye iris pattern and – last but not least – eye movements characteristic may be
developed. Of course there is a lot of work to be done to improve and test our
methodology, but first experiments show the great potential of eye movements
identification. That potential was acknowledged during 6
th World Conference
BIOMETRICS’2003 in London where a poster ‘Eye movement tracking for human
identification’ [22] was awarded the title of ‘Best Poster on Technological
Advancement’ and that encourages future effort.
References
1. Daugman, J.G.: High Confidence Visual Recognition of Persons by a Test of Statistical
Independence, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no.
11 (1993) 1148-1160.
2. Ober, J., Hajda, J., Loska, J., Jamnicki, M.: Application of Eye Movement Measuring
System OBER2 to Medicine and Technology. Proceedings of SPIE, Infrared Technology
and applications, Orlando, USA, 3061(1) (1997)
3. Cowen, L., Ball, L.J., Delin, J.: An eye-movement analysis of web-page usability. Chapter
in X. Faulkner, J. Finlay, & F. Détienne (Eds.): People and Computers XVI—Memorable
Yet Invisible: Proceedings of HCI 2002. Springer-Verlag Ltd, London (2002)
4. Mast F.W. Kosslyn S.M. Eye movements during visual mental imagery. TRENDS in
Cognitive Sciences Vol.6 No.7 (2002)
5. Javal, É.: Physiologie de la lecture et de l’écriture. Paris: Félix Alcan (1905)
6. Huey, E.B.: The Psychology and Pedagogy of Reading. With a Review of the History of
Reading and Writing and of Methods, Texts, and Hygiene in Reading. New York:
Macmillan (1908)
7. Duchowski, A. T.: A Breadth-First Survey of Eye Tracking Applications. Behavior
Research Methods, Instruments & Computers (BRMIC), 34(4) (2002) 455-470
8. Josephson, S., Holmes, M. E.: Visual Attention to Repeated Internet Images: Testing the
Scanpath Theory on the World Wide Web., Proceedings of the Eye Tracking Research &
Application Symposium 2002.New Orleans, Louisiana (2002) 43-49
9. Schiessl, M., Duda, S., Thölke, A. & Fischer R.: Eye tracking and its application in usability
and media research. MMI-Interaktiv, No.6, (2003) 41-50
10.Noton, D., Stark, L. W.: Scanpaths in eye movements during pattern perception. Science,
171 (1971) 308-311
11.Hornof, A. J., Halverson, T.: Cleaning up systematic error in eye tracking data by using
required fixation locations. Behavior Research Methods, Instruments, and Computers, 34(4)
(2002) 592-604
12.Kopiez, R., Galley, N.: The Musicians' Glance: A Pilot Study Comparing Eye Movement
Parameters In Musicians And Non-Musicians. Proceedings of the 7th International
Conference on Music Perception and Cognition, Sydney (2002)
13.Vatikiotis-Bateson, E., Eigsti, I.M., Yano, S., Munhall, K. Eye movement of perceivers
during audiovisual speech perception. Perception and Psychophysics, 60(6), (1998) 926-940
14.Henderson, J. M., Hollingworth, A.: Eye Movements and Visual Memory: Detecting
Changes to Saccade Targets in Scenes. Michigan State University, Visual Cognition Lab,
Rye Lab Technical Report Tech Report No. 2001, 3 (2001)
15.Campbell, C. S., Maglio, P. P.: A robust algorithm for reading detection. Proceedings of the
ACM Workshop on Perceptual User Interfaces (2002)
16.Engbert, R., Longtin, A., Kliegl, R. Complexity of eye movements in reading. International
Journal of Bifurcation and Chaos (in press)
17.Loève M. M., Probability Theory, Princeton, NJ: Van Nostrand, (1955)
18.Quinlan, J. R.: C4.5: Programs for Machine Learning. San Mateo: Morgan Kaufmann
(1993)
19.Cortes, C., Vapnik, V.: Support Vector Networks. Machine Learning, 20, (1995)
20.Witten, I. H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques
with Java Implementations. Morgan Kaufmann (1999)
21.Rabiner, L. R., Schafer, R. W.: Digital Processing of Speech Signals, Prentice Hall,
Englewood Cliffs, NJ (1978)
22.Kasprowski, P., Ober, J. Eye movement tracking for human identification. 6th World
Conference BIOMETRICS’2003, London (2003)