Content uploaded by Bartosz Kunka
Author content
All content in this area was uploaded by Bartosz Kunka
Content may be subject to copyright.
NON-INTRUSIVE INFRARED-FREE EYE TRACKING METHOD
Bartosz Kunka, Bozena Kostek
Gdansk University of Technology, Multimedia Systems Department, Gdansk, Poland,
e-mail: kuneck@sound.eti.pg.gda.pl
e-mail: bozenka@sound.eti.pg.gda.pl
Abstract: In the paper a technique of eye tracking based
on visible light is presented. The approach described does
not require an additional hardware equipment used in the
infrared eye tracking system. First, examples of existing
eye tracking techniques were presented. Then, the
proposed algorithm of image processing and the process
of determining the eye position are described. The
engineered eye tracking application was tested and the
results of these tests are presented.
1 Introduction
The development of computer technology enables an
intelligent analysis of processed images delivered by
camera connected to the computer screen. It may be said
that machines can see. Obviously, this skill depends on the
complexity and the advancement of the implemented
image processing algorithms. Computer vision implies
analyzing, evaluating and interpreting data. Knowing the
gaze direction could provide very useful information. The
technique to carry out this kind of analysis is called gaze
tracking. Nevertheless, estimating where the person is
looking at relies in fact on eye tracking. Typically eye
tracking approaches rely upon infrared illumination.
The paper presents an eye tracking technique alternative
to the infrared illumination which is based on the visible
light spectrum analysis. Eye trackers working in visible
light are often called infrared-free (IR-free) eye trackers.
The IR-free eye tracking system engineered at the
Multimedia Systems Department of Gdansk University of
Technology (GUT) was implemented on the Macintosh
platform running Mac OS X. The application itself is not
compatible with the PC-based platforms, however the
proposed algorithms work on any type of computer.
There are many potential applications of the eye tracking
system. Nowadays, there exist many commercial gaze
tracking systems, and they achieve very high levels of
accuracy. However, they are still very expensive making
this technique unsuitable for many potential users.
Moreover, they perform poorly outdoors due to the
presence of ambient infrared light. The technique
described in this paper could be regarded as an alternative,
inexpensive approach to eye tracking.
2 Review of eye tracking techniques
Systems that track the gaze point in order to know where
or what a person is looking at were researched already at
the end of the 19th century. These first attempts were
related to observing a reader and were performed by using
direct observations of the reader‟s eye [7]. This field was
continuously being explored during the 20th century. Until
the beginning of the 21st century, eye tracking was
recognized as an area of the scientific research rather than
an issue for general public. Only some years ago, eye
tracking technologies turn to be more precise, and as a
result eye tracking systems were rediscovered as very good
candidates to offer an affordable way to the so-called
Alternative and Augmentative Communication (AAC). It
is worth mentioning that disabled people (particularly,
paralyzed persons) could find in this emerging technology
a very simple way to enhance quality of their
communication with the outside world.
As a result of the continuous research in this field, an
important number of technologies and methods were
discovered and applied to track eye movements and gaze
point. The most important technologies are based on the
infrared illumination and on visible light (IR-free).
2.1 Infrared eye trackers
The eye tracking system based on the IR illumination is
often used in head mounted and hands-free interfaces.
In order to estimate the fixation point the eye is
illuminated by infrared diode light which is invisible to the
user and does not disturb his/her interaction with the
computer. The IR sources, properly installed on the
camera, produce unique reflections on the user‟s eye (so
called „glints‟). The IR illuminating eliminates unwanted
artifacts in an eye image [6]. Another advantage of using
the IR light is that the strongest feature in the image is the
contour of the pupil rather than the limbus. This is because
both the sclera and the iris strongly reflect infrared light,
while only the sclera does that with the visible light.
Tracking the sharp contour of the pupil instead of the iris
is a best option, since its small size makes it less likely to
be occluded by the eyelid. The drawback of this technique
is that daylight can interfere with the system because of the
important ambient IR spectrum.
The dedicated algorithm analyzes the image captured by
the camera in order to detect the IR reflections present on
the eyeball and performs mathematical calculation
resulting in coordinates of the point where the user is
looking at [6]. Fig. 1 presents the idea of an eye tracking
system equipped with four sections of IR diodes localized
in the computer display corners. IR diodes form a shape of
quadrangle on the cornea. The point of fixation is
determined based on relation between positions of „glints‟
which are static and pupil center which moves when a
person changes his/her direction of looking.
Fig. 1. Projection of the screen corners (A, B, C and D) on
the user‟s eye (A‟, B‟, C‟ and D‟) and center of pupil P‟
related to the gaze point P [5].
2.2 Infrared-free eye trackers
Most of the existing eye tracking systems and algorithms
take advantage of the IR illumination, since it heavily
increases the achieved accuracy. However, some eye
tracking algorithms do not rely upon IR light. This kind of
eye tracking systems is able to work correctly even with a
typical webcam [9].
Blink detection
The blink movement is used as a reference to locate the
eyes in the image. This approach is based on the fact that
duration of an involuntary eye blink is very short and
therefore the frames during this lapse will likely be equal
except in the region of the eyelids. It is assumed that the
user cannot avoid this involuntary blinking. Exploiting
these facts, Chau et al. proposed a system to detect the
blink and to track eyes achieving up to 95.3% accuracy at
30fps with a cheap UBS webcam [4]. Li and Parkhurst
developed a similar system, which become an open-source
software package. The camera in their system is mounted
on the extended arm of a chin rest [9].
Bhaskar et al. describe blink detection using frame
differencing combined with the flow computation [1].
Firstly, possible motion regions are determined by
obtaining the difference image of the consecutive frames.
If movement is detected, the optical flow is computed
within these regions. The flow computation is a technique
that permits to determinate the direction of moving,
unsteady features in an image. Fig. 2 shows two sample
images illustrating the system introduced by Bhaskar et al.
a)
b)
Fig. 2. Blink detection algorithm: a) movement detected
with frame differencing; b) performance of the optical flow
computation [1].
One-circle algorithm
Wang et al. presented an algorithm for estimating the
gaze point from a single image of one eye. The system did
not need the IR illumination, but it could be combined with
an IR tracker to increase the accuracy of the system.
The principle of this algorithm is based on the
observation that although the contour of the iris can be
approximated into a circle in 3D, the 2D perspective view
of the limbus captured by the camera becomes an ellipse.
The ellipse can then be projected into the original circle
and its orientation can be calculated. The gaze, defined as
the normal to the iris circle, can be estimated from this
correspondence.
An ellipse can be back-projected into space on two
circles of different orientations. However, the correct
solution can be disambiguated using an anthropometric
property of the eye. Relying on a simplified eye model,
Wang et al. observed an obvious fact that the distance
between one eye corner and the center of the eyeball is the
same as the distance between the other eye corner and the
center of the eyeball. The algorithm of the above described
approach is presented in Fig. 3. First, from the image of
the eye the approximating ellipse fitting the iris is
calculated. This ellipse can be interpreted by two different
ways as the 2D projection of a 3D circle. Each circle,
representing a potential pupil, has a corresponding sphere
normal to its surface, representing the eyeball, according to
the eye model. The distances between the eye corners A
and B and the center of the sphere are compared. Knowing
that both distances must be equal, the correct solution can
be disambiguated, and the gaze direction is taken as the
optical axis of the eye (the line passing through the center
of the eyeball and the center of the iris) [12].
Fig. 3. One-circle algorithm [11].
3 Eyes and characteristic points tracking
In this Section the main algorithm steps underlying the
developed IR-free eye tracker were presented. They are as
follows: localization of the eye region, exact determination
of the limbus (the contour between the iris and the sclera)
[12] and finally the localization of the eye corners.
3.1 Localization of the eye regions
The first step of detection of the gaze is to locate the eye
regions, in order to extract the necessary features. By
investigating various approaches how to achieve this task,
it was decided to employ Haar cascade object detector.
Haar cascade was trained with eye images to directly
locate the eyes. Freely available eye detectors were used –
for the right and left eyes – trained by Castrillon-Santana -
each with 7000 positive samples [3].
Fig. 4. Face detection with Haar cascade.
Once the face is correctly detected, the eye detector is
applied. It is not necessary to apply it on the whole face, to
avoid unnecessary amount of computation time. Observing
general features of the face composition, it is possible to
assume that eyes are always situated within the same
region: when a face is divided in four horizontal stripes of
the equal height, eyes are likely to be found in the second
stripe from the top.
Fig. 5. Face and eye detection.
Going a step further, this horizontal strip should be
divided to equal parts, corresponding to the right and left
eye. Finally, one can apply right and left eye detectors to
these two well-defined eye regions.
3.2 Iris detection
Once the eye regions have been localized, the eye
features can be detected. The next step of the algorithm is
then to determine the center and the radius of the contour
between the iris and the sclera assuming that it has a
perfect circular form. This assumption is in fact very
accurate, particularly when the face of the user is directly
in front of the camera. However when the eye is not
exactly in front of the camera, the image of the limbus
becomes an ellipse because of the projection of the 3D
circular shape into the 2D image. Although this effect is of
a little importance for the assumed purposes, it could in
fact be the goal for the future improvement of the system.
Furthermore, some eye trackers, try specifically to take
advantage of this observation to estimate the gaze
direction.
Preprocessing of the eye image
First of all, an equalization of the histogram of the eye
image should be performed. This operation influences the
increasing contrast of the image. Furthermore, the
histogram equalization makes the system unaffected by
light conditions and therefore its accuracy better. The next
operation (filter) sets thresholds and binarizes the image. It
generates a new image in which pixels are white when the
corresponding pixel in the original image had the
brightness smaller than the threshold value. In the
experiments a value of 50 was used. This threshold was
found empirically. One further advantage of the histogram
equalization is that there is no need to change the threshold
value dynamically. The next step of the preprocessing eye
image is an opening operation which removes artifacts. In
the last step, the edges are retrieved with the Canny filter.
The scheme shown in Fig. 6 presents the whole process.
Fig. 6. Steps of preprocessing procedure: (1) original
image; (2) equalizing the histogram; (3) setting a
threshold; (4) artifacts removal; (5) edge map (Canny
filter).
After obtaining the edge map, the circular Hough
transform can be calculated. Estimating the range of radii
relies on the information stored from the previous camera
frame analyzed. If in a frame an iris had a radius of R, the
circular Hough transform can be iterated to the radius
range from R−2 to R+2. If this information is not
available, the value of the radius is set between 1/5 and 1/9
of the eye region width. These values were found
empirically.
Circular Hough transform (CHT)
The CHT algorithm was implemented to detect the
limbus. There are many examples in the literature showing
this method as very suitable for this purpose [8][10].
The edge map and circles with several radii are needed in
order to find a limbus. It is assumed that the center of the
circle belongs to each pixel of the contour determined by
Canny filter. The locus of points sharing this property is a
circle of the radius R and the center at the given pixel. The
process of marking these points could be thought out as a
“vote” by the pixel (x,y) for a series of circles. A high
number of votes indicate the center position of the
searched circle. The main principles of this method are
shown in Fig. 7.
If the value of radius is not known, the value which
ensure the maximum value of the center of the iris can be
used. Fig. 8 presents a process of looking for an optimal
value of the radius.
Fig. 9 shows the detected iris tagged by the contour. The
best fitting circle in this case confirms a circle of 19 pixels.
The visualization of the CHT for this case is presented in
Fig. 8c.
Fig. 7. The main principles of the circular Hough
Transform.
a)
b)
c)
d)
Fig. 8. Iteration of the CHT within a range of several
possible radii: a) 17, b) 18, c) 19 and d) 20 pixels.
a)
b)
Fig. 9. Iris detection: a) edge map; b) iris contour found in
the eye image.
3.3 Eye corners localization
Eye tracking based on video image processing requires
detecting some reference points. In IR eye tracking
systems reflections of IR light on the cornea are constant
theoretically, therefore they are regarded as the reference
points. In the color images without the characteristic IR
reflections other points should be used as reference. It was
assumed that eye corners detected in the image can fulfill
this role. Although there exist some literature sources on
this subject, this still remains a difficult task to perform
[11]. One of the reasons is the fact that the eye corners can
be defined less accurately than other features like e. g. the
center of the iris. The main principle used in the research is
shown in Fig. 10. The region of interest (ROI) of the eye is
the area determined in the earlier step.
Fig. 10. Regions of interest of eye corners
The eye corners are supposed to be found in two regions
of the eye ROI. The left and right parts of the eye ROI
should be selected, starting from both ends of the iris
contour and covering until the limits of the eye ROI. These
ROIs are showed in Fig. 10. It was observed that the
variance of the brightness of the pixels in both eye corner
ROIs is of a great importance. Vertical scanning the pixels
in these regions on the grayscale image allows to observe
that close to the iris the variance tends to be very high,
since it covers regions of skin (middle pixels), eyelashes
(dark pixels) and sclera (light pixels). Scanning the pixels
close to the outer limit of the eye ROI results in very low
variances of brightness, because the pixels of the skin are
more or less equal middle grey.
However it is necessary to know at every moment which
eye (right or left one) is analyzed. This is because the left
or right eye corners are inner or outer corners depending
on the eye. The principle of eye corners detection is
presented in Fig. 11.
Fig. 11. Eye corners detection principle.
It occurred in experiments that the method based on
variances returns good results for the outer eye corners,
but not for the inner ones. Some examples of the complete
detection performed by the developed eye tracker are
shown in Fig. 12. In Fig. 12b it can be observed that inner
eye corners are detected incorrectly.
a)
b)
Fig.12. Examples of detection results: a) part of the frame
with found irises; b) part of the frame with detected eye
corners.
5 Concluding remarks
The most important feature of the system engineered was
the fact of using a simple laptop with no added hardware.
The eye tracking systems existing in the market are very
complex from the hardware point of view, making the
systems expensive and unsuitable for potential users.
Another important point was the fact that the eye tracker
was able to work without requiring the user to keep the
head localized in the middle of the screen. In fact, the eye
tracking system allows for the user‟s movements across
the field of the view of the camera.
The application engineered enables a very reliable
localization of the eye regions and relatively accurate
detection of the iris contour. Although the eye corners
were detected correctly, the method still needs an
improvement, especially for the inner corner detection.
Nevertheless, the goal of the implementation of IR-free
eye tracker was achieved because it shows that it is
possible to develop the eye tracking system without
utilizing the infrared diodes.
It seems that the approach presented, namely the
detection of the eye corners deserves a closer attention in
order to obtain a reliable reference point for the system.
Therefore, the research in this field should continue in this
direction, focusing on the accuracy and at the same time on
the simplicity of the system.
Acknowledgements
Authors are grateful to Robert Branchat i Freixa whose
research delivered them many precious conclusions.
References
[1] T.N. Bhaskar, F.T. Keat, S. Ranganath, Y.V.
Venkatesh. Blink detection and eye tracking for eye
localization. TENCON 2003. Conference on Convergent
Technologies for Asia-Pacific Region, 2:821–824,
October 2003.
[2] R. Branchat, Development of an eye tracking system
with an emphasis on calibration procedures. M.Sc.
Thesis, supervisor: B.Kostek, consultant: B.Kunka.
Multimedia Systems Department, Gdansk University of
Technology, Poland 2009.
[3] M. Castrillon-Santana et al. Real-time detection of
multiple faces at di
ff
erent resolutions in video streams.
Journal of Visual Communication and Image
Representation, 18(2):130–140, April 2007.
[4] M. Chau, M. Betke. Real time eye tracking and blink
detection with usb cameras. Computer Science
Department, Boston University, May 2005.
[5] A. Czyzewski, B. Kunka, M. Kurkowski, R. Branchat.
Comparison of developed gaze point estimation methods.
Conference on NTAV/SPA 2008, Poznan, Poland,
September 2008.
[6] B. Kunka, B. Kostek, M. Kulesza, P. Szczuko, A.
Czyzewski. Gaze-tracking based audio-visual
correlation analysis employing Quality of Experience
Methodology, Intelligent Decision Technologies Journal,
Greece, 2009 (in review).
[7] M.F. Land. Eye movements in daily life. In The visual
neurosciences [13], Chapter 91, 1357–1368.
[8] D.B.B. Liang, L. K. Houi. Non-intrusive eye gaze
direction tracking using color segmentation and Hough
transform. International Symposium on Communications
and Information Technologies, 602–607, 2007.
[9] D. Li and D. Parkhurst. Open-Source Software for
Real-Time Visible-Spectrum Eye Tracking. 2nd
Conference on Communication by Gaze Interaction –
COGAIN 2006: Gazing into the Future. Turin, Italy,
Sept. 4-5, 2005.
[10] K. Toennies, F. Behrens, M. Aurnhammer. Feasibility
of hough-transform-based iris localization for real-time-
application. In 16th International Conference on Pattern
Recognition, 2002. Proceedings, vol. 2, 1053–1056,
2002.
[11] V.I. Uzunova. An eyelids and eye corners detection
and tracking method for rapid iris tracking. Master's
thesis, Otto-von-Guericke Universitaet, Magdeburg,
August 2005.
[12] J.G. Wang, E. Sung, and R. Venkateswarlu. Eye gaze
estimation from a single image of one eye. Computer
Vision, 2003. Proceedings. Ninth IEEE International
Conference on, vol. 1, 136–143, October 2003.
[13] J.S. Werner, L.M. Chalupa. The visual neurosciences.
MIT Press, Cambridge, Mass., 2004.