Conference PaperPDF Available

Implicit Calibration Using Predicted Gaze Targets


Abstract and Figures

The paper presents the algorithm supporting an implicit calibration of eye movement recordings. The algorithm does not require any explicit cooperation from users, yet it uses only information about a stimulus and an uncalibrated eye tracker output. On the basis of this data, probable fixation locations are calculated at first. Such a fixation set is used as an input to the genetic algorithm which task is to choose the most probable targets. Both information can serve to calibrate an eye tracker. The main advantage of the algorithm is that it is general enough to be used for almost any stimulation. It was confirmed by results obtained for a very dynamic stimulation which was a shooting game. Using the calibration function built by the algorithm it was possible to predict where a user will click with a mouse. The accuracy of the prediction was about 75%
Content may be subject to copyright.
Implicit Calibration Using Predicted Gaze Targets
Pawel Kasprowski
Silesian University of Technology
Katarzyna Harezlak
Silesian University of Technology
The paper presents the algorithm supporting an implicit calibration
of eye movement recordings. The algorithm does not require any
explicit cooperation from users, yet it uses only information about
a stimulus and an uncalibrated eye tracker output. On the basis of
this data, probable fixation locations are calculated at first. Such a
fixation set is used as an input to the genetic algorithm which task
is to choose the most probable targets. Both information can serve
to calibrate an eye tracker. The main advantage of the algorithm is
that it is general enough to be used for almost any stimulation. It
was confirmed by results obtained for a very dynamic stimulation
which was a shooting game. Using the calibration function built by
the algorithm it was possible to predict where a user will click with
a mouse. The accuracy of the prediction was about 75%.
Keywords: eye tracker, calibration, genetic algorithm
Concepts: Human-centered computing Interaction tech-
1 Introduction
With increasing availability of low cost eye trackers thinking about
using eye tracking in the wild by unexperienced and unsupervised
users has become possible. However, in order to take full advantage
of data recorded by such a device, in most cases a prior calibration is
required. It is a cumbersome and unnatural process and may be con-
sidered as one of the main obstacles for spreading popularity of eye
tracking as an enhancement for human computer interfaces. There-
fore, some effort has been made to omit or simplify this process
and some methods have been developed (see [Brolly and Mulligan
2004], [Villanueva and Cabeza 2008] or [Hansen et al. 2010]) but
they usually require complicated hardware setups with more than
one light sources and cameras.
The research presented in this paper aims at calibrating a device
without any explicit cooperation with users. In such a setup a sys-
tem builds a calibration function using information obtained from
an eye tracker and from elements of an interface. Contrary to the
previously mentioned studies, the tests presented in the paper used a
popular and cheap, off-the-shelf eye tracker without any additional
Of course such a calibration is possible only when a system has
some knowledge about area where a person is supposed to look at
a specified moment. Having this information it is possible to pair
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specific permission and/or a fee. Request
permissions from c
2016 ACM.
ETRA’16, March 14-17, 2016, Charleston, SC, USA
ISBN: 978-1-4503-4125-7/16/03
data obtained from an eye tracker and a predicted fixation location
and use it in a way identical as during a classic calibration. Such an
on-the-fly calibration was called the implicit calibration, because
it is realized during a normal interface usage without any user’s
This paper presents some theoretical background for the implicit
calibration and describes the algorithm designed to perform such a
calibration. The algorithm was checked using a specially designed
experiment, during which participants were playing a simple game.
2 Model of implicit calibration
An output from an eye tracker may be represented by a sequence of
points (e1...eN). There are eye trackers, which can work without
any calibration, however some of them have to be initialized by a
prior calibration procedure to start any signal registration. In this
research it was assumed that such a device was calibrated by one
person and then used by other users. Thus, in both the aforemen-
tioned cases we will call eipoints the uncalibrated output.
Additionally, there is information about a stimulation that was ob-
served by a person when the uncalibrated eye positions where reg-
istered. A task is to build a function that correctly transforms uncal-
ibrated eye tracker data eito gaze points githat reference objects
of the stimulation. To be able to build such a function some knowl-
edge about where a person was looking at in a specified time is
required. In a traditional explicit calibration a person is forced to
look at defined points whereas for the implicit calibration it is nec-
essary to find such points by analyzing both a stimulation and eye
tracker data.
Let’s assume that there are Ntimestamps during the presentation
of a stimulation (called later screens) for which kpossible fixation
locations (targets) can be estimated. The number of targets (k) may
be different for every screen as it is presented in Figure 1.
Figure 1: Screens with targets. The figure shows subsequent screen
frames with several targets on each of them. Horizontal axis repre-
sents time.
Additionally, there is also an eye tracker output (ei) available for ev-
ery screen. The task for the algorithm is to choose a target t
ifor ev-
ery screen siand pair it with ei. The sequence of such pairs [t
i, ei]
may then be used to build a calibration function in a way identi-
cal to the explicit calibration. The calibration function C alF un(e)
maps every eye tracker output (ei) to a correct gaze point (gi).
gi(x, y) = C alF un(ei(x, y)) (1)
This is a pre-print. The final version of the paper is available at ACM Digital Library.
Link library via
The method used in this study is based on the idea of RFLs (Re-
quired Fixation Locations) that was introduced in [Hornof and
Halverson 2002]. RFLs are objects on a screen at which a partic-
ipant must look in order to accomplish a task. In the aforemen-
tioned work they were determined based on locations of mouse
clicks (with an assumption that people look where they click) and
were used to check the calibration quality and invoke recalibration
if necessary.
The case when there are more than one possible objects (targets) on
a screen has been studied in [Zhang and Hornof 2011] and [Vadillo
et al. 2014]. Instead of RFLs, so called PFLs (Probable Fixation
Locations) were used to improve calibration. For every screen they
automatically chose the closest target as the probable gaze location.
Both RFLs and PFLs has been used in [Zhang and Hornof 2014].
Similarly to [Vadillo et al. 2014], PFL was chosen as a target closest
to the eye tracker output. RFLs - as more reliable - were weighted
ten times higher than the PFLs while creating a recalibration func-
All these approaches aimed at improving a calibration model,
which was calculated for the same person at the beginning of an
experiment. During our studies we have found that, when an eye
tracker is calibrated for one person, a similar technique may be used
to recalibrate it for another one. This idea was checked during the
experimental part of our research. The basic task is to choose a se-
quence of targets tj
i(where i= 1,2, ..., N is a screen number and
jis an index of the target selected for the ith screen) representing
genuine targets of user’s gazes and utilize it to build a calibration
model. The main problem is how to find an appropriate sequence
among all the possibilities. Intuitively, this evaluation may be based
on the quality of the model obtained for some genuine data, for
which correct gaze values are known. However, such a solution
is not feasible when data of that type is not available. This was
a motivating factor to undertake studies on elaborating a method
determining a correct sequence of targets without any knowledge
about the true gaze locations.
3 Sequence of targets evaluation
Every sequence of targets and corresponding eye tracker points
i, ei]may be used to build a calibration model. Then, it is possi-
ble to evaluate the quality (fitness) of this model. One of the most
popular measures of a model fitness is the coefficient of determina-
Having a set of Mreference genuine gaze points
gr(1), gr(2), ..., gr(M)and a corresponding eye tracker out-
put er(1), er(2), ..., er(M)the model quality may be calculated
by a comparison of grand the model output gm=Cal F un(er)
(see equation (1)).
g(gr, gm) = 1 PM
where gris the average of all reference values gr(i).R2
gis equal to
1 when the model fits perfectly, i.e. every gm(i)point is exactly at
the same location as the corresponding gr(i)point.
When reference gaze points grare not available, the only possible
way of the evaluation is a calculation of the model fitness for the
data that was used to build it (in our case chosen target’s positions
t(i)). In such a case the coefficient of determination may be cal-
culated as:
m(t, g) = 1 PN
where tis the mean value for all chosen targets and g(i) =
Cal F un(e(i)) is a gaze value for a screen iwith a target t(i)
calculated using the chosen model.
mis equal to 1 when the model fits perfectly (i.e. every g(i)point
is exactly at the same location as the corresponding target t(i)).
It is important to emphasize that R2
mas calculated in equation (3)
measures only the model fitness and gives no clue if targets taken
to calculate this model are the genuine targets.
4 Finding the best sequence
The search for the best sequence of targets may be treated as an
optimization task when the best solution is searched among many
possibilities. A cost function is used to evaluate every solution (a
sequence of targets in our case). The number of possible sequences
is j1j2j3... jNwhere jxis the number of targets on the jth
screen. So, even for only 10 screens and 6 targets for every screen
there are 610 - more than 60 mln possibilities. Therefore, it is not
feasible just to check all sequences. It is necessary to use some
heuristic that tries to find a ”good” sequence. Among a plethora of
optimization algorithms including genetic, ant colony or e.g. sim-
ulated annealing, which may be used in this case, the genetic one
was chosen for the research purpose. Such an algorithm takes an
initial population of candidate solutions and then tries to find better
solutions modifying the current ones using different operations (it
is called evolution). Our implementation started with a crossover
operator (using 35% operations per generation) followed by a mu-
tation with probability 1/12. Every solution, called a chromosome,
consists of genes. In our case every ith gene was an index of the
ith target on the screen. A chromosome represented a sequence of
targets - one from each screen. Therefore, the chromosome’s length
was equal to the number of screens.
A criterion for a chromosome optimization is a value of a cost func-
tion calculated for the chromosome. During the evolution chromo-
somes with higher function’s values are preferred. As the result,
after some number of iterations (evolutions), a chromosome with
the highest cost is obtained. However, naturally, it may not be the
best possible chromosome, it is just the best found chromosome.
To evaluate the cost function value for a sequence (chromosome)
the sequence was at first used to prepare a calibration model us-
ing [t
i, ei]pairs. There was a linear regression with Levenberg-
Marquardt algorithm used to prepare the model separately for ver-
tical and horizontal directions:
Cal F unx(e) = Axex+Byey+Cx(4)
Cal F uny(e) = Ayex+Byey+Cy(5)
The created model may be subsequently used to calculate a gaze
point gibased on eye tracker data ei(equation 1).
There were two different cost functions used. The first one eval-
uated the model generated for the sequence (chromosome) using
genuine reference gaze points r(1), r(2), ..., r(M)and the corre-
sponding eye tracker output er(1), er(2), ..., er(M). The model
quality was assessed by a comparison of rand the model output
gr=Cal F un(er). Therefore, we had
GenF unction(r, gr) = (R2
gx(r, gr)R2
gy (r, gr))2(6)
where R2
gx(r, gr)and R2
gy (r, gr)are coefficients of determination
calculated using equation (2), separately for horizontal and verti-
cal axes. The best sequence found utilizing this function may be
considered the correct one as the optimization algorithm uses the
correct gaze points for the evaluation. Therefore this sequence is
called the genuine sequence in the subsequent text.
The second cost function did not use any reference points and took
into account only pairs of tand the corresponding e. In other
words, we had
F itF unction(t, g) = (R2
mx(t, g )R2
my(t, g ))2(7)
where R2
mx(t, g )and R2
my(t, g )are coefficients of determination
calculated using equation (3) separately for horizontal and vertical
axes. The best chromosome (sequence) found for this function is
called the fittest sequence in the subsequent text.
Our hypothesis was that a model built using the correct targets
should be ”easier” to calculate, so we expected that R2
for this model should be higher than for models built using im-
proper targets. It means that a sequence determined with usage of
F itF unction - as the cost function - should be close to the appro-
priate one. This technique is similar to regression based approach
presented in [Kasprowski and Harezlak 2015]. During the experi-
mental part of the research this assumption was checked against the
real data.
5 Experiment
To check whether the algorithms described above may be used for
the implicit calibration, an experiment was conducted based on the
EyeTribe eye tracker, registering data with 60Hz frequency. Be-
cause it is not possible to work with the EyeTribe without a cali-
bration, at first the device was calibrated by a person that did not
take part in the next steps of the experiment. Then 43 different par-
ticipants played a simple game during which their eye movements
were registered. One game run was called a trial, thus this way 43
trials were collected.
The game scenario was as follows. There were two kinds of objects
moving on the screen - the ”good guys” and the ”bad guys”. The
task for a participant was to use a mouse pointer to shoot down
as many bad guys as possible. The hit object disappeared and the
new one was created in a random place. There were always about
10 objects visible on the screen. The whole recording lasted 60
seconds and, when the game finished, a score was calculated taking
the number of killed bad guys and good guys as well as the number
of bad guys that escaped out of the screen into account. It was 26”
display used and an approximate distance from the screen for all
participants was about 50 cm.
Our fundamental assumption was that participants follow with eyes
the objects on the screen. Thus, for every moment in a trial, when
the participant’s eyes are during fixation or smooth pursuit, a list of
possible targets may be calculated. This list may be subsequently
used as an input to the sequence finding algorithm.
Another assumption was that people look where they click (as it
was assumed in [Hornof and Halverson 2002]), especially when a
target is small and moving, which was the case during the exper-
iment. So, information about gaze points during the mouse clicks
may be used as a reference to estimate quality of the calibration
model (which did not use this information).
It was expected that about 3600 eye positions should be avail-
able for every trial (for 60Hz frequency and 60 seconds recording).
However, the experiment was conducted ”in the wild” - participants
Table 1: Average errors for fittest and genuine sequences calcu-
lated for each trial
Direction Fittest Genuine
horizontal 88.5 (33.87) 82.3 (34.7)
vertical 106.24 (59.0) 82.56 (41.39)
just came and played the game and the only initial setup was a pre-
liminary check whether participant’s eyes are visible for the eye
tracker’s camera. Therefore, it has happened that the eye tracker
could not locate eyes during a trial and it was unable to provide
data. We decided to exclude from subsequent experiments trials for
which less than 2300 eye positions were recorded. It resulted in
exclusion of 8 trials and only 35 remaining were used in the further
6 Data processing
Before the sequence search algorithm was run, some data prepro-
cessing steps were performed, for each trial independently. The first
step was the extraction of screens. The screen was defined for ev-
ery timestamp with 100 ms interval. For every screen the locations
of objects (good and bad guys) were calculated and added as a set
of targets. Then eye tracker data before and after a timestamp was
used to calculate the value of eifor a screen. A classic velocity
threshold was used to choose only these recordings that belong to
fixations or smooth pursuits. Screens for which it was impossible
to find at least 10 recordings were removed. Finally, it was some
number (k) of targets (t1
i) and an eye tracker output eidefined
for each screen i. Then the genetic algorithm using the cost func-
tion defined on equation (7) was used to find the fittest sequence of
The next step was a search for the genuine sequence. Mouse click
locations with timestamps were extracted as reference points (ri).
For each mouse click location, eye tracker measurements close in
time and belonging to a fixation or smooth pursuit were utilized
to estimate ei- the eye tracker output for this click. It resulted in
a list of genuine pairs [ri, ei], which were subsequently used by
the cost function to evaluate each targets’ sequence according to
the equation (6). The sequence for which the calibration model
gave the best results was returned by the genetic algorithm and was
treated as the genuine one.
7 Results
The first task of the experimental part was to compare the fittest
sequence found for each trial with its genuine counterpart. If the
fittest sequence occurred to be similar to the genuine one it would
indicate that the fitness function (equation (7)) may be used for the
sequences optimization and no additional data (like mouse clicks)
is necessary. The absolute error formula was used to calculate er-
ror for the model built using the sequence, taking Mclick points
(ri) and the corresponding model output (gi) in both horizontal and
vertical directions into account (equation (8)):
Error(t) = sPM
The averaged results for all 35 trials are presented in Table 1. Ana-
lyzing these results it may be noticed that the fittest sequence gives
higher errors than the genuine one. Such findings were predictable,
because the fittest sequence optimization did not use any informa-
tion about the genuine points. On the other hand, it is visible that
the results for both sequences are comparable. Although for ver-
tical direction the genuine sequence is significantly better than the
fittest one (p=0.023), the difference between these sequences for the
horizontal direction is even not significant (p=0.22). It leads to the
conclusion, that the fittest sequence may be used for the calibration
purpose when obtaining the genuine one is not possible because of
lack of reference points.
7.1 Clicked target prediction
The test checking, whether a model created using the fittest se-
quence may be applied to predict the next target to click, was per-
formed as well. At first the gaze points gwere calculated using the
calibration model calculated for the fittest sequence. Then, for ev-
ery click c(t)the corresponding gaze point g(t)was found and the
target closest to the gaze point was chosen. If the selected target
was the same as that truly clicked, it was treated as a success. This
calculation was repeated for each trial separately - there were on
average 82.5 (±18.6) clicks during every trial. The summarized
results showed that it was possible to predict targets correctly in
2181 out of 2886 clicks which gave accuracy equal to 75.6% (±
15.3 for a single trial).
These satisfactory results were obtained when all screens from a
trial were taken into account. Because the target prediction requires
some calculation to be done, we decided to check the possibility to
improve this process performance and reduce a number of screens
maintaining the good prediction rate on the same level. During this
experiment only screens from the first Xseconds where taken into
account while building a calibration model to predict all clicked
targets. The results are presented on Figure 2.
1 2 3 4 5 6 7 8 9 10 20 30 40 50 60
Accuracy of prediction
Number of seconds
Figure 2: Accuracy of clicked target prediction depending the
recording duration.
The results show that after just 20 seconds of recording it is possible
to predict correctly - with accuracy over 70% - which target is about
to be clicked and only 6 seconds is required to obtain accuracy of
prediction over 50%.
8 Discussion
The results described earlier are satisfactory, however, the exact
numbers achieved are strictly correlated with the experiment sce-
nario. In our case the participation in the game was very dynamic
with a lot of short fixations and a lot of clicks (about 1.4 clicks per
second on average). Because most targets were moving, there were
in fact more smooth pursuits than fixations in the recorded data.
It gives a potential possibility to use information about direction
and velocity of the movements to better predict targets, similarly
to [Pfeuffer et al. 2013]. Additionally, there were always about 10
targets on every screen. It may be expected that in less dynamic
scenario - with less targets to deal with - the results should get bet-
On the other hand, the main drawback of the method is that it re-
quires considerably complex computations. For our purpose we
need only several seconds of recording but to choose a correct se-
quence a genetic algorithm must calculate a lot of models. In our
case it was 1000 chromosomes for each generation multiplied by
1000 evolution steps. It took about 1 minute for recordings last-
ing 10 seconds and almost 6 minutes for recordings lasting 60 sec-
onds, using our laboratory computer (Intel Xenon 3.1GHz with
8GB RAM). Therefore, some more sophisticated heuristic will be
checked during our further research to find the best solution faster.
9 Conclusion
The paper describes the algorithm supporting an implicit calibra-
tion of eye movement recording. The algorithm does not require
any cooperation from users, yet uses only information about the
stimulation and an uncalibrated eye tracker output. The correct-
ness of the algorithm was confirmed during experiments involving
35 people. The results obtained showed that is useful in improving
calibration process. Moreover, experiments presented in the paper
showed that it is possible to obtain meaningful data only after 10
seconds of recording. The main advantage of the algorithm is that
it does not require any explicit user’s feedback. Only location of
possible targets is needed to use it, so it is quite general and may be
used in many eye tracking scenarios.
BROL LY, X. L., AN D MULLIGAN, J . B. 2004. Implicit calibra-
tion of a remote gaze tracker. In Computer Vision and Pattern
Recognition Workshop, 2004. CVPRW’04. Conference on, IEEE,
HAN SE N, D . W., AG UST IN , J. S., AN D VILLANUEVA, A. 2010.
Homography normalization for robust gaze estimation in uncal-
ibrated setups. In Proceedings of the 2010 Symposium on Eye-
Tracking Research & Applications, ACM, 13–20.
HORNOF, A. J., AND HALVER SO N , T. 2002. Cleaning up sys-
tematic error in eye-tracking data by using required fixation lo-
cations. Behavior Research Methods, Instruments, & Computers
34, 4, 592–604.
KAS PRO WS KI , P., AN D HAREZLAK, K. 2015. Using non-
calibrated eye movement data to enhance human computer inter-
faces. In Intelligent Decision Technologies. Springer, 347–356.
GEL LE RS EN, H. 2013. Pursuit calibration: Making gaze cal-
ibration less tedious and more flexible. In Proceedings of the
26th annual ACM symposium on User interface software and
technology, ACM, 261–270.
D. R. 2014. A simple algorithm for the offline recalibration
of eye-tracking data through best-fitting linear transformation.
Behavior research methods, 1–12.
VILLANUEVA, A., AN D CAB EZ A , R . 2008. A novel gaze esti-
mation system with one calibration point. Systems, Man, and
Cybernetics, Part B: Cybernetics, IEEE Transactions on 38, 4,
ZHA NG , Y., AN D HORNOF, A. J. 2011. Mode-of-disparities error
correction of eye-tracking data. Behavior Research Methods 43,
3, 834–842.
ZHA NG , Y., AND HORNOF, A. J. 2014. Easy post-hoc spatial
recalibration of eye tracking data. In ETRA, 95–98.
... The idea that the calibration may be done implicitly, during normal user's activities is our main research hypothesis. The possibility of such a calibration has been confirmed in several papers [5][6][7][8]; however, the main challenge is to achieve generality of the solution and accuracy that is sufficient for end-user eye-tracking usage in various eye-tracking applications. ...
... The paper summarizes the idea introduced in two conference publications [6,29]; however, it significantly extends it by: ...
... • It is more universal as it may be used with any stimulus: static images, movies-but also with computer interfaces (e.g., with buttons or labels as targets) or games (e.g., with avatars as targets as was proposed in [6]). ...
Full-text available
Proper calibration of eye movement signal registered by an eye tracker seems to be one of the main challenges in popularizing eye trackers as yet another user-input device. Classic calibration methods taking time and imposing unnatural behavior on eyes must be replaced by intelligent methods that are able to calibrate the signal without conscious cooperation by the user. Such an implicit calibration requires some knowledge about the stimulus a user is looking at and takes into account this information to predict probable gaze targets. This paper describes a possible method to perform implicit calibration: it starts with finding probable fixation targets (PFTs), then it uses these targets to build a mapping-probable gaze path. Various algorithms that may be used for finding PFTs and mappings are presented in the paper and errors are calculated using two datasets registered with two different types of eye trackers. The results show that although for now the implicit calibration provides results worse than the classic one, it may be comparable with it and sufficient for some applications.
... Moreover, the gradient descent method used in most attempts may lead to local maxima when image has several distinctive salient regions. That is why in [Kasprowski and Harezlak 2016] a simplification of the model was suggested by replacing the saliency map by a list of points with high probability and using them as probable fixation targets -PFTs (see Figure 1). Having a short list of such points for every frame, it is possible to build the model assuming that most of the time a user would look at one of these points. ...
... The only problem is to choose a correct PFT for every fixation. In [Kasprowski and Harezlak 2016] a genetic algorithm was used to find a sequence of targets that result in the lowest error. This solution was later implemented in the ETCAL library [Kasprowski and Harezlak 2017]. ...
... The main purpose of the research presented in this paper was to evaluate the PFT based solution initially proposed in [Kasprowski and Harezlak 2016]. Apart from the solution implemented in [Kasprowski and Harezlak 2017] it was another implementation proposed which gave better results while it was about 10 times faster. ...
Conference Paper
Full-text available
With growing access to cheap low end eye trackers using simple web cameras, there is also a growing demand on easy and fast usage of this devices by untrained and unsupervised end users. For such users the necessity to calibrate the eye tracker prior to its first usage is often perceived as obtrusive and inconvenient. In the same time perfect accuracy is not necessary for many commercial applications. Therefore, the idea of implicit calibration attracts more and more attention. Algorithms for implicit calibration are able to calibrate the device without any active collaboration with users. Especially, a real time implicit calibration, that is able to calibrate a device on-the-fly, while a person uses an eye tracker, seems to be a reasonable solution to the aforementioned problems. The paper presents examples of implicit calibration algorithms (including their real time versions) based on the idea of probable fixation targets (PFT). The algorithms were tested during a free viewing experiment and compared to the state of the art PFT based algorithm and explicit calibration results.
... There are also studies in which the calibration process is a part of human-computer interaction; this is the case in the research presented in [15], which applies a game scenario with objects (targets) moving on the screen. Under the assumption that the user is always looking at one of the targets displayed on the screen, an algorithm that maps the uncalibrated eye tracker output to the targets was proposed. ...
... However, sometimes searching for a proper calibration model may be a complicated and time-consuming task due to the number of objects to be considered in a user interface. This problem may be alleviated by the usage of heuristic methods [15]. ...
... As has already been stated, input data may consist of more than one target (possible gaze coordinates) for each data unit when implicit calibration is used instead of explicit calibration [15]. In such a scenario, the user may choose any of the interface targets; he is not forced to look at one specific point. ...
Full-text available
Eye tracking is an increasingly popular technique that may be used for a variety of applications including user experience analysis, usability engineering and interactive games. Moreover, this technique may be useful inter alia in education, psychological tests and medical diagnosis. Video-based oculography (VOG) is the most commonly utilized technique because it is non-intrusive, can be applied in users' natural environments and is relatively cheap as it uses only basic cameras. There are already well-established methods for eye detection using still images registered by a camera. However, to be usable in the estimation of gaze position, eye image features must be associated with the place where an observer is looking; this typically requires the evaluation of several parameters during a process called calibration. The parameters significantly influence the quality of the analysis of subsequently collected data, thus they should be adjusted to the eye tracker used, the user and the environment. The purpose of this study is to present various calibration techniques and introduce an open, extendable and ready-to-use software application that implements these techniques. This software implements several algorithms already described in the literature and introduces some novel techniques. It provides the opportunity to compare different methods including plug-in self-developed filters and optimization algorithms; it also allows for analysis of results.
... However, in order to exploit the entire potential, an accurate gaze estimation and thus, a precise calibration model are necessary. This makes successful calibration essential for eye tracking, which is frequently described as a task of insufficient usability that is inconvenient and tedious [4,19,27,28]. ...
... To enhance dwell-based approaches Renner et al. proposed a motion guided calibration technique in a virtual environment, where a dragonfly flew from point to point, attracting the user's attention [33]. Contrarily to prior approaches, Kasprowski and Harezlak implemented an implicit form of point-guided calibration, by predicting the most probable fixated object out of a set of shown ones [19]. ...
Conference Paper
Although, gaze-based interaction has been investigated since the 1980s and provides promising concepts to realize cognitive systems and support universal interaction within distributed environments, the main challenges, such as the Midas touch problem [16] or calibration are still frequent topics of research. In this work, Natural Pursuit Calibration is presented, which is a comfortable, unobtrusive technique enabling ongoing attention detection and eye tracker calibration within an off-screen context. The user is able to perform calibration, without a digital user interface, artificial annotation of the environment nor further assistance, by simply following any arbitrary moving target. Due to the characteristics of the calibration process, it can be executed simultaneously to any primary task, without active user participation. A two-stage evaluation process is conducted to (i) optimize parameter settings in a first setup and (ii) compare the accuracy as well as the user acceptance of the proposed procedure to prevailing calibration techniques.
... More complicated stimuli, in form of images, may be used as well, however in such a case well-defined targets are required. There are studies, in which the calibration process is included in normal human-computer interaction [10]. Especially mouse click positions may be utilized as probable gaze locations [11]. ...
... It happens when an implicit calibration is used instead of an explicit one [10]. In such the case a user is not forced to look at one specific point. ...
Full-text available
Recently eye tracking has become a popular technique that may be used for variety of applications starting from medical ones, through psychological, analyzinguser experience, ending with interactive games. Video based oculography (VOG) is the most popular technique because it is non-intrusive, can be use in users’ natural environment and are relatively cheap as it uses only classic cameras. There are already well established methods for eye detection on a camera capture. However, to be usable in gaze position estimation, this information must be associated with an area in an observer scene, which requires evaluating several parameters. These parameters are typically estimated during the process called calibration. The main purpose of the software described in this paper is to establish a common platform that is easy to use and may be used in different calibration scenarios. Apart from the normal regression based calibration the ETCAL library allows also to use more sophisticated methods like automatic parameters optimization or the automatic detection of gaze targets. The library is also easily extendable and may be accessed with a convenient Web/REST interface.
... This model is built using some number of clicks and then it is used for each subsequent click to recalculate the ET signal into gaze location. In that way, our system implicitly builds a calibration model for the given user [Kasprowski and Harezlak 2016]. ...
Conference Paper
Eye movement-based biometric has been developed for over 15 years, but for now - to the authors' knowledge - no commercial applications utilize this modality. There are many reasons for this, starting from still low accuracy and ending with the problematic setup. One of the essential elements of this setup is the calibration , as nearly every eye tracker needs to be calibrated before its first usage. This procedure makes any authentication based on eye movement a cumbersome and lengthy process. The main idea of the research presented in this paper is to perform authentication based on a signal from a cheap remote eye tracker but - contrary to the previous studies - without any calibration of the device. The uncalibrated signal obtained from the eye tracker is used directly, which significantly simplifies the enrollment process. The experiment presented in the paper aims at protection from a so-called "lunchtime attack" when an unauthorized person starts using a computer, taking advantage of the absence of the legitimate user. We show that such an impostor may be detected with an analysis of the signal obtained from the eye tracker when the user clicks with a mouse objects on a screen. The method utilizes the assumptions that: (1) users usually look at the point they click, and (2) an uncalibrated eye tracker signal is different for different users. It has been shown that after the analysis of nine subsequent clicks, the method is able to achieve the Equal Error Rate lower than 15% and may be treated as a valuable and difficult to counterfeit supplement to classic face recognition and password-based computer protection methods.
... Interestingly, using a calibration function built by an algorithm (Kasprowski & Harezlak, 2016) it was possible to predict where a user will click with a mouse: The accuracy of the prediction was about 75%which points to a high correlation as also shown here. ...
Full-text available
Attention is crucial as a fundamental prerequisite for perception. The measurement of attention in viewing and recognizing the images that surround us constitutes an important part of eye movement research, particularly in advertising-effectiveness research. Recording eye and gaze (i.e. eye and head) movements is considered the standard procedure for measuring attention. However, alternative measurement methods have been developed in recent years, one of which is mouse-click attention tracking (mcAT) by means of an on-line based procedure that measures gaze motion via a mouse-click (i.e. a hand and finger positioning maneuver) on a computer screen. Here we compared the validity of mcAT with eye movement attention tracking (emAT). We recorded data in a between subject design via emAT and mcAT and analyzed and compared 20 subjects for correlations. The test stimuli consisted of 64 images that were assigned to eight categories. Our main results demonstrated a highly significant correlation (p
... Pursuit calibration proposes the use of moving targets with a known trajectory [Celebi et al. 2014;Pfeuffer et al. 2013] and can achieve an angular error as low as 0.6°. A different approach is to leverage user events and possible interactions with a PC [Huang et al. 2016;Kasprowski and Harezlak 2016]. Egocentric visual saliency can also be used for a continuous selfcalibrating eye tracker [Sugano and Bulling 2015]. ...
Conference Paper
Full-text available
Common calibration techniques for head-mounted eye trackers rely on markers or an additional person to assist with the procedure. This is a tedious process and may even hinder some practical applications. We propose a novel calibration technique which simplifies the initial calibration step for mobile scenarios. To collect the calibration samples, users only have to point with a finger to various locations in the scene. Our vision-based algorithm detects the users' hand and fingertips which indicate the users' point of interest. This eliminates the need for additional assistance or specialized markers. Our approach achieves comparable accuracy to similar marker-based calibration techniques and is the preferred method by users from our study. The implementation is openly available as a plugin for the open-source Pupil eye tracking platform.
Full-text available
Proper calibration of eye movement signal registered by an eye tracker seems to be one of the main challenges in popularizing eye trackers as yet another user input device. Classic calibration methods taking time and imposing unnatural behavior of users have to be replaced by intelligent methods that are able to calibrate the signal without conscious cooperation with users. Such an implicit calibration requires some knowledge about the stimulus a person is looking at and takes into account this information to predict probable gaze targets. The paper describes one of the possible methods to perform implicit calibration: it starts with finding probable fixation targets (PFTs), then uses these targets to build a mapping - probable gaze path. Various possible algorithms that may be used for finding PFTs and mapping are presented in the paper and errors are calculated utilizing two datasets registered with two different types of eye trackers. The results show that although for now the implicit calibration provides results worse than the classic one, it may be comparable with it and sufficient for some applications.
Conference Paper
Eye movement analysis finds tremendous usefulness in various medical screening applications and rehabilitation. Infrared sensor based eye trackers are becoming popular but these are expensive and need repeated calibration. Moreover, with multiple calibration also, there persists some noises called, variable and systematic, resulting in inaccurate gaze tracking. This study aims to build an one time calibration module to avoid the overhead of multiple calibration and to design an algorithm to remove both the types of errors effectively. The proposed approach is used for correcting the gaze tracking data for Digit Gazing task and standard recall-recognition test, where an accuracy of 90% and 82% are achieved respectively for detecting the gaze positions against the raw eye gaze data. Results also show that it is possible to perform accurate gaze tracking with one-time calibration method provided the experimental setup is not altered.
Conference Paper
Full-text available
Eye movement may be regarded as a new promising modality for human computer interfaces. With the growing popularity of cheap and easy to use eye trackers, gaze data may become a popular way to enter information and to control computer interfaces. However, properly working gaze contingent interface requires intelligent methods for processing data obtained from an eye tracker. They should reflect users' intentions regardless of a quality of the signal obtained from an eye tracker. The paper presents the results of an experiment during which algorithms processing eye movement data while 4-digits PIN was entered with eyes were checked for both calibrated and non-calibrated users.
Full-text available
Poor calibration and inaccurate drift correction can pose severe problems for eye-tracking experiments requiring high levels of accuracy and precision. We describe an algorithm for the offline correction of eye-tracking data. The algorithm conducts a linear transformation of the coordinates of fixations that minimizes the distance between each fixation and its closest stimulus. A simple implementation in MATLAB is also presented. We explore the performance of the correction algorithm under several conditions using simulated and real data, and show that it is particularly likely to improve data quality when many fixations are included in the fitting process.
Conference Paper
Full-text available
Homography normalization is presented as a novel gaze estimation method for uncalibrated setups. The method applies when head movements are present but without any requirements to camera calibration or geometric calibration. The method is geometrically and empirically demonstrated to be robust to head pose changes and despite being less constrained than cross-ratio methods, it consistently performs favorably by several degrees on both simulated data and data from physical setups. The physical setups include the use of off-the-shelf web cameras with infrared light (night vision) and standard cameras with and without infrared light. The benefits of homography normalization and uncalibrated setups in general are also demonstrated through obtaining gaze estimates (in the visible spectrum) using only the screen reflections on the cornea.
Full-text available
In eye-tracking research, there is almost always a disparity between a person's actual gaze location and the location recorded by the eye tracker. Disparities that are constant over time are systematic error. In this article, we propose an error correction method that can reliably reduce systematic error and restore fixations to their true locations. We show that the method is reliable when the visual objects in the experiment are arranged in an irregular manner-for example, when they are not on a grid in which all fixations can be shifted to adjacent locations using the same directional adjustment. The method first calculates the disparities between fixations and their nearest objects. It then uses the annealed mean shift algorithm to find the mode of the disparities. The mode is demonstrated to correctly capture the magnitude and direction of the systematic error so that it can be removed. This article presents the method, an extended demonstration, and a validation of the method's efficacy.
Full-text available
In the course of running an eye-tracking experiment, one computer system or subsystem typically presents the stimuli to the participant and records manual responses, and another collects the eye movement data, with little interaction between the two during the course of the experiment. This article demonstrates how the two systems can interact with each other to facilitate a richer set of experimental designs and applications and to produce more accurate eye tracking data. In an eye-tracking study, a participant is periodically instructed to look at specific screen locations, or explicit required fixation locations (RFLs), in order to calibrate the eye tracker to the participant. The design of an experimental procedure will also often produce a number of implicit RFIs--screen locations that the participant must look at within a certain window of time or at a certain moment in order to successfully and correctly accomplish a task, but without explicit instructions to fixate those locations. In these windows of time or at these moments, the disparity between the fixations recorded by the eye tracker and the screen locations corresponding to implicit RFLs can be examined, and the results of the comparison can be used for a variety of purposes. This article shows how the disparity can be used to monitor the deterioration in the accuracy of the eye tracker calibration and to automatically invoke a recalibration procedure when necessary. This article also demonstrates how the disparity will vary across screen regions and participants and how each participant's unique error signature can be used to reduce the systematic error in the eye movement data collected for that participant.
Conference Paper
Full-text available
We describe a system designed to monitor the gaze of a user working naturally at a computer workstation. The system consists of three cameras situated between the keyboard and the monitor. Free head movements are allowed within a three-dimensional volume approximately 40 centimeters in diameter. Two fixed, wide-field "face" cameras equipped with active-illumination systems enable rapid localization of the subject's pupils. A third steerable "eye" camera has a relatively narrow field of view, and acquires the images of the eyes which are used for gaze estimation. Unlike previous approaches which construct an explicit three-dimensional representation of the subject's head and eye, we derive mappings for steering control and gaze estimation using a procedure we call implicit calibration. Implicit calibration is performed by collecting a "training set" of parameters and associated measurements, and solving for a set of coefficients relating the measurements back to the parameters of interest. Preliminary data on three subjects indicate an median gaze estimation error of ap-proximately 0.8 degree.
Conference Paper
Eye gaze is a compelling interaction modality but requires user calibration before interaction can commence. State of the art procedures require the user to fixate on a succession of calibration markers, a task that is often experienced as difficult and tedious. We present pursuit calibration, a novel approach that, unlike existing methods, is able to detect the user's attention to a calibration target. This is achieved by using moving targets, and correlation of eye movement and target trajectory, implicitly exploiting smooth pursuit eye movement. Data for calibration is then only sampled when the user is attending to the target. Because of its ability to detect user attention, pursuit calibration can be performed implicitly, which enables more flexible designs of the calibration task. We demonstrate this in application examples and user studies, and show that pursuit calibration is tolerant to interruption, can blend naturally with applications and is able to calibrate users without their awareness.
Conference Paper
The gaze locations reported by eye trackers often contain error resulting from a variety of sources. Such error is of increasing concern to eye tracking researchers, and several techniques have been introduced to clean up the error. These methods, however, either compensate only for error caused by a particular source (such as pupil dilation) or require the error to be somewhat constant across space and time. This paper introduces a method that is applicable to error generated from a variety of sources and that is resilient to the change in error across the display. A study shows that, at least in some cases, although the change in error across the display appears to be random it in fact follows a consistent pattern which can be modeled using quadratic equations. The parameters of these equations can be estimated using linear regression on the error vectors between recorded fixations and possible target locations. The resulting equations can then be used to clean up the error. This regression-based approach is much easier to apply than some of the previously published methods. The method is applied to the data of a visual search experiment, and the results show that the regression-based error correction works very well.
The design of robust and high-performance gaze-tracking systems is one of the most important objectives of the eye-tracking community. In general, a subject calibration procedure is needed to learn system parameters and be able to estimate the gaze direction accurately. In this paper, we attempt to determine if subject calibration can be eliminated. A geometric analysis of a gaze-tracking system is conducted to determine user calibration requirements. The eye model used considers the offset between optical and visual axes, the refraction of the cornea, and Donder's law. This paper demonstrates the minimal number of cameras, light sources, and user calibration points needed to solve for gaze estimation. The underlying geometric model is based on glint positions and pupil ellipse in the image, and the minimal hardware needed for this model is one camera and multiple light-emitting diodes. This paper proves that subject calibration is compulsory for correct gaze estimation and proposes a model based on a single point for subject calibration. The experiments carried out show that, although two glints and one calibration point are sufficient to perform gaze estimation (error approximately 1 degree), using more light sources and calibration points can result in lower average errors.
Implicit calibration of a remote gaze tracker. InComputer Vision and Pattern Recognition Workshop
  • X L Brolly
  • J B Mulligan