Conference PaperPDF Available

Application of Dimensionality Reduction Methods for Eye Movement Data Classification

Authors:

Abstract and Figures

In this paper we apply two data dimensionality reduction methods to eye movement dataset and analyse how the feature reduction method improves classification accuracy. Due to the specificity of the recording process, eye movement datasets are characterized by both big size and high-dimensionality that make them difficult to analyse and classify using standard classification approaches. Here, we analyse eye movement data from BioEye 2015 competition and to deal with the problem of high dimensionality we apply SVM combined with PCA feature extraction and random forests wrapper variable selection. Our results show that the reduction of the number of variables improves classification results. We also show that some of classes (participants) can be classified (recognised) with high accuracy while others are very difficult to be correctly identified. Keywords: Eye movement data analysis DTW Dimensionality reduction Classification PCA SVM Random forest
Content may be subject to copyright.
Application of dimensionality reduction methods
for eye movement data classification
Aleksandra Gruca, Katarzyna Harezlak, Pawel Kasprowski
Institute of Informatics, Silesian University of Technology
Akademicka 16, 44-100, Gliwice, Poland
aleksandra.gruca@polsl.pl
Abstract. In this paper we apply two data dimensionality reduction
methods to eye movement dataset and analyse how the feature reduc-
tion method improves classification accuracy. Due to the specificity of
the recording process, eye movement datasets are characterized by both
big size and high-dimensionality that make them difficult to analyse and
classify using standard classification approaches. Here, we analyse eye
movement data from BioEye 2015 competition and to deal with the prob-
lem of high dimensionality we apply SVM combined with PCA feature
extraction and random forests wrapper variable selection. Our results
show that the reduction of the number of variables improves classifica-
tion results. We also show that some of classes (participants) can be
classified (recognised) with high accuracy while others are very difficult
to be correctly identified.
1 Introduction
Recent advances in computer science technology, development of new technolo-
gies and data processing algorithms provided new tools and methods that are
used to control access to numerous resources. Some of them are widely avail-
able while others should be protected against an unauthorized access. For the
latter case various security methods have been developed like PINs, passwords,
tokens, however biometrics solutions such as as gait, voice, mouse stroke or eye
movement are becoming more and more popular due to their convenience.
In the field of eye movement research the biometric identification plays an
important role as the information included in such signal is difficult to imitate or
This is the accepted version of:
Gruca A., Harezlak K, Kasprowski P.: Application of Dimensionality Reduction
Methods for Eye Movement Data Classification. Gruca A. et al. (Eds.), AISC, 391,
2015, pp. 291-303
The final publication is available at Springer via
http://link.springer.com/chapter/10.1007%2F978-3-319-23437-3 25
2
forge. Acquisition of such data is realized with usage of various types of cameras
and can provide, depending on recording frequency of an eye tracker used, from
25 to 2000 samples per second. A total amount of obtained samples depends
on a time of registration - it can be only during a login phase or continuously
during the whole session. In the latter case, obtained dataset is characterized by
both big size and high dimensionality of features describing it. Analysis of such
big dataset is therefore a challenging task.
It may seem that the more information we include into our analysis, the better
decision we can make, however it is possible to reach a point beyond which data
analysis is very difficult and sometimes impossible. In numerous cases objects
in dataset are characterized by many features, which are redundant or have no
impact on a final result. Taking them into consideration may not improve qual-
ity of results but even make it worse. Additionally, two problems - a complexity
of a classifier used and so-called the curse of dimensionality - may arise. In the
latter case we need to deal with exponential growth of data size to ensure the
same quality of classification process when a dimensionality grows up. Moreover,
collecting such amount of data may be expensive and difficult to perform. To
solve this problem, additional methods are necessary that allow finding relation-
ships existing in data, filter redundant information and select only these features
which are relevant to a studied area, and classification outcome. Data dimen-
sionality reduction methods has been successfully applied in machine learning
in many different fields such as industrial data analysis [3], computer vision [10],
geospatial [17] or biomedical data analysis [20]. Removing unimportant features
has following advantages:
reduces bias unimportant from a classifier point of view,
simplifies calculation saving resources usage,
improves learning performance,
reveals real relationships among data
improves classifier accuracy.
In the research described in the paper the problem of dimensionality re-
duction has been applied into analysis of eye movement data for the purpose of
biometric classification [19]. Two methods have been considered PCA [6] method
combined with SVM classifier and random-forest based procedure [5]. Data from
eye movement sessions were transformed into the form suitable to be analysed
by a classifier using Dynamic Time Wrapping (DTW) distance measure method.
The main contribution of this paper is application of DTW metrics to eye move-
ment data analysis and performance comparison of two different data reduction
dimensionality methods: feature selection and feature extraction on classification
results.
The paper is organized as follows: two first sections provide the description
of data used in the research and its pre-processing phase. Next section includes
description of two dimensionality reduction methods. Then, results of analysis
and final conclusions are presented.
3
2 Description of data
Data used in the presented research is a part of the dataset available for the
BioEye competition (www.bioeye.info). The purpose of the competition was to
establish methods enabling human identification using eye movement modality.
Eye movement is known to reveal a lot of interesting information about a human
being and eye movement based identification is yet another biometric possibility
which was initially proposed about 10 years ago [13]. Since then many research
have been done in that field and the BioEye competition follows the previously
announced EMVIC2012 [12] and EMVIC2014 [11] competitions.
The dataset used in this research consisted of eye movement recordings of 37
participants. During each session the participant had to follow with eyes a point
displayed on a screen. As the point changed its position by leaps and bounds
eye movement data consisted of fixations on stable point and sudden saccades
to the subsequent location. The point position changed every 1 second and there
were 100 random point locations during each session so the whole session lasted 1
minute and 40 seconds. Eye movement data was initially recorded with frequency
1000Hz and then down-sampled to 250Hz with the usage of noise removal filter.
Finally, there were 25 000 recordings available for every session. Each recording
was additionally described by a ”Validity” flag. Validity equal to 0 meant that
the eye tracker lost eye position and data recorded is not valid.
There were two sessions available for every participant referred later as the
first session and the second session. The task was to build a classification model
using data from the first session as training samples and then use it to classify
the second session for every subject.
3 Data preprocessing
There were 37 first (training) sessions and 37 second (testing) sessions available.
Initially, every session was divided into segments when displayed point location
was stable. It gave 100 segments for each session. The first segment of each
session was removed. Every segment consisted of 250 recordings but some of
that recordings were invalid (with validity flag set to 0). Segments with less than
200 valid recordings were removed from the set. It resulted in 6885 segments.
Every segment consisted of 200–250 eye movement recordings. 3425 segments
were extracted from the first sessions and were used as training samples and
3460 segments from second sessions were used as test samples. The segments
were divided into four groups: NW, NE, SE and SW based on the direction of
points location change. There were 823 training segments in NE direction, 869
in SW direction, 925 in SE and 808 in NW accordingly.
In the next step pairwise distances among all training segments were calcu-
lated. As the length of the segments was varying and we were interested more
in shape comparison than point-to-point comparison, we used Dynamic Time
Warping to calculate distances among samples [4]. The distance calculation was
done for each of the nine different signal features: velocity, acceleration and jerk
4
in vertical direction, velocity, acceleration and jerk in horizontal direction and
absolute velocity, acceleration and jerk values. The distances were calculated sep-
arately for every group (NW, NE, SE and SW). The results of that calculations
were 9 x 4 = 36 matrices containing distances among training samples.
These distances were treated as features in a way similar to [18]. For every
test sample there were DTW distances of this sample to every training sample
of the same direction calculated and these distances were treated as features
describing this sample.
Finally, the full dataset consisted of 36 sets. Nine NE sets had 1689 sam-
ples (including 823 training samples) with 823 features, nine SW sets had 1769
samples (incl. 869 training) with 869 features, nine NW sets had 1597 samples
(incl. 808 training) with 808 features, and finally nine SE sets consisted of 1830
samples (incl. 925 training) with 925 features.
4 Dimensionality reduction methods
Methods for decreasing dimensionality may be divided into two main groups
feature extraction or feature selection. In the first group of methods original
features are transformed to obtain their linear or non-linear combination. As a
result data are represented in another feature space. The second technique relies
on such choice of features that discriminate analysed data best. The task of a
feature selection is to reduce redundancy while maximizing quality of the final
classification outcome. The extension of a feature selection method is wrapper
variable selection where during feature selection process the learning algorithm
and the training set interact.
In this section we present application of two data dimensionality reduction
methods: PCA for feature extraction and random forest for wrapper variable
selection into BioEye2015 dataset.
4.1 Feature extraction with Principal Component Analysis
One of the methods utilized in the research was PCA (Principal Component
Analysis) which is an example of the feature extraction method. It was success-
fully used in many classification problems (pattern recognition, bioinformatics),
in these, in field of eye movement data processing as well [2][19].
The PCA task is to reveal a covariance structure in data dimensions to find
differences and similarities between them. As a result transformation of corre-
lated variables into uncorrelated one is possible. These uncorrelated variables
are called principal components. They are constructed in a way ensuring that
the first of components accounts for the most possible variability in the data.
The same regards each succeeding component, which explains as much of the
remaining variability as possible.
In the presented research the feature extraction was done with usage of
prcomp() function available in R language from the default stats package. As
a function input, a matrix representing DTW distances calculated based on
5
one of previously-described features, was provided. Data from this matrix was
limited only to the first sessions of recordings. Center and Scale parameters of
prcomp() function have been used to (1) shift the data to be zero centered and
(2) scale it to have unit variance. Data transformed this way served as a training
set for SVM classifier [1][24], which has been successfully used in the field of
machine learning and pattern recognition. SVM performs classification tasks by
constructing hyperplanes in a multidimensional space that separates objects of
different class labels. It uses a set of mathematical functions called kernels to
map original data from one feature space to another one. The method is very
popular because it solves a variety of problems and was proved to provide a
good classification accuracy even for a relatively small data set. For this reason
it seems to be suitable for an analysis of an eye movement signal, which is often
gathered during short sessions.
There are different types of kernel mappings such as the polynomial kernel
and the Radial Basis Function (RBF) kernel. The latter one was applied in the
presented research with usage of svm() R function from e1071 package and C=215
and gamma=29settings. The classification model was verified using of predict()
function. Its input parameter was a test set constructed on the basis of PCA
model. It was applied to this part of samples in a a form of distance matrix, which
was obtained from the second recording session. Because prediction probabilities
were evaluated for each sample in a distance matrix, they were subsequently
summed up and normalized in regard to samples related to one user. As a result
one probability vector for each user was provided for one distance matrix. This
procedure has been repeated for all 36 distance matrices thus 36 user probability
vectors were achieved, which were finally averaged for the second time.
4.2 Wrapper variable selection with random forest
Another data dimensionality reduction method used was random-forest based
procedure for wrapper variable selection [14]. Unlike feature extraction, feature
selection methods allow improving classifier accuracy by selecting the most im-
portant attributes. Therefore, resulting subset of attributes may be further used
not only for classification purposes but also for data description and interpreta-
tion [15][21].
Wrapper variable selection approach can be used on any machine learning
algorithm, however we decided to choose random forest due to the fact that this
method is particularly suitable for the high-dimensionality problems and it is
known to be hard to over-train, relatively robust to outliers and noise, and fast
to train [23]. Wrapper variable selection method is based on the idea of measure
of importance, which ranks variables from the most to the least important. Then,
in several iterations, less important variables are removed, the random forest is
trained on remaining set of values and its performance is analysed.
Random forest method [5] is based on ensemble learning idea and it combines
number of decision trees in such a way that each tree is learned (grown) based on
a bootstrap sample drawn from the original data. Therefore, during the learning
process the ensemble (forest) of decision trees is generated. Final classification
6
result is obtained based on simple voting strategy. Typically one-third of the
cases are left out of the bootstrap sample and not used to generate the tree. The
objects that are left are later used to estimate so-called out-of-bag (OOB) error.
Additional feature of random forest method is a possibility of obtaining a
measure of importance of the predictor variables. In the literature one can find
various methods to compute importance measures and these methods typically
differs in two ways: how the error is estimated and how the importance of vari-
ables is updated during learning process [8]. Here, we focus on so-called permu-
tation importance that is estimated in such a way that for a particular variable
its values are permuted in OOB cases and then it is checked how much prediction
error increased. The more error increases, the more important is the value or,
in other words, if the variable is not important, then rearranging the values of
that variable will not decrease prediction accuracy. The final importance value
for an attribute is computed as an average over all trees.
There are two backward strategies that can be applied when using impor-
tance ranking. First one is called Non Recursive Feature Elimination (NRFE)
[7][22] and in this approach the variable ranking is computed only once at the
beginning of the learning process. Next, less important variables are removed
from the ranking and the random forest is learned based on the remaining set
of values. This step is repeated in several iterations until no further variables re-
main. Second approach is called Recursive Feature Elimination (RFE) [9] and it
differs from NRFE method is such a way that the importance ranking is updated
(recomputed) at each iteration. Then, similarly to NRFE, the less important vari-
ables are removed and random forest is learned. In the work of Gregorutti at al.
[8] extensive simulation study was performed comparing these two approaches.
Based on their analysis we decided to choose RFE approach as it might be more
reliable than NRFE since the ranking by the permutation importance measure
is likely to change at each step and by recomputing the permutation importance
measure we assure the ranking to be consistent with the current forest [8].
The final procedure used to learn random forest classifier was as follows:
1. Train the random forest.
2. Compute permutation measure of importance for each attribute.
3. Remove half of the less relevant variables.
4. Repeat steps 1–3 until there is less than 10 variables in the remaining at-
tribute set.
As in the case of PCA analysis, the learning procedure has been repeated for
all 36 distance matrices which resulted in obtaining 36 random forests.
5 Results
5.1 Combined SVM and PCA results
To obtain the best possible prediction result, several cases concerning various
cumulative proportion of explained variance 95%, 97%, 99% and 99.9% have
7
been analysed. The most interesting issue on the first step of analysis was to
check how dimensionality reduction influenced accuracy of data classification
and what degree of reduction could provide the best possible data classification.
Results were compared to the classification based on the all dimensions used.
Please notice that for one user recording there were 36 sets of samples. Each set
included DTW distance matrix calculated for all users taking one signal feature
into consideration. The number of dimensions related to each set, dependent on
eye movement direction, varied from 808 to 925 elements. The performance of
the classification was assessed using two quality indices:
Accuracy the ratio of the number of correctly assigned attempts to the
number of all genuine identification attempts.
FAR the ratio calculated by dividing the number of false acceptances by
the number of identification attempts.
First we classified the whole dataset using SVM method and the classifica-
tion accuracy obtained with usage of all dimensions was 24%, and the FAR ratio
was 4%. Then we applied PCA method; the Table 1 presents classification re-
sults for all levels of explained variability considered in the research. They are
complemented by the information about the number of principal components
required to account for a given variability. Due to the fact, that this number
was calculated independently for each set of samples (36 times), the final result
is presented in a form of average, minimal and maximal number of components
utilized.
Table 1. Classification results for various levels of explained variability
Proportion Average Maximal Minimal
of explained Accuracy FAR number of number of number of
variance components components components
95% 11% 5% 1.9 5 1
97% 11% 5% 3.08 8 1
99% 22% 4% 8.81 18 1
99.9% 54% 2% 91.81 203 19
These outcomes clearly indicate that applying PCA dimensionality reduction
method has had significant influence on classification accuracy. It is visible for
both explained variance percentages 99% and 99.9%. While in the former case
the accuracy is comparable with the full dimensionality calculations, in the latter
one it was improved more than twice. It is worth emphasizing that both results
were obtained for remarkable smaller number of dimensions (on average 8.81
and 91.81 components respectively comparing to about 900 features in primary
sets).
Analysing the classification results we noticed that there were some samples
that obtained very similar probability for two or more analysed classes. To deal
8
with this similarity results appropriate acceptance threshold was defined and
another step to data analysis was introduced.
If we denote by:
pia probability that sample sbelongs to class i,
pja maximal probability obtained for sample sindicating that this sample
belongs to class j,
and (j6=iand i, j 1...37),
and if:
pipjacceptance threshold,
then probability of sample sbelonging to both classes iand jis treated as
equally likely.
Four values of the threshold defined as the difference between calculated
probabilities 0.000, 0.001, 0.0025 and 0.005 were studied. As it was expected,
the bigger threshold value the higher accuracy was obtained. However, increasing
threshold values resulted in increasing of FAR ratio as well (table 2 and Figure
1). The last column of the table presents the ratio of accuracy and FAR for
a particular threshold. It can be noticed that for the two first proportions of
explained variability the best ratio was obtained for threshold equal 0.001, while
in the third case both thresholds 0.000 and 0.001 provided similar ratio values.
In the last of the variabilities, proportion threshold 0.000 significantly surpassed
the others. The ratio of accuracy and FAR for all parameters was presented in
Figure 2.
0%
10%
20%
30%
40%
50%
60%
70%
80%
0.000
0.001
0.0025
0.005
0.000
0.001
0.0025
0.005
0.000
0.001
0.0025
0.005
0.000
0.001
0.0025
0.005
0.95
0.95
0.95
0.95
0.97
0.97
0.97
0.97
0.99
0.99
0.99
0.99
0.999
0.999
0.999
0.999
Accuracy
FAR
Fig. 1. Classification results for various threshold values
9
Table 2. Classification results for various threshold values
Explained Ratio of
variability Similarity Accuracy
proportion threshold Accuracy FAR and FAR
0.95 0.000 11% 5% 2.18
0.95 0.001 27% 8% 3.33
0.95 0.0025 43% 19% 2.23
0.95 0.005 70% 46% 1.52
0.97 0.000 11% 5% 2.18
0.97 0.001 24% 8% 2.92
0.97 0.0025 50% 20% 2.45
0.97 0.005 70% 47% 1.5
0.99 0.000 22% 4% 4.97
0.99 0.001 32% 7% 4.5
0.99 0.0025 50% 15% 3.35
0.99 0.005 68% 35% 1.92
0.999 0.000 54% 2% 21.76
0.999 0.001 62% 4% 16.37
0.999 0.0025 70% 7% 10.46
0.999 0.005 76% 13% 5.92
5.2 Random forest results
Recursive feature elimination procedure described in subsection 4.2 was repeated
for all 36 datasets. Therefore, finally we obtained 36 different random forests that
were used to classify examples from the test dataset (presented results do not
include OOB error). During classification, objects from the test dataset were
presented to each of the 36 classifiers and the final decision was made based on
voting strategy.
Analyses were performed using R Project random forest implementation from
randomForest package [16]. The number of trees in each forest was set empir-
ically to 1500 (ntree=1500) and the importance measure was computed as a
mean decrease of accuracy (importance parameter type=1). As we expect that
it might exists a correlation among variables, the values of importance were
not scaled, that is were not divided by their standard errors (importance pa-
rameter scale=FALSE). Classification accuracy obtained for selected number of
important features is presented in Table 3. As each direction (NE, NE, SE and
SW) was characterized by different number of attributes, the number of selected
features is presented separately for each direction.
Analysis of the results presented in Table 3 shows that reduction of the num-
ber of attributes improves classification accuracy. The best results were obtained
for the number of attributes around 50. Further reduction of the attribute num-
ber decreased the performance of the classifier as too much of the important
information is removed from the data description. In addition, we can notice
that with the reasonably reduced number of attributes (more than 50), the clas-
10
0
5
10
15
20
25
0.000 0.001 0.0025 0.005 0.000 0.001 0.0025 0.005 0.000 0.001 0.0025 0.005 0.000 0.001 0.0025 0 .005
0.95 0.95 0.95 0.95 0.97 0.97 0.97 0.97 0.99 0.99 0.99 0.99 0.999 0.999 0.999 0.999
Ratio of accuracy and FAR
Fig. 2. The ratio of accuracy and FAR for all thresholds
Table 3. Classification accuracy obtained for selected number of important features
Number of attributes
NE NW SE SW Accuracy
825 810 927 871 42%
413 405 464 436 46%
207 203 232 218 42%
104 102 116 109 46%
52 51 58 55 49%
26 26 29 28 40%
13 13 15 14 31%
7 7 8 7 31%
sification accuracy is around 40% - 46%. This is different than in case of the
SVM analysis where in case of the full set of attributes, the classification results
were very poor.
Finally, for random forest classifier, we have analysed classification accuracy
for each of 37 class separately. The results were quite surprising, as we noticed,
that some of the classes, such as 7, 15, 30 and 35 are classified with quite high
accuracy and for others we were able to correctly classify only several objects.
This is something that we did not expect and it requires further investigation.
The classification accuracy obtained separately for each class is presented in
Figure 3.
6 Conclusions
The aim of the research presented in this paper was to elaborate a procedure
for classification of individuals based on data obtained from their eye movement
signal. The data for the studies was acquired from the public accessed compe-
11
Fig. 3. Classification accuracy computed for separated classes
tition, which makes the results obtained in the research comparable with other
prospective explorations.
To prepare data for the classification, the set of features was built based
on dissimilarities among training samples. The dissimilarities were calculated
with DWT metrics. The drawback of such approach for data preprocessing is
that as the result it produces a dataset in which each object is described by
a huge number of attributes. High dimensionality of obtained dataset makes it
difficult to analyse, therefore some additional preprocessing steps are required
before selected classification method is applied. The obtained results show that
combining DTW data preprocessing method with a dimensionality reduction
approach provides the better classification accuracy.
Due to different philosophy of feature extraction versus feature selection it
is difficult to directly compare both methods. In case of the combined SVM and
PCA method different ranges of data size reduction and their influence on the fi-
nal result were studied. These outcome confirmed that it is possible to decrease
a data size meaningfully without decreasing classification accuracy, even im-
proving it. Applying a threshold parameter allows obtaining better classification
results, however it is a trade-off between accuracy of the classifier and a security
system false acceptance rate. The second data dimensionality reduction method
used in our analysis was random forest procedure for wrapper variable selec-
tion. Comparing random forest with SVM method we can see that for full set
of features, random forest classifier gives the better results than SVM method.
This is something that is expected as the random forest method is known to be
suitable for the high-dimensionality problems. However, by reducing the number
of attributes we can still improve accuracy of our random forest classifier.
12
Another interesting result that we observed during our analyses is that the
classification accuracy highly differs among classes. Currently, we are not able
to say if it is due to the differences among examined individuals or there was
some bias introduced during data acquisition phase.
Summarizing the results, it must be emphasized that their quality is not suf-
ficient to apply in a real authentication process, yet indicate promising directions
of the future work.
Acknowledgment The work was partially supported by National Science Centre
(decision DEC-2011/01/D/ST6/07007) (A.G). Computations were performed
with the use of the infrastructure provided by the NCBIR POIG.02.03.01-24-
099/13 grant: GCONiI - Upper-Silesian Center for Scientific Computations.
References
1. Aggarwal, C.C.: Data Classification: Algorithms and Applications. Data Mining
and Knowledge Discovery Series, Hapman and Hall CRC (2014)
2. Bednarik, R., Kinnunen, T., Mihaila, A., Fanti, P.: Eye-movements as a biometric.
In: Image analysis, pp. 780–789. Springer (2005)
3. Bensch, M., Schroder, M., Bogdan, M., Rosenstiel, W.: Feature selection for high-
dimensional industrial data. In: ESANN 2005, 13th European Symposium on Ar-
tificial Neural Networks. pp. 375–380 (2005)
4. Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time
series. In: KDD workshop. vol. 10, pp. 359–370. Seattle, WA (1994)
5. Breiman, L.: Random forests. Machine learning 45(1), 5 32 (2001)
6. Burges, C.J.C.: Dimension reduction: A guided tour. Foundations and Trends in
Machine Learning 2(4) (2010)
7. ıaz-Uriarte, R.and Alvarez de Andr´es, S.: Gene selection and classification of
microarray data using random forest. BMC Bioinformatics 7, 3 (2006)
8. Gregorutti, B., Michel, B., Saint Pierre, P.: Correlation and variable importance
in random forests. ArXiv:1310.5726. (2015)
9. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classifi-
cation using support vector machines. Machine Learning 46(1-3), 389–422 (2002)
10. Holzer, S., Ilic, S., Tan, D., Navab, N.: Efficient learning of linear predictors us-
ing dimensionality reduction. In: Lee, K., Matsushita, Y., Rehg, J., Hu, Z. (eds.)
Computer Vision ACCV 2012, Lecture Notes in Computer Science, vol. 7726, pp.
15–28. Springer Berlin Heidelberg (2013)
11. Kasprowski, P., Harezlak, K.: The second eye movements verification and identifi-
cation competition. In: Biometrics (IJCB), 2014 IEEE International Joint Confer-
ence on. pp. 1–6. IEEE (2014)
12. Kasprowski, P., Komogortsev, O.V., Karpov, A.: First eye movement verification
and identification competition at btas 2012. In: Biometrics: Theory, Applications
and Systems (BTAS), 2012 IEEE Fifth International Conference on. pp. 195–202.
IEEE (2012)
13. Kasprowski, P., Ober, J.: Eye movements in biometrics. In: Biometric Authentica-
tion, pp. 248–258. Springer (2004)
14. Kohavi, R., Johnb, G.: Wrappers for feature subset selection. Artificial Intelligence
97(1-2), 273324 (1997)
13
15. Kursa, M., Jankowski, A., Rudnicki, W.: Boruta - a system for feature selection.
Journal - American Water Works AssociationFundamenta Informaticae 101(4),
271–285 (2010)
16. Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3),
18–22 (2002)
17. Miller, R., Chen, C., Eick, C., Bagherjeiran, A.: A framework for spatial feature
selection and scoping and its application to geo-targeting. In: Spatial Data Mining
and Geographical Knowledge Services (ICSDM), 2011 IEEE International Confer-
ence on. pp. 26–31 (2011)
18. Pekalska, E., Duin, R.P., Paclik, P.: Prototype selection for dissimilarity-based
classifiers. Pattern Recognition 39(2), 189–208 (2006)
19. Saeed, U.: A survey of automatic person recognition using eye movements. Inter-
national Journal of Pattern Recognition and Artificial Intelligence 28(08), 1456015
(2014)
20. Sikora, M.: Redefinition of decision rules based on the importance of elementary
conditions evaluation. Fundamenta Informaticae 123(2), 171–197 (2013)
21. Sikora, M., Gruca, A.: Quality improvement of rule-based gene group descriptions
using information about go terms importance occurring in premises of determined
rules. Applied Mathematics and Computer Science 20(3), 555–570 (2010)
22. Svetnik, V., Liaw, A., Tong, C., Wang, T.: Application of breimans random forest
to modeling structure-activity relationships of pharmaceutical molecules. In: Roli,
F., Kittler, J., Windeatt, T. (eds.) Multiple Classifier Systems, Lecture Notes in
Computer Science, vol. 3077, pp. 334–343. Springer Berlin Heidelberg (2004)
23. Touw, W., Bayjanov, J., Overmars, L., Backus, L., Boekhorst, J., Wels, M., van
Hijum, S.: Data mining in the life sciences with random forest: a walk in the park
or lost in the jungle? Brief. Bioinformatics 14(3), 315–326 (2013)
24. Vapnik, V., Golowich, S.E., Smola, A.: Support vector method for function ap-
proximation, regression estimation, and signal processing. Advances in neural in-
formation processing systems pp. 281–287 (1997)
... As Human-Device interaction in IoT is one of the crucial aspects, developing a new mechanism to track data received from humans (e.g. [23]) is another important research topic. Secure communication and applying security policies to such interaction should be missioncritical for developers and architects. ...
Article
In this paper, we describe secure gateway for Internet of Things (IoT) devices with internal AAA mechanism, implemented to connect IoT sensors with Internet users. Secure gateway described in this paper allows to (1) authenticate each connected device, (2) authorise connection or reconfiguration performed by the device and (3) account each action. The same applies to Internet users who want to connect, download data from or upload data to an IoT device. Secure Gateway with internal AAA mechanism could be used in Smart Cities environments and in other IoT deployments where security is a critical concern. The mechanism presented in this paper is a new concept and has been practically validated in Polish national research network PL-LAB2020.
Chapter
In recent years, deep learning has been widely used in the eye-tracking area. Eye-tracking has been studied to diagnose neurological and psychological diseases early since it is a simple, non-invasive, and objective proxy measurement of cognitive function. This project aims to develop a system to automatically track the synchronisation of eye movement data and its visual target. To achieve this goal, we employ a deep learning algorithm (Points-CNN and Head-CNN) to detect the eye centre location and classify the synchronisation level of the eye movement and visual target. Moreover, we modify the eyediap dataset to assist with our research objective. The video data in the eyediap dataset is used to track the eye movement trajectory, while the visual target movement data is used to extract the direction change window. The movement feature vectors are extracted from the eye movement data and the visual target movement data with the direction change window. Euclidean distance, Cosine similarity, and Jaccard similarity coefficient are used to assist the synchronization detection of the eye and visual target movement vector. In the synchronisation detection part, K-Nearest Neighbors, Support Vector Machine, Logistic Regression are investigated.
Article
Full-text available
The paper presents studies on biometric identification methods based on the eye movement signal. New signal features were investigated for this purpose. They included its representation in the frequency domain and the largest Lyapunov exponent, which characterizes the dynamics of the eye movement signal seen as a nonlinear time series. These features, along with the velocities and accelerations used in the previously conducted works, were determined for 100-ms eye movement segments. 24 participants took part in the experiment, composed of two sessions. The users’ task was to observe a point appearing on the screen in 29 locations. The eye movement recordings for each point were used to create a feature vector in two variants: one vector for one point and one vector including signal for three consecutive locations. Two approaches for defining the training and test sets were applied. In the first one, 75% of randomly selected vectors were used as the training set, under a condition of equal proportions for each participant in both sets and the disjointness of the training and test sets. Among four classifiers: kNN (k = 5), decision tree, naïve Bayes, and random forest, good classification performance was obtained for decision tree and random forest. The efficiency of the last method reached 100%. The outcomes were much worse in the second scenario when the training and testing sets when defined based on recordings from different sessions; the possible reasons are discussed in the paper.
Article
Full-text available
The idea concerning usage of the eye movement for human identification has been known for 10 years. However, there is still lack of commonly accepted methods how to perform such identification. This paper describes the second edition of Eye Movement Verification and Identification Competition (EMVIC), which may be regarded as an attempt to provide some common basis for eye movement biometrics (EMB). The paper presents some details describing the organization of the competition, its results and formulates some conclusions for further development of EMB.
Conference Paper
Full-text available
The idea concerning usage of the eye movement for human identification has been known for 10 years. However, there is still lack of commonly accepted methods how to perform such identification. This paper describes the second edition of Eye Movement Verification and Identification Competition (EMVIC), which may be regarded as an attempt to provide some common basis for eye movement biometrics (EMB). The paper presents some details describing the organization of the competition, its results and formulates some conclusions for further development of EMB.
Conference Paper
Full-text available
This paper presents the results of the first eye movement verification and identification competition. The work provides background, discusses previous research, and describes the datasets and methods used in the competition. The results highlight the importance of very careful eye positional data capture to ensure meaningfulness of identification outcomes. The discussion about the metrics and scores that can assist in evaluation of the captured data quality is provided. Best identification results varied in the range from 58.6% to 97.7% depending on the dataset and methods employed for the identification. Additionally, this work discusses possible future directions of research in the eye movement-based biometrics domain.
Article
Full-text available
This paper is about variable selection with the random forests algorithm in presence of correlated predictors. In high-dimensional regression or classification frameworks, variable selection is a difficult task, that becomes even more challenging in the presence of highly correlated predictors. Firstly we provide a theoretical study of the permutation importance measure for an additive regression model. This allows us to describe how the correlation between predictors impacts the permutation importance. Our results motivate the use of the Recursive Feature Elimination (RFE) algorithm for variable selection in this context. This algorithm recursively eliminates the variables using permutation importance measure as a ranking criterion. Next various simulation experiments illustrate the efficiency of the RFE algorithm for selecting a small number of variables together with a good prediction error. Finally, this selection algorithm is tested on the Landsat Satellite data from the UCI Machine Learning Repository.
Chapter
The problem of classification has been widely studied in the database, data mining, and information retrieval communities. The problem of classification is defined as follows. Given a set of records D = {X1,…,XN} and a set of k different discrete values indexed by {1…k}, each representing a category, the task is to assign one category (equivalently the corresponding index value) to each record Xi. The problem is usually solved by using a supervised learning approach where a set of training data records (i.e., records with known category labels) are used to construct a classification model, which relates the features in the underlying record to one of the class labels. For a given test instance for which the class is unknown, the training model is used to predict a class label for this instance. The problem may also be solved by using unsupervised approaches that do not require labeled training data, in which case keyword queries characterizing each class are often manually created, and bootstrapping may be used to heuristically obtain pseudo training data. Our review focuses on supervised learning approaches.
Article
The human eye is rich in physical and behavioral attributes that can be used for automatic person recognition. The physical attributes such as the iris attracted early attention and yielded significant recognition results, but like most physical biometrics, they have several disadvantages such as intrusive acquisition, vulnerability to spoofing attacks, etc. Consequently, during the last decade the behavioral attributes extracted from human eyes have steadily gained interest from the automatic person recognition research community. In this first of its kind survey, we present the studies utilizing the behavioral attributes of human eyes for automatic person recognition. We have proposed a unique classification based on the type of stimuli used to elicit behavioral attributes. In addition, for each approach we have carefully examined the common steps involved in automatic person recognition from database acquisition, feature extraction to classification. Lastly, we also present a comparison of the recognition results obtained by each approach.
Article
The paper presents an algorithm of decision rules redefinition that is based on evaluation of the importance of elementary conditions occurring in induced rules. Standard and simplified heuristic indices of elementary condition importance evaluation are described. There is a comparison of the results obtained by both indices concerning classifiers quality and elementary condition rankings estimated by the indices. The efficiency of the proposed algorithm has been verified on 21 benchmark data sets. Moreover, an analysis of practical applications of the proposed methods for biomedical and medical data analysis is presented. The obtained results show that the redefinition reduces considerably a rule set needed to describe each decision class. Additionally, after the rule set redefinition negated elementary conditions may also occur in new rules.
Article
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.