Content uploaded by Vinayaka R Kamath
Author content
All content in this area was uploaded by Vinayaka R Kamath on Jan 24, 2019
Content may be subject to copyright.
ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 132 (2018) 890–899
1877-0509 © 2018 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/)
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and Data Science
(ICCIDS 2018).
10.1016/j.procs.2018.05.101
10.1016/j.procs.2018.05.101
© 2018 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/)
Peer-review under responsibility of the scientic committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
1877-0509
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000–000
www.elsevier.com/locate/procedia
Corresponding author: vinayakkamath2010@gmail.com
1877-0509 © 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
International Conference on Computational Intelligence and Data Science (ICCIDS 2018)
Sparse Locally Adaptive Regression Kernel For Face Verification
Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan
Center for Pattern Recognition and Machine Intelligence,
PES University, Bengaluru, India.
Abstract
The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression
Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance
metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has
significantly higher influence when the verification process is computationally intensive. The proposed approach requires
minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process
is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from
the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”
between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the
resemblance from the threshold resulting in a binary decision.
© 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
Keywords: LARK; Face Verification; SVD; Filtering; Threshold;
1. Introduction
Typically face verification is achieved by collating two images and deducing the resemblance between them. The
physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.
Nomenclature
LARK Locally Adaptive Regression Kernel
SVD Singular Value Decomposition
ROC Receiver Operating Characteristic
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000–000
www.elsevier.com/locate/procedia
Corresponding author: vinayakkamath2010@gmail.com
1877-0509 © 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
International Conference on Computational Intelligence and Data Science (ICCIDS 2018)
Sparse Locally Adaptive Regression Kernel For Face Verification
Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan
Center for Pattern Recognition and Machine Intelligence,
PES University, Bengaluru, India.
Abstract
The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression
Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance
metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has
significantly higher influence when the verification process is computationally intensive. The proposed approach requires
minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process
is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from
the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”
between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the
resemblance from the threshold resulting in a binary decision.
© 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
Keywords: LARK; Face Verification; SVD; Filtering; Threshold;
1. Introduction
Typically face verification is achieved by collating two images and deducing the resemblance between them. The
physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.
Nomenclature
LARK Locally Adaptive Regression Kernel
SVD Singular Value Decomposition
ROC Receiver Operating Characteristic
2 A Vinay/ Procedia Computer Science 00 (2018) 000–000
Facial recognition is a biometric method of finding similarities and dissimilarities of an individual by collating
digital image data with the featured extracted [1] beforehand. Nearly all contemporary face recognition setups work
with blueprints of numeric codes called faceprints. The designed systems recognize nodal points of high importance
on a human face. Here the nodal points act as cardinal features which elaborate on key aspects such as the length
and width of the nose, the profundity of the eye sockets and the geography of the cheekbones. These systems labor
by cognizance data for nodal points on a digital image of an individual’s face and store the data for further
operations. These faceprints can then be used as a scale to compare with data captured from other faces in an image
or video. Face verification has innumerable uses. It can be used to detect fraud in passports and visas, elevated
security of the card user which works by mapping facial data and matching it against the database in ATMs and
banks, identification of criminals, prevent fraud voters and finally keep track of attendance. Companies like
Facebook use ‘Deepface’ which is a deep learning facial recognition system in artificial intelligence. ‘Facenet’
which is a Google project achieved a close to 100 percent accuracy on an extensively used dataset known as
‘Labeled Faces in the Wild’, which included more than 13,000 images of faces from across the world. Many
challenges brought about include the image quality which the algorithm expects. Without the accurate computations
of facial features the robustness of the approaches will also be lost. Hence even in the most complex algorithm as
the image quality [2] deteriorate the accuracy reduces. Same face appears differently due to change in lighting.
Illumination can transform and manipulate the appearance of an object to a great extent. There is a need to overcome
irregular lighting. Pose variation is when an individual’s face happens to be rotated at different angles[3] where
frontal view images are feature rich while side views of a person are sparse of feature. This problem appears when
the system has to identify a rotated face using this frontal view training data. Multiple orientations of the faces need
to be fed into the database. Recognizing faces across different ages of a person [4] poses to be a difficult task, so age
variation needs to be taken into account. A simple yet accurate model which is not computationally intensive would
help hardware possessing light weight architecture in achieving face verification. Locally adaptive regression
kernels provided one such methodology for training free object detection. The idea was to extend the same
constituent to face verification with nominal training. Face verification posed a different set of challenges which
needed an unorthodox approach, very different from what LARKs were used for before. The paper “Face
Verification Using the LARK Representation” [5] provided motivation for the usage of LARKs in face verification.
Further sections of this paper illustrate various stages of the tendered pipeline. Section 2 contains description of
the mechanisms used in the stages of the pipeline and their advantages over their counterparts. Suitable schematics
and mathematical equations depict the process pictorially and theoretically. Section 3 briefs over the datasets used
and challenges each one has to offer. Last two sections throw light over the inferences and speculations concluded
from the technique. Future scope and enhancements have been discussed in the last section of the paper.
2. Proposed Methodology
2.1 Overview of Approach
This section briefs about the framework implemented on the images to achieve face verification. Cropping the
region of interest from the raw image helps to diminish unwanted noise and variations in the background in a large
scale. Singular value decomposition [6] assists in compression [7] of the region of interest therefore reducing the
computational potency. LARKs [8] achieve extraction of key points and descriptors in the form of vectors from the
preprocessed images [9]. Euclidean distance, Cosine distance, Chebyshev distance and several other distance
metrics can be used to differentiate these vectorized LARKs [10]. The intensity of these metrics is correlated with
the thresholds proposed that lead to a binary decision, hence achieving face verification.
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 891
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000–000
www.elsevier.com/locate/procedia
Corresponding author: vinayakkamath2010@gmail.com
1877-0509 © 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
International Conference on Computational Intelligence and Data Science (ICCIDS 2018)
Sparse Locally Adaptive Regression Kernel For Face Verification
Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan
Center for Pattern Recognition and Machine Intelligence,
PES University, Bengaluru, India.
Abstract
The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression
Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance
metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has
significantly higher influence when the verification process is computationally intensive. The proposed approach requires
minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process
is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from
the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”
between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the
resemblance from the threshold resulting in a binary decision.
© 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
Keywords: LARK; Face Verification; SVD; Filtering; Threshold;
1. Introduction
Typically face verification is achieved by collating two images and deducing the resemblance between them. The
physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.
Nomenclature
LARK Locally Adaptive Regression Kernel
SVD Singular Value Decomposition
ROC Receiver Operating Characteristic
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000–000
www.elsevier.com/locate/procedia
Corresponding author: vinayakkamath2010@gmail.com
1877-0509 © 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
International Conference on Computational Intelligence and Data Science (ICCIDS 2018)
Sparse Locally Adaptive Regression Kernel For Face Verification
Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan
Center for Pattern Recognition and Machine Intelligence,
PES University, Bengaluru, India.
Abstract
The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression
Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance
metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has
significantly higher influence when the verification process is computationally intensive. The proposed approach requires
minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process
is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from
the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”
between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the
resemblance from the threshold resulting in a binary decision.
© 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
Keywords: LARK; Face Verification; SVD; Filtering; Threshold;
1. Introduction
Typically face verification is achieved by collating two images and deducing the resemblance between them. The
physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.
Nomenclature
LARK Locally Adaptive Regression Kernel
SVD Singular Value Decomposition
ROC Receiver Operating Characteristic
2 A Vinay/ Procedia Computer Science 00 (2018) 000–000
Facial recognition is a biometric method of finding similarities and dissimilarities of an individual by collating
digital image data with the featured extracted [1] beforehand. Nearly all contemporary face recognition setups work
with blueprints of numeric codes called faceprints. The designed systems recognize nodal points of high importance
on a human face. Here the nodal points act as cardinal features which elaborate on key aspects such as the length
and width of the nose, the profundity of the eye sockets and the geography of the cheekbones. These systems labor
by cognizance data for nodal points on a digital image of an individual’s face and store the data for further
operations. These faceprints can then be used as a scale to compare with data captured from other faces in an image
or video. Face verification has innumerable uses. It can be used to detect fraud in passports and visas, elevated
security of the card user which works by mapping facial data and matching it against the database in ATMs and
banks, identification of criminals, prevent fraud voters and finally keep track of attendance. Companies like
Facebook use ‘Deepface’ which is a deep learning facial recognition system in artificial intelligence. ‘Facenet’
which is a Google project achieved a close to 100 percent accuracy on an extensively used dataset known as
‘Labeled Faces in the Wild’, which included more than 13,000 images of faces from across the world. Many
challenges brought about include the image quality which the algorithm expects. Without the accurate computations
of facial features the robustness of the approaches will also be lost. Hence even in the most complex algorithm as
the image quality [2] deteriorate the accuracy reduces. Same face appears differently due to change in lighting.
Illumination can transform and manipulate the appearance of an object to a great extent. There is a need to overcome
irregular lighting. Pose variation is when an individual’s face happens to be rotated at different angles[3] where
frontal view images are feature rich while side views of a person are sparse of feature. This problem appears when
the system has to identify a rotated face using this frontal view training data. Multiple orientations of the faces need
to be fed into the database. Recognizing faces across different ages of a person [4] poses to be a difficult task, so age
variation needs to be taken into account. A simple yet accurate model which is not computationally intensive would
help hardware possessing light weight architecture in achieving face verification. Locally adaptive regression
kernels provided one such methodology for training free object detection. The idea was to extend the same
constituent to face verification with nominal training. Face verification posed a different set of challenges which
needed an unorthodox approach, very different from what LARKs were used for before. The paper “Face
Verification Using the LARK Representation” [5] provided motivation for the usage of LARKs in face verification.
Further sections of this paper illustrate various stages of the tendered pipeline. Section 2 contains description of
the mechanisms used in the stages of the pipeline and their advantages over their counterparts. Suitable schematics
and mathematical equations depict the process pictorially and theoretically. Section 3 briefs over the datasets used
and challenges each one has to offer. Last two sections throw light over the inferences and speculations concluded
from the technique. Future scope and enhancements have been discussed in the last section of the paper.
2. Proposed Methodology
2.1 Overview of Approach
This section briefs about the framework implemented on the images to achieve face verification. Cropping the
region of interest from the raw image helps to diminish unwanted noise and variations in the background in a large
scale. Singular value decomposition [6] assists in compression [7] of the region of interest therefore reducing the
computational potency. LARKs [8] achieve extraction of key points and descriptors in the form of vectors from the
preprocessed images [9]. Euclidean distance, Cosine distance, Chebyshev distance and several other distance
metrics can be used to differentiate these vectorized LARKs [10]. The intensity of these metrics is correlated with
the thresholds proposed that lead to a binary decision, hence achieving face verification.
892 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899
AVinay/ Procedia Computer Science 00 (2018) 000–000 3
Fig. 1. Schematic of the proposed system
2.2. Singular Value Decomposition
Singular Value Decomposition commonly known as SVD[11] is the speculation of the eigendecomposition[12] of
a positive definite normal matrix to any
nm
real or complex matrix via an extension of the polar decomposition. If
M is a
nm
matrix whose entries come from field Y, then M can be factorized to the form,
(1)
Where,
A is a unitary matrix on basis of Y.
Σ is a diagonal m x n matrix.
B is an n x n unitary matrix for Y.
Bθ is the conjugate transpose of A
SVD aids in reducing the time required in subsequent operations. The refactored matrices can preserve useful
features, while presenting the image using a smaller set of values [13]. This image can then be utilized for feature
extraction and further processed.
Fig.2(a). Image before SVD Fig.2(b). Image after SVD
θ
BA=M
4 A Vinay/ Procedia Computer Science 00 (2018) 000–000
2.3. Locally Adaptive Regression Kernels
In order to showcase obsolete low quality content on a relatively high resolution screens, the need for imp space -
time up scaling, denoising and deblurring algorithms has become a matter of great concern. Classical kernel
regression is a non parametric estimate of a point or pixel within an image. It is used to estimate the conditional
estimate of a random variable. The Kernel regression framework defines its data model as
iii
ε+xz=y
xi ϵ ω, i = 1 …. S (2)
yi is a denoised sample measured at xi = [x1i , x2i ] T
where Z(x) is the required regression function, εi is an
independently and identically distributed zero mean noise. P is the total number of samples in an arbitrary “window”
ω around a position of interest X. Point wise estimates can be concluded easily using this mechanism. Z(xi) can be
computed using the Taylor series.
.+xxxHzxx!+xxxz+xzxz
i
T
ii
T
i
..--2/1-∇≈
(3)
.+xxxxvechβ+xxβ+β
T
ii
T
i
T
..---≈
210
(4)
where H is the Hessian operators, while vech is the half-vectorization operator that lexicographically orders the
lower triangular portion of a symmetric matrix into a column stacked vector. β1 and β2 can be mathematically
defined as
(5)
(6)
The vech operation can be illustrated as below,
Locally adaptive regression kernel [14] can be formulated as follows:
(7)
Where,
(8)
T
xxz,xxz=β2
2
2
2
1
2
2//2/1
T
xxz,xxz=β
211
//
T
dba=
db
ba
vech
T
ifecba=
ifc
feb
cba
vech
}xxCxx{=x,x,CK
il
'
iii
exp
ωK
kx
kxkx
kxkx
kxi
xz
xzxz
xzxz
xz=C
2
2
2
21
21
1
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 893
AVinay/ Procedia Computer Science 00 (2018) 000–000 3
Fig. 1. Schematic of the proposed system
2.2. Singular Value Decomposition
Singular Value Decomposition commonly known as SVD[11] is the speculation of the eigendecomposition[12] of
a positive definite normal matrix to any
nm
real or complex matrix via an extension of the polar decomposition. If
M is a
nm
matrix whose entries come from field Y, then M can be factorized to the form,
(1)
Where,
A is a unitary matrix on basis of Y.
Σ is a diagonal m x n matrix.
B is an n x n unitary matrix for Y.
Bθ is the conjugate transpose of A
SVD aids in reducing the time required in subsequent operations. The refactored matrices can preserve useful
features, while presenting the image using a smaller set of values [13]. This image can then be utilized for feature
extraction and further processed.
Fig.2(a). Image before SVD Fig.2(b). Image after SVD
θ
BA=M
4 A Vinay/ Procedia Computer Science 00 (2018) 000–000
2.3. Locally Adaptive Regression Kernels
In order to showcase obsolete low quality content on a relatively high resolution screens, the need for imp space -
time up scaling, denoising and deblurring algorithms has become a matter of great concern. Classical kernel
regression is a non parametric estimate of a point or pixel within an image. It is used to estimate the conditional
estimate of a random variable. The Kernel regression framework defines its data model as
iii
ε+xz=y
xi ϵ ω, i = 1 …. S (2)
yi is a denoised sample measured at xi = [x1i , x2i ] T
where Z(x) is the required regression function, εi is an
independently and identically distributed zero mean noise. P is the total number of samples in an arbitrary “window”
ω around a position of interest X. Point wise estimates can be concluded easily using this mechanism. Z(xi) can be
computed using the Taylor series.
.+xxxHzxx!+xxxz+xzxz
i
T
ii
T
i
..--2/1-∇≈
(3)
.+xxxxvechβ+xxβ+β
T
ii
T
i
T
..---≈
210
(4)
where H is the Hessian operators, while vech is the half-vectorization operator that lexicographically orders the
lower triangular portion of a symmetric matrix into a column stacked vector. β1 and β2 can be mathematically
defined as
(5)
(6)
The vech operation can be illustrated as below,
Locally adaptive regression kernel [14] can be formulated as follows:
(7)
Where,
(8)
T
xxz,xxz=β2
2
2
2
1
2
2//2/1
T
xxz,xxz=β
211
//
T
dba=
db
ba
vech
T
ifecba=
ifc
feb
cba
vech
}xxCxx{=x,x,CK
il
'
iii
exp
ωK
kx
kxkx
kxkx
kxi
xz
xzxz
xzxz
xz=C
2
2
2
21
21
1
894 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899
AVinay/ Procedia Computer Science 00 (2018) 000–000 5
2.4. Descriptors and Matchers
Vectorized LARKs acts as key points and descriptors for the image [15]. Overlapping patches are removed from the
vectorized LARKs, hence giving a visual impression of the generated LARKs. This in turn can be used to plot an
image. Different set of key points can be obtained from the preprocessed images by varying smoothness, window
size and sensitivity, each time resulting in a slightly different set of vectorized version of LARKs [16]. Theoretical
distance in a vector space is calculated between two raw LARKs using different distance metrics.
A unique set of LARKs is obtained every time the parameters are tweaked. Significant variations are observed in
the visual LARKs based on the input image. Each image from a dataset is compared with rest of the images,
including that of the same person. The output regarding each image is labeled as matched or unmatched without fail.
When this process is complete with all images, an optimal threshold is to be found which separates these labeled
values. To achieve this, graph of sensitivity and specificity vs thresholds is plotted. The intersection of these curves
grants the required threshold. The magnitude of the y value interprets the accuracy of that threshold.
Fig. 3(a). Sample Image from Faces95 database Fig. 3(b). Visual LARKs with low window size
Fig. 3(c). Visual LARKs with high sensitivity Fig. 3(d). Visual LARKs with low sensitivity
A metric or distance function is a special mapping which defines a relation regarding spacial distance[17]
between elements of a set and that of non-negative real numbers. Euclidean, Chebyshev and cosine distances were
persuaded to be geometrically faultless and proximate. Cosine distance is dot product upon the product of euclidean
distances from the origin.
6 A Vinay/ Procedia Computer Science 00 (2018) 000–000
(9)
Chebyshev distance is a special case of Minkowski distance where P goes to infinity. Also known as Chessboard
distance, it is the L∞ norm of the difference.
(10)
Euclidean distance [18] is also a special case of Minkowski distance using P value as 2. It is the most
commonly used distance measure in geometric interpretation.
(11)
Fig. 4(a). Euclidean distance contour Fig. 4(b). Cosine distance contour Fig. 4(c). Chebyshev distance contour
3. Datasets and Experimentation
3.1. Datasets Used
To test the righteousness of the methodology for variations, corresponding datasets were used. These helped in
testing and analysis of the methodology.
3.1.1. ORL Faces
Formerly known as ‘The ORL Database of Faces’ [19], the dataset holds images from early 1990s captured at
Cambridge University Computer Laboratory. It contains ten unique images of 40 distinct individuals, subjected to
variations in time of the day when capturing, lightings, facial expressions and othe r details such as glasses. Each
image has 92 x 112 pixels with standard 256 grey levels per unit. The dataset possessed a challenge in terms of
facial expressions and lighting conditions.
n
=i
i
n
=i
i
i
n
=i
i
cosine
yx
yx
=
YX
YX,
YX,distance
1
2
1
2
1
22
11
ii
p
n
=i
p
ii
p
cheb ysh ev
yxmax=yx=yxYX,distance
/1
1
lim
n
=i
iieucli dea n
yx=yxYX,distance
1
2
2
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 895
AVinay/ Procedia Computer Science 00 (2018) 000–000 5
2.4. Descriptors and Matchers
Vectorized LARKs acts as key points and descriptors for the image [15]. Overlapping patches are removed from the
vectorized LARKs, hence giving a visual impression of the generated LARKs. This in turn can be used to plot an
image. Different set of key points can be obtained from the preprocessed images by varying smoothness, window
size and sensitivity, each time resulting in a slightly different set of vectorized version of LARKs [16]. Theoretical
distance in a vector space is calculated between two raw LARKs using different distance metrics.
A unique set of LARKs is obtained every time the parameters are tweaked. Significant variations are observed in
the visual LARKs based on the input image. Each image from a dataset is compared with rest of the images,
including that of the same person. The output regarding each image is labeled as matched or unmatched without fail.
When this process is complete with all images, an optimal threshold is to be found which separates these labeled
values. To achieve this, graph of sensitivity and specificity vs thresholds is plotted. The intersection of these curves
grants the required threshold. The magnitude of the y value interprets the accuracy of that threshold.
Fig. 3(a). Sample Image from Faces95 database Fig. 3(b). Visual LARKs with low window size
Fig. 3(c). Visual LARKs with high sensitivity Fig. 3(d). Visual LARKs with low sensitivity
A metric or distance function is a special mapping which defines a relation regarding spacial distance[17]
between elements of a set and that of non-negative real numbers. Euclidean, Chebyshev and cosine distances were
persuaded to be geometrically faultless and proximate. Cosine distance is dot product upon the product of euclidean
distances from the origin.
6 A Vinay/ Procedia Computer Science 00 (2018) 000–000
(9)
Chebyshev distance is a special case of Minkowski distance where P goes to infinity. Also known as Chessboard
distance, it is the L∞ norm of the difference.
(10)
Euclidean distance [18] is also a special case of Minkowski distance using P value as 2. It is the most
commonly used distance measure in geometric interpretation.
(11)
Fig. 4(a). Euclidean distance contour Fig. 4(b). Cosine distance contour Fig. 4(c). Chebyshev distance contour
3. Datasets and Experimentation
3.1. Datasets Used
To test the righteousness of the methodology for variations, corresponding datasets were used. These helped in
testing and analysis of the methodology.
3.1.1. ORL Faces
Formerly known as ‘The ORL Database of Faces’ [19], the dataset holds images from early 1990s captured at
Cambridge University Computer Laboratory. It contains ten unique images of 40 distinct individuals, subjected to
variations in time of the day when capturing, lightings, facial expressions and othe r details such as glasses. Each
image has 92 x 112 pixels with standard 256 grey levels per unit. The dataset possessed a challenge in terms of
facial expressions and lighting conditions.
n
=i
i
n
=i
i
i
n
=i
i
cosine
yx
yx
=
YX
YX,
YX,distance
1
2
1
2
1
22
11
ii
p
n
=i
p
ii
p
cheb ysh ev
yxmax=yx=yxYX,distance
/1
1
lim
n
=i
iieucli dea n
yx=yxYX,distance
1
2
2
896 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899
AVinay/ Procedia Computer Science 00 (2018) 000–000 7
Fig. 5. Sample record from ORL faces
3.1.2. Grimace
An assembly of images of 18 individuals designed and maintained by Dr. Libor Spacek, grimace[20]
focuses mainly on expression variations for both male and female candidates. The dataset contains 20 portraits of
each individual of resolution 180 x 200 pixels. The background is kept same throughout all the images with small
uniform head scale variations. The lighting changes are minimal and little to no variations in hairstyle of the
subjects.
Fig. 6. Individual exhibiting different expressions
3.1.3. Faces95
Brain child of Dr. Libor Spacek, the database contains portraits of 72 distinct individuals [21]. A sequence
of 20 images was captured while the subject was asked to step towards the camera after each snap. The dataset
offers a huge head scale variation and minor variations due to the shadows, resulting in a discrepancy in red
background. Noticeable changes in lighting occur due to artificial lighting systems used.
4. Results and Inferences
Three benchmark datasets and three different distance metrics were utilized to conclude from a heuristic
approach [22]. All three grimace, faces95 and ORL databases have a different set of challenges and variations of
their own, best suited for a heuristic approach. An optimal distance metric was to be deduced which would give an
exact measure of the interspace between two sets of LARKs from the images. This interval would help us in arriving
at a conclusion about the correlation between two images. ROC curve [23] was to be plotted by comparing all the
subjects in the dataset with a handful of randomly selected subjects. The frequency of true positives and true
negatives were normalized to attain simplification in the analysis [24]. A graph of sensitivity and specificity vs
criterion value was plotted instead of a ROC curve to get the value of the threshold as well. Each sample from ORL
dataset was mapped to five other subjects from the same dataset and the outputs were labeled. This process was
repeated for grimace and faces95 with three random sample comparisons for all subjects. The methodology provided
reasonable accuracy with minimal variations for multiple tests.
The steady increase in the graph shows that the data points are normally distributed. Quickly rising graphs
exhibit densely populated points around the corresponding threshold. SVD on raw images reduces the accuracy up-
8 A Vinay/ Procedia Computer Science 00 (2018) 000–000
to 2% but significantly decreases the time taken to compute the output. The red curve on the graph was obtained by
plotting specificity value at all possible thresholds. The blue curve represents possible values of sensitivity for all
thresholds.
Fig. 7. Specificity and Sensitivity vs euclidean distance for ORL dataset
Fig. 8. Specificity and Sensitivity vs cosine distance for grimace dataset
The intersection of the specificity curve and sensitivity curve gives the optimal threshold for face
verification. The graph was plotted for all possible combinations of datasets and distance metrics. Following table
represents optimal thresholds obtained for different distance metrics.
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 897
AVinay/ Procedia Computer Science 00 (2018) 000–000 7
Fig. 5. Sample record from ORL faces
3.1.2. Grimace
An assembly of images of 18 individuals designed and maintained by Dr. Libor Spacek, grimace[20]
focuses mainly on expression variations for both male and female candidates. The dataset contains 20 portraits of
each individual of resolution 180 x 200 pixels. The background is kept same throughout all the images with small
uniform head scale variations. The lighting changes are minimal and little to no variations in hairstyle of the
subjects.
Fig. 6. Individual exhibiting different expressions
3.1.3. Faces95
Brain child of Dr. Libor Spacek, the database contains portraits of 72 distinct individuals [21]. A sequence
of 20 images was captured while the subject was asked to step towards the camera after each snap. The dataset
offers a huge head scale variation and minor variations due to the shadows, resulting in a discrepancy in red
background. Noticeable changes in lighting occur due to artificial lighting systems used.
4. Results and Inferences
Three benchmark datasets and three different distance metrics were utilized to conclude from a heuristic
approach [22]. All three grimace, faces95 and ORL databases have a different set of challenges and variations of
their own, best suited for a heuristic approach. An optimal distance metric was to be deduced which would give an
exact measure of the interspace between two sets of LARKs from the images. This interval would help us in arriving
at a conclusion about the correlation between two images. ROC curve [23] was to be plotted by comparing all the
subjects in the dataset with a handful of randomly selected subjects. The frequency of true positives and true
negatives were normalized to attain simplification in the analysis [24]. A graph of sensitivity and specificity vs
criterion value was plotted instead of a ROC curve to get the value of the threshold as well. Each sample from ORL
dataset was mapped to five other subjects from the same dataset and the outputs were labeled. This process was
repeated for grimace and faces95 with three random sample comparisons for all subjects. The methodology provided
reasonable accuracy with minimal variations for multiple tests.
The steady increase in the graph shows that the data points are normally distributed. Quickly rising graphs
exhibit densely populated points around the corresponding threshold. SVD on raw images reduces the accuracy up-
8 A Vinay/ Procedia Computer Science 00 (2018) 000–000
to 2% but significantly decreases the time taken to compute the output. The red curve on the graph was obtained by
plotting specificity value at all possible thresholds. The blue curve represents possible values of sensitivity for all
thresholds.
Fig. 7. Specificity and Sensitivity vs euclidean distance for ORL dataset
Fig. 8. Specificity and Sensitivity vs cosine distance for grimace dataset
The intersection of the specificity curve and sensitivity curve gives the optimal threshold for face
verification. The graph was plotted for all possible combinations of datasets and distance metrics. Following table
represents optimal thresholds obtained for different distance metrics.
898 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899
AVinay/ Procedia Computer Science 00 (2018) 000–000 9
Table 1. Optimal thresholds obtained from experimentation.
Dataset vs Distance Metrics
Euclidean Distance
Cosine Distance
Chebyshev Distance
Threshold
Accuracy
Threshold
Accuracy
Threshold
Accuracy
ORL Database
0.05760
78.4%
0.030565
77.25%
0.026942
70.75%
Grimace
0.032258
95.5%
0.01272
93.7%
0.015282
94.5%
Faces 95
0.022585
68%
0.025926
66.81%
0.022585
66.16%
5. Conclusion and Future Work
Analysing the experimentation results it is fairly noticable that the system provides equitable accuracy for
considered distance metrics. Euclidean distance stands out with higher accuracy for all the datasets with about 1% to
2 % better accuracy. Grimace dataset embodies significant variations in the expressions of the subjects to which the
pipeline performs admirably. It was noticed that the system showed minutest variations over several trails proving
the stability of the mechanism. Faces95 on the other hand shifts the region of interest and delivers variations in the
background to which the systems doesn’t perform upto the mark. There seems to be a room for improvement with
regards to noises and disturbances the subjects has to offer. Locally adaptive regression kernels have proven to be
capable and concise descriptors for extraction of features from an image. As they require minimal training and
preform significantly better than other feature extraction algorithms there is a sense of under utilization. There is
also an enormous scope for better matchers to be used with LARK descriptors. The distance metrics such as
euclidean , cosine and Gorbachev show appreciable accuracy for the amount of computational power required.
Future scope include pairing LARK descriptors with flann matchers, convolutional neural networks and
dimensionality reduction of LARKs using principle component analysis. Use feature aggregation techniques such as
bag of words, fisher vectors for more precise segregation of the descriptors. Further evaluate these using Support
Vector Machine to conclude verification.
References
[1] Brown, M. and Lowe, D.G. 2002. “Invariant features from interest point groups”. British Machine Vision Conference, Cardiff, Wales, pp.
656-665.
[2] H. R. Sheikh, A. C. Bovik, "Image information and visual quality", Image Processing IEEE Transactions, vol. 15, no. 2, pp. 430-444, 2006.
[3] O. Boiman, M. Irani, "Detecting Irregularities in Images and in Video", Int'l J. Computer Vision, vol. 74, pp. 17-31, Aug. 2007.
[4] X. Geng, C. Yin, Z.-H. Zhou, "Facial age estimation by learning from label distributions", IEEE Trans. Pattern Anal. Mach. Intell., vol. 35,
no. 10, pp. 2401-2412, Oct. 2013.
[5] H.J. Seo and P. Milanfar, “Face verification using the LARK representation,” IEEE Transactions on Information Forensics and Security,
2011.
[6] Q. X. Gao, "SVD for face recognition problems and solutions," China Journal of Image and Graphics, 2006, Vol. 11, No.12, pp. 1784-1791.
[7] H. S. Prasantha, H. L. Shashidhara and K. N. B. Murthy, "Image Compression Using SVD," International Conference on Computational
Intelligence and Multimedia Applications (ICCIMA 2007), Sivakasi, Tamil Nadu, 2007, pp. 143-145.
[8] H. J. Seo and P. Milanfar, “Training-free, generic object detection using locally adaptive regression kernels,” IEEE Trans. Pattern Anal.
Mach.Intell., vol. 32, no. 9, pp. 1688–1704, Sep. 2010.
[9] Y. Q. Cheng and Y.M. Zhuang, "The image feature extraction and recognition based on matrix similarity," Computer Research and
Development, 1992, 11, pp. 42-48.
[10] M. Guillaumin, J. Verbeek, and C. Schmid, “Is that you? Metric learning approaches for face identification,” in Proc. IEEE Int. Conf.
Computer Vision (ICCV), 2009.
[11]Virginia C. Klema and Alan J. Laub, “The Singular Value Decomposition: Its Computation and Some Applications”, IEEE transactions on
automatic control, vol. ac-25, no. 2, APFSL 1980.
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 899
AVinay/ Procedia Computer Science 00 (2018) 000–000 9
Table 1. Optimal thresholds obtained from experimentation.
Dataset vs Distance Metrics
Euclidean Distance
Cosine Distance
Chebyshev Distance
Threshold
Accuracy
Threshold
Accuracy
Threshold
Accuracy
ORL Database
0.05760
78.4%
0.030565
77.25%
0.026942
70.75%
Grimace
0.032258
95.5%
0.01272
93.7%
0.015282
94.5%
Faces 95
0.022585
68%
0.025926
66.81%
0.022585
66.16%
5. Conclusion and Future Work
Analysing the experimentation results it is fairly noticable that the system provides equitable accuracy for
considered distance metrics. Euclidean distance stands out with higher accuracy for all the datasets with about 1% to
2 % better accuracy. Grimace dataset embodies significant variations in the expressions of the subjects to which the
pipeline performs admirably. It was noticed that the system showed minutest variations over several trails proving
the stability of the mechanism. Faces95 on the other hand shifts the region of interest and delivers variations in the
background to which the systems doesn’t perform upto the mark. There seems to be a room for improvement with
regards to noises and disturbances the subjects has to offer. Locally adaptive regression kernels have proven to be
capable and concise descriptors for extraction of features from an image. As they require minimal training and
preform significantly better than other feature extraction algorithms there is a sense of under utilization. There is
also an enormous scope for better matchers to be used with LARK descriptors. The distance metrics such as
euclidean , cosine and Gorbachev show appreciable accuracy for the amount of computational power required.
Future scope include pairing LARK descriptors with flann matchers, convolutional neural networks and
dimensionality reduction of LARKs using principle component analysis. Use feature aggregation techniques such as
bag of words, fisher vectors for more precise segregation of the descriptors. Further evaluate these using Support
Vector Machine to conclude verification.
References
[1] Brown, M. and Lowe, D.G. 2002. “Invariant features from interest point groups”. British Machine Vision Conference, Cardiff, Wales, pp.
656-665.
[2] H. R. Sheikh, A. C. Bovik, "Image information and visual quality", Image Processing IEEE Transactions, vol. 15, no. 2, pp. 430-444, 2006.
[3] O. Boiman, M. Irani, "Detecting Irregularities in Images and in Video", Int'l J. Computer Vision, vol. 74, pp. 17-31, Aug. 2007.
[4] X. Geng, C. Yin, Z.-H. Zhou, "Facial age estimation by learning from label distributions", IEEE Trans. Pattern Anal. Mach. Intell., vol. 35,
no. 10, pp. 2401-2412, Oct. 2013.
[5] H.J. Seo and P. Milanfar, “Face verification using the LARK representation,” IEEE Transactions on Information Forensics and Security,
2011.
[6] Q. X. Gao, "SVD for face recognition problems and solutions," China Journal of Image and Graphics, 2006, Vol. 11, No.12, pp. 1784-1791.
[7] H. S. Prasantha, H. L. Shashidhara and K. N. B. Murthy, "Image Compression Using SVD," International Conference on Computational
Intelligence and Multimedia Applications (ICCIMA 2007), Sivakasi, Tamil Nadu, 2007, pp. 143-145.
[8] H. J. Seo and P. Milanfar, “Training-free, generic object detection using locally adaptive regression kernels,” IEEE Trans. Pattern Anal.
Mach.Intell., vol. 32, no. 9, pp. 1688–1704, Sep. 2010.
[9] Y. Q. Cheng and Y.M. Zhuang, "The image feature extraction and recognition based on matrix similarity," Computer Research and
Development, 1992, 11, pp. 42-48.
[10] M. Guillaumin, J. Verbeek, and C. Schmid, “Is that you? Metric learning approaches for face identification,” in Proc. IEEE Int. Conf.
Computer Vision (ICCV), 2009.
[11]Virginia C. Klema and Alan J. Laub, “The Singular Value Decomposition: Its Computation and Some Applications”, IEEE transactions on
automatic control, vol. ac-25, no. 2, APFSL 1980.
10 A Vinay/ Procedia Computer Science 00 (2018) 000–000
[12] T. S. T. Chan and Y. H. Yang, "Polar n-Complex and n-Bicomplex Singular Value Decomposition and Principal Component Pursuit," in
IEEE Transactions on Signal Processing, vol. 64, no. 24, pp. 6533-6544, Dec.15, 15 2016.
[13] K. M. Aishwarya, R. Ramesh, P. M. Sobarad and V. Singh, "Lossy image compression using SVD coding algorithm," 2016 International
Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), Chennai, 2016, pp. 1384-1389.
[14] Hae Jong Seo and Peyman Milanfar, “Face Verification Using the LARK Representation”, IEEE Transactions on Information Forensics
and security, vol. 6, no. 4, Dec 2011.
[15] L. Wolf, T. Hassner, and Y. Taigman, “Descriptor based methods in the wild,” in Proc. Faces in Real-Life Image Workshop in Eur. Conf.
Computer Vision (ECCV), Marseille, France, 2008.
[16] Z. Cao, Q. Yin, X. Tang, and J. Sun, “Face recognition with learning-based descriptor,” in Proc. IEEE Conf. Computer Vision and Computer
Vision (CVPR), 2010, pp. 2707–2714.
[17] A. Vadivel, A.K.Majumdar & S. Sural, " Performance comparison of distance metrics in content based Image retrieval applications", Intl.
Conference on Information Technology.
[18] L. D. Chase, " Euclidean Distance", College of Natural Resources, Colorado State University, Fort Collins, Colorado, USA, 824-146-294,
NR 505, December 8, 2008.
[19]AT&T Laboratories Cambridge, (2002), "The Database of Faces” [Online]. Retreived from:
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html.
[20] Dr Libor SpaceK(2007), “Collection of Facial Images: Grimace”[Online], Retreived
from:http://cswww.essex.ac.uk/mv/allfaces/grimace.html
[21]Dr Libor SpaceK(2007), “Collection of Facial Images: Faces95”[Online], Retreived from:http://cswww.essex.ac.uk/mv/allfaces/faces95.html
[22] Senthilkumaran N and Vaithegi S, “ Image Segmentation By Using Thresholding Techniques For Medical Images”, Computer Science &
Engineering: An International Journal (CSEIJ), Vol.6, No.1, February 2016.
[23] Devlin, S. A.; Thomas, E. G. and Emerson, S. S. (2013). “Robustness of approaches to ROC curve modeling under misspecification of the
underlying probability model”, Communications in Statistics—Theory and Methods, 42, 3655–3664.
[24] D. Surya Prabha and J. Satheesh Kumar, “Performance Evaluation of Image Segmentation using Objective Methods”,IndianJournal of
Science and Technology, Vol 9(8), February 2016.