ArticlePDF Available

Abstract and Figures

The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has significantly higher influence when the verification process is computationally intensive. The proposed approach requires minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance” between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the resemblance from the threshold resulting in a binary decision.
Content may be subject to copyright.
ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 132 (2018) 890–899
1877-0509 © 2018 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/)
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and Data Science
(ICCIDS 2018).
10.1016/j.procs.2018.05.101
10.1016/j.procs.2018.05.101
© 2018 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/)
Peer-review under responsibility of the scientic committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
1877-0509
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000000
www.elsevier.com/locate/procedia
Corresponding author: vinayakkamath2010@gmail.com
1877-0509 © 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
International Conference on Computational Intelligence and Data Science (ICCIDS 2018)
Sparse Locally Adaptive Regression Kernel For Face Verification
Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan
Center for Pattern Recognition and Machine Intelligence,
PES University, Bengaluru, India.
Abstract
The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression
Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance
metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has
significantly higher influence when the verification process is computationally intensive. The proposed approach requires
minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process
is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from
the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”
between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the
resemblance from the threshold resulting in a binary decision.
© 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
Keywords: LARK; Face Verification; SVD; Filtering; Threshold;
1. Introduction
Typically face verification is achieved by collating two images and deducing the resemblance between them. The
physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.
Nomenclature
LARK Locally Adaptive Regression Kernel
SVD Singular Value Decomposition
ROC Receiver Operating Characteristic
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000000
www.elsevier.com/locate/procedia
Corresponding author: vinayakkamath2010@gmail.com
1877-0509 © 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
International Conference on Computational Intelligence and Data Science (ICCIDS 2018)
Sparse Locally Adaptive Regression Kernel For Face Verification
Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan
Center for Pattern Recognition and Machine Intelligence,
PES University, Bengaluru, India.
Abstract
The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression
Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance
metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has
significantly higher influence when the verification process is computationally intensive. The proposed approach requires
minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process
is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from
the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”
between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the
resemblance from the threshold resulting in a binary decision.
© 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and
Data Science (ICCIDS 2018).
Keywords: LARK; Face Verification; SVD; Filtering; Threshold;
1. Introduction
Typically face verification is achieved by collating two images and deducing the resemblance between them. The
physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.
Nomenclature
LARK Locally Adaptive Regression Kernel
SVD Singular Value Decomposition
ROC Receiver Operating Characteristic
2 A Vinay/ Procedia Computer Science 00 (2018) 000000
Facial recognition is a biometric method of finding similarities and dissimilarities of an individual by collating
digital image data with the featured extracted [1] beforehand. Nearly all contemporary face recognition setups work
with blueprints of numeric codes called faceprints. The designed systems recognize nodal points of high importance
on a human face. Here the nodal points act as cardinal features which elaborate on key aspects such as the length
and width of the nose, the profundity of the eye sockets and the geography of the cheekbones. These systems labor
by cognizance data for nodal points on a digital image of an individual’s face and store the data for further
operations. These faceprints can then be used as a scale to compare with data captured from other faces in an image
or video. Face verification has innumerable uses. It can be used to detect fraud in passports and visas, elevated
security of the card user which works by mapping facial data and matching it against the database in ATMs and
banks, identification of criminals, prevent fraud voters and finally keep track of attendance. Companies like
Facebook use ‘Deepface’ which is a deep learning facial recognition system in artificial intelligence. ‘Facenet
which is a Google project achieved a close to 100 percent accuracy on an extensively used dataset known as
‘Labeled Faces in the Wild’, which included more than 13,000 images of faces from across the world. Many
challenges brought about include the image quality which the algorithm expects. Without the accurate computations
of facial features the robustness of the approaches will also be lost. Hence even in the most complex algorithm as
the image quality [2] deteriorate the accuracy reduces. Same face appears differently due to change in lighting.
Illumination can transform and manipulate the appearance of an object to a great extent. There is a need to overcome
irregular lighting. Pose variation is when an individual’s face happens to be rotated at different angles[3] where
frontal view images are feature rich while side views of a person are sparse of feature. This problem appears when
the system has to identify a rotated face using this frontal view training data. Multiple orientations of the faces need
to be fed into the database. Recognizing faces across different ages of a person [4] poses to be a difficult task, so age
variation needs to be taken into account. A simple yet accurate model which is not computationally intensive would
help hardware possessing light weight architecture in achieving face verification. Locally adaptive regression
kernels provided one such methodology for training free object detection. The idea was to extend the same
constituent to face verification with nominal training. Face verification posed a different set of challenges which
needed an unorthodox approach, very different from what LARKs were used for before. The paper “Face
Verification Using the LARK Representation” [5] provided motivation for the usage of LARKs in face verification.
Further sections of this paper illustrate various stages of the tendered pipeline. Section 2 contains description of
the mechanisms used in the stages of the pipeline and their advantages over their counterparts. Suitable schematics
and mathematical equations depict the process pictorially and theoretically. Section 3 briefs over the datasets used
and challenges each one has to offer. Last two sections throw light over the inferences and speculations concluded
from the technique. Future scope and enhancements have been discussed in the last section of the paper.
2. Proposed Methodology
2.1 Overview of Approach
This section briefs about the framework implemented on the images to achieve face verification. Cropping the
region of interest from the raw image helps to diminish unwanted noise and variations in the background in a large
scale. Singular value decomposition [6] assists in compression [7] of the region of interest therefore reducing the
computational potency. LARKs [8] achieve extraction of key points and descriptors in the form of vectors from the
preprocessed images [9]. Euclidean distance, Cosine distance, Chebyshev distance and several other distance
metrics can be used to differentiate these vectorized LARKs [10]. The intensity of these metrics is correlated with
the thresholds proposed that lead to a binary decision, hence achieving face verification.
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 891
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000000
www.elsevier.com/locate/procedia
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000000
www.elsevier.com/locate/procedia
2 A Vinay/ Procedia Computer Science 00 (2018) 000000
Facial recognition is a biometric method of finding similarities and dissimilarities of an individual by collating
digital image data with the featured extracted [1] beforehand. Nearly all contemporary face recognition setups work
with blueprints of numeric codes called faceprints. The designed systems recognize nodal points of high importance
on a human face. Here the nodal points act as cardinal features which elaborate on key aspects such as the length
and width of the nose, the profundity of the eye sockets and the geography of the cheekbones. These systems labor
by cognizance data for nodal points on a digital image of an individual’s face and store the data for further
operations. These faceprints can then be used as a scale to compare with data captured from other faces in an image
or video. Face verification has innumerable uses. It can be used to detect fraud in passports and visas, elevated
security of the card user which works by mapping facial data and matching it against the database in ATMs and
banks, identification of criminals, prevent fraud voters and finally keep track of attendance. Companies like
Facebook use ‘Deepface’ which is a deep learning facial recognition system in artificial intelligence. ‘Facenet
which is a Google project achieved a close to 100 percent accuracy on an extensively used dataset known as
‘Labeled Faces in the Wild’, which included more than 13,000 images of faces from across the world. Many
challenges brought about include the image quality which the algorithm expects. Without the accurate computations
of facial features the robustness of the approaches will also be lost. Hence even in the most complex algorithm as
the image quality [2] deteriorate the accuracy reduces. Same face appears differently due to change in lighting.
Illumination can transform and manipulate the appearance of an object to a great extent. There is a need to overcome
irregular lighting. Pose variation is when an individual’s face happens to be rotated at different angles[3] where
frontal view images are feature rich while side views of a person are sparse of feature. This problem appears when
the system has to identify a rotated face using this frontal view training data. Multiple orientations of the faces need
to be fed into the database. Recognizing faces across different ages of a person [4] poses to be a difficult task, so age
variation needs to be taken into account. A simple yet accurate model which is not computationally intensive would
help hardware possessing light weight architecture in achieving face verification. Locally adaptive regression
kernels provided one such methodology for training free object detection. The idea was to extend the same
constituent to face verification with nominal training. Face verification posed a different set of challenges which
needed an unorthodox approach, very different from what LARKs were used for before. The paper “Face
Verification Using the LARK Representation” [5] provided motivation for the usage of LARKs in face verification.
Further sections of this paper illustrate various stages of the tendered pipeline. Section 2 contains description of
the mechanisms used in the stages of the pipeline and their advantages over their counterparts. Suitable schematics
and mathematical equations depict the process pictorially and theoretically. Section 3 briefs over the datasets used
and challenges each one has to offer. Last two sections throw light over the inferences and speculations concluded
from the technique. Future scope and enhancements have been discussed in the last section of the paper.
2. Proposed Methodology
2.1 Overview of Approach
This section briefs about the framework implemented on the images to achieve face verification. Cropping the
region of interest from the raw image helps to diminish unwanted noise and variations in the background in a large
scale. Singular value decomposition [6] assists in compression [7] of the region of interest therefore reducing the
computational potency. LARKs [8] achieve extraction of key points and descriptors in the form of vectors from the
preprocessed images [9]. Euclidean distance, Cosine distance, Chebyshev distance and several other distance
metrics can be used to differentiate these vectorized LARKs [10]. The intensity of these metrics is correlated with
the thresholds proposed that lead to a binary decision, hence achieving face verification.
892 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899
AVinay/ Procedia Computer Science 00 (2018) 000000 3
Fig. 1. Schematic of the proposed system
2.2. Singular Value Decomposition
Singular Value Decomposition commonly known as SVD[11] is the speculation of the eigendecomposition[12] of
a positive definite normal matrix to any
nm
real or complex matrix via an extension of the polar decomposition. If
M is a
nm
matrix whose entries come from field Y, then M can be factorized to the form,
(1)
Where,
A is a unitary matrix on basis of Y.
Σ is a diagonal m x n matrix.
B is an n x n unitary matrix for Y.
Bθ is the conjugate transpose of A
SVD aids in reducing the time required in subsequent operations. The refactored matrices can preserve useful
features, while presenting the image using a smaller set of values [13]. This image can then be utilized for feature
extraction and further processed.
Fig.2(a). Image before SVD Fig.2(b). Image after SVD
θ
BA=M
4 A Vinay/ Procedia Computer Science 00 (2018) 000000
2.3. Locally Adaptive Regression Kernels
In order to showcase obsolete low quality content on a relatively high resolution screens, the need for imp space -
time up scaling, denoising and deblurring algorithms has become a matter of great concern. Classical kernel
regression is a non parametric estimate of a point or pixel within an image. It is used to estimate the conditional
estimate of a random variable. The Kernel regression framework defines its data model as
 
iii
ε+xz=y
xi ϵ ω, i = 1 …. S (2)
yi is a denoised sample measured at xi = [x1i , x2i ] T
where Z(x) is the required regression function, εi is an
independently and identically distributed zero mean noise. P is the total number of samples in an arbitrary “window”
ω around a position of interest X. Point wise estimates can be concluded easily using this mechanism. Z(xi) can be
computed using the Taylor series.
     
.+xxxHzxx!+xxxz+xzxz
i
T
ii
T
i
..--2/1-
(3)
    
 
.+xxxxvechβ+xxβ+β
T
ii
T
i
T
..---
210
(4)
where H is the Hessian operators, while vech is the half-vectorization operator that lexicographically orders the
lower triangular portion of a symmetric matrix into a column stacked vector. β1 and β2 can be mathematically
defined as
(5)
(6)
The vech operation can be illustrated as below,
Locally adaptive regression kernel [14] can be formulated as follows:
(7)
Where,
(8)
   
 
T
xxz,xxz=β2
2
2
2
1
2
2//2/1
   
 
T
xxz,xxz=β
211
//
 
T
dba=
db
ba
vech
 
T
ifecba=
ifc
feb
cba
vech
   
}xxCxx{=x,x,CK
il
'
iii
exp
     
     
ωK
kx
kxkx
kxkx
kxi
xz
xzxz
xzxz
xz=C
2
2
2
21
21
1
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 893
nm
nm
θ
BA=M
4 A Vinay/ Procedia Computer Science 00 (2018) 000000
2.3. Locally Adaptive Regression Kernels
In order to showcase obsolete low quality content on a relatively high resolution screens, the need for imp space -
time up scaling, denoising and deblurring algorithms has become a matter of great concern. Classical kernel
regression is a non parametric estimate of a point or pixel within an image. It is used to estimate the conditional
estimate of a random variable. The Kernel regression framework defines its data model as
 
iii
ε+xz=y
xi ϵ ω, i = 1 …. S (2)
yi is a denoised sample measured at xi = [x1i , x2i ] T
where Z(x) is the required regression function, εi is an
independently and identically distributed zero mean noise. P is the total number of samples in an arbitrary “window”
ω around a position of interest X. Point wise estimates can be concluded easily using this mechanism. Z(xi) can be
computed using the Taylor series.
     
.+xxxHzxx!+xxxz+xzxz
i
T
ii
T
i
..--2/1-
(3)
    
 
.+xxxxvechβ+xxβ+β
T
ii
T
i
T
..---
210
(4)
where H is the Hessian operators, while vech is the half-vectorization operator that lexicographically orders the
lower triangular portion of a symmetric matrix into a column stacked vector. β1 and β2 can be mathematically
defined as
(5)
(6)
The vech operation can be illustrated as below,
Locally adaptive regression kernel [14] can be formulated as follows:
(7)
Where,
(8)
   
 
T
xxz,xxz=β2
2
2
2
1
2
2//2/1
   
 
T
xxz,xxz=β
211
//
 
T
dba=
db
ba
vech
 
T
ifecba=
ifc
feb
cba
vech
   
}xxCxx{=x,x,CK
il
'
iii
exp
     
     
ωK
kx
kxkx
kxkx
kxi
xz
xzxz
xzxz
xz=C
2
2
2
21
21
1
894 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899
AVinay/ Procedia Computer Science 00 (2018) 000000 5
2.4. Descriptors and Matchers
Vectorized LARKs acts as key points and descriptors for the image [15]. Overlapping patches are removed from the
vectorized LARKs, hence giving a visual impression of the generated LARKs. This in turn can be used to plot an
image. Different set of key points can be obtained from the preprocessed images by varying smoothness, window
size and sensitivity, each time resulting in a slightly different set of vectorized version of LARKs [16]. Theoretical
distance in a vector space is calculated between two raw LARKs using different distance metrics.
A unique set of LARKs is obtained every time the parameters are tweaked. Significant variations are observed in
the visual LARKs based on the input image. Each image from a dataset is compared with rest of the images,
including that of the same person. The output regarding each image is labeled as matched or unmatched without fail.
When this process is complete with all images, an optimal threshold is to be found which separates these labeled
values. To achieve this, graph of sensitivity and specificity vs thresholds is plotted. The intersection of these curves
grants the required threshold. The magnitude of the y value interprets the accuracy of that threshold.
Fig. 3(a). Sample Image from Faces95 database Fig. 3(b). Visual LARKs with low window size
Fig. 3(c). Visual LARKs with high sensitivity Fig. 3(d). Visual LARKs with low sensitivity
A metric or distance function is a special mapping which defines a relation regarding spacial distance[17]
between elements of a set and that of non-negative real numbers. Euclidean, Chebyshev and cosine distances were
persuaded to be geometrically faultless and proximate. Cosine distance is dot product upon the product of euclidean
distances from the origin.
6 A Vinay/ Procedia Computer Science 00 (2018) 000000
(9)
Chebyshev distance is a special case of Minkowski distance where P goes to infinity. Also known as Chessboard
distance, it is the L norm of the difference.
(10)
Euclidean distance [18] is also a special case of Minkowski distance using P value as 2. It is the most
commonly used distance measure in geometric interpretation.
(11)
Fig. 4(a). Euclidean distance contour Fig. 4(b). Cosine distance contour Fig. 4(c). Chebyshev distance contour
3. Datasets and Experimentation
3.1. Datasets Used
To test the righteousness of the methodology for variations, corresponding datasets were used. These helped in
testing and analysis of the methodology.
3.1.1. ORL Faces
Formerly known as ‘The ORL Database of Faces’ [19], the dataset holds images from early 1990s captured at
Cambridge University Computer Laboratory. It contains ten unique images of 40 distinct individuals, subjected to
variations in time of the day when capturing, lightings, facial expressions and othe r details such as glasses. Each
image has 92 x 112 pixels with standard 256 grey levels per unit. The dataset possessed a challenge in terms of
facial expressions and lighting conditions.
 
n
=i
i
n
=i
i
i
n
=i
i
cosine
yx
yx
=
YX
YX,
YX,distance
1
2
1
2
1
22
11
 
   
ii
p
n
=i
p
ii
p
cheb ysh ev
yxmax=yx=yxYX,distance
/1
1
lim
 
 
n
=i
iieucli dea n
yx=yxYX,distance
1
2
2
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 895
6 A Vinay/ Procedia Computer Science 00 (2018) 000000
(9)
Chebyshev distance is a special case of Minkowski distance where P goes to infinity. Also known as Chessboard
distance, it is the L norm of the difference.
(10)
Euclidean distance [18] is also a special case of Minkowski distance using P value as 2. It is the most
commonly used distance measure in geometric interpretation.
(11)
Fig. 4(a). Euclidean distance contour Fig. 4(b). Cosine distance contour Fig. 4(c). Chebyshev distance contour
3. Datasets and Experimentation
3.1. Datasets Used
To test the righteousness of the methodology for variations, corresponding datasets were used. These helped in
testing and analysis of the methodology.
3.1.1. ORL Faces
Formerly known as ‘The ORL Database of Faces’ [19], the dataset holds images from early 1990s captured at
Cambridge University Computer Laboratory. It contains ten unique images of 40 distinct individuals, subjected to
variations in time of the day when capturing, lightings, facial expressions and othe r details such as glasses. Each
image has 92 x 112 pixels with standard 256 grey levels per unit. The dataset possessed a challenge in terms of
facial expressions and lighting conditions.
 
n
=i
i
n
=i
i
i
n
=i
i
cosine
yx
yx
=
YX
YX,
YX,distance
1
2
1
2
1
22
11
 
   
ii
p
n
=i
p
ii
p
cheb ysh ev
yxmax=yx=yxYX,distance
/1
1
lim
 
 
n
=i
iieucli dea n
yx=yxYX,distance
1
2
2
896 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899
AVinay/ Procedia Computer Science 00 (2018) 000000 7
Fig. 5. Sample record from ORL faces
3.1.2. Grimace
An assembly of images of 18 individuals designed and maintained by Dr. Libor Spacek, grimace[20]
focuses mainly on expression variations for both male and female candidates. The dataset contains 20 portraits of
each individual of resolution 180 x 200 pixels. The background is kept same throughout all the images with small
uniform head scale variations. The lighting changes are minimal and little to no variations in hairstyle of the
subjects.
Fig. 6. Individual exhibiting different expressions
3.1.3. Faces95
Brain child of Dr. Libor Spacek, the database contains portraits of 72 distinct individuals [21]. A sequence
of 20 images was captured while the subject was asked to step towards the camera after each snap. The dataset
offers a huge head scale variation and minor variations due to the shadows, resulting in a discrepancy in red
background. Noticeable changes in lighting occur due to artificial lighting systems used.
4. Results and Inferences
Three benchmark datasets and three different distance metrics were utilized to conclude from a heuristic
approach [22]. All three grimace, faces95 and ORL databases have a different set of challenges and variations of
their own, best suited for a heuristic approach. An optimal distance metric was to be deduced which would give an
exact measure of the interspace between two sets of LARKs from the images. This interval would help us in arriving
at a conclusion about the correlation between two images. ROC curve [23] was to be plotted by comparing all the
subjects in the dataset with a handful of randomly selected subjects. The frequency of true positives and true
negatives were normalized to attain simplification in the analysis [24]. A graph of sensitivity and specificity vs
criterion value was plotted instead of a ROC curve to get the value of the threshold as well. Each sample from ORL
dataset was mapped to five other subjects from the same dataset and the outputs were labeled. This process was
repeated for grimace and faces95 with three random sample comparisons for all subjects. The methodology provided
reasonable accuracy with minimal variations for multiple tests.
The steady increase in the graph shows that the data points are normally distributed. Quickly rising graphs
exhibit densely populated points around the corresponding threshold. SVD on raw images reduces the accuracy up-
8 A Vinay/ Procedia Computer Science 00 (2018) 000000
to 2% but significantly decreases the time taken to compute the output. The red curve on the graph was obtained by
plotting specificity value at all possible thresholds. The blue curve represents possible values of sensitivity for all
thresholds.
Fig. 7. Specificity and Sensitivity vs euclidean distance for ORL dataset
Fig. 8. Specificity and Sensitivity vs cosine distance for grimace dataset
The intersection of the specificity curve and sensitivity curve gives the optimal threshold for face
verification. The graph was plotted for all possible combinations of datasets and distance metrics. Following table
represents optimal thresholds obtained for different distance metrics.
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 897
AVinay/ Procedia Computer Science 00 (2018) 000000 7
Fig. 5. Sample record from ORL faces
3.1.2. Grimace
An assembly of images of 18 individuals designed and maintained by Dr. Libor Spacek, grimace[20]
focuses mainly on expression variations for both male and female candidates. The dataset contains 20 portraits of
each individual of resolution 180 x 200 pixels. The background is kept same throughout all the images with small
uniform head scale variations. The lighting changes are minimal and little to no variations in hairstyle of the
subjects.
Fig. 6. Individual exhibiting different expressions
3.1.3. Faces95
Brain child of Dr. Libor Spacek, the database contains portraits of 72 distinct individuals [21]. A sequence
of 20 images was captured while the subject was asked to step towards the camera after each snap. The dataset
offers a huge head scale variation and minor variations due to the shadows, resulting in a discrepancy in red
background. Noticeable changes in lighting occur due to artificial lighting systems used.
4. Results and Inferences
Three benchmark datasets and three different distance metrics were utilized to conclude from a heuristic
approach [22]. All three grimace, faces95 and ORL databases have a different set of challenges and variations of
their own, best suited for a heuristic approach. An optimal distance metric was to be deduced which would give an
exact measure of the interspace between two sets of LARKs from the images. This interval would help us in arriving
at a conclusion about the correlation between two images. ROC curve [23] was to be plotted by comparing all the
subjects in the dataset with a handful of randomly selected subjects. The frequency of true positives and true
negatives were normalized to attain simplification in the analysis [24]. A graph of sensitivity and specificity vs
criterion value was plotted instead of a ROC curve to get the value of the threshold as well. Each sample from ORL
dataset was mapped to five other subjects from the same dataset and the outputs were labeled. This process was
repeated for grimace and faces95 with three random sample comparisons for all subjects. The methodology provided
reasonable accuracy with minimal variations for multiple tests.
The steady increase in the graph shows that the data points are normally distributed. Quickly rising graphs
exhibit densely populated points around the corresponding threshold. SVD on raw images reduces the accuracy up-
8 A Vinay/ Procedia Computer Science 00 (2018) 000000
to 2% but significantly decreases the time taken to compute the output. The red curve on the graph was obtained by
plotting specificity value at all possible thresholds. The blue curve represents possible values of sensitivity for all
thresholds.
Fig. 7. Specificity and Sensitivity vs euclidean distance for ORL dataset
Fig. 8. Specificity and Sensitivity vs cosine distance for grimace dataset
The intersection of the specificity curve and sensitivity curve gives the optimal threshold for face
verification. The graph was plotted for all possible combinations of datasets and distance metrics. Following table
represents optimal thresholds obtained for different distance metrics.
898 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899
AVinay/ Procedia Computer Science 00 (2018) 000000 9
Table 1. Optimal thresholds obtained from experimentation.
Dataset vs Distance Metrics
Euclidean Distance
Cosine Distance
Chebyshev Distance
Threshold
Accuracy
Threshold
Accuracy
Threshold
Accuracy
ORL Database
0.05760
78.4%
0.030565
77.25%
0.026942
70.75%
Grimace
0.032258
95.5%
0.01272
93.7%
0.015282
94.5%
Faces 95
0.022585
68%
0.025926
66.81%
0.022585
66.16%
5. Conclusion and Future Work
Analysing the experimentation results it is fairly noticable that the system provides equitable accuracy for
considered distance metrics. Euclidean distance stands out with higher accuracy for all the datasets with about 1% to
2 % better accuracy. Grimace dataset embodies significant variations in the expressions of the subjects to which the
pipeline performs admirably. It was noticed that the system showed minutest variations over several trails proving
the stability of the mechanism. Faces95 on the other hand shifts the region of interest and delivers variations in the
background to which the systems doesn’t perform upto the mark. There seems to be a room for improvement with
regards to noises and disturbances the subjects has to offer. Locally adaptive regression kernels have proven to be
capable and concise descriptors for extraction of features from an image. As they require minimal training and
preform significantly better than other feature extraction algorithms there is a sense of under utilization. There is
also an enormous scope for better matchers to be used with LARK descriptors. The distance metrics such as
euclidean , cosine and Gorbachev show appreciable accuracy for the amount of computational power required.
Future scope include pairing LARK descriptors with flann matchers, convolutional neural networks and
dimensionality reduction of LARKs using principle component analysis. Use feature aggregation techniques such as
bag of words, fisher vectors for more precise segregation of the descriptors. Further evaluate these using Support
Vector Machine to conclude verification.
References
[1] Brown, M. and Lowe, D.G. 2002. “Invariant features from interest point groups”. British Machine Vision Conference, Cardiff, Wales, pp.
656-665.
[2] H. R. Sheikh, A. C. Bovik, "Image information and visual quality", Image Processing IEEE Transactions, vol. 15, no. 2, pp. 430-444, 2006.
[3] O. Boiman, M. Irani, "Detecting Irregularities in Images and in Video", Int'l J. Computer Vision, vol. 74, pp. 17-31, Aug. 2007.
[4] X. Geng, C. Yin, Z.-H. Zhou, "Facial age estimation by learning from label distributions", IEEE Trans. Pattern Anal. Mach. Intell., vol. 35,
no. 10, pp. 2401-2412, Oct. 2013.
[5] H.J. Seo and P. Milanfar, “Face verification using the LARK representation,” IEEE Transactions on Information Forensics and Security,
2011.
[6] Q. X. Gao, "SVD for face recognition problems and solutions," China Journal of Image and Graphics, 2006, Vol. 11, No.12, pp. 1784-1791.
[7] H. S. Prasantha, H. L. Shashidhara and K. N. B. Murthy, "Image Compression Using SVD," International Conference on Computational
Intelligence and Multimedia Applications (ICCIMA 2007), Sivakasi, Tamil Nadu, 2007, pp. 143-145.
[8] H. J. Seo and P. Milanfar, “Training-free, generic object detection using locally adaptive regression kernels,” IEEE Trans. Pattern Anal.
Mach.Intell., vol. 32, no. 9, pp. 16881704, Sep. 2010.
[9] Y. Q. Cheng and Y.M. Zhuang, "The image feature extraction and recognition based on matrix similarity," Computer Research and
Development, 1992, 11, pp. 42-48.
[10] M. Guillaumin, J. Verbeek, and C. Schmid, “Is that you? Metric learning approaches for face identification,” in Proc. IEEE Int. Conf.
Computer Vision (ICCV), 2009.
[11]Virginia C. Klema and Alan J. Laub, “The Singular Value Decomposition: Its Computation and Some Applications”, IEEE transactions on
automatic control, vol. ac-25, no. 2, APFSL 1980.
Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 899
Dataset vs Distance Metrics
Euclidean Distance
Cosine Distance
Chebyshev Distance
Threshold
Accuracy
Threshold
Accuracy
Threshold
Accuracy
ORL Database
0.05760
78.4%
0.030565
77.25%
0.026942
70.75%
Grimace
0.032258
95.5%
0.01272
93.7%
0.015282
94.5%
Faces 95
0.022585
68%
0.025926
66.81%
0.022585
66.16%
10 A Vinay/ Procedia Computer Science 00 (2018) 000000
[12] T. S. T. Chan and Y. H. Yang, "Polar n-Complex and n-Bicomplex Singular Value Decomposition and Principal Component Pursuit," in
IEEE Transactions on Signal Processing, vol. 64, no. 24, pp. 6533-6544, Dec.15, 15 2016.
[13] K. M. Aishwarya, R. Ramesh, P. M. Sobarad and V. Singh, "Lossy image compression using SVD coding algorithm," 2016 International
Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), Chennai, 2016, pp. 1384-1389.
[14] Hae Jong Seo and Peyman Milanfar, “Face Verification Using the LARK Representation”, IEEE Transactions on Information Forensics
and security, vol. 6, no. 4, Dec 2011.
[15] L. Wolf, T. Hassner, and Y. Taigman, “Descriptor based methods in the wild,” in Proc. Faces in Real-Life Image Workshop in Eur. Conf.
Computer Vision (ECCV), Marseille, France, 2008.
[16] Z. Cao, Q. Yin, X. Tang, and J. Sun, “Face recognition with learning-based descriptor,” in Proc. IEEE Conf. Computer Vision and Computer
Vision (CVPR), 2010, pp. 27072714.
[17] A. Vadivel, A.K.Majumdar & S. Sural, " Performance comparison of distance metrics in content based Image retrieval applications", Intl.
Conference on Information Technology.
[18] L. D. Chase, " Euclidean Distance", College of Natural Resources, Colorado State University, Fort Collins, Colorado, USA, 824-146-294,
NR 505, December 8, 2008.
[19]AT&T Laboratories Cambridge, (2002), "The Database of Faces” [Online]. Retreived from:
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html.
[20] Dr Libor SpaceK(2007), “Collection of Facial Images: Grimace”[Online], Retreived
from:http://cswww.essex.ac.uk/mv/allfaces/grimace.html
[21]Dr Libor SpaceK(2007), “Collection of Facial Images: Faces95”[Online], Retreived from:http://cswww.essex.ac.uk/mv/allfaces/faces95.html
[22] Senthilkumaran N and Vaithegi S, “ Image Segmentation By Using Thresholding Techniques For Medical Images”, Computer Science &
Engineering: An International Journal (CSEIJ), Vol.6, No.1, February 2016.
[23] Devlin, S. A.; Thomas, E. G. and Emerson, S. S. (2013). “Robustness of approaches to ROC curve modeling under misspecification of the
underlying probability model”, Communications in StatisticsTheory and Methods, 42, 36553664.
[24] D. Surya Prabha and J. Satheesh Kumar, “Performance Evaluation of Image Segmentation using Objective Methods”,IndianJournal of
Science and Technology, Vol 9(8), February 2016.
... These feature vectors plays an important role in defining the descriptors and keypoints for an image. The overlapping patches are removed and featurized LARKS [9,10,11] are obtained. Key points are extracted from preprocessed images by changing the level of window size,sensitivity and smoothness. ...
Article
Full-text available
Informed by recent work on tensor singular value decomposition and circulant algebra matrices, this paper presents a new theoretical bridge that unifies the hypercomplex and tensor-based approaches to singular value decomposition and robust principal component analysis. We begin our work by extending the principal component pursuit to Olariu’s polar n- complex numbers as well as their bicomplex counterparts. In so doing, we have derived the polar n-complex and n-bicomplex proximity operators for both the '1- and trace-norm regularizers, which can be used by proximal optimization methods such as the alternating direction method of multipliers. Experimental results on two sets of audio data show that our algebraically-informed formulation outperforms tensor robust principal component analysis. We conclude with the message that an informed definition of the trace norm can bridge the gap between the hypercomplex and tensor-based approaches. Our approach can be seen as a general methodology for generating other principal component pursuit algorithms with proper algebraic structures.
Article
Full-text available
Background/Objectives: Image segmentation, a crucial and an essential step in image processing, determines the success of higher level of image processing. In this paper, a detailed study about different evaluation techniques based on subjective and objective methods have been discussed. Methods/Statistical analysis: An application specific characteristic of image segmentation paves a way for development of numerous algorithms. Traditionally subjective method of evaluation is used to determine the segmentation performance accuracy. As this evaluation method is quantitative and biased, a qualitative method of evaluation is demanded. This is done using the objective method of evaluation where discrepancy and goodness methods are used.Discrepancy method is used in widespread for predefined benchmark images where it has corresponding ground truth image for comparison. Goodness method is used for real time images where no ground truth image is available for comparison. These methods of objective evaluation are highly needed to validate the segmentation methods which are increasing rapidly in recent years. Findings: A detailed study of different evaluation methods are discussed and experimented over different segmentation methods. Boundary based methods like sobel, canny, susan, region based methods like region growing, thresholding and a hybrid method, combining boundary based and region based method are used for the purpose of experimentation.Experimental result shows that hybrid method performs better than other existing ones and also highlights the importance of image quality assessment method to identify a better segmentation technique for all type of images.
Conference Paper
Full-text available
We present a novel approach to address the representation issue and the matching issue in face recognition (verification). Firstly, our approach encodes the micro-structures of the face by a new learning-based encoding method. Unlike many previous manually designed encoding methods (e.g., LBP or SIFT), we use unsupervised learning techniques to learn an encoder from the training examples, which can automatically achieve very good tradeoff between discriminative power and invariance. Then we apply PCA to get a compact face descriptor. We find that a simple normalization mechanism after PCA can further improve the discriminative ability of the descriptor. The resulting face representation, learning-based (LE) descriptor, is compact, highly discriminative, and easy-to-extract. To handle the large pose variation in real-life scenarios, we propose a pose-adaptive matching method that uses pose-specific classifiers to deal with different pose combinations (e.g., frontal v.s. frontal, frontal v.s. left) of the matching face pair. Our approach is comparable with the state-of-the-art methods on the Labeled Face in Wild (LFW) benchmark (we achieved 84.45% recognition rate), while maintaining excellent compactness, simplicity, and generalization aability across different datasets.
Conference Paper
Full-text available
Face identification is the problem of determining whether two face images depict the same person or not. This is difficult due to variations in scale, pose, lighting, background, expression, hairstyle, and glasses. In this paper we present two methods for learning robust distance measures: (a) a logistic discriminant approach which learns the metric from a set of labelled image pairs (LDML) and (b) a nearest neighbour approach which computes the probability for two images to belong to the same class (MkNN). We evaluate our approaches on the Labeled Faces in the Wild data set, a large and very challenging data set of faces from Yahoo! News. The evaluation protocol for this data set defines a restricted setting, where a fixed set of positive and negative image pairs is given, as well as an unrestricted one, where faces are labelled by their identity. We are the first to present results for the unrestricted setting, and show that our methods benefit from this richer training data, much more so than the current state-of-the-art method. Our results of 79.3% and 87.5% correct for the restricted and unrestricted setting respectively, significantly improve over the current state-of-the-art result of 78.5%. Confidence scores obtained for face identification can be used for many applications e.g. clustering or recognition from a single training example. We show that our learned metrics also improve performance for these tasks.
Conference Paper
Singular Value Decomposition (SVD) deals with the decomposition of general matrices which has proven to be useful for numerous applications in science and engineering disciplines. In this paper the method of SVD has been applied to mid-level digital image processing. SVD transforms a given matrix into three different matrices, which in other words, means refactoring the digital image into three matrices. Refactoring is achieved by using singular values, and the image is represented with a smaller set of values. The primary aim is to achieve image compression by using less storage space in the memory and simultaneously preserving the useful features of original image. The experiments with different singular values are performed and the performance evaluation parameters for image compression viz. Compression Ratio, Mean Square Error, PSNR and Compressed Bytes are calculated for each SVD coefficient. The implementation tool for the tests and experiments is MATLAB.
Article
Crystal structures of the heteroligand fluoro-containing complex compounds of zirconium and hafnium are systematized on the basis of the nL : Zr(Hf) ratio in the complex (complexed anion of the compound). The structural features of the compounds of this class are discussed.
Conference Paper
Measurement of image quality is crucial for many image-processing algorithms. Traditionally, image quality assessment algorithms predict visual quality by comparing a distorted image against a reference image, typically by modeling the human visual system (HVS), or by using arbitrary signal fidelity criteria. We adopt a new paradigm for image quality assessment. We propose an information fidelity criterion that quantifies the Shannon information that is shared between the reference and distorted images relative to the information contained in the reference image itself. We use natural scene statistics (NSS) modeling in concert with an image degradation model and an HVS model. We demonstrate the performance of our algorithm by testing it on a data set of 779 images, and show that our method is competitive with state of the art quality assessment methods, and outperforms them in our simulations.
Article
One of the main difficulties in facial age estimation is that the learning algorithms cannot expect sufficient and complete training data. Fortunately, the faces at close ages look quite similar since aging is a slow and smooth process. Inspired by this observation, instead of considering each face image as an instance with one label (age), this paper regards each face image as an instance associated with a label distribution. The label distribution covers a certain number of class labels, representing the degree that each label describes the instance. Through this way, one face image can contribute to not only the learning of its chronological age, but also the learning of its adjacent ages. Two algorithms, named IIS-LLD and CPNN, are proposed to learn from such label distributions. Experimental results on two aging face databases show remarkable advantages of the proposed label distribution learning algorithms over the compared single-label learning algorithms, either specially designed for age estimation or for general purpose.
Conference Paper
This paper approaches the problem of finding correspondences between images in which there are large changes in viewpoint, scale and illumi- nation. Recent work has shown that scale-space 'interest points' may be found with good repeatability in spite of such changes. Further- more, the high entropy of the surrounding image regions means that local descriptors are highly discriminative for matching. For descrip- tors at interest points to be robustly matched between images, they must be as far as possible invariant to the imaging process. In this work we introduce a family of features which use groups of interest points to form geometrically invariant descriptors of image regions. Feature descriptors are formed by resampling the image rel- ative to canonical frames defined by the points. In addition to robust matching, a key advantage of this approach is that each match implies ah ypothesis of the local 2D (projective) transformation. This allows us to immediately reject most of the false matches using a Hough trans- form. We reject remaining outliers using RANSAC and the epipolar constraint. Results show that dense feature matching can be achieved in a few seconds of computation on 1GHz Pentium III machines.