Content uploaded by Vinayaka R Kamath

Author content

All content in this area was uploaded by Vinayaka R Kamath on Jan 24, 2019

Content may be subject to copyright.

ScienceDirect

Available online at www.sciencedirect.com

Procedia Computer Science 132 (2018) 890–899

1877-0509 © 2018 The Authors. Published by Elsevier Ltd.

This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/)

Peer-review under responsibility of the scientiﬁc committee of the International Conference on Computational Intelligence and Data Science

(ICCIDS 2018).

10.1016/j.procs.2018.05.101

10.1016/j.procs.2018.05.101

© 2018 The Authors. Published by Elsevier Ltd.

This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/)

Peer-review under responsibility of the scientic committee of the International Conference on Computational Intelligence and

Data Science (ICCIDS 2018).

1877-0509

Available online at www.sciencedirect.com

ScienceDirect

Procedia Computer Science 00 (2018) 000–000

www.elsevier.com/locate/procedia

Corresponding author: vinayakkamath2010@gmail.com

1877-0509 © 2018 The Authors. Published by Elsevier B.V.

Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and

Data Science (ICCIDS 2018).

International Conference on Computational Intelligence and Data Science (ICCIDS 2018)

Sparse Locally Adaptive Regression Kernel For Face Verification

Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan

Center for Pattern Recognition and Machine Intelligence,

PES University, Bengaluru, India.

Abstract

The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression

Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance

metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has

significantly higher influence when the verification process is computationally intensive. The proposed approach requires

minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process

is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from

the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”

between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the

resemblance from the threshold resulting in a binary decision.

© 2018 The Authors. Published by Elsevier B.V.

Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and

Data Science (ICCIDS 2018).

Keywords: LARK; Face Verification; SVD; Filtering; Threshold;

1. Introduction

Typically face verification is achieved by collating two images and deducing the resemblance between them. The

physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.

Nomenclature

LARK Locally Adaptive Regression Kernel

SVD Singular Value Decomposition

ROC Receiver Operating Characteristic

Available online at www.sciencedirect.com

ScienceDirect

Procedia Computer Science 00 (2018) 000–000

www.elsevier.com/locate/procedia

Corresponding author: vinayakkamath2010@gmail.com

1877-0509 © 2018 The Authors. Published by Elsevier B.V.

Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and

Data Science (ICCIDS 2018).

International Conference on Computational Intelligence and Data Science (ICCIDS 2018)

Sparse Locally Adaptive Regression Kernel For Face Verification

Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan

Center for Pattern Recognition and Machine Intelligence,

PES University, Bengaluru, India.

Abstract

The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression

Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance

metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has

significantly higher influence when the verification process is computationally intensive. The proposed approach requires

minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process

is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from

the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”

between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the

resemblance from the threshold resulting in a binary decision.

© 2018 The Authors. Published by Elsevier B.V.

Data Science (ICCIDS 2018).

Keywords: LARK; Face Verification; SVD; Filtering; Threshold;

1. Introduction

Typically face verification is achieved by collating two images and deducing the resemblance between them. The

physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.

Nomenclature

LARK Locally Adaptive Regression Kernel

SVD Singular Value Decomposition

ROC Receiver Operating Characteristic

2 A Vinay/ Procedia Computer Science 00 (2018) 000–000

Facial recognition is a biometric method of finding similarities and dissimilarities of an individual by collating

digital image data with the featured extracted [1] beforehand. Nearly all contemporary face recognition setups work

with blueprints of numeric codes called faceprints. The designed systems recognize nodal points of high importance

on a human face. Here the nodal points act as cardinal features which elaborate on key aspects such as the length

and width of the nose, the profundity of the eye sockets and the geography of the cheekbones. These systems labor

by cognizance data for nodal points on a digital image of an individual’s face and store the data for further

operations. These faceprints can then be used as a scale to compare with data captured from other faces in an image

or video. Face verification has innumerable uses. It can be used to detect fraud in passports and visas, elevated

security of the card user which works by mapping facial data and matching it against the database in ATMs and

banks, identification of criminals, prevent fraud voters and finally keep track of attendance. Companies like

Facebook use ‘Deepface’ which is a deep learning facial recognition system in artificial intelligence. ‘Facenet’

which is a Google project achieved a close to 100 percent accuracy on an extensively used dataset known as

‘Labeled Faces in the Wild’, which included more than 13,000 images of faces from across the world. Many

challenges brought about include the image quality which the algorithm expects. Without the accurate computations

of facial features the robustness of the approaches will also be lost. Hence even in the most complex algorithm as

the image quality [2] deteriorate the accuracy reduces. Same face appears differently due to change in lighting.

Illumination can transform and manipulate the appearance of an object to a great extent. There is a need to overcome

irregular lighting. Pose variation is when an individual’s face happens to be rotated at different angles[3] where

frontal view images are feature rich while side views of a person are sparse of feature. This problem appears when

the system has to identify a rotated face using this frontal view training data. Multiple orientations of the faces need

to be fed into the database. Recognizing faces across different ages of a person [4] poses to be a difficult task, so age

variation needs to be taken into account. A simple yet accurate model which is not computationally intensive would

help hardware possessing light weight architecture in achieving face verification. Locally adaptive regression

kernels provided one such methodology for training free object detection. The idea was to extend the same

constituent to face verification with nominal training. Face verification posed a different set of challenges which

needed an unorthodox approach, very different from what LARKs were used for before. The paper “Face

Verification Using the LARK Representation” [5] provided motivation for the usage of LARKs in face verification.

Further sections of this paper illustrate various stages of the tendered pipeline. Section 2 contains description of

the mechanisms used in the stages of the pipeline and their advantages over their counterparts. Suitable schematics

and mathematical equations depict the process pictorially and theoretically. Section 3 briefs over the datasets used

and challenges each one has to offer. Last two sections throw light over the inferences and speculations concluded

from the technique. Future scope and enhancements have been discussed in the last section of the paper.

2. Proposed Methodology

2.1 Overview of Approach

This section briefs about the framework implemented on the images to achieve face verification. Cropping the

region of interest from the raw image helps to diminish unwanted noise and variations in the background in a large

scale. Singular value decomposition [6] assists in compression [7] of the region of interest therefore reducing the

computational potency. LARKs [8] achieve extraction of key points and descriptors in the form of vectors from the

preprocessed images [9]. Euclidean distance, Cosine distance, Chebyshev distance and several other distance

metrics can be used to differentiate these vectorized LARKs [10]. The intensity of these metrics is correlated with

the thresholds proposed that lead to a binary decision, hence achieving face verification.

Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 891

Available online at www.sciencedirect.com

ScienceDirect

Procedia Computer Science 00 (2018) 000–000

www.elsevier.com/locate/procedia

Corresponding author: vinayakkamath2010@gmail.com

1877-0509 © 2018 The Authors. Published by Elsevier B.V.

Data Science (ICCIDS 2018).

International Conference on Computational Intelligence and Data Science (ICCIDS 2018)

Sparse Locally Adaptive Regression Kernel For Face Verification

Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan

Center for Pattern Recognition and Machine Intelligence,

PES University, Bengaluru, India.

Abstract

The paper presents several thresholds obtained by heuristic approach for face verification using Locally Adaptive Regression

Kernel (LARK) descriptors for euclidean, cosine and chebyshev distance metrics. The absence of a threshold for several distance

metrics possess several setbacks such as increased computational complexity and escalated runtime. The proposed method has

significantly higher influence when the verification process is computationally intensive. The proposed approach requires

minimal training and face verification is accomplished based on the threshold value obtained during training. The whole process

is modest and nearly all of the time spent is solely on quantifying any of the distance metrics between the LARKs obtained from

the two faces for verification. LARK descriptors compute a measure of resemblance on the basis of “signal-induced distance”

between a pixel and its nearby pixels. We assess the interspace between the LARKs from these faces and analogize the

resemblance from the threshold resulting in a binary decision.

© 2018 The Authors. Published by Elsevier B.V.

Data Science (ICCIDS 2018).

Keywords: LARK; Face Verification; SVD; Filtering; Threshold;

1. Introduction

Typically face verification is achieved by collating two images and deducing the resemblance between them. The

physique may be subjected to variations in illumination, scale, orientation, contrast and several other distortions.

Nomenclature

LARK Locally Adaptive Regression Kernel

SVD Singular Value Decomposition

ROC Receiver Operating Characteristic

Available online at www.sciencedirect.com

ScienceDirect

Procedia Computer Science 00 (2018) 000–000

www.elsevier.com/locate/procedia

Corresponding author: vinayakkamath2010@gmail.com

1877-0509 © 2018 The Authors. Published by Elsevier B.V.

Data Science (ICCIDS 2018).

International Conference on Computational Intelligence and Data Science (ICCIDS 2018)

Sparse Locally Adaptive Regression Kernel For Face Verification

Vinay A, Vinayaka R Kamath, Varun M, K N Balasubramanya Murthy, S Natarajan

Center for Pattern Recognition and Machine Intelligence,

PES University, Bengaluru, India.

Abstract

resemblance from the threshold resulting in a binary decision.

© 2018 The Authors. Published by Elsevier B.V.

Data Science (ICCIDS 2018).

Keywords: LARK; Face Verification; SVD; Filtering; Threshold;

1. Introduction

Nomenclature

LARK Locally Adaptive Regression Kernel

SVD Singular Value Decomposition

ROC Receiver Operating Characteristic

2 A Vinay/ Procedia Computer Science 00 (2018) 000–000

Facial recognition is a biometric method of finding similarities and dissimilarities of an individual by collating

digital image data with the featured extracted [1] beforehand. Nearly all contemporary face recognition setups work

with blueprints of numeric codes called faceprints. The designed systems recognize nodal points of high importance

on a human face. Here the nodal points act as cardinal features which elaborate on key aspects such as the length

and width of the nose, the profundity of the eye sockets and the geography of the cheekbones. These systems labor

by cognizance data for nodal points on a digital image of an individual’s face and store the data for further

operations. These faceprints can then be used as a scale to compare with data captured from other faces in an image

or video. Face verification has innumerable uses. It can be used to detect fraud in passports and visas, elevated

security of the card user which works by mapping facial data and matching it against the database in ATMs and

banks, identification of criminals, prevent fraud voters and finally keep track of attendance. Companies like

Facebook use ‘Deepface’ which is a deep learning facial recognition system in artificial intelligence. ‘Facenet’

which is a Google project achieved a close to 100 percent accuracy on an extensively used dataset known as

‘Labeled Faces in the Wild’, which included more than 13,000 images of faces from across the world. Many

challenges brought about include the image quality which the algorithm expects. Without the accurate computations

of facial features the robustness of the approaches will also be lost. Hence even in the most complex algorithm as

the image quality [2] deteriorate the accuracy reduces. Same face appears differently due to change in lighting.

Illumination can transform and manipulate the appearance of an object to a great extent. There is a need to overcome

irregular lighting. Pose variation is when an individual’s face happens to be rotated at different angles[3] where

frontal view images are feature rich while side views of a person are sparse of feature. This problem appears when

the system has to identify a rotated face using this frontal view training data. Multiple orientations of the faces need

to be fed into the database. Recognizing faces across different ages of a person [4] poses to be a difficult task, so age

variation needs to be taken into account. A simple yet accurate model which is not computationally intensive would

help hardware possessing light weight architecture in achieving face verification. Locally adaptive regression

kernels provided one such methodology for training free object detection. The idea was to extend the same

constituent to face verification with nominal training. Face verification posed a different set of challenges which

needed an unorthodox approach, very different from what LARKs were used for before. The paper “Face

Verification Using the LARK Representation” [5] provided motivation for the usage of LARKs in face verification.

Further sections of this paper illustrate various stages of the tendered pipeline. Section 2 contains description of

the mechanisms used in the stages of the pipeline and their advantages over their counterparts. Suitable schematics

and mathematical equations depict the process pictorially and theoretically. Section 3 briefs over the datasets used

and challenges each one has to offer. Last two sections throw light over the inferences and speculations concluded

from the technique. Future scope and enhancements have been discussed in the last section of the paper.

2. Proposed Methodology

2.1 Overview of Approach

This section briefs about the framework implemented on the images to achieve face verification. Cropping the

region of interest from the raw image helps to diminish unwanted noise and variations in the background in a large

scale. Singular value decomposition [6] assists in compression [7] of the region of interest therefore reducing the

computational potency. LARKs [8] achieve extraction of key points and descriptors in the form of vectors from the

preprocessed images [9]. Euclidean distance, Cosine distance, Chebyshev distance and several other distance

metrics can be used to differentiate these vectorized LARKs [10]. The intensity of these metrics is correlated with

the thresholds proposed that lead to a binary decision, hence achieving face verification.

892 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899

AVinay/ Procedia Computer Science 00 (2018) 000–000 3

Fig. 1. Schematic of the proposed system

2.2. Singular Value Decomposition

Singular Value Decomposition commonly known as SVD[11] is the speculation of the eigendecomposition[12] of

a positive definite normal matrix to any

nm

real or complex matrix via an extension of the polar decomposition. If

M is a

nm

matrix whose entries come from field Y, then M can be factorized to the form,

(1)

Where,

A is a unitary matrix on basis of Y.

Σ is a diagonal m x n matrix.

B is an n x n unitary matrix for Y.

Bθ is the conjugate transpose of A

SVD aids in reducing the time required in subsequent operations. The refactored matrices can preserve useful

features, while presenting the image using a smaller set of values [13]. This image can then be utilized for feature

extraction and further processed.

Fig.2(a). Image before SVD Fig.2(b). Image after SVD

θ

BA=M

4 A Vinay/ Procedia Computer Science 00 (2018) 000–000

2.3. Locally Adaptive Regression Kernels

In order to showcase obsolete low quality content on a relatively high resolution screens, the need for imp space -

time up scaling, denoising and deblurring algorithms has become a matter of great concern. Classical kernel

regression is a non parametric estimate of a point or pixel within an image. It is used to estimate the conditional

estimate of a random variable. The Kernel regression framework defines its data model as

iii

ε+xz=y

xi ϵ ω, i = 1 …. S (2)

yi is a denoised sample measured at xi = [x1i , x2i ] T

where Z(x) is the required regression function, εi is an

independently and identically distributed zero mean noise. P is the total number of samples in an arbitrary “window”

ω around a position of interest X. Point wise estimates can be concluded easily using this mechanism. Z(xi) can be

computed using the Taylor series.

.+xxxHzxx!+xxxz+xzxz

i

T

ii

T

i

..--2/1-∇≈

(3)

.+xxxxvechβ+xxβ+β

T

ii

T

i

T

..---≈

210

(4)

where H is the Hessian operators, while vech is the half-vectorization operator that lexicographically orders the

lower triangular portion of a symmetric matrix into a column stacked vector. β1 and β2 can be mathematically

defined as

(5)

(6)

The vech operation can be illustrated as below,

Locally adaptive regression kernel [14] can be formulated as follows:

(7)

Where,

(8)

T

xxz,xxz=β2

2

2

2

1

2

2//2/1

T

xxz,xxz=β

211

//

T

dba=

db

ba

vech

T

ifecba=

ifc

feb

cba

vech

}xxCxx{=x,x,CK

il

'

iii

exp

ωK

kx

kxkx

kxkx

kxi

xz

xzxz

xzxz

xz=C

2

2

2

21

21

1

Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 893

AVinay/ Procedia Computer Science 00 (2018) 000–000 3

Fig. 1. Schematic of the proposed system

2.2. Singular Value Decomposition

Singular Value Decomposition commonly known as SVD[11] is the speculation of the eigendecomposition[12] of

a positive definite normal matrix to any

nm

real or complex matrix via an extension of the polar decomposition. If

M is a

nm

matrix whose entries come from field Y, then M can be factorized to the form,

(1)

Where,

A is a unitary matrix on basis of Y.

Σ is a diagonal m x n matrix.

B is an n x n unitary matrix for Y.

Bθ is the conjugate transpose of A

SVD aids in reducing the time required in subsequent operations. The refactored matrices can preserve useful

features, while presenting the image using a smaller set of values [13]. This image can then be utilized for feature

extraction and further processed.

Fig.2(a). Image before SVD Fig.2(b). Image after SVD

θ

BA=M

4 A Vinay/ Procedia Computer Science 00 (2018) 000–000

2.3. Locally Adaptive Regression Kernels

In order to showcase obsolete low quality content on a relatively high resolution screens, the need for imp space -

time up scaling, denoising and deblurring algorithms has become a matter of great concern. Classical kernel

regression is a non parametric estimate of a point or pixel within an image. It is used to estimate the conditional

estimate of a random variable. The Kernel regression framework defines its data model as

iii

ε+xz=y

xi ϵ ω, i = 1 …. S (2)

yi is a denoised sample measured at xi = [x1i , x2i ] T

where Z(x) is the required regression function, εi is an

independently and identically distributed zero mean noise. P is the total number of samples in an arbitrary “window”

ω around a position of interest X. Point wise estimates can be concluded easily using this mechanism. Z(xi) can be

computed using the Taylor series.

.+xxxHzxx!+xxxz+xzxz

i

T

ii

T

i

..--2/1-∇≈

(3)

.+xxxxvechβ+xxβ+β

T

ii

T

i

T

..---≈

210

(4)

where H is the Hessian operators, while vech is the half-vectorization operator that lexicographically orders the

lower triangular portion of a symmetric matrix into a column stacked vector. β1 and β2 can be mathematically

defined as

(5)

(6)

The vech operation can be illustrated as below,

Locally adaptive regression kernel [14] can be formulated as follows:

(7)

Where,

(8)

T

xxz,xxz=β2

2

2

2

1

2

2//2/1

T

xxz,xxz=β

211

//

T

dba=

db

ba

vech

T

ifecba=

ifc

feb

cba

vech

}xxCxx{=x,x,CK

il

'

iii

exp

ωK

kx

kxkx

kxkx

kxi

xz

xzxz

xzxz

xz=C

2

2

2

21

21

1

894 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899

AVinay/ Procedia Computer Science 00 (2018) 000–000 5

2.4. Descriptors and Matchers

Vectorized LARKs acts as key points and descriptors for the image [15]. Overlapping patches are removed from the

vectorized LARKs, hence giving a visual impression of the generated LARKs. This in turn can be used to plot an

image. Different set of key points can be obtained from the preprocessed images by varying smoothness, window

size and sensitivity, each time resulting in a slightly different set of vectorized version of LARKs [16]. Theoretical

distance in a vector space is calculated between two raw LARKs using different distance metrics.

A unique set of LARKs is obtained every time the parameters are tweaked. Significant variations are observed in

the visual LARKs based on the input image. Each image from a dataset is compared with rest of the images,

including that of the same person. The output regarding each image is labeled as matched or unmatched without fail.

When this process is complete with all images, an optimal threshold is to be found which separates these labeled

values. To achieve this, graph of sensitivity and specificity vs thresholds is plotted. The intersection of these curves

grants the required threshold. The magnitude of the y value interprets the accuracy of that threshold.

Fig. 3(a). Sample Image from Faces95 database Fig. 3(b). Visual LARKs with low window size

Fig. 3(c). Visual LARKs with high sensitivity Fig. 3(d). Visual LARKs with low sensitivity

A metric or distance function is a special mapping which defines a relation regarding spacial distance[17]

between elements of a set and that of non-negative real numbers. Euclidean, Chebyshev and cosine distances were

persuaded to be geometrically faultless and proximate. Cosine distance is dot product upon the product of euclidean

distances from the origin.

6 A Vinay/ Procedia Computer Science 00 (2018) 000–000

(9)

Chebyshev distance is a special case of Minkowski distance where P goes to infinity. Also known as Chessboard

distance, it is the L∞ norm of the difference.

(10)

Euclidean distance [18] is also a special case of Minkowski distance using P value as 2. It is the most

commonly used distance measure in geometric interpretation.

(11)

Fig. 4(a). Euclidean distance contour Fig. 4(b). Cosine distance contour Fig. 4(c). Chebyshev distance contour

3. Datasets and Experimentation

3.1. Datasets Used

To test the righteousness of the methodology for variations, corresponding datasets were used. These helped in

testing and analysis of the methodology.

3.1.1. ORL Faces

Formerly known as ‘The ORL Database of Faces’ [19], the dataset holds images from early 1990s captured at

Cambridge University Computer Laboratory. It contains ten unique images of 40 distinct individuals, subjected to

variations in time of the day when capturing, lightings, facial expressions and othe r details such as glasses. Each

image has 92 x 112 pixels with standard 256 grey levels per unit. The dataset possessed a challenge in terms of

facial expressions and lighting conditions.

n

=i

i

n

=i

i

i

n

=i

i

cosine

yx

yx

=

YX

YX,

YX,distance

1

2

1

2

1

22

11

ii

p

n

=i

p

ii

p

cheb ysh ev

yxmax=yx=yxYX,distance

/1

1

lim

n

=i

iieucli dea n

yx=yxYX,distance

1

2

2

Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 895

AVinay/ Procedia Computer Science 00 (2018) 000–000 5

2.4. Descriptors and Matchers

Vectorized LARKs acts as key points and descriptors for the image [15]. Overlapping patches are removed from the

vectorized LARKs, hence giving a visual impression of the generated LARKs. This in turn can be used to plot an

image. Different set of key points can be obtained from the preprocessed images by varying smoothness, window

size and sensitivity, each time resulting in a slightly different set of vectorized version of LARKs [16]. Theoretical

distance in a vector space is calculated between two raw LARKs using different distance metrics.

A unique set of LARKs is obtained every time the parameters are tweaked. Significant variations are observed in

the visual LARKs based on the input image. Each image from a dataset is compared with rest of the images,

including that of the same person. The output regarding each image is labeled as matched or unmatched without fail.

When this process is complete with all images, an optimal threshold is to be found which separates these labeled

values. To achieve this, graph of sensitivity and specificity vs thresholds is plotted. The intersection of these curves

grants the required threshold. The magnitude of the y value interprets the accuracy of that threshold.

Fig. 3(a). Sample Image from Faces95 database Fig. 3(b). Visual LARKs with low window size

Fig. 3(c). Visual LARKs with high sensitivity Fig. 3(d). Visual LARKs with low sensitivity

A metric or distance function is a special mapping which defines a relation regarding spacial distance[17]

between elements of a set and that of non-negative real numbers. Euclidean, Chebyshev and cosine distances were

persuaded to be geometrically faultless and proximate. Cosine distance is dot product upon the product of euclidean

distances from the origin.

6 A Vinay/ Procedia Computer Science 00 (2018) 000–000

(9)

Chebyshev distance is a special case of Minkowski distance where P goes to infinity. Also known as Chessboard

distance, it is the L∞ norm of the difference.

(10)

Euclidean distance [18] is also a special case of Minkowski distance using P value as 2. It is the most

commonly used distance measure in geometric interpretation.

(11)

Fig. 4(a). Euclidean distance contour Fig. 4(b). Cosine distance contour Fig. 4(c). Chebyshev distance contour

3. Datasets and Experimentation

3.1. Datasets Used

To test the righteousness of the methodology for variations, corresponding datasets were used. These helped in

testing and analysis of the methodology.

3.1.1. ORL Faces

Formerly known as ‘The ORL Database of Faces’ [19], the dataset holds images from early 1990s captured at

Cambridge University Computer Laboratory. It contains ten unique images of 40 distinct individuals, subjected to

variations in time of the day when capturing, lightings, facial expressions and othe r details such as glasses. Each

image has 92 x 112 pixels with standard 256 grey levels per unit. The dataset possessed a challenge in terms of

facial expressions and lighting conditions.

n

=i

i

n

=i

i

i

n

=i

i

cosine

yx

yx

=

YX

YX,

YX,distance

1

2

1

2

1

22

11

ii

p

n

=i

p

ii

p

cheb ysh ev

yxmax=yx=yxYX,distance

/1

1

lim

n

=i

iieucli dea n

yx=yxYX,distance

1

2

2

896 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899

AVinay/ Procedia Computer Science 00 (2018) 000–000 7

Fig. 5. Sample record from ORL faces

3.1.2. Grimace

An assembly of images of 18 individuals designed and maintained by Dr. Libor Spacek, grimace[20]

focuses mainly on expression variations for both male and female candidates. The dataset contains 20 portraits of

each individual of resolution 180 x 200 pixels. The background is kept same throughout all the images with small

uniform head scale variations. The lighting changes are minimal and little to no variations in hairstyle of the

subjects.

Fig. 6. Individual exhibiting different expressions

3.1.3. Faces95

Brain child of Dr. Libor Spacek, the database contains portraits of 72 distinct individuals [21]. A sequence

of 20 images was captured while the subject was asked to step towards the camera after each snap. The dataset

offers a huge head scale variation and minor variations due to the shadows, resulting in a discrepancy in red

background. Noticeable changes in lighting occur due to artificial lighting systems used.

4. Results and Inferences

Three benchmark datasets and three different distance metrics were utilized to conclude from a heuristic

approach [22]. All three grimace, faces95 and ORL databases have a different set of challenges and variations of

their own, best suited for a heuristic approach. An optimal distance metric was to be deduced which would give an

exact measure of the interspace between two sets of LARKs from the images. This interval would help us in arriving

at a conclusion about the correlation between two images. ROC curve [23] was to be plotted by comparing all the

subjects in the dataset with a handful of randomly selected subjects. The frequency of true positives and true

negatives were normalized to attain simplification in the analysis [24]. A graph of sensitivity and specificity vs

criterion value was plotted instead of a ROC curve to get the value of the threshold as well. Each sample from ORL

dataset was mapped to five other subjects from the same dataset and the outputs were labeled. This process was

repeated for grimace and faces95 with three random sample comparisons for all subjects. The methodology provided

reasonable accuracy with minimal variations for multiple tests.

The steady increase in the graph shows that the data points are normally distributed. Quickly rising graphs

exhibit densely populated points around the corresponding threshold. SVD on raw images reduces the accuracy up-

8 A Vinay/ Procedia Computer Science 00 (2018) 000–000

to 2% but significantly decreases the time taken to compute the output. The red curve on the graph was obtained by

plotting specificity value at all possible thresholds. The blue curve represents possible values of sensitivity for all

thresholds.

Fig. 7. Specificity and Sensitivity vs euclidean distance for ORL dataset

Fig. 8. Specificity and Sensitivity vs cosine distance for grimace dataset

The intersection of the specificity curve and sensitivity curve gives the optimal threshold for face

verification. The graph was plotted for all possible combinations of datasets and distance metrics. Following table

represents optimal thresholds obtained for different distance metrics.

Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 897

AVinay/ Procedia Computer Science 00 (2018) 000–000 7

Fig. 5. Sample record from ORL faces

3.1.2. Grimace

An assembly of images of 18 individuals designed and maintained by Dr. Libor Spacek, grimace[20]

focuses mainly on expression variations for both male and female candidates. The dataset contains 20 portraits of

each individual of resolution 180 x 200 pixels. The background is kept same throughout all the images with small

uniform head scale variations. The lighting changes are minimal and little to no variations in hairstyle of the

subjects.

Fig. 6. Individual exhibiting different expressions

3.1.3. Faces95

Brain child of Dr. Libor Spacek, the database contains portraits of 72 distinct individuals [21]. A sequence

of 20 images was captured while the subject was asked to step towards the camera after each snap. The dataset

offers a huge head scale variation and minor variations due to the shadows, resulting in a discrepancy in red

background. Noticeable changes in lighting occur due to artificial lighting systems used.

4. Results and Inferences

Three benchmark datasets and three different distance metrics were utilized to conclude from a heuristic

approach [22]. All three grimace, faces95 and ORL databases have a different set of challenges and variations of

their own, best suited for a heuristic approach. An optimal distance metric was to be deduced which would give an

exact measure of the interspace between two sets of LARKs from the images. This interval would help us in arriving

at a conclusion about the correlation between two images. ROC curve [23] was to be plotted by comparing all the

subjects in the dataset with a handful of randomly selected subjects. The frequency of true positives and true

negatives were normalized to attain simplification in the analysis [24]. A graph of sensitivity and specificity vs

criterion value was plotted instead of a ROC curve to get the value of the threshold as well. Each sample from ORL

dataset was mapped to five other subjects from the same dataset and the outputs were labeled. This process was

repeated for grimace and faces95 with three random sample comparisons for all subjects. The methodology provided

reasonable accuracy with minimal variations for multiple tests.

The steady increase in the graph shows that the data points are normally distributed. Quickly rising graphs

exhibit densely populated points around the corresponding threshold. SVD on raw images reduces the accuracy up-

8 A Vinay/ Procedia Computer Science 00 (2018) 000–000

to 2% but significantly decreases the time taken to compute the output. The red curve on the graph was obtained by

plotting specificity value at all possible thresholds. The blue curve represents possible values of sensitivity for all

thresholds.

Fig. 7. Specificity and Sensitivity vs euclidean distance for ORL dataset

Fig. 8. Specificity and Sensitivity vs cosine distance for grimace dataset

The intersection of the specificity curve and sensitivity curve gives the optimal threshold for face

verification. The graph was plotted for all possible combinations of datasets and distance metrics. Following table

represents optimal thresholds obtained for different distance metrics.

898 Vinay A et al. / Procedia Computer Science 132 (2018) 890–899

AVinay/ Procedia Computer Science 00 (2018) 000–000 9

Table 1. Optimal thresholds obtained from experimentation.

Dataset vs Distance Metrics

Euclidean Distance

Cosine Distance

Chebyshev Distance

Threshold

Accuracy

Threshold

Accuracy

Threshold

Accuracy

ORL Database

0.05760

78.4%

0.030565

77.25%

0.026942

70.75%

Grimace

0.032258

95.5%

0.01272

93.7%

0.015282

94.5%

Faces 95

0.022585

68%

0.025926

66.81%

0.022585

66.16%

5. Conclusion and Future Work

Analysing the experimentation results it is fairly noticable that the system provides equitable accuracy for

considered distance metrics. Euclidean distance stands out with higher accuracy for all the datasets with about 1% to

2 % better accuracy. Grimace dataset embodies significant variations in the expressions of the subjects to which the

pipeline performs admirably. It was noticed that the system showed minutest variations over several trails proving

the stability of the mechanism. Faces95 on the other hand shifts the region of interest and delivers variations in the

background to which the systems doesn’t perform upto the mark. There seems to be a room for improvement with

regards to noises and disturbances the subjects has to offer. Locally adaptive regression kernels have proven to be

capable and concise descriptors for extraction of features from an image. As they require minimal training and

preform significantly better than other feature extraction algorithms there is a sense of under utilization. There is

also an enormous scope for better matchers to be used with LARK descriptors. The distance metrics such as

euclidean , cosine and Gorbachev show appreciable accuracy for the amount of computational power required.

Future scope include pairing LARK descriptors with flann matchers, convolutional neural networks and

dimensionality reduction of LARKs using principle component analysis. Use feature aggregation techniques such as

bag of words, fisher vectors for more precise segregation of the descriptors. Further evaluate these using Support

Vector Machine to conclude verification.

References

[1] Brown, M. and Lowe, D.G. 2002. “Invariant features from interest point groups”. British Machine Vision Conference, Cardiff, Wales, pp.

656-665.

[2] H. R. Sheikh, A. C. Bovik, "Image information and visual quality", Image Processing IEEE Transactions, vol. 15, no. 2, pp. 430-444, 2006.

[3] O. Boiman, M. Irani, "Detecting Irregularities in Images and in Video", Int'l J. Computer Vision, vol. 74, pp. 17-31, Aug. 2007.

[4] X. Geng, C. Yin, Z.-H. Zhou, "Facial age estimation by learning from label distributions", IEEE Trans. Pattern Anal. Mach. Intell., vol. 35,

no. 10, pp. 2401-2412, Oct. 2013.

[5] H.J. Seo and P. Milanfar, “Face verification using the LARK representation,” IEEE Transactions on Information Forensics and Security,

2011.

[6] Q. X. Gao, "SVD for face recognition problems and solutions," China Journal of Image and Graphics, 2006, Vol. 11, No.12, pp. 1784-1791.

[7] H. S. Prasantha, H. L. Shashidhara and K. N. B. Murthy, "Image Compression Using SVD," International Conference on Computational

Intelligence and Multimedia Applications (ICCIMA 2007), Sivakasi, Tamil Nadu, 2007, pp. 143-145.

[8] H. J. Seo and P. Milanfar, “Training-free, generic object detection using locally adaptive regression kernels,” IEEE Trans. Pattern Anal.

Mach.Intell., vol. 32, no. 9, pp. 1688–1704, Sep. 2010.

[9] Y. Q. Cheng and Y.M. Zhuang, "The image feature extraction and recognition based on matrix similarity," Computer Research and

Development, 1992, 11, pp. 42-48.

[10] M. Guillaumin, J. Verbeek, and C. Schmid, “Is that you? Metric learning approaches for face identification,” in Proc. IEEE Int. Conf.

Computer Vision (ICCV), 2009.

[11]Virginia C. Klema and Alan J. Laub, “The Singular Value Decomposition: Its Computation and Some Applications”, IEEE transactions on

automatic control, vol. ac-25, no. 2, APFSL 1980.

Vinay A et al. / Procedia Computer Science 132 (2018) 890–899 899

AVinay/ Procedia Computer Science 00 (2018) 000–000 9

Table 1. Optimal thresholds obtained from experimentation.

Dataset vs Distance Metrics

Euclidean Distance

Cosine Distance

Chebyshev Distance

Threshold

Accuracy

Threshold

Accuracy

Threshold

Accuracy

ORL Database

0.05760

78.4%

0.030565

77.25%

0.026942

70.75%

Grimace

0.032258

95.5%

0.01272

93.7%

0.015282

94.5%

Faces 95

0.022585

68%

0.025926

66.81%

0.022585

66.16%

5. Conclusion and Future Work

Analysing the experimentation results it is fairly noticable that the system provides equitable accuracy for

considered distance metrics. Euclidean distance stands out with higher accuracy for all the datasets with about 1% to

2 % better accuracy. Grimace dataset embodies significant variations in the expressions of the subjects to which the

pipeline performs admirably. It was noticed that the system showed minutest variations over several trails proving

the stability of the mechanism. Faces95 on the other hand shifts the region of interest and delivers variations in the

background to which the systems doesn’t perform upto the mark. There seems to be a room for improvement with

regards to noises and disturbances the subjects has to offer. Locally adaptive regression kernels have proven to be

capable and concise descriptors for extraction of features from an image. As they require minimal training and

preform significantly better than other feature extraction algorithms there is a sense of under utilization. There is

also an enormous scope for better matchers to be used with LARK descriptors. The distance metrics such as

euclidean , cosine and Gorbachev show appreciable accuracy for the amount of computational power required.

Future scope include pairing LARK descriptors with flann matchers, convolutional neural networks and

dimensionality reduction of LARKs using principle component analysis. Use feature aggregation techniques such as

bag of words, fisher vectors for more precise segregation of the descriptors. Further evaluate these using Support

Vector Machine to conclude verification.

References

[1] Brown, M. and Lowe, D.G. 2002. “Invariant features from interest point groups”. British Machine Vision Conference, Cardiff, Wales, pp.

656-665.

[2] H. R. Sheikh, A. C. Bovik, "Image information and visual quality", Image Processing IEEE Transactions, vol. 15, no. 2, pp. 430-444, 2006.

[3] O. Boiman, M. Irani, "Detecting Irregularities in Images and in Video", Int'l J. Computer Vision, vol. 74, pp. 17-31, Aug. 2007.

[4] X. Geng, C. Yin, Z.-H. Zhou, "Facial age estimation by learning from label distributions", IEEE Trans. Pattern Anal. Mach. Intell., vol. 35,

no. 10, pp. 2401-2412, Oct. 2013.

[5] H.J. Seo and P. Milanfar, “Face verification using the LARK representation,” IEEE Transactions on Information Forensics and Security,

2011.

[6] Q. X. Gao, "SVD for face recognition problems and solutions," China Journal of Image and Graphics, 2006, Vol. 11, No.12, pp. 1784-1791.

[7] H. S. Prasantha, H. L. Shashidhara and K. N. B. Murthy, "Image Compression Using SVD," International Conference on Computational

Intelligence and Multimedia Applications (ICCIMA 2007), Sivakasi, Tamil Nadu, 2007, pp. 143-145.

[8] H. J. Seo and P. Milanfar, “Training-free, generic object detection using locally adaptive regression kernels,” IEEE Trans. Pattern Anal.

Mach.Intell., vol. 32, no. 9, pp. 1688–1704, Sep. 2010.

[9] Y. Q. Cheng and Y.M. Zhuang, "The image feature extraction and recognition based on matrix similarity," Computer Research and

Development, 1992, 11, pp. 42-48.

[10] M. Guillaumin, J. Verbeek, and C. Schmid, “Is that you? Metric learning approaches for face identification,” in Proc. IEEE Int. Conf.

Computer Vision (ICCV), 2009.

[11]Virginia C. Klema and Alan J. Laub, “The Singular Value Decomposition: Its Computation and Some Applications”, IEEE transactions on

automatic control, vol. ac-25, no. 2, APFSL 1980.

10 A Vinay/ Procedia Computer Science 00 (2018) 000–000

[12] T. S. T. Chan and Y. H. Yang, "Polar n-Complex and n-Bicomplex Singular Value Decomposition and Principal Component Pursuit," in

IEEE Transactions on Signal Processing, vol. 64, no. 24, pp. 6533-6544, Dec.15, 15 2016.

[13] K. M. Aishwarya, R. Ramesh, P. M. Sobarad and V. Singh, "Lossy image compression using SVD coding algorithm," 2016 International

Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), Chennai, 2016, pp. 1384-1389.

[14] Hae Jong Seo and Peyman Milanfar, “Face Verification Using the LARK Representation”, IEEE Transactions on Information Forensics

and security, vol. 6, no. 4, Dec 2011.

[15] L. Wolf, T. Hassner, and Y. Taigman, “Descriptor based methods in the wild,” in Proc. Faces in Real-Life Image Workshop in Eur. Conf.

Computer Vision (ECCV), Marseille, France, 2008.

[16] Z. Cao, Q. Yin, X. Tang, and J. Sun, “Face recognition with learning-based descriptor,” in Proc. IEEE Conf. Computer Vision and Computer

Vision (CVPR), 2010, pp. 2707–2714.

[17] A. Vadivel, A.K.Majumdar & S. Sural, " Performance comparison of distance metrics in content based Image retrieval applications", Intl.

Conference on Information Technology.

[18] L. D. Chase, " Euclidean Distance", College of Natural Resources, Colorado State University, Fort Collins, Colorado, USA, 824-146-294,

NR 505, December 8, 2008.

[19]AT&T Laboratories Cambridge, (2002), "The Database of Faces” [Online]. Retreived from:

http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html.

[20] Dr Libor SpaceK(2007), “Collection of Facial Images: Grimace”[Online], Retreived

from:http://cswww.essex.ac.uk/mv/allfaces/grimace.html

[21]Dr Libor SpaceK(2007), “Collection of Facial Images: Faces95”[Online], Retreived from:http://cswww.essex.ac.uk/mv/allfaces/faces95.html

[22] Senthilkumaran N and Vaithegi S, “ Image Segmentation By Using Thresholding Techniques For Medical Images”, Computer Science &

Engineering: An International Journal (CSEIJ), Vol.6, No.1, February 2016.

[23] Devlin, S. A.; Thomas, E. G. and Emerson, S. S. (2013). “Robustness of approaches to ROC curve modeling under misspecification of the

underlying probability model”, Communications in Statistics—Theory and Methods, 42, 3655–3664.

[24] D. Surya Prabha and J. Satheesh Kumar, “Performance Evaluation of Image Segmentation using Objective Methods”,IndianJournal of

Science and Technology, Vol 9(8), February 2016.