IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 1, JANUARY 2011247
Face Recognition by Exploring Information Jointly in
Space, Scale and Orientation
Zhen Lei, Shengcai Liao, Matti Pietikäinen, Senior Member, IEEE, and Stan Z. Li, Fellow, IEEE
Abstract—Information jointly contained in image space, scale
and orientation domains can provide rich important clues not
seen in either individual of these domains. The position, spatial
frequency and orientation selectivity properties are believed to
have an important role in visual perception. This paper proposes a
novel face representation and recognition approach by exploring
information jointly in image space, scale and orientation domains.
Specifically, the face image is first decomposed into different scale
and orientation responses by convolving multiscale and multior-
ientation Gabor filters. Second, local binary pattern analysis is
used to describe the neighboring relationship not only in image
space, but also in different scale and orientation responses. This
way, information from different domains is explored to give a good
face representation for recognition. Discriminant classification is
then performed based upon weighted histogram intersection or
conditional mutual information with linear discriminant analysis
techniques. Extensive experimental results on FERET, AR, and
FRGC ver 2.0 databases show the significant advantages of the
proposed method over the existing ones.
Index Terms—Conditional mutual information (CMI), face
recognition, Gabor volume based local binary pattern (GV-LBP),
Gabor volume representation, local binary pattern (LBP).
lenges. In real world, the face images are usually affected by
different expressions, poses, occlusions and illuminations, and
the difference of face images from the same person could be
larger than those from different ones. Therefore, how to extract
robust and discriminant features which make the intraperson
faces compact and enlarge the margin among different persons
becomes a critical and difficult problem in face recognition.
Up to now, many face representation approaches have been
introduced, including subspace based holistic features and local
well known principal component analysis (PCA) , linear dis-
ACE recognition has attracted much attention due to its
potential value for applications and its theoretical chal-
Manuscript received December 14, 2009; revised April 13, 2010; accepted
July 05, 2010. Date of publication July 19, 2010; date of current version De-
cember 17, 2010. This work was supported by the Chinese National Hi-Tech
(863) Program #2008AA01Z124, National Science and Technology Support
Program Project 2009BAK43B26, and the AuthenMetric R&D Fund. The as-
sociate editor coordinating the review of this manuscript and approving it for
publication was Dr. Arun Ross.
Z. Lei, S. Liao, and S. Z. Li are with the Center for Biometrics and Se-
curity Research and National Laboratory of Pattern Recognition, Institute of
Automation, Chinese Academy of Sciences, Beijing 100190, China (e-mail:
firstname.lastname@example.org, email@example.com, firstname.lastname@example.org).
M. Pietikäinen is with the Machine Vision Group, University of Oulu,
FI-90014 Oulun yliopisto, Finland (e-mail: email@example.com).
Color versions of one or more of the figures in this paper are available online
Digital Object Identifier 10.1109/TIP.2010.2060207
criminate analysis (LDA) , independent component analysis
(ICA) , etc. PCA provides an optimal linear transformation
from the original image space to an orthogonal eigenspace with
reduced dimensionality in sense of the least mean square re-
construction error. LDA seeks a linear transformation by maxi-
ance. ICA is a generalization of PCA, which is sensitive to
the high-order relationship among the image pixels. Recently,
Wang and Tang  unify PCA, LDA and Bayesian methods
into the same framework and present a method to find the op-
timal configuration for LDA. Yan et al.  reinterpret the sub-
methods, such as PCA, LDA, ISOMAP , LLE , LPP ,
NPE , MFA  etc. can all be interpreted under this frame-
work. Furthermore, in order to handle the nonlinearity in face
feature space, the nonlinear kernel techniques (e.g., kernel PCA
, kernel LDA  etc.) are also introduced.
Local appearance features, as opposed to holistic features
like PCA and LDA, have certain advantages. They are more
stable to local changes such as illumination, expression and in-
accurate alignment. Gabor ,  and local binary patterns
(LBPs)  are two representative features. Gabor wavelets
capture the local structure corresponding to specific spatial fre-
quency (scale), spatial locality, and selective orientation which
are demonstrated to be discriminative and robust to illumina-
tion and expression changes. LBP operator which describes the
neighboring changesaroundthe centralpoint,is a simple yetef-
fective way to represent faces. It is invariant to any monotonic
gray scale transformation and is, therefore, robust to illumina-
tion changes to some extend. Recently, some work has been
done to apply LBP on the Gabor responses to obtain a more
sufficient and stable representation. Zhang et al.  propose
LBPs descriptor on Gabor magnitude representation and Zhang
et al. ,  perform LBP on Gabor phase information. The
global and local descriptors are presented, respectively, and fi-
nally fused for face representation. These combinations of LBP
and Gabor features have improved the face recognition perfor-
mance significantly compared to the individual representation.
Combining information from different domains is usually
beneficial for face recognition. Recent biological studies in-
dicate that retinal position, spatial frequency and orientation
. Therefore, in this paper, we propose to explore informa-
tion jointly in space, frequency, and orientation domains to
enhance the performance of face recognition.
In previous work –, people have studied the neigh-
boring relationship in the spatial domain of a face image by
quantizing the difference into binary values. However, the rel-
evant information among different scales and orientations are
1057-7149/$26.00 © 2010 IEEE
248IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 1, JANUARY 2011
Fig. 1. Calculation of LBP code from 3?3 subwindow.
still not explored. In this paper, we propose a novel face rep-
resentation method that not only explores the information in
the spatial domain, but also among different scales and orien-
tations. The main procedure of the proposed joint information
extraction is as follows. First, the multiscale and multiorienta-
tion representations are derived by convolving the face image
witha Gaborfilter bankand formulatedas athird-ordervolume.
Second, LBP operator is applied on the three orthogonal planes
of Gabor volume, respectively, named GV-LBP-TOP in short.
In this way, we encode the neighboring information not only in
image space but also among different scales and orientations of
Zhao and Pietikäinen  have proposed a similar method
named LBP-TOP and applied it for dynamic texture recogni-
tion and facial expression analysis. The difference is that their
method is applied in the spatial and temporal domains of the
video sequence, whereas ours is conducted on the Gabor face
volume to explore the neighboring relationship in spatial, fre-
quency and orientation domains. In order to reduce the compu-
tational complexity, we further propose an effective GV-LBP
(E-GV-LBP) descriptor that models the neighboring changes
around the central point in the joint domains simultaneously
for face representation. After that, a statistical uniform pattern
mechanism is adopted and local histogram features based upon
finally performed based upon weighted histogram intersection
or conditional mutual information (CMI) with linear discrimi-
nant analysis (LDA) techniques.
There are mainly three advantages for the proposed method.
First, Gabor feature is applied to the face images to alleviate
the variations of facial expression and illumination. Second, the
LBP is utilized to model the neighboring relationship jointly in
inant and robust information, as much as possible, could be ex-
plored. The uniform pattern mechanism is then presented to im-
prove the efficacy of the proposed representation. Third, a fea-
ture selection and discriminant analysis method is introduced
to make the face representation compact and effective for face
The rest of this paper is organized as follows. Section II
briefly reviews the definition of Gabor filters and details the
GV-LBP-TOP and E-GV-LBP representations based upon
the Gabor faces. Section III describes the details of weighted
histogram distance metric and the process of face recognition.
SectionIV presentstheCMI basedfeatureselectionmechanism
and LDA subspace learning. Experimental results and analysis
are demonstrated in Section V and Section VI concludes the
II. GV-LBP-TOP AND E-GV-LBP BASED FACE
A. Gabor Faces
Gabor filters, which exhibit desirable characteristics of spa-
tial locality and orientation selectively and are optimally local-
ized in the space and frequency domains, have been extensively
used are defined as follows:
and define the orientation and scale of the Gabor
, and the wave vectoris
The Gabor kernels in (1) are all self-similar since they can be
generated from one filter, the mother wavelet, by scaling and
rotating via the wave vector
is generated by a set of various scales and rotations.
In this paper, we use Gabor kernels at five scales
and eight orientations
with the parameter
resentation by convolving face images with corresponding
Gabor kernels. For every image pixel we have totally 40 Gabor
magnitude and phase coefficients, respectively, that is to say,
we can obtain 40 Gabor magnitude and 40 Gabor phase faces
from a single input face image.
. Hence, a band of Gabor filters
 to derive the Gabor rep-
B. Gabor Volume Based LBP on Three Orthogonal Planes
LBP is introduced as a powerful local descriptor for micro-
of an image by thresholding the 3
pixelwith the center value and considering the result as a binary
number (or called LBP codes). An illustration of the basic LBP
operator is shown in Fig. 1.
Recently, the combination of Gabor and LBP has been
demonstrated to be an effective way for face recognition
–. In this paper, we propose to explore discriminative
information by modeling the neighboring relationship not only
in spatial domain, but also among different frequency and ori-
entation properties. Particularly, for a face image, the derived
Gabor faces are assembled by the order of different scales and
orientations to form a third-order volume as illustrated in Fig. 2,
where the three axes X, Y, T denote the different rows, columns
of face image and different types of Gabor filters, respectively.
It can be seen that the existing methods – essentially
applied LBP or LXP operator on XY plane. It is natural and
possible to conduct the similar analysis on XT and YT planes
to explore more sufficient and discriminative information for
face representation. GV-LBP-TOP is originated from this idea.
3-neighborhood of each
LEI et al.: FACE RECOGNITION BY EXPLORING INFORMATION JOINTLY IN SPACE, SCALE AND ORIENTATION 249
Fig. 2. Face image and its corresponding third-order Gabor volume.
GV-LBP-XY, (c), (g) GV-LBP-XT, (d), (h) GV-LBP-YT. The first row is the
based upon Gabor phase information.
It first applies LBP analysis on the three orthogonal planes
(XY, XT, and YT) of Gabor face volume and then combines the
description codes together to represent faces.
Fig. 3 illustrates examples of Gabor magnitude and phase
faces and their corresponding GV-LBP codes on XY, XT, and
YT planes. It is clear to see that the codes from three planes
are different and, hence, may supply complementary informa-
tion helpful for face recognition. After that, three histograms
corresponding to GV-LBP-XY, GV-LBP-XT, and GV-LBP-YT
codes are computed as
nating these three histograms
the face that incorporates the spatial information and the co-oc-
currence statistics in Gabor frequency and orientation domains
and, thus, is more effective for face representation and recogni-
is an indication function of a boolean
expresses the GV-LBP codes in th plane
: XY; 1: XT; 2: YT), andis the number of the th
C. Effective GV-LBP
The aforementioned GV-LBP-TOP is of high computational
complexity. The length of the histogram feature vector and
the computational cost are threefold compared to those of
LGBPHS , so it is not very efficient in practical application.
To address this problem, we propose an effective formulation
of GV-LBP (E-GV-LBP) which encodes the information in
spatial, frequency and orientation domains simultaneously and
Fig. 4. Formulation of E-GV-LBP.
Fig. 5. (a) One face image and its E-GV-LBP results on (b) Gabor magnitude
faces and (c) Gabor phase faces.
reduces the computational cost. Fig. 4 shows the definition
of E-GV-LBP coding. For the central point
the orientation neighboring pixels;
in spatial domains. Like in LBP, all the values of these pixels
surrounded are compared to the value of the central pixel,
thresholded into 0 or 1 and transformed into a value between 0
and 255 to form the E-GV-LBP value
, and are
and are the scale
are the neighboring pixels
is a threshold function defined as
Fig. 5 demonstrates the E-GV-LBP codes based upon 40
Gabor magnitude and phase faces for an input face image.
The histogram features are then computed based upon the
E-GV-LBP codes to provide a more reliable description as
of the E-GV-LBP codes.
250IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 1, JANUARY 2011
Fig. 6. Distributions of (a). (e) GV-LBP-XY, (b), (f) GV-LBP-XT, (c), (g) GV-LBP-YT, and (d), (h) E-GV-LBP codes on Gabor Magnitude and Phase images.
(a)–(d) Distributions on Gabor Magnitude images and (e)–(h) are the ones on Gabor Phase images.
D. Statistical Uniform Pattern
In , researchers propose uniform pattern mechanism for
LBP code which is robust to noise and improves the recognition
performance. In LBP code, the uniform patterns are defined as
versa occur when the binary string is considered circular. It is
based upon the observation that there are a limited number of
transitions or discontinuities in the circular presentation of the
3 texture patterns. Therefore, the uniform patterns occupy
a vast majority proportion of all LBP patterns in local image
In this paper, we adopt a more general strategy and define the
uniform pattern via statistical analysis, according to the occur-
rence percentage instead of the number of 0–1 and 1–0 transi-
tions for different codings.
Denote a coded face image by
value in position
of the th image. The occurrence distri-
bution histogram for
face images is computed as
to indicate the coding
condition. The GV-LBP-XY, GV-LBP-XT, GV-LBP-YT and
E-GV-LBP codes distributions, calculated on the training set of
FERET database are shown in Fig. 6.
The histogram is then sorted according to the occurrence per-
centage. In this paper, we define the uniform patterns in an it-
erative way. In each step, the patterns corresponding to the two
smallest occurrence percentage collapse into a single one and
then the histogram is resorted. Suppose we originally have
bins, afteriterations, there are
equals to 256 and can be assigned arbitrarily from 0 to
value will result in huge feature dimension
is an indication function of a boolean
labels left. In this work,
uniform patterns and select the best one based upon the tradeoff
between the recognition accuracy and computational cost.
III. WEIGHTED HISTOGRAM INTERSECTION BASED FACE
resent faces, and in face recognition phase, the histogram inter-
section defined in (7) is used as the dissimilarity to measure dif-
ferent face images
bin value. Directly comparing the histograms based upon the
whole faces may lose the structure information of faces which
is important for face recognition. One possible way is to par-
tition the face image into several blocks. The local histograms
are first obtained from different blocks and then concatenated
into a histogram sequence to represent the whole face. In this
way, we succeed to depict the face image at three levels. The
GV-LBP-TOP or E-GV-LBP codes contain information in spa-
tial, frequency and orientation domains at pixel level. Local his-
togram expresses characteristic at regional level which is robust
to alignment errors and finally, they are combined together as
a global description for a face image to maintain both its accu-
racy and robustness. Pervious work has shown that different re-
gions of face make different contributions for the performance
of recognition , , e.g., the areas nearby eyes and nose
are more important than others. Therefore, it is sensible to as-
sign different weights onto different blocks when measuring the
dissimilarity of two images.
Consequently, the weighted dissimilarity of different his-
togram sequences can be formulated as
are two histograms anddenote the th
LEI et al.: FACE RECOGNITION BY EXPLORING INFORMATION JOINTLY IN SPACE, SCALE AND ORIENTATION 251
weight for the th local histogram pair
In this paper, we take the similar measure as in  to set the
weights for different blocks. For each block, we first compute
the dissimilarity means
(the same person) and extra (different persons) sample pairs,
respectively, and then the weight for the block can be computed
following the Fisher criterion  as
denotethetwohistogramsequencesand is the
and variances, for intra
Therefore, if the local histogram features are discriminative,
where the means of intra and extra classes are far apart and the
variances are small, the corresponding block will be assigned
with a large weight. Otherwise, the weight will be small.
are those of extra sample ones.,
IV. CMI AND LDA BASED FACE RECOGNITION
GV-LBP-TOP, it is still of very huge dimension and there is a
lot of redundancy which greatly affects the efficiency in feature
matching process. To deal with this problem, we utilize CMI
to select the effective and uncorrelated feature set and adopt
LDA to learn the discriminative feature space to improve the
effectiveness and efficiency.
A. CMI Based Feature Selection
there will be millions of E-GV-LBP features for a face image.
Directly comparing these features lacks efficiency and it is dif-
ficult to learn a classifier on them because of the high com-
putational cost and limited memory storage. A straightforward
way is to select a subset of original ones to reduce the dimen-
sion of features. AdaBoost learning  and CMI  based
methods are two competent ways to play this role. The work in
 shows CMI based feature selection achieves better results
than AdaBoost. Therefore, in this paper, we adopt CMI to select
the most discriminative and uncorrelated features to represent
Mutual Information (MI) is a basic concept in information
theory. It estimates the quantity of information shared between
is defined as follows:
sures the uncertainty of variable. For a discrete random variable
, is defined as
is the entropy of the random variable, which mea-
. The conditional entropy
In the context of feature selection, the main goal is to se-
lect a small subset of features
represents the marginal probability distribution of
measures the remaining
, whenis known.
carries as much information as possible for classification. In
our problem, when we have selected
is to select the next feature
features, our purpose
to maximize the CMI
, where denotes the
class label variable. In practice, due to the various feature
values and small sample size, it is difficult to estimate the joint
directly. To address this problem, we take the
following strategy. First, following the idea in , intra and
inter personal spaces are constructed to increase the number of
samples, where the image pairs from the same person form the
intrasamples and the image pairs from different persons form
and inter personal spaces into binary values by a predefined
threshold to make the variable distribution more reliable and
meanwhile simplify the computation process.
Given a set of training samples with class labels
is the th intra or inter
is the sample
intra and inter classes. Each dimension of feature is converted
into a binary-value variable as
dimension features.denotes the
mizing the classification error in this work as
is the predefined threshold which is decided by mini-
For one feature, if the difference of the image pair is less than
a threshold, the feature value is set to 0, otherwise it is set to 1.
The feature selection principle is to find the feature that max-
imizes the CMI
is computed as
denotes the probability distribution.
However, the joint probabilities of multivariables shown pre-
viously are still difficult to estimate, especially when
because there are
bution. A suboptimized way is to compute a series of CMIs
based upon one selected feature in sequence [(14), where
denotes the selected feature]. The feature whose corresponding
minimum of CMIs is the largest is selected as a new one. In this
way, we only need to at most estimate the joint probability of
ternary variables which is feasible to be calculated in practice.
The whole process of CMI based feature selection is illustrated
in Fig. 7
binary variables’ joint distri-
252IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 1, JANUARY 2011
Fig. 7. CMI based feature selection procedure.
LDA  is a representative subspace learning method which
has achieved great success in face recognition. In this part, we
conduct LDA on the selected features to learn the most discrim-
inant subspace for classification. The essential idea of LDA is
to disperse the samples from different classes and meanwhile
gather the samples from the same class. Given the training sam-
, the between class scatter matrix
and within class scatter matrixare defined as
global mean vector. LDA aims to find the projective directions
which maximize the ratio of between class scatter matrix to
within class scatter one as
The optimal projection matrix
solving the following eigen-value problem
can be obtained by
Fig. 8. Face examples of FERET databases.
In classification phase, after projecting the original data onto
dissimilarity of two samples in subspace.
is the diagonal matrix whose diagonal elements are
Extensive experiments have been carried out to illustrate the
efficacy of the proposed method. Specifically, three large pub-
licly available face databases, FERET , AR  and FRGC
ver 2.0 , are used to evaluate the performance of different
methods. These face databases contain various changes of face
proposed methods, GV-LBP-TOP and E-GV-LBP, have shown
their robustness and accuracy in these variations.
A. Experiment I: FERET Database
The FERET database is one of the largest publicly available
databases. In this experiment, the training set contains 731 im-
ages. In test phase, the gallery set contains 1196 images from
expression, illumination and aging variations are used to com-
pare the performance of different methods. All the images are
rotated, scaled and cropped into 88
provided eye coordinates (Fig. 8).
The uniform pattern number and the block size are two pa-
rameters that impact the performance of the proposed method.
Fig. 9 shows the recognition rate on fb probe set by varying the
number of uniform patterns and block size for E-GV-LBP rep-
resentation on Gabor magnitude and phase responses, respec-
tively. Here, the uniform patterns are computed on the FERET
training set and the unweighted histogram intersection mea-
sure (7) is used. As expected, too large or too small block size
would result in a decreased face recognition rate because of the
loss of spatial information or sensitivity to local variations. A
smaller size of uniform pattern set would lose the discrimina-
tive information and a larger one would increase the compu-
tational cost. Considering the tradeoff between the recognition
rate and computational cost, in the following experiments, the
face image is divided into 11
the size of 8
8, and the number of uniform patterns is set to
8. For the LBP, LGBP, GV-LBP-TOP and E-GV-LBP methods,
weighted histogram intersection measure is adopted. The uni-
form codes of GV-LBP-TOP and E-GV-LBP and the weights
of different blocks are statistically calculated on the FERET
80 size according to the
10 nonoverlapped blocks with
LEI et al.: FACE RECOGNITION BY EXPLORING INFORMATION JOINTLY IN SPACE, SCALE AND ORIENTATION253
Fig. 9. Face recognition rate of E-GV-LBP on fb probe set with different uni-
form pattern numbers and block sizes: (a) E-GV-LBP-M and (b) E-GV-LBP-P.
Fig. 10. Face partition and their corresponding weights for E-GV-LBP.
training set. Fig. 10 shows the face partition mode and the com-
parative weights of 11
10 blocks for E-GV-LBP. It is shown
that the regions around eyes and nose have more contributions
the previous research work , . In following results, the
character ’M’ denotes the Gabor magnitude feature and ’P’ de-
notes the Gabor phase one.
Table I lists the recognition rates of different methods on
FERET database, and Fig. 11 illustrates the corresponding cu-
mulative match curves for the proposed method. From the re-
sults, we can observe the following.
1) The proposed methods, GV-LBP-TOP and E-GV-LBP,
and phase faces which strongly demonstrates that there is
complementary discriminative information among spatial,
frequency and orientation domains and that the proposed
descriptor is effective to explore this information for better
face representation and recognition.
2) Though many previous works claimed Gabor phase infor-
mation may not be robust enough for face recognition due
to its sensitivity to displacement, in our experiments, the
is comparable with (or even slightly better than) that based
upon Gabor magnitude feature. As pointed in , Gabor
phase feature is also able to provide discriminant informa-
tion which can be fused with Gabor magnitude feature to
further improve the face recognition performance.
3) Comparing the results of LBP on original face images with
those of LBP on Gabor faces, the combination of Gabor
and LBP effectively reduces the affect of expression, illu-
mination and aging variations and significantly improves
4) Regarding to the complexity and accuracy, E-GV-LBP
codings based upon Gabor magnitude and phase represen-
tations are the best choice among these methods for face
representation and recognition.
fb, (b) fc, (c) dup1, and (d) dup2 probe sets.
RECOGNITION RATES OF DIFFERENT METHODS ON THE FERET DATABASE
The results reported previously are all based upon the well
aligned images which were cropped using the manually labeled
eye positions. However, in practice, due to various types of
noise, there are usually errors for the detected eye coordinates
which is so-called misalignment problem. To evaluate the
robustness of different methods in this case, we disturb the eye
positions of probe images by adding Gaussian white noise with
different variances. In this part, we combine the four probe
sets together and report the rank-1 recognition rate and the
verification rate when the false accept rate is 0.001. Fig. 12
shows the face recognition performance of different methods
on the different disturbed images. Since the performance of
GV-LBP-TOP is close to that of E-GV-LBP, we just plot the
results of E-GV-LBP in the figure for a more clear view. It
can be seen that a) compared to the subspace methods such as
LDA based upon pixel level, the histogram features computed
on a local region are much more robust. b) With LDA, the
performance of Gabor phase decreases much faster than that of
Gabor magnitude. That’s because phase information is more
sensitive to displacement and is not so robust as Gabor magni-
tude responses. The histogram based features greatly alleviate
this disadvantage and both the magnitude and phase features
254 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 1, JANUARY 2011
Fig. 12. Rank-1 recognition rate and ?????? ? ????? curves with respect
to different variances of Gaussian noise on FERET database.
ones are the images collected in two sessions. The first row is the image with
neutral expression, the second to the fourth rows are the images with various
expressions, sunglass and scarf occlusions, respectively.
are robust to the misalignment problem and, hence, are more
practical in real application. The proposed E-GV-LBP coding
combined with the statistical uniform mechanism achieves the
best performance in the case of misalignment.
B. Experiment II: AR Database
The AR database consists of more than 4,000 images of 126
subjects, including 70 males and 56 females. The images were
taken in two sessions separated by two weeks, considering ex-
pression (neutral, smile, anger and scream) and occlusion (sun-
glass and scarf) variations. In this experiment, we randomly se-
lect 45 men and 45 women, respectively, use the neutral images
in the two sessions as the gallery set, and compare the perfor-
mance of different methods in expression and occlusion cases.
Therefore, there are 180 images in the gallery set (two images
probe sets (expression, sunglass and scarf occlusion) respec-
tively. All the images are cropped into 88
eye positions (Fig. 13).
Table II compares the recognition rates of different methods
on AR expression, sunglass and scarf occlusion sets, respec-
tively. For PCA and LDA related methods, the subspaces are
learned from the FERET training set. For the histogram feature
based methods, the way of block partition of face image and the
holistic methods could achieve comparatively good results on
expression subset, but it is really difficult for them to deal with
80 according to the
RECOGNITION RATE OF DIFFERENT METHODS ON THE AR DATABASE
Fig. 14. Face examples of FRGC databases.
occlusion variations. The LBP descriptor is robust to expres-
sion variation, but the performance degrades dramatically in the
case of sunglass and scarf occlusion. Comparatively, the per-
formances of LGBP, GV-LBP-TOP and E-GV-LBP are much
better than that of LBP, especially on scarf occlusion set which
to the LBP. However, their performance on sunglass occlusion
set is still far from satisfactory. It is mainly due to the fact that
the area around the eyes contains the most important clues for
face recognition and the sunglass occlusion drops too much dis-
tinct information. The proposed methods, GV-LBP-TOP and
E-GV-LBP, whatever with magnitude or phase representation,
the effectiveness of the proposed method in expression and oc-
C. Experiment III: FRGC Database
In this experiment, we evaluate the proposed E-GV-LBP rep-
duction technique on FRGC database following the experiment
training set consists of 12776 face images from 222 individuals,
including 6360 controlled images and 6416 uncontrolled ones.
ages and 8014 query images which are uncontrolled ones, from
466 persons. All the images are rotated, scaled and cropped to
histogram equalization preprocessing. Nofurther preprocessing
is applied. Fig. 14 illustrates some cropped face examples.
In order to explore as much information as possible, his-
togram features calculated on multiscale windows which are
slided with a step of four pixels across the whole face are ex-
tracted. There are totally 3,474,560 features in candidate fea-
ture set. CMI is then utilized to select 6000 dimension features
LEI et al.: FACE RECOGNITION BY EXPLORING INFORMATION JOINTLY IN SPACE, SCALE AND ORIENTATION255
Fig. 15. First five selected E-GV-LBP-M (up) and E-GV-LBP-P (down) fea-
tures. The white rectangle is the local histogram region and the left-down is the
real part of Gabor kernel at the central point.
PERFORMANCE COMPARISON OF THE PROPOSED METHOD
AND THE EXISTING ONES
from theoriginal feature set.After that, LDA learningis applied
to the selected features to find 221 dimension discriminant sub-
space where the classification is finally performed. All of the
previously shown operations are conducted on FRGC training
set. In test phase, we follow the experiment 4 protocol to report
the results as ROC I, ROC II, and ROC III corresponding to dif-
ferent time intervals.
Fig. 15 shows the first five selected features with Gabor
magnitude and phase information, respectively. It can be seen
the selected E-GV-LBP-M and E-GV-LBP-P features are
different in Gabor kernel, feature position and local region
size. Therefore, it is possible that there exists complementary
information in Gabor magnitude and phase responses useful
for face recognition. Table III shows the comparative results
of the proposed E-GV-LBP+CMI+LDA method with some
state-of-the-art methods. It should be noted our result is based
upon one or two models while methods in  is the ensemble
results of a good many of (more than 20) models. From the
results, we can see that both E-GV-LBP on Gabor magnitude
and phase faces are effective for face recognition and the
combination of these two representations can further improve
the performance, which is comparable with the state-of-the-art
method, such as , but requires much less computational
and storage cost because of the much less models used in our
from LGBP , we first formulate Gabor faces as a third-
order volume and then apply LBP operators on three orthog-
onal planes (GV-LBP-TOP), encoding discriminative informa-
tion not only in spatial domain, but also in frequency and ori-
entation domains. In order to reduce the computational com-
plexity, an effective GV-LBP (E-GV-LBP) descriptor is further
proposed to describe the changes in spatial, frequency and ori-
entation domains simultaneously. The statistical uniform pat-
tern mechanism is proposed to improve the effectiveness and
robustness of the proposed representations. In face recognition
phase, CMI and LDA are utilized to reduce the redundancy and
make therepresentation morecompact and, thus,to improvethe
efficiency of the algorithm. Experimental results validate the ef-
ficacy of the proposed method.
The authors would like to thank the associate editor and the
anonymous reviewers for their valuable suggestions.
 W. Zhao, R. Chellappa, P. Phillips, and A. Rosenfeld, “Face recogni-
tion: A literature survey,” ACM Comput. Surv., pp. 399–458, 2003.
 S. Z. Li and A. K. Jain, Eds., Handbook of Face Recognition
York, Springer-Verlag, 2005.
 M. A. Turk and A. P. Pentland, “Face recognition using eigenfaces,”
in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Jun.
1991, pp. 586–591.
 P. Belhumeur, J. Hespanha, and D. Kriegman, “Eigenfaces vs. fisher-
faces: Recognition using class specific linear projection,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 711–720, Jul. 1997.
 P. Comon, “Independent component analysis—a new concept?,”
Signal Process., vol. 36, pp. 287–314, 1994.
 X. Wang and X. Tang, “A unified framework for subspace face recog-
nition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 9, pp.
1222–1228, Sep. 2004.
 S. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, and S. Lin, “Graph em-
bedding and extensions: A general framework for dimensionality re-
duction,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 1, pp.
40–51, Jan. 2007.
 J. Tenenbaum, V. Silva, and J. Langford, “A global geometric frame-
work for nonlinear dimensinality reduction,” Science, vol. 290, no. 22,
pp. 2319–2323, 2000.
 S. Roweis and L. Saul, “Nonlinear dimensionality reduction by locally
linear embedding,” Science, vol. 290, no. 22, pp. 2323–2326, 2000.
laplacianfaces,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 3,
pp. 328–340, Mar. 2005.
 X. He, D. Cai, S. Yan, and H. Zhang, “Neighborhood preserving em-
bedding,” in Proc. IEEE Int. Conf. Comput. Vis., 2005, pp. 1208–1213.
ysis as a kernel eigenvalue problem,” Neural Comput., vol. 10, pp.
 S. Mika, G. Ratsch, J. Weston, B. Scholkopf, and K.-R. Müller,
“Fisher discriminant analysis with kernels,” in Proc. Neural Netw.
Signal Process., 1999, pp. 41–48.
 C. Liu and H. Wechsler, “Gabor feature based classification using the
enhanced fisher linear discriminant model for face recognition,” IEEE
Trans. Image Process., vol. 11, no. 4, pp. 467–476, Apr. 2002.
 Z.Lei, S.Z. Li, R.Chu,andX.Zhu, “Facerecognitionwith localgabor
textons,” in Proc. IAPR/IEEE Int. Conf. Biometr., 2007, pp. 49–57.
 T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local
binary patterns: Application to face recognition,” IEEE Trans. Pattern
Anal. Mach. Intell., vol. 28, no. 12, pp. 2037–2041, Dec. 2006.
 W. C. Zhang, S. G. Shan, W. Gao, and H. M. Zhang, “Local gabor
binary pattern histogram sequence (lgbphs): A novel non-statistical
Comput. Vis., 2005, pp. 786–791.
 W. C. Zhang, S. G. Shan, X. L. Chen, and W. Gao, “Are gabor
phases really useless for face recognition?,” in Proc. Int. Conf. Pattern
Recognit., 2006, pp. 606–609.
 B. Zhang, S. Shan, X. Chen, and W. Gao, “Histogram of gabor phase
patterns(hgpp): A novel object representationapproach for face recog-
nition,” IEEE Trans. Image Process., vol. 16, no. 1, pp. 57–68, Jan.
 L. Itti and C. Koch, “Computational modelling of visual attention,”
Nature Rev. Neurosci., vol. 2, no. 3, pp. 194–203, 2001.
 G. Zhao and M. Pietikäinen, “Dynamic texture recognition using local
binary patterns with an application to facial expressions,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 29, no. 6, pp. 915–928, Jun. 2007.
256IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 1, JANUARY 2011 Download full-text
 R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, Second
ed. Hoboken, NJ: Wiley, 2001.
 Y. Freund and R. E. Schapire, “A decision-theoretic generalization of
on-line learning and an application to boosting,” J. Comput. Syst. Sci.,
vol. 55, no. 1, pp. 119–139, 1997.
 F. Fleuret,“Fast binary featureselection with conditional mutual infor-
mation,” J. Mach. Learn. Res., vol. 5, pp. 1531–1555, 2004.
 B. Moghaddam, T. Jebara, and A. Pentland, “Bayesian face recogni-
tion,” Pattern Recognit., vol. 33, no. 11, pp. 1771–1782, 2000.
 P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET
evaluation methodology for face-recognition algorithms,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 22, no. 10, pp. 1090–1104, Oct. 2000.
 A. Martinez and R. Benavente, “The AR face database, CVC,” Tech.
Rep 24, 1998.
 P. J. Phillips, P. J. Flynn, W. T. Scruggs, K. W. Bowyer, J. Chang,
K. Hoffman, J. Marques, J. Min, and W. J. Worek, “Overview of the
face recognition grand challenge,” in Proc. IEEE Comput. Soc. Conf.
Comput. Vis. Pattern Recognit., 2005, pp. 947–954.
and local classifiers for face recognition,” in Proc. IEEE Int. Conf.
Comput. Vis., 2007, pp. 1–8.
hybrid fourier feature for large face image set,” in Proc. IEEE Comput.
Soc. Conf. Comput. Vis. Pattern Recognit., 2006, pp. 1574–1581.
 C. Liu, “Capitalize on dimensionality increasing techniques for im-
proving face recognition grand challenge performance,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 28, no. 5, pp. 725–737, May 2006.
Zhen Lei received the B.S. degree in automation
from the University of Science and Technology
of China (USTC), Hefei, China, in 2005 and the
Ph.D. degree from Institute of Automation, Chinese
Academy of Sciences, Beijing, China, in 2010.
He is currently with the Center for Biometrics and
Security Research and National Laboratory of Pat-
tern Recognition, Institute of Automation, Chinese
Academy of Sciences, Beijing, China. His research
interests are in computer vision, pattern recognition,
image processing, and face recognition in particular.
Shengcai Liao received the B.S. degree in math-
ematics and applied mathematics from the Sun
Yat-sen University, Guangzhou, China, in 2005 and
the Ph.D. degree from the Institute of Automation,
Chinese Academy of Sciences, Beijing, China, in
He is currently with the Center for Biometrics and
Security Research and National Laboratory of Pat-
tern Recognition, Institute of Automation, Chinese
Academy of Sciences, Beijing, China. His research
interests include machine learning, pattern recogni-
tion, biometrics, and visual surveillance.
Dr. Liao was awarded the Excellence Paper of Motorola Best Student Paper
and the 1st Place Best Biometrics Paper in the International Conference on Bio-
metrics on 2006 and 2007, respectively, for his works on face recognition.
Matti Pietikäinen (S’75–M’77–SM’95) received
the D.Sc. degree in technology from the University
of Oulu, Finland, in 1982.
He is currently with the Machine Vision Group,
University of Oulu. From 1980 to 1981 and from
1984 to 1985, he visited the Computer Vision
Laboratory, University of Maryland, College Park.
His research interests are in texture-based com-
puter vision, face analysis, activity analysis, and
their applications in human computer interaction,
person identification and visual surveillance. He has
authored about 250 papers in international journals, books, and conference pro-
ceedings, and over 100 other publications or reports. His research is frequently
cited and its results are used in various applications around the world.
Dr. Pietikäinen has been Associate Editor of the IEEE TRANSACTIONS ON
PATTERN ANALYSIS AND MACHINE INTELLIGENCE and Pattern Recognition
journals, and is currently an Associate Editor of Image and Vision Computing
journal. He was President of the Pattern Recognition Society of Finland from
1989 to 1992. From 1989 to 2007 he served as Member of the Governing Board
of the International Association for Pattern Recognition (IAPR), and became
one of the founding fellows of the IAPR in 1994. He was a Vice-Chair of IEEE
Stan Z. Li (M’92–SM’99–F’09) received the B.Eng.
degree from Hunan University, Changsha, China, the
M.Eng. degree from the National University of De-
fense Technology, China, and the Ph.D. degree from
Surrey University, Surrey, U.K.
He is currently a Professor and the Director
of Center for Biometrics and Security Research
(CBSR), Institute of Automation, Chinese Academy
of Sciences (CASIA). He worked at Microsoft
Research Asia as a researcher from 2000 to 2004.
Prior to that, he was an Associate Professor at
Nanyang Technological University, Singapore. His research interest includes
pattern recognition and machine learning, image and vision processing, face
recognition, biometrics, and intelligent video surveillance. He has published
over 200 papers in international journals and conferences, and authored and
edited eight books.
Dr. Li is currently an Associate Editor of the IEEE TRANSACTIONS ON
PATTERN ANALYSIS AND MACHINE INTELLIGENCE and is acting as the Ed-
itor-in-Chief for the Encyclopedia of Biometrics. He served as a co-chair for the
International Conference on Biometrics 2007 and 2009, and has been involved
in organizing other international conferences and workshops in the fields of his