Page 1
A FRAMEWORK FOR ANALYZING TEXTURE DESCRIPTORS
Timo Ahonen and Matti Pietik¨ ainen
Machine Vision Group, University of Oulu, PL 4500, FI90014 Oulun yliopisto, Finland
tahonen@ee.oulu.fi, mkp@ee.oulu.fi
Keywords:
Local Binary Pattern, KTHTIPS, MR8, Gabor, Texture descriptor.
Abstract:
This paper presents a new unified framework for texture descriptors such as Local Binary Patterns (LBP)
and Maximum Response 8 (MR8) that are based on histograms of local pixel neighborhood properties. This
framework is enabled by a novel filter based approach to the LBP operator which shows that it can be seen
as a special filter based texture operator. Using the proposed framework, the filters to implement LBP are
shown to be both simpler and more descriptive than MR8 or Gabor filters in the texture categorization task. It
is also shown that when the filter responses are quantized for histogram computation, codebook based vector
quantization yields slightly better results than threshold based binning at the cost of higher computational
complexity.
1 INTRODUCTION
Texture is a fundamental property of surfaces, and
automated analysis of textures has applications rang
ing from remote sensing to document image analysis
(Tuceryan and Jain, 1998). Recent findings in apply
ing texture methods to face image analysis, for ex
ample, indicate that texture might have applications
in new fields of computer vision that have not been
considered texture analysis problems. Because of the
importance of texture analysis, a wide variety of dif
ferent texture descriptors have been presented in the
literature. However, there is no formal definition of
the phenomenon of texture itself that the researchers
would agree upon. This is possibly one of the reasons
that so far no unified theory or no unified framework
of texture descriptors has been presented.
The Local Binary Pattern (LBP) (Ojala et al.,
2002), Maximum Response 8 (Varma and Zisserman,
2005) and Gabor filter based texture descriptors are
among the most studied and best known recent tex
ture analysis techniques. Despite the large number
of publications discussing these methods, the connec
tions and differences between them are not well un
derstood. This paper presents a new unified frame
work for these texture descriptors, which allows for a
systematic comparison of these widely used descrip
tors and the parts that they are built of.
LBP is an operator for image description that is
based on the signs of differences of neighboring pix
els. It is fast to compute and invariant to monotonic
grayscale changes of the image. Despite being sim
ple, it is very descriptive, which is attested by the
wide variety of different tasks it has been success
fully applied to. The LBP histogram has proven to
be a widely applicable image feature for, e.g., texture
classification, face analysis, video background sub
traction, etc. (The Local Binary Pattern Bibliography,
2007).
Another frequently used approach in texture de
scription is using distributions of quantized filter re
sponses to characterize the texture (Leung and Malik,
2001), (Varma and Zisserman, 2005). In the field of
texture analysis, filtering and pixel value based tex
ture operators have been seen as somewhat contradic
tory. However, in this paper we show that the local
binary pattern operator can be seen as a filter oper
ator based on local derivative filters at different ori
entations and a special vector quantization function.
Apart from clarifying the connections between LBP
and filter based methods, this also helps analyzingthe
properties of the LBP operator.
2 THE LOCAL BINARY PATTERN
OPERATOR
The local binary pattern operator (Ojala et al., 2002)
is a powerful means of texture description. The orig
inal version of the operator labels the pixels of an
image by thresholding the 3x3neighborhoodof each
507
Page 2
110
1
100
1
124
8
163264
128
591
446
723
ThresholdWeights
LBP code: 1+2+8+64+128=203
Figure 1: The basic LBP operator.
Figure 2: Three circular neighborhoods: (8,1), (16,2), (6,1).
The pixel values are bilinearly interpolated whenever the
sampling point is not in the center of a pixel.
pixel with the center value and summing the thresh
olded values weighted by powers of two. Then the
histogram of the labels can be used as a texture de
scriptor. See Fig. 1 for an illustration of the basic
LBP operator.
The operator can also be extended to use neigh
borhoods of different sizes (Ojala et al., 2002). Us
ing circular neighborhoods and bilinearly interpolat
ing the pixel values allow any radius and number of
pixels in the neighborhood. For neighborhoods we
will use the notation (P,R) which means P sampling
points on a circle of radius of R. See Fig. 2 for an
example of different circular neighborhoods.
3 FRAMEWORK FOR FILTER
BANK AND VECTOR
QUANTIZATION BASED
TEXTURE DESCRIPTORS
Apart from LBP and other similar methods work
ing directly on pixel values, another widely used ap
proach to texture analysis is to convolve an image
with N different filters whose responses at a certain
position (x,y) form an Ndimensional vector.
learning stage, a set of such vectors is collected from
training images and the set is clustered using, e.g.,
kmeans to form a codebook. Then each pixel of a
texture image is labeled with the label of the nearest
cluster center and the histogram of these labels over a
texture image is used to describe the texture. (Leung
and Malik, 2001), (Varma and Zisserman, 2005).
More formally, let I(x,y) be the image to be de
scribed by the texture operator. Now the vector val
ued image obtained by convolving the original image
At
with filter kernels is
If(x,y) =
I1(x,y) = I(x,y)⋆F1
I2(x,y) = I(x,y)⋆F2
...
IN(x,y) = I(x,y)⋆FN
(1)
The labeled image Ilab(x,y) is obtained with a vector
quantizer f : RN?→ {0,1,2,··· ,M −1}, where M is
the number of different labels produced by the quan
tizer. Thus, the labeled image is
Ilab(x,y) = f(If(x,y))
(2)
and the histogram of labels is
Hi=∑
x,y
δ{i,Ilab(x,y)},i = 0,...,M−1,
(3)
in which δ is the Kronecker delta.
If the task is classification or categorization as in
this work, several possibilities exist for classifier se
lection. The most typical strategy is to use nearest
neighbor classifier using, e.g., χ2distance to measure
the distance between histograms (Leung and Malik,
2001), (Varma and Zisserman, 2005). In (Varma and
Zisserman, 2004), the nearest neighbor classifier was
comparedto Bayesian classification but no significant
difference in the performance was found. In (Caputo
et al., 2005) it was shown that the performance of
a material categorization system can be boosted by
using suitably trained support vector machine based
classifier. In this work, the main interest is not in the
classifier design but in the local descriptors and thus
the nearest neighbor classifier with χ2distance was
selected for the experimental part.
In the following two subsections we take a look
at the two main parts of the proposed texture descrip
tion framework, the filter bank and the quantization
method.
3.1Filter Bank
In this paper we compare three different types of fil
ter kernels that are used in texture description. The
first filter bank is a set of oriented derivative filters
whose thresholded output is shown to be equivalent
to the local binary pattern operator. The other two fil
ter banks included in the comparison are Gabor filters
and Maximum Response 8 filter set.
A novelway to look at the LBP operatorproposed
in this paperis to see it as a special filterbasedtexture
operator. The filters for implementing LBP are ap
proximationsof image derivativescomputedat differ
ent orientations. The filter coefficients are computed
so that they are equal to the weights of bilinear inter
polationof pixel values at sampling points of the LBP
VISAPP 2008  International Conference on Computer Vision Theory and Applications
508
Page 3
0
0
000
1
0
−1
0
0
0
000
0
0
000
0
−1
1
0 −0.914 0.207
0.207 0.5
Figure 3: Filters F1···F3of the total of 8 local derivative
filters at (8,1) neighborhood. The remaining 5 filters are
obtained by mirroring the filters shown here.
operator and the coefficient at filter center is obtained
by subtracting 1 from the center value. For example,
thekernelsshowninFig. 3canbeusedforfilter based
implementation of local binary pattern operator in the
circular (8,1) neighborhood. The response of such
filter at location (x,y) gives the signed difference of
the center pixel and the sampling point corresponding
to the filter. These filters, which will be called local
derivative filters in the following, can be constructed
for any radius and any number of sampling points.
Another type of filter kernels that is widely used
in texture description is Gabor filters, which are com
plex filters whose spatial representation is obtained
by multiplying a Gaussian with a complex sinusoid.
The typical way Gabor filters are applied in texture
descriptionis to convolvethe input image with a bank
of Gaborfilters andcomputea set of featuresfrom the
output images. A lot of work has been devoted to de
signing the filter bank and feature computation meth
ods, see, e.g., (Manjunath and Ma, 1996), (Clausi and
Jernigan, 2000), (Grigorescu et al., 2002). In this
work we apply the Gabor filters in the proposed tex
ture description framework, i.e. the responses of the
filter bank at a certain position are stacked into a vec
tor which is used as an input for the vector quantizer.
The third considered filter bank is the Maximum
Response 8 bank (Varma and Zisserman, 2005). That
filter set consists of 38 filters: two isotropic filters
(Gaussian and Laplacian of Gaussian) and an edge
and a bar filter both at 3 scales and 6 orientations.
As an intermediate step between filtering and vector
quantization, the maximum of the 6 responses at dif
ferent orientations is computed which results in a to
tal of 8 responses. In the proposed unified framework
this maximum selection falls more conveniently into
the vector quantization operation and it is discussed
in more detail in the next subsection.
3.2Vector Quantization
The assumption onto which the proposed texture de
scription framework is based on is that the joint dis
tribution of filter responses can be used to describe
the image texture. Depending on the size of the fil
ter bank, the dimension of the vectors in the image
If(x,y) can be high and quantization of the vectors is
needed for reliable estimation of the histogram.
A simple, nonadaptive way to quantize the filter
responsesistothresholdthemandtocomputethesum
of thresholded values multiplied by powers of two:
Ilab(x,y) =
N
∑
n=1
s{In(x,y)}2n−1,
(4)
where s(z) is the thresholding function
s{z} =
?
1,
0,
z ≥ 0
z < 0
(5)
Thresholding divides each dimension of the filter
bank output into two bins. The total number of dif
ferent labels producedby thresholdquantizationis 2N
where N is the number of filters.
Now, if the filter bank that was used to obtain the
image If(x,y) is the set of local derivative filters (e.g.
the filters presented in Fig. 3), the filter responses
are equal to the signed differences of the pixel I(x,y)
and its neighbors. As the quantizer (4) is applied to
If(x,y), the resulting labels are equal to those ob
tained with the local binary pattern operator using the
same neighborhood. Therefore, the LBP operator can
be represented in the proposed framework.
Anothermethodforquantizingthe filter responses
is to construct a codebook of them at the learning
stage and then use the nearest codeword to represent
the filter bank output at each location:
Ilab(x,y) = argmin
m
????If(x,y)−cm
????,
(6)
in whichcmis themth vector(codeword)in thecode
book. This approach is used in (Leung and Malik,
2001) and (Varma and Zisserman, 2005), which use
kmeans to construct the codebook whose elements
are called textons. Codebook based quantization of
signed differences of neighboring pixels (which cor
respond to local derivative filter outputs) was pre
sented in (Ojala et al., 2001).
When comparing these two methods for quantiz
ing the filter responses, one might expect that the
if the number of labels produced by the quantizers
is kept roughly the same, the codebook based quan
tizer handles the possible statistical dependencies be
tween the filter responses better. On the other hand,
since the codebook based quantization requires the
search for the closest codeword at each pixel loca
tion, it is clearly slower than simple thresholding,
even though a number of both exact and approximate
techniques have been proposed for finding the near
est codeword without exhaustive search through the
codebook (Gray and Neuhoff, 1998, p. 2362).
A FRAMEWORK FOR ANALYZING TEXTURE DESCRIPTORS
509
Page 4
Itis importanttonotethataclevercodesignofthe
filter bank and the vector quantizer can also make the
texture descriptor rotationally invariant. Again, two
different strategies have been proposed. Rotationally
invariant LBP codes are obtained by circularly shift
ing a LBP binary code to its minimum value (Ojala
et al., 2002). In the joint framework this can be rep
resented as further combining the labels of threshold
quantization(4) so that all the differentlabels that can
arisefromrotationsofthelocalgraypatternarejoined
to form a single label.
On the other hand, the approach chosen for the
MR8 descriptor to achieve rotational invariance is to
select only the maximum of the 6 different rotations
of each bar and edge filters. Only these maximum
values and the responses of the two isotropic filters
are used in further quantization so the 8dimensional
response of the filter is invariant to rotations of the
gray pattern.
4 EXPERIMENTS
To test the proposed framework and to systemati
cally explorethe relative descriptiveness of the differ
ent filter banks and vector quantization methods, the
challenging task of material categorization using the
KTHTIPS2 database (Mallikarjuna et al., 2006) was
utilized.
4.1Experimental Setup
The KTHTIPS2 database contains 4 samples of 11
different materials, each sample imaged at 9 different
scales and 12 lighting and pose setups, totaling 4572
images.
Caputo et al. performed material categorization
tests using the KTHTIPS2 and considered especially
the significance of classifier selection (Caputo et al.,
2005). In that paper, the main conclusions were that
the stateoftheart descriptors such as LBP and MR8
have relatively small differences in the performance
but significant gains in classification rate can be ob
tained by using support vector machine classifier in
stead of nearest neighbor. Moreover, the classifica
tion rates can be enhanced by increasing the number
of samples used for training.
In this work, the main interest is to examine the
relative descriptiveness of different setups of the filter
bank based texture descriptors. To facilitate this task,
we chose the most difficult test setup used in (Caputo
et al., 2005), namely using the nearest neighbor clas
sifier with only one sample (i.e. 9*12 images) per
material class for training.
Table 1: Properties of the tested filter kernels.
Filter bank
Local der. filters
Gabor(1,4)
Gabor(4,6)
MR8
Size
3×3
7×7
49×49
49×49
Num of filters
8
8
48
38
1632 64128 256
0.4
0.42
0.44
0.46
0.48
0.5
0.52
0.54
0.56
0.58
0.6
Local derivatives
Gabor(4,6)
Gabor(1,4)
MR8
Figure 4: The categorization rates for different filter banks
as a function of codebook size
The proposed framework allowed testing the per
formanceof differentfilters anddifferentquantization
methods independently. The filter banks that were in
cluded in the test were local derivative filters, two dif
ferent banks of Gabor filters and MR8 filters. The
local derivative filter bank was chosen to match the
LBP8,1operator which resulted in 8 filters (see Fig.
3). Two verydifferenttypes ofGaborweretested, one
with only 1 scale and 4 orientations and small spatial
support (7×7) and another one with 4 scales and 6
orientations and larger spatial support. The properties
of the tested filter kernels are listed in table 1.
4.2 Codebook based Vector
Quantization
All the 4 filter banks were tested using two types
of vector quantization: thresholding and codebook
basedquantization. For codebookbased quantization,
the selected approachwas to aim for compact, univer
sal texton codebooks, i.e. codebooks of rather small
size that are not tailored for this specific set of tex
tures. Therefore, the KTHTIPS1 database (Mallikar
juna et al., 2006) was used to learn the codebooks.
That database is similar to KTHTIPS2 in terms of
imaging conditions but it contains partly different set
of materials (textures). The codebook sizes that were
tested were 32...256 codewords.
The categorization rates as a function of the code
VISAPP 2008  International Conference on Computer Vision Theory and Applications
510
Page 5
Table 2: Recognition rates for different filter banks and
quantization methods.
Codebook
0.562
0.487
0.525
0.471
Thresholding
0.532
0.383

0.498
Local der. filters
Gabor(1,4) filters
Gabor(4,6)
MR8 filters
book size obtained with each filter bank and code
book based quantization is plotted in Figure 4. The
figure shows that for most of the time, using a larger
codebook enhances the categorization rate but the se
lection of the filter bank is a more dominant factor
than the codebook size. For example, local deriva
tive filters achieve a higher categorization rate with
the smallest codebook size than the MR8 filters with
any codebook size.
4.3 Thresholding based Vector
Quantization
In the next experiment the material categorization
tests were performed using the same filter banks but
thresholding based vector quantization.
derivative and Gabor filters have zero mean, thus the
thresholding function (5) was applied directly. For
the edge and bar filters in the MR8 filter set, only the
maximum of responses over different orientations is
measured and therefore in that case the mean of 8
dimensional response vectors over all the training im
ages was computed and subtracted from the response
before applying thresholding.
Table 2 lists the categorization rates using thresh
olding based quantization and the maximum cate
gorization rate over different codebook sizes for the
four tested filter banks. The Gabor(4,6) filter bank
was omitted from this experiment due to the large
number of filters in the filter bank (the resulting his
tograms would have been of length 248). Codebook
based quantization yields slightly better categoriza
tionratethanthresholdingwhenusinglocalderivative
filters. With the Gabor(1,4) filter bank thresholding
performs much worse than codebook based quantiza
tion, but interestingly with MR8 filters, thresholding
yields better rate.
To conclude the performed experiments, the lo
cal derivative filters give the best categorization rate
over the tested filter sets with both quantization func
tions. The results obtained in these experiments also
attest those presented in (Ojala et al., 2001) which
showed that codebook based quantization of signed
graylevel differences yields slightly better recogni
tion than LBPs, however at the cost of higher compu
The local
tational complexity. Considering the computational
cost of the presented methods, thresholding based
quantization is much faster than codebook based
quantization. As for the filter bank operations, the
computational cost grows with the size and number
of filters, but using FFT based convolution can make
the operations faster. Still, at two extremes, the com
putations for local derivative filter and thresholding
based labeling of an image of size of 256×256 take
0.04 seconds whereas the codebook based labeling
of the same image using Gabor(4,6) filters (and per
forming convolutions using FFT) take 10.98 seconds.
Bothrunningtimesweremeasuredusingunoptimized
Matlab implementations of the methods on a PC with
AMD Athlon 2200 MHz processor.
4.4 Filter Subset Selection
The third experiment tested whether it is possible to
select a representative subset of filters from a large
filter bank for thresholding based quantization. The
number of labels produced by the quantizer is 2Nin
which N is the number of filters, which means that
thelengthofthelabelhistogramsgrowsexponentially
with respect to the number of filters. Thus a small fil
ter bankis desirablefor the thresholdingquantization.
In this experiment, the Sequential Floating For
ward Selection (SFFS) (Pudil et al., 1994) algorithm
wasusedtoselectamaximumof8filtersfromalarger
filter bank. The optimization criterion was the recog
nition rate over the training set (KTHTIPS1). Two
different initial filter banks were tested. First, 8 filters
were selected from the 48 filters in the Gabor(4,6) fil
ter bank. However, the resulting 8filter bank did not
perform well on the testing database, yielding a cate
gorization rate of only 0.295.
Inthefacerecognitionliterature,therearefindings
that LBP and Gabor filter based information are com
plementary. In (Yan et al., 2007), score level fusion
of LBP and Gabor filter based similarity scores was
done. Motivated by these findings, SFFS was used to
select 8 filters from the union of local derivative and
Gabor(1,4) filter banks. This resulted in a set of 6
local derivative and 2 Gabor filters and the resulting
filter bank reached categorization rate of 0.544 which
is significantly higher than the rate of Gabor(1,4) fil
ter bank and slightly higher than the rate of the local
derivative filter bank.
A FRAMEWORK FOR ANALYZING TEXTURE DESCRIPTORS
511