Available via license: CC BY-NC-ND 4.0
Content may be subject to copyright.
Computer Vision and Image Understanding
Vol. 74, No. 1, April, pp. 22–35, 1999
Article ID cviu.1999.0745, available online at http://www.idealibrary.com on
Pattern Matching as a Correlation on the Discrete Motion Group
Alexander B. Kyatkin and Gregory S. Chirikjian∗
Department of Mechanical Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland, 21218
E-mail: gregc@jhu.edu
Received November 3, 1997; accepted January 6, 1999
In this paper we develop a correlation method for the template
matching problem in pattern recognition which includes transla-
tions, rotations, and dilations in a natural way. The correlation
method is implemented using Fourier analysis on the “discrete
motion group” and fast Fourier transform methods. A brief in-
troduction to Fourier methods on the discrete motion group is
given and the efficiency of these methods is discussed. Results of
the numerical implementation are given for particular examples.
c
°1999 Academic Press
Key Words: pattern analysis; object recognition and indexing.
1. INTRODUCTION
Inthispaperweaddressatwo-dimensionalprobleminpattern
recognition. For a given template object we want to find if this
template object is present in a given image, and, if it is found, de-
termine its position and orientation. We use a correlation method
(see [1] and references therein) for this purpose, which is ex-
tended in a natural way to include rotations and dilations of the
template object in addition to translations. Essentially, we trans-
late, rotate, and dilate the template object, overlap it with the
image and compute an overlap area (weighted by the intensity
value at each pixel) with the proper normalization. The novelty
ofourapproachisthatthecorrelationmethodis implemented us-
ingthe Fourier transform on the“discretemotion group.”Fourier
methods on the discrete motion group also provide a fast method
to distinguish “identical” images (up to possible translations and
rotations of the image) from “different” ones.
Thediscretemotiongroup can be viewed as the set of matrices
of the form
g=µRr
0T1¶,(1)
where
R=µcos 2πi/N−sin 2πi/N
sin 2πi/Ncos 2πi/N¶(2)
∗To whom all correspondence should be addressed.
for fixed natural number Nand i∈[0,N−1] (R∈SO(2) for
the continuous motion group SE(2)).1The group law is simply
matrix multiplication.
The problem of template matching is quite old and has been
approached in a number of different ways. Perhaps the most
common (and oldest) approach is that of “matched filters” [2].
In this approach the Fourier transform of the image and template
aretaken,these are multiplied, and a peak is sought. This method
can be implemented via digital computer, or by analog optical
computation [3]. The drawback of this standard approach is that
rotations are handled in a very awkward manner. Several works
haveconsidered rotation-invariantapproaches (e.g., [4]). In such
approaches, polar coordinates are used and images are expanded
in series of Zernike polynomials (see, e.g. [5]) or by using the
Hankel transform. The problem with such approaches is that
rotational invariance is often gained at the expense of the trans-
lational invariance offered by the classical Fourier transform.
A number of works have considered using invariants of im-
ages for recognition (e.g. [6]). When one begins discussing in-
variants, the most natural analytical tool is group theory. In this
work we apply an area of group theory called noncommutative
harmonic analysis to the template matching problem. In short,
this area of mathematics deals with the generalization of the
concept of convolution and Fourier transforms to functions on
groups. In particular, if we are given a function f(x), the gener-
alized Fourier transform developed and applied in this paper is
a matrix function which has the property
F(f(RT(x−a))) =F(f(x))U(R,a),
where Uis a unitary matrix that depends on rotation Rand
translation a, and Fdenotes the nonabelian Fourier transform.
The above expression cannot be written as a matrix product
for the usual abelian Fourier transform for R6= identity, al-
though it is completely analogous to the behavior of the abelian
Fouriertransform applied to translated functions.Inother words,
noncommutative harmonic analysis provides a natural tool for
1The notation SE(2) stands for “special Euclidean” group of R2, i.e. the
group of all rigid-body motions in the plane. It is also called the Euclidean
motion group, or simply the motion group.
22
1077-3142/99 $30.00
Copyright c
°1999 by Academic Press
All rights of reproduction in any form reserved.
PATTERN MATCHING AS A CORRELATION 23
translationand rotation invariantpattern matching.Furthermore,
since Uis unitary kUF(f)k2=kF(f)k2, and so this general-
ized Fourier transform provides a tool for generating a whole
continuum of pattern invariants under rigid-body motion.
The connection between group theory and the theory of wave-
lets (which has become a very popular tool in image analysis)
has been well established. In essence, expanding a function in a
wavelet basis is achieved by starting with a mother wavelet and
superposing affine-transformed versions of the mother wavelet
to best approximate a given function. The interested reader is
pointed to [7–10] for further reading on the subject of wavelets,
their applications in image analysis, and their connection with
group theory.
The approach presented in this paper is to use the nonabelian
Fourier transform and generalized concepts of convolution and
correlation.This is verydifferentthanwaveletapproaches. While
waveletstypically allowone to efficiently approximatefunctions
(or images), they have the drawback of not behaving well under
operations such as convolution, which is the most natural tool
in matched filtering.
In Section 2 we describe the correlation method. Section 3
is an introduction to Fourier analysis on the discrete motion
group. In Section 4 we describe the implementation of the cor-
relation method using the Fourier transform on the discrete mo-
tion group. Section 5 describes the invariant constructions which
may be used in image analysis problems. Section 6 examines the
computational complexity of the approach. Section 7 describes
practical numerical examples: Subsection 7.1 gives numerical
examples of the correlation method which includes translations
and rotations; Subsection 7.2 illustrates applications of the in-
variants on the discrete motion group for comparison of images.
2. METHOD FOR PATTERN RECOGNITION
In this paper we extend the correlation method for pattern
recognition [1] to include, in a natural way, rotations and dila-
tions (in addition to translations) as the allowed transformations
oftheimage.To find if the template object is present in the image
we take a section from the image and compare it with a rotated,
translated, and dilated version of the template pattern. Taking
a section from the image is equivalent to multiplication of the
image by a “window” function, which is rotated, translated, and
dilated the same way as the template pattern.
Mathematically the correlation function is written as
q(a,R,k)
=RR2f1(x)W(R−1(kx−a)) f2(R−1(kx−a)) d2x
£RR2(f1(x))2(W(R−1(kx−a)))2d2x¤1/2RR2(f2((R−1(kx−a)))2d2x]1/2,
(3)
where R∈SO(2), a∈R2,k∈R+close to one, and W(x)isa
window function. For a similar template pattern and window
from the image the value of the correlation coefficient should be
close to one. We note that the integral
ZR2(f2(R−1(kx−a)))2d2x(4)
is just the square of norm of function f2(for k=1),
ZR2(f2(x))2d2x.
According to the Cauchy–Schwarz inequality,
Zf1(x)f2(x)d2x≤·Z(f1(x))2d2xZ(f2(x))2d2x¸1/2
,
the correlation coefficient (3) is always smaller or equal to one,
and it is equal to one for an identical pattern and windowed
image. We note that the value of a correlation coefficient does
not change if we change overall intensity of the original image
or template object.
For a dilation coefficient k=1 we observe that the correlation
functionq=q(a,R) is a function on theEuclidean motion group
SE(2) [22, 16], which is the semidirect product of translation
group (R2,+) and the rotation group SO(2). It appears that this
group has not been used extensively in applications to the image
processing; the authors are aware of only a few previous works
using this group (e.g., [11, 12, 15]).
Using Fourier methods on the motion group we compute the
correlation coefficient in a much more efficient way than us-
ing direct integration. Indeed, the direct computation of integral
(3) is very costly (we consider for simplicity the k=1 case).
For Nr=Nx·Nysamples of the image (and template) on an
Nx×Nyrectangular grid, and for Nsamples of orientation,
we need to perform O(N2
rN) computations (and we need to
compute the convolution-like integrals twice, in the denomi-
nator and numerator of (3)). For Nr=256 ×256 and N=60,
the computations require 5 ×1011 operations, which requires a
day of computer work on a 250 MHz workstation. In this pa-
per we use the advantages of Fourier methods on the “discrete
motion group” (i.e. the subgroup of SE(2), where the orienta-
tion angle has discrete values from the CNsubgroup of SO(2),
θ=2πi/Nfor i=0,...,N−1), and fast Fourier transform
(FFT) methods [17, 18] to compute the correlation coefficient in
O(NN
rlog Nr) computation. In addition, Fourier methods on
the discrete motion group provide a very fast method for com-
parison of two images which are translated and rotated relatively
to each other.
A natural question to ask is how the computational require-
ments of this approach compare to classical Fourier techniques
applied to matched filtering. The answer is that they are on the
sameorder.However, the benefit of our formulation is that it pro-
vides a clean notation in which to treat translations and rotations
in a unified way. This paper also serves as an introduction of the
image understanding community to techniques which are not
24 KYATKIN AND CHIRIKJIAN
widely known outside of pure mathematics. In the next section
wediscuss briefly Fouriermethodson the discrete motiongroup.
3. FOURIER TRANSFORM ON THE DISCRETE
MOTION GROUP
The concept of convolution of functions on a wide variety of
abstract groups is well known in the pure mathematics literature
[14]. A detailed study of the concrete case of convolution of
functions on SE(2) in the context of robot kinematics can be
found in [13].
Building on this previous work, we note that the numera-
tor (and denominator) in the correlation function (3) for k=1
may be written formally as a convolution-like integral on the
Euclidean motion group
ZR2f1(x)f2(R−1(x−a)) d2x
=ZSO(2)ZR2
˜
f1(x,A)˜
f2(R−1(x−a),R−1◦A)d2xdA
=ZSE(2)
˜
f1(h)˜
f2(g−1◦h)dµ(h),(5)
where A∈SO(2) and dA=dθ/(2π) is the normalized integra-
tion measure on SO(2), and in the template-matching prob-
lem functions ˜
f1,2are explicit functions only of position, i.e.,
˜
f1,2(x,A)=f1,2(x) (henceforth we do not distinguish between
˜
f1,2and f1,2).2Furthermore, f1,2(x) are nonnegative real func-
tions (we formally write f1as the complex conjugate of itself to
use the properties of the Fourier transform later).
The group elements g,hare in SE(2), the group product is the
group product on the motion group,3and dµ(h)=d2xdθ/(2π).
We assume that the orientation angles are restricted to values
from the discrete subgroup CNof the rotation group SO(2). We
refer to the subgroup of the motion group with a discrete range
of allowed rotations as the discrete motion group GN.
For the discrete motion group the integration over orientation
should be replaced by summation through Ai(which can be
viewed as elements of the group CN, or as matrices of the form
in Eq. (2)):
ZSO(2)(·)dA →1
N
N−1
X
i=0
(·).
InthecaseofthetranslationgrouptheusualFouriertransform
on R2may be used to get a simple expression for convolution
in Fourier space (i.e. the product of Fourier transforms). In fact,
this property is based on the property
U(a;p)·U(b;p)=U(a+b;p)
2Any function on R2can also be viewed as one on SE(2) which is constant
over all orientations.
3For g=(x,R) and h=(y,A) the group product is defined as g◦h=(Ry+
x,R◦A), where R◦Ais a group product for SO(2).
of the Fourier transform elements U(a;p)=exp(ip·a), which
form a complete and orthonormal set of elements for all possible
values of the Fourier parameter vector p. We note that U(a;p)
are matrix elements (complex numbers in this case) of unitary ir-
reducible representations [25, 26] of the translation group of R2.
We use a similar approach in order to get a simple expres-
sion for the convolution integral on the motion group in Fourier
space. We have to use a generalized Fourier transform with the
property that (see Appendix)
F(f1∗f2)=F(f1)F(f2),
where f1,2are functions on the motion group. This will provide
a tool for fast calculation of integrals like those in Eq. (5).
Awell-developedtheory for suchgeneralizationsof theFourier
transform exists. It is called noncommutative harmonic analy-
sis. A key element of this theory is the enumeration of linear
operators, U, which have the homomorphism property
U(g1;ρ)U(g2;ρ)=U(g1◦g2;ρ),(6)
where g1,2are group elements of a group G,ρ is a general-
ized Fourier parameter (or set of parameters), and the operator
product may be understood as a matrix product (of, in general,
infinite dimensional matrices).
This homomorphism property allows one to reduce the con-
volution integrals to a matrix product equation in Fourier space.
The property (6) is just part of the definition of a group repre-
sentation [25] and is required to define Fourier transforms with
the convolution property. The operators Ucan be thought of as
generalizations of the complex exponentials, U, used in usual
Fourier analysis. Each Ucan be expressed as a unitary matrix.
To generate the complete and orthonormal basis in which to
expand functions on the group, we have to calculate the ma-
trix elements of irreducible and unitary representations (IURs)
[25, 26] of the group. A detailed review of the general theory is
provided is the Appendix.
The elements of the Umatrices for the discrete motion group
may be written as (see Appendix for details)
Umn(g;ρ)=Umn(Aj,r;p,φ)=e
ipuφ
m·rδA−1
jum,un,(7)
where A−1
jis the inverse of the discrete rotation Aj, and uφ
kde-
notes the vector to the angle θ=φ+2πk/Non the unit circle
in the interval Fk=[2πk/N,2π(k+1)/N],k=0,...,N−1
(φmeasures the angle on this segment, 0 ≤φ≤2π/N). The
vector uφ
kis analogous to the 2D Fourier vector pin ordinary 2D
Fourier transform (normalized to unit magnitude) and, thus, has
a dependence on the continuous angle θ, which just measures
the polar angle of p. We note that each element of the discrete
motion group can be expressed as a pair g=(Ai,r) and each
Fourier parameter can be expressed as the pair ρ=(p,φ).
PATTERN MATCHING AS A CORRELATION 25
The direct Fourier transform is defined as
ˆ
fmn(p,φ)=[F(f)]mn =
N−1
X
i=0ZR2f(Ai,r)U−1
mn (Ai,r;p,φ)d2r.
(8)
The vector uφ
m, which is inside the segment Fmmay be received
by the rotation Am(which transform F0to Fm) from uφ
0,uφ
m=
Amuφ
0. The parameter φdenotes the position inside the segment
F0.
The inverse Fourier transform is
F−1(ˆ
f)
=1
4π2X
mX
nZ∞
0Z2π/N
0
ˆ
fmn(p,φ)U
nm(Ai,r;p,φ)pdpdφ,
(9)
where the angle φis measured from θ=2πn/N. We note that
this result is in agreement with [11].
4. APPLICATION TO THE CORRELATION METHOD
The convolution-like integrals in the numerator and denomi-
nator of (3) may be formally written (for simplicity we consider
the k=1 case) as integrals
c(x,Aj)=1
N
N−1
X
i=0ZR2f1(y,Ai)f2¡A−1
j(y−x),A−1
jAi¢d2y
(10)
(where the functions f1,2are orientation-independent).
The correlation function, however, is a function on GN,sowe
may use the Fourier transform on the discrete motion group GN
to write this integral as a product of Fourier transforms. Because
the functions are real the integral in the numerator of Eq. (3) may
be written as
1
N
N−1
X
i=0ZR2f1(y,Ai)f2¡A−1
j(y−x),A−1
jAi¢d2y
=ZGN
f1(h)f2(g−1h)dµ(h),
where we denote integration dµ(h) over the discrete motion
group, GN, to mean integration over R2and summation through
the Ai, and the groupelements are of the form g=(x,Aj).Using
the orthogonality and homomorphism properties of the Fourier
matrix elements, this integral may be written as
1
NX
qX
nZ∞
0Z2π/N
0X
m¡ˆ
f1mn ˆ
f2mq¢Uqn(g−1;p,φ)pdpdφ
=1
NX
qX
nZ∞
0Z2π/N
0X
m¡ˆ
f2mq ˆ
f1mn¢Unq(g;p,φ)pdpdφ,
(11)
where φis measured from 2πq/N. For the second expression
we used the unitarity of the matrix elements Umn and the fact
that the expression is real (i.e., we take the complex conjugate
of the integral). The matrices ( ˆ
f1,2)mn are the Fourier transforms
(as defined in Eq. (8)) of the functions f1,2(x,Ai). We note that
this integral is the inverse Fourier transform of ˆ
f†
2·ˆ
f1, and thus,
the expression depends only on three indices.
Because functions f1,2(x,Ai)=f1,2(x) do not depend on the
orientations Ai, matrix elements in the same column are the
same, i.e.,
(ˆ
f1,2)mn =(ˆ
f1,2)qn
for any m,q. This may be observed from the expression
Umn(g−1;p,φ)=e−ipuφ
n·rδA−1
iun,um
(the exponent depends only on the n-index), the definition of the
direct transform (8), and the fact that the functions do not depend
on the orientation. Thus, we compute a row of the Fourier matrix
for a particular orientation (for example A0=1, the identity
element)
(ˆ
f1,2)n=(ˆ
f1,2)nn.
This may be done using the 2D FFT for the functions f1,2(x) and
interpolating the Foureir values to points on a polar coordinate
grid.
The value of pis determined by |p|, the values of mand vare
determinedby the angular part of p. Thisrequires O(Nrlog(Nr))
computations.
Thus, the integrals in Eq. (3) may be written as
c(x,Aj)
=CX
qX
nZ∞
0Z2π/N
0Xˆ
f2q(p,φ)ˆ
f1n(p,φ)CU
nq(g;p,φ)pdpdφ,
(12)
where C=1/N.
We observe that the convolution-like integrals may be com-
puted by taking the Fourier transform, computing the product
of transforms, and taking the inverse Fourier transform on the
discrete motion group.
5. INVARIANTS OF THE DISCRETE MOTION GROUP
Let us assume that one wants to compute properties of the
image (object) which are invariant with respect to translations
and rotations of the image. The Fourier transform on the discrete
motion group provides a very efficient tool to compute these
invariants. Let us construct a function with values in R+,
η(p;φ)=
N−1
X
m=0
[ˆ
fm(p;φ)ˆ
fm(p;φ)],(13)
26 KYATKIN AND CHIRIKJIAN
for each fixed φ=0,...,Nφ−1, where ˆ
fm(p;φ) is the Fourier
transform on the discrete motion group of f(x). Then (13) is
invariant with respect to rotations and translations of f(x); i.e.,
η(p;φ) does not change if we compute (13) using the Fourier
transform on the motion group for f0(x)=f(R−1(x−a)).
We note that for orientation-independent functions (i.e., for
functions on R2) the Fourier transform elements ˆ
fmmay be
arranged as a matrix which has the same matrix elements in the
same column,
ˆ
fqm =ˆ
frm =ˆ
fm.
Then (13) may be written also as a trace
η(p;φ)=Tr[ ˆ
f†(p;φ)ˆ
f(p;φ)],(14)
where ˆ
f†(p;φ) is the Hermitian conjugate matrix.
According to (8), ˆ
fqm for f(x) may be written as
ˆ
fqm(p;φ)=ZGN
f(h)U−1
qm(h;p,φ)dµ(h),
where the integral over GNdenotes integration with respect to
xand summation through the elements of CN, and f(h)=f(x).
The function f0(x)=f(R−1(x−a)) may be formally written
as f(g−1◦h), where g=(a,R)∈GN. Then ˆ
f0
qm is written
as
ˆ
f0
qm(p;φ)=ZGN
f(g−1◦h)U−1
qm(h;p,φ)dµ(h).
Using the invariance of the integration measure we write this
integral as
ˆ
f0
qm(p;φ)=ZGN
f(h0)U−1
qm(g◦h0;p,φ)dµ(h0).
Using the homomorphism properties of Uwe write it as
ˆ
f0
qm(p;φ)=·ZGN
f(h0)U−1
qr (h0;p,φ)dµ(h0)
¸·U−1
rm (g;p,φ)
=ˆ
fqr(p,φ)U†
rm(g;p,φ),
where we have used a unitarity property of U. Thus, the Fourier
matrix is transformed under rotations and translations g∈GN
as
ˆ
f0(p,φ)=ˆ
f(p,φ)U†(g;p,φ).
Using the cyclic property of Tr and unitarity of Uit is clear
that
Tr[( ˆ
f0)†(p,φ)ˆ
f0(p,φ)]
=Tr[U(g;p,φ)ˆ
f†(p,φ)ˆ
f(p,φ)U†(g;p,φ)]
=Tr[ ˆ
f†(p,φ)ˆ
f(p,φ)],
which proves the invariance of (13). We note that the invariant,
written in the form (14), is valid also for orientation-dependent
functions(i.e. for generalfunctions on thediscrete motion group).
The use of invariants for pattern recognition was suggested in
[11].
6. EFFICIENT CALCULATION OF CONVOLUTION-
LIKE INTEGRALS USING THE FOURIER METHOD
As we mentioned before, the direct integrations of (3) re-
quires O(N2
rN) computations for CN, where Nris the number
of sampling points in an R2region.
Using the Fourier transform on the discrete motion group we
have to compute direct Fourier transforms for image and tem-
plate,computethematrixproduct(inourcaseitis a column–row
product) of the Fourier transform, which describes the Fourier
transform of the convolved functions, and then calculate the in-
verse Fourier transform.
The calculation of direct Fourier transform and the “matrix”
(column–row) product is a fast computation. The direct Fourier
transform for f1,2(x) may be computed using a usual two-
dimensional FFT [18] in O(Nrlog Nr) computations. The FFT
gives, however, values of Fourier elements computed on the
Cartesian square (rectangular) grid of pvalues. To receive the
Fourier transform elements ˆ
fm(p,φ) on the discrete motion
group we have to interpolate values on the Cartesian grid to
a polar coordinate grid, the pvalue is the magnitude of p, the
mand φindices are determined by the angular part of p(thus,
the constraint NpNφN≈Nrmay be used). The linear inter-
polation requires O(Nr) computations. The product of Fourier
column ˆ
fT
m(p,φ) and row ˆ
fn(p,φ) which gives Fourier ma-
trix ˆ
Fmn(p,φ), may be performed in O(N2NpNφ)=O(NN
r)
computations. We note that the trace in invariants (13) may be
computed in O(NNpN
φ)=O(N
r) (for all φ-values).
Thus, the direct Fourier transform and the “matrix” product
may be computed in O(Nrlog Nr+O(NN
r) computations.
The inverse Fourier transform calculation is a slower compu-
tation. One element from each row and column of ˆ
Fmn(p,φ)is
used in computation of the inverse Fourier transform for each
rotation element Ai. First, we interpolate the value of the Fourier
transform on the square grid Nr×Nrof pto polar coordinates.
The radial coordinate is p=|p|, the polar angle is determined
by the values of mand φ(the value of nis determined by m
and the index of rotation i,n=m+i, thus we take ˆ
Fm,m+iele-
ments from the Fourier matrix to compute the inverse transform
for fixed orientation Ai). After inverse interpolation to Cartesian
coordinates (which may be done in O(NN
r) computations), the
inverse Fourier integration may be performed in O(Nrlog(Nr))
for each of the Nnonzero matrix elements of Uusing the FFT.
Thus, in O(NN
rlog(Nr)) computations we reproduce the func-
tionfor all Ai.Wenote that the inverseFourier transform compu-
tation is O(N) (or O(log Nr), depending which is larger) times
more time-consuming, because we reproduce a function on the
discrete motion group, rather than a function on R2.
PATTERN MATCHING AS A CORRELATION 27
FIG. 1. The image—256 ×256 array of grey values.
Thus, thetotal requiredis O(NN
rlog(Nr)) computations, and
these computations are, for the most part, calculations of the
inverse Fourier transform. We also note that we have to perform
calculations twice, to compute convolution-like integrals in the
denominator and numerator of (3).
The total order of computations using classical Fourier anal-
ysis is of the same order as in our implementation, but the
Fourier transform on the motion group has additional nice prop-
erties. For example, it allows one to neatly write integrals in
the correlation function as matrix products in Fourier space.
It also gives an efficient way to construct image invariants.
These invariants may be used in image processing problems
(see Section 7.2 for numerical examples). Of course it can be
argued that the classical (scalar) Fourier transforms of rotated
versions of images can be arranged to form a Fourier matrix like
ours. If this arrangement is performed, then knowingly or not,
one is calculating the Fourier transform on the discrete motion
group.
28 KYATKIN AND CHIRIKJIAN
FIG. 2. The template pattern. The arrow is used as a reference to find the position and orientation of this pattern in the image.
7. NUMERICAL EXAMPLES
7.1. Correlation Method, Including Rotations and Translations
In this section we compute the correlation function, Eq. (3)
(for dilation k=1) for some practical examples. We compute
most examples for Nr=256 ×256 and N=60 (C60 group),
although the computing time for other arrays is also reported.
We consider the image depicted in Fig. 1. This is a 256 ×256
array of grey values (256 grey levels of intensity for each pixel).
We choose a template, depicted in Fig. 2, which is a rotated
(at angle θ=−π/3), and the translated pattern taken from the
image. The arrow shown on the picture is used as a reference
arrow to find the position and orientation of this template in the
image. The correlation function depicted for the θ=π/3 angle
isdepicted in Fig. 3. The highest valueof the correlation function
is at the original position and orientation of the pattern in the
image. We also find positions and orientations of local maxima
in each of m×msubregions of the original image. For m=8 the
PATTERN MATCHING AS A CORRELATION 29
FIG. 3. The correlation function depicted for the θ=π/3 orientation angle.
positions and orientations of local maxima in each of subregions
are shown in Fig. 4. The highest value is depicted by the arrow,
which is rotated and translated from the arrow in Fig. 2. Other
local maxima (with a value of correlation which is greater than
0.85) are depicted by a white square; a small line attached to the
square shows the orientation.
Wenotethat theprecisevalues ofcorrelation at thelocations of
64 maxima may be found by direct integration, and the Fourier
method may be used as a fast filter method to find locations
of these maxima. It is especially important to compute precise
values in the case when the template object does not match
exactly the pattern in the image.
In the table below we listed the computing time of the method
(given in minutes and seconds, on a 250-MHz SGI workstation),
implemented in the C programming language. Nis listed along
the horizontal, the right column lists the time to compute the
correlation coefficients at 64 maxima using direct integration.
The Nrarray size is given along the vertical.
N=60 N=30 N=10 Dir. int.
Nr=256 ×256 4:06 2:11 0:48 0:55
N
r=128 ×128 1:12 0:36 0:16 0:13
N
r=64 ×64 0:25 0:12 0:04 0:03
30 KYATKIN AND CHIRIKJIAN
FIG. 4. The image. Positions and orientations of the absolute maximum (shown by the arrow) and local maxima of the correlation function are depicted.
7.2. Using the Invariants on the Motion Group
to Compare Images
Aswehave shownbefore,function (13) is the same for images
which are rotated and translated relative to each other. It may
be used to compare images and determine if they are identical
(similar) or not. Again, we compute the correlation coefficient
of invariant functions f1,2(p,φ), computed for two different
images,
η(φ)=R∞
0f1(p,φ)f2(p,φ)pdp
¡R∞
0f
1(p,φ)
2pdp
¢1/2¡R∞
0f
2(p,φ)
2pdp
¢1/2.
PATTERN MATCHING AS A CORRELATION 31
FIG. 5. An image.
This is a fast computation of the order O(NpNφ)≈O(Nr/N)
(to compute Nφcoefficients) and it may be done using the
usual integration techniques. As we mentioned before, the direct
Fourier transform may be computed in O(Nrlog(Nr)) compu-
tation; computation of the sum in (13) may be done in O(Nr)
computations.
We compare the images depicted in Fig. 5 and Fig. 6, which
are just rotated and translated relative to each other. We use
the value ν=(1.0−η)×103to compare images, which is more
convenient to use for ηvalues which are close to 1.0. The greater
νis, the worse the correlation is. In the table below we show ν
for φ=0,...,5.
φ012345
ν0.016 0.017 0.017 0.017 0.016 0.016
We see that values are very close for different φ; thus we may
use νfor any one of the φvalues. The values are small which
indicates very strong correlation.
If we compare Fig. 6 with a not-quite-the-same rotated and
translated image (taken from an image after application of direct
and inverse Fourier transform), we get a value of ν=0.107
which indicates a moderate correlation. The time to compute
32 KYATKIN AND CHIRIKJIAN
FIG. 6. A rotated and translated version of the image in Fig. 5.
direct Fourier transforms and correlations νwas around3son
a 250 MHz workstation.
8. CONCLUSIONS
In this paper we use techniques from noncommutative har-
monic analysis to formulate problems in template matching and
construction of image invariants. The main contribution is to
illustrate that problems in image understanding can be cleanly
formulated using mathematical techniques which are not stan-
dard tools in the community. Numerical examples are provided
to demonstrate the techniques.
APPENDIX
In this appendix we review the essentials of noncommutative
harmonicanalysis which is the generalizationofFourier analysis
to functions on groups. Much of the review material presented
here may be found in [13, 14, 16, 25].
Recall that the Fourier transform pair for a suitable scalar
function, f(x), for x∈Ris defined as
ˆ
f(p)=Z∞
−∞
f(x)u(−x,p)dx,
(A.1)
f(x)=1
2πZ∞
−∞
ˆ
f(p)u(x,p)dp,
PATTERN MATCHING AS A CORRELATION 33
where u(x,p)=eipx. Note that u(x+y,p)=eip(x+y)=eipx ·
eipy =u(x,p)u(y,p). This is an example of a group homomor-
phism. In general, a homomorphism is a mapping between two
groups h:(G,◦)→(H,ˆ◦) such that h(g1◦g2)=h(g1)ˆ◦h(g
2
).
In particular, the function u(·,p) maps (R,+)→(U,·) for each
ω∈R, where Uis the set of complex numbers with unit mod-
ulus, and ·is scalar multiplication.
The convolution theorem for functions on the real line states
that ( f1(x)∗f2(x)) =ˆ
f1(p)ˆ
f2(p). This is a direct result of the
facts that
u(−(x+y),p)=u(−x,p)u(−y,p)
and integration on the real line is translation invariant.
Noncommutative harmonic analysis extends the concept of
Fourier transform and convolution to functions on groups. At the
core of this area of mathematics is the enumeration of functions
analogous to u(x,p). Unlike the Abelian case where such func-
tions are scalars, in the noncommutative context these functions
are matrices called irreducible unitary representations (IURs).
Arepresentation of a group Gis a homomorphism T:G→
T(G)⊂GL(V). Vis a vector space called the representation
space, and GL(V) is the group of all invertible linear transfor-
mations of Vonto itself. T(g) for g∈Gis expressed in a given
basis of Vas an invertible matrix, and
T(g1◦g2)=T(g1)T(g2),T(g−1)=T−1(g),
T(e)=1∈GL(V).
Representationsthat can beexpressed asunitary matrices (U−1=
U†) in an orthonormal basis of Vare called unitary represen-
tations. Irreducible representations are those which cannot be
block-diagonalized. I.e., they are the “smallest” representations
and cannot be further reduced. The function u(x,p) is an exam-
ple of a one-dimensional (and hence irreducible) unitary repre-
sentation.
For a general unimodular group (i.e., a group which possesses
a left and right invariant volume measure), the Fourier transform
of a suitable function f(g) is defined as
ˆ
f(ρ)=F(f(g)) =ZGf(g)U(g−1;ρ)dµ(g),
where dµ(g) is a left–right invariant volume measure on Gand
ρis a dual (frequency-like) parameter which enumerates all
the different IURs of the group. The parameter ρcould be a
scalar, vector, or other quantity, depending on the group under
consideration. The inverse Fourier transform
f(g)=Zˆ
Gtrace( ˆ
f(ρ)U(g;ρ)) dν(ρ)
reconstructsthe function from its spectrum(collectionof Fourier
transforms), where dν(ρ) is an appropriately defined measure
on the dual space of the group, ˆ
G.4
Because of the homomorphism property U(g1◦g2;ρ)=
U(g1;ρ)U(g2;ρ), the convolution theorem
F(f1∗f2)=ˆ
f2(ρ)ˆ
f1(ρ)
holds, where convolution is defined as [13, 16, 14]
(f1∗f2)(g)=ZGf1(h)f2(h−1◦g)dµ(h).
The problem then becomes the enumeration of all inequi-
valent IURs for a given group.
A.1. Unitary Representations of SE(2)
A unitary representations of SE(2) is defined by the unitary
operator
U(g;p)˜
f(x)=eip(b·x)˜
f(RTx) (A.2)
for each g=(R,b)∈SE(2). The form of this operator results
from the semi-direct product structure of the group SE(2); ez
is the scalar exponential function, ρ=p∈R+,i=√−1, and
x·y=x1y1+x2y2. The vector xis a unit vector (x·x=1), and
˜
f(·)∈L2(S1), where S1is the unit circle.
Since only one angle is required to parameterize a vector
on the unit circle, x=(cos ψ, sin ψ)T, and ˜
f(x)=˜
f(cos ψ,
sin ψ)≡f(ψ). Henceforth we will not distinguish between ˜
f
and f.
By definition, group representations observe the homomor-
phism property, which in this case is seen as
U(g1;p)U(g2;p)f(x)=U(g1;p)(U(g2;p)f(x))
=U(g1;p)¡eip(b2·x)f¡RT
2x¢¢
=eip(b1·x)eip(b2·RT
1x)f¡RT
2RT
1x¢
=eip(b1+R1b2)·xf((R1R2)Tx)
=U(g1◦g2;p)f(x).
Any function f(ψ)∈L2(S1) can be expressed as a weighted
sumof orthonormal basisfunctionsas f(ψ)=Pnaneinψ.Like-
wise, the matrix elements of the operator U(g;p) are expressed
in this basis as
Umn(g,p)=(eimψ,U(g;p)einψ)∀m,n∈Z,
4It is worth noting that the measures dµand dνhave not been determined
for all unimodular groups, although they are well known for both the Euclidean
group and the discrete motion group, which is all that is important in the context
of this paper.
34 KYATKIN AND CHIRIKJIAN
where the inner product (·,·) is defined as
(f1,f2)=1
2πZ2π
0f1(ψ)f2(ψ)dψ.
It is easy to see that (U(g;p)f1,U(g;p)f2)=(f1,f2), and that
U(g;p) is therefore unitary with respect to this inner product.
The matrix with elements Umn is “infinite dimensional.” Fur-
thermore, the matrix of a unitary operator expressed in an or-
thonormal basis is a unitary matrix, which means U−1
nm =umn.
A number of works including [22] have shown that the matrix
elements of this representation are given by
umn(g(r,φ,θ),p)=in−me−i[nθ+(m−n)φ]J
n−m(pr),(A.3)
where Jν(x)istheνth order Bessel function and g(r,φ,θ)isan
element of SE(2), where the translational part is parameterized
in polar coordinates (r,φ).
A.2. The Discrete Motion Group
In order to get the IURs of the discrete motion group we
restrict possible orientation angles to the values from CNand
choose the appropriate pulse orthonormal basis functions to
compute the representation matrices using property (A.2).
We choose a pulse orthonormal basis fN,n(u)onS; i.e., we
subdivide the circle into identical segments Fnand choose the
f-functions to satisfy the orthonormality relations
1
2πZSfN,n(u)fN,m(u)dθ=δnm.
We choose the orthonormal functions as
fN,n(p)=½(N)1/2if u∈Fn,
0 otherwise;
n=0,...,N−1 enumerates different segments. We denote
these pulse functions as δ-like functions fN,n(u)=(1/N)1/2
δN(u,un), where unis the vector to the center of the Fnseg-
ment.
The matrix elements in this basis are
Umn(A,r;p)=1
2πZSfN,m(u)eipu·rfN,n(A−1u)dθ. (A.4)
It may be shown that this integral may be approximated as
Umn(Aj,r;p)=eipum·rδA−1
jum,un,(A.5)
where δA−1
jum,un=δm−j,nin this case.
The matrix elements (A.5) are exact expressions for the ma-
trixelements of the unitaryrepresentations of the discretemotion
group. The set of matrix elements (A.5) is, however, incomplete.
This means, that the direct and inverse Fourier transforms, de-
fined using these matrix elements, would reproduce the original
function with O(1/N) error; i.e.,
F−1(F(f(Ai,r)) =f(Ai,r)(1 +O(1/N)).
The reason for this is that summing through all possible seg-
ments cannot replace integration over all possible angles on the
circle. It is also clear that the additional continuous parameter
which enumerates possible angles inside each segment on the
circle must give the complete set of the matrix elements.
Thus, the matrix elements must be modified as
Umn(Aj,r;p,φ)=e
ipuφ
m·rδA−1
jum,un,(A.6)
where uφ
kdenotes the vector to the angle θ=φ+2πk/Non the
unit circle on the interval [2πk/N,2π(k+1)/N],k=0,...,
N−1(φmeasures the angle on this segment). The vectors uφ
m
are illustrated in Fig. 7. The Fourier parameter in this case is the
pair ρ=(p,φ).
This is expression (7) in the text.
FIG. 7. Illustration for vectors uφ
min the matrix elements of IURs of the
discrete motion group.
PATTERN MATCHING AS A CORRELATION 35
REFERENCES
1. B. Jahne, Spatio-Temporal Image Processing: Theory and Scientific Appli-
cations, Springer-Verlag, Berlin, 1993.
2. G. L. Turin, An introduction to matched filters, IRE Trans. Inf. Theory IT-6,
1960, 311–329.
3. M. A. Karim and A. A. S. Awwal, Optical Computing An Introduction,
Wiley, New York, 1992.
4. H. H. Arsenault and Y. N. Hsu, Chalasinsk–Macukow, rotation-invariant
pattern recognition, Opt. Eng. 23, 1984, 705–709.
5. A. B. Bhatia and E. Wolf, On the circle polynomials of Zernike and related
orthogonal sets, Proc. Cambridge Philos. Soc. 50, 1954, 40–48.
6. Y. S. Abu-Mostafa and D. Psaltis, Recognition aspects of moment invari-
ants, IEEE Trans. Pattern Anal. Mach. Intell. PAMI-6, 1984, 698.
7. M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, Image coding
using wavelet transform, IEEE Transactions on Image Processing 40(2),
1992, 205–220.
8. J.-P. Leduc, Spatio-temporal wavelet transforms for digital signal analysis,
Signal Processing 60, 1997, 23–41.
9. R. Murenzi, Wavelet transforms associated to the n-dimensional Euclidean
group with dilations: Signals in more than one dimension, in Wavelets:
Time-Frequency Methods and Phase Space (J. M. Combes, A. Grossmann,
and Ph. Tchamitchian, Eds.), pp. 239–246.
10. J. Segman and Y. Zeevi, Image analysis by wavelet-type transforms: Group
theoretical approach, Journal of Mathematical Imaging and Vision 3, 1993,
51–77. [“Estimation with a Pattern Recognition (ICPR’86), Washington
DC, 1986”]
11. J. P. Gauthier, G. Bornard, and M. Sibermann, Motion and pattern analysis:
Harmonic analysis on motion groups and their homogeneous spaces, IEEE
Trans. Syst. Man Cybern.21, 1991, 159–172.
12. R. Lenz, Group Theoretical Methods in Image Processing, Lecture Notes in
Computer Science, Springer-Verlag, Berlin/Heidelberg/New York, 1990.
13. G. S. Chirikjian and I. Ebert-Uphoff, Numerical convolution on the
Euclidean group with applications to workspace generation, IEEE Trans.
Robotics Automation 14(1), 1998, 123–136.
14. G. B. Folland, A Course in Abstract Harmonic Analysis, CRC Press, Boca
Raton, FL, 1995.
15. K. Kanatani, Group-Theoretical Methods in Image Understanding,
Springer-Verlag, Berlin/Heidelberg/New York, 1990.
16. G. Chirikjian, Fredholm integral equations on the Euclidean motions group,
Inverse Problems 12, 1996, 579–599.
17. J. W. Cooley and J. Tukey, An algorithm for the machine calculation of
complex Fourier series, Math. Comput. 19, 1965, 297–301.
18. D. F. Elliott and K. R. Rao, Fast Transforms: Algorithms, Analyses, Appli-
cations, Academic Press, New York, London, 1982.
19. H. Choi and D. C. Munson, Direct-Fourier reconstruction in tomography
and synthetic aperture radar, Int. J. of Imaging Systems and Technology 9,
1998, 1–13.
20. H. Stark, J. W. Woods, I. Paul, and R. Hingorani, Direct Fourier recon-
struction in computed tomography, IEEE Trans. Acoustics, Speech, Signal
Processing ASSP-29, 1981, 237–245.
21. A. Kyatkin and G. Chirikjian, Regularized solutions of a nonlinear convo-
lution equation on the Euclidean group, Acta Appl. Math. 53, 1986, 89–123.
22. N. J. Vilenkin, Bessel functions and representations of the group of
Euclidean motions, Uspehi Mat. Nauk. 11, 1956, 69–112. [Russian]
23. J. Canny. A computational approach to edge detection, IEEE Trans. Pattern
Anal. Mach. Intell. 8, 1986, 679–698.
24. H. Dym and H. McKean, Fourier Series and Integrals, Academic Press,
New York, 1972.
25. M. Sugiura, Unitary Representations and Harmonic Analysis, 2nd ed.,
Elsevier Science, Amsterdam, 1990.
26. D. Gurarie, Symmetry and Laplacians. Introduction to Harmonic Analysis,
Group Representations and Applications, Elsevier Science, 1992.