Content uploaded by Amit Kale
Author content
All content in this area was uploaded by Amit Kale
Content may be subject to copyright.
A Joint Illumination and Shape Model for Visual Tracking
Amit Kale and Christopher Jaynes
∗
Ctr. for Visualization and Virtual Environments and Department of Computer Science
University of Kentucky
{amit,jaynes}@cs.uky.edu
Abstract
Visual tracking involves generating an inference about
the motion of an ob ject from measured image locations in
a video sequence. In this paper we present a uniﬁed frame
work that incorporates shape and illumination in the con
text of visual tracking. The contribution of the work is
twofold. First, we introduce a a multiplicative, low di
mensional model of illumination that is deﬁned by a linear
combination of a set of smoothly changing basis functions.
Secondly, we show that a small number of centroids in this
new space can be used to represent the illumination condi
tions existing in the scene. These centroids can be learned
from ground truth and are shown to generalize well to other
objects of the same class for the scene. Finally we show
how this illumination model can be combined with shape
in a probabilistic sampling framework. Results of the joint
shapeillumination model are demonstrated in the context
of vehicle and face tracking in challenging conditions.
1. Introduction
Visual tracking involves generating an inference about
the motion o f an object from measured image locations in
a video sequence. Unfortunately, this goal is confounded
by sources of image appearance change that are only partly
related to the position of the object in the scene. For exam
ple, unknown deformations, changes in pose of the object,
or changes in illu mination can cause a template to ch ange
appearance over time and lead to tracking failure.
Shape change for rigid objects can be captured by a low
dimensional shape space under a weak perspective assump
tion. Thus tracking can b e considered as the statistical in
ference of this lowdimensional shape vector. This inter
pretation forms the basis of several tracking algorithms in
cluding the wellknown Condensation algorithm [7] and its
variants. A similarly concise model is required if we are to
∗
This work was funded by NSF CAREER Award IIS0092874
and by Department of Homeland Security
−4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(a) (b)
Figure 1. Tracking a car across drastic illumination change. (a)
A template constructed for the vehicle in sunlight will change ap
pearance as it enters shadow and traditional shapetracking fails.
(b) The histogram of the zeroth coefﬁcient of our illumination
model. This work shows how the modes of these distributions are
sufﬁcient to accurately track though both shape and illumination
change.
robustly estimate illumination changes in a statistical track
ing framework while avoiding undue increase in the dimen
sionality of the problem. This is the topic of this paper.
The study of appearance change as a function of illumi
nation is a widely studied area in computer vision [2, 11, 1].
These methods focus on accurate models of appearance un
der varying illumination and their utility for object recogni
tion. However they typically require an explicit 3D model
of the object which somewhat limits their application to
surveillance applications. A general yet lowdimensional
parameterization of illumination has thus far been elusive
in a tracking context.
In this work we focus on the problem of tracking objects
through simultaneous illumination and shape change. Ex
amples include monitoring vehicles that move in and out
of shadow or tracking a face as it moves thorough differ
ent lighting conditions in an indoor environment. The ap
proach is intended for use in traditional video surveillance
and monito ring tasks where a large number of illumina tion
samples of each object to be tracked are unavailable [6]and
features th at are con sidered to be invariant to illumination
are known to be unreliable [2].
The contribution of the work is twofold. First, we intro
duce a a multiplicative, low dimensional model of illumi
nation that is computed as a linear combinatio n of a set of
Legendre functions. Such a multiplicative model can be in
terpreted as an ap proximation of illumination image as dis
cussed in Weiss [12]. Although the model is not intended
to be applied for recognition tasks under differing illumina
tion, it is sufﬁcient to capturing appearance variability for
improved tracking. The Legendre coefﬁcients together with
the shape vectors deﬁne a joint shapeillumination space.
Our a pproach then is to estimate the vector in this jo int
space that best transforms the template to the current frame.
This is in contrast to app roaches that adapt the temp late over
time by modifying a continuously varying density [3, 13].
Direct adaption of the template requires careful selection of
adaption parameters to avoid problems of drift [10].
In alternative formulation of the problem Freedman and
Turek [4] introduce an illumination invariant approach to
computing opticﬂow that can be used to localize object
templates. The method was shown to be qu ite robust at
tracking objects through shadows. However it is computa
tionally expensive and it is unclear how known system dy
namics can be integrated within the approach. We do not
seek illumination invariance but instead estimate the illumi
nation changes using our model as part of the tracking pro
cess. However, use of illumination invariant optic ﬂow as a
lowlevel primitive could be used in combination with the
work here to inform the shape space sampling distributions
and is the subject of future work.
When using this joint shape illumination space for track
ing, it is no longer obvious how this space should be sam
pled. For example, Figure 1a shows a vehicle that moves
from bright sunlight to shadow. Because this transition
can occur instantaneously between frames, the smoothness
assumptions th at are used to derive the sampling distribu
tion for shape cannot are often violated for the illumina
tion component. Furthermore, the additional degreesof
freedom that it are required to model illu mination can lead
to decreased robustness at runtime o r require an inordinate
number of tracking samples in each frame. However, we
discover the surprising result that a small number of cen
troids extracted from the underlying distributions of our il
lumination coefﬁcients are o ften adequate to represent the
inﬂuence of most of the illumination conditions existing in
the scene. Figure 1b shows a d istribution o f the zeroth or
der coefﬁcient in our model for the car moving through the
scene in Figure 1a. In Section 3.1 we discuss how important
modes of these distributions are extracted and used to track
through drastic illumination changes such as these.
2. A Multiplicative Model of Appearance
Change due to Illumination
The image template throughout the tracking sequence
can be expressed as:
U
t
(x, y)=L
t
(x, y)R(x, y) (1)
where L
t
(x, y) denotes the illumination image in frame t
and R(x, y) denotes a ﬁxed reﬂectance image [12]. Thus
if the reﬂectance image of the object is known, tracking be
comes the p roblem of estimating the illumin a tion image and
a shapevector.
Of course, the reﬂectance image is typically unavailable
and the illumination image can only be computed modulo
the illumination contained in the image template shown in
Equation 2.
L
t
=
˜
L
t
L
0
R(x, y) (2)
where L
0
is the initial illumination image and
˜
L
t
is the un
known illumina tion image for frame t.
Our proposed model of appearance change, then, is sim
ply the product of the input image with a function f
t
(x, y)
that approximates L
t
and is deﬁned over the image domain,
P ×Q. A naive way of compensating for appearance change
then is to allow each f(x, y),x =1, ··· ,P,y =1, ··· ,Q
to vary independently. However, it is known that for a con
vex Lambertian object, the change in appearance of neigh
boring pixels is not independent and the excessive addi
tional degreesoffreedom can make the tracking problem
intractable.
Instead we construct the illumination compensation im
age f from a linear combination of a far lower dimensional
set of n basis functions. In order to be useful, the basis func
tions must be both both orthogonal in the 2D image domain
and straightforward to compute. Furthermore they must be
capable of spanning most of the appearance changes in the
template due to illumination. For the work here we utilize
the Legendre polynomial basis although any other polyno
mial basis will sufﬁce. To give an idea about the type of
variation the basis supports, Figure 2 shows the Legendre
basis of order three.
Let p
n
(x) denote the nth Legendre basis function. Then,
for a given set of coefﬁcients Λ=[λ
0
, ··· ,λ
2n
]
T
,the
scaled intensity value at a pixel is computed as:
ˆ
U(x, y)=(
1
2n +1
(λ
0
+ λ
1
p
1
(x)+···+ λ
n
p
n
(x)+ (3)
λ
n+1
p
1
(y)+···+ λ
2n
p
n
(y)) + 1)U(x, y)
For purposes of notation, we will denote the effect of Λ on
the image as
∆ΛU ≡ U ⊗ PΛ+U (4)
where
0
10
20
30
40
50
60
70
80
90
100
0
50
100
150
0
0.5
1
1.5
2
2.5
f(X,Y)
X
Y
0
10
20
30
40
50
60
70
80
90
100
0
50
100
150
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
X
Y
f(X,Y)
0
10
20
30
40
50
60
70
80
90
100
0
50
100
150
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
X
Y
f(X,Y)
0
10
20
30
40
50
60
70
80
90
100
0
50
100
150
0.9
0.95
1
1.05
1.1
1.15
X
Y
f(X,Y)
0
10
20
30
40
50
60
70
80
90
100
0
50
100
150
0.9
0.95
1
1.05
1.1
1.15
X
Y
f(X,Y)
0
10
20
30
40
50
60
70
80
90
100
0
50
100
150
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
X
Y
f(X,Y)
0
10
20
30
40
50
60
70
80
90
100
0
50
100
150
0.85
0.9
0.95
1
1.05
1.1
1.15
1.2
X
Y
f(X,Y)
Figure 2. First seven Legendre basis functions used to track illumination change in an image template.
P =
1
2n+1
p
0
···
1
2n+1
p
n
(y
1
)
.
.
.
.
.
.
.
.
.
1
2n+1
p
0
···
1
2n+1
p
n
(y
PQ
)
. (5)
We deﬁne ⊗ as an operator that scales the rows of P with
the corresponding element of U written as a vector. Given
an input template T andanimageU, the Legendre coefﬁ
cients that minimize the error between ∆.ΛU and T can be
computed by solving the least squares problem,
U ⊗ P Λ ≈ T − U. (6)
Each of the basis functions is scaled by a particular
choice of Λ
i
and the n linearly combined using Eq uation 4
to derive a illumination image.
Figure 3 demonstrates how how this lowdimensional
set of Legendre polynomials can accommodate illumination
change. Figure 3a is an input template and Figure 3bisthe
same image template relit from a different direction. Us
ing an a least squares ﬁt for Λ, a new image that is more
similar in appearance to the target image is generated (see
Figure 3c).
3. A Jointspace of Illumination and Shape for
Tracking
For the sake of generality, we assume an N
S

dimensional shape space and a N
λ
dimensional illumina
tion space that results in a joint space A
L
= L(W,T,∆).
that maps a joint shape and appearance vector X
A
∈
R
N
S
+N
λ
:
X
A
=
X
Λ
(7)
to a deformed and relit template, U ∈ R
N
T
:
U =[∆Λ] [I (WX + T )] . (8)
(a) (b) (c)
Figure 3. An example of illumination compensation using a low
dimensional multiplicative model. (a) Input template. (b) Input
image under new illumination. (c) Synthesized image that is the
product of illumination basis functions with the input.For this ex
ample a third order Le gendre polynomial was used and the Legen
dre coefﬁcients were computed using (6).
W denotes a N
T
×N
S
shape matrix. The constant offset
T denotes the template against which shape variations are
measured. No such offset is required for the illumination
component. I(·) sim ply refers to the image intensities mea
sured on the shape grid implied by the shape component of
X
A
.
The proposed joint shapeillumination space can be sam
pled sequentially to track objects through a range of shape
and illumination changes. This is best acco mplished in a
robust way using a particle ﬁlter framework. Particle ﬁlters
(PF) are very widely studied in computer vision and differ
ent variants of its implementation exist [13, 9]. Two im
portant components of a PF include a state evolution model
p(X
A
t
X
A
t−1
) and an observation model p(Y
t
X
A
t
).ThePF
tracker approximates the posterior density, p(X
A
t
Y
1:t
) with
a set of weighted particles {(X
A
)
j
t
,w
j
t
} with
M
j=1
w
j
t
=
1. The likelihood p(Y
t
X
A
t
) of a particular hypothesis and
in the case of the joint shapeappearance model is computed
using the transformed image and the template. A likelihood
measure on the joint shapeillumination hypothesis X
A
i
is
computed as the sum of absolute difference (SAD) between
U and T .
The other component of PF tracking is the speciﬁcation
of p(X
A
t
X
A
t−1
). Typically a GaussMarkov model is as
sumed, whereby X
A
t+1
∼N(X
A
t
,V). In the absence of
any knowledge about the expected range of motion and il
lumination change, a brute force approach is required and
the variance on the normal distribution of each component
in X
A
is set to a high value. This necessitates an u nreason
able increase in the number of particles in order to maintain
reliable tracking and such an approach is now m ore likely
to suffer from local minima. With the addition a l dimen
sions that the new model implies, the problem can be even
more formidable than traditio nal shape tracking where re
cent work has studied how more informed sampling distri
butions for shape tracking can be derived [8]. In the follow
ing sectio n we outline how meaningful sampling densities
for illumination can be learned from a few examples and
show th at these densities can in fact are degenerate. As a
result, the new model can be represented by several cen
troids in the Legendre basis.
3.1. Learning Sampling Distributions for Illumina
tion and Shape
We assume that we have a static camera acquiring im
ages of a scene and that the illumination conditions, al
though variable within the scene, do not change signiﬁ
cantly over time. Ground truth video sequences consisting
of a starting template T and its location and shape in sub
sequent frames, {U
1
, ··· ,U
N
} are used to compute shape
vectors {X
1
, ··· ,X
N
} corresponding to this motion. Fur
thermore, a set of Legendre coefﬁcents {Λ
1
, ··· , Λ
N
} that
best map {U
1
, ··· ,U
N
} to T are computed via standard
least squares ﬁtting (6).
The shape sampling distribution h(X) must model the
incremental motion between frames. For smooth motions,
shape distributions can be computed from shape difference
vectors { X
2
− X
1
, ··· ,X
N
− X
N−1
}. Standard kernel
density methods can then be used for estimating a sam
pling distribution from using these differences. Alterna
tively a uniform density U(a, b) corresponding to the max
imal ranges of state components can be used as a simple
approximation of h(X).
In the case of our new illumination model, sampling dis
tributions in the Legendre space must be estimated. I t is
natural to consider whether a differential model similar to
the one used for X is suitable in this regard. Figure 6
illustrate the problem with such an approach for the illu
mination space. Although components of shape space are
more or less smoothly monotonic (Figure 4c), this is not the
case for the illumination coefﬁcents. For example, the ﬁrst
coefﬁcents, λ
0
, changes dramatically as the subject moves
through differing illumination. The result is a trajectory that
cannot be modeled by considering discrete differences (Fig
ure 4d).
(a) (b)
0 10 20 30 40 50 60 70
−100
−50
0
50
100
150
200
250
300
Frames
X(5) = t
x
0 10 20 30 40 50 60 70
−3
−2.5
−2
−1.5
−1
−0.5
0
Frames
lambda
0
(c) (d)
Figure 4. Difﬁculty of using a differential model for building a
sampling distribution for illumination (a) and (b) show images of
a person walking in a hallway towards the camera (c) shows the
ytranslation component of X and (d) shows the λ
0
coefﬁcient of
Λ as a function of time. As can be seen even for smooth motions,
the illumination component displays discontinuities.
(a) (b)
(c) (d)
Figure 5. Our approach is motivated by the fact that certain domi
nant illumination conditions can be quantized into a few centroids
in the illumination space. For example, in this scene some of the
salient illumination conditions are: (a) subject is diffusely lit from
above (b) subject passes thorough shadow, (c) subject strongly lit
from the side, and (d) subject in darker region of room near cam
era.
One approach to this problem is to identify subregions of
monotonicity and then build a mixture of distributions us
ing discrete differences that are particular to each. However
one direct consequence of using these distributions is that
the number of particles needed to span the corresponding
regions in illumination space will be extremely large adding
an additional computational burden on top o f the traditional
shape space sampling. Clearly a more efﬁcient way of sam
pling the illumination space must be found if the resulting
algorithm is to be useful.
0 10 20 30 40 50 60 70
5
10
15
20
25
30
35
40
45
50
55
Frames
Error
With No Illumination Compensation
With Exact Least Squares Fit
Using K−means
Using Random Compensation
Figure 6. A plot of the SAD error as a function of time. The red
dashed line represents the situation with no illumination compen
sation. The blue dashdotted line represents t he compensation with
the least squares ﬁt for Λ The green solid line represents compen
sation with the vector quantized values of the l east square ﬁts. The
black crossed line represents compensation with a random Λ.
Although the underlying distribution of Λ is of course
continuous, we can discard much of this information in fa
vor of tracking robustness by seeking the most important
illumination modes that are present in the distribution. This
step is motivated by the observation that a scene is typi
cally composed of a discrete set of illumination conditions.
For example, the underlying illumination distribution for
the scene shown in Figure 4 arises from certain salient il
lumination con ditions in the scene as shown in Fig ure 5.
In order to achieve an efﬁcient sampling of the il
lumination space we perform a k−means clustering
{Λ
1
, ··· , Λ
N
} and use the k centroids c
1
, ··· ,c
k
as a rep
resentation of the illumination space.
To demonstrate that clustering in this way does not de
grade our ability to track, we studied many face track ing ex
amples under different illumination conditions. The results
support the claim that only a few modes are needed instead
of the entire distribution. For example, Figure 6 show the
SAD score achieved for a typical face tracking process us
0 10 20 30 40 50 60
−3
−2.5
−2
−1.5
−1
−0.5
0
0.5
1
Frames
Value of Lambda component
0 10 20 30 40 50 60
−4
−3
−2
−1
0
1
2
Frames
Value of Lambda component
0 10 20 30 40 50 60
−2
−1.5
−1
−0.5
0
0.5
1
1.5
Frames
Value of Lambda component
0 10 20 30 40 50 60
−1
0
1
2
3
4
Frames
Value of Lambda component
Figure 7. The range of Λ as a function of time shown for two
individuals for a 2nd order Legendre polynomial ﬁt. The similarity
in the range of values taken by different components of Λ can be
seen from the plot. As a consequence the centroids describing the
illumination conditions in a scene for different people are close.
ing several different approaches. Both random selection of
Legendre coefﬁcents and no compensation lead to high er
ror. More importantly, the plot shows nearly no difference
between exact leastsquares ﬁts of a second order Legen
dre that utilizes only six centroids discovered v ia k−means
clustering process.
This result is typical of most situations and a rate
distortion study found that k =6is adequate to represent
the variability of Λ for our indoor surveillance senario. Fig
ure 7 shows the result of leastsquares ﬁts to the ﬁrst four
Legendre coefﬁcents for two different subjects. Note that
the rang e of variability is nearly the same for both subjects
justifying our use of the same centroids to represent several
subjects from the same scene.
These results allows us to coarsely sample the illumi
nation space with minimal impact on the tracking results
while retaining the ab ility to generalize to previously un
seen objects within the same class. This requires only minor
modiﬁcation to the standard particle ﬁlter that incorporates
the k illumination clusters. Speciﬁcally fo r every particle j,
drawn from h(X),wesamplei from {1, ··· ,k} with prob
ability
1
k
and compute
U =[∆Λ
ci
]
I
WX
j
t
+ T
(9)
before measuring the SAD distance. The new algorithm,
then, combines traditional shape tracking with our multi
plicative model of illumination compensation. Table 1 sum
marizes the joint shap eillumination tracking algorithm.
Given: an estimate of shape sampling d istributions,
h(X) and k cluster centers, c
1
, ..., c
k
in the
illumination basis.
1. Initialize sample set X =
X
j
0
, 1/M
2. For t =1, ··· ,T
3. For j =1, ··· ,M
4. Generate X
j
t
from X
j
t−1
using h(X)
5. Compute transformed image regions in
accordance with shape vectors X
j
t
6. Pick an i from {1, ··· ,k} with probability
1
k
7. Compute U using (9)
8. Compute likelihood p(Y
t
X
j
t
) by
measuring the SAD distance between
U and T
9. End
10. Importance resample {X
j
t
}
based on {p(Y
t
X
j
t
)} to get {X
j
t
}
11. End
Table 1. The Particle Filter using the new shapeillumination
space.
4. Experimental Results
We now d emonstrate the utility of the joint shape
illumination model in two different scenarios. The results
discussed here are indicative of results the system achieved
for many such sequences. For example, in the car sequence
twenty cars were successfully tracked over a period of two
hours
1
In each case we follow the procedure described in
Section 3.1 to establish sampling distributions in the joint
space over some set of training samples. Tracking was then
performed using 200 particles on new objects using the al
gorithm in Table 1.
The car dataset was generated from a camera observ
ing a road from above as cars approach an intersection and
move in and out of shadow. Two sequences were used to
acquire the sampling distributions. Training involved mark
ing locations of the moving car in successive frames. Using
these locations the corresponding shapevector was com
puted. We used a 3D shape space that spans scaling and
translations in X and Y. Using the maximal values of the
shape difference vectors, a uniform distribution over the
corresponding range was computed for each shape compo
nent. Using the least squares method (see Section 6)we
ﬁt different orders of Legendre polynomials and computed
the resulting SAD error. We found that a ﬁrst order Leg
endre polynomial was adequate to capture the illumination
change in this case where the object is more or less planar.
The k−means clustering process yields two centers {c
1
,c
2
}
1
The cars were arbitrarily picked in the sequence and initial locations
of the cars were hand extracted and passed on to the tracker
that were then u sed to r epresent the discretized illuminatio n
space.
Figure 8 shows tracking results for a car using the joint
shapeillumination tracker. The white square corresponds
to the MAP estimate for that frame. T he new tracking al
gorithm is compared to a tradition a l particle ﬁlter that does
not encompass illumination change (Figur e 8 bottom row).
The same shape sampling d istributions were used by both
algorithms.
The particle ﬁlter tracks the template well as long as
the illumination conditions that existed when the template
was captured remain unchanged. However, at the shadow
boundary the traditional tracker fails. On the other hand,
the new illu mination model captures this appearance change
and the joint shapeillumination likelihoods remain high for
the correct estimate via the additional degreeoffreedom af
forded b y the illumination model.
A second dataset contained several different subjects
moving through different illuminations in an indoor envi
ronment. The illumination conditions in this case were sig
niﬁcantly more complex than the vehicle tracking dataset.
Sunlight through window and different light sources (i.e.
ﬂuorescent overhead lamps and incandescent desk lights)
persist throughout the space making the dataset very chal
lenging. In fact, to test the algorithm a strong diffuser
lamp was placed in a room to generate strong side light
ing (see Figure 9). Ground truth was again g enerated from
two different sequences. A second order Legendre poly
nomial was chosen for the illumination component. Using
ratedistortion studies as discussed in Section 3.1, we found
that around six clusters were required to capture the vari
ability in the scene. Here we discuss tracking results when
six clusters were used. Using more centroids does not lead
to degradation of the results, however it requires an addi
tional number of particles.
Figure 9 shows two different subjects moving through
various illumination conditions as they approach a surveil
lance camera. These sequences are typical for this setup and
only three frames are shown in the interest of space.
Figure 10 shows the initial template for each subject and
the illumination image generated by the illumination cen
troid associated with the MAP estimate. This illumination
image was multiplied to the grid indicated by the shape vec
tor in the frames shown in Figure 9. As can be seen these
illumination images are able to compensate for the illumi
nation changes in the sequence.
5. Conclusions and Future Work
In this paper we presented an approach to track across
shape and illumination change. We introduced a low
dimensional multiplicative model of illumination change
that is expressed as a linear combination of a Legendre ba
sis. We demonstrated how this n ew model is capable of
Figure 8. Example of tracking a car through drastic illumination changes. The bottom row shows the result using a conventional particle
ﬁlter while the top row shows the result using our algorithm.
(a) (b) (c)
(d) (e) (f)
Figure 9. Example of tracking faces in an indoor setting. The illumination conditions existing in this scenario are signiﬁcantly more
complex than those in the vehicle tracking situaion.
capturing appearance change in the tracked template. We
showed how the Legendre coefﬁcients can be combined
with the shape vector to deﬁne a new shapeillumination
space. We discovered that in this new illumination space,
a small number of centroids sufﬁce to capture illum ination
changes in particular scenario. We showed how to estimate
these centroids and incorporate them in the particle ﬁlter
ing framework at run time without adding excessive com
putational burden. We demonstrated the utility of our ap
proach for both vehicle and face tracking scenario. One of
the assumptions in our work is that the initial tem plate in
the training and testing sequences are acquired under sim
10 20 30 40 50 60
10
20
30
40
50
60
70
80
90
5 10 15 20 25 30 35
5
10
15
20
25
30
35
40
45
10 20 30 40 50 60
10
20
30
40
50
60
70
80
90
5 10 15 20 25 30 35
5
10
15
20
25
30
35
40
45
5 10 15 20 25 30 35
5
10
15
20
25
30
35
40
45
5 10 15 20 25 30 35
5
10
15
20
25
30
35
40
45
Figure 10. Initial template and illumination images constructed from the Legendre basis that were used to model appearance change in the
sequence shown in Figure 9.
ilar illumination conditions. We expect to incorporate the
bilinear stylecontent factorization of Freeman and Tenen
baum [5] to overcome this drawback. Finally, more sophis
ticated studies involving the stability of the learned distri
butions over time and slow illumination changes are un
derway. Initial results indicate that the distributions can be
quite stable but may need to be relearned over some period
over time. For example, distributions learned at dawn no
longer apply at dusk.
References
[1] R. Basri and D. Jacobs. Lambertian reﬂectance and linear
subspaces. IEEE Trans. PAMI, 25(2):218–233, 2003. 1
[2] P. Belhumeur and D.J.Kriegman. What is the set of images
of an object under all possible illumination conditions. IJCV,
28(3):1–16, 1998. 1
[3] B.Han and L. Davis. Online densitybased appearance mod
eling for object tracking. Proceedings of ICCV, 2005. 2
[4] D. Freedman and M. Turek. Illuminationinvariant track
ing via graph cuts. IEEE Computer Society Conference on
Computer Vi sion and Pattern Recognition (CVPR), 2:10–17,
2005. 2
[5] W. Freeman and J. Tenenbaum. Learning bilinear models for
two factor problems in vision. Proceedings of IEEE CVPR,
1997. 7
[6] G. Hager and P. Belhumeur . Efﬁcient region tracking with
parametric models of geometry and illumination. IEEE
Trans. PAMI, 20(10):1025–1039, 1998. 1
[7] M. Isard and A. Blake. Condensation  conditional density
propogation for visual tracking. IJCV , 21(1):695–709, 1998.
1
[8] A. Kale and C. Jaynes. Shape space sampling distributions
and their impact on visual tracking. IEEE International Con
ference on Image Processing, 2005. 4
[9] L. Lu, X.Dai, and G. Hager . A particle ﬁlter without dynam
ics for robust 3d face tracking. Proc. of FPIV, 2004. 3
[10] I. Matthews, T. Ishikawa, and S. Baker. The template up
date problem. In Proceedings of the B ritish Machine Vision
Conference, September 2003. 2
[11] R. Ramamoorthi. Analytic pca construction for theoretical
analysis of lighting variability in images of lambertian ob
ject. IEEE Trans. PAMI, 24(10):1–12, 2002. 1
[12] Y. Weiss. Deriving intrinsic images from image sequences.
Proc of ICCV, 2001. 2
[13] S. Zhou, R. Chellappa, and B. Moghaddam. Visual tracking
and recognition using appearanceadaptive models in par
ticle ﬁlters. IEEE Trans. on Image Processing, November
2004. 2, 3