Content uploaded by M.R. Morelande
Author content
All content in this area was uploaded by M.R. Morelande
Content may be subject to copyright.
IMPROVED PARTICLE FILTERING SCHEMES FOR TARGET TRACKING
Zhe Chen
1
, Thia Kirubarajan
1
, Mark R. Morelande
2
1. Communications Research Lab, McMaster University, Canada
2. Center for Sensor, Signal and Information Processing, University of Melbourne, Australia
ABSTRACT
In this paper, we propose two improved particle ﬁltering schemes
for target tracking, one based on gradient proposal and the other
based on Turbo principle. We present the basic ideas and deriva
tions and show detailed results of three tracking applications. Fa
vorable experimental ﬁndings have shown the efﬁciency of our
proposed schemes and their potential in other tracking scenarios.
1. INTRODUCTION
Recent years have witnessed evergrowing efforts in applying par
ticle ﬁlters to signal processing, communications, and machine
learning [1]. Bearing the nature of recursive Bayesian estimation
and sequential Monte Carlo sampling, particle ﬁltering has demon
strated its potential in various nonlinear, nonGaussian, nonstationary
sequential estimation problems. Among many, target tracking prob
lem provides a testbed for particle ﬁlter. As well known, the con
ventional Bayesian bootstrap [3] or SIR ﬁltering (using prior pro
posal) has a drawback of ignoring the most recent observation. In
this paper, we propose two improved particle ﬁltering schemes to
overcome this weakness and apply them in several tracking appli
cations. The ﬁrst improvement scheme is to use gradient infor
mation of the measurement model. The idea of gradient proposal
is very heuristic, but it is very simple to implement and turns out
to be quite efﬁcient in practice [4]. We also propose another new
particle ﬁltering method based on Turbo principle (motivated from
Turbo decoding in communications). Basically, we use one ﬁlter
(socalled slave ﬁlter) to produce a ﬁrststage (rough) estimate; and
we run another ﬁlter (master ﬁlter) in parallel to yield the second
stage (ultimate) estimate, which uses its current as well as previous
estimate for importance weights update in a recursive particle ﬁl
tering fashion.
The rest of the paper is organized as follows: In section 2, we
brieﬂy discuss the Bayesian bootstrap ﬁlter and then introduce the
improved schemes for particle ﬁltering. Section 3 is devoted to
two simulated and one reallife tracking applications, followed by
concluding remarks in Section 4.
2. PARTICLE FILTERING AND IMPROVED SCHEMES
2.1. StateSpace Model and Bayesian Bootstrap Filter
Consider a generic discretetime nonlinear state space model:
x
n+1
= f (n, x
n
, d
n
), (1a)
y
n
= g(n, x
n
, v
n
), (1b)
where d
n
and v
n
characterize the dynamic and measurement noise
processes, respectively. The state equation (1a) characterizes the
state transition probability p(x
n+1
x
n
), whereas the measurement
equation (1b) describes the probability p(y
n
x
n
) which is further
related to the measurement noise model.
Simply say, particle ﬁlter uses a number of independent ran
dom variables called particles, sampled directly from the state space,
to represent the posterior probability, and update the posterior by
involving the new observations; the “particle system” is properly
located, weighted, and propagated recursively according to the
Bayesian rule. Speciﬁcally, using a sequential important sampling
(SIS) scheme, it can be shown [2] that the importance weights up
date has the following recursive form:
W
(i)
n
= W
(i)
n−1
p(y
n
x
(i)
n
)p(x
(i)
n
x
(i)
n−1
)
q(x
(i)
n
x
(i)
0:n−1
, y
0:n
)
. (2)
where W
(i)
n
= p(x
(i)
n
)/q(x
(i)
n
) denotes the importance weight,
and q(x
(i)
n
x
(i)
0:n−1
, y
0:n
) represents the proposal distribution. Choos
ing a proper proposal often has a crucial effect on the particle ﬁl
tering performance.
The wellknown SIR and Bayesian bootstrap ﬁlter [3] use a
transition prior as proposal, i.e. q(x
n
x
n−1
, y
0:n
)=p(x
n
x
n−1
);
it then simpliﬁes (2) to
W
(i)
n
= W
(i)
n−1
p(y
n
x
(i)
n
), (3)
which essentially neglects the effect of recent observation y
n
. De
spite its appealing simplicity, this proposal distribution is far from
optimal and its resulted performance can be quite poor even a large
number of particles are used. Many improved schemes (such as the
auxiliary variable) have been developed in the literature [1]. In the
following, we will describe two improved schemes in an attempt
to efﬁciently incorporate the observation information into the sam
pling step.
2.2. Particle Filtering Using Gradient Proposal
In order to use the recent observation, we propose to use the gra
dient information of (1b) to select the “informative” particles [4].
The main idea behind it is to introduce a MOVEstep to sampling
for the proposal distribution, which is plugged in before the sam
pling step in the conventional SIR ﬁlter. This new algorithm es
sentially calculates the gradient information from the likelihood
model and guides the particles towards the lowerror region, along
the gradientdescent direction; by assuming an additive measure
ment noise model in (1b), the MOVEstep is described by
ˆ
x
nn−1
=
ˆ
x
n−1n−1
− η
∂(y
n
− g(x))
2
∂x
x=
ˆ
x
n−1n−1
(4)
IV  1450780388747/05/$20.00 ©2005 IEEE ICASSP 2005
➠
➡
slave filter
master filter
x
n
(i)
y
n
time n
time n+1
x
nn
,W
n
(i)
x
n
1
(i)
,W
n
1
(i)
Fig. 1. A schematic diagram of Turbo particle ﬁltering.
where η ∈ [0.001, 0.01] is a smallvalued stepsize parameter. As
expected from (4), the inclusion of current observation y
n
and the
calculation of gradient information will tend to push the samples to
a highlikelihood region, thereby providing more reliable predic
tive samples for the next step. In summary, the improved particle
ﬁltering with gradient proposal reads as follows:
1. For i =1, ··· ,N
p
, sample x
(i)
0
∼ p(x
0
), W
(i)
0
=1/N
p
.
2. For each sample {x
(i)
n−1
}, update the sample via (4).
3. Importance sampling:
ˆ
x
(i)
n
∼ p(x
n

ˆ
x
(i)
nn−1
).
4. Importance weights update:
W
(i)
n
= W
(i)
n−1
p(y
n

ˆ
x
(i)
n
)
p(
ˆ
x
(i)
n

ˆ
x
(i)
n−1n−1
)
p(
ˆ
x
(i)
n

ˆ
x
(i)
nn−1
)
.
5. Calculate effective sample size
ˆ
N
ef f
,if
ˆ
N
ef f
>N
p
/2,
resampling; otherwise go to Step 2.
2.3. Turbo Particle Filtering
A schematic diagram of Turbo particle ﬁltering is illustrated in
Fig. 1. In Fig. 1, there are two ﬁlters in parallel, to be run iter
atively. The slave ﬁlter, being an extended Kalman ﬁlter (EKF)
here, is used to produce a rough estimate
ˆ
x
nn
, given the current
observation y
n
and previous state estimates x
(i)
n−1
. This is done
by the typical EKF equations. Note that in the prediction step,ev
ery particle x
(i)
n−1
is passed through the state equation, where the
predicted covariance can be estimated by the sample covariance,
ˆ
P
nn−1
, or calculated through linearization; in the ﬁltering step,
instead of using all of the samples {
ˆ
x
(i)
nn−1
}, we only use its mean
value,
ˆ
x
nn−1
=
ˆ
x
(i)
nn−1
, to perform the EKF update:
ˆ
x
nn
=
ˆ
x
nn−1
+ K
n
(y
n
− g(
ˆ
x
nn−1
)), (5)
P
nn
=
ˆ
P
nn−1
+ K
n
C
n
ˆ
P
nn−1
, (6)
where K
n
=
ˆ
P
nn−1
C
T
n
(C
n
ˆ
P
nn−1
C
T
n
+Σ
v
)
−1
, and C
n
is the
linearized Jacobian matrix of the measurement equation, P
nn
is
the ﬁltered state covariance. Note that the ﬁltered estimate
ˆ
x
nn
is more accurate than the predicted estimate
ˆ
x
nn−1
, since it uti
lizes the observation y
n
. In the meantime, the master ﬁlter, given
y
n
and the previous simulated samples {x
(j)
n−1
}, as well as the
ﬁrststage estimate
ˆ
x
nn
, runs a particle ﬁltering procedure with a
constructed suboptimal proposal distribution, and further produces
a secondstage posterior estimate x
(i)
n
. After a complete step, the
master ﬁlter propagates its samples to the slave ﬁlter for the next
iteration. Essentially, two ﬁlters are trying to solve the same ﬁl
tering problem but looking at it from different perspectives; each
one takes advantage of the result of the other at the previous step
and thereby produces the solution in a cooperative way. Due to its
similarity to Turbo decoding, we call the proposed ﬁlter structure
as Turbo particle ﬁlter (TPF). In what follows, we will derive the
update equation mathematically in detail.
Let us write the ﬁltering posterior in a slightly different way:
p(x
n
y
0:n
)=p(x
n
y
n
, y
0:n−1
)
=
p(y
n
x
n
, y
0:n−1
)p(x
n
y
0:n−1
)
p(y
n
y
0:n−1
)
∝ p(y
n
x
n
)p(x
n
y
0:n−1
). (7)
Next, suppose we can draw samples {x
(i)
n
} from a proposal dis
tribution, we need to ﬁnd the importance ratios to appropriately
weight the samples. Using the importance sampling trick, we have
W
(i)
n
=
p(y
n
x
(i)
n
)p(x
(i)
n
y
0:n−1
)
q(x
(i)
n
y
n
)
, (8)
where q(x
(i)
n
y
n
) is the proposal distribution. Assuming that at
time n, we have the simulated particles for approximating the pos
terior of time n − 1:
p(x
n−1
y
0:n−1
) ≈
N
p
j=1
˜
W
(j)
n−1
δ(x
n−1
− x
(j)
n−1
), (9)
where
˜
W
(j)
n−1
are the normalized importance weights with the sum
equal to unity. Hence, we can have
p(x
(i)
n
y
0:n−1
)=
p(x
(i)
n
x
n−1
)p(x
n−1
y
0:n−1
)dx
n−1
≈
N
p
j=1
˜
W
(j)
n−1
p(x
(i)
n
x
(j)
n−1
). (10)
Substituting (10) into (8) yields the importance weights update:
W
(i)
n
=
p(y
n
x
(i)
n
)
N
p
j=1
˜
W
(j)
n−1
p(x
(i)
n
x
(j)
n−1
)
q(x
(i)
n
y
n
)
. (11)
Using the Gaussian approximation proposal (from the EKF), we
can draw samples {x
(i)
n
} from q = N(
ˆ
x
nn
, P
nn
), where P
nn
is obtained from (6). Now equation (11) naturally combines the
prior and likelihood information, using previous weights and sam
ples as well as observation y
n
. Finally, the master ﬁlter produces
a (secondstage) mean estimate x
n
=
N
p
i=1
˜
W
(i)
n
x
(i)
n
.
In summary, a completestep of TPF runs as follows:
1. At time n, for j =1,...,N
p
,givenx
(j)
n−1
(obtained from
the master ﬁlter in the previous step) and y
n
, run the EKF
updates (for the slave ﬁlter) to calculate the approximated
Gaussian sufﬁcient statistics (
ˆ
x
nn
, P
nn
).
2. Draw N
p
samples {x
(i)
n
} from N
ˆ
x
nn
, P
nn
.
3. For the master ﬁlter, for i =1,...,N
p
, calculate the im
portance weights via (11), and normalize them to get {
˜
W
(i)
n
}.
4. Calculate the secondstage estimate: x
n
=
N
p
i=1
˜
W
(i)
n
x
(i)
n
.
5. Calculate
ˆ
N
ef f
,if
ˆ
N
ef f
<N
p
/2, perform the resampling.
6. Copy the N
p
particles to the slave ﬁlter for the next step.
IV  146
➡
➡
Table 1. Experimental results of bearingonly target tracking
based on 50 Monte Carlo runs (excluding the divergence trials).
ﬁlter MSE NMSE diver. rate
EKF 0.0026 0.0168 0/50
UKF
0.0045 0.0232 0/50
SIRprior (N
p
= 100) 0.0021 0.0150 3/50
SIRgradient (N
p
= 100) 0.00014 0.0078 1/50
TPFEKF (N
p
=30) 0.00006 0.0009 0/50
TPFUKF (N
p
=30) 0.00006 0.0008 0/50
3. TRACKING APPLICATIONS
BearingOnly Tracking: First, we consider a bearingonly
tracking benchmark problem [3]. Let (ν, ˆν,η, ˆη) denote the x − y
positions and velocities of a moving target. The statespace equa
tions are described as follows:
x
n+1
= Fx
n
+ Cd
n
,
y
n
= tan
−1
(η
n
/ν
n
)+v
n
,
where x
n
=[ν
n
, ˙ν
n
,η
n
, ˙ν
n
]
T
, d
n
=[d
1,n
,d
2,n
]
T
, and
F =
⎡
⎢
⎣
1100
0100
0011
0001
⎤
⎥
⎦
, C =
⎡
⎢
⎣
10
10
01
01
⎤
⎥
⎦
.
The observation is a noisy bearing, and d
n
∼N(0, 0.001
2
I),
v
n
∼N(0, 0.005
2
). The goal is to reconstruct the trajectory
x
0:n
given the observed bearings y
0:n
and initial condition x
0
=
[−0.05, 0.001, 0.7, −0.055]
T
. The priors of particle ﬁlters are
set up as p(x
0
) ∼N(0, diag{0.05
2
, 0.005
2
, 0.03
2
, 0.01
2
}) and
E
p(x
0
)
[x
0
]=[−0.06, 0.0015, 0.65, −0.05]
T
. Note that here the
prior and likelihood are both peaked.
The experimental comparisons are conducted between (i) EKF;
(ii) unscented Kalman ﬁlter (UKF); (iii) SIR ﬁlter with a prior pro
posal; (iv) gradient proposal particle ﬁlter; (v) TPF with an EKF
as slave ﬁlter; and (vi) TPF with a UKF as slave ﬁlter. A total of
50 Monte Carlo experiments with different random seeds are per
formed for each ﬁltering scheme, using different number of parti
cles. The error metrics of interest here are the meansquared error
(MSE), the normalized MSE (NMSE), as well as the divergence
rate.
1
Experimental results are summarized in Table 1. As seen,
the two proposed particle ﬁltering schemes outperform the con
ventional bootstrap ﬁlter. In this example, the TPF produces the
best tracking result; but no big difference is observed between (v)
and (vi), partially because the state equation is linear and Gaus
sian. Using the same number of particles, the gradient proposal
obviously produces better results than the prior proposal; however,
as expected, when N
p
gradually increases, the difference between
them will be reduced; this has been conﬁrmed in our experiments
(see Fig. 2).
Tracking with Coordinate Turn: Next, we consider a typ
ical target tracking through a coordinated turn (CT), where the
statespace model is described by a stochastic differential equation
(SDE) and approximated by a 2ndorder weak Taylor approxima
tion. See [5] for background and details. Let x
t
=[ξ
t
,ς
t
,s
t
,θ
t
,ω
t
]
T
1
By divergence we mean the ﬁlter deviates from the target trajectory
and is unable to come back to the true track.
0
0.2
0.4
0.6
0.8
1
1.2
x 10
Ŧ3
NMSE
SIR
SIR+gradient
100
200
500
N
p
Fig. 2. Performance (NMSE) comparison between the SIR and
gradient proposal particle ﬁlters using varying numbers of parti
cles, each based on 20 independent Monte Carlo runs.
denote the state vector containing target position in x and y coor
dinate, target speed, heading, as well as turn rate. Under the SDE
theory, the constant speed and turn rate (in the ideal CT model)
will be instead modiﬁed as a Wiener process.
Speciﬁcally, the 2ndorder Taylor approximation of the continuous
time state equation is described as [5]:
x
t
= f (x
τ
)+G(x
τ
)w
t
(12)
where δ = t − τ (we use δ =1in discretization), and
f (x
τ
)=
⎛
⎜
⎜
⎜
⎝
ξ
τ
+ δs
τ
cos(θ
τ
) − δ
2
s
τ
ω
τ
sin(θ
τ
)/2
ς
τ
+ δs
τ
sin(θ
τ
)+δ
2
s
τ
ω
τ
sin(θ
τ
)/2
s
τ
θ
τ
+ δω
τ
ω
τ
⎞
⎟
⎟
⎟
⎠
,
G(x
τ
)=E(x
τ
)V
δ
, with
E(x
τ
)=
⎛
⎜
⎜
⎜
⎝
σ
s
cos(θ
τ
)0 0 0
σ
s
sin(θ
τ
)0 0 0
00σ
s
0
0 σ
ω
00
000σ
ω
⎞
⎟
⎟
⎟
⎠
,
V
δ
=
δ
3
/30
√
3δ/2
√
δ/2
⊗ I
2×2
where ⊗ denotes the Kronecker product; w
t
is a Wienner process
approximated by standardized white Gaussian noise N(0, I
4×4
).
The measurement equation consists of a “rangebearing” pair:
y
t
=
ξ
t
t
+ ς
2
t
, tan
−1
(ς
t
/ξ
t
)
T
+ v
t
(13)
where the Gaussian measurement noise v
t
∼N(0, Σ
v
) is inde
pendent on the initial state and w
t
. The data trajectory was gen
erated using the Euler approximation with sampling period of 1
second and 1000 intervals per sampling instant. Measurements
are collected for 200 seconds with a constant sampling period.
The noise and initial parameter setup in our experiment is as fol
lows: σ
2
s
=1/5, σ
2
ω
=5× 10
−7
, Σ
v
= diag{100, (π/180)
2
},
x
0
∼N(µ
0
, Σ
0
) with
µ
0
= [1000, 2650, 150,π/2, −π/45]
T
,
Σ
0
= diag{400, 400, 25, (5π/180)
2
, (0.2π/180)
2
}.
IV  147
➡
➡
0 20 40 60 80 100 120 140 160 180 200
0
5000
10000
xŦaxis
0 20 40 60 80 100 120 140 160 180 200
0
5000
yŦaxis
0 20 40 60 80 100 120 140 160 180 200
100
150
200
Target speed
0 20 40 60 80 100 120 140 160 180 200
Ŧ20
0
20
Target heading
0 20 40 60 80 100 120 140 160 180 200
Ŧ0.1
Ŧ0.08
Ŧ0.06
Turn rate
Time step
Fig. 3. One typical result in tracking with CT experiment using
gradient proposal particle ﬁlter (N
p
= 500, RMSE=87.5).
Note that here the dynamic noise is peaked whereas the measure
ment noise is rather ﬂat. Table 2 shows the comparison between
the bootstrap ﬁlter (with prior proposal) and the SIR ﬁlter (with
gradient proposal) and TPF, with varying number of particles. Fig.
3 shows one typical tracking result.
MIMO Wireless Channel Tracking: Finally, we study a real
life MIMO wireless channel tracking problem (actually it is a chan
nel/sybmol joint estimation problem but we ignore the symbol de
coding part due to space limitation) [4]. The reallife wireless nar
rowband MIMO channel data were recorded in midtown Manhat
tan, New York city, January 2001. In particular, the state equa
tion of the channel can be described by a ﬁrstorder AR model
driven by nonGaussian noise (such as the mixture of Gaussians),
whereas the measurement equation is described as [4]:
y
j,n
=
#receivers
k=1
s
k,n
x
jk,n
+ v
j,n
,
where s
k,n
is the block of encoded symbols radiated by the kth
transmitter at time n; x
jk,n
is the channel coefﬁcient from the kth
transmitter to the jth receiver at time n; y
j,n
is the signal observed
at the input of the jth receiver; and v
j,n
is the measurement noise
at the input of the jth receiver at time n, which is modeled by a
Middleton Class A noise model. See [4] for more details.
We have compared different trackers (including Kalman ﬁlter,
mixture Kalman ﬁlters, and particle ﬁlters) [4], but here we only
highlight the efﬁciency of the gradient proposal in this reallife ap
plication. As seen from Table 3, the proposed gradient proposal
particle ﬁlter produces much better results (in terms of MSE as
well as symbol error rate) than the conventional SIR ﬁlter, espe
cially when using a small number particles. Namely, the particles
drawn from the gradient proposal are more informative. This phe
nomenon has also been evidenced in the previous two applications.
In terms of relative complexity and performance gain, compared
to the Kalman ﬁlter (with complexity 1), the relative complexity
factors of the SIR (with 100 particles) and our gradient SIR ﬁlters
(with 20 particles) are 3.2 and 1.5, respectively; while their per
formance gains are 2.6 and 2.7, respectively! This huge gain is
quite signiﬁcant when complexity issue is concerned in industrial
practice.
Table 2. The RMSE ≡
ξ −
ˆ
ξ
2
+ ς − ˆς
2
performance com
parison (based on 20 Monte Carlo runs) with varying N
p
.
Filter RMSE
SIRprior (N
p
=500) 144.3
SIRprior (N
p
= 1000) 91.6
SIRprior (N
p
= 2000) 81.9
SIRgradient (N
p
=200) 142.3
SIRgradient (N
p
=500) 90.2
SIRgradient (N
p
=1000) 74.3
TPFEKF (N
p
=100) 158.7
Table 3. The MSE of 2by2 MIMO wireless channel estimate and
symbol error rate (SER) for various numbers of particles (at 10dB
SNR) based on 100 Monte Carlo runs.
Number of SIRprior SIRgradient
Particles, N
p
MSE SER MSE SER
10 0.0615 0.0681 0.0233 0.0353
20
0.0431 0.0460 0.0206 0.0305
40
0.0338 0.0387 0.0196 0.0291
100
0.0257 0.0301 0.0188 0.0275
200
0.0227 0.0285 0.0183 0.0272
4. CONCLUDING REMARKS
We have proposed two improved particle ﬁltering schemes and
demonstrated their potential merits in three tracking applications.
In particular, the adhoc gradient proposal is quite efﬁcient in var
ious noise scenarios (esp. with small Σ
v
); whereas the TPF usu
ally works well when variances Σ
d
and Σ
v
are both small (since
the deterministic EKF gradually reduces the stateerror variance).
Eventually, the issue of performance and complexity tradeoff of
particle ﬁltering is central: choosing a better proposal distribution
(with more computation per step) might ironically reduce the total
complexity (in terms of required particle number simulation). In
practice, we might design certain decision criteria for using differ
ent particle ﬁlters in different scenarios; in other words, we have to
study the problem ﬁrst and then choose a problemspeciﬁc ﬁlter.
Acknowledgements: The authors thank Kris Huber for assis
tance in wireless channel tracking investigation. Z. C. was ﬁnan
cially supported by a NSERC grant of Prof. Simon Haykin.
5. REFERENCES
[1] A. Doucet, N. de Freitas, and N. Gordon, Eds. Sequential
Monte Carlo Methods in Practice, Springer, 2001.
[2] A. Doucet, S. Godsill, and C. Andrieu, “On sequential Monte
Carlo sampling methods for Bayesian ﬁltering,” Statist. Com
put., vol. 10, pp. 197–208, 2000.
[3] N. Gordon, D. Salmond, and A. F. M. Smith, “Novel ap
proach to nonlinear/nongaussian Bayesian state estimation,”
IEE Proc.F, vol. 140, pp. 107–113, 1993.
[4] S. Haykin, K. Huber, and Z. Chen, “Bayesian sequential state
estimation for MIMO wireless communication,” Proc. IEEE,
vol. 92, no. 3, pp. 439–454, March 2004.
[5] M. R. Morelande and N. J. Gordon, “Target tracking through
a coordinated turn,” manuscript submitted to ICASSP2005.
IV  148
➡
➠