Conference PaperPDF Available

State estimation for linear systems with additive cauchy noises: Optimal and suboptimal approaches

Authors:

Abstract and Figures

Only few estimation methods can converge in the presence of impulsive measurement and/or process noises without the use of augmented heuristic schemes. To understand the performance of these schemes, the optimal Idan/Speyer Cauchy Estimator (ISCE) is compared with the performance of the particle filter (PF) and Gaussian sum filter (GSF) as the convergence time of these estimators is allowed to increase. That is, the number of particles at each step for the PF and the number of Gaussian components at each step for the GSF are increased and their performance relative to the ISCE is numerically studied for scalar and two-state dynamic systems.
Content may be subject to copyright.
State Estimation for Linear Systems with Additive Cauchy Noises:
Optimal and Suboptimal Approaches
Robert Fonod1, Moshe Idan1and Jason L. Speyer2
Abstract Only few estimation methods can converge in
the presence of impulsive measurement and/or process noises
without the use of augmented heuristic schemes. To understand
the performance of these schemes, the optimal Idan/Speyer
Cauchy Estimator (ISCE) is compared with the performance
of the particle filter (PF) and Gaussian sum filter (GSF) as
the convergence time of these estimators is allowed to increase.
That is, the number of particles at each step for the PF and
the number of Gaussian components at each step for the GSF
are increased and their performance relative to the ISCE is
numerically studied for scalar and two-state dynamic systems.
I. INTRODUCTION
Impulsive measurement and process noises in stochastic
state estimators have typically been handled by heuristic
schemes that augment the estimation process. Recently, an
analytical recursive nonlinear estimation scheme for mul-
tivariate linear systems with additive Cauchy distributed
measurement and process noises has been developed. Cauchy
uncertainties are impulsive and this filter is optimal in that
it generates, in closed-form, the character function (CF) of
the unnormalized conditional probability function of the state
given the measurement history [1], [2]. From this character
function, the conditional mean and conditional error variance
in the presence of Cauchy distributed noise is obtained.
However, there are several general estimation algorithms that
may also be able to deal with impulsive noises [3]–[7].
Although those approaches are suboptimal, they may offer
reasonable approximations also for the heavy tailed, Cauchy
distributed noise environment.
Two of the most popular approximations are the particle
filter (PF) and the Gaussian sum filter (GSF) in that they are
shown to converge to the correct conditional density of the
state as the number of terms increase. Therefore, they are
implemented with some degree of approximation, producing
a tradeoff between numerical efficiency and the estimation
performance in constructing the conditional probability den-
sity function of the state given the measurement history and
the resulting conditional mean and variance. Our objective is
to compare the performance of these approximate filters to
that of the optimal Cauchy filter. Performance is measured by
restricting the filter approximation to an average computation
time interval at each measurement update.
This work was supported by the United States-Israel Binational Science
Foundation, Grant 2012122.
1Robert Fonod and Moshe Idan are with Department of Aerospace
Engineering, Technion - Israel Institute of Technology, Haifa, 3200003,
Israel. Email: {robert.fonod; moshe.idan}@technion.ac.il
2Jason L. Speyer is with Department of Mechanical and Aerospace
Engineering, University of California, Los Angeles (UCLA), Los Angeles,
90095, USA. Email: speyer@g.ucla.edu
Although the closed-form analytical solution of the
Cauchy filter provides the exact minimum variance estimates
of the systems states given a measurement sequence, their
computational complexity and memory burden becomes very
high requiring an approximation in their implementation. A
sliding window of data is used to limit the computation of the
Cauchy filter. It is shown numerically that there is little dif-
ferences in the conditional mean and variance for a window
of six, eight, or even ten. Clearly, larger window size implies
higher computational burden and memory requirements. The
performance of the PF and GSF are tested and compared with
the Cauchy estimator with a window of six and then eight.
II. PROB LEM FORMULATION
Consider a discrete-time, single-input-single-output, mul-
tivariate, and time-invariant linear system described by
xk+1 =Φxk+Γwk,(1)
zk=Hxk+vk,(2)
with state vector xkRn, scalar measurement zk, and
known matrices ΦRn×n,ΓRn×1, and HR1×n. The
noise inputs wkand vkare independent Cauchy distributed
random variables with zero median and scaling parameters
β > 0and γ > 0, respectively. Their probability density
functions (PDFs) and their characteristic functions are de-
noted pand φ, respectively, and are assumed to be time
independent and given by
pW(wk) = β/π
w2
k+β2φWν) = eβ|¯ν|,(3)
pV(vk) = γ/π
v2
k+γ2φVν) = eγ|¯ν|,(4)
where ¯νis a scalar spectral variable.
The initial conditions at k= 1 are also assumed to be
independent Cauchy distributed random variables. Specifi-
cally, each i-th element x1iof the initial state vector x1
has a Cauchy PDF with a given median ¯x1iand a scaling
parameter αi>0, i = 1, . . . , n. The joint PDF of the initial
conditions and its characteristic function are given by
pX1(x1) =
n
Y
i=1
αi
(x1i¯x1i)2+α2
i
φX1(ν) =
n
Y
i=1
eαi|νi|+j¯x1iνi,
(5)
where νiis an element of the spectral variable νRn
.
The measurement history used in the estimation problem
formulation is defined as z1:k={z1, . . . , zk}. The objective
is to compute the minimum variance estimate of xkgiven
the measurement history of z1:k.
2016 European Control Conference (ECC)
June 29 - July 1, 2016. Aalborg, Denmark
978-1-5090-2591-6 ©2016 EUCA 1434
III. OPTIMAL SOLUTION - CAU CHY FI LTE R
In this section, a brief overview of the minimum variance
Cauchy estimators, also known as Idan/Speyer Cauchy Es-
timator (ISCE) [8], for systems described by (1) and (2) is
given, summarizing the main results of [1], [2], [9].
A. Scalar ISCE - PDF Approach
The original work derived the ISCE for single-state system
using the PDF approach [9]. It was shown that under mild
conditions on the system parameters, the PDF of the state at
time step kgiven z1:kcan be expressed as
p(xk|z1:k) =
k+2
X
i=1
ai
k|kxk+bi
k|k
(xkσk|k)2+ (ωi
k|k)2.(6)
Initialization and update rules for the series coefficients ai
k|k,
bi
k|k,σi
k|kand ωi
k|kare to be found in [9]. It is apparent that
all of the series coefficients must be updated at each time
step, and that each measurement update increases the number
of terms in the series. To avoid such impractical growth, [9]
describes a truncation procedure that limits the number of
terms in the series to a specified buffer length.
The approach above provides a closed-form expression for
p(xk|z1:k), which can be examined for its shape, as will
be shown in this paper. Unfortunately, this approach was
found to be insufficient when addressing multivariate systems
due to its derivation specifics (partial fraction expansion
of single-variable rational functions). As an alternative, the
successful approach to address multivariate system utilizes
the CF of the PDF of interest. The ISCE for single-state
systems was re-derived using this approach in [1].
B. Multivariate ISCE - CF Approach
In this approach, instead of propagating p(xk|z1:k), its
un-normalized characteristic function given by
¯
φxk|z1:kν=Zp(xk,z1:k)ejνTxkdxk,(7)
is propagated, while the normalization factor can be obtained
by p(z1:k) = ¯
φxk|z1:k0. In [2] it was shown that (7) can
be expressed as a growing sum of terms in the form
¯
φxk|z1:kν=
nt(k|k)
X
i=1
gk|k
iyk|k
gi (ν)exp yk|k
ei (ν),(8)
i.e., a sum of exponential terms multiplied by a coefficient
function g(·). The argument of this coefficient function,
yk|k
gi (ν), and of the exponents, yk|k
ei (ν), are complicated
functions of the spectral vector ν. The details of the various
parameters and functions of the above expression can be
found in [2]. Those functions and parameters can also be
used to determine the state estimate and estimation error
covariance matrix in a closed-form.
The main difficulty with the above results is that the num-
ber of terms nt(k|k)grows rapidly with time. A windowing
technique that allows reducing the number of terms using
an approximation was recently presented in [10] for the
two-state case. This version of the ISCE will be used in
comparing the performance of the various filters in this study.
IV. SUBOPTIMAL SOLUTION - PARTICLE FILTER
Particle filter, also known as sequential Monte Carlo
(MC) method, is a set of algorithms implementing recursive
Bayesian estimation, based on point mass representation of
probability densities. For good surveys, see [5], [6], [11].
A. Sequential Importance Sampling
Most PFs are based on an algorithm known as sequential
importance sampling (SIS), which is a MC technique for
solving the Bayesian inference problem [6]. The main idea
is to represent the required posterior p(xk|z1:k)using a set
of random samples with associated weights. Let {xi
k, µi
k}np
i=1
be such an approximation, where {xi
k}np
i=1 is a set of support
points (particles) with associated weights {µi
k}np
i=1, and np
is the total number of particles. Then, using the principle of
importance sampling and Bayes’s rule, the posterior at time
kcan be approximated as [6]
p(xk|z1:k)
np
X
i=1
µi
kδ(xkxi
k),(9)
where δ(·)is the delta function. The weights are updated as
µi
kµi
k1
p(zk|xi
k)p(xi
k|xi
k1)
q(xi
k|xi
k1, zk).(10)
The symbol in (10) stands for “proportional to” and
q(xi
k|xi
k1, zk)is the chosen importance density used for
sampling. Weights are normalized such that Pnp
i=1 µi
k= 1.
Based on the strong law of large numbers, the approximated
posterior (9) approaches the true p(xk|z1:k)as np→ ∞.
The SIS algorithm thus consists of recursive propagation
of the weights and particles as each measurement is received
sequentially. The minimum variance state estimate ˆ
xkand
the minimum error covariance Pkcan be approximated by
ˆ
xkXnp
i=1 µi
kxi
k,(11a)
PkXnp
i=1 µi
k(xi
kˆ
xk)(xi
kˆ
xk)T.(11b)
B. Choice of the Importance Density
The optimal importance density q(xi
k|xi
k1, zk)was
shown to be p(xk|xi
k1, zk)[5]. This importance density,
however, is not always known and thus can be used only in
special cases, e.g., scalar system with Gaussian measurement
and process noises. Hence, the most widely used density is
the a priori distribution [6]
q(xk|xi
k1, zk) = p(xk|xi
k1).(12)
In this case, (10) reduces to
µi
kµi
k1p(zk|xi
k).(13)
It can be noticed that this choice of important density is
independent of the measurement zk. Thus, the state space is
explored without any knowledge of the observation and this
choice can fail if a new measurements appear in the tail of
the prior or if the likelihood is too peaked in comparison to
the prior. This strategy promotes a well known problem of
the SIS algorithm, known as the degeneracy problem [6].
1435
C. Degeneracy Problem
The degeneracy problem is a common phenomenon when
after a few iterations, all but one particle will have negligible
weight. This degeneracy results in a very poor approximation
of p(xk|z1:k)and a breakdown of the algorithm. It should
be noted that this phenomenon occurs even if the optimal
importance density is used, but is more severe when using
the prior density p(xk|xi
k1).
A suitable measure of degeneracy of the algorithm is the
effective sample size neff
pdefined in [6] as
neff
p=np
1 + Var(µi
k),(14)
where µi
k=p(xi
k|z1:k)/q(xi
k|xi
k1, zk)is referred to as
the “true weight”. This measure cannot be evaluated because
usually q(xi
k|xi
k1, zk)is unknown. Therefore it is normally
estimated by ˆneff
p= 1/Pnp
i=1(µi
k)2, where µi
kis the nor-
malized weight obtained using (13). Here 1neff
pnp,
where the upper bound is attained when all particles have
the same weight, and the lower bound when all probability
mass is at one particle. Note that small neff
pindicates
severe degeneracy. The most common solutions to tackle the
degeneracy problem is the use of resampling.
D. Resampling
Resampling discards particles that have low importance
weights, as they do not contribute to the approximation, and
multiplies particles that have high weight [7]. This is done by
mapping the weighted measure {xi
k, µi
k}np
i=1 to an unweighed
measure {˜
xi
k, n1
p}np
i=1 that still provides an approximation
of p(xk|z1:k). In other words, this process concentrates the
particles in the more important regions of state space. To
prevent degeneracy, when ˆneff
pis below a fixed threshold
nt
p, an appropriate resampling procedure shall be utilized.
Several resampling schemes exist, e.g., systematic, multi-
nomial, stratified, residual, and regularized resampling. The
choice of the resampling scheme affects the computational
load as well as the MC approximation error, see the discus-
sion in [6], [12]. In the current study, only the systematic
resampling strategy was considered, as it was shown empir-
ically to outperform the other methods for the Cauchy case.
Systematic resampling is among the preferred techniques
by many authors and PF users since it is simple to imple-
ment, takes O(np)time, and minimizes the MC variation
[6]. Let be assumed that the weights µi
kare normalized
before resampling, i.e., Pnp
i=1 µi
k= 1. The basic idea is to
generate a random number µ1
kfrom the uniform distribu-
tion on (0,1/np]. The remaining weights µj
k, j > 1are
obtained deterministically according to: µj
k=µ1
k+n1
p(j
1), j = 2, . . . , np. Then, µj
kare used to select particles from
{xi
k, µi
k}np
i=1. In the j-th selection, the particle xi
kis chosen
when the following condition is satisfied
ci1
k< µj
kci
k,(15)
where ci
k=Pi
l=1 wl
k. The probability of selecting xi
kis the
same as that of µi
kbeing in the interval bounded by the
cumulative sum of the normalized weights as shown in (15).
V. SUBOPTIMAL SOLUTION - G AU SS IAN SU M FILTER
Here the PDFs of the noise sequences and the initial
state are treated using a Gaussian sum approximation. The
presented filtering scheme is a generalization of the well-
known GSF algorithm of Sorenson and Alspach [3].
A. Gaussian Sum Approximation
Let N(x;¯
x,P)denote a Gaussian PDF with argument
x, mean ¯
x, and covariance matrix P. Suppose that the
stationary Cauchy PDFs given by (3), (4), and (5) can be
approximated arbitrarily close by a finite sum of linear
combinations of Gaussian PDFs as
pW(wk)Xnw
i=1 µi
wNwk; ¯wi, Qi,(16a)
pV(vk)Xnv
i=1 µi
vNvk; ¯vi, Ri,(16b)
pX1(x1)Xn1|0
i=1 µi
1|0Nx1;¯
xi
1|0,Pi
1|0,(16c)
where µi
w,µi
v, and µi
1|0are nonnegative weighting factors
satisfying Pnw
i=1 µi
w=Pnv
i=1 µi
v=Pn1|0
i=1 µi
1|0= 1. The
Gaussian sum mixture is always a valid density function and
converges uniformly to any density of practical concern [3].
B. Measurement Update
Assume that at step kthe a priori density p(xk|z1:k1)
can be expressed as a finite sum of Gaussian densities
p(xk|z1:k1)=
nk|k1
X
i=1
µi
k|k1Nxk;¯
xi
k|k1,Pi
k|k1.(17)
Then, given the k-th measurement zk, the posteriori density
p(xk|z1:k)is given by
p(xk|z1:k) =
nk|k1
X
i=1
nv
X
j=1
˜µij
k|kNxk;¯
mij
k|k,Mij
k|k,(18)
where ¯
mij
k|kand Mij
k|kcan be computed using Kalman-like
equations for all i= 1, . . . , nk|k1and j= 1, . . . , nvas
¯zij
k=H¯
xi
k|k1+ ¯vj,(19a)
Sij
k=HP i
k|k1HT+Rj,(19b)
Kij
k=Pi
k|k1HT(Sij
k)1,(19c)
¯
mij
k|k=¯
xi
k|k1+Kij
k(zk¯zij
k),(19d)
Mij
k|k=Pi
k|k1Kij
kSij
k(Kij
k)T.(19e)
The weight factors ˜µij
k|kare updated using the following rule
˜µij
k|k=µi
k|k1µj
vN(zk; ¯zij
k, Sij
k)
Pnk|k1
l=1 Pnv
m=1 µl
k|k1µm
vN(zk; ¯zlm
k, Slm
k).(20)
It is obvious that ˜µij
k|k0and that Pnk|k1
i=1 Pnv
j=1 ˜µij
k|k= 1.
Thus, (18) is a proper PDF. For notation convenience the
double summation in (18) is restated as
p(xk|z1:k) =
nk|k
X
i=1
µi
k|kNxk;¯
xi
k|k,Pi
k|k,(21)
where nk|k= (nk|k1)(nv), and µi
k|k,¯
xi
k|kand Pi
k|kare
formed in an obvious fashion from ˜µij
k|k,¯
mij
k|kand Mij
k|k.
1436
C. Time Propagation
Suppose that p(xk|z1:k)is given by (21). Then the pre-
diction density p(xk+1|z1:k)is
p(xk+1|z1:k) =
nk|k
X
i=1
nw
X
j=1
˜µij
k+1|kNxk+1 ;¯
mij
k+1|k,Mij
k+1|k
(22)
with
¯
mij
k+1|k=Φ¯
xi
k|k+Γ¯wj,(23a)
Mij
k+1|k=ΦPi
k|kΦT+ΓQjΓT,(23b)
˜µij
k+1|k=µi
k|kµj
w.(23c)
For convenience, one can rewrite (22) as
p(xk+1|z1:k) =
nk+1|k
X
i=1
µi
k+1|kNxk+1 ;¯
xi
k+1|k,Pi
k+1|k
,(24)
where nk+1|k=(nk|k)(nw), and µi
k+1|k,¯
xi
k+1|kand Pi
k+1|k
are again formed from ˜µij
k+1|k,¯
mij
k+1|kand Mij
k+1|k. Clearly,
the definition of pX1(x1), given by (16c), has the form of
(24) as does the a priori PDF assumed in (17).
Having the posterior density p(xk|z1:k)in the form of
(21), the conditional mean and the estimation error covari-
ance can be approximated in the GSF sense as [3]
ˆ
xkXnk|k
i=1 µi
k|k¯
xi
k|k
,(25a)
PkXnk|k
i=1 µi
k|kPi
k|k+(¯
xi
k|kˆ
xk)(¯
xi
k|kˆ
xk)T.(25b)
It can be noticed that both ˆ
xkand Pkare functions of the
measurement zkand that it has no recursive structure. The
major disadvantage of the GSF is that the number of terms,
nk|k=n1|0·(nw·nv)k/nwat step k, increases exponentially
in time and thus does the computational complexity.
D. Gaussian Sum Re-approximation
To avoid exponential growth of terms nk|k, a seemingly
tempting method of keeping Gaussian components with
largest weights was found to be inefficient. Even if a weight
of a Gaussian component is relatively small at a certain
point, it might become large at the next step. Ignoring such
a component might have catastrophic effects.
In this paper, after each measurement update, we suggest
to re-approximate the densities by a reduced and fixed
number of Gaussian components. This reduction is motivated
by the observation that a relatively small number of Gaussian
densities can approximate a large class of distributions.
For ease of notation, assume that the measurement updated
Gaussian sum distribution at the given step is pa(x)having
originally naterms,
pa(x) =
na
X
i=1
µi
aNx;¯
xi
a,Pi
a.(26)
Our objective is to approximate pa(x)by another Gaussian
sum distribution pb(xk)with nbterms (nbna),
pb(x) =
nb
X
i=1
µi
bNx;¯
xi
b,Pi
b.(27)
The task is to choose proper values of µi
b,xi
band Pi
bsuch
that the the following cost function
J=Z
−∞ pa(x)pb(x)2dx(28)
is minimized, and the first two moments of the new Gaussian
mixture match exactly those of the original one, i.e.,
na
X
i=1
µi
a¯
xi
a=
nb
X
i=1
µi
b¯
xi
b,(29a)
na
X
i=1
µi
aPi
a+¯
xi
a(¯
xi
a)T=
nb
X
i=1
µi
bPi
b+¯
xi
b(¯
xi
b)T,(29b)
nb
X
i=1
µi
b=1, µi
b0,Pi
b= (Pi
b)T>0,i= 1 . . . nb.(29c)
A small Jindicates that pbis a very good approximation of
pa. Solving this constrained minimization problem generally
involves computationally costly numerical integration and
nonlinear optimization with respect to many parameters.
Note that the above re-approximation may incur significant
information loss since in general there is no guarantee that
the re-approximated distribution also preserves the higher or-
der moments of the original one. It was incorporated here to
obtained a computationally acceptable GSF approximation.
VI. NUMERICAL ST U DY
In the next, the performance of the two suboptimal al-
gorithms, discussed in the previous sections, is numerically
compared and analyzed to the optimal scalar and two-state
ISCE. In both cases, the same process noise, β= 0.1, and
measurement noise, γ= 0.2, parameters as well as noise
sequences were used. These sequences are depicted in Fig. 1.
Fig. 1. Time evolution of the process and measurement noises
The implemented PF employs the systematic resampling
procedure with the threshold parameter nt
p= 2/3np. In
the GSF case, the re-approximation technique presented in
section V-D was implemented using standard constrained
optimization tools of the Matlab environment.
A. Scalar Case
For the scalar case, the the following system parameters
were chosen: Φ=0.75,Γ=1,H= 2, and α= 0.5. These
parameters satisfy the condition γ/|H|< β(1 Φ) < α of
Assumption 4.1. in [9]. In the GSF case, the process noise,
the measurement noise, and the initial state PDF were fitted
1437
in the least-square sense with a weighted Gaussian sum of
nw= 7,nv= 7, and n1|0= 9 components, respectively.
Figures 2-3 are to visually compare the true posterior PDF
p(xk|z1:k)of the ISCE at time step 8 with the approximations
given by the PF and GSF with different number of particles
(np) and Gaussian terms (nb) kept at each step, respectively.
-1.2 -0.8 -0.4 0 0.4 0.8 1.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
p(xk|z1:k) - ISCE
p(xk|z1:k) - PF (np= 50)
p(xk|z1:k) - PF (np= 105)
xk
zk/H
Fig. 2. Comparison of the posterior PDF with the PF.
-1.2 -0.8 -0.4 0 0.4 0.8 1.2
0
0.5
1
1.5
2
2.5
3
3.5
4
p(xk|z1:k) - ISCE
p(xk|z1:k) - GSF (nb= 10)
p(xk|z1:k) - GSF (nb= 200)
xk
zk/H
Fig. 3. Comparison of the posterior PDF with the GSF.
It can be seen from Fig. 2 that the PF with 10,000 particles
approximates reasonably well the true posteriori. However,
the price to pay is the computational complexity. The average
computation time of the PF with 50 particles is 4.5times and
of the PF with 10,000 particles is 7,000 times higher than
the average computation time of the ISCE evaluated based
on 100 steps, when carried out on the same computer.
On the other hand, the GSF’s approximation is very poor
even when 200 Gaussian components are kept at each time
step, see Fig. 3. In this case, the computational burden with
10 terms kept is 1,000 times and with 200 terms kept is
25,000 times higher than the average computational burden
of the ISCE evaluated based on 100 steps.
B. Two-state Case
In this case, the system parameters were chosen as follows:
Φ=0.9 0.1
0.2 1.0,Γ=1.0
0.3,α1
α2=0.10
0.05,
and H= [ 1 2 ]. The system has stable eigenvalues at 0.95±
0.1323j. It is observable and complies with the necessary
condition for the ISCE to exist [2], i.e., that HΓ6= 0. Here,
the following values: nw= 3,nv= 3, and n1|0= 9 were
used to fit the respectively GSF’s PDFs.
To allow fair comparison, the parameters of the PF and
the GSF were selected such that their computational burden
is similar to the ISCE implemented with the finite horizon
approximation [10]. We have studied two cases. The first
case considers a finite horizon (window) of 6 steps. To yield
similar computation times, np= 4,500 particles were used
for the PF and nb= 3 Gaussian terms were kept for the
GSF. The second case considers a window of 8 steps that
takes longer to compute. Note that the performance of the
ISCE for the 6 step and 8 step are indistinguishable. In this
case np= 60,000 particles in PF were used and nb= 7
Gaussian components were kept.
Figures 4-7 depict the performance of the ISCE, PF
and GSF for various cases. In those plots we compare the
estimation errors (solid lines) and the computed estimation
error standard deviations (dotted lines) as obtained by the
various filters. Figure 4 compares the performance of the
ISCE and PF when the former uses a 6-step window while
the latter uses 4,500 particles. It can be seen that the PF can
handle well the measurement outlier at step 5. However, after
several steps, especially after encountering process noise
outliers, the PF errors deviated significantly from the ISCE
error. Moreover, those errors are not properly quantified
by the estimated standard deviation obtained by the PF, as
can be seen at larger time steps of above 30. This clearly
demonstrates that 4,500 particles are not enough to properly
estimate the system states. The same conclusion can be
drawn when examining the performance of the GSF depicted
in Fig. 5. In this case, to have a comparable computing time,
the GSF keeps only 3 Gaussian components. It clearly shows
that the GSF performs very poorly, worse than the PF.
Figure 6 represents the case when far more particles are
used. In this case, 60,000 particles are used to match approx-
imately the computational burden of the ISCE with an 8-step
window. In this case the PF performance is comparable to
that of the ISCE, except when a large process noise outlier is
encountered at time step 62. In this case the PF overestimates
the error covariance. Nonetheless, this has a nearly negligible
effect on the overall estimation results. In contrast, the GSF
with 7 terms, i.e., tuned to have a comparable computa-
tion time as the ISCE 8-window filter, is not capable to
reproduce the performance of the Cauchy filter. This clearly
demonstrates that the heavy-tail characteristics of the Cauchy
noise environment cannot be captured by a finite number
of Gaussian PDFs. Its performance deteriorates significantly
when both process and measurement noise outliers occur.
1438
Fig. 4. Comparison of the ISCE using a window of 6 steps with the PF
using 4,500 particles.
Fig. 5. Comparison of the ISCE using a window of 6 steps with the GSF
keeping 3 terms at each step.
VII. CONCLUSION
In this paper, the performances of two suboptimal filtering
algorithms have been numerically compared with the optimal
Cauchy estimator. Simulation results for two-state case have
revealed that both the PF and GSF perform quite poorly for
a time interval dictated by the Cauchy filter for a window
of six, but only the GSF perform quite poorly for a time
interval a window of eight. Given the two-state ISCE’s
superior performance and its bounded computational cost, it
represents a good candidate for real-time implementation of
filtering problems in heavy-tailed Cauchy noise environment.
REFERENCES
[1] M. Idan and J. L. Speyer, “State estimation for linear scalar dy-
namic systems with additive cauchy noises: Characteristic function
approach,” SIAM Journal on Control and Optimization, vol. 50, no. 4,
pp. 1971–1994, 2012.
[2] ——, “Multivariate cauchy estimator with scalar measurement and
process noises,” SIAM Journal on Control and Optimization, vol. 52,
no. 2, pp. 1108–1141, 2014.
[3] H. W. Sorenson and D. L. Alspach, “Recursive bayesian estimation
using gaussian sums,” Automatica, vol. 7, no. 4, pp. 465–479, 1971.
Fig. 6. Comparison of the ISCE using a window of 8 steps with the PF
using 60,000 particles.
Fig. 7. Comparison of the ISCE using a window of 8 steps with the GSF
keeping 7 terms at each step.
[4] D. L. Alspach and H. W. Sorenson, “Nonlinear bayesian estimation
using gaussian sum approximations,” Automatic Control, IEEE Trans-
actions on, vol. 17, no. 4, pp. 439–448, 1972.
[5] A. Doucet, S. Godsill, and C. Andrieu, “On sequential monte carlo
sampling methods for bayesian filtering,” Statistics and computing,
vol. 10, no. 3, pp. 197–208, 2000.
[6] M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial
on particle filters for online nonlinear/non-gaussian bayesian tracking,”
IEEE Transactions on Signal Processing, vol. 50, pp. 174–188, 2002.
[7] N. Gordon, D. Salmond, and A. Smith, “Novel approach to nonlinear
/non-Gaussian Bayesian state estimation,” in IEE Proceedings F Radar
and Signal Processing, vol. 140, no. 2, 1993, pp. 107–113.
[8] J. R. Carpenter and A. K. Mashiku, “Cauchy drag estimation for
low earth orbiters,” in AAS/AIAA Space Flight Mechanics Meeting,
Williamsburg, VA; United States, 2015.
[9] M. Idan and J. L. Speyer, “Cauchy estimation for linear scalar
systems,” IEEE Transactions on Automatic Control, vol. 55, no. 6,
pp. 1329–1342, 2010.
[10] J. Fernandez, J. Speyer, and M. Idan, “Stochastic estimation for two-
state linear dynamic systems with additive cauchy noises,IEEE
Transaction on Automatic Control, vol. 60, no. 12, pp. 3367–3372,
December 2015.
[11] A. Smith, A. Doucet, N. de Freitas, and N. Gordon, Sequential Monte
Carlo methods in practice. Springer Verlag New York, 2013.
[12] T. Li, M. Bolic, and P. M. Djuric, “Resampling methods for particle
filtering: Classification, implementation, and strategies,” Signal Pro-
cessing Magazine, IEEE, vol. 32, no. 3, pp. 70–86, 2015.
1439
... Using statistical Monte Carlo (MC) simulations and a judiciously chosen measure to evaluate the estimation performance of a heavy-tailed process, we extend the sample run results presented in [15]. Our main objective is to compare the efficiency of the PF and GSF to that of the scalar and two-state ISCE-based approximations, hence providing guidance for a practical implementation of an estimator for a heavy-tail noise environment. ...
... The bottom subplot presents the difference between the exact standard deviation of the estimation error (σ * k ) and its approximation (σ k ) computed by the PF or GSF. Note that the exact values ofx * k and σ * k are computed by the optimal ISCE, see (15) and (16), respectively. The approximate implementation of the scalar ISCE is omitted here as the results are indistinguishable from the optimal ISCE. ...
... For the two-state case example, The process noise, β = 0.1, and measurement noise, γ = 0.2, parameters are the same as in the scalar case example. Sample run results comparing the performance of the PF and GSF with the two-state ISCE, implemented both with a six-step and a eight-step sliding window approximation, are presented in [15]. ...
Article
Full-text available
The recently published optimal Cauchy estimator poses practical implementation challenges due to its time-growing complexity. Alternatively, addressing impulsive measurement and process noises, while using common estimation approaches, requires heuristic schemes. Approximate methods, such as particle and Gaussian-sum filters, were suggested to tackle the estimation problem in a heavy-tailed-noise environment when constraining the computational load. In this paper, the performances of a particle filter and a Gaussian-sum filter, designed for a linear system with specified Cauchy-noise parameters, are compared numerically to a Cauchy filter-based approximation showing the advantages of the latter.
Article
Full-text available
Two decades ago, with the publication of [1], we witnessed the rebirth of particle filtering (PF) as a methodology for sequential signal processing. Since then, PF has become very popular because of its ability to process observations represented by nonlinear state-space models where the noises of the model can be non-Gaussian. This methodology has been adopted in various fields, including finance, geophysical systems, wireless communications, control, navigation and tracking, and robotics [2]. The popularity of PF has also spurred the publication of several review articles [2]?[6].
Article
An efficient recursive state estimator is developed for two-state linear systems driven by Cauchy distributed process and measurement noises. For a general vector-state system, the estimator is based on recursively propagating the characteristic function of the conditional probability density function (cpdf), where the number of terms in the sum that expresses this characteristic function grows with each measurement update. Both the conditional mean and the conditional error variance are functions of the measurement history. For systems with two states, the proposed estimator reduces substantially the number of terms needed to express the characteristic function of the cpdf by taking advantage of relationships not yet developed in the general vector-state case. Further, by using a fixed sliding window of the most recent measurements, the improved efficiency of the proposed two-state estimator allows an accurate approximation for real-time computation. In this way, the computational complexity of each measurement update eventually becomes constant, and an arbitrary number of measurements can be processed. The numerical performance of the Cauchy estimator in both Cauchy and Gaussian simulations was demonstrated and compared to the Kalman Filter.
Conference Paper
Idan and Speyer have developed minimum variance estimators that assume measurement noise, process noise, and initial error are all distributed as Cauchy random variables. Samples from Cauchy distributions are characterized by long stretches of fairly small variation, punctuated by large variations that are many times larger. Cauchy distributions appear to accurately model some natural features, such as density variations that affect satellites in low Earth orbit. In contrast to a Kalman filter, the covariances from the Cauchy approach fully and conservatively bound the actual error signature, even in the face of unanticipated density disturbances of hundreds of percent.
Article
Uncertainties in many practical systems, such as radar glint and sonar noise, have impulsive character and are better described by heavy-tailed non-Gaussian densities, for example, the Cauchy Probability Density Function (PDF). The Cauchy PDF does not have a well-defined mean and its second moment is infinite. Nonetheless, the conditional density of a Cauchy random variable, given a scalar linear measurement with an additive Cauchy noise, has a conditional mean and a finite conditional variance. In particular, for scalar discrete linear systems with additive process and measurement noises described by Cauchy PDFs, the unnormalized characteristic function of the conditional PDF is considered. It is expressed as a growing sum of terms that at each measurement update increases by one term, constructed from four new measurement-dependent parameters. The dynamics of these parameters is linear. These parameters are shown to decay, allowing an approximate finite dimensional recursion. From the first two differentials of the unnormalized characteristic functions of the conditional PDF evaluated at spectral value of zero, the conditional mean and variance are easily obtained. The effectiveness of this estimator is examined using a target interception example that also demonstrates its robustness when processing data with non-Cauchy noises.
Article
The conditional mean estimator for a n-state linear system with additive Cauchy measurement and process noises is developed. It is shown that although the Cauchy densities, which model the initial state, the process noise and the measurement noise have undefined first moments and an infinite second moment, the probability density function conditioned on the measurement history does have a finite conditional mean and conditional variance. For the multivariable system state, the characteristic function of the unnormalized conditional probability density function is sequentially propagated through measurement updates and dynamic state propagation, while expressing the resulting characteristic function in a closed analytical form. Once the characteristic function of the unnormalized conditional probability density function is obtained, the probability density function of the measurement history, the conditional mean, and conditional variance are easily computed from the characteristic function and its continuous first and second derivatives, evaluated at the origin in the spectral variables’ domain. These closed form expressions yield the sequential state estimator. A three-state dynamic system example demonstrates numerically the performance of the Cauchy estimator.
Article
An estimation paradigm is presented for scalar discrete linear systems entailing additive process and measurement noises that have Cauchy probability density functions (pdf). For systems with Gaussian noises, the Kalman filter has been the main estimation paradigm. However, many practical system uncertainties that have impulsive character, such as radar glint, are better described by stable non-Gaussian densities, for example, the Cauchy pdf. Although the Cauchy pdf does not have a well defined mean and does have an infinite second moment, the conditional density of a Cauchy random variable, given its linear measurements with an additive Cauchy noise, has a conditional mean and a finite conditional variance, both being functions of the measurement. For a single measurement, simple expressions are obtained for the conditional mean and variance, by deriving closed form expressions for the infinite integrals associated with the minimum variance estimation problem. To alleviate the complexity of the multi-stage estimator, the conditional pdf is represented in a special factored form. A recursion scheme is then developed based on this factored form and closed form integrations, allowing for the propagation of the conditional mean and variance over an arbitrary number of time stages. In simulations, the performance of the newly developed scalar discrete-time Cauchy estimator is significantly superior to a Kalman filter in the presence of Cauchy noise, whereas the Cauchy estimator deteriorates only slightly compared to the Kalman filter in the presence of Gaussian noise. Remarkably, this new recursive Cauchy conditional mean estimator has parameters that are generated by linear difference equations with stochastic coefficients, providing computational efficiency.
Article
The Bayesian recursion relations which describe the behavior of the a posteriori probability density function of the state of a time-discrete stochastic system conditioned on available measurement data cannot generally be solved in closed-form when the system is either non-linear or nongaussian. In this paper a density approximation involving convex combinations of gaussian density functions is introduced and proposed as a meaningful way of circumventing the difficulties encountered in evaluating these relations and in using the resulting densities to determine specific estimation policies. It is seen that as the number of terms in the gaussian sum increases without bound, the approximation converges uniformly to any density function in a large class. Further, any finite sum is itself a valid density function unlike many other approximations that have been investigated. The problem of determining the a posteriori density and minimum variance estimates for linear systems with nongaussian noise is treated using the gaussian sum approximation. This problem is considered because it can be dealt with in a relatively straightforward manner using the approximation but still contains most of the difficulties that one encounters in considering non-linear systems since the a posteriori density is nongaussian. After discussing the general problem from the point-of-view of applying gaussian sums, a numerical example is presented in which the actual statistics of the a posteriori density are compared with the values predicted by the gaussian sum and by the Kalman filter approximations.
Book
Monte Carlo methods are revolutionizing the on-line analysis of data in fields as diverse as financial modeling, target tracking and computer vision. These methods, appearing under the names of bootstrap filters, condensation, optimal Monte Carlo filters, particle filters and survival of the fittest, have made it possible to solve numerically many complex, non-standard problems that were previously intractable. This book presents the first comprehensive treatment of these techniques, including convergence results and applications to tracking, guidance, automated target recognition, aircraft navigation, robot navigation, econometrics, financial modeling, neural networks, optimal control, optimal filtering, communications, reinforcement learning, signal enhancement, model averaging and selection, computer vision, semiconductor design, population biology, dynamic Bayesian networks, and time series analysis. This will be of great value to students, researchers and practitioners, who have some basic knowledge of probability. Arnaud Doucet received the Ph. D. degree from the University of Paris-XI Orsay in 1997. From 1998 to 2000, he conducted research at the Signal Processing Group of Cambridge University, UK. He is currently an assistant professor at the Department of Electrical Engineering of Melbourne University, Australia. His research interests include Bayesian statistics, dynamic models and Monte Carlo methods. Nando de Freitas obtained a Ph.D. degree in information engineering from Cambridge University in 1999. He is presently a research associate with the artificial intelligence group of the University of California at Berkeley. His main research interests are in Bayesian statistics and the application of on-line and batch Monte Carlo methods to machine learning. Neil Gordon obtained a Ph.D. in Statistics from Imperial College, University of London in 1993. He is with the Pattern and Information Processing group at the Defence Evaluation and Research Agency in the United Kingdom. His research interests are in time series, statistical data analysis, and pattern recognition with a particular emphasis on target tracking and missile guidance.
Article
An algorithm, the bootstrap filter, is proposed for implementing recursive Bayesian filters. The required density of the state vector is represented as a set of random samples, which are updated and propagated by the algorithm. The method is not restricted by assumptions of linearity or Gaussian noise: it may be applied to any state transition or measurement model. A simulation example of the bearings only tracking problem is presented. This simulation includes schemes for improving the efficiency of the basic algorithm. For this example, the performance of the bootstrap filter is greatly superior to the standard extended Kalman filter
Article
Increasingly, for many application areas, it is becoming important to include elements of nonlinearity and non-Gaussianity in order to model accurately the underlying dynamics of a physical system. Moreover, it is typically crucial to process data on-line as it arrives, both from the point of view of storage costs as well as for rapid adaptation to changing signal characteristics. In this paper, we review both optimal and suboptimal Bayesian algorithms for nonlinear/non-Gaussian tracking problems, with a focus on particle filters. Particle filters are sequential Monte Carlo methods based on point mass (or "particle") representations of probability densities, which can be applied to any state-space model and which generalize the traditional Kalman filtering methods. Several variants of the particle filter such as SIR, ASIR, and RPF are introduced within a generic framework of the sequential importance sampling (SIS) algorithm. These are discussed and compared with the standard EKF through an illustrative example