Content uploaded by Robert Fonod
Author content
All content in this area was uploaded by Robert Fonod on Oct 18, 2017
Content may be subject to copyright.
State Estimation for Linear Systems with Additive Cauchy Noises:
Optimal and Suboptimal Approaches
Robert Fonod1, Moshe Idan1and Jason L. Speyer2
Abstract— Only few estimation methods can converge in
the presence of impulsive measurement and/or process noises
without the use of augmented heuristic schemes. To understand
the performance of these schemes, the optimal Idan/Speyer
Cauchy Estimator (ISCE) is compared with the performance
of the particle filter (PF) and Gaussian sum filter (GSF) as
the convergence time of these estimators is allowed to increase.
That is, the number of particles at each step for the PF and
the number of Gaussian components at each step for the GSF
are increased and their performance relative to the ISCE is
numerically studied for scalar and two-state dynamic systems.
I. INTRODUCTION
Impulsive measurement and process noises in stochastic
state estimators have typically been handled by heuristic
schemes that augment the estimation process. Recently, an
analytical recursive nonlinear estimation scheme for mul-
tivariate linear systems with additive Cauchy distributed
measurement and process noises has been developed. Cauchy
uncertainties are impulsive and this filter is optimal in that
it generates, in closed-form, the character function (CF) of
the unnormalized conditional probability function of the state
given the measurement history [1], [2]. From this character
function, the conditional mean and conditional error variance
in the presence of Cauchy distributed noise is obtained.
However, there are several general estimation algorithms that
may also be able to deal with impulsive noises [3]–[7].
Although those approaches are suboptimal, they may offer
reasonable approximations also for the heavy tailed, Cauchy
distributed noise environment.
Two of the most popular approximations are the particle
filter (PF) and the Gaussian sum filter (GSF) in that they are
shown to converge to the correct conditional density of the
state as the number of terms increase. Therefore, they are
implemented with some degree of approximation, producing
a tradeoff between numerical efficiency and the estimation
performance in constructing the conditional probability den-
sity function of the state given the measurement history and
the resulting conditional mean and variance. Our objective is
to compare the performance of these approximate filters to
that of the optimal Cauchy filter. Performance is measured by
restricting the filter approximation to an average computation
time interval at each measurement update.
This work was supported by the United States-Israel Binational Science
Foundation, Grant 2012122.
1Robert Fonod and Moshe Idan are with Department of Aerospace
Engineering, Technion - Israel Institute of Technology, Haifa, 3200003,
Israel. Email: {robert.fonod; moshe.idan}@technion.ac.il
2Jason L. Speyer is with Department of Mechanical and Aerospace
Engineering, University of California, Los Angeles (UCLA), Los Angeles,
90095, USA. Email: speyer@g.ucla.edu
Although the closed-form analytical solution of the
Cauchy filter provides the exact minimum variance estimates
of the systems states given a measurement sequence, their
computational complexity and memory burden becomes very
high requiring an approximation in their implementation. A
sliding window of data is used to limit the computation of the
Cauchy filter. It is shown numerically that there is little dif-
ferences in the conditional mean and variance for a window
of six, eight, or even ten. Clearly, larger window size implies
higher computational burden and memory requirements. The
performance of the PF and GSF are tested and compared with
the Cauchy estimator with a window of six and then eight.
II. PROB LEM FORMULATION
Consider a discrete-time, single-input-single-output, mul-
tivariate, and time-invariant linear system described by
xk+1 =Φxk+Γwk,(1)
zk=Hxk+vk,(2)
with state vector xk∈Rn, scalar measurement zk, and
known matrices Φ∈Rn×n,Γ∈Rn×1, and H∈R1×n. The
noise inputs wkand vkare independent Cauchy distributed
random variables with zero median and scaling parameters
β > 0and γ > 0, respectively. Their probability density
functions (PDFs) and their characteristic functions are de-
noted pand φ, respectively, and are assumed to be time
independent and given by
pW(wk) = β/π
w2
k+β2⇒φW(¯ν) = e−β|¯ν|,(3)
pV(vk) = γ/π
v2
k+γ2⇒φV(¯ν) = e−γ|¯ν|,(4)
where ¯νis a scalar spectral variable.
The initial conditions at k= 1 are also assumed to be
independent Cauchy distributed random variables. Specifi-
cally, each i-th element x1iof the initial state vector x1
has a Cauchy PDF with a given median ¯x1iand a scaling
parameter αi>0, i = 1, . . . , n. The joint PDF of the initial
conditions and its characteristic function are given by
pX1(x1) =
n
Y
i=1
αi/π
(x1i−¯x1i)2+α2
i
⇒
φX1(ν) =
n
Y
i=1
e−αi|νi|+j¯x1iνi,
(5)
where νiis an element of the spectral variable ν∈Rn
.
The measurement history used in the estimation problem
formulation is defined as z1:k={z1, . . . , zk}. The objective
is to compute the minimum variance estimate of xkgiven
the measurement history of z1:k.
2016 European Control Conference (ECC)
June 29 - July 1, 2016. Aalborg, Denmark
978-1-5090-2591-6 ©2016 EUCA 1434
III. OPTIMAL SOLUTION - CAU CHY FI LTE R
In this section, a brief overview of the minimum variance
Cauchy estimators, also known as Idan/Speyer Cauchy Es-
timator (ISCE) [8], for systems described by (1) and (2) is
given, summarizing the main results of [1], [2], [9].
A. Scalar ISCE - PDF Approach
The original work derived the ISCE for single-state system
using the PDF approach [9]. It was shown that under mild
conditions on the system parameters, the PDF of the state at
time step kgiven z1:kcan be expressed as
p(xk|z1:k) =
k+2
X
i=1
ai
k|kxk+bi
k|k
(xk−σk|k)2+ (ωi
k|k)2.(6)
Initialization and update rules for the series coefficients ai
k|k,
bi
k|k,σi
k|kand ωi
k|kare to be found in [9]. It is apparent that
all of the series coefficients must be updated at each time
step, and that each measurement update increases the number
of terms in the series. To avoid such impractical growth, [9]
describes a truncation procedure that limits the number of
terms in the series to a specified buffer length.
The approach above provides a closed-form expression for
p(xk|z1:k), which can be examined for its shape, as will
be shown in this paper. Unfortunately, this approach was
found to be insufficient when addressing multivariate systems
due to its derivation specifics (partial fraction expansion
of single-variable rational functions). As an alternative, the
successful approach to address multivariate system utilizes
the CF of the PDF of interest. The ISCE for single-state
systems was re-derived using this approach in [1].
B. Multivariate ISCE - CF Approach
In this approach, instead of propagating p(xk|z1:k), its
un-normalized characteristic function given by
¯
φxk|z1:kν=Zp(xk,z1:k)ejνTxkdxk,(7)
is propagated, while the normalization factor can be obtained
by p(z1:k) = ¯
φxk|z1:k0. In [2] it was shown that (7) can
be expressed as a growing sum of terms in the form
¯
φxk|z1:kν=
nt(k|k)
X
i=1
gk|k
iyk|k
gi (ν)exp yk|k
ei (ν),(8)
i.e., a sum of exponential terms multiplied by a coefficient
function g(·). The argument of this coefficient function,
yk|k
gi (ν), and of the exponents, yk|k
ei (ν), are complicated
functions of the spectral vector ν. The details of the various
parameters and functions of the above expression can be
found in [2]. Those functions and parameters can also be
used to determine the state estimate and estimation error
covariance matrix in a closed-form.
The main difficulty with the above results is that the num-
ber of terms nt(k|k)grows rapidly with time. A windowing
technique that allows reducing the number of terms using
an approximation was recently presented in [10] for the
two-state case. This version of the ISCE will be used in
comparing the performance of the various filters in this study.
IV. SUBOPTIMAL SOLUTION - PARTICLE FILTER
Particle filter, also known as sequential Monte Carlo
(MC) method, is a set of algorithms implementing recursive
Bayesian estimation, based on point mass representation of
probability densities. For good surveys, see [5], [6], [11].
A. Sequential Importance Sampling
Most PFs are based on an algorithm known as sequential
importance sampling (SIS), which is a MC technique for
solving the Bayesian inference problem [6]. The main idea
is to represent the required posterior p(xk|z1:k)using a set
of random samples with associated weights. Let {xi
k, µi
k}np
i=1
be such an approximation, where {xi
k}np
i=1 is a set of support
points (particles) with associated weights {µi
k}np
i=1, and np
is the total number of particles. Then, using the principle of
importance sampling and Bayes’s rule, the posterior at time
kcan be approximated as [6]
p(xk|z1:k)≈
np
X
i=1
µi
kδ(xk−xi
k),(9)
where δ(·)is the delta function. The weights are updated as
µi
k∝µi
k−1
p(zk|xi
k)p(xi
k|xi
k−1)
q(xi
k|xi
k−1, zk).(10)
The symbol ∝in (10) stands for “proportional to” and
q(xi
k|xi
k−1, zk)is the chosen importance density used for
sampling. Weights are normalized such that Pnp
i=1 µi
k= 1.
Based on the strong law of large numbers, the approximated
posterior (9) approaches the true p(xk|z1:k)as np→ ∞.
The SIS algorithm thus consists of recursive propagation
of the weights and particles as each measurement is received
sequentially. The minimum variance state estimate ˆ
xkand
the minimum error covariance Pkcan be approximated by
ˆ
xk≈Xnp
i=1 µi
kxi
k,(11a)
Pk≈Xnp
i=1 µi
k(xi
k−ˆ
xk)(xi
k−ˆ
xk)T.(11b)
B. Choice of the Importance Density
The optimal importance density q(xi
k|xi
k−1, zk)was
shown to be p(xk|xi
k−1, zk)[5]. This importance density,
however, is not always known and thus can be used only in
special cases, e.g., scalar system with Gaussian measurement
and process noises. Hence, the most widely used density is
the a priori distribution [6]
q(xk|xi
k−1, zk) = p(xk|xi
k−1).(12)
In this case, (10) reduces to
µi
k∝µi
k−1p(zk|xi
k).(13)
It can be noticed that this choice of important density is
independent of the measurement zk. Thus, the state space is
explored without any knowledge of the observation and this
choice can fail if a new measurements appear in the tail of
the prior or if the likelihood is too peaked in comparison to
the prior. This strategy promotes a well known problem of
the SIS algorithm, known as the degeneracy problem [6].
1435
C. Degeneracy Problem
The degeneracy problem is a common phenomenon when
after a few iterations, all but one particle will have negligible
weight. This degeneracy results in a very poor approximation
of p(xk|z1:k)and a breakdown of the algorithm. It should
be noted that this phenomenon occurs even if the optimal
importance density is used, but is more severe when using
the prior density p(xk|xi
k−1).
A suitable measure of degeneracy of the algorithm is the
effective sample size neff
pdefined in [6] as
neff
p=np
1 + Var(µ∗i
k),(14)
where µ∗i
k=p(xi
k|z1:k)/q(xi
k|xi
k−1, zk)is referred to as
the “true weight”. This measure cannot be evaluated because
usually q(xi
k|xi
k−1, zk)is unknown. Therefore it is normally
estimated by ˆneff
p= 1/Pnp
i=1(µi
k)2, where µi
kis the nor-
malized weight obtained using (13). Here 1≤neff
p≤np,
where the upper bound is attained when all particles have
the same weight, and the lower bound when all probability
mass is at one particle. Note that small neff
pindicates
severe degeneracy. The most common solutions to tackle the
degeneracy problem is the use of resampling.
D. Resampling
Resampling discards particles that have low importance
weights, as they do not contribute to the approximation, and
multiplies particles that have high weight [7]. This is done by
mapping the weighted measure {xi
k, µi
k}np
i=1 to an unweighed
measure {˜
xi
k, n−1
p}np
i=1 that still provides an approximation
of p(xk|z1:k). In other words, this process concentrates the
particles in the more important regions of state space. To
prevent degeneracy, when ˆneff
pis below a fixed threshold
nt
p, an appropriate resampling procedure shall be utilized.
Several resampling schemes exist, e.g., systematic, multi-
nomial, stratified, residual, and regularized resampling. The
choice of the resampling scheme affects the computational
load as well as the MC approximation error, see the discus-
sion in [6], [12]. In the current study, only the systematic
resampling strategy was considered, as it was shown empir-
ically to outperform the other methods for the Cauchy case.
Systematic resampling is among the preferred techniques
by many authors and PF users since it is simple to imple-
ment, takes O(np)time, and minimizes the MC variation
[6]. Let be assumed that the weights µi
kare normalized
before resampling, i.e., Pnp
i=1 µi
k= 1. The basic idea is to
generate a random number µ1
kfrom the uniform distribu-
tion on (0,1/np]. The remaining weights µj
k, j > 1are
obtained deterministically according to: µj
k=µ1
k+n−1
p(j−
1), j = 2, . . . , np. Then, µj
kare used to select particles from
{xi
k, µi
k}np
i=1. In the j-th selection, the particle xi
kis chosen
when the following condition is satisfied
ci−1
k< µj
k≤ci
k,(15)
where ci
k=Pi
l=1 wl
k. The probability of selecting xi
kis the
same as that of µi
kbeing in the interval bounded by the
cumulative sum of the normalized weights as shown in (15).
V. SUBOPTIMAL SOLUTION - G AU SS IAN SU M FILTER
Here the PDFs of the noise sequences and the initial
state are treated using a Gaussian sum approximation. The
presented filtering scheme is a generalization of the well-
known GSF algorithm of Sorenson and Alspach [3].
A. Gaussian Sum Approximation
Let N(x;¯
x,P)denote a Gaussian PDF with argument
x, mean ¯
x, and covariance matrix P. Suppose that the
stationary Cauchy PDFs given by (3), (4), and (5) can be
approximated arbitrarily close by a finite sum of linear
combinations of Gaussian PDFs as
pW(wk)≈Xnw
i=1 µi
wNwk; ¯wi, Qi,(16a)
pV(vk)≈Xnv
i=1 µi
vNvk; ¯vi, Ri,(16b)
pX1(x1)≈Xn1|0
i=1 µi
1|0Nx1;¯
xi
1|0,Pi
1|0,(16c)
where µi
w,µi
v, and µi
1|0are nonnegative weighting factors
satisfying Pnw
i=1 µi
w=Pnv
i=1 µi
v=Pn1|0
i=1 µi
1|0= 1. The
Gaussian sum mixture is always a valid density function and
converges uniformly to any density of practical concern [3].
B. Measurement Update
Assume that at step kthe a priori density p(xk|z1:k−1)
can be expressed as a finite sum of Gaussian densities
p(xk|z1:k−1)=
nk|k−1
X
i=1
µi
k|k−1Nxk;¯
xi
k|k−1,Pi
k|k−1.(17)
Then, given the k-th measurement zk, the posteriori density
p(xk|z1:k)is given by
p(xk|z1:k) =
nk|k−1
X
i=1
nv
X
j=1
˜µij
k|kNxk;¯
mij
k|k,Mij
k|k,(18)
where ¯
mij
k|kand Mij
k|kcan be computed using Kalman-like
equations for all i= 1, . . . , nk|k−1and j= 1, . . . , nvas
¯zij
k=H¯
xi
k|k−1+ ¯vj,(19a)
Sij
k=HP i
k|k−1HT+Rj,(19b)
Kij
k=Pi
k|k−1HT(Sij
k)−1,(19c)
¯
mij
k|k=¯
xi
k|k−1+Kij
k(zk−¯zij
k),(19d)
Mij
k|k=Pi
k|k−1−Kij
kSij
k(Kij
k)T.(19e)
The weight factors ˜µij
k|kare updated using the following rule
˜µij
k|k=µi
k|k−1µj
vN(zk; ¯zij
k, Sij
k)
Pnk|k−1
l=1 Pnv
m=1 µl
k|k−1µm
vN(zk; ¯zlm
k, Slm
k).(20)
It is obvious that ˜µij
k|k≥0and that Pnk|k−1
i=1 Pnv
j=1 ˜µij
k|k= 1.
Thus, (18) is a proper PDF. For notation convenience the
double summation in (18) is restated as
p(xk|z1:k) =
nk|k
X
i=1
µi
k|kNxk;¯
xi
k|k,Pi
k|k,(21)
where nk|k= (nk|k−1)(nv), and µi
k|k,¯
xi
k|kand Pi
k|kare
formed in an obvious fashion from ˜µij
k|k,¯
mij
k|kand Mij
k|k.
1436
C. Time Propagation
Suppose that p(xk|z1:k)is given by (21). Then the pre-
diction density p(xk+1|z1:k)is
p(xk+1|z1:k) =
nk|k
X
i=1
nw
X
j=1
˜µij
k+1|kNxk+1 ;¯
mij
k+1|k,Mij
k+1|k
(22)
with
¯
mij
k+1|k=Φ¯
xi
k|k+Γ¯wj,(23a)
Mij
k+1|k=ΦPi
k|kΦT+ΓQjΓT,(23b)
˜µij
k+1|k=µi
k|kµj
w.(23c)
For convenience, one can rewrite (22) as
p(xk+1|z1:k) =
nk+1|k
X
i=1
µi
k+1|kNxk+1 ;¯
xi
k+1|k,Pi
k+1|k
,(24)
where nk+1|k=(nk|k)(nw), and µi
k+1|k,¯
xi
k+1|kand Pi
k+1|k
are again formed from ˜µij
k+1|k,¯
mij
k+1|kand Mij
k+1|k. Clearly,
the definition of pX1(x1), given by (16c), has the form of
(24) as does the a priori PDF assumed in (17).
Having the posterior density p(xk|z1:k)in the form of
(21), the conditional mean and the estimation error covari-
ance can be approximated in the GSF sense as [3]
ˆ
xk≈Xnk|k
i=1 µi
k|k¯
xi
k|k
,(25a)
Pk≈Xnk|k
i=1 µi
k|kPi
k|k+(¯
xi
k|k−ˆ
xk)(¯
xi
k|k−ˆ
xk)T.(25b)
It can be noticed that both ˆ
xkand Pkare functions of the
measurement zkand that it has no recursive structure. The
major disadvantage of the GSF is that the number of terms,
nk|k=n1|0·(nw·nv)k/nwat step k, increases exponentially
in time and thus does the computational complexity.
D. Gaussian Sum Re-approximation
To avoid exponential growth of terms nk|k, a seemingly
tempting method of keeping Gaussian components with
largest weights was found to be inefficient. Even if a weight
of a Gaussian component is relatively small at a certain
point, it might become large at the next step. Ignoring such
a component might have catastrophic effects.
In this paper, after each measurement update, we suggest
to re-approximate the densities by a reduced and fixed
number of Gaussian components. This reduction is motivated
by the observation that a relatively small number of Gaussian
densities can approximate a large class of distributions.
For ease of notation, assume that the measurement updated
Gaussian sum distribution at the given step is pa(x)having
originally naterms,
pa(x) =
na
X
i=1
µi
aNx;¯
xi
a,Pi
a.(26)
Our objective is to approximate pa(x)by another Gaussian
sum distribution pb(xk)with nbterms (nbna),
pb(x) =
nb
X
i=1
µi
bNx;¯
xi
b,Pi
b.(27)
The task is to choose proper values of µi
b,xi
band Pi
bsuch
that the the following cost function
J=Z∞
−∞ pa(x)−pb(x)2dx(28)
is minimized, and the first two moments of the new Gaussian
mixture match exactly those of the original one, i.e.,
na
X
i=1
µi
a¯
xi
a=
nb
X
i=1
µi
b¯
xi
b,(29a)
na
X
i=1
µi
aPi
a+¯
xi
a(¯
xi
a)T=
nb
X
i=1
µi
bPi
b+¯
xi
b(¯
xi
b)T,(29b)
nb
X
i=1
µi
b=1, µi
b≥0,Pi
b= (Pi
b)T>0,∀i= 1 . . . nb.(29c)
A small Jindicates that pbis a very good approximation of
pa. Solving this constrained minimization problem generally
involves computationally costly numerical integration and
nonlinear optimization with respect to many parameters.
Note that the above re-approximation may incur significant
information loss since in general there is no guarantee that
the re-approximated distribution also preserves the higher or-
der moments of the original one. It was incorporated here to
obtained a computationally acceptable GSF approximation.
VI. NUMERICAL ST U DY
In the next, the performance of the two suboptimal al-
gorithms, discussed in the previous sections, is numerically
compared and analyzed to the optimal scalar and two-state
ISCE. In both cases, the same process noise, β= 0.1, and
measurement noise, γ= 0.2, parameters as well as noise
sequences were used. These sequences are depicted in Fig. 1.
Fig. 1. Time evolution of the process and measurement noises
The implemented PF employs the systematic resampling
procedure with the threshold parameter nt
p= 2/3np. In
the GSF case, the re-approximation technique presented in
section V-D was implemented using standard constrained
optimization tools of the Matlab environment.
A. Scalar Case
For the scalar case, the the following system parameters
were chosen: Φ=0.75,Γ=1,H= 2, and α= 0.5. These
parameters satisfy the condition γ/|H|< β(1 −Φ) < α of
Assumption 4.1. in [9]. In the GSF case, the process noise,
the measurement noise, and the initial state PDF were fitted
1437
in the least-square sense with a weighted Gaussian sum of
nw= 7,nv= 7, and n1|0= 9 components, respectively.
Figures 2-3 are to visually compare the true posterior PDF
p(xk|z1:k)of the ISCE at time step 8 with the approximations
given by the PF and GSF with different number of particles
(np) and Gaussian terms (nb) kept at each step, respectively.
-1.2 -0.8 -0.4 0 0.4 0.8 1.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
p(xk|z1:k) - ISCE
p(xk|z1:k) - PF (np= 50)
p(xk|z1:k) - PF (np= 105)
xk
zk/H
Fig. 2. Comparison of the posterior PDF with the PF.
-1.2 -0.8 -0.4 0 0.4 0.8 1.2
0
0.5
1
1.5
2
2.5
3
3.5
4
p(xk|z1:k) - ISCE
p(xk|z1:k) - GSF (nb= 10)
p(xk|z1:k) - GSF (nb= 200)
xk
zk/H
Fig. 3. Comparison of the posterior PDF with the GSF.
It can be seen from Fig. 2 that the PF with 10,000 particles
approximates reasonably well the true posteriori. However,
the price to pay is the computational complexity. The average
computation time of the PF with 50 particles is 4.5times and
of the PF with 10,000 particles is 7,000 times higher than
the average computation time of the ISCE evaluated based
on 100 steps, when carried out on the same computer.
On the other hand, the GSF’s approximation is very poor
even when 200 Gaussian components are kept at each time
step, see Fig. 3. In this case, the computational burden with
10 terms kept is 1,000 times and with 200 terms kept is
25,000 times higher than the average computational burden
of the ISCE evaluated based on 100 steps.
B. Two-state Case
In this case, the system parameters were chosen as follows:
Φ=0.9 0.1
−0.2 1.0,Γ=1.0
0.3,α1
α2=0.10
0.05,
and H= [ 1 2 ]. The system has stable eigenvalues at 0.95±
0.1323j. It is observable and complies with the necessary
condition for the ISCE to exist [2], i.e., that HΓ6= 0. Here,
the following values: nw= 3,nv= 3, and n1|0= 9 were
used to fit the respectively GSF’s PDFs.
To allow fair comparison, the parameters of the PF and
the GSF were selected such that their computational burden
is similar to the ISCE implemented with the finite horizon
approximation [10]. We have studied two cases. The first
case considers a finite horizon (window) of 6 steps. To yield
similar computation times, np= 4,500 particles were used
for the PF and nb= 3 Gaussian terms were kept for the
GSF. The second case considers a window of 8 steps that
takes longer to compute. Note that the performance of the
ISCE for the 6 step and 8 step are indistinguishable. In this
case np= 60,000 particles in PF were used and nb= 7
Gaussian components were kept.
Figures 4-7 depict the performance of the ISCE, PF
and GSF for various cases. In those plots we compare the
estimation errors (solid lines) and the computed estimation
error standard deviations (dotted lines) as obtained by the
various filters. Figure 4 compares the performance of the
ISCE and PF when the former uses a 6-step window while
the latter uses 4,500 particles. It can be seen that the PF can
handle well the measurement outlier at step 5. However, after
several steps, especially after encountering process noise
outliers, the PF errors deviated significantly from the ISCE
error. Moreover, those errors are not properly quantified
by the estimated standard deviation obtained by the PF, as
can be seen at larger time steps of above 30. This clearly
demonstrates that 4,500 particles are not enough to properly
estimate the system states. The same conclusion can be
drawn when examining the performance of the GSF depicted
in Fig. 5. In this case, to have a comparable computing time,
the GSF keeps only 3 Gaussian components. It clearly shows
that the GSF performs very poorly, worse than the PF.
Figure 6 represents the case when far more particles are
used. In this case, 60,000 particles are used to match approx-
imately the computational burden of the ISCE with an 8-step
window. In this case the PF performance is comparable to
that of the ISCE, except when a large process noise outlier is
encountered at time step 62. In this case the PF overestimates
the error covariance. Nonetheless, this has a nearly negligible
effect on the overall estimation results. In contrast, the GSF
with 7 terms, i.e., tuned to have a comparable computa-
tion time as the ISCE 8-window filter, is not capable to
reproduce the performance of the Cauchy filter. This clearly
demonstrates that the heavy-tail characteristics of the Cauchy
noise environment cannot be captured by a finite number
of Gaussian PDFs. Its performance deteriorates significantly
when both process and measurement noise outliers occur.
1438
Fig. 4. Comparison of the ISCE using a window of 6 steps with the PF
using 4,500 particles.
Fig. 5. Comparison of the ISCE using a window of 6 steps with the GSF
keeping 3 terms at each step.
VII. CONCLUSION
In this paper, the performances of two suboptimal filtering
algorithms have been numerically compared with the optimal
Cauchy estimator. Simulation results for two-state case have
revealed that both the PF and GSF perform quite poorly for
a time interval dictated by the Cauchy filter for a window
of six, but only the GSF perform quite poorly for a time
interval a window of eight. Given the two-state ISCE’s
superior performance and its bounded computational cost, it
represents a good candidate for real-time implementation of
filtering problems in heavy-tailed Cauchy noise environment.
REFERENCES
[1] M. Idan and J. L. Speyer, “State estimation for linear scalar dy-
namic systems with additive cauchy noises: Characteristic function
approach,” SIAM Journal on Control and Optimization, vol. 50, no. 4,
pp. 1971–1994, 2012.
[2] ——, “Multivariate cauchy estimator with scalar measurement and
process noises,” SIAM Journal on Control and Optimization, vol. 52,
no. 2, pp. 1108–1141, 2014.
[3] H. W. Sorenson and D. L. Alspach, “Recursive bayesian estimation
using gaussian sums,” Automatica, vol. 7, no. 4, pp. 465–479, 1971.
Fig. 6. Comparison of the ISCE using a window of 8 steps with the PF
using 60,000 particles.
Fig. 7. Comparison of the ISCE using a window of 8 steps with the GSF
keeping 7 terms at each step.
[4] D. L. Alspach and H. W. Sorenson, “Nonlinear bayesian estimation
using gaussian sum approximations,” Automatic Control, IEEE Trans-
actions on, vol. 17, no. 4, pp. 439–448, 1972.
[5] A. Doucet, S. Godsill, and C. Andrieu, “On sequential monte carlo
sampling methods for bayesian filtering,” Statistics and computing,
vol. 10, no. 3, pp. 197–208, 2000.
[6] M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial
on particle filters for online nonlinear/non-gaussian bayesian tracking,”
IEEE Transactions on Signal Processing, vol. 50, pp. 174–188, 2002.
[7] N. Gordon, D. Salmond, and A. Smith, “Novel approach to nonlinear
/non-Gaussian Bayesian state estimation,” in IEE Proceedings F Radar
and Signal Processing, vol. 140, no. 2, 1993, pp. 107–113.
[8] J. R. Carpenter and A. K. Mashiku, “Cauchy drag estimation for
low earth orbiters,” in AAS/AIAA Space Flight Mechanics Meeting,
Williamsburg, VA; United States, 2015.
[9] M. Idan and J. L. Speyer, “Cauchy estimation for linear scalar
systems,” IEEE Transactions on Automatic Control, vol. 55, no. 6,
pp. 1329–1342, 2010.
[10] J. Fernandez, J. Speyer, and M. Idan, “Stochastic estimation for two-
state linear dynamic systems with additive cauchy noises,” IEEE
Transaction on Automatic Control, vol. 60, no. 12, pp. 3367–3372,
December 2015.
[11] A. Smith, A. Doucet, N. de Freitas, and N. Gordon, Sequential Monte
Carlo methods in practice. Springer Verlag New York, 2013.
[12] T. Li, M. Bolic, and P. M. Djuric, “Resampling methods for particle
filtering: Classification, implementation, and strategies,” Signal Pro-
cessing Magazine, IEEE, vol. 32, no. 3, pp. 70–86, 2015.
1439