Content uploaded by Robert Fonod

Author content

All content in this area was uploaded by Robert Fonod on Oct 18, 2017

Content may be subject to copyright.

State Estimation for Linear Systems with Additive Cauchy Noises:

Optimal and Suboptimal Approaches

Robert Fonod1, Moshe Idan1and Jason L. Speyer2

Abstract— Only few estimation methods can converge in

the presence of impulsive measurement and/or process noises

without the use of augmented heuristic schemes. To understand

the performance of these schemes, the optimal Idan/Speyer

Cauchy Estimator (ISCE) is compared with the performance

of the particle ﬁlter (PF) and Gaussian sum ﬁlter (GSF) as

the convergence time of these estimators is allowed to increase.

That is, the number of particles at each step for the PF and

the number of Gaussian components at each step for the GSF

are increased and their performance relative to the ISCE is

numerically studied for scalar and two-state dynamic systems.

I. INTRODUCTION

Impulsive measurement and process noises in stochastic

state estimators have typically been handled by heuristic

schemes that augment the estimation process. Recently, an

analytical recursive nonlinear estimation scheme for mul-

tivariate linear systems with additive Cauchy distributed

measurement and process noises has been developed. Cauchy

uncertainties are impulsive and this ﬁlter is optimal in that

it generates, in closed-form, the character function (CF) of

the unnormalized conditional probability function of the state

given the measurement history [1], [2]. From this character

function, the conditional mean and conditional error variance

in the presence of Cauchy distributed noise is obtained.

However, there are several general estimation algorithms that

may also be able to deal with impulsive noises [3]–[7].

Although those approaches are suboptimal, they may offer

reasonable approximations also for the heavy tailed, Cauchy

distributed noise environment.

Two of the most popular approximations are the particle

ﬁlter (PF) and the Gaussian sum ﬁlter (GSF) in that they are

shown to converge to the correct conditional density of the

state as the number of terms increase. Therefore, they are

implemented with some degree of approximation, producing

a tradeoff between numerical efﬁciency and the estimation

performance in constructing the conditional probability den-

sity function of the state given the measurement history and

the resulting conditional mean and variance. Our objective is

to compare the performance of these approximate ﬁlters to

that of the optimal Cauchy ﬁlter. Performance is measured by

restricting the ﬁlter approximation to an average computation

time interval at each measurement update.

This work was supported by the United States-Israel Binational Science

Foundation, Grant 2012122.

1Robert Fonod and Moshe Idan are with Department of Aerospace

Engineering, Technion - Israel Institute of Technology, Haifa, 3200003,

Israel. Email: {robert.fonod; moshe.idan}@technion.ac.il

2Jason L. Speyer is with Department of Mechanical and Aerospace

Engineering, University of California, Los Angeles (UCLA), Los Angeles,

90095, USA. Email: speyer@g.ucla.edu

Although the closed-form analytical solution of the

Cauchy ﬁlter provides the exact minimum variance estimates

of the systems states given a measurement sequence, their

computational complexity and memory burden becomes very

high requiring an approximation in their implementation. A

sliding window of data is used to limit the computation of the

Cauchy ﬁlter. It is shown numerically that there is little dif-

ferences in the conditional mean and variance for a window

of six, eight, or even ten. Clearly, larger window size implies

higher computational burden and memory requirements. The

performance of the PF and GSF are tested and compared with

the Cauchy estimator with a window of six and then eight.

II. PROB LEM FORMULATION

Consider a discrete-time, single-input-single-output, mul-

tivariate, and time-invariant linear system described by

xk+1 =Φxk+Γwk,(1)

zk=Hxk+vk,(2)

with state vector xk∈Rn, scalar measurement zk, and

known matrices Φ∈Rn×n,Γ∈Rn×1, and H∈R1×n. The

noise inputs wkand vkare independent Cauchy distributed

random variables with zero median and scaling parameters

β > 0and γ > 0, respectively. Their probability density

functions (PDFs) and their characteristic functions are de-

noted pand φ, respectively, and are assumed to be time

independent and given by

pW(wk) = β/π

w2

k+β2⇒φW(¯ν) = e−β|¯ν|,(3)

pV(vk) = γ/π

v2

k+γ2⇒φV(¯ν) = e−γ|¯ν|,(4)

where ¯νis a scalar spectral variable.

The initial conditions at k= 1 are also assumed to be

independent Cauchy distributed random variables. Speciﬁ-

cally, each i-th element x1iof the initial state vector x1

has a Cauchy PDF with a given median ¯x1iand a scaling

parameter αi>0, i = 1, . . . , n. The joint PDF of the initial

conditions and its characteristic function are given by

pX1(x1) =

n

Y

i=1

αi/π

(x1i−¯x1i)2+α2

i

⇒

φX1(ν) =

n

Y

i=1

e−αi|νi|+j¯x1iνi,

(5)

where νiis an element of the spectral variable ν∈Rn

.

The measurement history used in the estimation problem

formulation is deﬁned as z1:k={z1, . . . , zk}. The objective

is to compute the minimum variance estimate of xkgiven

the measurement history of z1:k.

2016 European Control Conference (ECC)

June 29 - July 1, 2016. Aalborg, Denmark

978-1-5090-2591-6 ©2016 EUCA 1434

III. OPTIMAL SOLUTION - CAU CHY FI LTE R

In this section, a brief overview of the minimum variance

Cauchy estimators, also known as Idan/Speyer Cauchy Es-

timator (ISCE) [8], for systems described by (1) and (2) is

given, summarizing the main results of [1], [2], [9].

A. Scalar ISCE - PDF Approach

The original work derived the ISCE for single-state system

using the PDF approach [9]. It was shown that under mild

conditions on the system parameters, the PDF of the state at

time step kgiven z1:kcan be expressed as

p(xk|z1:k) =

k+2

X

i=1

ai

k|kxk+bi

k|k

(xk−σk|k)2+ (ωi

k|k)2.(6)

Initialization and update rules for the series coefﬁcients ai

k|k,

bi

k|k,σi

k|kand ωi

k|kare to be found in [9]. It is apparent that

all of the series coefﬁcients must be updated at each time

step, and that each measurement update increases the number

of terms in the series. To avoid such impractical growth, [9]

describes a truncation procedure that limits the number of

terms in the series to a speciﬁed buffer length.

The approach above provides a closed-form expression for

p(xk|z1:k), which can be examined for its shape, as will

be shown in this paper. Unfortunately, this approach was

found to be insufﬁcient when addressing multivariate systems

due to its derivation speciﬁcs (partial fraction expansion

of single-variable rational functions). As an alternative, the

successful approach to address multivariate system utilizes

the CF of the PDF of interest. The ISCE for single-state

systems was re-derived using this approach in [1].

B. Multivariate ISCE - CF Approach

In this approach, instead of propagating p(xk|z1:k), its

un-normalized characteristic function given by

¯

φxk|z1:kν=Zp(xk,z1:k)ejνTxkdxk,(7)

is propagated, while the normalization factor can be obtained

by p(z1:k) = ¯

φxk|z1:k0. In [2] it was shown that (7) can

be expressed as a growing sum of terms in the form

¯

φxk|z1:kν=

nt(k|k)

X

i=1

gk|k

iyk|k

gi (ν)exp yk|k

ei (ν),(8)

i.e., a sum of exponential terms multiplied by a coefﬁcient

function g(·). The argument of this coefﬁcient function,

yk|k

gi (ν), and of the exponents, yk|k

ei (ν), are complicated

functions of the spectral vector ν. The details of the various

parameters and functions of the above expression can be

found in [2]. Those functions and parameters can also be

used to determine the state estimate and estimation error

covariance matrix in a closed-form.

The main difﬁculty with the above results is that the num-

ber of terms nt(k|k)grows rapidly with time. A windowing

technique that allows reducing the number of terms using

an approximation was recently presented in [10] for the

two-state case. This version of the ISCE will be used in

comparing the performance of the various ﬁlters in this study.

IV. SUBOPTIMAL SOLUTION - PARTICLE FILTER

Particle ﬁlter, also known as sequential Monte Carlo

(MC) method, is a set of algorithms implementing recursive

Bayesian estimation, based on point mass representation of

probability densities. For good surveys, see [5], [6], [11].

A. Sequential Importance Sampling

Most PFs are based on an algorithm known as sequential

importance sampling (SIS), which is a MC technique for

solving the Bayesian inference problem [6]. The main idea

is to represent the required posterior p(xk|z1:k)using a set

of random samples with associated weights. Let {xi

k, µi

k}np

i=1

be such an approximation, where {xi

k}np

i=1 is a set of support

points (particles) with associated weights {µi

k}np

i=1, and np

is the total number of particles. Then, using the principle of

importance sampling and Bayes’s rule, the posterior at time

kcan be approximated as [6]

p(xk|z1:k)≈

np

X

i=1

µi

kδ(xk−xi

k),(9)

where δ(·)is the delta function. The weights are updated as

µi

k∝µi

k−1

p(zk|xi

k)p(xi

k|xi

k−1)

q(xi

k|xi

k−1, zk).(10)

The symbol ∝in (10) stands for “proportional to” and

q(xi

k|xi

k−1, zk)is the chosen importance density used for

sampling. Weights are normalized such that Pnp

i=1 µi

k= 1.

Based on the strong law of large numbers, the approximated

posterior (9) approaches the true p(xk|z1:k)as np→ ∞.

The SIS algorithm thus consists of recursive propagation

of the weights and particles as each measurement is received

sequentially. The minimum variance state estimate ˆ

xkand

the minimum error covariance Pkcan be approximated by

ˆ

xk≈Xnp

i=1 µi

kxi

k,(11a)

Pk≈Xnp

i=1 µi

k(xi

k−ˆ

xk)(xi

k−ˆ

xk)T.(11b)

B. Choice of the Importance Density

The optimal importance density q(xi

k|xi

k−1, zk)was

shown to be p(xk|xi

k−1, zk)[5]. This importance density,

however, is not always known and thus can be used only in

special cases, e.g., scalar system with Gaussian measurement

and process noises. Hence, the most widely used density is

the a priori distribution [6]

q(xk|xi

k−1, zk) = p(xk|xi

k−1).(12)

In this case, (10) reduces to

µi

k∝µi

k−1p(zk|xi

k).(13)

It can be noticed that this choice of important density is

independent of the measurement zk. Thus, the state space is

explored without any knowledge of the observation and this

choice can fail if a new measurements appear in the tail of

the prior or if the likelihood is too peaked in comparison to

the prior. This strategy promotes a well known problem of

the SIS algorithm, known as the degeneracy problem [6].

1435

C. Degeneracy Problem

The degeneracy problem is a common phenomenon when

after a few iterations, all but one particle will have negligible

weight. This degeneracy results in a very poor approximation

of p(xk|z1:k)and a breakdown of the algorithm. It should

be noted that this phenomenon occurs even if the optimal

importance density is used, but is more severe when using

the prior density p(xk|xi

k−1).

A suitable measure of degeneracy of the algorithm is the

effective sample size neff

pdeﬁned in [6] as

neff

p=np

1 + Var(µ∗i

k),(14)

where µ∗i

k=p(xi

k|z1:k)/q(xi

k|xi

k−1, zk)is referred to as

the “true weight”. This measure cannot be evaluated because

usually q(xi

k|xi

k−1, zk)is unknown. Therefore it is normally

estimated by ˆneff

p= 1/Pnp

i=1(µi

k)2, where µi

kis the nor-

malized weight obtained using (13). Here 1≤neff

p≤np,

where the upper bound is attained when all particles have

the same weight, and the lower bound when all probability

mass is at one particle. Note that small neff

pindicates

severe degeneracy. The most common solutions to tackle the

degeneracy problem is the use of resampling.

D. Resampling

Resampling discards particles that have low importance

weights, as they do not contribute to the approximation, and

multiplies particles that have high weight [7]. This is done by

mapping the weighted measure {xi

k, µi

k}np

i=1 to an unweighed

measure {˜

xi

k, n−1

p}np

i=1 that still provides an approximation

of p(xk|z1:k). In other words, this process concentrates the

particles in the more important regions of state space. To

prevent degeneracy, when ˆneff

pis below a ﬁxed threshold

nt

p, an appropriate resampling procedure shall be utilized.

Several resampling schemes exist, e.g., systematic, multi-

nomial, stratiﬁed, residual, and regularized resampling. The

choice of the resampling scheme affects the computational

load as well as the MC approximation error, see the discus-

sion in [6], [12]. In the current study, only the systematic

resampling strategy was considered, as it was shown empir-

ically to outperform the other methods for the Cauchy case.

Systematic resampling is among the preferred techniques

by many authors and PF users since it is simple to imple-

ment, takes O(np)time, and minimizes the MC variation

[6]. Let be assumed that the weights µi

kare normalized

before resampling, i.e., Pnp

i=1 µi

k= 1. The basic idea is to

generate a random number µ1

kfrom the uniform distribu-

tion on (0,1/np]. The remaining weights µj

k, j > 1are

obtained deterministically according to: µj

k=µ1

k+n−1

p(j−

1), j = 2, . . . , np. Then, µj

kare used to select particles from

{xi

k, µi

k}np

i=1. In the j-th selection, the particle xi

kis chosen

when the following condition is satisﬁed

ci−1

k< µj

k≤ci

k,(15)

where ci

k=Pi

l=1 wl

k. The probability of selecting xi

kis the

same as that of µi

kbeing in the interval bounded by the

cumulative sum of the normalized weights as shown in (15).

V. SUBOPTIMAL SOLUTION - G AU SS IAN SU M FILTER

Here the PDFs of the noise sequences and the initial

state are treated using a Gaussian sum approximation. The

presented ﬁltering scheme is a generalization of the well-

known GSF algorithm of Sorenson and Alspach [3].

A. Gaussian Sum Approximation

Let N(x;¯

x,P)denote a Gaussian PDF with argument

x, mean ¯

x, and covariance matrix P. Suppose that the

stationary Cauchy PDFs given by (3), (4), and (5) can be

approximated arbitrarily close by a ﬁnite sum of linear

combinations of Gaussian PDFs as

pW(wk)≈Xnw

i=1 µi

wNwk; ¯wi, Qi,(16a)

pV(vk)≈Xnv

i=1 µi

vNvk; ¯vi, Ri,(16b)

pX1(x1)≈Xn1|0

i=1 µi

1|0Nx1;¯

xi

1|0,Pi

1|0,(16c)

where µi

w,µi

v, and µi

1|0are nonnegative weighting factors

satisfying Pnw

i=1 µi

w=Pnv

i=1 µi

v=Pn1|0

i=1 µi

1|0= 1. The

Gaussian sum mixture is always a valid density function and

converges uniformly to any density of practical concern [3].

B. Measurement Update

Assume that at step kthe a priori density p(xk|z1:k−1)

can be expressed as a ﬁnite sum of Gaussian densities

p(xk|z1:k−1)=

nk|k−1

X

i=1

µi

k|k−1Nxk;¯

xi

k|k−1,Pi

k|k−1.(17)

Then, given the k-th measurement zk, the posteriori density

p(xk|z1:k)is given by

p(xk|z1:k) =

nk|k−1

X

i=1

nv

X

j=1

˜µij

k|kNxk;¯

mij

k|k,Mij

k|k,(18)

where ¯

mij

k|kand Mij

k|kcan be computed using Kalman-like

equations for all i= 1, . . . , nk|k−1and j= 1, . . . , nvas

¯zij

k=H¯

xi

k|k−1+ ¯vj,(19a)

Sij

k=HP i

k|k−1HT+Rj,(19b)

Kij

k=Pi

k|k−1HT(Sij

k)−1,(19c)

¯

mij

k|k=¯

xi

k|k−1+Kij

k(zk−¯zij

k),(19d)

Mij

k|k=Pi

k|k−1−Kij

kSij

k(Kij

k)T.(19e)

The weight factors ˜µij

k|kare updated using the following rule

˜µij

k|k=µi

k|k−1µj

vN(zk; ¯zij

k, Sij

k)

Pnk|k−1

l=1 Pnv

m=1 µl

k|k−1µm

vN(zk; ¯zlm

k, Slm

k).(20)

It is obvious that ˜µij

k|k≥0and that Pnk|k−1

i=1 Pnv

j=1 ˜µij

k|k= 1.

Thus, (18) is a proper PDF. For notation convenience the

double summation in (18) is restated as

p(xk|z1:k) =

nk|k

X

i=1

µi

k|kNxk;¯

xi

k|k,Pi

k|k,(21)

where nk|k= (nk|k−1)(nv), and µi

k|k,¯

xi

k|kand Pi

k|kare

formed in an obvious fashion from ˜µij

k|k,¯

mij

k|kand Mij

k|k.

1436

C. Time Propagation

Suppose that p(xk|z1:k)is given by (21). Then the pre-

diction density p(xk+1|z1:k)is

p(xk+1|z1:k) =

nk|k

X

i=1

nw

X

j=1

˜µij

k+1|kNxk+1 ;¯

mij

k+1|k,Mij

k+1|k

(22)

with

¯

mij

k+1|k=Φ¯

xi

k|k+Γ¯wj,(23a)

Mij

k+1|k=ΦPi

k|kΦT+ΓQjΓT,(23b)

˜µij

k+1|k=µi

k|kµj

w.(23c)

For convenience, one can rewrite (22) as

p(xk+1|z1:k) =

nk+1|k

X

i=1

µi

k+1|kNxk+1 ;¯

xi

k+1|k,Pi

k+1|k

,(24)

where nk+1|k=(nk|k)(nw), and µi

k+1|k,¯

xi

k+1|kand Pi

k+1|k

are again formed from ˜µij

k+1|k,¯

mij

k+1|kand Mij

k+1|k. Clearly,

the deﬁnition of pX1(x1), given by (16c), has the form of

(24) as does the a priori PDF assumed in (17).

Having the posterior density p(xk|z1:k)in the form of

(21), the conditional mean and the estimation error covari-

ance can be approximated in the GSF sense as [3]

ˆ

xk≈Xnk|k

i=1 µi

k|k¯

xi

k|k

,(25a)

Pk≈Xnk|k

i=1 µi

k|kPi

k|k+(¯

xi

k|k−ˆ

xk)(¯

xi

k|k−ˆ

xk)T.(25b)

It can be noticed that both ˆ

xkand Pkare functions of the

measurement zkand that it has no recursive structure. The

major disadvantage of the GSF is that the number of terms,

nk|k=n1|0·(nw·nv)k/nwat step k, increases exponentially

in time and thus does the computational complexity.

D. Gaussian Sum Re-approximation

To avoid exponential growth of terms nk|k, a seemingly

tempting method of keeping Gaussian components with

largest weights was found to be inefﬁcient. Even if a weight

of a Gaussian component is relatively small at a certain

point, it might become large at the next step. Ignoring such

a component might have catastrophic effects.

In this paper, after each measurement update, we suggest

to re-approximate the densities by a reduced and ﬁxed

number of Gaussian components. This reduction is motivated

by the observation that a relatively small number of Gaussian

densities can approximate a large class of distributions.

For ease of notation, assume that the measurement updated

Gaussian sum distribution at the given step is pa(x)having

originally naterms,

pa(x) =

na

X

i=1

µi

aNx;¯

xi

a,Pi

a.(26)

Our objective is to approximate pa(x)by another Gaussian

sum distribution pb(xk)with nbterms (nbna),

pb(x) =

nb

X

i=1

µi

bNx;¯

xi

b,Pi

b.(27)

The task is to choose proper values of µi

b,xi

band Pi

bsuch

that the the following cost function

J=Z∞

−∞ pa(x)−pb(x)2dx(28)

is minimized, and the ﬁrst two moments of the new Gaussian

mixture match exactly those of the original one, i.e.,

na

X

i=1

µi

a¯

xi

a=

nb

X

i=1

µi

b¯

xi

b,(29a)

na

X

i=1

µi

aPi

a+¯

xi

a(¯

xi

a)T=

nb

X

i=1

µi

bPi

b+¯

xi

b(¯

xi

b)T,(29b)

nb

X

i=1

µi

b=1, µi

b≥0,Pi

b= (Pi

b)T>0,∀i= 1 . . . nb.(29c)

A small Jindicates that pbis a very good approximation of

pa. Solving this constrained minimization problem generally

involves computationally costly numerical integration and

nonlinear optimization with respect to many parameters.

Note that the above re-approximation may incur signiﬁcant

information loss since in general there is no guarantee that

the re-approximated distribution also preserves the higher or-

der moments of the original one. It was incorporated here to

obtained a computationally acceptable GSF approximation.

VI. NUMERICAL ST U DY

In the next, the performance of the two suboptimal al-

gorithms, discussed in the previous sections, is numerically

compared and analyzed to the optimal scalar and two-state

ISCE. In both cases, the same process noise, β= 0.1, and

measurement noise, γ= 0.2, parameters as well as noise

sequences were used. These sequences are depicted in Fig. 1.

Fig. 1. Time evolution of the process and measurement noises

The implemented PF employs the systematic resampling

procedure with the threshold parameter nt

p= 2/3np. In

the GSF case, the re-approximation technique presented in

section V-D was implemented using standard constrained

optimization tools of the Matlab environment.

A. Scalar Case

For the scalar case, the the following system parameters

were chosen: Φ=0.75,Γ=1,H= 2, and α= 0.5. These

parameters satisfy the condition γ/|H|< β(1 −Φ) < α of

Assumption 4.1. in [9]. In the GSF case, the process noise,

the measurement noise, and the initial state PDF were ﬁtted

1437

in the least-square sense with a weighted Gaussian sum of

nw= 7,nv= 7, and n1|0= 9 components, respectively.

Figures 2-3 are to visually compare the true posterior PDF

p(xk|z1:k)of the ISCE at time step 8 with the approximations

given by the PF and GSF with different number of particles

(np) and Gaussian terms (nb) kept at each step, respectively.

-1.2 -0.8 -0.4 0 0.4 0.8 1.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

p(xk|z1:k) - ISCE

p(xk|z1:k) - PF (np= 50)

p(xk|z1:k) - PF (np= 105)

xk

zk/H

Fig. 2. Comparison of the posterior PDF with the PF.

-1.2 -0.8 -0.4 0 0.4 0.8 1.2

0

0.5

1

1.5

2

2.5

3

3.5

4

p(xk|z1:k) - ISCE

p(xk|z1:k) - GSF (nb= 10)

p(xk|z1:k) - GSF (nb= 200)

xk

zk/H

Fig. 3. Comparison of the posterior PDF with the GSF.

It can be seen from Fig. 2 that the PF with 10,000 particles

approximates reasonably well the true posteriori. However,

the price to pay is the computational complexity. The average

computation time of the PF with 50 particles is 4.5times and

of the PF with 10,000 particles is 7,000 times higher than

the average computation time of the ISCE evaluated based

on 100 steps, when carried out on the same computer.

On the other hand, the GSF’s approximation is very poor

even when 200 Gaussian components are kept at each time

step, see Fig. 3. In this case, the computational burden with

10 terms kept is 1,000 times and with 200 terms kept is

25,000 times higher than the average computational burden

of the ISCE evaluated based on 100 steps.

B. Two-state Case

In this case, the system parameters were chosen as follows:

Φ=0.9 0.1

−0.2 1.0,Γ=1.0

0.3,α1

α2=0.10

0.05,

and H= [ 1 2 ]. The system has stable eigenvalues at 0.95±

0.1323j. It is observable and complies with the necessary

condition for the ISCE to exist [2], i.e., that HΓ6= 0. Here,

the following values: nw= 3,nv= 3, and n1|0= 9 were

used to ﬁt the respectively GSF’s PDFs.

To allow fair comparison, the parameters of the PF and

the GSF were selected such that their computational burden

is similar to the ISCE implemented with the ﬁnite horizon

approximation [10]. We have studied two cases. The ﬁrst

case considers a ﬁnite horizon (window) of 6 steps. To yield

similar computation times, np= 4,500 particles were used

for the PF and nb= 3 Gaussian terms were kept for the

GSF. The second case considers a window of 8 steps that

takes longer to compute. Note that the performance of the

ISCE for the 6 step and 8 step are indistinguishable. In this

case np= 60,000 particles in PF were used and nb= 7

Gaussian components were kept.

Figures 4-7 depict the performance of the ISCE, PF

and GSF for various cases. In those plots we compare the

estimation errors (solid lines) and the computed estimation

error standard deviations (dotted lines) as obtained by the

various ﬁlters. Figure 4 compares the performance of the

ISCE and PF when the former uses a 6-step window while

the latter uses 4,500 particles. It can be seen that the PF can

handle well the measurement outlier at step 5. However, after

several steps, especially after encountering process noise

outliers, the PF errors deviated signiﬁcantly from the ISCE

error. Moreover, those errors are not properly quantiﬁed

by the estimated standard deviation obtained by the PF, as

can be seen at larger time steps of above 30. This clearly

demonstrates that 4,500 particles are not enough to properly

estimate the system states. The same conclusion can be

drawn when examining the performance of the GSF depicted

in Fig. 5. In this case, to have a comparable computing time,

the GSF keeps only 3 Gaussian components. It clearly shows

that the GSF performs very poorly, worse than the PF.

Figure 6 represents the case when far more particles are

used. In this case, 60,000 particles are used to match approx-

imately the computational burden of the ISCE with an 8-step

window. In this case the PF performance is comparable to

that of the ISCE, except when a large process noise outlier is

encountered at time step 62. In this case the PF overestimates

the error covariance. Nonetheless, this has a nearly negligible

effect on the overall estimation results. In contrast, the GSF

with 7 terms, i.e., tuned to have a comparable computa-

tion time as the ISCE 8-window ﬁlter, is not capable to

reproduce the performance of the Cauchy ﬁlter. This clearly

demonstrates that the heavy-tail characteristics of the Cauchy

noise environment cannot be captured by a ﬁnite number

of Gaussian PDFs. Its performance deteriorates signiﬁcantly

when both process and measurement noise outliers occur.

1438

Fig. 4. Comparison of the ISCE using a window of 6 steps with the PF

using 4,500 particles.

Fig. 5. Comparison of the ISCE using a window of 6 steps with the GSF

keeping 3 terms at each step.

VII. CONCLUSION

In this paper, the performances of two suboptimal ﬁltering

algorithms have been numerically compared with the optimal

Cauchy estimator. Simulation results for two-state case have

revealed that both the PF and GSF perform quite poorly for

a time interval dictated by the Cauchy ﬁlter for a window

of six, but only the GSF perform quite poorly for a time

interval a window of eight. Given the two-state ISCE’s

superior performance and its bounded computational cost, it

represents a good candidate for real-time implementation of

ﬁltering problems in heavy-tailed Cauchy noise environment.

REFERENCES

[1] M. Idan and J. L. Speyer, “State estimation for linear scalar dy-

namic systems with additive cauchy noises: Characteristic function

approach,” SIAM Journal on Control and Optimization, vol. 50, no. 4,

pp. 1971–1994, 2012.

[2] ——, “Multivariate cauchy estimator with scalar measurement and

process noises,” SIAM Journal on Control and Optimization, vol. 52,

no. 2, pp. 1108–1141, 2014.

[3] H. W. Sorenson and D. L. Alspach, “Recursive bayesian estimation

using gaussian sums,” Automatica, vol. 7, no. 4, pp. 465–479, 1971.

Fig. 6. Comparison of the ISCE using a window of 8 steps with the PF

using 60,000 particles.

Fig. 7. Comparison of the ISCE using a window of 8 steps with the GSF

keeping 7 terms at each step.

[4] D. L. Alspach and H. W. Sorenson, “Nonlinear bayesian estimation

using gaussian sum approximations,” Automatic Control, IEEE Trans-

actions on, vol. 17, no. 4, pp. 439–448, 1972.

[5] A. Doucet, S. Godsill, and C. Andrieu, “On sequential monte carlo

sampling methods for bayesian ﬁltering,” Statistics and computing,

vol. 10, no. 3, pp. 197–208, 2000.

[6] M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial

on particle ﬁlters for online nonlinear/non-gaussian bayesian tracking,”

IEEE Transactions on Signal Processing, vol. 50, pp. 174–188, 2002.

[7] N. Gordon, D. Salmond, and A. Smith, “Novel approach to nonlinear

/non-Gaussian Bayesian state estimation,” in IEE Proceedings F Radar

and Signal Processing, vol. 140, no. 2, 1993, pp. 107–113.

[8] J. R. Carpenter and A. K. Mashiku, “Cauchy drag estimation for

low earth orbiters,” in AAS/AIAA Space Flight Mechanics Meeting,

Williamsburg, VA; United States, 2015.

[9] M. Idan and J. L. Speyer, “Cauchy estimation for linear scalar

systems,” IEEE Transactions on Automatic Control, vol. 55, no. 6,

pp. 1329–1342, 2010.

[10] J. Fernandez, J. Speyer, and M. Idan, “Stochastic estimation for two-

state linear dynamic systems with additive cauchy noises,” IEEE

Transaction on Automatic Control, vol. 60, no. 12, pp. 3367–3372,

December 2015.

[11] A. Smith, A. Doucet, N. de Freitas, and N. Gordon, Sequential Monte

Carlo methods in practice. Springer Verlag New York, 2013.

[12] T. Li, M. Bolic, and P. M. Djuric, “Resampling methods for particle

ﬁltering: Classiﬁcation, implementation, and strategies,” Signal Pro-

cessing Magazine, IEEE, vol. 32, no. 3, pp. 70–86, 2015.

1439