ArticlePDF Available

Abstract and Figures

The recently published optimal Cauchy estimator poses practical implementation challenges due to its time-growing complexity. Alternatively, addressing impulsive measurement and process noises, while using common estimation approaches, requires heuristic schemes. Approximate methods, such as particle and Gaussian-sum filters, were suggested to tackle the estimation problem in a heavy-tailed-noise environment when constraining the computational load. In this paper, the performances of a particle filter and a Gaussian-sum filter, designed for a linear system with specified Cauchy-noise parameters, are compared numerically to a Cauchy filter-based approximation showing the advantages of the latter.
Content may be subject to copyright.
Approximate Estimators for Linear Systems With
Additive Cauchy Noises
Robert Fonodand Moshe Idan
Technion - Israel Institute of Technology, Haifa 3200003, Israel
Jason L. Speyer
University of California, Los Angeles, California 90095, USA
The recently published optimal Cauchy estimator poses practical implementation chal-
lenges due to its time-growing complexity. Alternatively, addressing impulsive measure-
ment and process noises while using common estimation approaches requires heuristic
schemes. Approximate methods, such as particle and Gaussian sum filters, were suggested
to tackle the estimation problem in heavy-tailed noise environment when constraining the
computational load. In this paper, the performance of a particle filter and a Gaussian sum
filter, designed for a linear system with specified Cauchy noise parameters, are compared
numerically to a Cauchy filter-based approximation showing the advantages of the latter.
I. Introduction
Impulsive processes appear naturally in a variety of practical problems that range from engineering and
science to economics and finance. In many applications the underlying random processes or noises are
better described by heavy-tailed non-Gaussian densities [1], for example by the Cauchy probability density
function (pdf) [2]. Traditional filtering techniques often rely on the Gaussian assumption mainly because
modern methods and algorithms are able to handle such systems very efficiently [3], yielding, e.g., the Kalman
filtera[4]. However, in the presence of significantly non-Gaussian, heavy-tailed noises, and particularly in
the presence of outliers, the performance of the Kalman filter degrades severely [5].
Impulsive measurement and process noises in stochastic state estimators have typically been handled
by heuristic schemes that augment the estimation process. Recently, an analytical recursive non-linear
estimation scheme, i.e., the Idan/Speyer Cauchy Estimator (ISCE), for linear scalar systems driven by Cauchy
distributed process and measurement noises has been developed using pdf [6] and characteristic function (cf)
[7, 8] approaches. Specifically, the latter recursively generates, in closed-form, the characteristic function of
the unnormalized conditional pdf (ucpdf) of the state given the measurement history. The number of terms
in the sum that expresses this cf grows with each measurement update.
Although the ISCE provides the exact minimum conditional variance estimates of the system’s states
given a measurement sequence, its computational complexity and memory burden becomes very high, re-
quiring an approximation in its implementation. One such approximation was suggested in [6, 7] for the
scalar and one in [9, 10] for the two-state ISCE, respectively. For both cases, a fixed sliding window of the
most recent measurements was considered to attain a near optimal estimate, leading to an estimator with a
finite computational burden.
Alternatively, in this study we wish to evaluate other general estimation algorithms in addressing impul-
sive noises [11–14]. Although those approaches are suboptimal they may offer reasonable approximations
when tuned for the heavy-tailed, Cauchy noise environment. Two of the most popular approximations are
Postdoctoral Fellow, Faculty of Aerospace Engineering, robert.fonod@technion.ac.il.
Associate Professor, Faculty of Aerospace Engineering, moshe.idan@technion.ac.il. Associate Fellow AIAA.
Ronald and Valerie Sugar Distinguished Professor in Engineering, Department of Mechanical and Aerospace Engineering,
speyer@ucla.edu. Fellow AIAA.
aThe Kalman filter is the minimum variance filter if the additive noise is Gaussian and the best linear minimum variance
filter if the noise is non-Gaussian, but the second order statistics for the additive noise exists.
1
the particle filter (PF) and the Gaussian sum filter (GSF) in that they were shown to converge to the correct
conditional density of the state as the number of terms in their implementation tends to infinity. Therefore,
for real-time applications, they are implemented with some degree of approximation, producing a tradeoff
between numerical efficiency and the estimation performance in constructing the conditional pdf of the state
given the measurement history and the resulting conditional mean and variance.
Using statistical Monte Carlo (MC) simulations and a judiciously chosen measure to evaluate the esti-
mation performance of a heavy-tailed process, we extend the sample run results presented in [15]. Our main
objective is to compare the efficiency of the PF and GSF to that of the scalar and two-state ISCE-based
approximations, hence providing guidance for a practical implementation of an estimator for a heavy-tail
noise environment.
II. Problem Formulation
Consider a single-input-single-output, multivariate, discrete-time and time-invariant linear dynamic sys-
tem described by
xk+1 =Φxk+Γwk,(1)
zk=Hxk+vk,(2)
with state vector xkRn, scalar measurement zkR, and known matrices ΦRn×n,ΓRn×1, and
HR1×n. The sequence xkfor k= 1,2, . . ., is a discrete-time Markov process. The scalar noise inputs
wkand vkare independent Cauchy distributed random variables with zero median and scaling parameters
β > 0and γ > 0, respectively. Their pdf-s and cf-s are denoted pand φ, respectively. They are assumed to
be time independent and given by
pW(wk) = β/π
w2
k+β2φWν) = eβ|¯ν|,(3)
pV(vk) = γ/π
v2
k+γ2φVν) = eγ|¯ν|,(4)
where ¯νis a scalar spectral variable.
The initial conditions at k= 1 are also assumed to be independent Cauchy distributed random variables,
i.e., each element x1;iof the initial state x1has a Cauchy pdf with a zero median and a scaling parameter
αi>0, i = 1, . . . , n. The joint pdf of the initial conditions and its cf are given by
pX1(x1) =
n
Y
i=1
αi
x2
1;i+α2
i
φX1(ν) =
n
Y
i=1
eαi|νi|,(5)
where νiis an element of the n-dimensional spectral variable νRn.
The measurement history used in the estimation problem formulation is defined as z1:k,{z1, . . . , zk}.
The objective is to examine the practically computable minimum conditional variance estimates of xkgiven
the measurement history z1:k.
III. Bayesian Approach
From a Bayesian perspective, the filtering problem is solved by constructing the posterior density p(xk|z1:k)b
of the state xkat time kgiven all the available information z1:k. The pdf p(xk|z1:k)contains all available
statistical information, and thus is the complete solution to the estimation problem [3]. In principle, an
optimal (with respect to any criterion) estimate of the state may be obtained from this pdf.
Given the posterior pdf p(xk1|z1:k1)at time k1, the posterior pdf p(xk|z1:k)at time kcan be
computed using a two stage approach of prediction and update. For square systems (i.e., rank(Γ) = n), the
prediction stage can be obtained via the Chapman-Kolmogorov equation
p(xk|z1:k1) = Zp(xk1|z1:k1)p(xk|xk1)dxk1,(6)
bFor simplicity, p(xk|z1:k)denotes pXk|Z1:k(xk|z1:k). This notation simplification is used hereafter to avoid extensive
notation, whenever the context is clear.
2
where p(xk|xk1)is the state transition probability density determined by the state equation (1) and the
known statistics of the process noise wk1, i.e.,
p(xk|xk1) = pWΓ1(xkΦxk1).(7)
For non-square (i.e., rank(Γ)6=n) and non-singular systems (i.e., rank(Φ) = n), the prediction stage (6)
can be computed via [8]
p(xk|z1:k1) = Φ1ZpXk1|Z1:k1Φ1xkΦ1Γwk1z1:k1pW(wk1)dwk1,(8)
where |·|stands for determinant of a matrix.
If at time ka measurement zkbecomes available, then zkcan be used to update the prior pdf p(xk|z1:k1)
via Bayes’ rulec
p(xk|z1:k) = p(zk|xk)p(xk|z1:k1)
Rp(zk|xk)p(xk|z1:k1)dxk
,(9)
where p(zk|xk)is the measurement likelihood defined by the measurement model (2) and the known statistics
of the measurement noise vk, i.e.,
p(zk|xk) = pV(zkHxk).(10)
Given p(xk|z1:k), the minimum conditional variance estimate ˆ
xkand the second conditional moment of
the state are given by
ˆ
xk=E[xk|z1:k] = Zxkp(xk|z1:k)dxk,(11)
E[xkxT
k|z1:k] = ZxkxT
kp(xk|z1:k)dxk.(12)
The minimal conditional variance of the estimation error defined as ˜
xk,xkˆ
xkis then determined by
Pk=E[˜
xk˜
xT
k|z1:k] = E[xkxT
k|z1:k]ˆ
xkˆ
xT
k.(13)
IV. Optimal Solution
Here we present a brief overview of the analytical estimation solution to the Cauchy problem, the ISCE.
IV.A. Scalar ISCE: pdf Approach
The pioneering work of Idan and Speyer[6] derived the ISCE for scalar-state systems by solving, in closed-
form, the integrals of the Bayesian update rule involved in constructing the posterior pdf p(xk|z1:k). It was
shown that under mild conditions on the system and noise parameters (see Assumption 4.1 in [6]), p(xk|z1:k)
can be expressed as
p(xk|z1:k) =
k+2
X
i=1
ai
k|kxk+bi
k|k
(xkσi
k|k)2+ (ωi
k|k)2.(14)
Initialization and update rules for the series coefficients ai
k|k,bi
k|k,σi
k|kand ωi
k|kcan be found in [6]. All but
ωi
k|kare functions of the measurements. The series coefficients of the above pdf are used to determine ˆxk
and Pkas
ˆxk=π
k+2
X
i=1
ai
k|k[(σi
k|k)2(ωi
k|k)2] + bi
k|kσi
k|k
ωi
k|k
,(15)
Pk=π
k+2
X
i=1
hi
k|k2ai
k|kσi
k|k(ωi
k|k)2
ωi
k|k
ˆx2
k,(16)
where hi
k|k= (ai
k|kσi
k|k+bi
k|k)((σi
k|k)2(ωi
k|k)2).
The approach above provides a closed-form expression for p(xk|z1:k), which can be examined for its shape
and additional properties, as will be carried out in this study.
cAt time k= 1, the prior pdf is defined as: p(x1|z1:0),pX1(x1).
3
IV.B. Multivariate ISCE: cf Approach
Since the pdf approach did not extend to multivariate systems, an alternative derivation that utilizes the
characteristic function of the ucpdf was proposed. The ISCE for scalar-state systems was re-derived first
using this approach in [7]. Subsequently, it was extended to multivariate systems in [8].
Here, we propagate the cf of p(xk|z1:k)given by
φxk|z1:k(ν) = Zp(xk|z1:k)ejνTxkdxk.(17)
Moreover, for computational simplicity, the normalization by p(zk|z1:k1)when computing p(xk|z1:k)can
be postponed, thus propagating the cf of the ucpdf
¯
φxk|z1:k(ν) = Zp(xk,z1:k)ejνTxkdxk.(18)
From the Bayesian update rule, p(xk,z1:k)is the ucpdf of the state, while the normalization factor to obtain
p(xk|z1:k)is given by ¯
φxk|z1:k(0).
In [8], it was shown that ¯
φxk|z1:k(ν)at the update time kcan be expressed as a growing sum of nk|k
t
terms
¯
φxk|z1:k(ν) =
nk|k
t
X
i=1
gk|k
iyk|k
gi (ν)exp yk|k
ei (ν),(19)
i.e., a sum of exponential terms multiplied by a coefficient function, gk|k
i(·), that is a complex, nonlinear
function of the measurements. The argument of this coefficient function, yk|k
gi (ν), is real and is expressed as a
sum of sign functions of νwith known parameters. The real part of the argument of the exponents, yk|k
ei (ν),
is the absolute value of a function of the spectral vector ν, and its imaginary part is a linear function of the
measurements. The details of the various parameters and functions of the above expression can be found in
[8]. Since ¯
φxk|z1:k(ν)is twice continuously differentiable, the minimum conditional variance estimate of the
state can be obtained by [8]
ˆ
xk=1
jpZ1:k(z1:k)
nk|k
t
X
i=1
gk|k
iyk|k
gi (ˆ
ν)¯yk|k
ei (ˆ
ν),(20)
where jis the imaginary unit, ˆ
νis a fixed direction in the νdomain, ¯yk|k
ei (ˆ
ν)is a n-dimensional vector which
relates to yk|k
ei (ˆ
ν)by h¯yk|k
ei (ˆ
ν),ˆ
νi=yk|k
ei (ˆ
ν), and pZ1:k(z1:k)is
pZ1:k(z1:k) = ¯
φxk|z1:k(ˆ
ν)=0 =
nk|k
t
X
i=1
gk|k
iyk|k
gi (ˆ
ν).(21)
The estimation error covariance matrix is obtained by
Pk=1
j2pZ1:k(z1:k)
nk|k
t
X
i=1
gk|k
iyk|k
gi (ˆ
ν)¯yk|k
ei (ˆ
ν)¯yk|k
ei (ˆ
ν)Tˆ
xkˆ
xT
k.(22)
V. Suboptimal Solutions
In this section, the PF and GSF are designed to approximate the posterior density p(xk|z1:k)for the
system described by (1) and (2), in which the noise sequences wkand vkand the initial state x1are Cauchy
distributed. As was observed above, the exact posterior pdf (14) or its cf (19) are expressed in the optimal
ISCE as a series with a growing number of terms. To avoid the associated computational burden, [6, 7, 9, 10]
suggest a truncation procedure that limits the number of terms in these series to a prescribed fixed sliding
window of the nsmost recent measurements. The validity of these approximations was demonstrated even
when using only around twenty terms for the scalar case and a window of eight, i.e., around 3000 terms, for
the two-state ISCE. Consequently, only approximate ISCE implementations were considered in this study
when comparing the performance with the proposed PF and GSF.
4
V.A. Particle Filter
The particle filter, also known as the sequential Monte Carlo method, is a set of algorithms implementing
recursive Bayesian estimation based on point mass representations of probability densities. For good surveys,
see [16, 17].
The main idea of the PF is to represent the posterior density p(xk|z1:k)using a set of random samples
with associated weights. Let {xi
k, µi
k}np
i=1 be such a representation that characterizes the posterior p(xk|z1:k)
at time k, where {xi
k}np
i=1 is a set of npsupport points (particles) with associated nonnegative weights
{µi
k}np
i=1. The weights are normalized such that Pnp
i=1 µi
k= 1.
V.A.1. Sequential Importance Sampling
Most PFs are based on the algorithm known as sequential importance sampling (SIS), also known as the
bootstrap filter [11], which is a MC technique for solving the Bayesian problem. Given {xi
k1, µi
k1}np
i=1
and zk, the posterior density p(xk|z1:k)at time kcan be approximated using the principle of importance
sampling, Markov property, and Bayes’s rule as follows [17]
p(xk|z1:k)
np
X
i=1
µi
kδ(xkxi
k),(23)
where δ(·)stands for the Dirac delta density. In (23), the ith particle xi
kis sampled from the chosen
importance density q(xk|xi
k1, zk), also known as the proposal density. The ith importance weight µi
k
associated with xi
kis updated as
µi
kµi
k1
p(zk|xi
k)p(xi
k|xi
k1)
q(xi
k|xi
k1, zk), i = 1,2, . . . , np,(24)
where p(zk|xi
k)is the measurement likelihood given by (10), p(xi
k|xi
k1)is the state transition density given
by (7), and the symbol signifies “proportional to”.
The SIS algorithm thus consists of recursive propagation of the weights and particles as each measure-
ment is received sequentially. Based on the strong law of large numbers, as np→ ∞, the approximated
posterior of (23) approaches the true posterior p(xk|z1:k)[17]. The numerical approximation to (11) and
(13), respectively, is computed as
ˆ
xk
np
X
i=1
µi
kxi
k,(25)
Pk
np
X
i=1
µi
k(xi
kˆ
xk)(xi
kˆ
xk)T.(26)
V.A.2. Choice of the Importance Density
It has been shown in [12] that the optimal choice for the importance density that minimises the variance of
the importance weight µi
kis
q(xk|xi
k1, zk) = p(xk|xi
k1, zk) = p(zk|xk,xi
k1)p(xk|xi
k1)
p(zk|xi
k1).(27)
However, this importance density is not always available and can be used only in special cases, e.g., for a
class of models for which p(xk|xi
k1, zk)is Gaussian [12]. Hence, the most widely used importance density
is the prior pdf [17], i.e.
q(xk|xi
k1, zk) = p(xk|xi
k1).(28)
This choice of importance density means that for k2, we need to sample particles from p(xk|xi
k1). A
sample xi
kp(xk|xi
k1)can be obtained by first generating a process noise sample wk1pW(wk1)and
setting xi
k=Φxi
k1+Γwi
k1, where pW(wk1)is the Cauchy process noise pdf defined in (3). For k= 1,
5
the particles are generated from the initial density, i.e., xi
1pX1(x1), where pX1(x1)is the initial state pdf
defined in (5). Additionally, the weight update formula (24) reduces to
µi
kµi
k1p(zk|xi
k) = µi
k1pV(zkHxi
k), i = 1,2, . . . , np,(29)
where pV(·)is the Cauchy measurement noise pdf defined in (4).
Notice that the importance density of (28) is independent of the measurement zkand hence the state
space is explored without knowledge of the actual observation. This choice of q(xk|xi
k1, zk)may fail if new
measurements appear in the tail of the prior or if the measurement likelihood is too peaked in comparison
to the prior. This strategy promotes a well known problem of the SIS algorithm, known as the degeneracy
(or sample impoverishment) problem [17]. On the other hand, the advantage of the SIS algorithm is that its
computational burden is constant at each time step.
V.A.3. Degeneracy Problem
The degeneracy problem arises when after a few iterations of the SIS algorithm, only a few of the particles
have significant weights while the other particles have negligible weights. This yields a very poor approxi-
mation of p(xk|z1:k), and may lead to a breakdown of the algorithm. Note, this phenomenon occurs even if
the optimal importance density (27) is used, but is more severe when using p(xk|xi
k1).
A suitable measure to assess the degeneracy of the SIS algorithm is the effective sample size estimate [17]
ˆnef f
p= 1/
np
X
i=1
(µi
k)2.(30)
Here, 1ˆnef f
pnp, where the upper bound is attained when all particles have the same weight, and the
lower bound when the entire probability mass is at one particle. Small ˆnef f
pindicates severe degeneracy.
V.A.4. Resampling
The most common solution to tackle the degeneracy problem is the use of resampling. It discards particles
that have low importance weights, as they do not contribute to the approximation, and replaces them with
particles in the vicinity of those with high importance weights [11]. To prevent degeneracy when ˆneff
pis
below a fixed threshold nt
p, an appropriate resampling procedure is utilized.
Several resampling schemes exist. The choice of the particular resampling scheme affects the computa-
tional load as well as the approximation error, see the discussion and classification in [18]. In our study, only
the systematic resampling strategy [19] was used as it was shown empirically to outperform other methods
for the Cauchy case.
V.B. Gaussian Sum Filter
In this subsection, the problem of determining a posterior density p(xk|z1:k)is treated using the Gaussian
sum approximation. The presented filtering scheme is an adaptation of the well-known GSF algorithm of
Sorenson and Alspach [13] to the system described by (1) and (2), in which the noise sequences wkand vk
and the initial state x1are Cauchy distributed.
V.B.1. Gaussian Mixture Model
Based on the Wiener approximation theory [20], any pdf can be expressed, or approximated with a given
level of accuracy, using a weighted sum of Gaussian densities, what is also known as a Gaussian mixture
model (GMM) given by
pG
X(x) =
nx
X
i=1
µi
xNx;¯
xi
x,Pi
x.(31)
Here nxis a positive integer indicating the number of Gaussian components (terms) in the GMM, µi
x0,
i= 1,2, . . . , nxare scalar weighting factors satisfying Pnx
i=1 µi
x= 1, and Nx;¯
xi
x,Pi
xdenotes a multivariate
Gaussian density function with argument xRm, mean ¯
xi
xRm, and covariance matrix Pi
xRm×m. It
can be shown that pG
X(x)is a valid density function and converges uniformly to any density of practical
concern by letting nxincrease and each elemental covariance approach the zero matrix.
6
V.B.2. Fitting a GMM to a Cauchy Density
To obtain the recursive Bayesian filter in the GMM framework, first the stationary Cauchy densities given in
(3), (4), and (5) need to be fitted (approximated) by a GMM. The fitting can be done in various ways. In this
paper, this fitting is formulated as the following constrained optimization problem: Given the desired number
of Gaussian components (nx1), find {µi
x,¯
xi
x,Pi
x}nx
i=1 which minimizes the integral square difference (ISD)
between a particular Cauchy density of interest and a GMM, i.e.,
argmin
{µi
x,¯
xi
x,Pi
x}nx
i=1
Jf=ZpC
X(x)pG
X(x)2dx,(32a)
such that nx
X
i=1
µi
x= 1, µi
x0,Pi
x= (Pi
x)T>0, i = 1,2, . . . , nx.(32b)
Here pG
X(x)is a GMM defined in (31) and pC
X(x) = m
Q
i=1
δi
x2
i+δ2
i
is a zero median multivariate Cauchy density
function with argument x= [x1, x2, . . . , xm]Tand scaling parameters δi>0,i= 1,2, . . . , m. The complex
minimization problem (32) can be solved, e.g., numerically by standard constrained optimization tools.
Figure 1 illustrates an actual fitting of a standard scalar Cauchy pdf (m= 1 and δ1= 1) with a GMM
having different numbers of Gaussian components nx. For nx= 3,nx= 5, and nx= 7 the resulting ISD,
computed numerically, is approximately equal to Jf= 2.4×103,Jf= 9.79 ×105, and Jf= 6.01 ×105,
respectively.
-10 -5 0 5 10
x
0
0.1
0.2
0.3
0.4
pX(x)
Standard Cauchy pdf
GMM with 3 terms
GMM with 5 terms
GMM with 7 terms
-30 -20 -10 0 10 20 30
x
0
0.005
0.01
pX(x)
ZOOM
Figure 1. Fitting a Cauchy PDF with a GMM having different numbers of terms.
To proceed with the GSF algorithm, assume that the Cauchy densities given in (3), (4), and (5) are all
fitted by a GMM in the same way as pG
Xwas fitted to pC
Xin (32), i.e.,
pW(wk)
nw
X
i=1
µi
wNwk; ¯xi
w, P i
w,(33a)
pV(vk)
nv
X
i=1
µi
vNvk; ¯xi
v, P i
v,(33b)
pX1(x1)
n1|0
X
i=1
µi
1|0Nx1;¯
xi
1|0,Pi
1|0.(33c)
7
V.B.3. Time Propagation
Suppose that at time k1, the posterior density p(xk1|z1:k1)is approximated by a weighted sum of
nk1|k1Gaussian densities
p(xk1|z1:k1)
nk1|k1
X
i=1
µi
k1|k1Nxk1;¯
xi
k1|k1,Pi
k1|k1.(34)
Then, the approximation of the a priori density p(xk|z1:k1)at time kis obtained in the GSF sense as
p(xk|z1:k1)
nk1|k1
X
i=1
nw
X
j=1
˜µij
k|k1Nxk;¯
mij
k|k1,Mij
k|k1,(35)
where ¯
mij
k|k1and Mij
k|k1are computed using Kalman-like equations, i.e., for all i= 1, . . . , nk1|k1and
j= 1, . . . , nwwe have
¯
mij
k|k1=Φ¯
xi
k1|k1+Γ¯xj
w,(36a)
Mij
k|k1=ΦPi
k1|k1ΦT+ΓPj
wΓT.(36b)
The weighting factors ˜µij
k|k1are updated for all i= 1, . . . , nk1|k1and j= 1, . . . , nwas
˜µij
k|k1=µi
k1|k1µj
w.(37)
For notation convenience, the double summation in (35) can be restated as
p(xk|z1:k1)
nk|k1
X
i=1
µi
k|k1Nxk;¯
xi
k|k1,Pi
k|k1,(38)
where nk|k1= (nk1|k1)(nw), and µi
k|k1,¯
xi
k|k1, and Pi
k|k1are formed in an obvious fashion from
˜µij
k|k1,¯
mij
k|k1, and Mij
k|k1, respectively.
V.B.4. Measurement Update
Suppose that at time k, the a priori density p(xk|z1:k1)is expressed as in (38)d. Then, using the measure-
ment zk, the a posterior density p(xk|z1:k)at time kis approximated in the GSF sense as
p(xk|z1:k)
nk|k1
X
i=1
nv
X
j=1
˜µij
k|kNxk;¯
mij
k|k,Mij
k|k,(39)
where ¯
mij
k|kand Mij
k|kare computed for all i= 1, . . . , nk|k1and j= 1, . . . , nvas
ˆzij
k=H¯
xi
k|k1+ ¯xj
v,(40a)
Sij
k=HP i
k|k1HT+Pj
v,(40b)
Kij
k=Pi
k|k1HT(Sij
k)1,(40c)
¯
mij
k|k=¯
xi
k|k1+Kij
k(zkˆzij
k),(40d)
Mij
k|k=Pi
k|k1Kij
kSij
k(Kij
k)T.(40e)
The weighting factors ˜µij
k|kare updated for all i= 1, . . . , nk|k1and j= 1, . . . , nvusing the following rule
˜µij
k|k=µi
k|k1µj
vN(zk; ˆzij
k, Sij
k)
nk|k1
P
l=1
nv
P
m=1
µl
k|k1µm
vN(zk; ˆzlm
k, Slm
k)
.(41)
dNote that at time k= 1, the a priori density corresponds to the GMM representation of the initial state density given in
(33c), which has the same form as (38).
8
For convenience, one can rewrite (39) as:
p(xk|z1:k)
nk|k
X
i=1
µi
k|kNxk;¯
xi
k|k,Pi
k|k,(42)
where nk|k= (nk|k1)(nv)and µi
k|k,¯
xi
k|k, and Pi
k|kare again formed from ˜µij
k|k,¯
mij
k|k, and Mij
k|k, respectively.
The weighting factors µi
k|ksatisfy µi
k|k0and Pnk|k
i=1 µi
k|k= 1, thus generating a proper pdf stated in (42).
Note that for nw=nv=n1|0= 1, the above GSF equations reduce to the standard Kalman filter equations.
V.B.5. Conditional Mean and Estimation Error Variance
Using the posterior density p(xk|z1:k)at time k, as given in (42), the conditional mean (11) and the estimation
error covariance (13) can be approximated in the GSF sense as
ˆ
xk
nk|k
X
i=1
µi
k|k¯
xi
k|k,(43)
Pk
nk|k
X
i=1
µi
k|kPi
k|k+ (¯
xi
k|kˆ
xk)(¯
xi
k|kˆ
xk)T.(44)
The major disadvantage of the GSF algorithm is that the number of terms in the Gaussian sum increases
exponentially in time. The number of terms in p(xk|z1:k)at step kcan be expressed explicitly as nk|k=
n1|0×(nw×nv)k/nw. This is obviously a heavy computational burden for real-time implementation. It is
normally addressed by the Gaussian Sum re-approximation discussed next.
V.B.6. Gaussian Sum Re-approximation
Several heuristic approaches have been proposed in the literature to avoid the exponential growth of terms
nk|kin the Gaussian sum (42), see for instance [13, 21] and references therein. A seemingly tempting method
of adopting Gaussian components with largest weights was found to be inefficient [22]. This is mainly due
to the fact that even if the weight of a Gaussian term is very small at a certain point, it might become large
at the next time step. This is often the case for the Cauchy distributed environment, when an outlier is very
likely to occur. Ignoring such a component might have severe consequences.
In this paper, we suggest to re-approximate the posteriori density estimate (42) of the GSF by a new
reduced-order GMM with pre-fixed number of Gaussian components. This reduction in terms is motivated
by the observation that a relatively small number of weighted Gaussian components can approximate a large
class of densities [23], as well as by our aim to confine the computational time of the GSF in order to compare
its performance to the approximate ISCE and PF, both having a bounded computational burden.
For ease of notation, assume that at a given time step the measurement updated Gaussian sum density
(42) is denoted by pa(x)and has naterms, i.e.,
pa(x) =
na
X
i=1
µi
aNx;¯
xi
a,Pi
a.(45)
After evaluating (43) and (44) using pa(x), the objective is to approximate pa(x)by another Gaussian sum
density,
pb(x) =
nb
X
i=1
µi
bNx;¯
xi
b,Pi
b,(46)
which has a constant pre-fixed number of terms nb. Obviously, if nanb, then there is no need for re-
approximation and pb(x) = pa(x)is considered. If na> nb, then the task of the suggested re-approximation
scheme is to determine {µi
b,¯
xi
b,Pi
b}nb
i=1 such that the mean and covariance of the new Gaussian mixture
pb(xk)match exactly those of pa(xk), while also minimizing the ISD between pa(xk)and pb(xk). Given
nb1, this task can be formulated as a constrained optimization problem
argmin
{µi
b,¯
xi
b,Pi
b}nb
i=1
J=Zhpa(x)pb(x)i2dx,(47a)
9
such that nb
X
i=1
µi
b= 1, µi
b0,Pi
b= (Pi
b)T>0, i = 1,2, . . . , nb,(47b)
na
X
i=1
µi
a¯
xi
a=
nb
X
i=1
µi
b¯
xi
b,
na
X
i=1
µi
aPi
a+¯
xi
a(¯
xi
a)T=
nb
X
i=1
µi
bPi
b+¯
xi
b(¯
xi
b)T.(47c)
The cost function (47a) can be expanded and rewritten as
J=Zp2
a(x)dx2Zpa(x)pb(x)dx+Zp2
b(x)dx,Jaa 2Jab +Jbb,(48a)
where the particular integrals Jaa,Jab, and Jbb were solved, in closed-form, by Williams and Maybeck [24],
to yield
Jaa =
na
X
i=1
na
X
j=1
µi
aµj
aN¯
xi
a;¯
xj
a,Pi
a+Pj
a,(48b)
Jab =
na
X
i=1
nb
X
j=1
µi
aµj
bN¯
xi
a;¯
xj
b,Pi
a+Pj
b,(48c)
Jbb =
nb
X
i=1
nb
X
j=1
µi
bµj
bN¯
xi
b;¯
xj
b,Pi
b+Pj
b.(48d)
A small value of Jindicates, in the ISD sense, that pb(x)is a good approximation of pa(x). However, there
is no guarantee that the re-approximated density also preserves the higher order moments of the original
one.
Note that solving the above constrained minimization problem generally involves computationally costly
nonlinear optimization with respect to nb×(n2/2+3n/2+1) independent variables, where nis the dimension
of the system state vector x. Obviously, the computational burden of the GSF will also be affected by the
numerical procedure used to solve the above constrained optimization problem. In this paper, we assume
that the computational time of the GSF algorithm is dictated solely by nand nb. This can be achieved by
using numerical solvers with fixed number of iterations.
VI. Numerical Study
In this section, the performance of the PF and GSF is analyzed and numerically compared to the scalar
and two-state ISCE. All simulations were performed in Matlab (R2016a) on a desktop computer with an
8-core Intel Xeon processor at 2.90 GHz and 128 GB of RAM.
The PF was implemented using the systematic resampling technique [19], with a threshold parameter
nt
p= (2/3)np. The GSF-related constrained optimization problems, defined in (32) and (47), were solved
with the interior-point algorithm using fmincon function from the Optimization Toolbox [25] of Matlab.
Note that the fitting problem defined in (32) is performed off-line (filter design phase) and it does not affect
the on-line computation load (testing phase) of the GSF. On the other hand, the GMM re-approximation
procedure defined in (47) has to be solved almost after each measurement update. For this task, the fmincon
function is constrained to a maximum of 100 iterations. This leads to a GSF implementation which has a
limited computational burden.
VI.A. Scalar Case Example
For the scalar case example, the following system parameters were considered: Φ = 0.75,Γ = 1,H= 2,
β= 0.1,γ= 0.2, and α= 0.5.
VI.A.1. Sample Run
Before turning to a statistical MC evaluation, we compare the accuracy of the PF and GSF on a sample
run scenario driven by the noise sequences depicted in Fig. 2. For clarity of the presentation, we omit the
10
Figure 2. Cauchy distributed process and measurement noise sample sequences.
approximate implementation of the scalar ISCE from the sample run comparison as the results for a sliding
window of ns10 are indistinguishable from the optimal ISCE.
Based on the discussion presented in Section V.B.2, the process noise, measurement noise, and initial
state pdf of the GSF was fitted to the Cauchy pdf with a weighted Gaussian sum of nw= 7,nv= 7, and
n1|0= 9 components, respectively. Two different number of particles (np) and Gaussian components kept at
each step (nb) are considered for the PF and for the GSF, respectively, to demonstrate their effect on the
accuracy of those approximations.
We first assess the performance of the PF and GSF through their approximation of the true conditional
pdf p(xk|z1:k)obtained from the optimal ISCE as shown in Fig. 3. The conditional pdf at time step 8 is
considered as it yields a refined bimodal distribution. Fig. 3 clearly demonstrates that only the PF with
100,000 particles approximates reasonably well the true conditional pdf. However, the computational burden
when compared to the ISCE is quite high. The average computation time of the PF with 50 particles is 4.5
times and of the PF with 100,000 particles is 7,000 times higher than the average computation time of the
optimal ISCE while processing 100 data steps, and when carried out on the same computer.
-1.2 -0.8 -0.4 0 0.4 0.8 1.2
x8
0
0.5
1
1.5
2
2.5
p(x8|z1:8)
PF (np= 50)
PF (np= 100,000)
GSF (nb= 10)
GSF (nb= 200)
ISCE (optimal)
x8
z8/H
Figure 3. Comparison of PF and GSF approximations of the true density at k=8.
On the other hand, the GSF-based approximation of the conditional pdf is very poor even when 200
Gaussian componentseare kept at each step, as shown in Fig. 3. The bimodal shape of the true conditional
pdf is barely preserved. The computational burden of the GSF with 10 components is approximately 1,000
eNote that without engaging the re-approximation procedure of (47), the number of Gaussian components in (42) at time
k= 8 would reach a number greater than 4×1013.
11
times and with 200 components is approximately 25,000 times higher than the computational burden of the
ISCE, respectively.
Next, estimation results for a 100 step sample run are presented. The system is again driven by the
noise sequences depicted in Fig. 2. The upper subplot in Fig. 4 shows the difference between the exact
minimum variance state estimate (ˆx
k) and its approximation (ˆxk) computed by the PF or the GSF. The
bottom subplot presents the difference between the exact standard deviation of the estimation error (σ
k) and
its approximation (σk) computed by the PF or GSF. Note that the exact values of ˆx
kand σ
kare computed
by the optimal ISCE, see (15) and (16), respectively. The approximate implementation of the scalar ISCE
is omitted here as the results are indistinguishable from the optimal ISCE.
1 10 20 30 40 50 60 70 80 90 100
-20
0
20
40
ˆx
kˆxk
PF (np= 50)
PF (np= 100,000)
GSF (nb= 10)
GSF (nb= 200)
1 10 20 30 40 50 60 70 80 90 100
Step k
-20
0
20
40
σ
kσk
Figure 4. PF and GSF approximation error statistics compared to the optimal ISCE values.
Fig. 4 shows that the PF with both 50 and 100,000 particles disregards the measurement outlier at steps
5 and 62. On the other hand, process noise outliers of steps 26 and 31 cause a slight divergence of the PF
with 50 particles, but vanishing after few steps. Such behavior is observed when the number of particles is
insufficient to capture properly the heavy-tail characteristics of the Cauchy noise environment. At the cost of
increased computational burden, and except the two measurement outliers discussed earlier, the performance
of the PF with 100,000 particles is comparable to that of the ISCE.
Similar conclusions can be drawn when examining the performance of the GSF. In this case, the mea-
surement noise outliers at steps 5 and 62 yield to a slight divergence, especially at step 62.
VI.A.2. Monte Carlo Analysis
The presented sample run results suggest that, at the cost of a significantly higher computational burden,
both the PF and the GSF perform comparably to the ISCE when considering the scalar-state problem. Now
we will consider a Monte Carlo simulation-based evaluation of the PF and GSF performance. Additionally,
to allow a fair comparison of the PF and GSF with the ISCE, the number of particles npof the PF and
the number of components (nb,nw,nv,n1|0) of the GSF are selected such that the average computational
burden of the PF and GSF is similar to that of the approximate ISCE implementation with a window of
20 steps, i.e., ns= 20. Consequently, npwas set to 12 while the parameters of the GSF are set to be as
minimalist as possible, i.e., nb= 3,nw= 3,nv= 3, and n1|0= 3. This resulted in a computational burden
of the GSF to be approximately 30 times larger that the ISCE and PF.
Conventional evaluation of a MC-based ensemble mean and variance of the estimation error cannot be
performed for the studied problem as both the system state and the measurements are Cauchy distributed.
12
As a consequence, the estimation errors (no matter whether computed by ISCE, PF, or GSF) are also heavy-
tailed, leading to infinite variance when computed via the conventional MC averaging method. Therefore, in
this paper, we suggest to evaluate the estimation performance using the log of the geometric mean square,
i.e.,
˜σ2
k,1
nmc
nmc
X
i=1
log x(i)
kˆx(i)
k2,(49)
where iindicates the ith MC realization, nmc is the total number of MC runs. Since the log is monotonic,
but also suppresses the large deviations caused by the Cauchy impulsive uncertainty, ˜σ2
kappears to allow an
ensemble measure of the heavy-tailed mean-square estimation error deviations.
Figure 5 presents the results for the suggested measure ˜σ2
kbased on a set of nmc = 10,000 MC runs.
For consistency, we also depict the obtained results for the optimal ISCE. As the PF implementation is
nondeterministic, for each MC realization a set of 100 inner MC runs were considered to obtain an averaged
state estimate ˆx(i)
kfor the PF.
1 10 20 30 40 50 60 70 80 90 100
Step k
-6
-4
-2
0
˜σ2
k
PF (np= 12)
GSF (nb= 3)
ISCE (ns= 20)
ISCE (optimal)
Figure 5. Log of the geometric mean square of the scalar estimation error.
It can be observed from Fig. 5 that the performance of the PF degrades severely when constraining the
computational load to be comparable to the ISCE with ns= 20. Despite the fact that the computational
burden of the GSF is much higher, its performance is inferior to the ISCE.
VI.B. Two-state Case Example - Monte Carlo Analysis
For the two-state case example, the following system parameters were chosen: H= [ 1 2 ],
Φ="0.9 0.1
0.2 1.0#,Γ="1.0
0.3#,"α1
α2#="0.5
0.3#.
The process noise, β= 0.1, and measurement noise, γ= 0.2, parameters are the same as in the scalar
case example. Sample run results comparing the performance of the PF and GSF with the two-state ISCE,
implemented both with a six-step and a eight-step sliding window approximation, are presented in [15].
In the current study, the performance of the six-step window implementation, more suitable for real time
applications, will be shown to be statistically comparable to the eight-step approximation. Hence the six-
window ISCE will serve as the baseline for selecting the number of particles npof the PF and the number of
components (nb,nw,nv,n1|0) of the GSF. Consequently, npwas set to 4,500 while nb= 3,nw= 3,nv= 3,
and n1|0= 9.
Figure 6 shows the obtained results based on a set of 1,000 MC simulations. Again, 100 inner MC
runs were considered for the PF. Additionally to the six-window ISCE approximation, we depict also the
eight-window case to demonstrate their consistency. Figure 6 clearly demonstrates that 4,500 particles,
which result in a comparable computing time to that of the six-step ISCE approximation, are not enough to
properly estimate the system states. Similar conclusion can be drawn when examining the performance of
the GSF depicted in Fig. 6. In this case, to have a comparable computing time, the GSF is constrained to
keep only 3 Gaussian components. This figure clearly shows that the GSF performs very poorly, worse than
the respective PF approximation. It demonstrates that the heavy-tail characteristics of the Cauchy noise
environment cannot be captured well enough by a limited number of Gaussian pdf’s.
13
1 10 20 30 40 50 60 70 80 90 100
-6
-3
0
3
6
9
˜σ2
k;1
PF (np= 4,500) GSF (nb= 3) ISCE (ns= 6) ISCE (ns= 8)
1 10 20 30 40 50 60 70 80 90 100
Step k
-6
-3
0
3
6
9
˜σ2
k;2
Figure 6. Log of the geometric mean square of the two-state estimation error.
VII. Conclusion
The estimation performance of two popular approximate filtering algorithms have been numerically com-
pared with the approximate scalar and two-state ISCE, for a linear discrete-time dynamic system with
additive Cauchy measurement and process noises. Despite the fact that both the PF and GSF were designed
based on the same a priori information as the ISCE, sample run results shows that in the scalar case, the
PF tends to converge to the optimal solution only when using a very large number of particles, while the
GSF demonstrated modest convergence and a large discrepancy when the pdf approximation is of concern.
Monte Carlo simulation results for the scalar and two-state case revealed that both the PF and GSF perform
poorly and even diverge for a computation time consistent with that of the approximate ISCE. Hence the
PF and GSF do not provide a practical alternative to the approximation that is based on the optimal so-
lution. Consequently, for real-time implementation of filtering problems in the impulsive noise environment
represented here as heavy-tailed Cauchy noises, the approximate scalar and two-state ISCE with a bounded
computational burden is clearly the superior solution.
Acknowledgments
This work was supported by the National Science Foundation (NSF) under Grant No. 1607502, the
United StatesŰIsrael Binational Science Foundation (BSF) under Grant No. 2012122, and the joint NSF-
BSF ECCS program under Grant No. 2015702.
References
[1] Taleb, N. N., The Black Swan: The Impact of the Highly Improbable, Random House, New York, 2007.
[2] Carpenter, J. R. and Mashiku, A. K., “Cauchy Drag Estimation For Low Earth Orbiters,” in
“AAS/AIAA Space Flight Mechanics Meeting,” Williamsburg, VA; United States, 2015, pp. 2731–2746.
[3] Speyer, J. L. and Chung, W. H., Stochastic Processes, Estimation, and Control, SIAM, Philadelphia,
14
2008.
[4] Kalman, R. E., “A new approach to linear filtering and prediction problems,” Journal of basic Engi-
neering, Vol. 82, No. 1, 1960, pp. 35–45, doi:10.1115/1.3662552.
[5] Schick, I. C. and Mitter, S. K., “Robust recursive estimation in the presence of heavy-tailed observation
noise,” The Annals of Statistics, Vol. 22, No. 2, 1994, pp. 1045–1080, doi:10.1214/aos/1176325511.
[6] Idan, M. and Speyer, J. L., “Cauchy Estimation for Linear Scalar Systems,” IEEE Transactions on
Automatic Control, Vol. 55, No. 6, 2010, pp. 1329–1342, doi:10.1109/TAC.2010.2042009.
[7] Idan, M. and Speyer, J. L., “State Estimation for Linear Scalar Dynamic Systems with Additive Cauchy
Noises: Characteristic Function Approach,” SIAM Journal on Control and Optimization, Vol. 50, No. 4,
2012, pp. 1971–1994, doi:10.1137/110831362.
[8] Idan, M. and Speyer, J. L., “Multivariate Cauchy Estimator with Scalar Measurement and Pro-
cess Noises,” SIAM Journal on Control and Optimization, Vol. 52, No. 2, 2014, pp. 1108–1141,
doi:10.1137/120891897.
[9] Fernandez, J. H., Methods for Estimation and Control of Linear Systems Driven by Cauchy Noises,
Ph.D. thesis, UCLA: Mechanical Engineering 0330, 2013.
[10] Fernandez, J. H., Speyer, J. L., and Idan, M., “Stochastic Estimation for Two-State Linear Dynamic
Systems with Additive Cauchy Noises,” IEEE Transactions on Automatic Control, Vol. 60, No. 12, 2015,
pp. 3367–3372, doi:10.1109/TAC.2015.2422478.
[11] Gordon, N. J., Salmond, D. J., and Smith, A. F. M., “Novel approach to nonlinear/non-Gaussian
Bayesian state estimation,” in “IEE Proceedings F - Radar and Signal Processing,” IET, Vol. 140, 1993,
pp. 107–113, doi:10.1049/ip-f-2.1993.0015.
[12] Doucet, A., Godsill, S., and Andrieu, C., “On sequential Monte Carlo sampling methods for Bayesian
filtering,” Statistics and Computing, Vol. 10, No. 3, 2000, pp. 197–208, doi:10.1023/A:1008935410038.
[13] Sorenson, H. W. and Alspach, D. L., “Recursive Bayesian Estimation Using Gaussian Sums,” Automat-
ica, Vol. 7, No. 4, 1971, pp. 465–479, doi:10.1016/0005-1098(71)90097-5.
[14] Alspach, D. L. and Sorenson, H. W., “Nonlinear Bayesian estimation using Gaussian sum ap-
proximations,” IEEE Transactions on Automatic Control, Vol. 17, No. 4, 1972, pp. 439–448,
doi:10.1109/TAC.1972.1100034.
[15] Fonod, R., Idan, M., and Speyer, J. L., “State Estimation for Linear Systems with Additive Cauchy
Noises: Optimal and Suboptimal Approaches,” in “Proceedings of European Control Conference,” IEEE,
Piscataway, NJ, 2016, pp. 1434–1439, doi:10.1109/ECC.2016.7810491.
[16] Doucet, A., de Freitas, N., and Gordon, N., Sequential Monte-Carlo Methods in Practice, Springer-
Verlag, New York, 2001.
[17] Arulampalam, M. S., Maskell, S., Gordon, N., and Clapp, T., “A tutorial on particle filters for online
nonlinear/non-Gaussian Bayesian tracking,” IEEE Transactions on Signal Processing, Vol. 50, No. 2,
2002, pp. 174–188, doi:10.1109/78.978374.
[18] Li, T., Bolic, M., and Djuric, P. M., “Resampling Methods for Particle Filtering: Classification, im-
plementation, and strategies,” IEEE Signal Processing Magazine, Vol. 32, No. 3, 2015, pp. 70–86,
doi:10.1109/MSP.2014.2330626.
[19] Kitagawa, G., “Monte Carlo filter and smoother for non-Gaussian nonlinear state space models,” Journal
of computational and graphical statistics, Vol. 5, No. 1, 1996, pp. 1–25, doi:10.2307/1390750.
[20] Achieser, N. I., Theory of Approximation, Dover Publications, New York, 1992. Chap. 6.
[21] Psiaki, M. L., Schoenberg, J. R., and Miller, I. T., “Gaussian Sum Reapproximation for Use in a
Nonlinear Filter,” Journal of Guidance, Control, and Dynamics, Vol. 38, No. 2, 2015, pp. 292–303,
doi:10.2514/1.G000541.
15
[22] Kitagawa, G., “The two-filter formula for smoothing and an implementation of the Gaussian-sum
smoother,” Annals of the Institute of Statistical Mathematics, Vol. 46, No. 4, 1994, pp. 605–623,
doi:10.1007/BF00773470.
[23] Kitagawa, G., “Non-Gaussian seasonal adjustment,” Computers & Mathematics with Applications,
Vol. 18, No. 6, 1989, pp. 503–514, doi:10.1016/0898-1221(89)90103-X.
[24] Williams, J. L. and Maybeck, P. S., “Cost-function-based Gaussian mixture reduction for target track-
ing,” in “Proceedings of the sixth international conference of information fusion,” Vol. 2, 2003, pp.
1047–1054, doi:10.1109/ICIF.2003.177354.
[25] Venkataraman, P., Applied optimization with MATLAB programming, John Wiley & Sons, New York,
2002.
16
... However, high computational complexity and memory burden are also mentioned. 22 A new polar alignment algorithm based on the Huber estimation filter is proposed in this article. The main contribution of this article is to analyze the BDS measurement noise characteristics in the polar regions and to propose a Kalman filter based on the Huber estimation combining grid navigation system and generalized maximum likelihood estimation. ...
Article
Full-text available
For aircrafts equipped with BeiDou Navigation Satellite System/Strapdown Inertial Navigation System integrated navigation system, BeiDou Navigation Satellite System information can be used to achieve autonomous alignment. However, due to the complex polar environment and multipath effect, BeiDou Navigation Satellite System measurement noise often exhibits a non-Gaussian distribution that will severely degrade the estimation accuracy of standard Kalman filter. To address this problem, a new polar alignment algorithm based on the Huber estimation filter is proposed in this article. Considering the special geographical conditions in the polar regions, the dynamic model and the measurement model of BeiDou Navigation Satellite System/Strapdown Inertial Navigation System integrated alignment system in the grid frame are derived in this article. The BeiDou Navigation Satellite System measurement noise characteristics in the polar regions are analyzed and heavy-tailed characteristics are simulated, respectively. Since the estimation accuracy of standard Kalman filter can be severely degraded under non-Gaussian noise, a Kalman filter based on the Huber estimation is designed combining grid navigation system and generalized maximum likelihood estimation. The simulation and experiment results demonstrate that the proposed algorithm has better robustness under non-Gaussian noise, and it is effective in the polar regions. By employing the proposed algorithm, the rapidity and accuracy of the alignment process can be improved.
Conference Paper
Full-text available
Only few estimation methods can converge in the presence of impulsive measurement and/or process noises without the use of augmented heuristic schemes. To understand the performance of these schemes, the optimal Idan/Speyer Cauchy Estimator (ISCE) is compared with the performance of the particle filter (PF) and Gaussian sum filter (GSF) as the convergence time of these estimators is allowed to increase. That is, the number of particles at each step for the PF and the number of Gaussian components at each step for the GSF are increased and their performance relative to the ISCE is numerically studied for scalar and two-state dynamic systems.
Article
Full-text available
Two decades ago, with the publication of [1], we witnessed the rebirth of particle filtering (PF) as a methodology for sequential signal processing. Since then, PF has become very popular because of its ability to process observations represented by nonlinear state-space models where the noises of the model can be non-Gaussian. This methodology has been adopted in various fields, including finance, geophysical systems, wireless communications, control, navigation and tracking, and robotics [2]. The popularity of PF has also spurred the publication of several review articles [2]?[6].
Chapter
The light-tailed Gaussian paradigm has dominated the foundation of estimation and control algorithms. However, in many realistic applications the system can experience large impulsive noises far more often than the Gaussian would admit. In this paper the Cauchy probability density function (pdf) is used to develop a new class of estimation and control algorithms. First, the scalar Cauchy estimation problem is addressed which entails the generation of the state pdf conditioned on the measurement history. Next, based on this Cauchy conditional pdf, a model predictive optimal controller is developed. Finally, the vector stated estimator is derived by recursively propagating the characteristic function of the unnormalized conditional pdf through measurement updates and dynamic state propagation. The conditional mean and variance are easily computed from the first and second derivatives of this characteristic function.
Article
An efficient recursive state estimator is developed for two-state linear systems driven by Cauchy distributed process and measurement noises. For a general vector-state system, the estimator is based on recursively propagating the characteristic function of the conditional probability density function (cpdf), where the number of terms in the sum that expresses this characteristic function grows with each measurement update. Both the conditional mean and the conditional error variance are functions of the measurement history. For systems with two states, the proposed estimator reduces substantially the number of terms needed to express the characteristic function of the cpdf by taking advantage of relationships not yet developed in the general vector-state case. Further, by using a fixed sliding window of the most recent measurements, the improved efficiency of the proposed two-state estimator allows an accurate approximation for real-time computation. In this way, the computational complexity of each measurement update eventually becomes constant, and an arbitrary number of measurements can be processed. The numerical performance of the Cauchy estimator in both Cauchy and Gaussian simulations was demonstrated and compared to the Kalman Filter.
Article
A new method has been developed to approximate one Gaussian sum by another. This algorithm is being developed as part of an effort to generalize the concept of a particle filter. In a traditional particle filter, the underlying probability density function is described by particles: Dirac delta functions with infinitesimal covariances. This paper develops an important component of a more general filter, which uses a Gaussian sum with "fattened" finite-covariance "blobs" (i.e., Gaussian components), which replace infinitesimal particles. The goal of such a filter is to save computational effort by using many fewer Gaussian components than particles. Most of the techniques necessary for this type of filter exist. The one missing technique is a resampling algorithm that bounds the covariance of each Gaussian component while accurately reproducing the original probability distribution. The covariance bounds keep the blobs from becoming too "fat" to ensure low truncation error in extended Kalman filter or unscented Kalman filter calculations. A new resampling algorithm is described, and its performance is studied using two test cases. The new algorithm enables Gaussian sum filter performance that is better than standard nonlinear filters when applied in simulation to a difficult seven-state estimation problem: the new filter's root mean square error is only 60% higher than the Cramer-Rao lower bound, whereas the next best filter's root mean square error is 370% higher.
Conference Paper
Idan and Speyer have developed minimum variance estimators that assume measurement noise, process noise, and initial error are all distributed as Cauchy random variables. Samples from Cauchy distributions are characterized by long stretches of fairly small variation, punctuated by large variations that are many times larger. Cauchy distributions appear to accurately model some natural features, such as density variations that affect satellites in low Earth orbit. In contrast to a Kalman filter, the covariances from the Cauchy approach fully and conservatively bound the actual error signature, even in the face of unanticipated density disturbances of hundreds of percent.
Article
A new algorithm for the prediction, filtering, and smoothing of non-Gaussian nonlinear state space models is shown. The algorithm is based on a Monte Carlo method in which successive prediction, filtering (and subsequently smoothing), conditional probability density functions are approximated by many of their realizations. The particular contribution of this algorithm is that it can be applied to a broad class of nonlinear non-Gaussian higher dimensional state space models on the provision that the dimensions of the system noise and the observation noise are relatively low. Several numerical examples are shown.
Article
Uncertainties in many practical systems, such as radar glint and sonar noise, have impulsive character and are better described by heavy-tailed non-Gaussian densities, for example, the Cauchy Probability Density Function (PDF). The Cauchy PDF does not have a well-defined mean and its second moment is infinite. Nonetheless, the conditional density of a Cauchy random variable, given a scalar linear measurement with an additive Cauchy noise, has a conditional mean and a finite conditional variance. In particular, for scalar discrete linear systems with additive process and measurement noises described by Cauchy PDFs, the unnormalized characteristic function of the conditional PDF is considered. It is expressed as a growing sum of terms that at each measurement update increases by one term, constructed from four new measurement-dependent parameters. The dynamics of these parameters is linear. These parameters are shown to decay, allowing an approximate finite dimensional recursion. From the first two differentials of the unnormalized characteristic functions of the conditional PDF evaluated at spectral value of zero, the conditional mean and variance are easily obtained. The effectiveness of this estimator is examined using a target interception example that also demonstrates its robustness when processing data with non-Cauchy noises.
Article
The conditional mean estimator for a n-state linear system with additive Cauchy measurement and process noises is developed. It is shown that although the Cauchy densities, which model the initial state, the process noise and the measurement noise have undefined first moments and an infinite second moment, the probability density function conditioned on the measurement history does have a finite conditional mean and conditional variance. For the multivariable system state, the characteristic function of the unnormalized conditional probability density function is sequentially propagated through measurement updates and dynamic state propagation, while expressing the resulting characteristic function in a closed analytical form. Once the characteristic function of the unnormalized conditional probability density function is obtained, the probability density function of the measurement history, the conditional mean, and conditional variance are easily computed from the characteristic function and its continuous first and second derivatives, evaluated at the origin in the spectral variables’ domain. These closed form expressions yield the sequential state estimator. A three-state dynamic system example demonstrates numerically the performance of the Cauchy estimator.