Efficient Nonlinear Measurement Updating based on
Gaussian Mixture Approximation of Conditional Densities
Marco F. Huber, Dietrich Brunn, and Uwe D. Hanebeck
Abstract—Filtering or measurement updating for nonlinear
stochastic dynamic systems requires approximate calculations,
since an exact solution is impossible to obtain in general. We
propose a Gaussian mixture approximation of the conditional
density, which allows performing measurement updating in
closed form. The conditional density is a probabilistic repre-
sentation of the nonlinear system and depends on the random
variable of the measurement given the system state. Unlike
the likelihood, the conditional density is independent of actual
measurements, which permits determining its approximation
off-line. By treating the approximation task as an optimization
problem, we use progressive processing to achieve high quality
results. Once having calculated the conditional density, the
likelihood can be determined on-line, which, in turn, offers
an efficient approximate filter step. As result, a Gaussian
mixture representation of the posterior density is obtained. The
exponential growth of Gaussian mixture components resulting
from repeated filtering is avoided implicitly by the prediction
step using the proposed techniques.
Fusing information that has been acquired by measure-
ments is a common challenge in many technical applications
like sensor-actuator-networks or robotics. Especially in the
presence of uncertainties described by random variables,
Bayesian filtering provides exact probability density determi-
nations of the system state. In practical settings, a recursive
processing of this so-called posterior density is needed.
Since no exact density representation in closed-form and
constant complexity is available, the filtering or measurement
updating problem is computationally unfeasible in general.
While for linear systems with Gaussian random variables
the Kalman filter provides exact solutions in an efficient
manner , the nonlinear case requires the approximation
of the true density. The well-known extended Kalman filter
uses linearization to apply the Kalman filter equations on
nonlinear systems , while the unscented Kalman filter
offers increased higher-order accuracy by using a determin-
istic sampling approach . The resulting single Gaussian
density of both estimation methods is typically not sufficient
for characterizing the true complex density. One possibility is
using a sample representation of the density, like in particle
filters . Another possibility is to use generic parameter-
ized density functions. Due to their universal approximation
property , Gaussian mixtures are very convenient for
that purpose. The bandwidth of estimators using Gaussian
Marco F. Huber, Dietrich Brunn, and Uwe D. Hanebeck are with
the Intelligent Sensor-Actuator-Systems Laboratory, Institute of Computer
Science and Engineering, Universit¨ at Karlsruhe (TH), Germany.
mixtures ranges from the Gaussian sum filter , which is
algorithmically straightforward, up to computationally more
expensive but precise methods like the one presented in .
Unlike the previously mentioned estimation techniques,
we introduce a new efficient measurement updating ap-
proach that approximates the conditional density, which is
a probabilistic representation of the nonlinear measurement
equation. For approximation purposes, Gaussian mixtures
are used, whose parameters are calculated by means of
progressively solving an optimization problem as proposed
in . Because of being independent of actual measurements,
off-line optimization is possible. Given the approximate
conditional density and an actual measurement, the like-
lihood is generated on-line. Since the likelihood is also
represented by a Gaussian mixture, the filter step is reduced
to simple multiplications of Gaussian densities resulting in
a Gaussian mixture representation of the posterior density.
To avoid an exponential growth of the number of Gaussian
mixture components, we propose a simultaneous prediction
and Gaussian mixture reduction. Hence, an efficient filter
step with constant complexity is obtained.
In the following Section, we review Bayes’ law for
filtering discrete-time systems and point out the relation
between conditional density and likelihood. The rest of the
paper is structured as follows: In Section III, the progres-
sive processing for approximating the conditional density
is explained. An example application is also investigated.
Approximating the likelihood and performing the filter step
on-line is subject of Section IV, while in Section V the
closed-form prediction step and Gaussian mixture reduction
is derived. In Section VI, the interaction of both techniques
with the efficient filter step is demonstrated and compared
to the unscented Kalman filter, the particle filter, and the
Bayesian estimator by means of the example application. The
paper closes with conclusions and an outlook on future work.
II. PROBLEM FORMULATION
In this paper we only consider scalar random variables,
denoted by boldface letters, e.g. x. Thus, we consider scalar
nonlinear time-invariant systems, where scalar measurements
ˆ yk at time step k are related to the scalar system state xk
by means of the measurement equation
yk= h(xk) + vk ,
where the additive noise vkis assumed as a white stationary
Gaussian random process with density fv(vk) = N(vk−
μv,σv), mean μv, and standard deviation σv. Note that an
actual measurement ˆ ykis a realization of (1).
Given a predicted density fp
surement updates the system state via the filter step or
measurement update according to Bayes’ law 
k(xk) for xk, a new mea-
k(xk) = ckfp
where ck = 1/?
k(xk)dxk is a normalization
k(xk) is the so-called likelihood
constant and fL
k(xk) = f(ˆ yk|xk) = fv(ˆ yk− h(xk)) .
The likelihood depends on the noise density of vk, the
structure of the measurement equation, and especially on the
measurement ˆ yk. Hence, the likelihood’s shape changes with
every new measurement.
Recursively updating the predicted density fp
ing to (2) is of conceptual value only, since the complex
shape of the likelihood prevents a closed-form and efficient
solution. Furthermore, for the case of nonlinear systems
with arbitrarily distributed random variables, in general there
exists no analytical density that can be updated in the
filter step without changing the type of representation. To
overcome these insufficiencies, an appropriate approximation
of the true posterior density fe
on true densities will be indicated by a tilde, e.g.˜f(·), while
the corresponding approximation will be denoted by f(·).
The typically enormous computational effort when directly
we translate our approach for the prediction step proposed
in  to the filter step. In doing so, we use a Gaussian
mixture representation fL
tion purposes, that depends on the parameter vector ηk. The
calculation of an appropriate parameter vector ηkfor high
quality approximations is computationally very demanding.
Since the likelihood is time-variant, the demanding compu-
tations would also be necessary at every time step.
Instead, we can approximate the time-invariant conditional
density˜f(yk|xk) = fv(yk−h(xk)) by the Gaussian mixture
density f(yk,xk,η) with parameter vector η. The conditional
density can be interpreted as the aggregation of all possible
likelihoods and thus is of higher dimensionality. In presence
of a new measurement ˆ yk, we can easily obtain the corre-
sponding likelihood with
k(xk) is inevitable. From now
k(xk) at every time step can be avoided, if
k(xk) for approxima-
⇓ Approx. ⇓
k(xk,ηk) = f(yk,xk,η)
Thus, the approximate likelihood fL
mined on-line when needed by calculating its time-variant
parameter vector ηkfrom η and ˆ ykas shown in Section IV.
The more extensive Gaussian mixture conditional density
approximation depending on the parameter vector η can be
solved off-line as illustrated in Fig. 1. However, Gaussian
mixture approximations are considered a tough problem. In
the following section an effective approximation scheme is
k(xk,ηk) can be deter-
proximation is performed off-line, while likelihood approximation and the
filter step remain on-line tasks. A closed-form prediction step depending on
transition density approximation as derived in  completes the efficient
estimator for nonlinear systems.
Recursive, closed-form estimation. The conditional density ap-
III. APPROXIMATION OF THE CONDITIONAL DENSITY
The key idea is to reformulate the Gaussian mixture
approximation problem as an optimization problem by min-
imizing a certain distance measure between˜f(yk|xk) and
f(yk,xk,η). For solving this problem, we give a review on
the progressive optimization scheme proposed in .
Since in real systems the system state is usually restricted
to a finite interval, i.e.,
∀k : xk∈ [a,b] =: Ω ,
we are only interested in approximating the conditional
density for xk∈ Ω.
Furthermore, we use the special case of a Gaussian mix-
ture with axis-aligned Gaussian components (short: axis-
aligned Gaussian mixture) for representing the Gaussian
mixture approximation f(yk,xk,η). Here, each component
is separable in every dimension, i.e.,
ωi· N(yk− μy
with the parameter vector
η = [ηT
T, where ηi= [ωi,μy
An axis-aligned Gaussian mixture has minor approximation
capabilities compared to a non axis-aligned one. Hence, more
components are required to achieve a comparable approxima-
tion quality. In exchange, the covariance matrices of the axis-
aligned Gaussian mixture components are diagonal. Thus,
less parameters for a single component have to be adjusted
and the necessary determination of the gradient
to be easier. Altogether, representing f(yk,xk,η) as in (3)
lowers the algorithmic complexity.
A. The Optimization Problem
The quality of the approximation fe
on the similarity between˜f(yk|xk) and its Gaussian mixture
approximation f(yk,xk,η) for xk∈ Ω. Thus, this section is
concerned with solving the optimization problem
k(xk) strongly depends
ηmin= arg min
that yields the parameter vector for f(yk,xk,η), which
minimizes the distance to˜f(yk|xk). The employed distance
measure is the squared integral measure
?˜f(yk|xk) − f(yk,xk,η)
dxkdyk . (5)
Although this measure has been selected for its simplicity
and convenience, it has been found to give excellent results.
Apart from the selected distance measure, the underlying
nonlinearity complicates solving (4) significantly. In general,
no closed-form solution can be derived. In addition, the high
dimension of η makes the selection of an initial solution
very difficult, so that the direct application of numerical
minimization routines leads to insufficient local optima of η.
B. Progressive Processing
Instead of attempting the direct approximation the condi-
tional density, we pursue a progressive approach for finding
ηminas shown in Fig. 2. This type of processing has
been proposed in , . In doing so, a parameterized
conditional density˜f(yk|xk,γ) with the progression param-
eter γ ∈ [0,1] is introduced. Incrementing this progression
parameter by small Δγ ensures a continuous transformation
of the solution of an initial, tractable optimization problem
towards the desired true conditional density˜f(yk|xk). In ev-
ery single so-called progression step we achieve a gradually
changed distance measure G(η,γ), for which the necessary
condition for a minimum
has to be satisfied. For this purpose, the BFGS formula
, a standard numerical optimization method, is em-
ployed. G(η,γ) results from using the progressive version
˜f(yk|xk,γ) of˜f(yk|xk) in (5).
For introducing the parameterized conditional density, we
use the parameterized measurement function h(xk,γ),
h(xk,γ) = (1 − γ)H · xk+ γh(xk) ,
where in particular H ∈ R and
h(xk,γ = 0) = H · xk ,
h(xk,γ = 1) = h(xk) .
This yields the modified measurement equation
yk= h(xk,γ) + vk .
Init. Progression: γ = 0
γ ≤ 1
Increment: γ = γ + Δγ
Optimize current Progression Step
by minimizing G(η,γ)
Fig. 2.Flow chart of the progressive processing to determine ηmin.
The dependence of˜f(yk|xk,γ) = fv(yk− h(xk,γ)) on
measurement equation (7) automatically causes its param-
Example 1 (Quadratic Decay Measurement Function)
In a wireless communication scenario the measurement equa-
tion yk= (1+x2
to the relative signal strength (SNR) ykaccording to the free-
space propagation model . Fig. 3 shows the progression
of the corresponding parameterized measurement function
h(xk,γ) = (1 − γ)H · xk + γ?1 + x2
˜f(yk|xk,γ) performs the same transformation.
Initially,˜f(yk|xk,γ = 0) corresponds to the conditional
density of a linear system (6). For this optimization problem,
there exists just one single optimum. Thus, the choice of
insufficient starting parameters by the user and starting the
progression with a local optimum is bypassed . For γ = 1
the parameterized conditional density corresponds to the true
conditional density, i.e.,˜f(yk|xk,γ = 1) =˜f(yk|xk).
k)−1+vkrelates the position xkof a receiver
?−1for Δγ = 0.2,
H = 0 and Ω = [−3,3]. The parameterized conditional density
IV. THE FILTER STEP
Together with the closed-form prediction step proposed in
, the Gaussian mixture approximation of the conditional
density and likelihood respectively allows on-line performing
an efficient closed-form filter step as depicted in Fig. 1.
For this purpose we assume that all involved densities are
represented as Gaussian mixtures.
According to  we assume the predicted density fp
to be given by
k,j· N(xk− μp
where Lpis the constant number of Gaussian components,
standard deviation σp
Given a Gaussian mixture approximation f(yk,xk,η) ac-
cording to (3), its axis-aligned structure allows the direct
approximation of the likelihood fL
k,j) is a Gaussian density with mean μp
k,j, and ωp
k,j> 0 and?Lp
k,jare weighting coefficients
k(xk,ηk), if for time step
? = 0
? = 0.4
? = 0.6
? = 0.8
Fig. 3. Progression of the parameterized measurement function h(xk,γ) =
(1 − γ)H · xk+ γ(1 + x2
k a measurement ˆ ykis present,
k(xk,ηk) = f(yk,xk,η)
ωi· N(ˆ yk− μy
ωk,i· N(xk− μx
T, where ηk,i= [ωk,i,μx
Example 2 (Quadratic Decay Measurement (cont’d.))
We consider again the measurement equation of Example 1,
where now vk ∼ N(vk − 0,0.25) and Ω = [−3,3]. Using
L = 20 Gaussian components leads to the conditional density
approximation with quality G(η) = 0.0039 shown in Fig. 4.
Applying a measurement ˆ yk = 0.6, we obtain the likelihood
approximation depicted in Fig. 5. The little bumps at the
interval borders of xk result from sharply restricting xk to the
interval Ω. A more continuous windowing would alleviate this.
Incorporating an actual measurement and generating the
likelihood corresponds to taking a slice parallel to the xk-
axis from the conditional density at position yk= ˆ yk. The
Gaussian mixture representation of fL
very convenient for efficiently performing the filter step.
k(xk,ηk) itself is then
Theorem 1 (Approximate Posterior Density)
Given the Gaussian mixture representations (8) and (9)
posterior density fe
be calculated analytically.
k(xk) and fL
k(xk,ηk) respectively, the approximate
k(xk) is also a Gaussian mixture that can
PROOF. Using Bayes’ law (2) we obtain
k(xk) = ckfp
k,i,j· N(xk− μe
zi,j = N
k,i,j= zi,j· ωp
k,j· ωk,i ,
The normalization constant ck = 1/?Lp
For obtaining the result in (10), only multiplications of two
Gaussian densities, denoted by zi,j· N(xk− μe
have to be performed. Hence, (10) provides the closed-
form and efficient solution for the filter step by means of
the Gaussian mixture approximation of a likelihood. The
accuracy of the approximation of˜fe
on the number of components of f(yk,xk,η) and fL
The obtained Gaussian mixture approximation for the
posterior density ˜fe
Thus, unlike the closed-form prediction step, the number of
components in the approximation grows exponentially over
time. To avoid this exponential growth it is standard practice
to employ Gaussian mixture reduction techniques after the
filter step. Instead, the subsequent closed-form prediction
step, as depicted in Fig. 1, automatically leads to a Gaussian
from integrating over both sums in (10),
k(xk) strongly depends
k(xk) comprises Lp· L components.
V. GAUSSIAN MIXTURE REDUCTION
Very popular Gaussian mixture reduction methods,
Salmond’s joining algorithm  or Maybeck’s ISE based
reduction algorithm , suffer from either poor reduction
quality or a high computational burden. However, predictions
with respect to a scalar nonlinear time-invariant system
xk+1= a(xk) + wk ,
where wkis white and stationary Gaussian noise, offer the
opportunity for directly reducing the number of components
to a constant value Lp. For this purpose, the closed-form
proposed in  has to be performed. Here, fT(xk+1,xk,η)
is the axis-aligned Gaussian mixture approximation with Lp
components of the true transition density˜fT(xk+1|xk) for
xk∈ Ω. Since˜fT(xk+1|xk) is also a conditional density, its
approximation is done similarly to˜f(yk|xk).
Theorem 2 (Approximate Predicted Density)
Given the Gaussian mixture representation (10) for fe
and an axis-aligned Gaussian mixture representation for
fT(xk+1,xk,η) similar to (3) with Lpcomponents, the
approximate predicted density fp
sian mixture with Lpcomponents that can be calculated
k+1(xk+1) is also a Gaus-
Fig. 4. Top view on the Gaussian mixture approximation of the conditional
density˜f(yk|xk) = N(yk− (1 + x2
displays the underlying measurement function h(xk) = (1 + x2
measurement ˆ yk= 0.6 is indicated by the red line, leads to the likelihood
depicted in Fig. 5.
k)−1,0.25). The dashed black line
PROOF. See .
The number of components in fp
on the number of components of the approximate transi-
tion density. In return, performing the closed-form predic-
tion step (11) automatically reduces the Gaussian mixture
analytically, this simultaneous prediction and reduction is
computationally very efficient.
k+1(xk+1) depends only
k(xk). Because of the possibility of calculating fp
Remark 1 (Generalization) Until now we assumed, that vk
and wkare Gaussian. Generalization of the introduced ap-
proximation techniques with respect to noise represented by
a Gaussian mixture is straightforward. For general densities
it is possible to first find a Gaussian mixture approximation
of fv(vk) and fw(wk), e.g. using the method proposed in
, and then to approximate the conditional and transition
VI. SIMULATION: QUADRATIC DECAY
In this section we investigate the estimation results for the
1 + x2
introduced in Examples 1 and 2 with measurement noise
vk ∼ N(vk,0.1). We approximate the conditional density
of (12) for xk ∈ Ω = [−5,5] with L = 70 Gaussian
components, gaining a quality of G(η) = 0.225880.
The prediction step is based on the system equation
+ vk ,
xk+1= xk+ wk ,
where wk∼ N(wk,0.25). Although (13) is linear, we use
a Gaussian mixture approximation of the transition density
with Lp= 50 components for mixture reduction purposes.
The linearity allows on-line approximation for each predic-
tion step to dynamically cover the spread of the posterior
ˆ yk = 0.6 (blue, dashed) and its Gaussian mixture approximation (red,
solid), generated from the conditional density approximation shown in
Fig. 4. 20 Gaussian components (red, dotted) are used for representing the
approximate likelihood fL
k(xk) = N(ˆ yk− (1 + x2
In the simulation, four consecutive filter and prediction
steps are performed alternatingly, starting with the filter step
and the density fx
xkat time step k = 0. The measurement sequence is
ˆ y0= 0.4,
ˆ y1= 0.75,
0(x0) = N(x0+0.5,1) of the system state
ˆ y2= 0.5,
ˆ y3= 0.9 .
We compare the posterior densities of our approach (de-
noted as Appr.) with those of the unscented Kalman filter
(UKF) , a particle filter (PF) with 700 samples and
systematic resampling , and the exact Bayesian estimator.
Recursive estimation with the exact Bayesian estimator re-
quires recursively applied numerical integration and is used
as reference. Fig. 6 shows the resulting posterior densities
of the system state xk for the four consecutive filter and
prediction step pairs at time k = 0,...,3. It is obvious that
there is almost no shape difference between the estimates
of the Bayesian estimator and our approach. Especially both
modes are approximated almost exactly. The same is true for
the means and standard deviations, as shown in Table I. Since
the prediction simultaneously provides a Gaussian mixture
reduction, the number of components in fp
impairing the estimation results significantly.
Since the UKF provides a Gaussian density approximation,
whose mean is accurate up to second-order, the estimated
mean is relatively close to the true one. In contrast, the
difference in shape and standard deviation is significant.
Due to the shape approximation of the conditional density,
our approach is also able to cover higher-order moments
and the shape of the posterior density. Like the proposed
approach the time consumption of the UKF is constant, but
about two order of magnitudes less. However, our approach
provides estimates with high accuracy, whose calculation
need computing time that is close to real-time.
The density representation provided by the particle filter
depends on randomly drawn samples. Thus, this representa-
tion is inappropriate for well-fitting density approximations,
k(xk) stays constant at 50 and 3500 respectively, without
?? ?? ??0?2?
? ? ?
? ? ?
? ? ?
? ? ?
filter (green, dotted) are depicted. The particle filter provides only a sample representation and thus is omitted.
The results of the approach of this paper (red, solid) in comparison with those of the Bayesian estimator (blue, dashed) and the unscented Kalman
MEANS AND STANDARD DEVIATIONS OF THE POSTERIOR DENSITIES.
standard deviation: σe
but very convenient for estimating moments. In Table I the
average mean and standard deviation estimates of the PF
over 50 simulation runs are recorded. The mean estimates
are comparable to those provided by the UKF, while the
standard deviations are more accurate. Only by drastically
increasing the number of samples and thus the computation
time, the PF results would get close to those of the proposed
VII. CONCLUSIONS AND FUTURE WORK
The novel approach for closed-form measurement updat-
ing of dynamic time-invariant nonlinear systems introduced
in this paper is based on the approximation of conditional
densities by means of axis-aligned Gaussian mixtures. Given
the Gaussian mixture approximation of the conditional den-
sity and an actual measurement, an on-line generation of
the likelihood for performing the filter step is provided
and results in an analytic calculation of the approximate
posterior density. In contrast to the extended Kalman filter
or unscented Kalman filter, the Gaussian mixture represen-
tation of our approach allows an accurate approximation of
the posterior density, especially with regard to higher-order
moments and a multimodal shape. Whereas particle filters
only use a discrete approximation, the proposed estimation
technique is able to give a continuous representation.
The exponential growth of the number of components of
the approximate posterior density can be compensated by
applying the closed-form prediction step derived in . In
doing so, performing predictions implicitly achieves Gaus-
sian mixture reduction in an efficient manner.
Foundation of the proposed approach is an accurate con-
ditional density approximation. To achieve approximation
results in high quality and to avoid from getting trapped
in local optima, a progressive optimization algorithm is
proposed. Since the conditional density is time-invariant, the
computationally demanding approximation can be executed
The described approach has been introduced for scalar
random variables for the sake of brevity and clarity. Gen-
eralization to random vectors is straightforward. At the
moment, the approach is restricted to time-invariant systems.
Extension to time-variant systems is part of further research.
This work was partially supported by the German
Research Foundation (DFG) within the Research Train-
ing Group GRK 1194 “Self-organizing Sensor-Actuator-
 D. L. Alspach and H. W. Sorenson, “Nonlinear Bayesian Estimation
using Gaussian Sum Approximation,” IEEE Transactions on Automatic
Control, vol. 17, no. 4, pp. 439–448, August 1972.
 S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A Tutorial on
Particle Filters for Online Nonlinear/Non–Gaussian Bayesian Track-
ing,” IEEE Transactions of Signal Processing, vol. 50, no. 2, pp. 174–
 R. Fletcher, Practical Methods of Optimization, 2nd ed.
and Sons Ltd, 2000.
 U. D. Hanebeck, K. Briechle, and A. Rauh, “Progressive Bayes: A
New Framework for Nonlinear State Estimation,” in Proceedings of
SPIE, vol. 5099.AeroSense Symposium, 2003, pp. 256–267.
 S. Haykin, Communication Systems, 4th ed.
 M. Huber, D. Brunn, and U. D. Hanebeck, “Closed-Form Prediction of
Nonlinear Dynamic Systems by Means of Gaussian Mixture Approx-
imation of the Transition Density,” in IEEE International Conference
on Multisensor Fusion and Integration for Intelligent Systems, 2006,
 S. J. Julier and J. K. Uhlmann, “Unscented Filtering and Nonlinear
Estimation,” in Proceedings of the IEEE, vol. 92, no. 3, 2004, pp.
 R. E. Kalman, “A new Approach to Linear Filtering and Prediction
Problems,” Transactions of the ASME, Journal of Basic Engineering,
no. 82, pp. 35–45, 1960.
 P. S. Maybeck and B. D. Smith, “Multiple Model Tracker Based on
Gaussian Mixture Reduction for Maneuvering Targets in Clutter,” in
8th International Conference on Information Fusion, vol. 1, 2005, pp.
 V. Maz’ya and G. Schmidt, “On Approximate Approximations using
Gaussian Kernels,” IMA Journal of Numerical Analysis, vol. 16, no. 1,
pp. 13–29, 1996.
 N. Oudjane and C. Musso, “Progressive Correction for Regularized
Particle Filters,” in Proceedings of the 3rd International Conference
on Information Fusion, 2000, pp. THB2/10–THB2/17.
 A. Papoulis, Probability, Random Variables and Stochastic Processes,
3rd ed.McGraw-Hill, 1991.
 D. J. Salmond, “Mixture Reduction Algorithms for Target Tracking
in Clutter,” in SPIE Signal and Data Processing of Small Targets, ser.
1305, April 1990, pp. 434–445.
John Wiley & Sons,