ArticlePublisher preview available

Multilevel particle filters for the non-linear filtering problem in continuous time

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

In the following article we consider the numerical approximation of the non-linear filter in continuous-time, where the observations and signal follow diffusion processes. Given access to high-frequency, but discrete-time observations, we resort to a first order time discretization of the non-linear filter, followed by an Euler discretization of the signal dynamics. In order to approximate the associated discretized non-linear filter, one can use a particle filter. Under assumptions, this can achieve a mean square error of \(\mathcal {O}(\epsilon ^2)\), for \(\epsilon >0\) arbitrary, such that the associated cost is \(\mathcal {O}(\epsilon ^{-4})\). We prove, under assumptions, that the multilevel particle filter of Jasra et al. (SIAM J Numer Anal 55:3068–3096, 2017) can achieve a mean square error of \(\mathcal {O}(\epsilon ^2)\), for cost \(\mathcal {O}(\epsilon ^{-3})\). This is supported by numerical simulations in several examples.
This content is subject to copyright. Terms and conditions apply.
Statistics and Computing (2020) 30:1381–1402
https://doi.org/10.1007/s11222-020-09951-9
Multilevel particle filters for the non-linear filtering problem in
continuous time
Ajay Jasra1·Fangyuan Yu1·Jeremy Heng2
Received: 15 July 2019 / Accepted: 27 May 2020 / Published online: 15 June 2020
© Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract
In the following article we consider the numerical approximation of the non-linear filter in continuous-time, where the
observations and signal follow diffusion processes. Given access to high-frequency, but discrete-time observations, we resort
to a first order time discretization of the non-linear filter, followed by an Euler discretization of the signal dynamics. In order
to approximate the associated discretized non-linear filter, one can use a particle filter. Under assumptions, this can achieve a
mean square error of O(2),for>0 arbitrary, such that the associated cost is O(4). We prove, under assumptions, that
the multilevel particle filter of Jasra et al. (SIAM J Numer Anal 55:3068–3096, 2017) can achieve a mean square error of
O(2), for cost O( 3). This is supported by numerical simulations in several examples.
Keywords Multilevel Monte Carlo ·Particle filters ·Non-linear filtering
1 Introduction
The non-linear filtering problem in continuous-time is found
in many applications in finance, economics and engineering;
see e.g. Bain and Crisan (2009). We consider the case where
one seeks to filter an unobserved diffusion process (the sig-
nal) with access to an observation trajectory that is, in theory,
continuous in time and following a diffusion process itself.
The non-linear filter is the solution to the Kallianpur–Striebel
formula (e.g. Bain and Crisan 2009) and typically has no ana-
lytical solution. This has led to a substantial literature on the
numerical solution of the filtering problem; see for instance
(Bain and Crisan 2009;DelMoral2013).
In practice, one has access to very high-frequency obser-
vations, but not an entire trajectory and this often means one
has to time discretize the functionals associated to the path of
BJeremy Heng
b00760223@essec.edu
Ajay Jasra
ajay.jasra@kaust.edu.sa
Fangyuan Yu
fangyuan.yu@kaust.edu.sa
1Computer, Electrical and Mathematical Sciences and
Engineering Division, King Abdullah University of Science
and Technology, Thuwal 23955, Kingdom of Saudi Arabia
2ESSEC Business School, Singapore 139408, Singapore
the observation and signal. This latter task can be achieved
by using the approach in Picard (1984), which is the one
used in this article, but improvements exist; see for instance
(Crisan and Ortiz-Latorre 2013,2019). Even under such a
time-discretization, such a filter is not available analytically,
for most problems of practical interest. From here one must
often discretize the dynamics of the signal (such as Euler),
which in essence leads to a high-frequency discrete-time non-
linear filter. This latter object can be approximated using
particle filters in discrete time, as in, for instance, Bain and
Crisan (2009); this is the approach followed in this article.
Alternatives exist, such as unbiased methods Fearnhead et al.
(2010) and integration-by-parts, change of variables along
with Feynman–Kac particle methods Del Moral (2013), but,
each of these schemes has its advantages and pitfalls versus
the one followed in this paper. We refer to e.g. Crisan and
Ortiz-Latorre (2019) for some discussion.
Particle filters generate Nsamples (or particles) in par-
allel and sequentially approximate non-linear filters using
sampling and resampling. The algorithms are very well
understood mathematically; see for instance Del Moral
(2013) and the references therein. Given the particle filter
approximation of the time-discretized filter, using an Euler
method for the signal, one can expect that to obtain a mean
squared error (MSE), relative to the true filter, of O(2),for
>0 arbitrary, the associated cost is O(4). This follows
from standard results on discretizations and particle filters.
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... The approach improves performance and is robust to arbitrarily small time-discretization at the expense of additional computational cost. The latter is reduced using a new Multilevel Particle Filter; see [17,20] for some existing approaches. This article is structured as follows. ...
... One could focus on a time-discretization of either side of (2.5), however, as is conventional in the literature (e.g. [1,20]) we focus on the left hand side. ...
... We note that to choose l as specified, one has to have access to an appropriately finely observed data set and this is assumed throughout. Typically, one could use a multilevel Monte Carlo method, as in [20], to reduce the cost to achieve an MSE of O( 2 ). However, in this case as the O(N 2 ) cost dominates and does not depend on l, one can easily check that such a variance reduction method will not improve our particle method. ...
Article
Full-text available
We consider the problem of parameter estimation for a class of continuous-time state space models (SSMs). In particular, we explore the case of a partially observed diffusion, with data also arriving according to a diffusion process. Based upon a standard identity of the score function, we consider two particle filter based methodologies to estimate the score function. Both methods rely on an online estimation algorithm for the score function, as described, e.g., in [13], of O(N2 ) cost, with N ∈ N the number of particles. The first approach employs a simple Euler discretization and standard particle smoothers and is of cost O(N2 + N∆ −1 l ) per unit time, where ∆l = 2−l , l ∈ N0, is the time-discretization step. The second approach is new and based upon a novel diffusion bridge construction. It yields a new backward type Feynman-Kac formula in continuous-time for the score function and is presented along with a particle method for its approximation. Considering a time-discretization, the cost is O(N2∆ −1 l ) per unit time. To improve computational costs, we then consider multilevel methodologies for the score function. We illustrate our parameter estimation method via stochastic gradient approaches in several numerical examples.
... Another common recognised problem in continuous time filtering for diffusion processes is the unavailability of transition densities [14,19]. In our problem though, the hidden state is described by a linear SDE and thus state transition density is available, but the likelihood still remains intractable for the reason mentioned above. ...
... Since the minimisation problem above cannot be solved exactly, one can pursue a surrogate for ∆ * , in its vicinity, by minimising . Hence ∆ * , not being the true minimiser of (19), is a more conservative solution. Substituting this ∆ * into (19) gives an indication of the best relative MSE value for each C, which is of order of O (C −1 log(C)). ...
... Hence ∆ * , not being the true minimiser of (19), is a more conservative solution. Substituting this ∆ * into (19) gives an indication of the best relative MSE value for each C, which is of order of O (C −1 log(C)). In practice, we do not recommend this optimisation but rather choose (∆, η) as detailed in Section 4.1.1 and then stick to this choice even if more CPU time C has become available. ...
Preprint
We develop a (nearly) unbiased particle filtering algorithm for a specific class of continuous-time state-space models, such that (a) the latent process $X_t$ is a linear Gaussian diffusion; and (b) the observations arise from a Poisson process with intensity $\lambda(X_t)$. The likelihood of the posterior probability density function of the latent process includes an intractable path integral. Our algorithm relies on Poisson estimates which approximate unbiasedly this integral. We show how we can tune these Poisson estimates to ensure that, with large probability, all but a few of the estimates generated by the algorithm are positive. Then replacing the negative estimates by zero leads to a much smaller bias than what would obtain through discretisation. We quantify the probability of negative estimates for certain special cases and show that our particle filter is effectively unbiased. We apply our method to a challenging 3D single molecule tracking example with a Born and Wolf observation model.
... More recently, the methodology has also been applied to the context of partially observed diffusions [40,36], for parameter inference [38], online state inference [40,13,29,31,2,43], or both [19]. A notable recent body of work relates to continuous-time observations in this context [46,4,56]. Another notable trend is the application of randomized MLMC methods [11,42,41,34,45] in this context. Typically these methods require unbiased estimators of increments, which is particularly challenging in the inference context. ...
Preprint
We consider the problem of estimating expectations with respect to a target distribution with an unknown normalizing constant, and where even the unnormalized target needs to be approximated at finite resolution. This setting is ubiquitous across science and engineering applications, for example in the context of Bayesian inference where a physics-based model governed by an intractable partial differential equation (PDE) appears in the likelihood. A multi-index Sequential Monte Carlo (MISMC) method is used to construct ratio estimators which provably enjoy the complexity improvements of multi-index Monte Carlo (MIMC) as well as the efficiency of Sequential Monte Carlo (SMC) for inference. In particular, the proposed method provably achieves the canonical complexity of MSE$^{-1}$, while single level methods require MSE$^{-\xi}$ for $\xi>1$. This is illustrated on examples of Bayesian inverse problems with an elliptic PDE forward model in $1$ and $2$ spatial dimensions, where $\xi=5/4$ and $\xi=3/2$, respectively. It is also illustrated on a more challenging log Gaussian process models, where single level complexity is approximately $\xi=9/4$ and multilevel Monte Carlo (or MIMC with an inappropriate index set) gives $\xi = 5/4 + \omega$, for any $\omega > 0$, whereas our method is again canonical.
... Interesting book on the particle filter can be seen in [18,53,55,6]. Recently, particle filtering has attracted the attention of many researchers, see [56,62,28,66,68,41,61,20,47,11]. ...
... The construction of computable unbiased estimators via a sequence of asymptotically biased estimators has received much attention in recent years, following Glynn and Rhee (2014). This technique has been directly extended to estimating expectations of functionals of SDE paths in Rhee and Glynn (2015), and recently this approach has been further extended to non-linear filtering problems in Jasra et al. (2020). In ongoing work we have started to explore the possibility that a related approach could allow for exact rare event estimation in a more general setting, and with greater scope for practical applications. ...
Article
Full-text available
For rare events described in terms of Markov processes, truly unbiased estimation of the rare event probability generally requires the avoidance of numerical approximations of the Markov process. Recent work in the exact and $$\varepsilon$$ ε -strong simulation of diffusions, which can be used to almost surely constrain sample paths to a given tolerance, suggests one way to do this. We specify how such algorithms can be combined with the classical multilevel splitting method for rare event simulation. This provides unbiased estimations of the probability in question. We discuss the practical feasibility of the algorithm with reference to existing $$\varepsilon$$ ε -strong methods and provide proof-of-concept numerical examples.
... Thus, based on (16)- (17), when L is sampled from p , our estimator is as follows: ...
Preprint
Full-text available
In this paper, we consider static parameter estimation for a class of continuous-time state-space models. Our goal is to obtain an unbiased estimate of the gradient of the log-likelihood (score function), which is an estimate that is unbiased even if the stochastic processes involved in the model must be discretized in time. To achieve this goal, we apply a \emph{doubly randomized scheme} (see, e.g.,~\cite{ub_mcmc, ub_grad}), that involves a novel coupled conditional particle filter (CCPF) on the second level of randomization \cite{jacob2}. Our novel estimate helps facilitate the application of gradient-based estimation algorithms, such as stochastic-gradient Langevin descent. We illustrate our methodology in the context of stochastic gradient descent (SGD) in several numerical examples and compare with the Rhee \& Glynn estimator \cite{rhee,vihola}.
... The construction of computable unbiased estimators via a sequence of asymptotically biased estimators has received much attention in recent years, following [21]. This technique has been directly extended to estimating expectations of functionals of SDE paths in [34], and recently this approach has been further extended to non-linear filtering problems in [23]. In ongoing work we have started to explore the possibility that a related approach could allow for exact rare event estimation in a more general setting, and with greater scope for practical applications. ...
Preprint
Full-text available
For rare events described in terms of Markov processes, truly unbiased estimation of the rare event probability generally requires the avoidance of numerical approximations of the Markov process. Recent work in the exact and $\varepsilon$-strong simulation of diffusions, which can be used to almost surely constrain sample paths to a given tolerance, suggests one way to do this. We specify how such algorithms can be combined with the classical multilevel splitting method for rare event simulation. This provides unbiased estimations of the probability in question. We discuss the practical feasibility of the algorithm with reference to existing $\varepsilon$-strong methods and provide proof-of-concept numerical examples.
Article
In this paper, we consider static parameter estimation for a class of continuous-time state-space models. Our goal is to obtain an unbiased estimate of the gradient of the log-likelihood (score function), which is an estimate that is unbiased even if the stochastic processes involved in the model must be discretized in time. To achieve this goal, we apply a doubly randomized scheme, that involves a novel coupled conditional particle filter (CCPF) on the second level of randomization. Our novel estimate helps facilitate the application of gradient-based estimation algorithms, such as stochastic-gradient Langevin descent. We illustrate our methodology in the context of stochastic gradient descent (SGD) in several numerical examples and compare with the Rhee–Glynn estimator.
Article
This paper proposes the framework to determine and update the crack-front profile at the toe of welded plate joints based on the strain relaxation data. This study determines the crack depth by classifying the nodes in the thickness direction as open nodes on the crack surface and closed nodes on the intact ligament. To update the crack size during the loading history, this research employs the modified bootstrap particle filtering approach, which entails enhanced adjustment capabilities by imposing additional uncertainty distributions. This approach improves the crack size prediction by absorbing limited measurement data on the strain values or crack sizes.
Article
This paper develops a hybrid approach combining the neural network and the nonlinear 15 filtering to model and predict terrain profiles for both air and ground vehicles. To simplify the neural 16 network structures and reduce the number of synaptic weights and biases, the multiplicative neuron 17 model (MNM) is utilized to describe the relationship between the unknown elevation ahead and the last 18 few height values on the terrain profile. This paper adopts the gradient descent algorithm (GDA) to train 19 the MNM terrain model and stores the MNM parameters into a nonlinear state-space model. The state 20 vector in the state-space model (i.e., parameters of MNM) evolve agilely once absorbing new 21 observations and measurement of elevation values by the Bootstrap Particle Filter (BPF) algorithm. 22 Data-driven predictions on terrain profiles can be achieved through the updated MNM model. This study 23 utilizes two types of terrain profiles to verify the effectiveness of the proposed MNM-BPF approach. 24 Experimental results on two public datasets indicate that the proposed approach not only overcomes the 25 limitations of conventional terrain models that cannot dynamically tune model parameters according to 26 the newly input information, but also provides a simple but effective single-layered network for 27 modeling terrain profiles. The well-trained MNM-BPF model can achieve the lowest root mean square 28 errors (RMSE) (i.e., 17.3211 on the NS profile, 19.0366 on the EW profile) and average error (AE) (i.e., 29 1.5852 on the NS profile, 0.14885 on the EW profile) in the low-resolution dataset. The lowest RMSE 30 (i.e., 0.16549 on the left profile, 0.29926 on the right profile) and mean absolute error (MAE) (i.e., 31 0.13467 on the left profile, 0.23933 on the right profile) results are obtained in the high-resolution 32 dataset. Overall, the developed model is superior to the state-of-the-art models in at least four of the six 33 performance metrics and reduces RMSE by 40.8%, 17.2%, 13.1%, and 6.8% on average on the four 34 testing terrain profiles, respectively. The developed approach can be used as a decision tool for the 35 accurate prediction of terrain profiles with different resolutions. 36
Article
Full-text available
We consider the numerical analysis of the time discretization of Feynman-Kac semigroups associated with diffusion processes. These semigroups naturally appear in several fields, such as large deviation theory, Diffusion Monte Carlo or non-linear filtering. We present errors estimates a la Talay-Tubaro on their invariant measures when the underlying continuous stochastic differential equation is discretized; as well as on the leading eigenvalue of the generator of the dynamics, which corresponds to the rate of creation of probability. This provides criteria to construct efficient integration schemes of Feynman-Kac dynamics, as well as a mathematical justification of numerical results already observed in the Diffusion Monte Carlo community. Our analysis is illustrated by numerical simulations.
Article
Full-text available
The solution of the continuous time filtering problem can be represented as a ratio of two expectations of certain functionals of the signal process that are parametrized by the observation path. We introduce a class of discretization schemes of these functionals of arbitrary order. The result generalizes the classical work of Picard, who introduced first order discretizations to the filtering functionals. For a given time interval partition, we construct discretization schemes with convergence rates that are proportional with the $m$-power of the mesh of the partition for arbitrary $m\in\mathbb{N}$. The result paves the way for constructing high order numerical approximation for the solution of the filtering problem.
Article
Full-text available
In this article we introduce two new estimates of the normalizing constant (or marginal likelihood) for partially observed diffusion (POD) processes, with discrete observations. One estimate is biased but non-negative and the other is unbiased but not almost surely non-negative. Our method uses the multilevel particle filter of Jasra et al (2015). We show that, under assumptions, for Euler discretized PODs and a given $\varepsilon>0$. in order to obtain a mean square error (MSE) of $\mathcal{O}(\varepsilon^2)$ one requires a work of $\mathcal{O}(\varepsilon^{-2.5})$ for our new estimates versus a standard particle filter that requires a work of $\mathcal{O}(\varepsilon^{-3})$. Our theoretical results are supported by numerical simulations.
Article
In this article we prove new central limit theorems (CLTs) for several coupled particle filters (CPFs). CPFs are used for the sequential estimation of the difference of expectations with respect to filters which are in some sense close. Examples include the estimation of the filtering distribution associated to different parameters (finite difference estimation) and filters associated to partially observed discretized diffusion processes (PODDP) and the implementation of the multilevel Monte Carlo (MLMC) identity. We develop new theory for CPFs, and based upon several results, we propose a new CPF which approximates the maximal coupling (MCPF) of a pair of predictor distributions. In the context of ML estimation associated to PODDP with time-discretization $\Delta_l=2^{-l}$ , $l\in\{0,1,\dots\}$ , we show that the MCPF and the approach of Jasra, Ballesio, et al. (2018) have, under certain assumptions, an asymptotic variance that is bounded above by an expression that is of (almost) the order of $\Delta_l$ ( $\mathcal{O}(\Delta_l)$ ), uniformly in time. The $\mathcal{O}(\Delta_l)$ bound preserves the so-called forward rate of the diffusion in some scenarios, which is not the case for the CPF in Jasra et al. (2017).
Book
In the last three decades, there has been a dramatic increase in the use of interacting particle methods as a powerful tool in real-world applications of Monte Carlo simulation in computational physics, population biology, computer sciences, and statistical machine learning. Ideally suited to parallel and distributed computation, these advanced particle algorithms include nonlinear interacting jump diffusions; quantum, diffusion, and resampled Monte Carlo methods; Feynman-Kac particle models; genetic and evolutionary algorithms; sequential Monte Carlo methods; adaptive and interacting Markov chain Monte Carlo models; bootstrapping methods; ensemble Kalman filters; and interacting particle filters. Mean Field Simulation for Monte Carlo Integration presents the first comprehensive and modern mathematical treatment of mean field particle simulation models and interdisciplinary research topics, including interacting jumps and McKean-Vlasov processes, sequential Monte Carlo methodologies, genetic particle algorithms, genealogical tree-based algorithms, and quantum and diffusion Monte Carlo methods. Along with covering refined convergence analysis on nonlinear Markov chain models, the author discusses applications related to parameter estimation in hidden Markov chain models, stochastic optimization, nonlinear filtering and multiple target tracking, stochastic optimization, calibration and uncertainty propagations in numerical codes, rare event simulation, financial mathematics, and free energy and quasi-invariant measures arising in computational physics and population biology. This book shows how mean field particle simulation has revolutionized the field of Monte Carlo integration and stochastic algorithms. It will help theoretical probability researchers, applied statisticians, biologists, statistical physicists, and computer scientists work better across their own disciplinary boundaries.
Article
In this paper the filtering of partially observed diffusions, with discrete-time observa- tions, is considered. It is assumed that only biased approximations of the diffusion can be obtained for choice of an accuracy parameter indexed by l. A multilevel estimator is proposed consisting of a telescopic sum of increment estimators associated to the successive levels. The work associated to O("ϵ2) mean-squared error between the multilevel estimator and average with respect to the filtering distribution is shown to scale optimally, for example, as O("ϵ2) for optimal rates of convergence of the underlying diffusion approximation. The method is illustrated with some toy examples as well as estimation of interest rate based on real S&P 500 stock price data.
Chapter
Several applications of the strong schemes that were derived in the preceding chapters will be indicated in this chapter. These are the direct simulation of trajectories of stochastic dynamical systems, including stochastic flows, the testing of parametric estimators and Markov chain filters. In addition, some results on asymptotically efficient schemes will be presented.
Book
Filtering Theory.- The Stochastic Process ?.- The Filtering Equations.- Uniqueness of the Solution to the Zakai and the Kushner-Stratonovich Equations.- The Robust Representation Formula.- Finite-Dimensional Filters.- The Density of the Conditional Distribution of the Signal.- Numerical Algorithms.- Numerical Methods for Solving the Filtering Problem.- A Continuous Time Particle Filter.- Particle Filters in Discrete Time.