Mikko S. Pakkanen’s research while affiliated with Imperial College London and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (61)


Figure 1. Estimating the expected signature estimation from a finite collection of discretely-observed paths.
Learning with Expected Signatures: Theory and Applications
  • Preprint
  • File available

May 2025

·

2 Reads

·

Mikko S. Pakkanen

·

The expected signature maps a collection of data streams to a lower dimensional representation, with a remarkable property: the resulting feature tensor can fully characterize the data generating distribution. This "model-free" embedding has been successfully leveraged to build multiple domain-agnostic machine learning (ML) algorithms for time series and sequential data. The convergence results proved in this paper bridge the gap between the expected signature's empirical discrete-time estimator and its theoretical continuous-time value, allowing for a more complete probabilistic interpretation of expected signature-based ML methods. Moreover, when the data generating process is a martingale, we suggest a simple modification of the expected signature estimator with significantly lower mean squared error and empirically demonstrate how it can be effectively applied to improve predictive performance.

Download




Unifying incidence and prevalence under a time-varying general branching process

August 2023

·

127 Reads

·

10 Citations

Journal of Mathematical Biology

Mikko S. Pakkanen

·

Xenia Miscouridou

·

·

[...]

·

Samir Bhatt

Renewal equations are a popular approach used in modelling the number of new infections, i.e., incidence, in an outbreak. We develop a stochastic model of an outbreak based on a time-varying variant of the Crump–Mode–Jagers branching process. This model accommodates a time-varying reproduction number and a time-varying distribution for the generation interval. We then derive renewal-like integral equations for incidence, cumulative incidence and prevalence under this model. We show that the equations for incidence and prevalence are consistent with the so-called back-calculation relationship. We analyse two particular cases of these integral equations, one that arises from a Bellman–Harris process and one that arises from an inhomogeneous Poisson process model of transmission. We also show that the incidence integral equations that arise from both of these specific models agree with the renewal equation used ubiquitously in infectious disease modelling. We present a numerical discretisation scheme to solve these equations, and use this scheme to estimate rates of transmission from serological prevalence of SARS-CoV-2 in the UK and historical incidence data on Influenza, Measles, SARS and Smallpox.


Estimation and Inference for Multivariate Continuous-time Autoregressive Processes

July 2023

·

11 Reads

The aim of this paper is to develop estimation and inference methods for the drift parameters of multivariate L\'evy-driven continuous-time autoregressive processes of order pNp\in\mathbb{N}. Starting from a continuous-time observation of the process, we develop consistent and asymptotically normal maximum likelihood estimators. We then relax the unrealistic assumption of continuous-time observation by considering natural discretizations based on a combination of Riemann-sum, finite difference, and thresholding approximations. The resulting estimators are also proven to be consistent and asymptotically normal under a general set of conditions, allowing for both finite and infinite jump activity in the driving L\'evy process. When discretizing the estimators, allowing for irregularly spaced observations is of great practical importance. In this respect, CAR(p) models are not just relevant for "true" continuous-time processes: a CAR(p) specification provides a natural continuous-time interpolation for modeling irregularly spaced data - even if the observed process is inherently discrete. As a practically relevant application, we consider the setting where the multivariate observation is known to possess a graphical structure. We refer to such a process as GrCAR and discuss the corresponding drift estimators and their properties. The finite sample behavior of all theoretical asymptotic results is empirically assessed by extensive simulation experiments.


Schematic of a time-varying general branching process
a Shows schematics for the infectious period, an individual’s time-varying infectiousness (both functions of time post infection t*), and the population-level mean rate of infection events. The infectious period is given by probability density function g. For each individual their (time-varying) infectiousness and rate of infection events are given by ν and ρ respectively. In an example (b), an individual is infected at time l, and infects three people (random variables K, purple dashed lines) at times l + K1, l + K2 and l + K3. The times of these infections are given by a random variable with probability density function ~ρ(t)ν(t−l)∫ltρ(u)ν(u−l)du\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim \frac{\rho (t)\nu (t-l)}{\int\nolimits_{l}^{t}\rho (u)\nu (u-l)du}$$\end{document}. Each new infection then has its own infectious period and secondary infections (thinner coloured lines).
Aleatoric uncertainty without overdispersed offspring distribution
Plots show simulated epidemic where ρ(t) = 1.4 + sin(0.15t), with a Poisson offspring distribution. We use infectiousness ν ~ Gamma(3, 1), and infectious period g ~ Gamma(5,1). a Overlap between g and the infectiousness ν, where g controls when the infection ends e.g. by isolation. b Predicted mean and 95% aleatoric uncertainty intervals for prevalence. Note there is no epistemic uncertainty as the parameters are known exactly c Phase plane plot showing the mean plotting against the variance. d Proportional contribution to the variance from the individual terms in Eq. (9). Compounding uncertainty from past events is the dominant contributor to overall uncertainty.
The 2003 SARS epidemic in Hong Kong20,21
aρ(t) with 95% epistemic uncertainty. b Fitted incidence mean, 95% epistemic uncertainty with observational noise from using Eq. (4). Data is daily incidence of symptom onset. c Aleatoric uncertainty from the start of the epidemic under an optimistic and pessimistic ρ(t). d Epistemic (blue) and epistemic and aleatoric uncertainty (red) while keeping ρ constant at the forecast data (dotted line). Forecasting is from day 60.
Early 2020 COVID-19 pandemic in the UK
a shows a simulated epidemic using parameters available on March 16th 2020 (Table 1), for a plausible range of ρ = R0 between 2 and 4. Blue bars indicate actual COVID-19 deaths, which we assume no knowledge of. The purple line is March 17th 2020, we set transmission to zero i.e. ρ = 0, to simulate an intervention that stops transmission completely. The grey envelope is the epistemic uncertainty and the red envelope the aleatoric uncertainty. b is the same as the top plot, except time is extended past March 17th with transmission being zero. Note aleatoric uncertainty is presented but is very close to zero.
Intrinsic randomness in epidemic modelling beyond statistical uncertainty

June 2023

·

96 Reads

·

8 Citations

Uncertainty can be classified as either aleatoric (intrinsic randomness) or epistemic (imperfect knowledge of parameters). The majority of frameworks assessing infectious disease risk consider only epistemic uncertainty. We only ever observe a single epidemic, and therefore cannot empirically determine aleatoric uncertainty. Here, we characterise both epistemic and aleatoric uncertainty using a time-varying general branching process. Our framework explicitly decomposes aleatoric variance into mechanistic components, quantifying the contribution to uncertainty produced by each factor in the epidemic process, and how these contributions vary over time. The aleatoric variance of an outbreak is itself a renewal equation where past variance affects future variance. We find that, superspreading is not necessary for substantial uncertainty, and profound variation in outbreak size can occur even without overdispersion in the offspring distribution (i.e. the distribution of the number of secondary infections an infected person produces). Aleatoric forecasting uncertainty grows dynamically and rapidly, and so forecasting using only epistemic uncertainty is a significant underestimate. Therefore, failure to account for aleatoric uncertainty will ensure that policymakers are misled about the substantially higher true extent of potential risk. We demonstrate our method, and the extent to which potential risk is underestimated, using two historical examples.


The uncertainty of infectious disease outbreaks is underestimated

December 2022

·

99 Reads

Uncertainty can be classified as either aleatoric (intrinsic randomness) or epistemic (imperfect knowledge of parameters). The majority of frameworks assessing infectious disease risk consider only epistemic uncertainty. We only ever observe a single epidemic, and therefore cannot empirically determine aleatoric uncertainty. Here, for the first time, we characterise both epistemic and aleatoric uncertainty using a time-varying general branching processes. Our framework explicitly decomposes aleatoric variance into mechanistic components, quantifying the contribution to uncertainty produced by each factor in the epidemic process, and how these contributions vary over time. The aleatoric variance of an outbreak is itself a renewal equation where past variance affects future variance. Perhaps surprisingly, superspreading is not necessary for substantial uncertainty, and profound variation in outbreak size can occur even without overdispersion in the offspring distribution (i.e. the distribution of the number of secondary infections an infected person produces). Crucially, aleatoric forecasting uncertainty grows dynamically and rapidly, and so forecasting using only epistemic uncertainty is a significant underestimate. Therefore, failure to account for aleatoric uncertainty will ensure that policymakers are misled about the substantially higher true extent of potential risk. We demonstrate our method, and the extent to which potential risk is underestimated, using two historical examples: firstly the 2003 Hong Kong severe acute respiratory syndrome (SARS) outbreak, and secondly the early 2020 UK COVID-19 epidemic. Our framework provides analytical tools to estimate epidemic uncertainty with limited data, to provide reasonable worst-case scenarios and assess both epistemic and aleatoric uncertainty in forecasting, and to retrospectively assess an epidemic and thereby provide a baseline risk estimate for future outbreaks. Our work strongly supports the precautionary principle in pandemic response.


The Short-Term Predictability of Returns in Order Book Markets: a Deep Learning Perspective

November 2022

·

113 Reads

In this paper, we conduct a systematic large-scale analysis of order book-driven predictability in high-frequency returns by leveraging deep learning techniques. First, we introduce a new and robust representation of the order book, the volume representation. Next, we conduct an extensive empirical experiment to address various questions regarding predictability. We investigate if and how far ahead there is predictability, the importance of a robust data representation, the advantages of multi-horizon modeling, and the presence of universal trading patterns. We use model confidence sets, which provide a formalized statistical inference framework particularly well suited to answer these questions. Our findings show that at high frequencies predictability in mid-price returns is not just present, but ubiquitous. The performance of the deep learning models is strongly dependent on the choice of order book representation, and in this respect, the volume representation appears to have multiple practical advantages.



Citations (34)


... Arroyo et al. proposed a deep learning method using a Convolutional-Transformer encoder and a monotonic neural network decoder to estimate limit order fill times in a limit order book (LOB), significantly outperforming traditional survival analysis approaches [48]. Lucchese et al. employed deep learning techniques to conduct a large-scale analysis of predictability in high-frequency returns driven by order books, introducing a volume representation of the order book and conducting extensive empirical experiments [49]. Jaddu et al. focused on forecasting returns across multiple horizons using order flow imbalance and training three temporal-difference learning models for five financial instruments, including forex pairs, indices, and a commodity [50]. ...

Reference:

An Efficient deep learning model to Predict Stock Price Movement Based on Limit Order Book
The Short-Term Predictability of Returns in Order Book Markets: A Deep Learning Perspective
  • Citing Article
  • February 2024

International Journal of Forecasting

... SIR dynamical models are comparatively easy to fit to data but often lack the detail necessary to bridge the gap between individual behaviour and population-level dynamics. Alternative models based on branching processes 48 , partial differential equations or self-excitatory Hawkes processes 49 are growing in popularity and there are strong similarities between these approaches 48,50 . As previously noted, foundation time-series models can be used to constrain function space by restricting the set of functions a priori and embedding these constraints within a chosen mechanistic model (termed semi-mechanistic models). ...

Unifying incidence and prevalence under a time-varying general branching process

Journal of Mathematical Biology

... They are easily parameterised, permitting flexibility in fitting different distributions by adjusting their mean and standard deviation. Finally, Gaussian augmentation is designed to address epistemic uncertainty, which arises from incomplete knowledge, by introducing new columns that depend on previous features yet allow deviations within the Gaussian distribution, distinguishing from aleatoric uncertainty, which originates from the inherent randomness of events (Penn et al. 2023). ...

Intrinsic randomness in epidemic modelling beyond statistical uncertainty

... From a computational perspective, our scalable stochastic gradient approach to uncertainty-aware learning draws inspiration from both distributionally robust optimization and recent advances in deep hedging [11,13,68,54,14], which applies deep learning to complex financial decision problems. Like deep hedging, our method addresses highdimensional decision spaces and leverages machine learning to approximate optimal strategies while remaining robust to model risk. ...

Deep Hedging: Continuous Reinforcement Learning for Hedging of General Portfolios across Multiple Risk Aversions
  • Citing Conference Paper
  • October 2022

... The multivariate models evoke several applications where matrix-based scaling laws are expected to appear, such as in long range dependent time series [11,12] and queueing systems [13,14]. Like fractional Brownian motion in the univariate setting, operator fractional Brownian motion is a natural selection for constructing estimators for operator self-similarity process, because it is Gaussian and closely connected with stationary fractional process (see [2,9] for more information about operator self-similar processes). The fractal nature for operator fractional Brownian motion such as the Hausdorff dimension of the image and graph, and spatial surface properties such as hitting probabilities, transience, and the characterization of polar sets were studied by Mason and Xiao [15]. ...

Limit theorems for the realised semicovariances of multivariate Brownian semistationary processes
  • Citing Article
  • October 2022

Stochastic Processes and their Applications

... 2 Semenova et al. (2022) propose a two-stage procedure that bypasses the computational demands of latent Gaussian models by learning the spatial prior through a VAE. Mishra et al. (2022) leverage VAEs to capture low-dimensional data representations. We note that neural networks have been used within a FH model (e.g., see Parker (2024)), but not for the purpose of encoding spatial dependence. ...

ππ\pi VAE: a stochastic process prior for Bayesian deep learning with MCMC

Statistics and Computing

... The SV framework has inspired extensive methodological developments: Tauchen and Pitts [19] and Taylor [20] pioneered the application of stochastic principles to financial volatility modeling; Chib, Nardari, and Shephard [21] advanced Bayesian estimation techniques for high-dimensional multivariate SV models with time-varying correlations; Jensen and Maheu [22] introduced a semiparametric Bayesian approach incorporating Markov chain Monte Carlo methods to address distributional uncertainty; Fernández-Villaverde, Guerrón-Quintana, and Rubio-Ramírez [23] developed computationally efficient particle filtering algorithms tailored for large-scale SV models. Recent innovations continue to expand the SV paradigm, as evidenced by contributions from Rømer [24], Yazdani, Hadizadeh, and Fakoor [25], Bolko, Christensen, Pakkanen et al. [26], and Chan [27], among others. Notwithstanding these advancements, SV models remain computationally intensive, particularly for parameter estimation and short-term forecasting. ...

A GMM approach to estimate the roughness of stochastic volatility
  • Citing Article
  • September 2022

Journal of Econometrics

... The dependence on the current state of the order book in the dynamics of the order flow has been shown empirically to be of great significance (Muni Toke and Yoshida 2017; Lehalle et al. 2021;Morariu-Patrichi and Pakkanen 2022;Sfendourakis and Muni Toke 2023). Huang et al. (2015) introduce the "queue-reactive" approach to model the dynamics of the LOB and the price of an asset, introducing a Markovian dependence on the current state of the LOB for the arrival intensities of the orders. ...

State-dependent Hawkes processes and their application to limit order book modelling

... Deep Hedging is a classical algorithm that treats hedging of a set of derivatives as a reinforcement learning problem. This algorithm was first introduced in [1,2] and has been further developed in subsequent works such as [34][35][36][37]. In the original approach, the authors use a reinforcement learning environment associated with Deep Hedging that employs finite-horizon MDPs where there is a different state and action space per time-step. ...

Deep Hedging: Learning to Remove the Drift under Trading Frictions with Minimal Equivalent Near-Martingale Measures
  • Citing Preprint
  • November 2021

... Indeed, rough volatility models are the only class of continuous price models that are consistent to a power law of implied volatility term structure typically observed in equity option markets, as shown by [19]. One way to derive the power law under rough volatility models is to prove a large deviation principle (LDP) as done by many authors [11,4,3,13,14,31,33,34,38,35,39,32] using various methods. An introduction to LDP and some of its applications to finance and insurance problems can be found in [44,15]. ...

Pathwise large deviations for the rough Bergomi model: Corrigendum
  • Citing Article
  • September 2021

Journal of Applied Probability