PreprintPDF Available

Compartmental Models for COVID-19 and Control via Policy Interventions

Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

We demonstrate an approach to replicate and forecast the spread of the SARS-CoV-2 (COVID-19) pandemic using the toolkit of probabilistic programming languages (PPLs). Our goal is to study the impact of various modeling assumptions and motivate policy interventions enacted to limit the spread of infectious diseases. Using existing compartmental models we show how to use inference in PPLs to obtain posterior estimates for disease parameters. We improve popular existing models to reflect practical considerations such as the under-reporting of the true number of COVID-19 cases and motivate the need to model policy interventions for real-world data. We design an SEI3RD model as a reusable template and demonstrate its flexibility in comparison to other models. We also provide a greedy algorithm that selects the optimal series of policy interventions that are likely to control the infected population subject to provided constraints. We work within a simple, modular, and reproducible framework to enable immediate cross-domain access to the state-of-the-art in probabilistic inference with emphasis on policy interventions. We are not epidemiologists; the sole aim of this study is to serve as an exposition of methods, not to directly infer the real-world impact of policy-making for COVID-19.
Content may be subject to copyright.
Compartmental Models for COVID-19 and Control via
Policy Interventions
SWAPNEEL MEHTA, New York University
NOAH KASMANOFF, New York University
We demonstrate an approach to replicate and forecast the spread of the SARS-CoV-2 (COVID-19) pandemic
using the toolkit of probabilistic programming languages (PPLs). Our goal is to study the impact of various
modeling assumptions and motivate policy interventions enacted to limit the spread of infectious diseases.
Using existing compartmental models we show how to use inference in PPLs to obtain posterior estimates
for disease parameters. We improve popular existing models to reect practical considerations such as the
under-reporting of the true number of COVID-19 cases and motivate the need to model policy interventions
for real-world data. We design an SEI3RD model as a reusable template and demonstrate its exibility in
comparison to other models. We also provide a greedy algorithm that selects the optimal series of policy
interventions that are likely to control the infected population subject to provided constraints. We work within
a simple, modular, and reproducible framework to enable immediate cross-domain access to the state-of-the-art
in probabilistic inference with emphasis on policy interventions.
We are not epidemiologists
; the sole aim
of this study is to serve as a exposition of methods, not to directly infer the real-world impact of policy-making
for COVID-19.
In order to understand and control infectious diseases, it is important to build realistic models
capable of accurately replicating and projecting their transmission in a region [Atkeson 2020b;
Sameni 2020;Tang and Wang 2020]. The underlying assumption being that a model capable of
replicating disease spread has captured the true causal dynamics suciently well. Motivated by
this, there has been a spate of research in applying SIR, SEIR, and SEI3RD models to this problem
[COVID et al
2020;López and Rodó 2020]. These belong to a class of compartmental models that are
underpinned by Lotka-Volterra dynamics that divide a population into sections or compartments
and describe the probabilistic transitions between them through a set of partial dierential equations.
We extend this direction of work with an emphasis on modeling under-reported cases, decoupling
policy interventions from compartmental transitions, estimating the impact of policy interventions,
selecting a sequence of optimal interventions to control the spread of diseases, and using the
SEI3RD variant as an extension of existing work on SEIR models since it forms a template for many
other extensions of compartmental models [Giordano et al
2020;Grimm et al
2021;Kennedy et al
2020;Senapati et al
2020;Winters 2020;Wol 2020]. The authors of [Hong and Li 2020], much like
us, propose a new statistical tool to visualize analyses of COVID-19 data. However, our framework
makes it much simpler to inspect the intricacies of modeling assumptions, expand with a custom
set of constraints, and experiment with counterfactual simulations without the need for extensive
compute or data. We have created some demo notebooks which will be made available publicly 1.
In light of probabilistic programming languages (PPLs) [Bingham et al
2019;Carpenter et al
Salvatier et al
2016;van de Meent et al
2018] reaching their ’coming-of-age’ moment, COVID-19
modeling is being explored to guide and support policy decisions and decision-makers [de Witt
et al
2020;Wood et al
2020]. In this work, we provide data scientists with a concrete example
of applying probabilistic inference to understand disease spread and control. To epidemiologists,
we oer this manuscript as a guide to design compartmental models to evaluate the impact of
Authors’ addresses: Swapneel Mehta, New York University, Center for Data Science,
; Noah
Kasmano, New York University, Center for Data Science,
arXiv:2203.02860v1 [stat.ML] 6 Mar 2022
2 Swapneel Mehta and Noah Kasmano
Fig. 1. The SIR Compartmental Model
Fig. 2. Add an ’Exposed’ compartment to the SIR model to obtain the SEIR model
policy interventions [Giordano et al
2020;Mandal et al
2020;Wang et al
2020b]. We include clear
motivation and detailed real-world examples of how to manually explore the impact of such non-
pharmaceutical interventions (NPIs) using synthetically generated data and provide an algorithm
to automatically generate strategies for implementing governmental policies for disease control.
Our main contributions are as follows:
Improve existing compartmental models with the ability to deal with the under-reporting of
COVID-19 cases and decoupling NPIs from disease parameters.
Evaluate our parameter estimates through empirical comparisons with those in prevalent
literature, observed patterns in testing coverage, and highlight the relative consistency of
our predictions compared to anomalies in existing approaches.
Model xed policy interventions and describe an algorithm to select adaptive policy inter-
ventions to aid government eorts to limit the spread of disease.
Highlight the ease of using PPLs to design, extend, and t SEI3RD models using approximate
inference to obtain posterior disease parameter estimates; in addition to having an open-
source code-base and our self-contained tutorial to be released at the time of publication.
The SIR model dynamics form a template for the increasingly complex compartmental models we
explore. These dynamics are encapsulated in the following dierential equations following the
Lotka-Volterra Model:
=𝛽𝑆 (𝑡)𝑖(𝑡),𝑑𝐼
=𝛽𝑆 (𝑡)𝑖(𝑡) 𝛾𝐼 (𝑡),𝑑𝑅
=𝛾𝐼 (𝑡)
The population-level disease parameters we discussed for the SIR model (transmission probability
and recovery rate
) are usually estimated in terms of measurable quantities from observed
data such as the mean recovery time. The response rate
indicates the proportion of observed
infections in reality because we typically cannot expect to observe every single case of infection.
To be clear, it is often the case that people may not realize they have been infected in the absence
of extensive testing. This has been the case in most countries at least for the initial few months
of the pandemic and some strategies attempt to remedy it in dierent ways [Jagodnik et al
Khan et al
2021;Lachmann 2020]. Since the real-world data we intend to use overlaps signicantly
with this duration, it makes sense to add this variable to our model. Importantly, the same may be
exacerbated due to socioeconomic and political factors, so
allows us to model this under-reporting
of COVID-19 cases in practice. We start by making a guess about the range for the reproduction
Compartmental Models for COVID-19 and Control via Policy Interventions 3
and response rate
. This ’guess’ is equivalent to placing empirically informed
(from past outbreaks, for instance) priors on these latent variables and estimate their posterior
3.1 Using PPLs for Modeling and Inference
PPLs are a natural candidate among the tools we considered for this problem. We would like
to dene a time-series probabilistic model with some underlying dynamics that encodes our
assumptions about the data-generating process; then we want to t this model to the data through
stochastic variational inference [Homan et al
2013;Ranganath et al
2014], and use the posterior
distributions over its parameters to make predictions about future time-steps. As a sanity check, we
also want to test how well it replicates the historical trajectory of the disease through simulations.
[Bingham et al
2019], a deep universal probabilistic programming language, is an excellent
candidate for this because of the following reasons:
It comprises of thin wrappers around
[Paszke et al
2019] distributions allowing us
to write complex generative models interweaving stochastic and deterministic control ow
while still within a familiar and popular machine learning (ML) framework.
It oers general-purpose inference algorithms out of the box that allows us to shift the focus
from designing custom inference algorithms to building expressive models.
It recently extended support in terms of an API for epidemiological models. We perform our
experiments in
since it is a much more accessible framework due to the community
around it in comparison to modern alternatives like
[Baydin et al
2019]. We can
conrm this empirically since we spent some time reproducing work along similar lines in
[Wood et al
2020] and found much less time required to do the same in
. We do still
appreciate the relative ease of reproducing it compared to other machine learning research
for healthcare.
We consider a partially observed population and want to understand the putative controls that
could achieve a goal dened as "control the spread of the disease", "reduce the death rate", or other
such equivalent outcomes. While some of the parameters that dene disease spread are controllable,
there are certain non-controllable parameters including properties of the disease that we will infer.
From a computational standpoint, we demonstrate the capability of the universal probabilistic
programming language,
, to combine a system of equations that dene a simulator and perform
inference over the latent variables within the simulation to obtain a posterior distribution over the
model parameters. In addition to an inference engine,
oers the capability to intervene on the
variables within this simulation in order to obtain potential outcomes of policy changes that can be
expressed within the language as xing the values of certain stochastic parameters.
(a) Posterior distribution for 𝑅0(b) Posterior distribution for 𝜌
4 Swapneel Mehta and Noah Kasmano
Fig. 4. Predicting Future Infections for a Simulated Dataset in a Highly Infectious Seing
For our simulation-based study, we used models
and rened versions of
, wherein we start with an infected population of 0.01%. We conduct some simulation
studies for instance by setting the
85. We perform variational inference [Homan
et al
2013;Ranganath et al
2014] to obtain posterior parameter estimates close to the true values.
These are indicated by the vertical black line in 3a and 3b. Since the posterior point-estimates are
close to the true values for the simulation, we can see why we are able to accurately replicate
the disease spread shown in 4. We plot the daily infections versus time and the plot resembles a
single wave of infected patients. The shaded portion indicates a 90% condence interval for the
model’s predictions which closely tracks the true values of disease spread. While it is no surprise
that SIR, SEIR, and SEI3RD models perform well on simulated data, there are extensive studies
in applying the SEIR model in practice. However, we believe that the SEIR model also has its
failures (see Qualitative Analysis of Successes and Failures) that often go undetected due to a lack
of comprehensive evaluation. We show this through an extensive set of experiments on real-world
data across multiple geographies. The tabular comparison of the reproduction number or
estimated via our models are shown in Table 1for dierent time periods which oers the
following insights:
The reported cases for the initial period of January - July 2020 indicate far less testing than
the full periods (upto May 2021 for the USA and January 2021 for the rest of the world, in
our dataset).
The estimated
0seemed high for most regions upto April 2020, and even though longer-term
estimates denoted an
0of a little over 1 for most regions, the spate of cases underscores the
need to reduce the spread of the infection. At the same time, the reproduction number is not
Compartmental Models for COVID-19 and Control via Policy Interventions 5
all that matters and a careful study of mortality rates is warranted as testing increases and fa-
talities decrease in order to draw concrete scientic conclusions and policy recommendations
given this evidence.
We show qualitatively why it is necessary to build more granular, exible compartmental
models and underscore the need for separately modeling policy interventions as human
inuence on these processes.
The references for Comparing 𝑅0estimates with literature are drawn from prevalent literature
and expert-curated resources
2 3
on the subject [Gunzler and Sehgal 2020;Kamalich Muniz-Rodriguez
et al
[n. d.];Lau et al
2020;Prodanov 2021]. A certain George E. P. Box would be wont to say "All
0estimates are wrong but some may be useful", and we illustrate this via an empirical and rather
qualitative route. Firstly, in some cases the reference method itself presents anomalous forecasts
like the
0for Germany throughout the year being predicted as an unusual 22.032, and that of the
Netherlands, 9.103 from [Prodanov 2021]. In comparison, we observe relative consistency at least at
the model level, which might be a strong signal to consider more realistic modeling choices when
using real, noisy data for estimation. Where we lack consistency, we have a signal in the form of a
set of forecasts as well as an additional estimate of
. An oddly low
, for instance, might convince
us of an estimation error despite having high-condence and a good t to the observed data. We
observe that the SEI3RD models perform better than the SEIR and SIR models which might lack
the capacity to eectively model transitions. We also study a rened set of compartmental models
which consider an initial, partially infected (0.01%) population instead of starting from a single
infection (’patient zero’).
Once we successfully model disease progression and forecast for a set of future time-steps to
recover potential new infections. We can examine counterfactual questions of the nature ’What
would have happened to the number of cases if the government enacted
steps at
by introducing policy interventions
at this stage and gure out the global minimally invasive
intervention to perform that will remain within the desired thresholds for the infected populace.
An adaptive algorithm to determine the optimal policy interventions is developed in this paper
as we continue to explore breadth-wise analyses pertaining to COVID-19 control, and compare
our models to the impact of real-world interventions. This discussion, albeit a core contribution, is
relegated to the appendix.
As responsible data scientists attempting to provide a tool for epidemiological analysis, it is impor-
tant not to overstate the relevance of estimating the correct 𝑅0from the data since that is not the
only parameter of interest. Many numbers have been bandied about in the news under the assump-
tion that controlling
0implies controlling the pandemic. To a certain degree this is true, however
there are important caveats to this notion that some have expanded upon [Hébert-Dufresne et al
2020;Maruotti et al
2021]. The summary of the discussion is that
0must be used as a tool to
paint a partial picture of a disease spread, in conjunction with multiple factors. In this regard,
our consideration of
, modeling of infected fractions of populations, and consideration of policy
interventions are signicant steps taken to provide a comprehensive idea of the state of a pandemic.
We would like to acknowledge guidance and support from Kyle Cranmer and Rajesh Ranganath
that laid the foundations for this research project.
6 Swapneel Mehta and Noah Kasmano
Table 1. Comparing 𝑅0estimates with literature
Region Model Jan - April, ’20 Jan - Dec, ’20
𝑅0𝜌Ref. 𝑅0𝑅0𝜌Ref. 𝑅0
SIR 1.48 𝜎=.01 0.187 2.676 1.34 0.125 1.8024
SIR(i) 1.712 0.76 𝜎=.03 - 0.608 0.506 -
SEIR 2.27 𝜎=.02 0.11 - 1.49 0.14 -
SEIR(i) 0.403 0.56 - 1.96 0.178 -
SEI3RD 3.03 𝜎=.07 0.557 - 5.78 0.001 -
SEI3RD(i) 1.247 0.532 - 0.215 0.505 -
SIR 1.69 0.36 𝜎=.01 1.9962 1.724 𝜎=0.84 0.71 9.103
SIR(i) 1.699 𝜎=.01 0.54 𝜎=.015 - 0.490 0.5079 -
SEIR 4.51 𝜎=.011 3.24 𝜎=.01 - 1.76 0.45 -
SEIR(i) 10.048 0.71 - 0.236 0.50 -
SEI3RD 4.23 𝜎=.04 0.735 - 2.78 0.154 -
SEI3RD(i) 3.763 𝜎=.03 0.70 - 3.51 0.108 -
New York
SIR 1.432 0.54 𝜎=.016 1.21 1.475 0.027 0.81
SIR(i) 1.643 0.594 - 2.078 0.619 -
SEIR 0.88 0.50 - 0.81 0.50 -
SEIR(i) 4.09 𝜎=.01 0.04 - 4.98 0.14 -
SEI3RD 1.53 0.504 - 1.67 0.503 -
SEI3RD(i) 10.79 𝜎=.006 0.0795 - 12.295 0.253 -
SIR 0.108 0.5186 1.637 1.610 0.120 22.032
SIR(i) 1.643 0.594 - 2.078 0.619 -
SEIR 2.51 𝜎=.013 0.76 𝜎=.01 - 2.93 𝜎=.10.154 -
SEIR(i) 1.973 0.213 - 5.02 𝜎=.47 0.57 -
SEI3RD 0.572 0.55 - 5.56 𝜎=.08 0.01 -
SEI3RD(i) 1.37 0.52 - 4.7 0.124 -
SIR 3.74 0.303 1.45 3.296 0.343 0.84
SIR(i) 4.075 0.302 - 3.661 0.039 -
SEIR 5.3 𝜎=.17 0.015 - 3.47 0.56 -
SEIR(i) 4.97 𝜎=.078 0.014 - 5.36 0.096 -
SEI3RD 4.09 𝜎=.17 0.5483 - 1.32 0.501 -
SEI3RD(i) 6.8 0.049 - 5.62 0.205 -
Andrew Atkeson. 2020a. How deadly is COVID-19? Understanding the diculties with estimation of its fatality rate. Technical
Report. National Bureau of Economic Research.
Andrew Atkeson. 2020b. What will be the economic impact of COVID-19 in the US? Rough estimates of disease scenarios.
Technical Report. National Bureau of Economic Research.
Compartmental Models for COVID-19 and Control via Policy Interventions 7
Andrew Atkeson. 2021. A Parsimonious Behavioral SEIR Model of the 2020 COVID Epidemic in the United States and the
United Kingdom. Technical Report. National Bureau of Economic Research.
Atilim Güneş Baydin, Lei Shao, Wahid Bhimji, Lukas Heinrich, Lawrence Meadows, Jialin Liu, Andreas Munk, Saeid
Naderiparizi, Bradley Gram-Hansen, Gilles Louppe, et al
2019. Etalumis: Bringing probabilistic programming to scientic
simulators at scale. In Proceedings of the international conference for high performance computing, networking, storage and
analysis. 1–24.
Andrea L Bertozzi, Elisa Franco, George Mohler, Martin B Short, and Daniel Sledge. 2020. The challenges of modeling and
forecasting the spread of COVID-19. Proceedings of the National Academy of Sciences 117, 29 (2020), 16732–16738.
Eli Bingham, Jonathan P Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh,
Paul Szerlip, Paul Horsfall, and Noah D Goodman. 2019. Pyro: Deep universal probabilistic programming. The Journal of
Machine Learning Research 20, 1 (2019), 973–978.
Shelby R Buckman, Reuven Glick, Kevin J Lansing, Nicolas Petrosky-Nadeau, and Lily M Seitelman. 2020. Replicating and
projecting the path of COVID-19 with a model-implied reproduction number. Infectious Disease Modelling 5 (2020),
Bob Carpenter, Andrew Gelman, Matthew D Homan, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus A Brubaker,
Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: a probabilistic programming language. Grantee Submission 76, 1
(2017), 1–32.
Haonan Chen, Jing He, Wenhui Song, Lianchao Wang, Jiabao Wang, and Yijin Chen. 2020. Modeling and interpreting the
COVID-19 intervention strategy of China: A human mobility view. PloS one 15, 11 (2020), e0242761.
Derek K Chu, Elie A Akl, Stephanie Duda, Karla Solo, Sally Yaacoub, Holger J Schünemann, Amena El-harakeh, Antonio
Bognanni, Tamara Lot, Mark Loeb, et al
2020. Physical distancing, face masks, and eye protection to prevent person-
to-person transmission of SARS-CoV-2 and COVID-19: a systematic review and meta-analysis. The Lancet 395, 10242
(2020), 1973–1987.
Team IHME COVID, RC Reiner, RM Barber, and JK Collins. 2020. Modeling COVID-19 scenarios for the United States.
Nature medicine (2020).
Christian Schroeder de Witt, Bradley Gram-Hansen, Nantas Nardelli, Andrew Gambardella, Rob Zinkov, Puneet Dokania,
N Siddharth, Ana Belen Espinosa-Gonzalez, Ara Darzi, Philip Torr, et al
2020. Simulation-Based Inference for Global
Health Decisions. arXiv preprint arXiv:2005.07062 (2020).
Giulia Giordano, Franco Blanchini, Raaele Bruno, Patrizio Colaneri, Alessandro Di Filippo, Angela Di Matteo, and Marta
Colaneri. 2020. Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy. Nature
medicine 26, 6 (2020), 855–860.
Paolo Giudici and Emanuela Ranetti. 2020. Monitoring Covid-19 Policy Interventions. Frontiers in Public Health 8 (2020).
Veronika Grimm, Friederike Mengel, and Martin Schmidt. 2021. Extensions of the SEIR model for the analysis of tailored
social distancing and tracing approaches to cope with COVID-19. Scientic Reports 11, 1 (2021), 1–16.
Douglas D. Gunzler and Ashwini R. Sehgal. 2020. Time-Varying COVID-19 Reproduction Num-
ber in the United States. medRxiv (2020).
Laurent Hébert-Dufresne, Benjamin M. Althouse, Samuel V. Scarpino, and Antoine Allard. 2020. Beyond R0: the importance
of contact tracing when predicting epidemics. medRxiv (2020).
Matthew D Homan, David M Blei, Chong Wang, and John Paisley. 2013. Stochastic variational inference. Journal of
Machine Learning Research 14, 5 (2013).
Hyokyoung G Hong and Yi Li. 2020. Estimation of time-varying reproduction numbers underlying epidemiological processes:
A new statistical tool for the COVID-19 pandemic. PloS one 15, 7 (2020), e0236464.
Amirhoshang Hoseinpour Dehkordi, Majid Alizadeh, Pegah Derakhshan, Peyman Babazadeh, and Arash Jahandideh. 2020.
Understanding epidemic data and statistics: A case study of COVID-19. Journal of medical virology 92, 7 (2020), 868–882.
International Monetary Fund. [n. d.]. Policy Responses to COVID-19.
Kathleen M Jagodnik, Forest Ray, Federico M Giorgi, and Alexander Lachmann. 2020. Correcting under-reported COVID-19
case numbers: estimating the true scale of the pandemic. medRxiv (2020).
DrPH Kamalich Muniz-Rodriguez, Gerardo Chowell, Jessica S Schwind, Randall Ford, Sylvia K Ofori, Chigozie A Ogwara,
Margaret R Davies, Terrence Jacobs, Chi-Hin Cheung, Logan T Cowan, et al
[n. d.]. Time-varying Reproduction Numbers
of COVID-19 in Georgia, USA, March 2, 2020 to November 20, 2020. ([n. d.]).
Deanna M Kennedy, Gustavo José Zambrano, Yiyu Wang, and Osmar Pinto Neto. 2020. Modeling the eects of intervention
strategies on COVID-19 transmission dynamics. Journal of Clinical Virology 128 (2020), 104440.
Soban Qadir Khan, Imran Alam Moheet, Faraz Ahmed Farooqi, Muhanad Alhareky, and Faisal Alonaizan. 2021. Under-
reported COVID-19 cases in South Asian countries. F1000Research 10, 88 (2021), 88.
8 Swapneel Mehta and Noah Kasmano
Adam J Kucharski, Timothy W Russell, Charlie Diamond, Yang Liu, John Edmunds, Sebastian Funk, Rosalind M Eggo, Fiona
Sun, Mark Jit, James D Munday, et al
2020. Early dynamics of transmission and control of COVID-19: a mathematical
modelling study. The lancet infectious diseases 20, 5 (2020), 553–558.
Alexander Lachmann. 2020. Correcting under-reported COVID-19 case numbers. MedRxiv (2020).
Max SY Lau, Bryan Grenfell, Michael Thomas, Michael Bryan, Kristin Nelson, and Ben Lopman. 2020. Characterizing
superspreading events and age-specic infectiousness of SARS-CoV-2 transmission in Georgia, USA. Proceedings of the
National Academy of Sciences 117, 36 (2020), 22430–22435.
Nancy HL Leung, Daniel KW Chu, Eunice YC Shiu, Kwok-Hung Chan, James J McDevitt, Benien JP Hau, Hui-Ling Yen,
Yuguo Li, Dennis KM Ip, JS Malik Peiris, et al
2020. Respiratory virus shedding in exhaled breath and ecacy of face
masks. Nature medicine 26, 5 (2020), 676–680.
Ruiyun Li, Sen Pei, Bin Chen, Yimeng Song, Tao Zhang, Wan Yang, and Jerey Shaman. 2020. Substantial undocumented
infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 368, 6490 (2020), 489–493.
Mingming Liang, Liang Gao, Ce Cheng, Qin Zhou, John Patrick Uy, Kurt Heiner, and Chenyu Sun. 2020. Ecacy of face
mask in preventing respiratory virus transmission: A systematic review and meta-analysis. Travel medicine and infectious
disease 36 (2020), 101751.
Leonardo López and Xavier Rodó. 2020. The end of social connement and COVID-19 re-emergence risk. Nature Human
Behaviour 4, 7 (2020), 746–755.
Cesar Manchein, Eduardo L Brugnago, Rafael M da Silva, Carlos FO Mendes, and Marcus W Beims. 2020. Strong correlations
between power-law growth of COVID-19 in four continents and the ineciency of soft quarantine strategies<? A3B2
show [editpick]?>. Chaos: An Interdisciplinary Journal of Nonlinear Science 30, 4 (2020), 041102.
Sandip Mandal, Tarun Bhatnagar, Nimalan Arinaminpathy, Anup Agarwal, Amartya Chowdhury, Manoj Murhekar, Raman R
Gangakhedkar, and Swarup Sarkar. 2020. Prudent public health intervention strategies to control the coronavirus disease
2019 transmission in India: A mathematical model-based approach. The Indian journal of medical research 151, 2-3 (2020),
Antonello Maruotti, Massimo Ciccozzi, and Fabio Divino. 2021. On the misuse of the reproduction number in the COVID-19
surveillance system in Italy. Journal of Medical Virology (2021).
Faïçal Ndaïrou, Iván Area, Juan J Nieto, and Delm FM Torres. 2020. Mathematical modeling of COVID-19 transmission
dynamics with a case study of Wuhan. Chaos, Solitons & Fractals 135 (2020), 109846.
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin,
Natalia Gimelshein, Luca Antiga, et al
2019. Pytorch: An imperative style, high-performance deep learning library. arXiv
preprint arXiv:1912.01703 (2019).
Dimiter Prodanov. 2021. Analytical parameter estimation of the SIR epidemic model. Applications to the COVID-19
pandemic. Entropy 23, 1 (2021), 59.
Rajesh Ranganath, Sean Gerrish, and David Blei. 2014. Black box variational inference. In Articial intelligence and statistics.
PMLR, 814–822.
John Salvatier, Thomas V Wiecki, and Christopher Fonnesbeck. 2016. Probabilistic programming in Python using PyMC3.
PeerJ Computer Science 2 (2016), e55.
Reza Sameni. 2020. Mathematical modeling of epidemic diseases; a case study of the COVID-19 coronavirus. arXiv preprint
arXiv:2003.11371 (2020).
Abhishek Senapati, Sourav Rana, Tamalendu Das, and Joydev Chattopadhyay. 2020. Impact of intervention on the spread of
COVID-19 in India: A model based study. arXiv preprint arXiv:2004.04950 (2020).
Yuanji Tang and Shixia Wang. 2020. Mathematic modeling of COVID-19 in the United States. Emerging microbes & infections
9, 1 (2020), 827–829.
Robin N Thompson. 2020. Epidemiological models are important tools for guiding COVID-19 interventions. BMC medicine
18, 1 (2020), 1–4.
United States Government. [n. d.]. Government Response to Coronavirus, COVID-19.
Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An introduction to probabilistic program-
ming. arXiv preprint arXiv:1809.10756 (2018).
Giovani L Vasconcelos, Antônio MS Macêdo, Raydonal Ospina, Francisco AG Almeida, Gerson C Duarte-Filho, Arthur A
Brum, and Inês CL Souza. 2020. Modelling fatality curves of COVID-19 and the eectiveness of intervention strategies.
PeerJ 8 (2020), e9421.
R Verity, LC Okell, and I Dorigatti. 2020. Estimates of the severity of coronavirus disease 2019: a model-based analysis (vol
20, pg 669, 2020). (2020).
Chaolong Wang, Li Liu, Xingjie Hao, Huan Guo, Qi Wang, Jiao Huang, Na He, Hongjie Yu, Xihong Lin, An Pan, et al
Evolving epidemiology and impact of non-pharmaceutical interventions on the outbreak of coronavirus disease 2019 in
Wuhan, China. MedRxiv (2020).
Compartmental Models for COVID-19 and Control via Policy Interventions 9
Lili Wang, Yiwang Zhou, Jie He, Bin Zhu, Fei Wang, Lu Tang, Michael Kleinsasser, Daniel Barker, Marisa C Eisenberg,
and Peter XK Song. 2020b. An epidemiological forecast model and software assessing interventions on the COVID-19
epidemic in China. Journal of Data Science 18, 3 (2020), 409–432.
Jack M Winters. 2020. A Novel Model for Simulating COVID-19 Dynamics Through Layered Infection States that Integrate
Concepts from Epidemiology, Biophysics and Medicine: SEI3R2S-Nrec. medRxiv (2020).
Michael Wol. 2020. On build-up of epidemiologic models—Development of a SEI3RSD model for the spread of SARS-CoV-2.
ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift für Angewandte Mathematik und Mechanik 100, 11
(2020), e202000230.
Frank Wood, Andrew Warrington, Saeid Naderiparizi, Christian Weilbach, Vaden Masrani, William Harvey, Adam Scibior,
Boyan Beronov, and Ali Nasseri. 2020. Planning as inference in epidemiological models. arXiv preprint arXiv:2003.13221
10 Swapneel Mehta and Noah Kasmano
We have attempted to t many dierent kinds of models and obtained certain parameter estimates
which we compare to other, equally noisy parameter estimates from the literature. The fact is, each
country has dealt with COVID-19 in dierent ways. While China seemed to be able to take stringent
measures and limit the spread, others like Italy were not able to follow the same approach. There
was a strong focus on the fast-rising numbers of infected people in the United States (September,
2020), quickly overshadowed by a worse state of aairs in India (May, 2021). It is crucial to analyse
how well each model is able to not only t the data but also forecast the spread of disease accurately.
Since we have trained some models on partial amounts of data, let us take a look at which of the
models can identify upcoming ’waves’ in the progress of the disease.
For instance, in the gures 5(vertical bars mark the beginning of forecasts), the SEIR and the
SEI3RD models are able to capture the second wave, with the latter being able to, impressively
enough, predict the second one having only observed the rst wave. Similarly, in 6we can see
that both the SIR and SEIR models seem to perform poorly at data tting and forecasting. Note
that in both cases, the notion of the ’best model’ as indicated by comparing estimated
not seem to hold, emphasising the need to consider the least restrictive models when conducting
such studies. Of course, the SEI3RD model is not a panacea. We oer two additional examples
of its forecasts 8that convince us we need to model an external factor that allows us to modify
the second, larger wave (peak) in a manner similar to how government interventions delayed
the infections from successively peaking as per the predictions of the SEI3RD model. Also in 8
is the availability of condence intervals for the forecast. In general, while the SEI3RD model
performs well at capturing multiple waves of COVID-19 infections, we are making a fundamentally
incorrect modeling assumption by ignoring the fact that disease evolution was hindered by human
intervention, in particular via government policies designed to limit its spread. In fact, we can
utilise these ’vanilla’ SEI3RD models that are reasonably good at predicting the potential spread of
a disease to help us gure out how best to limit it! This motivates us to study Policy Interventions.
There is a vast body of literature on compartmental models for epidemiology. In particular, the
SEIR model seems the most popular and widely applied for real-world correspondence possibly
owing to a trade-o between simplicity and eectiveness [Buckman et al
2020]. In [COVID et al
2020] the team ts an SEIR model to mortality data in an eort to examine possible trajectories of
COVID-19 infections at the state level. In particular, they conduct a review of COVID-19 model on a
state-wise basis with particular emphasis on charting out potential scenarios in terms of the number
of fatalities in the presence of various non-pharmaceutical interventions. They forecast the best and
worst case outcomes in terms of these numbers and make recommendations to ensure the safety
of the US population in case of epidemic resurgences in many states. The authors of [López and
Rodó 2020] use an SEIR model to conduct a study of the recurrence of the COVID-19 pandemic via
dierent post-connement scenarios. Their work highlights the importance of non-pharmaceutical
interventions due to the re-emergence risk from the time-decay of acquired immunity and lack of
eective pharmaceutical interventions.
There is however signicant work that has gone into building more expressive, realistic, informed
models some of which are SIDARTHE [Giordano et al
2020], SEI3HR in India [Senapati et al
2020], SUEIHCDR in Brazil [Kennedy et al
2020], SEI3RSD [Wol 2020], SEI3Q3RD [Grimm
et al
2021], SEI3R2S [Winters 2020], and others [Ndaïrou et al
2020]. SEIR models for the spread
of disease [Buckman et al
2020,?;COVID et al
2020] are prevalent in the existing literature
Compartmental Models for COVID-19 and Control via Policy Interventions 11
Fig. 5. Fiing SIR (top le), SEI3RD (boom right) and SEIR models to data from Italy
Fig. 6. Fiing SIR (top le), SEIR (top right) and SEI3RD models to data from the Netherlands
with multitudinous global, national, and regional studies conducted through studying the disease
transmission dynamics modeled by its compartmental transitions. We conjecture that the SEIR
model can be improved by reducing the modeling assumptions implicit in its denition such as
the explicit separation of the recovered individuals from the fatalities out of all those in the nal
’removed’ compartment. In the framework oered by
, it is straightforward to modify the
12 Swapneel Mehta and Noah Kasmano
Fig. 7. Fiing SIR (top le), SEIR (top right) and SEI3RD models to data from New York
Fig. 8. Fiing SEI3RD models to data from Germany and Georgia (right)
compartmental model structure which makes it extremely useful as a toolkit for practitioners
interested in an experimental perspective on epidemiological modeling.
As a template, we consider the idea of three-layered infection states corresponding to increasingly
infectious ’spreaders’ motivated by the need to tie epidemiology with medical physiology through
modeling causal dynamic processes in time [Winters 2020] for dening an SEI3RD model as shown
in 9.
For researchers, an excellent approach to reproducibly expand upon the existing literature could
be following the approach of [Ndaïrou et al
2020]. They examine an eight-compartment model to
obtain disease parameter estimates of a COVID-19 variant in Wuhan, China. They then conduct
a sensitivity analysis of their model to examine the variance with respect to each parameter and
compare their results with a numerical simulation to examine its suitability. Our motivation with
the SEI3RD model is, in a similar vein, to expand upon the line of work leading to the SEIR model but
with the dierence of oering an open-source template to introduce new variants of compartmental
models that can dier regionally.
Compartmental Models for COVID-19 and Control via Policy Interventions 13
Fig. 9. The SEI3RD Model from [Wood et al. 2020] along with our policy intervention parameter 𝑢
C.1 Policy Interventions
The use of surgical face masks and face shields by healthcare and non-healthcare workers alike
has been shown to signicantly reduce or prevent the transmission of human coronaviruses and
inuenza viruses through respiratory droplets from symptomatic individuals in conned spaces
[Chu et al
2020;Leung et al
2020;Liang et al
2020]. This is an example of a type of intervention that
falls into the class of non-pharmaceutical interventions (NPIs) which are important to model in light
of the time it requires to implement substantial pharmaceutical interventions (PIs) such as vaccines.
Our simulation, therefore, focuses on the modeling of NPIs, or what we term policy interventions,
to allow us to build world-models that are reective of disease spread in the absence of eective PIs.
While vaccines may be the most eective long-term solution, there is signicant impact of short
term NPIs such as isolation and contact tracing [Bertozzi et al
2020;Grimm et al
2021]. For this
reason, our work focuses extensively on a framework to explore intervention strategies without
the need for human-supervised search.
When governments deal with diseases, they may take certain measures that result in limiting
the exposure of the population to the disease. These measures may range from mild as in washing
hands to stringent as in enforcing a complete lockdown [International Monetary Fund [n. d.];
United States Government [n. d.]]. While some lines of work focus on how to monitor their impact
[Giudici and Ranetti 2020;Vasconcelos et al
2020], we propose an algorithm to explore new
strategies for control. The goal of epidemiological modeling, particularly the spread of pandemics
is to be able to infer the actions necessary to limit their spread. Every policy to address this is
designed to intervene on the rate of spread through natural or articial means, for a short or
long term duration. However, most modeling approaches treat it as aecting the transmission in a
deterministic manner whereas in reality, the impact of government policies varies with time. For
this reason, we decouple modeling of the policy intervention
from the transmission probability
This allows us to introduce our greedy search algorithm described in (also see 12).
Modelling policy interventions for COVID-19 has been the focus of a large body of work ([Chen
et al
2020;Giordano et al
2020;Giudici and Ranetti 2020;Vasconcelos et al
2020]). In our work,
we start with the similar implementation of a policy intervention dened by the parameter
modies the SIR compartmental model as per the equation below:
Recall that
was the transmission probability of an individual from the pool of susceptible
individuals to the next compartment which is model-dependent. Changing the values of this policy
14 Swapneel Mehta and Noah Kasmano
Fig. 10. Visualizing the disease progression with time (along x-axis) and infected population fraction (along
the y-axis) under varying levels of the policy intervention parameter 𝑢for SIR.
intervention parameter manually, as in
taking on a deterministic set of values, and simulating
the resulting trajectories results in the plot shown in 10 for the SIR model and 11 for the SEIR
model. The horizontal dashed line indicates the 10% threshold of infections. This approach allows
us to monitor what could have been the impact of governmental policies such as a lockdown
implemented over a long time period. However,
may not be the same at all points in time. Thus,
we introduce the algorithm C.1 to consider a sequence of policy interventions that is analogous to
periodic policy updates or a variance in the impact of the same policies. A visual analogy of the
algorithm is oered by illustrating the best and worst case outcomes of selecting dierent policy
interventions at two time-steps in 12 with a description below.
Compartmental Models for COVID-19 and Control via Policy Interventions 15
Fig. 11. Visualizing the disease progression with time (along x-axis) and infected population fraction (along
the y-axis) under varying levels of the policy intervention parameter 𝑢for the SEIR model.
Algorithm 1: Adaptive Interventions through Greedy Search
Result: Sequence of policy interventions
initialize sequence;
while 𝑡𝑇do
simulate compartmental transitions;
for 𝑢=𝑢𝑖, increasing from 0to 1do
simulate a trajectory with 𝑅0,𝜌;
if infections threshold then
add 𝑢𝑖to sequence;
The current fraction of infected population is represented in bright red as a solid line. At
each timestep, we perform a simulation of possible trajectories for dierent values of the policy
16 Swapneel Mehta and Noah Kasmano
Fig. 12. Adaptively Changing the Policy Intervention
. For example, at a time
50 days we simulate the possible trajectories (dashed
lines). The plots show the best and worst-case options for the progress of the disease depending on
the magnitude of the policy intervention parameter
. This is repeated at each time step
, with the
illustration of 𝑡=100 oered in a separate plot.
A greedy solution, as demonstrated in the algorithm C.1, is to pick the
in a short-term optimal
manner such that it keeps the infected fraction just within the threshold. However, this might still
result in a sequence of large interventions for a highly infectious disease (long-term lockdowns).
There are better strategies that enact more stringent measures early on so that the overall impact
of interventions across the time series is not as high. Alternatively, we might weight the outcomes
of infected individuals exceeding a threshold by the amount of time it would take to breach
the threshold and thereby determine a reasonable intervention to perform. There are dierent
ways of picking the optimal policy intervention depending on the choice of utility function and
optimisation technique, including neural networks. We believe that the choice should vary subject
to demographic factors, eectiveness of government policies, durations for which governments
can implement stringent measures and acceptable thresholds of infected individuals.
C.2 Limitations
We study the spread of infectious diseases in terms of the COVID-19 pandemic in an attempt to
unify the dierent models to highlight the utility of PPLs. For this reason, while we do study the
predictions of our models by evaluating the parameter estimates against those described in the
existing literature [Hoseinpour Dehkordi et al
2020;Kucharski et al
2020;Li et al
et al
2020;Wang et al
2020a] in some experiments, it is more interesting for us to focus, in this
work, on our technique and its extensions than the fact that we achieve similar results to those in
related literature. We provide a complete comparison of some region-wise parameter estimates with
real-world data in 1but emphasize that the novelty is in the ease of application of this technique
across the table, its adaptability to incorporate modeling assumptions, and exibility of inference
across a spectrum of compartmental models.
It is challenging to model a disease with convenient mathematical assumptions implicit in
compartmental models, that largely ignore individual-level dierences. While there are some
inferences that are plausible to make, it is clear that we must not jump to immediate conclusions
Compartmental Models for COVID-19 and Control via Policy Interventions 17
in disease modeling due to the fact that there is almost always non-trivial uncertainty associated
with parameter estimates in that multiple models might reect similar initial trajectories [Atkeson
2020a]. Furthermore, certain types of interventions might not be as eective as others [Manchein
et al
2020]. However, modeling interventions is one way of translating theory into practical
advice for making policy decisions [Thompson 2020] and it is promising that SEIR models perform
well over time in multiple demographies with regards to predicting the spread of the disease
[Atkeson 2021]. This strengthens the argument that we would like to further reduce our modeling
assumptions and therefore, uncertainty in parameter estimates, in order to make more condent
policy recommendations eventually. This motivates the use of the SEI3RD model (among other
variants in Compartmental Model Extensions in Pyro) which is more powerful than the class of
SEIR models since it is a derivative that allows for more granular transmission dynamics between
compartments. However, even with the SEI3RD model, we need to add in external interventions,
consider jointly conditioning on observed fatalities, and inductive biases on transition probabilities
based on estimated incubation period, infection severity, stratication by demographic factors, and
consider utility
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
In this work we demonstrate how to automate parts of the infectious disease-control policy-making process via performing inference in existing epidemiological models. The kind of inference tasks undertaken include computing the posterior distribution over controllable, via direct policy-making choices, simulation model parameters that give rise to acceptable disease progression outcomes. Among other things, we illustrate the use of a probabilistic programming language that automates inference in existing simulators. Neither the full capabilities of this tool for automating inference nor its utility for planning is widely disseminated at the current time. Timely gains in understanding about how such simulation-based models and inference automation tools applied in support of policy-making could lead to less economically damaging policy prescriptions, particularly during the current COVID-19 pandemic.
Full-text available
In the context of the COVID-19 pandemic, governments worldwide face the challenge of designing tailored measures of epidemic control to provide reliable health protection while allowing societal and economic activity. In this paper, we propose an extension of the epidemiological SEIR model to enable a detailed analysis of commonly discussed tailored measures of epidemic control—among them group-specific protection and the use of tracing apps. We introduce groups into the SEIR model that may differ both in their underlying parameters as well as in their behavioral response to public health interventions. Moreover, we allow for different infectiousness parameters within and across groups, different asymptomatic, hospitalization, and lethality rates, as well as different take-up rates of tracing apps. We then examine predictions from these models for a variety of scenarios. Our results visualize the sharp trade-offs between different goals of epidemic control, namely a low death toll, avoiding overload of the health system, and a short duration of the epidemic. We show that a combination of tailored mechanisms, e.g., the protection of vulnerable groups together with a “trace & isolate” approach, can be effective in preventing a high death toll. Protection of vulnerable groups without further measures requires unrealistically strict isolation. A key insight is that high compliance is critical for the effectiveness of a “trace & isolate” approach. Our model allows to analyze the interplay of group-specific social distancing and tracing also beyond our case study in scenarios with a large number of groups reflecting, e.g., sectoral, regional, or age differentiation and group-specific behavioural responses.
Full-text available
We discuss the statistical method used in Italy to estimate the reproduction number Rt at the regional level. In Italy, Rt is not only used to provide a picture of the epidemic spread, but rather as a decision tool to plan and organize non‐pharmaceutical interventions by imposing a‐priori thresholds to define different levels of risks. We comment on methodological limitations of the statistical approach which lead to a misuse of Rt. Though remarking the importance of the reproduction number to evaluate the severity of the COVID‐19 spread, we must be cautious to use Rt over its real meaning, and avoid its use to impose constraints to the dailylife activities as these would be based on an unreliable estimate of the reproduction number. This article is protected by copyright. All rights reserved.
Full-text available
The SIR (Susceptible-Infected-Removed) model is a simple mathematical model of epidemic outbreaks, yet for decades it evaded the efforts of the mathematical community to derive an explicit solution. The present paper reports novel analytical results and numerical algorithms suitable for parametric estimation of the SIR model. Notably, a series solution of the incidence variable of the model is derived. It is proven that the explicit solution of the model requires the introduction of a new transcendental special function, which is a solution of a non-elementary integral equation. The paper introduces iterative algorithms approximating the incidence variable, which allows for estimation of the model parameters from the numbers of observed cases. The approach is applied to the case study of the ongoing coronavirus disease 2019 (COVID-19) pandemic in five European countries: Belgium, Bulgaria, Germany, Italy and the Netherlands. Incidence and case fatality data obtained from the European Centre for Disease Prevention and Control (ECDC) are analysed and the model parameters are estimated and compared.
Full-text available
The Coronavirus Disease 2019 (COVID-19) has proved a globally prevalent outbreak since December 2019. As a focused country to alleviate the epidemic impact, China implemented a range of public health interventions to prevent the disease from further transmission, including the pandemic lockdown in Wuhan and other cities. This paper establishes China’s mobility network by a flight dataset and proposes a model without epidemiological parameters to indicate the spread risks through the network, which is termed as epidemic strength. By simply adjusting an intervention parameter, traffic volumes under different travel-restriction levels can be simulated to analyze how the containment strategy can mitigate the virus dissemination through traffic. This approach is successfully applied to a network of Chinese provinces and the epidemic strength is smoothly interpreted by flow maps. Through this node-to-node interpretation of transmission risks, both overall and detailed epidemic hazards are properly analyzed, which can provide valuable intervention advice during public health emergencies.
Full-text available
This article is the updated, revised and peer reviewed version of a former paper with the same title.
Full-text available
We demonstrate a methodology for replicating and projecting the path of COVID-19 using a simple epidemiology model. We fit the model to daily data on the number of infected cases in China, Italy, the United States, and Brazil. These four countries can be viewed as representing different stages, from later to earlier, of a COVID-19 epidemic cycle. We solve for a model-implied effective reproduction number R t each day so that the model closely replicates the daily number of currently infected cases in each country. For out-of-sample projections, we fit a behavioral function to the in-sample data that allows for the endogenous response of R t to movements in the lagged number of infected cases. We show that declines in measures of population mobility tend to precede declines in the model-implied reproduction numbers for each country. This pattern suggests that mandatory and voluntary stay-at-home behavior and social distancing during the early stages of the epidemic worked to reduce the effective reproduction number and mitigate the spread of COVID-19.
Full-text available
A very key point in the process of the Covid-19 contagion control is the introduction of effective policy measures, whose results have to be continuously monitored through accurate statistical analysis. To this aim we propose an innovative statistical tool, based on the Gini-Lorenz concentration approach, which can reveal how well a country is doing in reducing the growth of contagion, and its speed.
Full-text available
Significance There is still considerable scope for advancing our understanding of the epidemiology and ecology of COVID-19. In particular, much is unknown about individual-level transmission heterogeneities such as superspreading and age-specific infectiousness. We statistically synthesize multiple valuable data streams, including surveillance data and mobility data, that are available during the current COVID-19 pandemic. We show that age is an important factor in the transmission of the virus. Superspreading is ubiquitous over space and time, and has particular importance in rural areas and later stages of an outbreak. Our results improve our understanding of the natural history of the virus and have important implications for designing optimal control measures.