Content uploaded by Joon-Seok Kim
Author content
All content in this area was uploaded by Joon-Seok Kim on Aug 27, 2020
Content may be subject to copyright.
Expert-in-the-Loop Prescriptive Analytics using
Mobility Intervention for Epidemics
Joon-Seok Kim
George Mason University
jkim258@gmu.edu
Hyunjee Jin
George Mason University
hjin6@gmu.edu
Andreas Züe
George Mason University
azue@gmu.edu
ABSTRACT
Due to complexity of social phenomena, it is a big challenge to
predict the curves of epidemics that spread via social contacts and
to control such epidemics. Misguided policies to mitigate epidemics
may result in catastrophic consequences such as nancial crisis,
massive unemployment, and the surge of the number of critically
ill patients exceeding the capacity of hospitals. In particular, un-
der/overestimation of ecacy of interventions can mislead policy-
makers about perception of evolving situations. To avoid such pit-
falls, we propose Expert-in-the-Loop (EITL) prescriptive analytics
using mobility intervention for epidemics. Rather than employing a
purely data-driven approach, the key advantage of our approach is
to leverage experts’ best knowledge in estimating disease spreading
and the ecacy of interventions which allows us to eciently nar-
row down factors and the scope of combinatorial possible worlds.
We introduce our experience to develop Expert-in-the-Loop simu-
lations during the Challenge on Mobility Intervention for Epidemics.
We demonstrate that misconceptions about the causality can be
corrected in the iterations of consulting with experts, developing
simulations, and experimentation.
CCS CONCEPTS
•Information systems →Geographic information systems
;
•Computing methodologies →Agent / discrete models.
KEYWORDS
expert-in-the-loop, prescriptive analytics, epidemic modeling, mi-
crosimulation, mobility, intervention
ACM Reference Format:
Joon-Seok Kim, Hyunjee Jin, and Andreas Züe. 2020. Expert-in-the-Loop
Prescriptive Analytics using Mobility Intervention for Epidemics . In Pre-
scriptive Analytics for the Physical World (PAPW ’20), August 24, 2020, San
Diego, CA, USA. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/
nnnnnnn.nnnnnnn
1 INTRODUCTION
Predicting human behaviors and the resulting spread of a pandemic
is a tremendous challenge due to complexity (and often irrational-
ity) of humans. Physicist Murray Gell-Mann, Nobel laureate who
conceived the quark, famously said “Imagine how hard physics
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
PAPW ’20, August 24, 2020, San Diego, CA, USA
©2020 Copyright held by the owner/author(s).
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM.
https://doi.org/10.1145/nnnnnnn.nnnnnnn
would be if electrons could think” [
3
]. To predict pandemic spread
the atoms that comprise the model for simulation and predict are
indeed humans. The diculty of predicting disease pandemics is ev-
ident by the large variance, within and between models, to predict
the spread of COVID-19 [
2
]. This uncertainty misguides policies to
mitigate epidemics and may result in catastrophic consequences
such as nancial crisis, massive unemployment, and the surge of
the number of critically ill patients exceeding the capacity of hos-
pitals. We’ve observed these consequences in the United States
as a result of COVID-19. In particular, misunderstanding of risks
and under/overestimation of ecacy of interventions can mislead
policymakers about perception of evolving situations.
To improve prescriptive analytics for decision making, we pro-
pose Expert-in-the-Loop (EITL) prescriptive analytics using mobil-
ity intervention for epidemics. The key advantage of our approach is
to leverage experts’ best knowledge in estimating disease spreading
and the ecacy of interventions which allows us to eciently nar-
row down factors and the scope of combinatorial possible worlds.
To manage uncertainty [
8
], we introduce our experience to develop
simulations during the Challenge on Mobility Intervention for Epi-
demics [
1
]. We demonstrate that misconceptions about the causality
can be corrected in the iterations of consulting with experts, devel-
oping simulations, and experimentation.
2 EXPERT-IN-THE-LOOP FRAMEWORK
In this section, we introduce our EITL framework, the goal of which
is to (1) discover or update ground truths; (2) evaluate the eciency
and ecacy of each intervention in dierent situations; and (3)
reduce combinatorial search space for optimization eciently.
The motivation of our EITL stems from an hands-on experi-
ence of development of agent-based epidemic simulations [
5
,
6
]
and familiarity of the challenge designers’ perspective against chal-
lengers
1
. Similar to the Challenge on Mobility Intervention for Epi-
demics, as a challenge designer, we have provided a black-box model
that allows challengers to obtain only observable information and
conduct experiments to nd prescriptions. Due to complexity of
social interactions in the simulation, it is a great obstacle for the
challengers to discover ground truths.
In line with our formal experience, we aim at discovering ground
truths prior to the outset. We explored multiple directions to under-
stand the feasibility of each approach–given limited resources–to
mitigate the simulated pandemic while also minimizing the cost
of prescriptions. These approaches included genetic algorithms
(GA), genetic programming (GP), reinforcement learning (RL), and
rule-based heuristics. Each technique has pros and cons and best
practices are to select methods that t a specic task in the process.
1Geo-social simulation project web site: https://geosocial.joonseok.org
PAPW ’20, August 24, 2020, San Diego, CA, USA J.-S. Kim et al.
0 200 400 600 800
Steps
0
2000
4000
6000
8000
10000
Number of cases
Susceptible
Symptomatic
Critical
Recovered
(a) Ground Truth (Scenario 1)
0 200 400 600 800
Steps
0
2000
4000
6000
8000
10000
Number of cases
Susceptible
Symptomatic
Critical
Recovered
(b) Estimated model (Scenario 1)
0 200 400 600 800
Steps
0
2000
4000
6000
8000
10000
Number of cases
Susceptible
Symptomatic
Critical
Recovered
(c) Ground Truth (Scenario 2)
0 200 400 600 800
Steps
0
2000
4000
6000
8000
10000
Number of cases
Susceptible
Symptomatic
Critical
Recovered
(d) Estimated model (Scenario 2)
Figure 1: Comparison of epidemic dynamics between ground truth and calibrated model
For EITL, we repeat the following four steps: 1) to develop heuris-
tics/rules to prescribe actions to agents, 2) optimize parameters of
these rules to minimize evaluation score, 3) evaluate patterns within
the results by consulting experts to understand resulting patterns
and interpret causality, and 4) re-adjust the rules following our
understanding and intuition.
Since it is dicult to control all factors that may inuence out-
comes, misconceptions about public health are common. In 1854,
until John Snow [
7
], for example, traced the source of a cholera epi-
demic to a specic water pump, people regarded cholera as airborne
epidemic.
To correct such a bias to the simulated world, we utilize modeling
and simulation (M&S) to model the system as predictive and pre-
scriptive analytics. Unlike black-box machine learning techniques,
modeling and simulation provides deep understanding of the in-
terested system. It is worth noting that the process of M&S allows
experts to elaborate their knowledge and validate their theory and
reasoning.
3 APPLICATION
The rest of the paper describes a best practice of applying EITL to
Challenge on Mobility Intervention for Epidemics [1] and results.
3.1 Feasibility Study
In the real world, an expert group in response to epidemics may
consist of epidemiologists, clinicians, social scientists, policy mak-
ers, health economists, community representatives and experienced
simulation modelers. Its composition depends on disease types and
society. In our settings, we leverage the simulator documentation
[
1
] as the best knowledge including human mobility patterns, a
disease model, feasible interventions, and its costs, obtained by
the expert group consisting of epidemiologists, geoinformation
scientists, social scientists, health economists, and policy makers.
To validate our knowledge or ground truths, we develop micro-
simulations that mimic the epidemic simulator. The main reason
of adopting micro-simulations, instead of agent-based simulations
[
4
], in this case is because we aim at statistical validation, and
micro-simulations are sucient to achieve our goal. The model
includes the process of instanciating patient zeros who inherit the
disease from the system without any contacts with other infectious
individuals.
To understand the ground truth disease model, we conducted
experiments by conning the population such that all agents stay
in their neighborhood. After many iterations of modeling and cali-
bration, we could nd a disease model that behaves similar to the
ground truth model. Figure 1 shows comparison of disease progres-
sion between the ground truth model and our estimated model in
two dierent scenarios (see [
1
] for more information). It is worth
noting that any subtle changes in the parameter conguration such
as distribution of family, disease transmission period, or recovery
period inuence the shape of the curves. The process provides us a
handle to control diseases with accurate estimation. If the model
and parameters are inaccurate, interventions are ineective hav-
ing both huge false positive and false negative cases. For example,
if we overestimate the probability of infection, then we are more
likely to isolate or quarantine unnecessarily which leads to high
intervention costs.
We categorize cases of transmission of epidemic into three: ac-
quaintance contacts, stranger contacts, and unknown (also known
as a patient zero). Given a probability
𝑃𝑐
for an individual to get
infected from an infected acquaintance contact and a probability
𝑃𝑠
for an individual to get infected from an infected stranger contact,
the probability 𝑃that individual 𝑥gets infected is:
𝑃(𝑥)=1−
𝑡
Ö
𝑖=1
(1−𝑃𝑐·𝑎𝑖)·(1−𝑃𝑠· (𝑎𝑖+𝑠𝑖)/𝑛𝑖),(1)
where
𝑎𝑖
is the number of acquaintance contacts with infectious
individual at time
𝑖
,
𝑠𝑖
is the number of stranger contacts with
infectious individuals at time
𝑖
,
𝑛𝑖
is the number of individuals
where
𝑥
is located at time
𝑖
. We found that randomly selected
patient zeros in Scenarios 1, 3, and 5 appear following the Poisson
distribution where 𝜆≈2.4person a day.
Based on experimentation, we discovered the ecacy of four
intervention types (connement, quarantine, isolation, and hospi-
talization).
•
Connement: Conned individuals are allowed to interact with
others in the same neighborhood including acquaintances. It
mitigates at some level reducing major epidemic spreads from
working areas. Acquaintance contacts are the main contribution
and the number of acquaintance contacts at work is much larger
than residential areas. This option is ecient and eective when
the number of infections drastically increases (Scenario 2) or the
number of infections is large (Scenario 4).
•
Quarantine: Initially, experts assumed quarantined individuals
contact only their acquaintances staying at home. However, by
Expert-in-the-Loop Prescriptive Analytics using Mobility Intervention for Epidemics PAPW ’20, August 24, 2020, San Diego, CA, USA
experimentation we found that diseases from self-quarantined pa-
tients are transmitted to a stranger, which is plausible in the real
world. We also discovered its ecacy is similar to connement,
but it is more expensive. This empirical ndings is important
for policymakers to make right decisions. For such a reason, our
prescriptions do not use quarantine.
•
Isolation: Empirically, we found the self-recovery period is dis-
tributed from 15 to 30 days including the pre-symptomatic period.
To isolate infectious individuals is the most eective option to
stop spreading from them since no contacts are allowed with the
isolated individuals.
•
Hospitalization: Similar to isolation, the hospitalized patients are
not allowed to contact others. Since there is no penalty on critical
cases nor reward on recovered cases, it has no merit of hospital-
ization, which is twice expensive than isolation. Therefore, we
exclude hospitalization from interventions.
We take advantages of this knowledge to reduce the combinatorial
search space for optimization.
3.2 Mitigation Strategies
Leveraging these observations, we optimize the combination of
two types of interventions, namely connement and isolation. To
decide which agents to conne/isolate, we compute an infection
probability (see Eq. 1) and a risk factor for each agent. A risk factor
of an agent is a measure of how dangerous an individual would be if
they were pre-symptomatic in such a way that the expected number
of infections is estimated using their number of acquaintances and
daily number of co-located agents. We categorize Scenarios 1, 3,
and 5 into the same group using the same heuristics/rules due to its
similarity. We highlight mitigation strategies for Scenarios 2 and
4 while strategies for other scenarios can be explained under the
following general rules2.
3.2.1 General rules. While infections from acquaintance contacts
and stranger contacts can be inferred by contact tracing, there is no
clue to infer who is a patient zero until they are discovered. Thus,
whenever a new case is discovered, we isolate the symptomatic
individual and select susceptible individuals with the high infection
probability. There is the trade-o between the number of infec-
tions and the number of interventions. That is, if we isolate more
individuals in question by decreasing a threshold
𝜎
, the number of
pre-symptomatic individuals is more likely to decrease. If
𝜎
is too
high, however, it ends up with more isolation because the number
of infection cases increase. Therefore, it is the main challenge to
nd a perfect 𝜎to make the balance between two measures.
Another dimension we take into account is a risk factor. Non-
symptomatic individuals are isolated if the product of the proba-
bility and a risk factor exceeds a threshold
𝜖
. Suppose that there
are two non-symptomatic individuals
𝑥
and
𝑦
having the same
infection probability. If
𝑥
has more acquaintances than
𝑦
, then the
risk of not isolating
𝑥
is higher than that of
𝑦
. Therefore,
𝑥
is more
likely to be isolated than 𝑦.
3.2.2 Scenario 2. The pandemic of Scenario 2 has higher infection
rates where
𝑃𝑐=
0
.
05 and
𝑃𝑠=
0
.
01. Due to high infection rates, we
need to isolate more aggressively, lowering threshold 𝜎. However,
2
Source code is available at https://github.com/joonseok-kim/kdd- papw20-challenge
controlling only
𝜎
is not sucient to mitigate the spread of the
pandemic due to high reproduction number. Our tactic for this
scenario is to prevent gatherings with eleven or more people. If an
individual has
𝑛>
10 acquaintances in a working area, we conne
the individual with a probability
(𝑛−
11
)/𝑛
. This strategy can be
seen as a work-from-home order.
3.2.3 Scenario 4. Scenario 4 is the extreme case that shows a trade-
o between the number of accumulated infections and intervention
costs. At the beginning, 300 pre-symptomatic are spawned and more
than 100 susceptible individuals get infected from them during the
rst day. About 100-200 symptomatic cases including about 60
patient zeros are discovered the very next day. Therefore, about
240 patient zeros are unknown and our goal is to cost-eectively
suppress initial spreading. Along with potential cases, we randomly
select many individuals with high risk factors to isolate. Similar to
Scenario 2 we prevent gatherings having more than ve people in
the working areas. We release them to reduce costs and the next
day we repeat this process to nd more pre-symptomatic cases.
4 CONCLUSION
In this paper, we introduced the Expert-in-the-Loop prescriptive
analytics that leverages experts’ best knowledge. While AI and
machine learning can outperform in well dened problems such
as optimization, experts can expedite the optimization process by
means of modeling such a problem and choosing right tools. We
also demonstrated how simulations can be used to discover ground
truths as groundwork for prescriptive analytics. Although our miti-
gation strategies yielded fairly good results for general cases, we
still have diculty to narrow down the complex combination of
rules for Scenario 2 and Scenario 4. In order to tackle the combina-
tion of rules, we will leverage evolutionary algorithms to explore
diverse possible worlds for the future work.
ACKNOWLEDGMENT
This work was supported by the Defense Advanced Research Projects
Agency (DARPA) under cooperative agreement No.HR00111820005
and the National Science Foundation Grant DEB-2030685. The con-
tent of the information does not necessarily reect the position or
the policy of the Government, and no ocial endorsement should
be inferred.
REFERENCES
[1]
Challenge on mobility intervention for epidemics. https://prescriptive-analytics.
github.io/challenge-cfp/index.html, Accessed July 24, 2020.
[2]
Covid-19 ensemble dashboard. https://covid-ensemble.gmu.edu/, Accessed July
24, 2020.
[3]
M. Gell-Mann. The interpretation of the new particles as displaced charge multi-
plets. Il Nuovo Cimento (1955-1965), 4(2):848–866, 1956.
[4]
J.-S. Kim, H. Jin, H. Kavak, O. C. Rouly, A. Crooks, D. Pfoser, C. Wenk, and A. Züe.
Location-based Social Network Data Generation Based on Patterns of Life. In
IEEE MDM 2020, pages 158–167. IEEE, 2020.
[5]
J.-S. Kim, H. Kavak, U. Manzoor, and A. Züe. Advancing simulation experimenta-
tion capabilities with runtime interventions. In SpringSim 2019, pages 1–11. IEEE,
2019.
[6]
J.-S. Kim, H. Kavak, C. O. Rouly, H. Jin, A. Crooks, D. Pfoser, C. Wenk, and
A. Züe. Location-based social simulation for prescriptive analytics of disease
spread. SIGSPATIAL Special, 12(1):53–61, 2020.
[7]
J. Snow. On Mode of Communication of Cholera. Churchill, London, England, 1855.
[8]
A. Züe, G. Trajcevski, D. Pfoser, and J.-S. Kim. Managing uncertainty in evolving
geo-spatial data. In IEEE MDM 2020, pages 5–8. IEEE, 2020.