Conference PaperPDF Available

Expert-in-the-Loop Prescriptive Analytics using Mobility Intervention for Epidemics

Authors:

Abstract

Due to complexity of social phenomena, it is a big challenge to predict the curves of epidemics that spread via social contacts and to control such epidemics. Misguided policies to mitigate epidemics may result in catastrophic consequences such as financial crisis, massive unemployment, and the surge of the number of critically ill patients exceeding the capacity of hospitals. In particular, under/overestimation of efficacy of interventions can mislead policymakers about perception of evolving situations. To avoid such pitfalls, we propose Expert-in-the-Loop (EITL) prescriptive analytics using mobility intervention for epidemics. Rather than employing a purely data-driven approach, the key advantage of our approach is to leverage experts' best knowledge in estimating disease spreading and the efficacy of interventions which allows us to efficiently narrow down factors and the scope of combinatorial possible worlds. We introduce our experience to develop Expert-in-the-Loop simulations during the Challenge on Mobility Intervention for Epidemics. We demonstrate that misconceptions about the causality can be corrected in the iterations of consulting with experts, developing simulations, and experimentation.
Expert-in-the-Loop Prescriptive Analytics using
Mobility Intervention for Epidemics
Joon-Seok Kim
George Mason University
jkim258@gmu.edu
Hyunjee Jin
George Mason University
hjin6@gmu.edu
Andreas Züe
George Mason University
azue@gmu.edu
ABSTRACT
Due to complexity of social phenomena, it is a big challenge to
predict the curves of epidemics that spread via social contacts and
to control such epidemics. Misguided policies to mitigate epidemics
may result in catastrophic consequences such as nancial crisis,
massive unemployment, and the surge of the number of critically
ill patients exceeding the capacity of hospitals. In particular, un-
der/overestimation of ecacy of interventions can mislead policy-
makers about perception of evolving situations. To avoid such pit-
falls, we propose Expert-in-the-Loop (EITL) prescriptive analytics
using mobility intervention for epidemics. Rather than employing a
purely data-driven approach, the key advantage of our approach is
to leverage experts’ best knowledge in estimating disease spreading
and the ecacy of interventions which allows us to eciently nar-
row down factors and the scope of combinatorial possible worlds.
We introduce our experience to develop Expert-in-the-Loop simu-
lations during the Challenge on Mobility Intervention for Epidemics.
We demonstrate that misconceptions about the causality can be
corrected in the iterations of consulting with experts, developing
simulations, and experimentation.
CCS CONCEPTS
Information systems Geographic information systems
;
Computing methodologies Agent / discrete models.
KEYWORDS
expert-in-the-loop, prescriptive analytics, epidemic modeling, mi-
crosimulation, mobility, intervention
ACM Reference Format:
Joon-Seok Kim, Hyunjee Jin, and Andreas Züe. 2020. Expert-in-the-Loop
Prescriptive Analytics using Mobility Intervention for Epidemics . In Pre-
scriptive Analytics for the Physical World (PAPW ’20), August 24, 2020, San
Diego, CA, USA. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/
nnnnnnn.nnnnnnn
1 INTRODUCTION
Predicting human behaviors and the resulting spread of a pandemic
is a tremendous challenge due to complexity (and often irrational-
ity) of humans. Physicist Murray Gell-Mann, Nobel laureate who
conceived the quark, famously said “Imagine how hard physics
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
PAPW ’20, August 24, 2020, San Diego, CA, USA
©2020 Copyright held by the owner/author(s).
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM.
https://doi.org/10.1145/nnnnnnn.nnnnnnn
would be if electrons could think” [
3
]. To predict pandemic spread
the atoms that comprise the model for simulation and predict are
indeed humans. The diculty of predicting disease pandemics is ev-
ident by the large variance, within and between models, to predict
the spread of COVID-19 [
2
]. This uncertainty misguides policies to
mitigate epidemics and may result in catastrophic consequences
such as nancial crisis, massive unemployment, and the surge of
the number of critically ill patients exceeding the capacity of hos-
pitals. We’ve observed these consequences in the United States
as a result of COVID-19. In particular, misunderstanding of risks
and under/overestimation of ecacy of interventions can mislead
policymakers about perception of evolving situations.
To improve prescriptive analytics for decision making, we pro-
pose Expert-in-the-Loop (EITL) prescriptive analytics using mobil-
ity intervention for epidemics. The key advantage of our approach is
to leverage experts’ best knowledge in estimating disease spreading
and the ecacy of interventions which allows us to eciently nar-
row down factors and the scope of combinatorial possible worlds.
To manage uncertainty [
8
], we introduce our experience to develop
simulations during the Challenge on Mobility Intervention for Epi-
demics [
1
]. We demonstrate that misconceptions about the causality
can be corrected in the iterations of consulting with experts, devel-
oping simulations, and experimentation.
2 EXPERT-IN-THE-LOOP FRAMEWORK
In this section, we introduce our EITL framework, the goal of which
is to (1) discover or update ground truths; (2) evaluate the eciency
and ecacy of each intervention in dierent situations; and (3)
reduce combinatorial search space for optimization eciently.
The motivation of our EITL stems from an hands-on experi-
ence of development of agent-based epidemic simulations [
5
,
6
]
and familiarity of the challenge designers’ perspective against chal-
lengers
1
. Similar to the Challenge on Mobility Intervention for Epi-
demics, as a challenge designer, we have provided a black-box model
that allows challengers to obtain only observable information and
conduct experiments to nd prescriptions. Due to complexity of
social interactions in the simulation, it is a great obstacle for the
challengers to discover ground truths.
In line with our formal experience, we aim at discovering ground
truths prior to the outset. We explored multiple directions to under-
stand the feasibility of each approach–given limited resources–to
mitigate the simulated pandemic while also minimizing the cost
of prescriptions. These approaches included genetic algorithms
(GA), genetic programming (GP), reinforcement learning (RL), and
rule-based heuristics. Each technique has pros and cons and best
practices are to select methods that t a specic task in the process.
1Geo-social simulation project web site: https://geosocial.joonseok.org
PAPW ’20, August 24, 2020, San Diego, CA, USA J.-S. Kim et al.
0 200 400 600 800
Steps
0
2000
4000
6000
8000
10000
Number of cases
Susceptible
Symptomatic
Critical
Recovered
(a) Ground Truth (Scenario 1)
0 200 400 600 800
Steps
0
2000
4000
6000
8000
10000
Number of cases
Susceptible
Symptomatic
Critical
Recovered
(b) Estimated model (Scenario 1)
0 200 400 600 800
Steps
0
2000
4000
6000
8000
10000
Number of cases
Susceptible
Symptomatic
Critical
Recovered
(c) Ground Truth (Scenario 2)
0 200 400 600 800
Steps
0
2000
4000
6000
8000
10000
Number of cases
Susceptible
Symptomatic
Critical
Recovered
(d) Estimated model (Scenario 2)
Figure 1: Comparison of epidemic dynamics between ground truth and calibrated model
For EITL, we repeat the following four steps: 1) to develop heuris-
tics/rules to prescribe actions to agents, 2) optimize parameters of
these rules to minimize evaluation score, 3) evaluate patterns within
the results by consulting experts to understand resulting patterns
and interpret causality, and 4) re-adjust the rules following our
understanding and intuition.
Since it is dicult to control all factors that may inuence out-
comes, misconceptions about public health are common. In 1854,
until John Snow [
7
], for example, traced the source of a cholera epi-
demic to a specic water pump, people regarded cholera as airborne
epidemic.
To correct such a bias to the simulated world, we utilize modeling
and simulation (M&S) to model the system as predictive and pre-
scriptive analytics. Unlike black-box machine learning techniques,
modeling and simulation provides deep understanding of the in-
terested system. It is worth noting that the process of M&S allows
experts to elaborate their knowledge and validate their theory and
reasoning.
3 APPLICATION
The rest of the paper describes a best practice of applying EITL to
Challenge on Mobility Intervention for Epidemics [1] and results.
3.1 Feasibility Study
In the real world, an expert group in response to epidemics may
consist of epidemiologists, clinicians, social scientists, policy mak-
ers, health economists, community representatives and experienced
simulation modelers. Its composition depends on disease types and
society. In our settings, we leverage the simulator documentation
[
1
] as the best knowledge including human mobility patterns, a
disease model, feasible interventions, and its costs, obtained by
the expert group consisting of epidemiologists, geoinformation
scientists, social scientists, health economists, and policy makers.
To validate our knowledge or ground truths, we develop micro-
simulations that mimic the epidemic simulator. The main reason
of adopting micro-simulations, instead of agent-based simulations
[
4
], in this case is because we aim at statistical validation, and
micro-simulations are sucient to achieve our goal. The model
includes the process of instanciating patient zeros who inherit the
disease from the system without any contacts with other infectious
individuals.
To understand the ground truth disease model, we conducted
experiments by conning the population such that all agents stay
in their neighborhood. After many iterations of modeling and cali-
bration, we could nd a disease model that behaves similar to the
ground truth model. Figure 1 shows comparison of disease progres-
sion between the ground truth model and our estimated model in
two dierent scenarios (see [
1
] for more information). It is worth
noting that any subtle changes in the parameter conguration such
as distribution of family, disease transmission period, or recovery
period inuence the shape of the curves. The process provides us a
handle to control diseases with accurate estimation. If the model
and parameters are inaccurate, interventions are ineective hav-
ing both huge false positive and false negative cases. For example,
if we overestimate the probability of infection, then we are more
likely to isolate or quarantine unnecessarily which leads to high
intervention costs.
We categorize cases of transmission of epidemic into three: ac-
quaintance contacts, stranger contacts, and unknown (also known
as a patient zero). Given a probability
𝑃𝑐
for an individual to get
infected from an infected acquaintance contact and a probability
𝑃𝑠
for an individual to get infected from an infected stranger contact,
the probability 𝑃that individual 𝑥gets infected is:
𝑃(𝑥)=1
𝑡
Ö
𝑖=1
(1𝑃𝑐·𝑎𝑖)·(1𝑃𝑠· (𝑎𝑖+𝑠𝑖)/𝑛𝑖),(1)
where
𝑎𝑖
is the number of acquaintance contacts with infectious
individual at time
𝑖
,
𝑠𝑖
is the number of stranger contacts with
infectious individuals at time
𝑖
,
𝑛𝑖
is the number of individuals
where
𝑥
is located at time
𝑖
. We found that randomly selected
patient zeros in Scenarios 1, 3, and 5 appear following the Poisson
distribution where 𝜆2.4person a day.
Based on experimentation, we discovered the ecacy of four
intervention types (connement, quarantine, isolation, and hospi-
talization).
Connement: Conned individuals are allowed to interact with
others in the same neighborhood including acquaintances. It
mitigates at some level reducing major epidemic spreads from
working areas. Acquaintance contacts are the main contribution
and the number of acquaintance contacts at work is much larger
than residential areas. This option is ecient and eective when
the number of infections drastically increases (Scenario 2) or the
number of infections is large (Scenario 4).
Quarantine: Initially, experts assumed quarantined individuals
contact only their acquaintances staying at home. However, by
Expert-in-the-Loop Prescriptive Analytics using Mobility Intervention for Epidemics PAPW ’20, August 24, 2020, San Diego, CA, USA
experimentation we found that diseases from self-quarantined pa-
tients are transmitted to a stranger, which is plausible in the real
world. We also discovered its ecacy is similar to connement,
but it is more expensive. This empirical ndings is important
for policymakers to make right decisions. For such a reason, our
prescriptions do not use quarantine.
Isolation: Empirically, we found the self-recovery period is dis-
tributed from 15 to 30 days including the pre-symptomatic period.
To isolate infectious individuals is the most eective option to
stop spreading from them since no contacts are allowed with the
isolated individuals.
Hospitalization: Similar to isolation, the hospitalized patients are
not allowed to contact others. Since there is no penalty on critical
cases nor reward on recovered cases, it has no merit of hospital-
ization, which is twice expensive than isolation. Therefore, we
exclude hospitalization from interventions.
We take advantages of this knowledge to reduce the combinatorial
search space for optimization.
3.2 Mitigation Strategies
Leveraging these observations, we optimize the combination of
two types of interventions, namely connement and isolation. To
decide which agents to conne/isolate, we compute an infection
probability (see Eq. 1) and a risk factor for each agent. A risk factor
of an agent is a measure of how dangerous an individual would be if
they were pre-symptomatic in such a way that the expected number
of infections is estimated using their number of acquaintances and
daily number of co-located agents. We categorize Scenarios 1, 3,
and 5 into the same group using the same heuristics/rules due to its
similarity. We highlight mitigation strategies for Scenarios 2 and
4 while strategies for other scenarios can be explained under the
following general rules2.
3.2.1 General rules. While infections from acquaintance contacts
and stranger contacts can be inferred by contact tracing, there is no
clue to infer who is a patient zero until they are discovered. Thus,
whenever a new case is discovered, we isolate the symptomatic
individual and select susceptible individuals with the high infection
probability. There is the trade-o between the number of infec-
tions and the number of interventions. That is, if we isolate more
individuals in question by decreasing a threshold
𝜎
, the number of
pre-symptomatic individuals is more likely to decrease. If
𝜎
is too
high, however, it ends up with more isolation because the number
of infection cases increase. Therefore, it is the main challenge to
nd a perfect 𝜎to make the balance between two measures.
Another dimension we take into account is a risk factor. Non-
symptomatic individuals are isolated if the product of the proba-
bility and a risk factor exceeds a threshold
𝜖
. Suppose that there
are two non-symptomatic individuals
𝑥
and
𝑦
having the same
infection probability. If
𝑥
has more acquaintances than
𝑦
, then the
risk of not isolating
𝑥
is higher than that of
𝑦
. Therefore,
𝑥
is more
likely to be isolated than 𝑦.
3.2.2 Scenario 2. The pandemic of Scenario 2 has higher infection
rates where
𝑃𝑐=
0
.
05 and
𝑃𝑠=
0
.
01. Due to high infection rates, we
need to isolate more aggressively, lowering threshold 𝜎. However,
2
Source code is available at https://github.com/joonseok-kim/kdd- papw20-challenge
controlling only
𝜎
is not sucient to mitigate the spread of the
pandemic due to high reproduction number. Our tactic for this
scenario is to prevent gatherings with eleven or more people. If an
individual has
𝑛>
10 acquaintances in a working area, we conne
the individual with a probability
(𝑛
11
)/𝑛
. This strategy can be
seen as a work-from-home order.
3.2.3 Scenario 4. Scenario 4 is the extreme case that shows a trade-
o between the number of accumulated infections and intervention
costs. At the beginning, 300 pre-symptomatic are spawned and more
than 100 susceptible individuals get infected from them during the
rst day. About 100-200 symptomatic cases including about 60
patient zeros are discovered the very next day. Therefore, about
240 patient zeros are unknown and our goal is to cost-eectively
suppress initial spreading. Along with potential cases, we randomly
select many individuals with high risk factors to isolate. Similar to
Scenario 2 we prevent gatherings having more than ve people in
the working areas. We release them to reduce costs and the next
day we repeat this process to nd more pre-symptomatic cases.
4 CONCLUSION
In this paper, we introduced the Expert-in-the-Loop prescriptive
analytics that leverages experts’ best knowledge. While AI and
machine learning can outperform in well dened problems such
as optimization, experts can expedite the optimization process by
means of modeling such a problem and choosing right tools. We
also demonstrated how simulations can be used to discover ground
truths as groundwork for prescriptive analytics. Although our miti-
gation strategies yielded fairly good results for general cases, we
still have diculty to narrow down the complex combination of
rules for Scenario 2 and Scenario 4. In order to tackle the combina-
tion of rules, we will leverage evolutionary algorithms to explore
diverse possible worlds for the future work.
ACKNOWLEDGMENT
This work was supported by the Defense Advanced Research Projects
Agency (DARPA) under cooperative agreement No.HR00111820005
and the National Science Foundation Grant DEB-2030685. The con-
tent of the information does not necessarily reect the position or
the policy of the Government, and no ocial endorsement should
be inferred.
REFERENCES
[1]
Challenge on mobility intervention for epidemics. https://prescriptive-analytics.
github.io/challenge-cfp/index.html, Accessed July 24, 2020.
[2]
Covid-19 ensemble dashboard. https://covid-ensemble.gmu.edu/, Accessed July
24, 2020.
[3]
M. Gell-Mann. The interpretation of the new particles as displaced charge multi-
plets. Il Nuovo Cimento (1955-1965), 4(2):848–866, 1956.
[4]
J.-S. Kim, H. Jin, H. Kavak, O. C. Rouly, A. Crooks, D. Pfoser, C. Wenk, and A. Züe.
Location-based Social Network Data Generation Based on Patterns of Life. In
IEEE MDM 2020, pages 158–167. IEEE, 2020.
[5]
J.-S. Kim, H. Kavak, U. Manzoor, and A. Züe. Advancing simulation experimenta-
tion capabilities with runtime interventions. In SpringSim 2019, pages 1–11. IEEE,
2019.
[6]
J.-S. Kim, H. Kavak, C. O. Rouly, H. Jin, A. Crooks, D. Pfoser, C. Wenk, and
A. Züe. Location-based social simulation for prescriptive analytics of disease
spread. SIGSPATIAL Special, 12(1):53–61, 2020.
[7]
J. Snow. On Mode of Communication of Cholera. Churchill, London, England, 1855.
[8]
A. Züe, G. Trajcevski, D. Pfoser, and J.-S. Kim. Managing uncertainty in evolving
geo-spatial data. In IEEE MDM 2020, pages 5–8. IEEE, 2020.
... By contrast, individual-based EPC strategies exploit individual information to estimate infection risk for each individual, and further select a customized epidemic-control action for each individual [8]. However, current individualbased EPC strategies [9], [10], [11], [12], [13] lack a module to estimate the spread of the virus through complex contacts between individuals. To achieve an efficient and effective EPC result, we in this paper aim to maximally make use of available information and design an individual-based EPC strategy that can both minimize the number of infections and the social cost of epidemic control. ...
... • EIT L [10]: a heuristic baseline that adjusts the epidemic strategy through a heuristic algorithm, which based on evaluating the intervention action effectiveness and understanding resulting patterns and interpret causality. ...
Preprint
Full-text available
The recent outbreak of COVID-19 poses a serious threat to people's lives. Epidemic control strategies have also caused damage to the economy by cutting off humans' daily commute. In this paper, we develop an Individual-based Reinforcement Learning Epidemic Control Agent (IDRLECA) to search for smart epidemic control strategies that can simultaneously minimize infections and the cost of mobility intervention. IDRLECA first hires an infection probability model to calculate the current infection probability of each individual. Then, the infection probabilities together with individuals' health status and movement information are fed to a novel GNN to estimate the spread of the virus through human contacts. The estimated risks are used to further support an RL agent to select individual-level epidemic-control actions. The training of IDRLECA is guided by a specially designed reward function considering both the cost of mobility intervention and the effectiveness of epidemic control. Moreover, we design a constraint for control-action selection that eases its difficulty and further improve exploring efficiency. Extensive experimental results demonstrate that IDRLECA can suppress infections at a very low level and retain more than 95% of human mobility.
Article
Full-text available
Human mobility and social networks have received considerable attention from researchers in recent years. What has been sorely missing is a comprehensive data set that not only addresses geometric movement patterns derived from trajectories, but also provides social networks and causal links as to why movement happens in the first place. To some extent, this challenge is addressed by studying location-based social networks (LBSNs). However, the scope of real-world LBSN data sets is constrained by privacy concerns, a lack of authoritative ground-truth, their sparsity, and small size. To overcome these issues we have infused a novel geographically explicit agent-based simulation framework to simulate human behavior and to create synthetic but realistic LBSN data based on human patterns-of-life (i.e., a geo-social simulation). Such data not only captures the location of users over time, but also their motivation, and interactions via temporal social networks. We have open sourced our framework and released a set of large data sets for the SIGSPATIAL community. In order to showcase the versatility of our simulation framework, we added disease a model that simulates an outbreak and allows us to test different policy measures such as implementing mandatory mask use and various social distancing measures. The produced data sets are massive and allow us to capture 100% of the (simulated) population over time without any data uncertainty, privacy-related concerns, or incompleteness. It allows researchers to see the (simulated) world through the lens of an omniscient entity having perfect data.
Conference Paper
Full-text available
Experimentation is a critical capability of simulations that allows one to test different scenarios safely and cost-effectively. In particular, agent-based simulations have been used in experimenting with different policy options to aid decision makers. Highly utilized experimentation methods such as parameter sweeping aim to explore the relationship between the initial parameter values (i.e., input) and simulation results (i.e., outputs). Experimentation, which involves changes of simulation states on-the-fly, is often conducted ad-hoc and entails manual code adjustments which are time consuming and error-prone. In this paper, we present a framework that facilitates intervening in a running simulation to change simulation states in a semi-automated manner so that a simulation user can explore alternative worlds. In our framework, such an intervention is implemented using an injection mechanism. The framework allows the user to weigh different policy options rapidly with minimal effort. We illustrate its use in an urban agent-based model.