ArticlePDF Available

Calibrating with Multiple Criteria: A Demonstration of Dominance

Authors:

Abstract and Figures

Pattern oriented modelling (POM) is an approach to calibration or validation that assesses a model using multiple weak patterns. We extend the concept of POM, using dominance to objectively identify the best parameter candidates. The TELL ME agent-based model is used to demonstrate the approach. This model simulates personal decisions to adopt protective behaviour during an influenza epidemic. The model fit is assessed by the size and timing of maximum behaviour adoption, aswell as the more usual criterion of minimising mean squared error between actual and estimated behaviour. The rigorous approach to calibration supported explicit trading off between these criteria, and ultimately demonstrated that there were significant flaws in the model structure.
Content may be subject to copyright.
Calibrating with Multiple Criteria:
A Demonstration of Dominance
Jennifer Badham1, Chipp Jansen2, Nigel Shardlow3, Thomas
French3
1Centre for Public Health, Queen’s University, Belfast BT12 6BA, United Kingdom
2Department of Informatics, King’s College, London, United Kingdom
3Sandtable Ltd, 69 Old Street, London, EC1V 9HX, United Kingdom
Correspondence should be addressed to research@criticalconnections.com.au
Journal of Artificial Societies and Social Simulation 20(2) 11, 2017
Doi: 10.18564/jasss.3212 Url: http://jasss.soc.surrey.ac.uk/20/2/11.html
Received: 02-05-2016 Accepted: 12-09-2016 Published: 31-03-2017
Abstract: Pattern oriented modelling (POM) is an approach to calibration or validation that assesses a model
using multiple weak patterns. We extend the concept of POM, using dominance to objectively identify the best
parameter candidates. The TELL ME agent-based model is used to demonstrate the approach. This model sim-
ulates personal decisions to adopt protective behaviour during an influenza epidemic. The model fit is assessed
by the size and timing of maximum behaviour adoption, as well as the more usual criterion of minimising mean
squared error between actual and estimated behaviour. The rigorous approach to calibration supported ex-
plicit trading o between these criteria, and ultimately demonstrated that there were significant flaws in the
model structure.
Keywords: Multi-Criteria Decision Making, Calibration, Pattern-Oriented Modelling, Dominance, Behaviour Mod-
elling
Introduction
1.1 Agent-Based Models (ABMs) simulate "unique and autonomous entities that usually interact with each other
and their environment locally" (Railsback & Grimm 2012, p. 10). Such models are therefore designed at the
micro-scale, with rules to guide the actions of the simulated individuals based on their specific characteristics
and situation. In contrast, much of the interesting behaviour of the model occurs at the macro-level.
1.2 This scale mismatch complicates model calibration. Parameters for those micro-scale rules may be unmeasur-
able, but the aggregated eect of the decisions is routinely collected in data about the operation of the system
being modelled. With a large number of parameters, it may be relatively easy to obtain an apparently good fit
overall that is nevertheless hiding structural invalidity or other problems. One way to make the calibration more
robust is by assessing model output against multiple criteria selected for their diversity, referred to as pattern-
oriented modelling (Wiegand et al. 2004; Railsback & Grimm 2012). Doing so, however, introduces the problem
of defining an overall ‘best fit’ since dierent sets of parameter values may generate model output that meet
dierent criteria.
1.3 One approach is to establish an overall objective function that combines each of the criteria in some way. For ex-
ample, the criteria could be weighted and the model calibrated to best fit the weighted combination. However,
this approach introduces an arbitrary function to combine the criteria (such as additional parameters in the
form of criteria weights), typically with only limited knowledge of what is being traded away. Another method
uses stakeholder or other experts to assess the reasonableness of the model’s behaviour (Moss 2007).
1.4 Categorical calibration or filtering (Wiegand et al. 2004; Railsback & Grimm 2012) uses acceptance thresholds
for each criterion and retains all parameter sets that meet all the thresholds for further consideration. However,
this is ineicient. If any threshold is set too high, a parameter set could be rejected that is an excellent fit on
all other criteria. On the other hand, setting a lower threshold passes too many potential solutions to be easily
compared.
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
Figure 1: Definition of dominance (two dimensions). Point D is dominated by point A because point A is better
against all criteria than point D. That is, regardless of the relative importance of the two criteria, point A is always
preferred over point D. Similarly, point E is dominated by both point B and point C. But point E is not dominated
by point A; if criterion 2 was much more important than criterion 1, it may be appropriate to select E for the small
improvement in criterion 2 at the expense of the loss in criterion 1. The shaded area indicates the parameter
space that is dominated by any of the three points A, B or C. The Pareto eicient front is the set of points that are
not dominated by any other, in this case any points on the dashed line specified by the points A, B and C. Along
this line, improvements in one dimension can only be achieved at the expense of at least one other criterion;
for example, moving from A to B improves criterion 2 but worsens criterion 1. With more dimensions, the Pareto
front is given by a piecewise hyperplane, but is also the set of points that appear on the front of any pair of
dimensions, regardless of whether they are dominated in other pairs of dimensions.
1.5 This paper instead presents the dominance approach, which does not arbitrarily prioritise criteria or set sub-
jective thresholds. Instead, dominance is used to identify all the parameter sets that are on the Pareto eicient
frontier. These are the parameter sets that are objectively best, where an improvement in one criterion can
only be made by reducing the fit for another criterion (see Figure 1). While this approach is well established in
operations research for multi-criteria decision making or optimisation (Müssel et al. 2012), it is less well known
in social simulation (with some exceptions, such as Schmitt et al. 2015).
1.6 The method is described using a case study: calibrating the TELL ME model concerning protective behaviour in
response to an influenza epidemic. This paper first presents the model structure and the parameters required
to operationalise the links between attitude, behaviour and epidemic spread. The description focuses on the
necessary background to understand the calibration process presented in the following sections. The approach
to setting parameter values is then described, with the results of that process and conclusions following.
Case Study Description: TELL ME Model
2.1 The European funded TELL ME project 1concerned communication before, during and aer an influenza pan-
demic. Ending in January 2015, it was intended to assist health agencies to develop communication plans that
encourage people to adopt appropriate behaviour to reduce influenza transmission. One project output was a
prototype ABM, to explore the potential of such models to assist communication planning. The agents in that
model represent people making decisions about protective behaviour (such as vaccination or hand hygiene) in
light of personal attitudes, norms and epidemic risk.
2.2 The core of the TELL ME model is individual agents making decisions about whether to adopt behaviour to
reduce their chance of becoming infected with influenza. Protective behaviour is adopted (or dropped) by an
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
agent if the weighted average of attitude, subjective norms and perception of threat exceeds (or falls below)
some threshold.
2.3 Each agent is attached to a patch (a location defined by a grid) overlaid on a map of the country in which the
epidemic is being simulated. The epidemic is mathematically modelled by the patches; there is no transmission
between individual agents. The infectivity at any patch is adjusted for the proportion of local agents who have
adopted protective behaviour and the eicacy of that behaviour. In addition, the number of new infections
in nearby patches is a key input to each agent’s perception of threat. Thus, the agent protective behaviour
decisions and the transmission of the epidemic are mutually dependent.
2.4 The operationalisation of this model design is described briefly below. This description focuses on those el-
ements of the model that were calibrated using dominance. The behaviour of the agents is also aected by
communication plans, which are input to the model as sets of messages. The communication elements were
disabled for calibration purposes due to lack of data, and are therefore not described here. The model was im-
plemented in NetLogo (Wilensky 1999), with the full code, complete model design (Badham & Gilbert 2015) and
other documentation available online.2
Epidemic transmission
2.5 The epidemic is modelled by updating counts for each disease state of the population at each patch. For in-
fluenza, a suitable epidemic model is the SEIR model, in which people conceptually start in the susceptible (S)
state, become exposed (E) but not yet infectious, then become infectious (I) and are eventually removed from
calculations (R) because they either recover and become immune or they die. The model represents this pro-
cess mathematically (Diekmann & Heesterbeek 2000), governed by transition rate parameters (βfor SE,λ
for EI, and γfor IR).
dS
dt =βSI
dE
dt =βSI λE
dI
dt =λE γI
dR
dt =γI
(1)
2.6 In each patch or region (r), the value of the transition rate parameter from S to E (β) is reduced in accordance
with the behaviour decisions taken by individuals at that patch and the eicacy(E) of the behaviour. The re-
duced infectivity rate (calculated with Equation 2) is used in the transmission equations (Equation 1), leading
to a lower local incidence. To support a mix of behaviour (and hence dierent reductions in infectivity between
patches), each patch is home to at least ten agents, with greater numbers in those patches that correspond to
high population density real world locations.
βr=β(1 PrE)(2)
2.7 To allow the epidemic to spread, a proportion of estimated new exposures for a region are actually created
in neighbouring patches to simulate travel. This requires two additional parameters, the proportion of new
infections created at other locations, and the split between neighbouring or longer distance patches.
Operationalising decisions about protective behavior
2.8 The agents’ behaviour decisions are based on three psychological models: the Theory of Planned Behavior
(Ajzen 1991), Health Belief Model (Rosenstock 1974), and Protection Motivation Theory (Maddux & Rogers 1983).
The key factors of attitude, norms and threat from these models were used as the inputs for agent behaviour.
The agent compares the weighted average of the three inputs to a threshold (Equation 3) for each type of be-
haviour (vaccination or other protective). If the value is higher, the agent adopts the non-vaccination behaviour
or seeks vaccination, and non-vaccination behaviour ceases once the value falls below the relevant threshold.
Vaccination cannot be dropped. Threat has the same value for both types of behaviour, but attitude, norms,
weights and thresholds may be dierent.
ωAA+ωNN+ (1 ωAωN)TtBadopt behavior
ωAA+ωNN+ (1 ωAωN)Tt< B abandon behavior (3)
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
2.9 Attitude is operationalised as a value in the range [0,1], initially selected from a distribution that reflects the
broad attitude range of the population. Subjective norms describe how a person believes family, friends and
other personally significant people expect them to behave and the extent to which they feel compelled to con-
form. The norm is operationalised as the proportion of nearby agents who have adopted the behaviour.
2.10 Perceived threat (Tt) reflects both susceptibility and severity (Equation 4). Following the method of Durham &
Casman (2012), susceptibility is modelled with a discounted (δ) cumulative incidence time series. This means
that perceived susceptibility will increase as the epidemic spreads but recent new cases (ct) will impact more
strongly than older cases. In contrast to the cited paper, only nearby cases are included in the time series for
the TELL ME model, so perceived susceptibility will be higher for the simulated individuals that are close to
the new cases than for those further away. Severity is included as a simple ‘worry’ multiplier (W), and can be
interpreted as subjective severity relative to some reference epidemic.
2.11 st=δst1+ct1
Tt=W st
(4)
Calibration Process
3.1 From the model structure discussion, it is clear that there are many parameters to be determined. Some may be
estimated directly from measurable values in the real world, such as population counts. Ideally, unmeasurable
values should be calibrated to optimise some measure of goodness of fit between model results and real world
data.
3.2 The first phase simplified the model to reduce the number of parameters influencing results. This was done by
excluding the communication component and fixing protective behaviour to have no eect. Other values were
fixed at values drawn from literature, specifically those that aected the distribution of attitudes and the trans-
mission of the epidemic. The exclusion of some components and setting of other parameters to fixed estimates
can be interpreted as reduction in the dimensions of the parameter space, reducing the scope of the calibration
task.
3.3 The second phase calibrated the parameters that are central to the model results; those that govern the agents’
decisions to adopt or drop protective behaviour as an epidemic progresses (weights in Equation 3 and discount
in Equation 4). This phase is where dominance was used, to assess parameter sets against three criteria: size
and timing of maximum behaviour adoption, as well as the more usual criterion of minimising mean squared
error between actual and estimated behaviour.
3.4 The model parameters are summarised in Table 1, together with how they were used in the calibration process.
While the TELL ME model included both vaccination and non-vaccination behaviour, only the latter is reported
here because the process was identical. Non-vaccination behaviour was calibrated with various datasets col-
lected during the 2009 H1N1 epidemic in Hong Kong. The calibration process is described in more detail in the
remainder of this section.
Dimension reduction: protective behaviour
3.5 Attitude distribution was based on a study of behaviour during the 2009 H1N1 epidemic in Hong Kong (Cowling
et al. 2010), which included four questions about hand hygiene: covering mouth when coughing or sneezing,
washing hands, using liquid soap, and avoiding directly touching common objects such as door knobs. A trian-
gular distribution over the interval [0,1] with mode of 0.75 was used to allocate attitude scores in the model as
an approximation to these data.
3.6 The eicacy of protective behaviour (E) was set to zero (ineective) during calibration. That is, agents respond
to the changing epidemic situation in their decision processes, but do not influence that epidemic. This ensures
simulations using the same random seeds will generate an identical epidemic regardless of behaviour adoption,
allowing simulated behaviour to respond to the relevant incidence levels.
Dimension reduction: epidemic transmission
3.7 Several parameters that influence epidemic spread were estimated from data. These are the various transition
rates between epidemic states, the structure of the population in which the epidemic is occurring, and the
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
Symbol Description Value
A0Base attitudes Fixed distribution
PrPopulation by region GIS data
R0Basic reproduction ratio (gives β) Fixed at 1.5
1
/λLatency period EIFixed at 2
1
/γRecovery period IRFixed at 6
Population traveling Fixed at 0.30
Long distance traveling Fixed at 0.85
WSeverity relative to H1N1 Fixed at 1
δIncidence discount To be calibrated
ωAAttitude weight (x2) To be calibrated
ωNNorms weight (x2) To be calibrated
Bbehavior threshold (x2) To be calibrated
Ebehavior eicacy (x2) Fixed at 0
Meaning of ‘nearby’ Fixed at 3 patches
Note: Parameters associated with the eect of communication
are not listed as communication was removed from the calibra-
tion process.
Table 1: TELL ME model parameter settings for calibration.
mobility of that population. The multiplier in Equation 4 was set at W= 1, establishing H1N1 as the reference
epidemic.
3.8 The basic reproductive ratio (denoted R0) is related to the parameters in Equation 1 with R0=β/γ(Diekmann
& Heesterbeek 2000). R0for the 2009 H1N1 epidemic was estimated as 1.1-1.4 (European Centre for Disease
Prevention and Control 2010). Calibration experiments were run with R0= 1.5(the lowest value for which
an epidemic could be reliably initiated), latency period of 2 days (European Centre for Disease Prevention and
Control 2010), and infectious period of 6 days (Fielding et al. 2014).
3.9 The population at each patch was calculated from population densities taken from GIS datasets of projected
population density for 2015 (obtained from Population Density Grid Future collection held by Center for Interna-
tional Earth Science Information Network - CIESIN - Columbia University & Centro Internacional de Agricultura
Tropical - CIAT 2013). These densities were adjusted to match the raster resolution to the NetLogo patch size
and then total population normalised to the forecast national population for 2015 (United Nations, Department
of Economic and Social Aairs, Population Division 2013).
3.10 As epidemic processes (Equation 1) occur independently within each patch, the model explicitly allocates a
proportion of the new infections created by a patch to other patches to represent spreading of the epidemic
due to travel. The proportion of new infections allocated to other patches was set at 0.3, with 0.85 allocated to
immediate neighbours and 0.15 allocated randomly to patches weighted by population counts. These values
provide a qualitatively reasonable pattern of epidemic spread.
Dominance analysis of behaviour parameters
3.11 Fourparameters are directly involved in agent adoptionof protective behaviour: weights for attitude and norms,
the discount applied for the cumulative incidence, and the threshold score for adoption (ωA,ωN,δ, and Bin
Equations 3 and 4). Briefly, multiple simulations were run while systematically varying these parameters to
generate a behaviour adoption curve. That curve was assessed against empirical data on three criteria, and
dominance analysis was used to identify the best fit candidates.
3.12 Broadly, the empirical behaviour data has an initial population proportion of approximately 65%, which rises
to 70% and then falls below the starting level. This rise and fall was considered the key qualitative feature of
the data and two aspects were included: timing and size of the bump. The three criteria to select the best fit
parameter sets were:
mean squared error between prediction and actual over all points in the data series (MSE);
the dierence in valuesbetween the maximum predicted adoption proportion and maximum actual adop-
tion proportion (Max); and
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
Parameter Range
Attitude weight (ωA) 0.2 by 0.05 to 0.7
Norms weight (ωN) 0.1 by 0.05 to 0.5
Incidence discount (δ) 0.02 by 0.02 to 0.2
Behaviour threshold (B) 0.2 by 0.05 to 0.7
Table 2: Parameter values tested in the calibration process.
the number of ticks (days) between the timing of the maximum predicted adoption and maximum actual
adoption (When).
3.13 Experimentsand dominance analysis were performed with the Sandtable Model Foundry (Sandtable 2015). This
proprietary system was used to manage several aspects of the simulation in a single pass: sampling the param-
eter space, submitting the simulations in a distributed computing environment via the NetLogo API, comparing
the result to the specified criteria, and calculating the dominance fronts. As each run takes several minutes, the
sampling and distributed computing environment made it feasible to comprehensively explore the parameter
space in a reasonable time and the within-system dominance calculation simplified analysis.3
3.14 Simulations were run with parameter values selected from the ranges at Table 2, chosen so as to require a
contribution by attitude (ωA0.2) to support heterogeneity of behaviour between agents on a single patch.
Parameter combinations were excluded if they did not include contributions by all three influencing factors of
attitude, norms (ωN0.1), and threat (ωA+ωN0.9). The parameter space was sampled using the Latin
Hypercube method, with 813 combinations selected.
3.15 Ten simulations were run for each parameter combination. Preliminary testing with 30 repetitions indicated
that simulations using the same parameters could generate epidemics that dier substantially on when they
‘take o’, but they had similar shapes once started, and hence similar behaviour adoption curves (not specif-
ically shown, but visible in Figure 5). Ten of the seeds were retained for use with the calibration simulations.
These random seeds generated epidemics with known peaks regardless of the behaviour parameter combina-
tion as the generated epidemic was not aected by protective behaviour (since eicacy is set to 0).
3.16 The behaviour curves from the 10 simulations were centred on the timestep of the epidemic peak and aver-
aged. The average curve was compared to the (centred) 13 data points of the Hong Kong hand washing dataset
(Cowling et al. 2010, supplementary information) for calculation of the three fit criteria.
3.17 Parallel plot analysis was used as an exploratory tool. This is an interactive technique using parallel coordinates
(Inselberg 1997; Chang 2015) to simultaneously show the full set of model parameters and the criteria metrics.
That is, simulation runs can be filtered with specific values or ranges of one or more of the input parameters or
dierence from criteria.
3.18 Dominance analysis was used to identify the best fit candidate parameter sets. This technique assigns each
parameter set to a dominance front (using the algorithm of Deb et al. 2002). Front 0 is the Pareto eicient fron-
tier, where any improvement in the fit for one criterion would decrease the fit against at least one of the other
criteria (Figure 1). Front 1 would be the Pareto eicient frontier if all the front 0 parameter sets were removed
from the comparison, and so on for higher front values until all parameter sets are allocated a front number.
Results
4.1 The parameter sets that are not dominated are those on the Pareto eicient frontier (front 0). These are de-
scribed at Table 3 with their performance against the three criteria. By definition, for all other parameter sets,
there is at least one on the frontier that is a better fit on at least one criterion and at least as good a fit on all
others. Thus, these are the objectively best candidates.
4.2 The choice between these for the best fit overall is subjective, trading performance in one criterion against
performance in the others and also adding other factors not captured in the criteria. Two methods were used
to assist with that choice, quantitative distance from best fit criteria and qualitative fit of behaviour curves.
4.3 The fit for all tested parameter sets is displayed at Figure 2, with the non-dominated (front 0) candidates marked
in red and labelled with the set number from Table 3. Each appears in the lower le corner of at least one of
the sub-figures. From (a) and (b), a small error in the timing of the maximum adoption cannot be combined
with a small error in either of the other properties. Focussing only on those other properties (sub-figure (d), the
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
Parameter values Criteria
Set ωAωNδ B MSE Max When
1 0.70 0.20 0.18 0.50 0.00 0.09 78
2 0.70 0.20 0.10 0.50 0.00 0.08 78
3 0.65 0.10 0.20 0.40 0.00 0.05 71
4 0.60 0.20 0.06 0.45 0.00 0.05 84
5 0.55 0.10 0.10 0.35 0.00 0.01 76
6 0.35 0.10 0.18 0.25 0.01 0.01 73
7 0.65 0.10 0.00 0.50 0.07 0.01 206
8 0.55 0.10 0.18 0.25 0.08 0.19 69
9 0.70 0.20 0.14 0.30 0.12 0.26 62
10 0.55 0.35 0.16 0.40 0.13 0.27 61
11 0.20 0.30 0.04 0.20 0.19 0.29 36
12 0.30 0.35 0.00 0.30 0.19 0.29 19
13 0.25 0.30 0.10 0.25 0.23 0.29 9
14 0.25 0.50 0.14 0.25 0.29 0.29 33
15 0.25 0.25 0.02 0.35 0.34 0.01 104
Table 3: Best fit parameter sets and their assessment.
relevant section of (c) expanded), parameter sets 5 and 6 achieve a much closer maximum adoption compared
to sets 3 and 4, with only a small loss in the mean squared error. While the same analysis could have been
performed by examining Table 3 directly, the visualisation allows fast comparison, even with a larger number
of criteria.
1
2
3
4
5
6
7
89
10
11
12 13
14
15
(a)
0
50
100
150
200
0.0 0.1 0.2 0.3 0.4
MSE adoption
When adoption error
1
2
3
4
5
6
7
89
10
11
12
13
14
15
(b)
0
50
100
150
200
0.0 0.2 0.4 0.6
Max adoption error
When adoption error
1
2
3
4
5
67
8
9
10 1112 13 14
15
(c)
0.0
0.2
0.4
0.6
0.0 0.1 0.2 0.3 0.4
MSE adoption
Max adoption error
3
4
5
6
(d)
0.00
0.02
0.04
0.06
0.00 0.01 0.02 0.03 0.04 0.05
MSE adoption
Max adoption error
Dominance
0
1
2+
Figure 2: Average outcome over 10 simulations for each of 813 parameter sets. Subfigures (a), (b) and (c) display
the outcome against dierent pairs of criteria, with subfigure (d) focussing on the best fit section of (c). Those
on the Pareto eicient frontier are coloured red and numbered according to Table 3.
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
4.4 These best fit candidates are also coloured red in the parallel coordinate analysis (see Figure 3). This revealed
that good fit parameter sets existed throughout the tested parameter space for the weights and discount, but
that the threshold should not exceed 0.5. The main benefit of this analysis, however, is interactive. For example,
it can provide a visual method of pattern-oriented modelling filtering, by adjusting ranges on the criteria results
and displaying the parameter values of the simulations that survive.
0
5
10
15
20
25
30
front
0.00
0.05
0.10
0.15
0.20
incidence-discount
0.2
0.3
0.4
0.5
0.6
attitude-weight-NV
0.1
0.2
0.3
0.4
0.5
norms-weight-NV
0.2
0.3
0.4
0.5
0.6
protectNV-threshold
0.2
0.4
0.6
max_protect_error
50
100
150
200
when_max_protect_error
0.1
0.2
0.3
prop_protect_mse
Figure 3: Interactive analysis of simulation experiments. The input parameter values appear in the le section
of screen, and the fit against each criteria on the right. Simulation runs can be highlighted in groups (such as all
those on the Pareto eicient frontier as displayed) or individually to explore the eect of dierent combinations
of parameter values.
4.5 For the qualitative visualisation, fiy simulations were run using the NetLogo BehaviorSpace tool (Wilensky
1999) for each of the non-dominated parameter sets. The average adoption curve is shown in Figure 4. Only sets
1 to 6 display the appropriate pattern of behaviour, with approximately two thirds of the population adopting
the behaviour before the start of the epidemic followed by an increase and then return to a similar level once
the epidemic has passed. An inspection of Table 3 shows that the mean squared error is similar for all six,
but parameter sets 5 and 6 also have a good match in the estimated maximum adoption level, supporting the
selection of either of these as the best fit.
123456
7
8
910 11 12 13 14
15
0.00
0.25
0.50
0.75
1.00
−150 −100 −50 0 50
Day (centred at epidemic peak)
Proportion adopting nonvaccination behaviour
Figure 4: The average of 50 simulation runs for each of the non-dominated candidate parameter sets. The se-
lected best fit parameter set (set 6) is drawn in red. Empirical behavior values (extracted from Cowling et al.
2010, supplementary information) are shown with dots.
4.6 Ultimately, parameter set 6 was selected as the best fit and used as the TELL ME non-vaccination behaviour
default values. The individual runs for the model with these default parameter values are shown in Figure 5,
together with the average behaviour curve.
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
0.5
0.6
0.7
0.8
−150 −100 −50 0 50
Day (centred at epidemic peak)
Proportion adopting nonvaccination behaviour
Figure 5: 50 simulation runs for proportion adopting hand washing behaviour during the 2009 H1N1 epidemic
in Hong Kong, together with the average behaviour curve (and 95% confidence interval). Parameter values are:
0.35 for attitude weight, 0.1 for norms weight, 0.55 for threat weight, 0.18 discount in cumulative incidence, and
0.25 for threshold. Empirical behaviour values (extracted from Cowling et al. 2010, supplementary information)
are shown with dots.
Discussion
5.1 This paper describes a detailed calibration process using the prototype TELL ME model as a case study. The
model is complicated, with many components and parameters to reflect policy makers’ understanding of their
planning environment. It is also complex, with model behaviour shaped by two types of interactions. Per-
sonal decisions about protective behaviour aect epidemic progress, which influences perceptions of threat
and hence personal decisions. Behaviour decisions of agents are also directly influenced by the decisions of
nearby agents, through their perception of norms.
5.2 The calibration process first reduced the dimensions of the parameter space by setting epidemic parameters,
population density and attitude distribution to values drawn from the literature. Some other parameters were
set to values that removed their influence in the model (notably behaviour eicacy and those associated with
communication).
5.3 This reduced the parameters required to calibrate the model to only four: attitude weight, norms weight, inci-
dence discount and adoption threshold. These parameters control the central process of the model - adoption
of protective behaviour in response to an epidemic. With only limited empirical information about behaviour
throughout an epidemic, we used pattern-oriented modelling and attempted to calibrate against three weak
signals: timing of the behaviour peak (compared to the epidemic peak), maximum level of protective behaviour,
and minimising the mean square dierence between the simulation estimate and measured behaviour level.
5.4 Having three assessment criteria opens the question as to how to compare the runs where they have dierent
rankings across criteria. The standard approach is to set acceptance thresholds for each criterion
(Railsback & Grimm 2012) and then select from only those that pass all. However, this is ineicient: if thresholds
are set low enough to pass simulations that are generally excellent but are slightly less fit on one criterion, then
the thresholds also allow through any simulation that is slightly less fit on all criteria. Instead, we have used the
concept of dominance to identify the objectively best parameter sets; for any excluded simulation, there is at
least one member of the dominant candidates that is better on at least one criterion and no worse on all others.
Additional criteria were used to choose between these objectively good candidates, determining what to give
up in order to achieve the best overall fit.
5.5 There is little similarity in the non-dominated parameter sets. Very dierent parameters can achieve similar
outcomes (for example, sets 5 and 6), and parameter values in the best fit sets covered a broad range of values.
This reflects the interdependence between the parameters and emphasises the diiculties in calibrating the
TELL ME model, it would not have been possible to identify these candidates by tuning parameters individually.
5.6 The rigorous calibration process was instrumental in detecting structural problems with the model. In partic-
ular, the prototype was unable to generate results with a behaviour peak earlier than the epidemic peak, in
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
conflict with the empirical results for hand hygiene during the Hong Kong 2009 H1N1 epidemic. A reasonable
fit could have been achieved against a minimum mean squared error single criterion, but assessing against
multiple criteria highlighted the timing weakness.
5.7 Further consideration of the model rules makes it clear that this is a structural or theoretical gap rather than a
failure in calibration. As attitude, weights and the threshold are fixed, change in behaviour arises from changes
in the norms or perceived threat. The attitude weight is instrumental in setting the proportion adopted in the
absence of an epidemic, but plays no part in behaviour change as the attitudes of agents are constant. As
the epidemic nears an agent, incidence increases near the agent, which also increases perceived threat and
may trigger adoption. This may also trigger a cascade through the norms (proportion of visible agents who are
protecting themselves) component. However, the threat component of the behaviour decision (Equation 3) can
only respond to an epidemic, not anticipate it, and the norms component can only accelerate adoption or delay
abandoning it. Therefore, regardless of parameter values, the simulation is unable to generate a pattern with a
behaviour curve peak before the epidemic peak.
Conclusion
6.1 Ultimately, the TELL ME ABM was unable to be calibrated adequately for policy assessment. That is, the best
fit parameter set was used as the model default values, but the simulation did not produce realistic model
behaviour. For the purposes of the TELL ME project, this outcome was disappointing but not unexpected. The
ABM was a prototype intended to identify the extent to which such a model could be developed for planning
purposes. The attempt highlighted both the limited empirical information about behaviour during an epidemic
and the absence of information about the eect of communication. Relevant behavioural information must be
collected if a full planning model is to be developed in the future.
6.2 In contrast, the use of dominance was successful in identifying candidate parameter sets that are objectively
best against several competing criteria. Selection between these candidates was then relatively simple as only
a limited number needed to be considered. Further, the rigorous process highlighted structural problems in
the model as the desired timing of the behaviour peak could not be achieved while also achieving good perfor-
mance in other criteria.
Acknowledgements
This paper benefited enormously from thoughtful comments and questions from several referees, and the au-
thors would like to thank them for their care and eort. We would also like to thank Nigel Gilbert, Andrew Skates
and other colleagues at the Centre for Research in Social Simulation (University of Surrey), and at Sandtable,
and the TELL ME project partners for their input into the development of the TELL ME model.
Notes
1This research has received funding from the European Research Council under the European Union’s Sev-
enth Framework Programme (FP/2007-2013), Grant Agreement number 278723. The full project title is TELL ME:
Transparent communication in Epidemics: Learning Lessons from experience, delivering eective Messages,
providing Evidence, with details at http://tellmeproject.eu/.
2The model and supporting documentation are available from several online locations. The EU project site
links to the model code and users’ guide at http://www.tellmeproject.eu/node/392, together with re-
ports concerning the project. The model and users’ guide are also lodged with OpenABM at https://www.
openabm.org/model/4536/version/1. The model and users’ guide are also available from the CRESS web-
site at http://cress.soc.surrey.ac.uk/web/resources/models/tell-me-model, as is the working pa-
per with the detailed technical information. The calibration simulation dataset is available on request from the
first author.
3Similar functionality could be achieved within an open source environment by combining tools: one for the
parameter space sampling and simulation management (such as OpenMOLE, MEME or the lhs and RNetLogo
packages in R), and another to analyse the results and calculate the dominance fronts (such as the tunePareto
package in R).
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
References
Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes,50(2),
179 211. doi:http://dx.doi.org/10.1016/0749-5978(91)90020-T
Badham, J. M. & Gilbert, N. (2015). TELL ME design: Protective behaviour during an epidemic. CRESS Working
Paper 2015:2, Centre for Research in Social Simulation, University of Surrey
Center for International Earth Science Information Network - CIESIN - Columbia University & Centro Interna-
cional de Agricultura Tropical - CIAT (2013). Gridded population of the world, version 3 (GPWv3): Population
density grid, future estimates (2015). NASA Socioeconomic Data and Applications Center (SEDAC)
Chang, K. (2015). Parallel Coordinates v0.5.0. Available from https://syntagmatic.github.io/parallel-
coordinates/
Cowling, B. J., Ng, D. M., Ip, D. K., Liao, Q., Lam, W. W., Wu, J. T., Lau, J. T., Griiths, S. M. & Fielding, R. (2010).
Community psychological and behavioral responses through the first wave of the 2009 influenza A(H1N1)
pandemic in Hong Kong. Journal of Infectious Diseases,202(6), 867–876
Deb, K., Pratap, A., Agarwal, S. & Meyarivan, T.(2002). A fast and elitist multiobjective geneticalgorithm: NSGA-II.
Evolutionary Computation, IEEE Transactions on,6(2), 182–197
Diekmann, O. & Heesterbeek, J. (2000). Mathematical Epidemiology of Infectious Diseases. Wiley Chichester
Durham, D. P. & Casman, E. A. (2012). Incorporating individual health-protective decisions into disease trans-
mission models: A mathematical framework. Journal of The Royal Society Interface,9(68), 562–570
European Centre for Disease Prevention and Control (2010). The 2009 A(H1N1) pandemic in Europe. Tech. rep.,
ECDC, Stockholm
Fielding, J. E., Kelly, H. A., Mercer, G. N. & Glass, K. (2014). Systematic review of influenza A(H1N1)pdm09 virus
shedding: Duration is aected by severity, but not age. Influenza and Other Respiratory Viruses,8(2), 142–150.
doi:10.1111/irv.12216
Inselberg, A. (1997). Multidimensional detective. In Information Visualization, 1997. Proceedings., IEEE Sympo-
sium on, (pp. 100–107). IEEE
Maddux, J. E. & Rogers, R. W. (1983). Protection motivation and self-eicacy: A revised theory of fear appeals
and attitude change. Journal of Experimental Social Psychology,19(5), 469 479. doi:http://dx.doi.org/10.
1016/0022-1031(83)90023-9
Moss, S. (2007). Alternative approaches to the empirical validation of agent-based models. Journal of Artificial
Societies and Social Simulation,11(1), 5
Müssel, C., Lausser, L., Maucher, M. & Kestler, H. A. (2012). Multi-objective parameter selection for classifiers.
Journal of Statistical Soware,46(5)
Railsback, S. F. & Grimm, V. (2012). Agent-Based and Individual-Based Modeling: A Practical Introduction. Prince-
ton University Press
Rosenstock, I. M. (1974). The health belief model and preventive health behavior. Health Education & Behavior,
2(4), 354–386. doi:10.1177/109019817400200405
Sandtable (2015). White paper: The Sandtable model foundry
Schmitt, C., Rey-Coyrehourcq, S., Reuillon, R. & Pumain, D. (2015). Half a billion simulations: Evolutionary
algorithms and distributed computing for calibrating the SimpopLocal geographical model. Environment
and Planning B,42(2), 300–315. doi:10.1068/b130064p
United Nations, Department of Economic and Social Aairs, Population Division (2013). World population
prospects: The 2012 revision (total population, 2015, medium variant)
Wiegand, T., Revilla, E. & Knauer, F. (2004). Dealing with uncertainty in spatially explicit population models.
Biodiversity & Conservation,13(1), 53–78. doi:10.1023/B:BIOC.0000004313.86836.ab
Wilensky, U. (1999). NetLogo. Center for Connected Learning and Computer-Based Modeling, Northwestern
University
JASSS, 20(2) 11, 2017 http://jasss.soc.surrey.ac.uk/20/2/11.html Doi: 10.18564/jasss.3212
... Developing effective public policies to trigger sustainability transitions in STS is of great relevance. However, in political practice, the assessment of policies focuses on economic and technical effectiveness of policies to solve the problem by various ex-ante analysis techniques ranging from expert evaluation based on experience and knowledge (Browne et al., 2019;Falk et al., 2019;Mayntz, 2009) over multi-criteria decision analysis (MCDA) (Badham et al., 2017;Greco et al., 2016;Huang and Keisler, 2011) or other tools from decision support (Elsawah et al., 2023;Ritchey, 2012). Such approaches lack a deeper analysis of how the policy affects the complex reality which is required to understand the non-intuitive dynamics and unintended side effects policies might evoke in the STS. ...
... Our calibration process was based on the method used by Badham et al. (2017) in their own estimation of the TELL ME model. Their originality is to estimate their model against multiple macro validation criteria, that is pattern-oriented modelling (Railsback and Grimm, 2012;Wiegand et al., 2004). ...
Article
The COVID-19 epidemic highlighted the necessity to integrate dynamic human behaviour change into infectious disease transmission models. The adoption of health protective behaviour, such as handwashing or staying at home, depends on both epidemiological and personal variables. However, only a few models have been proposed in the recent literature to account for behavioural change in response to the health threat over time. This study aims to estimate the relevance of TELL ME, a simple and frugal agent-based model developed following the 2009 H1N1 outbreak to explain individual engagement in health protective behaviours in epidemic times and how communication can influence this. Basically, TELL ME includes a behavioural rule to simulate individual decisions to adopt health protective behaviours. To test this rule, we used behavioural data from a series of 12 cross-sectional surveys in France over a 6-month period (May to November 2020). Samples were representative of the French population (N=24,003). We found the TELL ME behavioural rule to be associated with a moderate to high error rate in representing the adoption of behaviours, indicating that parameter values are not constant over time and that other key variables influence individual decisions. These results highlight the crucial need for longitudinal behavioural data to better calibrate epidemiological models accounting for public responses to infectious disease threats.
... Moreover, because our landings are defined in terms of quota attainment and percentage points, we value accuracy in yelloweye attainment (for which less than a ton of quota is available each year) just as much as sablefish accuracy (the main target species). This is however an unavoidable consequence of aggregating multiple patterns into a single error number (Badham et al., 2017). ...
Article
Most bio-economic models in fisheries assume perfectly rational profit-maximizing behaviour by fishing vessels. Here we investigate this assumption empirically. Using a flexible agent-based model of fishing vessels called POSEIDON, we compared predicted fishing patterns to observed patterns in logbook data, that resulted from a wide range of stylized decision-making processes in the U.S. west coast dover sole-thornyhead-sablefish (DTS) fishery, which is managed with tradable quotas (ITQs). We found that observed vessel behaviour was best predicted in the model by simple decision algorithms whereby vessels chose between exploring new fishing grounds and revisiting previous ones based on their and other vessels' past successes. In contrast, when the model assumed that vessels were perfect profit max-imizers, the model substantially overestimated their profits and utilization of quota of rare, constraining species that carry high quota costs, such as yelloweye rockfish. Our results suggest that bounded rationality is an important driver of vessel behaviour in this fishery.
... The distance function itself is a compromise, since in all but trivial problems we face a multi-objective optimization(see review by Marler and Arora 2004) where we should minimize the distance to each summary statistic independently. In theory, we could solve this by searching for an optimal Pareto front (see Badham et al. 2017 for an example of dominance analysis applied to agent-based models). In practice, however, even a small number of summary statistics make this approach infeasible and unintelligible. ...
Preprint
Full-text available
By recasting indirect inference estimation as a prediction rather than a minimization and by using regularized regressions, we can bypass the three major problems of estimation: selecting the summary statistics, defining the distance function and minimizing it numerically. By substituting regression with classification we can extend this approach to model selection as well. We present three examples: a statistical fit, the parametrization of a simple real business cycle model and heuristics selection in a fishery agent-based model. The outcome is a method that automatically chooses summary statistics, weighs them and use them to parametrize models without running any direct minimization.
Article
Full-text available
Computational models are increasingly being used to assist in developing, implementing and evaluating public policy. This paper reports on the experience of the authors in designing and using computational models of public policy (‘policy models’, for short). The paper considers the role of computational models in policy making, and some of the challenges that need to be overcome if policy models are to make an effective contribution. It suggests that policy models can have an important place in the policy process because they could allow policy makers to experiment in a virtual world, and have many advantages compared with randomised control trials and policy pilots. The paper then summarises some general lessons that can be extracted from the authors’ experience with policy modelling. These general lessons include the observation that often the main benefit of designing and using a model is that it provides an understanding of the policy domain, rather than the numbers it generates; that care needs to be taken that models are designed at an appropriate level of abstraction; that although appropriate data for calibration and validation may sometimes be in short supply, modelling is often still valuable; that modelling collaboratively and involving a range of stakeholders from the outset increases the likelihood that the model will be used and will be fit for purpose; that attention needs to be paid to effective communication between modellers and stakeholders; and that modelling for public policy involves ethical issues that need careful consideration. The paper concludes that policy modelling will continue to grow in importance as a component of public policy making processes, but if its potential is to be fully realised, there will need to be a melding of the cultures of computational modelling and policy making.
Article
Full-text available
Setting the free parameters of classifiers to different values can have a profound impact on their performance. For some methods, specialized tuning algorithms have been developed. These approaches mostly tune parameters according to a single criterion, such as the cross-validation error. However, it is sometimes desirable to obtain parameter values that optimize several concurrent - often conicting - criteria. The TunePareto package provides a general and highly customizable framework to select optimal parameters for classifiers according to multiple objectives. Several strategies for sampling and optimizing parameters are supplied. The algorithm determines a set of Pareto-optimal parameter configurations and leaves the ultimate decision on the weighting of objectives to the researcher. Decision support is provided by novel visualization techniques.
Article
Full-text available
Multi-agent geographical models integrate very large numbers of spatial interactions. In order to validate those models large amount of computing is necessary for their simulation and calibration. Here a new data processing chain including an automated calibration procedure is experimented on a computational grid using evolutionary algorithms. This is applied for the first time to a geographical model designed to simulate the evolution of an early urban settlement system. The method enables us to reduce the computing time and provides robust results. Using this method, we identify several parameter settings that minimise three objective functions that quantify how closely the model results match a reference pattern. As the values of each parameter in different settings are very close, this estimation considerably reduces the initial possible domain of variation of the parameters. The model is thus a useful tool for further multiple applications on empirical historical situations.
Article
Full-text available
Research dealing with various aspects of* the theory of planned behavior (Ajzen, 1985, 1987) is reviewed, and some unresolved issues are discussed. In broad terms, the theory is found to be well supported by empirical evidence. Intentions to perform behaviors of different kinds can be predicted with high accuracy from attitudes toward the behavior, subjective norms, and perceived behavioral control; and these intentions, together with perceptions of behavioral control, account for considerable variance in actual behavior. Attitudes, subjective norms, and perceived behavioral control are shown to be related to appropriate sets of salient behavioral, normative, and control beliefs about the behavior, but the exact nature of these relations is still uncertain. Expectancy— value formulations are found to be only partly successful in dealing with these relations. Optimal rescaling of expectancy and value measures is offered as a means of dealing with measurement limitations. Finally, inclusion of past behavior in the prediction equation is shown to provide a means of testing the theory*s sufficiency, another issue that remains unresolved. The limited available evidence concerning this question shows that the theory is predicting behavior quite well in comparison to the ceiling imposed by behavioral reliability.
Article
Full-text available
Duration of viral shedding following infection is an important determinant of disease transmission, informing both control policies and disease modelling. We undertook a systematic literature review of the duration of influenza A(H1N1)pdm09 virus shedding to examine the effects of age, severity of illness and receipt of antiviral treatment. Studies were identified by searching the PubMed database using the keywords 'H1N1', 'pandemic', 'pandemics', 'shed' and 'shedding'. Any study of humans with an outcome measure of viral shedding was eligible for inclusion in the review. Comparisons by age, degree of severity and antiviral treatment were made with forest plots. The search returned 214 articles of which 22 were eligible for the review. Significant statistical heterogeneity between studies precluded meta-analysis. The mean duration of viral shedding generally increased with severity of clinical presentation, but we found no evidence of longer shedding duration of influenza A(H1N1)pdm09 among children compared with adults. Shorter viral shedding duration was observed when oseltamivir treatment was administered within 48 hours of illness onset. Considerable differences in the design and analysis of viral shedding studies limit their comparison and highlight the need for a standardised approach. These insights have implications not only for pandemic planning, but also for informing responses and study of seasonal influenza now that the A(H1N1)pdm09 virus has become established as the seasonal H1N1 influenza virus.
Article
Full-text available
It has been argued that spatially explicit population models (SEPMs) cannot provide reliable guidance for conservation biology because of the difficulty of obtaining direct estimates for their demographic and dispersal parameters and because of error propagation. We argue that appropriate model calibration procedures can access additional sources of information, compensating the lack of direct parameter estimates. Our objective is to show how model calibration using population-level data can facilitate the construction of SEPMs that produce reliable predictions for conservation even when direct parameter estimates are inadequate. We constructed a spatially explicit and individual-based population model for the dynamics of brown bears (Ursus arctos) after a reintroduction program in Austria. To calibrate the model we developed a procedure that compared the simulated population dynamics with distinct features of the known population dynamics (=patterns). This procedure detected model parameterizations that did not reproduce the known dynamics. Global sensitivity analysis of the uncalibrated model revealed high uncertainty in most model predictions due to large parameter uncertainties (coefficients of variation CV 0.8). However, the calibrated model yielded predictions with considerably reduced uncertainty (CV 0.2). A pattern or a combination of various patterns that embed information on the entire model dynamics can reduce the uncertainty in model predictions, and the application of different patterns with high information content yields the same model predictions. In contrast, a pattern that does not embed information on the entire population dynamics (e.g., bear observations taken from sub-areas of the study area) does not reduce uncertainty in model predictions. Because population-level data for defining (multiple) patterns are often available, our approach could be applied widely.
Article
Full-text available
The effects of fear appeals on persuasion were investigated in a factorial experiment that was designed to test a combined model of protection motivation theory and self-efficacy theory. As predicted, the probability of a threat's occurrence and the effectiveness of a coping response both had positive main effects on intentions to adopt a recommended preventive health behavior. More importantly, the findings provided support for self-efficacy expectancy as a fourth component of protection motivation theory: Self-efficacy had a direct influence on intentions and interacted with two other variables of protection motivation theory. The interaction effect was interpreted in terms of two new decision-making strategies that people use when confronted with a fear appeal: a precaution strategy and a hyperdefensiveness strategy. In addition, the results replicated previous findings on the relationship between self-efficacy expectancy and outcome expectancy. A model incorporating protection motivation theory and self-efficacy theory is presented as a possible general model of attitude change.
Article
Full-text available
It is anticipated that the next generation of computational epidemic models will simulate both infectious disease transmission and dynamic human behaviour change. Individual agents within a simulation will not only infect one another, but will also have situational awareness and a decision algorithm that enables them to modify their behaviour. This paper develops such a model of behavioural response, presenting a mathematical interpretation of a well-known psychological model of individual decision making, the health belief model, suitable for incorporation within an agent-based disease-transmission model. We formalize the health belief model and demonstrate its application in modelling the prevalence of facemask use observed over the course of the 2003 Hong Kong SARS epidemic, a well-documented example of behaviour change in response to a disease outbreak.
Book
Agent-based modeling is a new technique for understanding how the dynamics of biological, social, and other complex systems arise from the characteristics and behaviors of the agents making up these systems. This innovative textbook gives students and scientists the skills to design, implement, and analyze agent-based models. It starts with the fundamentals of modeling and provides an introduction to NetLogo, an easy-to-use, free, and powerful software platform. Nine chapters then each introduce an important modeling concept and show how to implement it using NetLogo. The book goes on to present strategies for finding the right level of model complexity and developing theory for agent behavior, and for analyzing and learning from models. Agent-Based and Individual-Based Modelingfeatures concise and accessible text, numerous examples, and exercises using small but scientific models. The emphasis throughout is on analysis--such as software testing, theory development, robustness analysis, and understanding full models--and on design issues like optimizing model structure and finding good parameter values. The first hands-on introduction to agent-based modeling, from conceptual design to computer implementation to parameterization and analysis Filled with examples and exercises, with updates and supplementary materials at www.railsback-grimm-abm-book.com Designed for students and researchers across the biological and social sciences Written by leading practitioners.
Article
see https://ir.cwi.nl/pub/13562/13562A.pdf there is a later book with Heesterbeek with (part of the) title identical; of this book no pdf is available