## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

Agent-based models (ABMs) are an increasingly popular choice for simulating large systems of interacting components, and have been applied across a wide variety of natural and environmental systems. However, ABMs can be incredibly disparate and often opaque in their formulation, implementation, and analysis. This can impede critical assessment and re-implementation, and jeopardize the reproducibility and conclusions of ABM studies. In this review, we survey recent work towards standardization in ABM methodology in several aspects: model description and documentation, model implementation, and model analysis and inference.
Based on a critical review of the literature, focused on ABMs of environmental and natural systems, we describe a recurrent trade-off between flexibility and standardization in ABM methodology. We find that standard protocols for model documentation are beginning to establish, although their uptake by the ABM community is inhibited by their sometimes excessive level of detail. We highlight how implementation options now exist at all points along a spectrum from ad hoc, ‘from scratch’ implementations, to specific software offering ‘off-the-shelf’ ABM implementations. We outline how the main focal points of ABM analysis (behavioural and inferential analysis) are facing similar issues with similar approaches. While this active development of ABM analysis techniques brings additional methods to our analysis toolbox, it does not contribute to the development of a standardized framework, since the performance and design of these methods tends to be highly problem-specific. We therefore recommend that agent-based modellers should consider multiple approaches simultaneously when analyzing their model. Well-documented software packages, and critical comparative reviews of such, will be important facilitators in these advances. ABMs can additionally make better use of developments in other fields working with high-dimensional problems, such as Bayesian statistics and machine learning.

To read the full-text of this research,

you can request a copy directly from the authors.

With the rapid increase in the amount and sources of big data, using big data and machine learning methods to identify site soil pollution has become a research hotspot. However, previous studies that used basic information of sites as pollution identification indexes mainly have problems of low accuracy and efficiency when conducting complex model predictions for multiple soil pollution types. In this study, we collected the environmental data of 199 sites in 6 typical industries involving heavy metal and organic pollution. After feature fusion and selection, 10 indexes based on pollution sources and pathways were used to establish the soil pollution identification index system. The Multi-gate Mixture-of-Experts network (MMoE) were constructed to carry out the multi-tasks of soil heavy metals, VOCs and SVOCs pollution identification simultaneously. The SHAP framework was used to reveal the importance of pollution identification indexes on the multiple outputs of MMoE and obtain their driving factors. The results showed that the accuracies of MMoE model were 0.600, 0.783 and 0.850 for soil heavy metals, VOCs and SVOCs pollution identifications, respectively, which were 0-20 % higher than their accuracies of BP neural networks of single tasks. The indexes of raw material containing organic compounds, enterprise scale, soil pollution traces and industry types have the different significant importance on site soil pollutions. This study proposed a more efficient and accurate method to identify site soil pollutions and their driving factors, which offers a step towards realizing intelligent identification and risk control of site soil pollution globally.

Mathematical modeling is typically framed as the art of reductionism of scientific knowledge into an arithmetical layout. However, most untrained people get the art of modeling wrong and end up neglecting it because modeling is not simply about writing equations and generating numbers through simulations. Models tell not only about a story; they are spoken to by the circumstances under which they are envisioned. They guide apprentice and experienced modelers to build better models by preventing known pitfalls and invalid assumptions in the virtual world and, most importantly, learn from them through simulation and identify gaps in pushing scientific knowledge further. The power of the human mind is well-documented for idealizing concepts and creating virtual reality models, and as our hypotheses grow more complicated and more complex data become available, modeling earns more noticeable footing in biological sciences. The fundamental modeling paradigms include discrete-events, dynamic systems, agent-based (AB), and system dynamics (SD). The source of knowledge is the most critical step in the model-building process regardless of the paradigm, and the necessary expertise includes (a) clear and concise mental concepts acquired through different ways that provide the fundamental structure and expected behaviors of the model and (b) numerical data necessary for statistical analysis, not for building the model. The unreasonable effectiveness of models to grow scientific learning and knowledge in sciences arise because different researchers would model the same problem differently, given their knowledge and experiential background, leading to choosing different variables and model structures. Secondly, different researchers might use different paradigms and even unalike mathematics to resolve the same problem; thus, model needs are intrinsic to their perceived assumptions and structures. Thirdly, models evolve as the scientific community knowledge accumulates and matures over time, hopefully resulting in improved modeling efforts; thus, the perfect model is fictional. Some paradigms are most appropriate for macro, high abstraction with less detailed-oriented scenarios, while others are most suitable for micro, low abstraction with higher detailed-oriented strategies. Modern hybridization aggregating artificial intelligence (AI) to mathematical models can become the next technological wave in modeling. AI can be an integral part of the SD/AB models and, before long, write the model code by itself. Success and failures in model building are more related to the ability of the researcher to interpret the data and understand the underlying principles and mechanisms to formulate the correct relationship among variables rather than profound mathematical knowledge.

Agent-based models (ABMs) are increasingly used in the management sciences. Though useful, ABMs are often critiqued: it is hard to discern why they produce the results they do and whether other assumptions would yield similar results. To help researchers address such critiques, we propose a systematic approach to conducting sensitivity analyses of ABMs. Our approach deals with a feature that can complicate sensitivity analyses: most ABMs include important non-parametric elements, while most sensitivity analysis methods are designed for parametric elements only. The approach moves from charting out the elements of an ABM through identifying the goal of the sensitivity analysis to specifying a method for the analysis. We focus on four common goals of sensitivity analysis: determining whether results are robust, which elements have the greatest impact on outcomes, how elements interact to shape outcomes, and which direction outcomes move when elements change. For the first three goals, we suggest a combination of randomized finite change indices calculation through a factorial design. For direction of change, we propose a modification of individual conditional expectation (ICE) plots to account for the stochastic nature of the ABM response. We illustrate our approach using the Garbage Can Model, a classic ABM that examines how organizations make decisions.

Agent-based modeling is a simulation method in which autonomous agents interact with their environment and one another, given a predefined set of rules. It is an integral method for modeling and simulating complex systems, such as socio-economic problems. Since agent-based models are not described by simple and concise mathematical equations, the code that generates them is typically complicated, large, and slow. Here we present Agents.jl, a Julia-based software that provides an ABM analysis platform with minimal code complexity. We compare our software with some of the most popular ABM software in other programming languages. We find that Agents.jl is not only the most performant but also the least complicated software, providing the same (and sometimes more) features as the competitors with less input required from the user. Agents.jl also integrates excellently with the entire Julia ecosystem, including interactive applications, differential equations, parameter optimization, and so on. This removes any “extensions library” requirement from Agents.jl, which is paramount in many other tools.

Individual-based models have become important tools in the global battle against infectious diseases, yet model complexity can make calibration to biological and epidemiological data challenging. We propose using a Bayesian optimization framework employing Gaussian process or machine learning emulator functions to calibrate a complex malaria transmission simulator. We demonstrate our approach by optimizing over a high-dimensional parameter space with respect to a portfolio of multiple fitting objectives built from datasets capturing the natural history of malaria transmission and disease progression. Our approach quickly outperforms previous calibrations, yielding an improved final goodness of fit. Per-objective parameter importance and sensitivity diagnostics provided by our approach offer epidemiological insights and enhance trust in predictions through greater interpretability. Individual-based models have become important tools in the global battle against infectious diseases, yet model complexity can make calibration challenging. Here, the authors propose a Bayesian optimization framework to calibrate a complex malaria transmission simulator.

To robustly predict the effects of disturbance and ecosystem changes on species, it is necessary to produce structurally realistic models with high predictive power and flexibility. To ensure that these models reflect the natural conditions necessary for reliable prediction, models must be informed and tested using relevant empirical observations. Pattern‐oriented modelling (POM) offers a systematic framework for employing empirical patterns throughout the modelling process and has been coupled with complex systems modelling, such as in agent‐based models (ABMs). However, while the production of ABMs has been rising rapidly, the explicit use of POM has not increased. Challenges with identifying patterns and an absence of specific guidelines on how to implement empirical observations may limit the accessibility of POM and lead to the production of models which lack a systematic consideration of reality. This review serves to provide guidance on how to identify and apply patterns following a POM approach in ABMs (POM‐ABMs), specifically addressing: where in the ecological hierarchy can we find patterns; what kinds of patterns are useful; how should simulations and observations be compared; and when in the modelling cycle are patterns used? The guidance and examples provided herein are intended to encourage the application of POM and inspire efficient identification and implementation of patterns for both new and experienced modellers alike. Additionally, by generalising patterns found especially useful for POM‐ABM development, these guidelines provide practical help for the identification of data gaps and guide the collection of observations useful for the development and verification of predictive models. Improving the accessibility and explicitness of POM could facilitate the production of robust and structurally realistic models in the ecological community, contributing to the advancement of predictive ecology at large.

p>The utility of Agent Based Models (ABMs) for decision making support as well as for scientific applications can be increased considerably by the availability and use of methodologies for thorough model behaviour analysis. In view of their intrinsic construction, ABMs have to be analysed numerically. Furthermore, ABM behaviour is often complex, featuring strong non-linearities, tipping points, and adaptation. This easily leads to high computational costs, presenting a serious practical limitation. Model developers and users alike would benefit from methodologies that can explore large parts of parameter space at limited computational costs. In this paper we present a methodology that makes this possible. The essence of our approach is to develop a cost-effective surrogate model based on ABM output using machine learning to approximate ABM simulation data. The development consists of two steps, both with iterative loops of training and cross-validation. In the first part, a Support Vector Machine (SVM) is developed to split behaviour space into regions of qualitatively different behaviour. In the second part, a Support Vector Regression (SVR) is developed to cover the quantitative behaviour within these regions. Finally, sensitivity indices are calculated to rank the importance of parameters for describing the boundaries between regions, and for the quantitative dynamics within regions. The methodology is demonstrated in three case studies, a differential equation model of predator-prey interaction, a common-pool resource ABM and an ABM representing the Philippine tuna fishery. In all cases, the model and the corresponding surrogate model show a good match. Furthermore, different parameters are shown to influence the quantitative outcomes, compared to those that influence the underlying qualitative behaviour. Thus, the method helps to distinguish which parameters determine the boundaries in parameter space between regions that are separated by tipping points, or by any criterion of interest to the user.</p

We estimate the parameters of 41 simulation models to find which of 9 estimation algorithms performs better. Unfortunately no single algorithm is the best at estimation for all or even most the models. Five main results emerge instead from this research. First, each algorithm is the best estimator for at least one parameter. Second, the best estimation algorithm varies not just between models but even between parameters of the same model. Third, each estimation algorithm fails to estimate at least one identifiable parameter. Fourth, choosing the right algorithm improves estimation performance more than quadrupling the number of simulation runs. Fifth, half of the agent-based models tested cannot be fully identified. We argue therefore that the testing performed here should be done in other applied work and to facilitate this we share the R package freelunch.

Computational social science has witnessed a shift from pure theoretical to empirical agent-based models (ABMs) grounded in data-driven correlations between behavioral factors defining agents' decisions. There is a strong urge to go beyond theoretical ABMs with behavioral theories setting stylized rules that guide agents' actions, especially when it concerns policy-related simulations. However, it remains unclear to what extent theory-driven ABMs mislead, if at all, a choice of a policy when compared to the outcomes of models with empirical micro-foundations. This is especially relevant for pro-environmental policies that increasingly rely on quantifying cumulative effects of individual behavioral changes, where ABMs are so helpful. We propose a comparison framework to address this methodological dilemma, which quantitatively explores the gap in predictions between theory- and data-driven ABMs. Inspired by the existing theory-driven model, ORVin-T, which studies the individual choice between organic and conventional products, we design a survey to collect data on individual preferences and purchasing decisions. We then use this extensive empirical microdata to build an empirical twin, ORVin-E, replacing the theoretical assumptions and secondary aggregated data used to parametrize agents' decision strategies with our empirical survey data. We compare the models in terms of key outputs, perform sensitivity analysis, and explore three policy scenarios. We observe that the theory-driven model predicts the shifts to organic consumption as accurately as the ABM with empirical micro-foundations at both aggregated and individual scales. There are slight differences (±5%) between the estimations of the two models with regard to different behavioral change scenarios: increasing conventional tax, launching organic social-informational campaigns, and their combination. Our findings highlight the goodness of fit and usefulness of theoretical modeling efforts, at least in the case of incremental behavioral change. It sheds light on the conditions when theory-driven and data-driven models are aligned and on the value of empirical data for studying systemic changes.

Agent-based models provide a flexible framework that is frequently used for modelling many biological systems, including cell migration, molecular dynamics, ecology and epidemiology. Analysis of the model dynamics can be challenging due to their inherent stochasticity and heavy computational requirements. Common approaches to the analysis of agent-based models include extensive Monte Carlo simulation of the model or the derivation of coarse-grained differential equation models to predict the expected or averaged output from the agent-based model. Both of these approaches have limitations, however, as extensive computation of complex agent-based models may be infeasible, and coarse-grained differential equation models can fail to accurately describe model dynamics in certain parameter regimes. We propose that methods from the equation learning field provide a promising, novel and unifying approach for agent-based model analysis. Equation learning is a recent field of research from data science that aims to infer differential equation models directly from data. We use this tutorial to review how methods from equation learning can be used to learn differential equation models from agent-based model simulations. We demonstrate that this framework is easy to use, requires few model simulations, and accurately predicts model dynamics in parameter regions where coarse-grained differential equation models fail to do so. We highlight these advantages through several case studies involving two agent-based models that are broadly applicable to biological phenomena: a birth–death–migration model commonly used to explore cell biology experiments and a susceptible–infected–recovered model of infectious disease spread.

This study examines the possibility of applying the novel likelihood-free Bayesian inference called BayesFlow proposed by Radev et al. (2020) for the estimation of agent-based models (ABMs). BayesFlow is a fully likelihood-free approach, which directly approximates a posterior rather than a likelihood function by learning an invertible probabilistic mapping between parameters and standard Gaussian variables, conditional on simulation data from the ABM to be estimated. BayesFlow certainly achieved superior accuracy to the benchmark method of Kernel Density Estimation-MCMC of Grazzini et al. (2017) and the more sophisticated method of Mixture Density Network-MCMC of Platt (2019), in the validation tests of recovering the ground-truth values of parameters from the simulated datasets of a standard New Keynesian ABM (NK-ABM). Furthermore, the truly empirical estimation of NK-ABM with the real data of the US economy successfully showed the desirable pattern of posterior contraction along with the increase in observation periods. This deep neural network-based method holds general applicability without any critical dependence on pre-selected design and high computational efficiency. These features are desirable when scaling the method to practical-sized ABMs, which typically have high-dimensional parameters and observation variables.

Estimating the parameters of mathematical models is a common problem in almost all branches of science. However, this problem can prove notably difficult when processes and model descriptions become increasingly complex and an explicit likelihood function is not available. With this work, we propose a novel method for globally amortized Bayesian inference based on invertible neural networks that we call BayesFlow. The method uses simulations to learn a global estimator for the probabilistic mapping from observed data to underlying model parameters. A neural network pretrained in this way can then, without additional training or optimization, infer full posteriors on arbitrarily many real data sets involving the same model family. In addition, our method incorporates a summary network trained to embed the observed data into maximally informative summary statistics. Learning summary statistics from data makes the method applicable to modeling scenarios where standard inference techniques with handcrafted summary statistics fail. We demonstrate the utility of BayesFlow on challenging intractable models from population dynamics, epidemiology, cognitive science, and ecology. We argue that BayesFlow provides a general framework for building amortized Bayesian parameter estimation machines for any forward model from which data can be simulated.

Mesa is an agent-based modeling framework written in Python. Originally started in 2013, it was created to be the go-to tool in for researchers wishing to build agent-based models with Python. Within this paper we present Mesa's design goals, along with its underlying architecture. This includes its core components: 1) the model (Model, Agent, Schedule, and Space), 2) analysis (Data Collector and Batch Runner) and the visualization (Visualization Server and Visualization Browser Page). We then discuss how agent-based models can be created in Mesa. This is followed by a discussion of applications and extensions by other researchers to demonstrate how Mesa design is decoupled and extensible and thus creating the opportunity for a larger decentralized ecosystem of packages that people can share and reuse for their own needs. Finally, the paper concludes with a summary and discussion of future development areas for Mesa.

Model checking is an effective way to verify behaviours of an agent-based simulation system. Three behaviours are analysed: operational, control, and global behaviours. Global behaviours of a system emerge from operational behaviours of local components regulated by control behaviours of the system. The previous works principally focus on verifying the system from the operational point of view (operational behaviour). The satisfaction of the global behaviour of the system conforming to the control behaviour has not been investigated. Thus, in this paper, we propose a more complete approach for verifying global and operational behaviours of systems. To do so, these three behaviours are firstly formalized by automata-based techniques. The meta-transformation between automata theories and Kripke structure is then provided, in order to illustrate the feasibility for the model transformation between the agent-based simulation model and Kripke structure-based model. Then, a mapping between the models is proposed. Subsequently, the global behaviour of the system is verified by the properties extracted from the control behaviour and the operational behaviour is checked by general system performance properties (e.g. safety, deadlock freedom). Finally, a case study on the simulation system for aircraft maintenance has been carried out. A counterexample of signals sending between Flight agent and Plane agent has been produced by NuSMV model checker. Modifications for the NuSMV model and agent-based simulation model have been performed. The experiment results show that 9% out of 19% of flights have been changed to be serviceable.

Locusts are significant agricultural pests. Under favorable environmental conditions flightless juveniles may aggregate into coherent, aligned swarms referred to as hopper bands. These bands are often observed as a propagating wave having a dense front with rapidly decreasing density in the wake. A tantalizing and common observation is that these fronts slow and steepen in the presence of green vegetation. This suggests the collective motion of the band is mediated by resource consumption. Our goal is to model and quantify this effect. We focus on the Australian plague locust, for which excellent field and experimental data is available. Exploiting the alignment of locusts in hopper bands, we concentrate solely on the density variation perpendicular to the front. We develop two models in tandem; an agent-based model that tracks the position of individuals and a partial differential equation model that describes locust density. In both these models, locust are either stationary (and feeding) or moving. Resources decrease with feeding. The rate at which locusts transition between moving and stationary (and vice versa) is enhanced (diminished) by resource abundance. This effect proves essential to the formation, shape, and speed of locust hopper bands in our models. From the biological literature we estimate ranges for the ten input parameters of our models. Sobol sensitivity analysis yields insight into how the band’s collective characteristics vary with changes in the input parameters. By examining 4.4 million parameter combinations, we identify biologically consistent parameters that reproduce field observations. We thus demonstrate that resource-dependent behavior can explain the density distribution observed in locust hopper bands. This work suggests that feeding behaviors should be an intrinsic part of future modeling efforts.

The Overview, Design concepts and Details (ODD) protocol for describing Individual- and Agent-Based Models (ABMs) is now widely accepted and used to document such models in journal articles. As a standard- ized document for providing a consistent, logical and readable account of the structure and dynamics of ABMs, some research groups also find it useful as a workflow for model design. Even so, there are still limitations to ODD that obstruct its more widespread adoption. Such limitations are discussed and addressed in this paper: the limited availability of guidance on how to use ODD; the length of ODD documents; limitations of ODD for highly complex models; lack of sufficient details of many ODDs to enable reimplementation without access to the model code; and the lack of provision for sections in the document structure covering model design ratio- nale, the model’s underlying narrative, and the means by which the model’s fitness for purpose is evaluated. We document the steps we have taken to provide better guidance on: structuring complex ODDs and an ODD summary for inclusion in a journal article (with full details in supplementary material; Table 1); using ODD to point readers to relevant sections of the model code; update the document structure to include sections on model rationale and evaluation. We also further advocate the need for standard descriptions of simulation ex- periments and argue that ODD can in principle be used for any type of simulation model. Thereby ODD would provide a lingua franca for simulation modelling.

The recent advancement of agent-based modeling is characterized by higher demands on the pa-rameterization, evaluation and documentation of these computationally expensive models. Accordingly, there is also a growing request for "easy to go" applications just mimicking the input-output behavior of such models. Metamodels are being increasingly used for these tasks. In this paper, we provide an overview of common metamodel types and the purposes of their usage in an agent-based modeling context. To guide modelers in the selection and application of metamodels for their own needs, we further assessed their implementation effort and performance. We performed a literature research in January using four di erent databases. Five di erent terms paraphrasing metamodels (approximation, emulator, meta-model, metamodel and surrogate) were used to capture the whole range of relevant literature in all disciplines. All metamodel applications found were then categorized into specific metamodel types and rated by di erent junior and senior researches from varying disciplines (including forest sciences, landscape ecology, or economics) regarding the implementation e ort and performance. Specifically, we captured the metamodel performance according to (i) the consideration of uncertainties, (ii) the suitability assessment provided by the authors for the particular purpose, and (iii) the number of valuation criteria provided for suitability assessment. We selected distinct metamodel applications from studies published in peer-reviewed journals from to. These were used for the sensitivity analysis, calibration and upscaling of agent-based models, as well to mimic their prediction for di erent scenarios. This review provides information about the most applicable metamodel types for each purpose and forms a first guidance for the implementation and validation of metamodels for agent-based models.

Humans have observed the natural world and how people interact with it for millennia. Over the past century, synthesis and expansion of that understanding has occurred under the banner of the “new” discipline of ecology. The mechanisms considered operate in and between many different scales—from the individual and short time frames, up through populations, communities, land/seascapes and ecosystems. Whereas, some of these scales have been more readily studied than others—particularly the population to regional landscape scales—over the course of the past 20 years new unifying insights have been possible via the application of ideas from new perspectives, such as the fields of complexity and network theory. At any sufficiently large gathering (and with sufficient lubrication) discussions over whether ecologists will ever uncover unifying laws and what they may look like still persist. Any pessimism expressed tends to grow from acknowledgment that gaping holes still exist in our understanding of the natural world and its functioning, especially at the smallest and grandest scales. Conceptualization of some fundamental ideas, such as evolution, are also undergoing review as global change presents levels of directional pressure on ecosystems not previously seen in recorded history. New sensor and monitoring technologies are opening up new data streams at volumes that can seem overwhelming but also provide an opportunity for a profusion of new discoveries by marrying data across scales in volumes hitherto infeasible. As with so many aspects of science and life, now is an exciting time to be an ecologist.

Individual-based models, ‘IBMs’, describe naturally the dynamics of interacting organisms or social or financial agents. They are considered too complex for mathematical analysis, but computer simulations of them cannot give the general insights required. Here, we resolve this problem with a general mathematical framework for IBMs containing interactions of an unlimited level of complexity, and derive equations that reliably approximate the effects of space and stochasticity. We provide software, specified in an accessible and intuitive graphical way, so any researcher can obtain analytical and simulation results for any particular IBM without algebraic manipulation. We illustrate the framework with examples from movement ecology, conservation biology, and evolutionary ecology. This framework will provide unprecedented insights into a hitherto intractable panoply of complex models across many scientific fields.

Agent‐based models find wide application in all fields of science where large‐scale patterns emerge from properties of individuals. Due to increasing capacities of computing resources it was possible to improve the level of detail and structural realism of next‐generation models in recent years. However, this is at the expense of increased model complexity, which requires more efficient tools for model exploration, analysis and documentation that enable reproducibility, repeatability and parallelization. NetLogo is a widely used environment for agent‐based model development, but it does not provide sufficient built‐in tools for extensive model exploration, such as sensitivity analyses. One tool for controlling NetLogo externally is the r ‐package RN et L ogo . However, this package is not suited for efficient, reproducible research as it has stability and resource allocation issues, is not straightforward to be setup and used on high performance computing clusters and does not provide utilities, such as storing and exchanging metadata, in an easy way.
We present the r ‐package nlrx , which overcomes stability and resource allocation issues by running NetLogo simulations via dynamically created XML experiment files. Class objects make setting up experiments more convenient and helper functions provide many parameter exploration approaches, such as Latin Hypercube designs, Sobol sensitivity analyses or optimization approaches. Output is automatically collected in user‐friendly formats and can be post‐processed with provided utility functions. nlrx enables reproducibility by storing all relevant information and simulation output of experiments in one r object which can conveniently be archived and shared.
We provide a detailed description of the nlrx package functions and the overall workflow. We also present a use case scenario using a NetLogo model, for which we performed a sensitivity analysis and a genetic algorithm optimization.
The nlrx package is the first framework for documentation and application of reproducible NetLogo simulation model analysis.

To determine the resilience of complex social-ecological systems (SESs) it is necessary to have a thorough understanding of the system behavior under changing political, economic, and environmental conditions (i.e., external system stressors). Such behavior can be predicted if one knows the stable and unstable equilibrium states in a system and how these equilibria react to changes in the system stressors. The state of the system rapidly or gradually changes either toward (i.e., stable equilibrium) or away from (i.e., unstable equilibrium) an equilibrium. However, the equilibrium states in a SES are often unknown and difficult to identify in real systems. In contrast, agent-based SES models can potentially be used to determine equilibria states, but are rarely used for this purpose. We developed a generic approach to identify stable and unstable equilibria states with agent-based SES models. We used an agent-based SES model to simulate land-use change in an alpine mountain region in the Canton of Valais, Switzerland. By iteratively running this model for different input settings, we were able to identify equilibria in intensive and extensive agriculture. We also assessed the sensitivity of these equilibria to changes in external system stressors. With support-vector machine classifications, we created bifurcation diagrams in which the stable and unstable equilibria as a function of the values of a system stressor were depicted. The external stressors had a strong influence on the equilibrium states. We also found that a minimum amount of direct payments was necessary for agricultural extensification to take place. Our approach does not only provide valuable insights into the resilience of our case-study region to changing conditions, but can also be applied to other (agent-based) SES models to present important model results in a condensed and understandable format.

As agent-based modelling gains popularity, the demand for transparency in underlying modelling assumptions grows. Behavioural rules guiding agents’ decisions, learning, interactions and possible changes in these should rely on solid theoretical and empirical grounds. This field has matured enough to reach the point at which we need to go beyond just reporting what social theory we base these rules upon. Many social science theories operate with various abstract constructions such as attitudes, perceptions, norms or intentions. These concepts are rather subjective and remain open to interpretation when operationalizing them in a formal model code. There is a growing concern that how modellers interpret qualitative social science theories in quantitative ABMs may differ from case to case. Yet, formal tests of these differences are scarce and a systematic approach to analyse any possible disagreements is lacking. Our paper addresses this gap by exploring the consequences of variations in formalizations of one social science theory on the simulation outcomes of agent-based models of the same class. We ran simulations to test the impact of four differences: in model architecture concerning specific equations and their sequence within one theory, in factors affecting agents’ decisions, in representation of these potentially differing factors, and finally in the underlying distribution of data used in a model. We illustrate emergent outcomes of these differences using an agent-based model developed to study regional impacts of households’ solar panel investment decisions. The Theory of Planned Behaviour was applied as one of the most common social science theories used to define behavioural rules of individual agents. Our findings demonstrate qualitative and quantitative differences in simulation outcomes, even when agents’ decision rules are based on the same theory and data. The paper outlines a number of critical methodological implications for future developments in agent-based modelling.

Individual-based models provide modularity and structural flexibility necessary for modeling of infectious diseases at the within-host and population levels, but are challenging to implement. Levels of complexity can exceed the capacity and timescales for students and trainees in most academic institutions. Here we describe the process and advantages of a multi-disease framework approach developed with formal software support. The epidemiological modeling software, EMOD, has undergone a decade of software development. It is structured so that a majority of code is shared across disease modeling including malaria, HIV, tuberculosis, dengue, polio, and typhoid. In additional to implementation efficiency, the sharing increases code usage and testing. The freely available codebase also includes hundreds of regression tests, scientific feature tests, and component tests to help verify functionality and avoid inadvertent changes to functionality during future development. Here we describe the levels of detail, flexible configurability, and modularity enabled by EMOD and the role of software development principles and processes in its development.

Agent-based simulations have become increasingly prominent in various disciplines. This trend is positive, but it comes with challenges: while there are more and more standards for design, verification, validation, and presentation of the models, the various meta-theoretical strategies of how the models should be related to reality often remain implicit. Differences in the epistemological foundations of models make it however, difficult to relate distinct models to each other and to ensure a cumulative expansion of knowledge. Concepts and the analytic language developed by philosophers of science can help to overcome these obstacles. This paper introduces some of these concepts to the modelling community. It also presents an epistemological framework that helps to clarify how one wishes to generate knowledge about reality by the means of one's model and that helps to relate models to each other. Since the interpretation of a model is strongly connected to the activities of model verification and validation, these two activities will be embedded into the framework and their respective epistemological roles will be clarified. The resulting meta-theoretical framework aligns well with recently proposed frameworks for model presentation and evaluation.

Modeling and simulation techniques have demonstrated success in studying biological systems. As the drive to better capture biological complexity leads to more sophisticated simulators, it becomes challenging to perform statistical analyses that help translate predictions into increased understanding. These analyses may require repeated executions and extensive sampling of high-dimensional parameter spaces: analyses that may become intractable due to time and resource limitations. Significant reduction in these requirements can be obtained using surrogate models, or emulators, that can rapidly and accurately predict the output of an existing simulator. We apply emulation to evaluate and enrich understanding of a previously published agent-based simulator of lymphoid tissue organogenesis, showing an ensemble of machine learning techniques can reproduce results obtained using a suite of statistical analyses within seconds. This performance improvement permits incorporation of previously intractable analyses, including multi-objective optimization to obtain parameter sets that yield a desired response, and Approximate Bayesian Computation to assess parametric uncertainty. To facilitate exploitation of emulation in simulation-focused studies, we extend our open source statistical package, spartan, to provide a suite of tools for emulator development, validation, and application. Overcoming resource limitations permits enriched evaluation and refinement, easing translation of simulator insights into increased biological understanding.

The quantity of data and processes used in modeling projects has been dramatically increasing in recent years due to the progress in computation capability and to the popularity of new approaches such as open data. Modelers face an increasing difficulty in analyzing and modeling complex systems that consist of many heterogeneous entities. Adapting existing models is relevant to avoid dealing with the complexity of writing and studying a new model from scratch. ODD (Overview, Design concepts, Details) protocol has emerged as a solution to document Agent-Based Models (ABMs). It appears to be a convenient solution to address significant problems such as comprehension, replication, and dissemination. However, it lacks a standard that formalizes the use of data in empirical models. This paper tackles this issue by proposing a set of rules that outline the use of empirical data inside an ABM. We call this new protocol ODD+2D (ODD+Decision + Data). ODD+2D integrates a mapping diagram called DAMap (Data to Agent Mapping). This mapping model formalizes how data are processed and mapped to agent-based models. In this paper, we focus on the architecture of ODD+2D, and we illustrate it with a residential mobility model in Marrakesh.

The increased availability of high-resolution ocean data globally has enabled more detailed analyses of physical-biological interactions and their consequences to the ecosystem. We present IBMlib, which is a versatile, portable and computationally effective framework for conducting Lagrangian simulations in the marine environment. The purpose of the framework is to handle complex individual-level biological models of organisms, combined with realistic 3D oceanographic model of physics and biogeochemistry describing the environment of the organisms without assumptions about spatial or temporal scales. The open-source framework features a minimal robust interface to facilitate the coupling between individual-level biological models and oceanographic models, and we provide application examples including forward/backward simulations, habitat connectivity calculations, assessing ocean conditions, comparison of physical circulation models, model ensemble runs and recently posterior Eulerian simulations using the IBMlib framework. We present the code design ideas behind the longevity of the code, our implementation experiences, as well as code performance benchmarking. The framework may contribute substantially to progresses in representing, understanding, predicting and eventually managing marine ecosystems.

Sensitivity analysis provides information on the relative importance of model input parameters and assumptions. It is distinct from uncertainty analysis, which addresses the question ‘How uncertain is the prediction?’ Uncertainty analysis needs to map what a model does when selected input assumptions and parameters are left free to vary over their range of existence, and this is equally true of a sensitivity analysis. Despite this, many uncertainty and sensitivity analyses still explore the input space moving along one-dimensional corridors leaving space of the input factors mostly unexplored. Our extensive systematic literature review shows that many highly cited papers (42% in the present analysis) fail the elementary requirement to properly explore the space of the input factors. The results, while discipline-dependent, point to a worrying lack of standards and recognized good practices. We end by exploring possible reasons for this problem, and suggest some guidelines for proper use of the methods.

To evaluate the concern over the reproducibility of computational science, we reviewed 2367 journal articles on agent-based models published between 1990 and 2014 and documented the public availability of source code. The percentage of publications that make the model code available is about 10%. The percentages are similar for publications that are reportedly dependent on public funding. There are big differences among journals in the public availability of model code and software used. This suggests that the varying social norms and practical convenience around sharing code may explain some of the differences among different sectors of the scientific community.

Agent-based models (ABMs) are increasingly recognized as valuable tools in modelling human-environmental systems, but challenges and critics remain. One pressing challenge in the era of “Big Data” and given the flexibility of representation afforded by ABMs, is identifying the appropriate level of complicatedness in model structure for representing and investigating complex real-world systems. In this paper, we differentiate the concepts of complexity (model behaviour) and complicatedness (model structure), and illustrate the non-linear relationship between them. We then systematically evaluate the trade-offs between simple (often theoretical) models and complicated (often empirically-grounded) models. We propose using pattern-oriented modelling, stepwise approaches, and modular design to guide modellers in reaching an appropriate level of model complicatedness. While ABMs should be constructed as simple as possible but as complicated as necessary to address the predefined research questions, we also warn modellers of the pitfalls and risks of building “mid-level” models mixing stylized and empirical components.

Inverse parameter estimation of process-based models is a long-standing
problem in many scientific disciplines. A key question for inverse parameter
estimation is how to define the metric that quantifies how well model
predictions fit to the data. This metric can be expressed by general cost or
objective functions, but statistical inversion methods require a particular
metric, the probability of observing the data given the model parameters,
known as the likelihood.
For technical and computational reasons, likelihoods for process-based
stochastic models are usually based on general assumptions about variability
in the observed data, and not on the stochasticity generated by the model.
Only in recent years have new methods become available that allow the
generation of likelihoods directly from stochastic simulations. Previous
applications of these approximate Bayesian methods have concentrated on
relatively simple models. Here, we report on the application of a
simulation-based likelihood approximation for FORMIND, a parameter-rich
individual-based model of tropical forest dynamics.
We show that approximate Bayesian inference, based on a parametric likelihood
approximation placed in a conventional Markov chain Monte Carlo (MCMC)
sampler, performs well in retrieving known parameter values from virtual
inventory data generated by the forest model. We analyze the results of the
parameter estimation, examine its sensitivity to the choice and aggregation
of model outputs and observed data (summary statistics), and demonstrate the
application of this method by fitting the FORMIND model to field data from an
Ecuadorian tropical forest. Finally, we discuss how this approach differs
from approximate Bayesian computation (ABC), another method commonly used to
generate simulation-based likelihood approximations.
Our results demonstrate that simulation-based inference, which offers
considerable conceptual advantages over more traditional methods for inverse
parameter estimation, can be successfully applied to process-based models of
high complexity. The methodology is particularly suitable for heterogeneous
and complex data structures and can easily be adjusted to other model types,
including most stochastic population and individual-based models. Our study
therefore provides a blueprint for a fairly general approach to parameter
estimation of stochastic process-based models.

The proliferation of agent-based models (ABMs) in recent decades has motivated model practitioners to improve the transparency, replicability, and trust in results derived from ABMs. The complexity of ABMs has risen in stride with advances in computing power and resources, resulting in larger models with complex interactions and learning and whose outputs are often high-dimensional and require sophisticated analytical approaches. Similarly, the increasing use of data and dynamics in ABMs has further enhanced the complexity of their outputs. In this article, we offer an overview of the state-of-the-art approaches in analyzing and reporting ABM outputs highlighting challenges and outstanding issues. In particular, we examine issues surrounding variance stability (in connection with determination of appropriate number of runs and hypothesis testing), sensitivity analysis, spatio-temporal analysis, visualization, and effective communication of all these to non-technical audiences, such as various stakeholders.

Individual-based models (IBMs) can simulate the actions of individual animals as they interact with one another and the landscape in which they live. When used in spatially-explicit landscapes IBMs can show how populations change over time in response to management actions. For instance, IBMs are being used to design strategies of conservation and of the exploitation of fisheries, and for assessing the effects on populations of major construction projects and of novel agricultural chemicals. In such real world contexts, it becomes especially important to build IBMs in a principled fashion, and to approach calibration and evaluation systematically. We argue that insights from physiological and behavioural ecology offer a recipe for building realistic models, and that Approximate Bayesian Computation (ABC) is a promising technique for the calibration and evaluation of IBMs.

The ability to discover physical laws and governing equations from data is
one of humankind's greatest intellectual achievements. A quantitative
understanding of dynamic constraints and balances in nature has facilitated
rapid development of knowledge and enabled advanced technological achievements,
including aircraft, combustion engines, satellites, and electrical power. In
this work, we combine sparsity-promoting techniques and machine learning with
nonlinear dynamical systems to discover governing physical equations from
measurement data. The only assumption about the structure of the model is that
there are only a few important terms that govern the dynamics, so that the
equations are sparse in the space of possible functions; this assumption holds
for many physical systems. In particular, we use sparse regression to determine
the fewest terms in the dynamic governing equations required to accurately
represent the data. The resulting models are parsimonious, balancing model
complexity with descriptive ability while avoiding overfitting. We demonstrate
the algorithm on a wide range of problems, from simple canonical systems,
including linear and nonlinear oscillators and the chaotic Lorenz system, to
the fluid vortex shedding behind an obstacle. The fluid example illustrates the
ability of this method to discover the underlying dynamics of a system that
took experts in the community nearly 30 years to resolve. We also show that
this method generalizes to parameterized, time-varying, or externally forced
systems.

Despite the critical role that replication plays in the advancement of science, its presence in modelling literature is rare. To encourage others to conduct replication and report success and challenges that facilitate or hinder replication, we present the replication of an agent-based model (ABM) of residential sprawl using the Replication Standard. Replication results achieved relational equivalence. Through the replication process, issues with the original research were identified and corrected in an Improved model, which qualitatively supported original results. A specific challenge affecting alignment of original and Replicate models included capturing model output variability and publishing all original output data for statistical analysis. Through the replication of agent-based models, additional confidence in model behaviour can be garnered and replicated ABMs can become accredited for reuse by others. Future research in the development and refinement of replication methodology and assessments should be cultivated along with a culture of value for replication efforts.

Many complex systems occurring in the natural or social sciences or economics are frequently described on a microscopic level, e.g., by lattice- or agent-based models. To analyze the states of such systems and their bifurcation structure on the level of macroscopic observables, one has to rely on equation-free methods like stochastic continuation. Here we investigate how to improve stochastic continuation techniques by adaptively choosing the parameters of the algorithm. This allows one to obtain bifurcation diagrams quite accurately, especially near bifurcation points. We introduce lifting techniques which generate microscopic states with a naturally grown structure, which can be crucial for a reliable evaluation of macroscopic quantities. We show how to calculate fixed points of fluctuating functions by employing suitable linear fits. This procedure offers a simple measure of the statistical error. We demonstrate these improvements by applying the approach in analyses of (i) the Ising model in two dimensions, (ii) an active Ising model, and (iii) a stochastic Swift-Hohenberg model. We conclude by discussing the abilities and remaining problems of the technique.

With advances in computing, agent-based models (ABMs) have become a feasible and appealing tool to study biological systems. ABMs are seeing increased incorporation into both the biology and mathematics classrooms as powerful modeling tools to study processes involving substantial amounts of stochasticity, nonlinear interactions, and/or heterogeneous spatial structures. Here we present a brief synopsis of the agent-based modeling approach with an emphasis on its use to simulate biological systems, and provide a discussion of its role and limitations in both the biology and mathematics classrooms.

Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models, they are poorly suited for inference and lead to challenging inverse problems. We review the rapidly developing field of simulation-based inference and identify the forces giving additional momentum to the field. Finally, we describe how the frontier is expanding so that a broad audience can appreciate the profound influence these developments may have on science.

Many of the statistical models that could provide an accurate, interesting, and testable explanation for the structure of a data set turn out to have intractable likelihood functions. The method of approximate Bayesian computation (ABC) has become a popular approach for tackling such models. This review gives an overview of the method and the main issues and challenges that are the subject of current research. Expected final online publication date for the Annual Review of Statistics and Its Application Volume 6 is March 7, 2019. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Methods for testing and analyzing agent-based models have drawn increasing attention in the literature, in the context of efforts to establish standard frameworks for the development and documentation of models. This process can benefit from the use of established software environments for data analysis and visualization. For instance, the popular NetLogo agent-based modelling software can be interfaced with Mathematica and R, letting modellers use the advanced analysis capabilities available in these programming languages. To extend these capabilities to an additional user base, this paper presents the pyNetLogo connector, which allows NetLogo to be controlled from the Python general-purpose programming language. Given Python’s increasing popularity for scientific computing, this provides additional flexibility for modellers and analysts. PyNetLogo’s features are demonstrated by controlling one of NetLogo’s example models from an interactive Python environment, then performing a global sensitivity analysis with parallel processing.

The key intent of this work is to present a comprehensive comparative literature survey of the state-of-art in software agent-based computing technology and its incorporation within the modelling and simulation domain. The original contribution of this survey is two-fold: (1) Present a concise characterization of almost the entire spectrum of agent-based modelling and simulation tools, thereby highlighting the salient features, merits, and shortcomings of such multi-faceted application software; this article covers eighty five agent-based toolkits that may assist the system designers and developers with common tasks, such as constructing agent-based models and portraying the real-time simulation outputs in tabular/graphical formats and visual recordings. (2) Provide a usable reference that aids engineers, researchers, learners and academicians in readily selecting an appropriate agent-based modelling and simulation toolkit for designing and developing their system models and prototypes, cognizant of both their expertise and those requirements of their application domain. In a nutshell, a significant synthesis of Agent Based Modelling and Simulation (ABMS) resources has been performed in this review that stimulates further investigation into this topic.

We consider Bayesian inference techniques for agent-based (AB) models, as an alternative to simulated minimum distance (SMD). Three computationally heavy steps are involved: (i) simulating the model, (ii) estimating the likelihood and (iii) sampling from the posterior distribution of the parameters. Computational complexity of AB models implies that efficient techniques have to be used with respect to points (ii) and (iii), possibly involving approximations. We first discuss non-parametric (kernel density) estimation of the likelihood, coupled with Markov chain Monte Carlo sampling schemes. We then turn to parametric approximations of the likelihood, which can be derived by observing the distribution of the simulation outcomes around the statistical equilibria, or by assuming a specific form for the distribution of external deviations in the data. Finally, we introduce Approximate Bayesian Computation techniques for likelihood-free estimation. These allow embedding SMD methods in a Bayesian framework, and are particularly suited when robust estimation is needed. These techniques are first tested in a simple price discovery model with one parameter, and then employed to estimate the behavioural macroeconomic model of De Grauwe (2012), with nine unknown parameters.

Over the past 10 years the use of the term ‘tipping point’ in the scientific literature has exploded. It was originally used loosely as a metaphor for the phenomenon that, beyond a certain threshold, runaway change propels a system to a new state. Although several specific mathematical definitions have since been proposed, we argue that these are too narrow and that it is better to retain the original definition.

Agent based models (ABM)s are increasingly used in social science, economics, mathematics, biology and computer science to describe time dependent systems in circumstances where a description in terms of equations is difficult. Yet few tools are currently available for the systematic analysis of ABM behaviour. Numerical continuation and bifurcation analysis is a well-established tool for the study of deterministic systems. Recently, equation-free (EF) methods have been developed to extend numerical continuation techniques to systems where the dynamics are described at a microscopic scale and continuation of a macroscopic property of the system is considered. To date, the practical use of EF methods has been limited by; (1) the over-head of application-specific implementation; (2) the laborious configuration of problem-specific parameters; and (3) large ensemble sizes (potentially) leading to computationally restrictive run-times.

The two main challenges of ecological modelling are to yield more general understanding and theory and to provide testable and robust predictions. To achieve this, emergence, structural realism, and prediction have to become key elements of designing models. In the special issue “Next-generation ecological modelling”, which is dedicated to Donald DeAngelis on the occasion of his 70th birthday, 16 contributions present and discuss main features of next-generation ecological modelling. One key feature is to base the description of individuals’ behaviour and interactions on first principles rooted in energetic or evolutionary theory. To cope with increasing model complexity, standardization, separate testing of alternative submodels against multiple output patterns, and documenting these tests will be required. Including micro-evolution is essential to capture organisms’ response to changing conditions. Functional types may be used instead of species for representing communities. Model analysis will be challenging, but robustness analysis, which tries to break models’ explanations, can help to tell signals from noise and identify general mechanisms underlying the internal organization of ecological systems. Ultimately, next-generation modelling should aim at developing general theory to better understand stability properties and mechanisms. This understanding then can provide the basis for restoring, maintaining, or strengthening the resilience of ecosystems and supporting sustainable management of natural resources.

Existing methodologies of sensitivity analysis may be insufficient for a proper analysis of Agent-based Models (ABMs). Most ABMs consist of multiple levels, contain various nonlinear interactions, and display emergent behaviour. This limits the information content that follows from the classical sensitivity analysis methodologies that link model output to model input. In this paper we evaluate the performance of three well-known methodologies for sensitivity analysis. The three methodologies are extended OFAT (one-factor-at-a-time), and proportional assigning of output variance by means of model fitting and by means of Sobol’ decomposition. The methodologies are applied to a case study of limited complexity consisting of free-roaming and procreating agents that make harvest decisions with regard to a diffusing renewable resource. We find that each methodology has its own merits and exposes useful information, yet none of them provide a complete picture of model behaviour. We recommend extended OAT as the starting point for sensitivity analysis of an ABM, for its use in uncovering the mechanisms and patterns that the ABM produces.