Embedding System Dynamics in Agent Based Models for Complex Adaptive Systems.
ABSTRACT Complex adaptive systems (CAS) are composed of interacting agents, exhibit nonlinear properties such as positive and negative feedback, and tend to produce emergent behavior that cannot be wholly explained by deconstructing the system into its constituent parts. Both system dynamics (equation-based) approaches and agent-based approaches have been used to model such systems, and each has its benefits and drawbacks. In this paper, we introduce a class of agent-based models with an embedded system dynamics model, and detail the semantics of a simulation framework for these models. This model definition, along with the simulation framework, combines agent-based and system dynamics approaches in a way that retains the strengths of both paradigms. We show the applicability of our model by instantiating it for two example complex adaptive systems in the field of Computational Sustainability, drawn from ecology and epidemiology. We then present a more detailed application in epidemiology, in which we compare a previously unstudied intervention strategy to established ones. Our experimental results, unattainable using previous methods, yield insight into the effectiveness of these intervention strategies.
- SourceAvailable from: John Sterman[Show abstract] [Hide abstract]
ABSTRACT: When is it better to use agent based (AB) models, and when should differential equation (DE) models be used? Where DE models assume homogeneity and perfect mixing within compartments, AB models can capture heterogeneity in agent attributes and in the network of interactions among them. Using contagious disease as an example, we contrast the dynamics of AB models with those of the corresponding mean-field DE model, specifically, comparing the standard SEIR model - a nonlinear DE - to an explicit AB model of the same system. We examine both agent heterogeneity and the impact of different network structures, including fully connected, random, Watts-Strogatz small world, scale-free, and lattice networks. Surprisingly, in many conditions the AB and DE dynamics are quite similar. Differences between the DE and AB models are not statistically significant on key metrics relevant to public health, including diffusion speed, peak load on health services infrastructure and total disease burden. We explore the conditions under which the AB and DE dynamics differ, and consider implications for managing infectious disease. The results extend beyond epidemiology: from innovation adoption to the spread of rumor and riot to financial panics, many important social phenomena involve analogous processes of diffusion and social contagion.Management Science. 01/2008; 54:998-1014.
Conference Paper: Bandit Based Monte-Carlo Planning.[Show abstract] [Hide abstract]
ABSTRACT: For large state-space Markovian Decision Problems Monte- Carlo planning is one of the few viable approaches to flnd near-optimal solutions. In this paper we introduce a new algorithm, UCT, that ap- plies bandit ideas to guide Monte-Carlo planning. In flnite-horizon or discounted MDPs the algorithm is shown to be consistent and flnite sample bounds are derived on the estimation error due to sampling. Ex- perimental results show that in several domains, UCT is signiflcantly more e-cient than its alternatives. Consider the problem of flnding a near optimal action in large state-space Markovian Decision Problems (MDPs) under the assumption a generative model of the MDP is available. One of the few viable approaches is to carry out sampling based lookahead search, as proposed by Kearns et al. (8), whose sparse lookahead search procedure builds a tree with its nodes labelled by either states or state-action pairs in an alternating manner, and the root corresponding to the initial state from where planning is initiated. Each node labelled by a state is followed in the tree by a flxed number of nodes associated with the actions available at that state, whilst each corresponding state-action labelled node is followed by a flxed number of state-labelled nodes sampled using the generative model of the MDP. During sampling, the sampled rewards are stored with the edges connecting state-action nodes and state nodes. The tree is built in a stage-wise manner, from the root to the leafs. Its depth is flxed. The computation of the values of the actions at the initial state happens from the leafs by propagating the values up in the tree: The value of a state-action labelled node is computed based on the average of the sum of the rewards along the edges originating at the node and the values at the corresponding successor nodes, whilst the value of a state node is computed by taking the maximum of the values of its children. Kearns et al. showed that in order to flnd an action at the initial state whose value is within the †-vicinity of that of the best, for discounted MPDs with discount factor 0 ∞ 1, K actions and uniformly bounded rewards, regardless of the size of the state-space flxed size trees su-ce (8). In particular, the depth of the tree is proportional toMachine Learning: ECML 2006, 17th European Conference on Machine Learning, Berlin, Germany, September 18-22, 2006, Proceedings; 01/2006