Article

Boundedly Rational Rule Learning in a Guessing Game

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

We combine Nagel's “step-k” model of boundedly rational players with a “law of effect” learning model. Players begin with a disposition to use one of the step-krules of behavior, and over time the players learn how the available rules perform and switch to better performing rules. We offer an econometric specification of this dynamic process and fit it to Nagel's experimental data. We find that the rule of learning model vastly outperforms other nested and nonnested learning models. We find strong evidence for diverse dispositions and reject the Bayesian rule-learning model.Journal of Economic LiteratureClassification Numbers: C70, C52, D83.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... One is the Cournot algorithm described in "Motivating example" section. 2 Another well-known behavior rule is the fictitious play (e.g., Brown [2], Fudenberg and Kreps [4], and Fudenberg and Takahashi [5]) which chooses a pure action best response to the observed frequency of actions by the opponent group. 3 Each of the level-k-rules in the level-k theory (e.g., Nagel [15], Stahl [18] and Mohlin [14]) is also a behavior rule 4 : the level-0 rule is to choose all actions with equal probability, the level-1 rule best responds to the choice of the level-0 opponent and so on. Notice that these behavior rules are well-defined without the knowledge of the component game G. ...
... We allow players in any group to hold different behavior rules and to change behavior rules over time. The latter case includes rule-learning (e.g., Stahl [18] and [19]) and hypothesis testing (e.g., Foster and Young [3]), that is, in each period, a behavior rule (which can be an algorithm) for each player is determined by a metarule (which can be an algorithm as well) on how to adjust behavior rules over time. ...
... We focus on "rational or justifiable" behavior rules in the sense that the rule prescribes (i) a best response to some belief or (ii) a previously chosen action. Such behaviors are predominant among humans (e.g., Stahl [18], and Kneeland [9]). When designing an algorithm, we also want it to have these properties. ...
Article
Full-text available
There is a widespread hope that, in the near future, algorithms become so sophisticated that “solutions” to most problems are found by machines. In this note, we throw some doubts on this expectation by showing the following impossibility result: given a set of finite-memory, finite-iteration algorithms, a continuum of games exist, whose unique and strict Nash equilibrium cannot be reached from a large set of initial states. A Nash equilibrium is a social solution to conflicts of interest, and hence finite algorithms should not be always relied upon for social problems. Our result also shows how to construct games to deceive a given set of algorithms to be trapped in a cycle without a Nash equilibrium.
... (34) FM models are the technique of choice for analyzing Beauty-Contest data, revealing that virtually all 'nontheorist' subjects 14 (94%) fall into one of three boundedly rational depth-of-reasoning classes (levels 0, 1 or 2). (35,36) FM models are being applied increasingly in empirical game theoryincluding to the analysis of e.g. trust-game data, social-preferences data, and common-poolresource data -demonstrating the broad applicability of a multiple-criteria approach. ...
... 34 Experiments have shown that deliberative reasoning can be "activated by metacognitive experiences of difficulty or disfluency during the process of reasoning", (58) i.e. by experiences that affect confidence negatively. 35 T II subjects come out of their training just as confident (in distribution) as T B subjects, and their overall classification performance, true-positive performance, and true-negative performance is statistically indistinguishable from the T B status quo. This classification performance itself could be the result of either or both (i) fundamental difficulty of interrupting the impulsive state of mind, or (ii) failure of T II's design to stimulate metacognitive processes. ...
... 34 Our results do not conclusively rule out the possibility that T II training could outperform the baseline, because we compare here our initial attempts at producing training packages against packages informed and refined over years of industry-standard approach development by professionals in the field of information security training. 35 "Confidence in the accuracy of intuitive judgment appears to depend in large part on the ease or difficulty with which information comes to mind ... and the perceived difficulty of the judgment at hand." (58) T RI on the other hand is a limited success, in that it increased the true-negative classification rate relative to T B by 15%. But it did not have a statistically significant effect on the true-positive classification rate relative to T B. T RI appears to be successful in stimulating metacognitive processes, and relatedly, in reducing (over-)confidence. ...
Preprint
Full-text available
Normative decision theory proves inadequate for modeling human responses to the social-engineering campaigns of Advanced Persistent Threat (APT) attacks. Behavioral decision theory fares better, but still falls short of capturing social-engineering attack vectors, which operate through emotions and peripheral-route persuasion. We introduce a generalized decision theory, under which any decision will be made according to one of multiple coexisting choice criteria. We denote the set of possible choice criteria by C. Thus the proposed model reduces to conventional Expected Utility theory when | C_EU | = 1, whilst Dual-Process (thinking fast vs. thinking slow) decision making corresponds to a model with | C_DP | = 2. We consider a more general case with | C | >= 2, which necessitates careful consideration of *how*, for a particular choice-task instance, one criterion comes to prevail over others. We operationalize this with a probability distribution that is conditional upon traits of the decision maker as well as upon the context and the framing of choice options. Whereas existing Signal Detection Theory (SDT) models of phishing detection commingle the different peripheral-route persuasion pathways, in the present descriptive generalization the different pathways are explicitly identified and represented. A number of implications follow immediately from this formulation, ranging from the conditional nature of security-breach risk to delineation of the prerequisites for valid tests of security training. Moreover, the model explains the `stepping-stone' penetration pattern of APT attacks, which has confounded modeling approaches based on normative rationality.
... As discussed in Section 1, this paper is closely related to the extensive literature of limited depth of reasoning in strategic environments. Over the past thirty years, this idea has been studied by a variety of theoretical researches (see, for instance, Binmore (1987Binmore ( , 1988, Selten (1991Selten ( , 1998, Aumann (1992), Stahl (1993), and Alaoui andPenta (2016, 2018)). Beyond theoretical work, Nagel (1995) conducts the first experiment to study to people's iterative reasoning process, using the "beauty contest" game. ...
... First, in most laboratory experiments in economics and game theory, subjects play the same game with multiple repetitions, in order to gain experience and facilitate convergence to equilibrium behavior. Ho and Su (2013) and Ho et al. (2021) propose a modification of CH that allows for learning across repeated plays of the same sequential game, in a different way than in Stahl (1996), but in the same spirit. In their setting, players repeatedly play the same sequential game and update their beliefs about the distribution of levels after observing past outcomes of earlier games, while holding the fixed beliefs during each play of the game. ...
Preprint
Full-text available
We explore the dynamic cognitive hierarchy (CH) theory proposed by Lin and Palfrey (2022) in the setting of multi-stage games of incomplete information. In such an environment, players will learn other players' payoff-relevant types and levels of sophistication at the same time as the history unfolds. For a class of two-person dirty faces games, we fully characterize the dynamic CH solution, predicting that lower-level players will figure out their face types in later periods than higher-level players. Finally, we re-analyze the dirty faces game experimental data from Bayer and Chan (2007) and find the dynamic CH solution can better explain the data than the static CH solution.
... Our paper belongs to the literature on learning across games. Following Selten, Abbink, Buchta & Sadrieh (2003), we consider a population of (artificial) agents who use behavior rules as in Stahl (1996) to decide upon some course of action in unfamiliar situations as described by Gilboa & Schmeidler (1995). Gilboa & Schmeidler (1995) provide a theoretical basis for learning across games. ...
... An empirical test of Mengel's (2012) partition model is provided by Grimm & Mengel (2012). Stahl (1996Stahl ( , 1999Stahl ( , 2000 introduced a rule-based approach to model learning by boundedly rational agents. The agents have behavior rules, which are maps from information sets to sets of feasible actions, and the reinforcement principle defines a learning dynamic on the space of behavior rules. ...
Article
Full-text available
We study one-shot play in the set of all bimatrix games by a large population of agents. The agents never see the same game twice, but they can learn ‘across games’ by developing solution concepts that tell them how to play new games. Each agent’s individual solution concept is represented by a computer program, and natural selection is applied to derive a stochastically stable solution concept. Our aim is to develop a theory predicting how experienced agents would play in one-shot games. To use the theory, visit https://gplab.nhh.no/gamesolver.php.
... 39 Cf. Stahl (1995, p. 304 (Stahl, 1995) Figure 4.2 illustrates a simple model of adaptive behaviour. This is centred on the evaluation of the feedback a certain behaviour yields for, to which the individual rules are updated. ...
... 45 Cf. e.g. Camerer (2003a), Nagel (1995), Stahl (1995). ...
Chapter
Prediction is involved in many recurrent tasks individuals are confronted with and can be seen as the result of the interaction between “judgement, intuition, and educated guesswork.” Even when forecasts rely on mathematical methods the central role of intuition cannot be denied as it supervises e.g. the choice of variables that belong to the model, their initial value and their functional specification. This chapter deepens a central aspect for human cognition and problem-solving, namely that individuals make use of bounded rational heuristics for taking decisions under uncertainty. Heuristics are simplified procedures for assessing probabilities. They are based on rules of thumb. They rely on mental clues which selectively orient the search process and enable the individual to reach her goals when time, informational and computational capabilities are constrained. Although in some cases bounded rational heuristics can be made responsible for the sub-optimality of outcomes and for behavioural biases. In some other cases it represents an essential support for carrying on inference when complexity overloads the individual cognitive and computational capabilities. It enables the individual to reach better solutions than otherwise. There are mainly two different approaches to subjective judgement and bounded rational heuristics, namely the “heuristics and biases” approach, pioneered by Kahneman and Tversky, and the “ecological rationality” approach, with Gigerenzer as one of its most influential proponents.
... 39 Cf. Stahl (1995, p. 304 (Stahl, 1995) Figure 4.2 illustrates a simple model of adaptive behaviour. This is centred on the evaluation of the feedback a certain behaviour yields for, to which the individual rules are updated. ...
... 45 Cf. e.g. Camerer (2003a), Nagel (1995), Stahl (1995). ...
Chapter
The phenomenon of referring is pervasive and regards all fields of human thought and activity, so much that it appears to be an inescapable basis of all that can be thought, conceptualized and expressed. The human capability of referring creates the basis for ordering the subjective perception of the world, for interpreting events, for interacting with others, etc., thus creating the basis for all activities which regard human cognition and which are essential for individual survival. Being able to establish self-references is even a necessary prerequisite for self-change and behavioural adjustment. Furthermore, the reflexive capacity underlies basic problemsolving abilities and makes mental adaptiveness possible. Consciousness (in the form of self-consciousness) can be identified as the main source of reflexivity for human thought and action. Individuals think and are simultaneously conscious of their thought, so that all discourses are both directed to outward reality (the external world) and to the inner reality of the individual who formulates them, since she is conscious of expressing them. Therefore it can be said that each human discourse, being a human way of thought formulation, has a self-referring nature. This chapter is dedicated to the analysis of the polyvalent concept of “self-reference.” After its definition which will be accompanied by an overview of the different kinds of reference relations some common varieties and possible taxonomies for self-reference will be presented. The polymorphism of self-reference will be illustrated by its implications for formal and natural language. Logical consistency of self-reference in its different forms and contexts of appearance will then be discussed, in that the relation between self-reference and paradoxes will be deepened and some guidelines for testing the legitimacy of self-references will be extrapolated. The chapter concludes discussing the role of self-reference for human understanding as well as for social and individual decision making.
... For individual DM tasks, Rieskamp and Otto (2006) find evidence that subjects use a reinforcement-learning scheme over the available heuristics. For strategic DM, Stahl (1996) concludes that subjects often apply rulelearning, which is essentially a form of reinforcement learning over a set of decision strategies. Closely related to this approach is the literature on evolution as the selection mechanism of decision rules, e.g., Engle-Warnick and Slonim (2004); Friedman (1991). ...
... Experimental studies of individual DM and strategic DM generally find significant between-subject heterogeneity, e.g. in learning models (Cheung and Friedman 1997;Daniel et al. 1998;Ho et al. 2007;Rapoport et al. 1998;Rutström and Wilcox 2009;Stahl 1996;Shachat and Swarthout 2004;Spiliopoulos 2012). ...
Article
Full-text available
For decisions in the wild, time is of the essence. Available decision time is often cut short through natural or artificial constraints, or is impinged upon by the opportunity cost of time. Experimental economists have only recently begun to conduct experiments with time constraints and to analyze response time (RT) data, in contrast to experimental psychologists. RT analysis has proven valuable for the identification of individual and strategic decision processes including identification of social preferences in the latter case, model comparison/selection, and the investigation of heuristics that combine speed and performance by exploiting environmental regularities. Here we focus on the benefits, challenges, and desiderata of RT analysis in strategic decision making. We argue that unlocking the potential of RT analysis requires the adoption of process-based models instead of outcome-based models, and discuss how RT in the wild can be captured by time-constrained experiments in the lab. We conclude that RT analysis holds considerable potential for experimental economics, deserves greater attention as a methodological tool, and promises important insights on strategic decision making in naturally occurring environments.
... They estimate a high frequency (about 40%) of worldly subjects and a moderate frequency (about 20%) of "Naïve Nash" (like our Equilibrium) subjects, but, in contrast to our results, no "rational expectations" subjects. 50 This reconfirms a point that has been made by many other experimental studies, including Roth, Prasnikar, Okuno-Fujiwara, and Zamir (1991), McKelvey and Palfrey (1992), Beard and Beil (1994), Nagel (1995), Stahl and Wilson (1995), Stahl (1996), Ho and Weigelt (1996), and Ho, Weigelt, and Camerer (1998). ...
... There is a growing experimental literature that studies the principles that govern strategic behavior, surveyed inKagel and Roth (1995) andCrawford (1997). See, among others,Beard and Beil (1994),Brandts and Holt (1993),Cachon and Camerer (1996),Camerer and Ho (1998), Cooper, DeJong, Forsythe, and Ross (1990,Friedman (1996),Ho and Weigelt (1996),Ho, Camerer, and Weigelt (1998),McKelvey and Palfrey (1992),Nagel (1995),Palfrey and Rosenthal (1994),Roth (1987),Roth, Prasnikar, Okuno-Fujiwara, and Zamir (1991),Schotter, Weigelt, and Wilson (1994),Selten (1998),Stahl (1996),Stahl and Wilson (1995),Straub (1995), andVan Huyck, Battalio, and Beil (1990. 5 Subjects were not allowed to record the pie sizes, and the frequencies with which they looked up payoffs repeatedly made clear that they did not memorize them. ...
Article
Full-text available
This paper reports experiments designed to measure strategic sophistication, the extent to which players' behavior reflects attempts to predict others' decisions, taking their incentives into account. Subjects played normal-form games with various patterns of iterated dominance and unique pure-strategy equilibria without dominance, using a computer interface that allowed them to look up hidden payoffs as often as desired, one at a time, while automatically recording their look-ups. Monitoring information search allows tests of game theory's implications for cognition as well as decisions, and subjects' deviations from search patterns suggested by equilibrium analysis help to predict their deviations from equilibrium decisions.
... Most of these models deal with the notion of bounded rationality when players are rational only to some extent; the degree of rationality is associated with the sophistication of a player. A dynamical model where the players choose one of the step-k behavioral rules, learn the results of the experiment, and choose more successful rules in the next iterations, was presented and estimated in Stahl (1996). A further extension of the set of possible behavioral strategies is discussed in Stahl (1998). ...
Preprint
Full-text available
A Keynesian beauty contest is a wide class of games of guessing the most popular strategy among other players. In particular, guessing a fraction of a mean of numbers chosen by all players is a classic behavioral experiment designed to test iterative reasoning patterns among various groups of people. The previous literature reveals that the level of sophistication of the opponents is an important factor affecting the outcome of the game. Smarter decision makers choose strategies that are closer to theoretical Nash equilibrium and demonstrate faster convergence to equilibrium in iterated contests with information revelation. We replicate a series of classic experiments by running virtual experiments with modern large language models (LLMs) who play against various groups of virtual players. We test how advanced the LLMs' behavior is compared to the behavior of human players. We show that LLMs typically take into account the opponents' level of sophistication and adapt by changing the strategy. In various settings, most LLMs (with the exception of Llama) are more sophisticated and play lower numbers compared to human players. Our results suggest that LLMs (except Llama) are rather successful in identifying the underlying strategic environment and adopting the strategies to the changing set of parameters of the game in the same way that human players do. All LLMs still fail to play dominant strategies in a two-player game. Our results contribute to the discussion on the accuracy of modeling human economic agents by artificial intelligence.
... In this experiment, the rule learning model is used as the rule classification layer. The function of this layer is to quickly predict partial results according to the defined association rules [16]. This improves the speed of model prediction. ...
Article
Full-text available
Vehicle collisions are a significant concern in road accidents, particularly with the rise of autonomous driving technology. However, existing studies often struggle to accurately predict collisions due to inconsistent correlations between collected data and collision labels. Therefore, this work quantitatively analyzes traffic accident data and constructs new features with strong correlations to the labels. In this study, a rule classification-dilated convolution network (R-DCN) model, which combines rule learning with dilated convolutional networks, is proposed. The rule learning model predicts partially collided vehicles using predefined rules, resulting in interpretability, high prediction efficiency, and quick computation. The remaining vehicle collisions are estimated using dilated convolutional layers, addressing the issue of missing important features in conventional convolution models. To distinguish between intense collisions (predicted by rule learning) and nonintense collisions (predicted by the dilated convolutional model), the data for training the network are those that remove the intense collision predicted by the rule learning model. The proposed model exhibits enhanced sensitivity to nonintense collision data. Compared to existing models, the approach presented in this work demonstrates superior evaluation metrics and training speed.
... Most of these models deal with the notion of bounded rationality when players are rational only to some extent; the degree of rationality is associated with the sophistication of a player. A dynamical model where the players choose one of the step-behavioral rules, learn the results of the experiment, and choose more successful rules in the next iterations, was presented and estimated in Stahl (1996). A further extension of the set of possible behavioral strategies is discussed in Stahl (1998). ...
... In fact, as Carpenter et al. (2013) and Gill and Prowse (2016) show, the capacity to effectively play this game depends on cognitive skills. The capacity of performing well in a beauty contest depends on cognitive skills in two ways: the strategic ability of forming higher-order beliefs (level-k thinking, see: Duffy and Nagel, 1997;Ho and Su, 2013;Nagel, 1995;Stahl, 1996) and the capacity of forming correct beliefs about the strategic ability of opponents and best responding to them. 8 Since performing well in the beauty contest game conflates the effect of cognitive skills through these two avenues, we focus our analysis on the distance from best response (in absolute terms) to assess subject performance instead of the k-level as it is typically done in the literature. ...
Article
Full-text available
The frustration-aggression hypothesis posits that anger affects economic behaviour essentially by temporally changing individual social preferences and specifically attitudes towards punishment. Here, we test a different channel in an experiment where we externally induce anger to a subgroup of participants (following a standard procedure that we verify by using a novel method of textual analysis). We show that anger can impair the capacity to think strategically in a beauty-contest game, in a pre-registered experiment. Angry participants choose numbers further away from the best response level and earn significantly lower profits. Using a finite mixture model, we show that anger increases the number of level-zero players by 9 percentage points, a percentage increase of more than . Furthermore, with a second pre-registered experiment, we show that this effect is not common to all negative emotions. Sad participants do not play significantly further away from the best response level than the control group and sadness does not lead to more level-zero play.
... B Frank Westerhoff so-called Cournot-Nash equilibrium. 1 While the experimental evidence by Cox and Walker (1998) suggests that naïve expectations may be a reasonable description of firms' expectation formation behavior, the experimental evidence by Stahl (1996), Offerman et al. (2002), Bigoni (2010) and Assenza et al. (2015) paints a richer picture. In particular, these studies suggest that firms switch between a limited number of heuristic forecasting models to form their expectations. ...
Article
Full-text available
We develop a nonlinear duopoly model in which the heuristic expectation formation and learning behavior of two boundedly rational firms may engender complex dynamics. Most importantly, we assume that the firms employ different forecasting models to predict the behavior of their opponent. Moreover, the firms learn by leaning more strongly on forecasting models that yield more precise predictions. An eight-dimensional nonlinear map drives the dynamics of our approach. We analytically derive the conditions under which its unique steady state is locally stable and numerically study its out-of-equilibrium behavior. In doing so, we detect multiple scenarios with coexisting attractors at which the firms’ behavior yields distinctively different market outcomes.
... In the El-Farol bar problem (Arthur, 1994), agents hold a heterogeneous set of simple predictive models and learn to use the more effective rules (given their individual experience) over time; interestingly, such a learning process converges to the Nash equilibrium solution. Empirical work in repeated games by Stahl (1996Stahl ( , 1999Stahl ( , 2000 and Haruvy and Stahl (2012) find evidence that subjects learn to use relatively simple rules based on their prior performance-they refer to their model as rule-learning. These are concepts strikingly similar to those proposed by the ERP; however, the ERP studies were in the domain of individual decision making, whereas the economic studies are in strategic decision making. ...
Article
Full-text available
Over the past decades psychological theories have made significant headway into economics, culminating in the 2002 (partially) and 2017 Nobel prizes awarded for work in the field of Behavioral Economics. Many of the insights imported from psychology into economics share a common trait: the presumption that decision makers use shortcuts that lead to deviations from rational behaviour (the Heuristics-and-Biases program). Many economists seem unaware that this viewpoint has long been contested in cognitive psychology. Proponents of an alternative program (the Ecological-Rationality program) argue that heuristics need not be irrational, particularly when judged relative to characteristics of the environment. We sketch out the historical context of the antagonism between these two research programs and then review more recent work in the Ecological-Rationality tradition. While the heuristics-and-biases program is now well-established in (mainstream neo-classical) economics via Behavioral Economics, we show there is considerable scope for the Ecological-Rationality program to interact with economics. In fact, we argue that there are many existing, yet overlooked, bridges between the two, based on independently derived research in economics that can be construed as being aligned with the tradition of the Ecological-Rationality program. We close the paper with a discussion of the open challenges and difficulties of integrating the Ecological Rationality program with economics.
... In fact, as Carpenter et al. (2013) and Gill and Prowse (2016) show, the capacity to effectively play this game depends on cognitive skills. The capacity of performing well in a beauty contest depends on cognitive skills in two ways: the strategic ability of forming higher-order beliefs (level-k thinking, see: Duffy and Nagel, 1997;Ho and Su, 2013;Nagel, 1995;Stahl, 1996) and the capacity of forming correct beliefs about the strategic ability of opponents and best responding to them. 8 Since performing well in the beauty contest game conflates the effect of cognitive skills through these two avenues, we focus our analysis on the distance from best response (in absolute terms) to assess subject performance instead of the k-level as it is typically done in the literature. ...
Article
Full-text available
The frustration-aggression hypothesis posits that anger affects economic behaviour essentially by temporally changing individual social preferences and specifically attitudes towards punishment. Here, we test a different channel in an experiment where we externally induce anger to a subgroup of participants (following a standard procedure that we verify by using a novel method of textual analysis). We show that anger can impair the capacity to think strategically in a beauty-contest game, in a pre-registered experiment. Angry participants choose numbers further away from the best response level and earn significantly lower profits. Using a finite mixture model, we show that anger increases the number of level-zero players by 9 percentage points, a percentage increase of more than 30%. Furthermore, with a second pre-registered experiment, we show that this effect is not common to all negative emotions. Sad participants do not play significantly further away from the best response level than the control group and sadness does not lead to more level-zero play.
... In the El-Farol bar problem (Arthur 1994), agents hold a heterogeneous set of simple predictive models and learn to use the more effective rules (given their individual experience) over time; interestingly, such a learning process converges to the Nash equilibrium solution. Empirical work in repeated games by Stahl (1996Stahl ( , 1999Stahl ( , 2000 and Haruvy & Stahl (2012) find evidence that subjects learn to use relatively simple rules based on their prior performancethey refer to their model as rule-learning. These are concepts strikingly similar to those proposed by the ERP; however, the ERP studies were in the domain of individual decision making, whereas the economic studies are in strategic decision making. ...
Preprint
Full-text available
Over the past decades psychological theories have made significant headway into economics, culminating in the 2002 (partially) and 2017 Nobel prizes awarded for work in the field of Behavioral Economics. Many of the insights imported from psychology into economics share a common trait: the presumption that decision makers use shortcuts that lead to deviations from rational behaviour (the Heuristics-and-Biases program). Many economists seem unaware that this viewpoint has long been contested in cognitive psychology. Proponents of an alternative program (the Ecological-Rationality program) argue that heuristics need not be irrational, particularly when judged relative to characteristics of the environment. We sketch out the historical context of the antagonism between these two research programs and then review more recent work in the Ecological-Rationality tradition. While the heuristics-and-biases program is now well-established in (mainstream neo-classical) economics via Behavioral Economics, we show there is considerable scope for the Ecological-Rationality program to interact with economics. In fact, we argue that there are many existing, yet overlooked, bridges between the two, based on independently derived research in economics that can be construed as being aligned with the tradition of the Ecological-Rationality program. We close the chapter with a discussion of the open challenges and difficulties of integrating the Ecological Rationality program with economics.
... After this paper, fruitful literature has emerged to study iterative reasoning, bounded rationality, and learning. See Duffy and Nagel (1997), Ho et al. (1998), Nagel (1995), and Stahl (1996) for early applications of iterated best reply and learning models. For more recent reviews, see Akin andUrhan (2011), Camerer (2003), Crawford et al. (2013), Nagel (2008), . ...
Article
Full-text available
This paper theoretically and experimentally investigates the behavior of asymmetric players in guessing games. The asymmetry is created by introducing r>1r>1r>1 replicas of one of the players. Two-player and restricted N-player cases are examined in detail. Based on the model parameters, the equilibrium is either unique in which all players choose zero or mixed in which the weak player (r=1r=1) imitates the strong player (r>1r>1r>1). A series of experiments involving two and three-player repeated guessing games with unique equilibrium is conducted. We find that equilibrium behavior is observed less frequently and overall choices are farther from the equilibrium in two-player asymmetric games in contrast to symmetric games, but this is not the case in three-player games. Convergence towards equilibrium exists in all cases but asymmetry slows down the speed of convergence to the equilibrium in two, but not in three-player games. Furthermore, the strong players have a slight earning advantage over the weak players, and asymmetry increases the discrepancy in choices (defined as the squared distance of choices from the winning number) in both games.
... Guessing games-also known as Beauty-Contest games-have been pivotal in showing not only that backward induction and dominance-solvability break down, but also that game play can be characterized by membership in a boundedly rational, discrete (level-k) depth-of-reasoning class (Nagel, 1995). FM models are the technique of choice for analyzing Beauty-Contest data, revealing that virtually all "nontheorist" subjects 15 (94%) fall into one of three boundedly rational depth-of-reasoning classes (level 0, 1, or 2) (Bosch-Domènech et al., 2010;Stahl, 1996). FM models are being applied increasingly in empirical game theoryincluding to the analysis of, for example, trust-game data, social-preferences data, and common-pool-resource datademonstrating the broad applicability of a multiple-criteria approach. ...
Article
Full-text available
Normative decision theory proves inadequate for modeling human responses to the social‐engineering campaigns of advanced persistent threat (APT) attacks. Behavioral decision theory fares better, but still falls short of capturing social‐engineering attack vectors which operate through emotions and peripheral‐route persuasion. We introduce a generalized decision theory, under which any decision will be made according to one of multiple coexisting choice criteria. We denote the set of possible choice criteria by CC\mathcal {C}. Thus, the proposed model reduces to conventional Expected Utility theory when |CEU|=1CEU=1|\mathcal {C}_{\text{EU}}|=1, while Dual‐Process (thinking fast vs. thinking slow) decision making corresponds to a model with |CDP|=2CDP=2|\mathcal {C}_{\text{DP}}|=2. We consider a more general case with |C|≥2C2|\mathcal {C}|\ge 2, which necessitates careful consideration of how, for a particular choice‐task instance, one criterion comes to prevail over others. We operationalize this with a probability distribution that is conditional upon traits of the decisionmaker as well as upon the context and the framing of choice options. Whereas existing signal detection theory (SDT) models of phishing detection commingle the different peripheral‐route persuasion pathways, in the present descriptive generalization the different pathways are explicitly identified and represented. A number of implications follow immediately from this formulation, ranging from the conditional nature of security‐breach risk to delineation of the prerequisites for valid tests of security training. Moreover, the model explains the “stepping‐stone” penetration pattern of APT attacks, which has confounded modeling approaches based on normative rationality.
... Various experimental studies (Nagel, 1995;Duffy and Nagel, 1997) outlined that while in opening rounds, outcomes were quite divergent from Nash equilibrium, subsequent rounds showed much closer outcomes to the predictions of the theory. Learning models in which the agents acquire understanding of the game and the strategy environment have been popular to explain this convergence to equilibrium (Stahl, 1996;Weber, 2003) without relying on improperly high levels of reasoning. Subjects were often found to identify depths of reasoning generally between order 0 and 2 (Duffy and Nagel, 1997). ...
Preprint
In Keynesian Beauty Contests notably modeled by p-guessing games, players try to guess the average of guesses multiplied by p. Convergence of plays to Nash equilibrium has often been justified by agents' learning. However, interrogations remain on the origin of reasoning types and equilibrium behavior when learning takes place in unstable environments. When successive values of p can take values above and below 1, bounded rational agents may learn about their environment through simplified representations of the game, reasoning with analogies and constructing expectations about the behavior of other players. We introduce an evolutionary process of learning to investigate the dynamics of learning and the resulting optimal strategies in unstable p-guessing games environments with analogy partitions. As a validation of the approach, we first show that our genetic algorithm behaves consistently with previous results in persistent environments, converging to the Nash equilibrium. We characterize strategic behavior in mixed regimes with unstable values of p. Varying the number of iterations given to the genetic algorithm to learn about the game replicates the behavior of agents with different levels of reasoning of the level k approach. This evolutionary process hence proposes a learning foundation for endogenizing existence and transitions between levels of reasoning in cognitive hierarchy models.
... Various experimental studies (Nagel, 1995;Duffy and Nagel, 1997) outlined that while in opening rounds, outcomes were quite divergent from Nash equilibrium, subsequent rounds showed much closer outcomes to the predictions of the theory. Learning models in which the agents acquire understanding of the game and the strategy environment have been popular to explain this convergence to equilibrium (Stahl, 1996;Weber, 2003) without relying on improperly high levels of reasoning. Subjects were often found to identify depths of reasoning generally between order 0 and 2 (Duffy and Nagel, 1997). ...
Preprint
Full-text available
In Keynesian Beauty Contests notably modeled by p-guessing games, players try to guess the average of guesses multiplied by p. Theoretical and experimental research in the spirit of level k models has characterized the behavior of agents with different levels of reasoning when p is persistently above or below 1. Convergence of plays to Nash equilibrium has often been justified by agents' learning. However, interrogations remain on the origin of reasoning types and equilibrium behavior when learning takes place in unstable environments. When successive values of p can take values above and below 1, bounded rational agents may learn about their environment through simplified representations of the game, reasoning with analogies and constructing expectations about the behavior of other players. We introduce an evolutionary process of learning to investigate the dynamics of learning and the resulting optimal strategies in unstable p-guessing games environments with analogy partitions. As a validation of the approach, we first show that our genetic algorithm behaves consistently with previous results in persistent environments, converging to the Nash equilibrium. We characterize strategic behavior in mixed regimes with unstable values of p. Varying the number of iterations given to the genetic algorithm to learn about the game replicates the behavior of agents with different levels of reasoning of the level k approach. This evolutionary process hence proposes a learning foundation for endogenizing existence and transitions between levels of reasoning in cognitive hierarchy models.
... 30 Second, they are based on backward-looking heuristics (with the exception of the fitted rules and rule 14), so that their functional forms are easy to compute (e.g., rule 1). Finally, we have excluded level-k type of expectations in the vein of the rule learning model of Stahl (1996) and the cognitive hierarchy model of Camerer et al. (2004) , since there is no common prior through which level-0 type can form imitation after the first period. 31 The goodness of fit of a given rule to the experimental data is based on the aggregate one-period-ahead forecast error that is computed as the root mean squared error (RMSE): ...
Article
Experimental evidence shows that the rational expectations hypothesis fails to characterize the path to equilibrium after an exogenous shock when actions are strategic complements. Under identical shocks, however, repetition allows adaptive learning, so that inertia in adjustment should fade away with experience. If this finding proves to be robust, inertia in adjustment may be irrelevant among experienced agents. The conjecture in the literature is that inertia would still persist, perhaps indefinitely, in the presence of real-world complications such as nonidentical shocks. Herein, we empirically test the conjecture that the inertia in adjustment is more persistent if the shocks are nonidentical. For both identical and nonidentical shocks, we find persistent inertia and similar patterns of adjustment that can be explained by backward-looking expectation rules. Notably, refining these rules with similarity-based learning approach improves their predictive power.
... Games (such as deep reinforcement learning. I introduced the concept of rule learning in static normal-form games [Stahl, 1996[Stahl, , 2000and Stahl and Haruvy, 2012]. Unlike formal game theory, instead of seeking a model of behavior common to all players and consistent with rationality, archetypal models of other players define a space of models, and reinforcement comes from the payoffs experienced (and counterfactually simulated). ...
Preprint
Full-text available
Most important economic decision problems are sequential, and thus naturally represented as Markov Decision Problems (MDP). After reviewing the theory of MDPs, the applicability of MDPs to real-life sequential decisions appears impractical. The central question addressed in this essay is how ordinary humans behave in the real-life sequential decision problems they face. A formal behavioral approach is presented, and it also appears impractical for all but toy MDPs. After engaging in introspection, a key insight is the enormous extent to which information and reinforcement provided by parents, teachers and others shapes our behavior in real-life. With these insights, an integrated framework of dynamic behavior emerges in which genetic evolution, short-term reinforcement learning and long-term acquisition of knowledge via institutions are seen as important aspects.
... (30) FM models are the technique of choice for analyzing Beauty-Contest data, revealing that virtually all 'nontheorist' subjects 12 (94%) fall into one of three boundedly rational depth-of-reasoning classes (levels 0, 1 or 2). (31,32) FM models are being applied increasingly in empirical game theoryincluding to the analysis of e.g. trust-game data, social-preferences data, and common-pool- 11 of decision making under risk 12 those who are not professional game theorists resource data -demonstrating the broad applicability of a multiple-criteria approach. ...
Preprint
Full-text available
Normative decision theory proves inadequate for modeling human responses to the social-engineering campaigns of Advanced Persistent Threat (APT) attacks. Behavioral decision theory fares better, but still falls short of capturing social-engineering attack vectors, which operate through emotions and peripheral-route persuasion. We introduce a generalized decision theory, under which any decision will be made according to one of multiple coexisting choice criteria. We denote the set of possible choice criteria by C. Thus the proposed model reduces to conventional Expected Utility theory when | C_{EU} | = 1, whilst Dual-Process (thinking fast vs. thinking slow) decision making corresponds to a model with | C_{DP} | = 2. We consider a more general case with | C | ≥ 2, which necessitates careful consideration of how, for a particular choice-task instance, one criterion comes to prevail over others. We operationalize this with a probability distribution that is conditional upon traits of the decision maker as well as upon the context and the framing of choice options. Whereas existing Signal Detection Theory (SDT) models of phishing detection commingle the different peripheral-route persuasion pathways, in the present descriptive generalization the different pathways are explicitly identified and represented. A number of implications follow immediately from this formulation, ranging from the conditional nature of security-breach risk to delineation of the prerequisites for valid tests of security training. Moreover, the model explains the 'stepping-stone' penetration pattern of APT attacks, which has confounded modeling approaches based on normative rationality.
... Such winning rules would allow for comparisons of hierarchical thinking models along the lines proposed by Nagel (1995) and Stahl (1996) for the guessing game. Finally, instead of forming a belief and best responding to it, bidders may be behaving according to ambiguity aversion (also called probabilistic risk-aversion or uncertainty aversion) (see Camerer & Weber, 1992), which has been shown in first-price sealed-bid auctions by Salo and Weber (1995). ...
... As a result, even if every subject understands that the target should be zero, there is self-fulfilling slow convergence. Using Nagel's (1995) data, Stahl (1996) combines her level-k model with a "law of effect" learning model: Agents start with a propensity to use a certain level of reasoning, and over time the players learn how the available rules perform and switch to better performing rules. He rejects Bayesian-rule learning in favor of this level-k model. ...
Chapter
We introduce a generalization of the Beauty Contest (BC) game as a framework that incorporates different models from micro- and macroeconomics by formulating their reduced forms as special cases of the BC. Examples include public good games, ultimatum games, Bertrand, Cournot, some auctions, asset markets, New-Keynesian, and general equilibrium models with sentiments/animal spirits. This becomes feasible by considering BC specifications with a best response or optimal action as a function of other players' aggregated actions. For characterizing an integrated account of heterogeneity in economics, as observed in BC experiments, we employ a non-equilibrium model, the so-called “level-k” model, based on one (or more) reference point(s) and (limited) iterated best responses. Level-k and related models thus bridge the gap between non-strategic (e.g. irrational, intuitive or random) behavior and equilibrium choices. We also give a brief overview of interactive decision-making within experimental economics, and discuss elicitation methods, cognitive and population measures, to better understand heterogeneity in human reasoning in general, and in economic experiments in particular.
... Furthermore subjects seem to perform only a few steps of reasoning (about 2) and have heterogeneous guessing abilities. Several models have been proposed to capture individual differences in guessing ability or stressed popular reasons to choose a particular number (see Stahl, 1996Stahl, , 1998Camerer et al., 2004;Guth et al., 2002). 2 Some other recent experimental findings about price guessing games (Fehr and Tyran, 2008) demonstrated that the strategic environment matters. In their paper, the authors examined this question in the context of the adjustment of nominal prices after an anticipated (exogenous) monetary shock, and showed that when agents' actions are strategic substitutes, adjustment to the new equilibrium is extremely quick, whereas under strategic complementarity, adjustment is both very slow and associated with relatively large real effects. ...
Article
Full-text available
We investigate experimentally a new variant of the beauty contest game (BCG) in which players' actions are strategic substitutes (a negative feedback BCG). Our results show that chosen numbers are closer to the rational expectation equilibrium than in a strategic complements environment (a positive feedback BCG). We also find that the estimated average depth of reasoning from the cognitive hierarchy model does not differ between the two environments. We show that the difference may be attributed to the fact that additional information is more valuable when players' actions are strategic substitutes rather than strategic complements, in line with other recent experimental findings.
Article
How do CEOs and academics differ in how they view academic research? We survey a sample of CEOs and business-school academics to measure their views on each other and on academic research. We explore differences between these two groups with an experimental beauty contest game (EBC) and by asking how much they value and trust different business disciplines, data types, and academic methodologies. We observe, in the EBC, that both CEOs and academics similarly hold relatively lower expectations about the reasoning abilities of CEOs than academics. While CEOs and academics both tend to trust company-specific data and simpler, more scientific methodologies, the groups differ in the value they place on disciplines that address CEOs’ duties and business specific methodologies. Together, our results shed new light on the disconnect between academic research and practitioners and indicate areas where that gap can be improved.
Article
Keynes’ beauty contest story (Keynes, 1936, p. 156) is a metaphor illuminating how speculators’ decisions on stock markets are guided by expectations’ expectations. Keynes’ idea has been implemented in laboratory experiments since the early nineties using a guessing game design with a contracting factor usually smaller than 1. That means that 1 is the unique Nash equilibrium if the choice space is given by the natural numbers from 1 to 100. In laboratory experiments it has been found out that a part of the participants learns to discover the theoretical equilibrium, but, having discovered it, not necessarily will choose it, taking into account the expected bounded rational behavior of the other participants. The theoretical equilibrium, however, plays the role of an anchor for those participants’ decisions. In our present study we pose the question what will happen in a guessing game if there is not a unique equilibrium, but a multiplicity of equilibria. Furthermore, we introduce (a)symmetric strategy space, with chosen numbers around the center of either zero, in the negative or positive interval. In our experimental setting, each integer of the choice set is an equilibrium of the guessing game if the contracting factor is chosen equal to 1. Since real stock market speculators also do not have a unique equilibrium anchor when deciding on selling or buying an asset, our procedure is in line with Keynes’ original idea. Game theory suggests that participants might use mixed strategies, leading to a stochastic sequence of choices in iterated experiments. In contrast, our exprimental evidence shows recurrent patterns in the time series of iterated choices instead. The aim of our study is to develop and to test hypotheses explaining this evidence.
Article
We leverage response-time data from repeated strategic interactions to measure the strategic complexity of a situation by how long people think on average when they face that situation (where we categorize situations according to characteristics of play in the previous round). We find that strategic complexity varies significantly across situations, and we find considerable heterogeneity in how responsive subjects’ thinking times are to complexity. We also study how variation in response times at the individual level affects success: when a subject thinks for longer than she would normally do in a particular situation, she wins less frequently and earns less.
Article
Full-text available
The cognitive hierarchy (CH) approach posits that players in a game are heterogeneous with respect to levels of strategic sophistication. A level-k player believes all other players in the game have lower levels of sophistication distributed from 0 to k-1, and these beliefs correspond to the truncated distribution of a "true" distribution of levels. We extend the CH framework to extensive form games, where these initial beliefs over lower levels are updated as the history of play in the game unfolds, providing information to players about other players' levels of sophistication. For a class of centipede games with a linearly increasing pie, we fully characterize the dynamic CH solution and show that it leads to the game terminating earlier than in the static CH solution for the centipede game in reduced normal form.
Article
We propose a Bayesian cognitive hierarchy (BCH) model with fixed reasoning levels for two-person normal-form games. The model extends the previous static version of the cognitive hierarchy model to dynamic environments and combines the cognitive hierarchy model with one of the most advanced adaptive learning models. We estimate the proposed model and other models with five datasets of two-person repeated normal-form games. The results indicate that the fixed-level BCH model can reasonably capture changes in the sophistication of behavior over time. Compared with the adaptive learning model, introducing reasoning can significantly improve the interpretation of data. We further decompose the BCH model to investigate the effect of each modeling component and find that, in different games, players rely on different decision-making processes of learning and reasoning.
Article
A classic issue in behavioral economics is the extent to which agents who make systematic mistakes have large effects on market outcomes. One perspective is that agents who make systematic mistakes have large effects on outcomes in settings characterized by strategic complementarity, but not in settings characterized by strategic substitutability. In this paper, we extend the experimental approach based on this perspective found in Cooper, Schneider, and Waldman (2017) concerning beauty contest experiments to the pricing game initially investigated in Fehr and Tyran (2008). Our main results are as follows: i) given strategic complementarity and multiple identical shocks, convergence to equilibrium play after an initial shock and the initial subsequent shocks is not immediate even though the shocks are identical; ii) the periodic introduction of inexperienced players given strategic complementarity slows down speed of convergence to equilibrium play; and iii) behavior in the pricing game given strategic complementarity shows faster post-shock convergence after later shocks than we found in our earlier paper for the beauty contest. In addition to showing these results, we discuss what the two papers suggest concerning how to model settings characterized by agents who vary in terms of their abilities to process information and form expectations.
Preprint
Full-text available
This paper theoretically and experimentally investigates the behavior of asymmetric players in guessing games. The asymmetry is created by introducing k > 1 replicas of one of the players. Two-player and restricted N-player cases are examined in detail. Based on the model parameters, the equilibrium is either unique in which all players choose zero or mixed in which the weak player (k = 1) imitates the strong player (k > 1). A series of experiments involving two and three-player repeated guessing games with unique equilibrium is conducted. We find that equilibrium behavior is observed less frequently and overall choices are farther from the equilibrium in two-player asymmetric games in contrast to symmetric games, but this is not the case in three-player games. Convergence towards equilibrium exists in all cases but asymmetry slows down the speed of convergence to the equilibrium in two, but not in three-player games. Furthermore, the strong players have a slight earning advantage over the weak players, and asymmetry increases discrepancy in choices (defined as the squared distance of choices from the winning number) in both games.
Article
In standard models of iterative thinking, players choose a fixed rule level from a fixed rule hierarchy. Nonequilibrium behavior emerges when players do not perform enough thinking steps. Existing approaches, however, are inherently static. This paper introduces a Bayesian level-k model, in which level-0 players adjust their actions in response to historical game play, whereas higher-level thinkers update their beliefs on opponents’ rule levels and best respond with different rule levels over time. As a consequence, players choose a dynamic rule level (i.e., sophisticated learning) from a varying rule hierarchy (i.e., adaptive learning). We apply our model to existing experimental data on three distinct games: the p-beauty contest, Cournot oligopoly, and private-value auction. We find that both types of learning are significant in p-beauty contest games, but only adaptive learning is significant in the Cournot oligopoly, and only sophisticated learning is significant in the private-value auction. We conclude that it is useful to have a unified framework that incorporates both types of learning to explain dynamic choice behavior across different settings. This paper was accepted by Manel Baucells, decision analysis.
Article
We axiomatically characterise a class of updating rules in the contexts of: (a) consumer’s equilibrium; and (b) random choice by a decision maker. The updating rule captures the effect of absence of a product on the expenditure-share function in (a) and the effect of removal of an alternative on the choice probabilities of the remaining alternatives in (b). The class of updating rules, called hemi-Bayesian random choice rule can be described as follows: the expenditure-share or the probability-weight of the alternative that is removed is distributed proportionately to the alternatives belonging to it’s lower contour set according to some linear order. We show that this class of rules is the same as random consideration set rule as in Manzini and Mariotti (2014).
Article
We develop a graphical, non-analytical version of the two-person beauty contest game to study the developmental trajectory of instinctive behavior and learning from kindergarten to adulthood. These are captured by observing behavior when the game is played in two consecutive trials. We find that equilibrium behavior in the first trial increases significantly between 5 and 10 years of age (from 17.9% to 61.4%) and stabilizes afterwards. Children of all ages learn to play the equilibrium, especially when they observe an equilibrium choice by the rival. Our younger children are the weakest learners mainly because they are less frequently paired with rivals who play at equilibrium. Finally, the choice process data suggests that participants who play at equilibrium in the second trial are also performing fewer steps before reaching a decision, indicating that they are less hesitant about their strategy.
Article
Full-text available
http://www.economics-ejournal.org/economics/discussionpapers/2019-53 The goal of this paper is to show how adding behavioral components to micro-foundated models of macroeconomics may contribute to a better understanding of real world phenomena. The authors introduce the reader to variations of the Keynesian Beauty Contest (Keynes, The General Theory of Employment, Interest, and Money, 1936), theoretically and experimentally with a descriptive model of behavior. They bridge the discrepancies of (benchmark) solution concepts and bounded rationality through step-level reasoning, the so-called level-k or cognitive hierarchy models. These models have been recently used as building blocks for new behavioral macro theories to understand puzzles like the lacking rise of inflation after the financial crisis, the effectiveness of quantitative easing, the forward guidance puzzle and the effectiveness of temporary fiscal expansion.
Article
Protocol analysis, in the form of concurrent verbal ‘thinking aloud’ reports, is a method of collecting and analyzing data about cognitive processes. This approach can help economists in evaluating competing theories of behavior and in categorizing heterogeneity of thinking patterns. As a proof of concept, I tested this method in the context of a guessing game. I found that concurrent think aloud protocols can inform us about individual’s thought processes without affecting decisions. The method allowed me to identify game theoretic thinking and heterogeneous approaches to unravelling the guessing game. The think aloud protocol is inexpensive and scalable, and it is a useful tool for identifying empirical regularities regarding decision processes.
Article
This review discusses selected work in experimental game theory. My goals are to further the dialogue between theorists and empiricists that has driven progress in economics and game theory and to guide future experimental work. I focus on experiments whose lessons are relevant to establishing and maintaining coordination and cooperation in human relationships, the role of communication in doing so, and the underlying cognition. These are questions of central importance, where both the gap between theory and experience and the role of experiments in closing it seem large. Humans appear to be unique in their ability to use language to manipulate and communicate mental models of the world and of other people, vital skills in relationships. Continuing the dialogue between theorists and empiricists should help to explain why it matters for cooperation that we can communicate, and why and how it matters whether we communicate via natural language or abstract signals. Expected final online publication date for the Annual Review of Economics Volume 11 is August 2, 2019. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Article
This chapter surveys important behavioral and experimental results on strategic interactions from three literatures: game theory, principal–agent theory, and bargaining theory. In the behavioral game theory, Section 7.1 discusses tests of three foundational assumptions of equilibrium: correct beliefs, best response, and strategic sophistication. I also discuss play in games with multiple equilibria, such as coordination games and repeated games. For principal–agent theory, I discuss experimental results on individual's response to financial and nonfinancial incentives. I also discuss behavioral factors such as reciprocity, intrinsic motivation, and status. The survey of bargaining theory discusses both free‐form and structured bargaining, as well as prominent psychological biases. For all three literatures, I also discuss operational applications.
Conference Paper
The paper examines experimental guessing “p-beauty contest” game. The objective of the study is to conduct an experimental study of simultaneous decision-making by subjects within various groups in the “p-beauty contest” guessing game, and to estimate the influence of various factors. The interac-tion of factors was evaluated. The contribution to this study extends the analysis of simultaneous de-cision-making by individuals within various groups to the conditions of the “p-beauty contest” game. The subjects simultaneously took decisions, while being part of a group of three subjects, and as part of a group of six subjects. The results from the experiment showed that the subjects make more ra-tional decisions, being in the larger group. The four-factor (a p-value, a group size, a period, and num-ber of subjects) experimental design shows that the Factors “p-value” and “Number of subjects” main effects were significant. Further, the Factor “p-value” by Factor “Group size,” the Factor “Group size” by Factor “Period,” the Factor “Group size” by Factor “Number of subjects,” and the Factor “Period” by Factor “Number of subjects” interactions were also significant.
Working Paper
Full-text available
This paper models the scenario where players in a finite perfect-information game can have heterogeneous foresight-levels, and the players are uncertain about their opponents’ foresight-levels. Foresight-level is defined as the number of subsequent stages that a player can observe/understand from a given move. We define the Limited Foresight Equilibrium (LFE), which provides an assessment for this model and makes outcome predictions given a distribution over players’ foresight-levels. We show the existence of LFE. In LFE, within a single play of the game, as play proceeds, the perception of the game changes for players with limited-foresight; they update beliefs about opponents’ foresights and adjust strategies to maximize payoff within their foresight bound. The LFE beliefs of players with higher foresight-level are consistent regarding more opponent-types. Players with given foresight-levels cannot distinguish among different higher-foresight types of a player, but if they observe actions that are impossible according to lower-foresight opponents’ LFE strategies, then they discover that some opponent has higher-foresight. LFE strategies take reputation effects into account. In applications, LFE is shown to rationalize experimental findings on Sequential Bargaining and the Centipede game. We discuss experimental findings from Rampal (2018) corroborating LFE’s predictions using race games.
Chapter
This chapter discusses specialized models of the cognitive processes involved in strategic thinking. It describes strategic thinking by the choice of strategies in mathematical games. Game theory specifies how players choose high-value strategies by guessing the likely choices of other players and acting on these guesses. Realistically, players may fail to correctly guess what others will do. This naturally limited thinking process is the main focus of this chapter. The chapter talks about one part of behavioral game theory: a cognitive hierarchy (CH) model of the limits on what players infer what other players will do. It also presents some motivating empirical examples of the wide scope of games to which the theory has been applied with some success, including field data, and consistency with data on functional magnetic resonance imaging (fMRI) and visual fixations measured using both Mouselab and eye tracking.
Article
Full-text available
Summary Experimental games typically involve subjects playing the same game a number of times. In the absence of perfect rationality by all players, the subjects may use the behavior of their opponents in early rounds to learn about the extent of irrationality in the population they face. This makes the problem of finding the Bayes-Nash equilibrium of the experimental game much more complicated than finding the game-theoretic solution to the ideal game without irrationality. We propose and implement a computationally intensive algorithm for finding the equilibria of complicated games with irrationality via the minimization of an appropriate multi-variate function. We propose two hypotheses about how agents learn when playing experimental games. The first posits that they tend to learn about each opponent as they play it repeatedly, but do not learn about the population parameters through their observations of random opponents (myopic learning). The second posits that both types of learning take place (sequential learning). We introduce a computationally intensive sequential procedure to decide on the informational value of conducting additional experiments. With the help of that procedure, we decided after 12 experiments that our original model of irrationality was unsatisfactory for the purpose of discriminating between our two hypotheses. We changed our models, allowing for two different types of irrationality, reanalyzed the old data, and conducted 7 more experiments. The new model successfully discriminated between our two hypotheses about learning. After only 7 more experiments, our approximately optimal stopping rule led us to stop sampling and accept the model where both types of learning occur.
Article
Full-text available
We provide an overview of the methods of analysis and results obtained, and, most important, an assessment of the success of rational learning dynamics in tying down limit beliefs and limit behavior. We illustrate the features common to rational or Bayesian learning in single agent, game theoretic and equilibrium frameworks. We show that rational learing is possible in each of these environments. The issue is not in whether rational learning can occur, but in what results it produces. If we assume a natural complex parameterization of the choice environment all we know is the rational learner believes that his posteriors will converge somewhere with prior probability one. Alternatively, if we, the modelers, assume the simple parameterization of the choice environment that is necessary to obtain positive results we are closing our models in the ad hoc fashion that rational learning was inroduced to avoid. We believe that a partial resolution of this conundrum is to pay more attention to how learning interacts with other dynamic forces. We show that in a simple economy, the forces of market selection can yield convergence to rational expectations equilibria even without every agent behaving as a rational learner.
Article
Full-text available
This lecture is divided into five parts. First we discuss the evolutionary approach to optimization – and specifically to game theory and some of its implications for the idea of bounded rationality, such as the development of truly dynamic theories of games, and the idea of “rule rationality” (as opposed to “act rationality”). Next comes the area of “trembles,” including equilibrium refinements, “crazy” perturbations, failure of common knowledge of rationality, the limiting average payoff in infinitely repeated games as an expression of bounded rationality, ε-equilibria, and related topics. Section 3 deals with players who are modeled as computers (finite state automata, Turing machines), which has now become perhaps the most active area in the field. In Section 4 we discuss the work on the foundations of decision theory that deals with various paradoxes (such as Allais (1953) and D. Ellsberg [Q. J. Econ. 75, No. 4, 643–669 (1961; Zbl 1280.91045)]), and with results of laboratory experiments, by relaxing various of the postulates and so coming up with a weaker theory. Section 5 is devoted to one or two open problems. Most of this lecture is set in the framework of non-cooperative game theory, because most of the work has been in this framework. Game theory is indeed particularly appropriate for discussing fundamental ideas in this area, because it is relatively free from special institutional features. The basic ideas are probably applicable to economic contexts that are not game-theoretic (if there are any).
Article
Full-text available
A model of the process by which players learn to play repeated coordination games is proposed with the goal of understanding the results of recent experiments. In those experiments, the dynamics of subjects' strategy choices and the resulting patterns of discrimination among equilibria varied systematically with the rule for determining payoffs and the size of the interacting groups in ways that are not adequately explained by available methods of analysis. The model suggests a possible explanation by showing how the dispersion of subjects' beliefs interacts with the learning process to determine the probability distribution of its dynamics and limiting outcome. Copyright 1995 by The Econometric Society.
Article
Full-text available
The authors study the steady states of a system in which players learn about the strategies their opponents are playing by updating their Bayesian priors in light of their observations. Players are matched.at random to play a fixed extensive-form game and each player observes the realized actions in his own matches but not the intended off-path play of his opponents or the realized actions in other matches. Because players are assumed to live finite lives, there are steady states in which learning continually takes place. If lifetimes are long and players are very patient, the steady state distribution of actions approximates those of a Nash equilibrium. Copyright 1993 by The Econometric Society.
Article
Full-text available
Subjective utility maximizers, in an infinitely repeated game, will learn to predict opponents' future strategies and will converge to play according to a Nash equilibrium of the repeated game. Players' initial uncertainty is placed directly on opponents' strategies and the above result is obtained under the assumption that the individual beliefs are compatible with the chosen strategies. An immediate corollary is that, when playing a Harsanyi-Nash equilibrium of a repeated game of incomplete information about opponents' payoff matrices, players will eventually play a Nash equilibrium of the real game, as if they had complete information. Copyright 1993 by The Econometric Society.
Article
We apply a sequential Bayesian sampling procedure to study two models of learning in repeated games. In the first model individuals learn only about an opponent when they play her or him repeatedly but do not update from their experience with that opponent when they move on to play the same game with other opponents. We label this the nonsequential model. In the second model individuals use Bayesian updating to learn about population parameters from each of their opponents, as well as learning about the idiosyncrasies of that particular opponent. We call this the sequential model. We sequentially sample observations on the behavior of experimental subjects in the so-called "centipede game." This game allows for a trade-off between competition and cooperation, which is of interest in many economic situations. At each point in time, the "state" of our dynamic problem consists of our beliefs about the two models and beliefs about the nuisance parameters of the two models. Our "choice" set is to sample or not to sample one more data point and, if we should not sample, which of the models to select. After 19 matches (4 subjects per match), we stop and reject the nonsequential model in favor of the sequential model.
Article
A learning process for 2-person games in normal form is introduced. The game is assumed to be played repeatedly by two large populations, one for player 1 and one for player 2. Every individual plays against changing opponents in the other population. Mixed strategies are adapted to experience. The process evolves in discrete time. All individuals in the same population play the same mixed strategy. The mixed strategies played in one period are publicly known in the next period. The payoff matrices of both players are publicly known. In a preliminary version of the model, the individuals increase and decrease probabilities of pure strategies directly in response to payoffs against last period’s observed opponent strategy. In this model, the stationary points are the equilibrium points, but genuinely mixed equilibrium points fail to be locally stable. On the basis of the preliminary model an anticipatory learning process is defined, where the individuals first anticipate the opponent strategies according to the preliminary model and then react to these anticipated strategies in the same way as to the observed strategies in the preliminary model. This means that primary learning effects on the other side are anticipated, but not the secondary effects due to anticipations in the opponent population. Local stability of the anticipatory learning process is investigated for regular games, i.e., for games where all equilibrium points are regular. Astability criterion is derived which is necessary and sufficient for sufficiently small adjustment speeds. This criterion requires that the eigenvalues of a matrix derived from both payoff matrices are negative. It is shown that the stability criterion is satisfied for 2x 2-games without pure strategy equilibrium points, for zero-sum games and for games where one player’s payoff matrix is the unit matrix and the other player's payoff matrix is negative definite. Moreover, the addition of constants to rows or columns of payoff matrices does not change stability. The stability criterion is related to an additive decomposition of payoffs reminiscent of a two way analysis of variance. Payoffs are decomposed into row effects, column effects and interaction effects. Intuitively, the stability criterion requires a preponderance of negative covariance between the interaction effects in both players’ payoffs. The anticipatory learning process assumes that the effects of anticipations on the other side remain unanticipated. At least for completely mixed equilibrium points the stability criterion remains unchanged, if anticipations of anticipation effects are introduced.
Article
This paper presents the detailed statistical modelling of an extensive body of educational research data on teaching styles and pupil performance. Clustering of teachers into distinct teaching styles is carried out using a latent class model, and comparison of these latent classes for differences in pupil achievement is examined using unbalanced variance component ("mixed") models. Differences among the classes are altered by the probabilistic clustering of the latent class model compared to the original findings of the Teaching Styles project, and the statistical significance of the differences is substantially reduced when allowance is made for the correlation among children taught by the same teacher.
Book
Finite mixture distributions arise in a variety of applications ranging from the length distribution of fish to the content of DNA in the nuclei of liver cells. The literature surrounding them is large and goes back to the end of the last century when Karl Pearson published his well-known paper on estimating the five parameters in a mixture of two normal distributions. In this text we attempt to review this literature and in addition indicate the practical details of fitting such distributions to sample data. Our hope is that the monograph will be useful to statisticians interested in mixture distributions and to re­ search workers in other areas applying such distributions to their data. We would like to express our gratitude to Mrs Bertha Lakey for typing the manuscript. Institute oj Psychiatry B. S. Everitt University of London D. l Hand 1980 CHAPTER I General introduction 1. 1 Introduction This monograph is concerned with statistical distributions which can be expressed as superpositions of (usually simpler) component distributions. Such superpositions are termed mixture distributions or compound distributions. For example, the distribution of height in a population of children might be expressed as follows: h(height) = fg(height: age)f(age)d age (1. 1) where g(height: age) is the conditional distribution of height on age, and/(age) is the age distribution of the children in the population.
Article
We use a dynamical systems approach to model the origin of bargaining conventions and report the results of a symmetric bargaining game experiment. Our experiment also provides evidence on the psychological salience of symmetry and efficiency. The observed behavior in the experiment was systematic, replicable, and roughly consistent with the dynamical systems approach. For instance, we do observe unequal-division conventions emerging in communities of symmetrically endowed subjects.
Article
This paper studies myopic Bayesian learning processes for finite-player, finite-strategy normal form games. Initially, each player is presumed to know his own payoff function but not the payoff functions of the other players. Assuming that the common prior distribution of payoff functions satisfies independence across players, it is proved that the conditional distributions on strategies converge to a set of Nash equilibria with probability one. Under a further assumption that the prior distributions are sufficiently uniform, convergence to a set of Nash equilibria is proved for every profile of payoff functions, that is, every normal form game.
Article
A steady-state, random-matching game model is proposed in which rules of thumb, which assign strategies to individual games, are the units of choice for individuals. Costs, which reflect complexity or difficulty of use, are associated with the rules independently of the games in which they are employed. A population equilibrium is a distribution of rules across the population of players such that no individual has an incentive to change rules, given the current distribution. Examples illustrate the concept, and existence is demonstrated.
Article
We use simple learning models to track the behavior observed in experiments concerning three extensive form games with similar perfect equilibria. In only two of the games does observed behavior approach the perfect equilibrium as players gain experience. We examine a family of learning models which possess some of the robust properties of learning noted in the psychology literature. The intermediate term predictions of these models track well the observed behavior in all three games, even though the models considered differ in their very long term predictions. We argue that for predicting observed behavior the intermediate term predictions of dynamic learning models may be even more important than their asymptotic properties. Journal of Economic Literature Classification Numbers: C7, C92.
Article
We investigate learning in a probabilistic task, called "medical diagnosis." On each trial, a subject is presented with a stimulus configuration indicating the value of four medical symptoms. The subject responds by guessing which of two diseases is present and is then given feedback about which disease was actually present. The feedback is determined according to fixed conditional probabilities unknown to the subject. We test a normative Bayesian model as well as simple variants of well-known psychological models including the Fuzzy Logical Model of Perception, an Exemplar model, a two-layer Connectionist model and an ALCOVE model. Both the asymptotic predictions of these models (i.e., predictions regarding behavior after it has stabilized and learning is complete) and predictions of trial-by-trial changes in behavior are tested. The models are tested against existing data from Estes et al. (1989, Journal of Experimental Psychology: Learning, Memory, & Cognition,15, 556-571) and new data from medical diagnosis tasks that include not only asymmetric but also symmetric base rates. Learning was observed in all cases in that subjects tended to match the objective probabilities of the symptom configurations more closely in later trials. All of the descriptive models give a more accurate account of performance than the normative Bayesian model. Relative to a benchmark measure, however, none of these models does an especially good job of characterizing asymptotic performance or the learning process. We suggest that future experiments should address individual performance, rather than group learning curves.
Article
We make two points about the number, B of bootstrap simulations needed to construct a percentile-t confidence interval based on an n sample from a continuous distribution: (i) The bootstrap's reduction of error of coverage probability, from O(n1/2)O(n^{-1/2}) to O(n1)O(n^{-1}), is available uniformly in B, provided nominal coverage probability is a multiple of (B+1)1(B + 1)^{-1}. In fact, this improvement is available even if the number of simulations is held fixed as n increases. However, smaller values of B can result in longer confidence intervals. (ii) In a large sample, the simulated statistic values behave like random observations from a continuous distribution, unless B increases faster than any power of sample size. Only if B increases exponentially quickly with n is there a detectable effect due to discreteness of the bootstrap statistic.
Article
A method is described for the minimization of a function of n variables, which depends on the comparison of function values at the (n + 1) vertices of a general simplex, followed by the replacement of the vertex with the highest value by another point. The simplex adapts itself to the local landscape, and contracts on to the final minimum. The method is shown to be effective and computationally compact. A procedure is given for the estimation of the Hessian matrix in the neighbourhood of the minimum, needed in statistical estimation problems.
Article
Deductive equilibrium analysis often fails to provide a unique equilibrium solution in many situations of strategic interdependence. Consequently, a theory of equilibrium selection would be a useful complement to the theory of equilibrium points. A salient equilibrium selection principle would allow decisionmakers to implement a mutual best response outcome. This paper uses the experimental method to examine the salience of payoff-dominance, security, and historical precedents in related average opinion games. The systematic and, hence, predictable behavior observed in the experiments suggests that it should be possible to construct an accurate theory of equilibrium selection. Copyright 1991, the President and Fellows of Harvard College and the Massachusetts Institute of Technology.
Article
This paper explores the idea of constructing theoretical economic agents that behave like actual human agents and using them in neoclassical economic models. It does this in a repeated-choice setting by postulating "artificial agents" who use a learning algorithm calibrated against human learning data from psychological experiments. The resulting calibrated algorithm appears to replicate human learning behavior to a high degree and reproduces several "stylized facts" of learning. It can, therefore, be used to replace the idealized, perfectly rational agents in appropriate neoclassical models with "calibrated agents" that represent actual human behavior. The paper discusses the possibilities of using the algorithm to represent human learning in normal-form stage games and in more general neoclassical models in economics. It explores the likelihood of convergence to long-run optimality and to Nash behavior, and the "characteristic learning time" implicit in human adaptation in the economy.
Article
We study learning processes for finite strategic-form games, in which players use the history of past play to forecast play in the current period. In a generalization of fictitious play, we assume only that players asymptotically choose best responses to the historical frequencies of opponents′ past play. This implies that if the stage-game strategies converge, the limit is a Nash equilibrium. In the basic model, plays seems unlikely to converge to a mixed-strategy equilibrium, but such convergence is natural when the stage game is perturbed in the manner of Harsanyi′s purification theorem. Journal of Economic Literature Classification Number: C72.
Article
Corruption in the public sector erodes tax compliance and leads to higher tax evasion. Moreover, corrupt public officials abuse their public power to extort bribes from the private agents. In both types of interaction with the public sector, the private agents are bound to face uncertainty with respect to their disposable incomes. To analyse effects of this uncertainty, a stochastic dynamic growth model with the public sector is examined. It is shown that deterministic excessive red tape and corruption deteriorate the growth potential through income redistribution and public sector inefficiencies. Most importantly, it is demonstrated that the increase in corruption via higher uncertainty exerts adverse effects on capital accumulation, thus leading to lower growth rates.
Article
Two groups containing 10 pairs of players each playing a finitely repeated matching pennies game were varied in terms of the information available to any player about past choices and payoffs of its opponent. The data reveals that presentation of such information does have a significant effect on the nature of play. For subjects without information about opponents′ moves, there is evidence in favor of the hypothesis that past experience with different choices in the past affect current strategy. For fully informed subjects, on the other hand, choices are considerably closer to i.i.d. play. Journal of Economic Literature Classification Numbers: C72, C92.
Article
The authors provide a convergence theory for adaptive learning algorithms useful for the study of learning by economic agents. Their results extend the framework of L. Ljung previously utilized by A. Marcet-T. J. Sargent and M. Woodford by permitting nonlinear laws of motion driven by stochastic processes that may exhibit moderate dependence, such as mixing and mixingale processes. The authors draw on previous work by H. J. Kushner and D. S. Clark to provide readily verifiable and/or interpretable conditions ensuring algorithm convergence, chosen for their suitability in the context of adaptive learning. Copyright 1994 by The Econometric Society.
Article
The aim of this paper is to present the new theory called “inductive game theory”. A paper, published by one of the present authors with A. Matsui, discussed some part of inductive game theory in a specific game. Here, we will give a more developed discourse of the theory. The paper is written to show one entire picture of the theory: From individual raw experiences, short-term memories to long-term memories, inductive derivation of individual views, classification of such views, decision making or modification of behavior based on a view, and repercussion from the modified play in the objective game. We focus on some clear-cut cases, forgetting a lot of possible variants, but will still give a lot of results. In order to show one possible discourse as a whole, we will ask the question of how Nash equilibrium is emerging from the viewpoint of inductive game theory, and will give one answer.
Article
In a class of games including some Cournot and Bertrand games, a sequence of plays converges to the unique Nash equilibrium if and only if the sequence is “consistent with adaptive learning” according to the new definition we propose. In the Arrow-Debreu model with gross substitutes, a sequence of prices converges to the competitive equilibrium if and only if the sequence is consistent with adaptive learning by price-setting market makers for the individual goods. Similar results are obtained for “sophisticated” learning. All the familiar learning algorithms generate play that is consistent with adaptive learning.
LearninginEvolutionaryGames:SameLaboratoryResults
  • Y- Cheung
  • W Andfriedman
Cheung,Y-W.,andFriedman,D.(1994).“LearninginEvolutionaryGames:SameLaboratoryResults.” Economics Department, Univ. of Calif., Santa Cruz
All rights of reproduction in any form reserved. and since the largest simulated log-likelihood difference was 10.679, the statistic 34.223 has a p-value less than 0
  • Copyright
Copyright © 1996 by Academic Press, Inc. All rights of reproduction in any form reserved. and since the largest simulated log-likelihood difference was 10.679, the statistic 34.223 has a p-value less than 0.001. Thus, we reject the directional learning hypothesis in favor of our model.
Rationality and Bounded Rationality Rational Expectations and Rational Learning
  • R Aumann
Aumann, R. (1986). " Rationality and Bounded Rationality, " Nancy L. Schwartz Lecture. Kellogg School of Management, Northwestern University. Blume, L., and Easley, D. (1992). " Rational Expectations and Rational Learning. " Department of Economics, Cornell University.
Nonparametric Adaptive Learning with Feedback University of California at San Diego Learning in Evolutionary Games: Same Laboratory Results
  • X Chen
  • H White
  • Y-W Cheung
  • D Friedman
Chen, X., and White, H. (1994). " Nonparametric Adaptive Learning with Feedback. " University of California at San Diego. Cheung, Y-W., and Friedman, D. (1994). " Learning in Evolutionary Games: Same Laboratory Results. " Economics Department, Univ. of Calif., Santa Cruz.
of the log-likelihood difference is 13.08, and since the smallest simulated log-likelihood difference was 0.017, the statistic −0.083 has a p-value less than 0.001. Thus, we can reject the hypothesis that K = 4 in favor of the hypothesis that K = 3 Statistical Modelling of Data on Teaching Styles
  • M Anderson
  • D Hinde
These Monte Carlo results indicate that the 5% critical value of the log-likelihood difference is 13.08, and since the smallest simulated log-likelihood difference was 0.017, the statistic −0.083 has a p-value less than 0.001. Thus, we can reject the hypothesis that K = 4 in favor of the hypothesis that K = 3. REFERENCES Aitken, M., Anderson, D., and Hinde, J. (1981). " Statistical Modelling of Data on Teaching Styles, " J. Roy. Statist. Soc. A 144, 419–461.
  • R Selten
Selten, R. (1991). " Evolution, Learning, and Economic Behavior, Games Econ. Behav. 3, 3–24.