Content uploaded by Kesten Green
Author content
All content in this area was uploaded by Kesten Green on Mar 20, 2018
Content may be subject to copyright.
Electronic copy available at: http://ssrn.com/abstract=1159537
International Journal of Forecasting 18 (2002) 321–344 www.elsevier.com/locate /ijforecast
F orecasting decisions in conflict situations: a comparison of game
theory, role-playing, and unaided judgement
*
Kesten C. Green
School of Business and Public Management
,
Victoria University of Wellington
,
and Decision Research Ltd
.,
P
.
O
.
Box
5530,
Wellington
,
New Zealand
Abstract
Can game theory aid in forecasting the decision making of parties in a conflict? A review of the literature revealed diverse
opinions but no empirical evidence on this question. When put to the test, game theorists’ predictions were more accurate
than those from unaided judgement but not as accurate as role-play forecasts. Twenty-one game theorists made 99 forecasts
of decisions for six conflict situations. The same situations were described to 290 research participants, who made 207
forecasts using unaided judgement, and to 933 participants, who made 158 forecasts in active role-playing. Averaged across
the six situations, 37 percent of the game theorists’ forecasts, 28 percent of the unaided-judgement forecasts, and 64 percent
of the role-play forecasts were correct. 2002 International Institute of Forecasters. Published by Elsevier Science B.V.
All rights reserved.
Keywords
:
Conflict; Expert opinion; Forecasting; Game theory; Judgement; Role-playing; Simulation
1 . Introduction gers in a series of electricity trading simulations.
The role-play behaviour was so at odds with the
In 1996 the New Zealand government trans- executives’ own beliefs about how the market
ferred some of the assets of its monopoly participants should and would behave, that they
electricity generator to a new private sector ignored the forecast. Turning to game theory for
electricity-generating company, Contact Energy help, Contact management found it to be ‘‘no
Ltd. It further split the residual into three help at all . . . the role-playing exercise had
entities in 1999. Wishing to know how particip- already foretold the future, as we were to find
1
ants in the new competitive market for out to our cost.’’
wholesale electricity would behave following This anecdote suggests that role-playing may
the second split, Contact organised its execu-
tives to role-play the generator company mana-
1
Interview with Toby Stevenson, General Manager Elec-
*Tel.: 164-4-499-2040; fax: 164-4-499-2080. tricity Trading, Contact Energy Limited, 7 December
E-mail address
:
kesten.green@vuw.ac.nz (K.C. Green). 2000.
0169-2070/02 /$ – see front matter 2002 International Institute of Forecasters. Published by Elsevier Science B.V. All rights reserved.
PII: S0169-2070(02)00025-0
Electronic copy available at: http://ssrn.com/abstract=1159537
322 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
be an effective approach to predicting decisions biases. Indeed, Nalebuff and Brandenburger
made in conflicts among small numbers of (1996, p. 8) wrote ‘‘by presenting a more
decision makers with much at stake. The pri- complete picture of each . . . situation, game
mary purpose of the research described in this theory makes it possible to see aspects of the
paper was to investigate the relative accuracy of situation that would otherwise have been ig-
methods used to forecast decisions made in real nored. In these neglected aspects, some of the
conflicts. For this purpose, I defined accuracy as greatest opportunities . . . are to be found’’. The
the proportion of forecasts that match the actual entry on game theory in Bullock and Trom-
decision. Accuracy is commonly regarded as the bley’s (1999) dictionary states that game theor-
most important criterion for judging the worth ists ‘‘hope to produce a complete theory and
of a forecast (Armstrong, 2001b). The methods I explanation of the social world’’. Given these
examined were unaided judgement, game claims and the fact that game theory is used by
theory, and role-playing. I defined game theory forecasting practitioners and is recommended by
as what game theorists do when faced with experts, it is legitimate to ask whether the
practical forecasting problems. It was not the method can help forecasters make useful predic-
purpose of the study to investigate other aspects tions for real conflicts.
of the methods, such as their value for generat- Opinions on the value of game theory for
ing strategic ideas. forecasting real conflicts are diverse. In contrast
While unaided judgement is commonly used to the optimistic claims made by Nalebuff and
to forecast decisions in conflicts, game theory Brandenburger (1996) and in Bullock and
and role-playing are not. Armstrong, Brodie, Trombley (1999), Shubik (1975, p. xi) de-
and McIntyre (1987) surveyed 59 practitioner scribed the assumptions behind formal game
members of the International Institute of Fore- theory as ‘‘peculiarly rationalistic’’. He con-
casters. The practitioners were asked about the tinued: ‘‘It is assumed that the individuals are
use, by their respective organisations, of meth- capable of accurate and virtually costless
ods for forecasting competitive actions. The computations. Furthermore, they are assumed to
authors found that the organisations of 85 be completely informed about their environ-
percent of practitioners used the opinions of ment. They are presumed to have perfect per-
experts with domain knowledge, the organisa- ceptions. They are regarded as possessing well-
tions of 8 percent of practitioners used formal defined goals. It is assumed that these goals do
game theory, and the organisations of 7 percent not change over the period of time during which
of practitioners used role-playing. The same the game is played’’. He concluded that while
study found expert opinion on the relative value game theory may be applicable to actual games
of the methods to be at odds with the reported (such as backgammon or chess), and even be
frequency of use by practitioners’ organisations. useful for constructing a model to approximate
Both marketing and forecasting experts ranked an economic structure, such as a market, ‘‘It is
game theory and role-playing more highly than much harder to consider being able to trap the
practitioners, although they disagreed about the subtleties of a family quarrel or an international
relative value of the two methods—forecasting treaty bargaining session’’ (p. 14).
experts preferred role-playing over game theory. The usefulness and realism of role-playing
Game theory may help practitioners provide are often contrasted with the limitations of game
more accurate forecasts than unaided judgement theory in the game-theory literature. For exam-
because, for example, the discipline of the ple, Nalebuff and Brandenburger (1996, p. 62)
approach should tend to counter judgemental emphasised the importance and difficulty of
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
323
appreciating the perceptions of other parties. In give ‘‘insufficient consideration of the contin-
a brief note, they suggested that one way for gent decisions of others’’, as evidenced by such
managers to do this is to ‘‘ask a colleague to phenomena as winner’s curse and non-rational
role-play by stepping into [another] player’s escalation of commitment (Zajac & Bazerman,
shoes’’ (p. 63). 1991, p. 43).
Shubik (1975) dealt with role-playing more Babcock, Loewenstein, Issacharoff, and Ca-
comprehensively. He stated that ‘‘an extremely merer (1995) demonstrated that judgement
valuable aspect of operational gaming is the tends to be biased by the role of the judge. They
perspective gained by viewing a conflict of asked participants to provide an estimate of a
interests from the other side. Experience gained fair judgement in a dispute between two parties.
in playing roles foreign to one’s own interests Before being presented with identical briefing
may provide insights hard to obtain in any other material, half the participants were told they
manner’’ (p. 9). Shubik further suggested that were to take the role of a lawyer for the
game theory is less realistic than role-playing complainant and the other half were told they
(gaming): ‘‘In summary we should suggest that were to represent the defendant. Estimates of a
many of the uses of gaming are not concerned fair judgement diverged between the two groups
with problems which can be clearly and narrow- of ‘lawyers’ and the authors show that the two
ly defined as belonging to game theory. En- groups interpreted the briefing material in dif-
vironment-poor experimental games come clos- ferent (self-serving) ways. Similarly, Cyert,
est to being strict game theory problems. Yet March, and Starbuck (1961) found that role-
even here, features such as learning, searching, players produced divergent forecasts from
organising, are best explained by psychology, identical sets of numbers depending on whether
social-psychology, management science, and they were told they were ‘cost analysts’ or
other disciplines more relevant than game ‘sales analysts’. Statman and Tyebjee (1985)
theory’’ (p. 17). replicated the study, and obtained results con-
Schelling (1961) stated that ‘‘part of the sistent with the earlier research.
rationale of game organization [role-play ex- Given the foregoing evidence, a manager
perimentation] is that no straightforward ana- wanting to forecast a decision in a conflict may
lytical process will generate a ‘solution’ to the consider asking someone not involved in the
problem, predict an outcome, or produce a conflict for his or her considered opinion on
comprehensive map of the alternative routes, what decision is most likely. Yet independent
processes, and outcomes that are latent in the judges are also subject to influences that lead to
problem’’ (p. 47). In contrast, role-plays ‘‘do inaccurate forecasts. For example, they may be
generate these complexities and, by most re- subject to biases arising from the use of com-
ports, do it in a fruitful and stimulating way’’. mon judgemental heuristics, or to overconfi-
dence. Bazerman (1998, p. 6) identified three
broad classes of heuristic (availability, repre-
sentativeness, anchoring) that can engender
2 . Prior evidence judgemental biases. Arkes (2001) examined the
Although judgement may be adequate for evidence and concluded that experience often
predicting decisions made in routine conflicts, leads forecasters using unaided judgement to
decision makers can be subject to serious ignore base rate data and shun decision aids to
judgemental biases or blind spots. Decision the detriment of forecast accuracy. Evidence on
makers who are involved in a conflict tend to the forecasting accuracy of independent judges
324 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
is provided by Armstrong (2001a) who found asking for empirical evidence on the predictive
that research participants exercising their un- validity of game theory for real conflicts. I had
aided judgement performed no better than previously used the addresses to recruit game-
chance when predicting decisions made in con- theorist research participants. Included in the
flicts. letter was a URL for a draft of this paper. I
There is little evidence on the predictive received 18 responses, none of which provided
validity of game theory for real conflicts. Reis- substantive evidence on predictive validity.
man, Kumar, and Motwani (2001) reviewed all One response was an automatic message from
game-theory articles published in the leading an ISP informing of an invalid address and five
US OR/MS journals and found that an average were automatic messages generated for addres-
of less than one article per year addressed a sees who were on leave or otherwise unavail-
real-world application. Armstrong (1997) stated able. A single respondent claimed to have no
‘‘I have reviewed the literature on the effective- relevant expertise, and a further four either
ness of game theory for predictions and have asked for the draft paper in another format or
been unable to find any evidence to directly stated that they would look at the material later.
support the belief that game theory would aid Six respondents provided comments on the
predictive ability’’ (p. 94). paper or information on game theory. One of
Such evidence as there is tends to be indirect these referred to ‘‘the huge literature on ex-
and incomplete, typically comparing game- perimental economics that shows under what
theory predictions with decisions made in role- conditions game theory with the ‘rational actor
play experiments rather than with those from assumptions’ works, and where it does not’’.
actual conflicts. For example, Armstrong and The second of these respondents pointed to his
Hutcherson (1989) found two studies (Neslin & own work on the civil conflict in Northern
Greenhalgh, 1983; Eliashberg, LaTour, Sangas- Ireland as offering evidence of predictive va-
wamy, & Stern, 1986) that involved the use of lidity for a variety of game theory (Brams &
game theory to predict decisions made in Togman, 2000). He stated ‘‘I think our
negotiations. In both studies the negotiations predictions . . . have for the most part been
2
were role-plays rather than actual negotiations, borne out by events’’. Brams and Togman
with the implication that the role-play decisions suggested that game theory and the theory of
(which game theory predicted imperfectly) were moves (an extension of game theory) give
equivalent to actual negotiation agreements. insights that ‘‘can help political leaders predict
Similar approaches have been widely used by both the dynamics of conflict and the conditions
game theorists. Shubik (1975) notes that ‘‘ex- that can provide an escape from it’’ (p. 337).
perimental gaming’’ is employed to examine the The book in which their paper appears also
‘‘validity of various solution concepts’’ de- contains claims of accuracy for a forecasting
veloped by game theorists (p. 20). Rapoport and model using game-theory and decision-theory
Orwent (1962) conducted a comprehensive analysis of the civil conflict over Jerusalem
review of the use of experimental games to test (Organski, 2000). A third respondent pointed to
game-theory hypotheses. They concluded this his own work on analysing play in an interna-
review with a suggestion that ‘‘game theory is tional sporting competition. His paper was not
not descriptive and will not predict human relevant to this research. Finally, four of the
behavior, especially in games with imperfect
2
information about the payoff matrices’’ (p. 34). Personal communication from Steven Brams, 13 June
I sent an e-mail letter to 474 e-mail addresses 2001.
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
325
respondents who had offered comment or in- Three of the articles mentioned the predictive
formation did not address the question of pre- accuracy of game theory for real conflicts.
dictive validity. Ghemawat and McGahan (1998) used historical
A search of the Social Science Citation Index market data to examine price competition be-
(SSCI) for the period 1978 to 7 July 2001 using tween electricity generating companies. The
the phrases ‘game theory’ and ‘forecasting’ authors found that game-theory models had
yielded two articles. Substituting ‘prediction’ greater explanatory power than ‘nonstrategic
for ‘forecasting’ yielded a further 14 articles. analysis’. They suggested that game theory is
Jehiel (1998) turned up in both searches, but the most likely to provide accurate forecasts of
article was purely speculative and theoretical. conflicts when there is concentrated competition
Of the remainder, one concerned a behavioural (few players), mutual familiarity, and repeated
pharmacology study using mice (Parmigiani, interaction. Gruca, Kumar, and Sudharshan
Ferrari, & Palanza, 1998), a second was a (1992) proposed a game-theory model of exist-
review of research on strategic decision making ing players’ responses to a new entrant to a
(Schwenk, 1995), and a third concerned the market. They compared the predictions from
predictive validity of agency theory (Ghosh & their game-theory model with empirical evi-
John, 2000). dence on actual behaviour in conflicts of this
Aside from Henderson (1998), which offered type and with the predictions of an alternative
only opinions on the usefulness of game theory, model. They found that their model provided
the remaining 11 articles all provided evidence forecasts that were more consistent with the
for the predictive accuracy of game theory empirical evidence for more combinations of
relative to another method, relative to a decision competitive situation than did the alternative
made in a real conflict, or both. Predictions model, but did not provide enumeration. Keser
were compared to the outcomes of experiments and Gardner (1999) found that the Nash equilib-
in six of the articles, and this was the only rium failed to predict the behaviour of particip-
comparison that was made in two of these ants in a common-pool resource game experi-
articles (Diekmann, 1993; Sonnegard, 1996). ment. They suggested that ‘‘policies based on
Batson and Ahmad (2001) compared the predic- that equilibrium’s predictions are suspect’’. The
tions of classic game theory with predictions participants were students experienced in game
from the theory of rational choice and the theory, and the researchers expected the design
empathy–altruism hypothesis. Gibbons and Van of the experiment to favour Nash equilibrium
Boven (2001) compared game-theory predic- decisions.
tions with the outcomes of experiments and the Finally, Sandholm (1998) compared the pre-
stated preferences of role-players. McCabe and dictions of three classes of ‘evolutionary game
Smith (2000) compared game-theory predic- theory’ models of social convention on the basis
tions with the outcomes of experiments and of reasonableness.
with predictions based on theories of ‘social Although the 11 articles did not provide
exchange’. Suleiman (1996) conducted experi- formal evidence on the accuracy of forecasts by
ments to test hypotheses about why classic game theorists relative to reasonable non-game-
game theory fails to predict the behaviour of theoretic alternatives, they did support the con-
participants in ‘ultimatum’ game experiments. tention that game theory is considered to be a
The experiments involved comparing the game forecasting method by some authors.
outcomes with the predictions of the particip- A search of the Internet using the Google
ants. search engine and the same terms I used for the
326 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
SSCI search on July 17, 2001 (‘game theory’ (2001a) found that role-playing provided more
and ‘forecasting’) produced too many hits to accurate forecasts than chance and unaided
investigate. Using the three phrases ‘compara- judgement.
tive’, ‘forecasting accuracy’, and ‘game theory’ Role-playing and game theory depend on
yielded 17 hits, including one duplicate. Eleven contrasting assumptions about modelling con-
of the hits were lists of courses or papers—the flict situations. Those who adopt a game-theory
three search terms were dispersed among unre- approach must assume that they can radically
lated course summaries and abstracts of papers. reduce the complexity of a conflict without
Four of the hits were of single documents, predictive validity being lost. The role-play
rather than lists. None of the four offered any approach, on the other hand, incorporates com-
evidence on the forecasting accuracy of game plexity and emotion into a simulation.
theory for real conflicts. One URL returned an Armstrong (2001c) has suggested that the
‘access denied’ message. accuracy of different approaches to predicting
Bennett (1995, p. 27) listed four missing decisions made in conflicts is related to the
dimensions of classic game-theory models: dif- realism with which they allow forecasters to
fering perceptions, dynamics, combinatorial model the situation: the more realistic the
complexity, and linked issues. Two extensions representation, the more accurate the predictions
of game theory that set out to address its are likely to be (Principle 7.2). This hypothesis
shortcomings have been ‘hypergame analysis’ implies that the use of game-theory knowledge
(Bennett & Huxham, 1982), and ‘drama theory’ should result in more accurate forecasts than the
(Howard, 1994a,b). While hypergames are in- use of unaided judgement and that the use of
tended to take players’ divergent perceptions role-playing should produce more accurate fore-
into account by describing and analysing a set casts than the use of game-theory knowledge.
of subjective but linked games, drama theory
moves further away from the sparseness of
game theory by attempting to incorporate emo- 3 . Methods
tion into the analysis of conflict situations. On
the face of it, these developments should deliver
3 .1.
Selection of conflicts
greater realism—and perhaps also greater pre-
dictive accuracy—than classic game theory. I chose six real conflicts to assess the relative
Nevertheless, to test hypotheses about behaviour accuracy of forecasts from unaided judgement,
generated by drama theory, Bennett and game theory, and role-playing. The conflicts
McQuade (1996), for example, resorted to role- each occurred between small numbers of de-
playing to provide the behavioural benchmark. cision-makers with much at stake. They were
Role-playing is the third and last of the three diverse in the parties that were involved, the
forecasting methods considered. It is ‘‘a tech- level of conflict, and the type of decision that
nique whereby people play roles and enact a was to be made. Furthermore, the conflicts were
situation in a realistic manner. Role-playing can all unlikely to be recognised by participants. I
be used to predict what will happen if various neither sought nor deliberately avoided conflict
strategies are employed. It is especially relevant situations that resembled familiar game-theory
when trying to forecast decisions made by two models. Neutral but informed observers wrote
parties who are in conflict’’ (Armstrong, 2001d, the descriptions of the situations.
p. 807). Role-playing is used to simulate rather Three of the situations (‘Artists’ Protest’,
than analyse conflict situations. Armstrong ‘Distribution Channel’, and ‘55% Pay Plan’)
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
327
were first used in research by Armstrong based my description on a Grenada Television
(1987). Artists’ Protest was a conflict between documentary about a British Steel investment
artists and government over financial support— decision that was complicated by conflicting
Armstrong based his description of it on a interests among the decision makers (Graef,
report in The Wall Street Journal (Newman, 1976). I disguised the situation, but did not alter
1982). Distribution Channel was a proposal by it in any substantive way. I sought advice from
Philco Corporation for commercial cooperation a professor who had used the documentary in
requiring decision makers to trade off conflict- his teaching for many years. He had also
ing interests—Armstrong took a description of worked at British Steel for one of the managers
it from a book of marketing case histories who played a major part in the investment
(Berg, 1970). The 55% Pay Plan was a conflict decision portrayed in the documentary. The
between National Football League owners and professor considered the description to be an
players over revenue shares—Armstrong based accurate representation of the situation.
his description of it on two reports in Sports I also wrote the description of the sixth
Illustrated published prior to the start of situation, ‘Nurses Dispute’. The situation was a
negotiations (Boyle, 1982; Kirshenbaum, 1982). dispute over pay between nursing staff and
A fourth situation, ‘Panalba Drug Policy’, was management. The nurses went on strike angry
first used in experiments conducted by Arm- that they were being offered a much lower pay
strong (1977) on the influence of role on increase than management had already given to
managers’ behaviour. Panalba was a conflict intensive care nurses and junior doctors. A
between the board members of a pharmaceutical mediator was appointed by a government
company and the consumers of one of the agency. The principal negotiators for the two
company’s drugs. Unlike the other conflicts in parties co-operated with the research and, after
this research, one party to the conflict (the they had reached an agreement, some members
consumers) was absent except in the abstract. of the nurses’ negotiation team role-played the
Armstrong wrote: ‘‘The description was based dispute in a debriefing session using the materi-
upon the true case of Panalba as reported by al I had prepared for this research. Asked how
Mintz (1969). Information was also taken from the description could be improved, the principal
Upjohn’s Annual Reports. I made up details for negotiator did not have any suggestions and
this case ...tomakethe extreme nature of this considered the material a fair and accurate
case obvious. Attempts were made to obtain representation. I used two versions of the situa-
further information from the Upjohn Co. to tion description that were the same in all
ensure the facts were accurately presented, but respects other than the names of the people and
they refused to answer’’ (p. 196). organisations involved. In the first version I
Prior research using these four situations used actual names. In the second, I changed the
provided information on the accuracy of role- names of the people and organisations after one
playing and judgemental forecasts. These data potential participant said he was unhappy play-
were collected by Armstrong and colleagues ing the role of a real person and withdrew from
and were summarised in Armstrong (2001a). In his session.
all, 147 judgemental forecasts made by 230 The descriptions of the six situations are
participants, and 119 role-play forecasts from available on the Internet at www.kestencgreen.
653 participants were obtained from this source. com. I used the same situation descriptions,
I wrote the description of the fifth situation, role descriptions, and exhaustive list of deci-
‘Zenith Investment’ for this research project. I sions from which the participants were asked to
328 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
choose for all three forecasting methods: un- Volunteers from one third-year university
aided judgement, role-playing, and game theory. class provided 10 Zenith predictions. I asked
I discuss an exception to the uniform treatment participants to leave the lecture theatre with the
below. material I had given them and to return the
For any forecasting method to be useful, it completed questionnaires no more than one
must be more accurate than chance. I use the hour after having received their instructions.
term ‘chance’ to refer to the probability of Volunteers from two third-year classes pro-
selecting, at random, the decision that was vided 20 predictions. I used 10 minutes of class
actually made, from the exhaustive list of time to distribute material on five situations
potential decisions provided to participants. (Nurses was not included) and to brief the
students. I instructed them to return their com-
pleted questionnaires to their lecturer, or to fold
3 .2.
Unaided judgement them (so that ‘Freepost’ information and address
The participants using unaided judgement were displayed) and post them.
read the description of a conflict and then I had offered five marketing students, one
selected the decision they thought most likely information-technology student, one environ-
from a list. I recruited these judges on the basis mental-planning consultant, and one medical
of convenience. They had no special knowledge doctor $NZ25 (about $US10) to participate in a
of the situations nor of the class of problem I role-playing session, but I did not need them for
asked them to consider. that purpose. Instead, they provided eight pre-
Except for predictions for the Zenith and dictions for the Zenith situation. I asked them to
Nurses situations, I took most of the predictions adjourn to a room away from the role-players
using unaided judgement that are used in this with the material I had given them and to return
research from studies reported by Armstrong completed questionnaires to me no more than
(2001a). Role descriptions were not provided to one hour after getting their instructions. In a
all the judges who took part in the research situation similar to the one just described,
reported by Armstrong, but Armstrong and students who were not needed for role-playing
Hutcherson (1989) have shown that providing provided predictions for the Nurses situation
role descriptions to judges has no effect on their instead.
forecasting accuracy. In some cases, the judges
were paired off to discuss the problem before
3 .3.
Role-playing
making their predictions. In other cases, they
acted alone. I gave each role-play participant a single role
I provided all participants in the new research description to read and told participants to adopt
with a full set of information for a single the role they had been given for the duration of
situation (including roles for both parties) and a their simulation. When they had adopted their
questionnaire. I told them that the material roles, I asked the role-players to read a descrip-
included a description of a conflict that had tion of the conflict. I then told them to form
occurred in the past. I asked them to read the groups, with each group comprising one role-
material and, without conferring with others or player for each role. For example, in the Nurses
referring to other sources of information, use situation there were five roles: two management
their judgement to predict the decision that had roles, two union roles, and one mediator role.
actually been made, and then complete and Once the role-players were in their groups, I
return the questionnaire. told them to role-play the conflict from the time
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
329
specified in the description until they arrived at characters’ names and positions. I told them to
a decision or ran out of time. When they meet in their groups, wearing their name
finished, each role-player recorded the decision badges, and to introduce themselves to each
the group had made or, if they had run out of other while in character.
time, the decision he or she thought would have I told the Zenith Investment role-players that
been made had the group been free to continue. the chairman of Zenith would set a time and
With the exception of six groups of Nurses place for a meeting and that they should prepare
Dispute role-players, the role-players were, like for the meeting in a manner consistent with
the judges, recruited on the basis of convenience their printed briefing material. In the verbal
rather than because they resembled the charac- briefings, I emphasised that it was appropriate to
ters they were to play or were familiar with the hold informal discussions with other members
roles they were to play or were knowledgeable of their group (the Zenith Policy Committee)
about the situation. Most of the role-players prior to the meeting. I told Nurses Dispute
were university students. I conducted role-play role-players that, after introductions, the two
simulations for the Zenith and Nurses situations parties to the dispute should agree to a time and
as part of this research, and took role-play place for their first meeting and that subsequent
results for the other four situations from Arm- meetings could be held at the discretion of the
strong (2001a). parties. I told participants who had been given
I brought the participants for the Zenith and the mediator roles, one from each group at the
Nurses simulations together in lecture theatres session, to meet at a designated place. I told
or similar settings for their briefings. I gave them to discuss mediation and its application to
each participant an information sheet, which the dispute among themselves until the parties
provided basic information on the research in their own groups called upon them, or 30
project and on role-playing, and a consent form. minutes had elapsed. If the contesting parties
I told them it was very important that they take failed to agree within 30 minutes, they were
the role-playing seriously. Additional space, obliged to accept the services of their gov-
such as a second lecture theatre, meeting rooms, ernment-appointed mediator. This measure was
or lobby areas, was available for role-playing. I intended to simulate the effect of employment
encouraged the role-players to make good use relations legislation that had just taken effect at
of the available space for holding both formal the time of the dispute.
meetings and private discussions. At the end of No efforts were made to increase the realism
the briefings, I told them that they were free to of the simulations beyond what has been de-
improvise provided that they remained in scribed here. There were no theatrical or tech-
character and true to the situation description. nological devices, nor were any confederate
They were allowed to retain their printed role role-players used.
and situation descriptions for the duration of the The procedures I adopted in this research
session. This material included the questionnaire were similar to those Armstrong and Hutcher-
that the role-players were to fill in at the end of son adopted for Artists’ Protest, Distribution
their role-plays. I drew their attention to the Channel, and 55% Pay Plan (Armstrong, 1987;
decision options presented in the questionnaires Armstrong & Hutcherson, 1989). Although
and told them that they would be expected to similar in other respects, the role-plays of
match one of these with their own group’s Panalba Drug Policy (Armstrong, 1977) in-
role-play decision. I also gave role-players volved people belonging to a single party (the
printed self-adhesive badges showing their members of the Upjohn Board) who had a
330 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
common purpose—neither representatives of Nurses sessions whether they would tend to
consumers nor of the Food and Drug Adminis- identify more with union or management in a
tration were present. pay dispute. I then allocated them according to
A total of 170 university students role-played their preferences, with the more equivocal par-
the Zenith situation in 17 groups. I conducted ticipants given the mediator role. I allocated
these role-plays in three sessions with different participants in the third session to roles on the
participants. Five groups role-played the situa- basis of my assessment of the compatibility of
tion in an organisational behaviour class and their experience with the demands of the roles.
four in a conflict-of-laws class. I recruited the
eight groups of role-players in the third session
3 .4.
Prediction by game theorists
from among students attending mathematics and
computer science lectures. I offered the students The initial sample of 558 game theorists was
who attended the third session $NZ25 to take composed of the members of the Game Theory
part.Whereas participants in earlier sessions had Society, recipients of the International Society
3
been assigned randomly to roles, in the third of Dynamic Games E-Letter, and a small
session I asked ‘natural leaders’ and ‘quantita- number of prominent game-theory experts who
tive analysis experts’ to come to the front of the were not members of the other two groups. I
theatre. I allocated ‘natural leaders’ to the expected that a large initial sample would be
Chairman roles and ‘quantitative experts’ to the necessary because the task was demanding and I
roles of either Finance Director or Chief Plan- was offering no extrinsic rewards.
ner. I assigned the remaining participants ran- I sent everyone in the sample of experts an
domly to the seven other roles. e-mail message and material on five situations,
I told all participants that their taking part excluding the Nurses Dispute (Appendix A). I
would help with research on decision making in personalised the messages when this was pos-
conflict situations and was likely to be both sible. Although the Game Theory Society mem-
enjoyable and (in the case of the first two bership list contained only e-mail addresses,
sessions) relevant to their studies. I told particip- names were included in most of the e-mail
ants that the situation they were to role-play had addresses. I conducted Internet searches when
occurred in the past and that it involved a group they were not and ended up with names for all
of senior managers making an important invest- but 57 e-mail addresses in the sample.
ment decision. I made recipients aware of the purpose of the
A total of 110 participants role-played the research with the e-mail message’s subject line:
Nurses situation in 22 groups. I conducted three ‘Using Game Theory to predict the outcomes of
sessions. In the first session, 10 students with conflicts’. Moreover, in the first paragraph I
work experience who were enrolled in courses wrote ‘‘I am engaged on a research project
on dispute resolution role-played in two groups. which investigates the accuracy of different
In the second session, 90 students recruited with methods for predicting the outcomes of con-
an offer of $NZ25 cash role-played in 18 flicts’’.
groups. In the third session, 10 participants I attached MS-Word files containing the
selected for experience relevant to the situation material on the situations to the messages. I
(union negotiators, managers, management varied the order of the five situation files across
negotiators, and professional mediators) role-
played in two groups.
3
I asked participants in the first and second http: //www.hut.fi/Units/SAL/isdg/issue23.txt.
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
331
the sample to reduce the risk that respondents reminders one week apart to those who had not
who predicted decisions for only some of the already responded.
situations would do so for similar subsets. In
addition, I hoped that varying the order of the
situations in the e-mail messages would counter 4 . Findings
any effect that might have arisen had the experts
4 .1.
Unaided-judgement replication and
changed their processes of deriving predictions extension
in some systematic way as they worked through
the situations.
4 .1.1.
Participation
The material on each of the situations in- Participants provided one forecast each for
cluded descriptions of all of the roles, a descrip- Artists’ Protest (eight forecasts), Distribution
tion of the situation, and a questionnaire. In the Channel (five), Panalba Drug Policy (four),
questionnaires, I asked respondents to select the Zenith Investment (21), and Nurses Dispute
most likely from a list of possible decisions. I (22).
also asked them how long they took to derive Students from the two classes who undertook
the prediction and how many years’ experience the task in their own time reported taking
they had working with game theory. I repeated between 2 and 120 minutes to derive their
the question on experience for each situation in individual predictions. The mean time was 30
case the recipient of the e-mail message asked minutes and the median was 25 minutes. I gave
other people to respond to one or more of the the participants making predictions for Zenith
problems. In anticipation of some respondents and Nurses an hour to do so. Zenith judges took
being willing to participate in the research but an average of 15 minutes to derive a prediction.
unwilling to make a prediction for particular I did not ask the Nurses judges to record the
situations, I included a question with each time they actually spent on the task.
situation asking those who did not make predic- One Nurses Dispute respondent claimed to
tions to give a reason for not doing so. I also know more about the situation than was con-
asked them to identify any situations they tained in the material provided but gave no
thought they recognised. I asked respondents to indication of knowing anything about the actual
complete and return their questionnaires elec- agreement, and so I included the response in the
tronically or to print them out and return the analysis. None of the other respondents admit-
completed forms by fax or post. ted to recognising any of the situations they had
After 13 and 14 days, I sent individualised been given.
e-mail reminders to the 413 addressees in the Fewer than 10 percent of the participants I
original sample who had not responded. The asked to undertake the task in their own time
reminder included a copy of the original letter returned completed questionnaires. I received
and the attached files in the same order as in the responses for four of the five situations pro-
original e-mail message. vided—none was received for the 55% Pay
I sent individual replies to all those who Plan.
responded, irrespective of the nature of their
responses.
A year after the first appeal to game theorists,
4 .1.2.
Predictions
I sent material on the Nurses situation to those The unaided-judgement predictions new to
who had provided predictions in response to the this research were (in an unweighted average
first appeal. I followed this with up to two across the situations) correct for 31 percent of
332 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
predictions. This rate is little different from hted average for the role-plays of the four
chance (27 percent) for the five situations for situations reported by Armstrong (2001a),
which I received responses. New predictions for which was 60 percent. Chance for these four
situations that were also reported by Armstrong situations is, on average, somewhat lower at 26
(2001a) (Artists, Distribution, and Panalba) were percent, however. The forecasting accuracy of
correct for 19 percent of predictions. This the eight groups with ‘natural leaders’ and
finding is consistent with Armstrong (2001a), ‘quantitative analysis experts’ allocated to roles
who reported 13 percent correct for the same was similar to that of the other nine groups—
situations. there were five correct forecasts from each
treatment.
For the Nurses Dispute situation, role-play
4 .2.
Role-playing extension agreements matched the actual agreement be-
4 .2.1.
Participation tween management and union in 18 of the 22
For the Zenith Investment situation, roughly groups (82 percent). This is better than chance.
half the students in the two classes that were not Although neither of the groups in the two-group
offered payment stayed to take part in role- (third) session reached an agreement in the
plays. After their briefing, the organisational course of their role-plays, both groups chose the
behaviour class had little more than 30 minutes actual decision when asked what decision they
remaining for role-playing and completing ques- thought would have occurred had their role-
tionnaires. The conflict-of-laws class had a full playing continued.
hour available, and all took this long, while the
groups of paid recruits took as long as an hour
4 .3.
Game theorists
for their role-playing alone. Some of the role-
players given only half an hour complained of
4 .3.1.
Participation
not having enough time, as did some who were I received responses of various types from
given an hour. Nevertheless, all 17 groups made 269 addresses (48 percent of the initial sample).
decisions. One of the paid recruits recognised Of these, 78 were invalid-address messages, 18
the situation but did not recall the decision that were automatic rejection messages (typically on
was made or discuss her knowledge with her leave messages), and six were statements that
group. the addressee was not a game-theory expert.
For the Nurses Dispute situation, participants The balance of 167 responses consisted of
in the first and second sessions typically took messages from 95 game theorists who stated
between 45 minutes and one hour for their that they did not wish to participate, 51 game
role-playing. The two groups of role-players in theorists who promised to respond or who
the third session took one-and-a-half hours for wanted more information but did not respond
their role-playing and neither group had come to after receiving it, and 21 game theorists who
an agreement when I asked them to stop. each responded with a completed questionnaire
for at least one situation.
4 .2.2.
Predictions
For the Zenith Investment situation, the role-
4 .3.1.1.
Reasons for refusing to participate
play decisions matched the actual decision for Of the 95 people who responded but refused
10 of the 17 groups (59 percent). This was to participate, 90 provided reasons (Appendix
better than chance, which is 33 percent for B). Most (72) stated they were too busy with
Zenith. The finding was similar to the unweig- other commitments to spend time helping with
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
333
this research. Eight maintained that their game-
4 .3.1.4.
Probabilistic forecasts
theory speciality was not applicable to the While the questionnaires instructed respon-
problems provided. Ten wrote either that it was dents to ‘check one—✓’ or something similar,
not appropriate to apply game theory to the the e-mail message stated ‘‘you may assign
problems given (six) or that there was insuffi- probabilities to possible outcomes, rather than
cient information for them to derive predictions picking a single outcome, if you consider this to
(four). be appropriate’’. One game theorist who re-
sponded for a single situation (Artists) provided
probabilities rather than choosing a single deci-
4 .3.1.2.
Selective prediction sion. He assigned a zero probability to the
Nine of the 21 experts who returned com- actual decision.
pleted questionnaires did not provide predic-
tions for all of the original five situations. I
4 .3.2.
Predictions
obtained 85 usable predictions in all from the 21 Overall, the game-theory experts’ predictions
experts out of a potential total of 105 predic- matched the actual decision for 37 percent of
tions. It seems reasonable to assume that they their predictions. This rate is better than chance
made predictions for the situations they consid- (an unweighted average of 27 percent over the
ered themselves most capable of predicting six situations) and better than unaided judge-
accurately. One respondent recognised the Ar- ment (28 percent).
tists’ Protest situation and I excluded his re-
sponse from the analysis.
5 . Discussion
4 .3.1.3.
Further participation
When I appealed again to the 21 game
5 .1.
Relative predictive accuracy
theorists who had responded to the original
appeal, 14 provided usable predictions for the In Table 1, I combined the results from this
Nurses Dispute situation and one sent an unus- research (chiefly game-theory experts’ predic-
able response. tions for six situations, and unaided judgement
Table 1
Accuracy of unaided judgement, game theorist, and role-play predictions. Percent correct predictions (number of predictions)
b
Chance Unaided Game Role-play
a
judgement theorist
Artists’ Protest 17 5 (39) 6 (18) 29 (14)
Distribution Channel 33 5 (42) 31 (13) 75 (12)
55% Pay Plan 25 27 (15) 29 (17) 60 (10)
Zenith Investment 33 29 (21) 22 (18) 59 (17)
Panalba Drug Policy 20 34 (68) 84 (19) 76 (83)
Nurses Dispute 33 68 (22) 50 (14) 82 (22)
c
Totals (unweighted ) 27 28 (207) 37 (99) 64 (158)
a
Results reported in Armstrong (2001a), except Zenith Investment, Nurses Dispute, and 17 predictions for other situations
from this research: Artists (one correct/n58); Distribution (1/5); Panalba (1/4).
b
Results reported in Armstrong (2001a) except Zenith Investment and Nurses Dispute.
c
Percentage figures in this row are unweighted averages of the percentage of correct responses reported for each situation.
334 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
and role-play predictions for Zenith Investment strong & Hutcherson, 1989). Unlike the role-
and Nurses Dispute) with the unaided judge- players, the game theorists were not restricted in
ment and role-play predictions for four situa- the time they had to derive their forecasts nor in
tions from Armstrong (2001a). the resources they could exploit, although my
Role-play forecasts were more accurate than written instructions asked them to avoid confer-
predictions by game theorists, which were, in ring with others.
turn, more accurate than unaided judgement Motivation may have influenced forecasting
predictions. These findings are consistent with accuracy. Game theorists had the most reason to
Schelling’s (1961) observation that role play be motivated. I told them that I planned to use
provides more realistic representations of con- their forecasts in a comparison of forecasting
flict than do theoretical models, and with Arm- methods, and that I would publish their names
strong’s (2001b) suggestion that forecasting in the resulting paper (Appendix A). The par-
accuracy for conflicts is positively related to the ticipants playing roles or using unaided judge-
realism of the forecasting method. The differ- ment, on the other hand, knew they would
ence in accuracy between prediction by game remain anonymous.
theorists and prediction using unaided judge- The superiority of game theorists’ forecasts
ment is substantial and is statistically significant over forecasts based on unaided judgement may
2
at 95 percent (
x
54.3, degrees of freedom 1, be due to the game theorists’ greater familiarity
4
P50.04). The error was about 12 percent with conflicts. This possibility does not affect
lower across the six situations, thus supporting the conclusion that those who wish to predict
the claim that game theory can improve the the decisions made in conflicts would do better
forecasting of conflict situations. The difference to ask game theorists than to rely on judges with
in accuracy between prediction by game theor- no expertise. The relative accuracy of people
ists and prediction by role-play decisions is who are experts on conflicts and who use
2
large and statistically significant (
x
524.6, unaided judgement to predict decisions made in
degrees of freedom 1, P,0.001). I applied the conflicts is a matter for further research.
statistical tests to the Table 1 data in the form of
two separate two-by-two contingency tables
5 .2.
Factors that might have disadvantaged
(two accuracy levels by two methods). game theorists
’
predictions
Some participants provided more than one
forecast. These participants may have been able Perhaps the game theorists were not really
to apply what they learned to subsequent fore- experts. I put respondents through a number of
casts. If this were the case, the game-theory ‘filters’ before asking them for their forecasts—
experts were most likely to benefit because they filters that would reasonably be expected to
provided an average of five forecasts each and exclude those who were not experts. These
were free to revise their forecasts. Role-players, filters were: (a) the composition of the original
on the other hand, generally played a role in a sample (mostly Game Theory Society mem-
single situation, although some were asked for bers); (b) the initial e-mail message, in which I
their opinion on the decision that was made in stated ‘‘I am writing to you because you are a
an unrelated situation (Armstrong, 1987; Arm- game theory expert’’; and (c) the potential
respondents’ own assessments of their ability to
42
successfully apply a game-theory approach to
Unless otherwise stated, all
x
tests are of two-by-two the problems.
contingency tables with continuity correction (Siegel &
Castellan, 1988, Eq. (6.3), p. 116). I asked respondents about their experience
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
335
with game theory. Their median experience was ‘‘spending more time would have not changed
six and a quarter years. Most of the respondents my answers, except if I had communicated with
had enough experience to merit game-theory- others’’. In response to follow-up e-mail mes-
expert status. More experience as a game theor- sages, nine of the 12 respondents stated that
ist did not, however, lead to more accurate they did not believe that more time would have
predictions (Table 2). changed their predictions. Little or no gain in
Perhaps the respondents did not spend accuracy was associated with spending more
2
enough time on the tasks to demonstrate the time on the forecasting task (
x
50.08, degrees
5
strength of a game-theory approach to predict- of freedom 2, P50.96) (Table 3). Further,
ing decisions made by parties in conflict. The only for the Panalba situation was there a
respondents were presumably busy people, and I monotonic increase in accuracy across the three
did not offer to pay them for their work. time categories.
Nevertheless, the respondents responded vol- Perhaps the respondents were unable to dem-
untarily and were, it seems likely, motivated to onstrate the strength of a game-theory approach
perform well having taken on the tasks. If, after to prediction because they lacked adequate
starting on the task, potential respondents began information about the situations. Four experts
to doubt their ability to predict the decisions in cited inadequate information about the situa-
the time they had available, they could have tions as a reason for not participating in the
abandoned the task without risking embarrass- research. Yet these four comprised only four
ment. It seems plausible that the withdrawal of percent of those who provided a reason for not
respondents who doubted their own abilities participating.
would tend to bias responses towards those I designed the research to compare the fore-
from the more capable. The link between confi- casting accuracy of the methods by providing
dence and accuracy is, however, tenuous (Arkes, the same material to the practitioners of all three
2001). methods. I assumed that the material provided
The game theorists reported spending nearly was a fair representation of the type of in-
40 minutes, on average, deriving their predic- formation that could realistically be assembled
tions for each situation, including 10 minutes about an unfolding conflict. Perhaps the experts
reading the material. This figure is in keeping could have produced better forecasts had they
with the up to 45 minutes that Armstrong been provided with different information. Per-
(1977) allowed role-players in his research. One haps, given equivalent resources, game-theory
game theorist respondent, who stated apologeti- experts would have collected different types of
cally that he had spent ‘only’ about 20 minutes Table 3
on each of the situations, went on to write that Game theorists’ predictions: the effect of time spent on
accuracy
a
Table 2 Time spent on forecast n% Correct
Game theorists’ predictions: the effect of experience on Up to 25 minutes (36) 36
accuracy 26 to 40 minutes (43) 37
a
Experience n% Correct More than 40 minutes (20) 40
a
Fewer than 5 years (31) 39 Number of predictions.
Between 5 and 10 years (35) 40
More than 10 years (33) 33
5
Test of independence in a two-by-nclassification (Fisher,
a
Number of predictions. 1973, p. 87).
336 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
information about the situations and would, as a predictions, ‘‘standard neoclassical models do
consequence, have produced better forecasts. well without the intellectual baggage of game
Further research may address these issues. theory’’ (p. 240).
Perhaps the game theorists would have been While Gintis declared that predicting the
more accurate in forecasting different types of decisions made in individual real conflicts is
6
conflicts. The game-theory experts provided the ‘‘not what game theory is good at’’, other
most accurate forecasts for the Panalba Drug researchers seem to think that game theory can
Policy conflict, while their accuracy for the be used to provide useful forecasts for some
other conflicts was poor. In sympathy with types of conflict for which ‘rational actor’
Shubik (1975), Gintis (2000) offered a possible assumptions would seem rather brave. The use
explanation for this dichotomy in the game of methods based on game theory to forecast
theorists’ forecasting accuracy: the rational civil conflicts has been mentioned (Brams &
actor model from neoclassical economics. He Togman, 2000; Organski, 2000), as has their
wrote ‘‘.. . when faced with market condi- use for predicting incumbents’ responses to a
tions—anonymous, nonstrategic interactions— new competitor (Gruca et al., 1992), for com-
people behave like self-interested, outcome-ori- mon-pool resource problems (Keser & Gardner,
ented actors . . . In other settings, especially in 1999), and for conflicts involving concentrated
the area of strategic interactions, people behave competition, mutual familiarity, and repeated
quite differently’’ (p. 240). The Panalba conflict interaction (Ghemawat & McGahan, 1998).
is an ‘anonymous, nonstrategic interaction’, and My research provides some evidence of game
the decision makers in the conflict did behave as theorists’ forecasting accuracy for some of these
‘self-interested, outcome-oriented actors’. The types of conflicts. While none of the six con-
five other conflicts, on the other hand, involved flicts I used involved sectarian violence, it could
face-to-face strategic interaction and ongoing be argued that the Artists’ Protest was a civil
relationships. This distinction suggests that conflict. The game-theory experts, on average,
managers would be sensible to seek forecasts of did no better than chance in forecasting the
conflicts from game theorists only if those in decision made in this conflict. Any pay dispute
conflict are likely to behave as ‘rational actors’. (for example, the 55% Pay Plan and the Nurses
Bennett’s (1995) ‘missing dimensions’ (men- Dispute) meets the conditions listed by
tioned earlier) may also help managers to dis- Ghemawat and McGahan, if they are taken
tinguish between conflicts for which game literally, as does the Zenith Investment conflict.
theory can provide accurate forecasts and those The game-theory experts averaged 34 percent
for which this is unlikely. For example, these correct for these three conflicts. This compares
dimensions are weak or absent in the Panalba with 41 percent for unaided judgement, 67
conflict.Whether managers would be able to use percent for role-playing, and 30 percent for
Bennett’s ‘missing dimensions’ or Gintis’s chance. None of the six conflicts involved a new
‘market conditions’ to make the distinction competitor or a common pool resource. Game
between conflicts that are tractable under game theorists’ predictions may be more accurate for
theory is a matter for further research. Ironical- conflicts of these types than they were for the
ly, even if such a distinction could be made in
advance, Gintis (2000) suggested that game
6
theory may not add value, because, in situations Personal communication from Herbert Gintis, 15 June
in which it is likely to help make accurate 2001.
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
337
six I used in this research. Further research may strate their knowledge of game theory or to
answer this question. describe its application in their (typically brief)
responses—after all, I had told them that I
selected them because of their game-theory
5 .3.
Did the game theorists use game theory
?
expertise. For these reasons, I examine relative
For each of their predictions, I asked game rather than absolute ratings in the following
theorists, ‘‘Broadly, what approach did you use analysis.
to derive your prediction?’’. The answers to this The ratings provide modest support for the
question were diverse. Five university students, contention that a more structured application of
enrolled in an Honours-level economics course game theory would have increased the accuracy
that included game theory, rated the responses. of game theorists’ forecasts. Of the 33 predic-
Each of these student raters had completed tions that were associated with a mean knowl-
between two and five courses with game-theory edge rating of 3.0 or higher, 48 percent were
content. I told the raters nothing about the correct. This compares with 33 percent correct
research participants or the situations. Specifi- for the 64 predictions associated with lower
cally, they did not know that the participants ratings. The difference between the proportions
2
were game-theory experts. They each rated up is not significant, however (
x
51.7, degrees of
to 97 of 98 responses both for evidence of freedom 1, P50.20). The proportion correct
game-theory knowledge (implicit or explicit) among the 19 predictions with application rat-
and for the extent to which the responses ings of 3.0 or higher was 42 percent compared
implied that knowledge of game theory had to 37 percent among the 78 with lower ratings.
been applied to making a prediction. The raters Again, the difference is not statistically signifi-
2
derived their ratings independently using a zero cant (
x
50.02, degrees of freedom 1, P5
(‘none’) to 10 (‘high’) scale. The rating ques- 0.89). There was no meaningful relationship
tionnaire, which includes the game theorists’ between the situations and the mean ratings for
responses, is available on the Internet at the situations (F(5,92) 50.9; P50.47 for both
www.kestencgreen.com. ratings).
I calculated Cronbach’s Alpha scores to test Ratings aside, what is important for the
interrater reliability for the students’ knowledge purpose of this research is that the respondents
and application ratings. Alphas were 0.69 (n5were aware that I was assessing game-theory
73) and 0.42 (n555), respectively. I averaged forecasts and that I expected them to apply their
the five raters’ ratings to provide two mean game-theory knowledge and skills to the fore-
ratings for each game-theorist response. The casting problems (Appendix A). The responses
overall average rating for knowledge was 2.6 of the six who considered it inappropriate to
and for application 2.2. Some game-theorist apply game theory to the problems lends further
responses were rated highly: the maximums of support to my contention that respondents knew
the five-rater means were 8.0 and 8.2, respec- what was expected of them. If some respondents
tively. Ratings were mostly quite low, however. failed to apply game-theory knowledge to the
While I asked the raters to assess the re- problems (and some responses suggested this),
sponses without giving them contextual infor- this implies that these respondents considered
mation, the game theorists knew the context that knowledge not useful for, or applicable to,
when they wrote their responses. It is reasonable these problems, or that the cost of applying it
to assume that they felt no obligation to demon- was too high.
338 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
5 .4.
Alternative assessment of forecast In the case of the Artists’ Protest, the most
7
accuracy popular choice by game theorists was option C
(42 percent). Option C was a compromise
Despite the relative inaccuracy of game- between the status quo and the actual decision,
theory experts’ forecasts, a manager who ob- which was the government conceding to the
tained several forecasts for a single conflict artists’ demands. This forecast could have
might still have been led to expect the decision alerted the parties to the likelihood that the
that actually occurred, or at least have been actual decision would favour the artists. Score:
warned that it might occur. This might be the 0.5.
case if the actual decision made in the conflict In the case of the Distribution Channel con-
was the most popular choice of the game flict, most game theorists (54 percent) forecast a
theorists or if the most popular choice of the short-term trial for the scheme, but the actual
game theorists was similar to the actual deci- decision was to make a long-term commitment.
sion. Nevertheless, it seems reasonable to as- The forecast would not have been likely to lead
sume that such a set of forecasts would be less the proposer of the scheme to expect a long-
likely to lead a manager to expect the actual term commitment. Score: 20.5.
decision than a set in which an absolute majori- In the case of the 55% Pay Plan, the great
ty were accurate. A manager is most likely to majority of game theorists (88 percent) forecast
commit to a course of action when provided a strike by the players, but their forecasts were
with an unambiguous forecast and to reap the evenly divided between the three strike op-
benefits of this commitment if the forecast is tions—short, medium, and long. Such a forecast
accurate. I propose the following accuracy-score would have led the parties to expect a strike but
regime to allow comparisons between the fore- not necessarily a long strike, which is what
casting methods based on the likelihood that a occurred. Score: 0.
set of forecasts would lead managers to confi- In the case of the Zenith Investment conflict,
dently expect the actual decision: the majority of game theorists (56 percent)
forecast that no new steel production plants
would be purchased. The Board chose to pur-
Accuracy of forecast set Accuracy
score
chase two new plants. The absolute majority
game-theorist forecast was completely wrong in
An absolute majority choose the actual decision 2
Actual decision is the most popular choice ( #50 percent) 1
this instance. Score: 22.
Either popular choice or absolute majority choice is similar
As Table 4 shows, the total accuracy score
to actual decision 0.5
for the game-theorist forecasts for the six con-
No clear choice 0
Either popular choice or absolute majority choice is dissimilar
flicts is 2.0 out of a possible total of 12. Making
to actual decision 20.5
a similar calculation for the role-play forecasts
Popular choice is substantially different from the actual deci- 21
is straightforward: for all conflicts except the
sion
Artists’ Protest, an absolute majority of fore-
Absolute majority choice is substantially different
from the actual decision 22
casts was correct and therefore the role-play
forecasts score two points for each of these. The
For both the Panalba and Nurses conflicts, the popular role-play decision Artists’ Protest was
game theorists’ absolute majority choice was option B, which was similar to the actual
the actual decision (Table 1). The accuracy
7
score is therefore 2 for both Panalba and n517, as one game theorist provided a probabilistic
Nurses. forecast.
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
339
Table 4 less than an hour to produce, in practice it
Accuracy of forecasts: an alternative assessment. Accuracy would be difficult to recruit students or experts
Scores: 22to12for less than one hour at a time for a commer-
Predictions by Role-playing cial forecasting task. On this basis, the marginal
game theorists cost of a game-theory forecast is six times more
Artists’ Protest 0.5 0.5 than the cost of an unaided-judgement forecast
Distribution Channel 20.5 2 by a student. The number of role-players needed
55% Pay Plan 0 2 for a role-play forecast will vary depending on
Zenith Investment 222
the situation and the judgement of the fore-
Panalba Drug Policy 2 2 caster. A two-party conflict can be role-played
Nurses Dispute 2 2 using two students to represent each party; four
Totals (unweighted) 2.0 10.5 students in total. The cost of a role-play forecast
in this case would be two-thirds the cost of a
8
decision. In fact, 13 of the 14 role-play deci- game-theory forecast. The role-plays of the six
sions either matched the actual decision (option situations used in this research involved an
A) or were similar to it (options B and C). This average of six role-players per simulation. With
implies a score of 0.5 for Artists’ Protest and a six role-players per simulation, the marginal
total accuracy score for role-playing of 10.5. cost is similar to that of the game-theory
forecasts: the equivalent of one hour of expert
time. Thus 10 independent role-play or game-
5 .5.
Cost of forecasts theory forecasts for a single conflict may cost
Managers deciding between forecasting meth- the equivalent of 50 h of expert time. Ten
ods are often influenced by the relative costs of unaided-judgement forecasts by domain experts
methods as well as their accuracy. When seek- would have a similar cost, but the cost of using
ing to forecast decisions in a conflict, asking a students as judges would be nearer the equiva-
neutral observer to describe the situation is lent of 42 h of expert time.
likely to be a sensible starting point. A week
(40 h) may be enough time for an experienced
person to assemble a concise and accurate 6 . Conclusions
description of a situation. The actual time,
however, will depend upon the importance and The primary purpose of my research was to
complexity of the problem and on any deadlines assess the relative worth of unaided judgement,
inherent in the situation. The time taken to game theory, and role-playing for predicting
compile a situation description is the major part decisions made in real-world conflicts involving
of the fixed cost of forecasting. few players and high stakes. To do this, I
The cost per forecast varies across the meth- combined new findings on the accuracy of
ods. Students were used for unaided judgement game-theory experts’ predictions, role-play pre-
and role-play forecasts in this research. Stu- dictions, and predictions based on unaided
dents’ time is likely to cost, at most, one-sixth judgement with role-play and unaided-judge-
as much as that of experts. Although most of the ment findings summarised in Armstrong
forecasts reported in this paper took somewhat (2001a). I then compared the accuracy of the
forecasts from the different methods. The results
8
support the view implied by Schelling (1961)
Personal communication from J. Scott Armstrong, 11
January 2002. and stated directly by Armstrong (2001a) that
340 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
role-playing will provide more accurate fore- nology (FRST contract: Vic 903). The contract
casts than other methods for forecasting deci- is administered by Raymond Harbridge and Pat
sions in conflicts because it provides more Walsh.
realistic representations.
The predictions of game-theory experts did
offer an improvement over the traditional ap- A ppendix A. E-mail messages to game
proach: their forecasts were more accurate, on theorists
average, than unaided judgement. Role-play
predictions were better than chance and unaided
judgement for all situations, and better than
First e-mail message
game-theory experts’ predictions for all but one Subject: Using game theory to predict the
situation. Game theorists’ forecasts, on the other outcomes of conflicts.
hand, varied more widely in their accuracy than
did role-play forecasts. Their forecasts were less Dear Dr X
accurate than the forecasts of people using I am writing to you because you are an expert
unaided judgement for two of six situations and in game theory. I am engaged on a research
were no better than chance for four of the six project which investigates the accuracy of dif-
situations. Game theorists are experts on con- ferent methods for predicting the outcomes of
flicts, whereas the other research participants conflicts. What I would like you to do is to read
were not. Further research would be necessary each of the 5 attached descriptions of real
to determine whether game-theory expertise conflict situations and to predict the outcomes
confers any advantage over the unaided judge- of each conflict. The files contain both descrip-
ment of experts on conflicts who are not famil- tions of the situations and of the individuals or
iar with game theory. The cost of the forecast- parties involved.
ing methods is similar. Each file includes a short questionnaire.
Space is provided for your prediction, and for a
short description of the method you used to
derive your prediction. You may assign prob-
A cknowledgements abilities to possible outcomes, rather than pick-
I am grateful for the cooperation of the game ing a single outcome, if you consider this to be
theory experts (listed in Appendix C) who appropriate. If you are unable to provide a
contributed their judgements and their helpful prediction for a situation, please state why in the
comments on this research. Without their help, space provided in the questionnaire. The sixth
my research would not have been possible. I file contains a questionnaire only. Please com-
also thank J. Scott Armstrong, Urs Daellenbach, plete it when you have finished with the 5
John Davies, six anonymous reviewers who situations.
provided helpful advice and criticism, David I would appreciate it if you do not discuss the
Harte and Shirley Pledger who provided advice situations with other people, as I’d rather each
on statistical methods, and Mary Haight who participant provided an independent response.
made many useful editorial suggestions. This Although I intend to acknowledge all of the
paper is, in part, based on research funded by people, such as yourself, who help me with this
the Public Good Science Fund administered by research, my report will not associate any
the Foundation for Research Science and Tech- prediction with any individual.
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
341
Your prompt response is very important to descriptive’’, and therefore not an appropriate
the successful completion of my project. Please technique for ‘‘predict[ing] actual behaviour’’.
help me to prove the sceptics wrong about the This respondent asserted that ‘‘game theory is a
level of cooperation I get! mediocre predictor of actual behaviour’’, and
that ‘‘I believe that the long-term ambition of
Best regards, Kesten Green (most) game theory to find optimal solutions to
School of Business and Public Management, any decision problem is fundamentally mis-
Victoria University of Wellington, PO Box conceived’’. Further, ‘‘finding a situation that
5530, Wellington, New Zealand. Tel.: 164-4- can be well modelled by a game of chicken tells
499-2040; fax: 164-4-499-2080. e-mail: you a number of interesting things ...it does
kesten.green@vux.ac.nz not give you a prediction of what the outcome
will be with real players . . . nor . . . how ideal-
Second e-mail message ised rational players should react’’.
Subject: Help with research on game theory. A third game theorist stated that he ‘‘did not
see why [predicting decisions] is a game theory
Dear Dr Y question’’. In particular, the respondent objected
About three weeks ago I sent you an email that the request to predict the decision made in
asking for your help with research I am con- the one situation he had looked at, presumably
ducting on game theory. I really need more the Panalba Drug Policy, ‘‘seemed. . . a ques-
responses, and I wonder whether you might tion about my opinions on company ethics’’.
consider responding. In a fourth expert’s opinion ‘‘the role of game
The material I originally sent is repeated theory in practical situations is not so much in
below. If you are unable to read the attach- computing the equilibrium, but rather a useful
ments, please let me know and I will send you help in thinking the situation through’’. His
.txt files instead. brief outline of what this would involve had a
similar flavour to the approach recommended by
Regards Nalebuff and Brandenburger (1996). The re-
Kesten Green spondent went on to write that ‘‘Game theory
etc. is a tool in understanding complex
situations . . . which forces you to think of the
strategic aspects of the situation, but people do
A ppendix B. Game theorist responses not always behave strategically, and one has to
Not appropriate to apply game theory to the take that into account also’’. In sum: ‘‘The best
problems provided
(
six responses
)
game theory . . . can offer is to explain some
One stated that the aim of what he does is phenomena, but I don’t see how it can predict
‘‘not to predict what shall happen here. This the outcomes of real life situations’’. This
depends on the psychology of the players, response was echoed by a fifth game theorist: ‘‘I
which is not the object of mathematics. It is to am afraid our theoretical knowledge is not
give one of [the players] the quantitative tools straightforwardly applicable to real-life prob-
that will let him act optimally according to his lems’’. A sixth wrote that she was ‘‘a game
perceived interests’’. ‘theorist’ and not a strategic planner’’, and
Another was of the opinion that ‘‘most/many further that she failed to ‘‘see any ‘game theory’
theorists see GT as prescriptive rather than in [the] project’’.
342 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
Bereket Kebede Andre Rossi de Yelena Yanovskaya
Insufficient information to derive a prediction
Oliveira
(
four responses
)
Somdeb Lahiri Ronald Peeters Shmuel Zamir
One stated: ‘‘You have not provided suffi-
´
Massimiliano Landi Alex Possajennikov Jose Zarzuelo
Andy McLennan Eleuterio Prado Anthony Ziegelmeyer
cient information about preferences and institu-
tions for me to identify a game-theoretic model
Peter Bennett Vito Fragnelli Harold Houba
Pierre Bernhard Herbert Gintis Marc Kilgour
and make a prediction from that’’. The respon-
Steven Brams
dent was concerned that in order to ‘‘predict
what ‘really’ might happen (rather than what a
theoretical model would predict), I would need R eferences
to know a lot more about the context in which
the problems arose’’. Arkes, H. R. (2001). Overconfidence in judgmental fore-
casting. In Armstrong, J. S. (Ed.), Principles of forecast-
ing
:
a handbook for researchers and practitioners.
Unresolved responses
(51)
Norwell, MA: Kluwer Academic, pp. 495–515.
Twenty-five respondents stated that they Armstrong, J. S. (1977). Social irresponsibility in manage-
could not read the MS-Word documents that ment. Journal of Business Research,
5
, 185–213.
contained the information on the situations and Armstrong, J. S. (1987). Forecasting methods for conflict
the summary questionnaire. I sent these respon- situation. In Wright, G., & Ayton, P. (Eds.), Judgemen-
tal forecasting. Chichester: Wiley, pp. 157–176.
dents the information in the form they re- Armstrong, J. S. (1997). Why can’t a game be more like a
quested. Seventeen did not respond, four re- business?: a review of Co-opetition by Nalebuff and
fused to participate or were on leave, and four Brandenburger. Journal of Marketing,
61
(April), 92–
returned completed questionnaires. 95.
Eight respondents asked for more information Armstrong, J. S. (2001a). Role playing: a method to
about the researcher and the research. I sent forecast decisions. In Armstrong, J. S. (Ed.), Principles
of forecasting
:
a handbook for researchers and prac-
replies to all those who had asked for more titioners. Norwell, MA: Kluwer Academic, pp. 15–30.
information but provided little extra information Armstrong, J. S. (2001b). Selecting forecasting methods.
because in doing so I would have risked re- In Armstrong, J. S. (Ed.), Principles of forecasting
:
a
sponses from this group being different from handbook for researchers and practitioners. Norwell,
those of other respondents. As it happens, none MA: Kluwer Academic, pp. 365–386.
Armstrong, J. S. (2001c). Standards and practices for
of this group returned completed questionnaires. forecasting. In Armstrong, J. S. (Ed.), Principles of
As many as 44 experts responded promising forecasting
:
a handbook for researchers and practition-
help with the research; 13 of these respondents ers. Norwell, MA: Kluwer Academic, pp. 679–732.
did so, and five later refused. Armstrong, J. S. (2001d). The forecasting dictionary. In
Armstron, J. S. (Ed.), Principles of forecasting
:
a
handbook for researchers and practitioners. Norwell,
MA: Kluwer Academic, pp. 761–824.
A ppendix C. Game theorist respondents Armstrong, J. S., Brodie, R. J., & McIntyre, S. H. (1987).
Forecasting methods for marketing: review of empirical
The following respondents completed the set research. International Journal of Forecasting,
3
, 335–
376.
tasks (above the line) or provided useful com- Armstrong, J. S., & Hutcherson, P. D. (1989). Predicting
ments on the research topic (below the line): the outcome of marketing negotiations. International
Journal of Research in Marketing,
6
, 227–239.
Babcock, L., Loewenstein, G., Issacharoff, S., & Camerer,
Manel Baucells Holger Meinhardt Maurice Salles
C. (1995). Biased judgments of fairness in bargaining.
Emilio Calvo Claudio Mezzetti Giorgos Stamatopoulos
Gary Charness Hannu Nurmi Tristan Tomala
The American Economic Review,
85
(5), 1337–1343.
K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
343
Batson, C. D., & Ahmad, N. (2001). Empathy-induced Graef, R. (1976). Decision Steel. Grenada Colour Pro-
altruism in a prisoner’s dilemma II: what if the target of ductions.
empathy has defected? European Journal of Social Gruca, T. S., Kumar, K. R., & Sudharshan, D. (1992). An
Psychology,
31
(1), 25–36. equilibrium-analysis of defensive response to entry
Bazerman, M. H. (1998). Judgment in managerial decision using a coupled response function model. Marketing
making, 4th ed.. New York: Wiley. Science,
11
(4), 348–358.
Bennett, P. G. (1995). Making decisions in international Henderson, H. (1998). Viewing ‘the new economy’ from
relations: game theory and beyond. Mershon Interna-diverse forecasting perspectives. Futures,
30
(4), 267–
tional Studies Review,
39
, 19–52. 275.
Bennett, P. G., & Huxham, C. S. (1982). Hypergames and Howard, N. (1994a). Drama theory and its relation to game
what they do: a ‘soft O.R.’ approach. Journal of the theory. Part 1: Dramatic resolution vs. rational solution.
Operational Research Society,
33
(1), 41–50. Group Decision and Negotiation,
3
, 187–206.
Bennett, P. G., & McQuade, P. (1996). Experimental Howard, N. (1994b). Drama theory and its relation to
dramas: prototyping a multiuser negotiation simulation. game theory. Part 2: Formal model of the resolution
Group Decision and Negotiation,
5
, 119–136. process. Group Decision and Negotiation,
3
, 207–235.
Berg, T. L. (Ed.), (1970). Mismarketing
:
case histories of Jehiel, P. (1998). Repeated games and limited forecasting.
marketing misfires. New York: Doubleday, pp. 87–131. European Economic Review,
42
(3–5), 543–551.
Boyle, R. H. (1982). The 55% solution. Sports Illustrated,Keser, C., & Gardner, R. (1999). Strategic behavior of
1 February, 30. experienced subjects in a common pool resource game.
Brams, S. J., & Togman, J. M. (2000). Agreement through
threats: the Northern Ireland case. In Miroslav, N., & International Journal of Game Theory,
28
(2), 241–252.
Lepgold, J. (Eds.), Being useful
:
policy relevance and Kirshenbaum, J. (1982). Right destination, wrong track.
international relations theory. Ann Arbor, MI: Universi- Sports Illustrated, 1 February, 7.
ty of Michigan Press, pp. 325–342. McCabe, K. A., & Smith, V. L. (2000). A comparison of
Bullock, A., & Trombley, S. (Eds.), (1999). The new ¨
naıve and sophisticated subject behavior with game
fontana dictionary of modern thought, 3rd ed.. London: theoretic predictions. Proceedings of the National
Harper Collins. Academy of Sciences USA,
97
(7), 3777–3781.
Cyert, R. M., March, J. G., & Starbuck,W. H. (1961). Two Mintz, M. (1969). F.D.A. and Panalba: a conflict of
experiments on bias and conflict in organisational commercial and therapeutic goals. Science,
165
, 815–
estimation. Management Science,
7
, 254–264. 881.
Diekmann, A. (1993). Cooperation in an asymmetric Nalebuff, B. J., & Brandenburger, A. M. (1996). Co-
volunteers dilemma game: theory and experimental opetition. London: Harper Collins.
evidence. International Journal of Game Theory,
22
(1), Neslin, S. A., & Greenhalgh, L. (1983). Nash’s theory of
75–85. cooperative games as a predictor of the outcomes of
Eliashberg, J., LaTour, S. A., Rangaswamy, A., & Stern, buyer–seller negotiations. Journal of Marketing Re-
L. W. (1986). Assessing the predictive accuracy of two search,
20
, 368–379.
utility-based theories in a marketing channel negotiation Newman, B. (1982). Artists in Holland survive by selling
context. Journal of Marketing Research,
23
, 101–110. to the government. The Wall Street Journal, 7 January,
Fisher, R. A. (1973). Statistical methods for research 1.
workers, 14th ed.. New York: Hafner. Organski, A. F. K. (2000). The outcome of the negotia-
Ghemawat, P., & McGahan, A. M. (1998). Order backlogs tions over the status of Jerusalem: a forecast. In
and strategic pricing: the case of the US large turbine Miroslav, N., & Lepgold, J. (Eds.), Being useful
:
policy
generator industry. Strategic Management Journal,relevance and international relations theory. Ann
19
(3), 255–268. Arbor, MI: University of Michigan Press, pp. 343–359.
Ghosh, M., & John, G. (2000). Experimental evidence for Parmigiani, S., Ferrari, P. F., & Palanza, P. (1998). An
agency models of salesforce compensation. Marketing evolutionary approach to behavioral pharmacology:
Science,
19
(4), 348–365. using drugs to understand proximate and ultimate
Gibbons, R., & Van Boven, L. (2001). Contingent social mechanisms of different forms of aggression in mice.
utility in the prisoners’ dilemma. Journal of Economic Neuroscience and Biobehavioral Review,
23
(2), 143–
Behavior and Organisation,
45
(1), 1–17. 153.
Gintis, H. (2000). Game theory evolving
:
a problem-
centred introduction to modelling strategic behavior. Rapoport, A., & Orwent, C. (1962). Experimental games: a
Princeton, NJ: Princeton University Press. review. Behavioral Science,
7
(1), 1–37.
344 K
.
C
.
Green /International Journal of Forecasting
18 (2002) 321
–
344
Reisman, A., Kumar, A., & Motwani, J. G. (2001). A meta Statman, M., & Tyebjee, T. T. (1985). Optimistic capital
review of game theory publications in the flagship budgeting forecasts: an experiment. Financial Manage-
US-based OR/MS journals. Management Decision,ment,Autumn, 27–33.
39
(2), 147–155. Suleiman, R. (1996). Expectations and fairness in a
Sandholm,W. H. (1998). History-independent prediction in modified Ultimatum game. Journal of Economic Psy-
evolutionary game theory. Rational Society,
10
(3), 303– chology,
17
(5), 531–554.
326. Zajac, E. J., & Bazerman, M. H. (1991). Blind spots in
Schelling, T. C. (1961). Experimental games and bargain- industry and competitor analysis: implications of inter-
ing theory. World Politics,XIV(1), 47–68. firm (mis)conceptions for strategic decisions. Academy
Schwenk, C. R. (1995). Strategic decision-making. Jour-of Management Review,
16
(1), 37–56.
nal of Management,
21
(3), 471–493.
Shubik, M. (1975). Games for society
,
business and war.Biography: Kesten GREEN is a Research Associate of the
Amsterdam: Elsevier. Industrial Relations Centre at the School of Business and
Siegel, S., & Castellan, Jr. N. J. (1988). Nonparametric Public Management at Victoria University of Wellington
statistics for the behavioral sciences, 2nd ed.. Singa- and Managing Director of Decision Research Ltd, an
pore: McGraw-Hill. independent research company. Kesten was also a co-
Sonnegard, J. (1996). Determination of first movers in founder of the New Zealand economic forecasting and
sequential bargaining games: an experimental study. consulting firm Infometrics Ltd.
Journal of Economic Psychology,
17
(3), 359–386.