Content uploaded by Robert John Aumann

Author content

All content in this area was uploaded by Robert John Aumann on Feb 19, 2023

Content may be subject to copyright.

Ž.

GAMES AND ECONOMIC BEHAVIOR 20, 102]116 1997

ARTICLE NO. GA970577

The Absent-Minded DriverU

Robert J. Aumann, Sergiu Hart, and Motty Perry

Center for Rationality and Interacti¨e Decision Theory, The Hebrew Uni¨ersity of

Jerusalem, Feldman Building, Gi¨at-Ram, 91904 Jerusalem, Israel†

Received February 6, 1996

The example of the ‘‘absent-minded driver’’ was introduced by Piccione and

Rubinstein in the context of games and decision problems with imperfect recall.

They claim that a ‘‘paradox’’ or ‘‘inconsistency’’ arises when the decision reached

at the ‘‘planning stage’’ is compared with that at the ‘‘action stage.’’ Though the

example is provocative and worth having, their analysis is questionable. A careful

analysis reveals that while the considerations at the planning and action stages do

differ, there is no paradox or inconsistency. Journal of Economic Literature Classi-

ﬁcation Numbers: D81, C72.

Q

1997 Academic Press

1. INTRODUCTION

An absent-minded driver starts driving at START in Figure 1. At Xhe

Ž.

can either EXIT and get to Afor a payoff of 0 or CONTINUE to Y.AtYhe

Ž. Ž.

can either EXIT and get to Bpayoff 4 , or CONTINUE to Cpayoff 1 . The

essential assumption is that he cannot distinguish between intersections X

and Y, and cannot remember whether he has already gone through one of

them. Ž.

Piccione and Rubinstein 1997; henceforth P &R , who introduced this

example, claim that a ‘‘paradox’’ or ‘‘inconsistency’’ arises when the

decision reached at the planning stage}at START}is compared with that

at the action stage}when the driver is at an intersection. Though the

example is provocative and worth having, P & R’s analysis seems ﬂawed. A

careful analysis reveals that while the considerations at the planning and

action stages do differ, there is no paradox or inconsistency.

UThis is an outgrowth of notes and correspondence originating in May and June of 1994.

We thank Ehud Kalai, Roger Myerson, Phil Reny, Ariel Rubinstein, Dov Samet, Larry

Samuelson, and Asher Wolinsky for discussions on these topics, and the Associate Editor and

the referee for some insightful comments. Research partially supported by grants of the

U.S.]Israel Binational Science Foundation, the Israel Academy of Sciences and Humanities,

and the Game Theory Program at SUNY]Stony Brook.

†E-mail: ratio@vms.huji.ac.il.

102

0899-8256r97 $25.00

Copyright

Q1997 by Academic Press

All rights of reproduction in any formreserved.

THE ABSENT-MINDED DRIVER 103

FIG. 1. The absent-minded driver problem.

We start in Section 2 by laying down the fundamental observations that

underlie the driver’s decision problem, and then show in Section 3 how

P & R’s analysis violates these observations. In Section 4 we formally

deﬁne the concept of action-optimality and use it to analyze P & R’s

example. Section 5 studies action-optimality in a more general setup, with

some interesting and unexpected conclusions. We conclude with a detailed

discussion of various issues in Section 6.

2. FUNDAMENTALS

At the planning stage, the decision problem is straightforward. In the

Ž.

1

example, the optimal randomized decision is ‘‘CONTINUE with probability

2r3 and EXIT with probability 1r3.’’ We call this the planning-optimal

decision.

1Ž. Ž. 2

The problem is to maximize 1 yp?0qp1yp?4qp?1 over p, where pis the

probability of CONTINUE.

AUMANN,HART,AND PERRY104

At the action stage, though, even formulating the decision problem is

not straightforward. The following observations are essential for a correct

analysis of the decision at the action stage.2

v

First, the driver makes a decision at each intersection through

which he passes. Moreover, when at one intersection, he can determine

the action only there, and not at the other intersection}where he isn’t.

v

Second, since he is in completely indistinguishable situations at the

two intersections, whatever reasoning obtains at one must obtain also at

the other, and he is aware of this.

3. THE P & R ANALYSIS

Consider the action stage. The driver ﬁnds himself at an intersection; he

does not know which. Let

a

be the probability that Xis the current

intersection, and let pand qbe the probabilities of CONTINUE at the

current and at ‘‘other’’ intersections, respectively. Then the expected

payoff at the action stage is

Hp,q,

a

[

a

1yp?0qp1yq?4qpq ?1

Ž.Ž.Ž.

q1y

a

1yp?4qp?1.

Ž.Ž.

Ž.wŽ. Ž. 2

xŽ

P & R maximize Hp,p,

a

s

a

1yp?0qp1yp?4qp?1q1

.wŽ. x

y

a

1yp?4qp?1 over p, holding

a

ﬁxed. Thus they take pand q

as decision variables to be maximized simultaneously, subject to the

constraint qsp. This makes sense only if the driver controls the probabil-

ities at both intersections}a violation of the ﬁrst observation. But even if,

by some magical process, the driver could control the probability qat the

other intersection, surely

a

depends on q, and cannot be held ﬁxed in the

maximization!

4. ACTION-OPTIMALITY

How, then, should the driver reason at the action stage? Let us spell out

in detail the implications of the two observations in Section 2:

Ž.

i The optimal decision is the same at both intersections; call

it pU.

2One can imagine scenarios for which these observations do not hold. But such scenarios

do not correspond to the plain meaning of the words used to describe the situation. More

important, with those other scenarios the analysis at the planning stage also changes, and

again there is no paradox. Piccione and Rubinstein have yet to adduce an explicit scenario

Ž.

that does display a paradox. See Section 6 c for more on this issue.

THE ABSENT-MINDED DRIVER 105

Ž. U

ii Therefore, at each intersection, the driver believes that pis

chosen at the other intersection.

Ž.

iii At each intersection, the driver optimizes his decision given his

beliefs. Therefore, choosing pat the current intersection to be pUmust be

optimal given the belief that pUis chosen at the other intersection.

Given a behavior qat the other intersection, the probability that the

3Ž. Ž.

current intersection is Xis

a

s1r1qq. Thus if we set hp,q[

ŽŽ.. Ž.

Hp,q,1r1qq, we can restate the ﬁnal implication iii as follows:

pUis action-optimal if the maximum of hp,p

Uover p

Ž.

is attained at pspU.

U4Ž.

Thus, pis a ﬁxed point of the set-valued mapping qª

Ž.

argmax hp,q.

pŽ5

.

Applying this analysis to the example, we see that the randomized

planning-optimal decision}CONTINUE with probability 2r3}is also ac-

tion-optimal. Indeed, if this is the behavior at the other intersection, then

the probability that the current intersection is Xis

a

s3r5. Therefore

the expected payoff from choosing CONTINUE at the current intersection

Ž.Ž.wŽ. Ž.

with probability pis hp,2r3s3r5?1yp?0qp?1r3?4qp?

Ž.xŽ.wŽ. xŽ.

2r3?1q2r5?1yp?4qp?1 , which equals 8r5 for all p.So

ps2r3 maximizes it; thus pUs2r3 is action-optimal.

So there is no paradox: the planning-optimal choice of 2r3 is also

action-optimal.

Moreover, pUs2r3 is the unique action-optimal decision. Indeed,

1

hp,qs1yp?0qp1yq?4qpq ?1

Ž. Ž . Ž .

1qq

q

q1yp?4qp?1

Ž.

1qq

4y6qpq4q

Ž.

s.

1qq

3

This probability is the ‘‘consistent belief’’ of P & R; but unlike P & R, we consider it to be

the one that is appropriate to this problem. In the Appendix we derive it from a formal

model. Informally, since the driver always goes through X, but only qof the time through Y,

Ž. Ž.

the ratio of the probabilities is 1 to q, so they must be 1r1qqand qr1qq. These

Ž

probabilities may also be derived from a ‘‘fair lottery’’ approach see Footnote 3 in Aumann,

.

Hart, and Perry, 1997 .

4ŽUU

.Ž.

Formally, p,pis a symmetric Nash equilibrium in the symmetric game between

Ž

‘‘the driver at the current intersection’’ and ‘‘the driver at the other intersection’’ the

.

strategic form game with payoff functions h.

5Ž.

For the pure case, see Section 6 d below.

AUMANN,HART,AND PERRY106

FIG. 2. Multiple action-optimal decisions.

Given q, the maximizing ptherefore is: 0 for q)2r3; 1 for q-2r3; and

anything for qs2r3. Thus the only ﬁxed point is pUs2r3.

The notion of action-optimality deﬁned here is mathematically identical

Ž.

to the ‘‘modiﬁed multiselves approach’’ described near the end Section 7

of P & R. But unlike P & R, we consider this notion to be the natural and

correct formulation of the driver’s decision problem at the action stage.

See Section 6 below for further discussion.

5. A MORE CHALLENGING EXAMPLE, WITH MULTIPLE

ACTION-OPTIMAL DECISIONS

We have seen that, in the speciﬁc example of P& R, randomized

planning optimality and action optimality coincide. This is not always so!

While, in general, any planning-optimal randomized decision is also

action-optimal,6we will now show that there may be action-optimal

choices that are not planning-optimal. Surprisingly, there may even be

action-optimal choices that, at the intersections, look better than the

planning-optimal choice.

6Proposition 3 of P & R and also the end of the Appendix below.

THE ABSENT-MINDED DRIVER 107

FIG. 3. A more challenging example.

For a simple example with action-optimal decisions that are not plan-

ning-optimal, change the payoffs to be 1 at A,0atB,and2atC}see Fig.

2. The unique planning-optimal choice is CONTINUE, i.e., ps1. There are,

UU U

however, three action-optimal decisions: psps1, ps0, and ps

123

Ž

U

1r4 e.g., to see that ps0 is action-optimal, note that if the decision at

2.

the other intersection is EXIT, then it is indeed optimal to EXIT now too .

This leads us to the next point: When there are multiple action-optimal

choices, how do their payoffs compare? Of course, when computed at

START, the one that is planning-optimal yields the maximum payoff. But

how does it look at the current intersection? In the example above,

U

psps1 yields the highest payoff among all the action-optimal deci-

1ŽUU

.Ž

UU

.

sions also when compared at the intersections; i.e., hp,p)hp,p

11 ii

for is2,3. But this is not always so when there are more than two

intersections. Indeed, consider an example with three intersections, the

AUMANN,HART,AND PERRY108

payoff being 7 at the ﬁrst EXIT, 0 at the second EXIT, 22 at the third EXIT,

Ž.

and 2 if always CONTINUE see Fig. 3 . Then:

Ž. Ž

i The unique planning-optimal decision is ps0 with a payoff

.

of 7 .

UU

Ž.

ii There are three action-optimal decisions: psps0, ps

12

7r30, and pUs1r2.

3

Ž. UU U

iii The ex-ante expected payoffs for p,p, and pare, respec-

12 3

tively, 7, 8519r1350f6.31, and 6.5.

Ž. ŽUU

.UU U

iv The ex-post expected payoffs hp,pfor p,p, and pare,

12 3

Ž

UU

.

respectively, 7, 7378r1159 f6.37, and 50r7f7.14; thus hp,pis

33

UU

Ž.Ž.

larger than hp,p'hp,p.

11

The reader may ask, since the choice is among three possibilities

yielding 7, f6.37, or f7.14, why does the driver not choose the action

with the highest yield, namely pU? The answer, of course, is that at the

3

action stage, the driver cannot choose among pU,pU, and pU. His beliefs

12 3

there are not under his control; he cannot choose what to believe. Action

optimality is a condition for consistency of beliefs and rational behavior. If

the player is to be consistent at the action stage, he must believe in one of

the three possibilities pU,pU,pU; but which one is not up to him at that

123

stage.

We have already pointed to the formal similarity between action-opti-

mality and game equilibria. Choosing between the pUis much like choos-

i

ing between game equilibria, which is something that the individual player

Ž

in a game cannot do}it must be done by an outside force like a custom

.

or a norm , or by all the players somehow coordinating their actions.

In our case, there is only one player, who acts at different times.

Because of his absent-mindedness, he had better coordinate his actions;

this coordination can take place only before he starts out}at the planning

stage. At that point, he should choose pU. If indeed he chose pU, there is

11

no problem. If by mistake he chose pUor pU, then that is what he should

23

Ž

do at the action stage. If he chose something else, or nothing at all, then

.

at the action stage he will have some hard thinking to do.

Once having coordinated on pU, there is no incentive for the driver to

1

choose pUat the action stage. Nevertheless, one may ask whether at that

3

stage he will be sorry that he did not coordinate on pUrather than on pU.

31

After all, if he had, his expectation now would be f7.14, which is greater

than the 7 he now expects! If the answer is ‘‘yes,’’ we have a conceptual

puzzle; why should the driver get himself into a situation where at START

he is sure that he will be sorry at every intersection he reaches? Why not

avoid the sorrow by coordinating on pUin the ﬁrst place?

3

THE ABSENT-MINDED DRIVER 109

FIG. 4. An automatic car.

But the answer is ‘‘no’’; he should not be sorry. Having chosen pU,he

1

knows he must be at Xwhen ﬁnding himself at an intersection. Being at

Xis like being at START}i.e., at the planning stage}and then the best

choice is pUand only pU. So when reaching an intersection after having

11

chosen pU, the driver is not sorry that he indeed chose pUrather than pU.

113

Why, then, is the driver’s expectation at an intersection nevertheless

larger for pUthan for pU? The reason is that at an intersection, his belief

31

as to where he is if he chose pUdiffers from his belief when he chose pU.

31

Having chosen pU, he knows he must be at X. If he had chosen pU,he

1 3

would have attributed probabilities 4r7, 2r7, and 1r7 to being at X,Y,

and Z. He ‘‘prefers’’ the latter distribution, because it gives him a chance

of already having passed the ‘‘dangerous’’ intersection Yand a better shot

at the high payoff of 22. But as we said above, one cannot choose one’s

beliefs, and it makes little sense to discuss ‘‘preferences’’ between them.

Speciﬁcally, since he does know that he is at X, it would be silly for him to

say, ‘‘I wish I had chosen the other plan, because then in my ignorance I

would have been deluded into expecting a higher payoff than now.’’

To clarify this point, consider the example in Fig. 4. The car is automatic

and EXITS with probability 1r2 at each intersection. The decision maker is

AUMANN,HART,AND PERRY110

FIG. 5. Clear-headed or absent-minded?

a passenger, who sleeps during most of the trip. At START, he is given the

option to be woken either at both intersections, or only at X. In the ﬁrst

option he is absent-minded: when waking up, he does not know at which

intersection he is. We call the second option ‘‘clear-headedness.’’

As in the previous discussion, the question at Xis not operative}what

to do}but only whether it makes sense to be ‘‘sorry.’’ If he chose

clear-headedness, his expectation upon reaching Xis 1r4. If he had

chosen absent-mindedness, then when reaching Xhe would have at-

tributed probability 2r3 to being at Xand 1r3 to being at Y. Therefore

Ž.Ž.Ž.

his expected payoff at that point would have been 2r3?1r4q1r3?

Ž.

1r2s1r3, which is larger than 1r4. Is he therefore sorry that he chose

to be clear-headed? Clearly this would be absurd, as the payoffs do not

depend on his choice.

To make this even more striking, assume that when he is not absent-

Ž.

minded, the probability of CONTINUE is increased to 4r7 Fig. 5 . Then

ex-ante the clear-headed option is actually preferred to the absent-minded

option}it yields 16r49 rather than 1r4. But, upon reaching X, the

clear-headed option still yields 16r49, whereas the absent-minded option

THE ABSENT-MINDED DRIVER 111

Ž.

yields 1r3 as above , which is bigger then 16r49. Surely, it would be

absurd for the decision maker to wish he were absent-minded}it would

be sticking his head in the sand!

Another issue that these examples raise is this7: Assume the driver in

Fig. 3 has chosen pUat START. His expected payoff there is then 6.5. But,

3

he is certain to go through X, where his expected payoff becomes f7.14.

For a treatment of this issue}which is entirely different from that of the

Ž.

current paper}please see Aumann, Hart, and Perry 1997 .

6. DISCUSSION

This section elaborates on a number of different matters.

Ž.

aDecision Points. Part of the problem in P& R’s analysis is in their

interpretation of information sets. Recall that the extensive form describes

the way a game is played. The play proceeds from one node to the next, as

each player is called upon to make a choice whenever a decision node of

his is reached. Of course, when asked to make a choice, he possesses

certain information. This is accurately described by information sets: two

decision nodes where a player’s information is the same belong to the

same information set. But decisions are made at nodes,not at information

Ž.

sets cf. the ﬁrst observation of Section 2 .

In games with perfect recall, a given information set can be reached only

once}at a single node}in any one play of the game. Therefore, there is

no harm in identifying decision points with information sets in such games,

though even there the decision point is basically a node. But in games with

absent-mindedness, when an information set may be visited more than

once, it is simply incorrect to identify decision points with information sets.

Ž.

bControl. At each intersection, the driver ‘‘expects’’ that he will do

the same at the other intersection. He expects it, and maximizes given that

expectation, and the maximizing behavior turns out the same as the

expectation; that is precisely action-optimality. But expecting is not deter-

mining. He cannot, in fact, determine it}he cannot control what happens

at the other intersection.

In their Section 7, P& R discuss what they call the ‘‘modiﬁed multi-selves

approach,’’ which is the same as our notion of action-optimality. They

write that this approach ‘‘assumes that a decision maker, upon reaching an

information set, takes his actions to be immutable at future occurrences of

that information set. .. .At the other extreme one ﬁnds the opposite axiom

7We are grateful to the associate editor in charge of the current paper for raising this

question.

AUMANN,HART,AND PERRY112

for which only one self resides in the information set and expects that,

were the information set to occur again, he would adopt whichever

Ž.

behavior rule he adopts now’’ their italics .

Piccione and Rubinstein’s ‘‘opposite extremes’’ are in fact identical. The

key element is control. At the action stage}once the driver reaches an

intersection}there is no way that he can control or even affect what he

does at the other intersection. So from his viewpoint, here and now, his

future action really is ‘‘immutable.’’ But that does not contradict P & R’s

Ž.

EXIT ‘‘opposite axiom’’ the ‘‘one self’’ approach . As we said above,

expecting to do the same at the other intersection does not imply that the

driver can determine here what happens there.

While P & R’s ‘‘opposite axiom’’ is correct as written, their formalization

of it is inappropriate. This formalization, which leads to ‘‘EXIT with

probability 5r9,’’ is based on the incorrect assumption that at the ﬁrst

intersection Xthe driver can control what he does at the second intersec-

tion Y.

Ž.

cConsistent Analysis. Conceivably, P & R could challenge the ﬁrst

observation in Section 2}that at the action stage, the driver can control

only what he does at the current intersection. Perhaps, after all, by some

unspeciﬁed psychic process, he can control also what he does at a

subsequent intersection. But in that case, he will have exercised this

control already at the ﬁrst intersection, and anything that he may think he

is doing at the second intersection has no real effect. This makes the

analyses at the action and planning stages identical, and then surely there

is no paradox.

Another possibility is that the mysterious psychic process affects the

decision at the subsequent intersection}say with some probability}but

does not fully determine it. To analyze this possibility one would have to

spell out just how the process works, and take it into account in the

planning stage.

One could also consider a model in which the driver gets at most one

chance to change his plan}at the ﬁrst or second intersection, but not at

both. In that case, too, he should take this into account in the planning

stage, and again no paradox results.

In brief: One may consider various different scenarios. Whatever its

speciﬁcations are, the precise scenario must be taken into account at the

planning as well as at the action stage. The analyses at the action and

planning stages must be consistent}they must analyze the same scenario.

Ž. Ž .

dThe Pure Case. Piccione and Rubinstein P & R probably agree

that the pure case is not particularly interesting. Be that as it may, it turns

out that in that case there is no action-optimal decision. If the driver EXITS

at the other intersection, he should CONTINUE here, and if he CONTINUES at

THE ABSENT-MINDED DRIVER 113

8ŽŽ.

the other intersection, he should EXIT here. Formally, hp,1 is maxi-

Ž. .

mized at ps0, and hp,0 at ps1.

Even though this is a one-person decision problem, randomized behav-

ior is necessary at the action stage, because there are two independent

decisions there. So how should the driver behave at the action stage?

There is no clear answer. How do you play ‘‘Matching Pennies’’ when

limited to pure strategies?

Ž.

eTying Knots. There is one particular scenario in Figure 1 that

deserves further attention. Assume that the driver has a handkerchief in

his pocket. Whenever he goes through an intersection, he ties a knot in the

handkerchief, if there was no knot; or he unties the knot, if there was one.

Ž.

At the beginning i.e., at START , it is equally probable that the handker-

chief had or did not have a knot. The driver}absent-minded as he

is}does not remember which was the case.

Thus, at each one of the two intersections, the probability of having a

knot in the handkerchief is 1r2. Therefore the driver does not learn

anything about the intersection from the fact that there is or there is not a

knot. The condition that ‘‘he does not know at which intersection he

currently is’’ is satisﬁed.

However, the handkerchief allows the driver to use the following strat-

egy: ‘‘EXIT if there is a knot, CONTINUE if there is not.’’ This is clearly

Ž

better than anything he can do without the handkerchief it yields a payoff

.

of 2 . The handkerchief has made it possible to separate the intersections

without identifying them. It serves as an external correlation device. Of

course, other things could be used}like sunspots, policemen, and so on

Žfor instance, assume the single policeman in town chooses at random at

.

which intersection to be .

In all these cases, the driver can behave differently at the two intersec-

tions. But then he should take this into account at the planning stage as

ŽŽ..

well}and again there is no paradox see Subsection c above .

8This may sound like P& R’s argument in the pure case, but it isn’t. Given the planning-

optimal decision}which in the pure case is CONTINUE }P & R claim that at the action stage,

the driver should switch to EXIT. They base this on 1r2y1r2 probabilities of being at the

two intersections; these probabilities are derived from the assumption that the driver indeed

CONTINUES. But if he decides to EXIT, this assumption makes no sense: the probabilities

cannot be computed as if he had chosen CONTINUE. As in Section 3, when calculating the

expected payoff of switching plans, you cannot use probabilities as if you had not switched.

In contrast, we say that if the driver EXITS at the other intersection, he should CONTINUE

here, and if he CONTINUES at the other intersection, he should EXIT here. That is a different

kettle of ﬁsh altogether.

AUMANN,HART,AND PERRY114

7. CONCLUSION

At the action stage, the driver must assume that the other decision is

ﬁxed at the action-optimal value. This is consistent with the optimal choice

at the planning stage. Thus, the example of the absent-minded driver

displays no dynamic inconsistency.

APPENDIX

Ž.

We provide here the precise derivation of the function hp,qof

Section 4. Recall that pand qdenote the probabilities of CONTINUE at the

Ž.

current and at the other intersection, respectively, and hp,qdenotes the

resulting expected payoff at the current intersection. For now, think of p

and qas ﬁxed.

Deﬁne two random variables, Zand t.Zis the end-node that is

eventually reached, and tis the current time. Thus Ztakes the values A,

B, and C; as for t, we are only interested in two values, say ts1, which is

the time when Xis visited, and ts2, which is the time when Yis visited

Ž.

if CONTINUE is chosen at X.

Without any information, it is equally probable that the current time tis

1or2: Pts1sPts2s1r2;

Ž.Ž.

otherwise, the two decision points would be distinguishable. This holds for

Ž.

the total probability not conditional on one end-node or another .

From the deﬁnition of pand q, we have:

<<

PZsAts1s1yp,PZsAts2s1yq,

Ž. Ž.

<<

PZsBts1sp1yq,PZsBts2sq1yp,

Ž.Ž.Ž.Ž.

<<

PZsCts1spq,PZsCts2sqp.

Ž. Ž.

Putting it all together yields

PZsAand ts1s1ypr2,

Ž.Ž.

PZsAand ts2s1yqr2,

Ž.Ž.

PZsBand ts1sp1yqr2,

Ž.Ž.

PZsBand ts2sq1ypr2,

Ž.Ž.

PZsCand ts1spqr2,

Ž.

PZsCand ts2sqpr2.

Ž.

The ‘‘current intersection’’ Nis deﬁned as the intersection, if any, visited

at the current time t.Ifts1, it is necessarily X;ifts2, it is Yif ZsB

THE ABSENT-MINDED DRIVER 115

Ž.

or ZsC, and ‘‘none’’ if ZsAwe write this as NsB. Thus we obtain

Ž.Ž. Ž.Ž

4

.

PNsXsPts1s1r2; PNsYsPts2 and ZgB,Cs

Ž. Ž .Ž .Ž

q1ypr2qqpr2sqr2; and PNsBsPts2 and ZsAs1y

.

qr2. Ž.

The expected payoff hp,qat the current intersection can thus be

written as

<

4

hp,qsPNsXNgX,YEuZNsX

Ž. Ž.

Ž.

Ž.

<

4

qPNsYNgX,YEuZNsY,

Ž.

Ž.

Ž.

Ž.

where uZis the payoff at the end-node Z. Therefore

1r2

hp,qs1yp?uAqp1yq?uB qpq ?uC

Ž. Ž .Ž.Ž .Ž. Ž.

1r2qqr2

qr2

q1yp?uBqp?uC .

Ž.Ž. Ž.

1r2qqr2

Thus, conditional on currently being at an intersection, the probability is

Ž. Ž.

1r1qqthat it is X, and qr1qqthat it is Y. These are the

Ž

‘‘consistent beliefs’’ of P &R. Note that the beliefs about the identity of

the current intersection depend only on the behavior at the other intersec-

tion}not at the current one, where nothing has yet been done. Hence

.

these probabilities are a function of qand not of p.

Next, let xand ydenote the probabilities of CONTINUE at Xand Y,

Ž.

respectively. The expected payoff at START is then

fx,y[1yx?uAqx1yy?uB qxy ?uC .

Ž.Ž .Ž. Ž .Ž. Ž.

Ž.

A behavior pis planning-optimal if it maximizes fp,pover p.To

compare planning-optimality with action-optimality, note that

1q1q11

q?hp,qqy?uAsfp,qqfq,p.

Ž. Ž. Ž. Ž.

ž/ ž/

22 22 2 2

Ž.

Denote this expression by gp,q; the right side may be interpreted as the

expected payoff, evaluated at START, of choosing pat one intersection and

qat the other, but without knowing which is which. Now pUis action-opti-

ŽU.

mal if it maximizes hp,pover p, the second argument being ﬁxed at

UŽU.U

p. Equivalently, since hp,pis, for ﬁxed p, a positive linear transfor-

ŽU.Ž . U

mation of gp,psee above , it follows that pis action-optimal if and

UŽU.wŽU.Ž

U.x

only if pmaximizes gp,psfp,pqfp,pr2 over p. This

implies that the randomized planning-optimal decision pis action-optimal.

Indeed, the ﬁrst-order necessary conditions of the two problems are

identical; they are moreover sufﬁcient for action-optimality, where the

function to be maximized is linear.9

9This argument is general; it proves Proposition 3 of P & R.

AUMANN,HART,AND PERRY116

REFERENCES

Ž.

Aumann, R. J., Hart, S., and Perry, M. 1997 , ‘‘The Forgetful Passenger,’’ Games and Econ.

Beha¨.20, 117]120. Ž.

Piccione, M., and Rubinstein, A., 1997 . ‘‘On the Interpretation of Decision Problems with

Imperfect Recall,’’ Games and Econ. Beha¨.20,3]24.