Page 1

NBER WORKING PAPER SERIES

A CATERING THEORY OF DIVIDENDS

Malcolm Baker

Jeffrey Wurgler

Working Paper 9542

http://www.nber.org/papers/w9542

NATIONAL BUREAU OF ECONOMIC RESEARCH

1050 Massachusetts Avenue

Cambridge, MA 02138

March 2003

We would like to thank Viral Acharya, Raj Aggarwal, Katharine Baker, Randy Cohen, Gene D'Avolio, Steve

Figlewski, Xavier Gabaix, Paul Gompers, Florian Heider, Laurie Hodrick, Dirk Jenter, Kose John, Steve

Kaplan, John Long, Asis Martinez-Jerez, Colin Mayer, Holger Mueller, Sendhil Mullainathan, Eli Ofek,

Lubos Pastor, Lasse Pedersen, Gordon Phillips, Raghu Rau, Jay Ritter, Rick Ruback, David Scharfstein,

Hersh Shefrin, Andrei Shleifer, Erik Stafford, Jeremy Stein, RyanTaliaferro, Jerold Warner, Luigi Zingales

and seminar participants at Dartmouth, Harvard Business School, London Business School, LSE, MIT, NBER

Behvioral and Corporate Conferences, NYU, Oxford, the University of Chicago, the University of Michigan,

the University of Rochester, and Washington University for helpful comments; John Long and Simon

Wheatley for data; and Ryan Taliaferro for superb research assistance. Baker gratefully acknowledges

financial support from the Division of Research of the Harvard Business School. The views expressed herein

are those of the authors and not necessarily those of the National Bureau of Economic Research.

©2003 by Malcolm Baker and Jeffrey Wurgler. All rights reserved. Short sections of text not to exceed two

paragraphs, may be quoted without explicit permission provided that full credit including ©notice, is given

to the source.

Page 2

A Catering Theory of Dividends

Malcolm Baker and Jeffrey Wurgler

NBER Working Paper No. 9542

March 2003

JEL No. G35

ABSTRACT

We develop a theory in which the decision to pay dividends is driven by investor demand. Managers

cater to investors by paying dividends when investors put a stock price premium on payers and not

paying when investors prefer nonpayers. To test this prediction, we construct four time series

measures of the investor demand for dividend payers. By each measure, nonpayers initiate dividends

when demand for payers is high. By some measures, payers omit dividends when demand is low.

Further analysis confirms that the results are better explained by the catering theory than other

theories of dividends.

Malcolm Baker

Harvard Business School

Soldiers Field Road

Boston, MA 02136

and NBER

mbaker@hbs.edu

Jeffrey Wurgler

New York University School of Business

44 West Fourth Street

New York, NY 10012-1126

and NBER

jwurgler@stern.nyu.edu

Page 3

1

I. Introduction

Miller and Modigiliani (1961) prove that dividend policy is irrelevant to stock price in

perfect and efficient capital markets. In that setup, no rational investor has a preference between

dividends and capital gains. Arbitrage ensures that dividend policy is irrelevant.

Over forty years later, the only assumption in this proof that has not been thoroughly

scrutinized is market efficiency.1 In this paper, we present a theory of dividends that relaxes this

assumption. It has three basic ingredients. First, for either psychological or institutional reasons,

some investors have an uninformed, time-varying demand for dividend-paying stocks. Second,

arbitrage fails to prevent this demand from driving apart the prices of stocks that do and do not

pay dividends. Third, managers cater to investor demand – paying dividends when investors put

a higher price on the shares of payers, and not paying when investors prefer nonpayers. We

formalize this catering theory of dividends in a simple model.

The catering theory differs from the standard view of the effect of investor demand on

dividend policy. The standard view emphasizes the irrelevance of dividend policy to share prices

even when some investor clienteles have a rational preference for dividends. For example, Black

and Scholes (1974) write: “If a corporation could increase its share price by increasing (or

decreasing) its payout ratio, then many corporations would do so, which would saturate the

demand for higher (or lower) dividend yields, and would bring about an equilibrium in which

marginal changes in a corporation’s dividend policy would have no effect on the price of its

stock” (p. 2). This equilibrium intuition for dividend irrelevance can also be found in corporate

finance textbooks.

1 Allen and Michaely (2002) provide a comprehensive survey of payout policy research.

Page 4

2

The catering theory and the clientele equilibrium theory differ on several key points. One

is that catering takes seriously the possibility that investor demand for dividends is affected by

sentiment. This adds a new and unexplored source of demand to the rational dividend clienteles

considered by Black and Scholes. Another difference is that the catering view focuses more on

the demand for shares that pay dividends, whereas the determinate supply response in a clientele

equilibrium view is the overall level of dividends. For example, we discuss the possibility that

managers cater to investors who categorize dividend-paying shares more or less together, and

pay less attention to whether the yield on a particular share is three or four percent.

But perhaps the most crucial difference is that catering takes a less extreme view on how

fast managers or arbitrageurs eliminate an emerging dividend premium or discount. According to

Black and Scholes, managers compete so aggressively that a nontrivial dividend premium or

discount never arises, and so for a given firm dividend policy remains effectively irrelevant. This

argument is compelling only if fluctuations in the demand for dividends are small relative to the

capacity of firms to adjust supply. It is not obvious a priori that this is the case, particularly if

demand is affected by sentiment. The catering theory acknowledges the possibility of a nontrivial

dividend premium, and thus the relevance of dividend policy.

The main prediction of the catering theory is that the propensity to pay dividends depends

on a measurable dividend premium in stock prices. To test this hypothesis, we construct four

time series measures of the demand for dividend-paying shares. The broadest one is what we

simply call the dividend premium – it is the difference between the average market-to-book ratio

of dividend payers and nonpayers. The other measures are the difference in the prices of Citizens

Utilities’ cash dividend and stock dividend share classes (between 1956 and 1989 CU had two

classes of shares which differed in the form but not the level of their payouts); the average

Page 5

3

announcement effect of recent dividend initiations; and the difference between the future stock

returns of payers and nonpayers. Intuition suggests that the dividend premium, the CU dividend

premium, and initiation effects would be positively related to investor demand for dividends. In

contrast, the difference in future returns of payers and nonpayers would be negatively related to

any such demand – if demand for payers is so high that they are relatively overpriced, their

future returns will be relatively low.

We then use these four measures of demand to explain time variation in aggregate rate of

dividend initiation and omission. The results on initiations are the strongest. Each of the four

demand measures is a significant predictor of the rate of initiation. The lagged dividend premium

variable by itself explains a remarkable sixty percent of the annual variation in the initiation rate.

Another perspective is future stock returns. When the initiation rate increases by one standard

deviation, returns on payers are lower than nonpayers by nine percentage points per year over the

next three years. Conversely, the omission rate increases when the dividend premium is low, and

when future returns on payers are high.

After considering several alternative explanations, we conclude that the results are best

explained by catering. Explanations based on time-varying firm characteristics such as

investment opportunities or profitability, for example, do not account for the results: The

dividend premium variable helps to explain the residual “propensity to initiate” dividends that

remains after controlling for changing firm characteristics, including investment opportunities,

profits, and firm size using the methodology of Fama and French (2001). Alternative

explanations based on time-varying contracting problems, such as agency or asymmetric

information theories, do not address many aspects of the results, for instance why dividend

policy would related to the CU dividend premium and future returns. The lack of a compelling

Page 6

4

alternative explanation, plus the close connection between the predictions of catering and the

patterns that we document, favors the catering explanation.

We then investigate which source of investor demand creates the time-varying dividend

premium that attracts caterers. One possibility is rational dividend clienteles based on taxes,

transaction costs, or institutional investment constraints. We would expect such clienteles to be

satisfied by changes in the overall level of dividends, rather than the number of shares that pay

dividends. The evidence does not support this prediction – initiations and omissions are related

to the dividend premium, but the aggregate dividend yield, the aggregate payout ratio, or the

aggregate rate of dividend increases are not. Moreover, the relationship between initiations and

omissions and the dividend premium is also apparent after controlling for plausible proxies for

rational clienteles. Another possibility is that demand is driven by investor sentiment. Consistent

with this hypothesis, we find a strong correlation between the dividend premium and the closed-

end fund discount.

In summary, we develop and test a catering theory of dividends that relaxes the market

efficiency assumption of the M&M dividend irrelevance proof. The theory rounds out the

collection of theories that relax other assumptions of the proof, and adds to the literature of

behavioral corporate finance. In an early contribution, Shefrin and Statman (1984) develop

theories of investor preference for dividends based on self-control problems, prospect theory, and

regret aversion. The current paper is closer to recent research that views managerial decisions as

rational responses to security mispricing. For example, Baker and Wurgler (2000) and Baker,

Greenwood, and Wurgler (2002) view security issuance decisions as responses to mispricing or

perceived mispricing, and Baker and Wurgler (2002a) develop this into a market timing theory

of capital structure that relaxes the market efficiency assumption of the M&M capital structure

Page 7

5

irrelevance proof. Shleifer and Vishny (2002) develop a theory of mergers based on rational

responses to mispricing. Morck, Shleifer, and Vishny (1990), Stein (1996), Baker, Stein, and

Wurgler (2001), and Polk and Sapienza (2001) study rational corporate investment in inefficient

capital markets. Graham and Harvey (2001) and Jenter (2001) provide more evidence that

managers react to mispricing.

Section II develops the catering theory. Section III presents the main empirical results.

Section IV considers alternative explanations. Section V discusses the source of investor demand

for dividends. Section VI concludes.

II. A catering theory of dividends

The theory has three ingredients. First, there is a time-varying, uninformed demand for

the shares of firms that pay cash dividends. Second, limits on arbitrage allow this demand to

affect prices. Third, managers rationally cater in response. After discussing these ingredients, we

combine them in a simple model.

A. Investor demand for dividends

We posit that at some times investors generally prefer stocks that pay cash dividends, and

other time generally prefer nonpayers. A useful framework for developing this hypothesis is

categorization. Categorization refers to the pervasive cognitive process of grouping objects into

discrete categories such as “birds” or “chairs.” This allows related objects to be considered

together, in terms of a small set of common features that define category membership, rather

than as individual objects, each with its own list of identifying attributes. Categorization thus

speeds up communication and inference. Rosch (1978) provides a detailed review.

Page 8

6

In standard theory, investors do not categorize. Instead, they identify each security with a

list of abstract statistics, such as mean return, variance, and covariance. In reality, as Barberis

and Shleifer (2002) point out, investors often do categorize securities into “small stocks,” “value

stocks,” “tech stocks,” “old-economy stocks,” “junk bonds,” “utilities,” and so forth. For many

investors, these labels appear to capture all they want to know, or have the ability to process,

about the securities within the category.

There are several reasons to suspect that certain investors and institutions categorize

“dividend payers” directly or use dividends to classify stocks as “old economy,” for example.

Whether a stock pays dividends is clearly a salient characteristic, perhaps even more so than

industry, size, or index membership, and the financial press often categorizes firms according to

dividend payment.2 The fact that many firms pay small but nonzero dividends suggests that there

is a discrete component to attracting attention through dividends.

One reason why dividends are salient is a belief that dividend-paying stocks are less

risky.3 This notion is common in the popular financial press, and was once common in the

academic literature – Graham and Dodd (1951) and Gordon (1959) are recognized for this idea,

but Miller and Modigliani (1961) cite a number of other papers of this vintage that make the

same argument. Naïve investors, such as retirees and those who hold dividend-paying stocks for

“income” despite the tax penalty, would seem especially likely to fall prey to this bird-in-the-

hand argument. For them, the quarterly dividend check is much more salient than daily gyrations

in the stock price. If the risk tolerance of bird-in-the-hand investors changes over time, their

2 For example, a July 16, 2002 Wall Street Journal article titled “Where should you invest now?” categorizes

appealing investment options into TIPs, Ginnie Maes, Real Estate, and Dividend Paying Stocks. Quoting from the

article: “As of Friday, prices for dividend-paying stocks in the S&P 500 stock index had fallen 8.04% vs. a loss of

28.18% for stocks in the index without dividends.”

3 Hyman (1988) describes investor reaction to Consolidated Edison’s 1974 dividend omission. “It smashed the

keystone of faith for investment in utilities: that the dividend is safe and will be paid.” (p. 109).

Page 9

7

preferences for payers and nonpayers will also change. This is one possible mechanism by which

unsophisticated investors may display a time-varying sentiment for payers.

Another way dividends can become salient is if investors use them to infer managers’

investment plans. For example, investors may interpret nonpayment, controlling for profitability,

as evidence that the firm thinks it has excellent investment opportunities. Conversely, dividends

may be taken as evidence that opportunities are weak. These inferences create another channel

though which payers and nonpayers become categories, and suggest a second realistic

mechanism to generate a time-varying sentiment across categories. That is, when investors’

perceptions of overall growth opportunities are high, they prefer nonpayers, and vice-versa. Note

that time variation is driven here by perceptions of growth opportunities, not risk tolerance as

above. One popular model ( Shiller (1984, 2000)) that combines both effects is that steady

dividends mean “old-economy.” Old-economy stocks are viewed as safer but also as having less

potential than the “new-economy” stocks which plow back everything to finance growth.

Black and Scholes (1974) and Allen, Bernardo, and Welch (2000), among others, suggest

that institutional frictions also lead to the rational categorization of payers by dividend clienteles.

Imperfections that have been proposed to cause clienteles include transaction costs, taxes, and

institutional investment constraints. Many endowed institutions are restricted to spending from

income, for example, a clear reason to categorize payers. Others may take dividends as evidence

that a stock is a “prudent” investment. Time variation in these imperfections can then induce

time-varying clientele preferences. The 1970s witnessed a number of events that may have led to

clientele demand shifts. The 1974 ERISA may have increased the attractiveness of payers to

pension funds (Del Guercio (1996) and Brav and Heaton (1998)). The 1975 advent of negotiated

Page 10

8

commissions reduced the transaction cost of creating homemade dividends. And of course tax

code changes can differentially affect payers and nonpayers.

Given that categorization occurs, time-varying demand between categories could also

arise from what Mullainathan (2002) calls categorical inference. Investors using categorical

inference could, for example, overestimate the impact of news about a particular payer for other

payers, and underestimate its impact for nonpayers. Thus even without any explicit preference

for cash dividends, the fact that categories have already been built around them could lead to

variation in demand between payers and nonpayers.

Finally, building on ideas in Thaler and Shefrin (1981), Shefrin and Statman (1984)

propose that some investors prefer dividend-paying stocks to homemade dividends to combat

self-control problems. Shefrin and Statman also motivate an investor preference for dividends

with prospect theory and regret aversion arguments. The prospect theory argument combines

ideas in Kahneman and Tversky (1979) and Thaler (1980, 1983) with the result that dividends

and capital gains allow investors a more flexible and agreeable mental accounting. When capital

gains are low, investors can find a silver lining in the dividend; when capital gains are high,

dividends and capital gains are individually-wrapped presents that can be savored separately.

These theories offer additional reasons why investors view payers and nonpayers as distinct. To

the extent that the germane considerations vary over time, they might also lead to a time-varying

preference for payers.

B. Limited arbitrage

In perfect and efficient markets, uninformed demand for dividends would not affect stock

prices. Arbitrage would prevent it. Arbitrageurs could short the firm with a preferred dividend

policy and go long a correctly priced “perfect substitute” – a firm with the same investment

Page 11

9

policy but a different dividend policy. In perfect and efficient markets, only investment policy

affects stock prices, so an arbitrage follows by making homemade dividends on the long firm to

match the dividends declared by the short firm. In the absence of further frictions, this position

delivers an up-front gain and can be risklessly held forever, or liquidated when prices move back

in line. Competition for such arbitrage opportunities, it is argued, would eliminate any dividend

premium or discount and maintain dividend policy irrelevance.

In practice, the long-short arbitrage that drives this irrelevance proof is risky and costly.4

Limited arbitrage is the second postulate of the catering theory. An obvious risk in long-short

arbitrage is fundamental risk, which arises simply because individual stocks do not have perfect

substitutes (Wurgler and Zhuravskaya (2002)). This risk is in principle diversifiable, but

arbitrageurs also face a systematic risk, often called noise-trader risk or interim price risk, if they

try to trade against systematic sentiment. With short horizons or limited capital, they are

sensitive to this risk (De Long, Shleifer, Summers, and Waldmann (1990) and Shleifer and

Vishny (1997)). Finally, long-short arbitrage is costly. Nontrivial shorting costs are reported by

D’Avolio (2002), Geczy, Musto, and Reed (2002), and Lamont and Jones (2002).

If arbitrage is limited and uninformed demand varies at the category level, as Barberis

and Shleifer propose, then prices may also vary at the category level. Barberis, Shleifer, and

Wurgler (2001) and Greenwood and Sosner (2001) find evidence for demand-induced

comovement within the categories defined by stock indexes. If payers and nonpayers are

4 Limited arbitrage explanations have been developed for closed-end fund discounts (Lee, Shleifer, and Thaler

(1991) and Pontiff (1996)), risk arbitrage returns (Mitchell and Pulvino (2001) and Baker and Savasoglu (2002)),

post-earnings-announcement drift (Mendenhall (2002)), the Internet bubble (Ofek and Richardson (2002a, 2002b)),

seasoned equity issue returns (Pontiff and Schill (2001)), negative stub values (Lamont and Thaler (2000) and

Mitchell, Pulvino, and Stafford (2001)), IPO underpricing (Duffie, Garleanu, and Pedersen (2002)), index inclusion

effects (Greenwood (2001) and various papers on S&P 500 additions), and the predictive power of such variables as

breadth of ownership (Chen, Hong, and Stein (2002)), market liquidity (Baker and Stein (2002)), and book-to-

market (Alti, Hwang, and Trombley (2002)).

Page 12

10

investment categories, the same logic implies that uninformed demand may also affect their

relative prices.

Our empirical work is soon to come. For the impatient reader, we point to Long (1978) as

some initial evidence that uninformed, time-varying demand for cash dividends affects stock

prices. Long studies the Citizens Utilities Company, which between 1956 and 1989 had one

share class that paid cash dividends and another that paid stock dividends. By charter, the

payouts to the two classes were supposed to have equal pretax value. In practice, the stock

dividend averaged ten percent higher than the cash dividend. Long finds that during his sample

period, the cash-paying share’s relative price was too high, given its pretax dividend

disadvantage and its further tax disadvantage.5 More interesting, the relative price fluctuates

substantially over time. Long, Poterba (1986), and Hubbard and Michaely (1997) conclude that

these fluctuations cannot be explained by traditional theories of dividends.

C. Catering as a rational response

The third element of the theory is that managers cater to uninformed investor demand. In

the setting of dividends, catering implies that managers will tend to initiate or continue paying

dividends when investors put a higher price on payers, and omit dividends or avoid initiating

them when investors favor nonpayers.

The objective of catering is to capture the stock price premium associated with the

characteristics investors currently favor. Catering is thus distinct from the usual policy of

maximizing shareholder value. In inefficient markets, managers have to decide which of two

prices to maximize: A short-run price affected by uninformed demand, and a fundamental or

5 In 1955 CU obtained a special IRS exemption making the stock dividends not taxable as ordinary income. In

general, regular stock dividends have been taxable since the 1969 Tax Reform Act, but CU received a grandfather

clause in that Act.

Page 13

11

long-run value determined by investment policy. Catering maximizes the short-run price, while

the traditional policy emphasizes fundamental value.

In general, whether managers will rationally cater to a short-run mispricing is an

empirical question.6 One element in their decision is how much of a fundamental tradeoff there

between catering and investment policy – if they can maximize short-run and long-run price

without conflict, they will do both.7 Another element is whether managers can personally profit

from any short-term overvaluation that follows from successful catering. If they hold a

significant amount of equity themselves, they can sell their overvalued shares. Or they may be

able to issue dilutive, overpriced shares. A final consideration is the horizon of managers, or the

horizon of the investors they care about most. Managers with short horizons, for instance those

with compensation tied to short-run performance, will be more likely to cater.

D. A model of dividend catering

A short model makes these tradeoffs precise, and illustrates the more subtle features and

limits of the catering theory. The model assumes that investors strictly categorize payers and

nonpayers. While extreme, this is a convenient way of capturing the distinction that we want to

emphasize – zero versus nonzero payout, not small versus large payout. Fama and French (2001)

also focus on this dimension of dividend policy.

Consider a firm with Q shares outstanding. At t = 1, it pays a liquidating distribution of V

= F + e per share, where e is a normally distributed error term with mean zero. At t = 0, it has the

6 Conditions under which managers will pursue short-run over long-run value are also discussed by Miller and Rock

(1985), Stein (1989), Shleifer and Vishny (1990), Blanchard, Rhee and Summers (1993) and Stein (1996).

7 An example of a setting in which no tradeoff exists is firm names. Cooper, Dimitrov, and Rau (2001) and Rau,

Patel, Osobov, Khorana, and Cooper (2001) document that when investor sentiment favored the Internet (before

March 2000), a number of firms added “dot com” to their names, but when sentiment turned away (after March

2000), firms were changing back. While many of these name changes surely coincided with changes in investment

policy, Rau et al. argue that at least some of them were simply catering to sentiment for the Internet.

Page 14

12

choice of paying an interim dividend d∈{0,1} per share, which reduces the liquidating

distribution by d(1+c). The risk-free rate is zero. The cost c captures any tradeoff between

dividend and investment policy, such as would result from costly external finance or taxes. The

Miller and Modigliani case has c equal to zero – dividend policy does not in any way affect the

cash flows to investors.

There are two types of investors, category investors and arbitrageurs. Both have constant

absolute risk aversion. Arbitrageurs have aggregate risk tolerance per period of gA. They have

rational expectations over the terminal distribution, and they know the long-run cost of an

interim dividend. Thus they expect a liquidating distribution of F if the firm does not pay an

interim dividend and F-c if it does.

Category investors have aggregate risk tolerance per period of gC = g. They have an

irrational expectation of the terminal distribution, and they do not recognize the cost of an

interim dividend. Their irrational expectation introduces a source of uninformed demand. For

purposes of developing the model, we suppose that they categorize because they view nonpayers

as growth firms, and they judge the prospects of those firms relative to their own assessment of

growth opportunities. (Alternatively, their irrational expectations could reflect biased inferences

that overweight within-category information as in Mullainathan (2002), biased risk perceptions

arising from the bird-in-the-hand fallacy, or capture institutional constraints in a reduced form.)

Specifically, they expect a final payment of VD from payers and VG from nonpayers. For

simplicity, we assume that they misestimate the mean payout, but not the distribution around the

mean. Typically, their net result is to cause VD and VG to fall on opposite sides of F.

If the firm meets its criteria, investor group k demands

( )

V

()

00

PED

kk

−=g

.(1)

Page 15

13

Prices of dividend payers PD (cum dividend) and growth firms PG are therefore

−+≡

−−+≡

=

+++

+++

AA

A

A

AA

A

A

Q

g

GG

0

Q

g

DD

0

FVP

cFVP

P

ggg

g

gg

g

ggg

g

gg

g

0

)(

.(2)

Given these prices, the manager chooses whether to pay dividends. We assume that the

manager is risk neutral and cares about both the current stock price and the value of total

distributions. The manager’s only effect on the latter is through the cost of dividends c. With his

horizon measured as l , the manager solves:

()()

dcP

d

−+−

ll

0

1max

(3)

The solution is straightforward. The manager pays dividends if the dividend premium is

positive and exceeds the present value of the long-run cost that he incorporates. That is, when

(

V

)

ccVPP

A

A

A

GDG

0

D

0

−

≥−−≡−

++

l

l

gg

g

gg

g

1

. (4)

The first term in the middle is the immediate positive price impact of switching categories. The

second is the immediate negative price impact of the arbitrageurs’ recognition of the cost c. To

induce payment, the net of these must exceed the long-run cost that the manager incorporates,

the term on the right. Qualitatively, the propensity to pay is increasing in the dividend premium,

decreasing in c, decreasing in the prevalence of arbitrage (the relative risk-bearing capacity of

arbitrageurs and category investors), and decreasing in managers’ horizons. The announcement

effect of an initiation is positive and increasing in the dividend premium.8

8 Note that an uninformed demand interpretation of announcement effects could explain why dividend changes have

price impacts while at the same time appear to contain more information about past earnings than future earnings

(Lintner (1956), Fama and Babiak (1968), Watts (1973), DeAngelo, DeAngelo, and Skinner (1996) and Benartzi,

Michaely, and Thaler (1997)).

Page 16

14

Equation (4) contains the basic time series predictions that we test, plus several cross-

sectional predictions that we leave to future work. However, this two-category version is too

simplistic to incorporate key stylized facts, such as the persistence of dividend payment and the

negative announcement effect of omissions. To address these facts, we briefly outline extensions

of the model that make use of a third category, former payers. These stocks lack the

characteristics noticed by category investors, as they pay no dividends and have low (historical)

earnings growth.9 Thus they attract only arbitrageurs, so their price is

A

Q

g

FD

0

FP

−=

.

With this third category, the model can address the stylized fact that dividend payment is

empirically quite persistent. That is, equation (4) shares the feature of many theories of dividends

(for example, Miller and Rock (1985)) that the decisions to initiate and omit are symmetric. With

former payers, dividends can be sticky. In particular, the decision for growth firms to initiate is

still governed by (4), while current payers continue when:

cc

Q

g

FVPP

A

A

A

A

DFD

0

D

−

≥−

−−≡−

++

l

l

gg

g

gg

g

1

0

.(5)

Like the propensity to initiate, the propensity to continue is decreasing in the long-run cost and

increasing in the dividend premium. The new insight is that continuing may be desirable even

when initiating is not. More formally, if gA is small, or if c is small and VG and VD fall on opposite

sides of F, then (5) is satisfied whenever (4) is satisfied. Intuitively, former payers are neglected

stocks, attracting only arbitrageurs. Even if initiating is undesirable, current payers may want to

continue if the price impact to omitting is large. Note that this third category also suggests why

9 The low historical earnings growth can be motivated by assuming that former payers’ past dividends were not fully

replenished by stock issues (perhaps as a result of the same external finance costs represented by c) or, more

intuitively, on empirical grounds. Fama and French (2001) report that dividend payers have average (asset) growth

rates of 8.78%, while nonpayers average 11.62% and former payers average only 4.67%. These averages are for the

1963-98 full sample. Between 1993-98, the averages are 6.65%, 17.67%, and 7.61% respectively.

Page 17

15

some firms might initiate (reinitiate) dividends even when the dividend premium is negative, and

why such initiations would still have a positive announcement effect.

A third category is also useful in addressing the stylized fact that the announcement effect

of omissions is negative (Healy and Palepu (1988) and Michaely, Thaler, and Womack (1995)),

whereas in the simplest two-category model it is not. Specifically, consider an intermediate time

period between t = 0 and t = 1, in which the neglected former payers face a positive probability

of being recategorized as growth firms – for example, because of a random earnings shock. In

this setup, dividend payers may choose to omit a dividend at t = 0 even when (5) is not satisfied.

They suffer a short-run negative announcement effect, but the expected value of being

recategorized may be worth it. It is straightforward to formally incorporate this effect.

Of course, there are many other ways to explain some of these facts, such as fundamental

risk, financial constraints, or asymmetric information. Our goal here is to illustrate the pros and

cons of a model that isolates the market efficiency assumption of Miller and Modigliani. Such a

model predicts that the propensity to pay dividends is robustly increasing in the dividend

premium, and decreasing in the long-run costs of paying dividends. Realistic variants of it

suggest that the decisions to initiate and to continue paying should be analyzed separately.

III. Empirical tests

We test the prediction that the decision to pay dividends depends on uninformed demand

for dividend payers as revealed through stock price signals. The model illustrates some cross-

sectional wrinkles, but this is primarily a time series prediction because uninformed demand is

hypothesized to be systematic.

Page 18

16

A. Dividend payment variables

Our measures of dividend payment are derived from aggregations of Compustat data. The

observations in the underlying 1962-2000 sample are selected as in Fama and French (2001, p.

40-41): “The Compustat sample for calendar year t … includes those firms with fiscal year-ends

in t that have the following data (Compustat data items in parentheses): total assets (6), stock

price (199) and shares outstanding (25) at the end of the fiscal year, income before extraordinary

items (18), interest expense (15), [cash] dividends per share by ex date (26), preferred dividends

(19), and (a) preferred stock liquidating value (10), (b) preferred stock redemption value (56), or

(c) preferred stock carrying value (130). Firms must also have (a) stockholder’s equity (216), (b)

liabilities (181), or (c) common equity (60) and preferred stock par value (130). Total assets must

be available in years t and t-1. The other items must be available in t. … We exclude firms with

book equity below $250,000 or assets below $500,000. To ensure that firms are publicly traded,

the Compustat sample includes only firms with CRSP share codes of 10 or 11, and we use only

the fiscal years a firm is in the CRSP database at its fiscal year-end. … We exclude utilities (SIC

codes 4900-4949) and financial firms (SIC codes 6000-6999).”

Within this sample we count a firm-year observation as a dividend payer if it has positive

dividends per share by the ex date, else it is a nonpayer. To aggregate this firm-level data into

useful time series, two aggregate identities are helpful:

Payerst = New Payerst + Old Payerst + List Payerst , (6)

Old Payerst = Payerst-1 - New Nonpayerst - Delist Payerst .(7)

The first identity defines the number of payers and the second describes the evolution. Payers is

the total number of payers at time t, New Payers is the number of initiators among last year’s

nonpayers, Old Payers is the number of payers that also paid last year, List Payers is the number

Page 19

17

of firms that are payers this year and were not in the sample last year, New Nonpayers is the

number of omitters among last year’s payers, and Delist Payers is the number of last year’s

payers that are not in the sample this year. Note that two analogous identities hold if one

switches “Payers” and “Nonpayers” everywhere. Also note that lists and delists are with respect

to our sample, which involves several screens. Thus new lists include both IPOs that survive the

screens in their Compustat debut as well as established Compustat firms when they first survive

the screens. It also includes the established NASDAQ firms that appear in Compustat for the first

time in the 1970s. Similarly, delists include both delists from Compustat and firms that fall

below the screens.

We use these aggregate totals to define three basic measures of the dynamics of dividend

payment among certain subsets of firms:

tt

t

t

Nonpayers

DelistNonpayers

PayersNew

1−

−

Initiate

=

, (8)

tt

t

Payers

t

Delist Payers

Payers

1−

−

Old

Continue

=

,(9)

tt

t

t

Nonpayers

ListPayersList

Payers

+

List

Listpay

=

. (10)

In words, the rate of initiation Initiate is the fraction of surviving nonpayers that become new

payers. The rate at which firms continue paying Continue is the fraction of surviving payers that

continue paying. It can also be viewed as one minus the rate at which firms omit dividends. The

rate at which new lists in the sample pay Listpay is self-explanatory.

These variables capture the decision whether to pay dividends, not how much to pay. We

take this approach for several reasons. First, these are the natural dependent variables in a theory

in which investors categorize shares based on whether they pay dividends. (Wings make a

Page 20

18

“bird,” regardless of their length.) Second, as an empirical matter, the payout ratio is sensitive to

profitability and the dividend yield is sensitive to changes in share prices. The decision to initiate

or omit dividends, in contrast, is always a policy decision. Third, Fama and French (2001)

document a decline in the number of payers, and no comparable pattern in the payout ratio.

Nonetheless, measures of the level of dividends turn out to be useful in discriminating among

alternative interpretations for the basic results.

Table 1 lists the aggregate totals and the dividend payment variables. The sample

displays similar characteristics to the sample in Fama and French (2001). For our purposes, the

most notable feature of the data is the time variation in the dividend variables. The rate of

initiation starts out high in the early years of the sample, then drops dramatically in the late

1960s, rebounds in the mid 1970s, drops again in the late 1970s and remains low through the end

of the sample. The rate at which firms continue paying displays less variation, as expected. The

rate at which lists pay displays the most variation. As Fama and French point out, it has declined

steadily over the past few decades.

While we do not focus on the level of dividends, as just discussed, it is useful to get a

rough sense of the aggregate economic significance of initiations. In the average year in our

sample, newly-initiated dividends amount to 0.5% of dividends already paid by payers, and 29%

of the change in the amount that is paid by payers (in years when this change is positive). The

fact that the first number is so small is not surprising. The numerator is small because the rate of

initiation is low and the typical initiator is small and starts off with a small dividend, while the

denominator is high because the persistence of payment is high and the typical surviving payer

tends to increase dividends over time. We also caution that the 29% figure is affected by outlying

Page 21

19

years in which the change in the amount paid by existing payers is barely positive. Nonetheless,

these figures provide some sense of the aggregate economic significance of initiations.

B. Investor demand for dividends variables

We relate dividend payment choices to several stock market measures of the uninformed

demand for dividend-paying shares. Conceptually, an ideal measure would be the difference

between the market prices of firms that have the same investment policy and different dividend

policies. In the frictionless and efficient markets of Miller and Modigliani (1961), of course, this

price difference is zero. But uninformed demand combined with limits to arbitrage, as discussed

above, can lead to a time-varying price difference.

Our first measure, which we simply call the dividend premium because it is the broadest

measure, is motivated by this intuition. It is the difference in the logs of the average market-to-

book ratios of payers and nonpayers – that is, the log of the ratio of average market-to-books.10

We define market-to-book following Fama and French (2001). Market equity is end of calendar

year stock price times shares outstanding (Compustat item 24 times item 25).11 Book equity is

stockholders’ equity (Item 216) [or first available of common equity (60) plus preferred stock par

value (130) or book assets (6) minus liabilities (181)] minus preferred stock liquidating value

(10) [or first available of redemption value (56) or par value (130)] plus balance sheet deferred

taxes and investment tax credit (35) if available and minus post retirement assets (330) if

available. The market-to-book ratio is book assets minus book equity plus market equity all

divided by book assets.

10 Market-to-book ratios are approximately lognormally distributed. As a result, levels of the market-to-book ratio,

unlike logs, have the property that the cross-sectional variance increases with the mean. In our context, this means

that the absolute size of a premium measured in levels could proxy for a market-wide valuation ratio.

11 Here we want an aggregate market-to-book measure for a precise point in time, the end of the calendar year. Later

in the paper, when we use market-to-book as a firm characteristic, we use the end of fiscal year stock price.

Page 22

20

We then average the market-to-book ratios across payers and nonpayers in each year. The

equal- and value-weighted dividend premium series are the difference of the logs of these

averages. These variables are listed by year in Table 2 and the value-weighted series are plotted

in Figure 1. The figure shows that the average payer and nonpayer market-to-books diverge

significantly at short frequencies. It reveals several interesting patterns. Dividend payers start out

at a premium, by this measure, in the first years of the sample. The valuation of nonpayers then

spikes up in 1967 and 1968 and falls sharply, in relative terms, through 1972. The dividend

premium takes another dip in 1974, and for over two decades now payers have traded at a

discount by this measure. The discount widened in 1999 but closed somewhat in 2000. At this

point it is premature to speculate on the forces that move the dividend premium variable over

time. In Baker and Wurgler (2002b), we draw on academic histories of the capital market and a

review of historical articles in the financial press to provide a detailed, but still highly stylized,

account of its variation.

The primary disadvantage of the dividend premium variable is that it may also reflect the

relative investment opportunities of payers and nonpayers, as opposed to uninformed demand for

dividend-paying shares. We consider this in our discussion of alternative explanations.

Our second measure of investor demand for dividend payers is the difference in the prices

of Citizens Utilities cash dividend and stock dividend share classes. Between 1956 and 1989 the

Citizens Utilities Company had two classes of shares outstanding on which the payouts were to

be of equal value, as set down in an amendment to the corporate charter. In practice, the relative

payouts were close to a fixed multiple. Long (1978) describes the case in detail. We measure the

CU dividend premium as the difference in the log price of the cash payout share and the log price

of the stock payout share. The 1962 through 1972 data were kindly provided by John Long and

Page 23

21

the 1973 through 1989 data are from Hubbard and Michaely (1997).12 Table 3 reports the CU

dividend premium year by year.

By its nature, the CU dividend premium does not reflect anything about investment

opportunities. This reduces the number of alternative explanations for why it fluctuates, but it

also means that arbitraging the premium entails no fundamental risk, only noise-trader risk, so

the amount of sentiment that it reflects may be muted. Other disadvantages include the fact that

CU is just one firm; the stock payout share is more liquid than the cash payout share; there was a

one-way, one-for-one convertibility of the stock payout class to the cash payout class, truncating

the ability of the price ratio to reveal pro-cash-dividend sentiment; certain sentiment-based

mechanisms outlined above involve categorization of firms rather than shares, so a case in which

one firm offers two dividend policies may lead to weaker results; and the experiment ended in

1990, when CU switched to stock payouts on both classes.

Our third measure of uninformed demand for dividends is the average announcement

effect of recent initiations.13 Intuitively, if investors are clamoring for dividends, they may make

themselves heard through their reaction to initiations. Asquith and Mullins (1983) find that

initiations are greeted with a positive return on average, but they do not study whether this effect

varies over time. We define a dividend initiation as the first cash dividend declaration date in

CRSP in the twelve months prior to the year in which the firm is identified as a Compustat New

Payer. Since Compustat payers are defined using fiscal years while CRSP allows us to use

12 There are two further adjustments made throughout the 1962 through 1989 series. The annual value that we

consider is the log of the average of the monthly price ratios, because the relative prices fluctuate dramatically even

within a year. And to control for the fact that cash dividends were quarterly, in practice, while the stock dividends

were semiannual, the cash dividends are assumed to be reinvested until the corresponding stock dividend is paid.

13 In closer analogy with the other demand variables, one might like to define an announcement effect variable that

combines the reactions to initiations and omissions. That is, when demand for dividend payers is high, initiation

effects may be particularly positive and omission effects particularly negative. Unfortunately, CRSP data do not

provide precise omission announcement dates.

Page 24

22

calendar years, the resulting asynchronicity means that the number of initiation announcements

identified in CRSP for year t does not equal the number of Compustat New Payers in year t.

Another difference arises because the required CRSP data are not always available.

Given an initiation in calendar year t, we calculate the cumulative abnormal return over

the three-day window from day –1 to day +1 relative to the CRSP declaration date as the

cumulative difference between the firm return and the CRSP value-weighted market index. To

control for the differences in volatility across firms and time (see Campbell, Lettau, Malkiel and

Xu (2000)), we scale each firm’s three-day excess return by the square root of three times the

standard deviation of its daily excess returns. The standard deviation of excess returns is

measured from 120 calendar days through five trading days before the declaration date.

Averaging these across initiations in year t gives a standardized, cumulative abnormal

announcement return A. To determine whether the average return in a given year is statistically

significant, we compute a test statistic by multiplying A by the square root of the number of

initiations in year t. This statistic is asymptotically standard normal and has more power if the

true abnormal return is constant across securities (Brown and Warner (1980) and Campbell, Lo,

and MacKinlay (1997)), which is a natural hypothesis in our context. Table 3 reports the average

standardized initiation announcement effects year by year.

Our last demand measure is the difference between the future returns on value-weighted

indexes of payers and nonpayers. Under the rather stark version of catering outlined in the

previous section, managers rationally initiate dividends to exploit a market mispricing. If this is

literally the case, then a high rate of initiations should forecast low returns on payers relative to

nonpayers as the overpricing of payers reverses. The opposite should hold for omissions.

Page 25

23

Table 4 reports the correlation between the sentiment measures. We correlate the first

three measures at year t with the excess real return on payers over nonpayers rD - rND in year t+1

and the cumulative excess return RD - RND from years t+1 through t+3. To the extent that these

variables capture a common factor in uninformed investor demand for dividends, we expect the

dividend premium, the CU premium, and announcement effects to be positively correlated with

each other, and negatively correlated with the future excess returns of payers. Table 4 shows that

these correlations are as expected, with two exceptions: the CU premium and the initiation effect

are negatively correlated, and the initiation effect and one-year-ahead excess returns are

positively correlated. The dividend premium is correlated with all of the other variables in the

expected direction, however. This suggests that the dividend premium may be the single best

reflection of the common factor. In any case, given that each measure has its own advantages and

disadvantages, it is reassuring that they correlate roughly as expected.14

Table 4 also reports autocorrelations and Dickey-Fuller tests for unit roots. These

statistics shed light on the time series properties of the data and the potential for spurious

correlation in the regressions to follow. Of course, the textbook case of spurious correlation

involves nonstationary variables, and so before one puts too much weight on the Dickey-Fuller

tests it is worth noting the theoretical considerations that suggest that these variables are indeed

stationary. For example, if the market-to-book ratio is itself stationary, the dividend premium

cannot grow without bound. In the absence of this prior information, however, Table 4 shows

that we cannot reject a unit root in the dividend premium or the CU dividend premium. A similar

logic holds for the dividend payment variables: Each one is mathematically bounded between

14 We have also considered average ex-dividend day returns as a fifth measure of investor demand. Ex-day returns

do vary over time (e.g., Eades, Hess, and Kim (1994)). However, they have less of a category-switching

interpretation than our other four measures: A dividend payer seems likely to be viewed as a payer before, during,

and after the ex-day.

Page 26

24

one and zero, but we cannot formally reject a unit root (unreported). More practically, what these

statistics suggest is that in certain cases we should control for a time trend before concluding that

a relationship is robust.

C. Time series relationships

Here we document the basic relationships between the rates of dividend payment and the

measures of the demand for dividend-paying shares. The top panel of Figure 2 plots the dividend

premium against the raw rate of dividend initiation in the following year.

The figure reveals a strong positive relationship, consistent with catering. On average, the

rate of initiation is 11.0% when the dividend premium is positive and only 3.1% when it is

negative. In the first half of the sample, the dividend premium and subsequent initiations move

almost in lockstep. The premium then submerges in the late 1970s, leading the rate of initiation

down once again. A qualitatively similar figure obtains with the rate of initiations by large firms,

small firms, or firms that have been listed for at least five years (unreported).

The dividend premium has been negative since around 1978, and the initiation rate has

also remained low. The figure gives a visual impression that the relationship has broken down in

this period. In fact, this pattern is not inconsistent with the theory. Equation (4) indicates that

there is no reason to initiate dividends when they are discounted. A monotonic relationship

between initiations and the dividend premium is predicted only when the latter is positive.

Consistent with this prediction, the correlation between the two series is 0.53 for the 14 years in

which the lagged dividend premium is positive, and 0.03 for the 24 years in which it is negative.

Of course, another (less flattering) possibility is that exogenous factors such as the growth in

dividend-unprotected executive stock options, or the emergence of repurchases as a substitute for

Page 27

25

dividends, have suppressed initiations in recent years. Still another interpretation of the 1980s

and 1990s data is mentioned below (where we discuss the lower panel of Figure 2).

To examine the basic relationship in the figure more formally, Table 5 regresses the

dividend policy measures on the lagged demand for dividends measures. For example, the

initiation rate is modeled as:

t

CU

−

1

tt

NDD

−

tt

u dPcA bPa Initiate

++++=

−

−

11

, (11)

where Initiate is the rate of initiation, PD-ND is the market dividend premium (value-weighted or

equal-weighted), A is the average initiation announcement effect, and PCU is the Citizens Utilities

dividend premium. All independent variables are standardized to have unit variance and all

standard errors are robust to heteroskedasticity and serial correlation to four lags using the

procedure of Newey and West (1987).

Panel A reports the determinants of initiations. The regression in the first column

corresponds to Figure 2. It shows that a one-standard-deviation increase in the value-weighted

market dividend premium is associated with a 3.90 percentage point increase in the initiation rate

in the following year, or roughly three-quarters of the standard deviation of that variable.15 This

one measure explains a striking 60 percent of the variation in the rate of initiation. The second

column shows that the effect of the equal-weighted dividend premium is essentially the same.16

The remaining columns show the effect of other variables, and the results of a multivariate horse

15 If nonpayers are trading at a discount to payers, a large number of initiations may mechanically dilute the price of

payers and hence lower the premium. This can create the sort of Stambaugh (1999) bias that is described in the

Appendix in connection with return predictability. This bias is increasing in the correlation between the errors of the

prediction regression in Table 5 and the errors in an autogression of the dividend premium on the lagged dividend

premium. In the case of Initiate, these errors have a correlation of less than 0.01, so the bias is inconsequential. In

the case of Continue and Listpay, the correlation is also not statistically significant.

16 The dependent variable is implicitly an equal-weighted measure, so an equal-weighted independent variable may

seem appropriate. On the other hand, the value-weighted premium, which emphasizes larger firms, is likely to be

more visible to potential initiators. The two measures perform almost identically in this and future tables. We

proceed with value weights alone for the sake of brevity.

Page 28

26

race. The lagged initiation announcement effect and the CU premium have significant positive

coefficients, as predicted. But they disappear in a multivariate regression that includes the

dividend premium. This is consistent with earlier indications that the dividend premium best

captures the common factor in these variables.

Panel B reports analogous regressions for the rate of continuation. The dividend premium

effect is again as predicted by catering: When dividends are at a discount, payers are more likely

to omit (not continue). The dividend premium effect is smaller here, consistent with the lower

sensitivity predicted by certain versions of the model. Specifically, a one-standard-deviation

increase in the dividend premium increases the continuation rate by 0.85 percentage points.

Indeed, to the extent that some omissions are forced by profitability circumstances, which we

control for in the next section, it may be surprising that the effect is as strong as it is. The other

columns of Panel B show that the other measures of demand do not have explanatory power for

the rate of continuation, however.

Panel C shows that the rate at which lists are pay is also positively related to the dividend

premium. A one-standard-deviation increase in the dividend premium increases Listpay by 16.08

percentage points. The relative size of the coefficient here again reflects the relative variation in

the dependent variable. Using a dividend premium variable defined just over recent new lists has

at least as much explanatory power (unreported). The CU premium also has a strong univariate

effect here, but as before the dividend premium wins a horse race.

Table 6 shows the relationship between dividend policy and our fourth demand variable,

the future excess returns of payers over nonpayers. In Panel A, the dependent variable is the

difference between the returns on value-weighted indexes of payers and nonpayers. Panels B and

C look separately at the returns on payers and nonpayers, respectively, to examine whether any

Page 29

27

results for relative returns are indeed coming from the difference in returns, which the theory

emphasizes, and not payer or nonpayer returns alone. Each panel examines one, two, and three-

year ahead returns, and cumulative three-year returns. The table reports ordinary least-squares

coefficients as well as coefficients adjusted for the small-sample bias analyzed by Stambaugh

(1999). The p-values reported in the table represent a two-tailed test of the hypothesis of no

predictability using a bootstrap technique described in the Appendix.

Panel A indicates that dividend policy does have predictive power for relative returns. A

one-standard-deviation increase in the rate of initiation forecasts a decrease in the relative return

of payers of around eight percentage points in the next year, and thirty percentage points over the

next three years. This strikes us as a substantial magnitude – arguably, a magnitude worth

catering to. The predictive power of the standardized continuation rate is similar. The rate at

which lists pay has no predictive power, however, unless a time trend is included, in which case

it displays a similar level of predictability to the other dividend policy variables. The bottom

panels confirm that the relative return predictability cannot be attributed to just payer or

nonpayer predictability. As theory suggests, it is the relative return that matters.

Tables 5 and 6 provide some support for the catering theory’s basic predictions. Firms

appear more likely to initiate dividends when the demand for dividend-paying shares is high, and

more likely to omit when demand is low.

IV. Alternative explanations

The catering explanation for these results is that dividend payment is, to an important

extent, a rational managerial response to a real or perceived stock market mispricing. While it is

often possible to reinterpret an individual empirical relationship, it turns out to be very difficult

Page 30

28

to construct a coherent, non-catering alternative explanation for the full set of results. We discuss

a variety of alternative hypotheses below.

A. Statistical robustness

Time-series regressions raise a standard set of statistical issues. One is spurious

correlation. Recall that despite theoretical reasons to believe that the variables are stationary,

formal tests do not always reject a unit root. On the other hand, the key for statistical inference is

that the residuals are stationary. In the regression of initiation on the dividend premium, we can

reject a unit root in the residuals at the 10 percent level (unreported).

A more practical question is whether these relationships are robust to the inclusion of a

time trend. One would not expect future relative stock returns to be predictable from a time

trend, but the other measures of investor demand are worth checking. Table 7 includes a trend

alongside the dividend premium. The coefficient remains strongly significant for initiations. For

continuations, however, inclusion of a trend pushes the coefficient to the 10 percent level of

significance, and considerably reduces the size of the coefficient on new lists, though it does not

eliminate statistical significance.

In unreported results, we include a trend alongside the CU dividend premium and the

initiation announcement effect. This changes our earlier inferences only in the case of the CU

dividend premium: It does not have explanatory power beyond a common trend. We have also

considered the raw (unstandardized) average initiation announcement effect, which we did not

examine earlier. It turns out to have a positive but insignificant univariate relationship with

initiations; however, it becomes significant in the presence of a trend term.

Page 31

29

B. Time-varying investment opportunities

We now turn to economics-based alternative explanations. The relationship in Figure 2

could be an artifact of time variation in investment opportunities, in an environment of rational

managers and rational investors. That is, nonpayers may be initiating dividends not because they

are chasing the relative premium on payers but because their investment opportunities are low in

an absolute sense. An inverse relationship between dividends and investment opportunities could

follow if external finance is costly, as in Myers (1984) and Myers and Majluf (1984), or if

dividends are a response to agency costs of free cash flow, as in Jensen (1986). This is a natural

alternative explanation that is worth considering in some detail.

A first point is that this explanation makes the converse prediction that payers will be

more likely to omit when their investment opportunities are high. This would imply a negative

relationship between the dividend premium and the rate at which firms continue paying, not

positive as we found earlier. Therefore, this alternative could apply only to initiations.

To examine its bite (for initiations), a straightforward test is to simply control for the

level of investment opportunities and see if the dividend premium retains residual explanatory

power. We consider two potential measures of investment opportunities, the average market-to-

book of the set of firms in question and the overall CRSP value-weighted dividend yield. The

first and fourth columns in Table 7 show the results. The investment opportunities proxies enter

with the predicted signs – nonpayers are less likely to initiate when their average market-to-book

is high, and when the overall dividend-price ratio is low. For continuations and new lists,

however, these variables enter with the wrong sign for the alternative explanation. More

important, the dividend premium coefficient is not much affected.

Page 32

30

The investment opportunities view also makes similar predictions for both repurchases

and dividends, while catering involves only the latter. Thus we can examine whether the rate of

repurchase is also related to the dividend premium, or only the rate of dividend initiation. We

construct aggregate time series measures of the rate of repurchase, defining a repurchase as

nonzero purchase of common and preferred stock (Compustat item 115). The first useable year is

1972. We find that the rate of repurchase among all firms, and the rate at which firms “initiate”

repurchases (new repurchasers in year t divided by surviving non-repurchasers), have an

insignificant negative correlation with the lagged dividend premium (unreported). The dividend

initiation rate, by contrast, has a correlation of 0.73 over the same 29-year period.

Finally, time-varying investment opportunities leads more naturally to variation in the

level of dividends, not necessarily the number of firms paying a dividend as is the essence of

initiations and omissions. Thus, this alternative hypothesis would predict that the dividend

premium should bear an even stronger relationship to the level of dividends, whereas catering to

category investors would not necessarily predict a relationship in levels. Consistent with the

latter view, we find that neither the payout ratio nor the dividend yield is significantly correlated

with the lagged dividend premium (unreported), where we use updated data from Shiller (1989)

on earnings and dividends for the S&P 500 and the CRSP value-weighted dividend yield. Also

note that we control for the dividend yield directly in the last three columns of Table 7. This

actually increases the effect of the dividend premium on the initiation rate.

All of this casts doubt on the ability of investment opportunities to explain the connection

between initiations and the dividend premium. Moreover, this story has fundamental difficulties

addressing the connection to future relative returns or the CU dividend premium (though as

mentioned above, this relationship is not apparent beyond a common time trend).

Page 33

31

C. Correlated errors in forecasting investment opportunities

The second alternative explanation we consider is a variant of the first and is suggested

by the referee. Perhaps managers and investors make correlated errors in their forecasts. That is,

investors sometimes get excited about growth prospects and bid up the price of nonpayers, who

they feel are better suited to exploit new opportunities. Managers, rather than catering to this

sentiment, are equally smitten and choose to invest all available resources rather than paying any

dividends. This story is better than the rational expectations version outlined above in that it can

address the return predictability results, but otherwise it has the same shortcomings.

D. Time-varying sample characteristics

The results could arise because our dividend demand measures are somehow related to

the cross-sectional distribution of dividend-relevant characteristics within payer and nonpayer

samples. This is a more general version of the investment opportunities explanation discussed

above. As a contrived example, suppose the variance of investment opportunities among

nonpayers increases (for some reason) whenever the dividend premium increases. Then an

increasing initiation rate could indicate that a relatively high fraction of nonpayers do not need to

retain cash, not that nonpayers as a group are catering to the dividend premium. Note that in this

example, the average investment opportunities of nonpayers are held constant, so the time series

exercises in Table 7 would mistakenly attribute the effect to the dividend premium.

We can evaluate this explanation by controlling directly for sample characteristics. In

particular, we examine whether the dividend premium helps to explain the residual variation in

dividend decisions after controlling for the characteristics studied by Fama and French (2001).

They model the expected probability that a firm is a payer as a function of four variables:

()

it

itit it

itit

u

A

E

e

A

dA

d

B

M

c bNYPa Payer

+

++++==

logit1 Pr

, (12)

Page 34

32

where size NYP is the NYSE market capitalization percentile, i.e. the percentage of firms on the

NYSE having smaller capitalization than firm i in that year. Market-to-book M/B is measured as

defined previously, with the slight modification that here we use the fiscal year closing stock

price (Compustat item 199) instead of the calendar year close. Growth dA/A in book assets

(Compustat item 6) is self-explanatory. Profitability E/A is earnings before extraordinary items

(18) plus interest expense (15) plus income statement deferred taxes (50) divided by book assets.

The error term u is the residual propensity to pay dividends for a given firm-year.

The tests proceed in two stages. In the first stage, we follow Fama and French in

estimating firm-level logit regressions using these firm characteristics. As before, we examine

dividend payment separately among surviving nonpayers, surviving payers, and new lists. We

also follow them in estimating specifications that exclude M/B – they suggest that the degree to

which this variable measures investment opportunities may change over time, and indeed we

have been arguing that this variable is affected by uninformed investor demand.

In the second stage, we regress the average annual prediction errors, or the aggregate

“propensity to pay,” on the value-weighted dividend premium. For example, naming PTI the

residual rate of initiation or the “propensity to initiate,” we estimate:

t

NDD

−1

tt

v gPfITP

++=

−

~

, where (13)

∑

≡

i

it

N

t

uITP

1

~

.

Explanatory power for the propensity to initiate (or, analogously, the propensity to continue

CTP~

or propensity to list as a payer

LTP~) would mean that the dividend premium is not

affecting dividend policy through the average or the cross-sectional distribution of any of these

Page 35

33

four characteristics.17 The regression in (13) is analogous to our earlier time series regressions,

such as equation (11), but now the effect of varying characteristics has been removed. Note that

the two-stage approach gives deference to the characteristics variables by allowing the dividend

premium to explain only residual variation. And in terms of statistical power, the dividend

premium is using only 38 data points to fit, not thousands like the characteristics.

Table 8 shows the results of this exercise. The first stage results indicate that size and

profitability have the most robust effects on the propensity to pay, as Fama and French find. The

right column shows the second stage results. In general, controlling for characteristics directly,

the dividend premium retains statistically significant explanatory power for most subsamples.

Comparing these coefficients to our earlier time series results, one can see that controlling for

firm characteristics barely affects the initiation rate coefficient. It is 3.90 in Table 5, and

controlling for characteristics moves it only slightly, and does not affect its statistical

significance. We view this as compelling evidence that the dividend premium is not working

through a background correlation with the level or distribution of characteristics.

Controlling for characteristics also tends to improve the post-1980 correlation between

initiations and the dividend premium, as shown in Panel B of Figure 2. This suggests another

perspective on the poor post-1980 fit of the raw initiation rate. Namely, that the raw rate was

depressed by the recent influx of small, unprofitable, high market-to-book firms noted by Fama

and French. Within the language of the model, firms with these characteristics would tend to

have high fundamental costs of paying dividends. Controlling for characteristics may better

17 Including the dividend premium directly in equation (13) and estimating the coefficients in a panel regression

gives qualitatively similar results to our two-stage procedure (unreported). A panel regression is necessary in that

specification because the dividend premium does not vary within a year, as the Fama-MacBeth procedure requires.

Page 36

34

reveal the partial effect of the dividend premium.18 Interestingly, the only period where the rate

of initiation is sharply lower than would be expected from the dividend premium is the early

1970s. The most obvious explanation is the Nixon dividend controls (1971-1974).

Controlling for characteristics does tend to reduce the effect of the dividend premium

among the other samples, however. That characteristics would help to explain omissions might

be expected given that they are known to be associated with characteristics such as low

profitability. Nevertheless, the dividend premium approaches statistical significance even in this

sample, and remains statistically significant in the new list sample.

The methodology in Table 8 is also useful for confirming once again that our empirical

results, like our theory, are mainly about the decision whether to pay dividends, not how much to

pay. That is, we have constructed a time series of the raw rate of dividend increases and found

that it has a significant positive correlation with the dividend premium (unreported), but that this

result comes entirely from changing characteristics like profitability. When these characteristics

are accounted for using the two-stage procedure, there is no relationship between the residual

propensity to increase dividends and the dividend premium (unreported).

Finally, we can also ask whether the average annual prediction errors from Table 8

predict the relative returns of payers and nonpayers. In other words, whether the non-

characteristics-related variation in dividend policy, which is presumably more closely related to

catering, also predicts returns. We find that the average prediction errors indeed have comparable

18 To be fully consistent with the theory, this interpretation of the 1980s and 1990s also involves low-frequency

measurement error in the dividend premium. If the true dividend premium was negative, then as mentioned before,

the low post-1980 correlation with the raw level of initiations is not problematic for the theory. The fact that

controlling for characteristics improves the correlation suggests that the true dividend premium may have been

positive over some of this period, but that we measure it with low-frequency error. For example, if nonpayers

typically have greater intangible assets, their market-to-books could be naturally higher than those of payers, but

higher-frequency variation in the dividend premium might still be informative about demand for dividends.

Page 37

35

or greater predictive power than the raw dividend payment measures (unreported). This indicates

that the predictability results also do not work through firm characteristics.

E. Time-varying contracting problems

Another class of alternative explanations involves time-varying contracting problems,

such as adverse selection or agency. With regard to adverse selection, it is possible that when

nonpayers trade at a low value, this is a particularly important time for them to signal their

investment opportunities. Initiating dividends serves as a signal in the models of Bhattacharya

(1979), Hakansson (1982), John and Williams (1985), and Miller and Rock (1985). Once again,

the natural way to evaluate this hypothesis is to control for the level of nonpayer market-to-book

directly, and examine whether the dividend premium has residual explanatory power. The results

in Table 7 show that it does. Moreover, it is hard to imagine a rational expectations equilibrium

model in which dividend policy choices predict future returns, or would have any natural reason

to be correlated with the CU dividend premium.

Agency costs may also vary over time, with high agency costs requiring dividend

payments. For example, La Porta, Lopez-de-Silanes, Shleifer, and Vishny (2000) find that

dividend policy varies across countries according to the degree of investor protection. If the

dividend premium were a simple time trend, this could be a more compelling explanation for our

results. As it stands, this explanation requires governance to improve briefly in the late 1960s,

deteriorate, and then improve again. Of course, it is possible that variation in investment

opportunities and profits might affect agency costs, but this would be addressed in Table 8. Here,

one must imagine agency problems that arise independent of firm characteristics.

Page 38

36

V. On the source of demand for dividends

Process of elimination, as well as the close connection between the results and

predictions of the model, suggests that managers are catering to investor demand. Here we ask

the follow-up question: Which investors are managers catering to? Put differently, what drives

the dividend premium? These are hard questions and we offer only preliminary conclusions. The

two basic possibilities are traditional dividend clienteles or sentimental investors.

A. Dividend clienteles

Black and Scholes (1974) suggest that uninformed demand for dividends result from

dividend clienteles, which in turn derive from such imperfections as taxes, transaction costs, or

institutional investment constraints.19 In general, rational clienteles would be satisfied by a

supply response in the aggregate level of dividends, not the number of dividend-paying shares.

Also, if they are diversified, rational clienteles will not care about how the supply response is

distributed across firms. In fact, Marsh and Merton (1987) point out that current dividend payers,

with high financial slack and modest investment opportunities, are probably the lowest marginal

cost suppliers of dividends. These considerations suggest that if the dividend premium were

varying in response to rational clientele demands, it should have a closer connection to the level

of dividends than the number of payers. We find the opposite.

Another approach is to see if we can directly match up the dividend premium with any

plausible proxies for clienteles. A natural proxy for tax clienteles, for example, is the relative tax

advantage of dividend income versus capital gains. Figure 1 suggests that the 1986 Tax Reform

Act, which should have shrunk the anti-dividend clientele, had no visible effect on the dividend

19 Miller and Scholes (1978) propose that tax code changes could have no influence, because taxes on dividends can

be postponed indefinitely. However, Peterson, Peterson, and Ang (1985) find empirically that most investors do not

avoid taxation.

Page 39

37

premium. Similarly, Hubbard and Michaely (1997) study the reaction of the CU dividend

premium to the 1986 reform. They conclude that tax-motivated clienteles do not seem to affect

that variable. As an aside, the lack of a differential reaction to the reform by payers and

nonpayers also seems inconsistent with dividend tax capitalization.

Table 7 contains a more formal test of whether firms are catering to tax clienteles. The

personal tax advantage for dividends (typically a net disadvantage) is measured as the ratio of the

after-tax income from a dollar of dividends to a dollar of long-term capital gains. That is, we take

one minus the average marginal income rate, divided by one minus the average marginal long-

term capital gains rate. The tax rates in this calculation are weighted average rates across

shareholder groups as calculated by the NBER TAXSIM model. They are reported at

www.nber.org/~taxsim/mrates/mrates2.html and described by Feenberg and Coutts (1993). Table

7 shows that if anything, the initiation rate is positively related to this variable, not negatively

related, and in any case its inclusion does not much affect the dividend premium coefficient. (In

Panel C, the large t-statistic on taxes disappears when a trend is included because of trends in

both the rate at which lists pay and the tax advantage variable.)

Transaction costs also vary over time, changing the cost of homemade dividends. Perhaps

this induces changes in demand by transaction cost clienteles. Black (1976) dismisses this

argument, pointing out that there are simple institutional solutions to the problem of the small

investor’s transaction costs. However, Jones (2001) shows that transaction costs have declined

dramatically since the mid-1970s, which coincides with the reduction in the rate of initiation that

we document.20 Jones’s Figure 3 shows the average annual one-way transaction cost for the

NYSE, or one half of the bid-ask spread plus commissions. This series is strongly positively

20 The rise of mutual funds roughly coincides with these falling transaction costs, potentially lowering an individual

investor’s cost of monetizing capital gains further still.

Page 40

38

correlated with the rate of initiation, though this comes mostly from a common time trend; the

correlation between the detrended variables is not significant (unreported). More importantly, in

regressions that include both variables, the dividend premium has more statistical significance

than transaction costs in explaining the initiation rate (unreported).

Another theoretical possibility is that dividend clienteles are motivated by institutional

investment constraints. For instance, the 1974 Employee Retirement Income Security Act may

have increased the pro-dividend clientele by creating a vague “prudent man” rule for pension

funds. The law was revised in 1979 to allow pension funds to provide venture capital, thus

erasing any doubt that nonpayers were acceptable investments and perhaps shrinking the

dividend clientele. Figure 2 could be broadly consistent with these institutional shifts. However,

the dividend premium seems to anticipate the law, peaking in 1972 and beginning its drop in

1977. Perhaps ERISA is part of the story in this period, but we are not aware of investment

constraints that could explain the dividend premium over the 1960s and early 1970s.

The rational clientele explanations for the dividend premium also face some difficulty

accounting for the magnitude of the return predictability effects. Even under limited arbitrage, in

equilibrium the marginal clientele investor should still be indifferent to leaving the clientele or

taking advantage of the mispricing that his colleagues presumably induce. But the marginal

clientele investor’s savings on transaction costs or taxes, for example, seem unlikely to be worth

a tradeoff of nine percentage points per year in pre-tax expected returns.

B. Sentiment for dividends

For these reasons we conclude that rational dividend clienteles are unlikely to be driving

the dividend premium. This leaves time-varying sentiment between payers and nonpayers as the

remaining explanation. Of course, economists are just beginning to understand sentiment, so

Page 41

39

such hypotheses are harder to reject by construction. Here we attempt to provide some rejectable

tests for the sentiment-based view of demand for dividends.

We outlined two specific sentiment mechanisms earlier in the paper. One was based on

the bird-in-the-hand fallacy and time-varying risk aversion. It proposes that when investors are

highly tolerant of risk, they stray more from the perceived safety of dividend-paying stocks.

Another story involves time-varying investor perceptions of growth opportunities. It holds that

uninformed or unsophisticated investors use dividend policy to infer a firm’s investment plans.

From a zero-payout policy (controlling for profitability), they tend to infer that the firm wants to

reinvest and grow. And so when they believe the outlook for growth stocks is generally good,

they favor nonpayers. When they feel it is bad, they favor payers. Either sentiment mechanism

seems consistent with most of our results, including return predictability.

As a first test of these mechanisms, we compare the closed-end fund discount with the

dividend premium. Zweig (1973) and Lee, Shleifer, and Thaler (1991) view the closed-end fund

discount as a measure of general investor sentiment. Whether it reflects risk tolerance,

expectations for growth stocks, or both, is far from clear. A positive correlation between the

closed-end fund discount and the dividend premium would therefore be consistent with both

mechanisms outlined above, and would not be predicted by any of the alternative explanations

we have considered. We gather value-weighted discounts on closed-end stock funds for 1962

through 1993 from Neal and Wheatley (1998), for 1994 through 1998 are from

CDA/Wiesenberger, and for 1999 and 2000 from the discounts on stock funds reported in the

Wall Street Journal in the turn-of-the-year issues.

Page 42

40

Figure 3 shows the relationship between the dividend premium and the closed-end fund

discount. They are not perfectly synchronous, but they are visibly related. The correlation is 0.37

with a p-value of 0.02. This provides some initial support for the sentiment mechanisms.

To tie this back to our basic results, Table 9 uses the closed-end fund discount as an

instrumental variable for the dividend premium. The table also uses lagged capital gains and

future relative returns on payers and nonpayers as instruments. The logic for using future relative

returns is that they are arguably a purer (though perhaps noisier) measure of sentiment for

dividends than the dividend premium. Recent capital gains on the market could capture either of

the mechanisms outlined above – after a crash, unsophisticated investors may tend more toward

the “bird in the hand” rationale, and also view general growth opportunities as bleak.21

Table 9 shows that the instrumental variables coefficients are, in general, about as

statistically and economically significant as the basic OLS coefficients. For the specification that

uses future returns as an instrument, this merely puts the earlier predictability results in units of

the dividend premium. For the other specifications, the results have a more novel value. At a

minimum, they confirm that that the specific component of the dividend premium associated

with these variables helps to explain rates of initiation and omission, thus casting doubt on

generic “omitted third factor” alternative explanations. To the extent that the instruments pick up

investor sentiment, the results provide affirmative support for a sentiment interpretation.

VI. Conclusion

We develop a catering theory of dividends that focuses on the market efficiency

assumption of the M&M dividend irrelevance proof. It adds to the collection of theories of

21 We thank Lubos Pastor for suggesting that we use past capital gains in this manner.

Page 43

41

dividends that relax other specific assumptions of the proof. The essence of catering is that

managers give investors what they want. In the setting of dividends, catering implies that

managers will tend to initiate dividends when investors put a relatively high stock price on

dividend payers, and tend to omit dividends when investors prefer nonpayers. A simple model

formalizes the key tradeoffs between maximizing fundamental value and catering, and offers

testable time-series and cross-sectional predictions.

Our empirical work focuses on the central time-series prediction of the model: a positive

relationship between the rates of dividend initiation and omission and the difference between the

prevailing stock prices of payers and nonpayers. We test this relationship using four measures of

investor demand for payers. The aggregate initiation rate is significantly positively related to all

four (however, in one case this does not amount to more than a common trend). One proxy for

investor demand for payers, the difference between the average market-to-book ratios of payers

and nonpayers – the “dividend premium” – explains a statistically impressive three-fifths of the

annual variation in the rate of initiation. In addition, the rate of omission is significantly

negatively related to two of the four proxies for investor demand. After reviewing other

possibilities, we conclude that catering is the most natural explanation.

We then inquire about the source of time-varying demand for dividends. We do not find

strong evidence for a traditional dividend clientele. Instead, investor sentiment appears to affect

the demand for dividends. This is suggested in the connection between the closed-end fund

discount and the dividend premium, and in instrumental variables estimates of the effect of the

dividend premium on dividend payment. In Baker and Wurgler (2002b), we review academic

histories of the capital markets and historical financial news articles to construct a detailed

timeline of how investor attitudes toward dividends have changed over time.

Page 44

42

Appendix

This appendix describes the simulations that generate the bias-adjusted coefficients and

p-values reported in Table 6. As discussed by Stambaugh (1999), a small-sample bias arises

when the explanatory variable is persistent and there is a contemporaneous correlation between

innovations in the explanatory variable and stock returns. For example, in the following system

ttt

u bXaR

++=

−1

(A1)

ttt

v dXcX

++=

−1

, (A2)

the bias is equal to

]

ˆ

[]

ˆ[

b

2

v

ddEbE

uv

−=−

s

s

, (A3)

where the hats represent OLS estimates. Kendall (1954) shows the OLS estimate of d has a

negative bias. The bias for OLS b is therefore of the opposite sign to the sign of the covariance

between innovations in dividend policy and returns.

The sign of this covariance is not obvious a priori (unlike when the predictor is a scaled-

price variable). To address the potential for bias and conduct inference, we use a bootstrap

estimation technique. The approach is identical to Baker and Stein (2002) and is similar to that

used in Vuolteenaho (2001), Kothari and Shanken (1997), Stambaugh (1999), and Ang and

Bekaert (2001). For each regression in Table 6, we perform two sets of simulations.

The first set generates a bias-adjusted point estimate. We simulate (A1) and (A2)

recursively starting with X0, using the OLS coefficient estimates, and drawing with replacement

from the empirical distribution of the errors u and v. We throw out the first 100 draws (to draw

from the unconditional distribution of X), then draw an additional N observations, where N is the

size of the original sample. (For the cumulative three-year regressions, the number of additional

Page 45

43

draws is one third the size of the original sample, since it contains overlapping returns.) With

each simulated sample, we re-estimate (A1). This gives us a set of coefficients b*. The bias-

adjusted coefficient BA reported in Table 6 subtracts the bootstrap bias estimate (the mean of b*

minus the OLS b) from the OLS b.

In the second set of simulations, we redo everything as above under the null hypothesis of

no predictability – that is, imposing b equals zero. This gives us a second set of coefficients b**.

With these in hand, we can determine the probability of observing an estimate as large as the

OLS b by chance, given the true b = 0. These are the p-values in Table 6.

Page 46

44

References

Allen, Franklin, Antonio E. Bernardo, and Ivo Welch, 2000, A theory of dividends based on tax

clienteles, Journal of Finance 55, 2499-2536.

Allen, Franklin, and Roni Michaely, 2002, Payout policy, University of Pennsylvania working

paper.

Alti, Ashiq, Hwang, Lee-Seok, and Mark A. Trombley, 2002, Arbitrage risk and the book-to-

market mispricing, Journal of Financial Economics (forthcoming).

Ang, Andrew and Geert Bekaert, 2001, Stock return predictability: Is it there?, NBER working

paper #8207.

Asquith, Paul, and David W. Mullins, Jr., 1983, The impact of initiating dividend payments on

shareholders’ wealth, Journal of Business 56, 77-96.

Baker, Malcolm, and Serkan Savasoglu, 2002, Limited arbitrage in mergers and acquisitions,

Journal of Financial Economics (forthcoming).

Baker, Malcolm, and Jeremy C. Stein, 2002, Market liquidity as a sentiment indicator, Harvard

University working paper.

Baker, Malcolm, Stein, Jeremy C., and Jeffrey Wurgler, 2001, When does the market matter?

Stock prices and the investment of equity-dependent firms, Harvard University working

paper.

Baker, Malcolm and Jeffrey Wurgler, 2000, The equity share in new issues and aggregate stock

returns, Journal of Finance 55, 2219-2257.

Baker, Malcolm and Jeffrey Wurgler, 2002a, Market timing and capital structure, Journal of

Finance 55, 2219-2257.

Baker, Malcolm and Jeffrey Wurgler, 2002b, Why are dividends disappearing? An empirical

analysis, Harvard University working paper.

Baker, Malcolm, Robin Greenwood, and Jeffrey Wurgler, 2002, The maturity of debt issues and

predictable variation in bond returns, Journal of Financial Economics (forthcoming).

Barberis, Nicholas, and Andrei Shleifer, 2002, Style investing, Journal of Financial Economics

(forthcoming).

Barberis, Nicholas, Andrei Shleifer, and Robert W. Vishny, 1998, A model of investor

sentiment, Journal of Financial Economics 49, 307-343.

Barberis, Nicholas, Andrei Shleifer, and Jeffrey Wurgler, 2001, Comovement, University of

Chicago working paper.

Benartzi, Shlomo, Roni Michaely, and Richard Thaler, 1997, Do changes in dividends signal the

future or the past?, Journal of Finance 52, 1007-1034.

Page 47

45

Black, Fischer, and Myron S. Scholes, 1974, The effects of dividend yield and dividend policy

on common stock prices and returns, Journal of Financial Economics 1, 1-22.

Black, Fischer, 1976, The dividend puzzle, Journal of Portfolio Management, 5-8.

Blanchard, Olivier, Chanyong Rhee, and Lawrence Summers, 1990, The stock market, profit,

and investment, Quarterly Journal of Economics 108, 115-136.

Boehme, Rodney D., and Sorin M. Sorescu, 2002, The long-run performance following dividend

initiations and resumptions: Underreaction or product of chance?, Journal of Finance 57,

871-900.

Brav, Alon, and J. B. Heaton, 1998, Did ERISA's prudent man rule change the pricing of

dividend omitting firms?, Duke University working paper.

Brown, Stephen, and Jerold Warner, 1980, Measuring security price performance, Journal of

Financial Economics 8, 205-258.

Campbell, John Y., Martin Lettau, Burton G. Malkiel, and Yexiao Xu, 2001, Have individual

stocks become more volatile? An empirical exploration of idiosyncratic risk, Journal of

Finance 56, 1-44.

Campbell, John Y., Andrew W. Lo, and A. Craig MacKinlay, 1997, The Econometrics of

Financial Markets, (Princeton University Press, Princeton, NJ).

Chen, Joseph, Harrison Hong, and Jeremy C. Stein, 2002, Breadth of ownership and stock

returns, Journal of Financial Economics (forthcoming).

Cooper, Michael J., Orlin Dimitrov, and P. Raghavendra Rau, 2001, A rose.com by any other

name, Journal of Finance 56, 2371-2388.

Daniel, Kent, Hirshleifer, David, and Avanidhar Subrahmanyam, 1998, Investor psychology and

security market under- and overreactions, Journal of Finance 53, 1839-85.

D’Avolio, Gene, 2002, The market for borrowing stock, Journal of Financial Economics

(forthcoming).

DeAngelo, Harry, Linda DeAngelo, and Douglas J. Skinner, 1996, Dividend signaling and the

disappearance of sustained earnings growth, Journal of Finance 40, 341-371.

DeLong, J. Bradford, Andrei Shleifer, Lawrence H. Summers, and Robert Waldmann, 1990,

Noise trader risk in financial markets, Journal of Political Economy 98, 703-738.

Del Guercio, Diane, 1996, The distorting effect of the prudent-man laws on institutional equity

investments, Journal of Financial Economics 40, 31-62.

Duffie, Darrell, Nicolae Garleanu, and Lasse Heje Pedersen, 2002, Securities lending, shorting,

and pricing, Journal of Financial Economics (forthcoming).

Eades, Kenneth M., Patrick J. Hess, and E. Han Kim, 1994, Time-series variation in dividend

pricing, Journal of Finance 49, 1617-1638.

Page 48

46

Fama, Eugene F., and Harvey Babiak, 1968, Dividend policy: An empirical analysis, Journal of

the American Statistical Association 53, 1132-1161.

Fama, Eugene F., and Kenneth R. French, 2001, Disappearing dividends: Changing firm

characteristics or lower propensity to pay?, Journal of Financial Economics 60, 3-44.

Feenberg, Daniel, and Elizabeth Coutts, 1993, An introduction to the Taxsim model, Journal of

Policy Analysis and Management 12, 189-194.

Geczy, Christopher, David K. Musto, and Adam Reed, 2002, Stocks are special too: An analysis

of the equity lending market, Journal of Financial Economics (forthcoming).

Gordon, Myron J., 1959, Dividends, earnings, and stock prices, Review of Economics and

Statistics 41, 99-105.

Graham, Benjamin, and David L. Dodd, 1951, Security Analysis: Principles and Techniques

(McGraw-Hill, New York, NY).

Graham, John R., and Campbell R. Harvey, 2001, The theory and practice of corporate finance:

Evidence from the field, Journal of Financial Economics 60, 187-244.

Greenwood, Robin, 2001, Large events and limited arbitrage: Evidence from a Japanese stock

index redefinition, Harvard University working paper.

Greenwood, Robin, and Nathan Sosner, 2001, Where do betas come from?, Harvard University

working paper.

Grullon, Gustavo, and Roni Michaely, 2002, Dividends, share repurchases, and the substitution

hypothesis, Journal of Finance (forthcoming).

Hakansson, Nils H., 1982, To pay or not to pay dividends, Journal of Finance 37, 415-428.

Healy, Paul M., and Krishna G. Palepu, 1988, Earnings information conveyed by dividend

initiations and omissions, Journal of Financial Economics 21, 149-176.

Hong, Harrison, and Jeremy C. Stein, 1999, A unified theory of underreaction, momentum

trading and overreaction in asset markets, Journal of Finance 54, 2143-2184.

Hubbard, Jeff, and Roni Michaely, 1997, Do investors ignore dividend taxation? A

reexamination of the Citizens Utilities case, Journal of Financial and Quantitative

Analysis 32, 117-135.

Hyman, Leonard, 1988, America’s Electric Utilities: Past, Present, and Future (Arlington, VA:

Public Utility Reports).

Jensen, Michael C., 1986, Agency costs of free cash flow, corporate finance and takeovers,

American Economic Review 76, 323-329.

Jenter, Dirk, 2001, “Managerial portfolio decisions and market timing,” Harvard University

working paper.

Page 49

47

John, Kose, and Joseph Williams, 1985, Dividends, dilution, and taxes: A signaling equilibrium,

Journal of Finance 40, 1053-1070.

Kahneman, Daniel, and Amos Tversky, 1979, Prospect theory: An analysis of decision under

risk, Econometrica 47, 263-291.

Kendall, M. G., 1954, Note on bias in estimation of auto-correlation, Biometrika 41, 403-404.

Kothari, S. P., and Jay Shanken, 1997, Book-to-market, dividend yield, and expected market

returns: A time series analysis, Journal of Financial Economics 44, 169-203.

Lamont, Owen A., and Charles M. Jones, 2002, Short sale constraints and stock returns, Journal

of Financial Economics (forthcoming).

Lamont, Owen A., and Richard H. Thaler, 2001, Can the market add and subtract? Mispricing in

tech-stock carve-outs, University of Chicago working paper.

La Porta, Rafael, Florencio Lopez-de-Silanes, Andrei Shleifer, and Robert Vishny, 2000, Agency

problems and dividend policies around the world, Journal of Finance 55, 1-33.

Lee, Charles M., Andrei Shleifer, and Richard Thaler, 1991, Investor sentiment and the closed-

end fund puzzle, Journal of Finance 46, 75-110.

Lintner, John, 1956, The distribution of incomes among corporations among dividends, retained

earnings, and taxes, American Economic Review 46, 97-113.

Long, John B., 1978, The market valuation of cash dividends: A case to consider, Journal of

Financial Economics 6, 235-264.

Malkiel, Burton G., 1999, A Random Walk Down Wall Street, (Norton, New York, NY).

Marsh, Terry A., and Robert C. Merton, 1987, Dividend behavior for the aggregate stock market,

Journal of Business 60, 1-40.

Mendenhall, Richard R., 2002, Post-earnings announcement drift and arbitrage risk, Journal of

Business (forthcoming).

Miller, Merton H., and Franco Modigliani, 1961, Dividend policy, growth and the valuation of

shares, Journal of Business 34, 411-433.

Miller, Merton H., and Kevin Rock, 1985, Dividend policy under asymmetric information,

Journal of Finance 40, 1031-1051.

Miller, Merton H., and Myron Scholes, 1978, Dividends and taxes, Journal of Financial

Economics 6, 333-364.

Mitchell, Mark, and Todd C. Pulvino, 2001, Characteristics of risk and return in risk arbitrage,

Journal of Finance 56, 2135-2176.

Mitchell, Mark, Todd C. Pulvino, and Erik Stafford, 2002, Limited arbitrage in equity markets,

Journal of Finance 57, 551-584.

Page 50

48

Morck, Randall, Robert Vishny, and Andrei Shleifer, 1990, The stock market and investment: Is

the market a sideshow?, Brookings Papers on Economic Activity 2:1990, 157-215.

Mullainathan, Sendhil, 2002, Thinking through categories, MIT working paper.

Myers, Stewart, 1984, The capital structure puzzle, Journal of Finance 39, 575-592.

Myers, Stewart, and Nicholas Majluf, 1984, Corporate financing and investment decisions when

firms have information that investors do not have, Journal of Financial Economics 13,

187-221.

Neal, Robert, and Simon M. Wheatley, 1998, Do measures of investor sentiment predict

returns?, Journal of Financial and Quantitative Analysis 33, 523-547.

Newey, Whitney K, and Kenneth D. West, 1987, A simple, positive semi-definite,

heteroskedasticity and autocorrelation consistent covariance matrix, Econometrica 55,

703-708.

Ofek, Eli, and Matthew Richardson, 2002a, DotCom mania: The rise and fall of internet stock

prices, Journal of Finance (forthcoming).

Ofek, Eli, and Matthew Richardson, 2002b, The valuation and market rationality of internet

stock prices, NYU working paper.

Peterson, Pamela, David Peterson, and James Ang, 1985, Direct evidence on the marginal rate of

taxation on dividend income, Journal of Financial Economics 14, 267-82.

Polk, Christopher, and Paola Sapienza, 2001, The real effects of investor sentiment,

Northwestern University working paper.

Pontiff, Jeffrey, 1996, Costly arbitrage: Evidence from closed-end funds, Quarterly Journal of

Economics 111, 1135-1152.

Pontiff, Jeffrey, and Michael J. Schill, 2001, Long-run seasoned equity offering returns: Data

snooping, model misspecification, or mispricing? A costly arbitrage approach, University

of Washington working paper.

Poterba, James M., 1986, The market valuation of cash dividends: The Citizens Utilities case

reconsidered, Journal of Financial Economics 15, 395-405.

Rau, P. Raghavendra, Ajay Patel, Igor Osobov, Ajay Korana, and Michael J. Cooper, 2001, The

game of the name: Value changes accompanying dot.com additions and deletions, Purdue

University working paper.

Rosch, Eleanor, 1978, Principles of categorization, in Eleanor Rosch and Barbara B. Lloyd, eds.:

Cognition and Categorization (Lawrence Erlbaum Associates, Hillsdale, NJ).

Shefrin, Hersh M., and Meir Statman, 1984, Explaining investor preference for cash dividends,

Journal of Financial Economics 13, 253-282.