Bridging the Gap between ‘Joe Sixpack’ and ‘Bill Gates’: On the Efficiency of Institutions for Redistribution

All democracies have implemented institutions that redistribute income from the rich to the poor. Economists tend to have strong views on how this redistribution should be organized, based on the two theorems of welfare economics. However, these views are mostly neglected. I argue that the reason for this neglect is likely to be that these institutions are constrained Pareto-efficient after a century of experimentation. If not, some political bargaining would lead to the implementation of the Pareto-improvement. Hence, economists should concentrate on an explanation of the constrained efficiency of existing institutions instead of on the design of drawing table grand reforms. This approach is applied to three institutions frequently observed in reality: minimum wages, education subsidies, and unemployment insurance. We show that these institutions for redistribution are likely to be constrained efficient. We analyze the impact of the constitutional environment on the implementation of efficient redistribution. Finally, we evaluate the causes for the observed cross-country variation in redistribution.
All modern capitalist democracies apply a multitude of institutions to redistribute
income. Although there is some discussion about which groups take advantage of
this redistribution, by and large the below median citizens tend to benefit at the
expense of the wealthier half of society. This redistribution has not remained un-
disputed. Obviously, the wealthier half of society can be expected to be opposed
to it. However, there has also been genuine concern about the consequences of
redistribution policies for incentives. Redistribution can only be achieved by tax-
ing the fruits of innate or inherited abilities. However, since effort and ability are
hard to distinguish, redistribution also taxes the fruits of effort. This dilemma is
known as the trade-off between equity
taxing ability
and efficiency
effort from taxation
During the seventies and eighties, when redistribution was at its maximum,
the task of giving policy advice was a simple one. Economists argued correctly
that there was too much redistribution. Nowadays, this era has gone. The welfare
state has been reconstructed. Hence, economists face more subtle questions to-
day, which are therefore more difficult to answer.
The effects of redistribution on economic activity are in many respects similar
to those of robbery. When Robin Hood steals money from the rich and distrib-
utes it among the poor, this undermines the incentives for the provision of effort.
Why would a rich man work when Robin Hood will steal the revenues of his
efforts? And as far as the rich have incentives for working, these incentives are
biased against plowshares and in favor of swords, in order to defend their prop-
erty rights.
Violence and robbery have therefore been means to redistribute in-
come from producers. Robbers face the equivalent of the trade-off between eq-
uity and efficiency: too much robbery will take away the incentives for producing
the commodities that are to be robbed in the future.
All modern democracies tend to redistribute income. The force supporting this
regularity is the fact that the median voter earns less than the mean income, since
the income distribution has a fat left tail. Since the median voter rules in a de-
mocracy, there will always be substantial support for at least some redistribution.
There is indeed a parallel between the robbery of Robin Hood and the redistri-
bution by the democratic welfare state.
In fact, democracy legitimizes a state-
monopoly of violence, to imprison Robin Hood and to stop unbounded robbery,
in exchange for some ‘regulated’ redistribution/robbery by the median voter.
compromise of democracy is probably the best protection of their property rights
against theft and robbery that the rich can buy.
This view might be outrageous
for those who see the democratic welfare state as the moral end point of political
history, but from a social scientist’s point of view it helps to understand history.
Since redistribution is an almost inevitable feature of modern democracies, the
policy advice of economists to abolish all redistribution is of little practical use.
A far more useful advice would guide policymakers in their design of redistribu-
tion policies, in order to minimize their cost.
It is this question that will be addressed in this paper. The discussion will
draw heavily upon research of the past ten years. However, the detailed analysis
of the trade-offs that will be sketched is as yet uncompleted. The paper is there-
fore partly a sketch of a research program. This program draws heavily on Beck-
efficient redistribution hypothesis, which is set out in section 2. Sec-
tion 3 provides an overview of how various institutions for redistribution fit in
our economic modeling. The role of these institutions in explaining the wide
cross-country variety in outcomes will be discussed in section 4.
In the simple world of the first two theorems of welfare economics, redistribution
is an easy task.
The first theorem deals with efficiency. Markets guarantee effi-
ciency, under conditions that are ignored for the moment. The second theorem of
welfare economics tells us that we can implement any efficient allocation by an
appropriate lump sum redistribution of income. A redistribution is lump sum when
it is not contingent on variables that are endogenous to the market process. A
complication is that such variables are hardly available
age is one of the few
. Progressive income taxation is the best analyzed example of an in-
strument that furthers redistribution, however at the expense of the incentives for
the provision of effort, see Mirrlees
. However, the main twist of the theo-
rems of welfare economics is maintained: the market mechanism is applied for
the allocation of commodities
the first theorem
, while the distribution is gov-
erned by an appropriate redistribution of labor income
the second theorem
a side remark, note that this instrument is enforced at a central level. This seems
to be a requirement for any redistribution policy. Otherwise the policy would face
an adverse selection problem: the poor would be queuing to collect their benefits,
but the rich would not show up to pay their contributions.
Regrettably, reality is not as simple to let it be casted in this simple mold.
analyzed the shape of actual institutions for redistribution, which
are more complex and diversified than is predicted by the above argument. There
are more institutions involved, like public health insurance, subsidies to educa-
tion, social insurance, old age provisions, means-tested rent subsidies and in kind
provision of services
food stamps, child care, education
. As suggested by the
title of Elsters book, redistribution is also not completely centralized. Further-
more, administrative procedures and conventions play a role: the order of layoff
in a firm, queuing procedures for surgical operations, the institutional environ-
ment of wage bargaining, waiting list procedures for housing cooperatives, rent
regulation, minimum wages, admission procedures to universities, screening pro-
cedures of insurance companies, and norms on the division of labor within the
family. This multitude of institutions makes the issue of their efficiency the more
pressing: is a mix of instruments really the best that can be achieved?
efficient redistribution hypothesis. Becker
observes that policy advice by economists for reorganizing institutions is met with
scepticism by politicians. Usually, it is simply ignored. According to Beckers
efficient redistribution hypothesis the reason for this attitude is that our institu-
tions, though not Pareto efficient, are at least second best. Given the constraint
that the policymaker cannot distinguish between ability and effort,
the outcome
implemented by the actual set of institutions is the best that can be attained: there
are no feasible Pareto improvements left. The motivation for this hypothesis is in
the nature of political process. As long as proposals for Pareto-improvements are
floating around, it is hard to imagine a reason why policitians would not be able
to implement them. By the definition of a Pareto-improvement there are only
gainers and no losers. The only reason might be inefficiency in the bargaining
process, i.e. politicians being unable to agree on who is entitled to the surplus of
the Pareto improvement and therefore wasting the opportunity. However, in the
long run, one would expect politicians to come to an agreement so that the po-
litical equilibrium would be close to efficient. This hypothesis does not yield a
deterministic outcome. It says that all surpluses will be distributed efficiently, but
that does not imply how they will be distributed. This issue will be discussed in
section 4.
The efficient redistribution hypothesis turns on its head the research agenda
for economists. Instead of making a simple drawing table analysis of how wel-
fare can be increased by institutional reform, economists are asked to explain
why institutions are shaped the way that they are. Two types of objections are
raised to this hypothesis. First, in particular those who are involved in policy-
making have objected that they observe many opportunities for Pareto-improve-
ments that lay idle along the sidewalk. The disaster of Dutch economic policy in
the seventies is a notorious example. These examples can be viewed as short-run
deviations that would be resolved by subsequent bargaining. The second objec-
tion is by those economists who are engaged in advising on Pareto-improving
policies. They say that if the hypothesis were true, then their profession would
no longer exist. However, the same argument applies to consulting in the private
sector. Our theories of firms assume that entrepreneurs behave rationally. Do these
theories make consultants, who help entrepreneurs to find the rational decision in
the myriad of reality, superfluous?
What is the virtue of applying the efficient redistribution hypothesis? Could
we not do equally well by simply analyzing what optimal institutions look like
and ignore actual institutions? Understanding the raison d’être of actual institu-
tions is a safety device for any policy advice. Though the history of most mod-
8 Standard Pareto efficiency takes into account three constraints: production technology, tastes and
initial endowments. The imperfect information of the redistribution agency implies that its actions
affect the incentives for players to provide effort, which yields a fourth incentive constraint, see e.g.
ern redistributive institutions is relatively short
a century at most
, this time span
provides ample opportunity for experimentation. The forces in the direction of
efficiency had a reasonable chance to operate. Take for example Mirrlees’ con-
clusion that a negative income tax is the optimal taxation scheme. This institu-
tion has never been applied in practice, in any case not as the only redistributive
instrument. This being the case, one can proceed in either of two directions. Ei-
ther one comes up with a grand proposal for a complete overhaul of the welfare
state following the prescriptions of the theoretical analysis, or one is more sus-
picious and considers the discrepancy between theoretical prescription and reality
evidence against the adequacy of the underlying economic model.
As a first reaction, our preference goes to the second approach. The history of
economic thought provides examples of the dangers of the first. The Modigliani-
Miller theorem, saying that the structure of liabilities of a firm is irrelevant, was
an important breakthrough in economic theory. However, businessmen have ig-
nored the dictum and have continued to care about an ‘optimal’ mix of their li-
abilities, even though they were lacking a firm notion of ‘optimality.’ They were
probably right, and nowadays economists are working to understand the reasons
for the structure of liabilities to matter. Similarly, observed institutions for redis-
tribution provide a benchmark for any theory about optimal institutions. As a
minimum, this theory should be able to explain some of the observed patterns of
redistribution. If not, one might doubt the empirical relevance of the theory. The
efficient redistribution hypothesis therefore is a safety belt against overly ambi-
tious policy advice.
By maintaining the efficient redistribution hypothesis, we asked ourselves a
question that is fundamentally different from that of Mirrlees
. Mirrlees
starts with postulating a welfare function that aggregates the utilities of all indi-
vidual citizens into a single measure of aggregate welfare. The optimal tax sys-
tem maximizes this aggregate welfare. The specification of the welfare function
is a normative step, since the weights we apply for various groups of citizens
depend on a subjective judgment. By contrast, our approach is positive. An ag-
gregate welfare function is superfluous, individual preferences suffice. The distri-
bution of utility is taken to be exogenous. The question asked is whether there is
any change in the mix of instruments that leaves the distribution of utility of all
but one agent unaffected and yields a positive change for at least one individual.
As long as such a change is available, the mix is not efficient.
The condition for this type of efficiency usually takes the form of an equality
of the marginal distortions of two instruments, evaluated for each type of citizen.
This equality constraint can be verified empirically, that is, we can check whether
this condition for the efficient mix of two instruments is satisfied in a particular
country, see Bullock
. Whether or not this constraint applies is highly con-
tingent on the structure of the economy. For example, the more elastic workers’
effort to monetary incentives, the more detrimental the effect of marginal rates in
the tax system. Hence, the higher the supply elasticity of effort,
the lower the
use of income taxation as compared to other instruments.
Hence, the efficient redistribution hypothesis has testable empirical implica-
tions. Our economy will have parameters, like the supply elasticity of effort, the
elasticity of substitution between various skill types in labor demand, or the de-
gree of relative risk aversion. The trade-off between various instruments is driven
by these parameters. Both estimates of these parameters and information on the
prevailing wage distribution and redistributive institutions are therefore necessary
ingredients for testing the efficient redistribution hypothesis.
3.1 Introduction
The discussion in section 2 showed that there is almost no limit to the number of
institutions that can be considered. We shall confine our attention to institutions
that are related to the labor market. Labor accounts for two thirds of GDP, so
this restriction captures an important share of what is going on in the economy.
Nevertheless, this leaves out many important institutions, like the marriage mar-
ket, the division of labor within the family, the regional segmentation associated
with the housing market, and health insurance. Even this restriction still leaves us
with a myriad of institutions, which cannot all be analyzed in one stroke.
The analysis consists of three layers. We start with a simple Walrasian
economy, where all agents are price takers and where the central government can
levy income taxes and subsidize human capital formation. In this world, wage
setting institutions are irrelevant, since wages are determined by the equilibrium
of supply and demand. This model is similar to that in Mirrlees’ seminal paper,
with one difference: here, workers with various levels of human capital are im-
perfect substitutes, while in Mirrlees’ world, workers are perfect substitutes. This
model provides a framework for an analysis of the trade-off between subsidizing
human capital formation and a system of progressive income taxes, analogous to
ideas in Tinbergen
. With imperfect substitutability between worker types,
human capital formation will affect relative wages. The history of the 19th cen-
tury provides a number of examples where an acceleration in human capital for-
mation yielded a subsequent compression of the wage distribution.
In the second layer, we shall extend this model with search frictions. The
search frictions create a surplus at the moment that the opportunity for a match
arises. Hence, agents are no longer price takers but have to bargain on the dis-
the outcome of the bargaining process, for example by minimum wage legisla-
tion, mandatory extension of collective agreements or recognition procedures for
trade unions. In this paper, we shall confine the analysis to minimum wages, as
recent evidence has documented their importance for the increase in wage dis-
persion in the United States during the eighties.
In this second layer, jobs are a one-shot opportunity. At the start of a job,
agents know the potential of a job. The future bears no uncertainty in that regard.
In the third layer, this assumption is relaxed. Agents have to make specific in-
vestments when they start a job. The per period payoff of these investments will
follow a random walk. This set up further increases the role of the institutions
for wage setting. Wage setting is no longer done once-and-for-ever at the start of
a job, as in the second layer. Each period new shocks come in that affect the
available surplus, and therefore wages. Lifetime wage inequality in the economy
depends on how these surpluses are divided between employer and employee.
The greater the share of workers in uncertain future surpluses, the greater wage
inequality. Furthermore, public social insurance benefits that are related to the
last earned wage rate tend to insure surpluses that are acquired in the early stage
of the career against depletion in later stages. The rise in early retirement in Eu-
rope during the past thirty years, which is boosted by the availability of public
social security, signifies the importance of this phenomenon for explaining the
wage distribution.
The subsequent subsections will consider each of these layers in greater detail.
3.2 The Trade-off between Taxation and Education in the Walrasian Benchmark
In the beginning of the twentieth century, the United States took the lead in what
afterwards turned out to be one of the major innovations of that era, the educa-
tion revolution. In a couple of decennia, the high school was transformed from
an elite institution to a service that was available to the great majority of the
population. Remarkably, this revolution did not start in the well-established, Eu-
rope-oriented states of New England. The movement was initiated in the rich ag-
ricultural states in the mid-west, dubbed the education belt by Goldin and Katz
. Among them, Iowa was the front-runner. By the end of the forties, the
fraction of the population having had high school education was some twenty
percentage points ahead of countries like the United Kingdom.
The newly educated cohorts started entering the labor market from 1915 on-
wards. Solid evidence on the return to education before World War II is not avail-
able. However, Goldin and Katz
survey some scattered evidence on wage
differentials between World War I and II. Their most convincing piece is a sur-
vey containing information on wages and years of education in 1915, held not
surprisingly in Iowa. This evidence shows a clear reduction in the return to
education in the period 1915-1950, see also Goldin and Margo
. Only after
1980, following a period of low inflow into the university system, did the return
to education start to rise again, see Katz and Murphy
and Card and Le-
In the Netherlands, the timing of the high school revolution was some 30 years
later. Consequently, the fall in the return to education also lagged behind by 30
years. In 1961, the return to education was still 13% per year of education. It
dropped steadily till it reached a level of 7% in 1985. Since that time, the return
stabilized at that low level, see Hartog, Oosterbeek and Teulings
The education revolution started even later in most East Asian countries, see
. South Korea provides an interesting example. After massive in-
vestment in human capital in the sixties and seventies, the return to education
dropped rapidly during the eighties, see Kim and Topel
The common factor in all three examples is a decline in the return to educa-
tion after a period of investment in human capital. A useful framework for ana-
lyzing this phenomenon is the assignment model, where workers that differ by
their skill level have to be assigned to jobs that differ in their complexity, see
1995, 1999
. The driving force is the Ricardian concept of comparative
advantage: highly skilled workers have a comparative advantage in complex jobs.
The percentage productivity increase per additional unit of skill goes up with the
level of job complexity. In equilibrium, highly skilled workers will be assigned
to more complex jobs. This assignment is in the mutual interest of both low and
highly skilled workers. Low-skilled workers would earn lower wages if they were
assigned to more complex jobs because they lack adequate skills. The high-
skilled would earn lower wages in simple jobs because their skills are irrelevant
In this type of world, an increase in the supply of highly skilled workers will
reduce the return to skill, because the number of highly complex jobs where these
skills can be applied most fruitfully is limited. Hence, the highly skilled workers
will end up in less complex jobs, where the return to the skill index is lower.
Hence, a general increase in the stock of human capital in the economy
i.e. the
supply of high skill workers
will reduce the return to human capital. This effect
is captured by the complexity dispersion parameter, see Teulings
. This pa-
rameter measures the percentage reduction in the return to human capital in re-
sponse to a one percent increase in the stock of human capital. Teulings and Vieira
estimate this parameter to be in the range of 2.4 to 5, with 5 being the
more plausible estimate, based on research for the United States, the Netherlands,
and Portugal. A numerical example illustrates the force of the mechanism. Sup-
pose that initially the return to human capital equals 10% per year of education.
Then, raising the level of education of all workers by one year
hence, a 10%
increase in the value of the stock of human capital
will reduce the return by 5
i.e. the complexity dispersion parameter
10% 50%, that is, from 10% per
year to 5% per year. Obviously, there are countervailing forces. Where the ex-
tension of higher education among the population has compressed wage differen-
tials, skill biased technological progress has increased the demand for human capi-
tal and therefore its return. This is exactly what Tinbergen’s
metaphor of
a race education and technology describes.
This mechanism has important consequences for efficient redistribution poli-
cies. When there is no demand for redistribution at all, the market mechanism
would lead to an efficient demand for human capital. Workers would decide on
their investment in human capital by setting equal the marginal cost of acquiring
human capital to the market rate of return to that capital.
Any subsidy to the
education system would lead to inefficiencies since workers would be overinvest-
ing in human capital. This subsidy runs even counter to the usual goal of redis-
tribution policies, since it tends to favor the children of high-income families.
This has been the motivation for many pleas to reduce subsidies to education. At
the same time however, this advice yields Beckers question as to why many
societies nevertheless subsidize higher education.
Our model yields an explanation for this puzzle. Human capital creation yields
an endogenous force pushing in the direction of a compression of gross wage
differentials. Without this force, high marginal tax rates would be required to ex-
tract money from high ability individuals that can be distributed among the less
able citizens. However, these high marginal rates would tax the revenues derived
not only from ability but also from effort. When we were able to push up the
level of human capital in the economy
even above the efficient level
, by sub-
sidizing the education system, this higher level of human capital would reduce
the dispersion of gross wages. Hence, it would reduce the need for high marginal
tax rates. There would be less distortion in the provision of effort. The optimal
level of subsidies would equalize the marginal cost of the distortion caused by
too high a participation in the education system and the marginal revenues of a
smaller distortion due to lower marginal tax rates.
The widespread preference among politicians for subsidies to the education
system, even for people who enter higher education and who can therefore be
expected to earn high future salaries, might be perfectly rational. The efficient
redistribution hypothesis yields the conclusion that the greater the demand for
redistribution in a society, the higher the subsidy to the education system. This is
a testable implication.
3.3 Minimum Wages are Relevant when we Include Search Frictions
In the Walrasian world as envisaged by the first two theorems of welfare eco-
nomics, any direct intervention in prices is equivalent to some, albeit rather
strange tax system. For example, a minimum wage would be equivalent to a
100% tax on all labor supplied at hourly wage rates below the minimum. This
tax would have the effect of discouraging any labor supply at that level, which is
equivalent to what would be achieved by a minimum wage. Nobody would pro-
pose such a tax system as an efficient means of redistribution: taxing the lowest
incomes does not seem to contribute to redistribution from the rich to the poor.
In a Walrasian world with market clearing prices, minimum wages can therefore
only be a violation of efficiency.
This conclusion explains the fierce opposition
of economists to minimum wages.
Nevertheless, empirical research suggests that direct price interventions are an
important instrument for redistribution. Minimum wages have for a long time
been thought to have relatively little impact. In their survey of the US economy
in the seventies, Brown, Gillroy and Kohen
concluded that a 10% in-
crease in minimum wage would lead to a fall of 1 to 3% in employment among
youngsters. As a percentage of total employment, the effect would be even
smaller, in the order of magnitude of 0.1 to 0.3 percent. Later on, Card and Krue-
could not even find any disemployment effect at all.
This view has been challenged by a number of recent papers showing that
minimum wages had a large impact on wage dispersion, see DiNardo, Fortin,
and Lemieux
, Lee
and Teulings
. Wage dispersion has in-
creased substantially in the United States during the eighties. The search into the
causes of this phenomenon has been a theme for many papers. Skill biased tech-
nological progress and globalization have been the main candidates for an expla-
nation. Empirical research shows globalization not to be a likely cause, and hence,
by a process of elimination, skill biased technological progress has generally been
considered to be the main explanation.
However, during this same period, the minimum wage has been reduced by
almost 40% in real terms. Lee
and Teulings
show that this trend
can explain virtually the whole increase in dispersion in the lower half of the
wage distribution. Even relative wages at a wage level of twice the minimum
wage were affected by a change in the minimum. Figure 1
taken from Teulings
depicts the wage distribution for males
dotted line
and females
tinuous line
in the four main regions in the United States for 1973, 1979, 1985,
1989, and 1991. The distributions are ranked by their ratio of the minimum to
median wage. An eyeball test suffices to see that an increase in the minimum
tends to distort the whole lower half of the distribution, in particular for females
who earn wages closer to the minimum. Similar pictures in Teulings, Vogels, and
van Dieten
reveal the same pattern for the Netherlands.
Also, interventions in the institutions for wage setting have a major impact on
the wage distribution. These institutions vary substantially between countries.
11 However, Teulings
offers a Walrasian model where minimum wages contribute to redis-
tribution. That model is based on the same assignment model as in section 3.2. The elimination of
low-skilled workers from effective labor supply pushes up the wages of slightly better skilled work-
Figure1–Lowwagedistributions for males and females separately, estimation results
Wage setting is rather centralized in the Scandinavian countries, while it is de-
centralized in the United States and Canada. The economies in the north-west of
Europe take an intermediate position. The word ‘centralization’ might yield the
misconception that in Scandinavian countries all wages are actually set by a cen-
tral planner. This is unlikely, since nowhere in the world central planning has
been feasible, simply because central planners lack the information to implement
this wage scheme. Teulings and Hartog
provide an alternative interpreta-
tion, where ‘centralization’ is merely a device for affecting the bargaining power
in the bargain over the match specific surpluses.
In practice, centralization of wage bargaining tends to favor the bargaining
power of employers. Empirical research shows that the more ‘centralized’ econo-
mies have smaller wage differentials. This holds in particular for those differen-
tials which are related to variables that, at least at first sight, do not reflect the
human capital of the worker. Typical examples are the industry affiliation of the
worker or the profitability of his firm
e.g. Blau and Kahn
and Teulings
and Hartog
. An interpretation that is consistent with the evidence is that
‘centralization’ in wage bargaining tends to reduce the bargaining power of work-
ers in claiming surplus at the firm level, see Nickell
1998, Table 4
and Teulings
These issues can only be analyzed in a framework that allows for some kinds
of frictions, which prevent markets from clearing. A convenient idea is to extend
the assignment model discussed in the previous section with search frictions of
the standard Pissarides
type, see Teulings and Gautier
. In the as-
signment model, both workers and jobs are heterogeneous. The principle of com-
parative advantage makes that each worker type has its own optimal job type.
Hence, there is an obvious rationale for investment in search for a suitable part-
ner that matches your own comparative advantage. However, since search is
costly, there is a trade-off between continued search for an even better match and
the cost of search. The reservation match quality is that match quality that is just
acceptable to both the job seeker and the employer. Worse matches are rejected
and agents will continue search. Better matches yield a surplus compared to the
reservation match. Hence, match quality is partly a matter of luck, depending on
the question whether a job seeker happens to run into a good match before he
settles down in a match of only reservation quality. By implication, in most
matches there is a surplus above the reservation match quality. This surplus is
match-specific, since it evaporates as soon as agents separate. Therefore, wages
are set, not by perfect competition, but by bargaining in a bilateral monopoly of
job seeker and firm.
12 This feature of surplus sharing distinguishes labor market models from models for the marriage
market. In the labor market we can transfer surplus
or: utility
from the firm to the worker by raising
the wage and vice versa. In the marriage market this option is not available.
In this type of world, a new field for policy intervention emerges. Changes in
the institutions for wage bargaining affect the distribution of the surplus between
job seeker and firm. On the basis of the discussed evidence on the impact of
minimum wages, these interventions have a large effect on the wage dispersion.
Again, Beckers efficient redistribution hypothesis is a useful safety device.
Economists are trained to support the view that we should leave markets on their
own and that redistribution should be implemented by income transfers, a view
that is based on the two theorems of welfare economics. Politicians never com-
plied with this rule, as is documented by the widespread use of minimum wages
and similar interventions in wage setting. Only recently, economists start to un-
derstand the reasons for the potential effectiveness of these interventions. The ex-
perience of the United States regarding the effect of minimum wages offers an
excellent example.
Systematic results on the second best characteristics of this type of interven-
tion are still lacking. Some preliminary results have been achieved, see e.g. Van
den Berg
, Mortensen
, and Flinn
. A first conclusion is that,
contrary to a world with perfect competition where wages adequately reflect
the social cost of labor wage rates are set at an inefficient level in the first
place. Hence, intervention in wage setting does not necessarily deteriorate
A low minimum wage might therefore even increase efficiency com-
pared to the case with no minimum wages. However, the trade-off between re-
distribution by progressive taxation or minimum wages has not yet been formally
analyzed. The search models developed during the past fifteen years will turn out
to be of great help.
An important consideration is the incentive structure of a minimum wage com-
pared to the tax system. Since taxes are a transfer by the worker and his firm to
the government, both agents have a joint interest in cheating. In contrast, a mini-
mum wage imposes a transfer from the firm to its worker. Hence, one of the two
agents that are directly involved has an incentive to ask for compliance. This
softens the informational burden of enforcement. Another implication is that no
money is transferred to a political bureaucracy. This alleviates the incentive con-
straints that govern the design of these bureaucracies, since bureaucrats have their
own agenda. As a US congressman once said: ‘after all, an increase in the mini-
mum wage does not cost a penny to the federal budget.’
13 Here, I use the word ‘efficiency’ in a loose way. I assume that a costless
3.4 Firm Specific Investment as a Motive for Unemployment Insurance
We usually take a static view on income inequality, implicitly comparing differ-
ent skills or ability types at one point in time. However, there is much random-
ness in the careers of workers. Some fare well. They enter a profitable company
with a growing number of employees, become chief and manage to walk up on
the hill of insider rents. Others fare less. Their firms turn out to be less profit-
able. They may even be forced to leave the firm and make a new start at another
company. Obviously, the wages of both groups will vary widely, even though
they started from an equal position. This variation is not due to across-skill-type
differentials but to within-skill-type-across-career differentials. This observation
fits the fact that the variance of the error term in a standard wage regression
increases over the career of a worker.
In cross-country comparisons of wage dispersion, the across-skill-type and
within-skill-type-across-career differentials are usually merged into a single sta-
tistic. This practice has obscured the two underlying forces. People tend to focus
on the across-skill-type differentials. However, the second factor delivers an im-
portant contribution to inequality. As an illustration, Figure 2 plots the experience
and tenure profiles that are generated by a standard OLS wage regression for the
United States and the Netherlands.
We normalize the experience profile at their
level at 4 years of experience to eliminate the effect of the separate minimum
wages for youngsters in the Netherlands. Panel A plots the profiles for workers
with zero tenure. Panel B plots the profiles for workers who joined their present
employer at 4 years of experience and stayed there from then on
tenure experience 4
. Figure 2 offers a first indication of the importance of
the actual career of a worker: workers with zero tenure earn much lower wages.
Furthermore, there are substantial differences between the United States and the
Netherlands. The experience profile for newly hired workers is steeper in the first
half of their career in the Netherlands than in the United States. Moreover, the
profile stays at this high level in the Netherlands, where it declines in the United
States. However, much of this difference is undone when a worker manages to
stay at the same firm during his career. In that case, the experience profiles look
more similar in both countries. In the United States, much depends therefore on
the question whether you manage to stay with the same firm.
How do redistribution policies affect this process? For this question, we need
a different type of model than the one we have applied so far. We collapse the
skill-type dimension and introduce a time dimension instead. In Teulings and Van
der Ende
, a worker and a firm have to make a specific investment at the
start of their employment relation. This investment raises the productivity of the
worker well above his productivity in other jobs. After the moment of invest-
ment, log productivity follows a random walk. Each period of time a new shock
14 The coefficients are taken from Teulings and Hartog
1998, Table 1.2
, using the values for 1985.
comes in, which moves productivity either up or down. As soon as productivity
falls below a threshold, continuation of the employment relation is no longer prof-
itable, and the worker and the firm will agree to separate.
In this world the labor market is a selection device. All employment relations
are random walks. Good draws survive, bad draws are eliminated by separation.
New relations are started. Again, good draws survive and bad draws are elimi-
nated. In the course of time, an increasing share of a cohort will find itself in
good draw. This model can explain the distribution of tenure and the tenure pro-
file in wages quite well, see Teulings and Van der Ende
Like in the model with search frictions discussed in the previous section, spe-
cific investments create a surplus. These surpluses constitute the return on these
specific investments. Efficiency requires that the surpluses are divided between
the worker and the firm according to their share in specific investments.
petition on the labor market will drive the expected discounted value of these
surpluses down to the value of investment. Again, wage bargaining determines
how these surpluses are divided between the worker and the firm. Since the sur-
plus is equal to productivity minus the separation threshold, this surplus follows
a random walk. Hence, the wage rate also follows a random walk. This implica-
tion of the model has been confirmed for the United States by Topel
. The
difference between the Netherlands and the United States in panel B of Figure 2
is also consistent with this model, since at the firm level, workers have greater
bargaining power in the United States than in the Netherlands, see the previous
section. Hence, tenure profiles should be steeper in the United States.
A first implication of this model is that the size of worker bargaining power
affects the distribution of surpluses not only due to search as discussed in the
previous section but also due to subsequent shocks to match productivity. The
greater worker bargaining power, the larger their share in randomly determined
match-specific surpluses, and the larger therefore wage inequality. In the extreme
case where workers’ bargaining power is equal to zero and workers receive only
their reservation payoff
, workers would not face any uncertainty. The low firm-
specific worker bargaining power that is implemented in the economy by ‘cen-
tralization’ of wage formation serves in fact as an insurance device of the returns
on workers’ match specific investments. A large share of the difference in wage
dispersion between the United States and Canada on the one hand and the Scan-
dinavian and north-west European countries on the other hand can therefore be
ascribed to differences in degree of insurance of these investments.
15 A violation of this rule would lead to a hold-up problem, the firm delaying its specific invest-
ment because it gets too small a share in the surplus or the other way around. Hence, bargaining
institutions should be in line with investment shares, a requirement known as the Hosios
dition. It is not obvious that the decentralized market mechanism will generate this outcome.
16 Note that this market payoff is endogenous. Lower workers’ bargaining power is likely to raise
the market payoff, since firms have strong incentives to invest in new jobs. Hence, lower workers’
bargaining power does not necessarily imply a lower total expected discounted payoff.
The institutions for wage setting are not the only insurance device that is op-
erative. Unemployment and disability insurance play a similar role, at least in
most European economies. In these countries, workers are entitled to a career-
long unemployment or disability benefit as soon as they pass the age of 55, if
not de jure, than at least de facto. These benefits tend to be related to the last
earned wage. Now, consider a cohort of workers at the age of 55, who have gone
through a selection process lasting 30 years. On average, they will be well above
the separation threshold, some even quite far. Hence, they have a substantial
match-specific capital at stake. This explains the huge wage losses for displaced
workers in the United States, see Jacobson, Lalonde, and Sullivan
A. Log wages as a function of experience for newly hired workers relative to the wage of a worker
with four years of experience.
B. Log wages as a function of experience for workers with tenure equal to experience minus four for
workers with four years of experience.
Figure 2 Wage experience profiles in the Netherlands and the United States, 1985
Beyond the age of 55, some workers will find themselves in matches of which
the productivity declines due to downward random shocks. However, since their
entitlement to an unemployment benefit is related to their last earned wage, there
are adverse incentives in accepting wage reductions. These adverse incentives will
affect wage bargaining and induce employers to accept non-declining wages for
these workers. Moreover, employers feel safe to pay these wages since they know
that doing so maximizes the claim of their workers to unemployment benefits
and therefore facilitates future displacement in the case of a further productivity
decline. This mechanism explains why the general experience profile remains
quite high in the Netherlands for workers with more than 30 years of experience,
see Figure 2, panel A. The exit route to social security benefits puts a floor in
Statistics on participation rates for elderly males for European countries testify
the relevance of this mechanism. In fact, the participation rate for this age cat-
egory is the main explanation for the cross-country differences in male participa-
tion. The standard interpretation is largely in terms of early retirement schemes
with a fixed entitlement age. Though these fixed schemes, which operate inde-
pendently of match productivity, obviously play an important role, the random
component in match productivity is probably far more important.
Beckers efficient redistribution hypothesis suggests the reason why this exit
route of early retirement, which is largely a by-product of a high degree of in-
surance of the returns on match-specific investments, shows up more prominently
in European countries than in the United States. First, all institutions in Europe
are more geared towards redistribution. Apparently, the political constellation in
European countries leads to a greater demand for redistribution than in the United
States. Beckers hypothesis suggests that the system will seek an efficient mix of
instruments. Given the larger demand for redistribution, the marginal cost in terms
of efficiency of each redistributive instrument must be higher. This explains why
career-long benefits for workers above 55 are more acceptable in Europe than on
the other side of the Atlantic. A similar reasoning might explain why European
countries opt for a more ‘centralized’ wage formation with lower worker bargain-
ing power than the United States: it offers an insurance mechanism for firm spe-
cific capital, which helps to satisfy the large demand for redistribution.
17 In fact, the fixed entitlement age in early retirement schemes might be endogenous. Most schemes
have been introduced or made more generous at times when adverse shocks hit match productivity.
The main difference with the social insurance schemes is that these entitlements are irreversible even
when match productivity picks up for later cohorts. This argument might also help to square the
emergence of these institutions with Beckers efficient redistribution hypothesis. It seems hard to be-
lieve that these institutions are efficient in the steady state. However, during the recessions of the
seventies and eighties they have served the generation in power. Hence, though these institutions sat-
isfy Beckers efficient redistribution hypothesis in the one shot game, they do not do so in the re-
peated game, in particular where these schemes were funded on a pay as you go basis.
At first sight, maximum insurance can be achieved by designing the institu-
tions for wage formation such that workers’ bargaining power is close to zero.
However, since first best efficiency requires the surpluses to be shared according
to the cost of investment, the distribution of bargaining power can only be ma-
nipulated to the extent that the burden of specific investments can be shifted to
employers. Some investments can probably be shifted at low costs. However,
other, non-verifiable or non-observable components, in particular in human capi-
tal, cannot be shifted. An attempt to push workers’ bargaining power below a
certain threshold would reduce the incentives for these investments. Similarly, ad-
equate incentives for the provision of effort probably require workers to get some
share in the surplus. Hence, the choice of worker bargaining power can best be
described by the classical trade-off between incentives and insurance, where the
United States puts more weight on incentives and Europe more on insurance.
Much of the reasoning in section 3 took for granted that the demand for redis-
tribution is much higher in Scandinavia and the north-west of Europe than in the
United States and Canada. It is this level of demand which then explains why all
institutions in Europe seem to contribute to redistribution. Since distortions tend
to increase quadraticly with the extent to which an institution is applied, it is
efficient to spread redistribution among many institutions. This brings us to the
issue as to why countries face differences in the demand for redistribution in the
first place. At this point, we can only formulate some hypotheses. I suggest two
types of explanations. The first is based on the ‘constitutional setting’ in which
the political system has to operate, while the second relies on the status of the
United States and Canada as immigration countries.
The domain of politics is bound by the ‘constitutional setting.’ This phrase
does not necessarily refer to a written constitution. Rules of conduct can also
bind the game that politicians play. Politics does not have the deterministic out-
come that is suggested by the median voter theorem. When preferences are not
single peaked, a coalition might overrule the median voter. In this setting, rules
of conduct can be particularly persistent because no member of a ruling coalition
dares to change out of fear to be removed from the ruling coalition.
An example of a constitutional setting that affects the demand for redistribu-
tion is the funding of primary education in the United States. In Europe, the edu-
cation system is financed from tax revenues. The United States have a decentral-
ized system where a large part of the funding comes from local real estate taxes.
Each local community supports its own education system. As has been discussed
in section 3.2, the acquisition of human capital yields distributional externalities.
An increase in the stock of human capital yields wage compression. In a decen-
tralized system as that in the United States, these externalities cannot be internal-
ized. Each community sets its real estate taxes to equate marginal cost and rev-
enues from the education system, disregarding the contribution of a higher level
of human capital to redistribution and therefore ignoring the potential for lower
marginal rates in income taxation. Despite this ineffiency, nobody dares to change
this ‘constitutional setting’ by switching to a more centralized system, out of fear
to be removed from the ruling coalition. This mechanism might explain why the
lower deciles of skill distribution in the United States lag behind their counter-
parts in north-west Europe, see Leuven, Oosterbeek and van Ophem
similar vein, the combination of an integrated labor market and a political system
that grants little authority to the federal center is likely to leave many opportu-
nities for internalizing externalities unexploited. This might offer a threat to Eu-
ropean integration in the future. However, a speedy integration of the labor mar-
ket is not to be expected in a continent that is separated along all kind of language
and other cultural barriers.
The second explanation of cross-country variation in the demand for redistri-
bution takes immigration to be the main force. Several mechanisms might link
immigration to the demand for redistribution. Migrants to wealthy countries are
in vast majority low-skilled. They enter in the lower tail of the skill distribution,
thereby reducing the average level of human capital in the economy. The assign-
ment model discussed in section 3.2 predicts that a reduction in the average level
of human capital will increase wage differentials. The issue is then why a coun-
try that faces a demand for redistribution at the same time allows migrants to
enter, largely undoing the redistribution effort?
The first explanation recognises that, though low skilled workers are hurt by
migration, the median voter might benefit from it. In that case, both income re-
distribution from the mean to the median and migration will help the median
voter. Second, migrants may have reduced voting power during the initial period
after immigration. This moves the median voter up in the hierarchy. Under this
assumption there might be multiple equilibria: in the no migration equilibrium,
there is much redistribution since the median voter is a lower skilled worker,
while in the migration equilibrium, there is little redistribution, since the median
voter is a better skilled worker. Third, the median voter might be hurt by migra-
tion in the short run. However, in a dynamic equilibrium, he might gain in the
future. The combination of his acquisition of the skills required in the host coun-
try and the future immigration of new cohorts of unskilled workers moves him
up in the skill distribution, into brackets that gain from immigration. The assign-
ment model discussed in section 3.2 will allow a more formal analysis of these
three explanations. A final explanation builds on the presumption that migrants,
prepared to take the risk of moving to a new home country, are probably less
risk averse. Nations with a long history of immigration can therefore be expected
to be inhabited by a less risk averse population. This might reduce the demand
for redistribution.
Actual institutions for redistribution are much more complicated than has been
suggested by Mirrlees’ seminal paper on optimal income taxation. Redistribution
not only makes use of all kinds of taxes and subsidies, but also intervenes di-
rectly into prices and wages. Decentralized rules of the game, like wage setting
conventions within the firm, play an important role as well. I analyzed in par-
ticular the role of subsidies to education and minimum wages. The importance of
these instruments is consistent with the fact that the main difference in income
distribution across countries is already present in the distribution of gross wages.
Hence, the tax system can be held responsible for only a small part of the cross-
country variation in income dispersion. Most redistribution directly affects gross
wages, like the two instruments mentioned above.
This paper intends to show that these deviations from Mirrlees’ simple rule do
not necessarily imply constrained Pareto-inefficiencies. On the contrary, for two
cases that have been discussed, the use of these instruments is probably efficient.
This is exactly what one would expect on the ground of Beckers efficient redis-
tribution hypothesis. Politicians were not wrong in using these instruments, econo-
mists were, in applying too simple a model of the economy to produce reliable
conclusions on the efficiency of institutions for redistribution.
This paper has two faces. On the one hand, it forces economists to apply mod-
esty, by the application of Beckers efficient redistribution hypothesis. Economic
history has shaped the institutions for redistribution with good reason. A policy
advice by an economist based on the simple rules suggested by the second theo-
rem of welfare economics is likely to underestimate the complexity of the real
world. The more we learn about these arguments, the more modest we have to
be in our policy advice.
On the other hand, this paper sets an ambitious task. Apparently, we do not
fully understand the forces that govern the institutions for redistribution. Future
research will disclose these secrets, at least in the long run. As soon as this job
has been completed, economists will be able to design the optimal institutions at
a drawing table, but they will always be bothered by why actual institutions look
like they do.
In the meantime, policymakers might feel uncomfortable. They might remem-
ber what has been said about the long run and feel that they are left without
guidance and, even worse, without a goal. They do not have to worry. Probably,
a wide range of small Pareto-improving reforms is laying on the sidewalk. And
for the grand designs that are needed from time to time, it is better to have them
from economists who understand the world as it is than from those who disre-
gard it.
This paper provides a survey on studies that analyze the macroeconomic effects of intellectual property rights (IPR). The first part of this paper introduces different patent policy instruments and reviews their effects on R&D and economic growth. This part also discusses the distortionary effects and distributional consequences of IPR protection as well as empirical evidence on the effects of patent rights. Then, the second part considers the international aspects of IPR protection. In summary, this paper draws the following conclusions from the literature. Firstly, different patent policy instruments have different effects on R&D and growth. Secondly, there is empirical evidence supporting a positive relationship between IPR protection and innovation, but the evidence is stronger for developed countries than for developing countries. Thirdly, the optimal level of IPR protection should tradeoff the social benefits of enhanced innovation against the social costs of multiple distortions and income inequality. Finally, in an open economy, achieving the globally optimal level of protection requires an international coordination (rather than the harmonization) of IPR protection.