Content uploaded by Lemin Wu
Author content
All content in this area was uploaded by Lemin Wu on Jul 29, 2015
Content may be subject to copyright.
If Not Malthusian, Then Why?
Lemin Wu*
June 2015†
Abstract: This paper shows that the Malthusian mechanism alone cannot explain the pre-
industrial stagnation of living standards. Improvement in luxury technology, if faster than im-
provement in subsistence technology, would have kept living standards growing. The Malthusian
trap is essentially a puzzle of balanced growth between the luxury sector and the subsistence sec-
tor. The author argues that balanced growth is caused by group selection in the form of biased
migration. It is proven that a tiny bit of bias in migration can suppress a strong growth tendency.
The theory re-explains the Malthusian trap and the prosperity of ancient market economies such as
Rome and Song. It also suggests a new set of factors triggering modern economic growth.
Keywords: Malthusian trap, group selection, very long run growth, source-sink migration
JEL Classification: N3, O41, B12
*School of Economics, Peking University, leminwu@pku.edu.cn.
†First draft: November 2013. An earlier version of this paper, “Does Malthus Really Explain the Malthusian
Trap?” is the first chapter of the author’s doctoral dissertation.
1 Introduction
All other arguments are of slight and subordinate consideration in comparison of
this. I see no way by which man can escape from the weight of this law, which pervades
all animated nature.
Thomas Malthus (1809, chap.1)
But in economics, the admission that mankind need not live at the margin of subsis-
tence [...] meant that, the very long run limit of wages was not physiological subsistence,
it was psychological subsistence—a much more complicated and difficult matter to for-
mulate exactly.
Lionel Robbins (1998, pg.174)
Life was miserable and stagnant for most who lived before the Industrial Revolution. “The av-
erage person in the world of 1800 was no better off than the average person of 100,000 BC” (Clark,
2008). According to Malthus, poverty lingered because economic progress only led to faster popu-
lation growth. A larger population depresses average income and brings society back to persistent
poverty. For two hundred years, Malthus’s wisdom has been held as one of the most infallible
doctrines in economics.
However, this paper shows that Malthus’s mechanism alone cannot explain the Malthusian trap.
Improvement in luxury technology, if faster than improvement in subsistence technology, would have
kept living standards increasing. To suppress luxury, it takes the Darwinian force of selection in a
social context: cultures and technologies spread faster if they favor group prosperity at the expense
of individual welfare. The trap is no less Darwinian than Malthusian.
Explaining the Malthusian trap is the core value of Malthusian theory. Despite empirical weak-
ness of the relationship between average income and population growth, most economists hold a
Malthusian view of history because Malthus explains the persistent poverty of the pre-industrial
world, and no competing theory is ever available. However, Malthus’s success in explaining the
Malthusian trap is an illusion. The prediction is right—the trap exists—but the mechanism is
wrong.
I challenge Malthusian theory not because it is inconsistent with data (though it is)—with
enough ad hocery, Malthusian theory can be reconciled with almost any fact. I attack it because
the theory relies on a crucial assumption. Implicitly, Malthus assumes away the conflict between
individual welfare and group fitness. He ignores the fact that, in the real world, what promote
the individual’s welfare does not always expand the group’s population. Diamonds, spas, circuses,
monuments—be they for sexual attraction, comfort, entertainment or vanity—are immune from the
Malthusian force. Increases in these types of consumption can raise living standards without limits.
Take flower and bread as a metaphor. Flowers attract the mates; bread feeds the mouths.
Population increases with the bread but not with the flowers. Choice-theoretically, when there
2
are flowers, people are simply better off. With the average bread consumption anchored by the
Malthusian force, the equilibrium living standards are fully determined by how many flowers the
average person has, which in turn depends on the ratio of flowers to bread in an economy.
I call bread “subsistence” and flowers “luxury”. Both contain hedonistic value, but one dollar
worth of subsistence has a larger demographic effect than one dollar worth of luxury (the division
is relative: beef is a luxury relative to potatoes, yet is a subsistence good compared to diamonds).
The two-sector model allows culture and technology to change equilibrium living standards
while the Malthusian constraint is still binding. What matters is not the size of the economy
but the structure of production (say, flowers versus bread); not the aggregate demand but the
relative preference, i.e., how much value people place on one thing—flowers—as compared with
another—bread. By adjusting the ratio of luxury to subsistence, variation in production structure
and social preference can now explain a large portion of fluctuations in living standards, which are
misattributed to changes in fertility culture and disease environment when the two-sector model is
unavailable.
It follows that the Romans were rich not because technological progress temporarily exceeded
population growth—as Malthusians claim—but because Rome had a business-friendly legal system
and an active market economy. Well-functioning courts and marketplaces boost industry more than
they boost agriculture; this biases the production structure toward luxury, and thereby raises the
average living standards of the whole society.
Conversely, after the Agricultural Revolution, the hunters-and-gatherers-turned-peasants failed
to achieve the level of leisure and nutrition their ancestors once enjoyed because agriculture biased
the production structure toward subsistence. The tragedy recurred when potatoes dominated the
Irish diet in the late 18th century.
Yet the two-sector model raises a serious puzzle. If luxury productivity had been growing
slightly faster than the subsistence productivity, living standards would have been rising steadily,
but this never happened until the modern era. The absence of an upward trend in the average
living standards implies that the luxury sector and the subsistence sector somehow grew at the
same speed. The balance of growth lies at the heart of the question of why the Malthusian trap
had ever existed, but Malthus was never aware of the puzzle, to say nothing of addressing it.
I show that balanced growth is caused by Darwinian selection of technologies at the group’s level.
Here is how it works: since living standards increase with the ratio of luxury to subsistence, migration
is usually from places relatively rich in subsistence to places relatively rich in luxury. Subsistence
technologies are thus more likely than luxury technologies to be carried around by migrants and
conquerors—technologies are spread in a selective way. Even if luxury technologies intrinsically
grow faster, the advantage of subsistence technologies in spread might offset the advantage of luxury
technologies in growth, thus keeping the two sectors in balance. Simulations confirm that living
standards would grow steadily if there were only the Malthusian mechanism but not the technological
selection, whereas a tiny bit of selection is enough to suppress a strong tendency of growth. Selection,
no less than the Malthusian mechanism, is crucial to the existence of the Malthusian trap.
3
The paper is naturally composed of two parts. The first part is the two-sector Malthusian model,
which raises the balanced growth puzzle (section 3). The second part addresses the puzzle with the
group selection theory (section 4).
Malthusian theory is no easy target. To replace it, I need to demonstrate three things. First, I
show that the luxury-subsistence division can explain a number of irregularities in historical data
(section 3.4). Second, I uncover the source-sink pattern in historical migrations (section 4.2). Third,
I simulate and mathematically prove that living standards would grow in a Malthusian economy
unless selection is introduced, and that a tiny bit of selection can suppress a strong tendency of
growth (section 4.4 and 4.5).
In section 5, I consider the implications of the theory for some major issues in economic history,
including the Agricultural Revolution (section 5.1), the rise and fall of ancient market economies
(section 5.2), the long-term impacts of wars and migrations (section 5.3), and the Industrial Revolu-
tion (section 5.4). Section 6 concludes and discusses what the research means to Economic History
as a discipline.
2 Literature review
How sound is the Malthusian fact? Figure 1 shows Maddison (2003)’s estimates of the world’s GDP
per capita for the last two millennia. Using Maddison’s data, Ashraf and Galor (2011) confirm that,
by year 1500, the level of a country’s technology explains the country’s population density, but not
the income per capita.
However, Maddison’s “guesstimates” might be contaminated with a Malthusian presumption. A
bunch of revisions in the past few years have largely wiped away the humdrum picture of ancient
economic life in the original Maddison series. It has been shown that, during the pre-modern
centuries, while Italy and Spain experienced stagnation and even declines (Malanima, 2011; Álvarez-
Nogal and De La Escosura, 2013), the per capita GDP of England doubled between 1270 and 1700,
and that of the Netherlands almost tripled between 1000 and 1500. Researchers questioned when
sustained growth actually started (Hersh and Voth, 2009; Persson, 2010; Fouquet, 2014), but, except
Wu et al. (2014), none asked whether or not the Malthusian trap existed. The take-home message of
most of the research is that living standards fluctuation was much larger than previously thought—
the findings disturb the Malthusian interpretation of short-run fluctuations but not the Malthusian
fact of long-term stagnation. Even Wu et al. (2014) agree that the European per capita GDP
failed to recover from the collapse of the Roman Empire until perhaps the dawn of the Industrial
Revolution. Hence, it is fair to say that the existence of the Malthusian trap is still widely held as
a fact.
It has been doubted to what extent Malthusian theory is applicable to explaining short-run
changes. As a matter of fact, average income and population growth were poorly correlated in the
English data. Yet, for all its weakness, the Malthusian force is believed to dominate in the long run
by its persistence (Lee, 1987, p.452). Allen (2008) worried about the conjecture but mentioned no
4
100
1,000
10,000
100,000
0
500
1000
1500
2000
World
Western Europe
Year (AD)
GDP per capita
(1990 international Geary-Khamis dollars)
Figure 1: World GDP per capita 1-2001 AD. Data source: Maddison (2003)
alternative solution.
Another critique of Malthus targets his failure to predict modern growth. As a remedy, theorists
have endogenized the acceleration of growth to reconcile the Malthusian stagnation with modern
economic growth (Simon and Steinmann, 1991; Jones, 2001; Hansen and Prescott, 2002; Galor and
Weil, 2000; Galor and Moav, 2002). Presuming that stagnation was caused by the Malthusian force,
these researchers described how a Malthusian shackle would have been broken, but they never asked
whether the shackle is truly Malthusian or not.
This paper has two key concepts: sectoral duality and group selection. Two-sector Malthusian
models have appeared in Restuccia, Yang, and Zhu (2008); Voigtländer and Voth (2013), and Yang
and Zhu (2013). These studies define sectors production-wise: agriculture uses land; manufacturing
does not. In this paper, I divide sectors consumption-wise: agricultural products and manufactured
products have different demographic effects. Steady growth is possible in my model but not in
theirs. Rudimentary consumption-wise two-sector models have been built in Davies (1994) and
Taylor and Brander (1998). More sophisticated versions appeared in Lipsey, Carlaw, and Bekar
(2005, chap.9) and Weisdorf (2008). However, these researchers never discuss the possibility that
a sustained directional change in production structure would disturb the long-term constancy of
living standards.
The other concept, group selection, was once a taboo, but has revived in the past few decades.
The academia once resisted the idea because it reminded one of Nazi eugenics. But modern biological
research has demonstrated the absurdity of racism based on genetic reasons. Research on group
selection is and should be apolitical. Nevertheless it is worth noting that unlike Galor and Moav
(2002) and Clark (2008), my notion of selection is not genetic but cultural. Another reason why
5
group selection was dismissed as pseudo-science is that researchers once thought group selection to
be invariably weak compared to individual-level selection. In social contexts, emphasizing group
selection seemed to contradict individual rationality. However, since about 1970s, biologists and
economists have found numerous ways for group selection to exert a significant influence despite
individual selection. Bowles and Gintis (2002) and Bowles (2006) use it to explain how cooperation
gets rooted in human nature. The sociobiologist Wilson (2015) provides a book-length introduction
to the modern literature on multilevel selection and how it explains altruism. In this paper, selection
operates among Nash equilibria that emerge out of individuals’ rational choices. It does not even
require people to be altruistic to do things that matter for group’s survival and expansion. My
theory is thus immune from the usual critiques on group selection, though the critiques themselves
are mostly unwarranted. This paper also contributes to the sociobiological literature of group
selection. Multilevel selection theorists have been overwhelmingly concerned with the origin of
altruism; they have published hundreds of books and papers on the topic. But this paper is the
first to apply the idea to explaining the Malthusian trap, an issue of no less importance.
A parallel study that I participate in, Wu, Dutta, Levine, and Papageorge (2014) (henceforth
WDLP), is the closest research to this one. There are at least three differences between WDLP
and this paper. First, although WDLP have a similar two-sector Malthusian model, we leave out
the important comparative statics regarding cultural preference, as well as the population stasis
requirement. WDLP cover only one dimension of comparative statics; this paper covers three.
Second, WDLP take the inter-sector difference in demographic effects as a presumed fact, while
this paper explains how the difference arises from an evolutionary biological perspective. Third,
although both WDLP and this paper come to the same conclusion that either the Malthusian fact
is false, or Malthus’s explanation for the fact is wrong, WDLP explore the first possibility, while
this paper focuses on the second: re-explaining the Malthusian trap.
Similarly to WDLP, another paper, Levine and Modica (2013), treat the Malthusian trap not as
a fact to explain but as a false prediction to challenge. Instead of dividing goods into sectors, they
focus on the allocation of resources between the people and the authority—a special pair of luxury
and subsistence in light of my theory. Their equilibrium is the maximization of authority-controlled
resources meant for wars between states. It is a special case of my group selection theory.
3 A two-sector Malthusian theory
In this section, I begin with a two-sector Malthusian model that adds two new comparative statics to
the classical theory. The classical Malthusian theory is shown to be a special case that assumes away
the conflict between individual welfare and group fitness. Next, I discuss the historical relevance of
the multi-sectorality, and its implications for long-term growth.
6
3.1 The division of sectors
Suppose there are 𝐻identical people living on an isolated island that has 𝑀types of commodities,
𝑗= 1,2, ..., 𝑀 . The representative agent consumes 𝐸∈R𝑀
+, a bundle she chooses to maximize a
utility function that is differentiable and strictly increasing:
max
𝐸∈𝐶(𝐻)𝑈(𝐸).(1)
The island is a Malthusian economy. Given resources, her choice set, 𝐶, shrinks with population:
∀𝐻1< 𝐻2,𝐶(𝐻1)⊃𝐶(𝐻2). The population growth rate depends on the average consumption 𝐸:
˙
𝐻
𝐻=𝑛(𝐸).(2)
Assume that 𝑛(𝐸)is continuous, differentiable and strictly increasing, and that there exists a set
Son which population stays constant. Call Sthe constant population set. Any isolated economy
that finds itself on the constant population set is at a Malthusian equilibrium.
If 𝑈(·)is not a transformation of 𝑛(·), there must exist some bundle of consumption 𝐸, at which
one commodity is more luxurious than another, i.e., ∃𝑗1, 𝑗2∈ {1,2, ..., 𝑀 }such that
𝜕𝑈 (𝐸)
𝜕𝐸𝑗1
𝜕𝑛(𝐸)
𝜕𝐸𝑗1
>
𝜕𝑈 (𝐸)
𝜕𝐸𝑗2
𝜕𝑛(𝐸)
𝜕𝐸𝑗2
(3)
Compared with 𝑗2, commodity 𝑗1marginally contributes more to individual utility than to popu-
lation growth. This makes 𝑗1a luxury relative to 𝑗2. We can define the “luxuriousness” of each
commodity in the following way: ∀𝐸∈R𝑀
+,∀𝑗∈ {1,2, ..., 𝑀 }, commodity 𝑗’s luxuriousness at 𝐸is
𝜕𝑈 (𝐸)
𝜕𝐸𝑗
𝜕𝑛(𝐸)
𝜕𝐸𝑗
(4)
Order all commodities by their luxuriousness, and we have a spectrum from the most luxurious
commodity to the most subsistential commodity. In fact we can always distinguish luxury goods
from subsistence goods as long as 𝑈(·)is not a transformation of 𝑛(·).
3.2 Production structure and social preference
Here I provide a graphical representation of the two-sector Malthusian model. To start with, draw
the representative agent’s consumption space (figure 2A). Besides the familiar indifference curve
and production possibility frontier, the diagram has a “constant population curve”. If the agent’s
consumption bundle is right on the curve, population stays constant. If the bundle lies to the
left—consumption is less than reproduction requires—population decreases. If the bundle lies to
the right, population increases.
7
E
subsistence
luxury
0
constant population curve
indifference curve
PPF
subsistence
luxury
0
constant population curve
indifference curve
PPF
E
E’
(A)
(B)
Figure 2: The two-sector Malthusian equilibrium
Population changes shift the production possibility frontier. When population declines, the
frontier expands: each person is endowed with a larger choice set. When population rises, the
frontier contracts—the economy has diminishing returns to labor. The returns to labor diminish
because land is crucial to production and its supply is inelastic.
Assume that the expansion and contraction of the production possibility frontier are shape
preserving, that is, the shape of the production possibility frontier is independent of the size of the
population. It would be interesting to allow luxury to be more labor-intensive than subsistence,
but such a relaxation of assumption would only complicate the model in ways that are favourable
to my hypothesis—the equilibrium living standards would then become even more responsive to
technological changes. Hence I stick to the shape-preserving assumption1. The spirit is that I do
not intend to equate subsistence with food. Luxury and subsistence are more fundamental a pair
of concepts than food and non-food.
The constant population curve crosses the indifference curve from above because, by definition,
subsistence goods are more important than luxury goods to population growth.2The curve does
not have to be vertical though. Beef is more luxurious than potatoes, but beef contains calories too.
The Malthusian equilibrium must lie on the constant population curve. As figure 2B shows, if
the economy expands, the temporary affluence will raise the density of population. The production
possibility frontier will then contract until the economy returns to the constant population curve.
Unlike the classical Malthusian theory, the two-sector framework allows technological shocks to
1One may consult Wu et al. (2014) for the case where the assumption is dropped.
2The definition of luxury determines the direction of crossing. If we denote the consumption of subsistence as 𝐸𝐴
and the consumption of luxury as 𝐸𝐵, the definition of luxury implies
𝜕𝑈 (𝐸)
𝜕𝐸𝐴
𝜕𝑛(𝐸)
𝜕𝐸𝐴
<
𝜕𝑈 (𝐸)
𝜕𝐸𝐵
𝜕𝑛(𝐸)
𝜕𝐸𝐵
=⇒
𝜕𝑈 (𝐸)
𝜕𝐸𝐴
𝜕𝑈 (𝐸)
𝜕𝐸𝐵
<
𝜕𝑛(𝐸)
𝜕𝐸𝐴
𝜕𝑛(𝐸)
𝜕𝐸𝐵
i.e., 𝑀𝑅𝑆𝑈< 𝑀 𝑅𝑆𝑛,(5)
so the constant population curve must be steeper than the indifference curve at 𝐸.
8
change equilibrium living standards. A positive shock in luxury technology expands the production
possibility frontier vertically (figure 3A). After population adjusts, the economy returns to the
constant population curve (figure 3B). The new equilibrium (𝐸′′) is above the old one (𝐸) because
the production possibility frontier has become steeper.
E
E
subsistence
luxury
0
CPC
IC
PPF
subsistence
luxury
0
CPC
IC
PPF
(A)
(B)
E’
E’
E”
Figure 3: Progress of luxury technology improves equilibrium living stan-
dards.
However, long-term living standards will decrease if progress occurs in the subsistence sector
instead. As figure 4A shows, the production possibility frontier expands horizontally as the subsis-
tence sector expands. The abundance of subsistence goods increases population. After the economy
returns to the constant population curve, the new equilibrium stays below the old one, because the
production possibility frontier has become flatter (figure 4B). In the long run, what matters for
living standards is not the size but the shape of the production possibility frontier.
Luxury expansion characterizes the impact that market-oriented economic policies would have
on an ancient economy, such as the Roman Empire and the Song dynasty of China. Markets boost
both agriculture and manufacturing, but manufacturing usually reaps more benefits from markets
than agriculture does, so the equilibrium living standards will rise as a result of the tilted production
structure. The Malthusian force would have no way to check the improvement.
Paradoxically, subsistence expansions, such as the Agricultural Revolution, only drag an econ-
omy into deeper poverty. Archaeological evidence shows that the ancient peasants lived a worse life
than their hunter-gatherer ancestors: leisure time was shortened; diet became less diversified; and
harvest failures caused more frequent starvation (Diamond, 1987).
Besides technology, the two-sector model also gives culture an important role to play. Suppose
there is a cultural shock that makes people desire more luxury—the indifference curve becomes
flatter (figure 5A). Imagine the luxury culture as one that promotes a new conspicuous consumption
used to signal unobserved income (Moav and Neeman, 2008). Those who dared reject the cultural
norms might have more food to eat but would be less attractive on the marriage market. Individual
9
E’
E
E
subsistence
luxury
0
CPC
IC
PPF
subsistence
luxury
0
CPC
IC
PPF
(A)
(B)
E’
E”
Figure 4: Subsistence technological progress decreases equilibrium living stan-
dards.
reproductive rationality requires people to spend resources on vanity. As people trade subsistence
for luxury, population undergoes a gradual decline. When the adjustment is over, those who remain
turn out to enjoy higher equilibrium living standards (figure 5B).
In a word, luxury is socially free: so long as everyone desires more, more is granted to each
who survives. Who pays for the extra luxury? It’s those who would have been born and those who
would not have died that pay with lives.
subsistence
luxury
0
subsistence
luxury
0
(A)
(B)
E’
E
CPC
IC
PPF
E’
E
CPC
IC
PPF
E’’
Figure 5: Luxury culture shock increases equilibrium living standards.
The above results seem to suggest that technologies and cultures that are more biased toward
luxury always yield higher living standards. As a correlation, the rule holds most of the time but
exceptions exist. The monotonicity is broken if there exist multiple equilibria. Fortunately, I can
10
show that multiple equilibria arise only if subsistence goods are Giffen. To the extent that Giffen
goods are rare, we do not have to worry about the exceptions in most cases. In the online appendix3,
I offer a proof of the following theorems:
Theorem 1 (Production Structure Theorem).(a) For an economy already on a stable equilibrium,
a positive luxury technological shock always improves equilibrium living standards. (b) Other things
being equal, if subsistence goods are not Giffen, a more luxury-biased production structure always
means higher equilibrium living standards.
Theorem 2 (Free Luxury Theorem).(a) For an economy already on a stable equilibrium, a luxury
cultural shock always improves equilibrium living standards. (b) Other things being equal, if sub-
sistence goods are not Giffen, a more luxury-biased culture always means higher equilibrium living
standards.
The above two theorems each adds a new dimension of comparative statics to the classical
Malthusian theory. The classical theory, in contrast, has only one comparative statics, the one
that is associated with the constant population curve. This single “classical” comparative statics, it
turns out, works well in the two-sector model too. When the disease environment worsens, warfare
becomes more frequent, or people decide to postpone marriage and have fewer children, the constant
population curve simply shifts rightward: population grows more slowly at each level of consumption
(figure 6A). The ensuing decline of population expands the production possibility frontier. People
who survive the changes are better off in the new equilibrium than in the old equilibrium (figure 6B).
The two-sector model thus preserves the merit of the classical theory.
subsistence
luxury
0
subsistence
luxury
0
(A)
(B)
E
CPC
IC
PPF
E’
E
CPC
IC
PPF
CPC’
CPC’
Figure 6: Diseases, wars and delayed marriage increase equilibrium living
standards.
3http://wulemin.weebly.com/uploads/1/4/6/2/14620598/malthusonlineappendix.pdf
11
3.3 Classical Malthusianism as a special case
The classical theory is actually a special case of the two-sector model. When 𝑈(·)is a transformation
of 𝑛(·), all commodities have the same level of luxuriousness, and the constant population curve
will coincide with the indifference curve. As the two curves coincide, a luxury technological shock
can change the consumption bundle but not the equilibrium living standards. The economy having
returned to the constant population curve will have the same level of utility as in the original
equilibrium (figure 7B).
E’
E’’
subsistence
luxury
0
subsistence
luxury
0
(A)
(B)
E
IC (CPC)
PPF
E’
E
IC (CPC)
PPF
Figure 7: Malthusian theory is a special case where population growth and
individual utility are fully aligned.
This is why the classical theory predicts stagnation. By the coincidence of curves, two sectors
are reduced to one; the difference of demographic effects between bread and flowers is ignored; and
the conflict of reproductive interest between individual and group is assumed away.
The last point bears emphasizing. The conflict between individual and group is the ultimate
reason why the curves cross. The constant population curve is an iso-group-fitness curve, along which
population grows at the same rate; the indifference curve is an iso-utility curve, or approximately
an iso-individual-fitness curve. Millions of years’ natural selection shapes human beings’ preference
system into a maximizer of one’s own reproductive success. The conflict of reproductive interest
between individual and group prevails in both culture and nature. As sure as the conflict persists,
the division of sectors is a perpetual human condition. Assuming the individual’s interest to be
perfectly aligned with the group’s interest is simply unrealistic.
There are still many who believe there is nothing wrong with accepting an unrealistic assumption
as long as the theory makes the right prediction, and that among all theories that predict correctly,
the simplest is the best (Friedman, 1953). In their view, Malthusian theory is great. It might fail
to predict short-term changes, but ad hoc explanations are always available to explain away the
inconsistencies; the most important thing is: Malthus predicts the Malthusian trap. So why bother
12
to find an alternative?
The above view makes a serious methodological mistake—Milton Friedman (1953) is wrong.
Unrealistic assumptions are fine only if they are not crucial. If an unrealistic assumption is crucial,
that is, if making the assumption more realistic would lead the prediction astray from reality, then
the theory is wrong (Solow, 1956). Malthusian theory has an unrealistic assumption—the harmony
of reproductive interest between individual and group (should it be the case, the whole field of
game theory would be pointless!). If we relax the assumption, turning the one-sector model into
a multi-sector one, the multi-sector model will predict a trend of growth in living standards (this
part will be shown in section 3.6). Hence the assumption is both unrealistic and crucial. We need
a new theory to replace Malthusian theory, not only because the two-sector model accounts for
a larger portion of variation in living standards (it does), but also because Malthusian theory is
simply wrong.
3.4 Evidence of multi-sectorality
How relevant is sectoral division in actual history? The short answer is: very much. Whether the
division is recognized decides views as important as when the Great Divergence occurred and how
real income inequality evolved.
Based on the similarity of wages in terms of calories, Pomeranz (2009) argued that the Yangzi
Delta of China was on the same level of development as Northwest Europe as late as the end of the
18th century. He ignored the idea that the per capita calories are fixed by the Malthusian force, and
that most of the difference in living standards comes from the difference in non-food consumption.
As Broadberry and Gupta (2006) showed, if we measure the purchasing power of wage by grams of
silver, the “silver wage” of the Yangzi Delta was only comparable with the level of the central and
eastern parts of Europe (table 1). Regions varied widely in the ratio of silver wage to grain wage.
That the ratio was higher in England and the Netherlands than in other places reflected these two
countries’ relative advantage in tradable goods production, and it was by this advantage that the
Northwest European economies stood out in the pre-modern centuries.
Table 1: The grain wage and silver wage of different regions, 1750-99
Regions Grain wage (kg/d.) Silver wage (g/d.) Ratio (Sw/Gw)
Southern England 7 8.3 1.2
Antwerp 9.6 6.9 0.7
Vienna 7 3 0.4
India 2.3 1.2 0.5
Yangzi delta, China 3 1.7 0.6
Data source: Broadberry and Gupta (2006).
Hoffman et al. (2002) studied the implication of sectoral division for real income inequality.
Because the rich spent a larger portion of income on luxury than did the poor, the decline of the
prices of luxuries relative to the prices of subsistence goods enlarged the real inequality in Europe
13
between 1500 and 1800.
More direct evidence for multi-sectorality is provided in Wu (2012). In that paper, I did a
simple exercise: instead of regressing birth and death rates on real wages, as most people do when
testing the Malthusian relationship, I regress these demographic variables on “sectoral wages”, that
is, the purchasing power of wages measured in terms of goods of certain sectors. During the three
centuries before 1800, the period conventionally believed to have provided the strongest evidence
in support of Malthus, multi-sectorality turned out to be a salient feature of the English economy.
If pasture goods became more affordable, population growth barely changed; if arable goods were
more affordable, population growth rate increased a lot. Within the category of arable goods, the
affordability of wheat had little impact on population growth, but the affordability of barley and
oats—the poor people’s staple foods—almost solely explained the impact of real wages on birth
and death rates. However, barley and oats were merely 10% of the English economy, much smaller
than the share of wheat. The remaining 90%, including wheat, beef, cotton and candles, hardly
mattered demographically. Productivity improvement in the 90% sector surely would have increased
the long-term living standards, with more families switching their diet from porridge to bread, and
starting to call tea, sugar and coffee “necessities” of life. Changes like this might not be reflected in
the real wage series historians have built, but, with the proper data and method, it is still possible
to assess this part of the welfare increase.
Hersh and Voth (2009) estimated that, by the end of the 18th century, tea, coffee and sugar had
added at least the equivalent of 16% (and possibly as much as 20%) of income to English welfare.
Contrast it with two other New World crops, maize and potatoes. Chen and Kung (2013) estimated
that maize accounted for 18% of the population increase in China during 1776-1910, but had no
significant effect on economic growth. Nunn and Qian (2011) estimated that potatoes accounted for
about a quarter of the growth in Old World population between 1700 and 1900. Potatoes triggered
a Malthusian crisis in Ireland in the late 18th century; the explosion of population drove the Irish
to extreme poverty that culminated in the Great Famine of 1845 (Mokyr, 1983). The difference
between tea, coffee and sugar on the one hand, and maize and potatoes on the other hand, is
none other than the difference between luxury and subsistence. While the abundance in the former
improved the quality of life, the abundance in the latter increased the population.
3.5 The algebraic two-sector model
This section builds a simple algebraic model that captures all of the three comparative statics.
Assume the representative agent maximizes a Cobb-Douglas utility function over her subsistence
consumption, 𝑥and luxury consumption, 𝑦.
max 𝑈(𝑥, 𝑦) = 𝑥1−𝛽𝑦𝛽.(6)
The constant returns to scale makes the magnitude of utility meaningful: utility doubles when
consumption doubles.
14
Specify the subsistence production function as 𝑋=𝐴𝐿1−𝛾𝐴
𝐴𝐻𝛾𝐴
𝐴and the luxury production
function as 𝑌=𝐵𝐿1−𝛾𝐵
𝐵𝐻𝛾𝐵
𝐵.𝐿𝐴and 𝐿𝐵are land used in the production of subsistence and
luxury respectively. Their sum is the total endowment of land, 𝐿𝐴+𝐿𝐵=𝐿.𝐻𝐴and 𝐻𝐵are labor
employed in the respective sectors, and 𝐻𝐴+𝐻𝐵=𝐻.
Assumption 1. 𝛾𝐴=𝛾𝐵≡𝛾 < 1.
I assume 𝛾𝐴=𝛾𝐵so that population growth affects the two sectors in proportion, equivalent to
the shape-preserving assumption in the graphical model. 𝛾𝐴and 𝛾𝐵are smaller than one because
of diminishing returns to labor.
By maximizing the agent’s utility under land and labor constraints, we can derive her consump-
tion bundle:
𝑥=𝐴(1 −𝛽)𝐻
𝐿𝛾−1
(7)
𝑦=𝐵𝛽 𝐻
𝐿𝛾−1
(8)
Substitute equation 7 and equation 8 into 𝑈=𝑥1−𝛽𝑦𝛽. The level of utility is
𝑈=𝐴𝐻
𝐿𝛾−1𝐵
𝐴𝛽
(1 −𝛽)1−𝛽𝛽𝛽.(9)
Since 𝐴(𝐻/𝐿)𝛾−1(1 −𝛽) = 𝑥(equation 7), 𝑈can be expressed alternatively as
𝑈=𝑥𝐵
𝐴𝛽𝛽
1−𝛽𝛽
.(10)
The economy converges to equilibrium by population adjustment. In an isolated economy, the net
growth rate of population, 𝑔𝐻is equal to the natural growth rate of population, 𝑛. Assume that 𝑛
depends on the average consumption of subsistence only.
Assumption 2. 𝑔𝐻≡˙
𝐻/𝐻 =𝑛=𝛿(ln 𝑥−ln ¯𝑥), and 𝛿 > 0.
¯𝑥is the level of average subsistence at which population remains constant. The assumption
means a vertical constant population curve—population growth is independent of average luxury
consumption. In the equilibrium, 𝑥= ¯𝑥. Therefore,
Proposition 1.
𝑈𝐸= ¯𝑥𝐵
𝐴𝛽𝛽
1−𝛽𝛽
(11)
The equilibrium utility increases with
(a) the relative luxury productivity, 𝐵
𝐴,
(b) the relative preference for luxury, 𝛽,
15
(c) and the required consumption for population balance, ¯𝑥.
It is worth noting that the above model uses an exogenously given function of population growth.
However, the new standard of the literature is to endogenize the birth rates from households’ fertility
choices. Yet I stick to the exogenous functions because, as Weisdorf (2008) demonstrates, adding
such a microfoundation will not change any of the above results. When subsistence is relatively
cheaper than luxury, raising children becomes less costly to parents. The result is the same: the
equilibrium income per capita depends on production structure. I abstract from fertility choices
to save unnecessary complications and to highlight the key mechanisms at work—it’s not fertility
choice but the physiological nature of commodities that matters.
3.6 The balanced growth puzzle
Proposition 1 implies that living standards will rise steadily if luxury productivity grows faster than
subsistence productivity. Denote 𝑔𝐴as the growth rate of subsistence productivity and 𝑔𝐵as the
growth rate of luxury productivity. Appendix A.1 proves that 𝑔𝑈converges to 𝛽(𝑔𝐵−𝑔𝐴)in the
long run.
In the classical Malthusian theory, there is only one sector (𝛽= 0), therefore 𝑔𝑈= 0. In the
two-sector model, 𝛽is positive. The equilibrium living standards will have an upward or downward
trend unless 𝑔𝐵=𝑔𝐴.
Given Malthusian stagnation as a fact, the implied balanced growth is an extraordinary phe-
nomenon. The world population had grown from several million at the dawn of the agricultural
revolution, to three hundred million at the birth of Christ, and to almost one billion on the eve of
the industrial revolution—it went up by a factor of at least 1000. To keep up, subsistence produc-
tion must have grown by a thousand-fold; the subsistence technology 𝐴would have grown by about
thirty-fold if 𝛾= 0.5.
Throughout the thousand-fold growth of subsistence, luxury must have grown in exact propor-
tion. What could keep things balanced over ten thousand years during a thousand-fold growth? In
comparison, world population has grown only six-fold since the industrial revolution. In the mean-
time, the world GDP has grown by a factor of 150—the progress in luxury productivity has been
much faster than the progress in subsistence productivity. If the balance of growth is so difficult to
maintain within such a short period, isn’t it extraordinary that growth was once balanced for not
decades or centuries, but for millennia and even longer?
Moreover, it is natural to expect luxury production to intrinsically grow faster than subsistence
production. There are at least four reasons why we should expect so. First, manufacturing and
commerce are usually more labor-intensive than agriculture4. Population growth, by increasing
4I assume land plays equal parts in luxury production and subsistence production, i.e. 𝛾𝐴=𝛾𝐵(a simplifying
assumption that is unfavourable to my hypothesis). If instead I allow subsistence to rely more on land than luxury
does—which would be a very reasonable assumption, for subsistence is mostly about food and basic shelter—then
balanced growth alone is enough to cause steady progress of living standards. Nevertheless, I keep the stricter
assumption of 𝛾𝐴=𝛾𝐵for model tractability. That said, it is admittedly imprecise to call balanced growth as the
stagnation condition, for the condition should take into account the gap between 𝛾𝐴and 𝛾𝐵. Wu et al. (2014) derive
16
labor supply, naturally expands luxury production more than it expands subsistence production.
Second, industrial innovations are much less constrained by the possibilities of nature than are
agricultural innovations.
Third, the incentives for industrial innovations are better protected than the incentives of agri-
cultural innovations. An ancient farmer who succeeded with a new crop could hardly reap any of
the social benefit that spilled out of her own land. By contrast, in manufacturing and commerce,
keeping trade secrets for monopoly rent was feasible most of the time.
Last but not the least, manufacturing allows a larger extent of division of labor. As Adam Smith
(1887, Chapter I, Book I) put it,
“The nature of agriculture, indeed, does not admit of so many subdivisions of labour,
nor of so complete a separation of business from another, as manufacture. [...] This im-
possibility of making so complete and entire a separation of all the different branches of
labour employed in agriculture, is perhaps the reason why the improvement of the pro-
ductive powers of labour, in this art, does not always keep pace with their improvement
in manufactures. The most opulent nations, indeed, generally excel all their neighbors
in agriculture as well as in manufactures; but they are commonly more distinguished by
their superiority in the latter than in the former.”
The above reasons combined, the Malthusian trap appears to be a most unlikely coincidence.
Therefore, the Malthusian fact is still a fundamental puzzle of history. Malthus has not solved the
most mysterious part of the puzzle. To address the puzzle, I propose four explanations. Below, I
reject three of them. In the next section, I will explore the fourth in detail.
IEvolutionary adaption
Long exposure to a luxury good might cause genetic adaption that allows people to use it
as a subsistence good. For example, lactose intolerance is relatively rare among Northwest
Europeans, whose ancestors, as a conjecture goes, had a higher reliance on milk as a source of
nutrition than did people in Asia, who did not develop the gene. The problem is that, even if
the conjecture is correct, genetic adaption is usually slow, and the mechanism that works for
food does not work for manufactured products, which are a more important category of luxury
goods.
II Positional goods
Diamonds are precious because they are rare. Positional goods become worthless when there
are too many: people value how much they own compared with others instead of what they
own per se. The abundance of a particular luxury drives people away from the luxury. The
the 𝛾-adjusted balanced growth condition as
𝑔𝐵−1−𝛾𝐵
1−𝛾𝐴
𝑔𝐴= 0.(12)
17
problem with this hypothesis is that the Malthusian fact is not about the lack of desire, but
the shortage of goods. The mechanism might explain why being rich does not make one much
happier, but it does not explain why physical deprivation lasts.
III Constant returns to scale
Solow and Samuelson (1953) showed that, in a dynamic system described as
𝐴𝑡+1 =𝐹𝐴(𝐴𝑡, 𝐵𝑡)
𝐵𝑡+1 =𝐹𝐵(𝐴𝑡, 𝐵𝑡),
if 𝐹𝐴(·)and 𝐹𝐵(·)have constant returns to scale, then 𝐴and 𝐵will grow in balance on a stable
path. But the theorem only pushes back the question one step further. It is doubtful whether
the theorem is applicable to luxury and subsistence growth. Even if it is applicable, we still
have to answer why the functions have constant returns to scale. So far, I have seen no reason
why they should be so.
IV Group selection
This hypothesis holds that, for most of ancient history, there is a systematic bias in the direction
of migration and conquest: people and civilizations from relatively subsistence-rich groups tend
to conquer and replace those from relatively luxury-rich groups. High-living-standard societies
were rare in the ancient times because the only way to achieve high living standards under the
Malthusian constraint was to have a highly developed luxury sector, but if a society spends too
much resources on luxury, it invites invasion and immigration from relatively subsistence-rich
groups that finally erode the luxury culture. Because of the bias, groups that value austerity
are more likely to linger around than groups with licentious lifestyle. The bias also enables
subsistence technology to spread faster than luxury technology. Even if luxury technology
intrinsically grows faster than subsistence technology, the spread advantage of subsistence might
offset the growth advantage of luxury. The hypothesis fits historical facts so well that I cannot
reject it. The rest of the paper is devoted to the idea.
4 A group selection theory of the Malthusian trap
I have shown that it takes balanced growth between the luxury and subsistence sectors for living
standards to remain constant, and that the balanced growth puzzle cannot be explained by the
evolutionary adaptation hypothesis, the positional goods hypothesis and the constant returns to
scale hypothesis. This section elaborates on a fourth alternative, the group selection mechanism.
Selection works through biased migration. I first explain what biased migration is, and discuss
historical evidence in support of the migration pattern. Next, I build a group selection model
that allows me to derive the threshold condition of stagnation. Finally, simulations are conducted
to compare the paths of global average living standards with and without selection. Both model
18
and simulation confirm that a Malthusian world without selection would have an upward trend of
growth in average living standards, but when selection is introduced—a tiny bit of selection would
suffice—the trend is gone.
4.1 Biased migration
Group selection is related to the phenomenon of “biased migration”. This section uses a simple model
to explain how certain characteristics in culture and technology bias the direction of migration.
Suppose there is a sea of identical villages, all at the equilibrium state. Following Tiebout (1956),
I assume free migration across villages but forbid trade between them.5Bread and flowers are the
only commodities. Suddenly, one of the villages discovers a better way to grow flowers. Its produc-
tion possibility frontier expands vertically. If migration were forbidden, the flower village would end
up with higher living standards. But free migration equalizes utility across the villages (figure 8B).
With a steeper production possibility frontier tangent with the same indifference curve, the flower
village stays to the left of the constant population curve in the migration equilibrium—its death rate
is higher than the birth rate. The natural decrease of population does not expand the production
possibility frontier because the under-reproduction is filled up with the continuous immigration from
the other villages. The flower village becomes a demographic sink and the surrounding villages a
demographic source.
E
E
subsistence
luxury
0
CPC
IC
PPF
subsistence
luxury
0
CPC
IC
PPF
(A)
(B)
E’
E’
E”
Figure 8: Source-sink migration emerges out of difference in production struc-
ture.
Differences among cultures cause source-sink migration too. Suppose in one of the villages, girls
begin to ask for more flowers from their suitors: the indifference curve becomes flatter (figure 9A).
5Trade substitutes migration. If trade is free of cost, different regions will face the same relative price of luxury
to subsistence. The Malthusian force then equates consumption across regions, and there will be no need to migrate.
But if trade has a cost, the relative price will differ and migration will emerge. I forbid trade only to simplify the
analysis. The assumption is not crucial. The model applies so long as trade has a cost.
19
If migration were forbidden, the girls would get what they demand for free (remember the free
luxury theorem). But in the migration equilibrium, the equality of utility means demographic
imbalance. In the beginning, people in the surrounding villages do not move. They will stay put
until the population of the flower village decreases enough for the economy to move from 𝐸′to
𝐸′′ (figure 9B). After 𝐸′′, the continuous immigration will keep the flower village to the left of the
constant population curve. The source-sink pattern emerges again.6
subsistence
luxury
0
subsistence
luxury
0
(A)
(B)
E’
E
CPC
IC
PPF
E’
E
CPC
IC
PPF
E’’
Figure 9: Source-sink migration emerges out of difference in social preference.
The craze for flowers will not last forever, because the immigrants who come from the places
that do not value flowers as much will be diluting the flower culture7. The arms race of conspicuous
consumption is constrained not by Malthusian forces, but by source-sink migration, and the selection
that follows. Selection dissipates luxury cultures and diffuses subsistence cultures.
4.2 Evidence of biased migration
Source-sink migration is best documented in the context of rural-urban migration, where the phe-
nomenon is sometimes called “urban natural decrease”: in pre-modern Europe, the urban death rate
was higher than the birth rate, and the natural decrease coincided with the natural increase in the
surrounding rural area.
De Vries (2006) decomposes the net changes of pre-modern European urban population into net
immigration and natural growth. As figure 10 shows, during most of the time between 1500 and
1800, urban population had been growing in both Northern and Mediterranean Europe. Despite
the net increase, urban population had been declining naturally, that is, the death rate was higher
than the birth rate in the cities. During the half century between 1600 and 1650, Northern Europe
had an annual growth of 0.32% in its urban population; meanwhile, the urban death rate exceeded
6Here the migrants are assumed to keep their old preference. If they convert to new cultures, the diagram is
slightly different but the source-sink pattern still remains.
7The migrants might be assimilated into the host culture to some extent, which slows the dilution.
20
the birth rate by 0.33%. A flow of rural migrants that amounted to 0.65% of the size of urban
population per year had been replenishing the cities.
The growth rate of urban population of
Northern Europe, 1500-1890
-1%
0%
1%
2%
3%
1500-50
1550-1600
1600-50
1650-1700
1700-50
1750-1800
1800-50
1850-90
Net growth rate
Natural growth rate
The growth rate of urban population of
Mediterranian Europe, 1500-1800
-1%
-0.5%
0%
0.5%
1%
1500-50
1550-1600
1600-50
1650-1700
1700-50
1750-1800
Year Ye a r
Figure 10: The source-sink pattern of migration in pre-modern Europe. Data
source: De Vries (2006, p.203-208).
Malthusian theory is inconsistent with the migration pattern. The period 1800 −1850 witnessed
a spike in the growth of urban population in Northern Europe. According to the classical theory,
the spike suggests a rise in urban living standards, which would cause a faster natural growth in
the urban population. In fact, the gap between the urban death rate and the urban birth rate only
widened in this period.
The anomaly can be easily explained by the biased migration model. After 1800, the growth of
manufacturing and commerce accelerated in the urban areas; the growth of agriculture accelerated
in the rural areas. The polarization of production structures spurred more migration into the cities
than in the previous centuries. The flood of immigrants lowered the average subsistence, including
hygiene and workplace safety, by so much that the natural growth rate of the urban population
dropped further. As proposition 2 later shows, the depth of the demographic sink increases with
the distance of production structures.
Another piece of evidence for biased migration comes from Ravenstein (1885), who calculates
the gap between a county’s population and the number of its natives, enumerated throughout Eng-
land and Wales in 1881. When there were more residents than natives, he regards the county as
one of absorption—more people moved in than moved out; otherwise, it was a county of dispersion.
Ravenstein also marks whether a county was “agricultural” or “industrial”. He calls a county “agri-
cultural” if the county’s proportion of agricultural population exceeded the sample average, and
“industrial” otherwise.
As figure 11 shows, the overwhelming majority of the non-agricultural counties were counties of
absorption, and the overwhelming majority of the agricultural counties were counties of dispersion.
The pattern is consistent with the hypothesis that people usually move from subsistence-rich regions
21
Non-Agricultural Counties
Agricultural Counties
Frequency
Frequency
Demographic Imbalance (%) =
(Population - Natives) / Population (%)
Figure 11: The demographic imbalances of non-agricultural and agricultural
counties. Data source: Ravenstein (1885, p.185-186).
to luxury-rich regions.
Source-sink migration is not limited to the rural-urban context. Production structure can differ
over a much larger scale, say, between south and north of a continent. The difference will trigger
biased migration too, but data are hardly available on such a large scale. That is why I focus on
the rural-urban context to demonstrate the source-sink migration.
4.3 The spread of ideas through migration
So long as ideas move with people, the bias of migration naturally gives rise to bias in the spread of
ideas. Because people usually move from subsistence-rich regions to luxury-rich regions, a subsis-
tence culture (technology) is easier to spread than a luxury culture (technology), other things being
equal.
Today, anyone with electricity and Internet can watch lectures taught by the best scholars in
the world. Technological spread and migration are largely disentangled from each other. But in
ancient times, when the supply of books was limited, migration was crucial to the spread of ideas.
It took thousands of years for agriculture to spread from the Fertile Crescent to Northern Europe,
and the process coincided with the spread of the original Neolithic groups’ genes in both timing
and spatial extent (Cavalli-Sforza, Menozzi, and Piazza, 1993). If the hunter-gatherers beyond the
frontiers had learned agriculture without the immigrants’ help, the reproductive advantage of the
first farmers would have been quickly lost, and the observed pattern of genetic spread would be
impossible. Likewise, the Indo-European family of languages originated in the Caucasian steppe,
where the early domestication of horses lent serious military advantage to the herders. If horse-
22
herding had spread without migration, the Caucasians would have found it hard to conquer the
neighbouring peoples who had learnt the new arts of war, and the proto-Indo-European language
would have had no chance to diffuse at all.
Even in societies with decent literacy levels, migration still played a crucial role in the spread of
ideas. In the 15th century, the learned Byzantine exiles who fled out of the falling Constantinople
revived Greek studies in Renaissance Italy. Considering the potential demand for Greek letters and
thoughts, it is extraordinary that the knowledge had not spread earlier to Italy by means other than
migration. Another example is the revolutionary impact of Jewish emigres from Nazi Germany on
U.S. science. Moser, Voena, and Waldinger (2014) estimated that the arrival of German Jewish
emigres brought a 31 percent increase in patenting by U.S. inventors in emigres’ research fields after
1933. Even today, complaints about “brain drain” are frequently heard. Again, people are concerned
about the spread of technology through migration.
It is fair to say that, at least in ancient times, migration was an important channel of spreading
ideas. The bias of migration means a bias in technological diffusion. The question is: can this bias
explain the Malthusian trap?
4.4 The group selection models
I build two group selection models to answer the above question. The first model studies a single
village that is surrounded by an infinite number of villages. I call the case “partial equilibrium”
because the model treats the population and technology of the surrounding villages as fixed. The
relatively simple result highlights the key mechanism at work.
The second model studies the general equilibrium of two villages, allowing them to fully interact
with each other. A threshold condition is derived to tell when selection dominates growth and
when growth overpowers selection. Both models assume away cultural selection by keeping 𝛽fixed.
Focusing on technological selection alone, I can easily extend the results to cultural selection.
4.4.1 The partial equilibrium
Suppose there are an infinite number of villages as described in section 3.5. All villages begin with
the same technological levels and population sizes. As time elapses, bread and flower technologies
stagnate at 𝐴′and 𝐵′in all villages except one. Use asterisk to denote that special village. Its
subsistence technology stagnates too, 𝐴*=𝐴′, but its luxury technology 𝐵*tends to grow at the
rate of 𝑔. We call it the flower village and the others the bread villages. When 𝐵*exceeds 𝐵′, there
will be continuous migration from the bread villages to the flower village.
Assumption 3. Trade is forbidden across the villages but migration is free of cost.8
Free migration equalizes the level of utility, 𝑈*=𝑈′, which by equation 10 means
𝑥*𝐵*
𝐴*𝛽𝛽
1−𝛽𝛽
=𝑥′𝐵′
𝐴′𝛽𝛽
1−𝛽𝛽
8See footnote in section 4.1 for a justification of the assumption.
23
where 𝑥*and 𝑥′are the average consumption of bread. Rearrange the equation and take logarithm:
ln 𝑥*−ln 𝑥′=−𝛽ln 𝐵*
𝐴*−ln 𝐵′
𝐴′.(13)
The net emigration rate from the flower village, 𝑚, is equal to the natural growth rate of
population, 𝑛, which in turn depends on the average bread, 𝑥*, that is,
𝑚=𝑛=𝛿(ln 𝑥*−ln ¯𝑥).(14)
¯𝑥is the level of average bread that keeps population in natural balance. Since 𝛿 > 0and 𝑥*<¯𝑥,
𝑚is negative: migrants move from the bread villages into the flower village. The emigration has a
negligible effect on each bread village, because their number is infinite and migration between them
is frictionless. So the bread villages still have 𝑥′= ¯𝑥.
Denote 𝑠*≡ln(𝐵*/𝐴*)and 𝑠′≡ln(𝐵′/𝐴′), the relative luxury productivities. Substituting
𝑥′= ¯𝑥and equation 13 into equation 14, we get
Proposition 2.
𝑚=−𝛽𝛿(𝑠*−𝑠′)(15)
The net emigration rate is proportional to the distance of production structures. Having a higher
relative luxury productivity than the neighbouring villages causes net immigration.
Migrants spread ideas. Assume that migration affects 𝐵*by displacing hosts’ technology with
immigrants’ technology in proportion to the number of immigrants:
Assumption 4. From time 𝑡to 𝑡+ Δ𝑡,𝐵*updates by taking the weighted geometric average of 𝐵*
and 𝐵′and growing at the rate of 𝑔.
𝐵*(𝑡+ Δ𝑡) = 𝐵*(𝑡)1−𝑚Δ𝑡(𝐵′)𝑚Δ𝑡(1 + 𝑔Δ𝑡)(16)
Divide both sides of equation 16 by 𝐴′, take logarithms, and calculate the limit as Δ𝑡→0. We
can rewrite the equation into the motion function of 𝑠*:
˙
𝑠*=𝑚(𝑠*−𝑠′) + 𝑔(17)
Substitute equation 15 into equation 17:
˙
𝑠*=−𝛿𝛽(𝑠*−𝑠′)2+𝑔(18)
The differential equation has a stable equilibrium:
Proposition 3. In the long run, even if 𝐵*has an intrinsic tendency to grow at the constant rate
24
𝑔, the flower village’s relative productivity, 𝑠*= ln(𝐵*/𝐴*)will stabilize at
𝑠′+𝑔
𝛿𝛽
Note that 𝑔has a level effect but no growth effect on the equilibrium level of 𝑠*.
4.4.2 The general equilibrium
The partial equilibrium model assumes away the influence of the flower village on its neighbours.
In this section, we study a two-village model to take into account the general equilibrium effect.
Suppose village 1and village 2start identical. Their bread technologies, 𝐴1and 𝐴2, grow at the
same constant rate 𝑔𝐴, and their flower technologies, 𝐵1and 𝐵2, drift with noises:
𝑑ln 𝐵𝑖= (𝑔𝐴+𝑔)𝑑𝑡 +𝜎𝑑𝑧𝑖(19)
Here 𝑔 > 0captures the growth advantage of flower productivity over bread productivity. The
error terms 𝑧𝑖’s (𝑖= 1,2) are Brownian motions. 𝑧1and 𝑧2are independent with each other, and
Var(𝜎𝑑𝑧) = 𝜎2𝑑𝑡. I introduce the stochastic growth of technology as the source of inter-regional
variation. Variation is the basis of technological selection as mutation is the basis of natural selection.
I fix the growth rates of 𝐴1and 𝐴2to keep population equal between the regions. The equality of
population makes the model tractable without loss of generality. With 𝑠𝑖≡ln(𝐵𝑖/𝐴𝑖), equation 19
can be rewritten as
𝑑𝑠𝑖=𝑔𝑑𝑡 +𝜎𝑑𝑧𝑖.(20)
That 𝑔 > 0allows both villages, if isolated, to grow steadily in living standards. However,
selection cancels out growth by adding a “drag” term to the motion of 𝑠𝑖. The drag appears when
𝑠1̸=𝑠2. Following assumption 4, the drag term is a quadratic of the difference between 𝑠1and 𝑠2
as in equation 18:
𝑑𝑠𝑖= [𝑔−𝐼{𝑠𝑖>𝑠𝑗}𝛽𝛿(𝑠𝑖−𝑠𝑗)2]𝑑𝑡 +𝜎𝑑𝑧𝑖.
Here 𝐼{𝑠𝑖>𝑠𝑗}is an indicator function that equals 1if 𝑠𝑖> 𝑠𝑗and 0if otherwise. If 𝑠1> 𝑠2, village
1is relatively rich in flowers. It attracts immigration from village 2, which drags 𝑠1closer to 𝑠2. If
instead 𝑠1< 𝑠2, village 1is relatively rich in bread. The bread village receives no immigration and
selection will not affect its relative luxury productivity.
Since utility depends on 𝑠, the most interesting variables are the global average of 𝑠𝑖’s, 𝜇=
1
2(𝑠1+𝑠2), and the inter-regional variation, 𝜈=1
2(𝑠1−𝑠2)2.9.
Applying It¯o’s lemma, we get
𝑑𝜇 = (𝑔−𝛽𝛿𝜈)𝑑𝑡 +√2
2𝜎𝑑𝑧 (21)
𝑑𝜈 = (𝜎2−2√2𝛽𝛿𝜈 3
2)𝑑𝑡 + 2√𝜈𝜎𝑑𝑧 (22)
9𝜈is the sample variance: 𝜈≡1
2(𝑠1−𝑠2)2= [𝑠1−1
2(𝑠1+𝑠2)]2+ [𝑠2−1
2(𝑠1+𝑠2)]2
25
where 𝑧is a Brownian motion.
Taking long-term expectation of both sides of equation 21, we have
E
𝑡→+∞𝑑𝜇
𝑑𝑡 =𝑔−𝛽𝛿 E
𝑡→+∞(𝜈)(23)
Denote 𝑆≡𝛽𝛿 E𝑡→+∞(𝜈).𝑆is the force of selection. (𝑔−𝑆)captures the race between growth
and selection. If 𝑔 > 𝑆, growth overcomes selection and living standards grow; otherwise, selection
dominates growth.
Appendix A.2 proves that the variation 𝜈will always converge to a finite value, and
𝑆=𝑘
2(𝛽𝛿)1
3(𝜎2)2
3,(24)
where 𝑘≡31
3Gamma 4
3−1≈0.78 is a constant. Comparing 𝑔and 𝑆, we get the threshold
condition:
Proposition 4. Growth overcomes selection, if
𝑔 > 𝑘
2(𝛽𝛿)1
3(𝜎2)2
3.(25)
Otherwise, selection dominates growth.
When 𝑔 < 𝑆,E𝑡→+∞𝑑𝜇
𝑑𝑡 <0. The possibility that 𝜇could decline seems to contradict Malthu-
sian stagnation. But in the real world, the decline of 𝜇has a natural limit. Luxury consumption
cannot decrease further when it reaches zero. In this sense, the case of 𝑔 < 𝑆 already explains
why the average luxury consumption was almost nil in the ancient times. That said, if one feels
uncomfortable with zero utility under Cobb-Douglas utility function, a simple remedy is to assume,
not unreasonably, that luxury productivity growth accelerates if average luxury is close to zero
(demand is huge when luxury is rare). Then the equilibrium will have a positive amount of average
luxury. The later simulations will use the method.
As proposition 4 indicates, two sets of variables determine how strong selection is. The first is
the variance of technological growth 𝜎2, which provides the necessary heterogeneity for selection to
work on. The second is the product of two exogenous variables, 𝛽𝛿. Denote 𝜆≡𝛽𝛿, and call it
the intensity of selection. In a richer setting, 𝜆would further incorporate the migrants’ willingness
to move and the hosts’ susceptibility to migrants’ influence. A little calibration can help gauge the
relative strength of selection. If 𝛽= 0.5,𝛿= 0.1and 𝜎= 0.02, the threshold ^𝑔is 0.78%. In actual
history, the world population had been growing at about 1% per year between the Agricultural
and Industrial Revolutions—𝑔𝐻≈1%. Since appendix A.1 proves that 𝑔𝐴converges to (1 −𝛾)𝑔𝐻,
if 𝛾= 0.5,𝑔𝐴is roughly 0.5%. Therefore, if 𝑔𝐵≤𝑔𝐴+ ^𝑔= 1.28%, or less than about 2.5times
the level of 𝑔𝐴, the global average living standards will not have an upward trend of growth.
It is worth noting that assumption 4 allows technology to regress when “barbarians” invade.
Though largely neglected in the growth literature, there were quite a few well-documented instances
26
of technological regress in history. Some of them were apparently caused by invasions from less
civilized groups. One example, which Aiyar, Dalgaard, and Moav (2008) provide in their study
of technological regress, is a mortar called ‘Pozzolana cement’ that the Romans used to construct
large and durable structures such as baths, pantheons, and aqueducts. The technology was lost
after the fall of the Roman Empire and was only relearned in the early 13th century. As the case
of Pozzolana cement illustrates, luxury technologies are particularly vulnerable to social disorder
following “barbarian invasions”. Assumption 4 does capture part of reality.
However, there is no reason to expect peaceful immigration to bring about serious technological
regress. While having lasting impacts on the host society, migrants tend to be gradually assimilated,
and the mutual learning between natives and migrants often pushes forward both’s technological
frontiers. Assumption 4 might characterize the consequences of invasions but it overstates the power
of selection in the case of peaceful migrations.
The question is: is assumption 4 crucial? In the next section, I will relax the assumption,
allowing natives and migrants to learn from each other with no chance of technological regress.
Even then, the group selection theory still holds. Selection is so strong that even if wars are ruled
away, peaceful migration alone is enough to suppress the upward trend of living standards growth.
But for the algebraic model I still keep the assumption, otherwise the model would be intractable. As
shown above, the assumption allows me to analytically derive the condition under which Malthusian
stagnation arises. Qualitatively, the condition applies to peaceful migrations as well.
4.5 Simulations
The general equilibrium model has three limitations. First, the two-village setup may fail to cap-
ture the real-world intense competition among hundreds of countries. Second, the model assumes
migration is free of cost and occurs instantly, so there is no difference of utility across the regions.
Third, the model assumes immigrants’ technologies to be able to displace natives’ technologies,
no matter whose technologies are better. As discussed above, the assumption is unreasonable for
peaceful migrations.
To ensure robustness, I relax all of the three assumptions in this section. The algebraic models
being no longer tractable, I turn to computer simulations. The simulations have a hundred regions
instead of two. Migration is gradual and its speed increases with the utility gap across regions.
Moreover, besides the baseline case where technologies are indiscriminately substituted, I also study
the case where learning occurs only if the immigrants have a better technology (peaceful migrations
with no wars).
4.5.1 The baseline simulation
Imagine a world composed of 100 sites, arrayed on a 10 ×10 grid. Each grid represents a re-
gion that has the same population dynamics and the same production and utility functions as the
baseline model specifies. Time is discrete. At period 𝑡, the state of grid (𝑖, 𝑗)is characterized by
{𝐴𝑖𝑗𝑡 , 𝐵𝑖𝑗𝑡 , 𝐻𝑖𝑗𝑡}, the subsistence technology, the luxury technology and the population size.
27
Assume 𝐴𝑖𝑗 and 𝐵𝑖𝑗 evolve the following way:
𝐴𝑖𝑗 (𝑡+ 1) = 𝐴𝑖𝑗 (𝑡)(1 + 𝑔𝐴𝑖𝑗 +𝜎𝐴𝜖𝐴𝑖𝑗 ) + selection effect (26)
𝐵𝑖𝑗 (𝑡+ 1) = 𝐵𝑖𝑗 (𝑡)(1 + 𝑔𝐵 𝑖𝑗 +𝜎𝐵𝜖𝐵𝑖𝑗 ) + selection effect (27)
The error terms 𝜖𝐴and 𝜖𝐵have normal distributions, 𝜖𝐴, 𝜖𝐵∼𝑁(0,1), i.i.d. 𝑔𝐴𝑖𝑗 is the same
across all grids: 𝑔𝐴𝑖𝑗 =𝑔𝐴, but 𝑔𝐵𝑖𝑗 increases with the relative rarity of luxury:
𝑔𝐵𝑖𝑗 =𝑔𝐵1 + 𝐵𝑖𝑗
𝐴𝑖𝑗 𝛼.(28)
The additional term in the bracket is meant to prevent the downward trend. 𝛼is arbitrarily set to
be a large negative number to minimize its impact when 𝐵𝑖𝑗/𝐴𝑖𝑗 >1(𝛼=−10). Though appearing
ad hoc, adding the term increases the growth rate of luxury, which is unfavourable to my hypothesis
and only makes the theory even more robust.
At each period, residents of each grid decide whether they should move to a neighbouring grid
for higher living standards. For two grids next to each other, if grid 1has a higher utility than
grid 2, some residents of grid 2will move to grid 1, and the migration rate is proportional to the
difference of utility: Migrants
Population of the Origin =𝜃(ln 𝑈1−ln 𝑈2)(29)
Unlike the previous model, here 𝜃is finite.
I simulate two scenarios. In the first scenario, barbarian invasions and technological regress are
allowed. The immigrants’ technologies are assumed to displace the natives’ technologies, no matter
whose technologies are better. I call the scenario “indiscriminate substitution”.
In the second scenario, people only learn from those who do better: if the immigrants are better
at producing things, the natives will update their technology in the same way as “indiscriminate
substitution”; but if the immigrants’ technologies are inferior, the natives will keep their old ways to
produce, and the immigrants will convert to the natives’ technologies. I call the scenario “selective
learning”. There is no chance of technological regress under “selective learning”.
The force of selection is weaker under selective learning, yet it still favors the spread of subsistence
technologies. To see this, suppose there are two identical regions. If one has a positive shock in
subsistence productivity, people will emigrate to the other region to spread the improved subsistence
technology. But if it is the luxury technology that has improved, no emigration will occur and the
luxury technology has to remain local (contrast it with the indiscriminate substitution scenario
where selection happens either way).
I simulate both scenarios. It turns out that, over a vast range of parameters, there is no upward
trend of growth under either scenario. Selection is dominant over growth. To save computer time,
I treat each simulated period as a decade. As table 3 in appendix C summarizes, I parameterize
𝑔𝐴= 0.5% and 𝑔𝐵= 1% (per decade) for the baseline cases. The size of 𝑔𝐴guarantees a growth rate
of population close to historical rates. 𝑔𝐵is twice as big as 𝑔𝐴, promising a strong tendency of luxury
28
growth. The (subsistence) income elasticity of population, 𝛿= 0.2, matches the estimation from the
English demographic and price data. The other crucial parameters include the standard deviation
of the growth errors, 𝜎= 5%, and the migration propensity, 𝜃= 0.1. At a first approximation,
𝜃= 0.1means that if there is an opportunity to move to a twice richer place next to home, only
1% of people will take the opportunity in a typical year.
Figure 12(A) presents the key result of the simulations. It compares the global average utility,
weighted by regional population10 , both with and without the indiscriminate-substitution type
of selection. The Malthusian assumption alone fails to deliver the Malthusian result. Over ten
thousand years, the global average utility increases about tenfold if without migration (the result
is the same if migration is allowed but migrants are assumed to carry no technologies). But when
the knowledge-spreading migration is introduced, the trend is gone.
The absence of growth trend under selection is not a result of technological stagnation. Rather,
technologies grow faster when migrants are allowed to carry them around. This is how the “Malthu-
sian + migration” case achieves a faster population growth than the purely Malthusian case in
figure 12(B).
0
2
4
6
8
10
12
0
5000
1E+06
1E+07
1E+08
1E+09
1E+10
1E+11
0
5000
10000 10000
(A) (B)
Yea r Yea r
The global average utility
(weighted by regional population) The total population
Malthusian
Malthusian + migration
Malthusian + migration
Malthusian
Figure 12: A purely Malthusian simulation does not produce the stagnation
of living standards. To ensure stagnation, the Malthusian mechanism has to
be combined with biased migration.
What’s more, the distribution of regional utility under biased migration is stationary (figure 18 in
appendix B). The richest region’s utility never exceeds twice the poorest region’s utility. Selection
10The weighted global average utility is the “true” average that assigns equal weight to each person of the world.
If I drop the weighting and use the average of regional utility instead, the path of global utility will only be more
stable, as utility is negatively correlated with population. I will stick to the weighted average, which is unfavourable
to my hypothesis, throughout all simulations.
29
keeps all regions interlocked. In contrast, if there is only Malthusian force but no selection, the
variation is enormous and ever more divergent over time. Figure 17 in appendix B further traces
the utility of three representative regions—one at the corner of the world, one on the side, and one
in the middle. Despite cycles spaning thousands of years, there is no trend of growth in any single
region.
Figure 13 shows the result under the selective learning scenario. Selection, weakened as it is,
still dominates growth. The global average utility climbs up slowly before it stabilizes at a plateau.
In the long run, there is no upward trend of growth either.
0
1
2
3
4
0
5000
10000
15000
20000
25000
30000
Yea r
The global average utility
(weighted by regional population)
Indiscriminate substitution
Selective learning
Figure 13: If learning is selective, the global average utility will stabilize at
a higher level than if technology is indiscriminately substituted.
4.5.2 Robustness checks
In this section, I show that the dominance of selection over growth is robust to variation in three
sets of parameters, including (a) the standard deviation of growth errors, 𝜎, (b) the side length of
the simulated world, 𝑤and 𝑙, and (c) the migration propensity, 𝜃.
First, I vary 𝜎from 0% to 15% with each step equal to 1%, and 𝑔𝐵from 0% to 2% with each
step equal to 0.1%, keeping all the other parameters the same as in the baseline case. For each
pair of 𝜎and 𝑔𝐵, I run the simulations five times. I adopt a stringent criterion of stagnation. If
the global average utility grows more than 25% from the 300th period to the 600th period—over a
length of 3000 years—I treat it as a trend of growth, and if there are more than one simulations
(excluding one) having trend, I mark the pair of parameters as “progressive”; otherwise, “stagnant”.
I conduct the robustness check for both scenarios of knowledge spread. The result is figure 19
in the appendix. Selection gets stronger with a larger 𝜎: under indiscriminate substitution, when
𝜎= 3%, selection dominates if 𝑔𝐵−𝑔𝐴≤0.6%; when 𝜎= 5%, selection dominates if 𝑔𝐵−𝑔𝐴≤1%.
As expected, selection is weaker under selective learning, but there is a caveat. A simulated history
treated as progressive does not necessarily have an upward trend in the long run. The path of
30
selective learning in figure 13 keeps rising until stabilized at about the 20,000th year. Applying the
above criterion, I would treat the history as progressive but it actually has no trend in the long run.
To verify that the force of selection is robust to various sizes of the square world, I experiment
with every integer value of side length from 3 to 20, running five simulations for each. Figure 20
in appendix B shows the cumulative growth from the 300th period to the 600th period of these
experiments. The variation is bigger when the world is smaller, for the results are then more likely
to be driven by the idiosyncrasies of individual grids. Nevertheless, there is hardly any difference
between a world of 100 grids and a world of 400 grids. The baseline simulation, which assumes a
10 ×10 world, is representative in this respect.
The results are also robust to variation in 𝜃. To verify this, I run 10 experiments under each
scenario for each value of 𝜃from 0 to 0.2 with the step equal to 0.01. The power of small 𝜃’s is
extraordinary. The baseline simulation assumes 𝜃= 0.1. It is already a conservative estimate of
people’s willingness to move. But as figure 14 shows, even if 𝜃is as small as 0.01—only 0.1% of
people would move each year to a neighbouring region that is twice as rich—selection still dominates.
This by no means suggests that migration is unimportant. If 𝜃= 0 (the pure Malthusian case), the
cumulative growth is way larger than if 𝜃= 0.01. Growth precipitates as 𝜃slightly deviates from
zero. A tiny bit of migration is strong enough to dominate a strong tendency of growth. Why this
is the case I will leave to section 5.4 where I discuss the implications of the group selection model
for the Industrial Revolution.
Cumulative growth over 3000 years
-30%
0%
30%
60%
90%
120%
θ, the migration propensity parameter
0.00
0.05
0.10
0.15
0.20
Indiscriminate substitution Selective learning
Figure 14: The cumulative growth of global average utility over 3000 years.
Notes: each point represents an experiment at the corresponding level of mi-
gration propensity, 𝜃.
31
4.5.3 Pointwise balance
Selection ensures the equality of long-run average growth rates between the sectors. But the mere
equality is not sufficient for the stagnation of living standards. World population growth had
changed speed several times (figure 15). Behind each change is the acceleration of subsistence
technology growth. At these moments, for living standards to keep constant, the progress of luxury
technology must accelerate to exactly the same speed—a pointwise balance.
0
10
20
30
40
0
10000
20000
World population (million)
Global average luxury and
subsistence technologies (log)
1
10
100
1,000
10,000
10,000 BC
6,000 BC
2,000 BC
2,000 AD
upper estimates
lower estimates
luxury technology
(grey line)
subsistence technology
(black line)
Year
Year
(A)
(B)
Figure 15: (A) The world historical population estimates. Data source: US
census bureau. (B) When subsistence technology growth accelerates, luxury
technology growth will accelerate to the same speed.
To test the pointwise balance hypothesis, I fix 𝑔𝐵at 1%, and have 𝑔𝐴jump from 0.25% to 0.75%
at the 1001st period of the simulation. If the pointwise balance exists, the global luxury technology
growth will speed up to the same rate as the subsistence technology growth immediately after the
juncture. This is exactly what happens in the simulation, as figure 15 shows. I further conduct a
Chow test:
Δ log(luxury technology) = 5𝑒−3
(1𝑒−3)+ 10𝑒−3
(0.6𝑒−3)×break dummy𝑡=1001 +𝜖(30)
With p-value as low as 10−6, the test rejects the null hypothesis that there is no kink in luxury
technology growth at the 1001st decade. The estimated coefficient of the break dummy, 10𝑒−3is
exactly twice as large as the constant term, 5𝑒−3. It means that when the growth rate of subsistence
productivity triples from 0.25% to 0.75%, the growth rate of luxury technology triples from 0.25%
to 0.75% too. Despite that 𝑔𝐵is fixed, luxury growth catches up fast and fully. Selection ensures
balanced growth not only trend-wise but also point-wise.
32
5 Rethinking major events of economic history
The combination of the two-sector model and the group selection model paints a new picture of
economic history. In what follows, I will discuss the implications of the theory for four issues,
namely, the Agricultural Revolution, the ancient market economies, the welfare consequences of
wars and migrations, and the Industrial Revolution.
5.1 Why farm?
The early farmers were worse off than their hunter-gatherer ancestors. They had less leisure, worse
nutrition and greater inequality between sexes and across castes. The paradox of immiserizing
growth can be explained by the fact that agriculture is a subsistence technology. By tilting the
production structure toward subsistence, it caused living standards to decline in the long run. Yet,
if agriculture was so bad, “Why [did people] farm? Why work harder, for food less nutritious and
a supply more capricious? Why invite famine, plague, pestilence and crowded living conditions
(Harlan, 1975)?”
Farmers are faced with a prisoner’s dilemma. People choosing what is best for themselves are
hardly concerned with the prospect of the whole group’s misery. That the farmers as a group
would end up worse off would not bother one who saw agriculture as the dominant strategy to
maximize her own chance of survival and reproduction. But even if there was a group of altruistic
visionaries who coordinated to keep the hunting-gathering lifestyle, the group could not compete
with one that had switched to agriculture. The latter was relatively richer in subsistence. The
higher density of population, the consequent impoverished life, and the greed for new lands would
drive the agricultural group to invade the hunting-gathering group instead of the other way around.
Selection would wipe away whoever refused to farm (Cavalli-Sforza, Menozzi, and Piazza, 1993).
5.2 The rise and fall of the wealth of nations
First published in 1776,The Wealth of Nations declared the birth of modern economics. However,
Gregory Clark (2008, chap.2, pg.35) commented, “[I]n 1776, when the Malthusian economy still
governed human welfare in England, the calls of Adam Smith for restraint in government taxation
and unproductive expenditure were largely pointless [... while] those scourges of failed modern
states—war, violence, disorder, harvest failures, collapsed public infrastructures, bad sanitation—
were the friends of mankind before 1800."
Provocative as he sounded, Gregory Clark was only making explicit the natural conclusion of
the classical Malthusian theory. Without the two-sector model, no matter how uncomfortable one
may feel about the remark, there is no way to refute it.
In fact, Smith is right, though in a way he was never aware of. The policies he suggested can
improve living standards, not only in the short run but also in the long run, not only in Solow’s
time but also in his own time, and much earlier times as well. Laissez-faire, light taxation and the
division of labor, if applied to economic policies, raise productivity in all sectors, but manufacturing
33
and commerce benefit more than agriculture. The rise of the ratio of luxury to subsistence leads to
higher equilibrium living standards.
This explains why the average Romans and Song Chinese were richer than the other peoples in
history. According to Lo Cascio and Malanima (2009)’s estimation, the per capita GDP of Roman
Italy reached $1400 in US 1990 dollars in 150 AD, and the per capita GDP of the whole Roman
Empire was as high as $1000. Among the many mentioned estimates, Temin (2013) regards this
set of numbers as closest to reality. To put the estimates into perspective, consider that Maddison
(2003) estimated the per capita GDP of most ancient societies at slightly above or around $450.
$1400 per capita is what the Netherlands achieved as late as 1700. The reason why the Romans were
rich is very similar to the reason why people living in modern developed countries are rich. As Temin
(2013) shows, Rome had a functioning legal system, an active financial market, and a broad market
network. The security of property rights stimulated investment; the scale of the market facilitated
labor division; and standardised mass production improved the quality of consumer goods to a high
level. All these were meaningless in the old Malthusian view of history, but, in light of the new
theory, they were as crucial to ancient living standards as they are to modern life.
On the contrary, the “friends of mankind”—wars, violence, disorder, collapsed infrastructures—
often destroy more luxury than subsistence and decrease living standards in the long run.
5.3 A brief history of the long war against luxury
Fatal clashes on the group level have been a persistent human condition since primitive society. Of
the fourteen groups studied in Mae Enga, a modern hunter-gatherer society in Papua New Guinea,
five went extinct in tribal clashes over a 50-year period. In place of the extinct groups, new groups
formed out of the old groups that survived and expanded (Soltis, Boyd, and Richerson, 1995). A
group that spent too much on luxuries would hardly survive.
The domestication of animals and plants divided the world into nomadic zones and arable zones.
Until the mass use of gunpowder, clashes between the two had disrupted growth over and over again.
The three pre-modern peaks in Ian Morris (2011)’s social development index all ended in “barbarian”
invasions. The sea peoples raided Anatolia, the Levant and Egypt; the Huns and Goths ruined the
Western Roman Empire; the Jurchens and Mongols conquered the Song Dynasty of China. A brief
review of the three events can help us appreciate the crucial role migration plays in suppressing the
trend of luxury growth.
Around 1000 BC, the sea peoples, arguably nomads from the hinterland of Europe, destroyed
a number of highly developed kingdoms built by the Hittites, Minoans, and Mycenaeans. Urban
centres, artistic representation, elaborate writing systems, and large-scale trading, shipping and
construction vanished; civilizations were reduced to impoverished, illiterate, technically backward
and violent small communities. The population of the largest cities in the West declined from 80,000
(Babylon and Thebes) in 1200 BC to 25,000 (Susa) in 1000 BC. “The invasions were not merely
military operations, but involved the movements of large populations, by land and sea, seeking new
lands to settle” (Bryce, 1999).
34
The collapse of Rome was even more dramatic than the collapse of the Hittite kingdom. In
the post-Roman Europe, production shrank to meet only local needs. Worldwide copper pollution
plummeted to a seventh of the Roman peak level (Hong et al., 1996). Elites found it hard to afford
the tiled roofs that once even the lowest class of Roman peasants had for their houses (Ward-Perkins,
2005). It is of course unfair to blame all of the loss and decline on the invaders. There was evidence
of mild recession in the third and fourth centuries, arguably caused by civil wars and epidemics.
But the invasions had certainly done most of the devastation.
Observe how this view contradicts the Malthusian version of Roman history. In that view, the
average Roman lived beyond the verge of subsistence only because the Roman population had not
caught up with technology for a short while; when it finally did, prosperity was gone (Temin, 2013).
The problem is: if the Romans had lived a pinched life under population pressure, why would Em-
peror Valens have bothered to recruit armies from the “barbarian” immigrants, allowing the Gothic
refugees from the Huns to reside within Roman territory in the first place? After the collapse of
Rome and the decline of population that followed, why did average living standards not rise, as
Malthusian theory predicts, but instead plunged? My answer is: the Romans were rich because
their economic system encouraged luxury production and consumption; the post-Roman Europeans
were poor because the new rulers—not so unlike bandits—adopted policies hurting commerce and
industry. Europe later turned into a feudal society where obligation replaced profit as the guiding
principle of economic life. In many parts of the continent, money transactions disappeared. In-
dividuals’ freedom gave way to group survival. It was then, by the contraction of commerce and
industry, that Europe became a true “subsistence economy”. The next time Europe recovered, it
was a thousand years later when another round of commercial revolution began in the Italian cities.
The same catastrophe befell Song China. Broadberry, Guan, and Li (2014) estimated that the
per capita GDP of Song was about $1500 in US 1990 dollars in the 11th century. Manufacturing
and commerce were so developed that they contributed two-thirds of the government’s tax revenue
(Liu, 2015). Song’s textile machinery was comparable with European designs in the eighteenth
century. Its furnaces put out as much iron as the whole of Europe would produce in 1700. Song’s
coal mines were large enough for hundreds of workers to work at the same time. However, while
the combination of textile, iron and coal sent England onto the track of the Industrial Revolution,
Song failed prematurely.
Unlike England, Song had few geographical barriers to protect itself from invasions. Its collapse
is best viewed as one of several waves of group selection that surged in East Asia in the 12th and 13th
centuries. Before the Jurchens, Song’s rival was Liao, a country the nomadic Khitans built. After
25 years’ war, Song and Liao signed a peace treaty on the condition that Song would pay an annual
tribute to Liao. The peace lasted more than 120 years, bringing prosperity to both sides. Occupying
part of China proper that included today’s Beijing, Liao turned from a backward pasture economy
into a civilized country that had a highly developed manufacturing sector. But the Khitans, now
civilized, ended up an easy prey of the Jurchen barbarians. Two years after Liao fell, the Jurchens
further conquered Kaifeng (Song’s capital) and annexed the northern half of China. A century later,
35
the now-civilized Jurchens were in turn wiped out by the barbarian Mongols.
Though the Mongols inherited enormous wealth from the Jurchens and the Song Chinese, they
threw away many of the institutions and policies that had made the wealth possible. The Mongols
divided subjects into four castes with institutionalized discrimination between them. For a cheap
and stable supply of labor, the government forbade workers and their offspring from changing jobs.
The system was later inherited by the Hong-Wu Emperor of the Ming dynasty, who concluded from
the hyperinflation under the Mongols’ rule that money was a dangerous thing, and that the best way
to organise the economy was to fix people at preassigned places and jobs, discouraging movement
of goods and people. The result was that Ming China became a predominantly agrarian economy.
While Song collected two-thirds of its tax from commerce and industry, agriculture provided 84% of
Ming’s tax revenue. Even so, Ming’s total agricultural tax was still smaller than that of Song China
despite the former’s larger territory and population. The living standards that the Song Chinese
once achieved were not reached again in China until perhaps Deng Xiaoping’s reform.
Above, we have seen how invasions from less developed regions destroyed some of the greatest
civilisations the world had ever seen. But that is only one aspect of wars’ impact on luxury growth.
In response to wars, groups often intentionally cut down on luxuries, and that might have been an
even stronger force undermining luxury growth.
For example, during the Warring Period of China (476-221BC), restraint on luxury was the theme
of a series of political and economic reforms11 . In the face of constant nomadic harassment, King
Wu-Ling of Zhao (340-295 BC) commanded his subordinates to take off their wide sleeves and long
robes12 and switch to nomadic uniform—pants, belts and boots—in order to fight as cavalry. Half
a century before King Wu-Ling, Shang Yang’s reform swept another kingdom, Qin. The reformer
punished commerce, rewarded cultivation, forbade migration and restricted entertainment. In a
word, he cut down luxury and directed as much resources as possible to subsistence. The subjects
were deprived, but the Qin kingdom defeated all of the six rival kingdoms and united China for the
first time in history. A contemporary philosopher commented, “Qin is different from all the other
kingdoms. The people are poor and the government is cruel. Whoever hopes for a better life can
do nothing but combat hard. This makes Qin army the strongest of all."13
Qin’s idea of governance had a lasting impact on the later Chinese dynasties. Part of the
influence is reflected in the mainstream of ancient Chinese economic thought, which emphasized
restraints on luxury and commerce. The ancient thinkers thought differently than Adam Smith, not
because they were blind to the benefits of commerce, but because they cared about the country’s
survival more than about the subjects’ welfare. In a country where tax-fed mercenaries are not the
backbone of military strength, the government had better sacrifice commercial gains to ensure ease
of conscription. Adam Smith is unique, not because he discovered something no one had thought
11To name a few, Li Hui’s reform in Wei, Wu Qi’s in Chu, Shen Buhai’s in Han, Shang Yang’s in Qin and King
Wu-Ling’s in Zhao.
12Veblen (1899) pointed out that the inconvenience in the clothing style reflects the need for conspicuous consump-
tion.
13Xun Zi, chapter Yi Bing (On Wars).
36
of, but because he lived on the eve of the modern era, when individual welfare was finally reconciled
with group survival and expansion—it was the richer Europeans that had moved to America, Africa
and India, instead of the other way around. Smith prophesied the day at dawn.
5.4 Luxury explosions and the Industrial Revolution
Thanks to the Industrial Revolution, we have escaped the Malthusian trap. Understanding why the
Malthusian trap existed is basic to explaining how we escaped it. The classical Malthusian theory
predicts instability when birth rates decrease with income. So, most researchers have used demo-
graphic transition (multiple equilibria) to make sense of the Industrial Revolution. Demographic
transition is not incompatible with the two-sector model, but the current theory adds to the con-
ventional interpretation by pointing to a new set of factors triggering modern economic growth
(table 2).
Table 2: The “revolutionary” factors in the old and new theories
Model feature Triggering event Result
Classical Malthusian theory
Birth rates decrease with income Income rises Switch to the higher equilibrium
Group selection theory
Trade replaces migration Trade cost drops Selection slows down
Migrants spread knowledge Printing Tech. develops Tech. &migration disentangled
Threshold: 𝑔 > 𝑘
2(𝛽𝛿)1
3(𝜎2)2
3𝑔increases Balance is tipped
Threshold: 𝑔 > 𝑘
2(𝛽𝛿)1
3(𝜎2)2
3𝜎decreases Growth dominates selection
Literacy was a luxury It becomes subsistence Literacy spurs growth
The first factor is trade. Trade substitutes for migration. A decline in the cost of trade, combined
with a rise in the cost of migration (political barriers to migration increased in the modern era),
can slow down the selection that draws living standards downward.
The second factor is books. It is crucial to the group selection theory that migration should be
a major channel to spread knowledge. The assumption no longer held after printing presses spread.
The next two factors appear in the threshold condition of growth, 𝑔 > 𝑘
2(𝛽𝛿)1
3(𝜎2)2
3. From 1500
to 1800, Northwest Europe experienced a steady decline in the relative price of luxuries over staple
food and fuels (Hoffman et al., 2002). This implies that the gap in growth rates, 𝑔, had become
larger. What is more, Fouquet (2014) shows that the variance of GDP growth rates of European
economies decreased in the 19th century. The increase in 𝑔and the decrease in 𝜎2might reverse the
inequality relationship.
The last factor may be regarded as a theory by itself, the luxury explosion theory. The theory
holds that a technology—akin to culture—that turns from a luxury technology into a subsistence
technology will spread in an explosive way. To see this, go back to the group selection model. With
a little abuse of notation, denote a negative selection, the case where a trait is selected against
(luxury), as 𝜆 < 0, and a positive selection, the case where a trait is selected for (subsistence),
37
as 𝜆 > 0. Following a similar derivation as in section 4.4.2, the relationship between the force of
selection and the intensity of selection is still
𝑆= Φ𝜆1
3,(31)
except that the sign of 𝜆now indicates whether the trait is selected for or selected against.
As figure 16 shows, the “S” shape of 𝑆(𝜆)means that even a tiny 𝜆can produce a large force
of selection. The “S” shape results from a subtle variation effect: when selection is less intense—
because, say, people are more reluctant to move—regions tend to deviate further away from each
other in terms of production structure and the level of utility. The increased gap of utility motivates
people to move notwithstanding the inertia; and the enlarged difference of lifestyle means migrants
have more surprises to offer to the host culture. Overall, the greater variation compensates for the
loss of interest in migration. It makes weak selection still have a strong impact14.
The derivative of 𝑆(𝜆)is infinite at 0. When environmental changes make a luxury trait less
luxurious, little will change if the trait remains a luxury. But if the trait thereby turns into subsis-
tence, even if the environmental change is extremely tiny, this process will trigger a big change in
𝑆—a luxury explosion.
0
S
S=1
3
negative selection
positive selection
luxury subsistence
hardly any difference
luxury explosion
Figure 16: The relationship between the force of selection 𝑆and the intensity
of selection 𝜆. Here 𝜆 < 0means the commodity is luxury, and it is subject to
negative selection; 𝜆 > 0means the commodity is subsistence, and it is subject
to positive selection.
The luxury explosion has profound implications for triggering modern growth. Consider the
relationship between literacy and fertility. The tradeoff between quantity and quality of children
is a classic example of the choice between subsistence and luxury. Spending money on books and
education for children—a choice of quality over quantity—might increase the number of grandchil-
14A similar mechanism can explain why hardly intermarried peoples are still genetically close to each other. Pinker
(2003, p.143) notes that “Rare genes can offer immunity to endemic diseases, so they get sucked into one group from a
neighbouring group like ink on a blotter, even if members of one group mate with members of the other infrequently.
That is why Jews, for example, tend to be genetically similar to their non-Jewish neighbours all over the world, even
though until recently they tended to marry other Jews. As little as one conversion, affair, or rape involving a gentile
in every generation can be enough to blur genetic boundaries over time.”
38
dren (Galor and Klemp, 2014), but, if all households do so, the density of population will decline
in most cases. The quality of children is the pivot of transition in a number of unified growth
theories (Galor and Moav, 2002; Galor and Weil, 2000; Clark, 2008; Galor, 2011). Transitions in
these models are mostly driven by multiple equilibria of fertility choice. Here, selection provides a
new mechanism of transition: literacy, which was meant for reading the Bible at the onset of the
Religious Reformation, unintentionally equipped the masses with scientific knowledge, engineering
knowhow and nationalist enthusiasm, by which Europe colonized the other parts of the world. What
used to be a luxury turned into subsistence. The luxury explosion then made a revolution.
The Industrial Revolution is unique because human capital is a very special luxury. Most other
luxuries, like diamonds and yachts, cannot switch into subsistence. Weapons switch but they do not
improve living standards. Human capital not only enriches the individual but also strengthens a
country. Moreover, human capital is immune to the “erosion” of biased migration. If an immigrant
wants to benefit from the “luxury”, she has to learn it, whereas a learned emigrant can apply his
knowledge away from home. Hardly any other luxury combines these wonderful features.
There have been plenty of explanations for the Industrial Revolution. It is unlikely that any
single factor can account for the whole transition experience. However, the current state of the
Industrial Revolution research is that most existing ideas are anecdotal. The idea that receives
the most rigorous modelling is demographic transition (there are many versions of it, but the
underlying mechanism is the same—demographic transition combined with multiple equilibria).
The disproportionate popularity of that single explanation is rooted in the widely held presumption
that the Malthusian trap is caused by the Malthusian mechanism. Now that the presumption
is shaken and an alternative mechanism has been put into a model, the Industrial Revolution is
open to a new set of interpretations that might be rigorously modelled under the new framework:
institution, trade, social insurance, the Renaissance, the Reformation, the Scientific Revolution, and
the Enlightenment (Mokyr, 2005).
6 Concluding remarks
For more than two centuries, scholars have taken Malthus’s explanation for the Malthusian trap for
granted. The conventional wisdom is wrong.
Differing from the Malthusian view of history, this paper suggests the following basic story.
Imagine a world where people live on two things: bread and flowers. Population increases with
bread, hence the average consumption of bread is fixed in the long run by the Malthusian force. But
population hardly responds to flowers. If the flower sector grows faster than the bread sector, people
will live better and better by each having more and more flowers. Such had never happened until
the Industrial Revolution. Throughout the thousands of years before that time, flower productivity
had somehow grown at the same rate as bread productivity.
The cause of the balanced growth is group selection. People organize themselves into compet-
ing groups. When a group has comparative advantage at making bread, its average member will
39
have fewer flowers than their neighbours do. Greed drives them to move abroad. As they move,
they spread the technology of their hometown to other places. The consequence is that the bread
technology tends to spread faster than the flower technology. Even if the flower sector intrinsically
grows faster than the bread sector, a tiny bit of spread advantage of the bread sector can offset
a large growth advantage of the flower sector. With the whole world interlocked in a network of
migration and occasional conquests, living standards were stagnant almost everywhere. Thus the
Malthusian trap is also a Darwinian trap in the mean time.
For all its novelty, the group selection theory of the Malthusian trap is a tautology, a tautology
that makes the theory irrefutably robust. Here is how: what the theory is set to explain is why
the average pre-industrial person had so little luxury15—so few flowers. The theory ascribes this
Malthusian fact to group competition. By definition, luxury contributes to individual utility at
the expense of group fitness. Fitness matters only in the corresponding context of competition.
Therefore, luxury must be constrained by group competition, the only context in which group
fitness ever matters. So, the way I define luxury has already ensured that group competition is the
main suppressor of luxury.
Overall, this paper has four contributions:
I It replaces Malthus’s explanation of the Malthusian trap.
II It explains why the Malthusian relationship between average income and population growth
is empirically weak: the classical theory misses two of the three determinants of long-run
equilibrium, i.e. social preference and production structure.
III It explains the prosperity of ancient market economies such as Rome and Song.
IV It suggests a new set of factors that might have triggered modern economic growth.
Malthusian theory is fundamental to our understanding of history. Replacing it opens numerous
possibilities for economic history research. Here, I discuss two of them.
First, the two-sector model liberates living standards researchers from the Malthusian presump-
tion. The presumption is evident in Maddison’s series, where both Rome’s and Song’s per capita
GDP were estimated at $450, that is, only $50 above the lowest number in the data. The pre-
sumption is also evident in the dubious methodology of many empirical researchers, who ignored
non-agricultural output when estimating income, for they believe, as Baumol (1990) put it, “[i]n a
period in which agriculture probably occupied some 90 percent of the population, the expansion of
industry [...] could not by itself have created a major upheaval in living standards.” The presump-
tion is even evident in many skeptics’ work. Too often we have seen researchers who provide strong
evidence of high living standards in certain historical episodes concluding with an apologetic tone
that the prosperity must be a temporary phenomenon that is doomed to disappear when population
catches up with technology. Temin (2013, p.193), for example, said, “[i]t reveals even Malthusian
15Daily calorie intake per person has hardly changed since the Industrial Revolution. The improvement of life is
mostly reflected in the diversity and abundance of luxury.
40
economies can have economic growth, that is, can have rising standards of living. This can go on
for a long time, even centuries, even though without industrialization, it is doomed to end.” Now,
with the two-sector model available, large swings of living standards becomes a serious theoretical
possibility. Researchers no longer have to hide or distort facts to fit any theory. Hopefully, the
two-sector model will also direct more researchers’ attention to the non-agricultural sectors. In the
classical model, commerce and industry are unimportant for living standards; but in the two-sector
model, they become crucial.
Second, the group selection theory calls on the profession to embrace a “macro” view of the
Industrial Revolution. What I mean is this: the classical Malthusian theory has led most economists
to believe that the key mechanism of the Industrial Revolution lies in the demographic transition.
Guided by this belief, researchers spent most of their energies on changes of fertility behaviour
in pre-modern Europe—how fertility varied with income, status, education, and etc. Fertility is
a micro decision, made on the household level. Fertility is important. But it is only one of the
three comparative statics in the two-sector model. The other two, social preference and production
structure, are no less important than fertility. What determines these two? They are determined by
politics, policies, institutions, wars, trade, migration, cultures, and geography. Incorporating social
preference and production structure into the analysis means we need to rethink the roles these
“macro” factors play in a society. When it comes to the Industrial Revolution, previous researchers
asked why households changed their minds about children; now the added question is: what had
the prince done that made his country stand out?
Finally, I would like to further address the difference between Wu, Dutta, Levine, and Papageorge
(2014), henceforth WDLP, and this paper. Most related studies treat the Malthusian trap as a fact,
but WDLP is an exception. They argue that, because manufacturing and commerce usually grow
faster than agriculture, the income per capita had a slow yet still significant trend of growth before
the Industrial Revolution.
As stated previously, the two-sector Malthusian theory leads to two possibilities. One is that
the Malthusian fact is right, but requires a new explanation; the other is that the Malthusian trap
did not exist at all. This paper explores the first possibility, while WDLP studies the second. The
reality must lie in between. The two theories can actually be reconciled. I will leave the details of
the reconciliation to another paper. Here, I only sketch the idea.
Many luxuries are culture-specific. They are desired within a culture but not without. Group se-
lection has no way to eliminate the growth of such luxuries because migration never responds to the
difference of consumption in these items. Therefore, a distinction should be made between “universal
luxury” and “provincial luxury”. Universal luxuries are desired by all human beings; provincial lux-
uries, only by a group of people. Group selection suppresses universal luxuries, but leaves provincial
luxuries free to grow. This explains why culture was so diverse across pre-industrial societies despite
the monotony of life (measured by universal luxury). This is also why most economic historians
accept the Malthusian fact while WDLP come to a different conclusion. The profession has focused
on universal luxury only, but WDLP are concerned with all types of luxuries. The current paper
41
explains the Malthusian trap in its usual sense, that is, why the average consumption of universal
luxury was constantly low throughout the pre-industrial era.
According to a famous anecdote in the history of science, Darwin and Wallace independently
discovered natural selection, both by reading Malthus’s essay on population16. Now the anecdote
has a happy ending: Malthus inspired Darwin, but it’s Darwin that put into place the final piece
of Malthus’s puzzle.
16In his autobiography (1876), Charles Darwin wrote: “In October 1838, that is, fifteen months after I had begun my
systematic inquiry, I happened to read for amusement Malthus on Population, and being well prepared to appreciate
the struggle for existence which everywhere goes on from long-continued observation of the habits of animals and
plants, it at once struck me that under these circumstances favourable variations would tend to be preserved, and
unfavourable ones to be destroyed. The results of this would be the formation of a new species. Here, then I had at
last got a theory by which to work.”
42
References
Aiyar, Shekhar, Carl-Johan Dalgaard, and Omer Moav. 2008. “Technological Progress and Regress
in Pre-Industrial Times.” Journal Of Economic Growth 13 (2):125–144.
Allen, Robert C. 2008. “A Review of Gregory Clark’s a Farewell to Alms: A Brief Economic History
of the World.” Journal of Economic Literature 46 (4):946–973.
Álvarez-Nogal, Carlos and Leandro Prados De La Escosura. 2013. “The Rise and Fall of Spain
(1270–1850).” The Economic History Review 66 (1):1–37.
Ashraf, Quamrul and Oded Galor. 2011. “Dynamics and Stagnation in the Malthusian Epoch.” The
American Economic Review :2003–2041.
Baumol, William J. 1990. “Entrepreneurship: Productive, Unproductive, and Destructive.” Journal
of Political Economy :893–921.
Bowles, Samuel. 2006. “Group Competition, Reproductive Leveling, and the Evolution of Human
Altruism.” Science 314 (5805):1569–1572.
Bowles, Samuel and Herbert Gintis. 2002. “Behavioural Science: Homo Reciprocans.” Nature
415 (6868):125–128.
Broadberry, Stephen, Hanhui Guan, and David D. Li. 2014. “China, Europe and the Great Di-
vergence: A Study in Historical National Accounting, 950-1850.” Economic History Department
Paper, London School of Economics.
Broadberry, Stephen and Bishnupriya Gupta. 2006. “The Early Modern Great Divergence: Wages,
Prices and Economic Development in Europe and Asia, 1500–1801.” The Economic History
Review 59 (1):2–31.
Bryce, Trevor. 1999. The Kingdom of the Hittites. Oxford University Press.
Cavalli-Sforza, Luigi L, Paolo Menozzi, and Alberto Piazza. 1993. “Demic Expansions and Human
Evolution.” Science 259 (5095):639–646.
Chen, Shuo and James Kai-sing Kung. 2013. “Of Maize and Men: The Effect of a New World Crop
on Population and Economic Growth in China.” Available at SSRN 2102295 .
Clark, Gregory. 2008. A Farewell to Alms: A Brief Economic History of the World. Princeton
University Press.
Davies, John E. 1994. “Giffen Goods, the Survival Imperative, and the Irish Potato Culture.”
Journal of Political Economy :547–565.
De Vries, Jan. 2006. European Urbanisation, 1500-1800, vol. 4. Routledge.
43
Diamond, Jared. 1987. “The worst mistake in the history of the human race.” Discover 8 (5):64–66.
Fouquet, Roger. 2014. “Seven Centuries of Economic Growth and Decline.” Presented at the
AEA/ASSA Annual Meeting 2015.
Friedman, Milton. 1953. “The Methodology of Positive Economics.” Essays In Positive Economics
3 (8).
Galor, Oded. 2011. Unified Growth Theory. Princeton University Press.
Galor, Oded and Marc Klemp. 2014. “The Biocultural Origins of Human Capital Formation.” Tech.
rep., University Library of Munich, Germany.
Galor, Oded and Omer Moav. 2002. “Natural Selection and the Origin of Economic Growth.” The
Quarterly Journal of Economics 117 (4):1133–1191.
Galor, Oded and David N Weil. 2000. “Population, Technology, and Growth: From Malthusian
Stagnation to the Demographic Transition and Beyond.” American Economic Review 90 (4):806–
828.
Hansen, Gary D and Edward C Prescott. 2002. “Malthus to Solow.” American Economic Review
:1205–1217.
Harlan, Jack Rodney. 1975. Crops and Man. American Society of Agronomy.
Hersh, Jonathan and Joachim Voth. 2009. “Sweet Diversity: Colonial Goods and the Rise of
European Living Standards after 1492.” .
Hoffman, Philip T, David S Jacks, Patricia A Levin, and Peter H Lindert. 2002. “Real Inequality
in Europe since 1500.” The Journal of Economic History 62 (02):322–355.
Hong, Sungmin, Jean-Pierre Candelone, Clair C Patterson, and Claude F Boutron. 1996. “History
of Ancient Copper Smelting Pollution during Roman and Medieval Times Recorded in Greenland
Ice.” Science 272 (5259):246–249.
Jones, Charles I. 2001. “Was an Industrial Revolution Inevitable? Economic Growth over the Very
Long Run.” Advances In Macroeconomics 1 (2).
Lee, Ronald. 1987. “Population Dynamics of Humans and Other Animals.” Demography 24 (4):443–
465.
Levine, David and Salvatore Modica. 2013. “Anti-Malthus: Conflict and the Evolution of Societies.”
Research In Economics .
Lipsey, Richard G, Kenneth I Carlaw, and Clifford T Bekar. 2005. Economic Transformations:
General Purpose Technologies and Long-Term Economic Growth. Oxford University Press.
44
Liu, William Guanglin. 2015. “The Making of a Fiscal State in Song China, 960–1279.” Economic
History Review 1 (68):48–78.
Lo Cascio, Elio and Paolo Malanima. 2009. “Ancient and Pre-Modern Economies: GDP in the
Roman Empire and Early Modern Europe.” Presented at the conference on “Long-Term Quan-
tification in Mediterranean Ancient History," Brussels.
Maddison, Angus. 2003. The World Economy: Historical Statistics. OECD Publishing.
Malanima, Paolo. 2011. “The Long Decline of a Leading Economy: GDP in Central and Northern
Italy, 1300-1913.” European Review of Economic History 15 (02):169–219.
Malthus, Thomas Robert. 1809. An Essay on the Principle of Population, as It Affects the Future
Improvement of Society, vol. 2.
Moav, Omer and Zvika Neeman. 2008. “Conspicuous Consumption, Human Capital, and Poverty.”
Human Capital and Poverty .
Mokyr, Joel. 1983. Why Ireland Starved: An Analytical and Quantitative Study of Irish Poverty,
1800–1851. London and Boston: George Allen and Unwin.
———. 2005. “The Intellectual Origins of Modern Economic Growth.” Journal of Economic History
65 (02):285–351.
Morris, Ian. 2011. Why the West Rules - For Now: The Patterns of History, and What They Reveal
about the Future. Random House LLC.
Moser, Petra, Alessandra Voena, and Fabian Waldinger. 2014. “German-Jewish Emigrés and US
Invention.” Tech. rep., National Bureau of Economic Research.
Nunn, Nathan and Nancy Qian. 2011. “The Potato’s Contribution to Population and Urbanization:
Evidence From a Historical Experiment.” The Quarterly Journal of Economics 126 (2):593–650.
Persson, Karl Gunnar. 2010. “The End of the Malthusian Stagnation Thesis.” Tech. rep., Mimeo
(University Of Copenhagen).
Pinker, Steven. 2003. The Blank Slate: the Modern Denial of Human Nature. Penguin.
Pomeranz, Kenneth. 2009. The Great Divergence: China, Europe, and the Making of the Modern
World Economy. Princeton University Press.
Ravenstein, Ernest George. 1885. “The Laws of Migration.” Journal of the Statistical Society of
London :167–235.
Restuccia, Diego, Dennis Tao Yang, and Xiaodong Zhu. 2008. “Agriculture and Aggregate Produc-
tivity: A Quantitative Cross-Country Analysis.” Journal of Monetary Economics 55 (2):234–250.
45
Robbins, Lionel. 1998. A History of Economic Thought: the LSE Lectures. Wiley Online Library.
Simon, Julian Lincoln and Gunter Steinmann. 1991. “Population Growth, Farmland, and the Long-
Run Standard of Living.” Journal of Population Economics 4 (1):37–51.
Smith, Adam. 1887. An Inquiry into the Nature and Causes of the Wealth of Nations. T. Nelson
and Sons.
Solow, Robert M. 1956. “A Contribution to the Theory of Economic Growth.” The Quarterly
Journal of Economics 70 (1):65–94.
Solow, Robert M and Paul A Samuelson. 1953. “Balanced Growth under Constant Returns to
Scale.” Econometrica :412–424.
Soltis, Joseph, Robert Boyd, and Peter J Richerson. 1995. “Can Group-Functional Behaviors Evolve
by Cultural Group Selection? An Empirical Test.” Current Anthropology :473–494.
Taylor, M Scott and James A Brander. 1998. “The Simple Economics of Easter Island: A Ricardo-
Malthus Model of Renewable Resource Use.” The American Economic Review 88 (1):119–138.
Temin, Peter. 2013. The Roman Market Economy. Princeton University Press.
Tiebout, Charles M. 1956. “A Pure Theory of Local Expenditures.” The Journal of Political
Economy :416–424.
Veblen, Thorstein. 1899. The Theory of the Leisure Class: an Economic Study of Institutions.
Macmillan.
Voigtländer, Nico and Hans-Joachim Voth. 2013. “The Three Horsemen of Riches: Plague, War,
and Urbanization in Early Modern Europe.” The Review of Economic Studies 80 (2):774–811.
Ward-Perkins, Bryan. 2005. The Fall of Rome: and the End of Civilization. Oxford University
Press.
Weisdorf, Jacob L. 2008. “Malthus Revisited: Fertility Decision Making Based on Quasi-Linear
Preferences.” Economics Letters 99 (1):127–130.
Wilson, David Sloan. 2015. Does altruism exist?: culture, genes, and the welfare of others. Yale
University Press.
Wu, Lemin. 2012. “Does Malthus Really Explain the Constancy of Living Standards?” Available at
SSRN 2187113 .
Wu, Lemin, Rohan Dutta, David K Levine, and Nicholas W Papageorge. 2014. “Entertaining
Malthus: Bread, Circuses and Economic Growth.” Tech. rep., UCLA Department Of Economics.
Yang, Dennis Tao and Xiaodong Zhu. 2013. “Modernization of Agriculture and Long-Term Growth.”
Journal of Monetary Economics 60 (3):367–382.
46
Appendices
A Proofs
A.1 Prove that 𝑔converges to 𝛽(𝑔𝐵−𝑔𝐴)in the long run.
First, I prove the following lemma.
Lemma 5. If an isolated economy has constant growth rates of technology 𝑔𝐴and 𝑔𝐵, then 𝑔𝐴−
(1 −𝛾)𝑔𝐻converges to 0.
Proof:
Population evolves in the following way:
𝑔𝐻=𝛿(ln 𝑥−ln ¯𝑥)
Since 𝑥=𝐴(1 −𝛽)𝛾𝐻𝛾−1(equation 7),
𝑔𝐻=𝛿[ln 𝐴+𝛾ln(1 −𝛽)+(𝛾−1) ln 𝐻−ln ¯𝑥]
Denote 𝑀≡ln 𝐴+ (𝛾−1) ln 𝐻, then
𝑔𝐻=𝛿[𝑀+𝛾ln(1 −𝛽)−ln ¯𝑥]
The motion of 𝑀follows
𝑑𝑀 =𝑔𝐴+ (𝛾−1)𝑔𝐻
=𝑔𝐴+ (𝛾−1)𝛿[𝑀+𝛾ln(1 −𝛽)−ln ¯𝑥]
Since (𝛾−1)𝛿 < 0,𝑀will stabilize at
𝑀*=𝑔𝐴
(1 −𝛾)𝛿)−𝛾ln(1 −𝛽) + ln ¯𝑥
Hence 𝑑𝑀 =𝑔𝐴−(1 −𝛾)𝑔𝐻converges to 0.
Proposition 6. 𝑔𝑈converges to 𝛽(𝑔𝐵−𝑔𝐴).
Proof: Start by expressing 𝑈as a function of 𝐴and 𝐵. We can not use the formula of equilibrium
utility (equation 11) because the continuous progress of technology will pull the economy slightly
away from the equilibrium state. So I turn to equation 9, which applies to dynamic scenario as well.
Suppose land is fixed. By log-linearizing equation 9, we get
𝑔𝑈=𝛽(𝑔𝐵−𝑔𝐴) + 𝑔𝐴−(1 −𝛾)𝑔𝐻.
Lemma 5 holds that 𝑔𝐴−(1 −𝛾)𝑔𝐻converges to 0. Therefore, 𝑔𝑈converges to 𝛽(𝑔𝐵−𝑔𝐴).
47
A.2 Prove that 𝑆≡𝛽𝛿 E𝑡→+∞(𝜈) = 𝑘
2(𝛽𝛿)1
3(𝜎2)2
3.
Proof: By Ito’s lemma,
𝑑𝜈𝑥=𝜎2𝑥(2𝑥−1)𝜈𝑥−1−2√2|𝜆|𝑥𝜈 𝑥+1
2𝑑𝑡 + 2𝜎𝑥𝜈𝑥−1
2𝑑𝑧
Since E𝑡→+∞(𝑑𝜈𝑥)→0, the long-run expectation of the drift term
𝜎2𝑥(2𝑥−1) E
𝑡→+∞(𝜈𝑥−1)−2√2|𝜆|𝑥E
𝑡→+∞(𝜈𝑥+1
2)=0.
Let 𝑓(𝑥)≡E𝑡→+∞(𝜈𝑥)and denote 𝜎2
2√2|𝜆|as 𝑎, then the above equation can be rewritten as a
general term formula:
𝑓𝑥+3
2=𝑎(2𝑥+ 1)𝑓(𝑥)
with 𝑓(0) = E𝑡→+∞(𝜈0)=1.
The general solution is
𝑓(𝑥) = 1
3(3𝑎)2
3𝑥Pochhammer 4
3,2
3𝑥−1.
Let 𝑥= 1, then
𝑓(1) = 𝑎2
3
31
3Gamma 4
3.
Denote 𝑘≡31
3Gamma 4
3−1≈0.78, then 𝑓(1) can be written as 𝑘𝑎 2
3.
By definition,
E
𝑡→+∞(𝜈) = 𝑓(1) = 𝑘𝜎2
2√2|𝜆|2
3
=𝑘𝜈*
Substituting E𝑡→+∞(𝜈) = 𝑘𝜈*into 𝑆≡𝛽𝛿 E𝑡→+∞(𝜈), we get
𝑆=𝜆𝑘𝜈*=𝑘
2(𝛽𝛿)1
3(𝜎2)2
3.
B Figures
48
0
0.5
1
1.5
2
2.5
0
0.5
1
1.5
2
2.5
0
0.5
1
1.5
2
2.5
0
2000
4000
6000
8000
10000
Regional utility
corner (1,1)
side (5,1)
interior (5,5)
year
Figure 17: The regional utility fluctuates wildly but has no trend. Here is
the history of regional utility of three representative regions: a corner region
(1,1), a side region (5,1), and an interior region (5,5).
49
0
25
0
25
0
25
0
25
0
25
2
1
1.5
Frequency
The regional average utility
year 2000
year 4000
year 6000
year 8000
year 10000
Figure 18: The distribution of regional average utility is stable over time.
The std. of growth rates
0%
5%
10%
15%
The drift rate of the luxury technology
0%
0.5%
1%
1.5%
2%
Growth in both scenarios
Stagnation in both
Growth under selective learning but stagnation under indiscriminate substitution
Figure 19: The progressive and stagnant areas on the parameter space. Notes:
a point is counted as a “growth point” only if the global average utility grows
more than 25% over 3000 years. There are three areas in the parameter space.
The upper left area that is marked with crosses is where the global average
utility stagnates in both the indiscriminate substitution case and the selective
learning case. The middle area marked with circles is where the utility grows
under selective learning but not under indiscriminate substitution. The lower
right area marked with triangles is where growth occurs under both scenarios.
50
Cumulative growth over 3000 years
-30%
-15%
0%
15%
30%
45%
60%
The side length of the square world
0
5
10
15
20
Indiscriminate substitution Selective learning
Figure 20: The cumulative growth of global average utility over 3000 years.
Notes: each point represents an experiment at the corresponding side length
of the world.
C Tables
Table 3: Parameterization of the baseline simulation
Parameter Value Interpretation
𝑔𝐴0.5% Subsistence growth rate
𝑔𝐵1% Luxury growth rate
𝜎𝐴5% Std. of subsistence growth
𝜎𝐵5% Std. of luxury growth
𝛿0.2𝑛=𝛿(ln 𝑥−ln ¯𝑥)
𝛾0.5𝑋=𝐴𝐿1−𝛾
𝐴𝐻𝛾
𝐴,𝑌=𝐵𝐿1−𝛾
𝐵𝐻𝛾
𝐵
¯𝑥1𝑛=𝛿(ln 𝑥−ln ¯𝑥)
𝜃0.1migrational rate
𝛽0.5𝑈=𝑥1−𝛽𝑦𝛽
𝛼−10 𝑔𝐵𝑖𝑗 =𝑔𝐵[1 + (𝐵𝑖𝑗 /𝐴𝑖𝑗)𝛼]
51