PreprintPDF Available

Speed of Convergence in a Malthusian World: Weak or Strong Homeostasis?

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Standard Malthusian models predict that a productivity or population shock modify income per capita in the short run. In the long run, however, population pressures make income per capita gradually come back to its steady state. I investigate the duration of this short-run fluctuation, estimating the speed of convergence of Malthusian economies to their GDP per capita and population steady-states. To do so, I first build and calibrate a Malthusian model capturing explicitly the idea that marriages are postponed (advanced) and fertility potential of couples reduced (augmented) during depressions (expansions). I then also run β-convergence regressions on historical panel data. I find consistent evidence of weak homeostasis, with a half-life of about one century. It implies that early modern data may display high persistence without necessarily rejecting the Malthusian hypothesis.
Content may be subject to copyright.
SPEED OF CONVERGENCE
IN A MALTHUSIAN WORLD:
WEAK OR STRONG
HOMEOSTASIS?
Arnaud Deseau
LIDAM Discussion Paper IRES
2023 / 10
Speed of Convergence in a Malthusian World:
Weak or Strong Homeostasis?
Arnaud Deseau
March 16, 2023
Abstract
Standard Malthusian models predict that a productivity or population shock modify in-
come per capita in the short run. In the long run, however, population pressures make
income per capita gradually come back to its steady state. I investigate the duration of
this short-run fluctuation, estimating the speed of convergence of Malthusian economies
to their GDP per capita and population steady-states. To do so, I first build and calibrate
a Malthusian model capturing explicitly the idea that marriages are postponed (advanced)
and fertility potential of couples reduced (augmented) during depressions (expansions). I
then also run β-convergence regressions on historical panel data. I find consistent evidence
of weak homeostasis, with a half-life of about one century. It implies that early modern
data may display high persistence without necessarily rejecting the Malthusian hypothesis.
Keywords: Convergence, Homeostasis, Malthusian dynamics, Preventive check, Mar-
riage, Fertility, Malthusian model, β-convergence
JEL Codes: N10, N13, N33, O10, O47
I am grateful to Hugues Annoye, Thomas Baudin, Pierre de Callataÿ, Greg Clark, David de la Croix, Frédéric
Docquier, Oded Galor, Marc Goñi, Andreas Irmen, Nils-Petter Lagerlöf, Hélène Latzer, Anastasia Litina, Fabio
Mariani, Stelios Michalopoulos, Luca Pensieroso and David Weil for their valuable comments and suggestions. I
also thank participants at the 2018 University of Luxembourg CREA Workshop on Culture and Comparative
Development; the 2017 Université Saint-Louis Bruxelles CEREC Workshop on Macroeconomics and Growth;
and seminars at Brown University and IRES (UCLouvain).
CEREC, Université Saint-Louis Bruxelles, 38 Boulevard du Jardin Botanique, 1000 Bruxelles, Belgium and
IRES/LIDAM, UCLouvain, College L. H. Dupriez, 3 Place Montesquieu, B-1348 Louvain-la-Neuve, Belgium
(e-mail: arnaud.deseau@uclouvain.be).
In four centuries [1300-1700], the [French] population only increased by 2 million persons in
all! And some say less! [...] Thus, an extraordinary ecological equilibrium is revealed. Of course, it
did not exclude possibly prodigious, but always temporary, upheavals and negative fluctuations in its
time like those experienced by animal population.
Emmanuel Le Roy Ladurie (1977), Motionless History.
1 Introduction
One of the most central prediction of the Malthusian theory is that standards of living were stag-
nant before the onset of industrialization. Stagnation however does not literally mean constant,
or flat, per capita income. In fact, any shock striking a Malthusian economy generates fluc-
tuations, or volatility, in the standards of living, namely temporary or non-sustained economic
growth. Indeed, a simple Malthusian model predicts that a positive shock on the technolo
level say the introduction of better cultivation techniques increases income per capita in
the short run only; in the long run, population increases and the economy returns to its initial
level of income per capita. This is the so-called “Malthusian trap” mechanism, that has been
recognized as one of the major obstacles to achieve sustained economic growth during millennia
(Kremer,1993;Galor and Weil,2000;Hansen and Prescott,2002;Clark,2007;Ashraf and
Galor,2011;Galor,2011).
While the existence of the Malthusian trap mechanism is widely accepted, previous literature
has found mixed evidence about its exact strength. A first group of studies finds evidence of
a weak Malthusian trap, known as weak homeostasis1, with slow convergence rates of several
centuries (Lee and Anderson,2002;Crafts and Mills,2009;Fernihough,2013;Bouscasse et al.,
2021). On the other hand, Madsen et al. (2019) find evidence of a strong Malthusian trap or
strong homeostasis, with fast convergence rates (few decades).
1Homeostasis comes from the Greek homoios “similar” and stasis “steady”, meaning “staying the same”. In
demography, it refers to a population equilibrium maintained by density-dependent checks (Lee,1987).
1
In this article, I reinvestigate the question of the strength of the Malthusian trap by examin-
ing the speed of convergence of Malthusian economies to their steady state i.e. the time it takes
them to go back to their steady state after a shock. I argue that fluctuations in Malthusian times
are, by nature, long, as a shock is absorbed through demographic fluctuations which take time
to unfold. For example, the introduction of better cultivation techniques increases income per
capita and, in turn, people are expected to marry younger and to start having children earlier in
life. Fertility will accordingly increase, but slowly, and not by a enormous margin. Similarly,
the increase in per capita income allows to concentrate more resources on the same number of
individuals, slowly improving survival chances and lowering mortality. These are the so called
preventive and positive checks, originally argued by Malthus (1798) himself. This means that
any shock to a Malthusian economy is likely to take generations to disappear.
To investigate this conjecture, I first build an overlapping-generations Malthusian growth
model including both preventive and positive checks as means of population adjustment. In
particular, agents first choose to marry (or not), influencing the extensive margin of fertility,
and then choose the number of children within marriage, influencing the intensive margin of
fertility. Both choices depend on income per capita, in a Malthusian fashion. I show that the
speed of convergence of a Malthusian economy to its steady state depends on four parameters:
the land share of output and the elasticities of fertility, marriage and survival with respect to
income per capita. I calibrate the model for England and show that, under plausible parameter
values, the speed of convergence indicates weak homeostasis, with a half-life of about a century.
Using elasticity values estimated in the literature, I find further evidence of weak homeostasis,
of about the same magnitude, for Scandinavian and European countries.
Second, I systematically confront the model predictions with the data, using β-convergence
regressions à la Barro and Sala-i Martin (1992). Using the latest version of the Maddison
Project historical GDP per capita series and simulated GDP per capita data series from Lagerlöf
(2019), I empirically confirm weak homeostasis, in the same magnitude as predicted by my
2
model. Endogeneity issues are addressed using an internal instrument approach (GMM) and
controlling for the State History Index of Borcan et al. (2018). Measurement error issues are
dealt using several strategies, including exclusion of the most uncertain part of the data, time
averages, and time-interacted regressors. Next, I run the same regressions using McEvedy et al.’s
(1978) historical population series and Reba et al.’s (2016) historical urban population series,
confirming weak homeostasis and its magnitude.
This article contributes to the growing literature examining the existence and strength of
the Malthusian trap (Lee and Anderson,2002;Nicolini,2007;Crafts and Mills,2009;Kelly
and Gráda,2012;Fernihough,2013;Møller and Sharp,2014;Lagerlöf,2015;Madsen et al.,
2019;Cummins,2020;Jensen et al.,2021). For instance, Fernihough (2013) finds evidence of
weak homeostasis in Northern Italy using VAR methods. Similarly, Jensen et al. (2021) inves-
tigate Malthusian dynamics in Denmark using also a VAR methodolo. I contribute to this
literature in two main respects. First, rather than analyzing the case of one specific country
using time series methods, I use a panel of Malthusian countries and a β-convergence model that
exploits the within country variations in the lagged income per capita or population levels to es-
timate the speed of convergence. This article is the first, to my knowldege, to provide evidence
of weak homeostasis in a panel of Malthusian economies. Second, using the most comprehensive
and up to date panel data available to study Malthusian economies, I am able to characterize,
for the first time, the full distribution of convergence speed during the Malthusian period. I
show that most of the countries were characterized by weak homeostasis, while highlighting
significant differences in the strength of the Malthusian trap. The closest article to mine is
Madsen et al. (2019), which find strong homeostasis for a panel of 17 countries (900-1870).
The main difference between the two articles lies in the approach to the data and the statisti-
cal model. While Madsen et al. (2019) rely on largely interpolated data from heterogeneous
historical sources and use a SUR model, I take the data as given and use the techniques and
remedies developed in the empirical growth literature, such as fixed-effects models and internal
3
instruments (GMM).
This article also adds to the literature studying Malthusian dynamics in an overlapping-
generations frameworks. The existing overlapping-generations Malthusian frameworks con-
sider the intensive margin of fertility as the only channel through which population adjusts
(Ashraf and Galor,2011;Lagerlöf,2019). I build on these previous models by incorporat-
ing, for the first time, marriage as an explicit channel through which the population adjusts,
as originally argued by Malthus (1798) himself. The marriage channel allows me to model
the extensive margin of fertility, as unmarried individuals typically did not have children in the
Malthusian era. I can therefore model richer Malthusian population and convergence dynamics.
Finally, this article relates to the literature deriving the speed of convergence in growth
models. Working in continuous time, Irmen (2004) and Szulga (2012) find that the speed of
convergence of a Malthusian economy depends on the land share of output and the elasticities of
the birth rate and death rate to income per capita. I contribute to this literature by showing that
the elasticity of the marriage rate to income per capita also matters to characterize the speed of
convergence. In a modern context, this article relates also to the seminal work of Barro (1991)
and Barro and Sala-i Martin (1992).
The rest of this article is organized as follows. Section 2 presents my Malthusian growth
model. Section 3 presents my calibration exercise, discussing the parameters I use and presenting
my simulations. I also derive the speed of convergence implied by my model and discuss it in
relation to the literature. Section 4 describes my empirical strate and the data I use to estimate
the speed of convergence. Section 5 presents and discusses my empirical results. Section 6
concludes.
4
2 Theoretical Framework
To describe the key mechanisms behind the dynamics of GDP per capita and population at
the Malthusian epoch, I first build a theoretical model. I consider an overlapping-generations
economy with time modelled as discrete and going from zero to infinity, and where agents live
two periods. In the first period of their life, they are inactive children entirely supported by
their parents; they make no decisions. In the second period of their life, they work, earn an
income and make decisions about consumption, marriage and fertility.
I deviate from textbook Malthusian models by modelling explicitly marriage, celibacy and
childlessness decisions. In brief, that means that I am considering both the extensive margin
of fertility, i.e. whether or not an individual marries and can have children, and the intensive
margin of fertility, i.e. variations in individual’s number of surviving children within marriage.
Those two elements are crucial in my model as they directly affect the speed at which a Malthu-
sian economy returns to its steady state after a shock. They are in line with empirical studies
showing the importance of the so called preventive checks, advocated by Malthus (1798) him-
self, in affecting fertility. Indeed, Cinnirella et al. (2017) show that real wages affect negatively
birth spacing within marriage and the time of marriage and first child in England for the period
1540-1850. Cummins (2020) finds similar results with a negative effect of living standards on
the age at first marriage in France between 1650 and 1820. de La Croix et al. (2019) show that
singleness and childlessness are key elements to take into account when estimating reproductive
success in pre-industrial times. Therefore, modelling both the extensive and intensive margins
of fertility appears crucial to a rigorous analysis of population dynamics during the Malthusian
era.
I model childlessness and celibacy together, leaving the possibility to procreate only to
married agents. This is fully consistent with historical studies showing very low illegitimate
birth rates in pre-industrial Europe (Hajnal,1965;Segalen and Fine,1988;Wrigley et al.,
5
1989). Marriage offers the opportunity for agents to gain utility from another source than just
pure consumption.2On the other hand, the disutility of marriage is represented by a search
cost that agents need to pay in order to match with a partner.3Agents are assumed to be
heterogeneous in their search cost, which is exogenously given. At the beginning of their adult
life, agents draw a search cost λiwith λi U(1, b)with bbeing the maximum of the uniform
distribution. Agents maximize their utility and therefore a marriage occurs only if the utility of
being married is superior to the utility of being single. Within marriage, I let the agent’s fertility
depend on his income per capita, according to the standard Malthusian theory and empirical
evidence (Cinnirella et al.,2017;de La Croix et al.,2019;Cummins,2020).
Preferences and Budget Constraints. The utility of a married agent iof generation tis
defined à la Baudin et al. (2015):
UM
i,t =ln ct+γln (nt+ν)ln λi,(1)
where ctdenotes consumption, γ > 0is a child preference parameter, ntis the number of
surviving children, ν > 0allows for childlessness as the individual utility remains defined when
nt= 0 and λiis the utility cost of marriage.
It follows that the utility of an unmarried agent of generation tis given by:
US
i,t =ln ct+γln (ν).(2)
Agents allocate their income between consumption and child rearing such that we have the
following budget constraint:
ct=ytf(nt),(3)
2This means that parents only care about the quantity of surviving children, as in a standard Malthusian model.
3Alternatively, one can think the cost as representing a dowry that agents need to pay in order to marry.
6
where ytis agent’s income, f(nt)is the cost of having ntchildren in terms of goods.
A convenient functional form for f(·)capturing both the idea of childlessness (f(0) = 0)
and allowing for potentially non-constant returns to scale in the production of children is the
following one:
f(nt) = q(nt+ν)1/δq ν1/δ,(4)
with q > 0being unitary cost of a child and δ > 0a parameter influencing the degree of
return to scale in child production.
Fertility. Maximizing (1) subject to (4), I obtain the optimal fertility behaviour of a
married agent of generation t:
nt=κ·yt+q ν1/δδνnt(yt),(5)
where κ=q
γδ +qδ. Thus, in accordance with Malthusian theory, the number of sur-
viving children within marriage depends positively on income per capita (∂nt/∂yt>0).
Marriage. An agent is indifferent between being married and single if utility is the same
in both situations. I define λas the draw from the search cost distribution that makes an agent
indifferent between being married and single. The condition for an agent to be married is:
λi< λ with λi U(1, b). I can therefore compute the probability for an agent of generation t
to be married as:
pt=P(λi< λ) = λ(yt)1
b1pt(yt),(6)
where bis the maximum of a uniform distribution and the threshold draw λdepends on an
individual’s income.4Since I work at the generation level, ptis also equivalent to the marriage
rate in that Malthusian economy. In the rest of the article, I will use ptas the marriage rate.
4The full expression of λis available in Section Aof the Appendix
7
Thus, in accordance to the idea of Malthus (1798), an increase in income lowers the age at
marriage, resulting in a higher marriage rate at the generation level in our model (∂pt/yt>0).
Production. Total output in period tis given by:
Yt= (AtT)αL1α
t,(7)
where Atis a land-augmenting technolo factor, Tis total land area, Ltis the size of the
labour force that is equivalent to the adult population in my analysis and α(0,1) is the land
share of output.
I assume that workers are self-employed and earn an income equal to the output per worker
in t. Using (7) and normalizing land area to unity (T= 1), we obtain:
yt=At
Ltα
.(8)
Following Lagerlöf (2019), I consider sustained but constant growth in land productivity.
The technological level in period tis given by:
At=A0(1 + g)t,(9)
where A0is the initial technological level and gis an exogenously given and constant rate
of technological progress.
Mortality.Malthus (1798) and the Malthusian theory assert that population adjusts via
the so called positive and preventive checks. My model includes the two types of Malthusian
population adjustment: (i) preventive checks, as both the decision to marry and the number of
kids within marriage result from agents’ optimization, and (ii) positive checks as I model the
survival rate of adult agents as directly depending on their income in the following way:
st=min(s, s yϕ
t),(10)
8
where sis the maximal survival rate, sis a parameter calibrated to target an initial survival
rate and ϕis the elasticity of the survival rate to income per capita. Thus, in accordance with
the Malthusian theory, adult’s survival is increasing along income as long as s > s yϕ
tsince
s > 0and ϕ > 0.
Population Dynamics. The size of the population of the next generation t+ 1 is given by:
Lt+1 =ntptstLt.(11)
Income per capita Dynamics. Forwarding (8) to period t+ 1 and using (8), (9) and (11),
I obtain a first-order difference equation giving the income per capita of the next generation:
yt+1 =1 + g
nt(yt)pt(yt)st(yt)α
·ytψ(yt).(12)
Steady State. Expression (12) is a non-monotonic function of ywith a unique inflexion
point. Provided that the initial income per capita y0is not too low, it is possible to demonstrate
that ψ(yt)has a unique and globally stable interior steady-state implicitly defined by:
y1 + g
n(y)p(y)s(y)α
= 1 .(13)
Proof. See Section Aof the Appendix.
3 Quantitative Analysis
In this section, I analyse the speed of convergence of a representative Malthusian economy
using my theoretical model. I start by discussing the identification of the parameters that I
use to calibrate my Malthusian model. I then discuss the simulation results of my calibration
exercise. Finally, I derive the speed of convergence implied by my model and discuss it with
respect to the literature.
9
3.1 Identification of the Parameters
In order to simulate the evolution of a representative Malthusian economy and study its speed of
convergence, I first set the value of some parameters a priori, while some others are set to match
some target following an exact identification procedure. I focus on England as the literature
already provides a rich array of parameter values for that economy during the Malthusian
period. Table 1summarizes and explains my calibration strate.
Table 1: Benchmark Parameter Values
Parameter Value Interpretation and comments
t25 Number of years per generation. Fixed a priori
γ1Preference for children. Fixed a priori
q1Unitary cost of a child. Fixed a priori
δ0.09 Gives preventive checks-income per capita elasticity of 0.22. Fixed a priori
ϕ0.13 Gives positive checks-income per capita elasticity of 0.13. Fixed a priori
α0.5Land share of output. Fixed a priori
g0.023 Rate of technological progress per generation. Fixed a priori
s0.196 Minimum of the survival rate. To match s= 0.71
ν0.33 Child quantity preference parameter. To match n= 1.62
b5.96 Maximum of the search cost distribution. To match p= 0.89
Notes: See text for more details on the sources.
First, the length of a period or generation tis fixed at 25 years, meaning that an agent
is living at most 50 years in my model.5This is in line with life expectancy figures in pre-
industrial England as reported by Wrigley et al. (1997). Life expectancy at the age of 20 was as
high as 33-34 years on the period 1550-1799. Conditional on their survival until the age of 20,
Malthusian agents have therefore good chances to reach the age of 50. This is also in line with
the evidence on the so-called European Marriage Pattern (EMP) from Hajnal (1965). Indeed,
the EMP is characterized by a late age of first marriage for women (between age of 24 and 26)
and low illegitimacy birth rates. In my setting, agents marry and procreate only in the second
period of their life, that is to say between age of 25 and 50 as indicated by the EMP.
Next, I normalize γand q, respectively the agent’s preference for children and the cost of
5de la Croix and Gobbi (2017) make a similar assumption in a modern context with developing economies.
10
raising a child, to one.
Elasticity parameters δand ϕare particularly important in my setting, as they directly affect
the speed of convergence in my model (see Section 3.3). Since I am working at the generation
level, those parameters represent respectively the long-run elasticities of the preventive checks
(fertility and marriage) and the long-run elasticities of the positive checks (survival) to income
per capita. The empirical literature on Malthusian dynamics provide various estimates of such
long-run elasticities based on wage, Crude Birth Rate (CBR), Crude Marriage Rate (CMR)
and Crude Death Rate (CDR) time-series. Such estimates are available for England (Lee,1981;
Lee and Anderson,2002;Klemp,2012;Møller and Sharp,2014), Northern-Italy (Fernihough,
2013), Scandinavian countries (Lagerlöf,2015;Klemp and Møller,2016) and Germany (Pfister
and Fertig,2020).
For England, long-run elasticities range from 0.12 to 0.32 for the preventive checks and
from 0.08 to 0.22 for the positive checks. I set δ= 0.09 and ϕ= 0.13 in my benchmark
specification to match the mean of the long-run elasticities provided by the aforementioned
literature for England. This corresponds to a long-run elasticity of 0.22 for the preventive
checks and 0.13 for the positive checks. In my model, the value of the long-run elasticity of
the positive checks is directly given by ϕ, as equation (10) corresponds to the unit-elastic case.
For the long-run elasticity of the preventive checks, I fix δsuch that the sum of the elasticities
of fertility and marriage with respect to income per capita is equal to the targeted value (see
Section Bof the Appendix for more details).
Setting δ < 1means that my model consider decreasing returns to scale in the production
of children, while most standard Malthusian models assume constant returns to scale (δ=
1).6As pointed out by Lagerlöf (2019), we may interpret decreasing returns to scale in the
production of children as stemming from an implicit production function for child survival
featuring two inputs: parental time devoted to each child and each child’s food intake. More
6See, for instance, Ashraf and Galor (2011).
11
children automatically yields less time per child, leading to an increase in the per-child amount
of the consumption good necessary to ensure the survival of each child. Furthermore, the
aforementioned empirical literature consistently finds values well below unity for the long-term
elasticities of the preventive and positive checks. For instance, using exogenous cross-county
variations in Swedish harvest between 1816 and 1856, Lagerlöf (2015) finds long-run elasticities
of fertility, marriage and mortality of 0.1, 0.16 and -0.09, respectively.
The land share of output (α) for England is set at 0.5, corresponding to its estimated long-
run value for the Malthusian period (Federico et al.,2020).
In standard Malthusian models with constant technological progress, total population at
the steady state is not constant. In fact, (13) shows that population grows at the same pace as
technolo; this is a necessary condition to keep income per capita constant at the steady state.
Consequently, gis calibrated using 25-years average population growth using Campbell et al.
(2015) data for the period 1270-1675.
Consider next the three remaining parameters, s,νand bthat are calibrated to match re-
spectively the steady-state survival rate for adults (s), agent’s steady-state fertility (n) and the
steady-state marriage rate (p) following an exact identification procedure. The first target sis
set to 0.71 as in Wrigley (1968). This corresponds to the survival rate of population of 25 years
old until the age of 50 for the period 1538-1624 in England. The second target pis set to 0.89
which corresponds to a percentage of never married women of 11% as reported by Dennison
and Ogilvie (2014) for England. This figure is the average of the percentage of never married
women for England across 45 historical studies and is also very close to the value reported in
the seminal study of Wrigley et al. (1989). Knowing the two first targets, the third target n
is given by the steady-state condition in (13). I also set the steady-state level of income per
capita yat 20,108 (2013 British pounds). This corresponds to the 1300-1325 average GDP
per capita of England cumulated over one generation (25 years) using Campbell et al. (2015)
data. I adjust the initial level of technolo A0in (9) to reach the desired level of y.
12
3.2 Simulation Results
Before looking at the speed of convergence in itself, this section shows the overall ability of
my model to reproduce Malthusian dynamics. To do so, I simulate a Black Death alike shock
killing 60% of the population at t= 5. The size of the population shock is taken from Campbell
et al. (2015) and corresponds to the lowest population level observed in England after the Black
Death to take into account the diffusion process of the plague and its multiple resurgences. This
figure is also consistent with Benedictow et al. (2004), finding an overall mortality of 62.5%
for England. Figure 1shows the evolution of income per capita (yt), fertility (nt), the marriage
rate (pt) and the survival rate (st) under our benchmark parametrization across 20 generations.
Standard Malthusian theory predicts that an exogenous negative shock on the population
level (or Black Death) increases income per capita in the short run only.7After the shock,
population increases and the economy gradually converges back to its steady state such that,
at the long-run, the income per capita is constant. This is, by construction, what I observe in
my model. Figure 1shows that, right after the plague onset, the surviving agents enjoy indeed
a temporarily higher level of income per capita. Those better material conditions mean that
agents have better chances to survive, they marry more and are able to raise more surviving
children inside marriage. This translates into faster population growth, which in turn triggers
the convergence process of income per capita to its steady state. In Figure 1, I also display
the half-life of convergence for my benchmark specification. The half-life is about 4 genera-
tions, meaning that any shock on the English Malthusian economy is persistent across several
generations (see Section 3.3 for a complete discussion on the speed of convergence).
Figure 2evaluates the ability of my model to replicate the dynamic of income per capita after
the Black Death, using English historical GDP per capita data from Campbell et al. (2015). To
do so, I first extract the cyclical component in the data using an Hodrick–Prescott filter.8This
7Jedwab et al. (2022) find evidence that the Black Death was indeed a plausibly exogenous shock to the Euro-
pean economy.
8I set the smoothing parameter to 500 given that I use generations of 25 years.
13
Figure 1: Responses of the English Malthusian Economy to a Black Death
Half-life = 3.98 generations
100 120 140 160
GDP per capita
0 5 10 15 20
Generations
100 101 102 103 104 105
Surviving children per adult
0 5 10 15 20
Generations
100 101 102 103 104 105
Marriage rate
0 5 10 15 20
Generations
100 102 104 106
Survival rate
0 5 10 15 20
Generations
Notes: This figure plots the response of income per capita (top-left panel), fertility (top-right panel), marriage
(bottom-left panel) and survival (bottom-right panel) to a Black Death alike shock, killing 60% of the population
at t= 5.
is necessary as my model analyses the dynamic of convergence to a unique and fixed steady
state. On the contrary, fluctuations in the data might reflect changes in the position of the
Malthusian steady state, as well as transitionnal dynamics. As argued by North and Thomas
(1973) and Acemoglu and Robinson (2012), the Black Death might have affected the steady
state of the English economy through institutional changes. Figure 2shows that my model
generates a path for GDP per capita very similar to the cyclical component of the data in the
years following the Black Death. This is remarkable as the transitional dynamic in my model is
only governed by the long-run elasticities provided by the aforementioned empirical literature.
14
Figure 2: GDP per capita Dynamic after the Black Death Simulated Path vs. Data
80 100 120 140 160 180
Cyclical component of GDP per capita (1300s=100)
1300 1400 1500 1600 1700
Data
Simulation
Notes: This figure plots the cyclical component of GDP per capita after the Black Death from Campbell et al.
(2015) (solid line) and from our benchmark simulation (dashed line). The cyclical component is obtained using
an Hodrick-Prescott filter with a smoothing parameter of 500. We normalize data on the period 1300-1325, last
period before the occurrence of the Black Death in England (1348).
3.3 The Speed of Convergence
In my model, the speed at which GDP per capita converges to its steady state is given by:
β=α(ϵnt+ϵpt+ϵst),(14)
where ϵnt,ϵptand ϵstare the elasticities of fertility, marriage and survival with respect to
income per capita. See Section Cin the Appendix for additional details on the derivation of
the speed of convergence. It is hence possible to fully characterize the speed of convergence of
15
a Malthusian economy using only elasticities. Similar results are found by Irmen (2004) and
Szulga (2012) in continuous time.
Table 2gives the speed of convergence for different parametrizations of my model. Under
my benchmark parametrization, the speed of convergence is about 17% per generation. Con-
vergence to the steady state is hence slow: it takes about four generations, or one century, for
the English Malthusian economy to absorb half of a shock. This is in line with much of the
literature, finding evidence of weak homeostatsis (Lee,1993;Lee and Anderson,2002;Crafts
and Mills,2009;Fernihough,2013;de la Croix and Gobbi,2017;Bouscasse et al.,2021;de la
Croix and Gobbi,2022). Using equation (14), I also compute the speed of convergence for
Denmark, Norway, Sweden and European Malthusian societies thanks to the long-run elas-
ticities provided by Galloway (1988), Lagerlöf (2015) and Klemp and Møller (2016). I find
half-lives ranging between 48 and 115 years, pointing once again towards weak homeostasis. All
in all, our benchmark falls exactly at the median of the half-lives reported in Table 2(mean of
126; standard deviation of 101). Looking at the studies focused on England, our benchmark
appears close to the lower tail of the half-lives found in the literature. In particular, I am very
close to Lee and Anderson (2002) who find an half-life of 107 years for the period 1540-1870.
To see under which conditions my model can generate a speed of convergence compat-
ible with strong homeostasis for England, I consider three deviations from my benchmark
parametrization. The logic is to gradually push the long-run elasticities towards the upper-
bounds provided by the literature. Doing so, I am able to compute the highest speed of conver-
gence that the English Malthusian economy can achieve under plausible elasticity values. I find
that the lowest half-life for England is 54 years. This is almost twice as fast as my benchmark,
but still well above strong homeostasis, represented by a half-life of 30 years or less as in Madsen
et al. (2019). This gives further evidence that any shock on a Malthusian economy is persistent
over several generations, even if the Malthusian trap mechanism remains binding in the long
run.
16
Table 2: Speed of Convergence in our Model and in the Literature
Country Value of Parameters |β|Half-Life (years) Comments
This Study:
England δ= 0.09,ϕ= 0.13 and α= 0.5 0.175 99 Benchmark specification
England δ= 0.11,ϕ= 0.22 and α= 0.5 0.273 64 With upper-bound elasticity values
England δ= 0.09,ϕ= 0.13 and α= 0.6 0.21 83 With the upper-bound land share of output
England δ= 0.11,ϕ= 0.22 and α= 0.6 0.327 54 With upper-bound elasticity and land share
of output
Denmark (1821-1890) 84 Using (14), α= 0.5and elasticities
reported by Klemp and Møller (2016)
Norway (1775-1853) 91 Using (14), α= 0.5and elasticities
reported by Klemp and Møller (2016)
Sweden (1775-1873) 48 Using (14), α= 0.5and elasticities
reported by Klemp and Møller (2016)
Sweden (1816-1870) 99 Using (14), α= 0.5and elasticities
reported by Lagerlöf (2015)
Europe (1540-1870) 115 Using (14), α= 0.5and elasticities
reported by Galloway (1988)
Estimated Half-Lives from Other Studies:
England (1541-1870) 431 Crafts and Mills (2009)
Sub-Saharan Africa (1990) 198 de la Croix and Gobbi (2022)
England (1250-1870) 150 Bouscasse et al. (2021)
Northern Italy (1650-1881) 112 Fernihough (2013)
England (1540-1870) 107 Lee and Anderson (2002)
Developing countries (1990) 100 de la Croix and Gobbi (2017)
England (1540-1870) 70 Lee (1993)
17 countries (1470-1870) 29 Madsen et al. (2019); article claiming
strong homeostasis
Notes: See text for more details on parameter values and sources.
17
4 Empirical Framework
In this section, I start by presenting the data I use to estimate the speed of convergence of
Malthusian economies. Next, I detail my main estimating equation and discuss the potential
threats to my identification strate.
4.1 Data
In my analysis, I use two kinds of datasets: (i) GDP per capita series (either historical or sim-
ulated) and (ii) historical population series (either total or urban population). Historical GDP
per capita series come from the Maddison Project Database (Bolt and Van Zanden,2020).
Building on the pioneering work of Maddison (2003), the Maddison Project provides standard-
ized historical GDP per capita series running over several centuries. These series are regularly
updated and enriched by researchers in the field of historical national accounting. To limit mea-
surement error issues, I focus on the period 1000 CE - 1800 CE and consider only countries
with good data availability i.e. GDP per capita data available every year or every ten years
before 1800 CE. Following these two criteria, twelve Malthusian economies are considered,
including core (eg. Italy, England, China) and more peripheric (e.g. Mexico, Poland, Sweden)
Malthusian economies. Simulated GDP per capita series come from Lagerlöf (2019). Lagerlöf
(2019) shows that a Malthusian model with stochastic and accelerating growth in land produc-
tivity is able to match the moments of historical GDP per capita series presented in Fouquet and
Broadberry (2015). Simulations are available for 1,000 model economies and 501 years, mak-
ing it very useful to circumvent the lack of GDP per capita data inherent to the pre-industrial
period. From an econometric point of view, it corresponds to an ideal setting where both the
cross-sectional and the time dimensions are large, limiting the bias of the different estimators
on the speed of convergence.
Historical population series come from various sources. First, considering total popula-
18
tion figures, I use McEvedy et al.’s (1978) data. Population series from that source have been
widely used to address various questions in the comparative development literature, with most
of the contributions exploiting cross-country variations over a few years (Acemoglu et al.,2001;
Nunn,2008;Nunn and Qian,2011;Ashraf and Galor,2011,2013).9My objective on the
other hand is to exploit within-country population changes, and I therefore coded McEvedy
et al.’s (1978) data in its full panel dimension. Despite its wide use in the litterature, McEvedy
et al.’s (1978) data are also heavily criticized, mostly for measurement error issues (Guinnane,
2021). In order to mitigate this problem, I use only a specific time frame and set of countries.
First, I consider only the period between 1000 CE and 1750 CE. It corresponds to a period
recognized as Malthusian and avoid the sizeable uncertainty on population figures surrounding
the end of the Roman Empire and the Early Middle Ages. Second, within the selected period, I
keep only countries for which population figures are reported with the maximum frequency
i.e. every century before 1600 CE and every half-century after 1600 CE. Following those two
criteria, eighteen countries are considered with a majority of European countries. Turning to
historical urban population series, I use Reba et al. (2016) who compiled and geocoded Chan-
dler’s (1987) and Modelski’s (2003) figures. In particular, the database provides population
level for cities all around the world from 3700 BC to 2000 CE. I apply the same procedure
as for the Maddison Project’s data or McEvedy et al. (1978)’s data, namely I first select urban
population levels within the 1000 CE - 1800 CE period.10 Then, I focus on cities with a good
data availability i.e. cities with a population figure available at least for seven half-centuries
(out of the seventeen potentially available) between 1000 CE and 1800 CE.11
9For example, Ashraf and Galor (2011) use McEvedy et al.’s (1978) data as dependent variable for three
periods: 1 CE, 1000 CE and 1500 CE.
10When both Chandler and Modelski estimates are available for the same city and year, we take the average
between the two figures. This is the case for 20 cities, only for year 1000 CE.
11That threshold corresponds to the median of data availability.
19
4.2 Empirical Strate
To empirically assess the speed of convergence of a Malthusian economy to its steady state, I rely
on standard β-convergence models. Such models have been extensively used in the growth lit-
erature to quantify the speed at which modern economies converge to their steady state (Barro,
1991;Barro and Sala-i Martin,1992;Islam,1995;Caselli et al.,1996;Barro,2015). More
recently, that framework has also been used in the Malthusian context (Madsen et al.,2019).
My main specification is the following dynamic panel:
ln(yi,t)ln(yi,tτ)
τ=βln(yi,tτ) + γXi,t+δt+αi+εi,t ,(15)
where i= 1, ..., N indicates my unit of analysis which can be either a country or a city and
t= 1, ..., T corresponds to a year. The left-hand side corresponds to the growth rate of my
variable of interest y, which can be either GDP per capita or population levels. The parameter
τindicates the number of years between two available yin the data, such that my dependent
variable is always the average annual growth rate of ybetween period tτand t. To handle
gaps, measurement error and to avoid spurious changes in the data, I target a minimal τof 50
years.12 The right-hand side is composed of the lagged dependent variable yi,tτ, a vector of
control variables Xi,t, time fixed effects δt, country or city fixed effects αiand an idiosyncratic
error term εi,t.
My coefficient of interest is β, which corresponds to the average annual speed at which an
economy converges to its steady state. Obtaining unbiased estimates of the speed of convergence
is challenging in many ways. First, one might challenge the inclusion of country fixed effects in
my regressions. Country fixed effects are indeed traditionally viewed as a solution to the omitted
variable bias, as they control for all time-invariant characteristics affecting long-run economic
12It means that when the data are available at a lower frequency than 50 years, we compute 50 years averages for
that variable. It corresponds to two generations in our theoretical model, or the complete lifetime of a Malthusian
agent.
20
development such as geography, climate, culture. Not including country fixed effects in a β-
convergence model will hence irremediably bias downwards the speed of convergence, unless
the set of time-varying explanatory variables Xi,tis rich enough. However, as highlighted by
Barro (2015), country fixed effects are themselves a source of upward bias in the measurement
of convergence speed, referred to as the Hurwicz-Nickell bias (Hurwicz,1950;Nickell,1981).
Barro (2015) shows in particular that the Hurwicz-Nickell bias tends to zero as the overall
sample length in years tends to infinity, meaning that the bias can be substantial in the modern
growth context where the time dimension rarely exceeds 50 years. In a Malthusian context
though, the advantages of using country fixed effects are heightened, while the associated risks
are dampened. Indeed, the scarcity of available time-varying control variables in the case of
a large sample and long time frame renders the country fixed effects crucial to neutralize the
omitted variable bias. On the other hand, the risk of a sizeable Hurwicz-Nickell bias is greatly
mitigated by the large overall sample length, since my Malthusian analysis spans over centuries.
Even if the advantages of using country fixed effects might overweight their disadvantages in
the Malthusian context, β-convergence models can still be plagued by the presence of the endo-
geneity bias. Country fixed effects cannot capture time-varying steady-state determinants, such
as institutional changes, which can jointly determine current economic growth and past levels
of economic development. To directly address that issue, I include Statehist and its squared
level as control variables (Borcan et al.,2018). Statehist is an index retracing state development
every half-century from 3500 BCE until today, and is therefore a suitable control for broad
institutional changes. Moreover, my analysis always includes time fixed effects in order to con-
trol for global changes in the steady-state determinants, such as the spread of new technologies
or climatic changes.13
To address further the endogeneity bias, I provide, when possible, results using the Arellano
13For instance, most of our analysis spans from the 11th to the 19th century, meaning that we are capturing
both the effect of the Medieval Warm Period and the Little Ice Age with time fixed effects, assuming that the effect
of those climatic events was, on average, the same for each country.
21
and Bond (1991) GMM estimator (AB) and the Blundell and Bond (1998) GMM estimator
(BB). The AB estimator uses a GMM estimation procedure where all the variables are taken
in first-difference and lagged levels are used to instrument the endogenous regressors. This
procedure was first used by Caselli et al. (1996) in the growth context to address both the
Hurwicz-Nickell bias and the endogeneity of regressors. The BB estimator builds on AB, ex-
ploiting additional moment conditions which use lagged first differences of the regressors to
instrument the levels of the endogenous variables. Obviously, AB and BB are not panacea and
the literature on dynamical panel has identified several issues in their use. AB’s main issue is the
problem of weak instruments, which is known to bias βestimates towards their LSDV coun-
terparts (Hauk and Wacziarg,2009). BB oftenly corrects for the weak instrument problem,
but requires a stationarity assumption to deliver consistent results. In particular, BB requires
that the country fixed effects are uncorrelated with lagged differences in the dependent variable
i.e. E(αi(∆ ln yi,tsln yi,tr)) = 0 for all rand s. Even if the stationarity condition does
not seem to hold in practice, it has been demonstrated that BB delivers systematically lower
biases in the modern growth context than AB under weak instruments (Hauk and Wacziarg,
2009).Monte Carlo simulations in the modern growth context have further shown that within-
country estimators (LSDV, AB and BB) perform better in estimating the “true” speed of con-
vergence than the between or the random effects estimators when the endogeneity bias on the
steady-state determinants is severe (Hauk Jr,2017). Measurement error on the other hand is
better dealed using the between or the random effects estimator (Hauk and Wacziarg,2009).
Considering the endogeneity bias stemming from omitted variables as the most serious threat
to our analysis, I therefore choose to rely on within-country estimators (LSDV, AB and BB)
to estimate the speed of convergence.
To address measurement error in the lagged dependent variable, I implement nevertheless
several strategies. First, as already mentioned above, I systematically avoid using the most un-
certain population or GDP per capita data by excluding pre-1000 CE figures. Indeed, as pointed
22
recently by Guinnane (2021) for population figures, we simply “do not know the population”
going that far back in the past where standardized and systematic censuses were not operated.
Population and output measures between 1000 CE and 1800 CE contain also a sizeable part
of uncertainty. However, local censuses, parish registers or proxy variables such as urbaniza-
tion are increasingly available on that period, reducing the overall measurement error. I also
only consider countries or cities with the best, or at least above median, data coverage for the
considered time periods (see Section 4.1 for more details). Second, I compute 50-year averages
when the data are available at a lower frequency to avoid spurious changes in the considered
variables, and to focus on long-run dynamics. Third, in presence of classical measurement error
in the regressors, AB and BB can in principle correct for it, as they are based on an internal
instrumentation strate to estimate coefficients. Non-classical measurement error, such as sys-
tematic differences in the GDP per capita or population levels across countries, are taken into
account via country fixed effects. This is the case for instance if rich countries report systemat-
ically, but consistently through time, lower errors than poor countries. Time fixed effects can
also deal with common time-varying measurement error, as for instance increased uncertainty
in population figures moving away from 1800 CE. Finally, as a further robustness check for
non-classical measurement error, I also systematically perform LSDV estimations with year-
interacted lagged dependent variables. By that mean, I can in principle take into account any
time varying differences in measurement correlated with initial population or GDP per capita
levels.
5 Results
In this section, I present my empirical estimates of the speed of convergence for various Malthu-
sian economies. I start by presenting my results using historical and simulated income per capita
data. Then, I present my results using total and urban population historical data.
23
5.1 Convergence with GDP per capita Data
In Table 3, I present my results based on OLS and LSDV estimations of equation (15) us-
ing Maddison Project’s data (Bolt and Van Zanden,2020).14 Table 3shows that Malthusian
economies take at least several generations to absorb a shock, revealing a pattern of weak home-
ostasis.
Table 3: Speed of Convergence using GDP per capita Data from the Maddison Project
Sample Used: Full Europe
OLS LSDV LSDV OLS LSDV LSDV
(1) (2) (3) (4) (5) (6)
log(GDPpc) -0.0006 -0.0057** -0.0057*** 0.0000 -0.0046** -0.0046**
(0.001) (0.002) (0.002) (0.001) (0.001) (0.001)
Time FE Yes Yes Yes Yes Yes Yes
Country FE No Yes Yes No Yes Yes
Statehist No No Yes No No Yes
Observations 85 85 85 69 69 69
adj. R-sq -0.01 0.16 0.14 -0.05 0.11 0.08
Half-Life 1197 122 121 -18766 150 152
Half-Life 95% C.I. [-434,252] [422,71] [391,72] [-356,370] [587,86] [663,86]
Notes: This table presents estimates of the speed of convergence using GDP per capita data from the Maddison Project. Columns 1-3 present
results using the full sample of countries we selected from the Maddison Project’s data and columns 4-6 show results focusing on European
countries. Standard errors clustered at the country level are in parentheses. * p<0.1, ** p<0.05, *** p<0.01.
Starting with the most parsimonious specification, with only time fixed effects as controls,
column 1 reveals that the lagged dependent variable coefficient is not statistically different from
zero. This is not really surprising as the omitted variable bias is substantial here, driving
the lagged dependent coefficient towards zero. Moreover, as my model suggests, Malthusian
economies should display conditional convergence rather than absolute convergence, as the
steady-state position of each economy depends on its characteristics.15
14In that case, GMM estimations are not reported due to the lack of observational units. Indeed, as advised by
Roodman (2009), a useful rule of thumb to avoid weak instrument issues in GMM estimations is to keep the total
number of instruments below the number of observational units. It is not possible with the sample we consider
from the Maddison Project data as we have eleven countries and fifteen instruments in the most parsimonious
possible instrumentation, resulting in unitary Hansen test p-values.
15From the steady-state condition in (13), it is clear that two economies with for instance different rates of
24
Adding country fixed effects, column 2 reveals a negative and significant relationship be-
tween GDP per capita growth and its initial level, pointing toward conditional convergence of
Malthusian economies. The estimated coefficient implies a half-life of 122 years (ln(2)/0.0057),
with a 95% confidence interval giving half-lives between 422 years and 71 years. Therefore, the
most comprehensive and up-to-date historical GDP per capita series are consistent with weak
homeostasis of Malthusian economies, as it takes at least several generations to absorb a shock.
Compared to other studies, my results fall between Fernihough (2013) and Bouscasse et al.
(2021), who find half-lives of 112 and 150 years respectively. However, my results are in great
contrast with Madsen et al. (2019), who find a half-life of 29 years for income per capita and
conclude in favor of strong homeostasis of Malthusian economies.16 In addition, since LSDV
typically delivers an overestimate of the speed of convergence and OLS an underestimate of it,
the true βshould lie between OLS and LSDV. This means that the “true” half-life should be
consistent with weak homeostasis only, as the upper bound of the half-life in that case is about
120 years.
To limit the omitted variable bias, column 3 adds Statehist and its squared level as controls.
The speed of convergence is almost unaffected, as the reported half-life is now slightly higher at
121 years. As a robustness check, columns 4 to 6 replicate the analysis restricting our sample to
European countries, which gives similar results. In particular, column 6 indicates a slower speed
of convergence, with a half-life of 152 years, confirming the weak homeostasis pattern found in
the previous columns. I find no significant differences in the estimated speed of convergence
between the two samples of countries.
As an additional robustness check, Figure 3displays LSDV estimations of columns 3 and
6, adding an interaction term between time fixed effects and the initial level of GDP per capita.
technological progress gwill not converge to the same steady state.
16Note that my article has several methodological differences with respect to Madsen et al. (2019). First, they use
seemingly unrelated regression (SUR) models a random effects family estimator while we use within-country
estimators (LSDV, AB and BB). Second, they rely on interpolated data coming from heterogeneous sources for
GDP per capita and population data, while we I take the data as given.
25
Figure 3: Speed of Convergence per period using Maddison Project Data
Half-life=115 years
Galloway (1988)
Half-life=30 years
Madsen et al. (2019)
-.02 0 .02 .04
1300
1350
1400
1450
1500
1550
1600
1650
1700
1750
Half-life=115 years
Galloway (1988)
Half-life=30 years
Madsen et al. (2019)
-.02 0 .02 .04
1300
1350
1400
1450
1500
1550
1600
1650
1700
1750
Notes: This figure reports estimates of the speed of convergence using GDP per capita data from the Maddison
Project. It corresponds to the LSDV estimations in column 3, Table 3 (left panel) and in column 6, Table 3 (right
panel), adding year-interacted lagged GDP per capita levels as controls. 95% confidence intervals are reported.
Figure 4: Speed of Convergence per country using Maddison Project Data
Half-life=115 years
Galloway (1988)
Half-life=30 years
Madsen et al. (2019)
-.04 -.03 -.02 -.01 0 .01
China
France
Italy
Mexico
Netherlands
Peru
Poland
Portugal
Spain
Sweden
United Kingdom
Half-life=115 years
Galloway (1988)
Half-life=30 years
Madsen et al. (2019)
-.04 -.03 -.02 -.01 0 .01
France
Italy
Netherlands
Poland
Portugal
Spain
Sweden
United Kingdom
Notes: This figure reports estimates of the speed of convergence using GDP per capita data from the Maddison
Project. It corresponds to the LSDV estimations in column 3, Table 3 (left panel) and in column 6, Table 3 (right
panel), adding country-interacted lagged GDP per capita levels as controls. 95% confidence intervals are reported.
26
This allows to look at the heterogeneity of the speed of convergence through time, and check
the possible influence of non-classical measurement errors. Whether considering the full or the
European sample of countries, the vast majority of the estimated coefficients are not statisti-
cally different from a half-life of 115 years, as found for Europe using the long-run elasticities
of Galloway (1988). Overall, the point estimates are stable in magnitude and, in majority, sta-
tistically different from zero at the 5% level. This indicates a clear and stable pattern of weak
homeostasis during the Malthusian period. On the contrary, strong homeostasis, as represented
by the highest half-life found in Madsen et al. (2019) (about 30 years), is always rejected at the
5% level.
Turning to the heterogeneity of the speed of convergence by country, Figure 4displays
the point estimates of LSDV estimations of column 3 and 6 adding an interaction term be-
tween country fixed effects and the initial level of GDP per capita. Figure 4reveals mixed
results as some countries are found compatible with weak homeostasis (eg the Netherlands), and
some other countries rather lean towards strong homeostasis (eg Poland). Some countries, like
France or Spain, are even found to be compatible with both types of homeostasis. However,
the precision of my estimates is clearly an issue in that specification. As displayed on Figure 4,
confidence intervals are generally large, due to a lack of variation in the data for some countries.
In Table 4, I present my results based on OLS, LSDV and GMM estimations of equation
(15) using Lagerlöf’s (2019) simulated data. That dataset has the great advantage of reproduc-
ing the same moments than Maddison Project’s data, while possessing a far larger time and
cross-sectional dimension. This is useful for estimating with greater precision the spectrum of
plausible half-lives in Malthusian economies. Consistent with the weak homeostasis displayed
when using historical GDP per capita series from the Maddison Project, Table 4reveals half-
lives ranging from about three to one century.
The LSDV estimates imply a half-life of 133 years, with a 95% confidence interval giving
half-lives between 141 and 126 years. As expected, the speed of convergence is now estimated
27
Table 4: Speed of Convergence using simulated GDP per capita Data from Lagerlöf (2019)
OLS LSDV GMM-AB GMM-BB
(1) (2) (3) (4)
log(GDPpc) -0.0019*** -0.0052*** -0.0063*** -0.0047***
(0.000) (0.000) (0.002) (0.001)
Time FE Yes Yes Yes Yes
Country FE No Yes Yes Yes
Observations 10000 10000 9000 10000
adj. R-sq 0.09 0.18 . .
AR(7) 0.17 0.18
Hansen 0.22 0.23
Diff. Hansen . 0.21
Instruments 13 15
Half-Life 363 133 110 146
Half-Life 95% C.I. [403,330] [141,126] [212,75] [250,103]
Notes: This table presents estimates of the speed of convergence using simulated GDP per capita data from Lagerlöf (2019). Columns 3 and
4 display Arellano and Bond (1991) and Blundell and Bond (1998) GMM estimations, using the seventh and further lagged values of GDP
per capita as instruments. We use a collapse matrix of instruments and report instrument count. The AR(7) row reports the p-value of a test
for no seventh-order correlation in the residuals. Standard errors clustered at the country level are in parentheses. * p<0.1, ** p<0.05, ***
p<0.01.
with much more precision, while falling in the wide confidence intervals of our previous results
in Table 3. Note that the Hurwicz-Nickell bias is very unlikely to affect my estimates in
that case, as this is an ideal setting where both the time and the sample size are very large
(N= 1000 and T= 500). Interestingly, the point estimate is found very close to my previous
LSDV estimation in Table 3, columns 3, indicating that the Hurwicz-Nickell bias is indeed
not substantial in the Malthusian context. A comparison with Maddison Project’s data is fully
relevant here, as Lagerlöf’s (2019) simulated data come from a Malthusian model which is
found to match the moments of the six historical GDP per capita series presented in Fouquet
and Broadberry (2015). The original series presented in Fouquet and Broadberry (2015) are
still part of the latest Maddison Project database for some countries (eg Holland and Italy) or
are updated versions using the same methodolo (eg England and Sweden).
Columns 3 and 4 display AB and BB GMM results. As highlighted by Monte Carlo simu-
lations in the modern growth context (Hauk and Wacziarg,2009;Hauk Jr,2017), BB is likely
28
to deliver better estimates of the speed of convergence compared to AB in presence of weak
instruments; the second best estimator in that context is LSDV. Column 3 reveals AB estimates
of the speed of convergence that are out of the plausible bound given by OLS and LSDV, which
is recognized as a sign of weak instruments in the litterature. In those conditions, BB is the
prefered GMM estimator. Column 4 shows BB estimation results with a half-life of about 146
years, pointing again towards weak homeostasis. The 95% confidence interval gives half-lives
between 250 and 103 years. In terms of post-estimation tests, I first reject the null hypothesis
of seventh-order serial correlation in the residuals (AR(7) test), meaning that using the seventh
(and greater) lag of GDP per capita as instruments does not violate the exclusion restriction.
Second, I reject both the null hypothesis of the Hansen test and the difference in Hansen test
for all GMM instruments, indicating that the moment conditions are globally satisfied.
Figure 5investigates the time heterogeneity of the speed of convergence. All the coefficients
are statistically different from zero and are very precisely estimated, thanks to the large time
and sample size in Lagerlöf (2019). The speed of convergence is fairly stable over time. Half
of the estimated coefficients cannot reject a half-life of 115 years at the 5% level, as found
for Europe using the long-run elasticities of Galloway (1988). Moreover, all the remaining
coefficients indicate a slower speed of convergence, which is again a clear sign of weak homeostasis
of Malthusian economies.
Investigating the cross-sectional heterogeneity of the speed of convergence, Figure 6dis-
plays the kernel density of the estimated speed for the 1000 simulated Malthusian economies
in Lagerlöf (2019). Thanks to the large sample size, I can visualize the whole spectrum of the
possible speed of convergence during Malthusian times. Consistent with the literature and my
results, it appears that the mode of the distribution is very close from a half-life of 115 years, as
found for Europe using the long-run elasticities of Galloway (1988). Strong homeostasis, rep-
resented by a half-life of 30 years or less, is found much less likely as it is close to the lower-tail
of the distribution. Figure E-4 in the Appendix delivers the point estimates along with their
29
Figure 5: Speed of Convergence per period using Lagerlöf’s (2019) simulated data
Half-life=115 years
Galloway (1988)
Half-life=30 years
Madsen et al. (2019)
0-.005-.01-.015-.02
1350
1400
1450
1500
1550
1600
1650
1700
1750
1800
Notes: This figure reports estimates of the speed of convergence by period using simulated GDP per capita data
from Lagerlöf (2019). It corresponds to the LSDV estimation in column 2, Table 4, adding year-interacted lagged
GDP per capita levels as controls. 95% confidence intervals are reported.
Figure 6: Speed of Convergence per country using Lagerlöf’s (2019) simulated data
Half-life=30 years
Madsen et al. (2019)
Half-life=115 years
Galloway (1988)
020 40 60 80
Density
-.04 -.02 0 .02
Speed of Convergence - β coefficients
kernel = epanechnikov, bandwidth = 0.0013
Notes: This figure reports the kernel density of the estimated speed of convergence by country using simulated
GDP per capita data from Lagerlöf (2019). It corresponds to the LSDV estimation in column 2, Table 4, adding
country-interacted lagged GDP per capita levels as controls.
30
95% confidence intervals for the 200 first simulated economies from Lagerlöf (2019).
5.2 Convergence with Population Data
In Table 5, I present my results based on OLS, LSDV and GMM estimations of equation
(15) using McEvedy et al.’s (1978) population data. Consistent with the predictions of our
theoretical model (see Section Cof the Appendix for more details), population converges to its
Malthusian steady state at a similar pace than GDP per capita and displays weak homeostasis.
Table 5: Speed of Convergence using Total Population Data from McEvedy et al. (1978)
OLS LSDV LSDV GMM-AB GMM-BB
(1) (2) (3) (4) (5)
log(Population) -0.000*** -0.005*** -0.006*** -0.009*** -0.004*
(0.000) (0.001) (0.001) (0.003) (0.002)
Time FE Yes Yes Yes Yes Yes
Country FE No Yes Yes Yes Yes
Statehist No No Yes Yes Yes
Observations 180 180 180 162 180
adj. R-sq 0.48 0.60 0.61 . .
AR(2) 0.69 0.36
Hansen 0.94 0.99
Diff. Hansen . 0.87
Instruments 18 22
Half-Life 4414 147 125 73 167
Half-Life 95% C.I. [12873,2663] [224,109] [231,86] [215,44] [-1176,78]
Notes: This table presents estimates of the speed of convergence using total population data from McEvedy et al. (1978). Columns 4 and 5
display Arellano and Bond (1991) and Blundell and Bond (1998) GMM estimations, using the second to fourth lagged values of total population
as instruments. We treat Statehist and its squared level as endogenous, using the same set of lags as instruments. We use a collapse matrix
of instruments and report instrument count. The AR(2) row reports the p-value of a test for no second-order correlation in the residuals.
Standard errors clustered at the country level are in parentheses. * p<0.1, ** p<0.05, *** p<0.01.
Controlling for time and country fixed effects, column 2 reveals a negative and highly signif-
icant relationship between population growth and its initial level. The implied half-life is about
147 years, which is in line with my previous results using historical GDP per capita series. The
95% confidence interval indicates half-lives between 224 and 109 years, which stays clearly in
the range of weak homeostasis.Dealing with omitted variable issues, column 3 adds Statehist and
31
its squared level as controls. Convergence tends to be faster, with a half-life of about 125 years.
However, I do not find significant differences in the speed of convergence between columns 2
and 3. This last result is close from my previous LSDV estimations using GDP per capita series
in Table 3, column 3, and Table 4, column 2, showing again evidence of weak homeostasis.
Columns 4 and 5 use GMM estimation procedures. Starting with the AB GMM estimation,
column 4 reveals a faster speed of convergence compared to our LSDV results, with a half-life
of 73 years. On the contrary, BB GMM estimation in column 5 displays much slower speed
of convergence than LSDV, with a half-life of 167 years. Clearly, AB estimation of the speed
of convergence falls out of the plausible bounds provided by OLS and LSDV, symptomatic of
a weak instruments problem. In those conditions, the BB and LSDV estimations appear to be
more reliable, pointing again towards weak homeostasis. However, further caution is needed
in interpreting the GMM results as Hansen test’s p-values are close to unity. This comes from
the fact that the cross-sectional dimension is small using McEvedy et al.’s (1978) data compared
to the number of instruments.17 This is the well known “too many instruments problem”,
highlighted by Roodman (2009).
Figure 7displays the point estimates of my LSDV estimation in column 3, adding year-
interacted initial population levels. All the estimated coefficients are found compatible with a
half-life of 115 years, as found for Europe using the long-run elasticities of Galloway (1988).
The point estimates are fairly stable in magnitude, and are all statistically different from zero.
This indicates, again, a clear pattern of weak homeostasis during the whole Malthusian period.
Strong homeostasis, on the contrary, is fully rejected.
Figure 8investigates the cross-country heterogeneity of the speed of convergence. As in
Figure 7, strong homeostasis is clearly rejected since all the estimated coefficients reject a half-
life of 30 years or less as in Madsen et al. (2019). In particular, the estimated half-life for England
and Wales is 90 years and is not statistically different from my benchmark result of 100 years.
17In that case, our sample is composed of 18 countries. For example, in column 4 we instrument all right
hand-side variables with their three first lags, amounting to 22 instruments.
32
Figure 7: Speed of Convergence per period using Population Data from McEvedy et al. (1978)
Half-life=115 years
Galloway (1988)
Half-life=30 years
Madsen et al. (2019)
.0060-.006-.012-.018-.024
1000
110 0
1200
1300
1400
1500
1600
1650
1700
1750
Notes: This figure reports estimates of the speed of convergence by period using total population data from
McEvedy et al. (1978). It corresponds to the LSDV estimation in column 3, Table 5, adding year-interacted
lagged total population levels as controls. 95% confidence intervals are reported.
Figure 8: Speed of Convergence per country using Population Data from McEvedy et al. (1978)
Half-life=100 years
Benchmark - England
Half-life=54 years
Upper-bound - England
Half-life=30 years
Madsen et al. (2019)
.0060-.006-.012-.018-.024
Austria
Bel. and Lux.
China Proper
Czechoslovakia
England & Wales
France
Germany
Hungary
Italy
Japan
Korea
Pak. India & Bang.
Poland
Portugal
Romania
Russia in Europe
Scandinavia
Spain
Notes:This figure reports estimates of the speed of convergence by country using total population data from
McEvedy et al. (1978). It corresponds to the LSDV estimation in column 3, Table 5, adding country-interacted
lagged total population levels as controls. 95% confidence intervals are reported.
33
On the contrary, the upper-bound half-life of 54 years for England is clearly rejected, giving
further evidence on the strength of the Malthusian trap in England. I am also able to confirm
that England is converging faster than the average European economy, as a test of equality with
a half-life of 115 years rejects the null hypothesis. Some Malthusian economies, such as Spain
or Korea, have significantly higher speed of convergence. However, it is not clear whether
this pattern reflects the influence of measurement error or the existence of stronger Malthusian
forces. Reassuringly, all those economies are found compatible with the upper-bound half-
life of England, that can represent in that case one of the highest speed of convergence that a
Malthusian economy can reach. This is still in line with weak homeostasis as it means that two
generations are needed to absorb half of a shock.
In Table 6, I present my results based on OLS, LSDV and GMM estimations of equation
(15) using urban population data from Reba et al. (2016). My results confirm the weak home-
ostasis pattern found using GDP per capita or total population historical series, with half-lives
of about one century.
Starting with city-level population data, column 2 reveals a negative and highly significant
relationship between urban population growth and the initial level of urban population, condi-
tional on time and city fixed effects. The corresponding half-life is about 95 years, with a 95%
confidence interval indicating half-lives between 155 and 65 years.
Columns 3 and 4 display AB and BB GMM estimations. The AB estimation in column
3 shows a speed of convergence falling within the OLS-LSDV bound, with a half-life of 115
years. This is further confirmed by the BB GMM estimation in column 4, which delivers a very
similar speed of convergence compared to AB, with a half-life of 119 years. However, I fail to
pass the AR(3) test in both specifications, suggesting a violation of the exclusion restriction for
the set of considered internal instruments.
Turning to country-level estimations, column 6 reveals a negative and highly significant
relationship between urban population growth and its initial level, conditional on time and
34
Table 6: Speed of Convergence using Urban Population Data from Reba et al. (2016)
Observational Unit: City Country
OLS LSDV GMM-AB GMM-BB OLS LSDV LSDV GMM-AB GMM-BB
(1) (2) (3) (4) (5) (6) (7) (8) (9)
log(Population) -0.003*** -0.007*** -0.006*** -0.006*** -0.004*** -0.007*** -0.008*** -0.007** -0.005*
(0.001) (0.001) (0.002) (0.002) (0.001) (0.001) (0.001) (0.003) (0.003)
Time FE Yes Yes Yes Yes Yes Yes Yes Yes Yes
City FE No Yes Yes Yes No No No No No
Country FE No No No No No Yes Yes Yes Yes
Statehist No No No No No No Yes Yes Yes
Observations 1706 1706 1239 1706 509 509 509 411 509
adj. R-sq 0.08 0.22 . . 0.16 0.24 0.26 . .
AR(3) 0.06 0.07 0.11 0.12
Hansen 0.12 0.12 0.56 0.91
Diff. Hansen . 0.02 . 0.99
Instruments 29 31 21 25
Half-Life 258 95 115 119 192 94 84 97 133
Half-Life 95% C.I. [500,174] [155,68] [253,74] [325,73] [270,149] [133,73] [118,64] [2707,49] [-698,61]
Notes: This table presents estimates of the speed of convergence using urban population data from Reba et al. (2016). Columns 1-4 present results using city-level data and columns 5-9 show
results using urban population aggregated at the country-level. Columns 3 and 4 and columns 8 and 9 display Arellano and Bond (1991) and Blundell and Bond (1998) GMM estimations. In
the case of city-level estimates using GMM, we use the third and further lagged values of urban population as instruments. In the case of country-level GMM estimates, we use the third and
fourth lagged values of urban population as instruments. When included, we treat Statehist and its squared level as endogenous, using the same set of lags as instruments. We use a collapse
matrix of instruments and report instrument count. The AR(3) row reports the p-value of a test for non third-order correlation in the residuals. Standard errors clustered at the city level in
columns 1-4 and at the country level in columns 5-9 are in parentheses. * p<0.1, ** p<0.05, *** p<0.01.
35
country fixed effects. The half-life is almost identical to the previous LSDV estimation using
city-level data in column 2. Column 7 adds Statehist and its squared level as control variables
in order to reduce the omitted variable bias. The estimated speed of convergence is now faster
with a half-life of 84 years.
In columns 8 and 9, I perform GMM estimations at the country level. I find half-lives of
97 and 133 years respectively. The AB GMM estimation falls within the OLS-LSDV bounds.
Contrary to the previous GMM estimations in columns 3 and 4, I am now passing the AR(3)
test in both cases. Moreover, the Hansen test and the difference in Hansen test reject their null
hypothesis, sign that the moment conditions are globally satisfied.18 This is a clear indication
that the instruments and the moment conditions used are valid.
Overall, my results using historical urban population data clearly confirm the weak home-
ostasis pattern found in the previous sections. Most of the half-lives are close to one century,
and the smallest half-life found is about 84 years.
Figure 9explores the time heterogeneity of the speed of convergence, both for my city-level
and country-level estimations. In both cases, weak homeostasis is confirmed. This is particularly
striking for the city-level data where all the point estimates starting from 1250 CE onward
cannot reject a half-life of 115 years at the 5% level. On the other hand, strong homeostasis is
always rejected at the level of 5%, except for the first period of the country-level data.
Figure 10 plots the kernel density of the estimated speed of convergence for a sample of 185
cities. A half-life of 115 years, as found for Europe using the long-run elasticities of Galloway
(1988), is very close to the mode of the distribution. Moreover, the distribution is also more
concentrated around that given half-life than my previous estimates with Lagerlöf’s (2019)
simulated data (Figure 6), giving more precise evidence in favor of weak homeostasis. On the
contrary, strong homeostasis is much less likely. Figure E-5 of the Appendix delivers the point
estimates along with their 95% confidence intervals for the 185 cities in our sample.
18Despite close to unity Hansen test’s p-values, the total number of instruments is well below the number of
observational units, with 25 instruments for 47 countries in column 9.
36
Figure 9: Speed of Convergence per period using Data from Reba et al. (2016)
Half-life=115 years
Galloway (1988)
Half-life=30 years
Madsen et al. (2019)
.0060-.006-.012-.018-.024
1050
1100
1150
1200
1250
1300
1350
1400
1450
1500
1550
1600
1650
1700
1750
1800
Half-life=115 years
Galloway (1988)
Half-life=30 years
Madsen et al. (2019)
.0060-.006-.012-.018-.024
1050
1100
1150
1200
1250
1300
1350
1400
1450
1500
1550
1600
1650
1700
1750
1800
Notes: This figure reports estimates of the speed of convergence by period using urban population data from Reba
et al. (2016). It corresponds to the LSDV estimations in column 2, Table 6 (left panel) and in column 7, Table 6
(right panel), adding year-interacted lagged urban population levels as controls. 95% confidence intervals reported.
Figure 10: Speed of Convergence per city using Data from Reba et al. (2016)
Half-life=30 years
Madsen et al. (2019)
Half-life=115 years
Galloway (1988)
020 40 60 80
Density
-.1 -.05 0 .05
Speed of Convergence - β coefficients
kernel = epanechnikov, bandwidth = 0.0017
Notes: This figure reports the kernel density of the estimated speed of convergence by city using urban population
data from Reba et al. (2016). It corresponds to the LSDV estimation in column 2, Table 6, adding city-interacted
lagged urban population levels as controls.
37
Figure 11 shows the cross-country heterogeneity of the speed of convergence using urban
population data. Along with the city-level results, Figure 11 shows that very few countries
fail to reject strong homeostasis at the 5% level. It is important to note that countries known
to have good quality urban population data, such as the Netherlands or the United Kingdom,
are estimated with great precision. In particular, the estimated speed of convergence for the
United Kingdom is not statistically different from the half-life of one century found with our
benchmark parametrization for England.
Figure 11: Speed of Convergence per country using Data from Reba et al. (2016)
Half-life=100 years
Benchmark - England
Half-life=54 years
Upper-bound - England
Half-life=30 years
Madsen et al. (2019)
-.06 -.04 -.02 0 .02
Algeria
Austria
Belgium
Bulgaria
Cambodia
China
Denmark
Ecuador
Egypt
France
Germany
Greece
Hungary
India
Indonesia
Iran
Iraq
Ireland
Israel
Italy
Japan
Korea, Rep.
Libya
Lithuania
Mexico
Morocco
Myanmar
Netherlands
Nigeria
Peru
Poland
Portugal
Romania
Russia
Saudi Arabia
Serbia and Montenegro
Slovakia
Spain
Sweden
Switzerland
Syria
Thailand
Tunisia
Turkey
Ukraine
United Kingdom
Uzbekistan
Notes: This figure reports estimates of the speed of convergence by country urban population data from Reba et al.
(2016). It corresponds to the LSDV estimations in column 7, Table 6, adding country-interacted lagged urban
population levels as controls. 95% confidence intervals are reported.
6 Conclusion
How long can living standards deviate from their long-run equilibrium after a shock in a Malthu-
sian world? This article proposes to look at the speed of convergence of Malthusian economies
to answer this question. I proceed in two steps. First, I build an overlapping-generations
38
Malthusian growth model, putting the emphasize on the channels through which population
adjusts to its standards of living. In particular, I follow the idea of Malthus (1798), including
both preventive and positive checks as means of population adjustment. Agents first choose
to marry (or not), influencing the extensive margin of fertility, and then choose the number
of children within marriage, influencing the intensive margin of fertility. Both choices depend
on income per capita, in a Malthusian fashion. I then derive the speed of convergence implied
by my model, showing that it depends only on the land share of output and the elasticities of
fertility, marriage and survival with respect to income per capita. I also perform a calibration
exercise, showing that under plausible parameter values the speed of convergence indicates weak
homeostasis for England, with a half-life of about a century. Second, I systematically confront
the model predictions with the data using β-convergence regressions à la Barro and Sala-i Martin
(1992). I first focus on historical and simulated GDP per capita data series, showing that they
exhibit weak homeostasis in the same magnitudes as revealed by our model. I then use historical
total and urban population series to estimate the speed of convergence, finding similar results
regarding the pattern of weak homeostasis and its magnitude. I address the endogeneity issues
using an internal instrument approach (GMM) and controlling for the State History Index of
Borcan et al. (2018). Measurement error issues are dealt using several strategies, including the
exclusion of the most uncertain part of the data, time averages and time-interacted regressors.
Overall, my results highlight the validity of the Malthusian theory and the Malthusian
trap mechanism to explain the thousands of years of stagnation that humanity faced before
the Industrial Revolution. In this perspective, weak homeostasis help us in understanding the
multiple episodes of non-sustained growth episodes but lasting for decades in pre-industrial
times, referred as “golden ages” or “efflorescences” in the literature (Goldstone,2002). Sim-
ilarly, different patterns of homeostasis can help explain why common economic shocks can
lead to a long-lasting divergence in the standards of living between two Malthusian economies.
In this perspective, my work already highlights significant differences between the speed of
39