ResearchPDF Available

Is Ethnic Diversity a Tragedy for Public good Provision in Africa?

Authors:

Abstract and Figures

Pioneer investigations of the economic consequences of ethnic diversity - a ubiquitous feature of African societies - found a strong and negative correlation. The present paper revisits this ``diversity burden'' conjecture, by using (1) newly available, better-quality data-sets allowing for sub-national estimations and multiple robustness checks, and (2) a new instrument, a pre-colonial measure of ethnic diversity, to explore causality. We find that the impact of ethnic diversity depends on the public good and the specification considered. Overall, completely heterogeneous societies produce 3 percent less public goods than their homogeneous counterparts, a negative but more limited effect than previously found.
Content may be subject to copyright.
Is Ethnic Diversity a Tragedy for Public Good Provision
in Africa?
Gwen-Jiro Clochard*
, Guillaume Hollard
This version: January 2019
Abstract
Pioneer investigations of the economic consequences of ethnic diversity - a ubiquitous fea-
ture of African societies - found a strong and negative correlation. The present paper revisits
this “diversity burden” conjecture, by using (1) newly available, better-quality data-sets al-
lowing for sub-national estimations and multiple robustness checks, and (2) a new instrument,
a pre-colonial measure of ethnic diversity, to explore causality. We find that the impact of
ethnic diversity depends on the public good and the specification considered. Overall, com-
pletely heterogeneous societies produce 3 percent less public goods than their homogeneous
counterparts, a negative but more limited effect than previously found.
Keywords: public good provision, ethnic diversity, causal effect
JEL Classification: C26, H411
*Corresponding author - Ecole polytechnique, CREST - gwen-jiro.clochard@polytechnique.edu - 5 avenue Le
Chatelier, Office 4083, 91120 Palaiseau, France
Ecole polytechnique, CREST and CNRS - guillaume.hollard@polytechnique.edu - 5 avenue Le Chatelier, Office
4035, 91120 Palaiseau, France
1We thank Esther Duflo, Marcel Fafchamps, Christopher Udry, Lorenz Götte, Margherita Comola, Pierre Boyer,
Anett John, Alessandro Riboni, Omar Sene and Fabien Perez for very useful comments. We also thank participants
at the CSAE2018 conference and other seminars. Last, we are grateful to Boris Gershman and Stelios Michalopoulos
for data sharing. The authors declare that they have no relevant material or financial interests that relate to the research
described in this paper.
1
1 Introduction
In their influential analysis of the determinants of growth in Sub-Saharan Africa, Easterly and
Levine (1997) singled out the effect of ethnic diversity. Ethnic diversity has been shown to be nega-
tively correlated with a variety of economic outcomes. For instance, La Porta et al. (1999), Alesina
et al. (2003) and Banerjee et al. (2007) found that more heterogeneous societies have worse infras-
tructure, and higher illiteracy and infant-mortality rates. In Alesina et al. (2003), ceteris paribus
completely heterogeneous societies produce 20 percent less public goods than homogeneous ones.
Twenty years after the pioneering work of Easterly and Levine (1997), we here propose to
revisit one of the most thoroughly-investigated questions in Development Economics, that of the
“diversity burden”, in the light of the substantial progress that has been made in data availability
and econometric analysis. Our aim is to contribute to the still rapidly-growing literature on the link
between diversity and public goods in the following ways.
Our first contribution relates to data compilation. Pioneer work on diversity used cross-country
analyses (Alesina et al. (2003), Fearon and Laitin (2003), Posner (2004)), while more recent works
benefit from sub-national data availability (Alesina and Zhuravskaya (2011), Gerring et al. (2015),
Gisselquist et al. (2016), Gershman and Rivera (2018)). We here contribute to this literature by
combining surveys, geographical information and historical ethnic data, which enable us to carry
out our estimations at the regional level (the first sub-national administrative level) in 32 African
countries. Regional-level analysis enables us to use country-fixed effects, capturing all of the unob-
served characteristics at the country level, such as the colonial influence of institutions (Acemoglu
et al., 2001).
Our second contribution regards the large set of public goods for which we estimate the effect
of ethnic diversity. The link between ethnic diversity and public-good provision is typically inves-
tigated for one particular public good, such as health centers, schools or roads. In contrast, we here
consider a set of nine public goods: the electricity grid, cellphone service, sewage-treatment facil-
ities, piped-water systems, post offices, schools, police stations, health centers and paved roads.
The third, and main, contribution is to propose a new instrument based on a pre-colonial mea-
sure of ethnic diversity. It is well known that OLS regressions are subject to a number of caveats,
such as omitted-variable bias, reverse causality and measurement error, leading to biased estimates.
2
In contrast to the standard OLS approach, IV regressions provide causal estimates.
We find that the impact of ethnic diversity depends on the public good considered: in line
with previous work, we find a negative effect for access to the electricity grid, clean water and
roads. However, there is no significant impact on cellphone service, sewage treatment or health
centers and, surprisingly, a positive effect on schools, post offices and police stations. We estimate
an average effect on public-good access by constructing a general index. This index is slightly
negatively correlated with diversity: ceteris paribus, completely heterogeneous societies produce
3 percent less public goods than their homogeneous counterparts, as compared to the 20 percent
figure found in Alesina et al. (2003).
We test the validity of our findings via a series of robustness checks. We first control for
multiple-hypothesis testing. Second, we use alternative measures of ethnic diversity, namely the
polarization index and a notion of proximity among groups based on language. Third, we use an
alternative data set to calculate our ethnic-diversity measures. Fourth, we use alternative specifica-
tions of the instrument. Fifth and last, we use alternative data for the pre-colonial population. In
(almost) all of the specifications, ethnic diversity continues to adversely affect public-good provi-
sion on average. However, the size, and even sign, of this effect varies across specifications. In
sum, we confirm the existence of a “diversity burden”, but only limited in size and not systematic.
The remainder of the paper is organized as follows. In Section 2, we present the data and the
variables of interest, and the instrument is then discussed in Section 3. The estimation framework
is presented in Section 4, and the results in Section 5. The different robustness checks appear in
Section 6. Last, Section 7 concludes.
2 Data
We appeal to five different data sets for our analysis: three waves of the Afrobarometer (2008-
2015)2, geographical data on soil quality, access to water and ethnic homeland territories calculated
by Nunn and Wantchekon (2011) and Michalopoulos and Papaioannou (2013), data on pre-colonial
populations from Manning (2013), on the depth of ethnic divisions from Gershman and Rivera
(2018) and an alternative data set for our measure of ethnic diversity: the Demographic Health
2Available at http://afrobarometer.org/data/merged-data
3
Surveys3.
The descriptive statistics are displayed in Table 1.
2.1 Afrobarometer
The Afrobarometer data come from nationally-representative surveys with a minimum of 1200
individuals per country. We pool data from the fourth (2008-2009), fifth (2011-2013) and sixth
(2014-2015) rounds to increase statistical power within regions. The pooled data sets give us
information on 32 African countries:4Algeria, Benin, Botswana, Burkina Faso, Cameroon, Cape
Verde, Côte d’Ivoire, Gabon, Ghana, Guinea, Kenya, Lesotho, Liberia, Madagascar, Malawi, Mali,
Mauritius, Morocco, Mozambique, Namibia, Niger, Nigeria, Sao Tome and Principe, Senegal,
Sierra Leone, South Africa, Swaziland, Tanzania, Togo, Uganda, Zambia and Zimbabwe. We have
information on roughly 100,000 individuals in over 400 regions. In addition to increased statistical
power, the use of several rounds of the Afrobarometer enables us to avoid the criticism of only
measuring a phenomenon at one single point in time (Rosenzweig and Udry, 2018).
Public goods There are a variety of ways to analyze the impact of ethnic diversity on public-good
provision. At one end of the spectrum, Alesina et al. (1999) investigated the relationship between
ethnic diversity and the inputs to public goods, i.e. public spending; at the other end, some work
(Alesina et al. (2003), Banerjee et al. (2007)) considered the relationship between diversity and
public-good outputs, such as the literacy rate, school quality and the infant-mortality rate. We
here investigate this same link at a point in-between these two extremes, as we look at the impact
of fractionalization on access to public goods. Access, meaning that the good is present close
to the survey respondent, is a necessary but not sufficient condition for individuals being able to
enjoy the good: having an electricity pylon close to one’s home does not necessarily mean that
one is connected to the grid or that the electricity received is reliable; having a school nearby does
not mean that one is able to send one’s children there or that the school has access to books and
properly-trained teachers; health centers can exist but be of poor quality if the doctors do not show
up or if they lack medical supplies.
3Available at https://dhsprogram.com/data/
4There are 36 countries in the Afrobarometer data, but questions about ethnicity are absent from the questionnaires
in Burundi, Egypt, Sudan and Tunisia.
4
Table 1: Summary statistics.
Variable Mean Std. Dev. Min. Max. N
Public goods
Electricity grid 0.61 0.49 0 1 123568
Cellphone service 0.89 0.31 0 1 120846
Sewage treatment 0.27 0.44 0 1 122416
Piped-water system 0.55 0.5 0 1 123243
Post office 0.25 0.43 0 1 122548
Schools 0.88 0.32 0 1 123284
Police stations 0.36 0.48 0 1 122616
Clinics 0.61 0.49 0 1 122655
Road 0.48 0.5 0 1 123648
Index of public goods 0.54 0.28 0 1 123648
Individual controls
Urbanized area 0.4 0.49 0 1 122361
Age 36.95 14.57 18 110 122615
Education 3.31 2.11 0 9 123381
Wealth -0.01 0.70 -1.35 1.31 122977
Religious 0.72 0.45 0 1 122423
Gender 0.5 0.5 0 1 123648
Regional controls
Region population 1614.97 1900.41 1.73 14558.1 104224
Region size 30361.4 41369.19 344.86 463712.6 104224
Suitability for agriculture 0.45 0.22 0 0.93 104224
Presence of rivers 0.15 0.17 0 1 104224
Pre-colonial institutions 1.63 0.81 0 3 104224
Ethnic Diversity Measures
EFI 0.64 0.21 0 0.96 123799
Historic EFI 0.27 0.22 0 0.79 103808
Polarization 0.6 0.18 0 0.98 123799
Historic Polarization 0.64 0.24 0 1 111732
EFI DHS 0.56 0.22 0 0.9 60351
ELF(1) 0.1 0.17 0 0.67 82475
ELF(5) 0.28 0.25 0 0.85 82475
ELF(10) 0.49 0.25 0 0.92 82475
ELF(13) 0.53 0.25 0 0.95 82475
The index of public goods is defined as average of access to the nine public goods. Gender is 1 for males,
0 for females; populations are expressed in thousands; region surface is in square kilometers. The wealth
index is calculated by a factor matrix such that the mean is 0. The ELF(1) through (13) variables represent
ethnic fractionalization a the different levels of depth from Gershman and Rivera (2018).
5
The Afrobarometer surveys include information on the presence of nine public goods: the
electricity grid, cellphone service, sewage-treatment facilities, piped-water systems, post offices,
schools, police stations, health centers and paved roads. These are not self-reported but answered
by the investigators, thus ensuring comparability across individuals. From the data, we construct an
index of access to public goods (called Index of public goods below), which is defined as average
access to all nine public goods: I ndex =P9
i=1 public_goodi
9. We use this index to investigate the
effect of diversity on public-good provision in general.
The Ethno-linguistic Fractionalization Index As is common in the literature, we measure eth-
nic diversity by the Ethno-linguistic Fractionalization Index (EFI, sometimes referred to as Ethno-
Linguistic Fractionalization - ELF - in the literature), which is the probability that two randomly-
selected individuals belong to different ethnic groups (see Equation 1, where sirepresents the share
of group iin the population). The larger the value of this index, the more ethnically diverse the
population.
EF I = 1 X
i
s2
i(1)
Using Afrobarometer data, we construct the fractionalization index at the regional level (there
are 400 regions in the data set). The mean value of EF I in our data is 0.64 with a standard
deviation of 0.21, reflecting the great variability across the continent (see Figure 1).
0 .5 1 1.5 2
0 .2 .4 .6 .8 1
x
EFI Pola
Figure 1: Distribution of the Ethno-Linguistic Fractionalization and Polarization Indexes
6
2.2 Pre-colonial territories
To construct our instrument, we use pre-colonial ethnic territories, taken from Murdock (1967)
and calculated by Nunn and Wantchekon (2011) and Michalopoulos and Papaioannou (2013). We
construct a pre-colonial measure of the EFI (see the discussion in Section 3). The instrument is,
due to data limitations, a measure of diversity of territories (rather than populations). The average
value of this measure is 0.27, which is far lower than the current level of fractionalization, thus
illustrating the sharp rise in diversity following colonization.
2.3 Controls
The individual controls we introduce from the Afrobarometer are urbanization (whether the in-
dividual lives in an urban or rural area), age, education, wealth, religiosity and gender. The wealth
index is calculated from a factorial analysis of questions about possessions (television, radio, car
and mobile phone). We also introduce region-level controls using data from Michalopoulos and
Papaioannou (2013): region population and size, the suitability of the soil for agriculture, the
presence of rivers and the level of pre-colonial institutions.
2.4 Robustness checks
We check the validity of our results via several robustness checks (see Section 6 for the details).
We first use an alternative measure of ethnic diversity, the the Polarization Index, developed by
Esteban and Ray (1994), which is defined as the closeness of the observed distribution to the
bimodal distribution (50:50) (Equation 2). In the same way as for the EFI, we construct the current
Polarization Index at the regional level and create the pre-colonial instrument of polarization.
P OLA = 1 X
i0.5si
0.52
si= 4 X
iX
j6=i
s2
isj(2)
The second robustness check requires data on pre-colonial populations, from the estimations
of populations by slave-trade region from 1850 to 1900 in Manning (2013). We use this data to
refine the definition of our instrument by including population densities in its construction.
In the third robustness check, we look at the impact of the depth of ethnic diversity on public-
7
good provision. To do so we use data from Gershman and Rivera (2018) that shows the degree of
ethnic diversity for different levels of language differences.
Last, we use an alternative data set, the Demographic Health Surveys, to construct our Ethno-
Linguistic Fractionalization Index. These are much larger data sets that are collected in many
countries around the world, and particularly in Africa. They cover about 100,000 individuals per
country, and therefore produce more precise figures for our measure. However, the DHS and
Afrobarometer surveys are not carried out in exactly the same countries. We combine the data
for only 14 countries: Benin, Burkina Faso, Cameroon, Ghana, Guinea, Kenya, Malawi, Mali,
Senegal, Sierra Leone, South Africa, Swaziland, Uganda and Zambia.
3 Using an Instrumental-Variable approach to ensure causality
To go beyond Ordinary Least Squares (OLS) correlations, which may produce misleading con-
clusions due to reverse causality, measurement error and omitted-variable biases, we propose an
instrument to establish the causal impact of fractionalization on public-good provision.
3.1 Presentation of the instrument
The instrument we propose here is a pre-colonial measure of the Ethnic Fractionalization Index,
calculated using the historical homelands of ethnic groups. Instrumental variables (IV) allow us
to cope with the previously-mentioned biases, provided that our instrument is relevant, i.e. the
instrument is correlated with EFI, and at least conditionally exogenous, i.e. the instrument only
affects public-good provision through its impact on ethnic diversity.
Identifying a valid instrument is a challenge. Acemoglu et al. (2001) use colonial settlers’ death
as an instrument for institutional capacity, Nunn and Wantchekon (2011) consider distance to the
coast to instrument the number of slaves taken from a country, and Atkin (2013) uses altitude,
rainfall and temperature to instrument agricultural income.
The instrument we construct is based on geographical data on pre-colonial ethnic territories,
compiled by Nunn and Wantchekon (2011) and Michalopoulos and Papaioannou (2016) (see Fig-
ure 25). African populations before the colonial period were characterized by low density (African
5Copyright American Economic Association; reproduced with permission of the American Economic Review
8
population density in 1850 was 3 hab/km2, while the corresponding European figure was 15:
Cameron (1993)), with few cities, and little ethnic diversity (Boserup, 1985). From the beginning
of the colonial period (around 1870) cities started to emerge and grow. For instance, in 1884 the
population of Bamako (Mali) was 2,500, but grew to 100,000 in 1960. The city of Nairobi (Kenya)
was built from scratch by British settlers during the construction of the Kenya Uganda Railway in
1899. By the time of Kenyan independence in 1963, the population had reached 343,000 (Mitullah,
2003). These cities remain the regional administrative centers (most regions in Africa are named
after their main city).
Figure 2: Difference between ethnic historical homelands and current African borders (Source:
Michalopoulos and Papaioannou (2016))
The idea on which our instrument is based is what we could call an “aspiration effect”: if
the city is located in a region where the population was ethnically diverse in pre-colonial times,
the current population is also more likely to be ethnically diverse than if the city is located in a
region that was more homogeneous. Since cities were developed in the first place to serve colonial
interests, they can be considered as exogenous shocks that strongly affected how ethnic groups
interact in the long run. Things would have been different had ethnic groups chosen to live together.
A similar argument is often made regarding the exogeneity of current national borders in Africa,
most of which were established at the Berlin conference in 1885.
In a given region, we draw a disc centered on the current administrative city.6Inside this disc,
we calculate the area corresponding to the historical homeland of each ethnic group, which we
use as a proxy for the number of people from this ethnic group within the disc (see Figure 3).
6We use a radius of 100km for our estimations. As a robustness check, we checked alternative values ranging from
50 to 500km: see Section 6.
9
The index thus measures the probability that two randomly-chosen pixels in the disc belong to the
same ethnic historical homeland. As a robustness check, we also calculate the index using data on
population density in pre-colonial times from Manning (2013).7
Figure 3: Creation of the instrument. From the administrative center of present-day regions, we
draw a circle of 100km (not to scale on the figure) and then count the area controlled by each group
within the circle to compute a pre-colonial measure of ethnic diversity.
We then calculate the historic ethnic fractionalization index (Histor ic EF I ) using the formula
in Equation 1. H istoric E F I, which we use as an instrument, has a mean of 0.27 which is
significantly lower than the current value of EF I (0.64) highlighting the significant rise in diversity
following colonization. As expected, the two measures are highly correlated, with an F-statistic of
2914 (see Table 2).
3.2 Exogeneity
We argue that our instrument is immune to potential issues related to reverse causality, as the
data used to construct it are precolonial, at a time when the public goods as considered in this paper
did not exist in Africa.
However, endogeneity could still come from omitted variables. Indeed, it is possible that the
7The pre-colonial population data from Manning (2013) are calculated for slave-trade regions (which correspond to
current countries), not for ethnic groups.Therefore the only observable differences between the original index and that
measured using pre-colonial population densities correspond to cases where there are differences at country borders.
This means that an ethnic territory which was exogenously divided in two by a border could be counted as artificially
having two different densities.
10
Table 2: First stage of the 2SLS-IV regression. Dependent variable: the Ethno-linguistic Fraction-
alization Index.
Without controls With controls
Historic EFI 0.159*** 0.246***
(0.003) (0.003)
Urbanized area 0.043***
(0.001)
Age -0.000***
(0.000)
Education 0.001**
(0.000)
Wealth -0.001
(0.001)
Religious -0.003**
(0.001)
Gender 0.001
(0.001)
Region population -0.000***
(0.000)
Region size 0.000***
(0.000)
Suitability for agriculture 0.075***
(0.005)
Presence of rivers 0.189***
(0.004)
Pre-colonial institutions -0.038***
(0.001)
Constant 0.583*** 0.326***
(0.001) (0.009)
No. of obs. 103800 97623
Adj. R20.027 0.339
F2843.70 2914.68
* p<0.10, ** p<0.05, *** p<0.01. The estimations include country and Afrobarometer-round fixed-
effects in the second column only. Robust standard errors in parentheses. Gender is 1 for males, 0
for females.
11
location of the administrative centers, from which we draw the circle used to compute our in-
strument, is influenced by factors which favor (or impede) public good production. To avoid this
potential omitted variable bias, we control for several factors.
First, we control for factors which could influence overall wealth and bring more (diverse)
populations, such as pre-colonial institutions (Acemoglu et al., 2001), soil suitability to agriculture
(we use the soil quality index from Michalopoulos and Papaioannou (2013)) and access to rivers
and oceans.
Second, we look in the literature for factors which could influence public good provision.
4 Regression framework
We want to estimate the causal impact of ethno-linguistic fractionalization on a set of public
goods. The equation we estimate is the following for individual iin region jand country k:
Yijk =β0+β1E F Ijk +γ1Xijk +γ2Cj k +δk+i(3)
Here Yijk is a public-good dummy for the presence of a school, clinic, paved road, etc. The
vector Xijk represents a set of controls at the individual level, while Cjk is the set of regional con-
trols. The δparameter represents the country fixed effects, including country characteristics such
as national institutions (which according to Collier (2000) are key determinants of the impact of
heterogeneity) and factor endowments (which could increase the correlations with ethnic diversity:
see Easterly and Levine (2003)).
In the case of endogenous ethnic diversity, a simple ordinary least squares (OLS) estimation
will not be consistent. We therefore estimate a Two-Stage Least Squares Instrumental Variable
(2SLS-IV) model with the following equations:
Yijk =β0+β1
[
EF I jk +γ1Xij k +γ2Cjk +δk+i
EF Ij k =π0+π1historic E F Ijk +ω1Xijk +ω2Cjk +γk+µi(4)
12
5 Results
5.1 OLS results
We first estimate Equation 3 without any controls: the results appear in Table 3. The coeffi-
cients are not negative for all of the goods, in contrast with the previous results in the literature. In
particular, the index of public goods appears to be positively correlated with ethnic diversity. Table
3 also shows that there is no single pattern linking ethnic division and access to public goods, as
the coefficients are significantly different from one good to the other.
Table 3: OLS regressions without controls.
Dependent Index of Electricity Cellphone Piped-water Sewage
variable public goods grid service system treatment
EFI 0.081*** 0.033*** 0.073*** 0.133*** 0.094***
(0.004) (0.006) (0.004) (0.006) (0.007)
Constant 0.493*** 0.593*** 0.844*** 0.186*** 0.488***
(0.002) (0.004) (0.003) (0.004) (0.004)
No. of obs. 123648 123568 120846 122416 123243
Adj. R20.004 0.000 0.003 0.004 0.002
Dependent Post Schools Police Clinic Road
variable office
EFI 0.110*** -0.012*** 0.125*** 0.048*** 0.147***
(0.006) (0.004) (0.006) (0.006) (0.007)
Constant 0.176*** 0.892*** 0.283*** 0.576*** 0.385***
(0.004) (0.003) (0.004) (0.004) (0.004)
No. of obs. 122548 123284 122616 122655 123648
Adj. R20.003 0.000 0.003 0.000 0.004
* p<0.10, ** p<0.05, *** p<0.01. Country and Afrobarometer-round fixed effects are NOT included. The
Index of public goods variable is defined as average access to the nine public goods in the data set.
The results from OLS estimations without and then with appear in Appendix A for each public
good and the index (the first two columns of Tables A.1 to A.10). Adding controls eliminates the
positive correlation between the index of public goods and diversity, with the coefficient becoming
insignificant. This corroborates recent results from Gisselquist et al. (2016) or Gershman and
Rivera (2018) who find that when investigated at the sub-national level, the link between ethnic
diversity and public good provision disappears.
13
5.2 2SLS - IV results
The main contribution of this paper is to estimate the causal link between ethnic fractional-
ization and public-good access. We first check the relevance of our instrument by estimating an
OLS regression on our instrument, including all the controls we used in the regression (in order
to avoid biases: see Angrist and Pischke (2008, Chapter 4)): this is the first stage of our 2SLS-IV
regression (Equation 4). The results in Table 2 show that our instrument is relevant, as the rule of
thumb (FStat > 10, see Stock and Yogo (2002)) clearly holds.
We then use this instrument to estimate the 2SLS-IV model (the first line of Equation 4). As
discussed in the previous Section, this enables us to interpret the estimated coefficient on ethnic
fractionalization causally. The IV-regression results appear in Tables 4 and 5, and the coefficients
are plotted in Figure 4. The relationship between the OLS and IV regression results is in Table
6 and Figure 5. Estimation results for each public good (including the index) are displayed in
Appendix A in Tables A.1 to A.10.
-0.030
-0.093
0.019
0.021
-0.107
0.144
0.105
0.155
-0.045
-0.391
EFI
-0.400 -0.200 0.000 0.200
Index of public goods Electricity grid
Cellphone service Sewage treatment
Piped water Post offices
Schools Police stations
Health centers Roads
Figure 4: Estimated coefficients for the Ethno-linguistic Fractionalization Index for each public good in
2SLS-IV estimations. Individual controls include urban area, gender, age, education, wealth and religiosity.
Regional controls include region population and surface, pre-colonial institutions, soil suitability for agri-
culture and access to river streams and oceans. Country and Afrobarometer-round fixed effects are included.
The Index of public goods variable is defined as average access to the nine public goods in the data set.
The 2SLS-IV results (Tables 4 and 5) indicate that, on average, ethnic diversity has a negative
impact on public-good provision. The findings in the existing literature thus seem to be confirmed.
14
Table 4: 2SLS-IV results I.
Dependent Index of Electricity Cellphone Sewage Piped-water
variable public goods grid service treatment system
EFI -0.030** -0.093*** 0.019 0.021 -0.107***
(0.015) (0.026) (0.022) (0.026) (0.030)
Urbanized area 0.278*** 0.403*** 0.081*** 0.392*** 0.413***
(0.002) (0.003) (0.002) (0.003) (0.003)
Age -0.000 -0.000 0.000 0.000* 0.000
(0.000) (0.000) (0.000) (0.000) (0.000)
Education 0.014*** 0.024*** 0.008*** 0.014*** 0.016***
(0.000) (0.001) (0.001) (0.001) (0.001)
Wealth 0.039*** 0.086*** 0.021*** 0.043*** 0.059***
(0.001) (0.002) (0.002) (0.002) (0.002)
Religious 0.003** -0.003 0.015*** 0.002 0.006*
(0.002) (0.003) (0.003) (0.003) (0.003)
Gender -0.016*** -0.030*** -0.010*** -0.017*** -0.021***
(0.001) (0.002) (0.002) (0.002) (0.003)
Region population 0.000 0.000*** 0.000*** 0.000*** 0.000***
(0.000) (0.000) (0.000) (0.000) (0.000)
Region size -0.000*** -0.000*** -0.000*** -0.000*** -0.000***
(0.000) (0.000) (0.000) (0.000) (0.000)
Suitability for agriculture -0.040*** -0.057*** -0.001 -0.125*** -0.049***
(0.005) (0.009) (0.008) (0.008) (0.011)
Presence of rivers 0.002 0.095*** -0.004 -0.037*** 0.110***
(0.005) (0.009) (0.007) (0.010) (0.011)
Pre-colonial institutions 0.003** 0.004 -0.003 -0.005** -0.013***
(0.001) (0.002) (0.002) (0.002) (0.003)
Constant 0.590*** 0.587*** 0.718*** 0.602*** 0.617***
(0.009) (0.015) (0.011) (0.015) (0.016)
No. of obs. 97623 97564 95087 96845 97304
Adj. R20.471 0.474 0.148 0.430 0.369
* p<0.10, ** p<0.05, *** p<0.01. The estimations include country and Afrobarometer-round fixed-
effects. Robust standard errors in parentheses. Gender is 1 for males, 0 for females. The Index of
public goods variable is defined as average access to the nine public goods in the data set.
15
Table 5: 2SLS-IV results II.
Dependent Post Schools Police Clinic Road
variable office
EFI 0.144*** 0.105*** 0.155*** -0.045 -0.391***
(0.028) (0.023) (0.032) (0.035) (0.033)
Urbanized area 0.291*** 0.029*** 0.336*** 0.202*** 0.342***
(0.003) (0.002) (0.004) (0.004) (0.004)
Age 0.000 -0.000* -0.000*** -0.000*** 0.000
(0.000) (0.000) (0.000) (0.000) (0.000)
Education 0.009*** 0.008*** 0.012*** 0.015*** 0.018***
(0.001) (0.001) (0.001) (0.001) (0.001)
Wealth 0.025*** 0.010*** 0.032*** 0.035*** 0.044***
(0.002) (0.002) (0.002) (0.003) (0.002)
Religious 0.003 0.015*** 0.004 0.009** -0.025***
(0.003) (0.003) (0.003) (0.004) (0.004)
Gender -0.010*** -0.006*** -0.013*** -0.014*** -0.020***
(0.002) (0.002) (0.003) (0.003) (0.003)
Region population -0.000*** -0.000 -0.000*** -0.000 -0.000***
(0.000) (0.000) (0.000) (0.000) (0.000)
Region size 0.000*** 0.000** 0.000** 0.000 -0.000***
(0.000) (0.000) (0.000) (0.000) (0.000)
Suitability for agriculture -0.076*** 0.011 -0.039*** 0.020* -0.040***
(0.009) (0.008) (0.010) (0.011) (0.011)
Presence of rivers -0.107*** -0.019** -0.068*** -0.034*** 0.050***
(0.010) (0.008) (0.011) (0.012) (0.012)
Pre-colonial institutions 0.028*** 0.009*** 0.006** 0.003 0.003
(0.003) (0.002) (0.003) (0.003) (0.003)
Constant 0.401*** 0.743*** 0.347*** 0.662*** 0.627***
(0.018) (0.016) (0.020) (0.021) (0.019)
No. of obs. 96748 97337 96866 96832 97623
Adj. R20.252 0.051 0.207 0.118 0.235
* p<0.10, ** p<0.05, *** p<0.01. The estimations include country and Afrobarometer-round
fixed-effects. Robust standard errors in parentheses. Gender is 1 for males, 0 for females.
16
Table 6: EFI coefficients in different specifications: OLS without controls, OLS with controls and
IV.
OLS OLS
without with IV
controls controls
Index of public goods 0.081*** 0.004 -0.030**
(0.004) (0.003) (0.015)
Electricity grid 0.033*** -0.044*** -0.093***
(0.006) (0.006) (0.026)
Cellphone service 0.073*** 0.022*** 0.019
(0.004) (0.005) (0.022)
Sewage treatment 0.133*** 0.050*** 0.021
(0.006) (0.006) (0.026)
Piped water 0.094*** -0.059*** -0.107***
(0.007) (0.007) (0.030)
Post offices 0.110*** 0.019*** 0.144***
(0.006) (0.006) (0.028)
Schools -0.012*** -0.029*** 0.105***
(0.004) (0.005) (0.023)
Police stations 0.125*** 0.042*** 0.155***
(0.006) (0.008) (0.032)
Health centers 0.048*** 0.005 -0.045
(0.006) (0.008) (0.035)
Roads 0.147*** 0.025*** -0.391***
(0.007) (0.007) (0.033)
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors are in parentheses. The individual controls include
urban area, gender, age, education, wealth and religiosity. The regional controls include region population
and surface, pre-colonial institutions, the suitability of soil for agriculture and access to river streams and
oceans. Country and Afrobarometer-round fixed effects are included. The Index of public goods variable is
defined as average access to the nine public goods in the data set.
17
0.081
0.004
-0.030
0.033
-0.044
-0.093
0.073
0.022
0.019
0.133
0.050
0.021
0.094
-0.059
-0.107
0.110
0.019
0.144
-0.012
-0.029
0.105
0.125
0.042
0.155
0.048
0.005
-0.045
0.147
0.025
-0.391
EFI
EFI
EFI
-0.400 -0.200 0.000 0.200 -0.400 -0.200 0.000 0.200
-0.400 -0.200 0.000 0.200 -0.400 -0.200 0.000 0.200
Index of public goods Electricity Cellphone Sewage
Piped water Post Schools Police
Clinic Road
OLS without controls OLS with controls
IV
Figure 5: Estimated coefficients for the Ethno-linguistic Fractionalization Index for each public good in
OLS without controls, OLS with controls and IV estimations. Individual controls include urban area, gen-
der, age, education, wealth and religiosity. Regional controls include region population and surface, pre-
colonial institutions, soil suitability for agriculture and access to river streams and oceans. Country and
Afrobarometer-round fixed effects are included. The Index of public goods variable is defined as average
access to the nine public goods in the data set.
18
Ethnic diversity is associated with 3 percent lower public-good provision, as measured by the Index
of public goods. This coefficient is far smaller than that previously found in the literature. However,
they also underline that the effects differ significantly according to the public good considered,
both in terms of size (the coefficient for access to electricity is -0.093, whereas that for roads is
-0.391) but also sign: three goods (electricity, clean water and roads) are negatively affected by
diversity, while for three (cellphone coverage, sewage treatment and health centers) the effect is
not significant, and for the last three (post offices, schools and police stations) the effect of diversity
is positive! This variety of results calls for a better understanding of the channels through which
ethnic diversity affects public-good provision, and in particular how public-good characteristics
(excludability, rivalry, returns to scale, relevant population size etc.) are related to the impact of
diversity.
6 Robustness Checks
In this Section, we test the sensitivity of our results to a number of robustness checks. We first
use a different measure of ethnic diversity, the polarization index, to correct for biases in the EF I
measure. Second, we refine our instrument by including pre-colonial populations in its calculation.
Third, we test the sensitivity of our results to the depth of ethnic divisions, using different measures
of the depth of language diversity. Fourth, we modified the radius of our instrument, to control for
any potential bias from choosing 100km as an arbitrary value for our instrument. Fifth, we use
an alternative data set to control for potential sampling issues within regions in the Afrobarometer
data, by using data from the Demographic Health Surveys. Last, we account for potential multiple-
hypothesis testing issues by carrying out Bonferroni and Holm p-value corrections.
6.1 Using an alternative measure of diversity: the polarization index
There are several issues with the use of the EF I measure to capture the effect of ethnic di-
versity (for a good review, see Posner (2004)). We therefore test the robustness of our findings by
re-estimating our regressions using the measure of ethnic polarization developed by Esteban and
Ray (1994). This measure is defined by the formula in Equation 2, with sibeing the population
share belonging to group i. This index is a measure of the closeness between the distribution of
19
groups in the population and the 50:50 distribution. Contrary to the fractionalization measure, for a
population with groups of equal size, the polarization index does not monotonically increase with
the number of ethnic groups. Polarization in fact reaches a maximum when there are two groups
with identically-sized populations (see Montalvo and Reynal-Querol (2005)).
We calculate the measure of polarization for both the present and pre-colonial times. We then
estimate the model in Equation 5. Results are presented in Appendix B.
Yijk =β0+β1
\
P OLAjk +γ1Xijk +γ2X0
jk +δk+i
P OLAjk =π0+π1historic_E F Ijk +ω1Xijk +ω2X0
jk +γk+µi(5)
The F-stat, displayed in Table B.1, is 55000 (far higher than 10), so that we can confidently
estimate the 2SLS-IV regressions. The estimated coefficients from regressions with the fraction-
alization and polarization indices are plotted in Figure B.1. We find that of our nine public goods,
five remain very significantly negatively affected by ethnic polarization, while two are positively
affected. For the last two goods, the effect is insignificant. The result for the Index of public goods
(Table B.2) is very significant, and negative. In terms of size, the polarization effect appears to be
larger than that of ethnic diversity, so that, in line with the results in Montalvo and Reynal-Querol
(2005), ethnic diversity appears to be particularly problematic when the distribution of groups is
close to bimodal (50:50). The negative impact of ethnic diversity is confirmed.
6.2 Using pre-colonial populations
One limitation of the calculation of our instrument is the use of the spatial distribution of ethnic
groups to approximate the group’s pre-colonial population (see Section 3). One solution would be
to incorporate data on ethnic-group population prior to colonization. Unfortunately, this data for
all ethnic groups is not, to the best of our knowledge, available. However, it is now possible to
obtain estimations of pre-colonial populations at the national level, thanks to the work of Manning
(2013). Including population in our instrument should improve the quality of our pre-colonial
index of diversity.
However, as the data is calculated at the country level, we do not have any idea of the intra-
20
country distribution, only that between countries. In the calculation of our instrument, we used
a radius from the regional administrative center (Figure 3). Density variations will then only be
observed across country borders. This could be even more problematic as the borders were drawn
regardless of ethnic groups, and often split groups in two. Including population at the country
level would therefore mean that we count the pixels belonging to the historical territory of a single
ethnic group differently depending on the country to which they were assigned.
We use population data for 1850 through 1900 for availability reasons. Results are presented
in Appendix C. The first-stage regressions (Table C.1) indicate that we can perform 2SLS-IV
regressions (the F-statistic is far greater than zero). The results are plotted in Figure C.1 and turn
out to be very similar to our initial results. The results for the index of public goods (Table C.2)
also indicate that our initial results continue to hold, with a significant negative impact of ethnic
diversity on public-good provision.
6.3 Using the depth of ethnic divisions
Another issue in the analysis of ethnic diversity is that the definition of ethnic group is some-
what arbitrary. Even slightly group different definitions (based on language, religion, tradition,
moral values etc.) can produce very large differences in the number and size of the groups.8
Building on the work of Fearon and Laitin (1996), a stream of literature has focused on the
role of the depth of division. The underlying idea is that the deeper the divisions (often measured
by linguistic, but sometimes also genetic, differences), the more rooted the differences, and thus
the more adverse the impact of diversity on economic outcomes. A recent paper by Gershman and
Rivera (2018) found that “only deep-rooted diversity based on cleavages formed in the distant past
is strongly inversely associated with regional development”. In particular, for their 13 levels of
language diversity, they find an adverse link between diversity and economic outcomes only below
Level 9 (indicating diversity between groups that speak very different languages). We use their
regional measures of ethnic diversity at different levels of division depth to test the sensitivity of
our results to division depth. We therefore use our pre-colonial measure of diversity to instrument
8One famous example is Rwanda: as they speak similar languages, Hutus and Tutsis were included in the same
group in the Atlas of Murdock (1967). History has unfortunately shown that there existed considerable tensions within
this group.
21
the measures at different levels.
Results are displayed in Appendix D. First-stage estimations are presented in Tables D.1 and
D.2. The measures are very significantly correlated with our instrument, so that we can use them
in the 2SLS-IV regressions. The results for levels 1, 5, 10 and 13 are displayed in Figure D.1, and
those for all levels on the Index of public goods in Tables D.3 and D.4. These results confirm our
findings: our initial result that diversity causally negatively affects public-good provision is thus
robust to division depth.
6.4 Modification of the radius
To see whether instrument validity depends on our arbitrary value of 100km for the radius of
the circle used to calculate it (see Figure 3), we re-estimated the regressions for values of the radius
ranging from 50km to 500km.
Results are displayed in Appendix E. The first-stage regressions appear in Table E.1, and reveal
a positive significant effect for all of the values of the radius considered. The IV regressions for the
different radii are plotted in Figure E.1, with the regression results for the index of public good in
Table E.2. These show that the effect we estimated for the initial radius value is in fact the smallest
observed. For the index of public goods, our result of an adverse effect of ethnic diversity is then
robust to the radius chosen for the calculation of our instrument.
6.5 Using an alternative data set: the Demographic Health Surveys
One final issue with our measure of diversity is that Afrobarometer surveys are representative
at the country level, but not at the regional level. To help overcome this problem, we pool data from
three rounds of the survey to gain statistical power. This does not however necessarily eliminate the
potential sampling bias. We thus turn to an alternative data set: the Demographic Health Surveys.
We construct the E F I measure for regions, which turns out to be very strongly correlated with our
initial measure (see Figure F.1).
Results are displayed in Appendix F. We estimated the first-stage regressions to check whether
we could use our instrument with the DHS measure (Table F.1). The results from the 2SLS-IV
regressions on all the public goods are presented in Figure F.2, and those from the estimation on
22
the index of public goods in Table F.2. Overall, the results from the DHS measure tend to be
stronger, and confirm that the impact of ethnic diversity on the provision of public goods is indeed
negative.
6.6 Statistical inference
As we estimate the model in Equation 4 for ten outcomes (the nine public goods in Afro-
barometer data and our index of public-good access), there is a risk of finding significant effects
that are in fact false positives. We carry out two tests to check whether our findings are robust to
multiple-hypothesis testing concerns. The results appear in Table G.1 in Appendix G.
The first test is the Bonferroni correction (Sidák, 1968). The idea here is that, to avoid false
positives, we need to multiply the p-value obtained in the regressions by the number of tests per-
formed. As we here estimate the effect of ethnic diversity on ten outcomes, we multiply all the
p-values by a factor of 10.
Second, we perform the Holm correction (Holm, 1979), which takes into account that the
Bonferroni process is very conservative and increases the risk of type-II errors (the risk of not
identifying true positives) drastically.
The results show that there is no longer a significant negative impact of ethnic diversity on the
index of public goods. However, the results remain highly significant for all the individual public
goods in the data set that were significant in the initial results.
6.7 Summary of the robustness checks
We performed a series of tests to check the validity of our main finding (that ethnic diversity has
a significant, causal negative impact on public-good provision). A summary of all the regressions
carried out on the index of public goods appears in Table 7 and Figure 6. All of the results for the
IV regressions point in the same direction and confirm our initial findings.
23
0.081
0.004
-0.030
-0.752
-0.212
-0.071
-0.042
-1.331 -1.093
-0.795
EFI
Polarization
ELF(1)
ELF(10)
EFI DHS
-1.500 -1.000 -0.500 0.000
OLS without controls OLS with controls
IV EFI IV EFI with population 1850
IV Polarization IV with GR Level 1
IV with GR Level 10 IV radius 300km
IV radius 500km IV with EFI from DHS
Figure 6: Estimated coefficients from regressions of the Index of public goods: OLS without controls, OLS
with controls, IV with EFI, IV including pre-colonial population in 1850 from Manning (2013), IV with
the Polarization index, IV including depth of divisions (using Levels 1 and 10 from Gershman and Rivera
(2018)) and IV with different radii of 300 and 500km. Individual controls include urban area, gender, age,
education, wealth and religiosity. Regional controls include region population and surface, pre-colonial in-
stitutions, soil suitability for agriculture and access to river streams and oceans. Country and Afrobarometer-
round fixed effects are included. The Index of public goods variable is defined as average access to the nine
public goods in the data set.
24
Table 7: Regression results for the Index of public goods in all of the specifications.
(1) (2) (3) (4) (5)
OLS without OLS with IV EFI IV EFI IV Pola
controls controls pop 1850
EFI 0.081*** 0.004 -0.030** -0.752***
(0.004) (0.003) (0.015) (0.084)
Polarization -0.212***
(0.030)
Controls included No Yes Yes Yes Yes
No. of obs. 123648 101393 97623 101027 101027
Adj. R20.004 0.477 0.471 0.229 0.467
(6) (7) (8) (9) (10)
IV with GR IV with GR IV radius IV radius IV with
Level 1 Level 10 300km 500km DHS
EFI -0.071*** -0.042*** -1.331*** -1.093*** -0.795***
(0.020) (0.012) (0.181) (0.105) (0.063)
Controls included Yes Yes Yes Yes Yes
No. of obs. 69173 69173 101027 101194 56447
Adj. R20.439 0.439 . . 0.256
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the 1st column. The individual controls include urban area, gender,
age, education, wealth and religiosity. The regional controls include region population and surface, pre-
colonial institutions, suitability of the soil for agriculture and access to river streams and oceans. Columns 1
to 3 present estimations using the Ethno-linguistic Fractionalization Index, Column 4 using the instrument
with pre-colonial populations of 1850 (using data from Manning (2013)), Column 5 the Polarization index,
Columns 6 and 7 specifications with Levels 1 and 10 of EFI from Gershman and Rivera (2018), Columns 8
and 9 specifications with changes in the radius used to construct the instrument, and Column 10 estimations
when EFI is calculated using DHS data. Individual controls include urbanized area, age, education, wealth,
religiosity and gender. Regional controls include region population, surface, soil suitability for agriculture,
access to rivers and precolonial institutions.
25
Table 8: Estimated coefficients on ethnic diversity for each public good in all the specifications.
(1) (2) (3) (4) (5)
OLS without OLS with IV EFI IV EFI with IV Pola
controls controls pop 1850
Electricity grid 0.033*** -0.044*** -0.093** 0.311** 0.135
Cellphone service 0.073*** 0.022*** 0.019 -0.142*** -0.354***
Sewage treatment 0.133*** 0.050*** 0.021 -1.992*** -0.596***
Piped water 0.094*** -0.059*** -0.107*** -0.872*** -0.193***
Post offices 0.110*** 0.019*** 0.144*** -0.165 -0.055
Schools -0.012*** -0.029*** 0.105*** 0.273*** 0.304***
Police stations 0.125*** 0.042*** 0.155*** -0.750*** -0.474***
Health centers 0.048*** 0.005 -0.045 -1.094*** 0.110
Roads 0.147*** 0.025*** -0.391*** -2.059*** -0.764***
No. negatives 1 3 3 6 5
No. positives 8 5 3 2 1
No. insignificant 0 1 3 1 3
(6) (7) (8) (9) (10)
IV with GR IV with GR IV radius IV radius IV with
Level 1 Level 10 300km 500km DHS
Electricity grid 0.023 0.014 -0.121 0.266** -0.989***
Cellphone service 0.086*** 0.051*** -0.363** -0.010 -0.380***
Sewage treatment -0.066* -0.039* -2.949*** -1.020*** -0.799***
Piped water -0.350*** -0.207*** -1.352*** -1.189*** -1.649***
Post offices 0.128*** 0.076*** -0.563*** -1.759*** 0.231**
Schools 0.131*** 0.078*** 0.492** -0.536*** -0.154*
Police stations 0.135*** 0.080*** -1.271*** -1.976*** -0.438***
Health centers -0.123*** -0.073*** -2.258*** -2.657*** -0.996***
Roads -0.527*** -0.313*** -3.209*** -1.036*** -1.749***
No. negatives 3 3 7 7 7
No. positives 4 4 1 1 1
No. insignificant 2 2 1 1 1
* p<0.10, ** p<0.05, *** p<0.01. Country and Afrobarometer-round fixed effects are included, except in
the 1st column. The individual controls include urban area, gender, age, education, wealth and religios-
ity. The regional controls include region population and surface, pre-colonial institutions, suitability of the
soil for agriculture and access to river streams and oceans. Columns 1 to 3 present estimations using the
Ethno-linguistic Fractionalization Index, Column 4 using the instrument with pre-colonial populations of
1850, Column 5 the Polarization index, Columns 6 and 7 specifications with Levels 1 and 10 of EFI from
Gershman and Rivera (2018), Columns 8 and 9 specifications with changes in the radius used to construct
the instrument, and Column 10 estimations when EFI is calculated using DHS data. The number of positive,
negative and insignificant coefficients at the bottom of each column are defined at the 5% threshold.
26
7 Conclusion
The “African Growth Tragedy” refers to African growth remaining rather low in the decades
following independence. The exact reasons for this slow growth remain a matter of debate. One
important factor, emphasized by many authors following Easterly and Levine (1997), is ethnic
divisions. We have here looked at a necessary condition for some economic growth to occur: the
presence of basic public goods such as roads and post offices. It is furthermore often suspected that
one major consequence of ethnic divisions is the prevention of efficient public-good provision.9We
thus ask whether ethnic divisions have acted as a brake on growth by preventing the provision of
the required public goods.
Taking advantage of better data availability, we estimate the causal effect of ethnic fractional-
ization on public-good provision. The average effect is only moderate. Ethnic divisions do play a
role, but not always in the expected direction (they sometimes have positive effects), resulting in a
small average effect.
Our analysis here is conducted at the regional level, in contrast with most previous work that
was at the national level. A number of interesting points are worth noting. First, ethnic fractional-
ization varies greatly across regions within a given country (in our sample, the standard deviation
of EFI at the national level is 0.10, while that at the regional level is 0.21). Second, the regional bor-
ders surrounding cities most often developed during the colonial period. The locations of colonial
cities then played a major role in explaining the current regional levels of EFI. Had colonization
not occurred, we conjecture that cities would have developed inside ethnic homelands, generating
less ethnic diversity. As a result, the effect of ethnic diversity may be stronger at the regional than
at the national level.
Last, there is another ethnic-diversity factor that has not yet been widely-studied, and which
would likely benefit from further research: the fact that ethnic belonging varies over time. Most of
the literature (including the present paper) considers ethnic groups as fixed, while there are many
examples of changing relationships between groups. For instance, clans with both Hutus and
Tutsis lived peacefully at the beginning of the Twentieth Century (Minnaert, 2009) even though
ethnic tensions led to the 1994 genocide. Further research should try to understand the factors that
9Another important issue is ethnic favoritism, i.e. the tendency for political leaders to favor their own ethnic group.
27
lie behind feelings of ethnic belonging. For instance, Depetris-Chauvin et al. (2018) found that
after a victory of the national football team, individuals were significantly more likely to identify
themselves as feeling as natives of the country, rather than of their ethnic groups.
References
Acemoglu, D., S. Johnson, and J. A. Robinson (2001). The colonial origins of comparative devel-
opment: An empirical investigation. American Economic Review 91(5), 1369–1401.
Alesina, A., R. Baqir, and W. Easterly (1999). Public goods and ethnic divisions. Quarterly
Journal of Economics 114(4), 1243–1284.
Alesina, A., A. Devleeschauwer, W. Easterly, S. Kurlat, and R. Wacziarg (2003). Fractionalization.
Journal of Economic Growth 8(2), 155–194.
Alesina, A. and E. Zhuravskaya (2011). Segregation and the quality of government in a cross
section of countries. American Economic Review 101(5), 1872–1911.
Angrist, J. D. and J.-S. Pischke (2008). Mostly Harmless Econometrics: An Empiricist’s Compan-
ion. Princeton University Press.
Atkin, D. (2013). Trade, tastes, and nutrition in india. American Economic Review 103(5), 1629–
1663.
Banerjee, A., L. Iyer, and R. Somanathan (2007). Public action for public goods. Handbook of
Development Economics 4, 3117–3154.
Boserup, E. (1985). Economic and demographic interrelationships in sub-saharan africa. Popula-
tion and Development Review 11(3), 383–397.
Cameron, R. E. (1993). A Concise Economic History of the World: from Paleolithic Times to the
Present. Oxford University Press, USA.
Collier, P. (2000). Ethnicity, politics and economic performance. Economics & Politics 12(3),
225–245.
28
Depetris-Chauvin, E., R. Durante, and F. R. Campante (2018). Building nations through shared
experiences: Evidence from african football. Technical report, National Bureau of Economic
Research.
Easterly, W. and R. Levine (1997). Africa’s growth tragedy: policies and ethnic divisions. Quar-
terly Journal of Economics 112, 1203–1250.
Easterly, W. and R. Levine (2003). Tropics, germs, and crops: how endowments influence eco-
nomic development. Journal of Monetary Economics 50(1), 3–39.
Esteban, J.-M. and D. Ray (1994). On the measurement of polarization. Econometrica 62(4),
819–851.
Fearon, J. D. and D. D. Laitin (1996). Explaining interethnic cooperation. American Political
Science Review 90(04), 715–735.
Fearon, J. D. and D. D. Laitin (2003). Ethnicity, insurgency, and civil war. American Political
Science Review 97(1), 75–90.
Gerring, J., S. C. Thacker, Y. Lu, and W. Huang (2015). Does diversity impair human development?
a multi-level test of the diversity debit hypothesis. World Development 66, 166–188.
Gershman, B. and D. Rivera (2018). Subnational diversity in sub-saharan Africa: Insights from a
new dataset. Journal of Development Economics 133, 231–263.
Gisselquist, R. M., S. Leiderer, and M. Niño-Zarazúa (2016). Ethnic heterogeneity and public
goods provision in zambia: Evidence of a subnational “diversity dividend”. World Develop-
ment 78, 308–323.
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of
Statistics 6(2), 65–70.
La Porta, R., F. Lopez-de Silanes, A. Shleifer, and R. Vishny (1999). The quality of government.
Journal of Law, Economics, and Organization 15(1), 222–279.
Manning, P. (2013). African population, 1650-1950: Methods for new estimates by region. In
African Economic History Conference.
29
Michalopoulos, S. and E. Papaioannou (2013). Pre-colonial ethnic institutions and contemporary
African development. Econometrica 81(1), 113–152.
Michalopoulos, S. and E. Papaioannou (2016). The long-run effects of the scramble for africa.
American Economic Review 106(7), 1802–1848.
Minnaert, S. (2009). Les pères blancs et la société rwandaise durant l’époque coloniale alle-
mande (1900–1916). une rencontre entre cultures et religions. In Les Religions au Rwanda,
défis, convergences, et competitions, Actes du Colloque International du 18-19 septembre 2008
Butare/Huye, pp. 53–101.
Mitullah, W. (2003). Understanding slums: case studies for the global report on human settlements
2003: the case of Nairobi, Kenya. Technical report, University of Nairobi.
Montalvo, J. G. and M. Reynal-Querol (2005). Ethnic diversity and economic development. Jour-
nal of Development Economics 76(2), 293–323.
Murdock, G. P. (1967). Ethnographic Atlas. University of Pittsburgh Press.
Nunn, N. and L. Wantchekon (2011). The slave trade and the origins of mistrust in africa. American
Economic Review 101(7), 3221–3252.
Posner, D. N. (2004). Measuring ethnic fractionalization in africa. American Journal of Political
Science 48(4), 849–863.
Rosenzweig, M. and C. Udry (2018). External validity in a stochastic world: Evidence from
low-income countries. Technical report, Yale Universtity.
Sidák, Z. (1968). On multivariate normal probabilities of rectangles: their dependence on correla-
tions. The Annals of Mathematical Statistics 35(5), 1425–1434.
Stock, J. H. and M. Yogo (2002, November). Testing for weak instruments in linear IV regression.
Working Paper 284, National Bureau of Economic Research.
30
Appendices
A Regression results for each public good
31
Table A.1: OLS without controls, OLS with controls and IV for the Index of public goods.
OLS without controls OLS with controls IV
EFI 0.081*** 0.004 -0.030**
(0.004) (0.003) (0.015)
Urbanized area 0.279*** 0.278***
(0.002) (0.002)
Age -0.000 -0.000
(0.000) (0.000)
Education 0.014*** 0.014***
(0.000) (0.000)
Wealth 0.040*** 0.039***
(0.001) (0.001)
Religious 0.004** 0.003**
(0.002) (0.002)
Gender -0.016*** -0.016***
(0.001) (0.001)
Region population 0.000 0.000
(0.000) (0.000)
Region size -0.000*** -0.000***
(0.000) (0.000)
Suitability for agriculture -0.055*** -0.040***
(0.004) (0.005)
Presence of rivers -0.003 0.002
(0.004) (0.005)
Pre-colonial institutions -0.001 0.003**
(0.001) (0.001)
Constant 0.493*** 0.592*** 0.590***
(0.002) (0.006) (0.009)
No. of obs. 123648 101393 97623
Adj. R20.004 0.477 0.471
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column. The Index of public goods variable is defined as average
access to the nine public goods in the data set.
32
Table A.2: OLS without controls, OLS with controls and IV for access to the electricity grid.
OLS without controls OLS with controls IV
EFI 0.033*** -0.044*** -0.093***
(0.006) (0.006) (0.026)
Urbanized area 0.401*** 0.403***
(0.003) (0.003)
Age -0.000* -0.000
(0.000) (0.000)
Education 0.023*** 0.024***
(0.001) (0.001)
Wealth 0.087*** 0.086***
(0.002) (0.002)
Religious -0.003 -0.003
(0.003) (0.003)
Gender -0.030*** -0.030***
(0.002) (0.002)
Region population 0.000*** 0.000***
(0.000) (0.000)
Region size -0.000*** -0.000***
(0.000) (0.000)
Suitability for agriculture -0.040*** -0.057***
(0.008) (0.009)
Presence of rivers 0.095*** 0.095***
(0.007) (0.009)
Pre-colonial institutions 0.006*** 0.004
(0.002) (0.002)
Constant 0.593*** 0.566*** 0.587***
(0.004) (0.010) (0.015)
No. of obs. 123568 101330 97564
Adj. R20.000 0.472 0.474
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
33
Table A.3: OLS without controls, OLS with controls and IV for access to cellphone service.
OLS without controls OLS with controls IV
EFI 0.073*** 0.022*** 0.019
(0.004) (0.005) (0.022)
Urbanized area 0.083*** 0.081***
(0.002) (0.002)
Age 0.000** 0.000
(0.000) (0.000)
Education 0.008*** 0.008***
(0.001) (0.001)
Wealth 0.020*** 0.021***
(0.002) (0.002)
Religious 0.016*** 0.015***
(0.003) (0.003)
Gender -0.009*** -0.010***
(0.002) (0.002)
Region population 0.000*** 0.000***
(0.000) (0.000)
Region size -0.000*** -0.000***
(0.000) (0.000)
Suitability for agriculture -0.003 -0.001
(0.006) (0.008)
Presence of rivers -0.013** -0.004
(0.006) (0.007)
Pre-colonial institutions -0.005*** -0.003
(0.001) (0.002)
Constant 0.844*** 0.723*** 0.718***
(0.003) (0.007) (0.011)
No. of obs. 120846 98700 95087
Adj. R20.003 0.146 0.148
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
34
Table A.4: OLS without controls, OLS with controls and IV for access to sewage-treatment facili-
ties.
OLS without controls OLS with controls IV
EFI 0.133*** 0.050*** 0.021
(0.006) (0.006) (0.026)
Urbanized area 0.392*** 0.392***
(0.003) (0.003)
Age 0.000 0.000*
(0.000) (0.000)
Education 0.014*** 0.014***
(0.001) (0.001)
Wealth 0.045*** 0.043***
(0.002) (0.002)
Religious 0.002 0.002
(0.003) (0.003)
Gender -0.017*** -0.017***
(0.002) (0.002)
Region population 0.000*** 0.000***
(0.000) (0.000)
Region size -0.000*** -0.000***
(0.000) (0.000)
Suitability for agriculture -0.138*** -0.125***
(0.007) (0.008)
Presence of rivers -0.035*** -0.037***
(0.008) (0.010)
Pre-colonial institutions -0.008*** -0.005**
(0.002) (0.002)
Constant 0.186*** 0.606*** 0.602***
(0.004) (0.010) (0.015)
No. of obs. 122416 100595 96845
Adj. R20.004 0.442 0.430
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
35
Table A.5: OLS without controls, OLS with controls and IV for access to piped water.
OLS without controls OLS with controls IV
EFI 0.125*** 0.042*** 0.155***
(0.006) (0.008) (0.032)
Urbanized area 0.343*** 0.336***
(0.003) (0.004)
Age -0.000*** -0.000***
(0.000) (0.000)
Education 0.012*** 0.012***
(0.001) (0.001)
Wealth 0.034*** 0.032***
(0.002) (0.002)
Religious 0.004 0.004
(0.003) (0.003)
Gender -0.013*** -0.013***
(0.003) (0.003)
Region population -0.000*** -0.000***
(0.000) (0.000)
Region size 0.000*** 0.000**
(0.000) (0.000)
Suitability for agriculture -0.055*** -0.039***
(0.009) (0.010)
Presence of rivers -0.043*** -0.068***
(0.009) (0.011)
Pre-colonial institutions -0.008*** 0.006**
(0.002) (0.003)
Constant 0.283*** 0.418*** 0.347***
(0.004) (0.015) (0.020)
No. of obs. 122616 100605 96866
Adj. R20.003 0.209 0.207
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
36
Table A.6: OLS without controls, OLS with controls and IV for access to post offices.
OLS without controls OLS with controls IV
EFI 0.110*** 0.019*** 0.144***
(0.006) (0.006) (0.028)
Urbanized area 0.300*** 0.291***
(0.003) (0.003)
Age 0.000 0.000
(0.000) (0.000)
Education 0.009*** 0.009***
(0.001) (0.001)
Wealth 0.028*** 0.025***
(0.002) (0.002)
Religious 0.004 0.003
(0.003) (0.003)
Gender -0.010*** -0.010***
(0.002) (0.002)
Region population -0.000*** -0.000***
(0.000) (0.000)
Region size 0.000*** 0.000***
(0.000) (0.000)
Suitability for agriculture -0.085*** -0.076***
(0.008) (0.009)
Presence of rivers -0.071*** -0.107***
(0.008) (0.010)
Pre-colonial institutions 0.007*** 0.028***
(0.002) (0.003)
Constant 0.176*** 0.490*** 0.401***
(0.004) (0.014) (0.018)
No. of obs. 122548 100480 96748
Adj. R20.003 0.263 0.252
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
37
Table A.7: OLS without controls, OLS with controls and IV for access to schools.
OLS without controls OLS with controls IV
EFI -0.012*** -0.029*** 0.105***
(0.004) (0.005) (0.023)
Urbanized area 0.038*** 0.029***
(0.002) (0.002)
Age -0.000* -0.000*
(0.000) (0.000)
Education 0.008*** 0.008***
(0.001) (0.001)
Wealth 0.010*** 0.010***
(0.002) (0.002)
Religious 0.014*** 0.015***
(0.003) (0.003)
Gender -0.005*** -0.006***
(0.002) (0.002)
Region population -0.000*** -0.000
(0.000) (0.000)
Region size 0.000*** 0.000**
(0.000) (0.000)
Suitability for agriculture 0.012* 0.011
(0.007) (0.008)
Presence of rivers 0.001 -0.019**
(0.006) (0.008)
Pre-colonial institutions -0.001 0.009***
(0.002) (0.002)
Constant 0.892*** 0.809*** 0.743***
(0.003) (0.013) (0.016)
No. of obs. 123284 101103 97337
Adj. R20.000 0.054 0.051
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
38
Table A.8: OLS without controls, OLS with controls and IV for access to police stations.
OLS without controls OLS with controls IV
EFI 0.125*** 0.042*** 0.155***
(0.006) (0.008) (0.032)
Urbanized area 0.343*** 0.336***
(0.003) (0.004)
Age -0.000*** -0.000***
(0.000) (0.000)
Education 0.012*** 0.012***
(0.001) (0.001)
Wealth 0.034*** 0.032***
(0.002) (0.002)
Religious 0.004 0.004
(0.003) (0.003)
Gender -0.013*** -0.013***
(0.003) (0.003)
Region population -0.000*** -0.000***
(0.000) (0.000)
Region size 0.000*** 0.000**
(0.000) (0.000)
Suitability for agriculture -0.055*** -0.039***
(0.009) (0.010)
Presence of rivers -0.043*** -0.068***
(0.009) (0.011)
Pre-colonial institutions -0.008*** 0.006**
(0.002) (0.003)
Constant 0.283*** 0.418*** 0.347***
(0.004) (0.015) (0.020)
No. of obs. 122616 100605 96866
Adj. R20.003 0.209 0.207
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
39
Table A.9: OLS without controls, OLS with controls and IV for access to health centers.
OLS without controls OLS with controls IV
EFI 0.048*** 0.005 -0.045
(0.006) (0.008) (0.035)
Urbanized area 0.200*** 0.202***
(0.003) (0.004)
Age -0.000** -0.000***
(0.000) (0.000)
Education 0.015*** 0.015***
(0.001) (0.001)
Wealth 0.035*** 0.035***
(0.002) (0.003)
Religious 0.010** 0.009**
(0.004) (0.004)
Gender -0.014*** -0.014***
(0.003) (0.003)
Region population -0.000*** -0.000
(0.000) (0.000)
Region size 0.000*** 0.000
(0.000) (0.000)
Suitability for agriculture -0.014 0.020*
(0.010) (0.011)
Presence of rivers -0.045*** -0.034***
(0.010) (0.012)
Pre-colonial institutions 0.000 0.003
(0.002) (0.003)
Constant 0.576*** 0.662*** 0.662***
(0.004) (0.015) (0.021)
No. of obs. 122655 100578 96832
Adj. R20.000 0.114 0.118
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
40
Table A.10: OLS without controls, OLS with controls and IV for access to paved roads.
OLS without controls OLS with controls IV
EFI 0.147*** 0.025*** -0.391***
(0.007) (0.007) (0.033)
Urbanized area 0.328*** 0.342***
(0.003) (0.004)
Age 0.000*** 0.000
(0.000) (0.000)
Education 0.018*** 0.018***
(0.001) (0.001)
Wealth 0.045*** 0.044***
(0.002) (0.002)
Religious -0.022*** -0.025***
(0.003) (0.004)
Gender -0.021*** -0.020***
(0.003) (0.003)
Region population -0.000*** -0.000***
(0.000) (0.000)
Region size -0.000*** -0.000***
(0.000) (0.000)
Suitability for agriculture -0.111*** -0.040***
(0.009) (0.011)
Presence of rivers -0.034*** 0.050***
(0.009) (0.012)
Pre-colonial institutions 0.020*** 0.003
(0.002) (0.003)
Constant 0.385*** 0.477*** 0.627***
(0.004) (0.012) (0.019)
No. of obs. 123648 101393 97623
Adj. R20.004 0.263 0.235
* p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round
fixed effects are included, except in the first column.
41
B Polarization index
-0.030
-0.212
-0.093
0.135
0.019
-0.354
0.021
-0.596
-0.107
-0.193
0.144
-0.055
0.105
0.304
0.155
-0.474
-0.045
0.110
-0.391
-0.764
EFI
Polarization
EFI
Polarization
EFI
Polarization
-1.000 -0.500 0.000 0.500-1.000 -0.500 0.000 0.500
-1.000 -0.500 0.000 0.500-1.000 -0.500 0.000 0.500
Index of public goods Electricity Cellphone Sewage
Piped water Post Schools Police
Clinic Road
IV with EFI IV with Polarization
Figure B.1: The estimated coefficients for EFI and Polarization in IV-2SLS regressions for the index of
public goods and the nine public goods in the data set. The individual controls include urban area, gender,
age, education, wealth and religiosity. The regional controls include region population and surface, pre-
colonial institutions, the suitability of soil for agriculture and access to river streams and oceans. Country
and Afrobarometer-round fixed effects are included. The Index of public goods variable is defined as average
access to the nine public goods in the data set.
42
Table B.1: First stage of the 2SLS-IV regression. Dependent variable: current polarization index.
Without controls With controls
Historic Polarization 0.094*** 0.136***
(0.003) (0.003)
Urbanized area 0.005***
(0.001)
Age 0.000***
(0.000)
Education 0.003***
(0.000)
Wealth -0.004***
(0.001)
Religious 0.003***
(0.001)
Gender -0.001
(0.001)
Region population -0.000***
(0.000)
Region size 0.000***
(0.000)
Suitability for agriculture -0.072***
(0.003)
Presence of rivers 0.021***
(0.003)
Pre-colonial institutions -0.003***
(0.001)
Constant 0.442*** 0.402***
(0.007) (0.010)
No. of obs. 111724 101027
Adj. R20.281 0.314
F67050.13 55551.21
* p<0.10, ** p<0.05, *** p<0.01. Country and Afrobarometer-round fixed effects are included. Robust
standard errors in parentheses. Gender is 1 for males, 0 for females.
43
Table B.2: 2SLS-IV regression of the index of public goods on the Polarization index.
EFI Polarization
EFI -0.030**
(0.015)
Polarization -0.212***
(0.030)
Urbanized area 0.278*** 0.280***
(0.002) (0.002)
Age -0.000 0.000
(0.000) (0.000)
Education 0.014*** 0.014***
(0.000) (0.000)
Wealth 0.039*** 0.039***
(0.001) (0.001)
Religious 0.003** 0.004***
(0.002) (0.002)
Gender -0.016*** -0.016***
(0.001) (0.001)
Region population 0.000 0.000
(0.000) (0.000)
Region size -0.000*** -0.000***
(0.000) (0.000)
Suitability for agriculture -0.040*** -0.063***
(0.005) (0.004)
Presence of rivers 0.002 0.004
(0.005) (0.004)
Pre-colonial institutions 0.003** -0.001
(0.001) (0.001)
Constant 0.590*** 0.688***
(0.009) (0.015)
No. of obs. 97623 101027
Adj. R20.471 0.467
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects.
Robust standard errors in parentheses. Gender is 1 for males, 0 for females. The Index of public goods is
defined as average access to the nine public goods in the data set.
44
C Pre-colonial populations
-0.183
-2.520
-2.540
-2.612
-0.314
-2.069
-2.061
-2.097
0.048
-0.422
-0.394
-0.381
-0.155
-4.675
-4.714
-4.828
-0.200
-2.924
-2.921
-2.977
EFI
EFI
-6.000 -4.000 -2.000 0.000
-6.000 -4.000 -2.000 0.000-6.000 -4.000 -2.000 0.000
Index of public goods Electricity Cellphone
Sewage Piped water
-0.129
-1.924
-1.976
-2.069
0.035
-0.133
-0.137
-0.155
-0.093
-2.863
-2.907
-3.012
-0.196
-2.603
-2.644
-2.740
-0.577
-4.374
-4.405
-4.515
EFI
EFI
-6.000 -4.000 -2.000 0.000
-6.000 -4.000 -2.000 0.000-6.000 -4.000 -2.000 0.000
Post Schools Police
Clinic Road
Initial measure 1850
1870 1900
Figure C.1: Estimated coefficients for EFI in 2SLS-IV regressions for the index of public goods and the
nine public goods in the data set, for our initial measure of ethnic diversity as well as measures of diversity
including populations ranging from 1850 to 1900. Individual controls include urban area, gender, age,
education, wealth and religiosity. Regional controls include region population and surface, pre-colonial
institutions, the suitability of soil for agriculture and access to river streams and oceans. Country and
Afrobarometer-round fixed effects are included. The Index of public goods variable is defined as average
access to the nine public goods in the data set.
45
Table C.1: First-stage regressions for measures using pre-colonial population
Initial Pre-colonial populations
1850 1870 1900
Historic EFI 0.246*** 0.064*** 0.064*** 0.063***
(0.003) (0.004) (0.004) (0.004)
Urbanized area 0.043*** 0.037*** 0.037*** 0.037***
(0.001) (0.001) (0.001) (0.001)
Age -0.000*** -0.000*** -0.000*** -0.000***
(0.000) (0.000) (0.000) (0.000)
Education 0.001** 0.001*** 0.001*** 0.001***
(0.000) (0.000) (0.000) (0.000)
Wealth -0.001 -0.001 -0.001 -0.001
(0.001) (0.001) (0.001) (0.001)
Religious -0.003** -0.004*** -0.004*** -0.004***
(0.001) (0.001) (0.001) (0.001)
Gender 0.001 0.001 0.001 0.001
(0.001) (0.001) (0.001) (0.001)
Region population -0.000*** -0.000*** -0.000*** -0.000***
(0.000) (0.000) (0.000) (0.000)
Region size 0.000*** 0.000*** 0.000*** 0.000***
(0.000) (0.000) (0.000) (0.000)
Suitability for agriculture 0.075*** 0.104*** 0.104*** 0.104***
(0.005) (0.004) (0.004) (0.004)
Presence of rivers 0.189*** 0.178*** 0.177*** 0.177***
(0.004) (0.004) (0.004) (0.004)
Pre-colonial institutions -0.038*** -0.061*** -0.061*** -0.061***
(0.001) (0.001) (0.001) (0.001)
Constant 0.326*** 0.418*** 0.418*** 0.418***
(0.009) (0.009) (0.009) (0.009)
No. of obs 97623 101027 101027 101027
Adj. R20.339 0.307 0.307 0.307
F2914.68 4192.04 4194.23 4198.68
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects.
Standard errors in parentheses. Gender is 1 for males, 0 for females. The Index of public goods is defined
as average access to the nine public goods in the data set. “Initial” refers to the measure used in previous
sections.
46
Table C.2: IV regressions using pre-colonial populations. The dependent variable is the index of
public goods
Initial Pre-colonial populations
1850 1870 1900
EFI -0.030** -0.752*** -0.766*** -0.794***
(0.015) (0.084) (0.085) (0.088)
Urbanized area 0.278*** 0.306*** 0.306*** 0.307***
(0.002) (0.003) (0.004) (0.004)
Age -0.000 -0.000*** -0.000*** -0.000***
(0.000) (0.000) (0.000) (0.000)
Education 0.014*** 0.014*** 0.014*** 0.014***
(0.000) (0.000) (0.000) (0.000)
Wealth 0.039*** 0.040*** 0.040*** 0.040***
(0.001) (0.001) (0.001) (0.001)
Religious 0.003** 0.001 0.001 0.000
(0.002) (0.002) (0.002) (0.002)
Gender -0.016*** -0.015*** -0.015*** -0.015***
(0.001) (0.002) (0.002) (0.002)
Region population 0.000 -0.000*** -0.000*** -0.000***
(0.000) (0.000) (0.000) (0.000)
Region size -0.000*** 0.000 0.000 0.000
(0.000) (0.000) (0.000) (0.000)
Suitability for agriculture -0.040*** 0.033*** 0.035*** 0.038***
(0.005) (0.011) (0.011) (0.012)
Presence of rivers 0.002 0.125*** 0.127*** 0.132***
(0.005) (0.015) (0.015) (0.016)
Pre-colonial institutions 0.003** -0.048*** -0.048*** -0.050***
(0.001) (0.005) (0.005) (0.006)
Constant 0.590*** 0.919*** 0.926*** 0.938***
(0.009) (0.037) (0.038) (0.039)
No. obs 97623 101027 101027 101027
Adj. R20.471 0.229 0.219 0.200
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects.
Standard errors in parentheses. Gender is 1 for males, 0 for females. “Initial” refers to the measure used in
previous sections.
47
D Depth of ethnic diversity
Table D.1: First-stage regressions for measures from GR.
Level 1 Level 2 Level 3 Level 4 Level 5 Level 6
Historic EFI 0.205*** 0.244*** 0.274*** 0.262*** 0.282*** 0.329***
(0.002) (0.003) (0.003) (0.003) (0.004) (0.004)
Controls Yes Yes Yes Yes Yes Yes
No. of obs. 69173 69173 69173 69173 69173 69173
Adj. R20.562 0.571 0.544 0.498 0.524 0.557
F2783.20 5058.43 4694.60 5798.81 6965.22 6986.53
Level 7 Level 8 Level 9 Level 10 Level 11 Level 12
Historic EFI 0.304*** 0.296*** 0.327*** 0.346*** 0.289*** 0.288***
(0.003) (0.003) (0.004) (0.004) (0.004) (0.004)
Controls Yes Yes Yes Yes Yes Yes
No. of obs. 69173 69173 69173 69173 69173 69173
Adj. R20.614 0.611 0.470 0.411 0.420 0.423
F8389.19 8466.70 5615.29 5699.02 5220.06 5252.12
Level 13
Historic EFI 0.289***
(0.004)
Controls Yes
No. of obs. 69173
Adj. R20.424
F5242.24
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects.
Robust standard errors in parentheses. The individual controls include urban area, gender, age, education,
wealth and religiosity. The regional controls include region population and surface, pre-colonial institutions,
suitability of the soil for agriculture and access to river streams and oceans. “GR” refers to Gershman and
Rivera (2018).
48
Table D.2: First-stage regressions for measures from GR from level 8 to 13.
Level 8 Level 9 Level 10 Level 11 Level 12 Level 13
Historic EFI 0.296*** 0.327*** 0.346*** 0.289*** 0.288*** 0.289***
(0.003) (0.004) (0.004) (0.004) (0.004) (0.004)
Urbanized area 0.027*** 0.037*** 0.042*** 0.043*** 0.043*** 0.043***
(0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Age -0.000*** -0.000*** -0.000*** -0.000*** -0.000*** -0.000***
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Education 0.000 -0.000 0.001* 0.001*** 0.001*** 0.001***
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Wealth -0.001 -0.004*** -0.001 -0.000 -0.000 -0.000
(0.001) (0.001) (0.001) (0.001) (0.001) (0.001)
Religious -0.007*** -0.012*** -0.012*** -0.011*** -0.011*** -0.011***
(0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Gender 0.001 0.003* 0.002 0.001 0.001 0.001
(0.001) (0.001) (0.001) (0.001) (0.001) (0.001)
Region -0.000*** -0.000*** -0.000*** -0.000*** -0.000*** -0.000***
population (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Region size 0.000*** 0.000*** 0.000*** 0.000*** 0.000*** 0.000***
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Suitability for 0.184*** 0.235*** 0.138*** 0.116*** 0.125*** 0.125***
agriculture (0.005) (0.006) (0.006) (0.006) (0.006) (0.006)
Presence of 0.182*** 0.132*** 0.084*** 0.057*** 0.056*** 0.055***
rivers (0.006) (0.006) (0.006) (0.006) (0.006) (0.006)
Pre-colonial -0.029*** -0.024*** -0.014*** -0.008*** -0.007*** -0.007***
institutions (0.001) (0.001) (0.001) (0.001) (0.001) (0.001)
Constant -0.230*** 0.098*** 0.452*** 0.492*** 0.491*** 0.490***
(0.005) (0.006) (0.007) (0.007) (0.007) (0.007)
No. of obs. 69173 69173 69173 69173 69173 69173
Adj. R20.611 0.470 0.411 0.420 0.423 0.424
F8466.70 5615.29 5699.02 5220.06 5252.12 5242.24
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects.
Robust standard errors in parentheses. The individual controls include urban area, gender, age, education,
wealth and religiosity. The regional controls include region population and surface, pre-colonial institutions,
suitability of the soil for agriculture and access to river streams and oceans. “GR” refers to Gershman and
Rivera (2018).
49
Table D.3: 2SLS-IV results using data on the depth of ethnic divisions for the index of public
goods.
Afro GR
Level 1 Level 2 Level 3 Level 4 Level 5 Level 6
EFI -0.030** -0.071*** -0.060*** -0.053*** -0.056*** -0.052*** -0.044***
(0.015) (0.020) (0.017) (0.015) (0.016) (0.015) (0.013)
Urbanized 0.278*** 0.262*** 0.263*** 0.262*** 0.263*** 0.263*** 0.263***
area (0.002) (0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Age -0.000 -0.000** -0.000** -0.000** -0.000** -0.000** -0.000**
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Education 0.014*** 0.013*** 0.013*** 0.013*** 0.013*** 0.014*** 0.014***
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Wealth 0.039*** 0.041*** 0.041*** 0.041*** 0.041*** 0.041*** 0.041***
(0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001)
Religious 0.003** 0.006*** 0.005*** 0.005*** 0.005*** 0.006*** 0.006***
(0.002) (0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Gender -0.016*** -0.017*** -0.017*** -0.017*** -0.017*** -0.017*** -0.017***
(0.001) (0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Region 0.000 0.000*** 0.000*** 0.000*** 0.000*** 0.000*** 0.000***
population (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Region size -0.000*** -0.000*** -0.000*** -0.000*** -0.000* -0.000* -0.000**
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Suitability for -0.040*** -0.052*** -0.047*** -0.050*** -0.047*** -0.044*** -0.045***
agriculture (0.005) (0.006) (0.006) (0.006) (0.006) (0.006) (0.006)
Presence of 0.002 -0.012* -0.009 -0.008 -0.009 -0.003 -0.002
rivers (0.005) (0.006) (0.006) (0.006) (0.006) (0.006) (0.006)
Pre-colonial 0.003** -0.010*** -0.011*** -0.011*** -0.010*** -0.011*** -0.012***
institutions (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001)
Constant 0.590*** 0.492*** 0.503*** 0.504*** 0.517*** 0.515*** 0.515***
(0.009) (0.011) (0.011) (0.012) (0.013) (0.013) (0.013)
No. of obs. 97623 69173 69173 69173 69173 69173 69173
Adj. R20.471 0.439 0.440 0.439 0.439 0.439 0.440
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects.
Robust standard errors in parentheses. Gender is 1 for males, 0 for females. The Index of public goods is
defined as average access to the nine public goods in the data set. “Afro” refers to Afrobarometer data, while
“GR” refers to Gershman and Rivera (2018).
50
Table D.4: 2SLS-IV results using data on the depth of ethnic divisions for the index of public
goods.
GR
Level 7 Level 8 Level 9 Level 10 Level 11 Level 12 Level 13
EFI -0.048*** -0.049*** -0.045*** -0.042*** -0.050*** -0.051*** -0.050***
(0.014) (0.014) (0.013) (0.012) (0.014) (0.014) (0.014)
Urbanized 0.263*** 0.263*** 0.263*** 0.263*** 0.264*** 0.264*** 0.264***
area (0.002) (0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Age -0.000** -0.000** -0.000** -0.000** -0.000** -0.000** -0.000**
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Education 0.014*** 0.014*** 0.014*** 0.014*** 0.014*** 0.014*** 0.014***
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Wealth 0.041*** 0.041*** 0.041*** 0.041*** 0.041*** 0.041*** 0.041***
(0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001)
Religious 0.006*** 0.006*** 0.005*** 0.006*** 0.005*** 0.005*** 0.005***
(0.002) (0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Gender -0.017*** -0.017*** -0.017*** -0.017*** -0.017*** -0.017*** -0.017***
(0.002) (0.002) (0.002) (0.002) (0.002) (0.002) (0.002)
Region 0.000*** 0.000*** 0.000*** 0.000*** 0.000*** 0.000*** 0.000***
population (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Region size -0.000** -0.000* -0.000*** -0.000** -0.000* -0.000* -0.000*
(0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)
Suitability for -0.042*** -0.042*** -0.040*** -0.045*** -0.045*** -0.044*** -0.044***
agriculture (0.007) (0.007) (0.007) (0.006) (0.006) (0.006) (0.006)
Presence of 0.000 0.000 -0.003 -0.005 -0.006 -0.006 -0.006
rivers (0.007) (0.007) (0.006) (0.006) (0.006) (0.006) (0.006)
Pre-colonial -0.012*** -0.012*** -0.012*** -0.012*** -0.011*** -0.011*** -0.011***
institutions (0.001) (0.002) (0.001) (0.001) (0.001) (0.001) (0.001)
Constant 0.517*** 0.517*** 0.515*** 0.515*** 0.522*** 0.522*** 0.522***
(0.013) (0.013) (0.013) (0.013) (0.014) (0.014) (0.014)
No. of obs. 69173 69173 69173 69173 69173 69173 69173
Adj. R20.440 0.440 0.439 0.439 0.438 0.438 0.438
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects.
Robust standard errors in parentheses. The individual controls include urban area, gender, age, education,
wealth and religiosity. The regional controls include region population and surface, pre-colonial institutions,
suitability of the soil for agriculture and access to river streams and oceans. The Index of public goods is
defined as average access to the nine public goods in the data set. “GR” refers to Gershman and Rivera
(2018).
51
-0.030
-0.071
-0.052
-0.042
-0.050
-0.093
0.023
0.017
0.014
0.016
0.019
0.086
0.063
0.051
0.062
0.021
-0.066
-0.048
-0.039
-0.046
-0.107
-0.350
-0.254
-0.207
-0.249
0.144
0.128
0.093
0.076
0.090
0.105
0.131
0.096
0.078
0.094
0.155
0.135
0.098
0.080
0.096
-0.045
-0.123
-0.089
-0.073
-0.088
-0.391
-0.527
-0.385
-0.313
-0.375
EFI
ELF(1)
ELF(5)
ELF(10)
ELF(13)
EFI
ELF(1)
ELF(5)
ELF(10)
ELF(13)
EFI
ELF(1)
ELF(5)
ELF(10)
ELF(13)
-0.600-0.400-0.200 0.000 0.200-0.600-0.400-0.200 0.000 0.200
-0.600-0.400-0.200 0.000 0.200 -0.600-0.400-0.200 0.000 0.200
Index of public goods Electricity Cellphone Sewage
Piped water Post Schools Police
Clinic Road
Initial measure Level 1
Level 5 Level 10
Level 13
Figure D.1: The estimated coefficients on EF I for the index of public goods as well as each of the nine
public goods in the data set, in IV regressions for our initial measure of diversity and for levels of depth
of 1, 5, 10 and 13 from Gershman and Rivera (2018). The individual controls include urban area, gender,
age, education, wealth and religiosity. The regional controls include region population and surface, pre-
colonial institutions, the suitability of soil for agriculture and access to river streams and oceans. Country
and Afrobarometer-round fixed effects are included. The Index of public goods variable is defined as average
access to the nine public goods in the data set.
52
E Variation of the radius
0.040
-0.030
-0.424
-0.764
-1.093
-0.028
-0.093
0.018
0.233
0.266
0.171
0.019
0.118
-0.208
-0.010
0.118
0.021
-0.626
-1.474
-1.020
-0.379
-0.107
-0.606
-0.628
-1.189
0.311
0.144
-0.145
-0.648
-1.759
0.088
0.105
0.202
0.081
-0.536
0.310
0.155
-0.344
-0.907
-1.976
0.058
-0.045
-0.721
-1.813
-2.657
-0.242
-0.391
-1.443
-1.352
-1.036
EFI
EFI
EFI
-3.000-2.000 -1.000 0.000 1.000 -3.000-2.000 -1.000 0.000 1.000
-3.000-2.000 -1.000 0.000 1.000 -3.000-2.000 -1.000 0.000 1.000
Index of public goods Electricity Cellphone Sewage
Piped water Post Schools Police
Clinic Road
50 km 100 km (Initial measure)
200 km 350 km
500 km
Figure E.1: The estimated coefficients in 2SLS-IV regressions of the index of public goods and the nine
public goods in the data set, with the radius for calculating the instrument ranging from 50 to 500km. The
individual controls include urban area, gender, age, education, wealth and religiosity. The regional controls
include region population and surface, pre-colonial institutions, the suitability of soil for agriculture and
access to river streams and oceans. Country and Afrobarometer-round fixed effects are included. The Index
of public goods variable is defined as average access to the nine public goods in the data set.
53
Table E.1: First-stage regressions for measures of pre-colonial ethnic diversity with radius ranging from 50 to 500km.
Radius
50km 100km150km 200km 250km 300 km 350km 400km 450km 500km
Historic EFI 0.214*** 0.246*** 0.163*** 0.110*** 0.056*** 0.040*** 0.077*** 0.112*** 0.120*** 0.089***
(0.004) (0.003) (0.003) (0.004) (0.004) (0.004) (0.004) (0.005) (0.005) (0.006)
Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
No. of obs. 93420 97623 100448 100828 100916 101027 101115 101115 101139 101194
Adj. R20.321 0.339 0.322 0.312 0.307 0.306 0.307 0.309 0.308 0.306
F4933.38 2914.68 3533.91 4048.71 4254.95 4315.90 4163.52 4120.28 4184.09 4336.29
*p<0.10, ** p<0.05, *** p<0.01. Robust standard errors in parentheses. Country and Afrobarometer-round fixed effects are included. The individual
controls include urban area, gender, age, education, wealth and religiosity. The regional controls include region population and surface, pre-colonial
institutions, suitability of the soil for agriculture and access to river streams and oceans. : Measure used in our initial regressions.
54
Table E.2: 2SLS-IV results of the effect of ethnic fractionalization with variation in the radius for the calculation of the instrument
Radius
50km 100km150km 200km 250km 300 km 350km 400km 450km 500km
EFI 0.040** -0.030** -0.214*** -0.424*** -0.974*** -1.331*** -0.764*** -0.557*** -0.640*** -1.093***
(0.019) (0.015) (0.023) (0.039) (0.100) (0.181) (0.079) (0.054) (0.058) (0.105)
Controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
No. of obs. 93420 97623 100448 100828 100916 101027 101115 101115 101139 101194
Adj. R20.473 0.471 0.456 0.397 0.061 . 0.220 0.340 0.296 .
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects. Robust standard errors in parentheses. The
individual controls include urban area, gender, age, education, wealth and religiosity. The regional controls include region population and surface,
pre-colonial institutions, suitability of the soil for agriculture and access to river streams and oceans. : Measure used in our initial regressions. The
Index of public goods is defined as average access to the nine public goods in the data set.
55
F Alternative data set: the Demographic Health Surveys
0 .2 .4 .6 .8 1
0 .2 .4 .6 .8 1
EFI DHS
EFI OLS fitted values
45-degree line
Figure F.1: The comparison between the Ethno-Linguistic Fractionalization Indices calculated from the
Afrobarometer and Demographic Health Surveys.
56
Table F.1: First stage of the IV regressions using DHS data for EFI.
Afrobarometer DHS
Without controls With controls Without controls With controls
Historic EFI 0.159*** 0.246*** -0.024*** 0.099***
(0.003) (0.003) (0.004) (0.005)
Urbanized area 0.043*** 0.040***
(0.001) (0.002)
Age -0.000*** -0.000***
(0.000) (0.000)
Education 0.001** -0.000
(0.000) (0.000)
Wealth -0.001 -0.000
(0.001) (0.001)
Religious -0.003** -0.010***
(0.001) (0.002)
Gender 0.001 0.001
(0.001) (0.001)
Region -0.000*** 0.000***
population (0.000) (0.000)
Region size 0.000*** -0.000***
(0.000) (0.000)
Suitability for 0.075*** -0.058***
agriculture (0.005) (0.007)
Presence of 0.189*** 0.267***
rivers (0.004) (0.005)
Pre-colonial -0.038*** -0.056***
institutions (0.001) (0.001)
Constant 0.583*** 0.326*** 0.574*** 0.833***
(0.001) (0.009) (0.001) (0.007)
No. of obs. 103800 97623 58030 56447
Adj. R20.027 0.339 0.001 0.474
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects in the
second column. Robust standard errors in parentheses. Gender is 1 for males, 0 for females.
57
Table F.2: IV results using DHS data for EFI.
Afrobarometer DHS
EFI -0.030**
(0.015)
EFI DHS -0.795***
(0.063)
Urbanized area 0.278*** 0.311***
(0.002) (0.003)
Age -0.000 -0.000***
(0.000) (0.000)
Education 0.014*** 0.014***
(0.000) (0.001)
Wealth 0.039*** 0.034***
(0.001) (0.002)
Religious 0.003** -0.012***
(0.002) (0.003)
Gender -0.016*** -0.016***
(0.001) (0.002)
Region population 0.000 0.000***
(0.000) (0.000)
Region size -0.000*** -0.000***
(0.000) (0.000)
Suitability for agriculture -0.040*** -0.087***
(0.005) (0.009)
Presence of rivers 0.002 0.233***
(0.005) (0.020)
Pre-colonial institutions 0.003** -0.044***
(0.001) (0.004)
Constant 0.590*** 1.093***
(0.009) (0.057)
No. of obs. 97623 56447
Adj. R20.471 0.256
* p<0.10, ** p<0.05, *** p<0.01. Estimations include country and Afrobarometer-round fixed-effects.
Robust standard errors in parentheses. Gender is 1 for males, 0 for females.
58
-0.795
-0.989
-0.380
-0.799
-1.649
0.231
-0.154
-0.438
-0.996
-1.749
EFI DHS
-2.000 -1.500 -1.000 -0.500 0.000 0.500
Index Electricity
Cellphone Piped water
Sewage Post
School Police
Clinic Road
Figure F.2: The estimated coefficients from 2SLS-IV regressions of the index of public goods and the nine
public goods in the data set with the EF I measure coming from the Demographic Health Surveys.. The
individual controls include urban area, gender, age, education, wealth and religiosity. The regional controls
include region population and surface, pre-colonial institutions, the suitability of soil for agriculture and
access to river streams and oceans. Country and Afrobarometer-round fixed effects are included. The Index
of public goods variable is defined as average access to the nine public goods in the data set.
59
G Statistical inference
Table G.1: P-values corrected for statistical inference
Dependent Regression 2SLS Bonferroni Holm
variable coefficient p-value p-value p-value
Index of public goods -0.030 0.049** 0.490 0.196
Electricity grid -0.930 0.000*** 0.000*** 0.000***
Cellphone service 0.018 0.396 1.000 0.792
Sewage treatment 0.021 0.407 1.000 0.407
Piped water system -0.107 0.000*** 0.000*** 0.000***
Post offices 0.144 0.000*** 0.000*** 0.000***
Schools 0.104 0.000*** 0.000*** 0.000***
Police stations 0.155 0.000*** 0.000*** 0.000***
Health centers -0.045 0.206 1.000 0.618
Roads -0.391 0.000*** 0.000*** 0.000***
* p<0.10, ** p<0.05, *** p<0.01. The individual controls include urban area, gender, age, education, wealth
and religiosity. The regional controls include region population and surface, pre-colonial institutions, the
suitability of soil for agriculture and access to river streams and oceans. Country and Afrobarometer-round
fixed effects are included. The Index of public goods variable is defined as average access to the nine public
goods in the data set.
60
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This paper presents a new dataset on subnational ethnolinguistic and religious diversity in Sub-Saharan Africa covering 36 countries and almost 400 first-level administrative units. We use population censuses and large-scale household surveys to compile detailed data on the ethnolinguistic composition of each region and match all reported ethnicities to Ethnologue, a comprehensive catalog of world languages. This matching allows us to standardize the notion of an ethnolinguistic group and account for relatedness between language pairs, a correlate of shared history and culture, when calculating diversity indices. Exploiting within-country variation provided by our new dataset, we find that local public goods provision, as reflected in metrics of education, health, and electricity access, is negatively related to ethnolinguistic diversity, but only if the underlying basic languages are first aggregated into larger families or if linguistic distances between groups are taken into consideration. In other words, only deep-rooted diversity, based on cleavages formed in the distant past, is strongly inversely associated with a range of regional development indicators. Furthermore, we show that subnational diversity has been remarkably persistent over the past two-three decades implying that population sorting in the short to medium run is unlikely to bias our main findings.
Article
Full-text available
The “diversity debit” hypothesis – that ethnic diversity has a negative impact on social, economic, and political outcomes – has been widely accepted in the literature. Indeed, with respect to public goods provision – the focus of this article – the conventional wisdom holds that a negative relationship between ethnic heterogeneity and public goods provision is so well-established empirically that future research should abandon examination of whether such a relationship exists and focus instead on why it exists, that is, on the mechanisms underlying a negative relationship. This article challenges the conventional wisdom on empirical grounds. It demonstrates at the sub-national level strong evidence for a “diversity dividend” – that is, a positive relationship between ethnic heterogeneity and some measures of public goods provision, in particular welfare outcomes related to publicly provided goods and services. Building on the literature, the article draws on new analysis at district level for Zambia, using a new dataset compiled by the authors from administrative, budget, and survey data, which cover a broader range of public goods outcomes than previous work, including information on both budgetary and welfare outcomes. The article explores why relationships may differ for sub-national budgetary and welfare outcomes, considering separate models for each. Analysis shows results to be robust across a variety of alternative specifications and models. Given the more nuanced relationship between ethnic diversity and public goods provision documented, the article argues that the key task for future work is not to address why the relationship is negative, but to study under what conditions such direction holds true, and the mechanisms that underlie a diversity dividend. It concludes by considering key explanatory hypotheses against the Zambian data to identify promising areas for such theory development. More broadly, while the diversity debit hypothesis highlights the costs of diversity and could be interpreted as providing support for polices that minimize it, the findings in this article are consistent with a view that diversity can be good for communities, not only for normative reasons, but also because, under some conditions, it can support concrete welfare gains.
Article
We examine whether shared collective experiences help build a national identity, by looking at the impact of national football teams’ victories in sub-Saharan Africa. We find that individuals surveyed in the days after an important victory of their country’s national team are 37 percent less likely to identify primarily with their ethnic group, and 30 percent more likely to trust other ethnicities, than those interviewed just before. Crucially, national team achievements also reduce violence: countries that (barely) qualified to the Africa Cup of Nations experience less civil conflict (9 percent fewer episodes) in the following months than countries that (barely) did not. (JEL D74, J15, L83, O15, O17, Z21)
Article
We provide a new compilation of data on ethnic, linguistic, and religious composition at the subnational level for a large number of countries. Using these data, we measure segregation of groups within the country. To overcome the endogeneity problem that arises because of mobility and endogenous internal borders, we construct an instrument for segregation. We find that more ethnically and linguistically segregated countries, i.e., those where groups live more spatially separately, have a lower quality of government; there is no relationship between religious segregation and governance. Trust is an important channel of influence; it is lower in more segregated countries.
Article
This study explores the determinants of ethnolinguistic diversity within as well as across countries, shedding light on its geographic origins. The empirical analysis conducted across countries, virtual countries, and pairs of contiguous regions establishes that geographic variability, captured by variation in regional land quality and elevation, is a fundamental determinant of contemporary linguistic diversity. The findings are consistent with the proposed hypothesis that differences in land endowments gave rise to location-specific human capital, leading to the formation of localized ethnicities. (JEL J15, J24, Z13)
Article
Many conditional cash transfer (CCT) programs have important social components and, therefore, can have an effect on social capital. In 2007, we conducted a field experiment with 1451 subjects in Cartagena, Colombia. We interpret the behavior in the game as a measure of what in the literature has been called social capital. We played the game in two similar and adjacent neighborhoods. The ‘treatment’ neighborhood, Pozón, had been targeted for over 2 years by a CCT program, Familias en Acción; the ‘control’ neighborhood, Ciénaga, had not. In 2008, with the program being implemented in both neighborhoods, we played the same public goods game, and were therefore able to implement a difference in differences strategy to estimate the impact of the CCT on our measure of social capital. In 2007, the level of cooperation we observed in the treatment neighborhood was considerably higher than that in the control one. Although similar in many dimensions, the two groups turned out to be significantly different in some observable variables; the positive result was robust to controls for these differences. In 2008, we found that the level of cooperation was statistically identical across the two neighborhoods, and similar to the levels observed in 2007 in the treatment one. We conclude that the CCT program did improve cooperation. In analyzing the effect of the CCT on cooperation we also look at other (individual and group) determinants of individual behavior in the game, and we compare our measure based on behavior in the game to more traditional measures of social capital used in the literature that we collected in a context-specific survey.
Article
This study departs from extant work on diversity and development in several respects. Using DHS data from a large number of developing countries, we adopt four human development outcomes: child mortality, fertility, education, and wealth. We exploit evidence at multiple levels—country, subnational region, and district—and we measure diversity in a variety of ways. This unique approach reveals that although diversity may have negative ramifications on human development at national levels it is very unlikely to have these same effects at subnational levels.