PreprintPDF Available

The geographic spread of COVID-19 correlates with structure of social networks as measured by Facebook

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

We use anonymized and aggregated data from Facebook to show that areas with stronger social ties to two early COVID-19 "hotspots" (Westchester County, NY, in the U.S. and Lodi province in Italy) generally have more confirmed COVID-19 cases as of March 30, 2020. These relationships hold after controlling for geographic distance to the hotspots as well as for the income and population density of the regions. These results suggest that data from online social networks may prove useful to epidemiologists and others hoping to forecast the spread of communicable diseases such as COVID-19.
The geographic spread of COVID-19 correlates with
structure of social networks as measured by Facebook
Theresa KuchlerDominic RusselJohannes Stroebel§
We use anonymized and aggregated data from Facebook to show that areas with
stronger social ties to two early COVID-19 hotspots (Westchester County, NY, in the
U.S. and Lodi province in Italy) generally have more confirmed COVID-19 cases as of
March 30, 2020. These relationships hold after controlling for geographic distance to
the hotspots as well as for the income and population density of the regions. These re-
sults suggest that data from online social networks may prove useful to epidemiologists
and others hoping to forecast the spread of communicable diseases such as COVID-19.
To forecast the geographic spread of communicable diseases such as COVID-19, it is valuable
to know which individuals are likely to physically interact (Piontti et al.,2018). Yet, the
geographic structure of social networks and interactions is usually hard to measure on a
national or global scale. In Bailey et al. (2018b), we showed how data from online social
networking services can be used to measure and understand the geographic structure of social
networks. We introduced a new data set, the Social Connectedness Index, which captures the
relative probability that individuals across two regions are connected through a friendship
link on Facebook, a global online social network. At the time, we suggested that such a
measure of the geographic structure of social networks may be helpful to epidemiologists
hoping to forecast the spread of communicable diseases. The idea was that two regions
connected through many friendship links are likely to see more physical interactions between
their residents, providing increased opportunities for the spread of communicable diseases.
In this note, we explore the relationship between the geographic spread of COVID-19
and the geographic structures of social networks in the United States and in Italy. We
show that regions with stronger social ties to early COVID-19 “hotspots” in each country
— Westchester County, NY, in the United States, and Lodi province in Italy — have more
Date: April 8, 2020. The data on social connectedness data used in this paper (as well as similar data
for a wide range of other geographies) are accessible to other researchers by emailing sci data@fb.com.
New York University, Stern School of Business. Email: tkuchler@stern.nyu.edu
New York University, Stern School of Business. Email: drussel@stern.nyu.edu
§New York University, Stern School of Business. Email: johannes.stroebel@nyu.edu (Corresponding)
1
arXiv:2004.03055v1 [physics.soc-ph] 7 Apr 2020
documented COVID-19 cases per resident as of March 30, 2020. This relationship is robust
to controlling for the geographic distance to these early “hotspots”, as well as a number of
demographic characteristics of the regions.
Our objective is not to incorporate social connectedness data into a state-of-the-art epi-
demiological model, but instead to provide a “proof of concept” by highlighting that social
connectededness as measured by our Social Connectedness Index is correlated with COVID-
19 prevalence in a statistically meaningful way. This finding suggests to us that the geo-
graphic structure of social network as measured by Facebook may indeed provide a useful
proxy for the type of social interactions that epidemiologists have long known to contribute
to the spread of communicable diseases.1We thus hope that the Social Connectedness Index
can help epidemiologists with forecasting the spread of communicable diseases, in particular
given that these data are easily accessible to researchers by emailing sci data@fb.com.
Data Description
To measure the intensity of social connectedness between locations, we use an anonymized
snapshot of all active Facebook users and their friendship networks from March 2020. As
of the end of 2019, Facebook had nearly 2.5 billion monthly active users around the world:
248 million in the U.S. and Canada, 394 million in Europe, 1.04 billion in Asian-Pacific, and
817 billion in the rest of the world. The data therefore have extremely wide coverage, and
provide a unique opportunity to map the geographic structure of social networks around the
world. Locations are assigned to users based on their information and activity on Facebook,
including their public profile information, and device and connection information. Our mea-
sure of the social connectedness between two locations iand jis provided by the Social
Connectedness Index (SCI) introduced by Bailey et al. (2018b):
SocialConnectednessij =F B C onnectionsij
F B U sersiF B U sersj
.(1)
Here, F B C onnectionsij is the total number of Facebook friendship links between Facebook
users living in location iand Facebook users living in location j.2F B U sersiand F B U sersj
are the number of active users in each location. SocialConnectednessij thus measures the
1A number of other researchers have explored how different aspects of social media and internet-usage
patterns can be used for tracking and preventing disease (for one overview see Aiello et al.,2020). Our hope
is to add to this exciting work.
2Establishing a connection on Facebook requires the consent of both individuals, and there is an upper
limit of 5,000 on the number of connections a person can have. As a result, Facebook connections are
generally more likely to be between real-world acquaintances than links on many other social networking
platforms.
relative probability of a Facebook friendship link between a given Facebook user in location
iand a given Facebook user in location j: if this measure is twice as large, a given Facebook
user in region iis twice as likely to be friends with a given Facebook user in region j.
In previous work, we have shown that this measure of social connectedness is useful for
describing real-world social networks. We also documented that it predicts a large num-
ber of important economic and social interactions. For example, social connectedness as
measured through Facebook friendship links is strongly related to patterns of sub-national
and international trade (Bailey et al.,2020a), patent citations (Bailey et al.,2018b), travel
flows (Bailey et al.,2019b,2020b), and investment decisions (Kuchler et al.,2020). More
generally, we have found that information on individuals’ Facebook friendship links can help
understand their product adoption decisions (Bailey et al.,2019c) and their housing and
mortgage choices (Bailey et al.,2018a,2019a).
In the next section, we use these data to explore how the domestic spread of confirmed
COVID-19 cases is related to the social connectedness to two early COVID-19 “hotspots”:
Westchester County, NY, in the U.S., and Lodi Province in Italy. Westchester County
includes New Rochelle, a community that had the first major COVID-19 outbreak in the
eastern United States (NPR, March 10, 2020). As of March 20th, the county had over 9,300
cases, second only to nearby New York City. Additionally, a number of articles have reported
wealthy residents from Westchester and the New York area fleeing to other parts of the U.S.
(New York Times, March 25, 2020), providing a vector that could potentially spread the
disease across the country. Social connections to Westchester may thus provide particularly
important information for tracking the spread of COVID-19, especially if individuals’ travel
patterns follow their social networks, as suggested by Bailey et al. (2019b,2020b). Lodi is an
Italian province of around 230,000 inhabitants in the heavily impacted region of Lombardy.
It contains Codogno, where the earliest cases of COVID-19 in Italy were detected, and has
been at the center of Italy’s outbreak (New York Times, March 21, 2020).
Data on confirmed COVID-19 cases in the United States by county come from Johns
Hopkins University Center for Systems Science and Engineering.3Similarly, data for con-
firmed COVID-19 cases for each Italian province come from the Italian Dipartimeno della
Protezione Civile.4We use data from March 30th, 2020, but our results are robust to us-
ing data from prior days. At this stage it is important to note that, as with any data on
confirmed cases, some bias may be introduced by differential testing across regions.
3Available at https://github.com/CSSEGISandData/COVID-19.
4Available at https://github.com/pcm-dpc. Because Italian provinces on the island of Sardinia do
not align with current European NUTS3 regions (the level at which we aggregate social connectedness), we
include Sardinia as a single observation throughout our analysis.
Results
Panel (a) of Figure 1shows a heatmap of the social connectedness of Westchester County, NY,
to all other U.S. counties; darker colors correspond to stronger social ties. Panel (b) shows
the distribution of COVID-19 cases per 10,000 residents across U.S. counties, again with
darker colors corresponding to higher COVID-19 prevalence. These maps show a number
of similarities. Perhaps most notably, coastal regions and urban centers appear to have
both high levels of connectedness to Westchester and larger numbers of COVID-19 cases per
resident. But a number of more subtle patterns emerge as well. Both measures are high
in the communities along the coasts of Florida (in particular along the southeastern coast,
near Miami), in western and central Colorado (in particular in areas with ski resorts), and
in the upper northeast. These areas are all popular vacation destinations and second home
locations for many well-heeled residents of Westchester. Indeed, the governors of Florida
and Rhode Island have both publicly lamented the number of New York area residents
fleeing to their states and spreading COVID-19 (Tampa Bay Times, March 23, 2020;Time,
March 28, 2020). By contrast, many areas that are geographically closer but less socially
connected to Westchester, such as in western Pennsylvania and West Virginia, have fewer
confirmed COVID-19 cases. There are also a number of patterns of COVID-19 prevalence
that connectedness to Westchester alone cannot explain. Areas surrounding King County,
WA (Seattle), for example, have relatively low levels of connectedness to Westchester, but
were an independent early hotspot of COVID-19. Some states in the southern U.S. where
residents were slower to limit travel also have higher case densities than would be predicted
purely by social connectedness to Westchester (New York Times, April 2, 2020).
The two bottom panels of Figure 1explore the relationship between COVID-19 preva-
lence and social ties to Westchester more formally. Panel (c) shows a binscatter plot of
social connectedness to Westchester County and the number of COVID-19 cases per 10,000
residents. We exclude those counties within 50 miles of Westchester County: while those
areas have strong social links to Westchester, they are also close enough geographically such
that their populations might interact physically with Westchester residents even in the ab-
sence of social links (e.g., in supermarkets and houses of worship). There is a strong positive
relationship between COVID-19 prevalence and social ties to Westchester. Quantitatively,
a doubling of a county’s social connectedness to Westchester is associated with an increase
of about 0.88 COVID-19 cases per 10,000 residents. The R-Squared of this relationship is
0.093, suggesting that, in a statistical sense, 9.3% of the cross-county variation in COVID-19
cases can be explained by counties’ social connectedness to Westchester.
One concern with interpreting these initial correlations is that they might be primarily
Figure 1: Social Network Distributions from Westchester and COVID-19 Cases in the U.S.
(a) Log of SCI to Westchester County, NY (b) COVID-19 Cases per 10k Residents by County
(c) Westchester binscatter without controls
0 2 4 6
Cases per 10k people
456789
log(Social Connectedness)
(d) Westchester binscatter with controls
012345
Cases per 10k people
45678
log(Social Connectedness)
Note: Panel (a) shows the social connectedness to Westchester for U.S. counties. Panel (b) shows the number
of confirmed COVID-19 cases by U.S. county on March 30th, 2020. Panels (c) and (d) show binscatter plots
with provinces more than 50 miles from Westchester as the unit of observation. To generate the plot in
Panel (c) we group log(SC I ) into 30 equal-sized bins and plot the average against the corresponding average
case density. We then group log(SC I ) into 100 equal-sized bins and plot the average log(SCI) against the
corresponding average case density. Panel (d) is constructed in a similar manner. However, we first regress
log(SCI) and cases per 10,000 residents on a set of control variables and plot the residualized values on each
axis. Red lines show quadratic fit regressions. The controls for Panel (d) are 100 dummies for the percentile
of the county distance to Westchester from the Nation Bureau of Economic Research; population density
and median household income made available from (Chetty et al.,2016); and dummies for the six National
Center for Health Statistics Urban-Rural county classifications.
picking up other factors that affect the spread of COVID-19, and that are correlated with
social connectedness. Specifically, even after dropping counties within 50 miles of Westch-
ester, the correlations might be primarily picking up geographic distance to Westchester
(which is related to the number of friendship links to Westchester). As a result, including
social connectedness might not improve predictive power for models that already control
for some of these other variables. In Panel (d), we therefore present a binscatter plot of
the relationship between social connectedness to Westchester County and COVID-19 cases
that controls for a number of these possible confounding variables (in addition to excluding
nearby counties). Most importantly, we non-parametrically control for the geographic dis-
tance between each county and Westchester County by including 100 dummies for percentiles
of that distance. We also control for income, population density, and a classification of how
urban/rural a county is. Even conditional on these other factors, Panel (d) shows a strong
positive relationship between COVID-19 cases as of March 30, 2020 and social connectedness
to Westchester County. With these controls, a doubling of a county’s social connectedness
to Westchester is associated with an increase of about 0.80 COVID-19 cases per 10,000 res-
idents. The total R-Squared of the statistical relationship is 0.190, while the incremental
R-Squared from controlling for social connectedness to Westchester is 0.037.
It is important to highlight that the purpose of this exercise is to demonstrate the pre-
dictive power of social connectedness measured via online social networks for COVID-19
prevalence. We chose the current set of control variables to highlight that the Social Con-
nectedness Index has such predictive power over and above a number of variables on which
data is already easily available, and that may partially proxy for social connections in models
of communicable disease spread. The observed increase in predictive power thus suggests
that the Social Connectedness Index might serve as a valuable measure above some existing
proxies for social interactions.5
Figure 2explores the analogous relationships for Lodi province in Italy. The provinces
with highest COVID-19 case densities and connectedness to Lodi are in the surrounding
Lombardy region, as well as the nearby Piemonte and Veneto regions. There are also rela-
tively high levels of both connectedness to Lodi and COVID-19 cases in Rimini, a popular
tourist destination along the Adriatic sea. A number of provinces in southern Italy send
workers and students to the industrial Lombardy region, and therefore have strong social
ties to that region. While some of these areas have seen a number of COVID-19 cases,
they are not disproportionally larger, perhaps reflecting the efforts of Italian authorities to
5This is not to suggest that the Social Connectedness Index is the only such measure, and we believe
that further advances can be made using other data sources, such as cell-phone location pings. But the social
connectedness index has a number of advantages, including the fact that it is easily accessible to researchers
and consistently available for a large number of global geographies.
Figure 2: Social Network Distributions of Lodi and COVID-19 Cases in Italy
(a) Percentile of SCI to Lodi Province, Italy (b) COVID-19 Cases per 10k Residents by Province
(c) Lodi binscatter without controls
0 10 20 30 40 50
Cases per 10k people
9 9.5 10 10.5 11
log(Social Connectedness)
(d) Lodi binscatter with controls
5 10 15 20 25 30
Cases per 10k people
9.4 9.6 9.8 10 10.2 10.4
log(Social Connectedness)
Note: Panel (a) shows a measures of Social Connectedness to Lodi for Italian provinces. Panel (b) shows
the number of confirmed COVID-19 cases by Italian province on March 30th, 2020. Panels (c) and (d) show
binscatter plots with provinces more than 50 kliometers from Lodi as the unit of observation. To generate the
plot in Panel (c) we group log(SC I ) into 30 equal-sized bins and plot the average against the corresponding
average case density. Panel (d) is constructed in a smaller manner. However, we first regress log(S C I ) and
cases per 10,000 residents on a set of control variables and plot the residualized values on each axis. Red lines
show quadratic fit regressions. The controls for Panel (d) are 20 dummies for the quantile of the province
distance to Lodi; GDP per inhabitant; and population density.
restrict the movement of individuals (LA Times, March 8, 2020). Panels (c) and (d) repeat
the binscatter exercise from Figure 1. We exclude provinces within 50 kilometers. In Panel
(d) we control for geographic distance using 20 dummies for the quantile of the distance from
each province to Lodi, as well as GDP per inhabitant and population density. Again we find
that the Social Connectedness Index appears to have predictive power above these other
measures that might commonly be used to proxy for social interactions. Quantitatively, a
doubling of SC I corresponds to an increase of 16.6 COVID-19 cases per 10,000 residents
after controlling for these relevant factors. The incremental R-Squared of including social
connectedness to Lodi over the other control variables is 0.057.
Caveats
It is important, at this stage, to re-emphasize that we are not epidemiologists, and that the
goal of this note is not to provide an epidiomological model of the spread of COVID-19.6In
normal times we would not venture this far from our primary area of expertise and study
the the spread of a disease like COVID-19. Indeed, in Bailey et al. (2018b), we explicitly
proposed the modeling of communicable diseases as a potentially fruitful direction for others
to pursue, without attempting any such modeling ourselves. However, these are not normal
times, and we have spent much of the last few years exploring these data on the geographic
structure of social networks. In the process, we have found them to be extremely useful for
understanding a large number of social and economic relationships such as trade patterns,
patent citations, and travel flows. Given the urgency of the current global health crisis,
we hope that our expertise in measuring social networks can therefore contribute to the
worldwide interdisciplinary research effort to better understand COVID-19.
In particular, we hope that some of the initial patterns we document in this note —
together with our earlier work showing how social connections as measured by Facebook can
explain many important social and economic phenomena — might be sufficiently striking to
epidemiologists such that they would want to incorporate the Social Connectedness Index
data in their own work. For example, the availability of zip-code level data on social con-
nectedness in the United States as well as similar data for many countries around the world
will allow for more detailed modeling as COVID-19 case data becomes available at that level
of geographic disaggregation. We would be excited to work with any interested team to help
them get the most out of the Social Connectedness Index data.
6As a result, please also excuse our woefully incomplete-to-non existing review of the large related
literature in epidemiology and related fields. We are grateful for any guidance in this direction.
References
A. E. Aiello, A. Renson, and P. N. Zivich. Social media- and internet-based disease surveil-
lance for public health. Annual Review of Public Health, 41:101–118, 2020.
M. Bailey, R. Cao, T. Kuchler, and J. Stroebel. The economic effects of social networks: Ev-
idence from the housing market. Journal of Political Economy, 126(6):2224–2276, 2018a.
M. Bailey, R. Cao, T. Kuchler, J. Stroebel, and A. Wong. Social connectedness: Measure-
ments, determinants, and effects. Journal of Economic Perspectives, 32(3):259–80, 2018b.
M. Bailey, E. D´avila, T. Kuchler, and J. Stroebel. House price beliefs and mortgage leverage
choice. The Review of Economic Studies, 86(6):2403–2452, 2019a.
M. Bailey, P. Farrell, T. Kuchler, and J. Stroebel. Social connectedness in urban areas.
Working Paper 26029, National Bureau of Economic Research, 2019b.
M. Bailey, D. M. Johnston, T. Kuchler, J. Stroebel, and A. Wong. Peer effects in product
adoption. Working Paper 25843, National Bureau of Economic Research, 2019c.
M. Bailey, A. Gupta, S. Hillenbrand, T. Kuchler, R. Richmond, and J. Stroebel. International
trade and social connectedness. Working paper, 2020a.
M. Bailey, T. Kuchler, D. Russel, B. State, and J. Stroebel. Social connectedness in europe.
Working paper, 2020b.
R. Chetty, J. N. Friedman, N. Hendren, M. R. Jones, and S. R. Porter. The opportunity atlas:
Mapping the childhood roots of social mobility. National Bureau of Economic Research
Working Paper No. 25147, 2016.
T. Kuchler, L. Peng, J. Stroebel, Y. Li, and D. Zhou. Social proximity to capital: Implica-
tions for investors and firms. Working paper, 2020.
A. P. Piontti, N. Perra, L. Rossi, N. Samay, and A. Vespignani. Charting the Next Pandemic:
Modeling Infectious Disease Spreading in the Data Science Age. Springer, 2018.
ResearchGate has not been able to resolve any citations for this publication.
Article
We use de-identified data from Facebook to construct a new and publicly available measure of the pairwise social connectedness between 170 countries and 332 European regions. We find that two countries trade more when they are more socially connected, especially for goods where information frictions may be large. The social connections that predict trade in specific products are those between the regions where the product is produced in the exporting country and the regions where it is used in the importing country. Once we control for social connectedness, the estimated effects of geographic distance and country borders on trade decline substantially.
Article
We study the relationship between homebuyers’ beliefs about future house price changes and their mortgage leverage choices. Whether more pessimistic homebuyers choose higher or lower leverage depends on their willingness and ability to reduce the size of their housing market investments. When households primarily maximize the levered return of their property investments, more pessimistic homebuyers reduce their leverage to purchase smaller houses. On the other hand, when considerations such as family size pin down the desired property size, pessimistic homebuyers reduce their financial exposure to the housing market by making smaller downpayments to buy similarly-sized homes. To determine which scenario better describes the data, we investigate the cross-sectional relationship between house price beliefs and mortgage leverage choices in the U.S. housing market. We use plausibly exogenous variation in house price beliefs to show that more pessimistic homebuyers make smaller downpayments and choose higher leverage, in particular in states where default costs are relatively low, as well as during periods when house prices are expected to fall on average. Our results highlight the important role of heterogeneous beliefs in explaining households’ financial decisions.
Article
We use anonymized and aggregated data from Facebook to explore the spatial structure of social networks in the New York metro area. We find that a substantial share of urban residents’ connections are to individuals who are located nearby. We also highlight the importance of transportation infrastructure in shaping urban social networks by showing that social connectedness declines faster in travel time and travel cost than it does in geographic distance. We find that areas that are more socially connected with each other have stronger commuting flows, even after controlling for geographic distance and ease of travel. We also document significant heterogeneity in the geographic breadth of social networks across New York zip codes, and show that this heterogeneity correlates with access to public transit. Zip codes with geographically broader social networks also have higher incomes, higher education levels, and more high-quality entrepreneurial activity. We also explore the social connections between New York zip codes and foreign countries, and highlight how these are related to past migration movements.
Article
Disease surveillance systems are a cornerstone of public health tracking and prevention. This review addresses the use, promise, perils, and ethics of social media– and Internet-based data collection for public health surveillance. Our review highlights untapped opportunities for integrating digital surveillance in public health and current applications that could be improved through better integration, validation, and clarity on rules surrounding ethical considerations. Promising developments include hybrid systems that couple traditional surveillance data with data from search queries, social media posts, and crowdsourcing. In the future, it will be important to identify opportunities for public and private partnerships, train public health experts in data science, reduce biases related to digital data (gathered from Internet use, wearable devices, etc.), and address privacy. We are on the precipice of an unprecedented opportunity to track, predict, and prevent global disease burdens in the population using digital data. Expected final online publication date for the Annual Review of Public Health, Volume 41 is April 1, 2020. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Book
This book provides an introduction to the computational and complex systems modeling of the global spreading of infectious diseases. The latest developments in the area of contagion processes modeling are discussed, and readers are exposed to real world examples of data-model integration impacting the decision-making process. Recent advances in computational science and the increasing availability of real-world data are making it possible to develop realistic scenarios and real-time forecasts of the global spreading of emerging health threats. The first part of the book guides the reader through sophisticated complex systems modeling techniques with a non-technical and visual approach, explaining and illustrating the construction of the modern framework used to project the spread of pandemics and epidemics. Models can be used to transform data to knowledge that is intuitively communicated by powerful infographics and for this reason, the second part of the book focuses on a set of charts that illustrate possible scenarios of future pandemics. The visual atlas contained allows the reader to identify commonalities and patterns in emerging health threats, as well as explore the wide range of models and data that can be used by policy makers to anticipate trends, evaluate risks and eventually manage future events. Charting the Next Pandemic puts the reader in the position to explore different pandemic scenarios and to understand the potential impact of available containment and prevention strategies. This book emphasizes the importance of a global perspective in the assessment of emerging health threats and captures the possible evolution of the next pandemic, while at the same time providing the intelligence needed to fight it. The text will appeal to a wide range of audiences with diverse technical backgrounds.
Article
We show how data from online social networking services can help researchers better understand the effects of social interactions on economic decision making. We combine anonymized data from Facebook, the largest online social network, with housing transaction data and explore both the structure and the effects of social networks. Individuals whose geographically distant friends experienced larger recent house price increases are more likely to transition from renting to owning. They also buy larger houses and pay more for a given house. Survey data show that these relationships are driven by the effects of social interactions on individuals’ housing market expectations.
Article
Social networks can shape many aspects of social and economic activity: migration and trade, job-seeking, innovation, consumer preferences and sentiment, public health, social mobility, and more. In turn, social networks themselves are associated with geographic proximity, historical ties, political boundaries, and other factors. Traditionally, the unavailability of large-scale and representative data on social connectedness between individuals or geographic regions has posed a challenge for empirical research on social networks. More recently, a body of such research has begun to emerge using data on social connectedness from online social networking services such as Facebook, LinkedIn, and Twitter. To date, most of these research projects have been built on anonymized administrative microdata from Facebook, typically by working with coauthor teams that include Facebook employees. However, there is an inherent limit to the number of researchers that will be able to work with social network data through such collaborations. In this paper, we therefore introduce a new measure of social connectedness at the US county level. Our Social Connectedness Index is based on friendship links on Facebook, the global online social networking service. Specifically, the Social Connectedness Index corresponds to the relative frequency of Facebook friendship links between every county-pair in the United States, and between every US county and every foreign country. Given Facebook’s scale as well as the relative representativeness of Facebook’s user body, these data provide the first comprehensive measure of friendship networks at a national level.
The opportunity atlas: Mapping the childhood roots of social mobility
  • R Chetty
  • J N Friedman
  • N Hendren
  • M R Jones
  • S R Porter
R. Chetty, J. N. Friedman, N. Hendren, M. R. Jones, and S. R. Porter. The opportunity atlas: Mapping the childhood roots of social mobility. National Bureau of Economic Research Working Paper No. 25147, 2016.