ArticlePDF Available

Laws of Population Growth

December 2008
Proceedings of the National Academy of Sciences 105(48):18702-7

December 2008
105(48):18702-7

DOI:10.1073/pnas.0807435105

Source
PubMed

Authors:

Hernan D. Rozenfeld

American Physical Society

Michael Batty

University College London

Show all 6 authorsHide

An important issue in the study of cities is defining a metropolitan area, because different definitions affect conclusions regarding the statistical distribution of urban activity. A commonly employed method of defining a metropolitan area is the Metropolitan Statistical Areas (MSAs), based on rules attempting to capture the notion of city as a functional economic region, and it is performed by using experience. The construction of MSAs is a time-consuming process and is typically done only for a subset (a few hundreds) of the most highly populated cities. Here, we introduce a method to designate metropolitan areas, denoted "City Clustering Algorithm" (CCA). The CCA is based on spatial distributions of the population at a fine geographic scale, defining a city beyond the scope of its administrative boundaries. We use the CCA to examine Gibrat's law of proportional growth, which postulates that the mean and standard deviation of the growth rate of cities are constant, independent of city size. We find that the mean growth rate of a cluster by utilizing the CCA exhibits deviations from Gibrat's law, and that the standard deviation decreases as a power law with respect to the city size. The CCA allows for the study of the underlying process leading to these deviations, which are shown to arise from the existence of long-range spatial correlations in population growth. These results have sociopolitical implications, for example, for the location of new economic development in cities of varied size.

…

Figures - uploaded by Michael Batty

Content may be subject to copyright.

Content uploaded by Michael Batty

Content may be subject to copyright.

arXiv:0808.2202v2 [physics.soc-ph] 10 Sep 2009

Laws of Population Growth

Hern´an D. Rozenfeld1, Diego Rybski1, Jos´e S. Andrade Jr.2,

Michael Batty3, H. Eugene Stanley4, and Hern´an A. Makse1,2

1Levich Institute and Physics Department,

City College of New York, New York, NY 10031, USA

2Departamento de F´ısica, Universidade Federal

do Cear´a, 60451-970 Fortaleza, Cear´a, Brazil

3Centre for Advanced Spatial Analysis, University College London,

1-19 Torrington Place, London WC1E 6BT, UK

4Center for Polymer Studies and Physics Department,

Boston University, Boston, MA 02215, USA

Abstract

An important issue in the study of cities is deﬁning a metropolitan area, as diﬀerent deﬁnitions

aﬀect conclusions regarding the statistical distribution of urban activity. A commonly employed

method of deﬁning a metropolitan area is the Metropolitan Statistical Areas (MSAs), based on

rules attempting to capture the notion of city as a functional economic region, and is performed

using experience. The construction of MSAs is a time-consuming process and is typically done only

for a subset (a few hundreds) of the most highly populated cities. Here, we introduce a new method

to designate metropolitan areas, denoted “City Clustering Algorithm” (CCA). The CCA is based

on spatial distributions of the population at a ﬁne geographic scale, deﬁning a city beyond the

scope of its administrative boundaries. We use the CCA to examine Gibrat’s law of proportional

growth, which postulates that the mean and standard deviation of the growth rate of cities are

constant, independent of city size. We ﬁnd that the mean growth rate of a cluster utilizing the CCA

exhibits deviations from Gibrat’s law, and that the standard deviation decreases as a power-law

with respect to the city size. The CCA allows for the study of the underlying process leading to

these deviations, which are shown to arise from the existence of long-range spatial correlations in

population growth. These results have socio-political implications, for example for the location of

new economic development in cities of varied size.

I. INTRODUCTION

In recent years there has been considerable work on how to deﬁne cities and how the

diﬀerent deﬁnitions aﬀect the statistical distribution of urban activity [1, 2]. This is a

long standing problem in spatial analysis of aggregated data sources, referred to as the

‘modiﬁable areal unit problem’ or the ‘ecological fallacy’ [3, 4], where diﬀerent deﬁnitions of

spatial units based on administrative or governmental boundaries, give rise to inconsistent

conclusions with respect to explanations and interpretations of data at diﬀerent scales. The

conventional method of deﬁning human agglomerations is through the MSAs [1, 2, 5, 6, 7],

which is subject to socio-economical factors. The MSA has been of indubitable importance

for the analysis of population growth, and is constructed manually case-by-case based on

subjective judgment (MSAs are deﬁned starting from a highly populated central area and

adding its surrounding counties if they have social or economical ties).

In this report, we propose a new way to measure the extent of human agglomerations

based on clustering techniques using a ﬁne geographical grid, covering both urban and

rural areas. In this view, “cities” represent clusters of population, i.e., adjacent populated

geographical spaces. Our algorithm, the “city clustering algorithm” (CCA), allows for an

automated and systematic way of building population clusters based on the geographical

location of people. The CCA has one parameter (the cell size) that is useful for the study

of human agglomerations at diﬀerent length scales, similar to the level of aggregation in

the context of social sciences. We show that the CCA allows for the study of the origin

of statistical properties of population growth. We use the CCA to analyze the postulates

of Gibrat’s law of proportional growth applied to cities, which assumes that the mean and

standard deviation of the growth rates of cities are constant. We show that population

growth at a ﬁne geographical scale for diﬀerent urban and regional systems at country

and continental levels (Great Britain, the USA, and Africa) deviates from Gibrat’s law.

We ﬁnd that the mean and standard deviation of population growth rates decrease with

population size, in some cases following a power-law behavior. We argue that the underlying

demographic process leading to the deviations from Gibrat’s law can be modeled from the

existence of long-range spatial correlations in the growth of the population, which may arise

from the concept that “development attracts further development.” These results have

implications for social policies, such as those pertaining to the location of new economic

development in cities of diﬀerent sizes. The present results imply that, on average, the

greatest growth rate occurs in the smallest places where there is the greatest risk of failure

(larger ﬂuctuations). A corollary is that the safest growth occurs in the largest places having

less likelihood for rapid growth.

The analyzed data consist of the number of inhabitants, ni(t), in each cell iof a ﬁne

geographical grid at a given time, t. The cell size varies for each data set used in this study.

We consider three diﬀerent geographic scales: on the smallest scale, the area of study is Great

Britain (GB: England, Scotland and Wales), a highly urbanized country with population

of 58.7 million in 2007, and an area of 0.23 million km2. The grid is composed of 5.75

million cells of 200m-by-200m [8]. At the intermediate scale, we study the USA (continental

USA without Alaska), a single country nearly continental in scale, with a population of

303 million in 2007, and an area of 7.44 million km2. The grid contains 7.44 million cells

of approximately 1km-by-1km obtained from the US Census Bureau [9]. The datasets of

GB and USA are populated-places datasets, with population counts deﬁned at points in a

grid. Since there could be some distortions in the true residential population involved at

the ﬁnest grid resolution, we perform our analysis by investigating the statistical properties

as a function of the grid size by coarse-graining the data as explained in Section IV A. At

the largest scale, we analyze the continent of Africa, composed of 53 countries with a total

population of 933 million in 2007, and an area of 30.34 million km2. These data are gridded

with less resolution by 0.50 million cells of approximately 7.74km-by-7.74km [10]. More

detailed information about these datasets is found in Section IV A (all the datasets studied

in this paper are available at http://lev.ccny.cuny.edu/∼hmakse/cities/city data.zip).

II. RESULTS

Figure 1A illustrates operation of the CCA. In order to identify urban clusters, we require

connected cells to have nonzero population. We start by selecting an arbitrary populated

cell (ﬁnal results are independent of the choice of the initial cell). Iteratively, we then grow a

cluster by adding nearest neighbors of the boundary cells with a population strictly greater

than zero, until all neighbors of the boundary are unpopulated. We repeat this process until

all populated cells have been assigned to a cluster. This technique was introduced to model

forest ﬁre dynamics [11] and is termed the “burning algorithm,” since one can think of each

populated cell as a burning tree.

The population Si(t) of cluster iat time tis the sum of the populations n(i)

j(t)

of each cell jwithin it, Si(t) = PNi

j=1 n(i)

j(t), where Niis the number of cells

in the cluster. Results of the CCA are shown in Fig. 1B, representing the urban

cluster surrounding the City of London (red cluster overlaying a satellite image, see

http://lev.ccny.cuny.edu/∼hmakse/cities/london.gif for an animated image of Fig. 1B). Fig-

ure 1C depicts all the clusters of GB, indicating the large variability in their population and

size.

A feature of the CCA is that it allows the analysis of the population clusters at diﬀerent

length scales by coarse-graining the grid and applying the CCA to the coarse-grained dataset

(see Section IV A for details on coarse-graining the data). At larger scales, disconnected

areas around the edge of a cluster could be added into the cluster. This is justiﬁed when,

for example, a town is divided by a wide highway or a river.

Tables I and II in Supporting Information (SI) Section I. show a detailed comparison

between the urban clusters obtained with the CCA applied to the USA in 1990, and the

results obtained from the analysis of MSAs from the US Census Bureau used in previous

studies of population growth [5, 6, 7]. We observe that the MSAs considered in Ref. [5] are

similar to the clusters obtained with the CCA with a cell size of 4km-by-4km or 8km-by-

8km. In particular, the population sizes of the clusters have the same order of magnitude

as the MSAs. On the other hand, for large cities the MSAs from the data of Ref. [6] seem

to be mostly comparable to our results for cell sizes of 2km-by-2km or 4km-by-4km.

Use of the CCA permits a systematic study of cluster dynamics. For instance, clus-

ters may expand or contract, merge or split between two considered times as illustrated

in Fig. 2. We quantify these processes by measuring the probability distribution of the

temporal changes in the clusters for the data of GB. We ﬁnd that when the cell size is

2.2km-by-2.2km, 84% of the clusters evolve from 1981 to 1991 following the three ﬁrst cases

presented in Fig. 2 (no change, expansion or reduction), 6% of the clusters merge from two

clusters into one in 1991, and 3% of the clusters split into two clusters.

Next, we apply the CCA to study the dynamics of population growth by investigating

Gibrat’s law, which postulates that the mean and standard deviation of growth rates are

constant [1, 2, 5, 7, 12]. The conventional method [1, 2, 7] is to assume that the populations

of a given city or cluster i, at times t0and t1> t0, are related by

S1=R(S0)S0,(1)

where S0≡Si(t0) = PNi

jn(i)

j(t0) and S1≡Si(t1) = PNi

jn(i)

j(t1) are the initial and ﬁnal

populations of cluster i, respectively, and R(S0) is the positive growth factor which varies

from cluster to cluster. Following the literature in population dynamics [1, 2, 5, 7], we

deﬁne the population growth rate of a cluster as r(S0)≡ln R(S0) = ln(S1/S0), and study

the dependence of the mean value of the growth rate, hr(S0)i, and the standard deviation,

σ(S0) = phr(S0)2i − hr(S0)i2, on the initial population, S0. The averages hr(S0)iand σ(S0)

are calculated applying nonparametric techniques [13, 14] (see Section IV B for details). To

obtain the population growth rate of clusters we take into account that not all clusters

occupy the same area between t0and t1according to the cases discussed in Fig. 2. The

ﬁgure shows how to calculate the growth rate r(S0) in each case.

We analyze the population growth in the USA from t0= 1990 to t1= 2000 [9]. We apply

the CCA to identify the clusters in the data of 1990 and calculate their growth rates by

comparing them to the population of the same clusters in 2000 when the data are gridded

with a cell size of 2000m by 2000m. We calculate the annual growth rates by dividing rby

the time interval t1−t0.

Figure 3A shows a nonparametric regression with bootstrapped 95% conﬁdence bands [13,

14] of the growth rate of the USA, hr(S0)i(see Section IV B for details). We ﬁnd that the

growth rate diminishes from hr(S0)i ≈ 0.012 ±0.004 (error includes the conﬁdence bands)

for populations below 104inhabitants to hr(S0)i ≈ 0.002 ±0.002 for the largest populations

around S0≈107. We may argue that the mean growth rate deviates from Gibrat’s law

beyond the conﬁdence bands. While it is diﬃcult to ﬁt the data to a single function for the

entire range, the data show a decrease with S0approximately following a power-law in the

tail for populations larger than 104. An attempt to ﬁt the data with a power-law yields the

following scaling in the tail:

hr(S0)i ∼ S−α

0,(2)

where αis the mean growth exponent, that takes a value αUSA = 0.28 ±0.08 from Ordinary

Least Squares (OLS) analysis [15] (see Section IV B for details on OLS and on the estimation

of the exponent error).

Figure 3B shows the dependence of the standard deviation σ(S0) on the initial population

S0. On average, ﬂuctuations in the growth rate of large cities are smaller than for small

cities in contrast to Gibrat’s law. This result can be approximated over many orders of

magnitude by the power-law,

σ(S0)∼S−β

0,(3)

where βis the standard deviation exponent. We carry out an OLS regression analysis and

ﬁnd that βUSA = 0.20 ±0.06. The presence of a power-law implies that ﬂuctuations in the

growth process are statistically self-similar at diﬀerent scales, for populations ranging from

∼1000 to ∼10 million according to Fig. 3B.

Figure 4 shows the analysis of the growth rate of the population clusters of GB from

gridded databases [8] with a cell size of 2.2km-by-2.2km at t0= 1981 and t1= 1991. The

average growth rate depicted in Fig. 4A comprises large ﬂuctuations as a function of S0,

especially for smaller populations. However, a slight decrease with population seems evident

from rates around hri ≈ 0.008±0.001 with S0≈104dropping to zero or even negative values

for the largest populations, S0≈106. We ﬁnd that 3556 clusters with population around

S0= 103exhibit negative growth rates as well. Thus, the mean rates are plotted on a

semi-logarithmic scale in Fig. 4A. When considering intermediate populations ranging from

S0= 3000 to S0= 3 ×105, the data seem to be following approximately a power-law

with αGB = 0.17 ±0.05 from OLS regression analysis, as shown in the inset of Fig. 4A.

Figure 4B shows the standard deviation for GB, σ(S0), exhibiting deviations from Gibrat’s

law having a tendency to decrease with population according to Eq. (3) and a standard

deviation exponent, βGB = 0.27 ±0.04, obtained with OLS technique.

The CCA allows for a study of the growth rates as a function of the scale of observation, by

changing the size of the grid. We ﬁnd (SI Section II.) that the data for GB are approximately

invariant under coarse-graining the grid at diﬀerent levels for both the mean and standard

deviation. When the data of the USA are aggregated spatially from cell size 2000m to

8000m, the scaling of the mean rates crosses-over to a ﬂat behavior closer to Gibrat’s law.

At the scale of 8000m the mean is approximately constant (with ﬂuctuations). However,

we ﬁnd that, at this scale, all cities in the northeastern the USA spanning from Boston to

Washington D.C. form a single cluster. Despite these diﬀerences, the scaling of the standard

deviation for the USA holds approximately invariant even up to the large scale of observation

of 8000m.

Next, we analyze the population growth in Africa during the period 1960 to 1990 [10].

In this case the population data are based on a larger cell size, so we evaluate the data

cell by cell (without the application of the CCA). Despite the diﬀerences in the economic

and urban development of Africa, Great Britain and the USA, we ﬁnd that the mean and

standard deviation of the growth rate in Africa display similar scaling as found for the USA

and GB. In Fig. 5A we show the results for the growth rate in Africa when the grid is

coarse-grained with a cell size of 77km-by-77km. We ﬁnd a decrease of the growth rate from

hr(S0)i ≈ 0.1 to hr(S0)i ≈ 0.01 between populations S0≈103and S0≈106, respectively.

All populations have positive growth rates. A log-log plot of the mean rates shown in

Fig. 5A reveals a power-law scaling hr(S0)i ∼ S−αAf

0, with αAf = 0.21 ±0.05 from OLS

regression analysis. The standard deviation (Fig. 5B) satisﬁes Eq. (3) with a standard

deviation exponent βAf = 0.19 ±0.04. The CCA allows for a study of the origin of the

observed behavior of the growth rates by examining the dynamics and spatial correlations

of the population of cells. To this end, we ﬁrst generate a surrogate dataset that consists of

shuﬄing two randomly chosen populated cells, n(i)

j(t0) and n(i)

k(t0), at time t0. This swapping

process preserves the probability distribution of n(i)

j, but destroys any spatial correlations

among the population cells. Figure 4C shows the results of the randomization of the GB

dataset, indicating power-law scaling in the tail of σ(S0) with standard deviation exponent

βrand = 1/2. This result can be interpreted in terms of the uncorrelated nature of the

randomized dataset (SI Section III). We consider that the population of each cell jincreases

by a random amount δjwith mean value ¯

δand variance h(δ−¯

δ)2i= ∆2, and that r≪1,

then n(i)

j(t1) = n(i)

j(t0) + δj. Therefore, the population of a cluster at time t1can be written

S1=S0+

j=1

δj.(4)

It can be shown that (SI Section III.):

hS2

1i=hS2

0i+

h(δj−¯

δ)(δk−¯

δ)i.(5)

Randomly shuﬄing population cells destroys the correlations, leading to h(δj−¯

δ)(δk−¯

δ)i=

∆2δjk (where δjk is the Kronecker delta function) which implies βrand = 1/2 [16] (see SI

Section III.).

The fact that βlies below the random exponent (βrand = 1/2) for all the analyzed data

suggests that the dynamics of the population cells display spatial correlations, which are

eliminated in the random surrogate data. The cells are not occupied randomly but spatial

correlations arise, since when the population in one cell increases, the probability of growth

in an adjacent cell also increases. That is, development attracts further development, an

idea that has been used to model the spatial distribution of urban patterns [17]. Indeed this

ideas are related to the study of the origin of power-laws in complex systems [18, 19].

When we analyze the populated cells, we indeed ﬁnd that spatial correlations in the incre-

mental population of the cells, δj, are asymptotically of a scale-invariant form characterized

by a correlation exponent γ,

h(δj−¯

δ)(δk−¯

δ)i ∼ ∆2

|~xj−~xk|γ,(6)

where ~xjis the location of cell j. For GB we ﬁnd γ= 0.93 ±0.08 (see Fig. 4D). In SI

Section III. we show that power-law correlations in the ﬂuctuations at the cell level, Eq. (6),

lead to a standard deviation exponent β=γ/4. For γ= 2, the dimension of the substrate,

we recover βrand = 1/2 (larger values of γresult in the same βsince when γ > 2 correlations

become irrelevant). If γ= 0, the standard deviation of the populations growth rates has

no dependence on the population size (β= 0), as stated by Gibrat’s law, stating that the

standard deviation does not depend in the cluster size. In the case of GB, γ= 0.93 ±0.08

gives β= 0.23 ±0.02 approximately consistent with the measured value βGB = 0.27 ±0.04,

within the error bars. This observation suggests that the underlying demographic process

leading to the scaling in the standard deviation can be modeled as arising from the long-range

correlated growth of population cells.

III. DISCUSSION

Our results suggest the existence of scale-invariant growth mechanisms acting at diﬀerent

geographical scales. Furthermore, Eq. (3) is similar to what is found for the growth of

ﬁrms and other macroeconomic indicators [16, 20]. Thus, our results support the existence

of an underlying link between the ﬂuctuation dynamics of population growth and various

economic indicators, implying considerable unevenness in economic development in diﬀerent

population sizes. City growth is driven by many processes of which population growth and

migration is only one. Our study captures only the growth of population but not economic

growth per se. Many cities grow economically while losing population and thus the processes

we imply are those that inﬂuence a changing population. Our assumption is that population

change is an indicator of city growth or decline and therefore we have based our studies on

population clustering techniques. Alternatively, the MSAs provides a set of rules that try

to capture the idea of city as a functional economic region.

The results we obtain show scale-invariant properties which we have modelled using long-

range spatial correlations between the population of cells. That is, strong development in

an area attracts more development in its neighborhood and much beyond. A key ﬁnding is

that small places exhibit larger ﬂuctuations than large places. The implications for locating

activity in diﬀerent places are that there is a greater probability of larger growth in small

places, but also a greater probability of larger decline. Opportunity must be weighed against

the risk of failure.

One may take these ideas to a higher level of abstraction to study cell-to-cell ﬂows (mi-

gration, commuting, etc.) gridded at diﬀerent levels. As a consequence one may deﬁne

population clusters, or MSAs, in terms of functional linkages between neighboring cells. In

addition one may relax some conditions imposed in the CCA. Here we consider a cell to be

part of a cluster only if its population is strictly greater than 0. In SI Section V we relax

this condition and study the robustness of the CCA when cells of a higher population than

0 (for instance, 5 and 20) are allowed into clusters and ﬁnd that even though small clusters

present a slight deviation, the overall behavior of the growth rate and standard deviation is

conserved.

IV. MATERIALS AND METHODS

A. Information on the datasets

The datasets analyzed in this paper were obtained from the websites

http://census.ac.uk, http://www.esri.com/, and http://na.unep.net/datasets/datalist.php,

for GB, USA and Africa, respectively, and can be downloaded from

http://lev.ccny.cuny.edu/∼hmakse/cities/city data.zip.

The datasets consist of a list of populations at speciﬁc coordinates at two time steps t0

and t1. A graphical representation of the data can be seen in Fig. 1C for GB where each

point represents a data point directly extracted from the dataset.

To perform the CCA at diﬀerent scales we coarse-grain the datasets. For this purpose,

we overlay a grid on the corresponding map (USA, GB, or Africa) with the desired cell size

(for example, 2km-by-2km or 4km-by-4km for the USA). Then, the population of each cell

is calculated as the sum of the populations of points (obtained from the original dataset)

that fall into this cell.

Table I shows information on the datasets and results on USA, GB and Africa for the

cell size used in the main text as well as some of the exponents obtained in our analysis.

TABLE I: Characteristics of datasets and summary of results

Data Number t0t1Average Cell Size Number of α β

of cells growth rate clusters

USA 1.86 mill 1990 2000 0.9% 2km-by-2km 30,210 0.28 ±0.08 0.20 ±0.06

GB 0.10 mill 1981 1991 0.3% 2.2km-by-2.2km 10,178 0.17 ±0.05 0.27 ±0.04

Africa 2,216 1960 1990 4% 77km-by-77km 3,988 0.21 ±0.05 0.19 ±0.04

B. Calculation of hr(S0)iand σ(S0)and methodology

The average growth rate, hr(S0)i= ln(S1/S0), and the standard deviation, σ(S0) =

phr(S0)2i − hr(S0)i2, are deﬁned as follows. If we call P(r|S0) the conditional probability

distribution of ﬁnding a cluster with growth rate r(S0) with the condition of initial popula-

tion S0, then we can obtain r(S0) and σ(S0) through,

hr(S0)i=ZrP (r|S0)dr, (7)

and

hr(S0)2i=Zr2P(r|S0)dr. (8)

Once r(S0) and σ(S0) are calculated for each cluster, we perform a nonparametric re-

gression analysis [13, 14], a technique broadly used in the literature of population dynamics.

The idea is to provide an estimate for the relationship between the growth rate and S0and

between the standard deviation and S0. Following the methods explained in Ref. [14] we

apply the Nadaraya-Watson method to calculate an estimate for the growth rate, ˆr(S0),

with,

hˆr(S0)i=Pallclusters

i=0 Kh(S0−Si(t0))ri(S0)

Pallclusters

i=0 Kh(S0−Si(t0)) ,(9)

and an estimate for the standard deviation ˆσ(S0) with,

ˆσ(S0) = sPallclusters

i=0 Kh(S0−Si(t0))(ri(S0)− hˆr(S0)i)2

Pallclusters

i=0 Kh(S0−Si(t0)) ,(10)

where Si(t0) is the population of cluster iat time t0(as deﬁned in the main text), ri(S0) is

the growth rate of cluster iand Kh(S0−Si(t0)) is a gaussian kernel of the form,

Kh(S0−Si(t0)) = e(lnS0−lnSi(t0))2

2h2, h = 0.5 (11)

Finally, we compute the 95% conﬁdence bands (calculated from 500 random samples with

replacement) to estimate the amount of statistical error in our results [13]. The bootstrap-

ping technique was applied by sampling as many data points as the number of clusters and

performing the nonparametric regression on the sampled data. By performing 500 realiza-

tions of the bootstrapping algorithm and extracting the so called α/2 (αis not related to

the growth rate exponent) quantile we obtain the 95% conﬁdence bands.

To obtain the exponents αand βof the power-law scalings for hr(S0)iand σ(S0), respec-

tively, we perform an OLS regression analysis [15]. More speciﬁcally, to obtain the exponent

βfrom Eq. (3), we ﬁrst linearize the data by considering the logarithm of the independent

and dependent variables so that Eq. (3) becomes ln σ(S0)∼βln S0. Then, we apply a

linear Ordinary Least Square regression that leads to the exponent

β=NcPNc

i=1[ln Si(t0) ln σ(Si(t0))] −PNc

i=1 ln Si(t0)PNc

i=1 ln σ(Si(t0))

NcPNc

i=1(ln Si(t0))2−(PNc

i=1 ln Si(t0))2,(12)

where Ncis the number of clusters found using the CCA. Analogously, we obtain the expo-

nent αby linearizing h|r(S0)|i and calculating

α=NcPNc

i=1(ln Si(t0) ln h|r(Si(t0))|i − PNc

i=1 ln Si(t0)PNc

i=1 ln h|r(Si(t0))|i

NcPNc

i=1(ln Si(t0))2−(PNc

i=1 ln Si(t0))2.(13)

Next we compute the 95% conﬁdence interval for the exponents αand β. For this we

follow the book of Montgomery and Peck [15]. The 95% conﬁdence interval for βis given

by,

t0.025,Nc−2∗se, (14)

where tα′/2,Nc−2is the t-distribution with parameters α′/2 and Nc−2 and se is the standard

error of the exponent deﬁned as

se =sSSE

(Nc−2)Sxx

,(15)

where SSEis the residual and Sxx is the variance of S0.

Finally, we express the value of the exponent in terms of the 95% conﬁdence intervals as,

β±t0.025,Nc−2∗se. (16)

Acknowledgments

We thank L.H. Dobkins and J. Eeckhout for providing data on MSA and C. Briscoe

for help with the manuscript. This work is supported by the National Science Foundation

through grant NSF-HSD. J.S.A. thanks the Brazilian agencies CNPq, CAPES, FUNCAP

and FINEP for ﬁnancial support.

[1] Gabaix X (1999) Zipf’s law for cities: an explanation. Quart. J. Econ. 114: 739-767.

[2] Gabaix X, Ioannides Y M (2003) The evolution of city size distributions, in Handbook of Urban

and Regional Economics, Vol.4, eds Henderson J V, Thisse J F (Elsevier Science, Amsterdam),

pp 2341-2378.

[3] Unwin D J (1996) GIS, spatial analysis and spatial statistics. Progress in Human Geography

20:540-551.

[4] King G, Rosen O, Tanner M A (2004) eds, Ecological Inference: New Methodological Strategies

(Cambridge University Press, New York).

[5] Eeckhout J (2004) Gibrat’s law for (all) cities. Amer. Econ. Rev. 94: 1429-1451.

[6] Dobkins L H, Ioannides Y M (2000) Spatial interactions among U.S. cities: 1900-1990. Reg.

Sci. Urban Econ. 31: 701-731.

[7] Ioannides Y M, Overman H G (2003) Zipf ’s law for cities: an empirical examination Reg. Sci.

Urban Econ. 33: 127-137.

[8] The 1981 and 1991 population census, Crown Copyright, ESRC purchase,

http://census.ac.uk/

[9] ESRI Inc (2000) ArcView 3.2 data sets: North America, Environmental Systems Research

Institute, Redlands, CA.

[10] UNESCO (1987) through GRID, http://na.unep.net/datasets/datalist.php

[11] Stauﬀer D (1984) Introduction to percolation theory (Taylor & Francis, London).

[12] Eaton J, Eckstein Z (1997) Cities and growth: theory and evidence from France and Japan.

Reg. Sci. Urban Econ. 27: 443-474.

[13] H¨ardle W (1990) Applied Nonparametric Regression (Cambridge University Press, Cam-

bridge).

[14] Silverman, B W (1986) Density Estimation for Statistics and Data Analysis (Chapman and

Hall, New York).

[15] Montogomery D C, Peck E A (1992) Introduction to linear regression analysis (John Wiley &

Sons, Inc.).

[16] Stanley M H R et al. (1996) Scaling behavior in the growth of companies. Nature 379: 804-806.

[17] Makse H A, Havlin S, Stanley H E (1995) Modelling urban growth patterns. Nature 377:

608-612.

[18] Barab´asi A-L, Albert R (1999) Emergence of scaling in random networks. Science, 286, 509-

512.

[19] Carlson J M, Doyle J (2002) Complexity and robustness. PNAS, 99:2538-2545.

[20] Rossi-Hansberg E, Wright M L J (2007) Establishment size dynamics in the aggregate econ-

omy. Amer. Econ. Rev. 97: 1639-1666.

FIG. 1: (A) Sketch illustrating the CCA applied to a sample of gridded population data. In the top

left panel, cells are colored in blue if they are populated (n(i)

j(t)>0), otherwise, if n(i)

j(t) = 0, they

are in white. In the top right panel we initialize the CCA by selecting a populated cell and burning

it (red cell). Then, we burn the populated neighbors of the red cell as shown in the lower left panel.

We keep growing the cluster by iteratively burning neighbors of the red cells until all neighboring

cells are unpopulated, as shown in the lower right panel. Next, we pick another unburned populated

cell and repeat the algorithm until all populated cells are assigned to a cluster. The population

Si(t) of cluster iat time tis then Si(t) = PNi

j=1 n(i)

j(t). (B) Cluster identiﬁed with the CCA in the

London area (red) overlaying a corresponding satellite image (extracted from maps.google.com).

The greenery corresponds to vegetation, and thus approximately indicates unoccupied areas. For

example, Richmond Park can be found as a vegetation area in the south-west. The areas in the

east along the Thames River correspond mainly to industrial districts and in the west the London

Heathrow Airport, also not populated. The yellow line in the center represents the administrative

boundary of the City of London, demonstrating the diﬀerence with the urban cluster found with the

CCA. The pink clusters surrounding the major red cluster are smaller conglomerates not connected

to London. The ﬁgure shows that an analysis based on the City of London captures only a partial

area of the real urban agglomeration. (C) Result of the CCA applied to all of GB showing the

large variability in the population distribution. The color bar (in logarithmic scale) indicates the

population of each urban cluster.

FIG. 2: Illustration of possible changes in cluster shapes. In each case we show how the growth

rate is computed. In the ﬁrst case, there is no areal modiﬁcation in the cluster between t0and t1.

In the second, the cluster expands. In the third the cluster reduces its area. In the fourth, one

cluster divides into two and therefore we consider the population at t1to be S1=S′

1+S′′

1. In the

ﬁfth case two clusters merge to form one at t1. In this case we consider the population at t0to be

S0=S′

0+S′′

FIG. 3: Results for the USA using a cell size of 2000m-by-2000m. (A) Mean annual growth rate for

population clusters in the USA versus initial population of the clusters. The straight dashed line

shows a power-law ﬁt with αUSA = 0.28 ±0.08 as determined using OLS regression. (B) Standard

deviation of the growth rate for the USA. The straight dashed line corresponds to a power-law ﬁt

using OLS regression with βUSA = 0.20 ±0.06.

FIG. 4: Results for Great Britain using a cell size of 2.2km-by-2.2km. (A) Mean annual growth

rate of population clusters in Great Britain versus the initial cluster population. The inset shows

a double logarithmic plot of the growth rate in the intermediate range of populations, 3000 <

S0<3×105. A power-law ﬁt using OLS leads to an exponent αGB = 0.17 ±0.05 for this range.

(B) Double logarithmic plot of the standard deviation of the annual growth rates of population

clusters in Great Britain versus the initial cluster population. The straight line corresponds to a

power-law ﬁt using OLS with an exponent βGB = 0.27 ±0.04, according to Eq. (3). (C) Scaling

of the standard deviation in cluster population obtained from the randomized surrogate dataset

of GB by randomly swapping the cells. The data shows an exponent βrand = 1/2 in the tail. The

deviations for small S0are discussed in the SI Section IV. where we test these results by generating

random populations. (D) Long-range spatial correlations in the population growth of cells for GB

according to Eq. (6). The straight line corresponds to an exponent γ= 0.93 ±0.08.

FIG. 5: Results for Africa using a cell size of 77km-by-77km. (A) Mean growth rate of clusters in

Africa versus the initial size of population S0. The straight dashed line shows a power-law ﬁt with

exponent αAf = 0.21 ±0.05, obtained using OLS regression. (B) Standard deviation of the growth

rate in Africa. The straight line corresponds to power-law ﬁt using OLS providing the exponent

βAf = 0.19 ±0.04.

Fig. 1:

Fig. 1

t0t1

No Change

Expansion

Reduction

Division

Merge

‘

S0S1

‘

r(S0) = ln S1

r(S0) = ln S1+S1

r(S0) = ln S1

S0+S0

S0=S0+S0

S1=S1+S1

S0S1

Fig. 2:

A B

FIG. 3:

A B

C D

FIG. 4:

A B

FIG. 5:

SUPPORTING INFORMATION

Laws of Population Growth

Hern´an D. Rozenfeld, Diego Rybski, Jos´e S. Andrade Jr.,

Michael Batty, H. Eugene Stanley, and Hern´an A. Makse

As supplementary materials we provide the following: In Section V we present tables

with details on our results using the CCA and results presented in previous papers to allow

for comparison between the diﬀerent approaches. In Section VI we study the stability of the

scaling found in the text under a change of scale in the cell size. In Section VII we detail the

calculations to relate spatial correlations between the population growth and σ(S0) namely

the relation β=γ/4. In Section VIII we describe the random surrogate dataset used to

further test our results. In Section IX we further test the robustness of the CCA by proposing

a small variation in the algorithm.

V. CLUSTERS AT DIFFERENT SCALES AND COMPARISON WITH

METROPOLITAN STATISTICAL AREAS

In this section, Tables S1 and S2 allow for a detailed comparison of urban clusters obtained

with the CCA applied to the USA in 1990, and the populations of MSA from US Census

Bureau used in previous studies of population growth [5, 6, 7].

We can see that the MSA presented by Eeckhout (2004) typically correspond to our

clusters using cell sizes of 4km and 8km. For example, for the New York City region

Eeckhout’s data are well approximated by a cell size of 4km, but Los Angeles is better

approximated when using a cell size of 8km. On the other hand Dobkins-Ioannides (2000)

data are better described by cell sizes of 2km or 4km. For instance, Chicago is well described

by a cell size of 4km and Los Angeles is better described by a cell size of 2km.

An interesting remark is that the population of Los Angeles when using cell sizes of 2km,

4km and 8km does not vary as much as that for New York. This could be caused by the

fact that major cities in the northeast of USA are closer to each other than large cities in

the southwest, which may be attributed to land or geographical constraints.

It is important relate the results of Table S2 with an ecological fallacy. As the cell size

is increased, the population of a cluster also increases, as expected, because the cluster now

covers a larger area. This is not a direct manifestation of an ecological fallacy which, would

appear if the statistical results (growth rate vs. S or standard deviation vs. S) gave diﬀerent

results as the cell size increases. In Fig. 1 and Fig. 2 in the SI Section VI, we observe that

the growth rate and standard deviation for the USA and GB follow the same form, except

for the case of the growth rate in the USA in which diﬀerent cell sizes show deviations from

each other. The later may be an indicative of an ecological fallacy. In this case, it is not

obvious what cell size is the “correct” one. We consider this point (the possibility to choose

the cell size) to be a feature of the CCA, since one may appropriately pick the cell size

according to the speciﬁc problem one is studying.

Table S1: Top 10 largest MSA of the USA in 1990 from previous analysis of

population growth

Dobkins - Ioannides Eeckhout

MSA Population MSA Population

1 NYC NY206 9,372,000 NYC-North NJ-Long Is., NY-NJ-CT-PA 19,549,649

2 Los Angeles CA172 8,863,000 Los Angeles-Riverside-Orange County, CA 14,531,529

3 Chicago IL59 7,333,000 Chicago-Gary-Kenosha, IL-IN-WI 8,239,820

4 Philadelphia PA228 4,857,000 Washington-Baltimore, DC-MD-VA-WV 6,727,050

5 Detroit MI80 4,382,000 San Francisco-Oakland-San Jose, CA 6,253,311

6 Washington DC312 3,924,000 Philadelphia-Wilmington-Atlantic City 5,892,937

PA-NJ-DE-MD

7 San Francisco CA266 3,687,000 Boston-Worcester-Lawrence, MA-NH-ME-CT 5,455,403

8 Houston TX129 3,494,000 Detroit-Ann Arbor-Flint, MI 5,187,171

9 Atlanta GA19 2,834,000 Dallas-Fort Worth, TX 4,037,282

10 Boston MA39 2,800,000 Houston-Galveston-Brazoria, TX 3,731,131

Table S2: Top 10 largest clusters of the USA in 1990 from our analysis for

diﬀerent cell sizes. The city names are the major cities that belong to the clusters and

were picked to show the areal extension of the cluster.

Cell = 1km Cell = 2km Cell = 4km Cell = 8km

Cluster Population Cluster Population Cluster Population Cluster Population

1 NYC 7,012,989 NYC-Long Is. 12,511,237 NYC-Long Is. 17,064,816 NYC-Long Is. 41,817,858

Newark N. NJ-Newark North NJ

Jersey City Jersey City Philadelphia

D.C.-Boston

2 Chicago 2,312,783 Los Angeles 9,582,507 Los Angeles 10,878,034 Los Angeles 13,304,233

Long Beach Long Beach San Clemente

Pomona Riverside

3 Los Angeles 1,411,791 Chicago 4,836,529 Chicago 7,230,404 Chicago 9,288,345

Rockford Gary Gary

Rockford Rockford

Milwaukee

4 Philadelphia 1,282,834 Philadelphia 3,151,704 Washington 5,316,890 San Francisco 5,736,479

Wilmington Baltimore Santa Cruz

Springﬁeld Brentwood

5 Boston 759,024 Detroit 2,906,453 Philadelphia 4,935,734 Detroit 4,442,723

Trenton Ann Arbor

Wilmington Monroe

Sarnia

6 Newark 581,048 San Francisco 2,601,639 San Francisco 4,766,960 Miami 4,000,432

San Jose San Jose Port St. Lucie

Concord

7 San Francisco 507,300 Washington 2,059,421 Detroit 3,722,778 Dallas 3,536,186

Alexandria Waterford Fort Worth

Bethesda Canton

8 Washington 504,068 Phoenix 1,556,077 Miami 3,719,773 Houston 3,425,647

W. Palm Beach

9 Jersey City 438,591 Boston 1,498,208 Dallas 3,134,233 Cleveland 3,233,341

Lowell Fort Worth Canton

Quincy

10 Baltimore 437,413 Miami 1,465,490 Boston 3,064,925 Pittsburgh 3,214,661

Brockton Youngstown

Nashua Morgantown

A B

FIG. 6: Sensitivity of the results under coarse-graining of the data for GB. (A) Average growth

rate and (B) standard deviation for GB using the clustering algorithm for diﬀerent cell size. The

dashed line represents the OLS regression estimate for the exponents (A) αGB = 0.17 and (B)

βGB = 0.27 obtained in the main text. For clarity we do not show the conﬁdence bands.

VI. SCALING UNDER COARSE-GRAINING

In this section we test the sensitivity of our results to a coarse-graining of the data. We

analyze the average growth rate hr(S0)iand the standard deviation σ(S0) for GB and the

USA by coarse-graining the data sets at diﬀerent levels.

In Fig. 6A we observe that although the results are not identical for all coarse-grainings,

they are statistically similar, showing a slight decay in the growth rate. Moreover, we see

that cities of size S0≈103and S0≈106still exhibit a tendency to have negative growth

rates for all levels of coarse-graining, as explained in the main text. In the case of the USA

(Fig. 7A) there is a crossover to a ﬂat behavior at a cell size of 8000m, although at this scale

all the northeast USA becomes a large cluster of 41 million inhabitants. On the other hand,

Figs. 6B, 7B show that the scaling of Eq. (3) in the main text, σ(S0)∼S−β

0, still holds when

using the coarse-grained datasets on both GB and the USA.

VII. CORRELATIONS

In this section we elaborate on the calculations leading to the relation between Gibrat’s

law and the spatial correlations in the cell population. We ﬁrst show that when the pop-

A B

FIG. 7: Study of results under coarse-graining of the data for the USA. (A) Average growth rate

and (B) standard deviation for the USA using the clustering algorithm for diﬀerent cell size. The

dashed line represents the OLS regression estimate for the exponents (A) αUSA = 0.28 and (B)

βUSA = 0.20 obtained in the main text. For clarity we do not show the conﬁdence bands.

ulation cells are randomly shuﬄed (destroying any spatial correlations between the growth

rates of the cells), the standard deviation of the growth rate becomes σ(S0)∼S−βrand

0, where

βrand = 1/2 [16]. Then, we show that long-range spatial correlations in the population of

the cells leads to the relation β=γ/4 as stated at the end of Section II in the main text.

Assuming that the population growth rate is small (r≪1), we can write R=er≈1 + r.

Replacing R= 1 + rin Eq. (1) in the main text we obtain

S1=S0+S0r. (17)

We deﬁne the standard deviation of the populations S1as σ1, which is a function of S0:

σ1(S0) = qhS2

1i − hS1i2.(18)

This quantity is easier to relate to the spatial correlations of the cells than the standard

deviation σ(S0) of the growth rates r. Then, since hS1i=S0+S0hriand hS2

1i=S2

2S2

0hri+S2

0hr2i, we obtain,

σ1(S0)∼S0σ(S0),(19)

where σ(S0) = phr2i − hri2as deﬁned in the main text. Therefore, using Eq. (3) in the

main text,

σ1(S0)∼S1−β

0.(20)

As stated in the main text, the total population of a cluster at time t0is the sum of the

populations of each cell, S0=PNi

j=1 n(i)

j, where Niis the number of cells in cluster i. The

population of a cluster at time t1can be written as

S1=S0+

j=1

δj,(21)

where δjis the increment in the population of cell jfrom time t0to t1(notice that δjcan

be negative). Therefore, the standard deviation σ1(S0) is

σ1(S0)2=

j,k

hδjδki − h

δji2=

j,k

h(δj−¯

δ)(δk−¯

δ)i.(22)

After the process of randomization explained in Section II main text, the correlations

between the increment of population in each cell are destroyed. Thus,

h(δj−¯

δ)(δk−¯

δ)i= ∆2δjk ,(23)

where ∆2=¯

δ2−¯

δ2. Replacing in Eq. (22) and since hni= (1/Ni)PNi

jnj=S0/Ni, we

obtain

σ1(S0)2=Ni∆2∼S0.(24)

Comparing with Eq. (20) we obtain βrand = 1/2 for this uncorrelated case.

Let us assume that the correlation of the population increments δj, decays as a power-law

of the distance between cells indicating long-range scale-free correlations. Thus, asymptoti-

cally

h(δj−¯

δ)(δk−¯

δ)i ∼ ∆2

|~xj−~xk|γ,(25)

where ~xjdenotes the position of the cell jand γis the correlation exponent (for |~xj−~xk| → 0,

the correlations h(δj−¯

δ)(δk−¯

δ)itend to a constant). For large clusters, we can approximate

the double sum in Eq. (22) by an integral. Then, assuming that the shape of the clusters

can be approximated by disks of radius rc, for γ < 2 we obtain

(σ1(S0))2=

j,k

∆2

|~xj−~xk|γ→∆2Ni

a2Zrcrdrdθ

rγ≈∆2

(2 −γ)

a2r−γ+2

c,(26)

where a2is the area of each cell and rcthe radius of the cluster. Since rc∼Nia2, we ﬁnally

obtain,

σ1(S0)2∼N2−γ

i.(27)

Using S0=Nihniand Eq. (20) we arrive at,

β=γ

4.(28)

Equation (28) shows that Gibrat’s Law is recovered when the correlation of the population

increments is a constant, independent from the positions of the cells; that is when all the

populations cells are increased equally. In other words, if γ= 0, the standard deviation of

the populations growth rates has no dependence on the population size (β= 0), as stated by

Gibrat’s law. The random case is obtained for γ=d, where d= 2 is the dimensionality of the

substrate. In this case d= 2 and βrand = 1/2. For γ > 2, the correlations become irrelevant

and we still ﬁnd the uncorrelated case βrand = 1/2. For intermediate values 0 < γ < 2 we

obtain 0 < β =γ/4<1/2.

VIII. RANDOM SURROGATE DATASET

In this section we elaborate on the randomization procedure used to understand the role

of correlations in population growth.

Figure 4C in the main text shows the standard deviation σ(S0) when the population

of each cluster is randomized, breaking any spatial correlation in population growth. For

clusters with a large population, σ(S0) follows a power-law with exponent βrand = 1/2,

and for small S0,σ(S0) presents deviations from the power-law function as seen in Fig. 4C

with smaller standard deviation than the prediction of the random case. This deviation is

caused by the fact that the population of a cluster is bound to be positive: a cluster with a

small population S0cannot decrease its population by a large number, since it would lead

to negative values of S1. This produces an upper bound in ﬂuctuations of the growth rate

for small S0and results in smaller values of σ(S0) than expected (below the scaling with

exponent βrand = 1/2).

To support this argument, we carry out simulations using the clusters of GB, where the

population nj(t0) of each cell jis replaced with random numbers following an exponential

distribution with probability P(nj)∼e−nj/n0. The decay-constant, n0= 150, is extracted

from the data of GB to mimic the original distribution. This is done through a direct measure

of P(nj) from the GB dataset and ﬁtting the data using OLS regression analysis. We obtain

the population nj(t1) = nj(t0) + δjof cell jat time t1by picking random numbers for the

population increments δjfollowing a uniform distribution in the range −q∗150 < δi< q∗150.

Here qdetermines the variance of the increments. Since the population cannot be negative

we impose the additional condition nj(t1)≥0. Figure 8 shows the results of the standard

deviation σ(S0) for four diﬀerent q-values for this uncorrelated model. We ﬁnd that the tail

of σ(S0) reproduces the uncorrelated exponent βrand = 1/2. For small S0we ﬁnd that the

standard deviation levels oﬀ to an approximately constant value as in the surrogate data of

Fig 4C. The crossover from an approximately constant σ(S0) to a power-law moves to smaller

values of the population S0as the standard deviation in the δjis smaller (smaller value of q).

Such behavior can be understood since the condition n(i)

j(t1)≥0 imposes a lower “wall” in

the random walk speciﬁed by n(i)

j(t1) = n(i)

j(t0) + δj. As the initial population gets smaller,

the walker “feels” the presence of the wall and the ﬂuctuations decrease accordingly, thus

explaining the deviations from the power-law with exponent βrand = 1/2 for small population

values. Therefore, as the value of qdecreases, the small population plateau disappears as

observed in Fig. 8.

IX. A VARIATION OF THE CCA

In this section we study a variation of the CCA. In the main text we stop growing a cluster

when the population of all boundary cells have unpopulated, that is, have population exactly

0. In other words, clusters are composed by cell with population strictly greater than 0. It

is important to analyze whether this stopping criteria can be relaxed to including cell which

have a population larger that a given threshold. In Fig. 9A and Fig. 9B we show the results

for the population growth rate and standard deviation, respectively, in GB when the cell

size is 2.2km-by-2.2km (as in the main text) but including cells with a population strictly

larger than 5 and 20.

Although for small population clusters we observe a slight variation in the growth rate

and in the standard deviation, the results show that the thresholds do not inﬂuence the

global statistics when compared to the plots in the main text.

FIG. 8: Standard deviation σ(S0) for the random data set as explained in the SI Section VIII.

The results for σ(S0) are rescaled to collapse the power-law tails with exponent βrand = 1/2 and to

emphasize the deviations from this function for small values of S0. The larger the parameter q, the

larger the deviations from the power-law at lower S0. In other words, the crossover to power-law

tail appears at larger S0as qincreases.

FIG. 9: Sensitivity of the results under a change in the stopping criteria in the CCA (A) Average

growth rate for GB with a population threshold of 5 (green line) and 20 (black dashed line) and

(B) standard deviation for GB with a population threshold of 5 (green line) and 20 (black dashed

line). For clarity we do not show the conﬁdence bands.

The angiogenic growth of cities

Article

Full-text available

Apr 2024
J R SOC INTERFACE

Describing the space–time evolution of urban population is a fundamental challenge in the science of cities, yet a complete theoretical treatment of the underlying dynamics is still missing. Here, we first reconstruct the evolution of London (UK) over 180 years and show that urban growth consists of an initial phase of diffusion-limited growth, followed by the development of the railway transport network and a consequential shift from central to suburban living. Such dynamics—which are analogous to angiogenesis in biological systems—can be described by a minimalist reaction–diffusion model coupled with economic constraints and an adaptive transport network. We then test the generality of our approach by reproducing the evolution of Sydney, Australia, from 1851 to 2011. We show that the rail system coevolves with urban population, displaying hierarchical characteristics that remain constant over time unless large-scale interventions are put in place to alter the modes of transport. These results demonstrate that transport schemes are first-order controls of long-term urbanization patterns and efforts aimed at creating more sustainable and healthier cities require careful consideration of population–transport feedbacks.

Feasibility of Urban–Rural Temperature Difference Method in Surface Urban Heat Island Analysis under Non-Uniform Rural Landcover: A Case Study in 34 Major Urban Agglomerations in China

Article

Full-text available

Mar 2024

The urban–rural temperature difference is widely used in measuring surface urban heat island intensity (SUHII), where the accurate determination of rural background is crucial. However, traditionally, the entire permeable rural surface has been selected to represent the background temperature, leaving uncertainty about the impact of non-uniform rural surfaces with multiple land covers on the accuracy of SUHII quantification. In this study, we proposed two quantifications of SUHII derived from the primary (SUHII1) and secondary (SUHII2) land types, respectively, which successively occupy over 40–50% of whole rural regions. The spatial integration and temporal variation of SUHII1 and SUHII2 were compared with the result from whole rural regions (SUHII) within 34 urban agglomerations (UAs) in China. The results showed that the SUHII1 and SUHII2 differed slightly with SUHII, and the correlation coefficients of SUHII and SUHII1/SUHII2 are generally above 0.9 in most (32) UAs. Regarding the long-term SUHII between 2003 and 2019, the three methods demonstrated similar seasonal patterns, although SUHII1 (or SUHII2) tended to overestimate or underestimate compared to SUHII. As for the multi-year integration at the regional scale, the day–night cycle and monthly variations of SUHII1 and SUHII were found to be identical for each geographical division separately, indicating that the spatiotemporal pattern revealed by SUHII is minimally affected by the diversity of rural landcover types. The findings confirmed the viability of the urban–rural LST difference method for measuring long-term regional SUHII patterns under non-uniform rural land cover types.

On the influence of density and morphology on the Urban Heat Island intensity

Thesis

Full-text available

Jan 2024

Yunfei Li

The urban heat island (UHI) effect, describing an elevated temperature of urban areas compared with their natural surroundings, can expose urban dwellers to additional heat stress, especially during hot summer days. A comprehensive understanding of the UHI dynamics along with urbanization is of great importance to efficient heat stress mitigation strategies towards sustainable urban development. This is, however, still challenging due to the difficulties of isolating the influences of various contributing factors that interact with each other. In this work, I present a systematical and quantitative analysis of how urban intrinsic properties (e.g., urban size, density, and morphology) influence UHI intensity. To this end, we innovatively combine urban growth modelling and urban climate simulation to separate the influence of urban intrinsic factors from that of background climate, so as to focus on the impact of urbanization on the UHI effect. The urban climate model can create a laboratory environment which makes it possible to conduct controlled experiments to separate the influences from different driving factors, while the urban growth model provides detailed 3D structures that can be then parameterized into different urban development scenarios tailored for these experiments. The novelty in the methodology and experiment design leads to the following achievements of our work. First, we develop a stochastic gravitational urban growth model that can generate 3D structures varying in size, morphology, compactness, and density gradient. We compare various characteristics, like fractal dimensions (box-counting, area-perimeter scaling, area-population scaling, etc.), and radial gradient profiles of land use share and population density, against those of real-world cities from empirical studies. The model shows the capability of creating 3D structures resembling real-world cities. This model can generate 3D structure samples for controlled experiments to assess the influence of some urban intrinsic properties in question. [Chapter 2] With the generated 3D structures, we run several series of simulations with urban structures varying in properties like size, density and morphology, under the same weather conditions. Analyzing how the 2m air temperature based canopy layer urban heat island (CUHI) intensity varies in response to the changes of the considered urban factors, we find the CUHI intensity of a city is directly related to the built-up density and an amplifying effect that urban sites have on each other. We propose a Gravitational Urban Morphology (GUM) indicator to capture the neighbourhood warming effect. We build a regression model to estimate the CUHI intensity based on urban size, urban gross building volume, and the GUM indicator. Taking the Berlin area as an example, we show the regression model capable of predicting the CUHI intensity under various urban development scenarios. [Chapter 3] Based on the multi-annual average summer surface urban heat island (SUHI) intensity derived from Land surface temperature, we further study how urban intrinsic factors influence the SUHI effect of the 5,000 largest urban clusters in Europe. We find a similar 3D GUM indicator to be an effective predictor of the SUHI intensity of these European cities. Together with other urban factors (vegetation condition, elevation, water coverage), we build different multivariate linear regression models and a climate space based Geographically Weighted Regression (GWR) model that can better predict SUHI intensity. By investigating the roles background climate factors play in modulating the coefficients of the GWR model, we extend the multivariate linear model to a nonlinear one by integrating some climate parameters, such as the average of daily maximal temperature and latitude. This makes it applicable across a range of background climates. The nonlinear model outperforms linear models in SUHI assessment as it captures the interaction of urban factors and the background climate. [Chapter 4] Our work reiterates the essential roles of urban density and morphology in shaping the urban thermal environment. In contrast to many previous studies that link bigger cities with higher UHI intensity, we show that cities larger in the area do not necessarily experience a stronger UHI effect. In addition, the results extend our knowledge by demonstrating the influence of urban 3D morphology on the UHI effect. This underlines the importance of inspecting cities as a whole from the 3D perspective. While urban 3D morphology is an aggregated feature of small-scale urban elements, the influence it has on the city-scale UHI intensity cannot simply be scaled up from that of its neighbourhood-scale components. The spatial composition and configuration of urban elements both need to be captured when quantifying urban 3D morphology as nearby neighbourhoods also cast influences on each other. Our model serves as a useful UHI assessment tool for the quantitative comparison of urban intervention/development scenarios. It can support harnessing the capacity of UHI mitigation through optimizing urban morphology, with the potential of integrating climate change into heat mitigation strategies.

Quantifying the environmental synergistic effect of cooling-air purification-carbon sequestration from urban forest in China

Article

Full-text available

Mar 2024
J CLEAN PROD

Intensifying urban imprint on land surface warming: Insights from local to global scale

Article

Full-text available

Feb 2024

Increasing urbanization exacerbates surface energy balance perturbations and the health risks of climate warming; however, it has not been determined whether urban-induced warming and attributions vary from local, regional, to global scale. Here, the local surface urban heat island (SUHI) is evidenced to manifest with an annual daily mean intensity of 0.99℃–1.10℃ during 2003–2018 using satellite observations over 536 cities worldwide. Spatiotemporal patterns and mechanisms of SUHI tightly link with climatevegetation conditions, with regional warming effect reaching up to 0.015℃–0.138℃ (annual average) due to surface energy alterations. Globally, the SUHI footprint of 1,860 cities approximates to 1% of the terrestrial lands, about 1.8–2.9 times far beyond the urban impervious areas, suggesting the enlargements of the imprint of urban warming from local to global scales. With continuous development of urbanization, the implications for SUHI-added warming and scaling effects are considerably important on accelerating global warming.

On the parametric description of log-growth rates of Romanian city sizes

Article

May 2024

Cooling and optimizing urban heat island based on a thermal knowledge-informed multi-type ant colony model

Article

May 2024
REMOTE SENS ENVIRON

Analyzing urban scaling laws in the United States over 115 years

Article

Apr 2024

The scaling relations between city attributes and population are emergent and ubiquitous aspects of urban growth. Quantifying these relations and understanding their theoretical foundation, however, is difficult due to the challenge of defining city boundaries and a lack of historical data to study city dynamics over time and space. To address this issue, we analyze scaling between city infrastructure and population across 857 metropolitan areas in the conterminous United States over an unprecedented 115 years (1900–2015) using dasymetrically refined historical population estimates, historical urban road network models, and multi-temporal settlement data to define dynamic city boundaries. We demonstrate that urban scaling exponents closely match theoretical models over a century. Despite some close quantitative agreement with theory, the empirical scaling relations unexpectedly vary across regions. Our analysis of scaling coefficients, meanwhile, reveals that contemporary cities use more developed land and kilometers of road than cities of similar population in 1900, which has serious implications for urban development and impacts on the local environment. Overall, our results provide a new way to study urban systems based on novel, geohistorical data.

How to perceive and map the synergy between CO2 and air pollutants: Observation, measurement, and validation from a case study of China

Article

Feb 2024
J ENVIRON MANAGE

Human Impacts on Land Surface‐Atmosphere Interactions

Chapter

Dec 2023

Modelling urban growth patterns

Article

Full-text available

Oct 1995
NATURE

CITIES grow in a way that might be expected to resemble the growth of two-dimensional aggregates of particles, and this has led to recent attempts1á¤-3 to model urban growth using ideas from the statistical physics of clusters. In particular, the model of diffusion-limited aggregation4,5 (DLA) has been invoked to rationalize the apparently fractal nature of urban morphologies1. The DLA model predicts that there should exist only one large fractal cluster, which is almost perfectly screened from incoming á¤~development unitsá¤™ (representing, for example, people, capital or resources), so that almost all of the cluster growth takes place at the tips of the clusterá¤™s branches. Here we show that an alternative model, in which development units are correlated rather than being added to the cluster at random, is better able to reproduce the observed morphology of cities and the area distribution of sub-clusters (á¤~towns') in an urban system, and can also describe urban growth dynamics. Our physical model, which corresponds to the correlated percolation model6á¤-8 in the presence of a density gradient9, is motivated by the fact that in urban areas development attracts further development. The model offers the possibility of predicting the global properties (such as scaling behaviour) of urban morphologies.

GIS, Spatial Analysis and Spatial Statistics

Article

Full-text available

Dec 1996

David J Unwin

Ecological Inference: New Methodological Strategies

Book

Sep 2004

This collection of essays brings together a diverse group of scholars to survey the latest strategies for solving ecological inference problems in various fields. The last half-decade has witnessed an explosion of research in ecological inference--the process of trying to infer individual behavior from aggregate data. Although uncertainties and information lost in aggregation make ecological inference one of the most problematic types of research to rely on, these inferences are required in many academic fields, as well as by legislatures and the Courts in redistricting, by business in marketing research, and by governments in policy analysis.

Applied Nonparametric Regression

Article

Nov 1991

Introduction to Percolation Theory

Article

Oct 1987

Dietrich Stauffer

Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies

Density Estimation for Statistics and Data Analysis Chapman and Hall

Article

Mar 1988

Bernard Walter Silverman

Scaling behavior in the growth of companies

Article

Jan 1996
NATURE

Introduction to Linear Regression Analysis

Book

Jan 1992

Applied Non-Parametric Regression

Book

Oct 1990

Wolfgang Karl Karl Härdle

Applied Nonparametric Regression is the first book to bring together in one place the techniques for regression curve smoothing involving more than one variable. The computer and the development of interactive graphics programs have made curve estimation possible. This volume focuses on the applications and practical problems of two central aspects of curve smoothing: the choice of smoothing parameters and the construction of confidence bounds. Härdle argues that all smoothing methods are based on a local averaging mechanism and can be seen as essentially equivalent to kernel smoothing. To simplify the exposition, kernel smoothers are introduced and discussed in great detail. Building on this exposition, various other smoothing methods (among them splines and orthogonal polynomials) are presented and their merits discussed. All the methods presented can be understood on an intuitive level; however, exercises and supplemental materials are provided for those readers desiring a deeper understanding of the techniques. The methods covered in this text have numerous applications in many areas using statistical analysis. Examples are drawn from economics as well as from other disciplines including medicine and engineering.

Spatial Interactions Among US Cities

Article

Nov 2001
REG SCI URBAN ECON

We test implications of economic geography by exploring spatial interactions among U.S. cities. We use a data set consisting of 1900–1990 metro area populations, and spatial measures including distance from the nearest larger city in a higher-tier, adjacency, and location within U.S. regions. We also date cities from their time of settlement. We find that among cities which enter the system, larger cities are more likely to locate near other cities. Moreover, older cities are more likely to have neighbors. Distance from the nearest higher-tier city is not always a significant determinant of size and growth. We find no evidence of persistent non-linear effects on urban growth of either size or distance, although distance is important for city size for some years.

Laws of Population Growth

Abstract and Figures

Recommended publications

[Evaluation of weight, height and BMI in children, adolescents and young adults from the Community o...

Risk factors for depression in the first postnatal year: A Turkish study

[Platelet reference values in healthy children living in Mexico City]

Correction for Rhiel's Theory for the Range Estimator of the Coefficient of Variation for Skewed Dis...