Content uploaded by Michael Batty
Author content
All content in this area was uploaded by Michael Batty
Content may be subject to copyright.
LETTERS
Rank clocks
Michael Batty
1
Many objects and events, such as cities, firms and internet hubs,
scale with size
1–4
in the upper tails of their distributions. Despite
intense interest in using power laws to characterize such distribu-
tions, most analyses have been concerned with observations at a
single instant of time, with little analysis of objects or events that
change in size through time (notwithstanding some significant
exceptions
5–7
). It is now clear that the evident macro-stability in
such distributions at different times can mask a volatile and often
turbulent micro-dynamics, in which objects can change their posi-
tion or rank-order rapidly while their aggregate distribution
appears quite stable. Here I introduce a graphical representation
termed the ‘rank clock’ to examine such dynamics for three dis-
tributions: the size of cities in the US from
AD 1790, the UK from
AD 1901 and the world from 430 BC. Our results destroy any notion
that rank–size scaling is universal: at the micro-level, these clocks
show cities and civilizations rising and falling in size at many times
and on many scales. The conventional model explaining such scal-
ing on the basis of growth by proportionate effect cannot replicate
these micro-dynamics, suggesting that such models and explana-
tions are considerably less general than has hitherto been assumed.
I begin with the US city size distributions compiled
8
for the popu-
lations of the largest 100 cities from
AD 1790 to AD 2000 using the
decennial Population Census. City sizes are represented as rank–size
distributions after Zipf, where population in city i at time t, P
i
(t),is
ordered against its rank r
i
(t) and plotted logarithmically to give an
immediate visualization of scaling
9
. The relative stability of these
rankings is clear from the plots in Fig. 1a but the switch in rankings,
with many cities entering and leaving the top 100, is completely
hidden. Over this 210 yr period, there are 266 cities that at some stage
belong to the top 100; from 1840 when the number of cities first
reached 100, only 21 remain in 2000. On average, it takes 105 yr for
50% of cities to appear or disappear from the top 100, while the
average change in rank order for a typical city in each 10 yr period
is 7 ranks. If the distributions are collapsed onto one another, there is
only an 18% difference between any of their rank orders. This con-
trasts with a typical switch in rank orders, which is also illustrated in
Fig. 1a, where I plot the ‘1950’ ranks using ‘2000’ population values.
To visualize these micro-dynamics, I first plot trajectories in Fig. 1b
based on the rank and size of each city in the rank–size space where
each city is coloured according to its rank order and the time when it
first appears in the top 100. These trajectories are ordered so that
greater changes in rank overlay lower ones. Although cities that do
not change their rank show up clearly as vertical lines, this plot is
confusing. I therefore propose a ‘rank clock’, where rank orders are
plotted for each city in temporal clockwise direction with the highest
rank at the centre and the lowest on the circumference. This provides
an immediate visualization of the dynamics, as I show for the US data
in Fig. 1c. To develop meaningful analysis on the clock, I measure
various displacements of rank, but first extract significant city tra-
jectories as in Fig. 1d. New York City has been the top rank (at the
clock’s centre) since 1790, but the clock reveals how large cities such
as Chicago, Los Angeles and Houston enter the system as population
diffuses across the US, how cities in the original 13 colonies such as
Richmond and Charleston fall out of favour, and how cities in the
northeast ‘rust-belt’ such as Buffalo slowly decline in rank.
To complement the US data, I have constructed clocks for two
other very different data sets. The US data are based on the rapid
growth of key cities in the New World, and represent the transition
from an essentially agrarian society to an industrial one. The original
24 cities in 1790 comprise only 5% of the US population (,200,000),
whereas in 2000, the top 100 cities constitute 20% (,55 million). The
UK data, taken from a reworking of the decennial Census data into
458 urban places
10
, illustrates a population growing from 37 million
to 57 million over the period
AD 1901 to AD 2001. The third set is
based on Chandler’s 4,000 yr world history of urban growth
11
, from
which I have culled and added data to the top 50 cities from 430
BC to
AD 2000. There are 390 distinct cities in this data that grow from
about 3 million in total to almost 600 million over the period. Like
the US data, this reveals the massive transition to industrialization
but only at the very end of the time series. Other key transitions from
the classical era to the Dark Ages, the importance of China and Japan,
the ebb and flow of urban populations in the Middle Ages, and the
emergence of the modern era, are all reflected in this data. Details are
contained in the Supplementary Information.
Rank–size distributions and rank clocks are shown for the UK and
world data in Fig. 2. The UK rank–size profiles show scaling in the
upper tail as the number of urban places in this data set is fixed. In the
UK data, of the top 100 cities in 1901, 73 remain in these top ranks in
2001. The change in rank order in each 10 yr period for an average
city is only 14 out of a total of 458 cities, which is less than half the US
rate. In the world data, no cities in the top 50 in 430
BC remain in the
top 50 in the year 2000, while from the Fall of Constantinople in 1453,
only 6 cities from that era now appear in the top 50 (in 2000). The
cities that show the greatest longevity through their being in the top
50 over the 2,430 yr period are perhaps surprising: Suzhou (2,158 yr),
Nanking (2,080 yr), Wuchang (1,850 yr), Benares (1,780 yr), Rayy
(1,630 yr) and Rome (1,530 yr). Paris (525 yr) is 77th out of 390 cities.
The half life for cities in this data set remaining in the top 50 is
,200 yr, almost twice as long as for the US data. Comparing the
patterns in the three clocks in Figs 1 and 2, visual intuition suggests
that changes in rank are smoothest in the UK data, more volatile and
complex in the US, while most volatile in the world data where large
changes in rank seem to dominate all historical periods. A more
detailed sample of this micro-dynamics is shown in the clock in
Fig. 3, where 7 ‘typical’ cities are plotted for the world data, each
showing a different pattern of rank change, suggesting a possible basis
for future classification of city types based on their local dynamics
12
.
To compare these three systems, I reduce each set of cities to the
top 50. These of course generate different rank clocks (shown in the
Supplementary Information) in the US and UK systems, as I am
now dealing with the top 50 cities from 266 US and 458 UK cities.
Throughout I am also making the assumption that a city that exists at
times t {1 and t in the top 50 remains in that list regardless of the
length of time between t{1 and t. For the world data set where time
1
Centre for Advanced Spatial Analysis, The Bartlett School, University College London, 1
–
19 Torrington Place, London WC1E 6BT, UK.
Vol 444
|
30 November 2006
|
doi:10.1038/nature05302
592
Nature
Publishing
Group
©2006
periods vary substantially in length, this complicates the analysis, for
cities might hop in and out of the top 50 during long time intervals.
The rank shifts
13
are defined as distances d
i
(t)~ r
i
(t){r
i
(t{1)
jj
,
which exist only if the city in question is in the top 50 at each time
period within which N
i
(t) such cities comprise the common set V(t).
These distances can be plotted on the clock, as can the average shift
per time d(t )~
X
i[V(t )
r
i
(t){r
i
(t{1)
jj
.
N
i
(t), which enables
analysis of the extent to which the system is converging or diverging
in terms of rank shift
14
. The average shift over all time periods T is
defined as d~
X
t
d(t)
.
T. If there are no shifts, then all these
distances are zero, while the maximum average shift that can take
place at any time is d(t)vn
=
2 where n 5 50, the total number of cities
in the list.
These distances are plotted in Fig. 4a–c. The world system is clearly
the most volatile, with an overall shift d of 14.27 and with the average
d(t) indicating bigger shifts in the classical era and until around the
year
AD 1000. After that, there is a gradual reduction in shift until the
Industrial Revolution. For the US data, the overall shift is much lower
with d at 4.67, with the average d(t) being largest between 1830 and
1890, the period of greatest city expansion in the US. I would expect
the UK to have much lesser volatility from the earlier analysis, but
once the city set is restricted to the top 50, changes in rank (d at 4.22)
are only a little less than for the US case. The UK was effectively locked
into its current urban settlement pattern by 1901, and rank shifts in
the last century are greatest in the 1950s and 1960s, a period of rapid
suburbanization
15
.
My second analysis involves defining growth rates and examining
their trajectories using the rank clock. These rates will be split into
those associated with the growth and the change in the share of
population in cities. First I define the growth rate for each city i at
time t, l
i
(t),asP
i
(t)=P
i
(t{1), from which P
i
(t)~l
i
(t) P
i
(t{1) and
then define population shares as p
i
(t)~P
i
(t)=P(t) where total popu-
lation is P(t)~
X
i
P
i
(t). The expected growth rate for cities l(t) is:
l(t)~
X
i
p
i
(t)l
i
(t)~
X
i
p
i
(t)
P
i
(t)
P
i
(t{1)
~
P(t)
P( t{1)
X
i
p
i
(t)
p
i
(t)
p
i
(t{1)
()
ð1Þ
where the overall growth C(t) and expected shift in population shares
q(t) are the first and second terms on the second line of equation (1),
respectively. A more workable form is based on the logarithmic
growth rate logl
i
(t)~log P
i
(t)
=
P
i
(t{1)½, whose expected value is
an information statistic defined as:
I ½l(t)~
X
i
p
i
(t) log l
i
(t)~
X
i
p
i
(t)log
P
i
(t)
P
i
(t{1)
~log
P(t)
P( t{1)
z
X
i
p
i
(t)log
p
i
(t)
p
i
(t{1)
ð2Þ
The log of overall growth I ½C(t) and expected log of the shares
I ½q(t) are the first and second terms on the second line of equation
(2). The second term is an information difference, which holds the
key to further analysis through decomposition into growth and
change at different spatial scales, thereby enabling different systems
to be integrated through a hierarchy of information entropies
16–18
.
a
c
d
b
10
6
10
4
10
2
10
0
10
0
10
1
10
2
r
i
(t)
P
i
(t)
10
6
10
4
10
2
10
0
P
i
(t)
10
0
10
1
10
2
r
i
(t)
1790
1800
1810
1820
1830
1840
1850
1860
1870
1880
18901900
1910
1920
1930
1940
1950
1960
1970
1980
1990
1790
1800
1810
1820
1830
1840
1850
1860
1870
1880
18901900
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
1840
1790
Figure 1
|
Rank size and rank clocks for the US urban system 1790
–
2000.
a
, Zipf plots of the top 100 cities. Red indicates the rank of cities in 1950
using populations for 2000.
b, Trajectories of each city in rank–size space
with New York City rank 1 throughout.
c, The rank clock, with each axis
running from rank 1 at the centre to 100 on the circumference, and cities
coloured by date of entry from 1790 (red) to 2000 (blue).
d, Sample city
trajectories.
NATURE
|
Vol 444
|
30 November 2006 LETTERS
593
Nature
Publishing
Group
©2006
These growth and information statistics reflect the balance
between overall change and the shift and share in rank and popu-
lation size, which in its simplest additive form is expressed in the
expected log of city growth rates in equation (2). The growth and
information components averaged over all time periods for each data
set are shown in Table 1, where it is clear that for the world and UK
systems, the averages q and I ½q are very small, implying relatively
little contribution to overall growth, which mainly comes from C and
I ½C. In contrast, there is greater shift and overall growth in the US
system. However, the UK has been a low growth city system over the
entire data period whereas the world system had relatively low
growth from 430
BC to about AD 1600 when it began to take off.
These patterns are very clear in the trajectories of the three growth
components, which are plotted in rank clock form for all three sys-
tems in Fig. 4d. The information statistics mirror these trajectories.
There we observe for the world system that the q( t) and I ½q(t) make
very little contribution to the total change while C (t) and I ½C(t)
dominate, with the significant periods being the early Dark Ages from
the collapse of the Roman Empire to around 500, and the Industrial
Revolution which merges into the current era of globalization begin-
ning around 1800
19
. In the UK system, the statistics pick up the
recession period 1930–40 and the relatively low growth era of the
1970s. Lastly, the US urban system shows much greater growth and
population shifts in the nineteenth century with specific shifts in
population in the 1860s and relatively low periods of growth in the
1820s, the 1930s and 1940s, and the 1980s.
The information generated on the micro-dynamics of rank size can
be used to examine the consistency of the widely accepted model used
to generate growth and change in systems that scale
1,20–22
. This model
is based on proportionate random growth, first suggested by Gibrat
23
,
P
i
(t)
P
i
(t)
r
i
(t)
r
i
(t)
10
0
10
2
10
4
10
6
10
0
10
2
10
4
10
6
10
8
10
0
10
0
10
1
10
1
10
2
10
2
10
3
2001
1901
430
BC
AD
2000
1750
1800
1850
1900
1951
1961 1941
19311971
1981
1991
1901
1911
1921
1950 –430
–200
100
361
500
622
1700
1650
1600
1550
1500
1450
1400
1350
1250
1300
1200
1150
1100
1000
800
900
dc
ab
Figure 2
|
Rank size and rank clocks for the UK and world urban systems.
a
, b, The Zipf plot (a) and the rank clock (b) for the UK, with each clock axis
from rank 1 to 458.
c, d, The Zipf plot (c) and the rank clock (d) for the world,
with each clock axis from rank 1 to 50.
–430
–200
100
361
500
622
800
900
1000
1100
1150
1200
1250
1300
1350
1400
1450
1500
1550
1600
1650
1700
1750
1800
1850
1900
1950
Figure 3
|
Trajectories of a sample of world cities from 430 BC to AD 2000.
Note the dominance of Rome until modern times, and the importance of the
Chinese cities Nanking and Suzhou. From 1600 onwards, the volatility of the
clock increases as cities of the early modern period and the Industrial
Revolution are replaced in the top 50 by cities in the developing world.
LETTERS NATURE
|
Vol 444
|
30 November 2006
594
Nature
Publishing
Group
©2006
subject to a lower bound P
min
(t) below which populations cannot
fall. For n cities, P
i
(t)~½ Cze
i
(t) P
i
(t{1) with P
i
(t)wP
min
(t)
where e
i
(t) is a random value chosen from a normal distribution,
and P
min
(t) is a fixed proportion of the average P(t)=n of the total
population
24
. With 1,500 cities and using a constant ten year growth
rate of 1.126 discounted to each yearly time step, I have run the model
for T~2,500 steps, which mirrors the population change in
Chandler’s world data set. The biggest differences between the simu-
lated and three real data sets are the low values of the population
shift-shares with q~1:00061 and I ½q~1:00031. The macro- and
micro-dynamics of this simulation are shown in the Supplem-
entary Information, from which it is clear that the rank clock has
some similarities with that associated with the world data set, whereas
the distance clock is similar to the US clock.
It is clear however that this model is not able to generate the unique
events associated with a turbulent world history in the rise and fall of
cities and civilizations
19
, nor is it able to mirror change over shorter
time periods. Although Gibrat’s model does generate universal scal-
ing behaviour for city size distributions, its micro-dynamics is very
different from that which is revealed using the rank clocks. Models of
proportionate random growth generating scale-free effects are now
being explored for networks, and preliminary analysis of their his-
torical dynamics can be informed using rank clocks
25
. To test these
ideas further, much larger temporal data sets—such as clusters of
internet hubs and websites, firm sizes, scientific citation networks,
individual income distributions, and short lived epidemics—would
thus appear to be excellent candidates for analysis using rank clocks.
Received 12 May; accepted 27 September 2006.
1. Blank, A. & Solomon, S. Power laws in cities population, financial markets and
internet sites: Scaling and systems with a variable number of components.
Physica A 287, 279
–
288 (2000).
2. Gabaix, X. & Ioannides, Y. M. in Handbook of Regional and Urban Economics Vol. 4
(eds Henderson, V. & Thisse, J-F.) 2341
–
2378 (North-Holland, Amsterdam,
2004).
1920
1900
1910
1890
1880
1870
1860
1850
1840
1830
1820
1790
1800
1810
1990
1980
1970
1960
1950
1930
1940
1961
1951
1941
1931
1921
1901
19111991
1981
1971
800
900
1000
622
500
361
100
–200
–430
1950
1900
1850
1800
1750
1700
1650
1600
1550
1500
1450
1400
1350
1300
1250
1200
1150
1100
b
a
d
c
–430
1950
1950
1850
1800
1750
1700
1650
1600
1550
1500
1450
1400
1350
1300
1250
1200
1150
1100
1000
900
800
622
500
361
100
–200
Figure 4
|
Distance clocks and growth rates. a–c, Distance clocks for the
top 50 cities for the US (
a), the UK (b) and the world (c), coloured according
to when each city enters the rank order from the earliest in red to the latest in
blue, and with greatest changes in distance overlaying lesser.
d, The shift-
share growth clock. All axes are from rank 1 to 50 for the clocks in
a, b and
c, and from 0 (centre) to 2.5 (circumference) for d.
Table 1
|
Distance, growth and information statistics
Component US World UK Gibrat’s model
d 4.6675 14.2773 4.22049 9.4853
l~
X
i
l(t)
,
T
1.3787 1.1311 1.0044 1.3441
C~
X
i
C(t)
,
T
1.3122 1.1256 0.9991 1.3433
q~
X
i
q(t)
,
T
1.0456 1.0042 1.0054 1.0006
I ½l~
X
i
I ½l(t)
,
T
0.2770 0.1032 0.0010 0.2942
I ½C~
X
i
½ C(t)
,
T
0.2572 0.1012 -0.0017 0.2939
I ½q~
X
i
I ½q(t)
,
T
0.0198 0.0020 0.0027 0.0003
See text for details of quantities in the leftmost column. All rates are normalized to 10 yr time
intervals.
NATURE
|
Vol 444
|
30 November 2006 LETTERS
595
Nature
Publishing
Group
©2006
3. Axtell, R. L. Zipf distribution of U.S. firm sizes. Science 293, 1818
–
1820 (2001).
4. Adamic, L. A. & Huberman, B. A. Zipf’s law and the internet. Glottometrics 3,
143
–
150 (2002).
5. Stanley, M. H. R. et al. Scaling behavior in the growth of companies. Nature 379,
804
–
806 (1996).
6. White, D. R., Kejzar, N., Tsallis, C. & Rozenblat, C. Generative Historical Model of
City-Size Hierarchies: 430BCE
–
2005 (ISCOM Working Paper, Institute of
Mathematical Behavioral Sciences, University of California, Irvine, CA, 2005).
7. Barabasi, A-L. The origin of bursts and heavy tails in human dynamics. Nature 435,
207
–
211 (2005).
8. Gibson, C. Population of the 100 Largest Cities and Other Urban Places in the United
States: 1790 to 1990 (Population Division Paper 27, US Bureau of the Census,
Washington DC, 1998).
9. Zipf, G. K. Human Behavior and the Principle of Least Effort (Addison-Wesley,
Cambridge, Massachusetts, 1949).
10. CDU census website. Æhttp://census.ac.uk/cdu/æ (accessed 8 August, 2006).
11. Chandler, T. Four Thousand Years of Urban Growth: An Historical Census (Edward
Mellon, Lampeter, UK, 1987).
12. Ioannides, Y. M. & Overman, H. G. Zipf’s law for cities: An empirical examination.
Reg. Sci. Urban Econ. 33, 127
–
137 (2003).
13. Havlin, S. The distance between Zipf plots. Physica A 216, 148
–
150 (1995).
14. Guerin-Pace, F. Rank-size distribution and the process of urban growth. Urban
Stud. 32, 551
–
562 (1995).
15. Robson, B. T. Urban Growth: An Approach (Methuen, London, 1972).
16. Theil, H. Statistical Decomposition Analysis (North-Holland, Amsterdam, 1972).
17. Batty, M. Entropy in spatial aggregation. Geogr. Anal. 8, 1
–
21 (1976).
18. Gell-Mann, M. & Tsallis, C. (eds) Nonextensive Entropy-Interdisciplinary
Applications (Oxford Univ. Press, New York, 2004).
19. Turchin, P. Historical Dynamics: Why States Rise and Fall (Princeton Univ. Press,
Princeton, New Jersey, 2003).
20. Gabaix, X. Zipf’s law for cities: An explanation. Q. J. Econ. 114, 739
–
767 (1999).
21. Sornette, D. & Cont, R. Convergent multiplicative processes repelled from zero:
Power laws and truncated power laws. J. Phys. I (Paris) 7, 431
–
444 (1997).
22. Pumain, D. in Hierarchy in Natural and Social Sciences (ed. Pumain, D.) 169
–
222
(Springer, Dordrecht, 2006).
23. Gibrat, R. Les Ine
´
galite
´
sE
´
conomiques (Librarie du Recueil, Sirey, Paris, 1931).
24. Malcai, O., Biham, O. & Solomon, S. Power-law distributions and Levy-stable
intermittent fluctuations in stochastic systems of many autocatalytic elements.
Phys. Rev. E 60, 1299
–
1303 (1999).
25. Krapivsky, P. L. & Redner, S. Statistics of changes in lead node in connectivity-
driven networks. Phys. Rev. Lett. 89, 258703 (2002).
Supplementary Information is linked to the online version of the paper at
www.nature.com/nature.
Acknowledgements This work was partially supported by the EPSRC Spatially
Embedded Complex Systems Engineering Consortium. I thank R. Carvalho for
useful discussions, and D. Dorling for providing the UK data set.
Author Information Reprints and permissions information is available at
www.nature.com/reprints. The author declares no competing financial interests.
Correspondence and requests for materials should be addressed to the author
(m.batty@ucl.ac.uk).
LETTERS NATURE
|
Vol 444
|
30 November 2006
596
Nature
Publishing
Group
©2006