ArticlePDF Available

The Evolution of Natural Cities from the Perspective of Location-Based Social Media


Abstract and Figures

This paper examines the former location-based social medium Brightkite, over its three-year life span, based on the concept of natural cities. The term 'natural cities' refers to spatially clustered geographic events, such as the agglomerated patches aggregated from individual social media users' locations. We applied the head/tail division rule to derive natural cities. More specifically, we generated a triangulated irregular network, made up of individual unique user locations, and then categorized small triangles (smaller than an average size) as natural cities for the United States (mainland) on a monthly basis. The concept of natural cities provides a powerful means to develop new insights into the evolution of real cities, because there are virtually no data available to track the history of a city across its entire life span and at very fine spatial and temporal scales. Therefore, natural cities can act as a good proxy of real cities, in the sense of understanding underlying interactions, at a global level, rather than of predicting cities, at an individual level. Apart from the data produced and the contributed methods, we established new insights into the structure and dynamics of natural cities, e.g., the idea that natural cities evolve in nonlinear manners at both spatial and temporal dimensions. Keywords: Big data, head/tail breaks, ht-index, power laws, fractal, and nonlinearity
Content may be subject to copyright.
The Evolution of Natural Cities from the Perspective of Location-Based Social Media
Bin Jiang and Yufan Miao
Department of Technology and Built Environment, Division of Geomatics
University of Gävle, SE-801 76 Gävle, Sweden
(Draft: August 2013, Revision: September 2013, January 2014)
This paper examines the former location-based social medium Brightkite, over its three-year life span,
based on the concept of natural cities. The term ‘natural cities’ refers to spatially clustered geographic
events, such as the agglomerated patches aggregated from individual social media users’ locations. We
applied the head/tail division rule to derive natural cities. More specifically, we generated a
triangulated irregular network, made up of individual unique user locations, and then categorized
small triangles (smaller than an average size) as natural cities for the United States (mainland) on a
monthly basis. The concept of natural cities provides a powerful means to develop new insights into
the evolution of real cities, because there are virtually no data available to track the history of a city
across its entire life span and at very fine spatial and temporal scales. Therefore, natural cities can act
as a good proxy of real cities, in the sense of understanding underlying interactions, at a global level,
rather than of predicting cities, at an individual level. Apart from the data produced and the contributed
methods, we established new insights into the structure and dynamics of natural cities, e.g., the idea
that natural cities evolve in nonlinear manners at both spatial and temporal dimensions.
Keywords: Big data, head/tail breaks, ht-index, power laws, fractal, and nonlinearity
1. Introduction
Once upon a time, there were no cities, only scattered villages. Over time, cities gradually emerged
through the interaction of people or residents; similarly, large or mega cities evolve through the
interaction of cities or people. This is a conjecture mentioned in Jiang (2013b), in which he argued that
geographic phenomena such as urban growth are essentially unpredictable. Many models in the
literature that claim to be able to predict urban growth are in effect for short-term prediction like the
weather forecast; weather forecast beyond five days is essentially unforecastable (Bak 1996). A typical
city may have hundreds of years of history, making it nearly impossible to track its growth
quantitatively because of a lack of related data. More important, a city grows within a system of cities;
one cannot understand a city’s growth without considering other related cities. In this paper, we
illustrate that emerging social media provide an unprecedented data source for studying the evolution
of natural cities (c.f., Section 2 for the definition), and subsequently for better understanding structure
and dynamics of real cities. Location-based social media, sometimes termed as location-based social
networks, such as Flickr, Twitter, and Foursquare (Traynor and Curran 2012, Zheng and Zhou 2011)
refer to a set of Internet-based applications founded on Web 2.0 technologies and ideologies that allow
users to create and exchange user-generated content. Location-based social media can act as a proxy of
real cities (or human settlements in general) and provide better understanding of underlying structure
and dynamics of human settlements.
Not a long ago, there were no social media, only scattered home pages and bulletin board systems
created and maintained by individuals and institutions (Boyd and Ellison 2008, Kaplan and Haenlein
2010). In the era of Web 1.0, geographic locations were not an issue. However, with Web 2.0,
geographic locations have been becoming an important feature of social media. Almost all social
media allow users to tag their geographic locations, often at the level of meters, when sharing and
exchanging user-generated content. Location-based social media enable users to track individual
historical trajectories, their friends, and even the growth of social media. Unlike with conventional
cities, the trajectories of social media are well documented by the hosting companies; and unlike
conventional census data, social media data is defined at individual level, often at very fine spatial and
temporal scales. Data can be obtained using crawling techniques or through the social media’s
officially released application programming interfaces (API). This study aimed to showcase how
social media’s time-stamped location data can be utilized to study the evolution of natural cities, and
thus, providing new insights into the underlying structure and dynamics of real cities.
The contribution of this paper can be seen from the three aspects: data, methods, and new insights.
This study produced a large amount of data regarding natural cities from the former social medium
Brightkite during its entire 31-month life span. The resulting data has significant value for further
study of city growth and allometric relationship between populations and physical extents (data, as
well as related source codes, from the study will be released upon acceptance of this paper). We drew
upon a set of fractal or scaling oriented methods to characterize natural cities. These unique methods
help create new insights into the evolution of natural cities as well as that of real cities. For example,
natural cities demonstrate a striking nonlinear property, spatially and temporally (see Section 5).
Moreover, the evolution of natural cities can provide better understanding of social media from a
unique geospatial perspective.
This study provides new perspectives, as well as different ways of thinking, to the study of cities and
city growth in the era of big data (Mayer-Schonberger and Cukier 2013). We did not adopt
conventional census data, but rather the emerging georeferenced social media data; we did not adopt
conventional geographic units or boundaries that are imposed from the top down by authorities, but
rather the naturally defined concept of natural cities, to avoid statistical bias out of the modifiable areal
unit problem (Openshaw 1984); and we did not rely on standard and spatial statistics with a
well-defined mean to characterize spatial heterogeneity, but rather power-law-based statistics, driven
by fractal and scaling thinking. Therefore, the underlying ways of thinking adopted in this study are
bottom up rather than top down, in terms of data and methods, nonlinear rather than linear, and fractal
rather than Euclidean in terms of the power-law statistics. Therefore, this study intends to argue that
geospatial analysis requires a different way of thinking while dealing with the problem of spatial
The remainder of this paper is structured as follows. Section 2 presents the methods in which we
define the concept of natural cities, and discuss ways of characterizing natural cities. Section 3
presents the data on a monthly basis and shows basic statistics of the data. Section 4 discusses on the
results and major findings, while Section 5 on the implications of the study. Finally, Section 6 draws a
conclusion and points to future work.
2. Methods
In this section, we illustrate and define the concept of natural cities and present various ways of
characterizing natural cities. We also discuss how natural cities differ from conventional cities and
why they represent a new way of thinking for geospatial analysis.
2.1 Defining natural cities
To approach the difficult task of defining and describing natural cities, we start with definitions of
conventional cities and try to clarify why the conventional definitions are not natural. A city is a
relatively large and permanent human settlement. But how large a settlement must be to qualify as a
city is unclear. For example, a city in Sweden may not qualify as a city in China. Also, many cities
have a particular administrative, legal, and historical status according to its local laws. In the United
States, for example, cities can refer to incorporated places, urban areas, or metropolitan areas with
sufficient population of, say, at least 10,000. This population threshold can be very subjective and is
dependent on the country. This subjectivity is also demonstrated in the physical boundaries of cities,
which are legally and administratively determined. Remotely sensed imagery provides new means to
delineate city boundaries, but how does one choose an appropriate pixel value as a cutoff for the
delineation? Because of these subjectivities, conventional definitions of cities are unnatural. How, then,
can we define a city in more natural ways?
We present three examples of natural cities before formally define the concept. In the first example,
natural cities are derived from massive street nodes, including both junctions and street ends. Given all
street nodes of an entire country, we can run an iterative clustering algorithm to determine whether a
node is within the neighbor of another node. For example, set a radius of 700 meters and continuously
draw a circle around each node to determine whether any other node is within its circle. This
progressive and exhaustive process results in many natural cities; see Figure 1a for an illustrative
example. In their study, Jiang and Jia (2011) found that millions of natural cities could be derived
from dozens of millions of street nodes in the United States using OpenStreetMap (OSM) data
(Bennett 2010). Instead of massive street nodes, the second example relies on a massive number of
street blocks to extract natural cities. Jiang and Liu (2012) adopted the three largest European
countries: France, Germany, and the UK for their case studies, again using OSM data. The idea is
illustrated in Figure 1b in which small blocks (smaller than an average city block) constitute a natural
city. Although this method sounds very simple, the computation is very intensive for each country, and
involves millions of street blocks. The third example comes from Jiang and Yin (2014), in which the
authors relied on nighttime imagery to derive natural cities. The author took all pixel values (millions
of pixels each valued between 0 and 63) of an image in the United States and computed an average
value or mean. The mean split all the pixels into two: those above the mean, and those below the mean.
For the pixels above the mean, a second mean was obtained, and it can be a meaningful cutoff for
delineating natural cities.
Figure 1: (Color online) Natural cities based on (a) street nodes and (b) street blocks
(Note: Blue rectangles are the boundaries of the natural cities, which are composed of high-density
nodes or small street blocks based on the head/tail division rule (Jia and Jiang (2010))
These examples of deriving natural cities point out the importance of the mean’s effect, which is based
on the head/tail division rule: Given a variable X, if its values x follow a heavy tailed distribution, then
the mean (m) of the values can divide all the values into two parts: a high percentage in the tail, and a
low percentage in the head (Jiang and Liu 2012). The heavy tailed distribution refers to the statistical
distributions that are right-skewed, for example, power law, lognormal, and exponential. Obviously,
the density of street nodes, the size of street blocks, and the nighttime imagery pixel values all exhibit
a heavy tailed distribution, which implies that there are far more small things than large ones. In this
paper, we introduce an additional way of deriving natural cities: from individual users’ geographical
data of location-based social media. From unique users who check in from locations across an entire
country, we can build up a huge triangular irregular network (TIN), and then categorize these small
triangles (smaller than a mean) as natural cities (Figure 2); refer to the Appendix for a short tutorial.
Section 5 includes a discussion of why the head/tail division rules works so well in delineating natural
Figure 2: (Color online) Procedure of generating natural cities (red patches) from points through TIN
Based on these examples, a formal definition of natural cities can be derived. Natural cities refer to
human settlements or human activities in general on Earth’s surface that are objectively or naturally
defined and delineated from massive geographic information of various kinds, and based on the
head/tail division rule. Unlike conventional cities, natural cities do not need to meet a minimum
population requirement. A one-person settlement may constitute a natural city, or even zero people, if
natural cities are defined not according to human population, but something else. For example, when
natural cities are defined according to street nodes, a natural city derived from one street node may
have no people there at all. The reader may question whether this definition makes sense, but the
definition makes good sense because it provides a new perspective for geospatial analysis, and helps
us develop new insights into geographic forms and processes (see Sections 4 and 5). That is also the
reason that we use the term natural cities to refer to human settlements or human activities in general
on the Earth’s surface. With the concept of natural cities, we abandon the top-down imposed unnatural
geographic units or boundaries such as states, counties, and cities, in order to study geographic forms
and processes more scientifically.
2.2 Characterizing natural cities
The rank-size distribution of cities in a region can be well characterized by Zipf’s law, i.e., an inverse
power relationship between city rank (r) and city size (N), N = r ^ -1 (Zipf 1949). Simply put, when
ranking all cities in a decreasing order for a given country, the largest city is twice as big as the second
largest, three times as big as the third largest, and so on. In other words, a city’s size by population is
inversely proportional to its rank. Such a simple and neat law is found to hold remarkably well for
almost all countries or regions (e.g., Berry and Okulicz-Kozaryn 2011), although some researchers
have challenged its universality (e.g., Benguigui and Blumenfeld-Leiberthal 2011). Essentially, Zipf’s
law indicates two aspects: (1) a power-law relationship between rank and size, and (2) the Zipf’s
exponent of one. Most previous studies have confirmed the first aspect, but not the second; the Zipf’s
exponent was found to deviate from one. In other words, the first aspect is not as much controversial
as the second aspect. Some researchers argued that Zipf’s law was primarily used for characterizing
large cities rather than all cities. In this study, we chose large natural cities (larger than a mean) to
examine whether they followed Zipf’s law. The scaling patterns of far more small cities than large
ones underlie Zipf’s law — a majority of small cities, while a minority of large cities. More important,
the scaling pattern recurs not just once, but multiple times for those large cities, again and again. This
is the basis of head/tail breaks (Jiang 2013), a novel classification scheme for data with a heavy tailed
distribution. In what follows, we illustrate head/tail breaks with a working example.
Table 1: Head/tail breaking statistics for the TIN edges
The triangulated irregular network shown in Figure 2 apparently seems to contain far more short edges
than long ones, and indeed, this is true. There are 504 edges, ranging from the shortest 0.001 to the
longest 46.752. The wide range 46.751 = 46.752 – 0.001 and the large ratio 46,752 = 46.752/0.001
clearly indicate far more short edges than long ones. The average length of the 504 edges is 2.2, which
splits all the edges into two unbalanced parts: 135 in the head (27 percent) and 369 in the tail (73
percent). This head/tail breaking process can be continued for the head again and again, as shown in
Table 1. Eventually, the scaling pattern of far more short edges than long ones recurs five times, three
of which are plotted in Figure 3, or so-called nested rank-size plots. Given that the scaling pattern
recurs five times, the ht-index is six. Note that ht-index (Jiang and Yin 2014) is an alternative index to
fractal dimension (Mandelbrot 1983) used to capture the complexity of geographical features.
Figure 3: (Color online) Nested rank-size plots for the first three hierarchical levels with respect to the
first three rows in Table 1
(Note: The x axis and y axis represent rank and size respectively. The largest plot contains the 504
edges, the red being the first head (135 edges) and the blue being the first tail (369 edges). The 135
edges are plotted again with the red representing 35 in the second head and the blue 100 in the second
tail. The smallest plot is for the 35 edges in the second head.)
Head/tail breaks or ht-index provides a simple yet effective means to characterize natural cities, or
data in general with a heavy tailed distribution for mapping purposes. The derived ht-index captures
the hierarchy or scaling hierarchy of the data. For mapping purposes, head/tail breaks is superior to
conventional classification methods for capturing the underlying scaling pattern (Jiang 2013).
Ht-index complements to fractal dimension for characterizing the complexity of geographic features or
fractals in general.
3. Data and Data Processing
As stated above, the data for this study came from the former location-based social medium Brightkite,
during its three-year (31 months to be more precise) life span, from April 2008 to October 2010 (Cho,
Myer, and Leskovec 2011). The case included 2,837,256 locations in the mainland United States.
From the amount of locations, we removed duplicate locations, obtained 412,961 unique locations for
generating a TIN, and then 8,307 natural cities as of October 2010, by following the procedure shown
in Figure 2, as well as the short tutorial in the Appendix. The location data was time stamped (Table 2),
so we were able to slice all these locations monthly in an accumulated manner, i.e., locations at month
mi+1 contain all locations between months m1 and mi, where 131. For each time interval or
snapshot, we generated a set of natural cities ranging from dozens to thousands. For some snapshots,
we had to split data into small pieces, and put them back together to ArcGIS for visualization and
analysis. For example, Figure 4 illustrates the 8,307 natural cities as of October 2010, showing their
boundaries and populations. Note that this is just one of the 31 snapshots or datasets in the study.
Table 2: Initial check-in data format
User Chec
‐intime Latitude Longitude Locationid
58186 2008‐12‐03T21:09:14Z 39.633321 ‐105.317215 ee8b88dea22411
58186 2008‐11‐30T22:30:12Z 39.633321 ‐105.317215 ee8b88dea22411
58186 2008‐11‐28T17:55:04Z ‐13.158333 ‐72.531389 e6e86be2a22411
58186 2008‐11‐26T17:08:25Z 39.633321 ‐105.317215 ee8b88dea22411
58187 2008‐08‐14T21:23:55Z 41.257924 ‐95.938081 4c2af967eb5df8
Figure 4: (Color online) The largest set of natural cities as of October 2010 (red patches for boundaries
and red dots for populations) on the background of TIN (gray lines) generated from 412,961 unique
location points or 2,837,256 duplicate ones
Table 3: Measurements and statistics from location points to natural cities for the different time
(Note: Pnt = # of points, PntUniq = # of unique points, TINEdge = # of TIN edges, Mean = Average
length of TIN edges, NaturalCity = # of natural cities)
Table 3 lists some basic measurements and statistics from the location points to the natural cities. For
example, for the first month, April 2008, only 44 natural cities were generated from 3,784 locations, of
which 1,199 unique locations were used for generating a TIN with 3,580 edges, and a mean of 54,043
as the cutoff to derive the 44 natural cities. The number of natural cities increased to 8,307 as of
October 2010. During the 31 months, natural cities increased rapidly at some instances, e.g., over four
time increments from April to May 2008 and from March to April 2009. We do not know why there
are such rapid increments, but it could relate to advertising effects. In addition, there was a slight drop
in the number of natural cities from August to September 2010. In the following section, we utilize the
seven time intervals highlighted in Table 3 for a detailed discussion of our findings.
4. Results and Discussions
Before discussing the findings, we map the natural cities at the seven time intervals (or snapshots) for
four largest natural cities surrounding Chicago, New York, San Francisco, and Los Angeles. These are
shown in Figure 5, which illustrates clearly how the four cities or regions grew or expanded during the
31-month period. All parts of the country can be assessed for similar patterns of growth and evolution.
We know little about why the procedure shown in Figure 2, as well in the Appendix, works so well,
but the resulting patterns suggest that the natural cities effectively capture the evolution of real cities.
On the one hand, the natural cities expanded towards more fragmented pieces, far more small pieces
than large ones. On the other hand, the physical boundaries of the natural cities tended to become
more irregular over time. These two aspects suggest that the natural cities are fractal, and become
more and more fractal, resembling very well real cities (Batty and Longley 1994). These two aspects
are further discussed in the following.
Figure 5: (Color online) Evolution of the natural cities near the four largest cities regions with TIN as
a background
These results can be assessed from both global and local perspectives. Globally, all the natural cities in
the United States exhibit a power-law distribution. This is shown rank-size plots (Figure 6), in which
the distribution lines are very straight for all the natural cities at the different time intervals in the
log-log plots. The natural cites as of April 2008, except the smallest with less than 12 people, exhibited
a clear power law, probably the straightest distribution among all others. This result is the same in May
2008. However, the distribution lines from October 2008 to October 2010 are less straight, indicating
that a few of the largest natural cities did not fit well the power-law distribution. This is particularly
obvious for the last two snapshots in April 2009 and October 2010. A possible reason for this
difference, moving from a striking to a less striking power law, is described below.
Figure 6: (Color online) Rank-size plot for the natural cities
In further examinations, we looked at the large cities (larger than the mean) in each snapshot and
found that Zipf’s exponent was indeed around one for the first two months (0.98, and 1.08), and then
greater than one by about 0.25 (Table 4). Considering the duality of Zipf’s law, this result suggests that
Zipf’s law held remarkably well for the first two months, but less so for the remaining months. We
postulated a possible reason: The social medium users at the first two months increased proportionally
with the populations of real cities, thus leading to a striking Zipf’s law effect among the natural cities
because the populations of real cities are power-law distributed. Over time, large cities — particularly
a few of the largest cities such as New York — did not capture the other cities in attracting more users.
In other words, beyond the first two months, the increase in social medium users became less
proportional to the real cities’ populations. As a result, Zipf’s law is less striking. We assess this point
further in the discussion of our findings from a local perspective below. In contrast to small deviations
of Zipf’s exponent, the ht-index increased from four to seven (Table 4). Note that ht-index is a
measure for characterizing complexity of fractals or of geographic features (Jiang and Yin 2014). The
increment of the ht-index implies that more hierarchical levels were added, reflecting well the
evolution of the natural cities and of the social medium.
Table 4: Zipf’s exponent and ht-index for the natural cities
2008‐04 2008‐05 2008‐10 2008‐12 2009‐03 2009‐04 2010‐10
exponent 0.98 1.08 1.26 1.24 1.26 1.27 1.25
Ht‐index 4 4 4 5 6 6 7
Locally, there are two points to discuss. First, the boundaries of the natural cities became more
irregular over time, very much like the Koch curve when the iteration goes up. For example, the
boundaries of the natural cities as of April 2008 were simple enough to be described by Euclidean
geometry. However, over time, the boundaries must be characterized by fractal geometry — more
fragmented with more fine scales added. Second, large natural cities tended to become larger and
larger, while small ones continuously emerged at local levels. Figure 5 illustrates this finding in a less
striking manner, as the city sizes are measured by the physical extents. But if the city sizes are
measured by population as in Figure 7, we noticed the rapid increases for the four largest cities.
Overall, the four cities tended to become larger and larger, but there was a major difference among the
four. To illustrate the difference, we must clarify that Figure 7 adopts the graduated dots to represent
the city sizes, which are classified according to head/tail breaks. This is because the city sizes
exhibited a heavy tailed distribution, or there were far more small cities than large ones. Therefore, the
dot sizes in Figure 7 do not represent city sizes, strictly speaking, but rather, the corresponding classes
to which the cities belong. Notice that the largest natural city in the New York region in October 2010
appears smaller than in April 2009, which indicates that the natural city belonged to a higher class in
April 2009 than in October 2010. This is indeed true! Table 5 clearly indicates that the New York
natural city in April 2009 belonged to the sixth among the six classes, while its position in October
2010 dropped to the fifth among the seven classes. This finding also describes what we stated above: A
few of the largest cities did not capture the others in attracting more users.
Figure 7: (Color online) Evolution of the natural cities in terms of populations (or points)
near the four largest cities regions
Table 5: Evolution of the four cities within the system of the natural cities
(Note: a/b where a and b respectively denote the class the particular city belongs to, and the total
number of classes or the ht-index)
2008‐04 2008‐05 2008‐10 2008‐12 2009‐03 2009‐04 2010‐10
Chicago 1/4 2/4 3/4 3/5 3/6 4/6 4/7
2/4 3/4 3/4 4/5 4/6 6/6 5/7
SanFrancisco 3/4 4/4 4/4 5/5 5/6 6/6 6/7
LosAngeles 3/4 3/4 4/4 4/5 5/6 6/6 7/7
The above results or findings can be summarized by nonlinearity, which is reflected in both spatial and
temporal dimensions. Spatially, the natural cities were distributed heterogeneously or unevenly, i.e.,
there were far more small cities than large ones. This uneven distribution also was seen in the temporal
dimension. For example, within the first 10 months of 2008, the natural cities already had taken the
shapes of individual cities (Figure 5), with populations continuously growing, and small natural cities
being added persistently for the remaining time. In other words, it took just one third of the social
medium’s lifetime to determine the shapes of individual cities. That is also the reason that we chose
the seven unequal time intervals to examine the evolution.
5. Implications of the Study
The location-based social media provide large amounts of location data of significant value for
studying human activities in the virtual world, as well as on the Earth’s surface. Nowadays, the social
sciences — human geography in particular — benefit considerably from emerging social media data
that are time-stamped and location-based. The ways of doing geography and social sciences are
changing! The emerging big data harvested from social media, as well as from positioning and
geospatial technologies, coupled with data-intensive computing (Hey, Tansley, and Tolle 2009) are
transforming conventional social sciences into computational social sciences (Lazer et al. 2009). In
this section, we discuss some deep implications of this study for geography and social sciences in
The notion of natural cities implies a sort of bottom-up thinking in terms of data collection and
geographic units or boundaries. Conventional geographic data collected and maintained from the top
down by authorities are usually sampled and aggregated, and therefore, are small-sized. On the other
hand, new data harvested from social media are massive and individual, so they are called ‘big data.’
Time-stamped and location-based social media data, supported by Web 2.0 technologies and
contributed by individuals through humans as sensors (Goodchild 2007), constitute a brilliant new data
source for geographic research. Conventional geographic units or boundaries are often imposed from
the top down by authorities or centralized committees, while natural cities are defined and delineated
objectively in some natural manner, based on the head/tail division rule. This natural manner
guarantees that we can see a true picture of urban structure and dynamics, and suggests the
universality of Zipf’s law. This true picture is fractal and can be illustrated in this example: Throw
forcefully a wine glass on a cement ground, and it will very likely break into a large number of pieces.
Like the natural cities, these glass pieces are fractal or follow Zipf’s law: On the one hand, there are
far more small pieces than large ones, and on the other hand, each piece has an irregular shape.
The evolution of natural cities demonstrates nonlinearity at both spatial and temporal dimensions, or
equivalently from both static and dynamic points of view. Many phenomena in human geography, as
well as in physical geography, bear this nonlinearity (Batty and Longley 1994, Frankhauser 1994,
Chen 2009, and Phillips 2003). However, we are still very much constrained by linear thinking,
explicitly or implicitly, consciously or unconsciously. For example, we rely on Euclidean geometry to
describe Earth’s surface, and on a well-defined mean to characterize spatial heterogeneity. Our
mindsets apparently lag behind the advances of data and technologies. Conventional linear thinking is
not suitable for describing the Earth’s surface (the geographic forms), not to mention uncovering the
underlying geographic processes. Instead, we should adopt nonlinear thinking, or nonlinear
mathematics such as fractal geometry, chaos theories, and complexity for geographic research. The
tools adopted in this study, such as head/tail division rule, head/tail breaks, and ht-index, underlie
nonlinear mathematics and power-law-based statistics. These nonlinear mathematical tools help to
elicit new insights into the evolution of natural cities. Nonlinearity also implies that geographic forms
and processes are unpredictable like long-term weather or climate in general. To better predict and
understand geographical phenomena, we must seek to uncover the underlying mechanisms through
simulations rather than simple correlations.
The head/tail division rule is intellectually exciting because it appears to be both powerful and
mysterious. The reason why the head/tail division rule is an effective tool to derive natural cities, in
particular at the different time stages, remains an open question. However, we tend to believe it is the
effect of the wisdom of crowds — the diverse and heterogeneous many are often smarter than the few,
even a few experts (Surowiecki 2004). The massive amount of edges (up to 1,238,859) of the
generated TIN from the massive location points constituted the ‘crowds,’ and they collectively decided
an average cutoff for delineating the natural cities. Every single edge had ‘its voice heard’ in the
democratic decision. From the effectively derived natural cities, we can see an advantage of working
with big data. If we had not worked with the entire US data set, but only an area surrounding New
York for example, we would not have been able to determine a sensible cutoff for delineating the New
York natural city. Only with the big data that includes all location points or all edges can a meaningful
cutoff be determined and applied to all. In this sense, the approach to delineating natural cities is
holistic and bottom up, with participation of all diverse and heterogeneous individuals.
It is important to note that the check-in users are biased towards certain types of people. Thus the
derived natural cities are not exactly the same as the corresponding real cities. However, no one can
deny that the boundaries shown in Figure 5 are not those of Chicago, New York, San Francisco, and
Los Angeles, in particular with respect to the last time interval 2010-10. One can simply cross check
Google Maps to see how the cities or regions look like. On the other hand, this paper is not to study
real cities, at an individual level, on how they can be captured or predicted by the natural cities, but to
understand, at a collective level, underlying mechanisms of agglomerations, formed either by people
in physical space (real cities), or by the check-in users in virtual space (natural cities). In other words,
we consider cities (either real or natural cities) as an emergence (Johnson 2002) developed from
interactions of individual people from the bottom up. We believe that the insights developed social
media data can be applied to real cities, e.g., fractal structure and nonlinear dynamics. The fact that not
all people are the check-in users should not be considered a biased sampling issue. Sampling is an
inevitable technique at the time of information scarcity, so called the small data era, but it is not
legitimate concept in the big data era. The large social media data implies N=all (Mayer-Schonberger
and Cukier 2013). This N=all is an essence of big data. Given the 2.8 millions of check-in locations,
the social media can be a good proxy for studying the evolution of real cities in the country.
We face an unprecedented golden era for geography, or social sciences in general, with the wave of
social media and, in particular, the increasing convergence of social media and geographic information
science (Sui and Goodchild 2011). For the first time in history, human activities can be documented at
very fine spatial and temporal scales. In this study, we sliced the data monthly, but we certainly could
have done so weekly, daily, and even hourly. We believe that the observed nonlinearity at the temporal
dimension would be even more striking. This, of course, warrants further study. Geographers should
ride the wave of social media and develop a more computationally minded geography or
computational geography (Openshaw 1998). If we do not seize this unique opportunity, we may risk
being purged from the sciences. The rise of computational social science is a timely response to the
rapid advances of data and technologies. In fact, physicists and computer scientists already have been
working on this exciting and rapidly changing domain (see Brockmann, Hufnage, and Geisel 2006;
and Zheng and Zhou 2011). We geographers should do more rather than less.
6. Conclusion
Driven by the lack of data for tracking the evolution of cites, this study demonstrated that emerging
location-based social media such as Flickr, Twitter, and Foursquare can act a proxy for studying and
understanding underlying evolving mechanisms of cities. Compared with conventional census data
that are usually sampled, aggregated, and small, the time-stamped and location-based social media
data can be characterized as all, individual, and big. In this paper, we abandoned conventional
definitions of cities, and adopted objectively or naturally defined natural cities, using massive
geographic information of various kinds, and based on the head/tail division rule. Built on the notion
of the wisdom of crowds, the head/tail division rule works very well to establish a meaningful cutoff
for delineating natural cities. Natural cities provide an effective means or unique perspective to study
human activities for better understanding of geographic forms and processes.
We examined the evolution of natural cities, derived from massive location points of the social
medium Brightkite, during its 31-month life span. We found nonlinearity during the evolution of
natural cities in both spatial and temporal dimensions, and the universality of Zipf’s law. We archived
all the data that could be of further use for developing and verifying urban theories. This study has
deep implications for geography and social sciences in light of the increasing amounts of data that can
be harvested from location-based social media. Therefore, we call for the application of nonlinear
mathematics, such as fractal geometry, chaos theories and complexity to geographic and social science
research. A limitation of this study lies in the data that shows only the social medium’s continuous rise
and not its decline. Brightkite seemed to disappear overnight. Future research should concentrate on
development of power-law-based statistics, and underlying nonlinear mathematics, to manage the
increasing social media data and on agent-based simulations to reveal the mechanisms for the
evolution of natural cities.
An early version of paper was presented as a keynote address entitled “The evolution of natural cities:
a new way of looking at human mobility”, at Mobile Ghent '13, 23-25 October 2012, University of
Ghent, Belgium. XXXX
Bak P. (1996), How Nature Works: The science of self-organized criticality, Springer-Verlag: New
Yo rk .
Batty M. and Longley P. (1994), Fractal Cities: A geometry of form and function, Academic Press:
Benguigui L., and Blumenfeld-Leiberthal E. (2011), The end of a paradigm: Is Zipf’s law universal?
Journal of Geographical Systems, 13, 87–100.
Bennett J. (2010), OpenStreetMap: Be your own cartographer, PCKT Publishing: Birmingham.
Berry B. J. L. and Okulicz-Kozaryn A. (2011), The city size distribution debate: Resolution for US
urban regions and megalopolitan areas, Cities, 29, S17-S23.
Boyd D. M. and Ellison N. B. (2008), Social network sites: Definition, history, and scholarship,
Journal of Computer-Mediated Communication, 13, 210 – 230.
Brockmann D., Hufnage L., and Geisel T. (2006), The scaling laws of human travel, Nature, 439, 462
– 465.
Chen Y. (2009), Spatial interaction creates period-doubling bifurcation and chaos of urbanization,
Chaos, Solitons & Fractals, 42(3), 1316-1325.
Cho E., Myers S. A., and Leskovec J. (2011), Friendship and mobility: user movement in
location-based social networks, Proceedings of the 17th ACM SIGKDD international conference
on Knowledge discovery and data mining, ACM: New York, 1082-1090.
Frankhauser P. (1994), La Fractalité des Structures Urbaines, Economica: Paris.
Goodchild M. F. (2007), Citizens as sensors: The world of volunteered geography, GeoJournal, 69(4),
211 -221.
Hey T., Tansley S., and Tolle K. (2009), The Fourth Paradigm: Data intensive scientific discovery,
Microsoft Research: Redmond, Washington.
Jia T. and Jiang B. (2010), Measuring urban sprawl based on massive street nodes and the novel
concept of natural cities, Preprint:
Jiang B. (2013), Head/tail breaks: A new classification scheme for data with a heavy-tailed distribution,
The Professional Geographer, 65 (3), 482 – 494.
Jiang B. and Jia T. (2011), Zipf's law for all the natural cities in the United States: a geospatial
perspective, International Journal of Geographical Information Science, 25(8), 1269-1281.
Jiang B. and Liu X. (2012), Scaling of geographic space from the perspective of city and field blocks
and using volunteered geographic information, International Journal of Geographical
Information Science, 26(2), 215-229.
Jiang B. and Yin J. (2014), Ht-index for quantifying the fractal or scaling structure of geographic
features, Annals of the Association of American Geographers, xx, xx-xx, preprint:
Johnson S. (2002), Emergence: The Connected Lives of Ants, Brains, Cities, and Software, Scribner:
New York.
Kaplan A. M. and Haenlein M. (2010), Users of the world, unite! The challenges and opportunities of
social media, Business Horizons, 53, 59—68.
Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A.-L., Brewer, D., Christakis, N., Contractor,
N., Fowler, J., Gutmann, M., Jebara, T., King, G., Macy, M., Roy, D., and Van Alstyne, M.
(2009), Computation social science, Science, 323, 721-724.
Mandelbrot B. (1982), The Fractal Geometry of Nature, W. H. Freeman and Co.: New York.
Mayer-Schonberger V. and Cukier K. (2013), Big Data: A revolution that will transform how we live,
work, and think, Eamon Dolan/Houghton Mifflin Harcourt: New York.
Openshaw S. (1984), The Modifiable Areal Unit Problem, Geo Books: Norwick Norfolk.
Openshaw S. (1998), Towards a more computationally minded scientific human geography,
Environment and Planning A, 30, 317-332.
Phillips J. D. (2003), Sources of nonlinearity and complexity in geomorphic systems, Progress in
Physical Geography, 27(1), 1–23.
Sui D. and Goodchild M. (2011), The convergence of GIS and social media: challenges for GIScience,
International Journal of Geographical Information Science, 25(11), 1737–1748.
Surowiecki J. (2004), The Wisdom of Crowds: Why the Many Are Smarter than the Few, ABACUS:
Traynor D. and Curran K. (2012), Location-based social networks, In: Lee I. (editor, 2012), Mobile
Services Industries, Technologies, and Applications in the Global Economy, IGI Global:
Hershey, PA, 243 - 253.
Zheng Y. and Zhou X. (editors, 2011), Computing with Spatial Trajectories, Springer: Berlin.
Zipf G. K. (1949), Human Behavior and the Principles of Least Effort, Addison Wesley: Cambridge,
Appendix: Tutorial on How to Derive Natural Cities based on ArcGIS
This tutorial aims to show, in a step by step fashion with ArcGIS, how to derive natural cities using the
first month data (2008-04) as an example. Once you have got the check-in data of the first month,
transfer them into an Excel sheet with two columns namely x, y, respectively representing longitude
and latitude. Add a third column z, and set all column values as one (or any arbitrary value since
ArcGIS relies on 3D points for creating a TIN). Insert the Excel sheet as a shape file data layer in
ArcGIS (Figure A1(a)). Create a TIN from the point layer, using ArcToolbox > 3D analyst tools > TIN
management > Create TIN (Figure A1(b)).
Figure A1: Screen snapshots including (a) the 1199 unique points, (b) the TIN from the 1199 points, (c)
the selected edges shorter than the mean 54042.8, and (d) the 44 natural cities created
Convert the TIN into TIN edge, using ArcToolbox > 3D analyst tools > Conversion > From TIN > TIN
Edge. The converted TIN edge is a polyline layer. Right click the polyline layer to Open Attribute
Table in order to get statistics about the length of the edges. Figure A2 shows that the frequency
distribution that is apparently L-shaped, indicating that there are far more short edges than long ones.
Note that the mean is 54042.8.
Figure A2: Statistics of TIN edges
Select those shorter edges than this mean 54042.8 following menu Selection > Select by Attributes... .
The selected edges are highlighted in Figure A1(c). The selected shorter edges refer to high density
locations. Dissolve all the shorter edges into polygons to be individual natural cities (Figure A1(d)),
following ArcToolbox > Data Management Tools > Generalization > Dissolve, or alternatively
following menu Geoprocessing > Dissolve (the option ‘Create multipart features’ should be
The above steps all can be done with the existing ArcGIS functions, since the first month data is small
enough. However, we cannot fulfill all the processes simply using existing ArcGIS functions for some
later months. This is because more check-in points are added cumulatively. In this case, we must spit
the data into small pieces and put them back again into ArcGIS with some simple codes. All the
related codes together the natural cities data are archived at

Supplementary resource (1)

... A common knowledge is that the process of urbanization involves four aspects: urban system, urban form, urban ecology, and urbanism [1,3]. Both urban system and urban form can be well described from the prospective of natural cities [32,33]. However, it is difficult to study urban ecology and urbanism through modern technology. ...
Full-text available
The phenomenon of Iks was first found by anthropologists and biologists, but it is actually a problem of human geography. However, it has not yet drawn extensive attention of geographers. Based on the relationship between urbanization and ikization, this paper is devoted to constructing a model to explain ikization. The research methods include literature-based analogy, mathematical modeling, empirical analysis, and numerical experiments. The main findings are as follows. First, the generalized production function can be used to model the behavior of ikization resulting from dramatic changes in the geographical environment and sudden cultural rupture. Second, nonlinear replacement dynamic models can be used to explain the possibility of rapid urbanization leading to ikization. Observational data is utilized to verify the fast urbanization mode, and numerical experimentation is employed to reveal the possible key factor causing ikization. The principal conclusions can be reached that social transition should adopt a relatively mild approach to changing, and protecting the geographical environment and reviving traditional culture contribute to national sustainable development.
... With the development of geospatial big data, Jiang and colleagues coined the term 'natural city' (NC), which leveraged different types of location-based social media data to delimit city boundaries at country and global scales using a cutoff that guides the clustering process and is determined objectively by long-tailed statistics underlying the adjacencies between geotagged web content (e.g. Jiang and Jia 2011;Jiang and Miao 2015;Ma, Sandberg, and Jiang 2017). ...
Full-text available
The underlying complexity of urban space can be manifested by its fractal forms and scaling statistics. This paper examines these characteristics at the intra-urban scale through the lens of clustered street junctions (including road ends) in two Chinese metropolitan areas: Beijing and Shenzhen. We derived the cluster sets with Euclidean distance thresholds starting at 100 meters (m) and ending at 1000m, and outlined each cluster using a concave-hull method to maintain their original irregular shapes. Within each delimited cluster, we examined four urban attributes: gross domestic product, number of street nodes, polygon area, and population. Our analysis revealed that power law distribution applied to almost every cluster set in terms of the four attributes, but varied from one attribute to another or from city to city, represented primarily by fluctuated power law exponents and ht-index values whose profiles along with the cluster growth can effectively characterize the urban structure. Additionally, we computed the spectrum of intra-urban scaling exponents with cluster size increments, contributing new insights into the allometric relationships between urban configuration and function.
... Following this, all the stay points are converted into a triangular irregular network (TIN) model and a higher-density area of stay points; in other words, recreational hotspots are identified based on the length of TIN edges. Herein, the interpretation on whether the density of an area is 'higher or not' is theoretically based on the discussions of urban heterogeneity or scaling (Jiang, 2018;Jiang & Liu, 2012;Jiang & Miao, 2015;Ma et al., 2020). Finally, incorporating the approach proposed by Liu, Singleton, et al. (2021), we characterize recreational hotspots through street segments. ...
Full-text available
Recreational activities are heterogeneously distributed throughout urban space, with far more low-density areas than high-density ones. Identification of recreational hotspots, or high-density areas, plays a critical role in urbanplanning. Nevertheless, from the perspective of urban heterogeneity, recreational hotspots remain inadequately understood for further theoretical and empirical investigations. Hence, based on the volunteered GPS trajectory data, we established a novel framework for effectively capturing recreational hotspots. The entire process can be divided into three steps: extracting stay points from individuals' tracks; clustering points by using heavy-tailed distribution statistics of the point-point proximities based on triangular irregular network (TIN); and generating the hotspots and integrating them with street segments. To assess the proposed framework, we started by introducing it in three typical Chinese cities and analyzing the reliability of the capturing process. Furthermore, taking one of the three cases as an example, we compared the proposed framework with current widely-used clustering methods, namely, K-means, DBSCAN and CFSFDP. The results show that the proposed framework performs well in both the empirical investigations and methodological comparisons, as it not only highlights the existing hotspots that are in line with general public perceptions, but also outperforms the three clustering algorithms in terms of fitness of the purpose, rapidity, and accuracy. Overall, this study extends urban heterogeneity to the application of the urban recreational system and provides potentials for its redesign and improvement.
... In this study, we identified the most significant urban settlement as the city's urban core. The benefit of doing this is that it identifies natural cities (the continuous geographic space where human activities gather as a result of urban self-organization processes) [56] instead of administrative cities imposed by the government for the convenience of governance. Figure 4 reveals the results of urban-rural resilience evaluation with resilience from ecological and social dimensions and urban core areas identified in this research. ...
Full-text available
The urban–rural system is an economically, socially, and environmentally interlinked space, which requires the integration of industry, space, and population. To achieve sustainable and coordinated development between urban and rural systems, dynamic land use change within the urban–rural system and the ecological and social consequences need to be clarified. This study uses system resilience to evaluate such an impact and explores the impact of land use change, especially land conversion induced by urbanization on regional development through the lens of urban–rural resilience. The empirical case is based on the Beijing-Tianjin-Hebei Urban Agglomeration (BTHUA) in China from 2000 to 2020 when there was rapid urbanization in this region. The results show that along with urbanization in the BTHUA, urban–rural resilience is high in urban core areas and low in peripheral areas. From the urban core to the rural outskirts, there is a general trend that comprehensive resilience decreases with decreased social resilience and increased ecological resilience in this region. Specifically, at the city level, comprehensive resilience decreases sharply from the urban center to its 3–5 km buffer zone and then remains relatively stable in the rural regions. A similar trend goes for social resilience at the city level, while ecological resilience increases sharply from the urban center to its 1–3 km buffer zone, and then remains relatively stable in the rural regions in this region, except for cities in the west and south of Hebei. This study contributes to the conceptualization and measurement of urban–rural resilience in the urban–rural system with empirical findings revealing the impact of rapid urbanization on urban–rural resilience over the last twenty years in the BTHUA in China. In addition, the spatial heterogeneity results could be used for policy reference to make targeted resilience strategies in the study region.
... Its importance is further highlighted by the fact that, due to global availability, Social Media studies are commonly focusing on areas of high Social Media usage, neglecting in a sense the question of how possible would it be to deploy the defined methods in a different context. To the best of the authors' knowledge, little has been done to showcase the potential similarities and differences of deploying Social Media data in transportation research in different cities with the few exceptions to be found on the analysis of the deploying of the natural cities concept by (Jiang and Miao, 2015), the exploratory investigation of millions of Twitter footprints with the extraction of radius gyration for users in USA cities (Cheng et al., 2021), the identification of tourist hot spots in European cities (García-Palomares et al., 2015), the study of how people experience the city on local and global scale through geotagged photos (Paldino et al., 2015), and the use of Social Media as a global mobility proximity (Hawelka et al., 2013). However, in all of the above cases, the methodological approach for the (in some cases indirect) comparison of different city comparison is based on the general tweets data collection (from the Twitter Streaming Application Programming Interface, API) that returns a fraction of the total tweets posted, without focusing on the posting characteristics of individual users. ...
Full-text available
Social Media have increasingly provided data about the movement of people in cities making them useful in understanding the daily life of people in different geographies. Particularly useful for travel analysis is when Social Media users allow (voluntarily or not) tracing their movement using geotagged information of their communication with these online platforms. In this paper we use geotagged tweets from 10 cities in the European Union and United States of America to extract spatiotemporal patterns, study differences and commonalities among these cities, and explore the nature of user location recurrence. The analysis here shows the distinction between residents and tourists is fundamental for the development of city-wide models. Identification of repeated rates of location (recurrence) can be used to define activity spaces. Differences and similarities across different geographies emerge from this analysis in terms of local distributions but also in terms of the worldwide reach among the cities explored here. The comparison of the temporal signature between geotagged and non-geotagged tweets also shows similar temporal distributions that capture in essence city rhythms of tweets and activity spaces.
Full-text available
This thesis seeks to elucidate the dynamics inherent to the emergence of spatial structures aimed at a type of consumer, creating new centralities in the urban space. To elucidate, the consumption behavior of a group was analyzed, through the understanding of the symbols and signs valued by them. It was understood that with the advent of the internet, social interactions that were once geographically limited became present in a virtual environment, where they play the role that once belonged to urban centralities. This new dynamic created centralities in the virtual sphere, where individuals with mutual interests gather together without the need to be in the same place, nor to interact in real time and in the same language. Therefore, it was presented in this thesis that the virtual occupies a space of centrality of subgroups (which would not necessarily be spatially close) with the power to modify the geographic space, by creating places that carry their symbols and meanings. The luxury fitness market was selected as the spatial anchor, as it has a large presence in the virtual sphere, is widely spread in urban places and presents a succinct number of facilities, which allowed the use of on-site research for four years (from 2017 to 2021) for the collection of primary data and an in-depth analysis of the symbols and signs of behavior and consumption for the target audience of this market segment.
Full-text available
Social media is a dashboard of an individual, group, or society that offers all the information with or without the knowledge of the users, and it is one of the most prominent online activity. The nature of social media data is determined on real-time experience, and it enhances the marketers to pick this for their social media marketing. Hence, this research focuses on how social media information and its usage influence marketing analytics that strongly enhancing market analysis for the product or services improvement. The data from social media are gathered, analysed, tested, and the results of the analysis are used for marketing planning, designing, and decision-making. There are various third party tools offered in the market, but, however, there are analytical tools offered by social networking sites which would assist a lay person decision analysis. This research is performed to understand which segmentation focuses on social media data, and how the freely available data influence even the unfocused group to do a marketing analysis. It helps the social media users to understand the audience insights, emotions of the other social media users, helps to compare brands and represents brand positioning and mainly focus the trends that is currently adopted amongst the society. The various indispensable features of social media pave way to emergence of various marketing analytics tools, and this also impacts social media marketing since the social networking sites are one of the powerful medium which is common amongst demographic dimensions that carry advertising and awareness of various of brands, products, information, and services. The research results that social media is one of the core feature stimuli for marketing analytics which enables effective social media marketing.KeywordsSocial networkingCustomer engagementAnalytics toolsAdvertisingPerception and recognition
Full-text available
Social media data have been widely used in natural sciences and social sciences in the past 5 years, benefiting from the rapid development of deep learning frameworks and Web 2.0. Its advantages have gradually emerged in urban design, urban planning, landscape architecture design, sustainable tourism, and other disciplines. This study aims to obtain an overview of social media data in urban design and landscape research through literature reviews and bibliometric visualization as a comprehensive review article. The dataset consists of 1220 articles and reviews works from SSCI, SCIE, and A&HCI, based on the Web of Science core collection, respectively. The research progress and main development directions of location-based social media, text mining, and image vision are introduced. Moreover, we introduce Citespace, a computer-network-based bibliometric visualization, and discuss the timeline trends, hot burst keywords, and research articles with high co-citation scores based on Citespace. The Citespace bibliometric visualization tool facilitates is used to outline future trends in research. The literature review shows that the deep learning framework has great research potential for text emotional analysis, image classification, object detection, image segmentation, and the expression classification of social media data. The intersection of text, images, and metadata provides attractive opportunities as well.
Full-text available
We are now seeing governments and funding agencies looking at ways to increase the value and pace of scientific research through increased or open access to both data and publications. In this point of view article, we wish to look at another aspect of these twin revolutions, namely, how to enable developers, designers and researchers to build intuitive,multimodal, user-centric, scientific applications that can aid and enable scientific research.
Full-text available
Geospatial analysis is very much dominated by a Gaussian way of thinking, which assumes that things in the world can be characterized by a well-defined mean, i.e., things are more or less similar in size. However, this assumption is not always valid. In fact, many things in the world lack a well-defined mean, and therefore there are far more small things than large ones. This paper attempts to argue that geospatial analysis requires a different way of thinking - a Paretian way of thinking that underlies skewed distribution such as power laws, Pareto and lognormal distributions. I review two properties of spatial dependence and spatial heterogeneity, and point out that the notion of spatial heterogeneity in current spatial statistics is only used to characterize local variance of spatial dependence. I subsequently argue for a broad perspective on spatial heterogeneity, and suggest it be formulated as a scaling law. I further discuss the implications of Paretian thinking and the scaling law for better understanding of geographic forms and processes, in particular while facing massive amounts of social media data. In the spirit of Paretian thinking, geospatial analysis should seek to simulate geographic events and phenomena from the bottom up rather than correlations as guided by Gaussian thinking. KEYWORDS: Big data, scaling of geographic space, head/tail breaks, power laws, heavy-tailed distributions
Spatial trajectories have been bringing the unprecedented wealth to a variety of research communities. A spatial trajectory records the paths of a variety of moving objects, such as people who log their travel routes with GPS trajectories. The field of moving objects related research has become extremely active within the last few years, especially with all major database and data mining conferences and journals. Computing with Spatial Trajectories introduces the algorithms, technologies, and systems used to process, manage and understand existing spatial trajectories for different applications. This book also presents an overview on both fundamentals and the state-of-the-art research inspired by spatial trajectory data, as well as a special focus on trajectory pattern mining, spatio-temporal data mining and location-based social networks. Each chapter provides readers with a tutorial-style introduction to one important aspect of location trajectory computing, case studies and many valuable references to other relevant research work. Computing with Spatial Trajectories is designed as a reference or secondary text book for advanced-level students and researchers mainly focused on computer science and geography. Professionals working on spatial trajectory computing will also find this book very useful.
The ability to gather and manipulate real world contextual data, such as user location, in modern software systems presents opportunities for new and exciting application areas. A key focus among those working in the area of Location-Based services today has been the creation of social networks which allow mobile device users to exchange details of their personal location as a key point of interaction. While the initial interest in these services has been exceptionally high, they are plagued by the same challenges as all Location Based services, regarding the privacy and security of users and their data. This chapter aims to investigate the area of Location-Based Social Networks (LBSNs), with a view to documenting how they contribute to a new form of expertise due to the now accurate knowledge of where people are actually located at a moment in time.
Conference Paper
This presentation will set out the eScience agenda by explaining the current scientific data deluge and the case for a “Fourth Paradigm” for scientific exploration. Examples of data intensive science will be used to illustrate the explosion of data and the associated new challenges for data capture, curation, analysis, and sharing. The role of cloud computing, collaboration services, and research repositories will be discussed.
History tells us that when you want something done you turn to a leader: right? Wrong. If you want to make a correct decision or solve a problem, large groups of people are smarter than a few experts. This brilliant and insightful book shows why the conventional wisdom is so wrong and why the theory of the wisdom of crowds has huge implications for how we run our businesses, structure our political systems and organise our society. Shrewd, meticulous and profound, The Wisdom of Crowds will change for ever the way you think about human behaviour.
Four phases of interest in the distribution of city sizes are identified and current conflict in the literature is shown to be a consequence of poorly-selected units of observation. When urban regions are properly defined, US urban growth obeys Gibrat’s Law and the city size distribution is strictly Zipfian rank-size with coefficient q = 1.0. Care has to be taken with definition of the largest urban-economic regions, however; the fit in the upper tail of the distribution is best when they are recognized to be megalopolitan in scale.