ArticlePDF Available

Self-organized Natural Roads for Predicting Traffic Flow: A Sensitivity Study

Authors:
  • HKUST(GZ)

Abstract and Figures

In this paper, we extended road-based topological analysis to both nationwide and urban road networks, and concentrated on a sensitivity study with respect to the formation of self-organized natural roads based on the Gestalt principle of good continuity. Both Annual Average Daily Traffic (AADT) and Global Positioning System (GPS) data were used to correlate with a series of ranking metrics including five centrality-based metrics and two PageRank metrics. It was found that there exists a tipping point from segment-based to road-based network topology in terms of correlation between ranking metrics and their traffic. To our big surprise, (1) this correlation is significantly improved if a selfish rather than utopian strategy is adopted in forming the self-organized natural roads, and (2) point-based metrics assigned by summation into individual roads tend to have a much better correlation with traffic flow than line-based metrics. These counter-intuitive surprising findings constitute emergent properties of self-organized natural roads, which are intelligent enough for predicting traffic flow, thus shedding substantial insights into the understanding of road networks and their traffic from the perspective of complex networks. Keywords: topological analysis, traffic flow, phase transition, small world, scale free, tipping point
Content may be subject to copyright.
1
Self-organized Natural Roads for Predicting Traffic Flow: A Sensitivity Study
Bin Jiang, Sijian Zhao and Junjun Yin
Department of Land Surveying and Geo-informatics
The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Email: bin.jiang@polyu.edu.hk
(Dated: July 5, 2008)
Abstract
In this paper, we extended road-based topological analysis to both nationwide and urban road networks, and
concentrated on a sensitivity study with respect to the formation of self-organized natural roads based on the
Gestalt principle of good continuity. Both Annual Average Daily Traffic (AADT) and Global Positioning
System (GPS) data were used to correlate with a series of ranking metrics including five centrality-based metrics
and two PageRank metrics. It was found that there exists a tipping point from segment-based to road-based
network topology in terms of correlation between ranking metrics and their traffic. To our big surprise, (1) this
correlation is significantly improved if a selfish rather than utopian strategy is adopted in forming the self-
organized natural roads, and (2) point-based metrics assigned by summation into individual roads tend to have a
much better correlation with traffic flow than line-based metrics. These counter-intuitive surprising findings
constitute emergent properties of self-organized natural roads, which are intelligent enough for predicting traffic
flow, thus shedding substantial insights into the understanding of road networks and their traffic from the
perspective of complex networks.
Keywords: topological analysis, traffic flow, phase transition, small world, scale free, tipping point
1. Introduction
Natural roads are joined road segments based on the Gestalt principle of good continuity, and they are self-
organized in nature. Let us assume that every segment at each end chooses one most suitable neighboring
segment with a smallest deflection angle to join together; and this process (refer to Figure 16 later in the text for
an illustration) goes on until the deflection angle is greater than a preset threshold (e.g. degree 45). This process
resembles Bak’s sandpile (Bak, Tang and Wiesenfield 1987; Bak 1996) in which sand is added continuously to
generate different sizes of avalanche. The size of avalanches exhibits a universal regularity of power law
distribution, the same behavior demonstrated by the connectivity and length of natural roads (c.f. a later section
on more discussions). There is also no typical size of natural roads. More than the avalanches, natural roads
demonstrate a sort of collective intelligence (Surowiecki 2004) that is able to predict traffic flow.
Self-organized natural roads, or strokes in terms of Thomson (2003), differ from named roads that are identified
by unique names (Jiang and Claramunt 2004). Named roads are more difficult to implement than natural roads
because of the incomplete nature of road databases, in which some segments may have missing or wrong names.
On the other hand, the formation of natural roads is significantly biased by join principles and deflection angle
threshold in the join process. In other words, there is a sensitivity issue involved in the formation of natural roads.
How each segment determines to join with one of its neighboring segments follows a self-organized process
based on three different join principles. The first is called every-best-fit, and it works like this. Every pair of
segments at a junction point have to negotiate with each other to have best fit (i.e., the one with a smallest
deflection angle), in terms of which one joins which one. This principle is rather utopian or communism in
nature, and seems the best strategy. The second is called self-best-fit. Instead of every, each segment only
considers itself to find a best fit, and does not care about others in the process. Thus it is rather selfish or
capitalism in nature. Or it can be deemed a natural selection. Similar to this selfish principle, there is another one
called self-fit. Obviously each segment tries to choose arbitrarily one fit, i.e., the one with a deflection angle less
than a preset threshold, to join, but not necessarily to be the best fit. In comparison (c.f. Figure 1 for an
illustration, and Appendix B for algorithms), the first principle always leads to a unique set of natural roads,
while the other two principles would generate enormous sets of natural roads, depending on the search order of
the segments. Figures 1c and 1d are just one of many possible sets for each principle. It is one of the sensitivity
issues we intend to study in the paper.
2
This paper is intended to investigate the join principles and deflection angle threshold with respect to the
formation of natural roads, and their correlation to Annual Average Daily Traffic (AADT) and GPS data (both
referred to as traffic flow or flow in what follows). We found that there exists a tipping point from segment-
based to road-based network topology in terms of correlation between ranking metrics and their traffic flow (or
metric-flow correlation). To our big surprise, (1) the correlation based on the principles of self-best-fit and self-
fit is much better than that based on the principle of every-best-fit, and (2) point-based metrics (in particular for
local and global integrations, see Appendix C) assigned by summation into individual roads tend to have a much
better correlation with traffic flow than line-based metrics. These counter-intuitive surprising findings provide
telling evidence that self-organized natural roads are an emergence developed from individual road segments,
and are intelligent enough for predicting traffic flow, thus shedding substantial insights into the understanding of
road networks and their traffic from the perspective of complex networks.
The remainder of this paper is structured as follows. Section 2 introduces geometric and topological
representations of road networks using a notional road network, in particular different transformations from
geometric to topological representation using the three join principles. We brief data sources and essential
processing of road networks as well as observed traffic in section 3. Section 4 illustrates our experiments and
findings about various sensitivity issues. Section 5 speculates in detail on the emergent properties of natural
roads and their implications. Finally section 6 concludes the paper with a summary and future work.
2. Geometric versus topological representations of road networks
Although road networks can be abstracted as graphs, represented by a Point-Point Distance Matrix (PPDM, c.f.
A-1 for an example), we still refer to them as a geometric representation (the network in gray in Figure 1a). This
is based on the facts that (1) the junction points have precise geometric coordinates referenced to the earth, and
(2) the distances between the pairs of points are a major concern for the representation. The points are defined in
a Euclidian space, and the distances are taken by the matrix as its elements. Even though the distance matrix can
be further abstracted topologically into a Point-Point Connectivity Matrix (PPCM, c.f. A-2 or the connectivity
graph in Figure 1a), it still cannot be regarded as a true topology because of a lack of an interesting structure or
pattern. We can remark that the networks or graphs have a very boring connectivity structure, because of the lack
of variation in connectivity for individual points. The same observation can be made in reality where most
junctions have a degree of 4, and a very few have a degree less or greater than 4. However, things would be
rather different if we take a truly topological view (c.f. Figure 4 also).
Junction s or ends Road sement s
1
2
3
4
5
6
78
9
10
11
12
13
14
15
Nodes repre senting segments Links if two se gments intersected
(a)
1
2
3
4
5
6
78
9
10
11
12
13
14
15
Nodes representing roads Lin ks if tw o ro ads i nters ec ted
a
b
c
d
e
f
g
(b)
1
2
3
4
5
6
78
9
10
11
12
13
14
15
Nodes representing roads Links if two road s inter sected
(c)
1
2
3
4
5
6
78
9
10
11
12
13
14
15
Nodes representing roads Links if two roads i ntersected
(d)
Figure 1: (color online) A notional road network and its connectivity graphs: (a) segment-based connectivity
graph, and road-based connectivity graphs with respect to different join principles of (b) every-best-fit, (c) self-
best-fit, and (d) self-fit
3
This topological view takes a higher level (macro-scale) of abstraction, which considers adjacent relationships of
individual roads, i.e., a sort of road-road intersection. Seven intersected roads can be transformed into a Line-
Line Adjacency Matrix (LLAM, c.f. A-3), which is equivalent to the connectivity graphs shown in Figure 1b,
Figure 1c and Figure 1d with respect to the join principles of every-best-fit, self-best-fit and self-fit. The matrix
contains all information about the adjacency: its elements are set to 1 if the corresponding roads are intersected
and 0 otherwise. The graph based on LLAM contains no geometric information but the binary relation (1/0), i.e.,
no coordinates or distances attached to the nodes and links. However, the graph demonstrates some interesting
connectivity structure, e.g., the distribution of connectivity of the nodes is skewed significantly. It sets a clear
difference from the underlying road networks.
Another topological representation in terms of point and point relationship can be developed. The point-point
relationship is set up based on whether or not a pair of points share a road in common, i.e. 1 if yes and 0
otherwise in the corresponding Point-Point Adjacency Matrix (PPAM, c.f. A-4). This alternative graph also
demonstrates an interesting structure in terms of variation of connectivity. It is important to note the relationship
of the two topological representations. They are closely related, and can be easily derived through the operation
of multiplication of Line-Point Incidence Matrix (LPIM, c.f. A-5) and its transpose.
The LLAM-based representation is well developed and applied in space syntax community (Hillier and Hanson
1984), but it is far less so for the PPAM based representation. For sake of simplicity and intuition, we will
respectively call them line-based and point-based approaches (Jiang and Claramunt 2002). Alternatively, they
are named as primal and dual representations (Batty 2004; Jiang and Liu 2007). They are a powerful tool for
obtaining structure and patterns, thus an important analytical model for predicting traffic flow (e.g., Jiang and
Liu 2007). However, the dual relationship has yet to be applied, and a sensitivity study about the prediction
deserves further investigation as shown later in this paper.
3. Data sources and processing
A main dataset for the study was obtained from the Swedish Road Administration (Vägverket), and it contains
both road networks and AADT assigned to each individual road segment. It should be noted that it is a massive
dataset, involving in total ~ 45 000 segments, ~ 100 000 kilometers in length, across Sweden (Figure 2a). The
entire road network is divided into seven regions (Figure 2b). We keep the seven networks for separate
investigations for consistent checking of our findings, and in the mean time merge them together as an entire
network for some experiments. Before the experiments, isolated segments were removed to ensure all roads are
interconnected. It is important to note that the percentage of isolated segments is very low (< 0.5%), except for
the region of Stockholm, which is a bit higher. This is probably a partial reason that it shows some special
behavior compared with other regions. As for the Gävle urban street network (Figure 2c), it consists of ~3 400
segments, and traffic flow is obtained from GPS log files, by one taxi company (Taxi 2007), recording locations
of 50 taxicabs every 10 seconds. The GPS dataset has been preprocessed to ensure the recoded locations are truly
trajectories. We have in total seven days (October1-7, 2007) of such data for consistent checking, the same data
used in Jiang (2006). Before any further experiments, we make sure the networks are truly road segment based.
In case they are not, we join separate parts together to be one segment between two junctions, in order to form
the kind of road network shown in Figure 1. In the course of this process, traffic flows of the separate parts are
averaged to the segment.
(a)
(b)
(c)
Figure 2: (color online) nationwide road networks in Sweden (a) divided into seven regions (b) and Gävle urban
street network (c) (NOTE: the red spot in b is the location of the city Gävle)
4
4. Experiments and findings
4.1 Overall statistics on segments versus roads
We first examine the sensitivity of deflection angle threshold in forming natural streets. We chose every fifth
degree as an interval between degree 0 and 90 to examine how many roads generated from individual segments
with respect to the series of threshold angles. As plotted in Figure 3, the number of roads drops from degree 0 to
degree 5 dramatically, and continuously yet slowly till degree 30. And, the number tends to become rather stable
from degree 30 onwards. This observation is valid for both nationwide and urban road networks. Intuitively,
degree 45 appears to be an ideal threshold angle that helps generate natural roads with a good continuity. The
number of roads becomes stable around ~15 000 for the nationwide network, and around ~ 1 100 for the urban
street network. In this respect, both the nationwide road network and urban street network shows a significant
similarity.
10000
15000
20000
25000
30000
35000
40000
45000
50000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95
(a)
500
1000
1500
2000
2500
3000
3500
4000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95
(b)
Figure 3: The number of natural roads drops as the threshold angle rises: the case of the entire nationwide road
network (a) and Gävle urban street network (b)
It is important to note the fact that the natural roads generated by the threshold angle 0 are identical to the
segments. There are a couple of exceptions, where two adjacent segments have no angle change at all. The
distribution of segment and road connectivity (when the threshold angle is set to 45) is very different: the former
is a normal like distribution, while the latter is a power law distribution (Figure 4). We can remark that over 60%
of segments have a connectivity of 4 (i.e., a typical connectivity), and maximum connectivity is not more than 11.
On the other hand, maximum road connectivity is over 220, over 80% of roads have a connectivity less than 4,
and there is no typical connectivity for natural roads. This diversity of road length and connectivity has been
illustrated in a previous study, with a big sample of American cities and expressed by the 80/20 principle (Jiang
2007). Other researchers (e.g. Gastner and Newman 2006; Porta, Crucitti and Latora 2006) have also studied
spatial networks from the perspective of complex networks with some interesting findings.
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
1234567891011
(a)
(b)
Figure 4: Distribution of (a) segment connectivity and (b) road connectivity, whose log-log plot shows a straight
line (the inset)
4.2 Findings based on the line-based approach
In what follows, we will demonstrate how ranking metrics (c.f. Appendix C) correlate to traffic flow, and how
the metric-flow correlation alters with respect to the threshold angle and join principles. Before that, we examine
whether or not the distribution of the metrics and traffic flow shows a universal regularity of power law with
respect to the three join principles. From the plots in Figure 5, we can observe that except local and global
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
1 14 27 40 53 66 79 92 105 118 131 144 157 170 183 196 209 222
0%
0%
0%
1%
10%
100%
1 10 100 1000
5
integrations, all other metrics, as well as traffic flow (threshold angle is set to 45), exhibit a power law
distribution. Clearly, natural roads generated by the principle of self-best-fit have a more striking power law than
the other two. Overall, all the power law distributions are very similar, but there is a striking similarity between
that of PageRank and connectivity (or control) metrics, and between flow and betweenness (or weighted
PageRank) metrics. This is not particularly surprising, since (1) for an undirected graph PageRank scores reflect
connectivity, and (2) the definition of betweenness metric is based on the concept of flow.
0.00%
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBestFit
SelfBestFit
SelfFit
(a)
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBest Fit
SelfBestFit
SelfFit
(b)
0.00%
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBes tFit
SelfBestFit
SelfFit
(c)
0.00%
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBes tFit
SelfBestFit
SelfFit
(d)
0.00%
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBes tFit
SelfBestFit
SelfFit
(e)
0.0%
0.0%
0.1%
1.0%
10.0%
100.0%
110100
EveryBestF it
SelfBestFit
SelfFit
(f)
0.00%
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBes tFit
SelfBestFit
SelfFit
(g)
0.00%
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBes tFit
SelfBestFit
SelfFit
(h)
Figure 5: (color online) Log-log plots of (a) connectivity, (b) control, (c) betweenness, (d) PageRank (d = 0.20),
(e) weighted PageRank (d = 0.20), (f) flow (threshold angle = 45), (g) local integration and (h) global integration
6
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBestF it
SelfBestFit
SelfFit
(a)
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBes tFit
SelfBestFit
SelfFit
(b)
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBes tFit
SelfBestFit
SelfFit
(c)
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBest Fit
SelfBestFit
SelfFit
(d)
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBestF it
SelfBestFit
SelfFit
(e)
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBes tFit
SelfBestFit
SelfFit
(f)
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBes tFit
SelfBestFit
SelfFit
(g)
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBes tFit
SelfBestFit
SelfFit
(h)
Figure 6: (color online) Log-log plots of (a) connectivity, (b) control, (c) betweenness, (d) PageRank (d = 0.95),
(e) weighted PageRank (d = 0.95), (f) flow (threshold angle = 45), (g) local integration and (h) global integration
To fully examine the metric-flow correlation, we applied them to the nationwide road networks. Figures 7, 8 and
9 demonstrate one set of results with respect to the principles of every-best-fit, self-best-fit and self-fit for the
region of Sydost. We can observe that in all three cases segments (threshold angle = 0) have no metric-flow
correlation at all, with R square values being 0. However, this correlation rises gradually with the increase of the
threshold angle until degree 15, and then becomes stable for a while. It appears that degree 15 is a kind of tipping
point where R square values reach a maximum and become stable until degree 90. Of the seven metrics,
weighted PageRank, PageRank, connectivity, and control (with a decreasing order) are the best ones in terms of
7
metric-flow correlation. The poorest are local and global integrations, while the betweenness metric is
somewhere between the best and poorest. Cross checking Figures 7, 8 and 9, self-best-fit (Figure 8) is the best
option (R square over 0.75). We can also note that the damping factor d around 0.20 seems the best choice for
the nationwide networks. It is important to note that in theory connectivity is equivalent to PageRank for an
undirected graph. However, due to the different damping factor d values, the actual metric-flow correlations may
not be identical. This is clearly reflected in the plots. The observations are pretty consistent among the seven
regions of the nationwide road network. Thus, only one region is used for illustration purpose.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
Connect
Control
GInteg
LInteg
Between
(a)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0123451015202530354045505560657075808590
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(b)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 1015202530354045505560657075808590
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(c)
Figure 7: (color online) Correlation coefficient (R square) between traffic flow and (a) five centrality-based
metrics, (b) PageRank and (c) weighted PageRank, with respect to the threshold angle 45, based on the principle
of every-best-fit and using the case of the Sydost region
(NOTE: for both PageRank and weighted PageRank, they have a series of PageRank scores with respect to
different damping factor d values)
8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 51015202530354045505560657075808590
Angle
R square
Connect
Control
GInteg
LInteg
Between
(a)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(b)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 1015202530354045505560657075808590
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(c)
Figure 8: (color online) Correlation coefficient (R square) between traffic flow and (a) five centrality-based
metrics, (b) PageRank and (c) weighted PageRank, with respect to the threshold angle, based on the principle of
self-best-fit and using the case of the Sydost region
(NOTE: for both PageRank and weighted PageRank, they have a series of PageRank scores with respect to
different damping factor d values. The curves are the averaged result of 20 experiments.)
9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 1015202530354045505560657075808590
Angle
R square
Connect
Control
GInteg
LInteg
Between
(a)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(b)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(c)
Figure 9: (color online) Correlation coefficient (R square) between traffic flow and (a) five centrality-based
metrics, (b) PageRank and (c) weighted PageRank, with respect to the threshold angle 45, based on the principle
of self-fit and using the case of the Sydost region
(NOTE: for both PageRank and weighted PageRank, they have a series of PageRank scores with respect to
different damping factor d values. The curves are the averaged result of 20 experiments.)
Similar findings can be observed with the Gävle urban street network (Figures 10, 11 and 12). For instance, no
metric-flow correlation exists at all for segments, but a significant correlation for streets. However, the
correlation tends to become stable for PageRank metrics between degree 30 and 75, rather than between 15 and
90 in the previous case of nationwide networks. Weighted PageRank (when d = 0.95) is still the best metric (R
square over 0.7) in terms of metric-flow correlation. It is followed by betweenness, whose R square is over 0.6.
This result conforms to previous studies (Hillier and Iida 2005; Turner 2007). Again both local and global
10
integrations are the poorest in terms of metric-flow correlation. More importantly, the principle self-best-fit is
proved to be the best option. As for a possible reason, we will speculate on it later on.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
Connect
Control
GInteg
LInteg
Between
(a)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0123451015202530354045505560657075808590
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(b)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(c)
Figure 10: (color online) Correlation coefficient (R square) between traffic flow and (a) five centrality-based
metrics, (b) PageRank and (c) weighted PageRank, with respect to the threshold angle 45, based on the principle
of every-best-fit and using the case of the Gävle street network and one day traffic flow
(NOTE: for both PageRank and weighted PageRank, they have a series of PageRank scores with respect to
different damping factor d values)
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
Connect
Control
GInteg
LInteg
Between
(a)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(b)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(c)
Figure 11: (color online) Correlation coefficient (R square) between traffic flow and (a) five centrality-based
metrics, (b) PageRank and (c) weighted PageRank, with respect to the threshold angle 45, based on the principle
of self-best- fit and using the case of the Gävle urban street network and one day traffic
(NOTE: for both PageRank and weighted PageRank, they have a series of PageRank scores with respect to
different damping factor d values. The curves are the averaged result of 20 experiments.)
12
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Angle
R square
Connect
Control
GInteg
LInteg
Between
(a)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0123451015202530354045505560657075808590
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(b)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 1015202530354045505560657075808590
Angle
R square
d=1.00
d=0.95
d=0.90
d=0.85
d=0.80
d=0.70
d=0.60
d=0.50
d=0.40
d=0.30
d=0.20
d=0.15
d=0.10
d=0.05
d=0.01
(c)
Figure 12: (color online) Correlation coefficient (R square) between traffic flow and (a) five centrality-based
metrics, (b) PageRank and (c) weighted PageRank, with respect to the threshold angle, based on the principle of
self-fit and using the case of the Gävle urban street network and one day traffic
(NOTE: for both PageRank and weighted PageRank, they have a series of PageRank scores with respect to
different damping factor d values. The curves are the averaged result of 20 experiments.)
4.3 Findings based on the point-based approach
In this experiment, we adopt the point-based approach for forming connectivity graphs, and then assign point-
based metrics by summation into individual roads. Surprisingly, both local and global integrations, previously
demonstrated no scaling property using the line-based approach (Figures 5 and 6), exhibit a striking power law
distribution (Figures 13 and 14).
13
0.00%
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBestFit
SelfBestFit
SelfFit
(a)
0.00%
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBestFit
SelfBestFit
SelfFit
(b)
Figure 13: Log-log plots of local (a) and global (b) integration using the point-based approach (the case of the
entire nationwide road network)
0.01%
0.10%
1.00%
10.00%
100.00%
1 10 100
EveryBestFit
SelfBestFit
SelfFit
(a)
0.01%
0.10%
1.00%
10.00%
100.00%
110100
EveryBestFit
SelfBestFit
SelfFit
(b)
Figure 14: (color online) Log-log plots of local (a) and global (b) integration using the point-based approach (the
case of the Gävle street network)
There is a significant improvement for local and global integrations in terms of metric-flow correlation. There is
no correlation (R square values less than 0.20) for local and global integrations with the line-based approach.
However, R square values reach to around 0.8 for the case of Sydost (Figure 15a) when the point-based approach
is adopted. A similar observation can be made for the case of Gävle (Figure 15b). To this point, we can remark
that there is a significant relationship between scaling and metric-flow correlation. For instance, with the line-
based approach both local and global integrations do not follow the scaling law (Figures 5 and 6), and there is
nearly no metric-flow correlation for the integrations (Figures 7-12). However, in the space syntax community, it
is commonly accepted that local and global integrations are the default indicators for traffic flow. It is absolutely
not the case in our experiments based on the line-based approach. However, when we shift from the line-based to
the point-based approach, local and global integrations become the default indicators for traffic. It appears a
close relationship between the metrics’ scaling distribution (Figures 13 and 14), and the metric-flow correlation
(Figures 15 and 16).
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0123451015202530354045505560657075808590
Angle
R square
Connect
Control
GInteg
LInteg
Between
(a)
14
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0123451015202530354045505560657075808590
Angle
R square
Connect
Control
GInteg
LInteg
Between
(b)
Figure 15: (color online) Correlation coefficient (R square) between traffic flow and point-based centrality
metrics (a) the case of Syndost and (b) the case of Gävle (NOTE: local and global integrations in particular)
5. Discussions on emergent properties of natural roads
What we have found through the experiments can be considered to be emergent properties of natural roads,
because they are not properties of the fundamental element, i.e., road segments. In the above experiments, we
have illustrated that roads distinguish from segments in terms of the general behaviors. Roads are generated from
segments by a self-organized process, but they demonstrate intelligence that underlying constituent segments
lack. There is a striking zone between degree 30 and 75, where the emergent properties retain stable: (1) the
number of natural roads remains stable, and (2) the metric-flow correlation does not alter much. The emergent
properties remain valid for both nationwide and urban road networks. Some slight differences between
nationwide and urban road networks do exist. Among others, the betweenness metric tends to be a good indicator
for traffic flow in urban rather than nationwide settings. Under the line-based approach, traffic flow and all
metrics except local and global integrations demonstrate the scaling property. However, while remaining
unchanged for the scaling with all other metrics and traffic flow, local and global integrations exhibit the scaling
(c.f. Figures 13 and 14, in comparison with Figures 5 and 6) when the point-based approach is adopted, more
specifically, when point-based metrics are assigned by summation into individual roads. Although the
underlying principle or mechanism still waits to be found, we try to justify our findings from the perspective of
multi-agent systems (MAS) or complex adaptive systems (CAS) (Maes 1994; Johnson 2002).
The emergent properties found can be considered to be the outcome of interactions of individual segments from
the bottom up. The segments, roads, and threshold angle (e.g., degree 45) can be compared to the sand grains,
avalanches and slope in Bak’s sandpile model (Bak, Tang and Wiesenfield 1987; Bak 1996). Segments can be
regarded as multiple agents at the micro-scale, in which every individual segment interacts with its adjacent
neighbors (at both ends) to form individual roads. The forming process can be regarded as a tracking process in
which each segment at every junction point chooses one with the smallest angle to join. In the end, the formed
roads meet the condition of energy minimization. It is a natural way of forming roads. In this regard, the
formation of roads (Figure 16) resembles that of avalanches in Bak’s sandpile model, where sand is added
continuously, and a series of avalanches are generated as long as the slope of sand piles is greater than a
threshold. Surprisingly, the size of roads, as well as that of avalanches, demonstrates a regularity of power law.
Furthermore, the selfish oriented principles tend to capture traffic much better than the utopian one. This may
sound counter intuitive, but it is exactly the diversity, one of the distinguished natures of CAS, that makes the
difference. In other words, the more diverse the agents are, the more intelligent the CAS.
Roads can be regarded as multiple agents at the macro-scale, in which every road interacts with each other to
form a connected whole. In the connected whole, all the roads collectively determine an individual’s status. This
is particularly true for PageRank metrics, which is justified by a federal system in which each agent (as a web
page) casts a vote to determine an individual’s status (Surowiecki 2004). More than that, an important page tends
to have a higher weight than a less important page. Thus it is not a perfect democracy in the sense of one page
one vote. In other words, not only popularity but also prestige determines the rankings in the whole. In addition,
centrality metrics have also this nature of collective determination, although they are not as smart as PageRank
metrics. Overall, the collectively determined metrics capture very well the traffic flow. This is another emergent
property developed from the interactions of individual roads at the macro-scale.
15
1
2
3
4
5
6
78
9
10
11
12
13
14
15
(a)
1
2
3
4
5
6
78
9
1
0
11
12
13
14
15
(b)
1
2
3
4
5
6
78
9
10
11
12
13
14
15
(c)
1
2
3
4
5
6
78
9
1
0
11
12
13
14
15
(d)
Figure 16: (color online) Formation of a natural road (green) in the sequence of (a), (b), (c) and (d) using the
principle of self-best-fit
A third emergent property is with respect to the point-based approach. If we assign point-based metrics by
summation into individual segments, then there would be no metric-flow correlation at all. However, things
would be rather different if the point-based metrics are assigned into individual roads. First of all, local and
global integrations exhibit a scaling property, whereas it lacks such a property in the line-based approach.
Second, local and global integrations become the best metrics to capture traffic, while other metrics remain with
no significant changes. To this point, we are still unable to provide a satisfactory justification as to why it is so.
However, we conjecture that it is due to the Modifiable Areal Unit Problem (MAUP) (Openshaw 1984), which
should be more properly named as the Modified Linear Unit Problem. MAUP refers to the fact that the
aggregation units will affect statistics of spatial data, e.g., correlation relationships are strengthened by
aggregations. As we have noticed, the point-based metrics assigned to segments have nearly no correlation, but
tend to be highly correlated when the point-based metrics are assigned to roads. Clearly, metric-flow correlations
are strengthened by aggregations from segments to roads. Whether this conjecture remains valid requires further
investigation.
The emergent properties and intelligence demonstrated by natural roads provide telling evidence that cities are
self-organized phenomena, which have life structure, as articulated by Alexander (2004) and Salingaros (2005).
Linked to the empirical findings are also some fundamental, maybe philosophical, issues such as contradictions
of uniformity and stupidity (of segments) versus diversity and intelligence (of roads), and unpredictability (of
road length) versus predictability (of traffic). For instance, the principle self-best-fit based on natural selections
makes natural roads more diverse than the other two principles. This is probably the reason why it is the best
principle. The behavior change from segments to roads, in particular related to metric-flow correlation, sounds
like a phase transition. This transition can be compared to that from ants to colonies, and sand grains to
avalanches. These issues are fundamental to many complexity systems in nature or society, which deserve
further research.
6. Conclusion
We studied road networks from the perspective of complex networks by concentrating on the sensitivity issues
with respect to join principles, the damping factors with PageRank metrics, and the difference between line-
based and point-based approaches. Using massive road networks and traffic flow data, we found that (1) there
exists a tipping point from segment-based to road-based network topology in terms of correlation between
ranking metrics and traffic flow, (2) the correlation is significantly improved if a selfish rather than utopian
strategy is adopted in forming the self-organized natural roads, and (3) point-based metrics assigned by
summation into individual roads tend to have a much better correlation with traffic flow than line-based metrics,
and this is particularly true for both local and global integrations. In addition, we found that weighted PageRank
16
with an appropriate d factor setting tends to be one of the best metrics for correlating or predicting traffic flow.
In comparison with line-based and point-based approaches, the point-based one tends to be the best option.
We have tried to put natural roads in analogue with many complex phenomena such as ants/colonies and sand
grains/avalanches, which demonstrate emergent properties and a universal regularity of power law distribution.
We illustrated various emergent properties developed from roads and road network topology. Our study sheds
substantial insights into the understanding of road networks. Road networks, although artifacts in nature, can be
compared with biological entities, which exhibit complexity that is developed from the interaction of individuals,
thus bottom-up in nature.
Acknowledgements
This research was financially supported by research grants from the Hong Kong Polytechnic University, and the
Swedish Research Council FORMAS. The data on nationwide road networks and relevant traffic flow are
provided by the Swedish Road Administration (Vägverket). In this regard, Michael Westin from Vägverket
Region Mitt deserves our special thanks for his assistance in preparing and transferring the datasets. The Gävle
street network is provided by Gävle city, and GPS data are provided kindly by the Gävle taxi company (TAXI
Stor och Liten) in Sweden. In addition, the first author is grateful to Xintao Liu and Hong Zhang for their help in
drawing Figure 1 and patience in discussing this paper while it was prepared for publication.
References
Alexander C. (2004), The Nature of Order, Center for Environmental Structure: Berkeley, CA.
Bak P. (1996), How Nature Works: the science of self-organized criticality, Springer-Verlag: New York.
Bak P., Tang C. and Wiesenfield K. (1987), Self-organized criticality: an explanation of 1/f noise, Physical
Review Letters, 59, 381 – 384.
Batty M. (2004), A new theory of space syntax, CASA working paper #75.
Freeman L. C. (1979), Centrality in social networks: conceptual clarification, Social Networks, 1, 215 - 239.
Gastner M.T. and Newman M.E.J. (2006), The spatial structure of networks, The European Physical Journal B,
49, 247–252.
Hillier B. and Hanson J. (1984), The Social Logic of Space, Cambridge University Press: Cambridge.
Hillier B. and Iida S. (2005), Network and psychological effects in urban movement, In: Cohn A.G. and Mark D.
M. (eds.), Proceedings of the International Conference on Spatial Information Theory, COSIT 2005,
Ellicottsville, N.Y., U.S.A., September 14-18, 2005, Springer-Verlag: Berlin, 475-490.
Jiang B. (2006), Ranking spaces for predicting human movement in an urban environment, to appear in
International Journal of Geographical Information Science, xx, xx – xx, Preprint,
arxiv.org/physics/0612011/.
Jiang B. (2007), A topological pattern of urban street networks: universality and peculiarity, Physica A, 384, 647
– 655.
Jiang B. and Claramunt C. (2002), Integration of space syntax into GIS: new perspectives for urban morphology,
Transactions in GIS, 6(3), 295-309.
Jiang B. and Claramunt C. (2004), Topological analysis of urban street networks, Environment and Planning B:
Planning and Design, 31, 151- 162.
Jiang B. and Liu C. (2007), Street-based topological representations and analyses for predicting traffic flow in
GIS, to appear in International Journal of Geographic Information Science, xx, xx – xx, preprint,
arxiv.org/abs/0709.1981.
Johnson S. (2002), Emergence: the connected lives of ants, brains, cities and software, TOUCHSTONE: New
York.
Langville A. N. and Meyer C. D. (2006), Google's PageRank and Beyond: the science of search engine rankings,
Princeton University Press: Princeton, N.J.
Maes P. (1994), Modeling adaptive autonomous agents, Artificial Life, 1, 135 – 162.
Openshaw, S. (1984), The Modifiable Areal Unit Problem, Geo Book: Norwich.
Page L. and Brin S. (1998), The anatomy of a large-scale hypertextual Web search engine, Proceedings of the
Seventh International Conference on World Wide Web, 7, 107-117.
Porta S., Crucitti P. and Latora V. (2006), The network analysis of urban streets: a dual approach, Physica A, 369,
853 – 866.
Salingaros N. A. (2005), Principles of Urban Structure, Techne: Delft.
Surowiecki J. (2004), The Wisdom of Crowds: why the many are smarter than the few, ABACUS: London.
Taxi (2007), Taxi Stor och Liten, www.taxi107000.com
17
Thomson R. C. (2003), Bending the axial line: smoothly continuous road centre-line segments as a basis for road
network analysis, in Hanson, J. (ed.), Proceedings of the Fourth Space Syntax International Symposium,
University College London, London.
Turner A. (2007), From axial to road-centre lines: a new representation for space syntax and a new model of
route choice for transport network analysis, Environment and Planning B: Planning and Design, 34(3),
539–555.
Xing W. and Ghorbani A. (2004), Weighted PageRank algorithm, Second Annual Conference on
Communication Networks and Services Research CNSR’04, Fredericton, N.B. Canada, 305 – 314.
18
Appendix A: Matrices derived from the notational road network in Figure 1
With reference to Figure 1 in the main text of this paper, we derived various matrices, representing different
networks or graphs. First, Point-Point Distance Matrix (PPDM) is a matrix, a two-dimensional array, containing
the distances of a set of points, which are road junctions or ends.
=
0000000000000015
0000000000000014
00000000000013
00000000000012
00000000000011
0000000000000
010
000000000000009
0000000000008
0000000000007
000000000000006
00000000000005
00000000000004
000000000000003
00000000000002
000000000000001
151413121110987654321
x
x
xxx
xxx
xxx
x
x
xxx
xxx
x
xx
xx
x
xx
x
PPDM
(A-1)
where x denotes different distances between two points. The distance matrix becomes a binary connectivity
matrix, when all x are set to 1 (Figure 1a).
=
00100000000000015
00100000000000014
11010000000000013
00101001000000012
000101010000
00011
00001000000000010
0000000010000009
0001100010000008
0000001100100007
0000000000100006
0000000000010105
0
000000000100104
0000000000010003
0000000000110002
0000000000000101
151413121110987654321
PPCM
(A-2)
Line-line adjacent matrix (LLAM) is a binary matrix, whose element is set to 1 if corresponding lines are
intersected and 0 otherwise (Figure 1b).
=
0000011
0000001
0001001
0010001
0000001
1000001
1111110
g
f
e
d
c
b
a
gfedcba
LLAM
(A-3)
Point-point adjacent matrix (PPAM) is a binary matrix indicating whether or not a pair of points share a line in
common: 1 if yes and 0 otherwise (Figure 1b).
19
=
00110001101001115
00100000000000014
11010001101001113
10101101101001112
00010101000000011
00011000000000010
0000000010000009
1011100010100118
1011001100100117
0000000000111006
1011000111011115
00000
00001101104
0000000001110003
1011000110110012
1011000110100101
151413121110987654321
PPAM
(A-4)
Generally, a Line-Point Incidence Matrix (LPIM) shows the relationship between lines and points, i.e., whether
or not a point on a line (Figure 1b).
=
000000000001010
011000000000000
000111000000000
000010010000000
000000101000000
0
00000000111100
101100011010011
151413121110987654321
g
f
e
d
c
b
a
LPIM
(A-5)
The above two adjacency matrices can be easily derived from LPIM using the following operations:
T
LPIMLPIMLLAM *=
L
PIM
L
PIMPPAM T*=
20
Appendix B: Algorithms for forming natural roads with different join principles
Input: Segment-based network shape file
Output: road-based network shape file
Sub Main () //main function
Get all the segments in segment-based network shape file;
Get randomly a segment as a starting search segment from all the segments;
While (there is no such a starting search segment) do
If (this segment is not processed) then
Change the status of that segment to be processed;
Get a new segment by calling function ‘SearchSegmentByPnt_EveryBestFit’ with the old
segment and it’s ‘from’ direction as parameters;
/* to call different functions */
// Get a new segment by calling function ‘SearchSegmentByPnt_SelfBestFit’ with the old
// segment and it’s ‘from’ direction as parameters;
// Get a new segment by calling function ‘SearchSegmentByPnt_SelfFit’ with the old
// segment and it’s ‘from’ direction as parameters;
Get another new segment by calling function ‘SearchSegmentByPnt_EveryBestFit’ with the
last new constructed segment and it’s ‘to’ direction as parameters;
/* to call different functions */
// Get another new segment by calling function ‘SearchSegmentByPnt_SelfBestFit’ with
// the last new constructed segment and it’s ‘to’ direction as parameters;
// Get another new segment by calling function ‘SearchSegmentByPnt_SelfFit’ with the
// last new constructed segment and it’s ‘to’ direction as parameters;
Create a road with final constructed segment and add to the road-based network shape
file;
End
Get randomly a segment as a starting search segment from all the segments except the
processed segments;
End while
End sub
Algorithm I based on the principle of every-best-fit
Function SearchSegmentByPnt_EveryBestFit (old segment, direction) as new segment
//a recursive function
If (direction is ‘from’ direction) then
Search point = the from point of old segment;
Else if (direction is ‘to’ direction) then
Search point = the to point of old segment;
End if
Use a spatial filter to search for the segments intersected with search point except old
segment;
If (there are no intersected segments) then
Return new segment = old segment;
Else
If (the searched segments are all processed) then
Return new segment = old segment;
Else
Exclude the processed segments from the searched segments to get a remained set;
Calculate the deflection angles (a1) between old segment and every segment in the
remained set;
Calculate the deflection angle (a2) of every pair in the remained set;
Select the segments which meet with the condition a1 < a2;
// the principle of every-best-fit
If (There are no selected segments) then
Return new segment = old segment;
Else
Get the minimum deflection angle from a1 and its corresponding segment;
If (the minimum deflection angle < threshold) then
Join old segment and that segment into new segment at search point;
Change the status of that segment to be processed;
Call function ‘SearchSegmentByPnt_EveryBestFit’ recursively with new segment
and direction as parameters;
Else
Return new segment = old segment;
End If
21
End If
End If
End If
End function
Algorithm II based on the principle of self-best-fit
Function SearchSegmentByPnt_SelfBestFit (old segment, direction) as new segment
//a recursive function
If (direction is ‘from’ direction) then
Search point = the from point of old segment;
Else if (direction is ‘to’ direction) then
Search point = the to point of old segment;
End if
Use a spatial filter to search for the segments intersected with search point except old
segment;
If (there is no intersected segments) then
Return new segment = old segment;
Else
If (the searched segments are all processed) then
Return new segment = old segment;
Else
Exclude the processed segments from the searched segments to get a remained set;
Calculate the deflection angles between old segment and every segment in the remained
set;
Get the minimum deflection angle and its corresponding segment;
// the principle of Self-best-fit
If (the minimums deflection angle < threshold) then
Join old segment and that segment into new segment at search point;
Change the status of that segment to be processed;
Call function ‘SearchSegmentByPnt_SelfBestFit’ recursively with new segment and
direction as parameters;
Else
Return new segment = old segment;
End If
End If
End If
End function
Algorithm based on the principle of self-fit
Function SearchSegmentByPnt_SelfFit (old segment, direction) as new segment
//a recursive function
If (direction is ‘from’ direction) then
Search point = the from point of old segment;
Else if (direction is ‘to’ direction) then
Search point = the to point of old segment;
End if
Use a spatial filter to search for the segments intersected with search point except old
segment;
If (there are no intersected segments) then
Return new segment = old segment;
Else
If (the searched segments are all processed) then
Return new segment = old segment;
Else
Exclude the processed segments from the searched segments to get a remained set;
Calculate the deflection angles (a1) between old segment and every segment in the
remained set;
Select the segments that meet with the condition a1 < threshold
If (there are not selected segments) then
Return new segment = old segment;
Else
Get randomly a segment from the selected segments;
// the principle of self-fit
Join old segment and that segment into new segment at search point;
Change the status of that segment to be processed;
Call function ‘SearchSegmentByPnt_SelfFit’ recursively with new segment and
direction as parameters;
End if
End If
End If
22
End function
Appendix C: An introduction to ranking metrics used in the paper
To make the paper self contained, we introduce briefly the seven ranking metrics examined in the paper,
including connectivity, control, closeness which leads to both local and global integrations, betweenness,
PageRank and weighted PageRank (NOTE: some of the metrics are given in a short format such as Connect,
LInteg, GInteg, and Between in the plots). The reader may refer to relevant literature for more details, e.g., Jiang
(2006) for space syntax metrics originally developed by Hillier and Hanson (1984), Freeman (1979) for
centrality metrics, Langville and Meyer (2006) for the PageRank metric, originally developed by Page and Brin
(1998), and Xing and Ghorbani (2004) for the weighted PageRank metric. Note that space syntax metrics with
the exception of the control metric are based on centrality metrics, although they are named differently. In what
follows, we will outline the linkage.
The connectivity metric is de facto degree centrality, which measures the number of roads that interconnect a
given road. In the connectivity graph that represents road-road intersection, connectivity is the number of nodes
that link a given node. Formally connectivity is defined by:
kCnti= (C-1)
where k is the number of nodes directly linked to the given node i.
The control metric of a node is closely related to the connectivity of the directly linked nodes. Formally it is
defined by:
=
=k
jj
iCnt
Ctr
1
1 (C-2)
where k is the number of directly linked nodes (or connectivity) of a considered node, and j
Cnt is the
connectivity of the jth directly linked node.
The closeness metric measures the smallest number of links from a street to all other streets. In the
corresponding connectivity graph, it is the shortest distance from a given node to all other nodes. It is defined by:
=
=n
k
ijid
n
Cls
1
),(
1 (C-3)
where ),( jid is the shortest distance between nodes i and j.
The closeness metric becomes a sort of local closeness when considering only nodes within a few steps, instead
of all the nodes in the connectivity graph. In this sense, the closeness metric given by (C-3) is defined at a global
level, thus global closeness metric so to speak. Both local and global closeness metrics are the base for defining
local and global integrations, as used in our experiments.
The betweenness centrality measures to what extent a road is between roads. In the connectivity graph, it reflects
the intermediary location of a node along indirect relationships linking other nodes. Formally it is defined by:
∑∑
=
=
=n
j
j
kij
ikj
ip
p
Btw
1
1
1
(C-4)
where ij
pis he number of shortest paths from i to j, and ikj
p is the number of shortest paths from i to j that pass
through k, so
ij
ikj
p
pis the proportion of shortest paths from i to j that pass through k.
23
The PageRank metric is initially defined for web graphs (directed in nature) for ranking individual web pages
(Page and Brin 1998). The basic idea of PageRank is that a highly ranked node is one that highly ranked nodes
point to. It is defined formally as follows:
+
=
)(
Pr
1
Pr
iONj j
j
in
d
n
d (C-5)
where n is the total number of nodes; ON(i) is the outlink neighbors (i.e., those nodes that point to node i); Pri
and Prj are rank scores of node i and j, respectively; nj denotes the number of outlink nodes of node j; d is a
damping factor, which is usually set to 0.85 for ranking web pages.
With the above definition of PageRank, the PageRank of a node at any iteration is evenly divided over the nodes
to which it links (or outlink nodes). However, the propagation of the PageRank should follow an uneven rule, i.e.,
the more popular nodes tend to get a higher proportion. This is exactly the basic motivation of the weighted
PageRank (Xing and Ghorbani 2004).
The weighted PageRank is defined as follows:
+
=
)(
1
iONj jji WWprd
n
d
Wpr (C-6)
where weight Wj is added to propagate a PageRank score from one particular node i to its outlink nodes. This is
different from equation (C-5), where a PageRank score is evenly divided among its outlink nodes.
The weight Wj represents the relative popularity of node j among its counterparts, and it is defined as follows:
=)(kw
w
Wj
j (C-7)
where k is the counterpart nodes of j, w is the weight for individual links, indicating their relative popularity
based on the percentage of inlinks and outlinks.
... The other is spatial heterogeneity across all scales. Geometry-based representations fail to capture the spatial property of heterogeneity across all scales (Jiang, Zhao and Yin 2008). To better understand geographic forms or cities' structure, we must rely on topology-based representations, e.g., the topological representation of named or natural streets Claramunt 2004, Jiang, Zhao andYin 2008). ...
... Geometry-based representations fail to capture the spatial property of heterogeneity across all scales (Jiang, Zhao and Yin 2008). To better understand geographic forms or cities' structure, we must rely on topology-based representations, e.g., the topological representation of named or natural streets Claramunt 2004, Jiang, Zhao andYin 2008). This paper develops another topological representation that views cities as a whole. ...
... In the London Underground map, the geographic locations of stations are still relatively correct, although the distances between stations are distorted. With both topological data structures and cartograms, one, however, cannot see the underlying scaling of far more small things than large ones; see Jiang, Zhao and Yin (2008) for details. This makes our use of topology unique, because the topology enables us to see the underlying scaling property or hierarchy. ...
Preprint
A city is a whole, as are all cities in a country. Within a whole, individual cities possess different degrees of wholeness, defined by Christopher Alexander as a life-giving order or simply a living structure. To characterize the wholeness and in particular to advocate for wholeness as an effective design principle, this paper develops a geographic representation that views cities as a whole. This geographic representation is topology-oriented, so fundamentally differs from existing geometry-based geographic representations. With the topological representation, all cities are abstracted as individual points and put into different hierarchical levels, according to their sizes and based on head/tail breaks - a classification scheme and visualization tool for data with a heavy tailed distribution. These points of different hierarchical levels are respectively used to create Thiessen polygons. Based on polygon-polygon relationships, we set up a complex network. In this network, small polygons point to adjacent large polygons at the same hierarchical level and contained polygons point to containing polygons across two consecutive hierarchical levels. We computed the degrees of wholeness for individual cities, and subsequently found that the degrees of wholeness possess both properties of differentiation and adaptation. To demonstrate, we developed four case studies of all China and UK natural cities, as well as Beijing and London natural cities, using massive amounts of street nodes and Tweet locations. The topological representation and the kind of topological analysis in general can be applied to any design or pattern, such as carpets, Baroque architecture and artifacts, and fractals in order to assess their beauty, echoing the introductory quote from Christopher Alexander. Keywords: Wholeness, natural cities, head/tail breaks, complex networks, scaling hierarchy, urban design
... In recent years, aided by the availability of new datasets, there have been various studies on transportation systems including air transportation (e.g., Diop et al., 2021;Siozos-Rousoulis et al., 2021), rail and metro networks (e.g., Derrible and Kennedy, 2010;Wang et al., 2017), public transportation networks (e.g., bus network Li et al., 2020), and urban road networks (e.g., Jiang et al., 2008;Lämmer et al., 2006;Lee and Jung, 2018;Liu, 2019;Reza et al., 2022). These studies have focused on different aspects or questions ranging from understanding topological properties (e.g., computing the efficiency of the network) to finding vulnerabilities or bottlenecks (e.g., identifying key nodes against failure or targeted attacks). ...
... These works found that closeness centrality showed the highest correlation with land use densities, while betweenness centrality exhibited a weaker correlation. In Jiang et al. (2008), authors investigated the correlation between traffic flow and centrality measures in self-organized road network systems. In the city of Bologna, Porta et al. (2009) studied the connection between the centrality of the street interconnections and the densities of commercial and service activities. ...
Article
Full-text available
In this paper, we study urban road infrastructure in densely populated cities. As the subject of our study, we choose road networks from 35 populous cities worldwide, including China, India, Pakistan, Colombia, Brazil, Bangladesh, and Cote d’Ivoire. We abstract road networks as complex systems, represented by graphs consisting of nodes and links, and employ tools from network science to study their topological properties. Our multi-scale analysis includes macro-, meso-, and micro-scale perspectives, deriving insights into both common and unexpected patterns in these networks. At the macro-scale, we examine the global properties of these networks, summarizing the results in radar diagrams. This analysis reveals significant correlations among key metrics, indicating that more robust networks tend to be more efficient, while diameter and average path length show negative correlations with other properties. At the meso-scale, we explore the existence of sub-structures embedded within the road networks using two main concepts, namely, community and core-periphery structures. We find that while these densely populated city road networks show particularly strong community structures (high modularity values, close to 1.0) that are not typical to other networks, they exhibit a low level of presence of core-periphery structures, with an average coreness of 6.3%. This points to the cities being polycentric. At the micro-scale, we find nodal-level properties of the network. Specifically, we compute the various centrality measures and examine their distributions to capture the prevalent characteristics of these networks. We observe that the centrality measures present different distribution patterns. While the degree distribution demonstrates a limited range of degree values, the betweenness centrality distribution follows a power law, and the closeness centrality exhibits a binomial distribution—yet these patterns remain consistent across the studied cities. Overall, our multi-scale analysis provides valuable insights into the topological properties of urban road networks, informing city planning, traffic management, and infrastructure development in similar urban environments.
... Recent studies highlight the advantage of street-based topological representations for traffic flow prediction and human activity forecast (Ma, Osaragi, Oki, & Jiang, 2020;Jiang & Huang, 2021). For instance, Jiang et al. (2008) distinguished that streetbased topological representations are a better alternative GIS representation, which has farreaching implications for applications of GIS for traffic management and transport planning, and to the understanding of self-organized nature of cities and capture the small world and scale-free properties of the transportation network. In such a context, (Ma, Omer, Osaragi, Sandberg, & Jiang, 2019;Ma, Osaragi, Oki, & Jiang, 2020) distinguished that the Natural-Street topological representations are excellent proxies to capture the human mobility or traffic flow none other than any topological representation that utilized in the recent studies. ...
... Street Connectivity is primarily utilized to describe the scale-free property of the transportation network. The SC of segment , , can be calculated as follows (Jiang, Zhao, & Yin, 2008): ...
Article
Full-text available
The transportation system is a critical infrastructure with profound impacts on the economy and social well-being. Therefore, understanding and enhancing transportation system resilience in the face of disruptions is of utmost importance. This study presents a framework that utilizes critical-link attacks to disrupt transportation network segments, evaluating their resilience. By utilizing network topological parameters as proxies, the framework holistically captures the network's resilience, considering all potential individual road segment disruptions. In contrast to conventional studies focusing on pre-hazard occurrences, this framework incorporates worst-case scenarios, encompassing all potential individual disruptions. Statistical and spatial analysis techniques are employed to assess transportation network resilience in both macro and micro events, employing topological parameters to evaluate network changes under varying levels of disruption. The results demonstrate the effectiveness of topological parameters in capturing transportation system resilience, enabling the identification of critical road segments from structural and network perspectives. These findings contribute to evaluating network performance and ensuring optimal transportation serviceability.
... • PageRank methods have been used to study engineered systems such as road and urban space networks (Jiang et al., 2008). ...
... Network-based iterative algorithms are being applied to a broad range of problems, such as ranking search results in the WWW [1], predicting the traffic in urban roads [2], recommending the items that an online user might appreciate [3], measuring the competitiveness of countries in world trade [4,5], ranking species according to their importance in plant-pollinator mutualistic networks [6,7], assessing scientific impact [8,9], identifying influential spreaders [10], and many others. While linear algorithms are applied to a broad range of real systems [11,12], it has been recently shown that the non-linear fitness-complexity metric introduced in ref. [5] markedly outperforms linear metrics in ranking the nodes by their importance in bipartite networks that exhibit a nested architecture [7,13]. ...
Preprint
Numerical analysis of data from international trade and ecological networks has shown that the non-linear fitness-complexity metric is the best candidate to rank nodes by importance in bipartite networks that exhibit a nested structure. Despite its relevance for real networks, the mathematical properties of the metric and its variants remain largely unexplored. Here, we perform an analytic and numeric study of the fitness-complexity metric and a new variant, called minimal extremal metric. We rigorously derive exact expressions for node scores for perfectly nested networks and show that these expressions explain the non-trivial convergence properties of the metrics. A comparison between the fitness-complexity metric and the minimal extremal metric on real data reveals that the latter can produce improved rankings if the input data are reliable.
... This provides a promising research field for exploring the correlation between a space's syntactic data and social interaction (Hillier, 1996). This theory permits spatial research from a broad perspective, encompassing topics such as traffic flow (Jiang et al., 2008), crime distribution (Nubani and Wineman, 2005), wayfinding (Peponis et al., 1990;Haq and Girotto, 2003), evacuation ( € Unl€ u et al., 2005), office productivity (Wineman, 1982), mobility in museums (Hillier and Tzortzi, 2006) and shopping space efficiency (Garip and Unlu, 2009). ...
Article
Purpose The primary objective of this study is to investigate the cognitive aspects of spatial experiences of paediatric inpatients who receive long-term treatment in a healthcare setting in relation to the syntactic parameters of healthcare environment. It is aimed to investigate how the change in the child’s cognition caused by the environmental stress experienced by the child during his/her stay in the hospital is related to the physical parameters of the treatment space. Design/methodology/approach The methodology of the study is based on a correlational analysis to identify the cognitive and syntactic factors of the healthcare environment that contribute to changes in the perceptual processes of a sample group of thirty children. The study examined the relationships between the graph and isovist variables, and the cognitive parameters of paediatric inpatients. The two datasets were subjected to regression analyses in order to identify any significant findings, which allowed for a discussion of how the patients’ changing perceptual processes are influenced by the syntactic measures of the healthcare setting. Findings The study showed that a syntactically intelligible floor plan contributes significantly to reducing environmental stress among paediatric inpatients. The presence of shared spaces within the healthcare environment, where social interaction with peers is possible, emerges as a crucial factor influencing children’s spatial perception. Additionally, the visibility characteristics of shared spaces may also play a key role in enhancing children’s perceptions of safety. Research limitations/implications The limitations of the study include the fact that the study was conducted in an oncology and haematology inpatient unit with challenging conditions in terms of the mobility potentials of the children, which might have affected their perceptual processes. A further limitation is that the sample size comprised only 30 children, and the spatial configuration of the healthcare environment was linear and not particularly complex. Social implications By identifying the impact of spatial design on children’s well-being, the study informs the creation and improvement of healthcare environments. Enhanced understanding of factors like intelligible floor plans, shared spaces and isovist values can lead to more child-friendly facilities, potentially alleviating stress for young patients. Consequently, this research may contribute to improved healthcare outcomes, increased comfort for paediatric inpatients, and a more supportive environment for their families, fostering a holistic approach to paediatric care and positively influencing the overall quality of life for children undergoing long-term treatment. Originality/value This study contributes to the theoretical discourse on how the constrained physical conditions of a paediatric healthcare environment may influence the perceptual processes of paediatric inpatients. The results of this evidence-based study have the potential to inform the evaluation of design guidelines for healthcare settings, with the ultimate aim of enhancing therapeutic environments.
Article
Aiming at enhancing geographical road network analysis from a network science perspective, we introduce a novel problem of analyzing the road network in a city, and consider providing a new network centrality metric that could be useful for that problem. The problem addresses vehicular evacuation in urban settings. During emergency and disaster scenarios of vehicular evacuation, the shortest distance routes might not always be the most optimal. Instead, routes that are easier to traverse can be more crucial, even if they involve detours. Also, destinations for evacuation do not necessarily have to be restricted to traditional facilities; broad and well-maintained streets might also serve as suitable alternatives. We focus on the streets as the basic units of the road network to be investigated, and consider a scenario in which people efficiently move from starting intersections around their current places to designated goal streets, following the routes of easiest traversal. For the road network, we employ its topological representation, where vertices and edges correspond to streets and intersections between them, respectively. We thus represent the road network as a vertex-weighted graph, where the weight of each vertex reflects its ease of traversal. By appropriately extending the recently developed edge-centrality metric, “salience”, to this vertex-weighted graph, we construct a new network centrality metric to detect critical streets for the newly introduced problem. Using a toy model of road network and real-world urban road networks obtained from OpenStreetMap, we experimentally reveal its distinctive characteristics by comparing it with several baselines. Moreover, we demonstrate that the proposed network centrality metric can successfully find critical streets for vehicular evacuation, which are difficult to detect using baseline methods.
Article
Road network selection plays a key role in map generalization for creating multi-scale road network maps. Existing methods usually determine road importance based on road geometric and topological features, few evaluate road importance from the perspective of road utilization based on human travel data, ignoring the functional values of roads, which leads to a mismatch between the generated results and people’s needs. This paper develops two functional semantic features (i.e. travel path selection probability and regional attractiveness) to measure the functional importance of roads and proposes an automatic road network selection method based on graph convolutional networks (GCN), which models road network selection as a binary classification. Firstly, we create a dual graph representing the source road network and extract road features including six graphical and two functional semantic features. Then, we develop an extended GCN model with connectivity loss for generating multi-scale road networks and propose a refinement strategy based on the road continuity principle to ensure road topology. Experiments demonstrate the proposed model with functional features improves the quality of selection results, particularly for large and medium scale maps. The proposed method outperforms state-of-the-art methods and provides a meaningful attempt for artificial intelligence models empowering cartography.
Article
Full-text available
Relations between different components of urban structure are often measured in aliteral manner, along streets for example, the usual representation being routesbetween junctions which form the nodes of an equivalent planar graph. A popularvariant on this theme ? space syntax ? treats these routes as streets containing one ormore junctions, with the equivalent graph representation being more abstract, basedon relations between the streets which themselves are treated as nodes. In this paper,we articulate space syntax as a specific case of relations between any two sets, in thiscase, streets and their junctions, from which we derive two related representations.The first or primal problem is traditional space syntax based on relations betweenstreets through their junctions; the second or dual problem is the more usualmorphological representation of relations between junctions through their streets.The unifying framework that we propose suggests we shift our focus from the primalproblem where accessibility or distance is associated with lines or streets, to the dualproblem where accessibility is associated with points or junctions. This traditionalrepresentation of accessibility between points rather than between lines is easier tounderstand and makes more sense visually. Our unifying framework enables us toeasily shift from the primal problem to the dual and back, thus providing a muchricher interpretation of the syntax. We develop an appropriate algebra which providesa clearer approach to connectivity and distance in the equivalent graphrepresentations, and we then demonstrate these variants for the primal and dualproblems in one of the first space syntax street network examples, the French villageof Gassin. An immediate consequence of our analysis is that we show how the directconnectivity of streets (or junctions) to one another is highly correlated with thedistance measures used. This suggests that a simplified form of syntax can beoperationalized through counts of streets and junctions in the original street network.
Article
History tells us that when you want something done you turn to a leader: right? Wrong. If you want to make a correct decision or solve a problem, large groups of people are smarter than a few experts. This brilliant and insightful book shows why the conventional wisdom is so wrong and why the theory of the wisdom of crowds has huge implications for how we run our businesses, structure our political systems and organise our society. Shrewd, meticulous and profound, The Wisdom of Crowds will change for ever the way you think about human behaviour.
Book
The book presents a new theory of space: how and why it is a vital component of how societies work. The theory is developed on the basis of a new way of describing and analysing the kinds of spatial patterns produced by buildings and towns. The methods are explained so that anyone interested in how towns or buildings are structured and how they work can make use of them. The book also presents a new theory of societies and spatial systems, and what it is about different types of society that leads them to adopt fundamentally different spatial forms. From this general theory, the outline of a 'pathology of modern urbanism' in today's social context is developed.
Article
This paper presents the use of smoothly continuous road centre-line segments - which are here termed strokes - as a useful basis for the analysis of street and road networks. The decomposition of networks into linear strokes, which show good continuation of direction and continuity of character (width or type), has been found to provide a basis for robust, effective, and efficient automatic generalisation of road networks. Network generalisation and space syntax are shown to have similar requirements for understanding the importance of individual elements in the network structure. Stroke-based network generalisation employs structural analyses based on space syntax methods. The idea of replacing axial lines with strokes as the basic spatial representation in space syntax is presented. It is suggested that in spaces where linearity is dominant, in certain contexts such as traffic flow studies or when curved routes must be handled, stroke-based space syntax could offer definite computational and modelling advantages over the conventional use of axial lines.