Content uploaded by Luca Pappalardo
Author content
All content in this area was uploaded by Luca Pappalardo on Nov 03, 2017
Content may be subject to copyright.
A visual data-driven and network-based tool
for transportation planning and simulation
Michele Ferrei
King’s College London
London, United Kingdom
michele.ferrei@kcl.ac.uk
Luca Pappalardo
University of Pisa
ISTI, CNR
Pisa, Italy
lpappalardo@di.unipi.it
Gianni Barlacchi
University of Trento,
SKIL, Telecom Italia
Trento, Italy
barlacchi@k.eu
Bruno Lepri
Fondazione Bruno Kessler
Trento, Italy
lepri@k.eu
ABSTRACT
e availability of massive data describing human mobility oers
the possibility to design simulation tools to control and improve
transportation systems. In this perspective, we propose a visual
and data-driven simulation tool based on a multiplex network rep-
resentation of mobility data, where every layer describes people’s
movements with a given transportation mode. We then develop a
visual application which provides an easy-to-use interface to ex-
plore the mobility uxes and the connectivity of every urban zone
in a city. Our application allows the user to visualize changes in
the transportation system resulting from the addition or removal of
transportation modes, urban zones and single stops. We show how
our visual application can be used to explore mobility in Singapore,
by using data provided by the CIKM challenge 2017 and mobility
data obtained from external sources. e application allows to sim-
ulate the reaction to changes in the public transportation system
and to assess the resilience of the transportation network to the
removal of single subway/bus stops.
KEYWORDS
urban science, data science, human mobility, complex systems,
network science, multiplex networks
ACM Reference format:
Michele Ferrei, Luca Pappalardo, Gianni Barlacchi, and Bruno Lepri. 2016.
A visual data-driven and network-based tool
for transportation planning and simulation. In Proceedings of ACM Confer-
ence, Washington, DC, USA, July 2017 (Conference’17), 4 pages.
DOI: 10.1145/nnnnnnn.nnnnnnn
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permied. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
Conference’17, Washington, DC, USA
©2016 ACM. 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
DOI: 10.1145/nnnnnnn.nnnnnnn
1 PROBLEM STATEMENT
Nowadays the availability of massive data describing human move-
ments allows us to face relevant urban computing challenges [
2
,
4
,
6
,
7
,
9
,
15
]. For example, the observation of mobility ows oers
the possibility to investigate the resilience of urban transportation
systems, thus uncovering weak points and potentially sub-standard
routes. By combining the methods from machine learning and net-
work science, we can design powerful models and simulation tools
for what-if analysis of dierent urban planning scenarios [12, 13].
Specically, Singapore is a global city where such tools would
be particularly valuable, due to the complexity of its transportation
system [
11
]. Despite Singapore’s renowned eciency, its trans-
port services still face daily challenges which might undermine its
economy and negatively impact the well-being of its inhabitants.
Common problems in the public transportation system are related
to a non-optimal positioning of bus stops or subway stations; to
prolonged waiting times at such stops; or to misaligned intercon-
nections and inter-modal routes between dierent transportation
networks (e.g., subway and bus). Further, as recently empirically
demonstrated by Xu and Gonz
´
alez [
16
], a slight re-routing of a
fraction of daily rush-hour car commutes across metropolitan areas
produces more-than-proportional reductions in trac, alleviating
the overall transport system’s congestion state. In this perspective,
as highlighted in the Intelligent Transport System Strategic Plan
for Singapore [
1
], the development of big data analytics tools can
help to control the transportation system and improve both the
customer’s travel experience and the system’s overall eciency.
Starting from these considerations, we address the following ques-
tions: (i) what and where are the weakest transportation routes
in a city? (ii) Given some changes in the transportation system,
what scenarios are likely to occur and what is their impact on hu-
man mobility? In the literature,
C¸
olak et al. [
6
] investigate the
interplay of number of vehicles and road capacity on their routes
to determine the level of congestion in urban areas. ey explain
that the ratio of the road supply to the travel demand can explain
the percentage of time lost in congestion. De Domenico et al. [
8
]
show that the eciency in exploring the transportation layers de-
pends on the layers’ topology and the interconnection strengths.
Although these works doubtless shed light on interesting aspects
Conference’17, July 2017, Washington, DC, USA Michele Ferrei, Luca Pappalardo, Gianni Barlacchi, and Bruno Lepri
about the structure of urban transportation, they do not provide
easy-to-use tools for exploring a city’s demand for mobility and the
eciency of the related transportation system. Giannoi et al. [
10
]
partly overcome this problem by proposing a querying and mining
system (M-Atlas) for extracting mobility paerns from GPS tracks.
However, M-Atlas does not allow to investigate the transportation
system’s resilience with respect to a city’s mobility demand.
We propose a visual, data-driven and network-based simulation
tool to highlight and explore the weaknesses in a public transporta-
tion system. Our tool is based on a multiplex network represen-
tation of mobility data [
8
], where every layer describes people’s
movements with a given transportation mode, e.g., buses, metros,
taxis. A node in a layer represents a zone of the city, edges indicates
routes between zones and edge weights indicate the amount of
people moving between two nodes in a given time window. From
the multiplex network we extract a set of measures indicating a
layer’s carrying capacity and its ability to satisfy the overall mo-
bility needs in a given urban area [
3
]. We then develop a visual
application which provides an easy-to-use interface to explore the
mobility uxes and the connectivity of every urban zone in a city.
Our visual application allows the user to visualize changes in the
transportation system resulting from the addition or removal of
transportation modes, urban zones and single stops. We show
how our visual system can be used to explore human mobility in
Singapore, by using data provided by the CIKM challenge 2017
and Singapore mobility data obtained from external sources. e
application allows to point out weak routes among urban areas
in the city (i.e., routes where public transportation does not meet
the needs of the city users), and simulate changes in the capacity
of public transportation to satisfy needs of citizens when specic
events occur in the city, e.g., closing/adding transportation modes
or subway/bus stops. Our approach is highly exible since it uses
only data about transportation and mobility ows. Given that many
open datasets of such nature are publicly available
1
, our approach
can be potentially applied to any other urban area to simulate traf-
c changes due to specic events, such as the impact of adding
or removing transportation modes or stops, the impact of closing
the access to an urban area, or the organization of city-wide public
events.
2 DATA SOURCES
We use heterogeneous data sources to simulate, by using our visual
tool, transportation changes in the city of Singapore. In partic-
ular, we use data about bus lines available at the website www.
mytransport.sg
2
. For every bus line, we retrieve information about
its stops and the GPS traces describing the bus route. For every stop,
we retrieve its GPS position. e bus lines data provide information
to build a transportation network describing the displacements of
inhabitants between dierent zones of Singapore. We split Singa-
pore in urban zones by using the shape les provided at the website
data.gov.sg
3
, where dierent administrative divisions of the city
are provided. We use the most ne-grained division and assign ev-
ery bus stop to the corresponding urban zone. We hence obtain for
1hps://data.gov.sg/group/transport
2hps://www.mytransport.sg/content/mytransport/home/dataMall.html
3hps://data.gov.sg/dataset/master-plan-2014-subzone-boundary-web
every urban zone
z
: the bus stops
z
contains, the bus lines passing
through
z
and all the urban zones connected to
z
. We dene two
urban zones
z1
and
z2
to be connected if there is at least one bus
line connecting
z1
and
z2
. Finally, we use data indicating both the
presence of people in every urban zone and the uxes of people
between urban zones at a given date and time, downloaded from
the API provided by DataSpark
4
for the CIKM AnalytiCup 2017
5
.
We use these data to estimate the number of people moving by bus
between two urban zones in a given time window, since ocial
information about the number of users traveling on the buses is
not available.
3 METHODOLOGY
In an urban area, two zones can be connected through several
transportation means (bus, taxi, subway, etc.). To express this
kind of information we introduce the concept of urban multiplex
network:
Denition 3.1. An
Urban Multiplex Network (UMN)
is a net-
work in which two nodes represent zones of an urban area and can
be connected, at the same time, by multiple edges that belong to dif-
ferent dimensions. We model such structure with an edge-labeled
multi-graph denoted by
G=(V,E,L)
where:
V
is a set of nodes
(urban zones);
L
is a set of labels (public transportation means);
E
is a set of labeled edges, i.e., a set of triples
(u,v,d)
where
u,v∈V
and d∈Lis a label. We use the term dimension to indicate a label.
e multidimensional connectivity of two zones in an urban area
is a combination of two elements: connection intensity and con-
nection redundancy [
14
]. We dene the intensity of the connection
between two zones on a single dimension as:
Denition 3.2. Connection intensity
hd(u,v)=wd(u,v)|Γ
d(u)∩ |Γ
d(v)|
min (|Γ
d(u)|,|Γ
d(v)|),(1)
where
wd
:
V×V×L→N
is a weight function representing the
mobility ux between two zones on dimension
d
, and
Γ
d
is the set
of neighbours of a zone.
Connection intensity consists hence of two factors: the rst
factor,
wd
, indicates how many people move between the two zones
using transportation layer
d
; the second factor,
|Γd(u)∩| Γd(v)|
mi n (|Γd(u)|,|Γd(v)|)
,
is the percentage of common neighbours,
|Γ
d(u)∩ |Γ
d(v)|
, with
respect to the most selective zone,
min (|Γ
d(u)|,|Γ
d(v)|)
. e idea is
that, on each dimension, the connection intensity is inuenced by
both the number of displacements between the two zones, weighted
by the value of selectiveness of the more selective zone, i.e., the
probability that the cluster shared by the two zones is the main one
for the zone with the smallest set of neighbours. e second element
of multidimensional connectivity is connection redundancy, which
takes into account the relevance of a dimension for a zone, i.e.,
to what extent the removal of the links belonging to a dimension
aects the capacity to reach a zone’s strong connections.
Denition 3.3. Connection Redundancy
rd(u,v)=(1−DR (u,d)) (1−D R (v,d)),(2)
4hps://datasparkanalytics.com/
5hp://cikm2017.org/CIKM AnalytiCup task2 dataset.html
A visual data-driven and network-based tool
for transportation planning and simulation Conference’17, July 2017, Washington, DC, USA
where dimension relevance
DR
is the fraction of neighbours
that become directly unreachable from a zone if all the edges in a
specic dimension were removed [
5
]. We give a higher score to the
edges that appear in several dimensions, so we are interested in the
complement of those values. If the two areas are linked in more
than one dimension, the score is raised until a maximum of 1. We
combine connection intensity and connection redundancy taking
into account the multidimensionality of connectivity: a greater
number of connections on dierent dimensions is reected in a
greater chance of having a strong connectivity.
Denition 3.4.
Connectivity.
Let
u,v∈V
be two nodes and
L
be
the set of dimensions of an urban multiplex network
G=(V,E,L)
.
e connectivity of two urban areas u,vis dened as:
c(u,v)=X
d∈D
hd(u,v)(1+rd(u,v) ).(3)
e measure proposed can be used to estimate the strength of
the connection also in mono-dimensional networks, where
rd
is
zero and the overall sum is hd.
4 APPLICATION DESIGN
e overall application, comprising both the network representa-
tion and the interactive interface, is currently running on private
hosting solution. An ad hoc release to interested third parties and
potential collaborations might be considered in the future. e ap-
plication design follows closely the analytical framework described
in Section 1, while implementing a client-server architecture pat-
tern. e back-end component of such structure is responsible
for implementing and serving the models presented in Section 3,
which the client application then consumes via a REST service ex-
posed by the same server. is API is responsible not only for data
provisioning, but also acts as the communication layer between
the user and the network models. It is worth noting that, given
its modularity, our application can be re-purposed as an agnostic
provider of services to other consumers. In particular, the exposed
methods consist in:
•
a query endpoint returning the network features’ geome-
tries for a given urban zone ID, and additional information
used to populate the geographic map application;
•
a second query endpoint returning for each given urban
zone ID the computed network metrics, i.e., Connection
intensity, Connection Redundancy, Multidimensional Con-
nectivity (Section 3).
e front-end application is thus the entry-point for quickly
interacting and prototyping future transportation scenarios. e
User Interface (UI), visible in Figure 1, has been developed with
easiness of use and clarity of interpretation its standard pillars. It
allows a non-technical audience to inspect the number of routes
connecting an urban zone to the rest of the city simply by clicking
on an urban zone in the city panel (Figure 2), and control them via
an interactive menu (Figure 1). Upon addition/removal of one or
more routes in the menu, the user can trigger the calculation of the
network metrics, which are promptly displayed in three separate
windows. Every window shows a 3D map of the city, where an
urban zone’s height is proportional to its average networks value
computed over the connected urban zones (Figure 3). e menu
Figure 1: e Application User Interface (UI) displaying the
3D network metrics for the city of Singapore. e interactive
menu on the right allows to select and deselect transporta-
tion routes. e “Run Model” button allows to calculate the
intensity, redundancy and connectivity measures and visu-
alize them in the three bottom windows.
Figure 2: e city panel of the application user interface. It
visualizes all the urban zone in the city (Singapore). When
clicking on a urban zone, the system shows in the menu all
the bus routes passing through that urban area.
and the boom windows allow the user to simulate how the city’s
connectivity changes aer, for example, the construction of a new
route or the temporary closing of an existing one.
All of the geographic map components are fully interactive and
built with the latest web-mapping technologies built on WebGL
standards with 3D capabilities; as such, they allow for a most uid
and seamless experience. is is a crucial factor that allows the
application to not only hide the complexity of the models, but also
lets the technology move out the background, making space for
generating discussions and streamline decision making processes.
5 EXPERIMENTS
We conduct an extensive connectivity assessment by observing
Singapore’s urban network resilience to the removal of high or
low connectivity links. Figure 4 shows how the relative size of the
Conference’17, July 2017, Washington, DC, USA Michele Ferrei, Luca Pappalardo, Gianni Barlacchi, and Bruno Lepri
Figure 3: e 3D map in the rightmost bottom window. e
height of an urban zone (in Singapore) is proportional to
the average value of its multidimensional connectivity, com-
puted across all the connected urban zones.
largest network component changes with the removal of a given
percentage of links, sorted by increasing (red solid line) or decreas-
ing (blue dashed line) connectivity order. We nd that Singapore
has a resilient urban network, as the all nodes are still reachable
when removing up to 30% of the links. However, the deletion of
links in decreasing order of connectivity aect less the network’s
global connectivity, as more than 90% of the urban zones are still
reachable aer the removal of almost all the links (Figure 4, blue
dashed line). In contrast, when deleting the links in increasing
order of connectivity the network crumbles faster, as almost 20%
of urban zones become unreachable aer the removal of 90% of the
links (Figure 4, red solid line). ese results suggest that the pro-
posed connectivity metrics can protably be deployed to discover
those urban connections whose existence is crucial for the network
resilience. Moreover, those metrics can also help to evaluate the
impact of changing in the city’s mobility (e.g., the closure of a bus
line). Figure 4 highlights the points where the accelerated network
disassembly commences: up to those values the transportation
system exhibits a fair resilience, but surpassing such thresholds
provokes a rapid network fall-out. Further, while our simulated
experiment has been conducted, due to time constrains, only on
the bus transportation layer, it is straightforward to envisage and
implement in practice a more comprehensive simulation. Testing
the network resilience to the above stress conditions in a truly
multi-modal perspective represents thus an important tool at the
disposal of transportation planners to assess the current state of the
transportation network; plan future operations; and keep running
the system at its overall optimal capacity.
Acknowledgements. is work has been partially funded by the
European project SoBigData RI (Grant Agreement 654024).
REFERENCES
[1]
Intelligent transport system strategic plan for singapore. hps://www.lta.gov.sg/
ltaacademy/pdf/J15Nov p04Chin SmartMobility2030.pdf.
[2]
Mohammed N Ahmed, Gianni Barlacchi, Stefano Braghin, Francesco Calabrese,
Michele Ferrei, Vincent Lonij, Rahul Nair, Rana Novack, Jurij Paraszczak, and
Andeep S Toor. A multi-scale approach to data-driven mass migration analysis.
In SoGood@ ECML-PKDD, 2016.
[3]
Albert-L
´
aszl
´
o Barab
´
asi and M
´
arton P
´
osfai. Network science. Cambridge University
Press, Cambridge, 2016.
Figure 4: e stability of the urban network to link removal.
e x axis shows the percentage of removed links. e y axis
shows the size of the greatest network component.
[4]
Gianni Barlacchi, Marco De Nadai, Roberto Larcher, Antonio Casella, Cristiana
Chitic, Giovanni Torrisi, Fabrizio Antonelli, Alessandro Vespignani, Alex Pent-
land, and Bruno Lepri. A multi-source dataset of urban life in the city of milan
and the province of trentino. Scientic data, 2:150055, 2015.
[5]
M. Berlingerio, M. Coscia, F. Giannoi, A. Monreale, and D. Pedreschi. Founda-
tions of multidimensional network analysis. In 2011 International Conference on
Advances in Social Networks Analysis and Mining, pages 485–489, July 2011.
[6]
Serdar
C¸
olak, Antonio Lima, and Marta C Gonz
´
alez. Understanding congested
travel in urban areas. Nature Communications, 7:10793, 2016.
[7]
Manlio De Domenico, Antonio Lima, Marta C Gonz
´
alez, and Alex Arenas. Per-
sonalized routing for multitudes in smart cities. EPJ Data Science, 4(1):1, 2015.
[8]
Manlio De Domenico, Albert Sol
´
e-Ribalta, Sergio Gmez, and Alex Arenas. Navi-
gability of interconnected networks under random failures. Proceedings of the
National Academy of Sciences, 111(23):8351–8356, 2014.
[9]
Marco De Nadai, Jacopo Staiano, Roberto Larcher, Nicu Sebe, Daniele ercia,
and Bruno Lepri. e death and life of great italian cities: A mobile phone data
perspective. In Proceedings of the 25th International Conference on World Wide
Web, WWW ’16, pages 413–423, Republic and Canton of Geneva, Switzerland,
2016. International World Wide Web Conferences Steering Commiee.
[10]
Fosca Giannoi, Mirco Nanni, Dino Pedreschi, Fabio Pinelli, Chiara Renso, Sal-
vatore Rinzivillo, and Roberto Trasarti. Unveiling the complexity of human
mobility by querying and mining massive trajectory data. e VLDB Journal,
20(5):695–719, October 2011.
[11]
S. Jiang, J. Ferreira, and M. C. Gonz
´
alez. Activity-based human mobility paerns
inferred from mobile phone data: A case study of singapore. IEEE Transactions
on Big Data, 3(2):208–219, June 2017.
[12]
Shan Jiang, Yingxiang Yang, Siddharth Gupta, Daniele Veneziano, Shounak
Athavale, and Marta C. Gonz
´
alez. e timegeo modeling framework for urban
mobility without travel surveys. Proceedings of the National Academy of Sciences,
113(37):E5370–E5378, 2016.
[13]
Luca Pappalardo, Salvatore Rinzivillo, and Filippo Simini. Human mobility mod-
elling: Exploration and preferential return meet the gravity model. Procedia
Computer Science, 83:934 – 939, 2016. e 7th International Conference on
Ambient Systems, Networks and Technologies (ANT 2016) / e 6th Interna-
tional Conference on Sustainable Energy Information Technology (SEIT-2016) /
Aliated Workshops.
[14]
Luca Pappalardo, Giulio Rossei, and Dino Pedreschi. How well do we know
each other? detecting tie strength in multidimensional social networks. In 2012
IEEE/ACM International Conference on Advances in Social Networks Analysis and
Mining, pages 1040–1045, Aug 2012.
[15]
Luca Pappalardo, Filippo Simini, Salvatore Rinzivillo, Dino Pedreschi, Fosca
Giannoi, and Albert-L
´
aszl
´
o Barab
´
asi. Returners and explorers dichotomy in
human mobility. Nature Communications, 6, 09 2015.
[16]
Yanyan Xu and Marta C. Gonz
´
alez. Collective benets in trac during mega
events via the use of information technologies. Journal of e Royal Society
Interface, 14(129), 2017.