Conference PaperPDF Available

The City Browser: Utilizing Massive Call Data to Infer City Mobility Dynamics

Authors:

Abstract and Figures

This paper presents the City Browser, a tool developed to analyze the complexities underlying human mobility at the city scale. The tool uses data generated from mobile phones as a proxy to provide several insights with regards to the commuting patterns of the population within the bounds of a city. The three major components of the browser are the data warehouse, modules and algorithm, and the visualization interface. The modules and algorithm component utilizes Call Detail Records (CDRs) stored within the data warehouse to infer mobility patterns that are then communicated through the visualization interface. The modules and algorithm component consists of four modules: the spatial-temporal decomposition module, the home/work capturing module, the community detection module, and the flow estimation module. The visualiza-tion interface manages the output of each module to provide a comprehensive view of a city's mobility dynamics over varying time scales. A case study is presented on the city of Riyadh in Saudi Arabia, where the browser was developed to better understand city mobility patterns.
Content may be subject to copyright.
The City Browser: Utilizing Massive Call Data to Infer City
Mobility Dynamics
Fahad Alhasoun
Center for Computational
Engineering
Massachusetts Institute of
Technology
Cambridge MA, USA
fha@mit.edu
Abdullah Almaatouq
Center for Complex
Engineering Systems at
KACST and MIT
Riyadh, Saudi Arabia
amaatouq@mit.edu
Kael Greco
Senseable City Lab
Massachusetts Institute of
Technology
Cambridge MA, USA
kael@mit.edu
Riccardo Campari
Senseable City Lab
Massachusetts Institute of
Technology
Cambridge MA, USA
campari@mit.edu
Anas Alfaris
Center for Complex
Engineering Systems at
KACST and MIT
Riyadh, Saudi Arabia
anas@mit.edu
Carlo Ratti
Senseable City Lab
Massachusetts Institute of
Technology
Cambridge MA, USA
ratti@mit.edu
ABSTRACT
This paper presents the City Browser, a tool developed to analyze
the complexities underlying human mobility at the city scale. The
tool uses data generated from mobile phones as a proxy to provide
several insights with regards to the commuting patterns of the pop-
ulation within the bounds of a city. The three major components
of the browser are the data warehouse, modules and algorithm,
and the visualization interface. The modules and algorithm com-
ponent utilizes Call Detail Records (CDRs) stored within the data
warehouse to infer mobility patterns that are then communicated
through the visualization interface. The modules and algorithm
component consists of four modules: the spatial-temporal decom-
position module, the home/work capturing module, the community
detection module, and the flow estimation module. The visualiza-
tion interface manages the output of each module to provide a com-
prehensive view of a city’s mobility dynamics over varying time
scales. A case study is presented on the city of Riyadh in Saudi
Arabia, where the browser was developed to better understand city
mobility patterns.
Categories and Subject Descriptors
H.2.8 [Database Applications]: Data Mining; H.2.8 [Database
Applications]: Spatial Databases and GIS; H.4 [Information Sys-
tems Applications]: Miscellaneous
General Terms
Urban Computing, Data Analytics, City Science
Copyright is held by the author/owner(s).
UrbComp’14, August 24, 2014, New York, NY, USA.
Keywords
Call Detail Records, Mobile Applications, Urban Analysis
1. INTRODUCTION
Cities today house over 50 percent of world’s population, con-
suming 60-80 percent of global energy and emitting almost 75 per-
cent of greenhouse gases [18]. Some have suggested that almost
70 percent of world’s population will reside in cities by 2050 [18].
With the rapid urban population growth, cities’ infrastructures are
being strained to the point of becoming a major hindrance to so-
cioeconomic activity. Left unaddressed, the problem threatens to
weigh down the return on investment from public projects being
constructed throughout cities and adversely affect the quality of life
of all residents.
Understanding the complexities underlying the emerging behav-
iors of human travel patterns on the city level is essential toward
making informed decision-making pertaining to urban transporta-
tion infrastructures [2]. Traditional methods of assessing the social
demand on transportation are expensive and take longer periods of
time to conduct [10, 19, 21]. Such assessments are usually in the
form of surveys with considerably small sample sizes compared to
the total population of a city. Furthermore, such methods lack the
accuracy and resolution in time to provide fine-grained analysis of
human travel with precise time resolution.
New road counter technologies such as pressure tubes, inductive
loops and other traffic counting techniques allowed for counting
travelers with a finer time resolution; however, the drawback is the
spatial resolution of such techniques. They are usually highly local
and capture activity in a specific point in space that is miniscule
with respect to the city as a whole [16]. Therefore, such techniques
suffer from an inability to provide a holistic overview of the status
of the system. In addition, deploying new traffic counting technolo-
gies can be extremely expensive when considering the mega-cities
in the world.
An alternative approach toward capturing the social demand is
by using data generated from mobile phones to model and un-
derstand the behavior of human mobility [21]. Data pertaining to
mobile phone usage can be gathered at different levels within the
GSM network. Telecom companies usually do not keep track of all
the data traffic running across their networks; however, they store
certain information for billing purposes and network development.
The Call Detail Records, often referred to as CDRs, are one type
of information telecom companies keep for billing purposes. Ev-
ery time a user makes a phone call, sends a text message uses the
Internet and even passively when the mobile communicates to the
cellular network access points, the mobile network keeps a record
of their usage information and location in the CDRs [11]. There-
fore, such big data set can be utilized as a proxy to understand the
social demand on transportation infrastructures.
The motivation behind developing the browser is derived from
the demand of a tool that provides fine-grained analysis of the com-
plexity of human travel within cites. The approach takes advantage
of the existing built infrastructures to sense the mobility of people
eliminating the financial and temporal burdens of traditional meth-
ods. The outcomes of the tool will assist both planners and the
public in understanding the complexities of human mobility within
their cities.
In this paper we will present the City Mobility Browser, a tool
that facilitates a simplified understanding of human mobility across
a city. The paper is divided into four sections: Section 2 describes
existing methods and approaches, Section 3 presents the method-
ology of the browser, Section 4 describes the general architecture
of the system, Section 5 describes each component of the tool in
detail, and Section 7 presents results of the case study of city of
Riyadh in Saudi Arabia. The contributions of this paper can be
summarized into the following two points:
We propose an architecture that combines several known tech-
niques for data collection, storage and analysis in one frame-
work in a meaningful context to develop the “City Browser”,
that can aid in simplifying the complexity of human mobility
across a city.
We examine the usefulness of the system through a case study
of Riyadh, Saudi Arabia. The case study contained 100 mil-
lion real mobile phone activity and demonstrates the process
of analyzing massive amount of data and through visualiza-
tion, distilling the bits into actionable insights.
2. BACKGROUND
Several research activities have been investigating approaches to-
wards modeling and understanding mobility demand within cities.
Traditional methods of demand modeling inferred the collective be-
havior of demand on transportation infrastructures through house-
hold or road surveys to gather information about user’s behavior.
Another approach has been to use theoretical models to estimate
the number of trips and their directionality based on land use mod-
els. These approaches can be unreliable and can have financial and
temporal costs. Today and with the emergence of pervasive tech-
nologies around the world, research started investigating human
behavior through data gathered from mobile phones [8, 9, 12, 13].
Varying approaches have used the data as a proxy to better under-
stand human mobility. The focus on human mobility ranges from
decomposing the data onto the different dimensions to gain insights
into behavioral patterns by applying algorithms and processes on
models built on the data. Research investigating the dimensional-
ity of the data includes work on utilizing the spatial decomposi-
tion of aggregate activity to understand the dynamics of cities and
universal patterns of human mobility [3, 9]. On the other hand,
researchers have developed techniques to gain more insights from
the data by creating algorithms capturing more of the hidden pat-
terns [7, 12,22]. For example, researchers have been modeling the
social network based on the data captured from users’ interactions
to better understand whether the composition of social communi-
ties is correlating with the geographical constraints [17]. Another
approach was to capture users’ trips from the data set and aggregate
trips to get insight on the flows of people around the city towards
understanding the dynamics of flows of people [4,21]. Such under-
standing can help identify flawed urban planning in cities [23].
3. METHODOLOGY
The objective of the browser is to provide an understanding of
the complexity underlying human mobility within a city. The browser
will capture the dynamics of the distribution of the population to in-
vestigate aspects pertaining to flows of people as well as the struc-
ture of the community. Investigating population localization dy-
namics provides information pertaining to emerging zones with
higher population densities; certain dense zones emerge on daily
basis like commercial areas on weekdays while others emerge as
consequence of events that are not of periodic nature. The browser
will investigate whether the formation of periodic dense zones has
an influence on the segregating of the population of the city into
communities. On the other hand, it will provide information about
how the city interacts with events in terms of population commut-
ing flows.
The approach towards simplifying the complexity of human mo-
bility is staged into four steps. Starting with step 1, the browser de-
composes population distribution across the spatial dimension on
a time resolution of a day capturing the emergence of dense zones
(see Subsection 3.1). Step 2 then analyzes each individual in the
CDRs to capture their home/work locations (see Subsection 3.2).
Step 3 as explained in subsection 3.3 investigates the formation of
communities within cities as a result of their home/work choices.
Step 4 estimates people flows within the city within a day time scale
(see Subsection 3.4).
3.1 Spatial-Temporal Decomposition
The first phase of the methodology decomposes the population
over the spatial dimension of the city on the day scale; it will cap-
ture time series information of densities of people at every zone
with time granularity in minutes. The technique quantifies the mag-
nitudes of mobile user activities within the defined time window,
generating time series data for user activity densities for each zone
covered by a cell tower. Observing densities with such fine time
granularity provides fine grained detail on the emergence of such
populated zones by identifying when, where and how fast different
dense zones emerge.
3.2 Home/Work Places Capturing
The second phase takes a larger time granularity spanning weeks
to capture residential and business areas. The approach towards
that is by identifying locations where users spend most of their time
during day and night (i.e. home/work locations) across a sufficient
time interval. Aggregating the number of users spending most of
their times over a particular location captures zones that are emerg-
ing as a result of daily routine activities like regular business areas
and schools.
3.3 Community Detection
To better understand the influence of where people live and work,
this phase investigates the formation of segregated communities
based on their home and work locations. The formation of a mobil-
ity community within the population indicates that there is a subset
of the population traveling within confined bounds of the city and
tend not to cross those bounds (i.e. a neighborhood or group of
neighborhoods). Such analysis can provide insights on the level of
heterogeneity of trips’ sources and destinations.
3.4 Flows Estimation
To better understand daily commuting within a city, this phase
captures flows within the city through the origin destination esti-
mation algorithm. The algorithm captures trips generated by users
around the day and then aggregates the flows of people on a spec-
ified time window. The results of the origin destination estimation
algorithm will provide information about how dense zones emerge
in terms of the source of the population visiting those zones.
4. GENERAL ARCHITECTURE
The general architecture of the browser is composed of three ma-
jor components; data warehouse, modules and algorithms, and the
visualization interface. The data warehouse contains the needed
data for the modules and algorithms to produce insights and infor-
mation visualized through the visualization interface. The general
architecture is shown in the figure 1. The data warehouse contains
data pertaining to human mobile phone usage as well as GIS infor-
mation of the city and traffic counts. There are four major mod-
ules residing within the modules and algorithms component that
are spatial-temporal decomposition module, home/work capturing
module, community detection module and flows estimation mod-
ule. Finally the visualization interface takes the results produced
by the modules and algorithms together with GIS information of
the city to provide a comprehensive dynamic view of human mo-
bility within a city.
Figure 1: City Browser general architecture
"details of the implementation of the architecture"
5. COMPONENTS
The City Browser is decomposed into components following the
general architecture described in section 4. This section will pro-
vide the details of each component. The breakdown of the browser
into components is to allow for a more scalable, modular and sim-
pler architecture for development. Each of the components is de-
scribes below.
5.1 Data Warehouse
The data warehouse houses several datasets containing informa-
tion of the structure of the city as well as the dynamics of it. It
contains a geospatial database of the city including the lookup ta-
ble of the locations of the cell towers for the purpose of mapping
mobile phone activity to locations. In addition, it contains infor-
mation of the time series mobile phone usage data as well as traffic
counts.
The major part of the data warehouse is mobile phone billing
information, also known as Call Detail Records (CDRs), which
are records that telecom companies usually keep for the purpose of
generating bills for customers. The CDRs are generated by mobile
switching centers (MSCs) within GSM networks and go through
several processing methods to be usable by telecom providers. The
CDRs are finally structured in a table-like format, withholding in-
formation about phone activity details. Each entry in the CDRs
table is a record representing an activity generated by a user. Every
time a user makes a phone call, sends a text message or accesses the
Internet, the CDRs keeps a record of the cell tower that was used to
facilitate activity. In addition, the data warehouse contains a lookup
table for cell tower geospatial information where each cell tower is
mapped to its coordinates (i.e. latitude and longitude). Each record
within the databases is referred to as an activity and is described
by time t, user uand cell tower cand represented as a(t, c, u). For
each user, the dataset contains a series of activities captured and are
represented in this paper as:
Au={a0, a1, a2, ...an|u=ua0=ua1=ua2=.... =uan}
where a0is an activity record and ua0is the user generating activ-
ity a0. The data warehouse also contains traffic volume counts at
specific points on the road network. Traffic counts are usually taken
for a defined period of time where pressure tubes are placed on cer-
tain links to count the number of times vehicles pass across them.
Furthermore, information about the geometry of the road network
is housed within the data warehouse as a spatial database. The road
network spatial database contains information about the geometry
of roads such as number of lanes, category, length and speed limit.
5.2 Modules and Algorithms
The Modules and Algorithms component is composed of four
components: spatial-temporal decomposition module, home/work
capturing module, flow estimation module, and community detec-
tion module. Each of the components is described below.
5.2.1 Spatial-Temporal Decomposition Module
The first step toward understanding the dynamics of a city on the
day scale is to look at the dynamics of population densities across
the city through aggregate user activities for each cell tower. This
module breaks down the total activities of users on both the spatial
and temporal dimensions. A similar approach was developed in [4].
For each cell tower within the city, the module generates a time
series data for activity levels for a specified time granularity t. To
capture the collective behavior of the population across the city, the
module captures the aggregate activity level of users at every cell
tower ciwithin the city. The aggregate phone activity level denoted
AL(ci,t)at cell tower cifor a time window tis computed as
follows from the dataset.
AL(ci,t) = X
cci,tt
a(c, t, u)
Where a(c, t, u)is an activity generated through cell tower cat
time t. Each time series data for every location cigives insights on
the nature of the zone where the cell tower resides in terms of its
use. For example, work areas within cities are expected to have a
higher density of activity during work hours compared to residen-
tial areas. The module also provides insight into collective pop-
ulation behavioral characteristics showing when the city becomes
alive in the morning. It also captures information on how users are
interacting with events in terms of localization or behavior of ser-
vice usage. The objective of developing this module is to provide a
holistic overview of the change in population densities across space
and time.
5.2.2 Capturing Home/Work Places Module
Expanding the time interval of the analysis, this module captures
work zones as well as residentail zones. This is essentially cap-
turing places where the majority of daytime calls are as a proxy to
work locations. First, we segregate activity records on two time
windows to capture most visited zones at daytime versus nighttime
for a particular user u. Activities that would hold potential work
locations are separated in a set as:
dayu={a0, a1, a2, ...an|u=ua0=ua1=ua2=.... =uan
taidaytime}
Where a0is an activity record, ua0is the user generating activity
a0and taiis time tag of activity ai. Similarly, nightuis obtained
with the same logic for nighttime activity. Then, workulocation
for user uis chosen to be the most occuring location in dayuand
the same applies to homeuas it is chosen to be the most occuring
location in nightu.
After determining the workuand homeufor each user. The
aggregation of the resulting zones where users spend most of their
times during the day and night identifies dense zones that pertain
to business/residential areas since the module considers larger time
granularity for the analysis. Thus, this module quantifies the extent
to which a zone is considered as residentail/business zone.
5.2.3 Community Detection Module
Following on the output of section 5.2.2, this module will in-
vestigate whether there are groups within the population forming
communities that have similar home and work locations. The mod-
ule begins with the city-wide network of connected zones G(N, E)
where Nis the set of cell towers within the city representing the
zones and Eis composed of weighted directed edges defined as
the number of users who have a particular home/work pair, respec-
tively, in the zones corresponding to the starting and terminating
nodes. The adjacency matrix Aof the discussed network is as fol-
lows:
A=
w0,0w0,1· · ·
w1,0w1,1· · ·
.
.
..
.
.· · ·
wm,1wm,2
...
Where w0,1is the number of users having their homeuas c0
and workuas c1. The algorithm then uses a modularity optimiza-
tion scheme, such that sets of nodes are clustered in a way that
minimizes internal arc disruption [5, 14]. Each resulting commu-
nity represents an area where a large fraction of users are mostly
located during the day and night.
Modularity is a standard objective function in the field of com-
munity detection; it measures how well a partition of network nodes
into communities reflects the characteristics of the underlying net-
work (in our case the commuting flow among zones). The ratio-
nale behind modularity is that a group of nodes with connections
mostly directed towards its own members represent a community
with higher modularity while a set of nodes with intra-community
connections is what we would expect by randomly rewiring all the
links.
Communities resulting from modularity optimization of telecom-
munication data have been empirically shown to be representative
of the actual social and administrative boundaries at the level of
whole countries [7].
In the case of a city, we went further and studied communities at
the level of the neighborhood. The interesting results we obtained
are discussed in Section 7.
5.2.4 Flows Estimation Module
To capture the directionality and mobility of the population across
the city, the browser houses an algorithm that provides information
about the collective behavior of human mobility through mining
mobile phone activity. The module of estimating the aggregate
flows of people across the city from the CDRs is a three step al-
gorithm that has the CDRs as inputs and the aggregation of flows
of people between locations at every time window tas its result
(i.e. Origin Destination matrix). A similar approach was developed
in [4]. The module starts by arranging data on a user level and con-
sidering each of their displacements as a potential trip. After that,
the resulting potential trips go through a filtration process that fil-
ters out noise in the data from the potential trips generated. Finally
the last step aggregates the resulting trips on both the spatial and
temporal dimensions to generate an origin-destination matrix based
on the provided time slice of interest.
The first step in the algorithm looks at phone activities on a user
level and gathers all activities generated for each user sorted in time
as follows.
Au={a0, a1, a2, ...an|u=ua0=ua1=ua2=.... =uan
ta0< ta1< ta2....tan}
Where Auis the set of all activities generated by user u,uaiis
the user generating the activity aiand taiis the time tag of activ-
ity ai. Every consecutive records belonging to the same user are
merged into pairs of location records with their associated times
representing a potential displacement of the user. The set of dis-
placements of a user are represented as given by:
Du={(cai, cai+1 , tai, tai+2 )|a0, a1,Au}
Where Duis the set of all potential displacements of user u,
caiis the cell tower facilitating the activity ai,taiis the time tag
of activity aiand uaiis the user generating activity ai. The set of
potential displacement considers each successive user activity a po-
tential trip though this includes noisy data such as users who did not
change their locations between the successive activities but where
nevertheless served by different nearby cell towers, a phenomena
referred to as localization error. In order to capture user trips in
which a displacement actually occurs, we apply further filtering on
the set of potential displacements Du. The goal of the filtering pro-
cess is to eliminate all captured pairs of location records that are
considered as noise in terms of trip-capturing. The filtration pro-
cess eliminates all records that are considered as localization error,
have very long time intervals or no movement detected. Entries
in the data that corresponds to localization error are filtered out by
eliminating all trips that are less than a specified distance of the
maximum distance between any neighboring cell towers within an
urban setting. Given any two neighboring cell towers that caiand
caj, each element within Dumust satisfy the below predicate.
distance(cai, cai+1 )> max[distance(cai, caj)]
Where disance(cai, cai+1 )is the distance between the towers
caiand cai+1 . The filter eliminates potential displacements having
a distance larger than that of the maximum distance between any
two neighboring towers in the city. In addition, each pair of records
satisfy tai+1 tai> α, where taiis the time tag of activity ai.
That is a time difference between consecutive activity records being
more than a threshold is filtered out of the set of displacements Du
for the purpose of reducing the uncertainty in capturing the actual
departure and arrival times for trips.
The result of the filtering process is the set of displacements ¯
Du
containing all pairs of locations where movement was detected and
reasonable time duration for the trip was captured. After that, the
final step towards the generation of OD matrices is to aggregate the
trips according to the specified time slice into the origin destination
matrix given by:
OD(∆t) =
0T0,1T0,2· · ·
T1,00T1,2· · ·
T2,0T2,10· · ·
.
.
..
.
.....
.
.
Where each element Ti,j gives the number of trips captured be-
tween cito cjduring the time slice t. The value of Ti,j is com-
puted by:
Ti,j (∆t) = X¯
Du(can, can+1 , tan, tan+1 )
Where cani, can+1 jand tan+1 tant. Thus, Ti,j
quantifies the flows from zone ito zone jduring the time window
t.
6. VISUALIZATION INTERFACE
The visualization component shows the results of the modules
and algorithms on two time scales depending on the nature of their
outputs. It will visualize population density distribution and ma-
jor flows of people across the city dynamically over the span of a
day while on longer time scales it will show a static map of the
communities forming around the analysis of dense zones.
The visualization will start by showing the spatial-temporal de-
composition of the population over the scale of a day. A dynamic
visualization with time granularity of 15-minutes will capture pop-
ulation density variations across the day and night. The browser
shows mobile activity over a dynamic period of time broken up
into 15-minute intervals as shown in figure 3. This visualization
presents a rotatable, scalable map onto which a shifting, three-
dimensional grid is superimposed to show locational agglomera-
tions of cellphone activity. Grid sectors will rise and fall, and
brighten and fade as people move across the city using their mo-
bile devices.
On the same scale of a day, the visualization components shows
the directionality of human mobility through the output of flow es-
timation module as well as the car counts stored in the data set.
Major flows within the city showing the aggregate behavior of com-
muting around the day are visualized with a time window of 15-
minutes. The component visualizes the generation of trips on each
time slice by as an arc that rose from originating to terminating cell
tower. As shown in figure 6, each arc embodies a variable num-
ber of trips, and to illustrate this we altered its thickness and height
in correspondence to the intensity of activity along that route (on
a logarithmic scale). The arcs are drawn over the same city base
geography, on top of the social interaction mesh from above, in an
effort to reveal unseen connections between the two results. In ad-
dition, car counts were built into the visualization as half-spheres
placed at their respective intersections. Each sphere changes shape
and color at an hourly rhythm in line with the measured volume.
On the longer time scale and towards visualizing the output of
the community detection module, the visualization interface over-
lays the community network over the spatial dimension of the city
to show if there are correlations between the formation of commu-
nities and the urban fabric of the city. Nodes represent zones of
the city and arcs represent groups of people spending most of their
times across the day/night between connected nodes. The commu-
nity detection module provides the set of nodes that belong to the
same community. To visualize the output of the community detec-
tion algorithm, nodes belonging to the same community are col-
ored with the same color as shown in figure 5. Thus, areas where
sub communities spend most of their time during the day and night
are bounded within zones of the same color.
7. CASE STUDY
Over the past decade, Saudi Arabia has taken strong steps to-
wards developing a diversified economy. Specifically on enhanc-
ing its Information and Communication Technology (ICT) infras-
tructure [1]. Today, Saudi Arabia has one of the highest Internet
penetration percentages in the gulf area with current penetration at
14.7 million. It is ranked among the highest countries worldwide in
mobile penetration rates with 188% of the population possess mo-
bile phones [6]. The high penetration rate of mobile in Saudi Ara-
bia make it an ideal candidate for utilizing the Call Detail Records
(CDRs) as in situ sensors for human mobility.
The City Browser was implemented for the Urban Transporta-
tion System (UTS), a system developed to provide city planners
with insights with regards to the mobility of the population. The
project started with gathering information related to the structure
of the city as well as the dynamics of the population. The data
gathered includes Records CDRs spanning a period of the month
of December, a spatial database of the road network of Riyadh city
and traffic counts data on different points within the city. Currently,
the data is housed within the data warehouse where several modules
and algorithms are using it to generate insights on the dynamics of
the city.
7.1 Data Description
Our dataset consists of one full month of records for the entire
country of Saudi Arabia, with 3 billion mobile activities to over 10
thousands unique cell towers, provided by a single carrier. Each
record contains an anonymized user ID, the type of activity (i.e.,
SMS, MMS, call, data etc), the cell tower facilitating the service,
duration if its a phone call, and time stamp of the activity. Each
cell tower id is spatially mapped to its latitude and longitude. For
privacy concerns, user id information were completely anonymized
at the telecom operator side.
Previous studies [9, 15] have shown that human communication
patterns are highly heterogeneous; where some users use their mo-
bile phone much more frequently than others. The characteristics
(a)
(b) (c)
Figure 2: Communication patterns in the CDRs Dataset. Fig 2a
shows the Empirical Cumulative Distribution Function (ECDF) of
the activities duration. We find that almost 75% of the users con-
duct activities that last for 70 seconds or less. Fig 2b shows the
statistical distribution of the number of communication records
generated by the users for a single day. Fig 2c shows the inter-
event time distribution P r(∆t)of calling activity, where tis the
time elapsed between consecutive communication records (outgo-
ing phone calls and SMS) for the same user.
of the dynamics of individual communication activity obtained in
Fig 2 supports such hypothesis.
7.2 City Spatial-temporal Decomposition
The first step towards understanding the data in the city of Riyadh
is to decompose cellular activity on the spatial and temporal dimen-
sions. The visualization in figure 3 shows cellular activity through
color, transparency, and height (in logarithmic scale) gridded across
the metropolitan expanse of Riyadh. As opposed to seeing the cell
towers as discrete points in the city, we show network traffic in-
terpolated over a 100 by 100 grid. In this sense, each grid cell is
assigned an intensity based on its distance to surrounding anten-
nas and their activity levels using a Gaussian smoothing function.
The temporal activity is interpolated in a similar manner, show-
ing smooth transitions between each time-slice in the dataset.The
city’s downtown core quickly becomes clouded in smog of network
activity early in the morning that hangs over region for the entire
day. Clear sub centers emerge that follow construction density, and
these sub-centers appear to be partitioned by the roadway network
itself.
Figure 3: Spatial-Temporal Decomposition out for a single time
slice. The figure demonstrates the time-cumulative spatial mobile
activity conducted between 9:45am to 10:00am.
The city’s shifting activity profile also highlights a rich tempo-
ral signature of communication that is all Riyadh’s own. Watching
the oscillations of the activity landscape, we see that Riyadh comes
alive at around 6:15am. We also see strong regional delineation:
the residential neighborhoods to the southwest and northeast of the
downtown core come alive well before the rest of the city, and expe-
rience the strongest inter-hour fluctuations throughout the course of
the day. Finally, we see some peculiar discontinuities in aggregate
talk throughout the day almost as if all phone traffic was suddenly
halved at strange intervals.
7.3 Capturing Home/Work Places
A fundamental quality of mobility behavior is to analyze the
emergence of zones with higher densities along a wider time gran-
ularity to understand the distribution of residential and business
zones. Expanding our time intervals to capture broader day and
night variation we can begin to differentiate dense business areas
and schools versus dense residential neighborhoods.
Figure 4: Dense work zones during the day versus home locations
during the night. We observe high day-densities at the periphery
where major universities are located.
The map in Figure 4 highlights the discrepancy between the purely
day zones shifting towards the red color and the purely night dense
zones shifting towards blue color, showing some mono-centrically
clustered day hotspots that follow the overall spatial logic of the
city. At the periphery we also see a number of universities show
up strongly as day locations. Lastly, we see high agglomerations
of residences to both the south and east of the city, with smaller
pockets scattered throughout.
7.4 Detecting Mobility Communities
The work/home dense zones visualizations shown in section 7.3
point to an organizational logic of the city. Conceptualizing the
totality of day/night commutes as a city-wide mobility network,
we can conceivably break this network into sub-communities by
applying a regional delineation algorithm.
Figure 5: Community Detection Module results plotted by Latitude
and Longitude on the map of Riyadh. We find support to the com-
monly held belief that heavily trafficked streets, on many levels, are
instruments of segregation and control.
By overlaying the results of the community detection module on
geography of the city (see Figure 5), a number of interesting rela-
tionships are revealed between the detected communities and the
built form of the city. Most strikingly, the resulting clusters closely
correlate to the main arterials of city’s roadway infrastructure. Mo-
bility communities seem to be partitioned by the street network it-
self, underscoring the city’s dependence on highway infrastructure,
while also supporting the commonly held belief that heavily traf-
ficked streets, on many levels, are instruments of segregation and
control, or, perhaps more optimistically: good streets make good
neighbors.
7.5 Flow Estimation
The approach toward understanding flows that contribute dense-
zone emergence on smaller time granularity unveils rich informa-
tion pertaining to the sources of dense zones as well as the distri-
bution of flow over time. By collecting and filtering each user’s
mobile activity as sequence of cell tower locations and then aggre-
gating collective users’ trips, we are able to estimate flows in terms
of origins and destinations of trips. We’ve observed that these esti-
mated flows contributed to the emergence of high density zones in
the city of Riyadh; however this approach includes the added bene-
fit of capturing travel demand at highly dynamic time slices ranging
from seasonal variations to hourly fluctuations. Such a high tem-
poral resolution has the potential to transform our understanding of
urban mobility [20].
Figure 6: The extracted Origin Destination (OD) matrix across
Riyadh at the time slice of 9:30-9:45am. The height of the line
corresponds to the number of trips between a specific OD.
The resulting dynamic maps held a striking similarity to the lo-
cal intuition of vehicular flows across the city (see Figure 6). Over-
all flows correspond quite closely to the underlying street network.
Most notably, Figure 6 shows intense activity along the city’s main
arterials; King Fahd Road and the Northern and Eastern Ring roads.
This agrees with the local community’s subjective understanding of
commute patterns across the city. But to further validate our results,
we compared them against the best ground-truth measurements of
roadway activity: car count volumes captured by pressure-tube sen-
sors placed at multiple intersection across the city.
8. SUMMARY AND FUTURE WORK
In this paper, we have presented a new tool addressing the com-
plexity of city human mobility and showed its application to the
city of Riyadh the capital of Saudi Arabia through the UTS project.
The project developed the Riyadh Mobility Browser by implement-
ing several modules that mined data generated from mobile phones
to provide a coherent understanding of the dynamics of the interac-
tion between its social structure and transportation infrastructures.
At the current stage, the browser is built to work with historical
data and thus would provide an after-the-fact analysis and does not
allow for the parsing and analysis of the data in real time. A poten-
tial future work would be investigating the possibility of enabling
the browser to parse such big data in real time through establishing
a live connection of data feed with GSM network operators.
The city mobility browser synthesizes and extends existing al-
gorithms to provide a holistic decomposition of the complexity of
mobility across multiple dimensions. Although the browser cap-
tures the dynamics of the demand on transportation, it does not
map the demand over the road network of the city.
We also acknowledge that some of the explanations and conclu-
sions proposed in this work might lack rigorous validations and this
is due to the nature of the CDRs where it lacks sufficient granular-
ity in space and time. Spatially, the data is mapped to the locations
of cell towers and not the exact locations of users and therefore the
coordinates of cell towers are used as a proxy to the exact loca-
tions of users. Temporally, users have a bursty phone usage behav-
ior where activities are clustered around different times of the day
rather than spread out around the day to enable a more comprehen-
sive understanding of mobility in this case. However, we believe
that our analysis of human mobility can describe well the current
trends and phenomenon of human mobility and can be leveraged in
planning the city and transportation operations.
The visualizations provided by the tool give a dynamic qualita-
tive understanding of the spatial attributes of the city as well as its
population directionality across different times of the day. The city
mobility browser is envisioned to be a tool that can provide plan-
ners, engineers and the public with an easy to understand analysis
while capturing fine grained details about the city. Future work
could also enable the visualization interface to provide quantitative
analysis and a better understanding of emerging patterns.
9. REFERENCES
[1] A. Almaatouq, F. Alhasoun, R. Campari, and A. Alfaris. The
influence of social norms on synchronous versus
asynchronous communication technologies. In Proceedings
of the 1st ACM International Workshop on Personal Data
Meets Distributed Multimedia, PDM ’13, pages 39–42, New
York, NY, USA, 2013. ACM.
[2] M. Batty, K. Axhausen, F. Giannotti, A. Pozdnoukhov,
A. Bazzani, M. Wachowicz, G. Ouzounis, and Y. Portugali.
Smart cities of the future. The European Physical Journal
Special Topics, 214(1):481–518, 2012.
[3] D. Brockmann, L. Hufnagel, and T. Geisel. The scaling laws
of human travel. Nature, 439(7075):462–465, Jan. 2006.
[4] F. Calabrese, M. Colonna, P. Lovisolo, D. Parata, and
C. Ratti. Real-time urban monitoring using cell phones: A
case study in rome. IEEE Transactions on Intelligent
Transportation Systems, 12(1):141–151, 2011.
[5] A. Clauset, M. E. J. Newman, and C. Moore. Finding
community structure in very large networks. Physical Review
E, 70:066111, 2004.
[6] Communications and I. Commission. Ict indicators, q2-2012,
2012.
[7] Y.-A. de Montjoye, C. A. Hidalgo, M. Verleysen, and V. D.
Blondel. Unique in the Crowd: The privacy bounds of human
mobility. Scientific Reports, 3, Mar. 2013.
[8] N. Eagle and A. (Sandy) Pentland. Reality mining: Sensing
complex social systems. Personal Ubiquitous Comput.,
10(4):255–268, Mar. 2006.
[9] M. C. Gonzalez, C. A. Hidalgo, and A.-L. Barabasi.
Understanding individual human mobility patterns. Nature,
453(7196):779–782, June 2008.
[10] J. Herrera, D. Work, R. Herring, X. Ban, Q. Jacobson, and
A. Bayen. Evaluation of traffic data obtained via
GPS-enabled mobile phones: The Mobile Century field
experiment. Transportation Research Part C, 18(4):568–583,
August 2010.
[11] S. Jiang, G. A. Fiore, Y. Yang, J. Ferreira, Jr., E. Frazzoli,
and M. C. González. A review of urban computing for
mobile phone traces: Current methods, challenges and
opportunities. In Proceedings of the 2Nd ACM SIGKDD
International Workshop on Urban Computing, UrbComp
’13, pages 2:1–2:9, New York, NY, USA, 2013. ACM.
[12] S. Jiang, J. F. Jr, and M. González. Discovering urban
spatial-temporal structure from human activity patterns. In
Proceedings of the ACM SIGKDD International Workshop
on Urban Computing, pages 95–102, 2012.
[13] J. R. Kwapisz, G. M. Weiss, and S. A. Moore. Activity
recognition using cell phone accelerometers. SIGKDD
Explor. Newsl., 12(2):74–82, Mar. 2011.
[14] M. E. Newman. Modularity and community structure in
networks. Proc Natl Acad Sci U S A, 103(23):8577–8582,
June 2006.
[15] J. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer,
K. Kaski, J. Kertesz, and A. L. Barabasi. Structure and tie
strengths in mobile communication networks. Proc. Natl.
Acad. Sci. USA, 104(18):7332–7336, 2007.
[16] S. Peeta and P. a. Zhang. Counting device selection and
reliability: Synthesis study. Technical report, Purdue
University.
[17] C. Ratti, S. Sobolevsky, F. Calabrese, C. Andris, J. Reades,
M. Martino, R. Claxton, and S. H. Strogatz. Redrawing the
map of great britain from a network of human interactions.
PLoS ONE, 5(12):e14248, 12 2010.
[18] P. Rode and R. Burdett. Cities: investing in energy and
resource efficiency. In United Nations Envrionment
Programme, (corp. ed.) Towards a Green Economy:
Pathways to Sustainable Development and Poverty
Eradication, pages 453–492. United Nations Environment
Programme, 2011.
[19] W. Shen and L. Wynter. Real-time traffic prediction using
GPS data with low sampling rates: A hybrid approach. In
91st Transportation Research Board Annual Meeting,
number 12-1692, Washington, D.C., January 2012.
[20] J. L. Toole, S. Colak, F. Alhasoun, A. Evsukoff, and M. C.
Gonzalez. The path most travelled: Mining road usage
patterns from massive call data. arXiv preprint
arXiv:1403.0636, 2014.
[21] P. Wang, T. Hunter, A. M. Bayen, K. Schechtner, and M. C.
González. Understanding Road Usage Patterns in Urban
Areas. Scientific Reports, 2, Dec. 2012.
[22] J. Yuan, Y. Zheng, and X. Xie. Discovering regions of
different functions in a city using human mobility and pois.
In Proceedings of the 18th ACM SIGKDD international
conference on Knowledge discovery and data mining, pages
186–194. ACM, 2012.
[23] Y. Zheng, Y. Liu, J. Yuan, and X. Xie. Urban computing with
taxicabs. In Proceedings of the 13th international conference
on Ubiquitous computing, pages 89–98. ACM, 2011.
... (2) duration of the connection; (3) the caller's location; (4) the type of connection (phone call, SMS, internet query, etc.) and (5) the user's type of service (subscription, prepaid, etc.) (Alhasoun, et al., 2014). ...
... They defined the home-work line as the start and the end points of a commute. This was the first step towards understanding travel demands (Greco, 2014;Alhasoun, et al., 2014). ...
... From the urban perspective, an individual's mobility pattern is one of the most meaningful information, which describes one of the most vital components of urban analysis: origin-destination matrices. Creating accurate O-D matrices is a crucial component of transportation network optimization, not only for assessing moment-to-moment capacity constraints, but also for forecasting future needs (Alhasoun, et al., 2014). ...
Thesis
Cities are facing great challenges and incremental increases in population size, transportation demands, and infrastructure, thus there is a need to deal with the city in a more intelligent way and use futuristic prediction tools in order to enhance the quality of life. Nowadays, there are increasing amounts of data sources, which are responsible for reshaping cities. This new enthusiasm for technology, algorithms, and applications can provide a better understanding of cities. It also enables stakeholders to have predictive statistics to help in the decisionmaking process. Data is never a new thing, but data sources are always changing. The internet made everything easier and reachable using a wide range of technologies such as IOT (Internet of Things) and M2M (Machine to Machine) etc. All of this offers a new potential to deliver an analytical framework for urban optimization. According to the technologically mutated data sources, currently, data comes from weather channels, street security cameras, Facebook, Twitter, sensor networks, in-car devices, location-based smartphone applications, RFID tags, and smart meters, to mention a few (Hinssen, 2012). This massive amount of data, which comes from real-time-based tools, has put the world in a new era called the era of ‘Big Data’. Accordingly, smarter approaches for urban development are highly needed to manage this enormous amount of data that exists in a sustainable way. A knowledge-based approach will enable cities to cut costs, save energy, improve their services, optimize their infrastructure, enhance the quality of life of their citizens, reduce their environmental footprints, fuel innovation and drive sustainable economic growth. This dissertation is concerned with the newly emerged technologies in the urban informatics domain and will review developed systems worldwide, comparing and analyzing them. It will examine the cities' experiences in an attempt to learn and extract guidance from their experiences in order to enhance and engage technology in the urban development process in Egypt, Alexandria City. This will be achieved through developing a city dashboard for Alexandria, Egypt. It will be designed to monitor traffic using mobile signals and will also review all the hashtags that are related to Alexandria from the most famous social network platforms (Twitter, Facebook). The dashboard will involve the user-generated data as part of the decision-making process and provide close insights into citizens’ status among other diverse data sources that are essential not only for city decision-makers but also for individual citizens as well.
... Thus, different aspects of cellular network data have been used for different researches [6]. Among them we find; studies on patterns of mobility [7], [8], large scale studies of urban mobility [9], [10], [11], [12] conception of human mobility models [13], [14], [15], [16], studies on land usage [17], [18], etc. Although the literature abounds with examples of cellular data usages, the suitability of such data to identify and characterize human mobility is discussed [19], and many inherent aspects of these data such as granularity and coarse location estimates, may disrupt mobility models. ...
... Then, although admitting that roaming information was not the most pertinent, [27] used CDR to evaluate the proportion of tourists in a territory, whereas [28] used phone calls, ticketing and online photos information to extract tourist statistics. Such individual profiling has also been developed in [10] which leverages cellular data to detect communities inside a city. ...
Article
Full-text available
With the rapid growth of cell phone networks during the last decades, call detail records (CDR) have been used as approximate indicators for large scale studies on human and urban mobility. Although coarse and limited, CDR are a real marker of human presence. In this paper, we use more than 800 million of CDR to identify weekly patterns of human mobility through mobile phone data. Our methodology is based on the classification of individuals into six distinct presence profiles where we focus on the inherent temporal and geographical characteristics of each profile within a territory. Then, we use an event-based algorithm to cluster individuals and we identify 12 weekly patterns. We leverage these results to analyze population estimates adjustment processes and as a result, we propose new indicators to characterize the dynamics of a territory. Our model has been applied to real data coming from more than 1.6 million individuals and demonstrates its relevance. The product of our work can be used by local authorities for human mobility analysis and urban planning.
... Consists of one full month of records for the entire country, with 3 billion mobile activities to over 10, 000 unique cell towers, provided by a single telecommunication service provider [2,1]. Each record contains: i) an anonymized user identifier; ...
Preprint
The mapping of populations socio-economic well-being is highly constrained by the logistics of censuses and surveys. Consequently, spatially detailed changes across scales of days, weeks, or months, or even year to year, are difficult to assess; thus the speed of which policies can be designed and evaluated is limited. However, recent studies have shown the value of mobile phone data as an enabling methodology for demographic modeling and measurement. In this work, we investigate whether indicators extracted from mobile phone usage can reveal information about the socio-economical status of microregions such as districts (i.e., average spatial resolution < 2.7km). For this we examine anonymized mobile phone metadata combined with beneficiaries records from unemployment benefit program. We find that aggregated activity, social, and mobility patterns strongly correlate with unemployment. Furthermore, we construct a simple model to produce accurate reconstruction of district level unemployment from their mobile communication patterns alone. Our results suggest that reliable and cost-effective economical indicators could be built based on passively collected and anonymized mobile phone data. With similar data being collected every day by telecommunication services across the world, survey-based methods of measuring community socioeconomic status could potentially be augmented or replaced by such passive sensing methods in the future.
... Other researchers like Gonzalez et al. [12] and Song et al. [13] also analyzed mobility patterns and urban mobility in the past decade. Real-time urban monitoring studies also were conducted by using cellular data [14,15]. ...
Article
Human mobility patterns are associated with many aspects of our life. With the increase of the popularity and pervasiveness of smartphones and portable devices, the Internet of Things (IoT) is turning into a permanent part of our daily routines. Positioning technologies that serve these devices such as the cellular antenna (GSM networks), global navigation satellite systems (GPS), and more recently the WiFi positioning system (WPS) provide large amounts of spatio-temporal data in a continuous way (data streams). In order to understand human behavior, the detection of important places and the movements between these places is a fundamental task. That said, the proposal of this work is a method for discovering user habits over mobility data without any a priori or external knowledge. Our approach extends a density-based clustering method for spatio-temporal data to identify meaningful places the individuals’ visit. On top of that, a Gaussian mixture model (GMM) is employed over movements between the visits to automatically separate the trajectories accordingly to their key identifiers that may help describe a habit. By regrouping trajectories that look alike by day of the week, length, and starting hour, we discover the individual’s habits. The evaluation of the proposed method is made over three real-world datasets. One dataset contains high-density GPS data and the others use GSM mobile phone data with 15-min sampling rate and Google Location History data with a variable sampling rate. The results show that the proposed pipeline is suitable for this task as other habits rather than just going from home to work and vice versa were found. This method can be used for understanding person behavior and creating their profiles revealing a panorama of human mobility patterns from raw mobility data.
... Therefore, the geo-location of the cell site indicates that the mobile phone user is inside the coverage area of the cell tower while making the phone call. Therefore, CDR can be useful for activity analysis in terms of user behaviour [6]- [8] and patterns of urban community [9], [10]. ...
Conference Paper
Full-text available
Since mobile phone has become one of the most popular communication method. In order to find different characteristics of each cell towers and locations from various type of data collecting within CDRs. We explore CDRs to find amount of people the city in a period of time, to analyze highly active period and inactive hour in day of weeks. Behavior of mobile phone usage. And implement clustering algorithm to find a proper number which gather distinctive usage patterns from several cell towers within 24 study locations, which are located in Bangkok and surroundings area. In this study, we also discovered some patterns that can acutely describe area use.
... It was discovered that people tend to follow simple reproducible patterns. Alhasoun et al. inferred individual home/ work locations by analyzing users' CDRs, then investigated the formation of segregated communities based on users' home and work locations, and estimated people flows within the city within a day time scale [20]. Bayir et al. used cellular networks of real-world cell phone data to analyze human mobility in city-wide level. ...
Article
Full-text available
The prevalence of smartphones equipped with various sensors enables pervasive capture of users’ location data. WiFi scan lists on one smartphone, i.e., scan results of network in a range, can roughly indicate the physical location of the phone in a time period. Considering the close relationship between location and daily life, users’ life style can be inferred from their WiFi scan lists. Given the issue of user privacy, in this paper, we explore anonymized WiFi scan lists to discover users’ life style. Individual life style about mobility and important places of home and workplaces is discovered, respectively, based on the stay places extracted from anonymized WiFi scan lists and the reconstructed mobility trajectories. We first learn the life style about mobility by detecting activity areas from mobility trajectories and introducing two metrics of activeness and diversity to measure individual mobility. Then, we discover the life style about the home and workplaces identified from anonymized WiFi scan lists, such as stay duration at home, activeness of going outside at night, and working hours on weekdays and weekends. Experiments were conducted on a real-world large-scale dataset, which contains records of smart phone usage of more than 17,000 volunteering participants. Our work is a promising step towards automatically discover people’s life style from anonymized smartphone data.
... From the research, authors found that people usually follow simple routines involving a few frequent places. Similar mobility is observed in [15] for analyzing in Saudi Arabia. Noulas et al. found human mobility patterns are following a universal law, which is the probability of people traveling from one place to another is related to the relative rank of places [16]. ...
Article
Today's modern communication technologies such as cloud radio access and software defined networks are key candidate technologies for enabling 5G networks as they incorporate intelligence for data-driven networks. Traditional content caching in the last mile access point has shown a reduction in the core network traffic. However, the radio access network still does not fully leverage such solution. Transmitting duplicate copies of contents to mobile users consumes valuable radio spectrum resources and unnecessary base station energy. To overcome these challenges, we propose huMan mObility-based cOntent Distribution (MOOD) system. MOOD exploits urban scale users' mobility to allocate radio resources spatially and temporally for content delivery. Our approach uses the broadcast nature of wireless communication to reduce the number of duplicated transmissions of contents in the radio access network for conserving radio resources and energy. Furthermore, a human activity model is presented and statistically analyzed for simulating people daily routines. The proposed approach is evaluated via simulations and compared with a generic broadcast strategy in an actual existing deployment of base stations as well as a smaller cells environment, which is a trending deployment strategy in future 5G networks. MOOD achieves 15.2% and 25.4% of performance improvement in the actual and small-cell deployment, respectively.
Article
Full-text available
Extensive theoretic work attempts to address the role of social norms in describing, explaining and predicting human behaviors. However, traditional methods of assessing the effect can be expensive and time consuming. In this work, we utilize data generated by the call detail records (CDRs) and geo-tagged Tweets (GTTs) as enabling proxies for understanding human activity patterns. We present preliminary results on the effect of social norms on communication patterns during different times of the day, including prayer times. Specifically, we investigate the variations in population behavioral patterns with respect to social norms between asynchronous (i.e., Twitter) and synchronous (i.e., phone calls) communication mediums in the city of Riyadh, the capital of Saudi Arabia.
Conference Paper
Full-text available
In this work, we present three classes of methods to extract information from triangulated mobile phone signals, and describe applications with different goals in spatiotemporal analysis and urban modeling. Our first challenge is to relate extracted information from phone records (i.e., a set of time-stamped coordinates estimated from signal strengths) with destinations by each of the million anonymous users. By demonstrating a method that converts phone signals into small grid cell destinations, we present a framework that bridges triangulated mobile phone data with previously established findings obtained from data at more coarse-grained resolutions (such as at the cell tower or census tract levels). In particular, this method allows us to relate daily mobility networks, called motifs here, with trip chains extracted from travel diary surveys. Compared with existing travel demand models mainly relying on expensive and less-frequent travel survey data, this method represents an advantage for applying ubiquitous mobile phone data to urban and transportation modeling applications. Second, we present a method that takes advantage of the high spatial resolution of the triangulated phone data to infer trip purposes by examining semantic-enriched land uses surrounding destinations in individual's motifs. In the final section, we discuss a portable computational architecture that allows us to manage and analyze mobile phone data in geospatial databases, and to map mobile phone trips onto spatial networks such that further analysis about flows and network performances can be done. The combination of these three methods demonstrate the state-of-the-art algorithms that can be adapted to triangulated mobile phone data for the context of urban computing and modeling applications.
Article
Full-text available
Rapid urbanization places increasing stress on already burdened transportation systems, resulting in delays and poor levels of service. Billions of spatiotemporal call detail records (CDRs) collected from mobile devices create new opportunities to quantify and solve these problems. However, there is a need for tools to map new data onto existing transportation infrastructure. In this work, we propose a system that leverages this data to identify patterns in road usage. First, we develop an algorithm to mine billions of calls and learn location transition probabilities of callers. These transition probabilities are then upscaled with demographic data to estimate origin-destination (OD) flows of residents between any two intersections of a city. Next, we implement a distributed incremental traffic assignment algorithm to route these flows on road networks and estimate congestion and level of service for each roadway. From this assignment, we construct a bipartite usage network by connecting census tracts to the roads used by their inhabitants. Comparing the topologies of the physical road network and bipartite usage network allows us to classify each road's role in a city's transportation network and detect causes of local bottlenecks. Finally, we demonstrate an interactive, web-based visualization platform that allows researchers, policymakers, and drivers to explore road congestion and usage in a new dimension. To demonstrate the flexibility of this system, we perform these analyses in multiple cities across the globe with diverse geographical and sociodemographic qualities. This platform provides a foundation to build congestion mitigation solutions and generate new insights into urban mobility.
Article
Full-text available
Here we sketch the rudiments of what constitutes a smart city which we define as a city in which ICT is merged with traditional infrastructures, coordinated and integrated using new digital technologies. We first sketch our vision defining seven goals which concern: developing a new understanding of urban problems; effective and feasible ways to coordinate urban technologies; models and methods for using urban data across spatial and temporal scales; developing new technologies for communication and dissemination; developing new forms of urban governance and organisation; defining critical problems relating to cities, transport, and energy; and identifying risk, uncertainty, and hazards in the smart city. To this, we add six research challenges: to relate the infrastructure of smart cities to their operational functioning and planning through management, control and optimisation; to explore the notion of the city as a laboratory for innovation; to provide portfolios of urban simulation which inform future designs; to develop technologies that ensure equity, fairness and realise a better quality of city life; to develop technologies that ensure informed participation and create shared knowledge for democratic city governance; and to ensure greater and more effective mobility and access to opportunities for urban populations. We begin by defining the state of the art, explaining the science of smart cities. We define six scenarios based on new cities badging themselves as smart, older cities regenerating themselves as smart, the development of science parks, tech cities, and technopoles focused on high technologies, the development of urban services using contemporary ICT, the use of ICT to develop new urban intelligence functions, and the development of online and mobile forms of participation. Seven project areas are then proposed: Integrated Databases for the Smart City, Sensing, Networking and the Impact of New Social Media, Modelling Network Performance, Mobility and Travel Behaviour, Modelling Urban Land Use, Transport and Economic Interactions, Modelling Urban Transactional Activities in Labour and Housing Markets, Decision Support as Urban Intelligence, Participatory Governance and Planning Structures for the Smart City. Finally we anticipate the paradigm shifts that will occur in this research and define a series of key demonstrators which we believe are important to progressing a science of smart cities. Graphical abstract
Conference Paper
Full-text available
Urban geographers, planners, and economists have long been studying urban spatial structure to understand the development of cities. Statistical and data mining techniques, as proposed in this paper, go a long way in improving our knowledge about human activities extracted from travel surveys. As of today, most urban simulators have not yet incorporated the various types of individuals by their daily activities. In this work, we detect clusters of individuals by daily activity patterns, integrated with their usage of space and time, and show that daily routines can be highly predictable, with clear differences depending on the group, e.g. students vs. part time workers. This analysis presents the basis to capture collective activities at large scales and expand our perception of urban structure from the spatial dimension to spatial-temporal dimension. It will be helpful for planers to understand how individuals utilize time and interact with urban space in metropolitan areas and crucial for the design of sustainable cities in the future.
Article
GPS devices, as an emerging mobile traffic data source, offer new opportunities for short-term traffic prediction, especially in arterial networks where traditional fixed-location sensors are sparse or unavailable. In particular, we consider GPS data that is provided in the form of point speeds, rather than trajectories. This is the case when GPS data from consumers is sampled at discrete points by a service provider, e.g. to protect privacy of the consumers by not permitting a reconstruction of their trajectories. In the context studied in this paper as well as others observed in practice, such GPS sampling rates are quite low and link-level speed estimate based on a small sample of instantaneous GPS speed readings can be unreliable. Therefore, traditional time-series traffic prediction methods based on fixed-location data sources are usually inapplicable in this context. This paper presents a hybrid data mining approach for real-time traffic speed prediction based on such GPS data. It was found that reliable speed predictions can be obtained by combining GPS data with an additional, offline data source collecting link speed periodically. The primary contribution of GPS data comes from both the global and local count information within ranges of speed categories. The key elements of our approach include the neighboring distance criterion considering both local and global GPS counts information, the ensemble rule, and the cross-validation framework. The example studied is drawn from the traffic prediction competition of the 2010 IEEE International Conference on Data Mining, in which the authors were part of a team that finished second worldwide.
Article
The development of a city gradually fosters different functional regions, such as educational areas and business districts. In this paper, we propose a framework (titled DRoF) that Discovers Regions of different Functions in a city using both human mobility among regions and points of interests (POIs) located in a region. Specifically, we segment a city into disjointed regions according to major roads, such as highways and urban express ways. We infer the functions of each region using a topic-based inference model, which regards a region as a document, a function as a topic, categories of POIs (e.g., restaurants and shopping malls) as metadata (like authors, affiliations, and key words), and human mobility patterns (when people reach/leave a region and where people come from and leave for) as words. As a result, a region is represented by a distribution of functions, and a function is featured by a distribution of mobility patterns. We further identify the intensity of each function in different locations. The results generated by our framework can benefit a variety of applications, including urban planning, location choosing for a business, and social recommendations. We evaluated our method using large-scale and real-world datasets, consisting of two POI datasets of Beijing (in 2010 and 2011) and two 3-month GPS trajectory datasets (representing human mobility) generated by over 12,000 taxicabs in Beijing in 2010 and 2011 respectively. The results justify the advantages of our approach over baseline methods solely using POIs or human mobility.