Conference PaperPDF Available

The origin of heterogeneity in human mobility ranges

Conference Paper

The origin of heterogeneity in human mobility ranges

Abstract

In the last decade, scientists from different disciplines discovered a great heterogeneity in human mobility ranges, since a power law characterizes the distribution of the characteristic distance traveled by individuals, the so-called radius of gyra-tion. The origin of such heterogeneity, however, still remains unclear. In this paper, we analyze two mobility datasets and observe that an individual's locations tend to be grouped in dense clusters representing geographical mobility cores. We show that the heterogeneity in human mobility ranges is mainly due to trips between these mobility cores, while it is greatly reduced when individuals are constrained to move within a single mobility core.
The origin of heterogeneity in human mobility ranges
Luca Pappalardo
Department of Computer Science
University of Pisa
Largo Bruno Pontecorvo 3, 56127 Pisa, Italy
lpappalardo@di.unipi.it
ABSTRACT
In the last decade, scientists from different disciplines discov-
ered a great heterogeneity in human mobility ranges, since a
power law characterizes the distribution of the characteristic
distance traveled by individuals, the so-called radius of gyra-
tion. The origin of such heterogeneity, however, still remains
unclear. In this paper, we analyze two mobility datasets and
observe that an individual’s locations tend to be grouped in
dense clusters representing geographical mobility cores. We
show that the heterogeneity in human mobility ranges is
mainly due to trips between these mobility cores, while it
is greatly reduced when individuals are constrained to move
within a single mobility core.
CCS Concepts
Applied computing Physics; Mathematics and statis-
tics;
Keywords
human mobility; mobility data mining; mobile phone data;
GPS data; data science; Big Data
1. INTRODUCTION
In the last decade the availability of big mobility data,
such as GPS tracks from vehicles and mobile phone data,
offered a series of novel insights on the quantitative patterns
characterizing human mobility. In particular, scientists from
different disciplines discovered that human movements are
not completely random but follow specific statistical laws.
The mobility of an individual can be confined within a sta-
ble circle defined by a center of mass and a radius of gyration
[7, 12]. Interestingly, such circles are found to be highly het-
erogeneous since a power law characterizes the distribution
of the radius of gyration of individuals [7, 14]. Although
these discoveries have doubtless shed light on interesting as-
pects about human mobility, the origin of the observed pat-
terns still remains unclear: what is the origin of the hetero-
(c) 2016, Copyright is with the authors. Published in the Workshop Proceedings of the
EDBT/ICDT 2016 Joint Conference (March 15, 2016, Bordeaux, France) on CEUR-
WS.org (ISSN 1613-0073). Distribution of this paper is permitted under the terms of
the Creative Commons license CC-by-nc-nd 4.0
geneity in human mobility ranges? Answering this question
is of great importance in contexts like urban planning and
the design of smart cities, since it can be helpful for crucial
problems such as movement prediction [3, 20] and activity
recognition [11, 8, 15].
In this paper, we address this question by performing a
data-driven study of human mobility. In our analysis we
exploit the access to two mobility datasets, each storing the
trajectories of about 50,000 individuals. We observe that
the locations visited by the individuals tend to cluster in
dense groups, representing meaningful geographical units or
mobility cores. We then compute for every individual her
inter-core characteristic traveled distance and her intra-core
characteristic traveled distance, which are defined by the
radius of gyration computed on the trips between mobility
cores and the trips within mobility cores respectively. From
the comparison of the total radius of gyration of an indi-
vidual with her intra- and inter-core radius of gyration we
observe two main results. First, a strong linear correlation
emerges between the total radius of an individual and her
inter-core radius, suggesting that the mobility range of an
individual is mainly determined by trips between mobility
cores. Second, the distribution of the characteristic intra-
core radius of gyration has a peak suggesting that individu-
als show typical mobility ranges when constrained to move
within mobility cores. Our results, which emerge on differ-
ent types of mobility data and at different geographical and
temporal scales, suggest that people perform two types of
trips: intra-core trips and inter-core trips, the latter being
the origin of the observed heterogeneity in mobility ranges.
The paper is organized as follows. Section 2 summarizes
some works relevant to our topic. Section 3 introduces the
two mobility datasets we analyze and Section 4 describes
the measures of individual human mobility we use during
the analysis. Section 5 shows the results of our work and
finally Section 6 concludes the paper.
2. RELATED WORK
The availability of Big Data on human mobility allowed
scientists from different disciplines to discover that tradi-
tional mobility models adapted from the observation of an-
imals [5, 6] and dollar bills [2] are not suitable to describe
people’s movements. Indeed, at a global scale humans are
characterized by a huge heterogeneity, since a power law
emerges in the distribution of the radius of gyration, the
characteristic distance traveled by individuals [7, 12]. De-
spite this heterogeneity, through the observation of past mo-
bility history the whereabouts of most individuals can be
predicted with an accuracy higher than 80% [4, 18]. More-
over, according to their recurrent and total mobility patterns
individuals naturally split into two distinct mobility profiles,
namely returners and explorers, which show communication
preferences with individuals in the same mobility profile [14].
The patterns of individual human mobility have been ob-
served in both GSM data and GPS data [7, 12], and have
been used to build generative models of individual human
mobility [10, 18, 14], generative models to describe human
migration flows [17, 21, 9], methods to discover geographic
borders according to recurrent trips of private vehicles [16],
methods to predict the formation of social ties [3, 20], and
classification models to predict the kind of activity associ-
ated to individuals’ trips on the only basis of the observed
displacements [11, 8, 15]. Bagrow et al. exploit network sci-
ence techniques to split the mobility of individuals into mo-
bility units, or mobility habitats [1]. They find a relationship
between the total radius of gyration of an individual and the
trips between the main mobility habitats. In this paper we
investigate the existence of mobility groups at different ge-
ographical levels. We use data mining clustering techniques
(instead of network techniques) to aggregate an individual’s
locations into clusters.
3. MOBILITY DATA
GSM data. Our first data source consists of anonymized
mobile phone data collected by a European mobile carrier for
billing and operational purposes. The mobile phones carried
by individuals in their daily routine offer a good proxy to
study the structure and dynamics of human mobility: each
time an individual makes a call the tower that communi-
cates with her phone is recorded by the carrier, effectively
tracking her current location. The datasets consists of Call
Detail Records (CDR) describing the calls of 67,000 individ-
uals during three months selected from 1 million users pro-
vided that they visited more than two locations during the
observation period and that their average call frequency was
f0.5 hour1. Each call is characterized by timestamp,
caller and callee identifiers, duration of the call and the ge-
ographical coordinates of the tower serving the call. We
reconstruct a user’s movements based on the time-ordered
list of phone towers from which a user made her calls [7].
GPS data. Our second data source is a GPS dataset
storing information about the trips of 46,000 private vehi-
cles traveling in Tuscany during one month. The GPS traces
are provided by Octo Telematics1, a company that provides
a data collection service for insurance companies. The GPS
device embedded into a vehicle’s engine automatically turns
on when the vehicle starts, and the sequence of GPS points
that the device transmits every 30 seconds to the server via
a GPRS connection forms the global trajectory of a vehicle.
We exploit the stops of the vehicles to split the global trajec-
tory into several sub-trajectories, corresponding to the trips
performed by the vehicle. We set a stop duration threshold
of at least 20 minutes to create the sub-trajectories, in order
to avoid short stops like traffic lights: if the time interval be-
tween two consecutive observations of a vehicle is larger than
20 minutes, the first observation is considered as the end of a
sub-trajectory and the second one is considered as the start
of another sub-trajectory. We also performed the extrac-
tion of the sub-trajectories by using different stop duration
1http://www.octotelematics.com/
thresholds (5, 10, 15, 20, 30 and 40 minutes) without finding
significant differences in the sample of trips and in the statis-
tical analysis we present in this paper. We assign each origin
and destination point of the obtained sub-trajectories to the
corresponding Italian census cell, using information provided
by the Italian National Institute of Statistics (ISTAT). We
describe the movements of a vehicle by the time-ordered list
of census cells where the vehicle stopped [14].
GSM vs GPS. The GSM and the GPS datasets differ
in several aspects [13, 12]. The GPS data refers to trips
performed during one month (May 2011) in an area corre-
sponding to a single Italian region, while the mobile phone
data cover an entire European country and a period of ob-
servation of three months. The GPS data represents a 2%
sample of the population of vehicles in Italy [12], while the
mobile phone dataset covers users of a major European op-
erator, about the 25% of the country’s adult population [7,
14]. The trajectories described by mobile phone data in-
clude all possible means of transportation. In contrast, the
GPS data refers to private vehicle displacements only. The
fact that one dataset contains aspect missing in the other
dataset makes the two types of data suitable for an inde-
pendent validation of human mobility patterns.
4. MOBILITY MEASURES
The radius of gyration rgis a standard measure to describe
the characteristic distance traveled by an individual, defined
as [7, 12]:
rg=s1
NX
iL
ni(rircm)2,(1)
where Lis the set of locations visited by the individual,
riis a two-dimensional vector describing the geographical
coordinates of location i;niis the visitation frequency of
location i;N=PiLniis the total number of visits of the
individual, and rcm is the center of mass of the individual
defined as the mean weighted point of the visited locations
[7, 12]. The distribution of the radius of gyration is well
fitted by a power-law with exponential cutoff, as measured
on mobile phone data [7, 14] and GPS data [12, 14].
Given a partition of an individual’s locations in mgroups,
or mobility cores, we define a dominant location Dias the
most visited location in group i, i.e. the preferred location of
the individual when she visits locations in group i(see Fig-
ure 1). We define the inter-core radius rinter
gof an individual
as the radius of gyration computed on her mdominant loca-
tions (m2), and the intra-core radius rintra
gas the radius
of gyration computed on the locations of a given mobility
core. Table 1 summarizes the mobility measures we use in
our analysis and Figure 1 schematizes some of the concepts
introduced above.
measure symbol
radius of gyration rg
dominant location Di
intra-core radius of gyration rintra
g
inter-core radius of gyration rinter
g
Table 1: The mobility measures used in our study
and the corresponding mathematical notation.
Dominant(loca+on(
Mobility(core(
Noise(loca+on(
Figure 1: The image illustrates the locations vis-
ited by an individual. Blue circles are visited loca-
tions, groups of circles within blue dashed shapes
are mobility cores, red circles are dominant loca-
tions. Green circles are noise locations that are not
part of any mobility core. The radius of gyration
is computed on all the circles, the inter-core radius
on red circles, the intra-core radius on the circles
within the same dashed shape.
5. RESULTS
For every individual in the two datasets, we partition her
locations in mobility cores by using the DBSCAN clustering
algorithm [19], which extracts dense groups of points ac-
cording to two input parameters: eps, the maximum search
radius; and minP ts, the minimum number of points (loca-
tions) to form a cluster. Every location have two features,
the latitude and the longitude of the location’s position on
the space. The DBSCAN algorithm uses the latitude and
longitude of locations to group them in clusters according to
the input parameters minP ts and eps. We set minP ts = 2
and eps = 5,10,50,100km in our experiments and eliminate
the noise clusters produced by the algorithm, i.e. locations
that do not belong to any dense cluster of locations accord-
ing to the input parameters (see Figure 1).
We compute the distribution of the number of obtained
(non-noise) clusters per individual, at different values of eps
parameter (see Figure 2). We observe a peaked distribution
where the majority of individuals have few mobility cores,
e.g. two mobility cores when eps = 5km and one mobil-
ity core when eps = 100km, and individuals having more
than ten mobility cores are extremely rare (Figure 2). The
fact that the algorithm produces non-noise clusters indicates
that that the locations of an individual are not randomly
distributed but tend to aggregated in dense groups of loca-
tions, representing geographical units of individual mobility.
Our distribution of cores per person is in contrast with pre-
vious works which build mobility groups using network sci-
ence techniques [1], where most users possess 5-20 mobility
groups and only 7% of users have a single mobility group.
We also compare an individual’s radius of gyration rgwith
her inter-core radius rinter
g, observing a strong linear corre-
lation (see Figure 3). Since the inter-core radius is computed
on the dominant locations of the individual’s mobility cores,
this result suggests that the radius of gyration is mainly de-
termined by the tendency of an individual to partition her
mobility in different geographical units. If we compute the
distribution of individuals’ intra-core radius rintra
g, indeed,
we do not obtain a power law anymore (Figure 4): a peak
emerges from the distribution of rintra
gfor low eps suggesting
that, when restricted to move within mobility cores, individ-
uals show typical radii of gyration. In summary, our analysis
suggests that: (i) individuals tend to split their mobility in
dense groups of locations (mobility cores); (ii) the distance
between the dominant locations in mobility cores generates
the observed heterogeneity in human mobility ranges; (iii)
the heterogeneity is indeed greatly reduced when individuals
are constrained to move within mobility cores.
Interestingly, we observe that similar results emerge from
both the mobile phone dataset, which captures displace-
ments by any transportation means in an entire European
country during three months, and the GPS dataset, which
only captures movements by private vehicles occurred in
Tuscany during one month.
0 10 20 30 40 50
# clusters
0
2000
4000
6000
8000
10000
12000
14000
# users
clusters per user
eps = 5km
(a)
0510 15 20 25 30
0
5000
10000
15000
20000
25000 clusters per user
eps = 10km
# clusters
# users
(b)
0510 15 20
# clusters
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
# users
clusters per user
eps = 50km
(c)
0 2 46 8 10 12
# clusters
0
10000
20000
30000
40000
50000
# users
eps = 100km
clusters per user
(d)
Figure 2: Distribution of the number of clus-
ters per individual on the GSM dataset for eps =
5,10,50,100km (the GPS dataset produces similar
results). The plots highlight a clear tendency of
locations to cluster in dense groups. We observe
that: (i) the ma jority of individuals have few mobil-
ity cores (2 or 3), (ii) as eps increases the mode of
the distribution approaches to one.
rg [km]
inter-rg [km]
rg vs inter-rg
eps$=$5km$
#$mobility$cores$=$2$
(a)
#"mobility"cores"="2"
eps"="10km"
inter-rg [km]
rg [km]
rg vs inter-rg
(b)
Figure 3: Radius of gyration (on x axis) versus inter-
core radius (y axis) of individuals having two mobil-
ity cores, for eps = 5km (a) and eps = 10km (b).
Plots refer to the GSM dataset (the GPS dataset
produces similar results).
0 2 46 8 10 12 14 16
intra-rg [km]
0.0
0.1
0.2
0.3
0.4
0.5
p(intra-rg)
PDF of intra-rg
eps = 5km
(a)
0 20 40 60 80 100 120 140 160
intra-rg [km]
0.00
0.05
0.10
0.15
0.20
p(intra-rg)
PDF of intra-rg
eps = 10km
(b)
Figure 4: Distribution of intra-core radius rintra
g
across individuals in the GSM dataset (the GPS
dataset produces similar results), for eps = 5km (a)
and eps = 50km (b). We observe that, for eps = 5km,
the distribution is not a power law anymore but a
peak emerges denoting a characteristic radius of gy-
ration (a). For eps = 50km the distribution starts
approaching a power law.
6. CONCLUSIONS
In this paper we showed that the locations visited by indi-
viduals tend to cluster in a small number of mobility cores.
The radius of gyration computed on the dominant locations
of each mobility cores highly correlates with the standard
radius of gyration, meaning that the characteristic distance
traveled by individuals is mainly determined by their dom-
inant locations. Moreover, individuals show homogenous
radii of gyration when constrained to travel within mobility
cores. Our results showed that individual human mobility
is composed by two types of trips: intra-core trips, which
represent movement within a given geographical unit, and
inter-core trips, which define trips between locations belong-
ing to different mobility cores and generate the heterogene-
ity observed in human mobility ranges. As future work, we
plan to investigate deeply the structure of intra- and inter-
trips and quantify the contribution of every single intra- or
inter-trip in shaping the characteristic traveled distance of
an individual.
7. ACKNOWLEDGMENTS
This work has been partially funded by the EU under
the FP7-ICT Program by project Petra n. 609042, under
H2020 Program by projects SoBigData grant n. 654024 and
Cimplex grant n. 641191.
8. REFERENCES
[1] J. Bagrow and Y.-R. Lin. Mesoscopic structure and
social aspects of human mobility. PLoS ONE, 7(5),
2012.
[2] D. Brockmann, L. Hufnagel, and T. Geisel. The
scaling laws of human travel. Nature,
439(7075):462–465, 2006.
[3] E. Cho, S. A. Myers, and J. Leskovec. Friendship and
mobility: user movement in location-based social
networks. In Proceedings of the 17th ACM SIGKDD
International Conference on Knowledge Discovery and
Data Mining, KDD’11, pages 1082–1090. ACM, 2011.
[4] N. Eagle and A. Pentland. Eigenbehaviors: identifying
structure in routine. Behavioral Ecology and
Sociobiology, 63:1057–1066, 2009.
[5] G. M. V. et al. L´evy flight search patterns of
wandering albatrosses. Nature, 381:413–415, 1996.
[6] G. R.-F. et al. L´evy walk patterns in the foraging
movements of spider monkeys. Behavioral Ecology and
Sociobiology, 55(25), 2003.
[7] M. C. Gonz´alez, C. A. Hidalgo, and A.-L. Barab´asi.
Understanding individual human mobility patterns.
Nature, 453(7196):779–782, June 2008.
[8] S. Jiang, J. F. Jr, and M. Gonz´alez. Clustering daily
patterns of human activities in the city. Data Mining
and Knowledge Discovery, 25:478–510, 2012.
[9] W. S. Jung, F. Wang, and H. E. Stanley. Gravity
model in the korean highway. EPL (Europhysics
Letters), 81:48005, 2008.
[10] D. Karamshuk, C. Boldrini, M. Conti, and
A. Passarella. Human mobility models for
opportunistic networks. IEEE Communications
Magazine, 49(12):157–165, 2011.
[11] L. Liao, D. J. Patterson, D. Fox, and H. Kautz.
Learning and inferring transportation routines. Artif.
Intell., 171(5-6):311–331, Apr. 2007.
[12] L. Pappalardo, S. Rinzivillo, Z. Qu, D. Pedreschi, and
F. Giannotti. Understanding the patterns of car
travel. The European Physical Journal Special Topics,
215(1):61–73, 2013.
[13] L. Pappalardo, F. Simini, S. Rinzivillo, D. Pedreschi,
and F. Giannotti. Comparing general mobility and
mobility by car. In Proceedings of the 1st BRICS
Countries Congress (BRICS-CCI) and 11th Brazilian
Congress (CBIC) on Computational Intelligence, 2013.
[14] L. Pappalardo, F. Simini, S. Rinzivillo, D. Pedreschi,
F. Giannotti, and A.-L. Barabasi. Returners and
explorers dichotomy in human mobility. Nature
Communications, 6, 09 2015.
[15] S. Rinzivillo, L. Gabrielli, M. Nanni, L. Pappalardo,
D. Pedreschi, and F. Giannotti. The purpose of
motion: Learning activities from individual mobility
networks. In Proceedings of International Conference
on Data Science and Advanced Analytics, DSAA’14,
2014.
[16] S. Rinzivillo, S. Mainardi, F. Pezzoni, M. Coscia,
D. Pedreschi, and F. Giannotti. Discovering the
geographical borders of human mobility. K¨
unstliche
Intelligenz, 26(3):253–260, 2012.
[17] F. Simini, M. C. Gonz´alez, A. Maritan, and A.-L.
Barab´asi. A universal model for mobility and
migration patterns. Nature, 484(7392):96–100, 2012.
[18] C. Song, Z. Qu, N. Blumm, and A.-L. Barab´asi.
Limits of predictability in human mobility. Science,
327:1018–1021, 2010.
[19] P.-N. Tan, M. Steinbach, and V. Kumar. Introduction
to Data Mining. Addison Wesley, 2006.
[20] D. Wang, D. Pedreschi, C. Song, F. Giannotti, and
A.-L. Barab´asi. Human mobility, social ties, and link
prediction. In Proceedings of the 17th ACM SIGKDD
International Conference on Knowledge Discovery and
Data Mining, KDD ’11, pages 1100–1108, New York,
NY, USA, 2011. ACM.
[21] G. K. Zipf. The p1p2/d hypothesis: On the intercity
movement of persons. American Sociological Review,
11(6):677–686, 1946.
... In fact, all the above-mentioned works assume users to have constant moving speed and they can travel across the whole network. However, it has been well known that real-life mobile social networks are heterogeneous [8], [21], [29], e.g. users frequently visit and stay around a few "home locations", and their moving ranges are confined by heterogeneous gyration radii. ...
... Both of the two cases imply that users are more likely to stay in a fixed area around some home location, and the heterogeneous model further guarantees the moving areas are different across the users. Therefore the SSRM model complies with moving patterns found in mobile social networks [8], [21], [29]. ...
... Census or other administrative units are, however, a more useful unit of activity space when assessing variability in the activity spaces of an offender population [48]. They can also be considered a proxy for unmeasured activity space: we do not visit places in isolation but tend to cluster our activities together; places immediately around or in between activity nodes are more likely to be in our activity space than places further away [49][50][51][52]. ...
Article
Full-text available
It is well established that offenders' routine activity locations (nodes) shape their crime locations, but research examining the geography of offenders' routine activity spaces has to date largely been limited to a few core nodes such as homes and prior offense locations, and to small study areas. This paper explores the utility of police data to provide novel insights into the spatial extent of, and overlap between, individual offenders' activity spaces. It includes a wider set of activity nodes (including relatives' homes, schools, and non-crime incidents) and broadens the geographical scale to a national level, by comparison to previous studies. Using a police dataset including n=60,229 burglary, robbery, and extra-familial sex offenders in New Zealand, a wide range of activity nodes were present for most burglary and robbery offenders, but fewer for sex offenders, reflecting sparser histories of police contact. In a novel test of the criminal profiling assumptions of homology and differentiation in a spatial context, we find that those who offend in nearby locations tend to share more activity space than those who offend further apart. However, in finding many offenders' activity spaces span wide geographic distances, we highlight challenges for crime location choice research and geographic profiling practice.
... Brockmann et al.[7]study the scaling laws of human mobility by observing the circulation of bank notes in United States, finding that travel distances of bank notes follow a power-law behavior. González et al.[21]analyze a nation-wide large-scale mobile phone dataset and find a large heterogeneity in human mobility ranges[46]: (i) travel distances of individuals follow a power-law behavior, confirming the results byBrockmann et al.; (ii)the radius of gyration of individuals, i.e., their characteristic traveled distance, follows a power-law behavior with an exponential cutoff. Song et al.[64]observe on mobile phone data that individuals are characterized by a power-law behavior in waiting times, i.e. the time between a displacement and the next displacement by an individual. ...
Article
Full-text available
Human mobility modelling is of fundamental importance in a wide range of applications, such as the developing of protocols for mobile ad hoc networks or for what-if analysis and simulation in urban ecosystems. Current generative models generally fail in accurately reproducing the individuals' recurrent daily schedules and at the same time in accounting for the possibility that individuals may break the routine and modify their habits during periods of unpredictability of variable duration. In this article we present DITRAS (DIary-based TRAjectory Simulator), a framework to simulate the spatio-temporal patterns of human mobility in a realistic way. DITRAS operates in two steps: the generation of a mobility diary and the translation of the mobility diary into a mobility trajectory. The mobility diary is constructed by a Markov model which captures the tendency of individuals to follow or break their routine. The mobility trajectory is produced by a model based on the concept of preferential exploration and preferential return. We compare DITRAS with real mobility data and synthetic data produced by other spatio-temporal mobility models and show that it reproduces the statistical properties of real trajectories in an accurate way.
Article
Full-text available
The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.
Conference Paper
Full-text available
The large availability of mobility data allows us to investigate complex phenomena about human movement. However this adundance of data comes with few information about the purpose of movement. In this work we address the issue of activity recognition by introducing Activity-Based Cascading (ABC) classification. Such approach departs completely from probabilistic approaches for two main reasons. First, it exploits a set of structural features extracted from the Individual Mobility Network (IMN), a model able to capture the salient aspects of individual mobility. Second, it uses a cascading classification as a way to tackle the highly skewed frequency of activity classes. We show that our approach outperforms existing state-of-theart probabilistic methods. Since it reaches high precision, ABC classification represents a very reliable semantic amplifier for Big Data.
Conference Paper
Full-text available
In the last years, the emergence of big data led scientists from diverse disciplines toward the study of the laws underlying human mobility. Although these recent discoveries have shed light on very interesting and fascinating aspects about people movements, they are generally focused on global and general mobility patterns. For this reason, they do not necessarily capture phenomena related to specific types of mobility, such as mobility by car, by public transportations means, by foot and so on. In this work, we aim to compare general human mobility with mobility expressed by a specific conveyance, trying to address the following question: What are the differences between general mobility and mobility by car? To answer this question, we present the results of an analysis performed on a big mobile phone dataset and on a GPS dataset storing information about car travels in Italy.
Article
Full-text available
Are the patterns of car travel different from those of general human mobility? Based on a unique dataset consisting of the GPS trajectories of 10 million travels accomplished by 150,000 cars in Italy, we investigate how known mobility models apply to car travels, and illustrate novel analytical findings. We also assess to what extent the sample in our dataset is representative of the overall car mobility, and discover how to build an extremely accurate model that, given our GPS data, estimates the real traffic values as measured by road sensors.
Article
Full-text available
Data mining and statistical learning techniques are powerful analysis tools yet to be incorporated in the domain of urban studies and transportation research. In this work, we analyze an activity-based travel survey conducted in the Chicago metropolitan area over a demographic representative sample of its population. Detailed data on activities by time of day were collected from more than 30,000 individuals (and 10,552 households) who participated in a 1-day or 2-day survey implemented from January 2007 to February 2008. We examine this large-scale data in order to explore three critical issues: (1) the inherent daily activity structure of individuals in a metropolitan area, (2) the variation of individual daily activities—how they grow and fade over time, and (3) clusters of individual behaviors and the revelation of their related socio-demographic information. We find that the population can be clustered into 8 and 7 representative groups according to their activities during weekdays and weekends, respectively. Our results enrich the traditional divisions consisting of only three groups (workers, students and non-workers) and provide clusters based on activities of different time of day. The generated clusters combined with social demographic information provide a new perspective for urban and transportation planning as well as for emergency response and spreading dynamics, by addressing when, where, and how individuals interact with places in metropolitan areas.
Article
Full-text available
The individual movements of large numbers of people are important in many contexts, from urban planning to disease spreading. Datasets that capture human mobility are now available and many interesting features have been discovered, including the ultra-slow spatial growth of individual mobility. However, the detailed substructures and spatiotemporal flows of mobility--the sets and sequences of visited locations--have not been well studied. We show that individual mobility is dominated by small groups of frequently visited, dynamically close locations, forming primary "habitats" capturing typical daily activity, along with subsidiary habitats representing additional travel. These habitats do not correspond to typical contexts such as home or work. The temporal evolution of mobility within habitats, which constitutes most motion, is universal across habitats and exhibits scaling patterns both distinct from all previous observations and unpredicted by current models. The delay to enter subsidiary habitats is a primary factor in the spatiotemporal growth of human travel. Interestingly, habitats correlate with non-mobility dynamics such as communication activity, implying that habitats may influence processes such as information spreading and revealing new connections between human mobility and social networks.
Article
Full-text available
Introduced in its contemporary form in 1946 (ref. 1), but with roots that go back to the eighteenth century, the gravity law is the prevailing framework with which to predict population movement, cargo shipping volume and inter-city phone calls, as well as bilateral trade flows between nations. Despite its widespread use, it relies on adjustable parameters that vary from region to region and suffers from known analytic inconsistencies. Here we introduce a stochastic process capturing local mobility decisions that helps us analytically derive commuting and mobility fluxes that require as input only information on the population distribution. The resulting radiation model predicts mobility patterns in good agreement with mobility and transport patterns observed in a wide range of phenomena, from long-term migration patterns to communication volume between different regions. Given its parameter-free nature, the model can be applied in areas where we lack previous mobility measurements, significantly improving the predictive accuracy of most of the phenomena affected by mobility and transport processes.
Article
The availability of massive network and mobility data from diverse domains has fostered the analysis of human behavior and interactions. Broad, extensive, and multidisciplinary research has been devoted to the extraction of non-trivial knowledge from this novel form of data. We propose a general method to determine the influence of social and mobility behavior over a specific geographical area in order to evaluate to what extent the current administrative borders represent the real basin of human movement. We build a network representation of human movement starting with vehicle GPS tracks and extract relevant clusters, which are then mapped back onto the territory, finding a good match with the existing administrative borders. The novelty of our approach is the focus on a detailed spatial resolution, we map emerging borders in terms of individual municipalities, rather than macro regional or national areas. We present a series of experiments to illustrate and evaluate the effectiveness of our approach.
Article
This paper introduces a hierarchical Markov model that can learn and infer a user's daily movements through an urban community. The model uses multiple levels of abstraction in order to bridge the gap between raw GPS sensor measurements and high level information such as a user's destination and mode of transportation. To achieve efficient inference, we apply Rao–Blackwellized particle filters at multiple levels of the model hierarchy. Locations such as bus stops and parking lots, where the user frequently changes mode of transportation, are learned from GPS data logs without manual labeling of training data. We experimentally demonstrate how to accurately detect novel behavior or user errors (e.g. taking a wrong bus) by explicitly modeling activities in the context of the user's historical data. Finally, we discuss an application called “Opportunity Knocks” that employs our techniques to help cognitively-impaired people use public transportation safely.