Conference PaperPDF Available

Comparing General Mobility and Mobility by Car


Abstract and Figures

In the last years, the emergence of big data led scientists from diverse disciplines toward the study of the laws underlying human mobility. Although these recent discoveries have shed light on very interesting and fascinating aspects about people movements, they are generally focused on global and general mobility patterns. For this reason, they do not necessarily capture phenomena related to specific types of mobility, such as mobility by car, by public transportations means, by foot and so on. In this work, we aim to compare general human mobility with mobility expressed by a specific conveyance, trying to address the following question: What are the differences between general mobility and mobility by car? To answer this question, we present the results of an analysis performed on a big mobile phone dataset and on a GPS dataset storing information about car travels in Italy.
Content may be subject to copyright.
Comparing general mobility and mobility by car
Luca Pappalardo∗‡, Filippo Simini, Salvatore Rinzivillo, Dino Pedreschiand Fosca Giannotti
KDD Lab, Department of Computer Science
University of Pisa, Italy
Email: {lpappalardo, pedre}
Institute of Physics, Budapest University of Technology and Economics
Budapest, Hungary
Pisa, Italy
Email: {rinzivillo, fosca.giannotti}
Abstract—In the last years, the emergence of big data led
scientists from diverse disciplines toward the study of the laws
underlying human mobility. Although these recent discoveries
have shed light on very interesting and fascinating aspects about
people movements, they are generally focused on global and
general mobility patterns. For this reason, they do not necessarily
capture phenomena related to specific types of mobility, such as
mobility by car, by public transportations means, by foot and so
on. In this work, we aim to compare general human mobility with
mobility expressed by a specific conveyance, trying to address
the following question: What are the differences between general
mobility and mobility by car? To answer this question, we present
the results of an analysis performed on a big mobile phone dataset
and on a GPS dataset storing information about car travels in
In the last few years, the emergence of big data led scien-
tists from diverse disciplines toward the study of human mo-
bility, helping to discover and understand the hidden patterns
in the trajectories people follow during their daily life. Such
a social microscope showed that traditional mobility models
adapted from the observation of particles or animals (such
as Brownian motion and L´
evy-flights) [1], [2] and recently
from the observation of dollar bills [5], are not suitable to
describe people’s movements. Indeed, at a global scale humans
are characterized by a huge heterogeneity, since a power law
was observed in the distribution of the characteristic distance
traveled by users [3], [4]. Despite the observed heterogeneity in
people’s movements, through the observation of past mobility
history the whereabouts of most individuals can be predicted
with a very high accuracy, higher than 80% [6], [7].
These recent discoveries have undoubtedly shed light on
very interesting and fascinating aspects about human mobility.
However, they are generally focused on global and general
mobility patterns: through the analysis of GSM and other types
of data describing travels with different transportation means,
researchers revealed that our movements are not random, but
follow their own laws. Since such laws are very general, they
do not necessarily capture phenomena related to specific types
of mobility. To clarify this point, let us consider movements by
bike. Bikes are convenient and efficient transportation means
to use within a city, but they are not suitable to cover very
large distances. For this reason, while the accuracy in the
predictability of bikers could not differ so much from the
general pattern, the variability with respect to the characteristic
traveled distance is presumably much lower, leading to a
different mobility pattern.
The aim of this paper is to compare general mobility
with mobility by car, trying to answer the following question:
What are the differences between general mobility patterns and
patterns of car travel? To address this question, we exploit a big
mobile phone dataset collected by a European mobile phone
carrier and a GPS dataset consisting of the detailed spatio-
temporal trajectories of travels performed by cars in Italy. By
exploiting these data, we show the difference between global
mobility and mobility by car in two important aspects: the
distribution of radius of gyration and the distribution of time
spent in the visited locations.
The rest of the paper is organized as follows: Section II de-
scribes the dataset and highlights the main difference between
GSM and GPS data. Section III briefly describes the individual
mobility measures we used to unveil the patterns, while
Section IV presents a comparison between GSM and GPS
patterns. Finally, Section V concludes the paper, providing
some conclusions.
Mobile phones are nowadays very common technological
devices offering a good proxy to capture individual trajectories.
Indeed, each time a user makes a call the carrier records the
tower that communicates with the phone, effectively pinpoint-
ing users’ location. Unfortunately, this information is not ter-
ribly accurate because an individual could be anywhere within
the tower’s reception area, which can span tens of square
kilometers. Furthermore, the location is usually recorded only
when a person uses her phone, providing little information
about the whereabouts between calls. Since call patterns are
bursty [8], for most of the time the actual position of a user
is unknown.
In the current study, we exploit a GSM dataset collected by
a European mobile phone carrier for billing and operational
purposes. It contains temporal (date and time) and spatial (the
cell phone tower’s coordinates) information of all calls and
text messages sent by 3million costumers1. Table I shows an
1to guarantee anonymity, each user is identified with an anonymized security
example of phone records. In order to select the most reliable
users for our purpose, we restricted our period of observation
to three months and applied some filters to the data. For each
user, we discarded locations visited only once during the entire
period of observation, and those with a number of calls less
than 0.05% of the total2. Then, from the resulting dataset we
deleted all users who visited only a single location, and those
with a call frequency less than twelve calls per day on average
during the period of observation. The filtering phase resulted
in a final dataset of 67,000 active users.
Timestamp Coordinates Caller Callee Type
2008/04/01 - 23:45:00 (32.567,2.642) A45J23 F45J23 SMS
2008/04/02 - 06:02:10 (33.282,2.221) K65232 V56YT4 Call
... ... ... ... ...
Unlike GSM data, GPS traces provide high resolution
location data, storing the geodetic coordinates with an average
sampling rate of few seconds. Even though these features
are ideal in making a refined statistical analysis of human
mobility patterns, relatively few works in literature are based
on GPS data, mainly due to the difficulty to obtain complete
traces covering movements along the whole day. In this work,
we have access to a GPS dataset that stores information of
approximately 9.8Million different car travels from 159,000
cars tracked during one month (May 2011) in an area of
250km×250km in central Italy. The GPS traces are collected
by Octo Telematics Italia Srl3, a company that provides a data
collection service for insurance companies. Since GPS data
do not provide explicit information about visited locations, we
assigned each origin and destination point of the travels to
the corresponding Italian census cell, according to information
provided by the Italian National Institute of Statistics4. After
such aggregation, many users are found to have only one
visited location. We discarded them and took into account
only those users with the most frequent location (most likely
their home or work) inside the region of Tuscany. These
filtering operations produced a dataset of 46,121 users, where
a travel is described by a timestamp and a pair of coordinates
corresponding to the centroids of the origin and destination
cells of the travel (Table II).
Timestamp Origin Destination Car
2011/05/12 - 08:31:20 (32.567,2.546) (32.7,2.511) F45J23
2011/05/24 - 17:53:08 (32.1982,2.333) (33.123,2.31) H2705L
... ... ... ...
Table III summarizes a few properties of the two datasets
described above. It is worth noting that GPS data provide us
information about displacements performed by car only. For
this reason, we have a partial knowledge about the whole
mobility of individuals. Conversely, GSM data may provide
information about travels made using all transportation means,
although they are actually recorded only when a user calls
before and after the trip.
2this means that a location iis deleted if ni/N < 0.005, where niis the
number of calls performed in the tower i, and Nthe total number of calls
performed by the user.
Dataset Volume Conveyance Precision
GPS 46,121 users Cars High
GSM 67,000 users Many Low
In order to explore the statistical properties of the mobility
patterns, for each user we computed several individual mobility
The center of mass rcm of a user represents the pivot of her
individual mobility. Mathematically, it is a two-dimensional
vector representing the weighted mean of the visited locations:
~rcm =1
where Lis the total number of distinct towers/cells visited
by the user/car; ~riis a two-dimensional vector representing
the geographic coordinates of tower/cell i;wiis the weight
assigned to location i; and Wthe sum of the weights over all
locations. Depending on the measure considered to evaluate
the weight of a location, two different centers of mass can be
defined. The frequency-based center of mass weights locations
according to their visitation frequency, hence wiis the number
of calls/arrivals performed in location i. In the time-based
center of mass we take wias the total time spent by the user
in location i.
Another interesting measure of an individual’s central position
is the most frequent location L1, i.e. the location where she can
be located with the highest probability, which is most likely
the home or work place. Such measure can be computed in a
very straightforward way by simply taking the tower/cell from
which the user performs the highest number of calls/arrivals.
The radius of gyration of a user [3], [4] is a mobility
measure representing the characteristic distance traveled by
each individual. It is a concept borrowed from physics, defined
as the root mean square of the weighted sum of all locations’
distances from the center of mass:
where ~rcm is the vector of coordinates representing the center
of mass. We computed two types of radius of gyration: i) rg
with respect to the frequency-based center of mass, i.e. wiis
the total number of calls/arrivals in i; ii) rgwith respect to the
time-based center of mass, i.e. wiis the total time spent in i.
In this section, we compare the patterns found on the two
datasets, highlighting the main differences between general
mobility and mobility by car.
The first aspect we investigated is the difference between the
frequency-based radius of gyration and the time-based one.
In other words, how does the choice of locations’ weight
influence the value of the radius of gyration? Figure 1 shows
the scatter plots of frequency-rgversus time-rg, for GSM
(left) and GPS (center) users. For a better understanding of
Fig. 1. Scatter plots of frequency-rgversus time-rgfor GSM (left) and GPS (center) users. A zoomed version of the GSM scatter plot is proposed on the
right. Error bars use bin size of 50 km (left) and 5 km (center and right).
Fig. 2. Frequency- and time-based probability distributions of rgfor GSM (left) and GPS (center) users. On the right, a GSM plot for users with the most
frequent location L1in a region with size comparable to Tuscany is provided (only locations inside the region are considered in the computation of the radii).
the underlying correlation, also error bars are drawn, with bin
size of 50km (GSM) and 5km (GPS). At a first glance, in
both cases the measures seem to be correlated, as expected.
However, from a closer examination an interesting difference
emerges: while in the GSM case the mean of the error bars
tends to be biased toward the time-rg, in GPS data the mean
frequency-rgtends to be higher. One possible interpretation of
the phenomenon is that car visitation frequency of locations
distant from the center of mass is higher, leading to a bigger
characteristic traveled distance.
Figure 2 (left) shows the distributions of frequency- and time-
based radii of gyration computed on the GSM dataset. There
is no significant difference between the two curves, which
practically coincide. Conversely, a sharp difference clearly
emerges from the distributions of radii in the GPS case (Figure
2, right). Indeed, the time-based distribution is shifted towards
shorter radii, and peaks at 2km instead of 5km. This aspect
confirms the prominent role of frequency with respect to time
observed in Figure 1 (center), suggesting that cars are usually
parked for a long time in locations close to the center of mass
(like home and work locations). This effect is absent in GSM
data because people can continue their call activity even while
being stationary. Another difference we can notice is that while
the GSM curves decrease over the entire range, GPS radii show
a growing value up to 2km. This is presumably due to the
tendency of covering those small distances by foot, bike, or
bus, resulting in a lower probability to find such travels in the
GPS dataset.
To test at what extent the differences in the distribution of
rgare due by the geographic scale (GSM data refers to a
whole country, while GPS to a single region), we computed the
rgdistribution of those GSM users having the most frequent
location in a region of the country of a size and population
comparable to Tuscany. In the computation of the radius, only
the locations within the region are taken into account. As
Figure 2 (right) shows, the distribution of the radius does not
change significantly, suggesting that the shape of the curves,
and their slopes, are rather independent from the scale and are
related to the portion of mobility they represent.
The time spent across the visited locations is another inter-
esting mobility aspect it is worth investigating. Figure 3 (left)
shows the GSM distribution of time spent for the five most
frequent locations L1, . . . , L5. As we can see, the time spent
is clearly unbalanced in favor of the most important location
L1. This is reasonable, because the most frequent location
usually corresponds to user’s home or work place, which are
the locations where an individual spends most of the time.
The plot also suggests that time is proportional to frequency:
the more a user visit a location, the more time she spends
there. Such phenomenon is confirmed by Figure 4, where the
correlation, though not perfectly linear, is evident.
The same pattern is also observed on GPS data, although the
difference between L1and the other locations is less sharp
(Figure 3, right). It is worth to note that in both plots, L1
intersects the other curves approximately at the same points.
This is very interesting because, independently from the geo-
graphical scale and from the portion of mobility considered,
beyond a certain fraction of time is much more likely for a
user to be located in the L1than all the other locations.
Fig. 3. Distribution of fraction of time spent in the five most frequent locations L1,...,L5for the GSM (left) and the GPS (right) datasets.
t, hours spent
c, number of calls
Fig. 4. Correlation between calls and time spent in the nine most frequent
locations (GSM dataset).
Our data-driven analysis showed the difference between
general mobility and mobility by car regarding two main
aspects: the distribution of radius of gyration and the distri-
bution of time spent in the visited locations. We discovered
that, regardless the geographic scale, the shape of the rg
distribution is different since mobility by car tends to poorly
cover displacements within small distances. However, in both
cases the distribution of time spent in visited locations present
a clear dominance by the most frequent location L1, with such
dominance more pronounced in the GSM dataset. Moreover,
the L1curve intersects the others approximatively at the same
points in both cases, hinting the presence of a general pattern
underlying the phenomenon.
The authors wish to thank the company Octo Telematics
Italia Srl for providing the car travels GPS data.
The research reported in this article has been partially sup-
ported by European FP7 project DATASIM (http://www.
[1] G. M. Viswanathan et al., L´
evy flight search patterns of wandering
albatrosses. Nature 381, 413-415 (1996).
[2] G. Ramos-Fernandez et al., L´
evy walk patterns in the foraging move-
ments of spider monkeys, Behavioral Ecology and Sociobiology 55, 25
[3] M. C. Gonz´
alez, C. A. Hidalgo, A.-L. Barab´
asi, Understanding individual
human mobility patterns, Nature 453, 779-782.
[4] L. Pappalardo, S. Rinzivillo, Z. Qu, D. Pedreschi, F. Giannotti, Under-
standing the patterns of car travel, European Physics Journal Special
Topics 215, 61-73 (2013).
[5] D. Brockmann, L. Hufnagel, T. Geisel, The scaling laws of human travel,
Nature 439, no. 7075, 462-465 (2006).
[6] N. Eagle, A.S. Pentland, Eigenbehaviors: identifying structure in routine.
Behavioral Ecology and Sociobiology 63, 1057-1066 (2009).
[7] C. Song, Z. Qu, N. Blumm, A.-L. Barab´
asi, Limits of predictability in
human mobility. Science 327, 1018-1021 (2010).
[8] A.-L. Barab´
asi, The origin of bursts and heavy tails in humans dynamics,
Nature 435, 207 (2005).
... In conclusion, we obtain two intertwined results: first, the known human mobility models can be refined to deal with car mobility, and second, the available GPS data can indeed be used as a faithful proxy of car mobility. The work in this chapter is based on two papers [14,127] published in 2013. ...
... Table 5.2 summarizes some characteristics of the datasets. The GSM and the GPS datasets differ in several aspects [14,127]. The GPS data refers to trips performed during one month (May 2011) in an area corresponding to a single Italian region, while the mobile phone data cover an entire European country and a period of observation of three months. ...
... The fact that one dataset contains aspect missing in the other dataset makes the two types of data suitable for an independent validation of the universality of the patterns emerging from human mobility behavior. The works in [14,127] summarizes the main differences between the general mobility describe by GSM data and the vehicle mobility described by GPS data. ...
Full-text available
Understanding human social behavior is a longstanding dream of mankind, a really profound point from both pragmatic and philosophical perspectives. The ability of drawing a comprehensive picture of human behavior and dynamics is helpful in many problems, which characterize our modern and complex society: the prevention of devastating pandemic diseases; the diffusion of new ideas or technologies over a social network; the patterns of success in different spheres of our activities. Big Data are nowadays a powerful social microscope which paves the road to realize the dream, allowing to ``photograph'' the main aspects of the society and to create a comprehensive picture of human behavior. This thesis proposes to study human behavior and dynamics through a combination of techniques from network science and data mining. In the context of human mobility, we use mobile phone data and GPS trajectories from vehicles to show that people can be profiled into two distinct categories, namely returners and explorers, according to their recurrent mobility patterns. We construct a new mobility model that can reproduce the observed dichotomy and show that returners and explorers play a distinct quantifiable role in spreading phenomena. We also investigate the issue of activity recognition from human movements by presenting a classification model to recognize the activity performed by an individual by observing some characteristics of her movements. We then move from individuals to connections, entering the domain of social network analysis. We investigate the challenging problem of community detection in dynamic social networks presenting Tiles, an innovative algorithm able to track the history of social communities in a streaming fashion. We also address the fascinating problem of the information diffusion over a social network, studying the spreading of musical tastes over a music social media. We show that certain individuals act as musical leader or innovators and that they can generate three different patterns of diffusion. Finally, we investigate the potentiality of Big Data in providing estimate for the socio-economic development of a territory. We use mobile phone data and GPS trajectories from vehicles to show that human mobility, and mobility diversity in particular, is highly correlated to wellbeing at municipality and province level. Individuals' movements and quality of life are linked aspects of society, opening the scenario for the definition of new statistical index that rely on Big Data to monitor the economic health of a territory. We conclude the thesis by revising the most promising research directions which open up from the results summarized in the thesis and introducing other interesting aspects related to the data-driven study of human behavior and dynamics.
... • Factors: The base probabilities are then multiplied by 3 or 6, resulting in expected CDR counts of 3 and 6 respectively. The choice for 6 (on average) was taken as half of what (Pappalardo et al. 2013;Becker et al. 2013;Isaacman et al. 2011) used or had, as we specifically want to use methods that work on moderate counts of daily CDR. We then halve that again, to see how far down we can go. ...
In this work we present two methods that can extract habitual movement patterns and reconstruct the underlying movement of users from their call detail records (CDR) in a way that works for users with only moderate numbers of CDRs and that does not make any prior assumptions on the behaviour of the users. The methods allow for a more comprehensive user base in large-scale studies due to the fact that users that might otherwise have to be discarded can also be analysed. The first one is computationally not overly intense and is based on association mining. The second one, which we named DAMOCLES, is based on extracting idiosyncratic daily patterns from clustered daily activities. The methods are evaluated on real data of 140 users over an average of 200 days against benchmarks using assumptions commonly found in the literature such as a work week from Monday to Friday on GPS ground truth. Both methods clearly outperform the benchmarks and for many users retrieve similar regularities. Additionally a simulation study is performed that allows to evaluate the methods in a more controlled environment.
... In other words, since individuals are inactive most of their time, CDRs allow to reconstruct only a subset of an individual's mobility. Several works in literature study the bias in CDRs by comparing the mobility patterns observed on CDRs to the same patterns observed on GPS data [36,39,35,38] or handover data (data capturing the location of mobile phone users recorded every hour or so) [15]. The studies agree that the bias in CDRs does not affect significantly the study of human mobility patterns. ...
Full-text available
Human mobility modelling is of fundamental importance in a wide range of applications, such as the developing of protocols for mobile ad hoc networks or for what-if analysis and simulation in urban ecosystems. Current generative models generally fail in accurately reproducing the individuals' recurrent daily schedules and at the same time in accounting for the possibility that individuals may break the routine and modify their habits during periods of unpredictability of variable duration. In this article we present DITRAS (DIary-based TRAjectory Simulator), a framework to simulate the spatio-temporal patterns of human mobility in a realistic way. DITRAS operates in two steps: the generation of a mobility diary and the translation of the mobility diary into a mobility trajectory. The mobility diary is constructed by a Markov model which captures the tendency of individuals to follow or break their routine. The mobility trajectory is produced by a model based on the concept of preferential exploration and preferential return. We compare DITRAS with real mobility data and synthetic data produced by other spatio-temporal mobility models and show that it reproduces the statistical properties of real trajectories in an accurate way.
... In other words, since individuals are inactive most of their time, CDRs allow to reconstruct only a subset of the mobility of an individual. Several works in literature study the bias in CDR data by comparing the mobility patterns observed on CDR data to the same patterns observed on GPS data [43,44,46,47] or handover data (data capturing the location of mobile phone users recorded every hour or so) [24]. The studies agree that the bias in CDR data does not affect significantly the study of human mobility patterns. ...
Full-text available
An intriguing open question is whether measurements made on Big Data recording human activities can yield us high-fidelity proxies of socio-economic development and well-being. Can we monitor and predict the socio-economic development of a territory just by observing the behavior of its inhabitants through the lens of Big Data? In this paper, we design a data-driven analytical framework that uses mobility measures and social measures extracted from mobile phone data to estimate indicators for socio-economic development and well-being. We discover that the diversity of mobility, defined in terms of entropy of the individual users' trajectories, exhibits (i) significant correlation with two different socio-economic indicators and (ii) the highest importance in predictive models built to predict the socio-economic indicators. Our analytical framework opens an interesting perspective to study human behavior through the lens of Big Data by means of new statistical indicators that quantify and possibly "nowcast" the well-being and the socio-economic development of a territory.
... GSM vs GPS. The GSM and the GPS datasets differ in several aspects [13,12]. The GPS data refers to trips performed during one month (May 2011) in an area corresponding to a single Italian region, while the mobile phone data cover an entire European country and a period of observation of three months. ...
Conference Paper
Full-text available
In the last decade, scientists from different disciplines discovered a great heterogeneity in human mobility ranges, since a power law characterizes the distribution of the characteristic distance traveled by individuals, the so-called radius of gyra-tion. The origin of such heterogeneity, however, still remains unclear. In this paper, we analyze two mobility datasets and observe that an individual's locations tend to be grouped in dense clusters representing geographical mobility cores. We show that the heterogeneity in human mobility ranges is mainly due to trips between these mobility cores, while it is greatly reduced when individuals are constrained to move within a single mobility core.
... The GSM and the GPS datasets differ in several aspects [2,3]. The GPS data refers to trips performed during one month (May 2011) in an area corresponding to a single Italian region, while the mobile phone data cover an entire European country and a period of observation of three months. ...
As location-sensing devices and apps become more prevalent, the scale and availability of big GPS trajectory data are also rapidly expanding. Big GPS trajectory data analytics offers new opportunities for gaining insights into vehicle movement dynamics and road network usage patterns that are important for transportation studies and urban planning among other fields. Processing big GPS trajectory data, consisting of billions of GPS waypoints and millions of individual trajectories is a challenging yet important task for researchers from these different domains. In this research, we propose an Apache Spark-based geo-computing framework for using big GPS trajectory data to estimate vehicle miles travelled, an important metric used by both federal and state highway agencies in the United States for transportation planning. The computing challenge lies in scaling the processing of billions of raw GPS points data as well as the steps for map matching for a statewide road network consisting of thousands of road segments. In this work, we develop a scalable map-matching module that considers both the spatiotemporal information of GPS waypoint sequences and topologic information of road network for the State of Maryland while striking a balance between matching accuracy and computing time. We processed 19.8 million raw GPS trips consisting of approximately 1.4 billion GPS waypoints collected in Maryland during a four-month period in 2015 to estimate vehicle miles travelled for Maryland’s road network. The estimation results show that using big GPS trajectory analytic methods is promising for obtaining accurate and stable vehicle miles travelled estimates.
Full-text available
Are the patterns of car travel different from those of general human mobility? Based on a unique dataset consisting of the GPS trajectories of 10 million travels accomplished by 150,000 cars in Italy, we investigate how known mobility models apply to car travels, and illustrate novel analytical findings. We also assess to what extent the sample in our dataset is representative of the overall car mobility, and discover how to build an extremely accurate model that, given our GPS data, estimates the real traffic values as measured by road sensors.
Full-text available
Lévy flights are a special class of random walks whose step lengths are not constant but rather are chosen from a probability distribution with a power-law tail. Realizations of Lévy flights in physical phenomena are very diverse, examples including fluid dynamics, dynamical systems, and micelles1,2. This diversity raises the possibility that Lévy flights may be found in biological systems. A decade ago, it was proposed that Lévy flights may be observed in the behaviour of foraging ants3. Recently, it was argued that Drosophila might perform Lévy flights4, but the hypothesis that foraging animals in natural environments perform Lévy flights has not been tested. Here we study the foraging behaviour of the wandering albatross Diomedea exulans, and find a power-law distribution of flight-time intervals. We interpret our finding of temporal scale invariance in terms of a scale-invariant spatial distribution of food on the ocean surface. Finally, we examine the significance of our finding in relation to the basis of scale-invariant phenomena observed in biological systems.
Full-text available
A range of applications, from predicting the spread of human and electronic viruses to city planning and resource management in mobile communications, depend on our ability to foresee the whereabouts and mobility of individuals, raising a fundamental question: To what degree is human behavior predictable? Here we explore the limits of predictability in human dynamics by studying the mobility patterns of anonymized mobile phone users. By measuring the entropy of each individual’s trajectory, we find a 93% potential predictability in user mobility across the whole user base. Despite the significant differences in the travel patterns, we find a remarkable lack of variability in predictability, which is largely independent of the distance users cover on a regular basis.
Full-text available
The dynamics of many social, technological and economic phenomena are driven by individual human actions, turning the quantitative understanding of human behaviour into a central question of modern science. Current models of human dynamics, used from risk assessment to communications, assume that human actions are randomly distributed in time and thus well approximated by Poisson processes. In contrast, there is increasing evidence that the timing of many human activities, ranging from communication to entertainment and work patterns, follow non-Poisson statistics, characterized by bursts of rapidly occurring events separated by long periods of inactivity. Here I show that the bursty nature of human behaviour is a consequence of a decision-based queuing process: when individuals execute tasks based on some perceived priority, the timing of the tasks will be heavy tailed, with most tasks being rapidly executed, whereas a few experience very long waiting times. In contrast, random or priority blind execution is well approximated by uniform inter-event statistics. These finding have important implications, ranging from resource management to service allocation, in both communications and retail.
Full-text available
The dynamic spatial redistribution of individuals is a key driving force of various spatiotemporal phenomena on geographical scales. It can synchronize populations of interacting species, stabilize them, and diversify gene pools. Human travel, for example, is responsible for the geographical spread of human infectious disease. In the light of increasing international trade, intensified human mobility and the imminent threat of an influenza A epidemic, the knowledge of dynamical and statistical properties of human travel is of fundamental importance. Despite its crucial role, a quantitative assessment of these properties on geographical scales remains elusive, and the assumption that humans disperse diffusively still prevails in models. Here we report on a solid and quantitative assessment of human travelling statistics by analysing the circulation of bank notes in the United States. Using a comprehensive data set of over a million individual displacements, we find that dispersal is anomalous in two ways. First, the distribution of travelling distances decays as a power law, indicating that trajectories of bank notes are reminiscent of scale-free random walks known as Lévy flights. Second, the probability of remaining in a small, spatially confined region for a time T is dominated by algebraically long tails that attenuate the superdiffusive spread. We show that human travelling behaviour can be described mathematically on many spatiotemporal scales by a two-parameter continuous-time random walk model to a surprising accuracy, and conclude that human travel on geographical scales is an ambivalent and effectively superdiffusive process.
Full-text available
Despite their importance for urban planning, traffic forecasting and the spread of biological and mobile viruses, our understanding of the basic laws governing human motion remains limited owing to the lack of tools to monitor the time-resolved location of individuals. Here we study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period. We find that, in contrast with the random trajectories predicted by the prevailing Lévy flight and random walk models, human trajectories show a high degree of temporal and spatial regularity, each individual being characterized by a time-independent characteristic travel distance and a significant probability to return to a few highly frequented locations. After correcting for differences in travel distances and the inherent anisotropy of each trajectory, the individual travel patterns collapse into a single spatial probability distribution, indicating that, despite the diversity of their travel history, humans follow simple reproducible patterns. This inherent similarity in travel patterns could impact all phenomena driven by human mobility, from epidemic prevention to emergency response, urban planning and agent-based modelling.
Full-text available
Scale invariant patterns have been found in different biological systems, in many cases resembling what physicists have found in other nonbiological systems. Here we describe the foraging patterns of free-ranging spider monkeys (Ateles geoffroyi) in the forest of the Yucatan Peninsula, Mexico and find that these patterns resemble what physicists know as Levy walks. First, the length of a trajectory s constituent steps, or continuous moves in the same direction, is best described by a power-law distribution in which the frequency of ever larger steps decreases as a negative power function of their length. The rate of this decrease is very close to that predicted by a previous analytical Levy walk model to be an optimal strategy to search for scarce resources distributed at random Viswanathan et al 1999). Second, the frequency distribution of the duration of stops or waiting times also approximates a power-law function. Finally, the mean square displacement during the monkeys first foraging trip increases more rapidly than would be expected from a random walk with constant step length, but within the range predicted for Levy walks. In view of these results, we analyze the different exponents characterizing the trajectories described by females and males, and by monkeys on their own or when part of a subgroup. We discuss the origin of these patterns and their implications for the foraging ecology of spider monkeys.
Longitudinal behavioral data generally contains a significant amount of structure. In this work, we identify the structure inherent in daily behavior with models that can accurately analyze, predict, and cluster multimodal data from individuals and communities within the social network of a population. We represent this behavioral structure by the principal components of the complete behavioral dataset, a set of characteristic vectors we have termed eigenbehaviors. In our model, an individual’s behavior over a specific day can be approximated by a weighted sum of his or her primary eigenbehaviors. When these weights are calculated halfway through a day, they can be used to predict the day’s remaining behaviors with 79% accuracy for our test subjects. Additionally, we demonstrate the potential for this dimensionality reduction technique to infer community affiliations within the subjects’ social network by clustering individuals into a “behavior space” spanned by a set of their aggregate eigenbehaviors. These behavior spaces make it possible to determine the behavioral similarity between both individuals and groups, enabling 96% classification accuracy of community affiliations within the population-level social network. Additionally, the distance between individuals in the behavior space can be used as an estimate for relational ties such as friendship, suggesting strong behavioral homophily amongst the subjects. This approach capitalizes on the large amount of rich data previously captured during the Reality Mining study from mobile phones continuously logging location, proximate phones, and communication of 100 subjects at MIT over the course of 9 months. As wearable sensors continue to generate these types of rich, longitudinal datasets, dimensionality reduction techniques such as eigenbehaviors will play an increasingly important role in behavioral research.