ArticlePDF Available

Abstract and Figures

The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.
Content may be subject to copyright.
A preview of the PDF is not available
... The second line of research focuses on discovering the statistical laws that govern human mobility. These studies document that, far from being random, human mobility is characterized by predictable patterns, such as a stunning heterogeneity of human travel patterns (González et al. 2008); a strong tendency to routine and a high degree of predictability of individuals' future whereabouts (Song, Qu, Blumm, and Barabási 2010b); the presence of the returners and explorers dichotomy (Pappalardo, Simini, Rinzivillo, Pedreschi, Giannotti, and Barabási 2015); a conservative quantity in the number of locations actively visited by individuals (Alessandretti, Sapiezynski, Sekara, Lehmann, and Baronchelli 2018), and more (Barbosa et al. 2018;Luca et al. 2021). These quantifiable patterns are universal across different territories and data sources and are usually referred to as the "laws" of human mobility. ...
... Mobility data describe the movements of a set of moving objects during a period of observation. The objects may represent individuals (González et al. 2008), animals (Ramos-Fernández, Mateos, Miramontes, Cocho, Larralde, andAyala-Orozco 2004), private vehicles (Pappalardo et al. 2015), boats (Fernandez Arguedas et al. 2018), or even players on a sports field (Rossi et al. 2018). Mobility data are generally collected in an automatic way as a by-product of human activity on electronic devices (e.g., mobile phones, GPS devices, so-latitude longitude time stamp object identifier Figure 1: Representation of a 'TrajDataFrame' object. ...
... Individual measures summarize the mobility patterns of a single moving object, while collective measures summarize the mobility patterns of a population as a whole. For instance, the so-called radius of gyration (González et al. 2008) and its variants (Pappalardo et al. 2015) quantify the characteristic distance traveled by an individual. Several measures inspired by the Shannon entropy have been proposed to quantify the predictability of an individual's movements (Song et al. 2010b). ...
Article
Full-text available
The last decade has witnessed the emergence of massive mobility datasets, such as tracks generated by GPS devices, call detail records, and geo-tagged posts from social media platforms. These datasets have fostered a vast scientific production on various applications of mobility analysis, ranging from computational epidemiology to urban planning and transportation engineering. A strand of literature addresses data cleaning issues related to raw spatiotemporal trajectories, while the second line of research focuses on discovering the statistical "laws" that govern human movements. A significant effort has also been put on designing algorithms to generate synthetic trajectories able to reproduce, realistically, the laws of human mobility. Last but not least, a line of research addresses the crucial problem of privacy, proposing techniques to perform the re-identification of individuals in a database. A view on state-of-the-art cannot avoid noticing that there is no statistical software that can support scientists and practitioners with all the aspects mentioned above of mobility data analysis. In this paper, we propose scikit-mobility, a Python library that has the ambition of providing an environment to reproduce existing research, analyze mobility data, and simulate human mobility habits. scikit-mobility is efficient and easy to use as it extends pandas, a popular Python library for data analysis. Moreover, scikit-mobility provides the user with many functionalities, from visualizing trajectories to generating synthetic data, from analyzing statistical patterns to assessing the privacy risk related to the analysis of mobility datasets.
... Mobility (migration regardless of distance and borders) more broadly has traditionally been the focus of transport studies and urban planning (e.g.Pappalardo et al., 2015;Song et al., 2010). 4 There has been some indication that intentions (also expressed in Google searches) are in fact a relevant predictor of future migration both at the national and international levels (e.g.Van Dalen & Henkens, 2013;Tjaden et al., 2019;Böhme et al., 2018). ...
... Extended Detail Records (XDRs) and Control Plane Records (CPRs) also record locations when something is downloaded or phones are switching antennas providing more complete coverage. 7 Measuring human mobility (within countries, regions or cities) is an exploding field of research spanning physics and network science, to data mining, and has fueled advances from public health to transportation engineering, urban planning, official statistics and the design of smart cities (see e.g.Song et al., 2010;Pappalardo et al., 2015). ...
Article
Full-text available
The interest in human migration is at its all-time high, yet data to measure migration is notoriously limited. “Big data” or “digital trace data” have emerged as new sources of migration measurement complementing ‘traditional’ census, administrative and survey data. This paper reviews the strengths and weaknesses of eight novel, digital data sources along five domains: reliability, validity, scope, access and ethics. The review highlights the opportunities for migration scholars but also stresses the ethical and empirical challenges. This review intends to be of service to researchers and policy analysts alike and help them navigate this new and increasingly complex field.
... Indeed, such an exploitation-exploration (specializationdiversification) dichotomy is a common mechanism governing the dynamics of many diverse self-organized and adaptive systems [31][32][33][34]. ...
Preprint
Full-text available
From sports to science, the recent availability of large-scale data has allowed to gain insights on the drivers of human innovation and success in a variety of domains. Here we quantify human performance in the popular game of chess by leveraging a very large dataset comprising of over 120 million games between almost 1 million players. We find that individuals encounter hot streaks of repeated success, longer for beginners than for expert players, and even longer cold streaks of unsatisfying performance. Skilled players can be distinguished from the others based on their gaming behaviour. Differences appear from the very first moves of the game, with experts tending to specialize and repeat the same openings while beginners explore and diversify more. However, experts experience a broader response repertoire, and display a deeper understanding of different variations within the same line. Over time, the opening diversity of a player tends to decrease, hinting at the development of individual playing styles. Nevertheless, we find that players are often not able to recognize their most successful openings. Overall, our work contributes to quantifying human performance in competitive settings, providing a first large-scale quantitative analysis of individual careers in chess, helping unveil the determinants separating elite from beginner performance.
... Human mobility analysis strives to comprehend the intrinsic properties of human movements and the mechanisms behind observed patterns (Szell et al., 2012;Xu et al., 2018). Models of human mobility indicate a high degree of regularity instead of randomness in human population movements (González et al., 2008;Gao et al., 2020) and thus enable a certain level of predictability (Badr et al., 2020;Xiong et al., 2020) at multiple spatiotemporal scales (Simini et al., 2012;Pappalardo et al., 2015;Yan et al., 2017). These patterns of movements change with holidays (Deville et al., 2014) as well as emergencies (Wesolowski et al., 2015). ...
Article
Full-text available
In response to the coronavirus disease 2019 (COVID-19) pandemic, various countries have sought to control COVID-19 transmission by introducing non-pharmaceutical interventions. Restricting population mobility, by introducing social distancing, is one of the most widely used non-pharmaceutical interventions. Although similar population mobility restriction interventions were introduced, their impacts on COVID-19 transmission are often inconsistent across different regions and different time periods. These differences may provide critical information for tailoring COVID-19 control strategies. In this paper, anonymized high spatiotemporal resolution mobile-phone location data were employed to empirically analyze and quantify the impact of lockdowns on population mobility. Both the Guangdong-Hong Kong-Macao Greater Bay Area (GBA) in China and the San Francisco Bay Area (SBA) in the United States were studied. In response to the lockdowns, a general reduction in population mobility was observed, but the structural changes in mobility are very different between the two bays: 1) GBA mobility decreased by approximately 74.0–80.1% while the decrease of SBA was about 25.0–42.1%; 2) compared to SBA, the GBA had smoother volatility in daily volume during the lockdown. The volatility change indexes for GBA and SBA were 2.55% and 7.52%, respectively; 3) the effect of lockdown on short- to long-distance mobility was similar in GBA while the medium- and long-distance impact was more pronounced in SBA.
... Gonzalez et al. [24] explored the statistical properties of human mobility patterns with indices such as distribution of displacements, radius of gyration distribution, and return probability. Recent research proposed various individual human mobility models, including the exploration and preferential return (EPR) model [25], the d-EPR model [26], the TimeGeo model [27], and the model with memory effect and population-induced competition [28]. Most early studies and current works discovered that human mobility follows reproducible laws, which led to the research on the predictability of human mobility [29][30][31]. ...
Article
Full-text available
Although ordinary human mobility has been extensively studied, anomalous human mobility during emergencies or mass events is not sufficiently understood. The recently proposed vector field approach has shed light on human mobility studies. Here, the authors improve it to analyze anomalous human mobility in mass events. Specifically, the authors develop the anomalous field, the source field, and the dispersion field to identify the crowd gathering location, the start time, and the end time of anomalous human mobility. In addition, the authors propose the decay coefficient and the maximum distance to quantify the influence degree and scope of a mass event. The present approach can be used to capture the spatiotemporal characteristics of human mobility in mass events.
Article
The study of social characteristics in human interactions is a fundamental topic in mobile networks. By taking advantage of this type of information, we can leverage the knowledge about the behavior of the nodes in a network, leading to better routing strategies, especially in opportunistic dissemination scenarios. The state of the art of social-based opportunistic routing algorithms applies simple social characterization metrics, such as node’s properties and communities, which cannot capture individual social links that last longer and represent stronger bonds when compared to communities. This work proposes SocialRoute, a social-based opportunistic routing algorithm that considers individual social links instead of communities to route messages efficiently. Additionally, we evaluate how deploying static relay nodes at popular locations can aid in the transmission by using dissemination profiles that can be selected based on the message content and priority. We show that SocialRoute can transmit data with a low computational cost, both in terms of obtaining the social characteristics and overhead generated during dissemination. Additionally, this approach provides a more equally distributed dissemination between the nodes, in opposition to strategies that rely heavily on individual nodes due to their popularity. Finally, we evaluate our proposed algorithm using two real mobility traces and compare it to four state-of-the-art solutions, showing that SocialRoute obtains higher delivery ratios while maintaining a fraction of the overhead.
Preprint
Full-text available
The return of normalcy to the population's lifestyle is a critical recovery milestone in the aftermath of disasters, and delayed lifestyle recovery could lead to significant well-being impacts. Lifestyle recovery captures the collective effects of population activities and the restoration of infrastructure and business services. This study uses a novel approach to leverage privacy-enhanced location intelligence data to characterize distinctive lifestyle patterns and to unveil recovery trajectories after a disaster in the context of 2017 Hurricane Harvey in Harris County, Texas. The analysis integrates multiple data sources to record the number of visits from home census block groups (CBGs) to different points of interest during the baseline period and disruptive period. First, primary clustering using k-means characterized four distinct essential and non-essential lifestyle patterns. Then, secondary clustering characterized the impact of the hurricane into three recovery trajectories based on the severity of maximum disruption and duration of recovery. The results reveal multiple recovery trajectories and durations within each lifestyle cluster, which imply differential recovery rates among similar lifestyle and demographic groups. The findings offer a twofold theoretical significance: (1) lifestyle recovery is a critical milestone that needs to be examined, quantified, and monitored in the aftermath of disasters; (2) the spatial structures of cities formed by human mobility and distribution of facilities and extends the spatial reach of flood impacts on population lifestyles. The analysis and findings also provide novel data-driven insights for public official and emergency managers to examine, measure, and monitor a critical milestone in community recovery trajectory based on the return of lifestyles to normalcy.
Article
Existing Bluetooth-based private contact tracing (PCT) systems can privately detect whether people have come into direct contact with patients with COVID-19. However, we find that the existing systems lack functionality and flexibility , which may hurt the success of contact tracing. Specifically, they cannot detect indirect contact (e.g., people may be exposed to COVID-19 by using a contaminated sheet at a restaurant without making direct contact with the infected individual); they also cannot flexibly change the rules of “risky contact,” such as the duration of exposure or the distance (both spatially and temporally) from a patient with COVID-19 that is considered to result in a risk of exposure, which may vary with the environmental situation. In this article, we propose an efficient and secure contact tracing system that enables us to trace both direct contact and indirect contact. To address the above problems, we need to utilize users’ trajectory data for PCT, which we call trajectory-based PCT . We formalize this problem as a spatiotemporal private set intersection that satisfies both the security and efficiency requirements. By analyzing different approaches such as homomorphic encryption, which could be extended to solve this problem, we identify the trusted execution environment (TEE) as a candidate method to achieve our requirements. The major challenge is how to design algorithms for a spatiotemporal private set intersection under the limited secure memory of the TEE. To this end, we design a TEE-based system with flexible trajectory data encoding algorithms. Our experiments on real-world data show that the proposed system can process hundreds of queries on tens of millions of records of trajectory data within a few seconds.
Article
Human mobility pattern analysis has received rising attention. However, little is known about the mobility patterns of private Electric Vehicle (EV) users. In response, this paper characterized mobility patterns of private EV users using a unique one-month dataset containing moving trajectories of 76,774 actual private EVs in January 2018 in Beijing. Specifically, we first explored the diversity, regularity, spatial extent, and uniqueness of EV users’ mobility patterns. The results suggested that most EV users had both regular travel and activity patterns (the mean travel and activity entropies were 2.17 and 1.83, respectively) with special preferences towards some specific activity locations relative to all the locations they visited (the mean number of activity locations visited was 13.57 in one month). Furthermore, they tended to perform activities within a small geographical area (the mean radius of gyration was 7.60 km) and have a short daily travel distance (the mean value was 37.35 km) relative to their electric driving range. Further, we associated EV users’ mobility patterns with the built environment through ordinary least squares and geographically weighted regression models, particularly considering the so-called modifiable areal unit problem (MAUP). Due to the MAUP, most of the statistically significant built environment variables varied across spatial analysis units (SAUs). Gymnasia was the only variable statistically associated with the mobility patterns for all SAUs; while the variables related to residence and workplace were not statistically associated.
Article
Full-text available
The timely, accurate monitoring of social indicators, such as poverty or inequality, on a finegrained spatial and temporal scale is a crucial tool for understanding social phenomena and policymaking, but poses a great challenge to official statistics. This article argues that an interdisciplinary approach, combining the body of statistical research in small area estimation with the body of research in social data mining based on Big Data, can provide novel means to tackle this problem successfully. Big Data derived from the digital crumbs that humans leave behind in their daily activities are in fact providing ever more accurate proxies of social life. Social data mining from these data, coupled with advanced model-based techniques for fine-grained estimates, have the potential to provide a novel microscope through which to view and understand social complexity. This article suggests three ways to use Big Data together with small area estimation techniques, and shows how Big Data has the potential to mirror aspects of well-being and other socioeconomic phenomena.
Article
Full-text available
This study leverages mobile phone data to analyze human mobility patterns in a developing nation, especially in comparison to those of a more industrialized nation. Developing regions, such as the Ivory Coast, are marked by a number of factors that may influence mobility, such as less infrastructural coverage and maturity, less economic resources and stability, and in some cases, more cultural and language-based diversity. By comparing mobile phone data collected from the Ivory Coast to similar data collected in Portugal, we are able to highlight both qualitative and quantitative differences in mobility patterns - such as differences in likelihood to travel, as well as in the time required to travel - that are relevant to consideration on policy, infrastructure, and economic development. Our study illustrates how cultural and linguistic diversity in developing regions (such as Ivory Coast) can present challenges to mobility models that perform well and were conceptualized in less culturally diverse regions. Finally, we address these challenges by proposing novel techniques to assess the strength of borders in a regional partitioning scheme and to quantify the impact of border strength on mobility model accuracy.
Article
Full-text available
This study leverages mobile phone data to analyze human mobility patterns in developing countries, especially in comparison to more industrialized countries. Developing regions, such as the Ivory Coast, are marked by a number of factors that may influence mobility, such as less infrastructural coverage and maturity, less economic resources and stability, and in some cases, more cultural and language-based diversity. By comparing mobile phone data collected from the Ivory Coast to similar data collected in Portugal, we are able to highlight both qualitative and quantitative differences in mobility patterns - such as differences in likelihood to travel, as well as in the time required to travel - that are relevant to consideration on policy, infrastructure, and economic development. Our study illustrates how cultural and linguistic diversity in developing regions (such as Ivory Coast) can present challenges to mobility models that perform well and were conceptualized in less culturally diverse regions. Finally, we address these challenges by proposing novel techniques to assess the strength of borders in a regional partitioning scheme and to quantify the impact of border strength on mobility model accuracy.
Article
Given its effective techniques and theories from various sources and fields, data science is playing a vital role in transportation research and the consequences of the inevitable switch to electronic vehicles. This fundamental insight provides a step towards the solution of this important challenge. Data Science and Simulation in Transportation Research highlights entirely new and detailed spatial-temporal micro-simulation methodologies for human mobility and the emerging dynamics of our society. Bringing together novel ideas grounded in big data from various data mining and transportation science sources, this book is an essential tool for professionals, students, and researchers in the fields of transportation research and data mining.
Article
The potential of low-frequency bus localization data for the monitoring and control of bus system performance is investigated in this paper. It is shown that data with a sampling rate as low as 1 min, when processed appropriately, can provide ample information. Accurate estimates of stop arrival and departure times are obtained; these estimates in turn allow the analysis of headways and travel times. A three-parameter gamma family of distributions is fitted for headways at the stops along a bus line. The evolution of the parameters demonstrates critical points on the line where bus bunching is significantly increased. Moreover, this analysis allows differentiating problems associated with varying passenger demand from uncertainties associated with traffic conditions. Furthermore it is shown that expected travel time and travel time variability can be calculated from low-frequency localization data. Finally, the way in which the results can be used to calibrate a simulation model that can test bus control strategies is presented. The methods are applied and validated to data obtained from Bus Route Number 1 in Boston, Massachusetts.