Article

Exploring Millions of Footprints in Location Sharing Services

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Location sharing services (LSS) like Foursquare, Gowalla, and Facebook Places support hundreds of millions of user-driven footprints (i.e., "checkins"). Those global-scale footprints provide a unique opportunity to study the social and temporal characteristics of how people use these services and to model patterns of human mobility, which are significant factors for the design of future mobile+location-based services, traffic forecasting, urban planning, as well as epidemiological models of disease spread. In this paper, we investigate 22 million checkins across 220,000 users and report a quantitative assessment of human mobility patterns by analyzing the spatial, temporal, social, and textual aspects associated with these footprints. We find that: (i) LSS users follow the “Levy Flight” mobility pattern and adopt periodic behaviors; (ii) While geographic and economic constraints affect mobility patterns, so does individual social status; and (iii) Content and sentiment-based analysis of posts associated with checkins can provide a rich source of context for better understanding how users engage with these services.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... To date, researchers have relied on mobility models (Johnson and Maltz 1996;Jardosh et al. 2003), geotagged data via IP addresses (e.g., Wikipedia), or user check-in data available via APIs (e.g., Foursquare) (Cheng et al. 2011;Noulas and others 2011;Sen and others 2015;Hecht and Stephens 2014). Check-in data is attractive because it is relatively easy to access via APIs or scraping and captures a popular user practice, thus offering relatively high penetration rates. ...
... LBSNs provide a unique source to collect detailed and large-scale human mobility traces, which have been widely used in human mobility research. For instance, researchers have studied the spatial and temporal mobility characteristics using checkin datasets (Noulas and others 2012;Cheng et al. 2011;Noulas and others 2011). Others use check-in traces to build various applications, such as predicting human movements (Cho, Myers, and Leskovec 2011;Scellato and others 2011), inferring friendship (Scellato, Noulas, and Mascolo 2011), predicting customer volume (Georgiev, Noulas, and Mascolo 2014), measuring urban socioeconomic deprivation (Venerandi and others 2015), and even improving the efficiency of content delivery networks . ...
... Our work has direct implications on research efforts that apply check-in data to study human mobility without careful consideration of biases and limitations. For example, Cheng et al. analyze Foursquare check-ins, and use their results to report a strong periodic pattern in human movements (Cheng et al. 2011). Later work further leverage these patterns to predict users' future movement (Cho, Myers, and Leskovec 2011;Scellato and others 2011). ...
Article
Social computing researchers are using data from location-based social networks (LBSN), e.g., "Check-in" traces, as approximations of human movement. Recent work has questioned the validity of this approach, showing large discrepancies between check-in data and actual user mobility. To further validate and understand such discrepancies, we perform a crowdsourced study of Foursquare users that seeks to a) quantify bias and misrepresentation in check-in datasets and the impact of self-selection in prior studies, and b) understand the motivations behind misrepresentation of check-ins, and the potential impact of any system changes designed to curtail such misbehavior. Our results confirm the presence of significant misrepresentation of location check-ins on Foursquare. They also show that while "extraneous" check-ins are motivated by external rewards provided by the system, "missing" check-ins are motivated by personal concerns such as location privacy. Finally, we discuss the broader implications of our findings to the use of check-in datasets in future research on human mobility.
... Other studies use large-scale data, including LBSN data, and data mining techniques to understand which factors may be associated with people's movement patterns. For example, Cheng et al. [2021] used geolocated data from Twitter to understand user movements. The authors associated this spatial information with the economic characteristics of users, the geographic aspects of the areas frequented, as well as their positioning within the social network and the language used in their check-ins. ...
... A plausible explanation lies in the differing ways users engage with LBSNs. This result aligns with expectations, as several studies [Cheng et al., 2021;González et al., 2008;Rhee et al., 2008;Brockmann et al., 2006] highlight the tendency for users' travel distancesboth in LBSNs and similar datasets -to cluster at shorter distances and become increasingly rare over greater distances, as we have demonstrated. ...
Article
Full-text available
Location-Based Social Networks (LBSNs) are valuable for understanding urban behavior and providing useful data on user preferences. Modeling their data into graphs like interest networks (iNETs) offers important insights for urban area recommendations, mobility forecasting, and public policy development. This study uses check-ins and venue reviews to compare the iNETs resulting from two distinct LBSNs, Foursquare and Google Places. Although these two LBSNs differ in nature, with data varying in regularity and purpose, their resulting iNETs reveal similar urban behavior patterns. When analyzing the impact of socioeconomic, political, and geographic factors on iNET edges-each edge representing users' interests in a pair of regions-only geographic factors showed a significant influence. When studying the granularity of area sizes to model iNETs, we highlight important trade-offs between larger and smaller sizes. Additionally, we propose a methodology to identify clusters of geographically neighboring areas where user interest is strongest, which can be advantageous for understanding urban space usage.
... Many interesting temporal patterns of human mobility have also been discovered in previous work. For example, (Cheng et al. 2011) indicates that people return to specific locations on a daily basis, and (Noulas et al. 2011) reports the different mobility levels of people on weekdays and weekends. Without modelling the distance in the space, a number of entropy-based approaches are proposed to investigate the randomness of human mobility (Cranshaw et al. 2010;Smith et al. 2014). ...
... Next, we validate our analogy for two well studied temporal properties of human physical mobility: distribution of returning probability (Gonzalez, Hidalgo, and Barabasi 2008;Cheng et al. 2011) and hourly mobility levels (Noulas et al. 2011;Hu et al. 2016). Returning probability measures the periodic patterns of human mobility. ...
Preprint
Full-text available
With the wide adoption of the multi-community setting in many popular social media platforms, the increasing user engagements across multiple online communities warrant research attention. In this paper, we introduce a novel analogy between the movements in the cyber space and the physical space. This analogy implies a new way of studying human online activities by modelling the activities across online communities in a similar fashion as the movements among locations. First, we quantitatively validate the analogy by comparing several important properties of human online activities and physical movements. Our experiments reveal striking similarities between the cyber space and the physical space. Next, inspired by the established methodology on human mobility in the physical space, we propose a framework to study human "mobility" across online platforms. We discover three interesting patterns of user engagements in online communities. Furthermore, our experiments indicate that people with different mobility patterns also exhibit divergent preferences to online communities. This work not only attempts to achieve a better understanding of human online activities, but also intends to open a promising research direction with rich implications and applications.
... Pioneer studies, based on CDR [1,2] and banknote records [3], found that the distribution of displacement ∆r is well approximated by a power-law, P (∆r) ∼ ∆r −β , (or 'Lévy distribution' [4], as typically 1 < β < 3), and that an exponential cut-off in the distribution may control boundary effects [2]. These findings were confirmed by studies based on GPS trajectories of individuals [5,6,7] and vehicles [8,9], as well as online social networks data [10,11,12]. It has been noted, however, that power-law behaviour may fail to describe intra-urban displacements [13]. ...
Preprint
The recent availability of digital traces generated by phone calls and online logins has significantly increased the scientific understanding of human mobility. Until now, however, limited data resolution and coverage have hindered a coherent description of human displacements across different spatial and temporal scales. Here, we characterise mobility behaviour across several orders of magnitude by analysing ~850 individuals' digital traces sampled every ~16 seconds for 25 months with ~10 meters spatial resolution. We show that the distributions of distances and waiting times between consecutive locations are best described by log-normal distributions and that natural time-scales emerge from the regularity of human mobility. We point out that log-normal distributions also characterise the patterns of discovery of new places, implying that they are not a simple consequence of the routine of modern life.
... With the popularity of location-based social networks [17], users can share their real time activities by checking in at POIs, which provides a novel data source to study their collective behavior. For example, Cheng et al. [18] investigated 22 million checkins across 220,000 users and reported a quantitative assessment of human mobility patterns by analyzing the spatial, temporal, social, and textual aspects associated with these footprints. Noulas et al. [19] conducted an empirical study of geographic user activity patterns based on check-in data in Foursquare. ...
Preprint
In this paper we present the first population-level, city-scale analysis of application usage on smartphones. Using deep packet inspection at the network operator level, we obtained a geo-tagged dataset with more than 6 million unique devices that launched more than 10,000 unique applications across the city of Shanghai over one week. We develop a technique that leverages transfer learning to predict which applications are most popular and estimate the whole usage distribution based on the Point of Interest (POI) information of that particular location. We demonstrate that our technique has an 83.0% hitrate in successfully identifying the top five popular applications, and a 0.15 RMSE when estimating usage with just 10% sampled sparse data. It outperforms by about 25.7% over the existing state-of-the-art approaches. Our findings pave the way for predicting which apps are relevant to a user given their current location, and which applications are popular where. The implications of our findings are broad: it enables a range of systems to benefit from such timely predictions, including operating systems, network operators, appstores, advertisers, and service providers.
... Outros trabalhos buscam relacionar diferentes fatores com a mobilidade. Com dados geolocalizados do Twitter, [1] explora os relacionamentos entre as distâncias percorridas pelos usuários, as regiões que eles frequentam, a popularidade na plataforma e a linguagem usada nas postagens, identificado um padrão de mobilidade no qual distâncias menores têm maior preferência do que distâncias maiores. Em outra linha de investigação, [24] avalia se dados do Foursquare se assemelham aos dados informados pela WTO (World Tourism Organization), na questão do turismo internacional. ...
Conference Paper
Full-text available
Location-Based Social Networks (LBSNs) can help model users’ interests in urban areas in several ways. In the present work, we focus on Interest Networks (iNETs), which result from modeling LBSN data into graphs. The present study provides insights into which areas are frequently visited together by getting data from two distinct LBSNs, Foursquare and Google Places. Although the studied LBSNs differ in nature, with data varying in regularity and purpose, both modeled iNETs revealed similar urban behavior patterns and were likewise impacted by socioeconomic and geographic factors. Also, we discuss the development of a tool to empower urban studies and the by-products of this research.
... Various studies have employed advanced modeling techniques, including machine learning methods, to validate the use of mobile data for estimating travel patterns (e.g. Tenkanen et al 2017, Ruktanonchai et al 2018, Yu et al 2019, Merrill et al 2020, Wood et al 2020, Cheng et al 2021, Liang et al 2022, Bian et al 2023, Massenkoff and Wilmers 2023. For example, Liang et al (2022) combined mobile device data with the American Community Survey and demonstrated that this approach can complement traditional survey data in understanding visitor demographics and temporal visitation patterns to national sites. ...
Article
Full-text available
Outdoor recreation plays a pivotal role in improving people’s physical and mental health, serving as a popular form of entertainment and a significant economic contributor. Limited access to these resources not only exacerbates health disparities but also deprives underserved areas of essential benefits like stress relief and community bonding, both of which are crucial for enhancing overall quality of life. This paper provides one of the first detailed analyses of water-based recreation at over 61 000 inland and coastal sites across the United States. We aim to explore disparities in recreational behavior across race, ethnicity, income, and socioeconomic status. Using Advan cellphone data from more than 70 million outdoor trips, representing 215 000 census block groups, we find that communities of color, rural areas, and socioeconomically disadvantaged groups are significantly underrepresented in water-based recreational visits. Despite living similar distances from recreational sites, these groups show notably different patterns in travel distance for water-based recreation. Additionally, we find Native Americans from underserved areas have to travel 3–5 times longer distances than other groups for water-based recreation. Our findings show that the extensive and frequent cellphone mobility data could reveal policy-relevant patterns especially those made by underserved Americans often overlooked in traditional household surveys.
... Data sets capturing individual-level movements have been used for (1) modeling and analyzing human mobility patterns (e.g., [15,16,20,69,70,87,88,102,110,111]), (2) recommending locations to users based on previously visited locations (for example, [11, 12, 24ś27, 35, 48ś50, 52, 54ś56, 58ś61, 68, 86, 100, 101, 112, 115, 116, 119ś 121, 124, 125, 127ś131, 133, 134], surveyed in [6]), (3) predicting the next location to be visited by an individual (e.g., [5,13,33,53,57,84,96,114]), and (4) suggests new friends to individuals based on similar interests observed by visiting similar locations [14,34,67,80,89,99,104,108,113]), Publicly available real-world data sets have been the driving force for human mobility data science in recent years. These data sets mainly comprise trajectory data and location-based social network (LBSN) data. ...
Article
Full-text available
Human mobility data science using trajectories or check-ins of individuals has many applications. Recently, we have seen a plethora of research efforts that tackle these applications. However, research progress in this field is limited by a lack of large and representative datasets. The largest and most commonly used dataset of individual human trajectories captures fewer than 200 individuals while data sets of individual human check-ins capture fewer than 100 check-ins per city per day. Thus, it is not clear if findings from the human mobility data science community would generalize to large populations. Since obtaining massive, representative, and individual-level human mobility data is hard to come by due to privacy considerations, the vision of this paper is to embrace the use of data generated by large-scale socially realistic microsimulations. Informed by both real data and leveraging social and behavioral theories, massive spatially explicit microsimulations may allow us to simulate entire megacities at the person level. The simulated worlds, which do not capture any identifiable personal information, allow us to perform “in silico” experiments using the simulated world as a sandbox in which we have perfect information and perfect control without jeopardizing the privacy of any actual individual. In silico experiments have become commonplace in other scientific domains such as chemistry and biology, permitting experiments that foster the understanding of concepts without any harm to individuals. This work describes challenges and opportunities for leveraging massive and realistic simulated alternate worlds for in silico human mobility data science.
... Combining mobility data and social networks, LBSN data inds many applications. A irst application found in the literature was on modeling and describing human mobility patterns (e.g., [55,168]), analyzing these patterns (e.g., [54]), and explaining why individual user choose locations and how social ties afect this choice (e.g., [239]). Another application is that of location recommendation, which leverages check-ins of users and their ratings in the user-location network to recommend new locations to users [26]. ...
Article
Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences. In this paper, we present the domain of mobility data science. Towards a unified approach to mobility data science, we present a pipeline having the following components: mobility data collection, cleaning, analysis, management, and privacy. For each of these components, we explain how mobility data science differs from general data science, we survey the current state of the art, and describe open challenges for the research community in the coming years.
... This has presented tremendous opportunities across a spectrum of disciplines, including research questions related to spatial optimization and planning. As is widely known, advances in technology enable social media to generate a large amount of Geosocial network data with GPS location information, which is considered a new type of data assets (Cheng, Caverlee, Lee, and Sui, 2021;Stefanidis, Crooks, and Radzikowski, 2013). If analyzed properly, the data could be useful in many areas. ...
... These studies followed students on university campuses, volunteers in theme parks and state fairs, cabs on their journey through a city, or bank notes travelling through the United States. Two studies (Cheng et al. 2011;Noulas et al. 2012) showed that logins to location sharing services (LSS) like Foursquare also follow a Lévy-flight pattern on a global scale. Deutschmann (2016a; demonstrated that this pattern also holds for various forms of international mobility between countries worldwide, including migrants, tourists, asylum-seekers, and refugees. ...
... Combining mobility data and social networks, LBSN data finds many applications. A first application found in the literature was on modeling and describing human mobility patterns (e.g., [75,225,271,347]), analyzing these patterns (e.g., [74,226,328]), and explaining why individual user choose locations and how social ties affect this choice (e.g., [320,379]). Another application is that of location recommendation, which leverages check-ins of users and their ratings in the user-location network to recommend new locations to users [37,172,324,351]). ...
Preprint
Full-text available
Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences. In this paper, we present the emerging domain of mobility data science. Towards a unified approach to mobility data science, we envision a pipeline having the following components: mobility data collection, cleaning, analysis, management, and privacy. For each of these components, we explain how mobility data science differs from general data science, we survey the current state of the art and describe open challenges for the research community in the coming years.
... Outra característica existente no grafo são as mudanças da posição geográfica, veículos têm uma grande atividade de movimentação em várias direções. A partir da latitude e longitudeé possível calcular a amplitude da movimentação espacial média, conhecida como Raio de Giro r g [Cheng et al. 2011], que retorna as distâncias percorridas entre as diferentes localizações e o centro de massa de um veículo, como mostra a Equação 5. Onde r ié uma leitura de movimentação dada pelo par (latitude,longitude), r cmé o centro de massa das movimentações dos veículos (média de todas as movimentações). O valor (r i − r cm )é a distância entre uma leitura e o centro de massa. ...
Conference Paper
As pesquisas em redes veiculares oportunistas têm atraído atenção no que se refere á seleção de contatos para aplicação em redes de dados. Apresentamos um modelo GeoSocial para seleção de contatos e roteamento de mensagens em ambientes urbanos baseado na amplitude e frequência de movimentação e na estrutura social dos véıculos. Consideramos a temporalidade na formação dos enlaces na rede para extrair estas informações. O modelo foi avaliado em uma base de dados real de movimentação de taxis. Em decorrência mostramos que com um pequeno número de véıculos é possivel atingir resultados superiores na taxa de entrega com baixo overhead relativo em relação a outros protocolos de roteamento.
... Mittal et al., 2019), to the representational -the uneven data 'shadows', potential biased representation, and overall lack of socio-demographic information (Longley et al., 2015) -to the ethical (Boyd and Crawford, 2012;Zook, 2017). These intrinsic limitations have been common challenges in human mobility analysis (Cheng et al., 2011;Noulas et al., 2012) and require to be taken into account when processing and interpreting such data. Different efforts have been made to partially address these challenges, for example, through aggregating mobility patterns or inferring larger patterns based on statistical models (e.g., Alexander et al., 2015;Phithakkitnukoon et al., 2010;Ye et al., 2009), through contextualizing data with nearby land use information and point of interests (POIs) (Horozov et al., 2006), and through specifically evaluating the bias in these types of data (e.g., Longley et al., 2015;Luo et al., 2016). ...
Preprint
Full-text available
Analyses of urban spaces have often stressed the importance of both the density and diversity of the people they attract. However, the diversity of people is a challenging concept to operationalize within the context of urban spaces, which is why many evaluations of urban space have relied primarily on density-based measures. We argue that a focus on only one of the two aspects misses important aspects of the variety of urban spaces in our cities. To address this, we design a methodology that evaluates both the density and diversity of human behavior in urban spaces based on geosocial media data. We operationalize density as the frequency of tweets from visitors to a particular location and diversity as the variety of the home neighborhoods of those visitors. Taking Singapore as a test case, we identify networks between the home neighborhoods of 28k Twitter users based on 2.2 million geolocated tweets collected between 2012 and 2016. Based on this data, we categorize the urban landscape of Singapore into four “performance” categories, namely High-Density/High-Diversity; High-Density/Low-Diversity; Low-Density/High-Diversity, and Low-Density/Low-Diversity. Our findings illustrate that this combined indicator provides useful nuance compared to differentiation between well and less performing spaces based on density alone. By enabling a categorization of urban spaces that fits closer to the diversity of human behavior in these spaces, human mobility data sets, such as the social media data we use, open the door to a practical evaluation of the design and planning of our heterogeneous urban environment.
... Mittal et al., 2019), to the representational -the uneven data 'shadows', potential biased representation, and overall lack of socio-demographic information (Longley et al., 2015) -to the ethical (Boyd and Crawford, 2012;Zook, 2017). These intrinsic limitations have been common challenges in human mobility analysis (Cheng et al., 2011;Noulas et al., 2012) and require to be taken into account when processing and interpreting such data. Different efforts have been made to partially address these challenges, for example, through aggregating mobility patterns or inferring larger patterns based on statistical models (e.g., Alexander et al., 2015;Phithakkitnukoon et al., 2010;Ye et al., 2009), through contextualizing data with nearby land use information and point of interests (POIs) (Horozov et al., 2006), and through specifically evaluating the bias in these types of data (e.g., Longley et al., 2015;Luo et al., 2016). ...
Article
Full-text available
Analyses of urban spaces have often stressed the importance of both the density and diversity of the people they attract. However, the diversity of people is a challenging concept to operationalize within the context of urban spaces, which is why many evaluations of urban space have relied primarily on density-based measures. We argue that a focus on only one of the two aspects misses important aspects of the variety of urban spaces in our cities. To address this, we design a methodology that evaluates both the density and diversity of human behavior in urban spaces based on geosocial media data. We operationalize density as the frequency of tweets from visitors to a particular location and diversity as the variety of the home neighborhoods of those visitors. Taking Singapore as a test case, we identify networks between the home neighborhoods of 28k Twitter users based on 2.2 million geolocated tweets collected between 2012 and 2016. Based on these data, we categorize the urban landscape of Singapore into four “performance” categories, namely High-Density/High-Diversity, High-Density/Low-Diversity, Low-Density/High-Diversity, and Low-Density/Low-Diversity. Our findings illustrate that this combined indicator provides useful nuance compared to differentiation between well and less performing spaces based on density alone. By enabling a categorization of urban spaces that fits closer to the diversity of human behavior in these spaces, human mobility data sets, such as the social media data we use, open the door to a practical evaluation of the design and planning of our heterogeneous urban environment.
... found that sentiment expressed in tweets posted around 78 census areas of London correlated highly with community socioeconomic well being, as measured by the Index of Multiple Deprivation (i.e., qualitative study of deprived areas in the UK local councils). In another study they found that happy places tend to interact with other happy places, although other indicators such as demographic data and human mobility were not used in their research (Cheng et al. 2011). ...
Article
The social connections people form online affect the quality of information they receive and their online experience. Although a host of socioeconomic and cognitive factors were implicated in the formation of offline social ties, few of them have been empirically validated, particularly in an online setting. In this study, we analyze a large corpus of geo-referenced messages, or tweets, posted by social media users from a major US metropolitan area. We linked these tweets to US Census data through their locations. This allowed us to measure emotions expressed in the tweets posted from an area, the structure of social connections, and also use that area's socioeconomic characteristics in analysis. %We extracted the structure of online social interactions from the people mentioned in tweets from that area.We find that at an aggregate level, places where social media users engage more deeply with less diverse social contacts are those where they express more negative emotions, like sadness and anger. Demographics also has an impact: these places have residents with lower household income and education levels. Conversely, places where people engage less frequently but with diverse contacts have happier, more positive messages posted from them and also have better educated, younger, more affluent residents. Results suggest that cognitive factors and offline characteristics affect the quality of online interactions. Our work highlights the value of linking social media data to traditional data sources, such as US Census, to drive novel analysis of online behavior.
... Previous studies were not able to explore any differences due to a lack of data. While some studies have shown that mobility is correlated to social status (Cheng et al. 2011) and community well-being (Lathia, Quercia, and Crowcroft 2012) measured at city and neighborhood levels, they are based on inferred attributes where the ground truth is not firmly established. ...
Article
High accuracy location data are routinely available to a plethora of mobile apps and web services. The availability of such data lead to a better general understanding of human mobility. However, as location data are usually not associated with demographic information, little work has been done to understand the differences in human mobility across demographics. In this study we begin to fill the void. In particular, we explore how the growing number of geotagged footprints that social network users create can reveal demographic attributes and how these footprints enable the understanding of mobility at a demographic level. Our methodology gives rise to novel opportunities in the study of mobility. We leverage publicly available geotagged photographs from a popular photosharing network to build a dataset on demographic mobility patterns. Our analysis of this dataset not only reproduces previous results on mobility behavior at various geographical levels but further extends the existing picture: it allows for the refinement of mobility modeling from entire populations to specific demographic groups. Our analysis suggests the existence of regional variations in mobility and reveals statistically significant differences in mobility between genders and ethnicities.
... Yet these studies were clearly restricted in scale: US-dollar bills, for instance, can only be used in the United States, and "it remains unclear whether the observed properties are specific to the US or whether they represent universal features" (Brockmann and Theis 2008: 33). Two studies (Cheng et al. 2011;Noulas et al. 2012) have indeed shown that logins to location sharing Step length x Whether planet-scale human activity in general also follows a Lévy flight has thus not been thoroughly examined to date. Overall, the global sphere has largely been omitted by this natural-scientific strand of research, at least when it comes to human mobility.49 ...
Book
Full-text available
Increasingly, people travel and communicate across borders. Yet, we still know little about the overall structure of this transnational world. Is it really a fully globalized world in which everything is linked, as popular catchphrases like “global village” suggest? Through a sweeping comparative analysis of eight types of mobility and communication among countries worldwide—from migration and tourism to Facebook friendships and phone calls—Mapping the Transnational World demonstrates that our behavior is actually regionalized, not globalized. Emanuel Deutschmann shows that transnational activity within world regions is not so much the outcome of political, cultural, or economic factors, but is driven primarily by geographic distance. He explains that the spatial structure of transnational human activity follows a simple mathematical function, the power law, a pattern that also fits the movements of many other animal species on the planet. Moreover, this pattern remained extremely stable during the five decades studied—1960 to 2010. Unveiling proximity-induced regionalism as a major feature of planet-scale networks of transnational human activity, Deutschmann provides a crucial corrective to several fields of research. Revealing why a truly global society is unlikely to emerge, Mapping the Transnational World highlights the essential role of interaction beyond borders on a planet that remains spatially fragmented. "Mapping the Transnational World offers a large-scale look at various human connections spanning national borders. I appreciate the breadth of coverage: the description of regionalization and globalization across eight types of human activity over five decades is a big contribution all on its own. The use of network-analytic techniques to model these cross-border connections is impressive."—Jason Beckfield, Harvard University "This spectacularly ambitious and potentially paradigm-shifting work builds what is indeed one of the first systematic attempts to document, analyze, and explain the totality of migrations on a planetary scale. Mapping the Transnational World is a big empirical step forward for the study of international migration and mobilities."—Adrian Favell, University of Leeds
Article
Full-text available
In the era of information overload, location-based social software has gained widespread popularity, and the demand for personalized POI (Point of Interest) recommendation services is growing rapidly. Recommending the next POI is crucial in recommendation systems, aiming to suggest appropriate next-visit locations based on users’ historical trajectories and check-in data. However, the existing research often neglects user preferences’ diversity and dynamic nature and the need for the deep modeling of key collaborative relationships across various dimensions. As a result, the recommendation performance is limited. To address these challenges, this paper introduces an innovative Multi-View Contrastive Fusion Hypergraph Learning Model (MVHGAT). The model first constructs three distinct hypergraphs, representing interaction, trajectory, and geographical location, capturing the complex relationships and high-order dependencies between users and POIs from different perspectives. Subsequently, a targeted hypergraph convolutional network is designed for aggregation and propagation, learning the latent factors within each view. Through multi-view weighted contrastive learning, the model uncovers key collaborative effects between views, enhancing both user and POI representations’ consistency and discriminative power. The experimental results demonstrate that MVHGAT significantly outperforms several state-of-the-art methods across three public datasets, effectively addressing issues such as data sparsity and oversmoothing. This model provides new insights and solutions for the next POI recommendation task.
Article
This study examines the routine activity theory’s capable guardianship concept using contemporary GIS data. Leveraging high-frequency geotracked mobile phone data from Advan, which provides ambient population counts for census block groups every 2 hr, the research tests the impact of residents’ presence on burglary frequency at different times of the day and week. Through a series of negative binomial regression and GS2SLS models, significant variations in deterrence provided by such capable guardianship are revealed across different times of the day, as well as between weekdays and weekends. Highlighting the novelty and utility of mobile phone tracking data, the research offers insights into the temporal dynamics of residential burglary, informing targeted crime prevention strategies, enhancing community safety, and advancing criminological theory.
Conference Paper
Full-text available
As Redes Sociais Baseadas em Localização (LBSNs) são úteis na compreensão do comportamento urbano, oferecendo dados valiosos sobre preferências dos usuários. A modelagem desses dados em grafos, como as Redes de Interesse, permite percepções relevantes. Essas redes podem ser úteis para, por exemplo, recomendações de áreas urbanas, previsões de mobilidade e formulação de políticas públicas. Este estudo compara redes de interesse de duas LBSNs distintas, Foursquare e Google Places, usando dados de check-ins e avaliações de estabelecimentos. Embora as LBSNs estudadas sejam diferentes em natureza, com dados diferindo em regularidade e propósito, ambas as redes de interesse modeladas revelaram padrões similares de comportamento urbano. Fatores socioeconômicos e geográficos também mostraram impacto semelhante nas redes de interesse estudadas.
Article
Purpose The study investigated the relationship between digital supply chain (DSC) and sustainable supply chain performance (SSCP) of small and medium-sized enterprises (SMEs) via the lens of supply chain integration (SCI) and information sharing (IS). This study concentrates more on the mediating role of SCI and IS in the link between DSC and SSCP that no research has mentioned before. Design/methodology/approach This research figures out how the DSC impacts the performance of the organization and the supply chain. By employing a carefully designed questionnaire to gather data, a quantitative methodology was employed. Managers at the senior and medium levels were the responders who were targeted. There are 467 valid replies gathered from the primary survey. The data results were used in the analysis using partial least squares structural equation modeling (PLS-SEM). Findings The findings imply that SCI’s function in the information-sharing process is crucial as it fosters cooperation, coordination and connectivity throughout the DSC. Furthermore, the study’s conclusions offer helpful information on how businesses might enhance supply chain performance through information exchange. Businesses are constantly concentrating on the role that the DSC plays as a catalyst for sustainable growth and are improving supply chain performance through SCI and information exchange. Originality/value This study highlights the gaps and unexplored themes in the existing literature, catalogs the DSC published in the main logistics journals and helps people recognize and appreciate this kind of work. It also has the potential to contribute to future research on SSCP. Moreover, the novelty research is further reinforced by the coverage of the newfound mechanism, where SCI and IS mediate the relationship between DSC and SSCP, directly and positively enhancing SSCP.
Article
Full-text available
Spatial interaction research is particularly important for geographical analyses, as it plays a crucial role in extracting travel patterns. However, previous studies on spatial interactions have not adequately considered regional population variations over time, resulting in insufficiently precise travel predictions. Moreover, the threshold of spatial correlations is difficult to determine. Existing studies have assumed fully connected spatial correlation matrices, which is not realistic. To address these limitations, we proposed the Self-paced Gaussian-Based Graph Convolutional Network (SG-GCN) to automatically estimate the threshold of spatial correlations for travel flow predictions. It incorporates a temporal dimension into spatial relationship matrices to enhance the accuracy of vehicle flow predictions. In particular, Gaussian-based GCN identifies patterns in a time series of regional flows, enabling more precise capturing of spatial relationships while fusing node and edge features. Building on this model, self-paced contrastive learning automatically sets thresholds to determine the presence or absence of spatial relationships. The model's performance was verified through two empirical case studies conducted in New York City, USA, and Ningbo, China, using 2.8 million bicycle-sharing records and 1.25 million taxi trip records, respectively. The proposed model helps delineate mobility patterns in cities of varying scales and with different modes of transportation.
Article
Full-text available
Point-of-interest (POI) recommendation has gained significant traction recently due to the rising trend of location-based networks. Traditional approaches rely on a centralized collection of user data. Concerning privacy protection, decentralized federated learning employs model training on each user’s device with nearby collaborative training techniques. However, existing decentralized federated recommendations suffer from two major problems: (1) Privacy risks: existing approaches expose geographical location or co-rated items information when constructing user neighborhoods. (2) Performance limitations: existing approaches adopt a simple model without incorporating auxiliary information. To solve these, we propose CA-PDBPR (category-aware privacy preserving POI recommendation using decentralized Bayesian personalized ranking) to address the above challenges. Specifically, we introduce a novel privacy-enhanced neighborhood creation method utilizing POI category preferences to calculate decentralized user similarity through secret sharing technology, ensuring a higher level of privacy. Moreover, we integrate POI category information with a refined Bayesian personalized ranking (BPR) loss function to enhance recommendation performance. Experimental evaluations conducted on real-world datasets validate the effectiveness of the CA-PDBPR model, demonstrating enhanced recommendation quality while minimizing data exposure compared with state-of-the-art alternatives.
Article
Bu çalışmada, Türkiye Cumhuriyeti’nin ilk yıllarından itibaren turizmde uygulama alanı bulan teknolojik gelişmelerin dönemsel olarak ele alınması amaçlanmıştır. Bu amaçla nitel araştırma yöntemlerinden olan ikincil kaynak taraması ile veriler toplanmış, ikincil kaynakların tamamına ulaşılamaması araştırmanın kapsamını sınırlandırmıştır. Yıllar içerisinde asansör, yazar kasa, telgraf, telefon, mikrofon, robotlar, artırılmış gerçeklik, genişletilmiş gerçeklik, sanal gerçeklik uygulamaları, acenta yazılım programı, navigasyon uygulamaları, blok zincir, büyük veri, yapay zekâ, metaverse ve üç boyutlu yemek sunumları turizmde hizmet süreçlerinin bir parçası haline gelmiştir. Araştırma kapsamında ele alınan bu teknolojiler doğrultusunda çeşitli dönemlerde hizmetlerin oluşturulması ve sunumunda kullanılan teknolojik gelişmelerin müşteri memnuniyetinde ve memnuniyetsizliğinde rol oynadığı görülmüştür.
Article
Full-text available
In this paper, we propose a sentiment analysis of Twitter data focused on the attitudes and sentiments of Polish migrants and stayers during the pandemic. We collected 9 million tweets and retweets between January and August 2021, and analysed them using MultiEmo, the multilingual, multilevel, multi-domain sentiment analysis corpus. We discovered that the sentiment of tweets differs between migrants and stayers over time, and it relates to the country of migration. The general sentiment is similar for migrants and stayers, but a more detailed analysis reveals that hashtags related to staying safe and staying at home, as well as vaccinations are more polarised for migrants than for stayers, and they reflect the general development trend of the pandemic in Europe. In addition to comparing migrants with stayers, we also compared migrants staying in different countries. amongst the countries of migration, for which we collected at least 3000 tweets, the most positive sentiment of Polish migrants’ tweets was observed in Belgium, with the most negative sentiment coming from Estonia. We also observed that the sentiment of tweets written in Polish by stayers in Poland is less negative when compared to Polish migrants in most of the countries with the highest number of tweets.
Article
The primary objective of this study is to discover traffic exposure variables from some new data sources and explore how these new data sources and their combination affects the performance of zone-level crash models. Seven types of check-in activities and five types of taxi trips are inferred from Twitter and taxi GPS records, respectively. Then, Bayesian spatial models are employed to conduct zone-level traffic crash analysis. The results suggest that some specific check-in activities and inferred taxi trips are closely related with zone-level crash counts, and thereby confirms the benefits of incorporating new data sources into zone-level crash models. The comparative analyses further indicate that twitter check-in activities perform better than inferred taxi trips as a proxy for traffic exposures on spatial analyses of traffic crashes, and detailed trip purpose information hidden in new data sources greatly benefit zone-level crash models than simply aggregating location points in each zone. The results of this study reveal that each big data source has its prominent coverage of user groups and spatial areas, and their combination can serve as effective supplementary information to traditional exposure variables to improve the performance of zone-level crash models and better reveal the spatial impacts of human activities on traffic crashes. The findings of this study can help transportation authority develop more targeted traffic demand adjustment strategies to effectively reduce zone-level crash risks.
Article
Transport data are important for understanding human mobility and urban interactions within a city. As China’s transportation infrastructure continues to grow, more research is needed to analyse the spatial patterns of travel flows and to understand how these patterns change over time. With the development of online car-hailing and ride sharing services, floating car data have become a new resource to facilitate the analysis of human mobility patterns and the interactions of urban mobility within a city. The detection of urban communities based on urban networks is a helpful way to represent urban interactions. However, understanding community changes using online car-hailing data remains an underexplored topic. To this end, this study applies a community detection method to explore community changes over time based on the newly available floating car data (DiDi Chuxing (‘DiDi’)) in Chengdu, China. We applied undirected graphs to examine the spatial distribution of DiDi usage and the spatial patterns of travel distance. In addition, we explored the spatial-temporal variations of the communities at the taxi zone level using Blondel’s iterative algorithm, a modularity optimization approach. Results suggest that: 1) taxi zones on the south and west sides of Chengdu have more average daily trips compared to those in other areas; 2) residential taxi zones in the northeast area have a long median travel distance, indicating people living in those areas travel longer distances; and 3) the detected community structures change at different times. These findings provide valuable information for urban planning and location-based services in Chengdu.
Conference Paper
This study explores the potentials of utilizing GPS data for Activity-Based Models, under the condition that no additional information, such as travel diaries, is required. To extract activity details, we first developed a Time-Spatial Centroid-based Clustering algorithm to identify activity locations and times. Then a Home Detection algorithm was used in combination with two APIs from Google Maps, namely Nearby Search and Place Details, to label each identified activity as either ‘home’ or a specific activity type. Next, a Markov Chain Multinomial Logit Choice model was developed for the extracted activities that models the sequential relationship between consecutive activities. The approach was applied to a GPS dataset collected in Japan in 2020. The estimated parameters revealed how background factors, such as activity time and the person's age, and the previous activity associate with the current activity. Thus, GPS data alone can provide certain knowledge about activity-travel, which potentially benefit practices of travel demands forecasting.
Article
Full-text available
Social media data has frequently sourced research on topics such as traveller planning or the factors that influence travel decisions. The literature on the location of tourist activities, however, is scarce. The studies in this line that do exist focus mainly on identifying points of interest and rarely on the urban areas that attract tourists. Specifically, as acknowledged in the literature, tourist attractions produce major imbalances with respect to adjacent urban areas. The present study aims to fill this research gap by addressing a twofold objective. The first was to design a methodology allowing to identify the preferred tourist areas based on concentrations of places and activities. The tourist area was delimited using Instasights heatmaps information and the areas of interest were identified by linking data from the location-based social network Foursquare to TripAdvisor’s database. The second objective was to delimit areas of interest based on users’ existing urban dynamics. The method provides a thorough understanding of functional diversity and the location of a city’s different functions. In this way, it contributes to a better understanding of the spatial distribution imbalances of tourist activities. Tourist areas of interest were revealed via the identification of users’ preferences and experiences. A novel methodology was thus created that can be used in the design of future tourism strategies or, indeed, in urban planning. The city of Bucharest, Romania, was taken as a case study to develop this exploratory research.
Article
Full-text available
Social Media have increasingly provided data about the movement of people in cities making them useful in understanding the daily life of people in different geographies. Particularly useful for travel analysis is when Social Media users allow (voluntarily or not) tracing their movement using geotagged information of their communication with these online platforms. In this paper we use geotagged tweets from 10 cities in the European Union and United States of America to extract spatiotemporal patterns, study differences and commonalities among these cities, and explore the nature of user location recurrence. The analysis here shows the distinction between residents and tourists is fundamental for the development of city-wide models. Identification of repeated rates of location (recurrence) can be used to define activity spaces. Differences and similarities across different geographies emerge from this analysis in terms of local distributions but also in terms of the worldwide reach among the cities explored here. The comparison of the temporal signature between geotagged and non-geotagged tweets also shows similar temporal distributions that capture in essence city rhythms of tweets and activity spaces.
Chapter
The popularity of mobile phones and the rapid development of location-based social networks (LBSN) provide the rich data support for position recommendation. This paper presents a study of timeliness position recommendation by mining the time cycle of human geographical and social activities, and proposes to re-divide the geographical circle and social circle according to the time state. The geographical factors and social factors are quantified reasonably according to the space aggregation effect on the users’ check-in and the user’s behaviors. Finally, the recommendation model GSTS (Geographical, Social, Temporal, Spatial) is proposed, which integrates multi-geographical factors and multi-social factors by matrix decomposition technique. The experimental results show that the proposed model can achieve the best recommendation performance.
Article
Full-text available
The objective of this study is to mine and analyze large-scale social media data (rich spatio-temporal data unlike traditional surveys) and develop comparative infographics of emerging transportation trends and mobility indicators by adopting natural language processing and data-driven techniques. As such, first, around 13 million tweets for about 20 days (16 December 2019–4 January 2020) from North America were collected, and tweets closely aligned with emerging transportation and mobility trends (such as shared mobility, vehicle technology, built environment, user fees, telecommuting, and e-commerce) were identified. Data analytics captured spatio-temporal differences in social media user interactions and concerns about such trends, as well as topics of discussions formed through such interactions. California, Florida, Georgia, Illinois, New York are among the highly visible cities discussing such trends. Being positive overall, people carried more positive views on shared mobility, vehicle technology, telecommuting, and e-commerce, while being more negative on user fees, and the built environment. Ride-hailing, fuel efficiency, trip navigation, daily as well as shopping and recreational activities, gas price, tax, and product delivery were among the emergent topics. The social media data-driven framework would allow real-time monitoring of transportation trends by agencies, researchers, and professionals.
Article
Full-text available
In light of the growing number of user privacy vio- lations in centralized social networks, the need to define effective platforms for decentralized online social networks (DOSNs) is deeply felt. Interesting solutions have been proposed in the past, which own the necessary mechanisms to allow users keeping control over their personal information and setting the rules to regulate the access of other users. Unfortunately, the effectiveness of this type of solutions is severely reduced by the fact that different user communities with a shared interest could be disconnected/separated from each other. This translates into a reduced ability in effectively spreading data of common interest towards all interested users, as it currently happens in centralized social networks. In order to overcome the cited limitation, this paper proposes a disruptive approach, which exploits the availability of a new class of Internet of Things (IoT) devices with autonomous social behaviors and cognitive abilities. Such devices can be leveraged as friendship intermediaries between devices’ owners who are connected to a DOSN platform and share the same interest. We will demonstrate that clear advantages can be achieved in terms of increased percentage of Interested Reachable Nodes (a specific measure of Delivery Ratio) in distributed social networks among humans, when enhanced with so called Mediator Objects adhering to the well-known social IoT (SIoT) paradigm.
Article
Full-text available
Predicting user activity intensity is crucial for various applications. However, existing studies have two main problems. First, as user activity intensity is nonstationary and nonlinear, traditional methods can hardly fit the nonlinear spatio-temporal relationships that characterize user mobility. Second, user movements between different areas are valuable, but have not been utilized for the construction of spatial relationships. Therefore, we propose a deep learning model, the geographical interactions-weighted graph convolutional network-gated recurrent unit (GGCN-GRU), which is good at fitting nonlinear spatio-temporal relationships and incorporates users’ geographic interactions to construct spatial relationships in the form of graphs as the input. The model consists of a graph convolutional network (GCN) and a gated recurrent unit (GRU). The GCN, which is efficient at processing graphs, extracts spatial features. These features are then input into the GRU, which extracts their temporal features. Finally, the GRU output is passed through a fully connected layer to obtain the predictions. We validated this model using a social media check-in dataset and found that the geographical interactions graph construction method performs better than the baselines. This indicates that our model is appropriate for fitting the complex nonlinear spatio-temporal relationships that characterize user mobility and helps improve prediction accuracy when considering geographic flows.
Chapter
The world of technology continues to grow, and the use of digital information is giving rise to massive datasets. To find the most popular locations around the globe, it is very challenging to find the places in chunks. Social media provides a huge dataset, and researchers can use this dataset to predict the places where individuals can establish their businesses based on their popularity. In this research work, an ensemble classification strategy has been applied to several classification methods, such as Naïve Bayes, support vector machine, and multilayer perceptron algorithm. With these two test methods, k-cross-fold and training test methods are used to predict the results. The prediction results show that the 10-cross-fold method gives highest accuracy, as compared to the training test method. The ensemble method improves the overall efficiency in location prediction.
Article
Full-text available
Focusing on the diversified demands of location privacy in mobile social networks (MSNs), we propose a privacy-enhancing k-nearest neighbors search scheme over MSNs. First, we construct a dual-server architecture that incorporates location privacy and fine-grained access control. Under the above architecture, we design a lightweight location encryption algorithm to achieve a minimal cost to the user. We also propose a location re-encryption protocol and an encrypted location search protocol based on secure multi-party computation and homomorphic encryption mechanism, which achieve accurate and secure k-nearest friends retrieval. Moreover, to satisfy fine-grained access control requirements, we propose a dynamic friends management mechanism based on public-key broadcast encryption. It enables users to grant/revoke others’ search right without updating their friends’ keys, realizing constant-time authentication. Security analysis shows that the proposed scheme satisfies adaptive L-semantic security and revocation security under a random oracle model. In terms of performance, compared with the related works with single server architecture, the proposed scheme reduces the leakage of the location information, search pattern and the user–server communication cost. Our results show that a decentralized and end-to-end encrypted k-nearest neighbors search over MSNs is not only possible in theory, but also feasible in real-world MSNs collaboration deployment with resource-constrained mobile devices and highly iterative location update demands.
Article
Full-text available
We describe an “Urban Observatory” facility designed for the study of complex urban systems via persistent, synoptic, and granular imaging of dynamical processes in cities. An initial deployment of the facility has been demonstrated in New York City and consists of a suite of imaging systems—both broadband and hyperspectral—sensitive to wavelengths from the visible (∼400 nm) to the infrared (∼13 micron) operating at cadences of ∼0.01–30 Hz (characteristically ∼0.1 Hz). Much like an astronomical survey, the facility generates a large imaging catalog from which we have extracted observables (e.g., time-dependent brightnesses, spectra, temperatures, chemical species, etc.), collecting them in a parallel source catalog. We have demonstrated that, in addition to the urban science of cities as systems, these data are applicable to a myriad of domain-specific scientific inquiries related to urban functioning including energy consumption and end use, environmental impacts of cities, and patterns of life and public health. We show that an Urban Observatory facility of this type has the potential to improve both a city’s operations and the quality of life of its inhabitants.
Conference Paper
Full-text available
We propose and evaluate a probabilistic framework for estimating a Twitter user's city-level location based purely on the content of the user's tweets, even in the absence of any other geospatial cues. By augmenting the massive human-powered sensing capabilities of Twitter and related microblogging services with content-derived location information, this framework can overcome the sparsity of geo-enabled features in these services and enable new location-based personalized information services, the targeting of regional advertisements, and so on. Three of the key features of the proposed approach are: (i) its reliance purely on tweet content, meaning no need for user IP information, private login information, or external knowledge bases; (ii) a classification component for automatically identifying words in tweets with a strong local geo-scope; and (iii) a lattice-based neighborhood smoothing model for refining a user's location estimate. The system estimates k possible locations for each user in descending order of confidence. On average we find that the location estimates converge quickly (needing just 100s of tweets), placing 51% of Twitter users within 100 miles of their actual location.
Conference Paper
Full-text available
There have been many location sharing systems developed over the past two decades, and only recently have they started to be adopted by consumers. In this paper, we present the results of three studies focusing on the foursquare check-in system. We conducted interviews and two surveys to understand, both qualitatively and quantitatively, how and why people use location sharing applications, as well as how they manage their privacy. We also document surprising uses of foursquare, and discuss implications for design of mobile social services.
Conference Paper
Full-text available
Little research exists on one of the most common, oldest, and most utilized forms of online social geographic information: the 'location' field found in most virtual community user profiles. We performed the first in-depth study of user behavior with regard to the location field in Twitter user profiles. We found that 34% of users did not provide real location information, frequently incorporating fake locations or sarcastic comments that can fool traditional geographic information tools. When users did input their location, they almost never specified it at a scale any more detailed than their city. In order to determine whether or not natural user behaviors have a real effect on the 'locatability' of users, we performed a simple machine learning experiment to determine whether we can identify a user's location by only looking at what that user tweets. We found that a user's country and state can in fact be determined easily with decent accuracy, indicating that users implicitly reveal location information, with or without realizing it. Implications for location-based services and privacy are discussed.
Conference Paper
Full-text available
This paper examines tweets about two geographically local events a shooting and a building collapse that took place in Wichita, Kansas and Atlanta, Georgia, respectively. Most Internet research has focused on examining ways the Internet can connect people across long distances, yet there are benefits to being connected to others who are nearby. People in close geographic proximity can provide real time information and eyewitness updates for one another about events of local interest. We first show a relationship between structural properties in the Twitter network and geographic properties in the physical world. We then describe the role of mainstream news in disseminating local information. Last, we present a poll of 164 users' information seeking practices. We conclude with practical and theoretical implications for sharing information in local communities.
Conference Paper
Full-text available
Geography and social relationships are inextricably intertwined; the people we interact with on a daily basis almost always live near us. As people spend more time online, data regarding these two dimensions -- geography and social relationships -- are becoming increasingly precise, allowing us to build reliable models to describe their interaction. These models have important implications in the design of location-based services, security intrusion detection, and social media supporting local communities. Using user-supplied address data and the network of associations between members of the Facebook social network, we can directly observe and measure the relationship between geography and friendship. Using these measurements, we introduce an algorithm that predicts the location of an individual from a sparse set of located users with performance that exceeds IP-based geolocation. This algorithm is efficient and scalable, and could be run on a network containing hundreds of millions of users.
Conference Paper
Full-text available
The Natural Language Toolkit is a suite of program modules, data sets and tutorials supporting research and teaching in computational linguistics and natural language processing. NLTK is written in Python and distributed under the GPL open source license. Over the past year the toolkit has been rewritten, simplifying many linguistic data structures and taking advantage of recent enhancements in the Python language. This paper reports on the simplified toolkit and explains how it is used in teaching NLP.
Article
Full-text available
A range of applications, from predicting the spread of human and electronic viruses to city planning and resource management in mobile communications, depend on our ability to foresee the whereabouts and mobility of individuals, raising a fundamental question: To what degree is human behavior predictable? Here we explore the limits of predictability in human dynamics by studying the mobility patterns of anonymized mobile phone users. By measuring the entropy of each individual’s trajectory, we find a 93% potential predictability in user mobility across the whole user base. Despite the significant differences in the travel patterns, we find a remarkable lack of variability in predictability, which is largely independent of the distance users cover on a regular basis.
Article
Full-text available
The dynamic spatial redistribution of individuals is a key driving force of various spatiotemporal phenomena on geographical scales. It can synchronize populations of interacting species, stabilize them, and diversify gene pools. Human travel, for example, is responsible for the geographical spread of human infectious disease. In the light of increasing international trade, intensified human mobility and the imminent threat of an influenza A epidemic, the knowledge of dynamical and statistical properties of human travel is of fundamental importance. Despite its crucial role, a quantitative assessment of these properties on geographical scales remains elusive, and the assumption that humans disperse diffusively still prevails in models. Here we report on a solid and quantitative assessment of human travelling statistics by analysing the circulation of bank notes in the United States. Using a comprehensive data set of over a million individual displacements, we find that dispersal is anomalous in two ways. First, the distribution of travelling distances decays as a power law, indicating that trajectories of bank notes are reminiscent of scale-free random walks known as Lévy flights. Second, the probability of remaining in a small, spatially confined region for a time T is dominated by algebraically long tails that attenuate the superdiffusive spread. We show that human travelling behaviour can be described mathematically on many spatiotemporal scales by a two-parameter continuous-time random walk model to a surprising accuracy, and conclude that human travel on geographical scales is an ambivalent and effectively superdiffusive process.
Conference Paper
Full-text available
We report that human walks performed in outdoor settings of tens of kilometers resemble a truncated form of Levy walks commonly observed in animals such as monkeys, birds and jackals. Our study is based on about one thousand hours of GPS traces involving 44 volunteers in various outdoor settings including two different college campuses, a metropolitan area, a theme park and a state fair. This paper shows that many statistical features of human walks follow truncated power-law, showing evidence of scale-freedom and do not conform to the central limit theorem. These traits are similar to those of Levy walks. It is conjectured that the truncation, which makes the mobility deviate from pure Levy walks, comes from geographical constraints including walk boundary, physical obstructions and traffic. None of commonly used mobility models for mobile networks captures these properties. Based on these findings, we construct a simple Levy walk mobility model which is versatile enough in emulating diverse statistical patterns of human walks observed in our traces. The model is also used to recreate similar power-law inter-contact time distributions observed in previous human mobility studies. Our network simulation indicates that the Levy walk features are important in characterizing the performance of mobile network routing performance.
Article
An optimal search theory, the so-called Lévy-flight foraging hypothesis, predicts that predators should adopt search strategies known as Lévy flights where prey is sparse and distributed unpredictably, but that Brownian movement is sufficiently efficient for locating abundant prey. Empirical studies have generated controversy because the accuracy of statistical methods that have been used to identify Lévy behaviour has recently been questioned. Consequently, whether foragers exhibit Lévy flights in the wild remains unclear. Crucially, moreover, it has not been tested whether observed movement patterns across natural landscapes having different expected resource distributions conform to the theory’s central predictions. Here we use maximum-likelihood methods to test for Lévy patterns in relation to environmental gradients in the largest animal movement data set assembled for this purpose. Strong support was found for Lévy search patterns across 14 species of open-ocean predatory fish (sharks, tuna, billfish and ocean sunfish), with some individuals switching between Lévy and Brownian movement as they traversed different habitat types. We tested the spatial occurrence of these two principal patterns and found Lévy behaviour to be associated with less productive waters (sparser prey) and Brownian movements to be associated with productive shelf or convergence-front habitats (abundant prey). These results are consistent with the Lévy-flight foraging hypothesis1, supporting the contention that organism search strategies naturally evolved in such a way that they exploit optimal Lévy patterns.
Conference Paper
The increasing availability of GPS-enabled devices is changing the way people interact with the Web, and brings us a large amount of GPS trajectories representing people's location histories. In this paper, based on multiple users' GPS trajectories, we aim to mine interesting locations and classical travel sequences in a given geospatial region. Here, interesting locations mean the culturally important places, such as Tiananmen Square in Beijing, and frequented public areas, like shopping malls and restaurants, etc. Such information can help users understand surrounding locations, and would enable travel recommendation. In this work, we first model multiple individuals' location histories with a tree-based hierarchical graph (TBHG). Second, based on the TBHG, we propose a HITS (Hypertext Induced Topic Search)-based inference model, which regards an individual's access on a location as a directed link from the user to that location. This model infers the interest of a location by taking into account the following three factors. 1) The interest of a location depends on not only the number of users visiting this location but also these users' travel experiences. 2) Users' travel experiences and location interests have a mutual reinforcement relationship. 3) The interest of a location and the travel experience of a user are relative values and are region-related. Third, we mine the classical travel sequences among locations considering the interests of these locations and users' travel experiences. We evaluated our system using a large GPS dataset collected by 107 users over a period of one year in the real world. As a result, our HITS-based inference model outperformed baseline approaches like rank-by-count and rank-by-frequency. Meanwhile, when considering the users' travel experiences and location interests, we achieved a better performance beyond baselines, such as rank- by-count and rank-by-interest, etc.
Article
Despite their importance for urban planning, traffic forecasting and the spread of biological and mobile viruses, our understanding of the basic laws governing human motion remains limited owing to the lack of tools to monitor the time-resolved location of individuals. Here we study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period. We find that, in contrast with the random trajectories predicted by the prevailing Lévy flight and random walk models, human trajectories show a high degree of temporal and spatial regularity, each individual being characterized by a time-independent characteristic travel distance and a significant probability to return to a few highly frequented locations. After correcting for differences in travel distances and the inherent anisotropy of each trajectory, the individual travel patterns collapse into a single spatial probability distribution, indicating that, despite the diversity of their travel history, humans follow simple reproducible patterns. This inherent similarity in travel patterns could impact all phenomena driven by human mobility, from epidemic prevention to emergency response, urban planning and agent-based modelling.
Words of the year 2010 (the wall street journal)
  • R.-S Cholera
Cholera, R.-S. 2011. Words of the year 2010 (the wall street journal). http://on.wsj.com/e7AyTt.
Cheating, and claiming mayorships from your couch
  • Foursquare
Foursquare. 2010. Cheating, and claiming mayorships from your couch. http://blog.foursquare.com/2010/04/07/503822143/. Foursquare. 2011. So we grew 3400% last year.
Rhythms of social interaction: Messaging within a massive online network
  • S A Golder
  • D M Wilkinson
  • B A Huberman
Golder, S. A.; Wilkinson, D. M.; and Huberman, B. A. 2007. Rhythms of social interaction: Messaging within a massive online network. In Proceedings of the Third Communities and Technologies Conference.
Foursquare nearing 1 million checkins per day (mashable)
  • J Grove
Grove, J. 2010. Foursquare nearing 1 million checkins per day (mashable). http://mashable.com/2010/05/28/foursquarecheckins/.