Article

Flickr data for analysing tourists’ spatial behaviour and movement patterns: A comparison of clustering techniques

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Purpose The purpose of this study is to analyse the suitability of photo-sharing platforms, such as Flickr, to extract relevant knowledge on tourists’ spatial movement and point of interest (POI) visitation behaviour and compare the most prominent clustering approaches to identify POIs in various application scenarios. Design/methodology/approach The study, first, extracts photo metadata from Flickr, such as upload time, location and user. Then, photo uploads are assigned to latent POIs by density-based spatial clustering of applications with noise (DBSCAN) and k -means clustering algorithms. Finally, association rule analysis (FP-growth algorithm) and sequential pattern mining (generalised sequential pattern algorithm) are used to identify tourists’ behavioural patterns. Findings The approach has been demonstrated for the city of Munich, extracting 13,545 photos for the year 2015. POIs, identified by DBSCAN and k -means clustering, could be meaningfully assigned to well-known POIs. By doing so, both techniques show specific advantages for different usage scenarios. Association rule analysis revealed strong rules (support: 1.0-4.6 per cent; lift: 1.4-32.1 per cent), and sequential pattern mining identified relevant frequent visitation sequences (support: 0.6-1.7 per cent). Research limitations/implications As a theoretic contribution, this study comparatively analyses the suitability of different clustering techniques to appropriately identify POIs based on photo upload data as an input to association rule analysis and sequential pattern mining as an alternative but also complementary techniques to analyse tourists’ spatial behaviour. Practical implications From a practical perspective, the study highlights that big data sources, such as Flickr, show the potential to effectively substitute traditional data sources for analysing tourists’ spatial behaviour and movement patterns within a destination. Especially, the approach offers the advantage of being fully automatic and executable in a real-time environment. Originality/value The study presents an approach to identify POIs by clustering photo uploads on social media platforms and to analyse tourists’ spatial behaviour by association rule analysis and sequential pattern mining. The study gains novel insights into the suitability of different clustering techniques to identify POIs in different application scenarios.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... As suggested by Dolnicar (2021), in the future, market segmentation in tourism will harvest its primary strength by using web-based data, such as online search data (Fuchs et al., 2014;Höpken et al., 2015), web navigation data (Pitman et al., 2010), or online feedback data (Dietz et al., 2020) as opposed to relying on surveys or interview data. In line with this claim, Höpken et al. (2020) recently examined the suitability of different clustering techniques to identify points of interest based on uploaded photo data extracted from the photo-sharing platform Flickr. We will discuss the details of this work in more depth below. ...
... Traditional data sources, like guest surveys, visitor censuses, or on-site observations, impose a high amount of manual work and, thus, do not enable data gathering and analysis automatically and in real-time (Fuchs et al., 2014;Höpken et al., 2015;Önder et al., 2016). A study by Höpken et al. (2020) presents an approach that uses uploaded photos on the social media platform Flickr to analyze tourists' movement patterns when visiting points of interest (POIs), such as sights or attractions, in the destination city of Munich, Germany. By employing and comparatively assessing DBSCAN and k-means clustering for differing use scenarios, photo uploads on Flickr were clustered to POIs (Tan et al., 2018). ...
... First, a k-nearest neighbor distance plot showing the average distance of each point to its k nearest neighbors was employed to identify optimal DBSCAN parameter values with minPts ¼ 3 and ε ¼ 0,0009 (i.e., 99 m), respectively (Höpken et al., 2020). Moreover, the number k of clusters found by DBSCAN was used for k-means to guarantee comparability of the two clustering approaches. ...
Chapter
This chapter will discuss the unsupervised machine learning technique known as clustering and its main approaches and use cases. After presenting typical application areas for the tourism industry, the mathematical principle of clustering will be explained. Various techniques for representing differences between cases or clusters will be introduced, and major methods used to form clusters based on these differences will be presented (i.e., single linkage, complete linkage, average linkage, and centroid). Subsequently, the three most widely applied clustering approaches will be described. First, major concepts of hierarchical clustering, like divisive and agglomerative techniques, will be highlighted. Second, the partitioning technique k-means will be introduced, and, third, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) will be discussed. By using real tourism data and the data science platform RapidMiner, the practical demonstration will then explain step-by-step how clustering approaches can be executed. After employing typical processes for data transformation and normalization, RapidMiner processes for k-means, hierarchical clustering, and DBSCAN will be shown, and the clustering results will be discussed. Lastly, a tourism case applying k-means and DBSCAN to identify points of interest based on uploaded photo data extracted from the platform Flickr will conclude the chapter.
... and Flickr.com has provided a new opportunity for analyzing and understanding visitors' behavior and preferences (Zhang et al. 2019; Barros et al. 2020;Höpken et al. 2020; Lee and Tsou 2018;Gallo et al. 2017;Giglio et al. 2019). Visitors can take geo-tagged photos during their visits with digital photocapturing devices such as smart phones and tablets that have T.-H. ...
... Apart from visual contents, geo-tagged photos include metadata such as GPS location, timestamp, title and tags which provide valuable insights into the spatiotemporal behavior and preferences of visitors at the visited locations. There are many studies in literature that utilize geo-tagged photos to analyze and understand visitors' spatiotemporal behavior and preferences at tourist attractions (Zhang et al. 2019; Barros et al. 2020;Höpken et al. 2020;Lee and Tsou 2018;Gallo et al. 2017;Giglio et al. 2019), for instances national parks (Barros et al. 2020). However, it is still lack of a specific and comprehensive study based on geo-tagged photos to analyze and understand visitors' spatiotemporal behavior and preferences at shopping locations that are visited by many tourists. ...
... Social photo-sharing platforms provide digital footprints of social users (i.e., visitors) that come in the format of geo-tagged photo which includes the location of where it was generated. In this context, geo-tagged photos allow the spatial and temporal dimensions of visitors' behavior to be analyzed (Barros et al. 2020;Höpken et al. 2020;Lee and Tsou 2018;Sun et al. 2013). Geo-tagged photos are used to identify and analyze hotspots and visitors' activities in various tourist attractions (Höpken et al. 2020;Lee and Tsou 2018;Sun et al. 2013). ...
Article
Full-text available
Understanding shopping visitors’ behavior and preferences is important for tourism and retail business. This paper introduces geo-tagged photos on a social media platform as an additional data resource to study spatiotemporal behavior and preferences of visitors at shopping locations. We propose a novel framework that uses geo-tagged photos within an urban region for identifying shopping locations and then discovering spatiotemporal behavior and preferences of visitors at the identified shopping spots. We present a case study of Los Angeles City, California, USA. The analysis results give insights into spatiotemporal behavior and preferences of shopping visitors in different groups of shopping locations such as shopping mall, plaza, market and so on. The results also reveal the preferences of shopping visitors in groups of shopping locations classified by selling products. In addition, the preferences between domestic and international shopping visitors are also uncovered. The proposed approach and findings of the case study are beneficial to tourism managers and the managers of shopping places, especially those in Los Angeles, in understanding spatiotemporal behavior and preferences of shopping visitors. This can help to improve the activities of promoting and attracting tourists to visit local shopping locations.
... Based on the research results of this study, it is observed that Instagram has a significant impact on sales, in that, it increases branding and establishes loyal relationships. This results in increasing the profit of a business; when it comes to online advertising and shopping, Instagram can help in driving traffic, and convert would have been Instagram followers into customers [76]. Also, maximize the click-through rates, which will ensure that the Instagram followers are visiting the online store. ...
... As a result, they are more likely to buy from you since they like your personality. Local businesses can also benefit from Instagram, as you may ask Instagram followers to share images, to check in to the location, which will increase the visibility of your business on social media networks [76]. We have described the most common flow of social networking according to Figure 2. ...
Article
Full-text available
Today, the public is not willing to spend much time identifying their personal needs. Therefore, it needs a system that automatically recommends customized items to customers. The Recommender system has an internet of things (IoT) that entails a subclass of evidenced-based sieving structures that pursues to forecast the assessment of a customer would stretch to an item. Within social networks, numerous categories of RS operate on different recommendation expertise. In this state-of-the-art, we describe and classify current studies from three different aspects by describing different methods of recommender systems. The Friend Recommendation System in social networks is necessary and inevitable, and it is due to this kind of coordination that inevitably recommends latent friends to customers. Making recommendations for friends is an imperative assignment for community networks, as obligating supplementary networks customarily superiors to enhanced customer experience.
... The former method was typically used to measure ST fluctuations in municipal population densities or to generate occupancy curves for establishments (D'Silva et al., 2018;Haidery et al., 2020;Toepke, 2016), while the latter method of analyzing trajectories was used to model aggregate movements of specific user populations of interest, such as those potentially infected with influenza (Allen et al., 2016;Gao et al., 2018;Marquet et al., 2006;Padmanabhan et al., 2014;Wakamiya et al., 2018;. Activity analysis was performed to detect ST variability in hot-spots of specific user actions, and was frequently used to show ST patterns in tourist activity behaviours (Duan et al., 2020;Hasnat & Hasan, 2018;Hoepken et al., 2020;Jing et al., 2020;Kovacs et al., 2021;Lee & Tsou, 2018;Sun & Bakillah, 2013;L.-C. Wang, Yan, et al., Wang, Yan, et al., 2016). ...
... Spatial clustering algorithms such as k-means and DBSCAN were used to identify variations in ST points derived from LBSMD to develop spatial trajectories and detect anomalous clusters of posting activity (Belcastro et al., 2021;Fan et al., 2021;Hoepken et al., 2020;Hu et al., 2015;Q. Huang & Wong, 2015;S. ...
Article
Full-text available
With billions of active users, social media platforms now generate spatial data on a massive scale, which presents researchers with opportunities to use this data in new and innovative ways. There is a body of literature within which location-based social media data have been used as a means to spatially analyze certain phenomenon. Due to the low cost, high availability, and extreme ranges of content and geographic coverage, location-based social media data have the potential to play a major role in GIS research. To this end, our review maps the field of evidence as it relates to the methods used by GIS researchers to collect, spatially analyze, and cartographically visualize location-based social media data. The results are informed by 222 articles that utilize spatial social media for a wide variety of purposes and GIS-based analyses. Findings were organized according to a typical GIS workflow and are presented to give the reader an operationalized understanding of each step in the workflow. Collection strongly favoured platforms like Twitter and Flickr, and similar to methods analysis were dictated either by content or locational data attributes. Data visualization was achieved through discrete, continuous, and transmission surfaces.
... However, Instagram, the largest photo-video sharing platform, has a wealth of information about movement patterns in tourist attraction areas, which have not been deeply explored. To bridge this gap, this paper investigated tourists' movement patterns at and between POIs in the Lake Constance region using data crawled from Instagram, based on the framework from Höpken et al.'s study [12] and compares the performance of NK-MEANS with DBSCAN with regard to the geographic information clustering problem. ...
... Based on these clusters, a threshold of s(X → Y ) = 0.001 was set to filter frequent items and then meaningful association rules where identified, which are listed in Table 4. The mined association rules are distributed between 13 popular POIs shown in Fig. 6, which are centered on lakeside promenade (1) and spanning over railway station (6), club house (12) and shopping center (13). This results impressively reflect the user's movement trajectories among tourist attractions within Friedrichshafen. ...
Chapter
Full-text available
Understanding the characteristics of tourists’ movements is essential for tourism destination management. With advances in information and communication technology, more and more people are willing to upload photos and videos to various social media platforms while traveling. These openly available media data is gaining increasing attention in the field of movement pattern mining as a new data source. In this study, uploaded images and their geographic information within Lake Constance region, Germany were collected and through clustering analysis, a state-of-the-art k-means with noise removal algorithm was compared with the commonly used DBCSCAN on Instagram dataset. Finally, association rules between popular attractions at region-level and city-level were mined respectively. Results show that social media data like Instagram constitute a valuable input to analyse tourists’ movement patterns as input to decision support and destination management.
... DBSCAN, unlike the k-means method, can detect clusters of any type of data with no limits and without specifying the clusters' number. Furthermore, because DBSCAN detects noise locations, no outlier detection is required (Höpken et al., 2020). ...
... Donaire et al., 2014).Giglio et al. (2019) collected 26,392 photos linked to six Italian cities to assess clusters surrounding points of interest (POI). This research used the Density-based Spatial Clustering of Applications with Noise (DBSCAN) technique to automatically characterise the most frequently visited geographic locations(Giglio et al., 2019).Höpken et al. (2018) also analysed the suitability of different clustering methods, including DBSCAN, to investigate tourists' spatial movement and POI visitation behaviour using 13,545 photos for the year 2015(Höpken et al., 2020). ...
Purpose Several review articles have been published within the Artificial Intelligence (AI) literature that have explored a range of applications within the tourism and hospitality sectors. However, how efficiently the applied AI methods and algorithms have performed with respect to the type of applications and the multimodal sets of data domains have not yet been reviewed. Therefore, this paper aims to review and analyse the established AI methods in hospitality/tourism, ranging from data modelling for demand forecasting, tourism destination and behaviour pattern to enhanced customer service and experience. Design/methodology/approach The approach was to systematically review the relationship between AI methods and hospitality/tourism through a comprehensive literature review of papers published between 2010 and 2021. In total, 146 articles were identified and then critically analysed through content analysis into themes, including “AI methods” and “AI applications”. Findings The review discovered new knowledge in identifying AI methods concerning the settings and available multimodal data sets in hospitality and tourism. Moreover, AI applications fostering the tourism/hospitality industries were identified. It also proposes novel personalised AI modelling development for smart tourism platforms to precisely predict tourism choice behaviour patterns. Practical implications This review paper offers researchers and practitioners a broad understanding of the proper selection of AI methods that can potentially improve decision-making and decision-support in the tourism/hospitality industries. Originality/value This paper contributes to the tourism/hospitality literature with an interdisciplinary approach that reflects on theoretical/practical developments for data collection, data analysis and data modelling using AI-driven technology.
... Flickr users have strong motivation to upload and share photos of their destination visits and thus generate complete records of the attractions they visited (Murray, 2008). Stienmetz and Fesenmaier (2016) found a strong correlation between the spatial and temporal pattern identified by the Flickr VGI data and actual visitor trajectories, and the Flickr VGI data has become an established data source to investigate the spatial behaviour and movement patterns of tourists (Höpken, Müller, Fuchs, & Lexhagen, 2020;Stienmetz & Fesenmaier, 2019). ...
... The transitional probability from attraction A to B is set to equal the number of visitors that moved from A to B divided by the number of visitors that departed from attraction A. Thus, this novel matrix driven by big data can distinguish the visitor flow intensity from attraction A to B and B to A, which relaxes the symmetric assumption in traditional spatial matrices and is closer to the reality of visitor flows. Previous studies have demonstrated that Flickr data are suitable for detecting the spatial behaviour of visitors within a destination (Höpken et al., 2020;Stienmetz & Fesenmaier, 2016), and the sample size (n = 127,583 users) is sufficient to construct robust transition probability matrices. For robustness testing, traditional spatial weights matrixdistance-based k nearest neighboursare also used to verify the novel matrix implemented in this research. ...
Article
The aim of this study is to investigate the determinants of attraction demand and shed light on the spillover effects of visitor flows between/across attractions in London using spatial econometric modelling. Both global and local models reveal that income and search queries are significant determinants to attraction demand, while distance from city centre is only significant in the global model. Visitor flow spillovers from neighbouring attractions are found to have significant effects on attraction demand. The intensity and direction of visitor flows’ spillover effects vary by attraction locations. Furthermore, asymmetric spillover effects of visitor flows between a pair of attractions have been identified for the first time in the tourism literature. The adoption of novel spatial estimation methods generates a new dimension to investigate intra-destination demand across attractions. This can provide empirical evidence for decision-makers to optimise visitor flows within a destination.
... Their business intelligence (BI) architecture integrates and combines data sources, like customers' web search and navigation as well as booking and feedback data. Within this research vein, Höpken et al. (2014Höpken et al. ( , 2015 and Keil et al. (2017) proposed a multidimensional destination data warehouse which permits interactive BI-based (i.e., automated) data extraction and knowledge generation, such as decision trees estimating tourists' booking and cancellation behavior , association rules and sequential patterns predicting travellers' movement patterns and place attachment (Höpken et al. 2020a), and artificial neural networks prognosticating international arrivals (Höpken et al. 2020b). In line with the sustainability objective of smart tourism, Fuchs et al. (2013) employed OLAP−technologies to identify visitors with the smallest ecological footprint. ...
... Their business intelligence (BI) architecture integrates and combines data sources, like customers' web search and navigation as well as booking and feedback data. Within this research vein, Höpken et al. (2014Höpken et al. ( , 2015 and Keil et al. (2017) proposed a multidimensional destination data warehouse which permits interactive BI-based (i.e., automated) data extraction and knowledge generation, such as decision trees estimating tourists' booking and cancellation behavior , association rules and sequential patterns predicting travellers' movement patterns and place attachment (Höpken et al. 2020a), and artificial neural networks prognosticating international arrivals (Höpken et al. 2020b). In line with the sustainability objective of smart tourism, Fuchs et al. (2013) employed OLAP−technologies to identify visitors with the smallest ecological footprint. ...
... Por lo general los modelos del turismo desde las ciencias administrativas buscan el pronóstico de parámetros variables en el tiempo y su aplicación a un sistema de demanda casi ideal (Höpken et al., 2020), de igual forma, modelos de predicción de los destinos turísticos (Wan y Song, 2018). Por otro lado, los modelos de las ciencias sociales buscan relaciones con una preferencia hacia los desarrollos de la comunidad como actor del turismo (Dahles et al., 2020;Mamirkulova et al., 2020). ...
Article
Full-text available
Este artículo plantea una discusión relacionada con las teorías del turismo y los elementos utilizados para objetivar las representaciones vinculadas al turismo rural desde una perspectiva comunitaria. El objetivo de esta investigación es establecer articulaciones relacionadas con los alcances de un enfoque, de un modelo y de un sistema turístico y su relación con el turismo rural comunitario. Este análisis se realiza a partir de una revisión de literatura con el fin de plantear algunas conexiones relacionadas con las reflexiones académicas en las cuales se establece el turismo como disciplina académica. Los resultados evidencian que algunos de los postulados del turismo establecen aspectos más cuantitativos y en otros hay un mayor predominio de las teorías de ciencias humanas y sociales. En conclusión, comprender estas relaciones permite plantear un esbozo para la construcción de un estado del arte en la producción del conocimiento en la coyuntura de la COVID-19, con el propósito de fortalecer las teorías alrededor de los aspectos rurales en el turismo desde una óptica comunitaria.
... For example, Kim et al. (2021) points that the current big data technology and the support provided by Record Data can reduce the load bearing pressure and advance preparation of scenic spots by understanding the user's preference for travel time. However, Höpken et al. (2020) points out that these original data are often processed by various clustering technologies through thresholds. In some cases, the data also involves photos and GPS tags to identify points people are interested in and learn about users' preferences. ...
Article
Full-text available
Given the importance of data safety for psychology, the present study investigated the influence of data leaking scandal on campus customers’ financial consumption behaviors at intelligent tourism platforms in China, and explored the roles that individual characteristics play in this process by focusing on a set of participants from colleges. Data were collected through sending out an online questionnaire, where respondents were asked to finish a series of questions about their background information, their trust, future consuming intention, and defensive behaviors toward intelligent platforms. After they finished these questions, a short description about an online tourism platform leaking customers’ personal information was presented to the respondents, following which they were asked to report about their future consuming intentions and defensive behaviors again. In total, 236 participants of college students and teachers were recruited. Paired samples mean comparison showed that after the stimulus was presented, the respondents had a significant decrease in future financial consumption intention, and a significant increase in defensive behaviors toward online tourism platforms due to risks perceived. Multiple regression analysis was conducted subsequently to investigate individual characteristics that may account for part of the decrease (increase) in consuming intention (defensive behaviors). Results showed that, customers with higher level of trust and monthly income, as well as older customers, tend to experience higher level of decrease in consuming intention, and increase in defensive behaviors. These findings highlighted the importance of online tourism platforms guaranteeing data security of their customers.
... It is also stated that online photos shared on social media affect customer behaviors on destination choice and the purchasing decision-making process (Terttunen, 2017). Based on the underlined results of many studies, it is clear that social media has effective, supportive and stimulating roles in tourist behaviors such as pre-information seeking, sharing (Höpken et al., 2020;Zeng & Gerritsen, 2014), the way of information seeking, making decision and booking, collecting information, evaluating alternatives, choosing (Gupta, 2019), and intention to visit the destination (Koo et al., 2016). ...
Chapter
Digital marketing and online social media platforms have become the cornerstones to the success of places and accommodation. This edited volume investigates the current status of digital marketing and social media utilization by both travellers and service providers and explores future digital marketing and social media research trends. - Explores the most effective digital marketing strategies and campaigns; - Investigates the current status of digital marketing and social media utilization by both travellers and service providers; - Provides a view to the future of future digital marketing and social media research trends.
... Research focused on tourism hotspots and tourist movement has also become prevalent. Scholars have used tourist photographs and machine learning algorithms to identify popular tourist destinations within major cities and seasonal tourist hotspots (Giglio et al., 2020;Kaufman et al., 2019;Lee & Tsou, 2018;Vu et al., 2015;Zhang et al., 2018), recommend tourist attractions (Han et al., 2021) and analyze tourist movement patterns (Domènech et al., 2020;García-Palomares et al., 2015;Höpken et al., 2020;Önder et al., 2014). This vein of research has largely focused on identifying trends within urban centers and known tourist destinations. ...
Article
Unlike large cities and urban heritage attractions, archaeological heritage sites in peripheral locations cannot readily absorb the impact of over-tourism. Peru’s rapid growth as a tourist destination since the late 1990s has been bolstered by the popularity of its archaeological heritage sites and past civilizations, particularly in the Cuzco region. As mass tourism and issues of overcrowding have confronted visitors and heritage managers in the Cuzco region, tourists have sought alternative archaeological attractions beyond Cuzco’s ticketed sites – many of which are open-access and unmonitored. By integrating large-scale internet photo datasets from Flickr and the power of computer vision and machine learning algorithms, this study identifies ‘on-the-rise’ (i.e. emerging) archaeological heritage attractions accessible from five Peruvian cities. The major contributions of this paper are (1) to provide a novel method capable of being scaled globally to identify emerging archaeological tourist attractions that are unmonitored by ticket sales, (2) to identify ‘on-the-rise’ archaeological attractions in Peru, and (3) to mitigate future risk (e.g. over-tourism, pollution, destruction and damage) at identified sites as tourist itineraries continue to expand. The identification of emerging archaeological heritage attractions is significant for targeted and sustainable heritage planning, archaeological conservation, future destination marketing and tourism development.
... Following a steady promotion of this policy, industries have been transferred to the surrounding areas in an orderly manner, such that the positive effects of reshaping the spatial pattern of various types of industries in the city have gradually emerged. Previous studies have focused on issues, such as the evolutionary characteristics of the cluster of productive service industries in Beijing [27], the spatial pattern of public service industries [37][38][39][40][41], and the spatial distribution characteristics of the catering and accommodation industries [2]. However, there are relatively few municipal-level studies analyzing the structure of the tourism and leisure industry in general while specifically exploring the spatial characteristics of its sub-sectors. ...
Article
Full-text available
By taking Beijing as the case site, using open-source Point of Interest data, and employing spatial visualization techniques, this study explores the spatial structural characteristics of the Beijing tourism and leisure industry and its sub-sectors. It has been found that (1) the nearest neighbor indexes of the tourism and leisure industry and its sub-sectors are all less than 1, indicating that the tourism and leisure industry and its sub-sectors in Beijing exhibit a spatial clustering distribution. Scenic spots have the largest R-value of 0.52 and, thus, the lowest degree of clustering. The minimum R-value of 0.15 is found in catering, marking the highest degree of clustering in the industry; (2) the main directional trend of the tourism and leisure industry and its sub-sectors in Beijing is the “northeast-southwest” direction, the south-north directional dispersion is dominant, and scenic spots demonstrate a more noticeable trend of spatial dispersion; (3) within the area from Sanlitun Street in the north to Panjiayuan Street in the south, and from Chaoyangmen Street in the west to Liulitun Street in the east, is situated the largest portion of cluster centers with the highest degree of clustering in Beijing’s tourism and leisure industry. The contiguous high-density cluster center of catering starts from Sanlitun Street in the north to Jinsong Street in the south, and from Chaoyangmen Street in the west to Liulitun Street in the east. The cluster of shopping and entertainment shows a checkerboard pattern in the CZCF and NUDZ. The high-value cluster of accommodation occurs primarily around Sanlitun, Panjiayuan, and Qianmen; (4) the distribution of three grades of hot spot areas and non-significant areas of tourism and leisure, catering, accommodation, and shopping and entertainment in Beijing demonstrates a circular pattern that centers around the CZCF and expands outward in sequence. High-value hot spot streets for this area are dominated by Beixinqiao Street, Hepingli Street, Sanlitun Street, Heping Street, and Tuanjiehu Street; and the high-value cold spot streets of the area are chiefly in Fuzizhuang Township, Wangping Town, Miaofeng Mountain Town, and Tanzhesi Town.
... The last question dealt with security which was also part of the standard facilities. This is consistent with the research results of Höpken et al. [28], who discussed tourism behaviour toward famous tourist attractions. They indicated the tourists' characteristics or specific information to extract the data to utilize the management, service, and marketing activities for reaching sustainable tourism. ...
Article
Full-text available
The evolution of information technology and social media today affects the behaviour and expectations of tourist knowledge and information-seeking. Proper answers can enhance the opportunity of travel decision-making. The purpose of this research is to study tourist information-seeking behaviours. Such behaviours derive from the question items that are frequently asked by association rule mining (ARM). Following previous research, the questions were clustered into four groups in what is termed the ESAN model , which is the input data to ARM with default parameters in Weka. It was found that the question patterns generated by the ARM, consisting of clusters E, S, A, and N, had a value of conf. > 98%, and the value of supp. was 0.980-0.987, 1, 0.988-0.994, and 0.983-0.996, respectively. The utilization of this result includes 1) the representation of a data preparation model to access information quickly and to meet the needs of tourists and 2) the provision of guidelines for the Chatbot design by the rules-based Chatbot.
... The generalized sequential pattern algorithm (GSP) is a mining method that can detect recurrent sequences that exceed a user-specified support threshold. The method was first presented by Agrawal and Srikant [21] and later applied to analyze tourist behavior [22], the learning behavior of university students after exposure to educational games [23], and the behavior of book lending transactions [24]. The mining of sequential patterns in poultry has been used to assess growing chicks' behavior under heat and cold stress [8]. ...
Article
Full-text available
Broiler productivity is dependent on a range of variables; among them, the rearing environment is a significant factor for proper well-being and productivity. Behavior indicates the bird’s initial response to an adverse environment and is capable of providing an indicator of well-being in real-time. The present study aims to identify and characterize the sequential pattern of broilers’ behavior when exposed to thermoneutral conditions (TNZ) and thermal stress (HS) by constant heat. The research was carried out in a climatic chamber with 18 broilers under thermoneutral conditions and heat stress for three consecutive days (at three different ages). The behavior database was first analyzed using one-way ANOVA, Tukey test by age, and Boxplot graphs, and then the sequence of the behaviors was evaluated using the generalized sequential pattern (GSP) algorithm. We were able to predict behavioral patterns at the different temperatures assessed from the behavioral sequences. Birds in HS were prostrate, identified by the shorter behavioral sequence, such as the {Lying down, Eating} pattern, unlike TNZ ({Lying down, Walking, Drinking, Walking, Lying down}), which indicates a tendency to increase behaviors (feeding and locomotor activities) that guarantee the better welfare of the birds. The sequence of behaviors ‘Lying down’ followed by ‘Lying laterally’ occurred only in HS, which represents a stressful thermal environment for the bird. Using the pattern mining sequences approach, we were able to identify temporal relationships between thermal stress and broiler behavior, confirming the need for further studies on the use of temporal behavior sequences in environmental controllers.
... In this stage, we used the sequential pattern mining method to reveal the movement patterns of visitors. Sequential pattern mining usually transforms the recreation track into the visiting sequence of scenic site, which makes the recreational behavior easier to be understood (Höpken et al., 2020). Finally, on the basis of summarizing the above empirical evidence, conceptual patterns of different types of recreational flows were proposed: we argue that these help to understand the general rules of recreation behavior demonstrated by park visitors. ...
Article
With increasingly diversified and personalized lifestyles, visitors’ recreation behaviors within urban parks have become more and more active and changeable; meeting this novel change in demand has become a new challenge for effective park design and management. In fact, it has become an urgent task to understand more precisely how visitors flow through parks and in what patterns. This paper used Global Positioning System (GPS) tracking data to quantitatively identify the actual spatial movement and time cost of visitors to Jiefang Park, Wuhan, China. The results show that most recreational flows through the park relied on its major roads. These flows generally follow the distance decay law from the park entrance before reaching the midpoint of the recreation process, and recreational stoppages of these flows also occurred mainly in the first half of the trip, exhibiting three different characteristics. Based on the empirical evidence, this paper summarizes three generalized patterns of recreational flows in order to depict the general recreation behaviors of urban park visitors. These findings could help predict the spatial movement and time budget of park visitors and thus provide support for future park design and management.
... In [9] tourists' collective information about their activities in a city is used to identify POIs of interest and the tourists' behaviour in an urban area. The authors employ a density based clustering algorithm (POI identification) and association-rule mining (behaviour analysis) on users' geo-localized photos uploaded on a photo sharing platform. ...
Chapter
Full-text available
We consider the urban tourism scenario, which is characterized by limited availability of information about individuals’ past behaviour. Our system goal is to identify relevant next Points of Interest (POIs) recommendations. We propose a technique that addresses the domain requirements by using clusters of users’ visits trajectories that show similar visit behaviour. Previous analysis clustered visit trajectories by aggregating trajectories that contain similar POIs. We compare our approach with a next-item recommendation state-of-the-art Neighbour-based model. The results show that customizing recommendations for clusters of users’ with similar behaviour yields superior performance on different quality dimensions of the recommendation.
... Big data approaches enable destinations to have new measurable sustainability goals, such as identifying and targeting customers with the smallest ecological footprint . Furthermore, a transition is being witnessed from the use of big data and machine learning methods to understand what has happened in predictive analytics providing information on what is going to happen (Kamel et al., 2008;Li et al., 2017;Höpken et al., 2019;Höpken et al., 2020a;Höpken et al., 2020b). All of these technologial and methodological enhancements have a profound impact on tourism management. ...
Chapter
Full-text available
Information and communication technologies are major drivers of change. Also in tourism, businesses and entire destinations have to find new business models to stay competitive and relevant. Utilizing the possibilities of digital technologies in developing new business models is called digital transformation. This chapter examines what digital transformation in tourism is and how technology affects leadership and management in tourism organizations. Digital transformation is conceptualized as a creative process activated by knowledge management and knowledge transfer which, in turn, aims at creating new business possibilities and models, respectively. By drawing on the extensive literature on topics connected to digital transformation, such as tourism management, leadership, knowledge and change management, as well as creativity, the chapter at hand discusses the current state of digital transformation management in tourism. A research outlook for the future of digital transformation management in tourism is finally proposed.
Thesis
Full-text available
We are currently living in an era of change induced by a new technological cycle that promises to redefine our culture, our society and our economy. Artificial Intelligence (AI), Machine Learning (ML), Big Data (BD), blockchain, robotics, the Internet of Things (IoT), the metaverse and the rest of cutting-edge technologies are leading a new innovative wave that is plunging us fully into the so-called IV Industrial Revolution. Given this panorama of change, it is obvious that the entire Spanish economy and its productive fabric will be totally affected. Precisely, in this thesis we want to focus our attention on how one of the most representative sectors of our economy, tourism, is facing this wave of innovation. As was already the case with the emergence of the technological leap of Information and Communications Technology (ICTs), this new innovative impulse promises to transform the sector, although with some notable differences: this new wave will bring changes that are increasingly rapid, continuous and intense over time, posing a major challenge, perhaps as never seen before, for the tourism industry. The tourism sector is one of the most consolidated and important economic sectors in the world. During the last decades, it has established itself as one of the fastest growing and largest sectors worldwide, setting a record figure of 1.5 billion international tourists in 2019, surpassing those of 2018 by almost 5% (WTO, 2020). This translates into a contribution of more than 10% to the Gross Domestic Product (GDP) worldwide, more than 7% of total international trade and nearly 30% of world exports of services, keeping pace globally with the value of oil or automobile exports, thus making tourism one of the top five activities in world trade (WTTC, 2019). Justifying the economic importance also for the Spanish case, international tourist arrivals in 2019 exceeded 83.5 million travelers, accounting for more than 11% of the total international market and a growth in travelers of almost 1% compared to 2018 figures. Thus, tourism is established as a fundamental pillar of the Spanish economy, due to its contribution to GDP, employment or economic growth, well above the rest of the OECD countries (Figure 1.1), and due to the compensating role of the external imbalance that the Spanish economy suffers structurally (Pedreño-Muñoz and Ramón-Rodríguez, 2009). These characteristics give it an undeniable resistance to the latest economic crises, although it is true that the COVID-19 pandemic has overexerted it, has forced it to transform itself once again and has made evident the excessive dependence of our economy on tourism, due to its multiplier effect on the rest of the economy, because of the transversal nature of the consumption of tourist demand. In this globalizing context and the integration of new digital technologies in practically any area of society, it is unthinkable that the evolution of the sector should be linked to the digital economy, being one of its basic pillars of development (Hojeghan and Esfangareh, 2011). This digital revolution is taking place in an environment where the number of tourists worldwide is growing steadily, with the rate of growth of tourism demand exceeding the rate of growth of the economy, due to the emergence of emerging countries, the greater ability of young generations to travel and the lower costs of air travel, and represents a significant opportunity for digital innovation in the sector. As we will discuss below, the sector in question is characterized by a low capacity to generate innovation, although at the same time it is extremely sensitive to the adaptation of its structure to new technologies. This fact implies a constant need for renewal within the sector in order to adapt its competitive capacity through the new innovations already mentioned. In fact, as we will see in this thesis, there are many examples in the literature on the adaptation of the sector to BD and AI, clean energy, mobile technology, augmented reality, IoT, virtual assistants or blockchain, among others. But all these advances have as their starting point a common element: data. These are the key element to raise the productivity of companies and make the most of these technologies, so having the ability to collect, exchange, process and analyze data is indispensable in any industry. The introduction of data and algorithms oriented to price management, demand capture, user segmentation and optimization of value chain processes, among other applications, as a fundamental part of the structure of any industry has a fundamental impact on the study of industrial economics. Researchers, public administrations, companies and all the actors involved must adapt to this new reality, implement policies that promote the digitization of the productive fabric and address new lines of research to understand the new participants, thus achieving a greater understanding of this data-driven tourism. Precisely here lies one of the main strengths of tourism compared to other sectors: it has a huge amount of varied, dispersed and representative information, as it is produced by the 'digital footprint' of the tourist on each trip. The challenge for companies in the sector, from large hotel companies to SMEs, is to make the most of this data. Otherwise, they will be swept away by new technology companies, whose main source of business is innovation, and not the tourism sector. In fact, the new technological reality of the industry has created a competitive environment in which technology-based disruptors, who know how to create new markets by satisfying untapped needs, coexist with traditional players who generally do not have the capacity to innovate. Under these premises, this doctoral thesis aims to review the economic principles of tourism from an innovative perspective, analyze the potential impact of the application of AI in the tourism industry at all levels and the need for the use of machine learning algorithms in research in the sector. To this end, first of all, the conceptual framework on which it is framed is constructed. Chapter II of the thesis is devoted to a review of the evolution of the concept of innovation and its importance in economic theory. For this purpose, theoretical references that have studied the role of technology and innovation in economic growth, such as Schumpeter, Solow, Romer or Lucas, are studied. The aim is to understand the impact that the disruptive changes we are experiencing in the economy are having, in order to subsequently apply them to the transformation of the structure of the tourism industry. It is precisely in Chapter III where an applied analysis of innovation and the impact of new technologies on the tourism sector is carried out. It will study the state of innovation in the tourism sector, making important clarifications on the sector's capacity to adapt or develop innovations. In addition, it will explain the digital principles that are transforming the tourism industry and the new research cycle derived from the emergence of BD and which is led by techniques based on ML algorithms, thus justifying the choice of the tourism sector as a case study. Chapter IV provides a complete review of the transforming process that the structure of the tourism industry is undergoing due to the technological paradigm shift. Thus, it studies how these innovative processes are developing a new tourism demand based on data, how the tourism value chain is being reinvented, how tourism prices are set in a market with almost perfect information, what challenges are posed for the labor and training market in the sector, and what role they play in the emergence of new technology-based competitors in the sector. In Chapters V and VI, Airbnb is chosen as an applied case study, as it is representative of all the challenges faced by the sector in terms of technology, political regulation, market intervention, reinterpretation of the tourism value chain, emergence of economic or pandemic shocks that researchers must face. The applied analysis of Airbnb aims to contribute to the research of the sector in this empirical context dominated by BD and by the increasingly imperative need to apply machine learning-based algorithms to understand the different challenges posed by Airbnb in the sector, highlighting among them the processing and simplification of huge databases. Chapter V has as a case study Madrid, the capital of Spain and the fourth destination by number of Airbnb ads in Europe. For this applied case, we study whether the COVID-19 pandemic had a significant impact on the structure of Airbnb supply and demand. For this purpose, the study starts from a logit model of hedonic panel data, different alternative methods of variable selection and likelihood tests are applied to confirm the existence of the structural change affecting the decision making when renting an apartment from the Platform. This work aims to contribute to the research of the sector, and specifically of Airbnb, based on BD. Chapter VI focuses the study on the Valencian Community, one of the main sun and beach tourist destinations, to carry out an analysis on the pricing of tourist accommodation on the platform. This case study aims to analyze whether the application of ML algorithms allows companies to optimize prices in a more efficient way than traditional models. To this end, the performance of a traditional hedonic pricing model is compared to an estimation model based on neural networks to demonstrate the best fit in the predictive capacity of machine learning-based techniques when setting prices. Thus, the doctoral thesis constitutes a valuable and novel contribution to the new research cycle in the sector. It proposes an exhaustive review of all the implications and applications of new technologies in tourism and the advantages of using machine learning based analysis techniques for researchers in their study.
Article
Full-text available
The accuracy and completeness of information in geographical databases are very important for many location-based applications and services. However, the incompleteness in geographical databases is currently an issue. One consequence of this is that the geographic bounding boxes of many points of interests (POIs) have not been known. This paper studies the problem of estimating geographic bounding boxes for POIs using geo-tagged photos contributed by public users on social media. We present a novel approach using relevant geo-tagged photos of POIs to estimate geographic bounding boxes for the POIs. In the proposed method, we extend to apply survival analysis with random distance variable for our estimation. We demonstrate the superiority and effectiveness of our proposed approach over competing methods.
Chapter
Tourism and photography have become very complementary, and tourists are constantly seeking the best spots to capture pictures and memorize their vacations. However, the search for the best and unforgettable photographic spots is difficult and time-consuming for tourists, especially when visiting new regions. In this paper, we propose a method for discovering tourist photo spots from geotagged photos using clustering algorithms. The clusters are characterized to determine the type of photos such as selfies or panoramic. We compare our approach to the most used clustering algorithms namely K-Means and DBSCAN. The approach is simulated and experimentally evaluated on a real photographic dataset of the French capital Paris. Our approach identifies the best-known, quirky and thematic spots in the reference websites.KeywordsTourismPhotographic spotsClusteringHDBSCANKnowledge discovery
Article
The unprecedented development of the internet has compelled a growing number of tourists to share their photographs on social media. These images convey valuable memories and points of interest. As photography and content sharing have become commonplace among visitors, pictorial digital footprints represent a prevalent topic in tourism research. Studies on tourists’ movement trajectories hold great importance for destination management, marketing, and services. Flickr is a popular source in photo-based tourism research given the digital footprints embedded in photos’ metadata; however, the site’s bottlenecks (e.g. declining user activity, overly professional photographs) raise concerns. Scholars have instead gradually shifted their attention to emerging photo platforms such as Instagram—yet these pictures do not contain geographical information. Taking Beijing as a focal location, we introduce an approach in which landmark recognition complements the geographical cues in Instagram photos. Instagram check-in data and data identified through landmark recognition are validated. Ultimately, the recognized landmark information appears highly correlated with check-in data. This study demonstrates the feasibility of landmark recognition for extracting tourists’ footprints from ordinary content in user-generated photos. Findings also confirm that many photos from general social media platforms can serve as alternative and representative data sources in photo-based tourism research.
Chapter
Datenextraktion ist ein Prozess, bei dem potenziell nützliche Informationen und Wissen aus einer großen Anzahl von unscharfen Zufallsdaten extrahiert werden. Mit modernen automatischen Datenerfassungs- und Verarbeitungsmitteln wird der Widerspruch zwischen dem rasanten Anstieg des Datenvolumens und der Stagnation von Datenanalysemethoden immer größer. Daher wird der Bedarf an wissenschaftlicher Forschung und Entscheidungsfindung auf Basis umfangreicher Analysen bestehender Daten immer dringender. FP Wachstumsalgorithmus ist ein repräsentativer Datenextraktionsalgorithmus Diese Arbeit basiert auf FP Wachstumsalgorithmus im Prozess der Datenabfrage.
Chapter
Full-text available
This study aims to explore the composition of virtual guided tour experience on Airbnb and to develop a formation process of virtual guided tour experience. A case study based on the qualitative analysis was conducted with a dataset of online reviews towards an Online Experience in Beijing, China. A three-stage process of virtual guided tour experience was concluded, including experience encounter, experience evaluation, and behavioral intention. Experience encounter describes the experience composition from four dimensions: interpretation quality, host credibility, tourist-host social contact, and peer interaction; Experience evaluation is involved with benefits mainly gained from the enhanced understanding of local culture and the satisfaction attributed by the sense of telepresence; Further, behavioral intention covers both online and offline willingness to recommend or repurchase the virtual tour, or visit the destination in person after the pandemic. Theoretical and practical implications in navigating tourism recovery were discussed.
Chapter
Full-text available
We applied four machine learning models, linear regression, the k-nearest neighbors (KNN), random forest, and support vector machine, to predict consumer demand for bike sharing in Seoul. We aimed to advance previous research on bike sharing demand by incorporating features other than weather - such as air pollution, traffic information, Covid-19 cases, and social economic factors- to increase prediction accuracy. The data were retrieved from Seoul Public Data Park website, which records the counts of public bike rentals in Seoul of Korea from January 1 to December 31, 2020. We found that the two best models are the random forest and the support vector machine models. Among the 29 features in six categories the features in the weather, pollution, and Covid-19 outbreak categories are the most important in model prediction. While almost all social economic features are the least important, we found that they help enhance the performance of the models.
Book
Full-text available
This open access book presents the proceedings of the International Federation for IT and Travel & Tourism (IFITT)’s 29th Annual International eTourism Conference, which assembles the latest research presented at the ENTER2022 conference, which will be held on January 11–14, 2022. The book provides an extensive overview of how information and communication technologies can be used to develop tourism and hospitality. It covers the latest research on various topics within the field, including augmented and virtual reality, website development, social media use, e-learning, big data, analytics, and recommendation systems. The readers will gain insights and ideas on how information and communication technologies can be used in tourism and hospitality. Academics working in the eTourism field, as well as students and practitioners, will find up-to-date information on the status of research.
Chapter
Full-text available
Nowadays, hotels are adopting high technologies to improve the quality of their facilities and services to build competitive advantages. Although smart hotels are an emerging trend, no known studies have investigated hotel employees’ and guests’ perceptions of this kind of hotel. This research will investigate how hotel employees and guests perceive the benefits and drawbacks of smart hotels using Q methodology.
Chapter
Full-text available
As global travel emerges from the pandemic, pent up interest in travel will lead to consumers making their choice between global destinations. Instagram is a key source of destination inspiration. DMO marketing success on this channel relies on projecting a destination image that resonates with this target group. However, usual text-based marketing intelligence on this channel does not work as content is consumed first and foremost as a visual projection. The author has built a deep learning based visual classifier for destination image measurement from photos. In this paper, we compare projected and perceived destination images in Instagram photography for four of the most Instagrammed destinations worldwide. We find that whereas the projected destination image aligns well to the perceived image, there are specific aspects of the destinations that are of more interest to Instagrammers than reflected in the current destination marketing.
Chapter
Full-text available
This paper shows a first analysis of the experiences and challenges of studying tourism during the times of the COVID-19 pandemic. 14 tourism students from two higher education institutions in Europe participated in three focus group discussions. One generation of these students started their education in presence and had to shift online with the start of the pandemic, while the other generation started their education knowing that lessons would be mainly online. Authors used qualitative content analysis to analyze the participants’ statements. As a result of the analysis, several themes emerged, and students contextualized eLearning as an education method for a future without COVID-19.
Chapter
Full-text available
Customer relationship management (CRM) is proving to be one of the most promising business strategies. However, in the field of destination marketing literature, a problem exists as to how data-supported CRM can be established. While customer data management has already been well exploited in other industries, DMOs lack customer proximity and data sovereignty. The aim of this paper is to fill this research gap and show how a data-based CRM can be deployed by DMOs based on the principles of social exchange theory. In 13 expert interviews, these aspects were examined from the DMO’s point of view. The results show that the exchange relationship must be established taking into account the DMO’s extraordinary conditions and critical success factors. In order to stimulate guests’ desire for dialogue or the willingness to disclose personal data, DMOs should offer high-quality customer benefits. A combination of hedonic and utilitarian benefits are found to be the most effective stimuli. In return, only the most necessary customer information should be requested and subsequently built passively. Only if the cost and benefit ratio of the exchange relationship is positive for both parties, a database for the CRM can be built in order to foster long-lasting relationships with potential and returning guests.
Article
Full-text available
Space-time tourist behaviour is influenced by numerous factors related both to tourists and the destination. Yet, however complex it may be, understanding and to some extent managing the way tourists move in space and time is crucial to ensuring the quality of their experience, as well as the effective and sustainable management of destinations and attractions. In the rural wine tourism context, studies on space-time behaviour are rare. The present study uses empirical data collected from tourists staying in hotels of the Bairrada Wine Route territory (N = 116), combining a GPS tracking study with a questionnaire survey. Using a time-geographical analytical approach, the GPS tracking data were mapped for a more detailed analysis of the tourists’ movements in the Bairrada terroir. The findings highlight specificities of tourist consumption in the context of rural wine regions and provide valuable insights for destination planning, service design and marketing of the Bairrada Wine Route.
Article
Full-text available
Traditional tourism data collection includes surveys, interviews and focus groups. However, these methods are both expensive and time consuming. Moreover, there is a lag between the time of data collection and the receipt of that data for analysis. Today, almost all individuals leave digital footprints on the Internet, which can also be used for tourism research. One type of digital footprint is the photos uploaded on websites such as Flickr. The aim of this study is to determine whether the digital footprints in Flickr provide a useful indicator for tourism demand. Photos tagged with “Austria” between 2007 and 2011 were collected using Flickr API. Residents were distinguished from tourists using the data, and spatial analyses were conducted of the tourist-generated data. The results indicate that geotagged photos in Austria are more representative of actual tourist numbers at the city level than at the regional level.
Conference Paper
Full-text available
Flickr presents an abundance of geotagged photos for data mining. Particularly, we propose the concept of extracting spatio-temporal meta data from Flickr photos, combining a collection of such photos together results in a spatio-temporal entity movement trail, a trajectory describing an individual's movements. Using these spatio-temporal Flickr photographer trajectories we aim to extract valuable tourist information about where people are going, what time they are going there, and where they are likely to go next. In order to achieve this goal we present our novel spatio-temporal trajectory regions-of-interest mining and sequential pattern mining framework. It is different from previous work since it forms regions-of-interest taking into consideration both space and time simultaneously, and thus produces higher-quality sequential patterns. We test our framework's ability to uncover interesting patterns for the tourism sciences industry by performing experiments using a large dataset of Queensland photo taker movements for the year 2012. Experimental results validate the usefulness of our approach at finding new, information rich spatio-temporal tourist patterns from this dataset, especially in comparison with the 2D approaches shown in the literature.
Conference Paper
Full-text available
Flickr represents a massive opportunity to mine valuable human movement data from geo-tagged photos. However, existing Flickr trajectory data mining research has not considered mining frequent trajectory patterns whilst also considering the temporal domain. Therefore, a significant opportunity exists to demonstrate the application of a pattern mining algorithm to a large geo-tagged photo dataset. Thus, we present a novel application of the trajectory pattern mining algorithm to a 2012 Flickr dataset of Australia and encompassing state, Queens land. In our experiments we show that many interesting, previously unknown patterns discovered through our framework. Our framework is able to discover expected major landmarks such as cities and tourist attractions. In addition, we make the notable discover of what is theorized to be valuable tourist travel information about sequential movements between hot-spot attractions.
Conference Paper
Full-text available
In this paper, we present a Geographical Information Retrieval system, which aims to automatically extract and analyze touristic information from photos of online image collections (in our case of study Flickr). Our system collect all the photos, and the related information, that are associated to a specific city. We then use Google Maps service to geolocate the retrieved photos, and finally we analyze geo-referenced data to obtain our goals: 1) determining and locating the most interesting places of the city, i.e. the most visited locations, and 2) reconstructing touristic routes of the users visiting the city. Information is filtered by using a set of constraints, which we apply to select only the users that reasonably are tourists visiting the city. Tests were performed on an Italian city, Palermo, that is rich in artistic and touristic attractions, but preliminary tests showed that our technique could successfully be applied to any city in the world with a reasonable number of touristic landmarks
Article
Full-text available
Novel methods and tools are being developed to explore the significance of the new types of user-related spatiotemporal data. This approach helps uncover the presence and movements of tourists from cell phone network data and the georeferenced photos they generate. A city's visitors have many ways of leaving voluntary or involuntary electronic trails: prior to their visits, tourists generate server log entries when they consult digital maps or travel Web sites; during their visit, they leave traces on wireless networks whenever they use their mobile phones; and after their visit, they might add online reviews and photos. Broadly speaking then, there are two types of footprint: active and passive. Passive tracks are left through interaction with an infrastructure, such as a mobile phone network, that produces entries in locational logs; active prints come from the users themselves when they expose locational data in photos, messages, and sensor measurements. In this article, we consider two types of digital traces from Rome, Italy: georeferenced photos made publicly available on the photo-sharing Web site Flickr and aggregate records of wireless network events generated by mobile phone users making calls and sending text messages on the Telecom Italia Mobile (TIM) system.
Conference Paper
Full-text available
Vacation planning is one of the frequent---but nonetheless laborious---tasks that people engage themselves with online; requiring skilled interaction with a multitude of resources. This paper constructs intra-city travel itineraries automatically by tapping a latent source reflecting geo-temporal breadcrumbs left by millions of tourists. For example, the popular rich media sharing site, Flickr, allows photos to be stamped by the time of when they were taken and be mapped to Points Of Interests (POIs) by geographical (i.e. latitude-longitude) and semantic (e.g., tags) metadata. Leveraging this information, we construct itineraries following a two-step approach. Given a city, we first extract photo streams of individual users. Each photo stream provides estimates on where the user was, how long he stayed at each place, and what was the transit time between places. In the second step, we aggregate all user photo streams into a POI graph. Itineraries are then automatically constructed from the graph based on the popularity of the POIs and subject to the user's time and destination constraints. We evaluate our approach by constructing itineraries for several major cities and comparing them, through a "crowd-sourcing" marketplace (Amazon Mechanical Turk), against itineraries constructed from popular bus tours that are professionally generated. Our extensive survey-based user studies over about 450 workers on AMT indicate that high quality itineraries can be automatically constructed from Flickr data.
Conference Paper
Full-text available
Social media such as those residing in the popular photo sharing websites is attracting increasing attention in recent years. As a type of user-generated data, wisdom of the crowd is embedded inside such social media. In particular, millions of users upload to Flickr their photos, many associated with temporal and geographical information. In this paper, we investigate how to rank the trajectory patterns mined from the uploaded photos with geotags and timestamps. The main objective is to reveal the collective wisdom recorded in the seemingly isolated photos and the individual travel sequences reflected by the geo-tagged photos. Instead of focusing on mining frequent trajectory patterns from geo-tagged social media, we put more effort into ranking the mined trajectory patterns and diversifying the ranking results. Through leveraging the relationships among users, locations and trajectories, we rank the trajectory patterns. We then use an exemplar-based algorithm to diversify the results in order to discover the representative trajectory patterns. We have evaluated the proposed framework on 12 different cities using a Flickr dataset and demonstrated its effectiveness. 1
Article
Full-text available
Mobile guides (based on PDAs, smart phones, or mobile phones) play an increasingly important role in tourism, giving tourists ubiquitous access to relevant information especially during their trip. Due to a more difficult access to mobile applications in a ubiquitous usage environment, based on time constraints, lighting conditions, bandwidth, etc., user acceptance of mobile applications strongly depends on the application adaptation to the concrete usage context. This article presents a framework for mobile applications in tourism, enabling a flexible implementation of adaptive, context-aware tourism applications. The framework especially provides approaches for user interface adaptation, content adaptation (recommendation), and interaction modality adaptation. The framework has been prototypically instantiated and evaluated in two different application scenarios, a city guide for the city of Innsbruck and a skiing guide for the ski resort DolomitiSuperski. Both application scenarios showed high usage rates and customer satisfaction and proved the applicability and effectiveness of the presented approach for developing adaptive mobile tourism applications.
Book
The rapid growth of the Web in the last decade makes it the largest p- licly accessible data source in the world. Web mining aims to discover u- ful information or knowledge from Web hyperlinks, page contents, and - age logs. Based on the primary kinds of data used in the mining process, Web mining tasks can be categorized into three main types: Web structure mining, Web content mining and Web usage mining. Web structure m- ing discovers knowledge from hyperlinks, which represent the structure of the Web. Web content mining extracts useful information/knowledge from Web page contents. Web usage mining mines user access patterns from usage logs, which record clicks made by every user. The goal of this book is to present these tasks, and their core mining - gorithms. The book is intended to be a text with a comprehensive cov- age, and yet, for each topic, sufficient details are given so that readers can gain a reasonably complete knowledge of its algorithms or techniques without referring to any external materials. Four of the chapters, structured data extraction, information integration, opinion mining, and Web usage mining, make this book unique. These topics are not covered by existing books, but yet they are essential to Web data mining. Traditional Web mining topics such as search, crawling and resource discovery, and link analysis are also covered in detail in this book.
Article
Online social networks allow users to tag their posts with geographical coordinates collected through the GPS interface of smart phones. The time- and geo-coordinates associated with a sequence of posts/tweets manifest the spatial–temporal movements of people in real life. This paper aims to analyze such movements to discover people and community behavior. To this end, we defined and implemented a novel methodology to mine popular travel routes from geo-tagged posts. Our approach infers interesting locations and frequent travel sequences among these locations in a given geo-spatial region, as shown from the detailed analysis of the collected geo-tagged data.
Article
New sources of geolocated information, associated with big data and social networks, show great promise for geographical research, especially in the field of tourism geography. Photo-sharing services comprise one of these sources. The aim of this article is to demonstrate the potential of photo-sharing services for identifying and analyzing the main tourist attractions in eight major European cities: Athens, Barcelona, Berlin, London, Madrid, Paris, Rome and Rotterdam. Geotagged photographs on Panoramio were differentiated according to whether they had been taken by tourists or local residents, and their spatial distribution patterns were analyzed using spatial statistical techniques in a GIS. The results indicated the concentration and dispersion of photographs in each city and their main hot spots, and revealed marked differences between tourists' and residents' photographs, since the former showed higher spatial concentrations. In addition, differences were observed between cities; Barcelona and Rome presented a strong spatial concentration compared with London or Paris, which showed much greater dispersion.
Article
The number of geo-tagged digital photos has grown exponentially in the past decades. Increasing numbers of digital photos with geo-tags are available on many photo-sharing websites such as Flickr and Instagram. The proliferation of online photos offers great opportunities to study people's travel experiences and preferences. Mining tourists' behavior and city preferences has become popular in recent geographic information system (GIS) research. However, the huge amount of data also poses challenges in spatial analytics. In this study, we automate the detection of places of interest in multiple cities based on spatial and temporal features of Flickr images from 2007 on. We also speed up the process by running jobs on top of the RHadoop platform. This project provides fast and accurate tourist destination detection by mining large amounts of geo-tagged Flickr images. In addition, this study provides insight in applying the RHadoop platform to strengthen large geospatial data analytics. Our methods can be applied to many other cities, and results are valuable for tourism management.
Conference Paper
The paper introduces various concepts of ubiquitous (i.e. mobile) services which especially serve to collect customer-based information from tourists during their destination stay. The latter information source has been identified as a vital input for electronic Customer Relationship Management at the level of tourism destinations. The proposed mobile service concepts are prototypically visualized and qualitatively assessed by destination suppliers from Sweden and by a sample of potential customers from Germany. The gained empirical results suggest that the proposed concepts of an Electronic Customer Card, Detailed Slope Information, Avalanche Warning, and Quick Response code/NFC-tags show the potential to gain customer-based information and benefits for both, customers and destination suppliers.
Article
Decision-relevant data stemming from various business processes within tourism destinations (e.g. booking or customer feedback) are usually extensively available in electronic form. However, these data are not typically utilized for product optimization and decision support by tourism managers. Although methods of business intelligence and knowledge extraction are employed in many travel and tourism domains, current applications usually deal with different business processes separately, which lacks a cross-process analysis approach. This study proposes a novel approach for business intelligence-based cross-process knowledge extraction and decision support for tourism destinations. The approach consists of (a) a homogeneous and comprehensive data model that serves as the basis of a central data warehouse, (b) mechanisms for extracting data from heterogeneous sources and integrating these data into the homogeneous data structures of the data warehouse, and (c) analysis methods for identifying important relationships and patterns across different business processes, thereby bringing to light new knowledge. A prototype of the proposed concepts was implemented for the leading Swedish mountain destination Åre, which demonstrates the effectiveness of the proposed business intelligence architecture and the gained business benefits for a tourism destination.
Article
Geo-tagged photos of users on social media sites (e.g., Flickr) provide plentiful location-based data. This data provide a wealth of information about user behaviours and their potential is increasing, as it becomes ever-more common for images to be associated with location information in the form of geo-tags. Recently, there is an increasing tendency to adopt the information from these geo-tagged photos for learning to recommend tourist locations. In this paper, we aim to propose a system to recommend interesting tourist locations and interesting tourist travel sequences (i.e., sequence of tourist locations) from a collection of geo-tagged photos. Proposed system is capable of understanding context (i.e., time, date, and weather), as well as taking into account the collective wisdom of people, to make tourist recommendations. We illustrate our technique on a sample of public Flickr data set. Experimental results demonstrate that the proposed approach is able to generate better recommendations as compared to other state-of-the-art landmark based recommendation methods.
Book
This book is a thorough introduction to the most important topics in data mining and machine learning. It begins with a detailed review of classical function estimation and proceeds with chapters on nonlinear regression, classification, and ensemble methods. The final chapters focus on clustering, dimension reduction, variable selection, and multiple comparisons. All these topics have undergone extraordinarily rapid development in recent years and this treatment offers a modern perspective emphasizing the most recent contributions. The presentation of foundational results is detailed and includes many accessible proofs not readily available outside original sources. While the orientation is conceptual and theoretical, the main points are regularly reinforced by computational comparisons. Intended primarily as a graduate level textbook for statistics, computer science, and electrical engineering students, this book assumes only a strong foundation in undergraduate statistics and mathematics, and facility with using R packages. The text has a wide variety of problems, many of an exploratory nature. There are numerous computed examples, complete with code, so that further computations can be carried out readily. The book also serves as a handbook for researchers who want a conceptual overview of the central topics in data mining and machine learning.
Article
This paper proposed a method that fully exploits contextual information of geo-tagged web photos to recommend tourism attractions to a user according to his personal interest and current time and location. The proposed method first detects tourism attractions from geo-tags, and estimates their popularity with users' photo quantity. Photos' taken time is used to discover temporal fluctuation of attractions' popularity and distance of consecutive photos is exploited to model the spatial influence to user's travel behavior. Photos' textual and visual information are used to reveal users' personal interests. Collaborative filtering is also adopted in the recommendation process. With all these contextual information, our method predicts a user's preference to a certain attraction from different aspects, and automatically combines the prediction scores to give the final recommendation result with a learning to rank model. Experiments on Panoramio dataset show that our method performs better than the state-of-the-art method, especially for users with little traveling history.
Conference Paper
The advent of photo-sharing services results in massive user-generated geo-tagged photos. These photos implicitly and explicitly indicate points-of-interest and their associations. This study aims to combine two data mining techniques: clustering and association rules mining to mine areas of attraction, and their associative patterns. We analyze photos from Flickr in the area of Queensland, Australia, a popular tourist destination hosting the Great Barrier Reef and tropical rain forest. We report interesting experimental results and discuss findings.
Article
Recently, the phenomenal advent of photo-sharing services, such as Flickr and Panoramio, have led to volumous community-contributed photos with text tags, timestamps, and geographic references on the Internet. The photos, together with their time- and geo-references, become the digital footprints of photo takers and implicitly document their spatiotemporal movements. This study aims to leverage the wealth of these enriched online photos to analyze people’s travel patterns at the local level of a tour destination. Specifically, we focus our analysis on two aspects: (1) tourist movement patterns in relation to the regions of attractions (RoA), and (2) topological characteristics of travel routes by different tourists. To do so, we first build a statistically reliable database of travel paths from a noisy pool of community-contributed geotagged photos on the Internet. We then investigate the tourist traffic flow among different RoAs by exploiting the Markov chain model. Finally, the topological characteristics of travel routes are analyzed by performing a sequence clustering on tour routes. Testings on four major cities demonstrate promising results of the proposed system.
Article
The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we generalize this algorithm in two important directions. The generalized algorithm—called GDBSCAN—can cluster point objects as well as spatially extended objects according to both, their spatial and their nonspatial attributes. In addition, four applications using 2D points (astronomy), 3D points (biology), 5D points (earth science) and 2D polygons (geography) are presented, demonstrating the applicability of GDBSCAN to real-world problems.
Conference Paper
In this research, we propose a method to automatically generate a landmark identification system for geo-tagged photographs, based on analysis of various data collected from the Web. The method first conducts Web analysis based on three major procedures: (1) Automatic extraction of points-of-interest (POIs) based on geographical clustering of geo-tagged images, (2) Retrieval of landmark candidates for each extracted POI from search results of map search API, and (3) Collection and feature extraction of Web images of the landmark candidates. The system then identifies the landmark which appears in the query geo-tagged photograph, by comparing the location and content-based features of the query with the information accumulated by the previous procedures. Experimental results show that the proposed method is capable to construct a highly accurate landmark identification system by leveraging Web information.
Conference Paper
The advent of media-sharing sites like Flickr and YouTube has drastically increased the volume of community-contributed multimedia resources available on the web. These collections have a previously unimagined depth and breadth, and have generated new opportunities - and new challenges - to multimedia research. How do we analyze, understand and extract patterns from these new collections? How can we use these unstructured, unrestricted community contributions of media (and annotation) to generate "knowledge". As a test case, we study Flickr - a popular photo sharing website. Flickr supports photo, time and location metadata, as well as a light-weight annotation model. We extract information from this dataset using two different approaches. First, we employ a location-driven approach to generate aggregate knowledge in the form of "representative tags" for arbitrary areas in the world. Second, we use a tag-driven approach to automatically extract place and event semantics for Flickr tags, based on each tag's metadata patterns. With the patterns we extract from tags and metadata, vision algorithms can be employed with greater precision. In particular, we demonstrate a location-tag-vision-based approach to retrieving images of geography-related landmarks and features from the Flickr dataset. The results suggest that community-contributed media and annotation can enhance and improve our access to multimedia resources - and our understanding of the world.
Conference Paper
Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns. In this study, we propose a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efficiency of mining is achieved with three techniques: (1) a large database is compressed into a highly condensed, much smaller data structure, which avoids costly, repeated database scans, (2) our FP-tree-based mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divide-and-conquer method is used to decompose the mining task into a set of smaller tasks for mining confined patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent pattern mining methods.
Conference Paper
We investigate how to organize a large collection of geotagged photos, working with a dataset of about 35 million images collected from Flickr. Our approach combines content analysis based on text tags and image data with structural analysis based on geospatial data. We use the spatial distribution of where people take photos to define a relational structure between the photos that are taken at popular places. We then study the interplay between this structure and the content, using classification methods for predicting such locations from visual, textual and temporal features of the photos. We find that visual and temporal features improve the ability to estimate the location of a photo, compared to using just textual features. We illustrate using these techniques to organize a large photo collection, while also revealing various interesting properties about popular cities and landmarks at a global scale.
Article
It has long been realized that in pulse-code modulation (PCM), with a given ensemble of signals to handle, the quantum values should be spaced more closely in the voltage regions where the signal amplitude is more likely to fall. It has been shown by Panter and Dite that, in the limit as the number of quanta becomes infinite, the asymptotic fractional density of quanta per unit voltage should vary as the one-third power of the probability density per unit voltage of signal amplitudes. In this paper the corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy. The optimization criterion used is that the average quantization noise power be a minimum. It is shown that the result obtained here goes over into the Panter and Dite result as the number of quanta become large. The optimum quautization schemes for 2^{b} quanta, b=1,2, \cdots, 7 , are given numerically for Gaussian and for Laplacian distribution of signal amplitudes.
Conference Paper
We are given a large database of customer transactions, where each transaction consists of customer-id, transaction time, and the items bought in the transaction. We introduce the problem of mining sequential patterns over such databases. We present three algorithms to solve this problem, and empirically evaluate their performance using synthetic data. Two of the proposed algorithms, AprioriSome and AprioriAll, have comparable performance, albeit AprioriSome performs a little better when the minimum number of customers that must support a sequential pattern is low. Scale-up experiments show that both AprioriSome and AprioriAll scale linearly with the number of customer transactions. They also have excellent scale-up properties with respect to the number of transactions per customer and the number of items in a transaction
Travel diaries analysis by sequential rule mining
Vu, Q., Li, G., Law, R. and Zhang, Y. (2017), "Travel diaries analysis by sequential rule mining", Journal of Travel Research, doi: 10.1177/0047287517692446.
A novel approach to mining travel sequences using collections of geotagged photos
  • S Kisilevich
  • D Kein
  • L Rokach
Kisilevich, S., Kein, D. and Rokach, L. (2010b), "A novel approach to mining travel sequences using collections of geotagged photos", In Proceeding of the 13th AGILE Int. Conf. on Geographic Information Science, pp. 163-182.
P-DBSCAN: a density-based clustering algorithm for exploration and analysis of attractive areas using collections of geotagged photos
  • S Kisilevich
  • F Mansmann
  • D Keim
Kisilevich, S., Mansmann, F. and Keim, D. (2010a), "P-DBSCAN: a density-based clustering algorithm for exploration and analysis of attractive areas using collections of geotagged photos", Proceedings of the 1st International Conference and Exhibition on Computing for Geospatial Research Application, COM.Geo '10, ACM, New York, NY, pp. 38:1-38:4.
Discovering Knowledge in Data -an Introduction to Data Mining
  • D T Larose
Larose, D.T. (2005), Discovering Knowledge in Data -an Introduction to Data Mining, John Wiley and Sons, NJ. Tourists' spatial behaviour and movement patterns