ArticlePDF Available

Abstract and Figures

An important problem in the knowledge discovery of trajectories is segmentation in subparts (subtrajectories). Existing algorithms for trajectory segmentation generally use explicit criteria to create segments. In this article, we propose segmenting trajectories using a novel, unsupervised approach, in which no explicit criteria are predetermined. To achieve this, we apply the Minimum Description Length (MDL) principle, which can measure homogeneity in the trajectory data by computing the similarities between landmarks (i.e. representative points of the trajectory) and the points in their neighborhood. Based on the homogeneity measurements, we propose an algorithm named Greedy Randomized Adaptive Search Procedure for Unsupervised Trajectory Segmentation (GRASP-UTS), which is a meta-heuristic that builds segments by modifying the number and positions of landmarks. We perform experiments with GRASP-UTS in two real-world datasets, using segment purity and coverage metrics to evaluate its efficiency. Experimental results demonstrate that GRASP-UTS correctly segmented sample trajectories without predetermined criteria, by computing similarities between landmarks and other trajectory points.
Content may be subject to copyright.
A preview of the PDF is not available
... In recent years, the proposed trajectory segmentation algorithms can be classified the supervised [9][10][11][12][13][14], unsupervised [15][16][17][18][19][20][21][22][23][24][25][26][27], and semisupervised [28]. ...
... The cost-function-based approach mainly segments the trajectory by minimizing the cost function, including GRASP-UTS [23]. It was proposed by Amilcar et al. in 2015. ...
... In this work, the harmonic mean (H) of average purity P and average coverage C is used to evaluate the proposed algorithm. Scholars firstly proposed the concepts of coverage and purity in [23] and used the harmonic mean (H) to evaluate the trajectory segmentation algorithm in [19]. ...
Article
Full-text available
With the development of the wireless network, location-based services (e.g., the place of interest recommendation) play a crucial role in daily life. However, the data acquired is noisy, massive, it is difficult to mine it by artificial intelligence algorithm. One of the fundamental problems of trajectory knowledge discovery is trajectory segmentation. Reasonable segmentation can reduce computing resources and improvement of storage effectiveness. In this work, we propose an unsupervised algorithm for trajectory segmentation based on multiple motion features (TS-MF). The proposed algorithm consists of two steps: segmentation and mergence. The segmentation part uses the Pearson coefficient to measure the similarity of adjacent trajectory points and extract the segmentation points from a global perspective. The merging part optimizes the minimum description length (MDL) value by merging local sub-trajectories, which can avoid excessive segmentation and improve the accuracy of trajectory segmentation. To demonstrate the effectiveness of the proposed algorithm, experiments are conducted on two real datasets. Evaluations of the algorithm’s performance in comparison with the state-of-the-art indicate the proposed method achieves the highest harmonic average of purity and coverage.
... One of them is a preparation of input data for the algorithms that require a unique data structure, such as a neural network [15]. The reason for segmentation can also be maximizing the homogeneity of data belonging to one segment [16], finding points of interest or hot spots in the trajectory [17], or finding patterns in the trajectory [18]. ...
... Greedy Randomized Adaptive Search Procedure for Unsupervised Trajectory Segmentation (GRASP-UTS) [16] algorithm randomly selects the initial segmentation point. Then, the cost function is computed by using an adaptive greedy algorithm and modifying the number and positions of segmentation points. ...
Article
Full-text available
Identifying distribution of users’ mobility is an essential part of transport planning and traffic demand estimation. With the increase in the usage of mobile devices, they have become a valuable source of traffic mobility data. Raw data contain only specific traffic information, such as position. To extract additional information such as transport mode, collected data need to be further processed. Trajectory needs to be divided into several meaningful consecutive segments according to some criteria to determine transport mode change point. Existing algorithms for trajectory segmentation based on the transport mode change most often use predefined knowledge-based rules to create trajectory segments, i.e., rules based on defined maximum pedestrian speed or the detection of pedestrian segment between two consecutive transport modes. This paper aims to develop a method that segments trajectory based on the transport mode change in real time without preassumed rules. Instead of rules, transition patterns are detected during the transition from one transport mode to another. Transition State Matrices (TSM) were used to automatically detect the transport mode change point in the trajectory. The developed method is based on the sensor data collected from mobile devices. After testing and validating the method, an overall accuracy of 98% and 96%, respectively, was achieved. As higher accuracy of trajectory segmentation means better and more homogeneous data, applying this method during the data collection adds additional value to the data.
... Transform: these algorithms are based on the definition of a series of points that mathematically generate a function that approximates the trajectory. 16 TD-TR reduce, 17 WKMeans, 18 Pyramid, 19 ADP, 20 CB-SMoT, 21 STC, 22 GRASP-UTS, 23 RGRASP-SemTS, 24 BTC, 25 OLDCAT, 26 SGTCR-CS, 27 GSC and GSTC 28 Angle TD-TR reduce, 17 Persistence, 29 OLDCAT 26 Velocity TD-TR reduce, 17 CB-SMoT, 21 AACAT, 30 SimpleTrack, 31 44 Daescu,45,46 OGPC and OSPC, 47 MMTC-offline, 48 MMTC-online, 48 SPPA, 49 GRTSOpt, 50 Latecki, 51 Trajic, 52 Representativeness, 53 65 CDR, CDRm, GRTSOpt and GRTSSec, 50 TraClus, 66 OPERB and A-OPERB, 67 BQS, 68 ABQS, FBQS and PBQS, 69 LO-OPW-TR, 70 OPW-TR, 3 SMoT, 71 Pan, 72 Patroumpas, 73,74 STTrace, 75 Resheff, 76 Reumann-Witkam, 77 EPP, 78 SplitTrajs, 79 BTC and HTC, 25 TPMF, 80 DR, 81 SetraStream, 82 ROCE, 83 SPD 84 Angle GRPPA, 62 TSHL, 63 CFF, 64 Angular, 85 Interval, 86 OHTA, OnlineOHTA and SATA, 65 TraClus, 66 OPERB and A-OPERB, 67 BQS, 68 ABQS, FBQS and PBQS, 69 Intersect, 60 Error-Search, Min-Error and Span-Search, 87 Pan, 72 Patroumpas, 73,74 Thresholds, 75 EPP, 78 SplitTrajs, 79 BTC, 25 TPMF, 80 Zhao-Saalfeld 88 Velocity SUTC, 89 OPW-SP, 3 Pan, 72 Patroumpas, 73,74 Thresholds, 75 SplitTrajs, 79 94 Opheim-improved, 95 RSLC and TSLC, 96 FFUS, 97 FSW, 98 BQS, 68 ABQS, FBQS and PBQS, 69 FastSTray, 99 TD-TR, 3 SQUISH, 100 SQUISH-E(l) and SQUISH-E(m), 101 Probability: these algorithms use probabilities calculated by the algorithm itself to make the decision. Based on multiple criteria: these algorithms combine several of the above criteria to make the decision. ...
... Soares et al. 23 use MDL to avoid the use of thresholds. This solution selects N points randomly as representative points, similarly to a clustering approach, and automatically adjusts itself by means of the cost function formulated with MDL. ...
Article
Full-text available
With the continuous development and cost reduction of positioning and tracking technologies, a large amount of trajectories are being exploited in multiple domains for knowledge extraction. A trajectory is formed by a large number of measurements, where many of them are unnecessary to describe the actual trajectory of the vehicle, or even harmful due to sensor noise. This not only consumes large amounts of memory, but also makes the extracting knowledge process more difficult. Trajectory summarisation techniques can solve this problem, generating a smaller and more manageable representation and even semantic segments. In this comprehensive review, we explain and classify techniques for the summarisation of trajectories according to their search strategy and point evaluation criteria, describing connections with the line simplification problem. We also explain several special concepts in trajectory summarisation problem. Finally, we outline the recent trends and best practices to continue the research in next summarisation algorithms.
... Trajectory segmentation is an essential part of the mobility data mining framework. Trajectory segmentation is the process to divide trajectory into segments on the basis of some criteria [24,10]. The attributes and features are extracted from the segments in the trajectory data mining is comparable to the classical data mining process of data segmentation. ...
... A trajectory is a group of the spatio-temporal point located in space and time; We can use a different approach to split it into the segment. Algorithms for trajectory segmentation include: CB-SMoT [20], WKMeans [15], GRASP-UTS [24], RGRASP-SemTS [10], SWS [6] and WSII [5]. In this work, we used the Wise Sliding Window Segmentation (WSII) algorithm which proved to be the most efficient than other segmentation algorithms. ...
Thesis
Full-text available
A trajectory is a time-ordered sequence of geolocations of a moving object. With the advancement in geolocation technologies and their availability on new devices, it became easier to record any moving object's trajectory. However, the majority of tra-jectory datasets have no labels. A visual analytics platform for semantic annotation of trajectories (VISTA) was proposed to address this issue. The primary issue with the existing VISTA platform is that the manual annotation process is burdensome, i.e., the manual labeling process needs expert user involvement, and the labeling process is time-consuming and intensive. This project aims to implement a semi-automatic mechanism that overcomes the labor-intensive task of the manual trajectory annotation process and reduces the human involvement and effort with the platform. For the implementation of this new feature, we used two machine learning models. The first is the Wise Sliding Window Segmentation (WSII) algorithm, a segmentation strategy to partition trajectories into more similar subparts. The second is a labeling model that provides appropriate labels for the partitions suggested by WSII. The new semi-automated process works as follows. The user performs the manual annotation for the first trajectory. Onwards, the semi-automation pipeline handles the process and provides partitions and labels for the new trajectories until the user completes the annotation process for all trajectories. Implementing the new feature in the existing platform has significantly reduced human effort, interaction, and time with the platform.
... Some approaches start with an over-segmentation of the data stream, and proceed by removing false positive cuts (thus merging two segments) relying on geometrical or statistical properties of the segments [1,82,97,28,83,70]. Using the opposite strategy, some approaches start with a unique segment and perform further partitions, still based on geometric or statistical properties [123]. Other approaches look for movement repetitions within the time series to use as the basis for the segmentation [21]. ...
Thesis
Full-text available
Full and partial automation of Robotic Minimally Invasive Surgery holds significant promise to improve patient treatment, reduce recovery time, and reduce the fatigue of the surgeons. However, to accomplish this ambitious goal, a mathematical model of the intervention is needed. In this thesis, we propose to use Dynamic Movement Primitives (DMPs) to encode the gestures a surgeon has to perform to achieve a task. DMPs allow to learn a trajectory, thus imitating the dexterity of the surgeon, and to execute it while allowing to generalize it both spatially (to new starting and goal positions) and temporally (to different speeds of executions). Moreover, they have other desirable properties that make them well suited for surgical applications, such as online adaptability, robustness to perturbations, and the possibility to implement obstacle avoidance. We propose various modifications to improve the state-of-the-art of the framework, as well as novel methods to handle obstacles. Moreover, we validate the usage of DMPs to model gestures by automating a surgical-related task and using DMPs as the low-level trajectory generator. In the second part of the thesis, we introduce the problem of unsupervised segmentation of tasks' execution in gestures. We will introduce latent variable models to tackle the problem, proposing further developments to combine such models with the DMP theory. We will review the Auto-Regressive Hidden Markov Model (AR-HMM) and test it on surgical-related datasets. Then, we will propose a generalization of the AR-HMM to general, non-linear, dynamics, showing that this results in a more accurate segmentation, with a less severe over-segmentation. Finally, we propose a further generalization of the AR-HMM that aims at integrating a DMP-like dynamic into the latent variable model.
... Although a rule-based method is easy to implement and relatively intuitive, it is not robust to noise. The cost-function-based approach partitions a trajectory by minimizing a specific cost function to build the most homogeneous segments, such as GRASP-UTS and TS-MF based on the minimum description length principle [12,20]. The sliding window-based approach determines where the moving object changed its behavior by deriving the local features within a fixed-size sliding window (e.g., OWS [21] and SWS [13]). ...
Article
Full-text available
Semantic place annotation can provide individual semantics, greatly helping the field of trajectory data mining. Most existing methods rely on annotated or external data and require retraining models following a region change, thus preventing their large-scale applications. Herein, we propose an unsupervised method denoted as UPAPP for the semantic place annotation of individual trajectories using spatiotemporal information. The Bayesian Criterion is specifically employed to decompose the spatiotemporal probability of visiting the candidate place into spatial probability, duration probability, and visiting time probability. Spatial information in two geospatial data sources is comprehensively integrated to calculate the spatial probability. In terms of the temporal probabilities, the Term Frequency–Inverse Document Frequency weighting algorithm is used to count the potential visits to different place types in the trajectories and to generate the prior probabilities of the visiting time and duration. Finally, the spatiotemporal probability of the candidate place is then combined with the importance of the place category to annotate the visited places. Experimental results in a trajectory dataset collected by 709 volunteers in Beijing showed that our method achieved an overall accuracy of 0.712 and 0.720, respectively, indicating that the visited places can be annotated accurately without any annotated data.
... Although a rule-based method is easy to implement and relatively intuitive, it is not robust to noise. Cost-function-based approach partitions a trajectory to build the most homogeneous segments by minimizing a cost-function, such as GRASP-UTS and TS-MF based on the minimum description length principle [10,19]. A sliding-windowbased approach determines where the moving object changed its behavior within a fixed-size sliding window (kernel) by deriving the local features or interpolation (e.g., OWS [20] and SWS [11]). ...
Preprint
Full-text available
Semantic place annotation can provide individual semantics, which can be of great help in the field of trajectory data mining. Most existing methods rely on annotated or external data and require retraining following a change of region, thus preventing their large-scale applications. Herein, we propose an unsupervised method denoted as UPAPP for the semantic place annotation of trajectories using spatiotemporal information. The Bayesian Criterion is specifically employed to decompose the spatiotemporal probability of the candidate place into spatial probability, duration probability, and visiting time probability. Spatial information in ROI and POI data is subsequently adopted to calculate the spatial probability. In terms of the temporal probabilities, the Term Frequency Inverse Document Frequency weighting algorithm is used to count the potential visits to different place types in the trajectories, and generates the prior probabilities of the visiting time and duration. The spatiotemporal probability of the candidate place is then combined with the importance of the place category to annotate the visited places. Validation with a trajectory dataset collected by 709 volunteers in Beijing showed that our method achieved an overall and average accuracy of 0.712 and 0.720, respectively, indicating that the visited places can be annotated accurately without any external data.
... Besides to these methodologies, several other solutions to the trajectory segmentation problem have been proposed in the literature, yet with objectives different from ours. For example, cost-function based strategies were presented in [24,25], while clusteringbased ones are introduced in [29,30], and a method based on interpolation kernels is described in [10,11]. All these approaches are more focused on splitting a movement into homogeneous parts, rather than discovering significant stops, which is the purpose of this paper. ...
Article
Full-text available
Identifying the portions of trajectory data where movement ends and a significant stop starts is a basic, yet fundamental task that can affect the quality of any mobility analytics process. Most of the many existing solutions adopted by researchers and practitioners are simply based on fixed spatial and temporal thresholds stating when the moving object remained still for a significant amount of time, yet such thresholds remain as static parameters for the user to guess. In this work we study the trajectory segmentation from a multi-granularity perspective, looking for a better understanding of the problem and for an automatic, user-adaptive and essentially parameter-free solution that flexibly adjusts the segmentation criteria to the specific user under study and to the geographical areas they traverse. Experiments over real data, and comparison against simple and state-of-the-art competitors show that the flexibility of the proposed methods has a positive impact on results.
... There is also the potential for the user to define regions that could be areas of specific interest for the operator [44]. Automatically finding the spatial regions could be achieved using strategies that try to divide a trajectory into multiple meaningful subtrajectories in unsupervised [45,46] or semisupervised [47] ways by applying minimum description length (MDL) or sliding window segmentation (SWS) techniques. In this work, we created several spatial regions between the two ports being analyzed. ...
Article
Full-text available
With the recent increase in the use of sea transportation, the importance of maritime surveillance for detecting unusual vessel behavior related to several illegal activities has also risen. Unfortunately, the data collected by surveillance systems are often incomplete, creating a need for the data gaps to be filled using techniques such as interpolation methods. However, such approaches do not decrease the uncertainty of ship activities. Depending on the frequency of the data generated, they may even confuse operators, inducing errors when evaluating ship activities and tagging them as unusual. Using domain knowledge to classify activities as anomalous is essential in the maritime navigation environment since there is a well-known lack of labeled data in this domain. In an area where identifying anomalous trips is a challenging task using solely automatic approaches, we use visual analytics to bridge this gap by utilizing users’ reasoning and perception abilities. In this work, we propose a visual analytics tool that uses spatial segmentation to divide trips into subtrajectories and score them. These scores are displayed in a tabular visualization where users can rank trips by segment to find local anomalies. The amount of interpolation in subtrajectories is displayed together with scores so that users can use both their insight and the trip displayed on the map to determine if the score is reliable.
Chapter
Full-text available
Recent improvements in positioning technology have led to a massive moving object data. A crucial task is to find the moving objects that travel together. Usually, they are called spatio-temporal patterns. Due to the emergence of many different kinds of spatio-temporal patterns in recent years, different approaches have been proposed to extract them. However, each approach only focuses on mining a specific kind of pattern. In addition to the fact that it is a painstaking task due to the large number of algorithms used to mine and manage patterns, it is also time consuming. Additionally, we have to execute these algorithms again whenever new data are added to the existing database. To address these issues, we first redefine spatio-temporal patterns in the itemset context. Secondly, we propose a unifying approach, named GeT_Move, using a frequent closed itemset-based spatio-temporal pattern-mining algorithm to mine and manage different spatio-temporal patterns. GeT_Move is implemented in two versions which are GeT_Move and Incremental GeT_Move. Experiments are performed on real and synthetic datasets and the results show that our approaches are very effective and outperform existing algorithms in terms of efficiency.
Article
Full-text available
Mobility and spatial interaction data have become increasingly available due to the wide adoption of location‐aware technologies. Examples of mobility data include human daily activities, vehicle trajectories, and animal movements, among others. In this article we focus on a special type of mobility data, i.e. origin‐destination pairs, and present a new approach to the discovery and understanding of spatio‐temporal patterns in the movements. Specifically, to extract information from complex connections among a large number of point locations, the approach involves two steps: (1) spatial clustering of massive GPS points to recognize potentially meaningful places; and (2) extraction and mapping of the flow measures of clusters to understand the spatial distribution and temporal trends of movements. We present a case study with a large dataset of taxi trajectories in Shenzhen, China to demonstrate and evaluate the methodology. The contribution of the research is two‐fold. First, it presents a new methodology for detecting location patterns and spatial structures embedded in origin‐destination movements. Second, the approach is scalable to large data sets and can summarize massive data to facilitate pattern extraction and understanding.
Article
Full-text available
The knowledge of the transportation mode used by humans e.g. bicycle, on foot, car and train is critical for travel behaviour research, transport planning and traffic management. Nowadays, new technologies such as the Global Positioning System have replaced traditional survey methods paper diaries, telephone because they are more accurate and problems such as under reporting are avoided. However, although the movement data collected timestamped positions in digital form have generally high accuracy, they do not contain the transportation mode. We present in this article a new method for segmenting movement data into single-mode segments and for classifying them according to the transportation mode used. Our fully automatic method differs from previous attempts for five reasons: 1 it relies on fuzzy concepts found in expert systems, that is membership functions and certainty factors; 2 it uses OpenStreetMap data to help the segmentation and classification process; 3 we can distinguish between 10 transportation modes including between tram, bus and car and propose a hierarchy; 4 it handles data with signal shortages and noise, and other real-life situations; 5 in our implementation, there is a separation between the reasoning and the knowledge, so that users can easily modify the parameters used and add new transportation modes. We have implemented the method and tested it with a 17-million point data set collected in the Netherlands and elsewhere in Europe. The accuracy of the classification with the developed prototype, determined with the comparison of the classified results with the reference data derived from manual classification, is 91.6%.
Article
Full-text available
Place-oriented analysis of movement data, i.e., recorded tracks of moving objects, includes finding places of interest in which certain types of movement events occur repeatedly and investigating the temporal distribution of event occurrences in these places and, possibly, other characteristics of the places and links between them. For this class of problems, we propose a visual analytics procedure consisting of four major steps: 1) event extraction from trajectories; 2) extraction of relevant places based on event clustering; 3) spatiotemporal aggregation of events or trajectories; 4) analysis of the aggregated data. All steps can be fulfilled in a scalable way with respect to the amount of the data under analysis; therefore, the procedure is not limited by the size of the computer's RAM and can be applied to very large data sets. We demonstrate the use of the procedure by example of two real-world problems requiring analysis at different spatial scales.
Chapter
Full-text available
An important problem in the study of moving objects is the identification of stops. This problem becomes more difficult due to error-prone recording devices. We propose a method that discovers stops in a trajectory that contains artifacts, namely movements that did not actually take place but correspond to recording errors. Our method is an interactive density-based clustering algorithm, for which we define density on the basis of both the spatial and the temporal properties of a trajectory. The interactive setting allows the user to tune the algorithm and to study the stability of the anticipated stops.
Article
Development in techniques of spatial data acquisition enables us to easily record the trajectories of moving objects. Movement of human beings, animals, and birds can be captured by GPS loggers. The obtained data are analyzed by visualization, clustering, and classification to detect patterns frequently or rarely found in trajectories. To extract a wider variety of patterns in analysis, this article proposes a new method for analyzing trajectories on a network space. The method first extracts primary routes as subparts of trajectories. The topological relations among primary routes and trajectories are visualized as both a map and a graph‐based diagram. They permit us to understand the spatial and topological relations among the primary routes and trajectories at both global and local scales. The graph‐based diagram also permits us to classify trajectories. The representativeness of primary routes is evaluated by two numerical measures. The method is applied to the analysis of daily travel behavior of one of the authors. Technical soundness of the method is discussed as well as empirical findings.
Article
Many devices generate large amounts of data that follow some sort of sequentiality, e.g., motion sensors, e-pens, eye trackers, etc. and often these data need to be compressed for classification, storage, and/or retrieval tasks. Traditional clustering algorithms can be used for this purpose, but unfortunately they do not cope with the sequential information implicitly embedded in such data. Thus, we revisit the well-known K-means algorithm and provide a general method to properly cluster sequentially-distributed data. We present Warped K-Means (WKM), a multi-purpose partitional clustering procedure that minimizes the sum of squared error criterion, while imposing a hard sequentiality constraint in the classification step. We illustrate the properties of WKM in three applications, one being the segmentation and classification of human activity. WKM outperformed five state-of-the-art clustering techniques to simplify data trajectories, achieving a recognition accuracy of near 97%, which is an improvement of around 66% over their peers. Moreover, such an improvement came with a reduction in the computational cost of more than one order of magnitude.
Article
A new method for encoding a videoconference image sequence, termed adaptive neural net vector quantisation (ANNVQ), has been derived. It is based on Kohonen's self-organised feature maps, a neural network type clustering algorithm. The new method differs from it, in that after training the initial codebook, a modified form of adaptation resumes, in order to respond to scene changes and motion. The main advantages are high image quality with modest bit rates and effective adaptation to motion and scene changes, with the capability to quickly adjust the instantaneous bit rate in order to keep the image quality constant. This is a good match to packet switched networks where variable bit rate and uniform image quality are highly desirable. Simulation experiments have been carried out with 4 × 4 blocks of pixels from an image sequence consisting of 20 frames of size 112 × 96 pixels each. With a codebook size of 512, ANNVQ results in high image quality upon image reconstruction, with peak signal-to-noise ratio (PSNR) of about 36 to 37 dB, at coding bit rates of about 0.50 bit/pixel. This compares quite favourably with classical vector quantisation at a similar bit rate. Moreover, this value of PSNR remains approximately constant, even when encoding image frames with considerable motion.