PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Trajectory mining is a research field which aims to provide fundamental insights into decision-making tasks related to moving objects. One of the fundamental pre-processing steps for trajectory mining is its segmentation, where a raw trajectory is divided into several meaningful consecutive sub-sequences. In this work, we propose an unsupervised trajectory segmentation algorithm named Octal Window Segmentation (OWS) that is based on the processing an error signal generated by measuring the deviation of a middle point of an octal window. The algorithm we propose is flexible and can be applied to different domains by selecting an appropriate interpolation kernel. We tested our algorithm on two datasets from different domains, and the experiments show that the proposed algorithm achieved more than 93% of a cross-validated harmonic mean of purity and coverage for two different datasets. We also show that statistically significantly higher results were found by OWS when compared with a baseline for unsupervised trajectory segmentation.
Content may be subject to copyright.
A Trajectory Segmentation Algorithm Based on
Interpolation-based Change Detection Strategies
Mohammad Etemad
Institute for Big Data Analytics
Halifax, NS
etemad@dal.ca
Amílcar Soares
Institute for Big Data Analytics
Halifax, NS
amilcar.soares@dal.ca
Arazoo Hoseyni
Institute for Big Data Analytics
Halifax, NS
a.hoseyni@dal.ca
Jordan Rose
Institute for Big Data Analytics
Halifax, NS
jordanrose@dal.ca
Stan Matwin
Institute for Big Data Analytics
Polish Academy of Sciences
Halifax, NS
stan@cs.dal.ca
ABSTRACT
Trajectory mining is a research eld which aims to provide fun-
damental insights into decision-making tasks related to moving
objects. One of the fundamental pre-processing steps for trajec-
tory mining is its segmentation, where a raw trajectory is divided
into several meaningful consecutive sub-sequences. In this work,
we propose an unsupervised trajectory segmentation algorithm
named Octal Window Segmentation (OWS) that is based on the
processing an error signal generated by measuring the deviation
of a middle point of an octal window. The algorithm we propose
is exible and can be applied to dierent domains by selecting
an appropriate interpolation kernel. We examined our algorithm
on two datasets from dierent domains. The experiments show
that the proposed algorithm achieved more than 93% of a cross-
validated harmonic mean of purity and coverage for two dierent
datasets. We also show that statistically signicantly higher re-
sults were obtained by OWS when compared with a baseline for
unsupervised trajectory segmentation.
1 INTRODUCTION
Processing traces of people, vehicles, vessels, and animals have
been the focus of attention in the academic and industry sectors.
These traces of moving objects are called trajectory data and
can be informally dened as a consecutive sequence of the geo-
locations of a moving object. Transportation mode detection [
6
],
shing detection [
4
], tourism [
7
], environmental science [
18
], and
trac dynamics [
2
,
5
,
20
], are few examples of domains where
trajectory mining methods can be applied.
One of the fundamental trajectory mining tasks is segmenta-
tion, i.e., split raw trajectories into sub-trajectories. Trajectory
segmentation is a fundamental task since the method inuences
the features representing each trajectory. An accurate segmen-
tation method may provide higher quality features that better
represent the moving object behavior. The segmentation task is
therefore based on methods capable of distinguishing the homo-
geneous or similar parts of a trajectory based on some criteria.
Three cases can be distinguished: supervised, unsupervised, and
semi-supervised trajectory segmentation. Unsupervised methods
use only the raw trajectory as input, while supervised methods
use labels available in a training data to extract some knowledge
and use this knowledge as criteria to generate the sub-trajectories.
©
2019 Copyright held by the owner/author(s). Published in Proceedings of the
Published in the Workshop Proceedings of the EDBT/ICDT 2019 Joint Conference
on CEUR-WS.org, March 26, 2019:
Distribution of this paper is permitted under the terms of the Creative Commons
license CC-by-nc-nd 4.0.
Finally, semi-supervised methods use a combination of both la-
beled and unlabeled data as a criterion. Although eorts to create
labeled trajectory datasets [
8
,
23
] can be found in literature, the
majority of them do not contain such information. Therefore,
this work focuses on the development of unsupervised methods
for trajectory segmentation.
Since trajectory data is usually large and has all the charac-
teristics of Big Data, i.e., volume, velocity, variety, veracity, and
value, presenting a fast and accurate segmentation method is of
prime importance. In this research, we investigate the topic of tra-
jectory segmentation and propose an unsupervised segmentation
method that can generate high-quality segments. The intuition
behind our approach is that when a moving object changes its
behavior, this shift may be detected using only its geolocation
over time. Unlike, previous methods that uses speed variations
[
14
], direction variation [
15
], or a combination of many features
[
8
,
9
,
11
,
13
,
16
], this work focuses on nding these changes in
behavior only from the object’s coordinates using interpolation
methods to generate an error signal. This error signal is then
used as a criterion to split the trajectories into sub-trajectories.
Our method can be customized to a domain by using dierent
kernel interpolation methods. The contributions of this paper
are (i) the proposal of an unsupervised trajectory segmentation
method named OWS, (ii) a comparison of OWS and a baseline
regarding performance and execution time, and (iii) a compar-
ison of dierent kernel interpolations for datasets of dierent
domains.
The rest of this paper is organized as follows. We review the
algorithms for trajectory segmentation and interpolation in Sec-
tion 2. In Section 3, the denitions used through this paper, and
the OWS algorithm are detailed. The experiments (e.g., metrics,
dataset, hyperparameter tuning, and results) and its analysis
are detailed in Section 4. Finally, Section 5 provides conclusions
obtained from this work and future work that may be conducted.
2 RELATED WORK
Trajectory segmentation methods such as TRACLUS [
10
], W-
KMeans [
11
], SMOT [
1
], CB-SMoT [
14
], GRASP-UTS [
16
], and
RGRASP-SemTS [
9
] have been proposed to segment trajectory
data. These methods are briey reviewed in Section 2.1. Since the
proposed method can be customized for a domain by selecting
dierent interpolation methods as kernel, we reviewed some
major interpolation methods in Section 2.2.
2.1 Trajectory segmentation
Trajectory segmentation methods can be divided into three cate-
gories regarding the input data of the algorithm: (i) unsupervised,
(ii) supervised, and (iii) semi-supervised.
Unsupervised methods use only raw trajectory data as input
and compute a set of features from it. This family of methods
considers the similarity among features in the neighborhood of a
sequence to create a set of sub-trajectories [
11
,
16
,
21
]. Supervised
methods use labels available in training data to extract some
knowledge and use it as criteria to generate the sub-trajectories
[
3
,
14
,
23
]. Finally, semi-supervised methods use a combination
of both labeled and unlabeled data as a criterion. The RGRASP-
SemTS is an example of such method [9].
A trajectory segmentation can use a cost function or clustering
methods to create sub-trajectories. GRASP-UTS, RGRASP-SemTS,
and W-KMeans are examples of cost function based methods,
while TRACLUS, SM0T, and CB-SMoT are examples of clustering
based methods.
A quantitative comparison between some of the aforemen-
tioned methods is given in [
16
]. They reported higher perfor-
mance for GRASP-UTS in comparison to W-KMeans, and CB-
SMoT. The highest segmentation performance shown in [
16
] was
a purity of 91.37% and coverage of 83.00% on the shing vessels
dataset, and purity of 90.57% and coverage of 83.47% on the hur-
ricanes dataset ( to be detailed in Section 4.1). In this work, we
repeated the experiments in the same environment and showed
that our proposed method obtained better results when compared
to GRASP-UTS.
2.2 Interpolation
Sometimes it is necessary to resample the frequency of trajec-
tory data due to signal loss. Calculating the geolocation for a
time-stamp that the geolocation is missing called interpolation.
Dierent methods such as linear, random walk, bézier curve,
catmull-row, and kinematic path have been introduced to calcu-
late the geo-location of these missing points. An interpolation
method can be useful for one domain and useless for others. For
example, random walk interpolation can be useful to interpolate
wild animal behavior [
17
]. The bézier curve interpolation can be
useful for moving objects in uid environments [
19
]. The hermite
and spline interpolation can be useful for AIS data(trajectories
of vessels) [
22
] and kinematic interpolation is useful for trans-
portation [
12
] or fast moving objects. Linear interpolation is
the simplest, popular interpolation method. In this method, the
missing location calculated so that it is sitting on a straight line
between two available points. Cubic and Kinematic methods cal-
culate the speed and acceleration of the moving object in each
point of the octal window to interpolate the missing position.
We implemented random walk, kinematic, cubic, and linear in-
terpolation to utilize as a kernel for the proposed segmentation
algorithm with the objective of exploring their results into dier-
ent trajectory datasets.
3 THE TRAJECTORY SEGMENTATION
METHOD
In this section, we detail our novel algorithm for unsupervised
trajectory segmentation named Octal Window Segmentation
(OWS). We rst introduce the denitions used to describe the
algorithm (Section 3.1). After, we detail OWS algorithm step by
step in Section 3.2.
3.1 Denitions
Atrajectory point (
pi
) is dened as
pi=(xi,yi,ti)
, where
xi
is longitude,
yi
is latitude, and
ti
(
ti<ti+1
) is the capturing
time of the moving object. A raw trajectory (
τn
), is a sequence of
trajectory points captured through time,
τ=(pi,pi+1, .., pn),pi
τnand in.
Asegment or sub-trajectory is a subsequence of a raw trajectory
generated by splitting it into two or more sub-sequences. For
example, if we have one split point,
k
, and
τn
is a raw trajectory
then
s1=(pi,pi+1, ... , pk)
and
s2=(pk+1,pk+2, ... , pn)
are two
sub-trajectories generated from
τn
. The process of generating
sub-trajectories from a raw trajectory is called segmentation.
An octal window (
Sow
) is a sub-trajectory with seven trajectory
points, in which new trajectory points are created using inter-
polation techniques. We dene
Sow =(p1,p2,p3,p4,p5,p6,p7)
so
that
pi
is time-ordered. The indexes are relative for each window
so that it can slide over a raw trajectory and represents dierent
windows.
The decision of using seven trajectory points on a window
was motivated by the fact that it is necessary to use at least three
points to interpolate (predict) a trajectory point preceding or fol-
lowing them. Calculating acceleration in kinematic interpolation
requires at least three points. Since we use interpolation going
both forward and backward, we have used three points in the
beginning of the window to predict forward the position of the
fourth point, and three points in the end of the window to predict
backward another fourth point, expected to be very close to the
result of the forward interpolation (see details in Section 3.2). We
then use these two interpolated positions to create a midpoint
and calculate its geographical distance from
p4
. Since we have
two sets of four points each, the minimum number of required
trajectory points is seven points. Increasing the length of octal
window can possibly improve the results; however, the objective
of this work is to use minimal possible memory. We use the term
current octal window to refer to the window being processed by
our procedure at a given moment. After processing a window, we
slide the trajectory by one point and process the next window.
3.2 Octal Window Segmentation Algorithm
The intuition behind our algorithm is that when a moving ob-
ject changes from one behavior to another, this can be captured
directly from its geolocation. To achieve an estimated position,
where the moving object is supposed to be if its behavior does not
change, we use interpolation methods. After, we compare the real
position of the moving object with the estimated one, creating
an error signal. By evaluating this error signal, it is possible to
estimate if the moving object changed its behavior on a region
and use this information to create sub-trajectories.
The rst procedure that composes OWS unsupervised trajec-
tory segmentation algorithm is detailed in Algorithm 1. This
procedure creates an error signal by sliding the octal window
over a raw trajectory τn.
The procedure starts with an array of Error signals (
E
) in
line 1. In line 2, the empty signal set
[
0
,
0
,
0
]
is added to the list
and represents the error for the rst three points from the raw
trajectory. The algorithm explores all the octal windows from
lines 3 to 10 as follows. First, the actual octal window is created
(line 4). The forward interpolation is calculated in line 5. In this
method, we assume that
pi
in the current octal window is missing
and will be interpolated using points
p1,p2,p3
. The interpolated
point at time
ti=t3+t5t3
2
is called
pF
. After, the backward
Algorithm 1 Generate Error Signal
Require: τn- the raw trajectory
1: E− {}
2: E.append([0,0,0])
3: for (i=3; i<n3; i+ +)do
4: Create octal window Sow =(pi3, ..., pi+3)
5: pFinterpolate forward Sow
6: pBinterpolate backward Sow
7: pCextract midpoint from PFand PB
8: ϵiHaver sine(pi,pC)
9: E.append(ϵi)
10: end for
11: E.append([0,0,0])
12: return E
interpolation method is calculated (line 6). In this method, it is also
assumed that
pi
in the current octal window is missing. However,
we reverse the order of points so that points
p7,p6,p5
are used to
interpolate the point
pi
at time
ti=t5t5t3
2
and the procedure
calls it
pB
. In line 7, we use
pF
and
pB
geolocations to calculate
a midpoint (
pC
). The error signal
ϵi
is nally computed in line 8,
and it is obtained by calculating the haversine distance between
piand pC.
Figure 1: An example of an error signal calculation for an
octal sliding window Sow .
Figure 1 shows the
pB
and
pF
interpolated positions as red
points,
pi
as a green point and
pC
as a yellow point. In the ex-
ample of Figure 1, the haversine distance from the estimated
position
pC
to the real position
pi
is visible. This may indicate
that the moving object behavior has changed at position pi.
In Figure 2, an example of an error signal generated by Algo-
rithm 1 is shown. A raw trajectory with around 150 trajectory
points was used in this example. As can be seen in Figure 2, there
are several trajectory points (e.g., around trajectory point 95, or
around trajectory point 123) along the raw trajectory where the
estimated positions were far from the actual reported positions
by the moving object.
The OWS algorithm is detailed in Algorithm 2 which receives
as input a single
ϵ
value. The intuition of how this algorithm
works is that segments are created in partitioning positions where
the error values from
E
are higher than the
ϵ
value and these
partitioning positions are created as a list of tuples with the
indexes of where segments start and end. Algorithm 2 starts
Figure 2: An example of an error signal calculation for an
octal sliding window Sow .
Algorithm 2 Octal Window Segmentation
Require: ϵmin. error value to split a trajectory
1: EGenerate Error Signal (τn)
2: f irst 0
3: q− [(f i rst ,n)]
4: p− ∅
5: while q,do
6: tq.pop()
7: curr E[t[0]:t[1]]
8: mmax(curr )
9: if m>ϵthen
10: idx index (curr == m)
11: if len(idx )== 1then
12: q.append((t[0],t[0]+idx[0]))
13: q.append((t[0]+idx[0]+1,t[1]))
14: else
15: ixx дroup index (idx )
16: for all дixx do
17: q.append(( f irst,ixx[д]))
18: f irst =ixx[д]
19: end for
20: q.append(( f irst,t[1]))
21: end if
22: else
23: p.append(t)
24: end if
25: end while
26: return p
creating the error signal
E
that is the output of the procedure in
Algorithm 1. In lines 2 and 3, the algorithm initializes the
f irst
variable with a 0 value which represents the starting index of the
trajectory and creating the rst tuple
(f irst ,n)
that represents
the entire trajectory and adding it to a list
q
. In line 4, the nal
variable
p
with all the partitioning positioning tuples is declared
as an empty. While the list
q
is not empty, lines 6 to 24 are
executed. First, this algorithms get the rst element of the list
t
(line 6), which in the rst run is the full trajectory, creates a list
curr
with all the error values from
E
(line 7), and gets its maximal
error value
m
(line 8). If this maximal error value is greater than
the threshold, the index of
m
is retrieved, and two new tuples
are created if there is a single position with value
m
(lines 11 to
13). The new tuples are stored in
q
and are analyzed in the next
iteration of the algorithm, which will look for other error values
higher than the
ϵ
threshold. If there is more than one partitioning
position with a value equal to
m
(lines 14 to 21), tuples are created
in every single position that satises this criterion. This procedure
will run until all the tuples with partitioning positions are created
where error values are greater than the error threshold
ϵ
. In the
last step, i.e. if mϵ, tuple tis appended to the nal list p.
4 EXPERIMENTS
This section details the metrics and datasets (Section 4.1) and
algorithms parameter selection (Section 4.2) procedure. Finally,
the interpolation methods analysis obtained with the OWS algo-
rithms detailed in Section 4.3 and Section 4.4 shows a comparison
between our OWS strategy and a baseline segmentation algo-
rithm.
4.1 Metrics and datasets
Since our method is classied as an unsupervised method, cluster-
ing metrics such as purity, coverage, and the harmonic mean of
purity and coverage are proper evaluation metrics. In this work,
we have used the metrics named average purity and average cov-
erage. They were rst introduced in the context of trajectory
segmentation in the work of [
16
]. These two metrics were de-
signed to be orthogonal, i.e., when one tends to increase, the other
tends to decrease. Therefore, we dened the harmonic mean of
average purity (P) and average coverage (C), harmonic mean (H),
H=2PC
P+C
, as the primary metric for our analysis and to simplify
the plots and comparison of the segmentation algorithms.
The segment
purity
in a segment is dened as follows. Assum-
ing the set of all target labels in a segment is
L
with
k
trajectory
points. The majority label,
pdL
, is the label of majority tra-
jectory points in the segment and the number of occurrence of
pd
is
p
. Therefore, the purity of a segment is
p
k
. The average of
purity values for all segments generated by a trajectory segmen-
tation algorithm is called average purity,
P
. The
coveraдe
of a
segment can be calculated using a segment identier (
sid
) from
the segments found by the segmentation algorithm. Assuming
σm
is a segment that was supposed to be found by a segmenta-
tion algorithm, it is possible to verify for every segment found
by the algorithm the most frequent
sid
by
sid
m
. The average for
coverage of all generated segments is then called
C
. The more
over segmented a trajectory is, higher values of purity are ex-
pected to be found. However, lower values for coverage will be
computed in the case of a large number of segments found by
the segmentation algorithm. The same conicting result occurs
if the trajectory is under segmented, i.e., the purity values tend
to decrease, but the coverage tends to have a higher value.
Two datasets were selected for evaluation of OWS and the
baseline named GRASP-UTS: (i) shing (5190 points, 153 seg-
ments) and (ii) hurricane datasets (1990 points, 182 segments).
The dataset was processed using the same conditions and features
adopted in the experiments of [
16
]. The objective was to achieve
the best result reported by GRASP-UTS for the unsupervised
trajectory segmentation problem.
4.2 Algorithms parameter selection
In the experiments conducted in this work, ten dierent trajectory
subsets were created aiming to properly evaluate the performance
of the segmentation algorithms. We have used one subset for
estimating the input parameters values of both algorithms, and
the remaining nine to verify the algorithm’s performance in terms
of the harmonic mean of average purity and coverage. The same
process was repeated for every single subset as the set for input
parameters value estimation, and validation in the remaining
subsets. As a result, ten dierent values of the harmonic mean of
average purity and coverage were found in our experiments.
The input parameter values estimation for GRASP-UTS was
done by a grid search with all combinations of values reported in
[
16
]. The decision of the best input parameters combination was
guided by the best cost function value achieved by an algorithm
conguration, in the same way, reported in [16].
For the OWS segmentation algorithm, the
ϵ
value was found
using the following steps. First, the total error signal
E
was gen-
erated for the one subset for parameters estimation. After, the
harmonic mean of the purity and coverage was calculated by
running OWS and using values of percentiles (
P
) from
E
. We
tested the percentiles values for every
ϵE
from 99 to 90
(
P=[P99,P98,P97 ,P96,P95 ,P94,P93 ,P92,P91 ,P90]
. The percentile
that produced the highest harmonic mean was chosen to be used
as the
ϵ
value and was used to estimate the harmonic mean in
the remaining nine subsets.
4.3 OWS interpolation methods evaluation
In the rst experiment, we tested the kinematic, linear, random
walk, and cubic interpolation methods in OWS for the hurricanes
and shing datasets.
The results on shing dataset show that random walk interpo-
lation produces the highest harmonic mean. Since we do not have
enough samples (10 harmonic mean values for every segmenta-
tion algorithm) to verify if the outcomes are normally distributed,
we have used the Mann Whitney U test to verify if the dierence
in the results is signicantly dierent. If P was lower than 0.05,
we rejected the hypothesis that the observed median values came
from the same distribution, so there are statistical dierences. A
Mann Whitney U test indicated that the random walk interpola-
tion kernel produces statistically signicant higher median (M
= 93.68) harmonic mean for trajectory segmentation comparing
to kinematic (S = 11.0, P = 0.0018, M = 86.98), linear (S = 22.0, P
= 0.0188, M = 91.57), and cubic (S = 13.0, P = 0.0028, M = 91.61)
interpolation. We think that this result shows that the human
factor (i.e., the vessel’s captain) plays an essential role in detect-
ing shing activities and this is reected in random movement
behaviors changes.
Table 1: Comparing OWS interpolation methods on the
shing dataset.
RW LIN KIN CUB
M93.68 91.57 86.98 91.61
σ1.85 2.68 4.88 1.56
The results on the hurricanes dataset are detailed in Table
2. The results show that kinematic interpolation produces the
highest harmonic mean. A Mann Whitney U test indicated that
the kinematic interpolation (M = 93.11) kernel produces statisti-
cally signicant higher median harmonic mean for octal window
segmentation comparing to random walk(S = 18.0, P = 0.0086, M
= 92.45), linear (S = 10.0, P = 0.0014, M = 90.71), and cubic (S = 9.0,
P = 0.0011, M = 87.91) interpolation. We think that the kinematic
interpolation worked better in this dataset because a high-speed
moving objects tend to follow this strategy and also the sampling
rate for this dataset is constant (e.g., every 6 hours).
Table 2: Comparing OWS interpolation methods on the
hurricanes dataset.
RW LIN KIN CUB
M92.45 90.71 93.11 87.91
σ1.24 1.12 2.53 4.22
4.4 Comparison with a baseline
In this section, we compare the algorithms for unsupervised tra-
jectory segmentation named OWS and GRASP-UTS. Figure 3 (a)
shows a violin chart for the results of OWS segmentation on the
shing dataset in blue (left) and the GRASP-UTS in green (right)
for all subsets and interpolation methods. The random walk inter-
polation shows visible improvements against the GRASP-UTS, as
well as all the other interpolation kernels. Furthermore, a Mann
Whitney U test between the GRASP-UTS and the random walk
interpolation kernel on shing dataset shows that OWS produces
a statistically signicant higher median for harmonic mean than
the GRASP-UTS.
The Figure 3 (b) shows the results of the OWS segmenta-
tion on hurricane dataset with blue (left) and the GRASP-UTS
with green (right). Even though the GRASP-UTS took advantage
of using wind speed as a feature, the kinematic interpolation
shows a considerable improvement against the GRASP-UTS. The
other interpolation methods also show competitive results with
GRASP-UTS. Another Mann Whitney U test done between the
GRASP-UTS and the kinematic interpolation kernel on hurri-
cane dataset shows that OWS produces a statistically signicant
higher median for harmonic mean than the GRASP-UTS.
Figure 3: Comparing results of OWS against GRASP-UTS
on Fishing dataset
5 CONCLUSION
In this work, we proposed an unsupervised trajectory segmenta-
tion algorithm named Octal Window Segmentation (OWS) that
segments trajectory data using interpolation methods to generate
a geolocation error signal from where it was supposed to be. This
error signal represents possible partitioning positions where a
moving object changed its behavior, and it is used to segment
Figure 4: comparing results of OWS against GRASP-UTS
on Hurricanes dataset
the trajectory data into sub-trajectories. The proposed model
is exible to dierent domains by adjusting the interpolation
methods. The experimental results show that the kinematic in-
terpolation is more suitable for the hurricane dataset, while the
random walk interpolation was the best choice for segmenting
the shing dataset. OWS produces higher quality segmentation
than the state-of-the-art segmentation algorithm, GRASP-UTS.
We compare our proposed model against GRASP-UTS, and the re-
sults show that our algorithm achieved a statistically signicant
higher harmonic mean of purity and coverage for the hurricane
and shing datasets. Furthermore, OWS does not need any extra
knowledge than the raw trajectory, while GRASP-UTS needs
trajectory features such as speed, direction variation, etc.
This work can be extended in several directions. First, we in-
tend to expand the quantitative comparison of OWS with other
methods like WK-Means and CB-SMOT and use other trajectory
datasets like Geolife [
24
]. We also intend to evaluate the possibil-
ity of other interpolation methods and the eects of increasing
the window size used to create the error signal.
REFERENCES
[1]
Luis Otavio Alvares, Vania Bogorny, Bart Kuijpers, Jose Antonio Fernandes
de Macedo, Bart Moelans, and Alejandro Vaisman. 2007. A Model for Enrich-
ing Trajectories with Semantic Geographical Information. In Proceedings of
the 15th Annual ACM International Symposium on Advances in Geographic
Information Systems (GIS ’07). ACM, New York, NY, USA, Article 22, 8 pages.
https://doi.org/10.1145/1341012.1341041
[2]
Pablo Samuel Castro, Daqing Zhang, Chao Chen, Shijian Li, and Gang Pan.
2013. From taxi GPS traces to social and community dynamics: A survey.
ACM Computing Surveys (CSUR) 46, 2 (2013), 17.
[3]
Sina Dabiri and Kevin Heaslip. 2018. Inferring transportation modes from GPS
trajectories using a convolutional neural network. Transportation Research
Part C: Emerging Technologies 86 (2018), 360–371.
[4]
Erico N de Souza, Kristina Boerder, Stan Matwin, and Boris Worm. 2016.
Improving shing pattern detection from satellite AIS using data mining and
machine learning. PloS one 11, 7 (2016), e0158248.
[5]
Renata Dividino, Amilcar Soares, Stan Matwin, Anthony W Isenor, Sean
Webb, and Matthew Brousseau. 2018. Semantic Integration of Real-Time
Heterogeneous Data Streams for Ocean-related Decision Making. In Big Data
and Articial Intelligence for Military Decision Making. STO. https://doi.org/
10.14339/STO-MP- IST-160-S1- 3- PDF
[6]
Mohammad Etemad, Amílcar Soares Júnior, and Stan Matwin. 2018. Predicting
Transportation Modes of GPS Trajectories using Feature Engineering and
Noise Removal. In Advances in Articial Intelligence: 31st Canadian Conference
on Articial Intelligence, Canadian AI 2018, Toronto, ON, Canada, May 8–11,
2018, Proceedings 31. Springer, 259–264.
[7]
Shanshan Feng, Gao Cong, Bo An, and Yeow Meng Chee. 2017. POI2Vec:
Geographical Latent Representation for Predicting Future Visitors.. In AAAI.
102–108.
[8]
Amílcar Soares Júnior, Chiara Renso, and Stan Matwin. 2017. ANALYTiC:
An Active Learning System for Trajectory Classication. IEEE Computer
Graphics and Applications 37, 5 (2017), 28–39. https://doi.org/10.1109/MCG.
2017.3621221
[9]
Amílcar Soares Júnior, Valéria Times, Chiara Renso, Stan Matwin, and Lucıdio
A. F. Cabral. 2018. A semi-supervised approach for the semantic segmen-
tation of trajectories. In 19th IEEE International Conference on Mobile Data
Management.
[10]
Jae-Gil Lee, Jiawei Han, and Kyu-Young Whang. 2007. Trajectory clustering:
a partition-and-group framework. In Proceedings of the 2007 ACM SIGMOD
international conference on Management of data. ACM, 593–604.
[11]
Luis A. Leiva and Enrique Vidal. 2013. Warped K-Means: An algorithm to
cluster sequentially-distributed data. Information Sciences 237 (2013), 196 – 210.
https://doi.org/10.1016/j.ins.2013.02.042 Prediction, Control and Diagnosis
using Advanced Neural Computations.
[12]
Jed A Long. 2016. Kinematic interpolation of movement data. International
Journal of Geographical Information Science 30, 5 (2016), 854–868.
[13]
B. N. Moreno, A. Soares Júnior, V. C. Times, P. Tedesco, and Stan Matwin. 2014.
Weka-SAT: A Hierarchical Context-Based Inference Engine to Enrich Trajec-
tories with Semantics. In Advances in Articial Intelligence. Springer Interna-
tional Publishing, Cham, 333–338. https://doi.org/10.1007/978-3- 319-06483-3_
34
[14]
Andrey Tietbohl Palma, Vania Bogorny, Bart Kuijpers, and Luis Otavio Alvares.
2008. A Clustering-based Approach for Discovering Interesting Places in
Trajectories. In Proceedings of the 2008 ACM Symposium on Applied Computing
(SAC ’08). ACM, New York, NY, USA, 863–868. https://doi.org/10.1145/1363686.
1363886
[15]
Jose Antonio MR Rocha, Valéria C Times, Gabriel Oliveira, Luis O Alvares, and
Vania Bogorny. 2010. DB-SMoT: A direction-based spatio-temporal clustering
method. In Intelligent systems (IS), 2010 5th IEEE international conference. IEEE,
114–119.
[16]
A. Soares Júnior, B. N. Moreno, V. C. Times, S. Matwin, and L. A. F. Cabral.
2015. GRASP-UTS: an algorithm for unsupervised trajectory segmentation.
International Journal of Geographical Information Science 29, 1 (2015), 46–68.
[17]
Georgios Technitis, Walied Othman, Kamran Sa, and Robert Weibel. 2015.
From A to B, randomly: a point-to-point random trajectory generator for
animal movement. International Journal of Geographical Information Science
29, 6 (2015), 912–934.
[18]
Tammy M Thompson, Sebastian Rausch, Rebecca K Saari, and Noelle E Selin.
2014. A systems approach to evaluating the air quality co-benets of US
carbon policies. Nature Climate Change 4, 10 (2014), 917.
[19]
Yann Tremblay, Scott A Shaer, Shannon L Fowler, Carey E Kuhn, Birgitte I
McDonald, Michael J Weise, Charle-André Bost, Henri Weimerskirch, Daniel E
Crocker, Michael E Goebel, et al. 2006. Interpolation of animal tracking data
in a uid environment. Journal of Experimental Biology 209, 1 (2006), 128–140.
[20]
I. Varlamis, K. Tserpes, and C. Sardianos. 2018. Detecting Search and Rescue
Missions from AIS Data. In 2018 IEEE 34th International Conference on Data
Engineering Workshops (ICDEW). 60–65. https://doi.org/10.1109/ICDEW.2018.
00017
[21]
Zhixian Yan, Nikos Giatrakos, Vangelis Katsikaros, Nikos Pelekis, and Yannis
Theodoridis. 2011. SeTraStream: semantic-aware trajectory construction over
streaming movement data. In International Symposium on Spatial and Temporal
Databases. Springer, 367–385.
[22]
Daiyong Zhang, Jia Li, Qing Wu, Xinglong Liu, Xiumin Chu, and Wei He.
2017. Enhance the AIS data availability by screening and interpolation. In
Transportation Information and Safety (ICTIS), 2017 4th International Conference
on. IEEE, 981–986.
[23]
Yu Zheng, Hao Fu, X Xie, WY Ma, and Q Li. 2011. Geolife GPS trajectory
dataset-User Guide. (2011).
[24]
Yu Zheng, Hao Fu, Xing Xie, Wei-Ying Ma, and Quannan Li. 2011. Geolife GPS
trajectory dataset - User Guide. https://www.microsoft.com/en-us/research/
publication/geolife-gps- trajectory- dataset-user- guide/
... The cost-function-based approach partitions a trajectory by minimizing a specific cost function to build the most homogeneous segments, such as GRASP-UTS and TS-MF based on the minimum description length principle [12,20]. The sliding window-based approach determines where the moving object changed its behavior by deriving the local features within a fixed-size sliding window (e.g., OWS [21] and SWS [13]). While these two methods can effectively detect changes in the motion state, they are not suitable for the semantic mining of visited places. ...
Article
Full-text available
Semantic place annotation can provide individual semantics, greatly helping the field of trajectory data mining. Most existing methods rely on annotated or external data and require retraining models following a region change, thus preventing their large-scale applications. Herein, we propose an unsupervised method denoted as UPAPP for the semantic place annotation of individual trajectories using spatiotemporal information. The Bayesian Criterion is specifically employed to decompose the spatiotemporal probability of visiting the candidate place into spatial probability, duration probability, and visiting time probability. Spatial information in two geospatial data sources is comprehensively integrated to calculate the spatial probability. In terms of the temporal probabilities, the Term Frequency–Inverse Document Frequency weighting algorithm is used to count the potential visits to different place types in the trajectories and to generate the prior probabilities of the visiting time and duration. Finally, the spatiotemporal probability of the candidate place is then combined with the importance of the place category to annotate the visited places. Experimental results in a trajectory dataset collected by 709 volunteers in Beijing showed that our method achieved an overall accuracy of 0.712 and 0.720, respectively, indicating that the visited places can be annotated accurately without any annotated data.
Article
Full-text available
For many years trajectory data have been treated as sequences of space‐time points or stops and moves. However, with the explosion of the Internet of Things and the flood of big data generated on the Internet, such as weather channels and social network interactions, which can be used to enrich mobility data, trajectories become more and more complex, with multiple and heterogeneous data dimensions. The main challenge is how to integrate all this information with trajectories. In this article we introduce a new concept of trajectory, called multiple aspect trajectory, propose a robust conceptual and logical data model that supports a vast range of applications, and, differently from state‐of‐the‐art methods, we propose a storage solution for efficient multiple aspect trajectory queries. The main strength of our data model is the combination of simplicity and expressive power to represent heterogeneous aspects, ranging from simple labels to complex objects. We evaluate the proposed model in a tourism scenario and compare its query performance against the state‐of‐the‐art spatio‐temporal database SECONDO extension for symbolic trajectories.
Conference Paper
Full-text available
Information deluge is a continual issue in today's military environment, creating situations where data is sometimes underutilized or in more extreme cases, not utilized, for the decision-making process. In part, this is due to the continuous volume of incoming data that presently engulf the ashore and afloat operational community. However, better exploitation of these data streams can be realized through information science techniques that focus on the semantics of the incoming stream, to discover information-based alerts that generate knowledge that is only obtainable when considering the totality of the streams. In this paper, we present an agile data architecture for real-time data representation, integration, and querying over a multitude of data streams. These streams, which originate from heterogeneous and spatially distributed sensors from different IoT infrastructures and the public Web, are processed in real-time through the application of Semantic Web Technologies. The approach improves knowledge interoperability, and we apply the framework to the maritime vessel traffic domain to discover real-time traffic alerts by querying and reasoning across the numerous streams. The paper and the provided video demonstrate that the use of standards-based semantic technologies is an effective tool for the maritime big data integration and fusion tasks.
Conference Paper
Full-text available
The crossing of the Mediterranean by refugees has turned to be an extremely perilous activity. Human operators that handle Search and Rescue (SAR) missions need all the help they can muster in order to timely discover and assist in the coordination of the operations. In this work we present a tool that automatically detects SAR missions in the sea, by employing Automatic Identification System (AIS) data streams. The approach defines three steps to be taken: a) trajectory compression for affordable real time analysis in the presence of big data; b) detection of sub-operations to which a SAR mission is actually decomposed, and; c) synthesis of multiple vessels' inferred behavior to determine an ongoing SAR mission and its details. The evaluation results are promising showing that AIS data carry highly valuable information even in the absence of any other type of data that could make the problem easier (e.g. coast guard signals).
Conference Paper
Full-text available
A first fundamental step in the process of analyzing movement data is trajectory segmentation, i.e., splitting trajecto-ries into homogeneous segments based on some criteria. Although trajectory segmentation has been the object of several approaches in the last decade, a proposal based on a semi-supervised approach remains inexistent. A semi-supervised approach means that a user labels manually a small set of trajectories with meaningful segments and, from this set, the method infers in an unsupervised way the segments of the remaining trajecto-ries. The main advantage of this method compared to pure supervised ones is that it reduces the human effort to label the number of trajectories. In this work, we propose the use of the Minimum Description Length (MDL) principle to measure homogeneity inside segments. We also introduce the Reactive Greedy Randomized Adaptive Search Procedure for semantic Semi-supervised Trajectory Segmentation (RGRASP-SemTS) algorithm that segments trajectories by combining a limited user labeling phase with a low number of input parameters and no predefined segmenting criteria. The approach and the algorithm are presented in detail throughout the paper, and the experiments are carried out on two real-world datasets. The evaluation tests prove how our approach outperforms state-of-the-art competitors when compared to ground truth.
Article
Full-text available
Understanding transportation mode from GPS (Global Positioning System) traces is an essential topic in the data mobility domain. In this paper, a framework is proposed to predict transportation modes. This framework follows a sequence of five steps: (i) data preparation, where GPS points are grouped in trajectory samples; (ii) point features generation; (iii) trajectory features extraction; (iv) noise removal; (v) normalization. We show that the extraction of the new point features: bearing rate, the rate of rate of change of the bearing rate and the global and local trajectory features, like medians and percentiles enables many classifiers to achieve high accuracy (96.5%) and f1 (96.3%) scores. We also show that the noise removal task affects the performance of all the models tested. Finally, the empirical tests where we compare this work against state-of-art transportation mode prediction strategies show that our framework is competitive and outperforms most of them.
Conference Paper
Full-text available
Understanding transportation mode from GPS (Global Positioning System) traces is an essential topic in the data mobility domain. In this paper, a framework is proposed to predict transportation modes. This framework follows a sequence of five steps: (i) data preparation, where GPS points are grouped in trajectory samples; (ii) point features generation; (iii) trajectory features extraction; (iv) noise removal; (v) normalization. We show that the extraction of the new point features: bearing rate, the rate of rate of change of the bearing rate and the global and local trajectory features, like medians and percentiles enables many classifiers to achieve high accuracy (96.5\%) and f1 (96.3\%) scores. We also show that the noise removal task affects the performance of all the models tested. Finally, the empirical tests where we compare this work against state-of-art transportation mode prediction strategies show that our framework is competitive and outperforms most of them.
Article
With the increasing popularity of location-aware social media applications, Point-of-Interest (POI) recommendation has recently been extensively studied. However, most of the existing studies explore from the users' perspective, namely recommending POIs for users. In contrast, we consider a new research problem of predicting users who will visit a given POI in a given future period. The challenge of the problem lies in the difficulty to effectively learn POI sequential transition and user preference, and integrate them for prediction. In this work, we propose a new latent representation model POI2Vec that is able to incorporate the geographical influence, which has been shown to be very important in modeling user mobility behavior. Note that existing representation models fail to incorporate the geographical influence. We further propose a method to jointly model the user preference and POI sequential transition influence for predicting potential visitors for a given POI. We conduct experiments on 2 real-world datasets to demonstrate the superiority of our proposed approach over the state-of-the-art algorithms for both next POI prediction and future user prediction.
Article
A growing concern about the depletion of marine resources due to fishing overexploitation and degradation of ecosystems has been demonstrated over the last decade. Monitoring the spatial and temporal distribution of fishing activities is an important tool for fisheries management which can also be used by other sectors such as fisheries science, public authorities, policy-makers and marine spatial planning. In this paper we introduce the first map of fishing activity at a Mediterranean scale of EU and non-EU fishing vessels, extracted using Automatic Identification System ship tracking data. Fishing activity maps were produced for three different years with a spatial resolution of 0.01° × 0.01°. As a main result, for the first time, changes of bottom trawl fishing activities between two consecutive years were map for the whole Mediterranean Sea. The results confirmed the suitability of this monitoring system to obtain reliable information on the extent of bottom trawl fishing activities.