ArticlePDF Available

Abstract and Figures

This study presents a method in which historical AIS data are used to predict the future trajectory of a selected vessel. This is facilitated via a system intelligence-based approach that can be subsequently utilized to provide enhanced situation awareness to navigators and future autonomous ships, aiding proactive collision avoidance. By evaluating the historical ship behavior in a given geographical region, the method applies machine learning techniques to extrapolate commonalities in relevant trajectory segments. These commonalities represent historical behavior modes that correspond to the possible future behavior of the selected vessel. Subsequently, the selected vessel is classified to a behavior mode, and a trajectory with respect to this mode is predicted. This is achieved via an initial clustering technique and subsequent trajectory extraction. The extracted trajectories are then compressed using the Karhunen-Loéve transform, and clustered using a Gaussian Mixture Model. The approach in this study differs from others in that trajectories are not clustered for an entire region, but rather for relevant trajectory segments. As such, the extracted trajectories provide a much better basis for clustering relevant historical ship behavior modes. A selected vessel is then classified to one of these modes using its observed behavior. Trajectory predictions are facilitated using an enhanced subset of data that likely correspond to the future behavior of the selected vessel. The method yields promising results, with high classification accuracy and low prediction error. However, vessels with abnormal behavior degrade the results in some situations, and have also been discussed in this study.
Content may be subject to copyright.
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
Available online at www.sciencedirect.com
Journal of Ocean Engineering and Science xxx (xxxx) xxx
www.elsevier.com/locate/joes
Original Article
Ship behavior prediction via trajectory extraction-based clustering for
maritime situation awareness
Brian Murray
, Lokukaluge Prasad Perera
UiT The Arctic University of Norway, Troms ø, Norway
Received 15 October 2020; received in revised form 14 February 2021; accepted 18 March 2021
Available online xxx
Abstract
This study presents a method in which historical AIS data are used to predict the future trajectory of a selected vessel. This is facilitated
via a system intelligence-based approach that can be subsequently utilized to provide enhanced situation awareness to navigators and future
autonomous ships, aiding proactive collision avoidance. By evaluating the historical ship behavior in a given geographical region, the method
applies machine learning techniques to extrapolate commonalities in relevant trajectory segments. These commonalities represent historical
behavior modes that correspond to the possible future behavior of the selected vessel. Subsequently, the selected vessel is classified to a
behavior mode, and a trajectory with respect to this mode is predicted. This is achieved via an initial clustering technique and subsequent
trajectory extraction. The extracted trajectories are then compressed using the Karhunen–Loéve transform, and clustered using a Gaussian
Mixture Model. The approach in this study differs from others in that trajectories are not clustered for an entire region, but rather for
relevant trajectory segments. As such, the extracted trajectories provide a much better basis for clustering relevant historical ship behavior
modes. A selected vessel is then classified to one of these modes using its observed behavior. Trajectory predictions are facilitated using an
enhanced subset of data that likely correspond to the future behavior of the selected vessel. The method yields promising results, with high
classification accuracy and low prediction error. However, vessels with abnormal behavior degrade the results in some situations, and have
also been discussed in this study.
© 2021 Shanghai Jiaotong University. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
Keywords: Maritime situation awareness; Ship navigation; Trajectory prediction; Collision avoidance; Machine learning; Unsupervised learning; AIS.
1. Introduction
Technological advances are permeating almost every in-
dustry. Artificial intelligence, increased computational power
and wireless communication capabilities have the potential
to allow for disruptive innovations that can change business
models drastically. Many argue that there is a digital rev-
olution underway and are calling it Industry 4.0 [1] . If one
looks to the automotive industry for instance, significant inno-
vations related to autonomous cars are being developed at an
exponential rate. Autonomous cars are already being tested in
general traffic areas and there are claims that mass production
could be possible by 2021 [2] .
Corresponding author.
E-mail address: brian.murray@uit.no (B. Murray).
Similarly, it can be argued that shipping is currently on
its way into a fourth technical revolution, Shipping 4.0 [3] .
The first revolution in shipping can be argued to be the tran-
sition from sail to steam in at the turn of the 19
th century,
the second from steam to diesel around 1910, and the third
came with the introduction of automated systems, made pos-
sible through the advent of computers around 1970. Like the
car industry, the shipping industry is looking to autonomy
as a possible disruptive element. The shipping industry has,
however, historically been considered conservative, with in-
novations being implemented at a slower rate than in similar
industries. As such, technologies associated with autonomous
ships are not as developed as those for autonomous cars.
Nonetheless, many companies are working on the develop-
ment of autonomous ships. The first autonomous ships, e.g.
Yara Birkeland, are planned to be launched in 2020 and fully
autonomous by 2022 [4] . It can be argued that if the required
https://doi.org/10.1016/j.joes.2021.03.001
2468-0133/© 2021 Shanghai Jiaotong University. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
Please cite this article as: B. Murray and L.P. Perera, Ship behavior prediction via trajectory extraction-based clustering for maritime situation awareness,
Journal of Ocean Engineering and Science, https:// doi.org/ 10.1016/ j.joes.2021.03.001
B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
Nomenclature
aArbitrary AIS Parameter Vector
A Set of AIS Data
cTrajectory Class
C Data Cluster
dEuclidean Distance
e Eigenvector
EEigenvector Matrix
fTrajectory Feature Vector
IIdentity Matrix
J
3 Class Separability Criterion
kHyper-parameter for kNN classifier
K
M Number of Free Parameters in Mixture Model
LNumber of Data Points in Selected Trajectory
L L (·) Log-likelihood Function
MNumber of Models in Mixture Model
N Number of Trajectories
p Arbitrary Vessel Position
qSelected Vessel Position
rSearch Radius [ m]
RRotation Matrix
sVessel State
S
b Between-class Scatter Matrix
S
w Within-class Scatter Matrix
tTimestamp
T Elapsed Time [ s]
T
δAdditional Time Period [ s]
T
p Desired Prediction Time Horizon [ s]
vSpeed over Ground [ m/s]
xUTM x-coordinate [ m]
xReduced Feature Vector
XSet of Reduced Feature Vectors
yUTM y-coordinate [ m]
zClass Membership
Z Spatial Data Matrix
LStep Size [ m]
Eigenvalue Matrix
μMean Vector
πPrior Distribution
Covariance Matrix
θRotation Angle [
]
Model Parameters
χCourse over Ground [
]
Subscripts
0 Initial State
iSample Number
gGlobal
jClass Number
kk
th State
lNumber of Eigenvectors
mModel Number in Gaussian Mixture
δMaximum Offset
Superscripts
ˆ Estimated Parameter / State
Acronyms
AIS Automatic Identification System
BIC Bayesian Information Criterion
EM Expectation Maximization
GMM Gaussian Mixture Model
KL Karhunen–Loéve
LDA Linear Discriminant Analysis
technologies are available, autonomous ships will be safer and
more efficient than conventional vessels, and that because of
this fact they should be adopted by the industry [5] . For this to
occur, however, autonomous ships must be proven to operate
at a level of safety comparable to, or better than, conventional
manned vessels.
1.1. Maritime situation awareness
For autonomous ships to be introduced into commercial
shipping lanes, effective collision avoidance systems [6] must
be in place to ensure that the autonomous operations have the
required level of safety. Given that the vessels are unmanned,
an autonomous ship must be able to make decisions based
on its understanding of its surroundings, i.e. its own situation
awareness. Situation awareness is defined as “Being aware of
what is happening around you and understanding what that
information means to you now and in the future” [7] , and is
separated into three levels [8] :
1. Perception of the elements in the current situation
2. Comprehension of the current situation
3. Projection of the future status
For an autonomous vessel, situation awareness will pri-
marily entail obstacle detection and prediction of close-range
encounter situations. Other vessels are the most common ob-
stacle an autonomous ship will encounter, and are referred to
as target vessels in an encounter situation. The autonomous
vessel in this case is referred to as the own ship. Such situa-
tions will require collision avoidance maneuvers.
1.1.1. Perception of elements in the current situation
To effectively conduct collision avoidance maneuvers with
respect to target vessels, an own ship will need to be able
to first detect the target vessel, and evaluate relevant param-
eters such as its position, course over ground and speed over
ground. This can be considered as the first level of situa-
tion awareness. An autonomous ship must, therefore, first
define its current state, where all obstacles and their cur-
rent states are known. In order to perceive the relevant ob-
stacles, an autonomous ship must be able to observe them.
Since there is no navigator on-board, collision avoidance tech-
nologies will rely heavily on the sensor suite available on-
board the vessel, as they must in essence replace the eyes
2
B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
of the navigator. An advanced obstacle detection and track-
ing system, which utilizes sensor fusion to enhance detec-
tion capabilities, should be utilized. Relevant sensors will
likely include RADAR (Radio Detection and Ranging) and
electro-optical sensors [9] . Some examples include, LIDAR
(Light Detection and Ranging), stereo cameras and infra-red
cameras.
1.1.2. Comprehension of the current situation
Based on its current state, the own ship must be capable of
evaluating the risk of collision. If there is a risk of collision,
the own ship must conduct a collision avoidance maneuvers
that adhere to the COLREGS as outlined in [10] . This corre-
sponds to level two of Endsley’s situation awareness, where
the ship must now make sense of its current state, and the im-
mediate implications it has for the safety of the operation. Fu-
jii and Tanaka [11] and Goodwin [12] introduced the concept
of the ship domain, where a safety region around a relevant
vessel is introduced to indicate the collision risk. A thorough
review of collision avoidance methods can be found in [13] .
These methods are designed with respect to ships in close-
range encounters, where the collision risk is high enough to
require collision avoidance maneuvers.
1.1.3. Projection of the future status
Level three situation awareness addresses the projection of
the future state of the vessel. In a collision avoidance setting,
this entails predicting both the future states of the own ship,
as well as the future states of target vessels. Previous studies
relating to collision avoidance techniques entail predicting the
future state of a target vessel via calculations using constant
course and speed values. Based on this, collision risk param-
eters relating to the closest point of approach (CPA) such as
the distance (DCPA) and time (TCPA) can be determined,
and necessary collision avoidance maneuvers conducted on
this basis.
Ships have a slow response time when control actions are
sent to change the speed or course over ground. Cars for
instance can make changes almost instantaneously, depend-
ing on their speed. The inertia forces of a ship are, how-
ever, much higher, and resultant collision avoidance maneu-
vers will take much longer to conduct. Therefore, it is de-
sirable to predict the risk of collision as far as possible in
advance. This entails predicting the future trajectories of both
the own ship and target vessels accurately. Methods such as
[14] , where a fuzzy logic based decision making system for
collision avoidance was introduced, and [15] , where parallel
trajectory planning was proposed for autonomous collision
avoidance, can improve the ability of an autonomous vessel
to make decisions. Additionally, work on more advanced pre-
diction algorithms, e.g. [16] , where extended Kalman filters
were utilized to estimate ship trajectories, can enhance the
situation awareness of autonomous vessels to aid in effective
collision avoidance. However, predictions under such methods
are only useful up to rather short prediction horizons (order
of seconds to minutes). These methods are, therefore, useful
in the case of a close-range encounter situation. In such sit-
uations, the own ship must make decisions based on input
from the sensor system, and plan effective collision avoid-
ance maneuvers. This, however, entails that there is a risk of
collision.
This study suggests an approach in which the trajectory of
a target vessel is predicted far in advance, such that potential
close-range encounter situations are prevented from occurring.
With an enhanced level of situation awareness, an autonomous
vessel can predict its own future states, as well as those for
relevant target vessels, for a period up to 30 min into the
future. Based on this level of situation awareness, intelligent
decisions can be made to identify possible future close-range
encounter situations, and optimally implement simple proac-
tive collision avoidance strategies. Examples of such strategies
may include minor speed or course alterations, such that the
future trajectory of the own ship is altered. This is unfortu-
nately not straight forward. It can be assumed that the ma-
jority of vessels will be manned in the foreseeable future. As
such, the behavior of potential target vessels is highly unpre-
dictable for an autonomous agent. Such a strategy, therefore,
requires a system intelligence based approach to maritime sit-
uation awareness.
1.2. System intelligence based ship trajectory prediction
Data from the Automatic Identification System (AIS) pro-
vide a powerful data set upon which analytics can be con-
ducted. Historical AIS data provide insight into historical ship
behavior that can be used to gain insight into patterns in mar-
itime traffic. A myriad of ship parameters are recorded in the
stored ship trajectories, including positional data, speed over
ground values, and course over ground values for various time
instances. AIS data provide an ideal data set upon which ma-
chine learning techniques can be applied to yield insight into
patterns for subsequent use in maritime traffic analysis. Ma-
chine learning is a powerful tool, where insight can be ex-
tracted from data for a variety of purposes. Examples in the
maritime field include [17] , where an optimal truncated least
square support vector was utilized to estimate parameters for
nonlinear maneuvering models, and Shen et al. [18] where
deep reinforcement learning was used to facilitate automatic
collision avoidance.
This study suggests to provide future vessels with a de-
gree of system intelligence, facilitated by historical knowledge
that is extrapolated via machine learning techniques from AIS
data. Using the historical knowledge available, such system
intelligence will provide predictions of vessel trajectories, al-
lowing for subsequent collision risk assessment. The purpose
is to enhance the safety of both future autonomous ship op-
erations, as well as provide decision support to conventional
vessels. This section presents relevant related work and the
contributions of this study.
1.2.1. Related work
An increasing amount of research is being conducted on
methods to utilize AIS data. Zhang et al. [19] analyzed AIS
3
B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
data to gain insight into the spatial-temporal dynamics of ship
traffic around ports. Additionally, Liu et al. [20] used AIS data
to evaluate regional collision risk, and [21] utilized AIS data
to automatically generate ship routes. Tu et al. [22] provided
a comprehensive review of methods to exploit AIS data for
maritime navigation. Most work in the field has previously fo-
cused on predicting vessel trajectory patterns and general traf-
fic behavior e.g. [23] . Identifying anomalous behavior based
on general vessel patterns, e.g. [24] , has also been of focus.
These methods are useful for general behavior analysis, but
are of limited use with respect to aiding in collision avoid-
ance.
Of most interest in a collision avoidance setting is the work
done on utilizing AIS data to predict the future trajectory of
a vessel. The idea is to infer the future trajectory of a vessel
based on the historical behavior of vessels in the same re-
gion. This information is stored in historical AIS data. Ristic
et al. [25] presented a method to predict the future motion of
a vessel utilizing a particle filter approach, but the accuracy is
limited for use in collision avoidance. Pallotta et al. [26] pre-
sented the TREAD (Traffic Route Extraction and Anomaly
Detection) methodology to cluster all trajectories in a defined
region in an unsupervised manner, and subsequently classify a
selected vessel to one of the clusters, each representing a traf-
fic route for the purpose of anomaly detection. Pallotta et al.
[27] subsequently utilized the TREAD methodology to iden-
tify traffic routes, classify a vessel to a route, and predict the
vessel position along this route using the Ornstein–Uhlenbeck
stochastic process. The TREAD technique, however, clusters
way-points, entry points, and stationary points such that the
data for the entire region is utilized to differentiate between
the vessels. As such, there can be significant discrepancies be-
tween sub-paths for trajectories belonging to the same class.
This is of limited importance for long-term predictions (or-
der of hours), and the method using the Ornstein–Uhlenbeck
stochastic process is effective in such cases. The method’s
mean and variance functions do not change over time, how-
ever, which can be considered a strict assumption for real
applications. Short-term predictions (order 5–30 min) of high
accuracy and resolution, however, are arguably of more inter-
est for collision avoidance purposes. For such predictions, the
method will not be as effective. Mazzarella et al. [28] also
presented a prediction method using a Bayesian network-
based algorithm with a particle filter for prediction horizons
in the order of hours. However, this method also has lim-
ited efficacy in short-term trajectory predictions relevant for
collision avoidance purposes.
Hexeberg et al. [29] presented an AIS-based approach to
predict short-term vessel trajectories. The method utilizes a
single point neighbor search method to predict a vessel tra-
jectory based on the underlying AIS data. The method, how-
ever, is unable to handle branching, and [30] expanded on
this work to provide multiple predictions via a prediction
tree, where samples are drawn from close neighbors in the
underlying data. In this manner, a probability estimate can be
evaluated for the future position at a given point in time, facil-
itated via Gaussian Mixture Models. As opposed to previous
methods, these methods do not utilize clustering to identify
traffic routes. All predictions are based on the AIS data in
the neighborhoods of predicted states. As such, these methods
do not take into consideration the relationship between data
points. Future states are predicted iteratively from an initial
state based on the AIS data in the neighborhood of a pre-
dicted position. These data, however, may include data points
that have no relationship to the initial, or previous, predicted
states, and as such may degrade the accuracy. Rong et al.
[31] also presented an approach using a Gaussian process
model, where a probabilistic trajectory prediction method is
outlined which, in addition to predicting the future positions
of a vessel, also describes the uncertainty of the predicted
position. The method, however, is only evaluated with using
regular ship routes and offers no method to identify multiple
possible future routes the vessel may follow, and classify it
to one.
1.2.2. Contribution
In this study, a method to provide system intelligence
to future autonomous ships is suggested for the purpose of
enhanced situation awareness. The method is facilitated by
leveraging historical AIS data via machine learning techniques
to predict the future trajectory of a vessel based on its initial
state. The method provides short-term trajectory predictions
(order 5–30 min) that can provide a basis for collision risk
assessments. In this manner, possible close-range encounter
situations can be avoided, and the overall safety associated
with autonomous operations can be increased.
The method presented in this study is based on a similar
structure to that of previous techniques, in that trajectories
are first clustered, a selected vessel is classified to a given
cluster of trajectories, and a subsequent trajectory prediction is
determined. However, this method is designed to aid in short-
term trajectory predictions. As such, an alternative approach is
suggested, where an initial clustering technique is utilized to
extract a subset of data from a historical AIS data set, centered
about the initial vessel state. This cluster contains AIS data
that has a high degree of similarity to the initial state of the
selected vessel. Using this initial cluster, all unique future, i.e.
forward, trajectories are extracted from the cluster. The length
of these is defined by the desired prediction time horizon.
These trajectories represent all future paths of ships that had
similar states to the initial state of the selected vessel. This
data set will, therefore, only contain data that are related to the
initial vessel state, as well as retain the relationship between
data points.
The extracted forward trajectories represent the possible
future behavior of the selected vessel for a given prediction
horizon. In this study, it is of interest to identify all possible
trajectory modes of the historical ship behavior, such that a
high fidelity trajectory prediction can be conducted to support
collision avoidance. Identifying such modes can be facilitated
by clustering the forward trajectories. It is only of interest to
differentiate between different possible modes for the dura-
tion of the desired prediction horizon. As such, clustering the
extracted forward trajectories will provide a better basis for
4
B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
relevant route identification compared to other methods where
entire trajectories for regions are considered.
A clustering technique is suggested based on all relevant
data in each unique extracted forward trajectory. Dimension-
ality reduction via the Karhunen–Loéve transform is first con-
ducted in order to compress the trajectories, whilst retaining
the most important information relevant for differentiating the
ship behavior. Such dimensionality reduction should support
the clustering performance. Clustering is then facilitated via
unsupervised Gaussian Mixture Modeling. A selected vessel
is then classified to a cluster based on its past behavior. This
is achieved via backward trajectory extraction, and optimally
generating features for class separation using Linear Discrim-
inant Analysis. Finally, a trajectory prediction is conducted
with respect to the trajectory data in the cluster of historical
ship behavior.
The method has enhanced performance as it can discover
the cluster of most similar ship behavior. This allows for
predictions with a higher degree of fidelity than other meth-
ods with respect to collision avoidance. This is effective for
challenging regions with more complex traffic, i.e. multiple
possible routes with various speeds. The trajectory prediction
also provides an increased level of accuracy given the rela-
tionship between data points in the underlying data. Similar
methods use techniques that introduce time invariance, such
as dynamic time warping. These result in effective clustering
of trajectories of similar shapes for a given region, but can
not capture the relationship between when various maneu-
vers takes place. Furthermore, clustering ship trajectories for
entire regions will yield different results than clustering the
extracted trajectories as suggested in this study. Therefore,
the technique in this study provides a better basis for dif-
ferentiating relevant historical ship behavior, supporting ship
trajectory prediction. The method can also be applied in any
geographical region, as the algorithm only requires access to
raw AIS data of sufficient density for the region of interest.
An initial version of this work was presented in [32] .
Furthermore, in [33] , a Dual Linear Autoencoder approach
was introduced to facilitate trajectory prediction, that utilized
similar clustering and classification regimes to those in this
study. The clustering and classification techniques utilized in
[33] were, however, not the focus of the study, and, there-
fore, not addressed in detail. This study can, therefore, be
considered a parallel study, where the methods introduced in
[32] are expanded upon, and addressed in detail.
2. Methodology
This section outlines the methodology utilized to facili-
tate trajectory predictions via the proposed system intelligence
approach. First, the general approach to facilitate a trajectory
prediction is presented. Second, the trajectory clustering mod-
ule is outlined. Next, the trajectory classification module is
discussed. Finally, the methodology involved in the trajectory
prediction module is presented.
2.1. General prediction approach
The objective of the method presented in this study is to
facilitate a prediction of the future trajectory of a target ves-
sel, hereafter referred to as a selected vessel. In this study,
a prediction horizon of 30 min is investigated. This, how-
ever, can be varied based on the desires of the user. It is
further assumed that the past 10 min behavior of the selected
vessel is available. The architecture of the method can be
split into three modules; the trajectory clustering module, the
trajectory classification module and the trajectory prediction
module. This is illustrated in Fig. 1 .
The trajectory clustering module first employs an initial
clustering technique. Based on the current state of the se-
lected vessel, the technique identifies historical AIS messages
in a defined region surrounding the current position of the
selected vessel. Furthermore, data are filtered such that they
have similar speed and course over ground values to that of
the selected vessel. In this manner, ships with similar behavior
in the past are identified. The forward, i.e. future, trajectories
are then extracted 30 min into the future from their initial
data points. These represent the distribution of possible 30
min behavior for the selected vessel. Next, the forward tra-
jectories are clustered. In this manner, each cluster represents
a mode of ship behavior, where each is comprised of similar
trajectories. It is, therefore, of interest to identify the most
likely mode of future ship behavior the selected vessel may
belong to, such that a prediction of enhanced fidelity can be
facilitated.
In the classification module, the trajectories identified in
the clustering module are extended 10 min into the past. This
is referred to as a backward trajectory extraction. In this man-
ner, each backward trajectory is an extension of one of the
forward trajectories from the clustering module, and have cor-
responding class labels. By comparing the past 10 min be-
havior of the selected vessel to the backward trajectories, the
behavior can be compared, and used to classify the selected
vessel to one of the clusters of future behavior.
In the prediction module, the subset of forward trajectories
belonging to the cluster identified in the classification model
are utilized to conduct a prediction. In this manner, only the
specific behavior in the cluster is used to conduct the pre-
diction. This should enhance the accuracy of the prediction
compared to cases in which the trajectories diverge.
2.2. Trajectory clustering module
Machine learning can be split into two groups, supervised
and unsupervised learning. Supervised learning deals with
techniques where class labels are available, and one wish-
ing to train an algorithm to correctly classify an unseen data
point to a given class. Unsupervised learning, however, deals
with data where the class labels are unavailable. In such a
case, it is desirable to discover underlying groupings, or clus-
ters, in the data. Clustering is, therefore, a form of unsuper-
vised learning. In this study, the class labels for the extracted
trajectories are unavailable, requiring the use of unsupervised
5
B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
Fig. 1. Method architecture.
learning. As such, clustering is investigated to discover group-
ings, or clusters, of historical ship trajectories that represent
represent behavior modes that a selected vessel may belong
to. This section covers the methodology utilized to cluster the
historical trajectories.
2.2.1. Initial clustering
The input to the algorithm is the initial state of a selected
vessel, and is defined in (1) . This state can be thought of as
the current state of a target vessel, whose future trajectory is
of interest to predict. Such parameters can be acquired from
on-board sensors e.g. radar, or from external sources e.g. AIS.
s
0
[ x
0
, y
0
, χ0
, v
0
, T
0
] (1)
It is of interest to identify similar vessels in the historical
AIS database, i.e. data points with a high degree of similar-
ity to s
0
. It can be argued that AIS data points similar to
s
0 will have a higher probability of having similar trajecto-
ries than dissimilar data points. It is, therefore, assumed that
ships that were in a similar geographical location, with a sim-
ilar course and speed over ground, will likely have behaved
in a similar manner. As such, the trajectories of these vessels
can be thought of as representing the distribution of the pos-
sible future behavior of the selected vessel. It is, therefore,
assumed that these trajectories can be used to estimate the
future behavior of the selected vessel. The discovery of such
similar vessels is achieved via the initial clustering technique
described in this section.
A matrix Z can be defined as the subset of spatial data in
the AIS data set. The spatial data is converted from longitude
and latitude values to UTM coordinates ( x, y) prior to cluster-
ing. A rotational affine transformation can be defined to rotate
Z = [ x
z
, y
z
] by θ= χ0 to Z
= [ x
z
, y
z
] . This transformation
is defined in (2) .
Z
= R Z
T (2)
Where x
z IR , y
z IR , x
z
IR , y
z
IR and Ris the rotation
matrix defined as:
R =
cos (θ ) sin (θ )
sin (θ ) cos (θ ) (3)
The new matrix, Z
, will have a basis comprised of a vector
in the direction of χ0
, and one orthogonal to χ0
. An initial
cluster C
0 is then created using data in the space spanned by
these basis vectors in (4) . This clustering operation results in
a rectangular cluster C
0 with a height of 2δH and width 2δW
centered about s
0 as illustrated in Fig. 2 , which is adapted
Fig. 2. Initial cluster C
0
.
from that presented in [32] . The cluster also only contains
data points with similar χand vvalues that were at a similar
position to the selected vessel at some previous time point.
The rectangular shape of the cluster orthogonal to χ0 should
capture most vessels that have similar trajectories to that of
s
0
.
C
0
= { a
i A : (| x
z
i
x
z
0
| δW | y
z
i
y
z
0
| δH
)
(| χi
χ0
| χδ | v
i
v
0
| v
δ) } (4)
2.2.2. Forward trajectory extraction
Based on the initial cluster C
0
, unique instances of vessel
trajectories are identified, given that multiple data points in
C
0 may belong to the same trajectory. Once unique trajec-
tory instances have been identified, the nearest point of each
trajectory to s
0 in geographical space is defined as its initial
point. The forward trajectories of all instances are then ex-
tracted from this point and a period of time into the future
corresponding to the desired prediction horizon T
p
. An addi-
tional time period, T
δ, is extracted to ensure sufficient data
density for the trajectory prediction module at the culmination
of the prediction. The trajectories belonging to C
0 represent
the possible behavior of the selected vessel, as their initial
points have a high degree of similarity to s
0
. In other words,
it is likely that the future trajectory of the selected vessel will
be similar to one of the trajectories in C
0
.
2.2.3. Trajectory feature generation
Assuming that the trajectories in C
0 represent the distri-
bution of the possible future behavior of the selected vessel,
it is desirable to discriminate between the various possibil-
ities, i.e. discover groupings of behavior. In this sense, one
6
B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
wishes to cluster the trajectories into classes of behavior. To
achieve this, each unique trajectory must be described by a
set of features. The term feature in this case refers to an in-
dividual measurable parameter that describes the trajectory.
Each trajectory is to be clustered in an unsupervised manner
based on these features. As such, a trajectory feature vector
is constructed comprising relevant parameters.
The first step in the generation of the feature vectors is
to linearly interpolate each trajectory at 30 s intervals. This
is done to generate higher density data, as well as provide a
common time index with which the trajectories can be com-
pared. The initial point of each trajectory is defined as T
0
.
Subsequent data points are, therefore, 30 s apart, starting at
this point. In this manner, the trajectories can be directly com-
pared at the same time instance relative to T
0
. Using the inter-
polated data, each trajectory feature vector is constructed by
flattening the matrix containing the positional and speed data
(x, y, v) of the trajectory. If each trajectory is of length L,
the resultant trajectory feature vector is defined as f IR
3 L×1
.
Utilizing the positional data, fwill incorporate the shape of
the trajectory and the inherent course alterations between data
points. The speed of the vessel along the trajectory will also
be inherent in the positional data. Nonetheless, the speed over
ground values at each time instance were deemed relevant to
include to enhance the information stored in each vector.
As mentioned, the objective of the module is to cluster the
trajectories, and as such, the respective feature vectors should
provide a basis for discriminating between the classes of be-
havior. In general, including as much information as possible,
i.e. increasing the dimensionality of the feature vector, should
enhance the discriminatory properties of the data set. This is
true, but in a clustering setting one may encounter issues re-
lated to the curse of dimensionality [34] .
Clustering is based on grouping data points via some dis-
tance measure. Points that are closer together are more likely
to be considered part of the same cluster. The curse of dimen-
sionality in relation to clustering was discussed in [35] , where
it was pointed out that a fixed number of data points will
become increasingly sparse as the dimensionality increases.
Data points can in a sense be lost in space as the dimension-
ality increases, as the distance between points with respect
to a given dimension can be large. As a result, clustering
data using standard techniques in a high dimensional space
will degrade the results, as the algorithms are unable to find
groupings in the data. One method to ameliorate this effect
is to reduce the dimensionality of the data.
A common method for dimensionality reduction is the
Karhunen–Loéve (KL) transform [36] . The purpose of the
transform is to attain uncorrelated features and is shown in
(5) . First, the set of all feature vectors is centered such that
all features have mean zero within the set. Subsequently, the
covariance matrix, , of the set of all feature vectors is cal-
culated. Matrix Econsists of the eigenvectors of , and
is the eigenvalue matrix, where the relationship is shown in
(6) . (5) projects the feature vector, f, onto the space spanned
by the eigenvectors of the covariance matrix. The covariance
of the data inherently describes the correlation among the
respective parameters. As such, the eigenvectors of the co-
variance matrix will describe the directions in which the data
has the highest degree of variation orthogonal to each other.
x = E
T
f (5)
Where x IR
3 L×1
, f IR
3 L×1 and E IR
3 L×3 L
= EE
T (6)
Where IR
3 L×3 L and IR
3 L×3 L
In a high dimensional space, however, many of the eigen-
vectors will describe very little variation in the data. The KL-
transform, therefore, projects fonto the subspace spanned by
the leigenvectors with the llargest eigenvalues in (7) .
x = E
T
l
f (7)
Where x IR
l×1 and E
l IR
3 L×l
This will inherently preserve the most important covari-
ance information in the data whilst reducing the dimension-
ality to l. This may be abstract for the case of the trajectory
feature vector, f, as each dimension represents a position or
speed value at a given time instance. Take for instance the
case of a 30 min prediction with five minutes added to al-
low for sufficient data density. The dimensionality of fwill
then be 210. The eigenvectors of will point in the direc-
tions within this 210-dimensional space where there is a high
degree of variation between the trajectories. As such, it is
difficult to gain a direct physical interpretation of the eigen-
vectors, as the projection onto them represents a combination
of multiple parameters. By choosing the llargest eigenval-
ues, one chooses the ldirections where the variation in the
data is greatest. When projecting the feature vectors onto the
subspace spanned by the eigenvectors corresponding to the
largest eigenvalues, one is in fact generating new features
with a high degree of variation that can be used for further
analysis.
In this study, the projection of fonto the eigenvectors cor-
responding to the three largest eigenvalues was chosen as a
representation for each trajectory. Generally, the projection
should retain at least 95% of the variance in the data. This is
evaluated by investigating the sum of the chosen eigenvalues
over the sum of all eigenvalues [37] . It was found that using
the eigenvectors corresponding to the three largest eigenvalues
fulfilled this requirement when evaluating the results. Addi-
tionally, a three-dimensional vector can easily be visualized
when evaluating the performance of the clustering algorithm.
2.2.4. Unsupervised Gaussian mixture model clustering
Using the reduced trajectory feature vectors generated via
the KL-transform, the trajectories can be clustered. Depend-
ing on s
0
, the number of true clusters, i.e. classes, will vary.
As such, a flexible clustering algorithm is required that can
adapt to the data in each prediction. Unsupervised Gaussian
Mixture Model clustering was chosen for use in this study.
A Gaussian Mixture Model (GMM) [38] is a flexible model
that adapts to the underlying data. GMMs assume that a data
7
B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
set, X, consists of a mixture of Mdifferent Gaussian distri-
butions. Each distribution has its own mean vector, μm
, co-
variance matrix, m
, and prior distribution, πm
. As such, each
distribution will describe that particular class or cluster, i.e.
class m. The class membership parameter, z
i
, is introduced
for each data point, x
i
, where:
z
ik =
1 if k = m
0 otherwise
Where z
i IR
M×1
The class conditional probability is shown in (8) . The most
likely model is estimated by maximizing the log-likelihood
with respect to the various model parameters.
p(x
i
| z
im = 1) N ( μm
, m
) (8)
The class membership of the trajectories is, however, un-
known. As such, the Expectation Maximization (EM) algo-
rithm is utilized to conduct the unsupervised GMM cluster-
ing. The GMM requires that a specified number of underly-
ing models, M, is input. Based on this, the EM algorithm
initializes all model parameters. A common method is to ini-
tialize all μm
as randomly chosen data points, the priors as
πm =
1
M
and m = I. This initialization is unlikely to model
the underlying data correctly. As such, the algorithm conducts
what is known as the expectation step. In this step, the ex-
pected class membership z
im
is evaluated in (9) , based on
the current model parameters, . All data points will, there-
fore, have updated class memberships based on the current
model parameters.
z
im
=
p(x
i
| z
im
=1 ;) πm
M
k=1
p(x
i
| z
ik
=1 ;) πk
(9)
The next step in the EM algorithm is known as the maxi-
mization step. In this step, the model parameters are updated
based on the new distribution resulting from the expectation
step. This is done by maximizing the log-likelihood with re-
spect to . The estimated parameters in the maximization
step are calculated in (10), (11) and (12) .
ˆ
μm
=
N
i=1
z
im
x
i
N
i=1
z
im
(10)
ˆ
m =
N
i=1
z
im
(x
i
μm
)(x
i
μm
)
T
N
i=1
z
im
(11)
ˆ πm =
N
i=1
z
im
N
(12)
The EM algorithm now repeats, where the expected class
memberships are updated, as well as the model parameters.
The algorithm is in a sense adapting to the data, where the
most likely distribution of the data is discovered. This iter-
ative process continues in a loop until a stopping criteria is
met. One common stopping criteria is the convergence of the
total log-likelihood. Alternatively, one can terminate the algo-
rithm if there is little to no change in the model parameters,
i.e. the parameters themselves converge. The parameter con-
vergence criteria was utilized in this study. Often times, the
EM algorithm can have issues with convergence, due to poor
initialization. To avoid divergence issues, a technique is em-
ployed where a number of random initializations are run for
a number of iterations. The best run, i.e. the run with the
greatest log-likelihood score, is then chosen and run for fur-
ther iterations. The mixture model will, upon convergence,
consist of Mdistinct Gaussian distributions which describe
the class conditional probabilities, p(x| c
m
) , of the data, along
with an associated prior distribution, πm
. The posterior prob-
ability p(c
m
| x) can be found via Bayes Rule in (13) using
the resultant conditional probabilities and priors from the al-
gorithm.
p(c
m
| x) =
p(x| c
m
) πm
p(x)
(13)
p(c
m
| x) > p(c
j
| x) j = m, j = 1 . . . M (14)
Clustering of the dataset is then conducted via Bayesian
classification, where each feature, x
i
, is classified to class m
according to (14) . However, the number of underlying classes,
M, is as previously mentioned unknown. In order to determine
the most likely number of clusters, the Bayesian Information
Criterion (BIC) [39] defined in (15) , is utilized.
BI C = 2L L (M
) + K
M
ln(N ) (15)
For a GMM with Munderlying distributions, L L (M
) is
the total log-likelihood function computed at the optimum,
K
M the number of free parameters in the mixture model, and
N the number of data points. The EM algorithm can be run
for various GMMs by altering M. By calculating the BIC for
each resultant GMM, the most likely GMM is that with the
lowest BIC. This is due to it having the highest likelihood
and least complexity. In this study, it was assumed that there
will be no more than 20 unique clusters in the trajectory data,
and the BIC was, therefore, evaluated for values of Mup to
20.
This process discovers the best GMM to fit the data and
provides the number of possible routes, or trajectory behavior
modes, a selected vessel may belong to. By classifying all the
extracted forward trajectories, class labels can assigned. These
labels are used for further analysis in the subsequent modules.
2.3. Trajectory classification module
The trajectory clustering module has now clustered all tra-
jectories present in C
0 to Mclasses. Each class represents
a group of trajectories that have a high degree of similarity.
As such, each class represents a possible future route, or be-
havior mode, the selected vessel may belong to. It is now of
interest to classify the selected vessel to the most likely class
of the Mpossibilities. In this sense, an estimate of the distri-
bution of the possible future behavior of the selected vessel
can be made. Using the data in the class of ship behavior, a
trajectory prediction can be made. This section presents the
method utilized to achieve such a classification.
2.3.1. Backward trajectory extraction
One possible method to conduct the aforementioned clas-
sification is to utilize the current vessel state, s
0
, and compare
it to the data points in C
0
. This, however, will have limited
8
B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx
ARTICLE IN PRESS
JID: JOES [m5+; October 21, 2021;18:48 ]
predictive power, as the classification will be based solely on
one time instance of the selected vessel. An alternative ap-
proach is, therefore, suggested, where the previous 10 min of
the selected vessel’s trajectory are compared to the previous
10 min of data for all trajectories in C
0
. This is in a sense the
inverse of the forward trajectory extraction process described
in Section 2.2.3 . Instead of extracting the trajectories from T
0
and 30 min into the future, the past trajectories are extracted
from the same initial point, i.e. from T
0
, and 10 min into the
past from that time instance. It is assumed that at least 10 min
of behavior for the selected vessel should be available via the
on-board sensors of the own ship, or via external sources e.g.
AIS. The method is otherwise identical to that described in
Section 2.2.3 . All the backward trajectories extracted from C
0
will have the same labels as those determined by the cluster-
ing technique in Section 2.2.4 . As such, a labeled data set is
available that can be used to classify the observed trajectory
of selected vessel.
2.3.2. Optimal feature generation
Each backward trajectory feature vector is represented by
flattening the matrix containing all position and speed over
ground data, in the same manner as for the forward trajecto-
ries in Section 2.2.3 . This will result in a vector f IR
3 L×1
. In
the case of a 10 min trajectory this will be a 60-dimensional
space within which the classification must take place. This
can be a challenging task, as it is likely that the features are
quite similar, given that the vessels in C
0 generally will have
similar trajectories for the past 10 min.
To improve the classification accuracy, Linear Discriminant
Analysis (LDA) [40] is utilized. LDA provides a method to
generate features with optimal separation between classes in
a supervised manner. Using the class separability measure J
3
in (16) , one can optimize a transformation such that features
are generated to optimize class separability.
J
3
= trace { S
1
w
S
m
} (16)
S
m is the mixture scatter matrix defined as S
m = S
w
+
S
b
, where S
w is the within-class scatter matrix and S
b
the between-class scatter matrix. S
w and S
b are defined in
(17) and (19) , respectively. S
w describes how compact the
data within each class is, whilst S
b describes how spread out
each class is with respect to the global mean, μg
. In a clas-
sification setting, one wishes to minimize the trace of S
w
,
i.e. data are more compact within each class, and maximize
the trace of S
b
, i.e. the classes are more spread out. This
corresponds to maximizing the class separation criterion J
3
.
S
w
=
M
m=1
πm
m (17)
μg
=
M
m=1
πm
μm
(18)
S
b
=
M
m=1
πm
( μm
μ0
)( μm
μ0
)
T (19)
It is desirable to find a transformation x = A
T
fsuch that J
3 is
maximized in the transformed space. The optimal transforma-
tion with respect to class separability is found to be A = E,
where Eis the matrix of eigenvectors of S
1
w
S
b in the origi-
nal vector space. This relationship is shown in (21) , where
is the corresponding diagonal eigenvalue matrix. The trans-
formation is shown in (20) . However, S
b is of rank M 1 ,
and correspondingly S
1
w
S
b is also of rank M 1 . As such,
there will be M 1 nonzero eigenvalues. (20) will, therefore,
project fonto the subspace spanned by the llargest eigenvec-
tors in a similar manner to the KL-transform. If l = M 1 ,
optimality with respect to J
3 will be preserved. Further di-
mensionality reduction can still be conducted by choosing a
value l < M 1 . This will, however, be a sub-optimal solu-
tion. Further details on LDA can be found in [41] .
x = E
T
f (20)
Where x IR
3 L×1
, f IR
3 L×1 and E IR
3 L×l
S
1
w
S
b
= EE
T (21)
Where S
1
w
S
b
IR
3 L×3 L and IR
l×l
2.3.3. Classification
Despite utilizing the optimal features described in Sec-
tion 2.3.2 , the classification task is highly non-linear, and
likely with significant overlap between classes in most cases.
This is due to the high degree of similarity between the past
trajectories. As a result, the k-Nearest Neighbor ( kNN) clas-
sifier [42] is utilized due to its nonlinear predictive power.
Given a data point x
0
, the kNN classifier will measure the
distance to all other data points, x
i
, in the dataset Xusing
the Euclidean distance as shown in (22) .
d
i = || x
i
x
0
||
2 (22)
The kNN classifier will then identify the knearest data points
using distance measures from (22) . Based on this subset of
data, the algorithm then identifies the class with the most data
points in the subset, and classifies x
0 to the majority class.
In this study, x
0
is the projection of the backward trajectory