Content uploaded by Lokukaluge Prasad Perera

Author content

All content in this area was uploaded by Lokukaluge Prasad Perera on Oct 22, 2021

Content may be subject to copyright.

Content uploaded by Lokukaluge Prasad Perera

Author content

All content in this area was uploaded by Lokukaluge Prasad Perera on Mar 21, 2021

Content may be subject to copyright.

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

Available online at www.sciencedirect.com

Journal of Ocean Engineering and Science xxx (xxxx) xxx

www.elsevier.com/locate/joes

Original Article

Ship behavior prediction via trajectory extraction-based clustering for

maritime situation awareness

Brian Murray

∗, Lokukaluge Prasad Perera

UiT The Arctic University of Norway, Troms ø, Norway

Received 15 October 2020; received in revised form 14 February 2021; accepted 18 March 2021

Available online xxx

Abstract

This study presents a method in which historical AIS data are used to predict the future trajectory of a selected vessel. This is facilitated

via a system intelligence-based approach that can be subsequently utilized to provide enhanced situation awareness to navigators and future

autonomous ships, aiding proactive collision avoidance. By evaluating the historical ship behavior in a given geographical region, the method

applies machine learning techniques to extrapolate commonalities in relevant trajectory segments. These commonalities represent historical

behavior modes that correspond to the possible future behavior of the selected vessel. Subsequently, the selected vessel is classiﬁed to a

behavior mode, and a trajectory with respect to this mode is predicted. This is achieved via an initial clustering technique and subsequent

trajectory extraction. The extracted trajectories are then compressed using the Karhunen–Loéve transform, and clustered using a Gaussian

Mixture Model. The approach in this study differs from others in that trajectories are not clustered for an entire region, but rather for

relevant trajectory segments. As such, the extracted trajectories provide a much better basis for clustering relevant historical ship behavior

modes. A selected vessel is then classiﬁed to one of these modes using its observed behavior. Trajectory predictions are facilitated using an

enhanced subset of data that likely correspond to the future behavior of the selected vessel. The method yields promising results, with high

classiﬁcation accuracy and low prediction error. However, vessels with abnormal behavior degrade the results in some situations, and have

also been discussed in this study.

© 2021 Shanghai Jiaotong University. Published by Elsevier B.V.

This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )

Keywords: Maritime situation awareness; Ship navigation; Trajectory prediction; Collision avoidance; Machine learning; Unsupervised learning; AIS.

1. Introduction

Technological advances are permeating almost every in-

dustry. Artiﬁcial intelligence, increased computational power

and wireless communication capabilities have the potential

to allow for disruptive innovations that can change business

models drastically. Many argue that there is a digital rev-

olution underway and are calling it Industry 4.0 [1] . If one

looks to the automotive industry for instance, signiﬁcant inno-

vations related to autonomous cars are being developed at an

exponential rate. Autonomous cars are already being tested in

general trafﬁc areas and there are claims that mass production

could be possible by 2021 [2] .

∗Corresponding author.

E-mail address: brian.murray@uit.no (B. Murray).

Similarly, it can be argued that shipping is currently on

its way into a fourth technical revolution, Shipping 4.0 [3] .

The ﬁrst revolution in shipping can be argued to be the tran-

sition from sail to steam in at the turn of the 19

th century,

the second from steam to diesel around 1910, and the third

came with the introduction of automated systems, made pos-

sible through the advent of computers around 1970. Like the

car industry, the shipping industry is looking to autonomy

as a possible disruptive element. The shipping industry has,

however, historically been considered conservative, with in-

novations being implemented at a slower rate than in similar

industries. As such, technologies associated with autonomous

ships are not as developed as those for autonomous cars.

Nonetheless, many companies are working on the develop-

ment of autonomous ships. The ﬁrst autonomous ships, e.g.

Yara Birkeland, are planned to be launched in 2020 and fully

autonomous by 2022 [4] . It can be argued that if the required

https://doi.org/10.1016/j.joes.2021.03.001

2468-0133/© 2021 Shanghai Jiaotong University. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license

( http://creativecommons.org/licenses/by-nc-nd/4.0/ )

Please cite this article as: B. Murray and L.P. Perera, Ship behavior prediction via trajectory extraction-based clustering for maritime situation awareness,

Journal of Ocean Engineering and Science, https:// doi.org/ 10.1016/ j.joes.2021.03.001

B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

Nomenclature

aArbitrary AIS Parameter Vector

A Set of AIS Data

cTrajectory Class

C Data Cluster

dEuclidean Distance

e Eigenvector

EEigenvector Matrix

fTrajectory Feature Vector

IIdentity Matrix

J

3 Class Separability Criterion

kHyper-parameter for kNN classiﬁer

K

M Number of Free Parameters in Mixture Model

LNumber of Data Points in Selected Trajectory

L L (·) Log-likelihood Function

MNumber of Models in Mixture Model

N Number of Trajectories

p Arbitrary Vessel Position

qSelected Vessel Position

rSearch Radius [ m]

RRotation Matrix

sVessel State

S

b Between-class Scatter Matrix

S

w Within-class Scatter Matrix

tTimestamp

T Elapsed Time [ s]

T

δAdditional Time Period [ s]

T

p Desired Prediction Time Horizon [ s]

vSpeed over Ground [ m/s]

xUTM x-coordinate [ m]

xReduced Feature Vector

XSet of Reduced Feature Vectors

yUTM y-coordinate [ m]

zClass Membership

Z Spatial Data Matrix

LStep Size [ m]

Eigenvalue Matrix

μMean Vector

πPrior Distribution

Covariance Matrix

θRotation Angle [

◦]

Model Parameters

χCourse over Ground [

◦]

Subscripts

0 Initial State

iSample Number

gGlobal

jClass Number

kk

th State

lNumber of Eigenvectors

mModel Number in Gaussian Mixture

δMaximum Offset

Superscripts

ˆ Estimated Parameter / State

Acronyms

AIS Automatic Identiﬁcation System

BIC Bayesian Information Criterion

EM Expectation Maximization

GMM Gaussian Mixture Model

KL Karhunen–Loéve

LDA Linear Discriminant Analysis

technologies are available, autonomous ships will be safer and

more efﬁcient than conventional vessels, and that because of

this fact they should be adopted by the industry [5] . For this to

occur, however, autonomous ships must be proven to operate

at a level of safety comparable to, or better than, conventional

manned vessels.

1.1. Maritime situation awareness

For autonomous ships to be introduced into commercial

shipping lanes, effective collision avoidance systems [6] must

be in place to ensure that the autonomous operations have the

required level of safety. Given that the vessels are unmanned,

an autonomous ship must be able to make decisions based

on its understanding of its surroundings, i.e. its own situation

awareness. Situation awareness is deﬁned as “Being aware of

what is happening around you and understanding what that

information means to you now and in the future” [7] , and is

separated into three levels [8] :

1. Perception of the elements in the current situation

2. Comprehension of the current situation

3. Projection of the future status

For an autonomous vessel, situation awareness will pri-

marily entail obstacle detection and prediction of close-range

encounter situations. Other vessels are the most common ob-

stacle an autonomous ship will encounter, and are referred to

as target vessels in an encounter situation. The autonomous

vessel in this case is referred to as the own ship. Such situa-

tions will require collision avoidance maneuvers.

1.1.1. Perception of elements in the current situation

To effectively conduct collision avoidance maneuvers with

respect to target vessels, an own ship will need to be able

to ﬁrst detect the target vessel, and evaluate relevant param-

eters such as its position, course over ground and speed over

ground. This can be considered as the ﬁrst level of situa-

tion awareness. An autonomous ship must, therefore, ﬁrst

deﬁne its current state, where all obstacles and their cur-

rent states are known. In order to perceive the relevant ob-

stacles, an autonomous ship must be able to observe them.

Since there is no navigator on-board, collision avoidance tech-

nologies will rely heavily on the sensor suite available on-

board the vessel, as they must in essence replace the eyes

2

B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

of the navigator. An advanced obstacle detection and track-

ing system, which utilizes sensor fusion to enhance detec-

tion capabilities, should be utilized. Relevant sensors will

likely include RADAR (Radio Detection and Ranging) and

electro-optical sensors [9] . Some examples include, LIDAR

(Light Detection and Ranging), stereo cameras and infra-red

cameras.

1.1.2. Comprehension of the current situation

Based on its current state, the own ship must be capable of

evaluating the risk of collision. If there is a risk of collision,

the own ship must conduct a collision avoidance maneuvers

that adhere to the COLREGS as outlined in [10] . This corre-

sponds to level two of Endsley’s situation awareness, where

the ship must now make sense of its current state, and the im-

mediate implications it has for the safety of the operation. Fu-

jii and Tanaka [11] and Goodwin [12] introduced the concept

of the ship domain, where a safety region around a relevant

vessel is introduced to indicate the collision risk. A thorough

review of collision avoidance methods can be found in [13] .

These methods are designed with respect to ships in close-

range encounters, where the collision risk is high enough to

require collision avoidance maneuvers.

1.1.3. Projection of the future status

Level three situation awareness addresses the projection of

the future state of the vessel. In a collision avoidance setting,

this entails predicting both the future states of the own ship,

as well as the future states of target vessels. Previous studies

relating to collision avoidance techniques entail predicting the

future state of a target vessel via calculations using constant

course and speed values. Based on this, collision risk param-

eters relating to the closest point of approach (CPA) such as

the distance (DCPA) and time (TCPA) can be determined,

and necessary collision avoidance maneuvers conducted on

this basis.

Ships have a slow response time when control actions are

sent to change the speed or course over ground. Cars for

instance can make changes almost instantaneously, depend-

ing on their speed. The inertia forces of a ship are, how-

ever, much higher, and resultant collision avoidance maneu-

vers will take much longer to conduct. Therefore, it is de-

sirable to predict the risk of collision as far as possible in

advance. This entails predicting the future trajectories of both

the own ship and target vessels accurately. Methods such as

[14] , where a fuzzy logic based decision making system for

collision avoidance was introduced, and [15] , where parallel

trajectory planning was proposed for autonomous collision

avoidance, can improve the ability of an autonomous vessel

to make decisions. Additionally, work on more advanced pre-

diction algorithms, e.g. [16] , where extended Kalman ﬁlters

were utilized to estimate ship trajectories, can enhance the

situation awareness of autonomous vessels to aid in effective

collision avoidance. However, predictions under such methods

are only useful up to rather short prediction horizons (order

of seconds to minutes). These methods are, therefore, useful

in the case of a close-range encounter situation. In such sit-

uations, the own ship must make decisions based on input

from the sensor system, and plan effective collision avoid-

ance maneuvers. This, however, entails that there is a risk of

collision.

This study suggests an approach in which the trajectory of

a target vessel is predicted far in advance, such that potential

close-range encounter situations are prevented from occurring.

With an enhanced level of situation awareness, an autonomous

vessel can predict its own future states, as well as those for

relevant target vessels, for a period up to 30 min into the

future. Based on this level of situation awareness, intelligent

decisions can be made to identify possible future close-range

encounter situations, and optimally implement simple proac-

tive collision avoidance strategies. Examples of such strategies

may include minor speed or course alterations, such that the

future trajectory of the own ship is altered. This is unfortu-

nately not straight forward. It can be assumed that the ma-

jority of vessels will be manned in the foreseeable future. As

such, the behavior of potential target vessels is highly unpre-

dictable for an autonomous agent. Such a strategy, therefore,

requires a system intelligence based approach to maritime sit-

uation awareness.

1.2. System intelligence based ship trajectory prediction

Data from the Automatic Identiﬁcation System (AIS) pro-

vide a powerful data set upon which analytics can be con-

ducted. Historical AIS data provide insight into historical ship

behavior that can be used to gain insight into patterns in mar-

itime trafﬁc. A myriad of ship parameters are recorded in the

stored ship trajectories, including positional data, speed over

ground values, and course over ground values for various time

instances. AIS data provide an ideal data set upon which ma-

chine learning techniques can be applied to yield insight into

patterns for subsequent use in maritime trafﬁc analysis. Ma-

chine learning is a powerful tool, where insight can be ex-

tracted from data for a variety of purposes. Examples in the

maritime ﬁeld include [17] , where an optimal truncated least

square support vector was utilized to estimate parameters for

nonlinear maneuvering models, and Shen et al. [18] where

deep reinforcement learning was used to facilitate automatic

collision avoidance.

This study suggests to provide future vessels with a de-

gree of system intelligence, facilitated by historical knowledge

that is extrapolated via machine learning techniques from AIS

data. Using the historical knowledge available, such system

intelligence will provide predictions of vessel trajectories, al-

lowing for subsequent collision risk assessment. The purpose

is to enhance the safety of both future autonomous ship op-

erations, as well as provide decision support to conventional

vessels. This section presents relevant related work and the

contributions of this study.

1.2.1. Related work

An increasing amount of research is being conducted on

methods to utilize AIS data. Zhang et al. [19] analyzed AIS

3

B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

data to gain insight into the spatial-temporal dynamics of ship

trafﬁc around ports. Additionally, Liu et al. [20] used AIS data

to evaluate regional collision risk, and [21] utilized AIS data

to automatically generate ship routes. Tu et al. [22] provided

a comprehensive review of methods to exploit AIS data for

maritime navigation. Most work in the ﬁeld has previously fo-

cused on predicting vessel trajectory patterns and general traf-

ﬁc behavior e.g. [23] . Identifying anomalous behavior based

on general vessel patterns, e.g. [24] , has also been of focus.

These methods are useful for general behavior analysis, but

are of limited use with respect to aiding in collision avoid-

ance.

Of most interest in a collision avoidance setting is the work

done on utilizing AIS data to predict the future trajectory of

a vessel. The idea is to infer the future trajectory of a vessel

based on the historical behavior of vessels in the same re-

gion. This information is stored in historical AIS data. Ristic

et al. [25] presented a method to predict the future motion of

a vessel utilizing a particle ﬁlter approach, but the accuracy is

limited for use in collision avoidance. Pallotta et al. [26] pre-

sented the TREAD (Trafﬁc Route Extraction and Anomaly

Detection) methodology to cluster all trajectories in a deﬁned

region in an unsupervised manner, and subsequently classify a

selected vessel to one of the clusters, each representing a traf-

ﬁc route for the purpose of anomaly detection. Pallotta et al.

[27] subsequently utilized the TREAD methodology to iden-

tify trafﬁc routes, classify a vessel to a route, and predict the

vessel position along this route using the Ornstein–Uhlenbeck

stochastic process. The TREAD technique, however, clusters

way-points, entry points, and stationary points such that the

data for the entire region is utilized to differentiate between

the vessels. As such, there can be signiﬁcant discrepancies be-

tween sub-paths for trajectories belonging to the same class.

This is of limited importance for long-term predictions (or-

der of hours), and the method using the Ornstein–Uhlenbeck

stochastic process is effective in such cases. The method’s

mean and variance functions do not change over time, how-

ever, which can be considered a strict assumption for real

applications. Short-term predictions (order 5–30 min) of high

accuracy and resolution, however, are arguably of more inter-

est for collision avoidance purposes. For such predictions, the

method will not be as effective. Mazzarella et al. [28] also

presented a prediction method using a Bayesian network-

based algorithm with a particle ﬁlter for prediction horizons

in the order of hours. However, this method also has lim-

ited efﬁcacy in short-term trajectory predictions relevant for

collision avoidance purposes.

Hexeberg et al. [29] presented an AIS-based approach to

predict short-term vessel trajectories. The method utilizes a

single point neighbor search method to predict a vessel tra-

jectory based on the underlying AIS data. The method, how-

ever, is unable to handle branching, and [30] expanded on

this work to provide multiple predictions via a prediction

tree, where samples are drawn from close neighbors in the

underlying data. In this manner, a probability estimate can be

evaluated for the future position at a given point in time, facil-

itated via Gaussian Mixture Models. As opposed to previous

methods, these methods do not utilize clustering to identify

trafﬁc routes. All predictions are based on the AIS data in

the neighborhoods of predicted states. As such, these methods

do not take into consideration the relationship between data

points. Future states are predicted iteratively from an initial

state based on the AIS data in the neighborhood of a pre-

dicted position. These data, however, may include data points

that have no relationship to the initial, or previous, predicted

states, and as such may degrade the accuracy. Rong et al.

[31] also presented an approach using a Gaussian process

model, where a probabilistic trajectory prediction method is

outlined which, in addition to predicting the future positions

of a vessel, also describes the uncertainty of the predicted

position. The method, however, is only evaluated with using

regular ship routes and offers no method to identify multiple

possible future routes the vessel may follow, and classify it

to one.

1.2.2. Contribution

In this study, a method to provide system intelligence

to future autonomous ships is suggested for the purpose of

enhanced situation awareness. The method is facilitated by

leveraging historical AIS data via machine learning techniques

to predict the future trajectory of a vessel based on its initial

state. The method provides short-term trajectory predictions

(order 5–30 min) that can provide a basis for collision risk

assessments. In this manner, possible close-range encounter

situations can be avoided, and the overall safety associated

with autonomous operations can be increased.

The method presented in this study is based on a similar

structure to that of previous techniques, in that trajectories

are ﬁrst clustered, a selected vessel is classiﬁed to a given

cluster of trajectories, and a subsequent trajectory prediction is

determined. However, this method is designed to aid in short-

term trajectory predictions. As such, an alternative approach is

suggested, where an initial clustering technique is utilized to

extract a subset of data from a historical AIS data set, centered

about the initial vessel state. This cluster contains AIS data

that has a high degree of similarity to the initial state of the

selected vessel. Using this initial cluster, all unique future, i.e.

forward, trajectories are extracted from the cluster. The length

of these is deﬁned by the desired prediction time horizon.

These trajectories represent all future paths of ships that had

similar states to the initial state of the selected vessel. This

data set will, therefore, only contain data that are related to the

initial vessel state, as well as retain the relationship between

data points.

The extracted forward trajectories represent the possible

future behavior of the selected vessel for a given prediction

horizon. In this study, it is of interest to identify all possible

trajectory modes of the historical ship behavior, such that a

high ﬁdelity trajectory prediction can be conducted to support

collision avoidance. Identifying such modes can be facilitated

by clustering the forward trajectories. It is only of interest to

differentiate between different possible modes for the dura-

tion of the desired prediction horizon. As such, clustering the

extracted forward trajectories will provide a better basis for

4

B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

relevant route identiﬁcation compared to other methods where

entire trajectories for regions are considered.

A clustering technique is suggested based on all relevant

data in each unique extracted forward trajectory. Dimension-

ality reduction via the Karhunen–Loéve transform is ﬁrst con-

ducted in order to compress the trajectories, whilst retaining

the most important information relevant for differentiating the

ship behavior. Such dimensionality reduction should support

the clustering performance. Clustering is then facilitated via

unsupervised Gaussian Mixture Modeling. A selected vessel

is then classiﬁed to a cluster based on its past behavior. This

is achieved via backward trajectory extraction, and optimally

generating features for class separation using Linear Discrim-

inant Analysis. Finally, a trajectory prediction is conducted

with respect to the trajectory data in the cluster of historical

ship behavior.

The method has enhanced performance as it can discover

the cluster of most similar ship behavior. This allows for

predictions with a higher degree of ﬁdelity than other meth-

ods with respect to collision avoidance. This is effective for

challenging regions with more complex trafﬁc, i.e. multiple

possible routes with various speeds. The trajectory prediction

also provides an increased level of accuracy given the rela-

tionship between data points in the underlying data. Similar

methods use techniques that introduce time invariance, such

as dynamic time warping. These result in effective clustering

of trajectories of similar shapes for a given region, but can

not capture the relationship between when various maneu-

vers takes place. Furthermore, clustering ship trajectories for

entire regions will yield different results than clustering the

extracted trajectories as suggested in this study. Therefore,

the technique in this study provides a better basis for dif-

ferentiating relevant historical ship behavior, supporting ship

trajectory prediction. The method can also be applied in any

geographical region, as the algorithm only requires access to

raw AIS data of sufﬁcient density for the region of interest.

An initial version of this work was presented in [32] .

Furthermore, in [33] , a Dual Linear Autoencoder approach

was introduced to facilitate trajectory prediction, that utilized

similar clustering and classiﬁcation regimes to those in this

study. The clustering and classiﬁcation techniques utilized in

[33] were, however, not the focus of the study, and, there-

fore, not addressed in detail. This study can, therefore, be

considered a parallel study, where the methods introduced in

[32] are expanded upon, and addressed in detail.

2. Methodology

This section outlines the methodology utilized to facili-

tate trajectory predictions via the proposed system intelligence

approach. First, the general approach to facilitate a trajectory

prediction is presented. Second, the trajectory clustering mod-

ule is outlined. Next, the trajectory classiﬁcation module is

discussed. Finally, the methodology involved in the trajectory

prediction module is presented.

2.1. General prediction approach

The objective of the method presented in this study is to

facilitate a prediction of the future trajectory of a target ves-

sel, hereafter referred to as a selected vessel. In this study,

a prediction horizon of 30 min is investigated. This, how-

ever, can be varied based on the desires of the user. It is

further assumed that the past 10 min behavior of the selected

vessel is available. The architecture of the method can be

split into three modules; the trajectory clustering module, the

trajectory classiﬁcation module and the trajectory prediction

module. This is illustrated in Fig. 1 .

The trajectory clustering module ﬁrst employs an initial

clustering technique. Based on the current state of the se-

lected vessel, the technique identiﬁes historical AIS messages

in a deﬁned region surrounding the current position of the

selected vessel. Furthermore, data are ﬁltered such that they

have similar speed and course over ground values to that of

the selected vessel. In this manner, ships with similar behavior

in the past are identiﬁed. The forward, i.e. future, trajectories

are then extracted 30 min into the future from their initial

data points. These represent the distribution of possible 30

min behavior for the selected vessel. Next, the forward tra-

jectories are clustered. In this manner, each cluster represents

a mode of ship behavior, where each is comprised of similar

trajectories. It is, therefore, of interest to identify the most

likely mode of future ship behavior the selected vessel may

belong to, such that a prediction of enhanced ﬁdelity can be

facilitated.

In the classiﬁcation module, the trajectories identiﬁed in

the clustering module are extended 10 min into the past. This

is referred to as a backward trajectory extraction. In this man-

ner, each backward trajectory is an extension of one of the

forward trajectories from the clustering module, and have cor-

responding class labels. By comparing the past 10 min be-

havior of the selected vessel to the backward trajectories, the

behavior can be compared, and used to classify the selected

vessel to one of the clusters of future behavior.

In the prediction module, the subset of forward trajectories

belonging to the cluster identiﬁed in the classiﬁcation model

are utilized to conduct a prediction. In this manner, only the

speciﬁc behavior in the cluster is used to conduct the pre-

diction. This should enhance the accuracy of the prediction

compared to cases in which the trajectories diverge.

2.2. Trajectory clustering module

Machine learning can be split into two groups, supervised

and unsupervised learning. Supervised learning deals with

techniques where class labels are available, and one wish-

ing to train an algorithm to correctly classify an unseen data

point to a given class. Unsupervised learning, however, deals

with data where the class labels are unavailable. In such a

case, it is desirable to discover underlying groupings, or clus-

ters, in the data. Clustering is, therefore, a form of unsuper-

vised learning. In this study, the class labels for the extracted

trajectories are unavailable, requiring the use of unsupervised

5

B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

Fig. 1. Method architecture.

learning. As such, clustering is investigated to discover group-

ings, or clusters, of historical ship trajectories that represent

represent behavior modes that a selected vessel may belong

to. This section covers the methodology utilized to cluster the

historical trajectories.

2.2.1. Initial clustering

The input to the algorithm is the initial state of a selected

vessel, and is deﬁned in (1) . This state can be thought of as

the current state of a target vessel, whose future trajectory is

of interest to predict. Such parameters can be acquired from

on-board sensors e.g. radar, or from external sources e.g. AIS.

s

0

→ [ x

0

, y

0

, χ0

, v

0

, T

0

] (1)

It is of interest to identify similar vessels in the historical

AIS database, i.e. data points with a high degree of similar-

ity to s

0

. It can be argued that AIS data points similar to

s

0 will have a higher probability of having similar trajecto-

ries than dissimilar data points. It is, therefore, assumed that

ships that were in a similar geographical location, with a sim-

ilar course and speed over ground, will likely have behaved

in a similar manner. As such, the trajectories of these vessels

can be thought of as representing the distribution of the pos-

sible future behavior of the selected vessel. It is, therefore,

assumed that these trajectories can be used to estimate the

future behavior of the selected vessel. The discovery of such

similar vessels is achieved via the initial clustering technique

described in this section.

A matrix Z can be deﬁned as the subset of spatial data in

the AIS data set. The spatial data is converted from longitude

and latitude values to UTM coordinates ( x, y) prior to cluster-

ing. A rotational afﬁne transformation can be deﬁned to rotate

Z = [ x

z

, y

z

] by θ= χ0 to Z

= [ x

z

, y

z

] . This transformation

is deﬁned in (2) .

Z

= R Z

T (2)

Where x

z ∈ IR , y

z ∈ IR , x

z

∈ IR , y

z

∈ IR and Ris the rotation

matrix deﬁned as:

R =

cos (θ ) −sin (θ )

sin (θ ) cos (θ ) (3)

The new matrix, Z

, will have a basis comprised of a vector

in the direction of χ0

, and one orthogonal to χ0

. An initial

cluster C

0 is then created using data in the space spanned by

these basis vectors in (4) . This clustering operation results in

a rectangular cluster C

0 with a height of 2δH and width 2δW

centered about s

0 as illustrated in Fig. 2 , which is adapted

Fig. 2. Initial cluster C

0

.

from that presented in [32] . The cluster also only contains

data points with similar χand vvalues that were at a similar

position to the selected vessel at some previous time point.

The rectangular shape of the cluster orthogonal to χ0 should

capture most vessels that have similar trajectories to that of

s

0

.

C

0

= { a

i ∈ A : (| x

z

i

−x

z

0

| ≤δW ∧ | y

z

i

−y

z

0

| ≤δH

)

∧ (| χi

−χ0

| ≤χδ∧ | v

i

−v

0

| ≤v

δ) } (4)

2.2.2. Forward trajectory extraction

Based on the initial cluster C

0

, unique instances of vessel

trajectories are identiﬁed, given that multiple data points in

C

0 may belong to the same trajectory. Once unique trajec-

tory instances have been identiﬁed, the nearest point of each

trajectory to s

0 in geographical space is deﬁned as its initial

point. The forward trajectories of all instances are then ex-

tracted from this point and a period of time into the future

corresponding to the desired prediction horizon T

p

. An addi-

tional time period, T

δ, is extracted to ensure sufﬁcient data

density for the trajectory prediction module at the culmination

of the prediction. The trajectories belonging to C

0 represent

the possible behavior of the selected vessel, as their initial

points have a high degree of similarity to s

0

. In other words,

it is likely that the future trajectory of the selected vessel will

be similar to one of the trajectories in C

0

.

2.2.3. Trajectory feature generation

Assuming that the trajectories in C

0 represent the distri-

bution of the possible future behavior of the selected vessel,

it is desirable to discriminate between the various possibil-

ities, i.e. discover groupings of behavior. In this sense, one

6

B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

wishes to cluster the trajectories into classes of behavior. To

achieve this, each unique trajectory must be described by a

set of features. The term feature in this case refers to an in-

dividual measurable parameter that describes the trajectory.

Each trajectory is to be clustered in an unsupervised manner

based on these features. As such, a trajectory feature vector

is constructed comprising relevant parameters.

The ﬁrst step in the generation of the feature vectors is

to linearly interpolate each trajectory at 30 s intervals. This

is done to generate higher density data, as well as provide a

common time index with which the trajectories can be com-

pared. The initial point of each trajectory is deﬁned as T

0

.

Subsequent data points are, therefore, 30 s apart, starting at

this point. In this manner, the trajectories can be directly com-

pared at the same time instance relative to T

0

. Using the inter-

polated data, each trajectory feature vector is constructed by

ﬂattening the matrix containing the positional and speed data

(x, y, v) of the trajectory. If each trajectory is of length L,

the resultant trajectory feature vector is deﬁned as f ∈ IR

3 L×1

.

Utilizing the positional data, fwill incorporate the shape of

the trajectory and the inherent course alterations between data

points. The speed of the vessel along the trajectory will also

be inherent in the positional data. Nonetheless, the speed over

ground values at each time instance were deemed relevant to

include to enhance the information stored in each vector.

As mentioned, the objective of the module is to cluster the

trajectories, and as such, the respective feature vectors should

provide a basis for discriminating between the classes of be-

havior. In general, including as much information as possible,

i.e. increasing the dimensionality of the feature vector, should

enhance the discriminatory properties of the data set. This is

true, but in a clustering setting one may encounter issues re-

lated to the curse of dimensionality [34] .

Clustering is based on grouping data points via some dis-

tance measure. Points that are closer together are more likely

to be considered part of the same cluster. The curse of dimen-

sionality in relation to clustering was discussed in [35] , where

it was pointed out that a ﬁxed number of data points will

become increasingly sparse as the dimensionality increases.

Data points can in a sense be lost in space as the dimension-

ality increases, as the distance between points with respect

to a given dimension can be large. As a result, clustering

data using standard techniques in a high dimensional space

will degrade the results, as the algorithms are unable to ﬁnd

groupings in the data. One method to ameliorate this effect

is to reduce the dimensionality of the data.

A common method for dimensionality reduction is the

Karhunen–Loéve (KL) transform [36] . The purpose of the

transform is to attain uncorrelated features and is shown in

(5) . First, the set of all feature vectors is centered such that

all features have mean zero within the set. Subsequently, the

covariance matrix, , of the set of all feature vectors is cal-

culated. Matrix Econsists of the eigenvectors of , and

is the eigenvalue matrix, where the relationship is shown in

(6) . (5) projects the feature vector, f, onto the space spanned

by the eigenvectors of the covariance matrix. The covariance

of the data inherently describes the correlation among the

respective parameters. As such, the eigenvectors of the co-

variance matrix will describe the directions in which the data

has the highest degree of variation orthogonal to each other.

x = E

T

f (5)

Where x ∈ IR

3 L×1

, f ∈ IR

3 L×1 and E ∈ IR

3 L×3 L

= EE

T (6)

Where ∈ IR

3 L×3 L and ∈ IR

3 L×3 L

In a high dimensional space, however, many of the eigen-

vectors will describe very little variation in the data. The KL-

transform, therefore, projects fonto the subspace spanned by

the leigenvectors with the llargest eigenvalues in (7) .

x = E

T

l

f (7)

Where x ∈ IR

l×1 and E

l ∈ IR

3 L×l

This will inherently preserve the most important covari-

ance information in the data whilst reducing the dimension-

ality to l. This may be abstract for the case of the trajectory

feature vector, f, as each dimension represents a position or

speed value at a given time instance. Take for instance the

case of a 30 min prediction with ﬁve minutes added to al-

low for sufﬁcient data density. The dimensionality of fwill

then be 210. The eigenvectors of will point in the direc-

tions within this 210-dimensional space where there is a high

degree of variation between the trajectories. As such, it is

difﬁcult to gain a direct physical interpretation of the eigen-

vectors, as the projection onto them represents a combination

of multiple parameters. By choosing the llargest eigenval-

ues, one chooses the ldirections where the variation in the

data is greatest. When projecting the feature vectors onto the

subspace spanned by the eigenvectors corresponding to the

largest eigenvalues, one is in fact generating new features

with a high degree of variation that can be used for further

analysis.

In this study, the projection of fonto the eigenvectors cor-

responding to the three largest eigenvalues was chosen as a

representation for each trajectory. Generally, the projection

should retain at least 95% of the variance in the data. This is

evaluated by investigating the sum of the chosen eigenvalues

over the sum of all eigenvalues [37] . It was found that using

the eigenvectors corresponding to the three largest eigenvalues

fulﬁlled this requirement when evaluating the results. Addi-

tionally, a three-dimensional vector can easily be visualized

when evaluating the performance of the clustering algorithm.

2.2.4. Unsupervised Gaussian mixture model clustering

Using the reduced trajectory feature vectors generated via

the KL-transform, the trajectories can be clustered. Depend-

ing on s

0

, the number of true clusters, i.e. classes, will vary.

As such, a ﬂexible clustering algorithm is required that can

adapt to the data in each prediction. Unsupervised Gaussian

Mixture Model clustering was chosen for use in this study.

A Gaussian Mixture Model (GMM) [38] is a ﬂexible model

that adapts to the underlying data. GMMs assume that a data

7

B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

set, X, consists of a mixture of Mdifferent Gaussian distri-

butions. Each distribution has its own mean vector, μm

, co-

variance matrix, m

, and prior distribution, πm

. As such, each

distribution will describe that particular class or cluster, i.e.

class m. The class membership parameter, z

i

, is introduced

for each data point, x

i

, where:

z

ik =

1 if k = m

0 otherwise

Where z

i ∈ IR

M×1

The class conditional probability is shown in (8) . The most

likely model is estimated by maximizing the log-likelihood

with respect to the various model parameters.

p(x

i

| z

im = 1) ∼N ( μm

, m

) (8)

The class membership of the trajectories is, however, un-

known. As such, the Expectation Maximization (EM) algo-

rithm is utilized to conduct the unsupervised GMM cluster-

ing. The GMM requires that a speciﬁed number of underly-

ing models, M, is input. Based on this, the EM algorithm

initializes all model parameters. A common method is to ini-

tialize all μm

as randomly chosen data points, the priors as

πm =

1

M

and m = I. This initialization is unlikely to model

the underlying data correctly. As such, the algorithm conducts

what is known as the expectation step. In this step, the ex-

pected class membership z

im

is evaluated in (9) , based on

the current model parameters, . All data points will, there-

fore, have updated class memberships based on the current

model parameters.

z

im

=

p(x

i

| z

im

=1 ;) πm

M

k=1

p(x

i

| z

ik

=1 ;) πk

(9)

The next step in the EM algorithm is known as the maxi-

mization step. In this step, the model parameters are updated

based on the new distribution resulting from the expectation

step. This is done by maximizing the log-likelihood with re-

spect to . The estimated parameters in the maximization

step are calculated in (10), (11) and (12) .

ˆ

μm

=

N

i=1

z

im

x

i

N

i=1

z

im

(10)

ˆ

m =

N

i=1

z

im

(x

i

−μm

)(x

i

−μm

)

T

N

i=1

z

im

(11)

ˆ πm =

N

i=1

z

im

N

(12)

The EM algorithm now repeats, where the expected class

memberships are updated, as well as the model parameters.

The algorithm is in a sense adapting to the data, where the

most likely distribution of the data is discovered. This iter-

ative process continues in a loop until a stopping criteria is

met. One common stopping criteria is the convergence of the

total log-likelihood. Alternatively, one can terminate the algo-

rithm if there is little to no change in the model parameters,

i.e. the parameters themselves converge. The parameter con-

vergence criteria was utilized in this study. Often times, the

EM algorithm can have issues with convergence, due to poor

initialization. To avoid divergence issues, a technique is em-

ployed where a number of random initializations are run for

a number of iterations. The best run, i.e. the run with the

greatest log-likelihood score, is then chosen and run for fur-

ther iterations. The mixture model will, upon convergence,

consist of Mdistinct Gaussian distributions which describe

the class conditional probabilities, p(x| c

m

) , of the data, along

with an associated prior distribution, πm

. The posterior prob-

ability p(c

m

| x) can be found via Bayes Rule in (13) using

the resultant conditional probabilities and priors from the al-

gorithm.

p(c

m

| x) =

p(x| c

m

) πm

p(x)

(13)

p(c

m

| x) > p(c

j

| x) ∀ j = m, j = 1 . . . M (14)

Clustering of the dataset is then conducted via Bayesian

classiﬁcation, where each feature, x

i

, is classiﬁed to class m

according to (14) . However, the number of underlying classes,

M, is as previously mentioned unknown. In order to determine

the most likely number of clusters, the Bayesian Information

Criterion (BIC) [39] deﬁned in (15) , is utilized.

BI C = −2L L (M

) + K

M

ln(N ) (15)

For a GMM with Munderlying distributions, L L (M

) is

the total log-likelihood function computed at the optimum,

K

M the number of free parameters in the mixture model, and

N the number of data points. The EM algorithm can be run

for various GMMs by altering M. By calculating the BIC for

each resultant GMM, the most likely GMM is that with the

lowest BIC. This is due to it having the highest likelihood

and least complexity. In this study, it was assumed that there

will be no more than 20 unique clusters in the trajectory data,

and the BIC was, therefore, evaluated for values of Mup to

20.

This process discovers the best GMM to ﬁt the data and

provides the number of possible routes, or trajectory behavior

modes, a selected vessel may belong to. By classifying all the

extracted forward trajectories, class labels can assigned. These

labels are used for further analysis in the subsequent modules.

2.3. Trajectory classiﬁcation module

The trajectory clustering module has now clustered all tra-

jectories present in C

0 to Mclasses. Each class represents

a group of trajectories that have a high degree of similarity.

As such, each class represents a possible future route, or be-

havior mode, the selected vessel may belong to. It is now of

interest to classify the selected vessel to the most likely class

of the Mpossibilities. In this sense, an estimate of the distri-

bution of the possible future behavior of the selected vessel

can be made. Using the data in the class of ship behavior, a

trajectory prediction can be made. This section presents the

method utilized to achieve such a classiﬁcation.

2.3.1. Backward trajectory extraction

One possible method to conduct the aforementioned clas-

siﬁcation is to utilize the current vessel state, s

0

, and compare

it to the data points in C

0

. This, however, will have limited

8

B. Murray and L.P. Per er a Journal of Ocean Engineering and Science xxx (xxxx) xxx

ARTICLE IN PRESS

JID: JOES [m5+; October 21, 2021;18:48 ]

predictive power, as the classiﬁcation will be based solely on

one time instance of the selected vessel. An alternative ap-

proach is, therefore, suggested, where the previous 10 min of

the selected vessel’s trajectory are compared to the previous

10 min of data for all trajectories in C

0

. This is in a sense the

inverse of the forward trajectory extraction process described

in Section 2.2.3 . Instead of extracting the trajectories from T

0

and 30 min into the future, the past trajectories are extracted

from the same initial point, i.e. from T

0

, and 10 min into the

past from that time instance. It is assumed that at least 10 min

of behavior for the selected vessel should be available via the

on-board sensors of the own ship, or via external sources e.g.

AIS. The method is otherwise identical to that described in

Section 2.2.3 . All the backward trajectories extracted from C

0

will have the same labels as those determined by the cluster-

ing technique in Section 2.2.4 . As such, a labeled data set is

available that can be used to classify the observed trajectory

of selected vessel.

2.3.2. Optimal feature generation

Each backward trajectory feature vector is represented by

ﬂattening the matrix containing all position and speed over

ground data, in the same manner as for the forward trajecto-

ries in Section 2.2.3 . This will result in a vector f ∈ IR

3 L×1

. In

the case of a 10 min trajectory this will be a 60-dimensional

space within which the classiﬁcation must take place. This

can be a challenging task, as it is likely that the features are

quite similar, given that the vessels in C

0 generally will have

similar trajectories for the past 10 min.

To improve the classiﬁcation accuracy, Linear Discriminant

Analysis (LDA) [40] is utilized. LDA provides a method to

generate features with optimal separation between classes in

a supervised manner. Using the class separability measure J

3

in (16) , one can optimize a transformation such that features

are generated to optimize class separability.

J

3

= trace { S

−1

w

S

m

} (16)

S

m is the mixture scatter matrix deﬁned as S

m = S

w

+

S

b

, where S

w is the within-class scatter matrix and S

b

the between-class scatter matrix. S

w and S

b are deﬁned in

(17) and (19) , respectively. S

w describes how compact the

data within each class is, whilst S

b describes how spread out

each class is with respect to the global mean, μg

. In a clas-

siﬁcation setting, one wishes to minimize the trace of S

w

,

i.e. data are more compact within each class, and maximize

the trace of S

b

, i.e. the classes are more spread out. This

corresponds to maximizing the class separation criterion J

3

.

S

w

=

M

m=1

πm

m (17)

μg

=

M

m=1

πm

μm

(18)

S

b

=

M

m=1

πm

( μm

−μ0

)( μm

−μ0

)

T (19)

It is desirable to ﬁnd a transformation x = A

T

fsuch that J

3 is

maximized in the transformed space. The optimal transforma-

tion with respect to class separability is found to be A = E,

where Eis the matrix of eigenvectors of S

−1

w

S

b in the origi-

nal vector space. This relationship is shown in (21) , where

is the corresponding diagonal eigenvalue matrix. The trans-

formation is shown in (20) . However, S

b is of rank M −1 ,

and correspondingly S

−1

w

S

b is also of rank M −1 . As such,

there will be M −1 nonzero eigenvalues. (20) will, therefore,

project fonto the subspace spanned by the llargest eigenvec-

tors in a similar manner to the KL-transform. If l = M −1 ,

optimality with respect to J

3 will be preserved. Further di-

mensionality reduction can still be conducted by choosing a

value l < M −1 . This will, however, be a sub-optimal solu-

tion. Further details on LDA can be found in [41] .

x = E

T

f (20)

Where x ∈ IR

3 L×1

, f ∈ IR

3 L×1 and E ∈ IR

3 L×l

S

−1

w

S

b

= EE

T (21)

Where S

−1

w

S

b

∈ IR

3 L×3 L and ∈ IR

l×l

2.3.3. Classiﬁcation

Despite utilizing the optimal features described in Sec-

tion 2.3.2 , the classiﬁcation task is highly non-linear, and

likely with signiﬁcant overlap between classes in most cases.

This is due to the high degree of similarity between the past

trajectories. As a result, the k-Nearest Neighbor ( kNN) clas-

siﬁer [42] is utilized due to its nonlinear predictive power.

Given a data point x

0

, the kNN classiﬁer will measure the

distance to all other data points, x

i

, in the dataset Xusing

the Euclidean distance as shown in (22) .

d

i = || x

i

−x

0

||

2 (22)

The kNN classiﬁer will then identify the knearest data points

using distance measures from (22) . Based on this subset of

data, the algorithm then identiﬁes the class with the most data

points in the subset, and classiﬁes x

0 to the majority class.

In this study, x

0

is the projection of the backward trajectory