978-1-6654-7273-9/22/$31.00 ©2022 IEEE
Applying the Big Data Technologies for Enhancing
Maritime Interoperability Framework
Zdravko Paladin, Nexhat Kapidani, Enis Kočan, IEEE Member, Tommaso Nicoletti, Georgios Vosinakis,
Alkiviadis Astyakopoulos, Christos Bolakis, Marios Moutzouris
Abstract— In this paper, we present the core applications of
data lakes and other big data infrastructure technologies for the
purpose of enhancing the maritime interoperability framework
and ensuring resilient collaboration among agencies. The
approach is based on the deployment of multi-layered &
semantically enabled Data Lakes for storing various maritime
data collected from heterogeneous sensors, and on the
information exchange process through the Common
Information Sharing Environment (CISE) network using
advanced Command and Control (C2) platforms. The results of
this paper are derived from the EU-funded project
EFFECTOR, highlighting the significant contribution of
advanced solutions using Artificial Intelligence algorithms and
supporting UAV and C2 technologies to various operations at
sea. The validation survey results collected from end-users after
the execution of the maritime trials are presented as well.
Keywords—Big Data, Data Lake, CISE, UAV, Maritime C2
Following the recent trends in the development of data-driven
technologies, the maritime sector recognized the benefits of
the exploitation of large data sets and advanced systems that
will host the ever-growing quantity and versatility of
maritime big data relevant for strategic and tactical decision-
making processes. It is of key importance for maritime
surveillance and other safety purposes to have access to an
appropriate and well-organized large repository of big data
collected from heterogeneous sensors. These kinds of
repositories in form of Data Lakes, represent the fundamental
infrastructure for collecting and storing the various data as a
knowledge basis upon which, using Artificial Intelligence
(AI) and Machine Learning (ML) algorithms, will be possible
to extract and deliver the relevant operational information.
Thus, the Data Lake (DL) can be defined as “a flexible,
scalable data storage and management system, which ingests
and stores raw data from heterogeneous sources in their
original format, and provides query processing and data
analytics in an on-the-fly manner.” . Raw data, by itself in
the DL, do not have great value if they are not properly tagged
with metadata, because in that case the search and query for
relevant information are significantly hampered, while the
efficiency of data lake services is achieved when the data are
adequately structured in particular units . Such data
repositories have huge value for establishing and running the
Common Information Sharing Environment (CISE) network
which fully relies on efficient communications and fast
information exchange between maritime agencies' legacy
systems, based on CISE Data and Services Model (,).
The paper is structured as follows: main features of current
Data Lake developments (Chapter II), core capabilities of Big
Data Infrastructure for maritime purposes (Chapter III),
review of the main unmanned aerial vehicle (UAV)
components data flow for multilayered Data Lake developed
within EFFECTOR Projects, together with maritime
Command and Control Platforms (C2) and validation results
for these components which are tested in Maritime Trials
implementation by maritime agencies in 2022 (Chapter IV).
II. KEY FEATURES OF DATA LAKES
Along with Data Lake advancing, the developed AI/ML
techniques strongly impact the features of big data
repositories, especially when complex multiscale processes,
analysis of situations, and forecast of future events, are done,
based on large quantities of various data. Following this path,
Big Data, collected through a network of sensors, provide
continual flow to DL which feeds the decision support tools,
most commonly in the form of AI/ML/Deep Learning
modules, that provide extracted relevant information for
administrative and autonomous systems (, ). In that
sense, the Data Lakes as a product of big data multiplication
and growth, are featured with properties of huge volume,
variety, velocity, value, and veracity, which is known as the
5 Vs Model. Data in these lakes are in original form and
format (natural or raw data) and are used by larger user
communities. An example of a widely used platforms for big
data technologies is Hadoop . Furthermore, the data lake
architecture is mostly determined by three zones in hybrid
modes, consisting of the Raw data Zone (with raw data
ingestion and storage), Processing Zone (with data
processing and intermediate storage), and Access Zone (with
data querying and available data storage) .
Another important concept is the Data Warehouse (DW),
which integrates, manages, and aggregates the data collected
from multiple data sources. Also, the data are stored in DW
in Relational databases with SQL queries (Online analytical
processing – OLAP and Online transaction processing –
OLTP) while in DL data are stored in Hadoop, a relational
Database or NoSQL data stores with different query and
programming languages (SQL, Cypher, Java, Python, R,..)
. Therefore, the related essential tools cover data
integration, stream data ingestion, and cloud data processing
functionalities. Their key features are the availability on-
premise with Graphical User Interface (GUI), batch
processing, and certain level extendability with the support
for processing structured data and semi/unstructured data to
some extent. Stream data ingestion tools are featured with a
variety of designs and supporting structures, semi-structured,
and unstructured data processing .
Z.P. and N.K are with Administration for Maritime Safety and Port
Management of Montenegro (AMSPM), Bar, Montenegro (all emails in
E.K. is with the University of Montenegro (UoM), Faculty of Electrical
Engineering, Podgorica, Montenegro (firstname.lastname@example.org).
T.N. is with Engineering I.I., S.p.A., (ENG), Milano, Italy
G.V. is with the Institute of Communication and Computer Systems
(ICCS), Athens, Greece, (email@example.com)
A.A. and C.B. are with the Center of Security Studies (KEMEA), Ministry
of Citizen Protection, Athens, Greece, (firstname.lastname@example.org)
M.M. is with SATWAYS ltd., Athens, Greece (email@example.com)
30th Telecommunications forum TELFOR 2022 Serbia, Belgrade, November 15-16, 2022.Proceedings ISBN: 978-1-6654-7272-2 (IEEE), pp.105-109
III. MARITIME BIG DATA FRAMEWORK EXAMPLES
Maritime Big Data are supported by software tools, known as
Big Data Analytics, with the ability to fast process and extract
the queried vessel or other maritime information from large
databases, visualize it through Business Intelligence (BI)
software and help make a proper decision for maritime
authorities. On top of this approach and including the CISE
Network, a similar architecture in the form of Maritime Big
Data Framework could be established by integration of data
collected from national sources, transferred over Inter-VTS
Exchange Format (IVEF) or National Marine Electronics
Association (NMEA) protocols to databases in multi-
functional DL with layers for ingestion, aggregation,
semantics, data fusion and analytics, and harmonization .
The information are processed and delivered over decision
support tools, services management and actions, mission and
reporting layers, and then distributed using interactive and
interoperable C2 systems and platforms. These components
further translate the required data and information to CISE
Adaptors and established EU/regional/national CISE Nodes
and Gateways. Significant research efforts within the recent
EU ANDROMEDA project brought the extension and
enrichment of the basic CISE Data and Services Model with
additional vocabulary, entities, objects, services, and related
data classes for both land and maritime border surveillance,
widening its scope in the form of eCISE Model (, ).
Accordingly, an important application of combined AI
capabilities of deployed modules, such as InSyTo (a soft
information fusion and management toolbox for detection of
meeting between ships), and Early Collision Notification
System (ECNS) warning tool, confirmed the efforts towards
increasing the maritime safety and security by tracking the
vessel trajectory and prediction of collision risks .
Specifically, the Big Data applications represent the key
component of establishing the development platform for e-
Navigation, safety, and security management platforms,
being composed of information service modules for
Navigation, Ship dynamics, security and safety features .
Another important maritime surveillance big data-oriented
approach has been created within the SESAME platform to
make novel solutions for the management, analysis, and
visualization of multi-source Automatic Identification
System (AIS) and satellite data streams from Earth
Observation high-resolution Sentinel-1 Synthetic Aperture
Radar (SAR) and Sentinel-2 optical imaging . The
platform combines big-data-oriented clusters such as
Cassandra, Hadoop, Spark, and Fink for storage and batch
processing. Furthermore, the Big Data techniques provide the
capability for the detection of anomalous behavior at sea,
using knowledge-driven and mostly data-driven approaches,
ML methods (like Clustering with DBSCAN Algorithm,
Neural Networks, K-means clustering), and stochastic
methods (like Gaussian Process, Mixture models, Bayesian
Network, KDE), applied on AIS data sets .
IV. CASE STUDY: UAV AND C2 IN EU EFFECTOR PROJECT
A. Importing UAV-collected data in Data Lake
In maritime operational scenarios defined for validation of
project EFFECTOR technical solution, a UAV system would
be mainly utilized once an event has been detected to provide
live information on the evolving incident . The main
novel characteristics of a UAV system in supporting safety
and security maritime operations were identified as :
Detection: The system will be used to identify or verify a
report of an incident in the area concerned. The object
detection capability allows rapid and efficient area scanning.
Tracking: Once a vessel of interest has been detected, live,
uninterrupted information on its situation is of utmost
importance in the evolution of all operations.
Timeliness: Because conditions at seas alter quickly,
detection and tracking algorithms must provide live
information, with minimal delays and no interruption.
Fast Deployment: The system is easy and fast to deploy so
that live information and tracking of a target can start as soon
as possible after the incident is identified. Both in the search
and rescue and maritime safety fields, the initial time after the
identification of an event is critical.
Interoperability: Given the abundance of legacy systems
used in maritime operations and communications, any new
system needs to ensure seamless interoperability with
existing systems. Specifically, in the EU space, the CISE 
is being used as the basis for information exchange between
public authorities. Any messages or communications
generated by the system should be compatible with CISE.
The purpose of our implemented architecture (Figure 1) in the
EFFECTOR System is to detect an object in the camera feed,
assign it an identification (ID) and then keep track of its
trajectory without altering the ID. Our detection & tracking
architecture is composed of two main tools. The stack utilizes
YOLOv5 , a state-of-the-art object detector. When
YOLOv4 was published it outperformed EfficientDet, the
state-of-the-art object detection model up to that time .
The comparison described in , applied on a V100 GPU,
showed that YOLOv4 achieved the same accuracy with
EfficientDet at almost double frames per second (FPS).
However, YOLOv4 utilizes Darknet, while YOLOv5 utilizes
PyTorch, making it much easier for training, testing, and
deployment . The output of the detector is passed to the
Deep SORT tracker. This tracking algorithm’s run time
complexity is not perceivably affected by the number of
objects, since it only processes the position and velocity of
tracked objects while the output of the neural network is
independent of the number of objects, greatly reducing the
number of parameters that need to be processed .
Figure 1. The general architecture shows the distinct components
of the system 
B. C2 Platform for Data Lake
In the EFFECTOR Project, within French, Portuguese, and
Greek Maritime trials, the SeaMIS, MUSCA, and ENGAGE
Command and Control Systems (C2) were used respectively
. The MUSCA C2 gives basic support for strategic,
operational, and tactical levels providing the Common
Operational Picture (COP), tactical situational awareness,
and fused intelligence data used for maritime operation
centers. The main improvement comes from interoperability
with other maritime forces, providing a shared and agreed
model such as CISE. Thanks to the MUSCA integration of
CISE functionality it is also possible to communicate with
other partners connected through the CISE network and share
using C2 GUI features for CISE, sharing for instance vessels,
anomalies, and reports.
The MUSCA C2 is a modern web application that leverages
microservice architectures generated using Java Spring
technologies for the back-end and Angular for the front-end.
It leverages One Page Web Application, focusing on a map
and a set of tools that are necessary for Command and Control
in the maritime environment. The GUI allows handling the
User Management, monitoring Vessel Traffic and Vessel
Detail, analyzing Anomalies trigged by the underlying
services based on restricted criteria starting from user-defined
Rules, planning tasks for a Mission on a specific event,
creating Reports, sharing the info using CISE data model
through the CISE network connection. Based on the MUSCA
C2, there are the following modules: the Extraction Modules,
the Message Broker, the Recognized Maritime Picture
(RMP), and the Mission Planning (MP) (Figure 2).
Figure 2. MUSCA architecture
MUSCA leverages event-driven architecture. There are two
Extraction modules: the Sensor Converter, which converts
data from original raw data into machine data format, and the
ETL (Extract Transform Load) module, which takes the data
from sources and put it in a shared queue. The Message
Broker module gathers the information like a big, shared
queue and it is needed to get the data towards the underlying
Data Lake that will gather all the data. Thanks to the Message
Broker each underlying service can forward information to
the GUI of C2 that takes the data and renders to the operator
within a map, useful data such as Anomalies, Incidents, or
Risks. The RMP module is defined as a composite picture of
activity over a maritime area of interest. The purpose of RMP
is to show an object of interest, a vessel in the maritime
environment, determining what it is doing and determining if
some type of action is required.
Thanks to RMP Module, it is possible to have a full
situational awareness that aims to get knowledge of the traffic
tracks and positions, giving a better situation understanding
provided by the RMP that aims to get knowledge of events
that happened in a maritime environment, such as anomalies
or incidents. The MP module gives the operators a graphical
way to understand and deploy the assets available and filter
them according to the traffic and the knowledge of the border
situation (Figure 3). In creating a Mission, the operators will
be able to create a set of tasks according to anomaly,
incidents, or risk that has been triggered in the C2 GUI. A
Mission aims to define details and trace a possible action to
mitigate the occurred event, optimizing the assets. The C2
also communicates with a MUSCA module called MUSCA
CISE Adapter. The main feature well represented in MUSCA
C2 is the integration with a legacy system for live streaming
of maritime traffic and thanks to MUSCA services it is also
possible to gather and fuse different sources, such as radar
and satellite. The CISE concept was used as a data model for
networked sharing to gather partner service data info and
forward it to the UI through a well-defined protocol.
Figure 3. Vessel traffic in MUSCA GUI compliant with CISE
C. End-users survey results of Maritime Trials
The EFFECTOR technical solution, comprehending the Data
Lakes and plugged-in layers, as well as the UAV components
and C2 functionalities, has been evaluated through end-user
Validation Questionnaires for French, Portuguese, and Greek
Maritime Operational Trials. These questionnaires aim to
examine the overall and particular quality of deployed
services and systems in the environment of maritime
operations. Based on a statistical analysis of collected
answers (64), the majority of respondents (48,44%) assessed
the general degree of overall EFFECTOR Interoperability
Framework as “5 – Excellent”, 42,19% responses indicated
the “4 – Very Good”, while the rest answers (9,37%) covered
marks 3, 2 and 1. These results proved the high level of
EFFECTOR platform quality, technical components’ high
development level, CISE standard compliance, and
satisfaction of maritime end-users operational requirements.
Among others, the survey results comprehended the Key
Performance Area (KPA) Command and Control System,
which focuses on the essential features of C2 software used
in operational trials. The number of exchanged CISE entities
is assessed through questionnaires, and according to the
majority of answers, its range was between 3 and 5, thus
being assessed as “Very good”. The alarm/notification
visibility feature was successfully used in all trials. Related
to GUI quality, in the general evaluation for all C2 and trials,
nearly 50% of answers were “Good: 50-70%”, while slightly
above 50% of answers were “Very good: 70-100%”. The
EFFECTOR system comprehended versatile information and
communications standards for smooth and timely
collaboration between datasets and legacy systems of all
partners. Among respondents, it is confirmed the usage of the
AIS standard in all EFFECTOR trials, as well as radar,
satellite services, imaging devices, geolocational data, video
stream, NMSW, and UAV standards/systems. Furthermore,
the respondents very positively evaluated the degree of
reaction to the incidents and the usability of managing special
areas, with significant majority marks in the range of 50-70%
(Figure 4). Very satisfactory levels were assessed for map
user-friendliness in the GUI, assignment of deployable assets,
comprehensive intelligence report/management (Figure 5)
and increased efficiency of allocation of resources (Figure 6).
Figure 4. Reaction to the incident and special areas managing
Figure 5. The degree of user-friendliness of the map in the GUI,
assets assignment, and intelligence report/management
Figure 6. The efficiency of allocation of resources (UAV, boats)
The various safety and security operations in the maritime
sector are largely benefiting from the advanced ICT
technologies used to process a large amount of heterogeneous
data. Therefore, the trends in data science are fully applicable
to the maritime sector by adopting the conceptual frameworks
for management and handling large data sets to make the
additional value of collected data through their processing and
sharing in the appropriate format. The main models used to
design the data lakes and the overall big data infrastructure are
analyzed in this paper, together with the description of the C2
components. Some outstanding results that emerged from the
analysis of the validation surveys dealing with the assessment
of the EFFECTOR technical solution have been presented. To
this end, maritime Big Data technologies showed potential for
essential applications in the maritime domain, contributing to
the efficient management of assets and systems in complex
maritime environments. These use cases included ship
operations, maritime surveillance, safety, security, SAR
missions, environmental conditions, law enforcement, etc.
This work has received funding from the European Union’s
Horizon 2020 Research and Innovation Programme under
Grant Agreement No 883374 (project EFFECTOR). This
article reflects only the authors’ views and the Research
Executive Agency (REA) is not responsible for any use that
may be made of the information it contains.
 R. Hai, C. Quix, M. Jarke, “Data lake concept and systems: a survey”,
Computer Science Databases (cs.DB), 2021., http:/doi.org/2106.09592
 B. Inmon, “Data Lake Architecture: Designing the Data Lake and
Avoiding the Garbage Dump” Technics Publications, USA, 2016.
 European Maritime Safety Agency (EMSA): CISE Technical
 CISE homepage (2020), https://ec.europa.eu/cise, 2022.
 J.J. Huls, G. David, “Big Data: Maritime Uses and Challenges”,
Combined Joint Operations from the Sea Centre of Excellence, 2018.
 B. Soyer, A. Tettenborn, “Artificial Intelligence and Autonomous
Shipping - Developing the International Legal Framework”, 2021.
 A. Gorelik, “The Enterprise Big Data Lake”, O'Reilly Media, 2019.
 P. Sawadogo, J., Darmont, “On data lake architectures and metadata
management”, Journal of Intelligent Information Systems, Springer
Verlag, 56 (1), pp. 97-120, 2021.
 T. Hlupić, J. Puniš, “An Overview of Current Trends in Data Ingestion
and Integration”, In 44th MIPRO Conference Proceedings, 2021.
 Z. Paladin, N. Kapidani, Ž. Lukšić, T. Nicoletti, M. Moutzouris, A.
Blum, A. Astyakopoulos, “A Maritime Big Data Framework
Integration in a Common Information Sharing Environment”,
Proceedings of 45th MIPRO Conference, pp. 1161-1166, 2022.
 A. Mihailović, N. Kapidani, E. Kočan, D. Merino, J. Rasanen,
“Analysing the prospect of the maritime common information sharing
environment’s implementation and feasibility in Montenegro”, In
Scientific Journal of Maritime Research, 35(2), pp. 256-266, 2021.
 A. Mihailović, N.Kapidani, E. Kočan, et al., “A Framework for
Incorporating a National Maritime Surveillance System into the
European Common Information Sharing Environment,” International
Conference on Information Technology (25th IT), 2021, pp. 1-6.
 Z. Paladin, N. Kapidani, Ž. Lukšić, A. Mihailović, A. Astyakopoulos,
A. Blum, “Combined AI Capabilities for Enhancing Maritime Safety
in a Common Information Sharing Environment”, Proceedings of 35th
Bled e-Conference, pp. 145-160, 2022.
 Y. Zhang, A. Zhang, D. Zhang, Z. Kang, Y. Liang, “Design and
Development of Maritime Data Security Management Platform”,
Applied Science, MDPI, 12, (800), 2022.
 R. Fablet, N. Bellec, L. Chapel, C. Friguet, R.Garello, G. Hajduch,
“Next step for Big Data Infrastructure and Analytics for the
Surveillance of the Maritime Traffic from AIS Sentinel Satellite Data
Streams”, BiDS 2017: Big Data from Space Conference, France 2017.
 K. Wolsing, L. Roepert, J. Bauer, K. Wehrle, “Anomaly Detection in
Maritime AIS Tracks: A Review of Recent Approaches”, Journal of
Maritime Science and Engineering, 10(112), pp.1-19, 2022.
 EFFECTOR EU Project - “An End to end Interoperability Framework
For Maritime Situational Awareness at Strategic and TacTical
OpeRations”, GA ID: 883374, 2020-2022, https://www.effector-project.eu/.
 E. Vasilopoulos, G. Vosinakis, M. Krommyda, L. Karagiannidis, E.
Ouzounoglou, A. Amditis, “Autonomous Object Detection Using a
UAV Platform in the Maritime Environment”, Proceedings of
International Conference RCIS, pp. 567-579, Springer, Cham., 2022.
 VOC Homepage, http://host.robots.ox.ac.uk/pascal/VOC/.
 T. Mingxing, R. Pang, Q.V. Le, "Efficientdet: Scalable and efficient
object detection." Proceedings of the IEEE/CVF conference on
computer vision and pattern recognition, pp. 10781-10790, 2020.
 U. Nepal, H. Eslamiat, “Comparing YOLOv3, YOLOv4 and YOLOv5
for Autonomous Landing Spot Detection in Faulty UAVs”, Sensors
2022, 22, 464. https://doi.org/10.3390/s22020464.
 E. Vasilopoulos, G. Vosinakis, M. Krommyda, L. Karagiannidis, E.
Ouzounoglou, A. Amditis, “A Comparative Study of Autonomous
Object Detection Algorithms in the Maritime Environment Using a
UAV Platform”, Computation, 10(3), 42, 2022.