ArticlePDF Available

Redundant Path Optimization in Smart Ship Software-Defined Networking and Time-Sensitive Networking Networks: An Improved Double-Dueling-Deep-Q-Networks-Based Approach


Abstract and Figures

Traditional network architectures in smart ship communication systems struggle to efficiently manage the integration of heterogeneous sensor data. Additionally, conventional end-to-end transmission algorithms that rely on single-metric and single-path selection are inadequate in fulfilling the high reliability and real-time transmission requirements essential for high-priority service data. This inadequacy results in increased latency and packet loss for critical control information. To address these challenges, this study proposes an innovative ship network framework that synergistically integrates Software-Defined Networking (SDN) and Time-Sensitive Networking (TSN) technologies. Central to this framework is the introduction of a redundant multipath selection algorithm, which leverages Double Dueling Deep Q-Networks (D3QNs) in conjunction with Graph Convolutional Networks (GCNs). Initially, an optimization function encompassing transmission latency, bandwidth utilization, and packet loss rate is formulated within a software-defined time-sensitive network transmission framework tailored for smart ships. The proposed D3QN-GCN-based algorithm effectively identifies optimal working and redundant paths for TSN switches. These dual-path configurations are then disseminated by the SDN controller to the TSN switches, enabling the TSN’s inherent reliability redundancy mechanisms to facilitate the simultaneous transmission of critical service flows across multiple paths. Experimental evaluations demonstrate that the proposed algorithm exhibits robust convergence characteristics and significantly outperforms existing algorithms in terms of reducing network latency and packet loss rates. Furthermore, the algorithm enhances bandwidth utilization and promotes balanced network load distribution. This research offers a novel and effective solution for shipboard switch path selection, thereby advancing the reliability and efficiency of smart ship communication systems.
Content may be subject to copyright.
Citation: Xu, Y.; He, S.; Zhou, Z.; Xu, J.
Redundant Path Optimization in
Smart Ship Software-Defined
Networking and Time-Sensitive
Networking Networks: An Improved
Based Approach. J. Mar. Sci. Eng.
2024,12, 2214.
Academic Editor: Yassine Amirat
Received: 6 November 2024
Revised: 25 November 2024
Accepted: 29 November 2024
Published: 2 December 2024
Copyright: © 2024 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
Redundant Path Optimization in Smart Ship Software-Defined
Networking and Time-Sensitive Networking Networks: An
Improved Double-Dueling-Deep-Q-Networks-Based Approach
Yanli Xu * , Songtao He , Zirui Zhou and Jingxin Xu
College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China; (S.H.)
*Correspondence:; Tel.: +86-021-38282805
Abstract: Traditional network architectures in smart ship communication systems struggle to effi-
ciently manage the integration of heterogeneous sensor data. Additionally, conventional end-to-end
transmission algorithms that rely on single-metric and single-path selection are inadequate in fulfill-
ing the high reliability and real-time transmission requirements essential for high-priority service data.
This inadequacy results in increased latency and packet loss for critical control information. To ad-
dress these challenges, this study proposes an innovative ship network framework that synergistically
integrates Software-Defined Networking (SDN) and Time-Sensitive Networking (TSN) technologies.
Central to this framework is the introduction of a redundant multipath selection algorithm, which
leverages Double Dueling Deep Q-Networks (D3QNs) in conjunction with Graph Convolutional
Networks (GCNs). Initially, an optimization function encompassing transmission latency, bandwidth
utilization, and packet loss rate is formulated within a software-defined time-sensitive network trans-
mission framework tailored for smart ships. The proposed D3QN-GCN-based algorithm effectively
identifies optimal working and redundant paths for TSN switches. These dual-path configurations are
then disseminated by the SDN controller to the TSN switches, enabling the TSN’s inherent reliability
redundancy mechanisms to facilitate the simultaneous transmission of critical service flows across
multiple paths. Experimental evaluations demonstrate that the proposed algorithm exhibits robust
convergence characteristics and significantly outperforms existing algorithms in terms of reducing
network latency and packet loss rates. Furthermore, the algorithm enhances bandwidth utilization
and promotes balanced network load distribution. This research offers a novel and effective solution
for shipboard switch path selection, thereby advancing the reliability and efficiency of smart ship
communication systems.
Keywords: smart ship; software-defined networking; time-sensitive networking; double dueling
deep Q-networks; redundant path selection
1. Introduction
In the era of Industry 4.0, rapid advancements in computer and next-generation
information technologies have led to the maturation of intelligent factories, smart vehicles,
and advanced aerospace systems. Traditional sectors like maritime transport have also
begun embracing intelligent development, resulting in the emergence of smart ships [
Smart ship refers to a modernized ship that obtains real-time shipping data of the ship
through sensor technology, communication technology, and other intelligent means and
realizes intelligent operation in navigation and management by using computer technology,
automatic control technology, etc., and possesses the characteristics of safety, environmental
protection, economy, and reliability.
However, compared to smart cars and the aerospace industry, the progress of ship
intelligence has been relatively slow. This is mainly due to the unique characteristics of
J. Mar. Sci. Eng. 2024,12, 2214.
J. Mar. Sci. Eng. 2024,12, 2214 2 of 24
ships, such as complex sailing environments, limited communication conditions, and the
high specialization of traditional systems. On one hand, ship electronic systems have long
relied on specialized industry protocol standards [
]. Although these standards offer a high
degree of standardization, the protocols are ill-suited to meet the connectivity requirements
of modern, general-purpose intelligent devices [
]. During the process of ship digitization,
intelligent systems need to integrate diverse data and functional modules. However, com-
munication between systems continues to depend on heterogeneous protocols, resulting
in inefficient information exchange and inadequate system interoperability [
]. Therefore,
a novel ship communication network architecture is necessary to support heterogeneous
data interactions among intelligent terminals [
]. On the other hand, the marine environ-
ment presents significant challenges to ship information systems. Especially under harsh
sea conditions, the reliability of information transmission becomes particularly crucial.
Therefore, smart ships urgently need highly reliable communication network protocols to
guarantee their safe operation [6].
Traditional ships employ communication technologies such as the National Marine
Electronics Association (NMEA) protocol based on the Controller Area Network
(CAN) [2]
as well as network protocols like the Modbus protocol. These protocols cannot meet the
demands of intelligent operations. In advanced ships, a large amount of data from complex
devices create high demands for bandwidth and various transmission capabilities. Ethernet
technology, known for its high bandwidth and affordable transmission, is now being
used in intelligent vehicles, smart factories, aerospace, and many other smart applications.
However, standard Ethernet cannot handle real-time and critical data streams and is
devoid of high-reliability redundancy fault-tolerant mechanisms. Consequently, Ethernet
technology in existing ship information systems is mainly used for non-critical business data
transmission [
]. Therefore, intelligent ships urgently need Ethernet protocol technologies
that support real-time services and redundant transmission.
To tackle these challenges, new networking technologies have emerged. Software-
Defined Networking (SDN) is a novel network architecture that decomposes the traditional
distributed hardware network into a centralized control plane and a data forwarding plane,
representing a new generation of network technology programmable via software [
The software-programmable control plane can be applied to large networks for unified path
planning and dynamic scheduling. Its programmable data plane supports custom data
structures, enabling the unification and transformation of data from different protocols and
facilitating communication between heterogeneous systems [9].
Additionally, Time-Sensitive Networking (TSN), based on the Medium Access Control
(MAC) of Ethernet technology, has been developed to address the lack of real-time pro-
cessing and traffic scheduling in traditional Ethernet [
]. TSN is a collection of protocols
proposed by the IEEE 802.1 working group for industrial internet applications. Originating
from the Audio Video Bridging (AVB) protocol responsible for real-time audio and video
transmission, TSN now includes time-sensitive transmission protocols supporting various
data streams, such as high-precision time synchronization, traffic shaping and scheduling,
redundancy fault tolerance, and network configuration protocols [
]. TSN technology has
been widely applied in autonomous driving systems of intelligent vehicles [
] and shows
promising prospects in advanced aerospace [13].
In summary, to meet the needs of smart ships for heterogeneous communication
protocol integration and a high real-time, reliable transmission of high-priority data, this
paper proposes a network architecture for smart ship communication systems based on
SDN and TSN technologies, termed Ship Software-Defined Time-Sensitive Networking
(SSDTSN). Among them, the control plane adopts SDN technology to realize the conver-
gence of heterogeneous protocols of multi-devices on ships as well as configurations such
as path selection, and the data plane adopts TSN-based Ethernet technology to ensure the
reliability and real-time nature of the data transmission process.
Currently, TSN’s time synchronization and traffic scheduling protocols guarantee a
real-time communication performance. Meanwhile, its redundancy fault-tolerance protocol
J. Mar. Sci. Eng. 2024,12, 2214 3 of 24
enhances transmission reliability by replicating high-priority traffic across multiple paths,
ensuring seamless redundancy [
]. However, the TSN protocol itself does not provide
guidelines for selecting these reliable multipaths. Therefore, an efficient path selection
algorithm is required to realize the redundancy fault tolerance mechanism of TSN. Given
that TSN is a protocol standard for the MAC layer in the data link layer in the Ethernet
model, the selection of multiple paths in the redundancy mechanism of a data plane
TSN switch depends on the path selection algorithms implemented in the SDN control
layer. The realization of highly reliable redundant transmission over TSN in the SSDTSN
architecture requires the selection of multiple data transmission paths based on various
link characteristics. Therefore, this paper focuses on data link layer path selection based on
the SDN data plane.
Existing research primarily utilizes shortest path algorithms like Shortest Path First
(SPF) for SDN network path selection, with limited studies employing deep reinforcement
learning (DRL) for the optimal selection of multiple redundant paths in basic SDN archi-
tectures. Therefore, within the proposed SSDTSN framework, this paper designs a ship
redundant path selection algorithm based on a Double Duel Deep Q-Network (D3QN)
and Graph Convolutional Network (GCN). The GCN is a type of neural network designed
for processing and analyzing graph-structured data [
]. GCNs leverage the structural
details of network topology to accurately model the relationships and dependencies among
various nodes, such as network switches and links, within a communication network. Inte-
grating the GCN with the D3QN enables the algorithm to learn the optimal path selection
strategy through reinforcement learning and utilize the inherent graph structure of ship
communication networks for more informed and efficient decision making. The algorithm
selects multiple redundant paths for transmitting high-priority traffic to meet smart ships’
low-latency and high-reliability requirements.
The main contributions of this paper are as follows:
By integrating SDN and TSN technologies, we propose a novel network architecture
for smart ship information systems.
We establish a switch path selection model for ship software-defined networks and
design a path selection algorithm based on a D3QN and GCN, along with a ship
redundant multipath selection algorithm.
Through simulations, we validate the convergence and effectiveness of the proposed
algorithm. Experimental results indicate that the algorithm surpasses existing meth-
ods in terms of latency, packet loss, and bandwidth utilization in the simulated
network topology.
The remainder of this paper is organized as follows: Section 2reviews the current
status of path selection algorithms under software-defined networks. Section 3presents the
optimized architecture of ship software-defined time-sensitive networks and the modeling
of switch path selection. Section 4details the design of the path selection algorithm based
on a D3QN and the ship redundant multipath selection algorithm. Section 5provides an
experimental evaluation of the proposed algorithm. Finally, Section 6concludes this paper
and discusses future work.
2. Related Work
The mainstream path selection algorithms for SDN and TSN can be broadly catego-
rized into the following types: traditional shortest path optimization algorithms, algorithms
based on deep learning and machine learning, algorithms based on reinforcement learning,
and algorithms based on DRL [16].
A. Traditional Shortest Path Optimization Algorithms
In Ethernet path selection, the predominant algorithms are route optimization meth-
ods based on Shortest Path First (SPF) [
]. These algorithms primarily use a single metric,
such as bandwidth utilization or transmission latency, as the link cost and employ Di-
jkstra or Bellman–Ford algorithms to select the shortest path [
]. However, due to the
reliance on a single metric, the selected path may not be optimal under comprehensive
J. Mar. Sci. Eng. 2024,12, 2214 4 of 24
conditions and can easily lead to link congestion. In recent years, extensions of SPF that
have aimed to reduce congestion have incorporated multiple metrics in their research.
Heuristic algorithms [
], genetic Ant Colony (ACO) algorithms [
], and Simulated An-
nealing algorithms [
] can effectively select better paths and alleviate network congestion.
Nevertheless, intelligent optimization algorithms introduce path selection latencies owing
to their computational complexity. Therefore, traditional shortest path optimization algo-
rithms exhibit limitations when addressing complex, multi-objective network optimization
problems. There is a necessity to explore more efficient algorithms to meet the performance
demands of contemporary networks.
B. Routing Algorithms Based on Deep Learning and Machine Learning
In communication systems based on SDN networks, deep learning, and machine learn-
ing have been implemented to improve performance in areas such as routing selection and
scheduling. Ampratwum et al. [
] proposed a framework based on deep neural networks
to identify the quality of service requirements of flows and provide the necessary routing
strategies; tests showed that it met latency requirements faster than heuristic algorithms.
Awad et al. [
] developed a machine-learning-based multipath routing framework that
selects routing schemes based on network states and routing requests, addressing the
multipath routing problem in SDN with link constraints and flow rule space constraints.
Azzouni et al. [
] utilized Long Short-Term Memory (LSTM) recurrent neural networks
and deep neural networks to learn traffic characteristics. They used supervised learning
methods to choose routing paths, which helped to improve network throughput and lower
routing costs. However, routing optimization algorithms that rely on supervised learn-
ing need large-scale datasets to be collected and labeled, leading to high implementation
costs and an inability to meet the performance needs of data transmission [
]. Conse-
quently, although deep learning and machine learning methodologies offer theoretical
advantages, their widespread adoption in practical network environments is constrained
by high implementation costs and a dependence on large-scale datasets.
C. Routing Algorithms Based on Reinforcement Learning
Reinforcement learning is a dynamic optimization algorithm where agents make
decisions through continuous interactions with the environment, making it suitable for
solving routing optimization problems.
Q-learning is a widely adopted reinforcement learning algorithm that identifies op-
timal strategies through environmental interactions. The Q-value estimates the expected
return of taking a specific action in a given state [
]. A Q-table, structured as a two-
dimensional matrix, systematically stores the Q-values for all possible state-action pairs,
encompassing the full range of states and their corresponding actions. Using Q-values
and the Q-table, Q-learning effectively evaluates and updates state-action pairs’ values.
This iterative process enables the algorithm to converge on the optimal Q-value function,
facilitating the determination of optimal decision-making policies [27].
Houda Hassen et al. [
] explored the deployment of a Q-learning algorithm to solve
routing optimization from the perspective of minimizing latency. By combining greedy and
SoftMax methods, they proposed an improved Q-learning algorithm based on congestion
avoidance. Rischke et al. [
] introduced the QR-SDN algorithm based on Q-learning,
which can be applied to multipath routing between terminal switch pairs while main-
taining flow integrity. Casas-Velasco et al. [
] proposed an intelligent routing algorithm
based on reinforcement learning, using link state information to make routing decisions to
adapt to dynamic traffic and meet quality of service requirements. However, smart ships
require fine-grained control over multi-source heterogeneous data streams, especially for
the real-time and reliable transmission of critical data such as course control, emergency
braking and steering, and alarm information. As intelligent maritime networks expand,
the implementation of reinforcement learning algorithms requires increased storage capac-
ity to maintain extensive information such as system states, action sets, and reward values
within the Q-table. The expansion of data volumes has resulted in prolonged query times,
which in turn have caused latency in path selection, elevated network latency, and an uptick
J. Mar. Sci. Eng. 2024,12, 2214 5 of 24
in packet loss rates. Therefore, although reinforcement-learning-based routing algorithms
demonstrate potential in dynamic network environments, their application is constrained
in networks characterized by high complexity and stringent real-time requirements. To ad-
dress these challenges and fulfill practical application needs, it is essential to optimize the
algorithmic structure.
D. Routing Algorithms Based on DRL
In complex, large-scale, or continuous environments, the state and action spaces can
become exceedingly vast, causing the Q-table to expand dramatically and resulting in
prohibitive storage and computational costs. The DRL employs deep neural networks
to approximate the Q-value function, called Q-networks, to address this limitation [
These Q-networks enable efficient learning and decision-making processes within intricate
environments by effectively managing the complexities associated with extensive state and
action spaces.
W. Liu et al. [
] employed Deep Q-Networks (DQNs) and a Deep Deterministic
Policy Gradient (DDPG) to construct deep-reinforcement-learning-based routing (DRL-R).
Compared to Open Shortest Path First (OSPF), DRL-R achieved shorter flow completion
times, a higher throughput, better load balancing, and improved robustness, with the DDPG
outperforming the DQN. D. Xia et al. [
] proposed a routing algorithm based on DRL
for heterogeneous factory networks. Utilizing a Double DQN, they implemented action
selection and evaluation through different value functions to address the overestimation
problem. Shinde et al. [
] introduced a DRL algorithm called Advantage Actor–Critic
(A2C), where the advantage function measures how much better an action is compared to
other actions in a given situation.
In DRL algorithms applied to path selection problems, the DQN and its variants, which
are particularly suited for discrete action spaces, are predominantly utilized. The DQN
algorithm is an algorithm based on Q-learning that replaces the traditional Q-table with
a deep neural network to estimate the value function of state-action pairs [
]. However,
the original DQN faces certain issues, leading to the development of several variants, such
as a Double DQN and Dueling DQN [35].
The Double DQN addresses the problem of Q-value overestimation in a traditional
DQN by introducing an online network and a target network. In this setup, the online
network is responsible for selecting the next action. In contrast, the target network evaluates
the Q-value of that action, thus avoiding the bias that arises when action selection and
value evaluation use the same network. The Dueling DQN, on the other hand, optimizes
the network structure by decomposing the Q-value function into a state value function
and an advantage function. This decomposition enhances the policy’s decision-making
capability and improves learning efficiency.
Combining the strengths of a Double DQN and Dueling DQN, the D3QN enhances
the algorithm performance [
]. The D3QN architecture includes two streams: one outputs
the state value, and the other outputs the advantage value for each action. By merging
these two streams to compute the final Q-value, the D3QN provides a more accurate value
function estimation, reduces the problem of Q-value overestimation, and results in a more
stable training process, faster convergence, and better policy performance. Therefore, this
paper introduces a D3QN into the smart ship network system and investigates the reliability
path selection problem for SDN-converged TSN networks.
3. Optimized Architecture of Ship Networks and Modeling of Switch Path Selection
3.1. Design of Optimized Architecture for Smart Ship Networks
This section constructs an industrial Ethernet architecture for ships that support het-
erogeneous multi-protocols based on SDN and TSN technologies, namely the SSDTSN.
In this architecture, SDN technology decomposes the network into a control plane and a
data plane. The DRL algorithms are deployed in the control plane, while TSN technology
is embedded into the switches in the data plane. The data plane interfaces with sensors, the
Automatic Identification System (AIS), Global Positioning System (GPS), and other terminal
J. Mar. Sci. Eng. 2024,12, 2214 6 of 24
devices through intelligent gateways. The control plane connects to upper-layer applica-
tions via the Application Programming Interface (API), enabling upper-level application
control or 5G remote intelligent control.
3.1.1. Architecture of SSDTSN
In [
], the authors integrated DRL with SDN to establish the DRL-R architecture.
The DRL agent, deployed on the SDN controller, continuously interacts with the network
from a global perspective, thereby enabling optimized routing decisions. Building upon
this framework and incorporating TSN, this paper proposes a novel SDN architecture
specifically tailored for ship applications. The SSDTSN architecture is illustrated in Figure 1.
First, the network controller—the SDN control plane—acquires the network topology, link
information, and traffic data through data plane switches.
Figure 1. SSDTSN architecture diagram.
It obtains link latency information via the precise time synchronization protocol
generalized Precision Time Protocol (gPTP) of TSN switches and gathers bandwidth and
packet information of links and ports through periodic statistical reports. The DRL module
in the control plane leverages the global information of the data plane obtained via SDN to
learn real-time data transmission over network links. It then issues new switch flow table
entries to control data forwarding paths and traffic scheduling.
After the periodic issuance of control information, the data perception layer collects
navigation data and ship-specific data through intelligent terminal devices. The hetero-
geneous integration layer converts various ship communication protocols into Virtual
Local Area Network (VLAN) Ethernet data frames supporting TSN via intelligent gate-
ways. In the data forwarding layer, TSN switches transmit data from ship intelligent
systems based on TSN’s time synchronization, traffic scheduling, and redundancy fault-
tolerance mechanisms.
The proposed algorithm is primarily applied within the data forwarding layer to imple-
ment the redundancy fault-tolerance mechanism. Specifically, the SDN control plane selects
and distributes both working and redundant paths. Then, utilizing TSN’s redundancy
protocol IEEE 802.1CB, data streams are redundantly replicated and transmitted.
J. Mar. Sci. Eng. 2024,12, 2214 7 of 24
3.1.2. Example of SSDTSN Topology
The China Classification Society (CCS) has published intelligent ship standards that
categorize a vessel’s systems into six primary domains: intelligent navigation, intelligent
hull, intelligent machinery, intelligent energy efficiency management, intelligent cargo
management, and intelligent integrated platform [
]. Building upon the proposed SSDTSN
architecture, this paper introduces a novel network topology for intelligent ships. An exam-
ple of the ship’s SSDTSN topology is illustrated in Figure 2, where each system is connected
to the ship’s communication network through its respective TSN switch. The SDN con-
troller centrally manages the TSN switches of each independent system, ensuring unified
control and seamless integration across the vessel’s intelligent systems.
Figure 2. Topology diagram of SSDTSN.
3.2. Modeling of Path Selection Problems
Based on the optimized smart ship network architecture, this section systematically
models the path selection problem by introducing key parameters such as transmis-
sion latency, bandwidth utilization, and packet loss rate to ensure efficient and reliable
data transmission.
3.2.1. Parameter Definition
In smart ship communication networks, based on an existing study, the network
topology model is an undirected graph
G= (V
], where the set of TSN switch
nodes is represented as
, ...,
, with
being the number of nodes. The set
of links between nodes is represented as
, ...,
, where
is the number of
links. For data flow
, the source node is
and the destination node is
. The set of
reachable paths is
, ...,
, where the probability of transmitting via path
wj. Additional definition parameters are shown in Table 1.
J. Mar. Sci. Eng. 2024,12, 2214 8 of 24
Table 1. The notation definition.
Notation Definition
diData flow
PEnd-to-end transmission path
piTransmission links of flow dkat time t
LhTransmission latency of flow dkat time t
LAverage transmission latency of Jdata flows
Used bandwidth including the number of
bytes received
and sent
during statistical
time t
BTotal path bandwidth
UAverage bandwidth utilization of Jdata flows
Ucv Bandwidth load coefficient of variation
The packet loss rate of transmission path for
RAverage packet loss rate of Jdata flows
Definition 1 (average path transmission latency).During path transmission, if packet loss
occurs, data retransmission is required, and the path transmission latency is related to the packet loss
rate. Therefore, the path latency includes the initial transmission latency plus the retransmission
latency associated with the packet loss rate.
For any single data flow
, the link latency it experiences from
. After traversing
h links, the total path latency Lhis
The average transmission latency of K data flows is
Definition 2 (average packet loss rate).The SDN controller obtains link packet loss rates through
statistical data. For any data flow
, the packet loss rates of the
links it traverses from
Ri. The packet loss rate of the transmission path for diis
The average packet loss rate of K data flows is
Definition 3 (bandwidth load coefficient of variation).The SDN controller obtains statistical
information from switch node ports, including the number of bytes received
and the number of
bytes transmitted Tx during the statistical time t. The used bandwidth is calculated as
Uused =Rx +Tx
Let B be the total bandwidth; then, the bandwidth utilization is
The average bandwidth utilization of the K data flows is
J. Mar. Sci. Eng. 2024,12, 2214 9 of 24
In [
], the authors employ the coefficient of variation to quantify the degree of data dispersion.
To assess the load balance of all paths, this paper uses the coefficient of variation (CV) to measure
load balance. The standard deviation
and the mean
of the bandwidth occupied by the paths
determine the average load rate coefficient of variation,
. The lower the value of
, the more
balanced the load. The average load coefficient of variation rate of network bandwidth is
Ucv =σUused
µUused =1
3.2.2. Problem Modeling
In SDN integrated with a TSN network architecture, it is necessary to select two paths
to achieve the real-time and reliable transmission of critical service flows. Both the working
path and the redundant path should choose links with low latency, a low packet loss rate,
and low bandwidth utilization for transmission:
wk=1, kJ, (10)
wk0, pkP,kJ, (11)
Lh<Lmax, (12)
Uused <B, (13)
µ+ν+λ=1, 0 µ,ν,λ1. (14)
Among these, Equation (9) is the optimization objective, satisfying the requirements
for different network latency and packet loss rates.
Equation (10) represents all forwarding paths of data flow between the source node
and the destination node.
Equation (11) indicates that the probability of selecting a forwarding path is positive.
Equation (12) stipulates that the transmission latency of data flow does not exceed the
threshold of data transmission latency.
Equation (13) specifies that the link bandwidth utilization of data flow cannot exceed
the total link bandwidth.
Equation (14) represents the weight values for network bandwidth utilization, network
transmission latency, and network packet loss rate, respectively.
4. DRL of Redundant Multipath Selection Algorithms
In this section, a detailed design method of the redundant multipath selection algorithm
is designed based on a D3QN and the proposed network model and problem modeling.
4.1. Problem Description
In the case of limited network bandwidth and a fixed network architecture, finding
the best transmission path with the minimum latency and the minimum packet loss rate
is an NP-complete problem. To address this problem, the D3QN design optimization
algorithm is used to model reinforcement learning as a Markov Decision Process (MDP)
(S, A, P, R, γ) [26].
J. Mar. Sci. Eng. 2024,12, 2214 10 of 24
(1) State: In this paper, we utilize the data plane switch network topology, switch
port data, traffic information, and other data obtained by the SDN controller to identify
all transmission paths
, transmission traffic
, bandwidth utilization
, packet loss
and path latency
. This is performed in order to establish the composition of the graph
eigenvalue raw information. Subsequently, the GCN is employed to process the network
topology and features, thereby enhancing the quality of state representation and optimizing
the efficacy of policy implementation. The resulting state space is as follows:
d1,d2, . . . , dk
pd1,pd2, . . . , pdk
Ue1,Ue2, . . . , UeN
Ld1,Ld2, . . . , Ldk
Rd1,Rd2, . . . , Rdk
(2) Action: The D3QN senses the network state information, then obtains the real-time
state of each link in the network, and calculates the optimal forwarding path of the data
flow, denoted as
. The transfer path action of each data stream either remains unchanged
or a new path P is selected:
at=π(st) = d1,d2, . . . , dk
pd1,pd2, . . . , pdk (16)
(3) Policy: after taking action
, the network state will change with new network
topology feature information, and the new network information is processed using the
GCN to obtain the next state st+1:
P(st+1|st,at) = Pr(st+1=GCN[Dt+1,Pt+1,Lt+1,Ut+1,Rt+1]|st,at)(17)
(4) Reward: the D3QN takes into account the network average transmission latency
and network packet loss rate and expresses the reward function rtas follows:
R(st,at) = 1
ex p(µUcv) + ex p(νLt+1) + ex p(λRt+1)(18)
0, 1
is an adjustable weight and
1. When
1, only transmis-
sion latency for optimized data streams is considered; when
1, only the transmission
packet loss of optimized data streams is considered.
(5) Discount: a constant value between 0 and 1 that weighs the importance of current
rewards against future rewards, defined as follows:
γ[0, 1](19)
4.2. Path Selection Algorithm Based on D3QN Fusion GCN
To optimize path selection within smart ship communication systems, this section
introduces an algorithm that integrates a D3QN with a GCN, thereby enabling highly
reliable and real-time path selection.
4.2.1. The Optimal Path Selection Process Based on D3QN
The specific learning and training process of the D3QN optimization algorithm are
as follows.
To experience playback and state selection, the D3QN uses an experience playback
pool (Experience Replay Buffer) for random sampling to break the correlation between data
and improve the stability of training. Meanwhile, the maximum number of the jump limit
is set to eliminate inappropriate path selection.
J. Mar. Sci. Eng. 2024,12, 2214 11 of 24
For the action selection strategy, the Q-network uses the
-greedy strategy to select
actions, where
is the exploration rate and is used to balance exploration and utilization.
-greedy strategy is a fundamental action selection method in reinforcement learning,
designed to balance exploration and exploitation. Within the framework of a Q-network,
-greedy strategy assists the agent in deciding whether to select the best-known action
or to explore new actions that might yield higher rewards [38]:
at=(Random actions ϵ
arg max
The Q-network of a D3QN adopts the Dueling structure [
], which divides the
network into the value flow and the advantage flow and estimates the value
of each
state and the advantage
of each action, respectively. The formula for the Q-value is
as follows:
Q(s,a;θ,α,β) = V(s;θ,β) + A(s,a;θ,α)1
represents the shared network parameters;
are independent parameters of
the advantage and value streams, respectively; and |A|is the size of the action space.
According to the Bellman criterion [40], the target Q-value Ytis estimated as
Yt=rt+γmaxaQ(st+1, a;θ)(22)
is the immediate reward,
is the discount factor,
is the Q-value of the target
network, and
is the parameter of the target network. The loss function uses the Mean
Square Error (MSE) to measure the difference between the predicted Q-value and the target
L(θ) = E[(YtQ(st,at;θ))2](23)
Update the parameter θof the Q-network by the gradient descent method:
θθα· θL(θ)(24)
where αis the learning rate.
Regarding the update of the target network, in order to reduce the correlation between
the main network and the target network, the parameter
of the target network is copied
from the main Q-network with certain training intervals:
. It ensures the stability of the
target Q-value and promotes iterative learning from the Q-value to the final target Q-value.
4.2.2. Optimal Path Selection Algorithm Based on D3QN Fused with GCN
In [
], the authors introduced a D3QN into vehicular networks to address resource
allocation challenges. Building upon their network model, this study leverages the D3QN
as a foundational framework and incorporates the GCN to extract network feature val-
ues. Consequently, we propose a novel optimal path selection algorithm tailored for
SSDTSN, named Graph-enhanced D3QN (G-D3QN). The workflow of the proposed algo-
rithm is depicted in Figure 3. The proposed algorithm is detailed through its pseudocode in
Algorithm 1.
It acquires the original state information of the network topology through the SDN
controller and subsequently processes the network using a GCN. Topology and path
features are extracted, and high-dimensional embedded representations of the generated
network are outputted as the state space of the D3QN. The optimal forwarding paths of the
data streams are then computed by using the D3QN algorithm.
J. Mar. Sci. Eng. 2024,12, 2214 12 of 24
Figure 3. G-D3QN schematic diagram.
4.3. Redundant Multipath Selection Algorithm for Smart Ship
Utilizing the previously established SSDTSN framework in conjunction with the
G-D3QN algorithm, this section proposes an innovative redundant multipath selection
technique for intelligent ship communication networks. This technique seeks to increase
fault tolerance and secure seamless communication, thereby elevating the overall reliability
and robustness of data transmissions in smart maritime environments.
4.3.1. Smart Ship Data Flow Priority Classification
Building on existing research in autonomous driving traffic classification [
] and ship
traffic studies [
], this paper thoroughly classifies data flows. The traffic priorities are
detailed in Table 2.
Table 2. Communication data types and characteristics.
Data Type Cycle Requirements Reliability Priority
Navigation data transmission:
real-time radar and AIS info Isochronous 50 µs–2 ms Strict time limit, real-time High High
Engine control and monitoring:
engine parameters, power system
control signals
Cyclic Sync 100 µs–2 ms Max latency, real time High High
Safety alarm system: fire, leakage,
emergency stop notifications Alarms/Events Async/sudden Max latency, real time High High
Sensor data collection:
environmental sensors, equipment
Cyclic Async 2–20 ms Max latency High Medium
System configuration and
maintenance: equipment parameters,
fault diagnostics
Config/Diag Async/sudden Bandwidth Medium Medium
Network management and device
control: topology management,
start/stop instructions
Network control Cyclic Bandwidth High Medium
J. Mar. Sci. Eng. 2024,12, 2214 13 of 24
Table 2. Cont.
Data Type Cycle Requirements Reliability Priority
Non-critical communication: crew chat, non-critical
data Best Effort Async/sudden None Low Low
Surveillance video transmission: surveillance
cameras, entertainment system Video Async/sudden Max latency Low Low
Voice communication: intercom system, public
broadcasting Audio/Voice Async/sudden Max latency Low Low
In ship information systems, diverse data streams impose varying transmission per-
formance requirements, necessitating specialized transmission strategies. Additionally,
the redundancy mechanisms inherent in TSN require allocating supplementary commu-
nication paths, which mandates that data flows be prioritized based on their significance.
Consequently, the redundant transmission paths examined in this study are primarily
allocated to high-priority traffic, ensuring the reliability of critical data flows. This classifi-
cation considers the unique characteristics of marine vessels, the specific attributes of data
streams, and their respective performance demands. Doing so ensures that the transmission
processes within intelligent ships adhere to the traffic priority standards established for
time-sensitive networks.
4.3.2. Introduction to Redundant Multipath Selection Algorithms
In networks implementing TSN technology, data flows can be sent through multiple
redundant pathways, enabling uninterrupted communication [
]. This study presents
a novel redundant multipath selection algorithm specifically designed for maritime com-
munication systems. Within the SSDTSN architecture, TSN switches periodically provide
updates on flow characteristics and transmission processes. The SDN control plane employs
the G-D3QN algorithm to select both primary and alternative paths. These path flow tables
are then disseminated to the TSN switches, which oversee the redundant transmission
of high-priority traffic. This approach enhances fault tolerance and ensures continuous
communication, thereby boosting the reliability and robustness of data transmission in
intelligent maritime settings. The structure of the algorithm is shown in Figure 4.
Figure 4. Redundant multipath selection algorithms structure.
J. Mar. Sci. Eng. 2024,12, 2214 14 of 24
The redundant multipath selection algorithm starts by configuring the SDN controller
and TSN switches to collect real-time network topology information, such as path delays
and switch port statuses; this step is crucial because it provides a clear understanding of the
current network conditions and forms the foundation for optimal path selection. After gath-
ering this information, the SDN controller uses Algorithm 1to determine the working path
by selecting the route with the lowest latency that also meets acceptable packet loss rates,
ensuring the efficient and reliable transmission of critical data services. Next, the controller
chooses a redundant path to enhance fault tolerance, which overlaps with the working path
as little as possible, ideally using different physical links or nodes to avoid single points of
failure. Once both paths are determined, they are loaded into flow tables and distributed
to the TSN switches in the data plane. Finally, high-priority traffic is simultaneously sent
over both the working and redundant paths, ensuring that communication remains unin-
terrupted even if one of the paths fails or experiences degradation. Algorithm 2outlines
the pseudocode of the proposed algorithm.
Algorithm 1 Path Selection Algorithm Based on D3QN Fusion GCN
Network topology information, including time latency, packet loss rate, band-
width occupancy rate
Ensure: The end-to-end traffic transmission path
GCN preprocessing: Input the related network topology information into the graph
convolutional neural network to obtain the topological graph feature information
D3QN initialization: Initialize the empirical replay buffer
, initialize the online
network parameter θ, initialize the target network parameter θ
3: for episode = 1 to M do
4: Initialize state s1
5: Initialize action a1
6: for t=1 to Tdo
7: Select action ataccording to the policy ϵ-greedy strategy
8: Execute action at, observe a new state st+1and receive reward rt
9: Store transition (st,at,rt,st+1)in D
10: Update the online network
11: Sample a batch of experiences from D
12: Calculate the target Yt=rt+γQ(st+1, arg maxaQ(st+1,a;θ);θ)
13: Calculate the loss function L= (YtQ(st,at;θ))2
14: Minimize the loss Lby gradient descent
15: if it is time to update the target network then
16: Update target network parameters: θθ
17: end if
18: end for
19: end for
Algorithm 2 Redundant Multipath Selection Algorithm for Smart Ships
Network topology, including node and link information, and end-to-end node
Ensure: Working path and redundant path
SDN controller and TSN switch are initialized to obtain path latency and switch port
Use Algorithm 1 to determine the working path: select a working path with the best
time latency and better packet loss rate
Use Algorithm 1 to determine the redundant path: select a redundant path with the
minimum association with the working path
The SDN controller sends the traffic transfer instructions to the data plane TSN switch
5: Key services flow through two paths simultaneously
6: end
J. Mar. Sci. Eng. 2024,12, 2214 15 of 24
5. Experimental Evaluation
In this section, we establish a virtual simulation environment to validate the effec-
tiveness of the proposed algorithm and compare its performance with other algorithms.
By configuring experimental parameters such as the simulated topology of the ship com-
munication system, flow latency, bandwidth, and packet loss, we first test the convergence
and effectiveness of our algorithm. Subsequently, using this simulation environment, we
compare the performance of our algorithm with the DQN algorithm, the ACO algorithm,
and the OSPF algorithm in terms of latency and the packet loss rate under both single-path
and dual-path scenarios.
5.1. Experimental Configuration
In this section, we developed a virtual simulation environment utilizing the Python-
based NetworkX library [
], drawing on existing studies [
]. This environment fa-
cilitated the flexible implementation of network topologies and accelerated the training
process. The experimental topology streamlines the intelligent maritime network system
by reducing it to essential switching nodes, resulting in a network structure composed of
multiple switching nodes and data terminals. This simplification allows us to concentrate
on the network’s core functionalities without the computational burden of modeling the
entire complexity of real-world intelligent ship systems. By decreasing complexity, we can
more effectively analyze and interpret the results, a common strategy in network simulation
research to balance realism with computational feasibility [45].
Experimental Hardware Configuration: GPU: NVIDIA GeForce GTX 4090; CPU:
i7-14700KF; Memory: 128GB; Operating System: Windows 11 64-bit.
Experimental Environment Configuration: The simulation environment is developed
using the PyTorch framework, which offers dynamic computation graphs and GPU acceler-
ation, making it highly suitable for deep learning applications [
]. We employed a GCN
to extract feature values for each path, specifically targeting latency, bandwidth utilization,
and packet loss rate. GCNs are exceptionally effective for processing graph-structured
data because they can capture the complex relationships and interactions between nodes
and edges within the network [
]. By leveraging a GCN, we achieve comprehensive
representations of the network topology and its associated metrics, enhancing our ability
to analyze and optimize network performance.
5.2. Learning Parameter Settings
Following the experimental configuration, this section details the specific settings of
the learning parameters.
The GCN is designed with a two-layer fully connected structure, each comprising
16 neurons. This architecture effectively captures and processes the graph-structured
data inherent in the network topology while maintaining model simplicity. Similarly,
the D3QN utilizes a two-layer fully connected structure with 64 neurons in the hidden
layer to accommodate the larger state-action spaces typical in reinforcement learning tasks,
thereby enabling more precise policy learning.
The Rectified Linear Unit (ReLU) is employed as the activation function due to its
ability to introduce non-linearity and mitigate vanishing gradient issues, enhancing the
stability and efficiency of the training process. For optimization, the Adam optimizer is
chosen for its adaptive learning rate capabilities, which improve the optimization efficiency
and convergence speed, especially in deep learning models with numerous parameters.
To balance exploration and exploitation, an
-greedy strategy is implemented. The ini-
tial exploration rate
is set to one, encouraging extensive exploration of the action space
during the early training stages. This rate decays exponentially at a factor of 0.999, even-
tually approaching 0.01. This approach ensures that the model thoroughly explores to
discover optimal policies initially and gradually shifts toward exploiting the learned strate-
gies, thereby enhancing convergence effectiveness.
Additional hyperparameters are listed in Table 3, [33,48].
J. Mar. Sci. Eng. 2024,12, 2214 16 of 24
Table 3. Algorithm parameter settings.
Parameter Value
Learning rate 0.1, 0.01, 0.001, 0.0001
Training times 6000
Batch size (D3QNAgent) 512
γdiscount factor 0.99
Experience cache area 20,000
ϵ-greedy Initial 1.0, decay rate 0.999, minimum 0.01
(u,v,λ)(0.2, 0.7, 0.1), (0.3, 0.5, 0.2), (0.4, 0.4, 0.2),
(0.5, 0.3, 0.2), (0.7, 0.2, 0.1)
Each experiment uses a single random seed to interact with the environment in 6000
steps. In this paper, the effectiveness and convergence of the proposed algorithm are
verified by setting different weight values (µ,ν,λ)and learning rate α.
5.3. Algorithm Validation
As shown in Figure 5, when using the same
values but different learning rates,
a learning rate of 0.05 leads to more significant network latency optimization.
     
     
Figure 5. G-D3QN. (a) Reward value at different learning rates; (b) latency at different learning rates.
In Figure 6, we observe the reward values for different
weight settings at
a learning rate of 0.05. Good optimization can be achieved when
and λ=0.2.
     
  
  
  
  
  
Figure 6. G-D3QN reward values under different weights.
In the initial stage of training, the data transmission in the network leads to a poor net-
work latency performance. The reward value of the proposed algorithm decreases. With the
J. Mar. Sci. Eng. 2024,12, 2214 17 of 24
increase in training steps, the proposed algorithm gradually gained experience, respond-
ing to network changes and making dynamic adjustments. The average network latency
performance gradually increases, and the reward value increases and eventually converges.
Figure 7shows the results of the two tests for the D3QN at a learning rate of 0.05 with
weight values of
0.4, and
0.2 compared to the DQN. It can be concluded
that the D3QN converges faster and achieves a better reward value.
     
     
Figure 7. Comparison of D3QN and DQN training reward values. (a) test1; (b) test2.
5.4. Algorithmic Performance Evaluation
In this paper, based on the simulation topology diagram, the proposed algorithm (a
single path based on the G-D3QN selects one path for transmission, and the dual path is
to select two paths for transmission using the G-D3QN-based ship-redundant multipath
selection algorithm, and all the following are simplified representations of the G-D3QN)
is compared with the DQN, ACO, and OSPF. The performance comparison is conducted
for the four algorithms of a single path and dual path, respectively, to reflect the proposed
algorithm’s advantages. To demonstrate the proposed algorithm’s advantages, the four
algorithms’ performance is compared for single and dual paths, respectively. Table 4
summarizes the average latency, average packet loss, and average load factor for data
stream transmission.
Table 4. Performance comparison of different methods.
Average Time
Latency (µs)
Average Time
Latency (µs)
Average Packet
Loss Rate (%)
Average Packet
Loss (%)
Load CV
Load CV
D3QN 134.0661 116.8674 1.3075 0.01726 0.6454 0.6174
DQN 135.9177 117.9687 1.3163 0.01736 0.6562 0.6310
ACO 135.2319 121.2392 1.3133 0.01735 0.6535 0.6216
OSPF 142.9973 126.5977 1.4780 0.02074 0.7872 0.7356
Average delay analysis: Single-path scenario: The G-D3QN method exhibits the lowest
average delay among all approaches when operating on a single path. The DQN and ACO
show a comparable performance, with slightly higher delays than the G-D3QN, while
OSPF records the highest average delay. This indicates that the G-D3QN achieves an
optimal delay performance in single-path conditions. Dual-path scenario: the G-D3QN
outperforms the other methods in a dual-path setup, maintaining the lowest average delay.
The DQN follows, with ACO exhibiting marginally higher delays and OSPF again showing
the highest delays. These results demonstrate that the G-D3QN consistently provides a
superior delay performance even when utilizing dual paths.
J. Mar. Sci. Eng. 2024,12, 2214 18 of 24
Average packet loss rate analysis: Single-path scenario: the G-D3QN achieves the
lowest packet loss rate in single-path transmission, with the DQN and ACO presenting
similar, slightly higher rates and OSPF recording the highest loss rate. This suggests that
the G-D3QN is particularly effective in minimizing packet loss when using a single path.
Dual-path scenario: In dual-path conditions, the G-D3QN continues to lead with the lowest
packet loss rate, followed by the DQN and ACO, which have comparable rates slightly
above the G-D3QN, and OSPF, which maintains the highest loss rate. These findings
further confirm the efficacy of the G-D3QN in enhancing network reliability through
reduced packet loss.
Load balancing analysis: Single-path scenario: The G-D3QN method demonstrates the
lowest load CV in single-path transmissions, indicating a more uniform load distribution.
The DQN and ACO exhibit similar CV values, while OSPF records the highest CV. A lower
CV signifies better load balancing, highlighting the G-D3QN’s superior performance. Dual-
path scenario: Even in dual-path scenarios, the G-D3QN maintains the lowest CV value,
with the DQN and ACO showing similar, slightly higher CVs and OSPF again having
the highest CV. This also underscores the G-D3QN’s capability to achieve enhanced load
balancing under dual-path conditions.
Overall performance: The G-D3QN approach consistently outperforms all other meth-
ods across all evaluated metrics, demonstrating its comprehensive advantages in reducing
delay and packet loss rates while improving load balancing. The DQN method ranks
second, surpassing both ACO and OSPF, which underscores the effectiveness of deep
reinforcement learning in network optimization. Although ACO shows marginal im-
provements over OSPF in specific metrics, its overall performance remains inferior to the
G-D3QN and DQN. OSPF performs the worst across all metrics, highlighting the limitations
of traditional routing protocols in meeting real-time and reliability requirements.
Performance comparison between single and dual paths: For all evaluated methods,
the average delay and average packet loss rates are significantly lower in dual-path config-
urations than in single-path setups. This indicates that adopting a dual-path strategy can
substantially enhance network performance. Additionally, the load CV values are reduced
in dual-path scenarios, reflecting a more balanced load distribution and more efficient
utilization of network resources.
The G-D3QN method, based on deep reinforcement learning, demonstrates an ex-
ceptional performance in enhancing network metrics by effectively reducing delay and
packet loss rates and achieving superior load balancing. Implementing dual-path redun-
dant transmissions further improves the network performance. These results validate the
proposed G-D3QN-based redundant multipath selection algorithm’s capability to facili-
tate a low-latency and highly reliable transmission of high-priority data flows within the
SSDTSN architecture.
5.4.1. G-D3QN Single- and Dual-Path Performance Comparison
As shown in Figure 8, the latency, packet loss rate, and load factor of G-D3QN-based
single-path transmission are compared with multipath transmission using this algorithm,
and it can be seen that the average latency of the dual path is reduced by about 12.39%,
the average packet loss rate is reduced by about 98.72%, and the average load factor is
reduced by about 4.68%, compared with that of a single path.
5.4.2. Multi-Algorithm Latency Performance Comparison
The single- and dual-path transmission latency performance comparison of the four
algorithms is shown in Figure 9. For single-path transmission, the G-D3QN algorithm
reduces the latency by about 1.04% on average compared with the DQN algorithm, reduces
the latency by 0.62% on average compared with the ACO algorithm, and reduces the latency
by 5.80% on average compared with the OSPF algorithm. For two-path transmission, the G-
D3QN algorithm reduces the latency by about 0.93% on average over the DQN latency,
3.10% on average over the ACO algorithm, and 7.85% on average over OSPF.
J. Mar. Sci. Eng. 2024,12, 2214 19 of 24
Figure 8. G-D3QN single path vs dual path. (a) Latency; (b) packet loss; (c) load CV.
Figure 9. Latency evaluation of four algorithms. (a) Single path; (b) dual path.
5.4.3. Multi-Algorithm Packet Loss Performance Comparison
The packet loss performance comparison of the four algorithms for single- and dual-
path transmission is shown in Figure 10. For single-path transmission, the G-D3QN algo-
rithm reduces the latency by about 0.68% on average compared with the DQN algorithm,
reduces the latency by 0.38% on average compared with the ACO algorithm, and reduces
the latency by about 11.30% on average compared with the OSPF algorithm. For two-path
transmission, the G-D3QN algorithm reduces the latency by about 0.58% on average over
the DQN latency, 0.57% on average over the ACO algorithm, and about 16.63% on average
over OSPF.
5.4.4. Multi-Algorithm Load Performance Comparison
The comparison of the negative bandwidth coefficients of the four algorithms for
single- and two-path transmission is shown in Figure 11. It can be seen that, for single-path
transmission, the G-D3QN algorithm reduces the load factor by about 1.71% on average
compared to the DQN load factor, reduces the load factor by 1.05% on average compared
to the ACO algorithm, and reduces the load factor by about 18.40% on average compared
to the OSPF. For two-path transmission, the G-D3QN algorithm reduces about 2.35% on
average over the DQN load factor, 0.98% on average over the ACO algorithm, and about
15.81% on average over OSPF.
J. Mar. Sci. Eng. 2024,12, 2214 20 of 24
Figure 10. Packet loss evaluation of four algorithms. (a) Single path; (b) dual path.
Figure 11. Load CV evaluation of four algorithms. (a) Single path; (b) dual path.
Therefore, the path selection algorithm based on the G-D3QN significantly outper-
forms the DQN, ACO, and OSPF algorithms in both single-path and dual-path configura-
tions. It optimizes the transmission latency, enhances the transmission reliability, and im-
proves bandwidth resource utilization. The ship redundant multipath selection algorithm
proposed in this paper can help the smart shipboard communication system reduce network
latency and packet loss and promote network load balancing.
6. Conclusions
Smart ships require extensive data interaction between various intelligent sensors
and multiple intelligent systems. Among these, high-priority data transmissions require
guarantees of high real-time performance and high reliability. This paper proposes a novel
network architecture integrating SDN and TSN to meet the communication needs of smart
ships. Additionally, a redundant multipath selection algorithm based on a D3QN is devel-
oped to generate the working path and redundant path that meet the latency and packet
loss rate, distributed through the SDN control plane to the data plane TSN switch. By utiliz-
ing TSN’s inherent redundancy mechanisms and the selection of multiple redundant paths,
high-priority traffic is transmitted simultaneously across multiple paths. This approach
guarantees high-reliability transmission. Experiments show that the proposed redundant
multipath selection algorithm can optimize time latency and packet loss rate and has a
certain reference value for solving reliability problems in smart ships.
J. Mar. Sci. Eng. 2024,12, 2214 21 of 24
However, this study has several limitations. The experimental evaluations were
primarily conducted in simulated environments without validation in actual maritime com-
munication networks, which may not fully capture the complexities and dynamic nature of
real-world settings. Additionally, integrating wireless communication technologies such as
5G and 6G within maritime communication infrastructures was not extensively explored,
potentially limiting the algorithm’s applicability across a broader range of maritime sce-
narios. Furthermore, although the algorithm effectively reduces latency and packet loss
rates, its computational complexity and resource demands pose challenges for scalability in
more extensive and complex networks. Future research should focus on enhancing the al-
gorithm’s scalability and efficiency to better handle intricate network environments. Lastly,
the algorithm’s performance improvements are sensitive to specific weighting parameters,
indicating that its optimization effectiveness may vary depending on different contexts
and requirements. This sensitivity underscores the necessity for greater flexibility in the
algorithm’s design and additional validation to ensure adaptability to diverse operational
conditions and demands.
In this study, simulations were predominantly utilized to conduct experimental analy-
ses. The subsequent phase will focus on developing a hardware simulation environment
to facilitate verification and deployment testing on actual marine vessels. Furthermore,
with the integration of advanced wireless communication technologies such as 5G and 6G
alongside SDN and TSN, future models will seek to establish a unified architecture for both
wireless and wired network transmissions. This unified framework is designed to address
the increasingly diverse communication requirements of smart ships, thereby enhancing
the efficiency, flexibility, and reliability of maritime communication systems.
Author Contributions: Y.X.: conceptualization, formal analysis, methodology, supervision, and
writing—review and editing. S.H.: conceptualization, data curation, investigation, methodology,
project administration, resources, software, writing—original draft, and writing—review and edit-
ing. Z.Z.: conceptualization, investigation software, and validation. J.X.: investigation, resources,
and writing—review and editing. All authors have read and agreed to the published version of
the manuscript.
Funding: This work was partially supported by the National Natural Science Foundation of China
(62271303), the Innovation Program of Shanghai Municipal Education Commission of China (2021-01-
07-00-10-E00121), and the Natural Science Foundation of Shanghai (20ZR1423200)
(author: Yanli Xu.)
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: This article simulates the online access address of the code: https:
// (accessed on 30 November 2024)
Conflicts of Interest: The authors declare no conflicts of interest.
The following abbreviations are used in this manuscript:
NMEA National Marine Electronics Association
CAN Controller Area Network
SDN Software-Defined Networking
TSN Time-Sensitive Networking
D3QN Double Dueling Deep Q-Network
GCN Graph Convolutional Network
API Application Programming Interface
VLAN Virtual Local Area Network
AIS Automatic Identification System
GPS Global Positioning System
SSDTSN Ship Software-Defined Time-Sensitive Networking
ACO Ant Colony Optimization
J. Mar. Sci. Eng. 2024,12, 2214 22 of 24
LSTM Long Short-Term Memory
DQN Deep Q-Network
DDPG Deep Deterministic Policy Gradient
A2C Advantage Actor–Critic
ReLU Rectified Linear Unit
MSE Mean Square Error
MDP Markov Decision Process
OSPF Open Shortest Path First
gPTP generalized Precision Time Protocol
1. China Classification Society. Rules for Intelligent Ships 2024; China Classification Society: Beijing, China, 2023.
Tran, K.; Keene, S.; Fretheim, E.; Tsikerdekis, M. Marine Network Protocols and Security Risks. J. Cybersecur. Priv. 2021,1, 239–251.
Hao, G.; Xiao, W.; Huang, L.; Chen, J.; Zhang, K.; Chen, Y. The Analysis of Intelligent Functions Required for Inland Ships. J. Mar.
Sci. Eng. 2024,12, 836. [CrossRef]
Yu, Z. Research on Multi-Source Heterogeneous Data Fusion Method for Intelligent Ship Navigation Test. In Proceedings of the
2023 IEEE 3rd International Conference on Data Science and Computer Application (ICDSCA), Dalian, China, 27–29 October
2023; pp. 514–516. [CrossRef]
Tang, Y.; Shao, N. Design and Research of Integrated Information Platform for Smart Ship. In Proceedings of the 2017 4th
International Conference on Transportation Information and Safety (ICTIS), Banff, AB, Canada, 8–10 August 2017; pp. 37–41.
Kim, J.; Son, J. Design of an Integrated Telecom System for Performance Enhancement on Smart Ships. J. Theor. Appl. Inf. Technol.
2021,99. [CrossRef]
Liu, S.; Xing, B.; Li, B.; Gu, M. Ship Information System: Overview and Research Trends. Int. J. Nav. Archit. Ocean Eng. 2014,
6, 670–684. [CrossRef]
Prajapati, A.; Sakadasariya, A.; Patel, J. Software Defined Network: Future of Networking. In Proceedings of the 2018
International Conference on Intelligent Computing and Sustainable System (ICISS), Coimbatore, India, 19–20 January 2018;
pp. 1351–1354. [CrossRef]
Hauser, F.; Häberle, M.; Merling, D.; Lindner, S.; Gurevich, V.; Zeiger, F.; Frank, R.; Menth, M. A Survey on Data Plane
Programming with P4: Fundamentals, Advances, and Applied Research. J. Netw. Comput. Appl. 2023,212, 103561. [CrossRef]
Xu, Y.; Shang, J.; Tang, H. Recent Trends of In-Vehicle Time Sensitive Networking Technologies, Applications and Challenges.
China Commun. 2023,20, 30–55. [CrossRef]
Maleti´c, Ž.; Mla ¯
den, M.; Ljubojevi´c, M. A Survey on the Current State of Time-Sensitive Networks Standardization. In Proceedings
of the 2023 10th International Conference on Electrical, Electronic and Computing Engineering (IcETRAN), East Sarajevo, Bosnia
and Herzegovina, 5–8 June 2023; pp. 1–6. [CrossRef]
Xu, Y.; Huang, J. A Survey on Time-Sensitive Networking Standards and Applications for Intelligent Driving. Processes 2023,
11, 2211. [CrossRef]
Fiori, T.; Lavacca, F.; Valente, F.; Eramo, V. Proposal and Investigation of a Lite Time Sensitive Networking Solution for the
Support of Real Time Services in Space Launcher Networks. IEEE Access 2024,12, 10664–10680. [CrossRef]
IEEE Std 802.1CB-2017; IEEE Standard for Local and Metropolitan Area Networks–Frame Replication and Elimination for
Reliability. IEEE: New York, NY, USA, 2017. [CrossRef]
Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph Neural Networks: A Review of Methods and
Applications. AI Open 2020,1, 57–81. [CrossRef]
Let, G.; Pratap, C.; Jagannath, D.; Dolly, D.; Evangeline, L. Software-Defined Networking Routing Algorithms: Issues, QoS and
Models. Wirel. Pers. Commun. 2023,131, 1631–1661. [CrossRef]
Sheu, J.; Zeng, Q.; Jagadeesha, R.; Chang, Y. Efficient Unicast Routing Algorithms in Software-Defined Networking. In
Proceedings of the 2016 European Conference on Networks and Communications (EuCNC), Athens, Greece, 27–30 June 2016;
pp. 377–381. [CrossRef]
Abe, J.; Mantar, H.; Yayimli, A. k-Maximally Disjoint Path Routing Algorithms for SDN. In Proceedings of the 2015 Interna-
tional Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, Xi’an, China, 17–19 September 2015;
pp. 499–508. [CrossRef]
Tao, J.; Shen, Y.; Yan, Y.; Wu, Y.; Zhang, Y.; Wan, J. A Distributed Heuristic Multicast Algorithm Based on QoS Implemented by
SDN. In Proceedings of the 2017 3rd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China,
13–16 December 2017; pp. 23–29. [CrossRef]
Jing, S.; Muqing, W.; Yong, B.; Min, Z. An Improved GAC Routing Algorithm Based on SDN. In Proceedings of the 2017 3rd
IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 13–16 December 2017; pp. 173–176.
J. Mar. Sci. Eng. 2024,12, 2214 23 of 24
Lin, C.; Wang, K.; Deng, G. A QoS-Aware Routing in SDN Hybrid Networks. Procedia Comput. Sci. 2017,110, 242–249. [CrossRef]
Owusu, A.; Nayak, A. A Framework for QoS-Based Routing in SDNs Using Deep Learning. In Proceedings of the 2020
International Symposium on Networks, Computers and Communications (ISNCC), Montreal, QC, Canada, 20–22 October 2020;
pp. 1–6. [CrossRef]
Awad, M.; Ahmed, M.; Almutairi, A.; Ahmad, I. Machine Learning-Based Multipath Routing for Software Defined Networks. J.
Netw. Syst. Manag. 2021,29, 18. [CrossRef]
Azzouni, A.; Boutaba, R.; Pujolle, G. NeuRoute: Predictive Dynamic Routing for Software-Defined Networks. In Proceedings of
the 2017 13th International Conference on Network and Service Management (CNSM), Tokyo, Japan, 26–30 November 2017;
pp. 1–6. [CrossRef]
Kato, N.; Fadlullah, Z.; Mao, B.; Tang, F.; Akashi, O.; Inoue, T.; Mizutani, K. The Deep Learning Vision for Heterogeneous
Network Traffic Control: Proposal, Challenges, and Future Perspective. IEEE Wirel. Commun. 2016,24, 146–153. [CrossRef]
Shakya, A.; Pillai, G.; Chakrabarty, S. Reinforcement Learning Algorithms: A Brief Survey. Expert Syst. Appl. 2023,231, 120495.
Szepesvári, C. Markov Decision Processes. In Algorithms for Reinforcement Learning; Springer International Publishing: Cham,
Switzerland, 2010; pp. 1–10. [CrossRef]
Hassen, H.; Meherzi, S.; Jemaa, Z. Improved Exploration Strategy for Q-Learning Based Multipath Routing in SDN Networks. J.
Netw. Syst. Manag. 2024,32, 25. [CrossRef]
Rischke, J.; Sossalla, P.; Salah, H.; Fitzek, F.; Reisslein, M. QR-SDN: Towards Reinforcement Learning States, Actions, and Rewards
for Direct Flow Routing in Software-Defined Networks. IEEE Access 2020,8, 174773–174791. [CrossRef]
Casas-Velasco, D.; Rendon, O.; da Fonseca, N. Intelligent Routing Based on Reinforcement Learning for Software-Defined
Networking. IEEE Trans. Netw. Serv. Manag. 2020,18, 870–881. [CrossRef]
Wang, H.; Liu, N.; Zhang, Y.; Feng, D.; Huang, F.; Li, D.; Zhang, Y. Deep Reinforcement Learning: A Survey. Front. Inf. Technol.
Electron. Eng. 2020,21, 1726–1744. [CrossRef]
Liu, W.; Cai, J.; Chen, Q.; Wang, Y. DRL-R: Deep Reinforcement Learning Approach for Intelligent Routing in Software-Defined
Data-Center Networks. J. Netw. Comput. Appl. 2021,177, 102865. [CrossRef]
Xia, D.; Wan, J.; Xu, P.; Tan, J. Deep Reinforcement Learning-Based QoS Optimization for Software-Defined Factory Heterogeneous
Networks. IEEE Trans. Netw. Serv. Manag. 2022,19, 4058–4068. [CrossRef]
34. Jinesh, N.; Shinde, S.; Narayan, D. Deep Reinforcement Learning-Based QoS Aware Routing in Software Defined Networks. In
Proceedings of the 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT),
Delhi, India, 6–8 July 2023; pp. 1–6. [CrossRef]
Gayatri Shivani, M.; Subba Rao, S.; Sujatha, C. A Survey on Deep Recurrent Q Networks. In Intelligent Systems and Machine
Learning; Nandan Mohanty, S., Garcia Diaz, V., Satish Kumar, G., Eds.; Springer: Cham, Switzerland, 2023; pp. 251–261. [CrossRef]
Özalp, R.; Varol, N.; Tsci, B.; Uçar, A. A Review of Deep Reinforcement Learning Algorithms and Comparative Results on
Inverted Pendulum System. In Machine Learning Paradigms: Advances in Deep Learning-Based Technological Applications; Tsihrintzis,
G., Jain, L., Eds.; Springer: Cham, Switzerland, 2020; pp. 237–256. [CrossRef]
Aslam, S.; Michaelides, M.; Herodotou, H. Internet of Ships: A Survey on Architectures, Emerging Applications, and Challenges.
IEEE Internet Things J. 2020,7, 9714–9727. [CrossRef]
Amin, M.; Othman, M. Re-Exploration of
-Greedy in Deep Reinforcement Learning. In RiTA 2020. Lecture Notes in Mechanical
Engineering; Chew, E., Majeed, A.P.P.A., Liu, P., Platts, J., Myung, H., Kim, J., Kim, J.-H., Eds.; Springer: Singapore, 2021. [CrossRef]
Wang, Z.; Schaul, T.; Hessel, M.; Van Hasselt, H.; Lanctot, M.; De Freitas, N. Dueling Network Architectures for Deep
Reinforcement Learning. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22
June 2016; pp. 1995–2003. [CrossRef]
Luong, N.; Hoang, D.; Gong, S.; Niyato, D.; Wang, P.; Liang, Y.C.; Kim, D. Applications of Deep Reinforcement Learning in
Communications and Networking: A Survey. IEEE Commun. Surv. Tutorials 2019,21, 3133–3174. [CrossRef]
Jiang, F.; Li, Y.; Sun, C.; Wang, C. Dueling Double Deep Q-Network Based Computation Offloading and Resource Allocation
Scheme for Internet of Vehicles. In Proceedings of the 2023 IEEE Wireless Communications and Networking Conference (WCNC),
Glasgow, UK, 26–29 March 2023; pp. 1–6. [CrossRef]
Igual, L.; Seguí, S. Network Analysis. In Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications;
Springer International Publishing: Cham, Switzerland, 2024; pp. 151–174. [CrossRef]
Yang, L.; Wei, Y.; Yu, F.; Han, Z. Joint Routing and Scheduling Optimization in Time-Sensitive Networks Using Graph-
Convolutional-Network-Based Deep Reinforcement Learning. IEEE Internet Things J. 2022,9, 23981–23994. [CrossRef]
Xu, J.; Wang, Y.; Zhang, B.; Ma, J. A Graph Reinforcement Learning Based SDN Routing Path Selection for Optimizing Long-Term
Revenue. Future Gener. Comput. Syst. 2024,150, 412–423. [CrossRef]
Fujimoto, R.; Riley, G.; Perumalla, K. Wire–Line Network Simulation. In Network Simulation; Springer International Publishing:
Cham, Switzerland, 2007; pp. 19–25. [CrossRef]
Mishra, P. Introduction to Neural Networks Using PyTorch. In PyTorch Recipes: A Problem-Solution Approach to Build, Train and
Deploy Neural Network Models; Apress: Berkeley, CA, USA, 2023; pp. 117–133. [CrossRef]
J. Mar. Sci. Eng. 2024,12, 2214 24 of 24
Nie, M.; Chen, D.; Wang, D. Reinforcement Learning on Graphs: A Survey. IEEE Trans. Emerg. Top. Comput. Intell. 2023,
7, 1065–1082. [CrossRef]
Chen, G.; Sun, J.; Zeng, Q.; Jing, G.; Zhang, Y. Joint Edge Computing and Caching Based on D3QN for the Internet of Vehicles.
Electronics 2023,12, 2311. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Sorting out the requirements for intelligent functions is the prerequisite and foundation of the top-level design for the development of intelligent ships. In light of the development of inland intelligent ships for 2030, 2035, and 2050, based on the analysis of the division of intelligent ship functional modules by international representative classification societies and relevant research institutions, eight necessary functional modules have been proposed: intelligent navigation, intelligent hull, intelligent engine room, intelligent energy efficiency management, intelligent cargo management, intelligent integration platform, remote control, and autonomous operation. Taking the technical realization of each functional module as the goal, this paper analyzes the status quo and development trend of related intelligent technologies and their feasibility and applicability when applied to each functional module. At the same time, it clarifies the composition of specific functional elements of each functional module, puts forward the stage goals of China’s inland intelligent ship development and the specific functional requirements of different modules under each stage, and provides reference for the Chinese government to subsequently formulate the top-level design development planning and implementation path of inland waterway intelligent ships.
Full-text available
Software-Defined Networking (SDN) is characterized by a high level of programmability and offers a rich set of capabilities for network management operations. Network intelligence is centralized in the controller, which is responsible for updating the routing policies according to the applications’ requirements. To further enhance such capabilities, the controller has to be endowed with intelligence by integrating Artificial Intelligence (AI) tools in order to provide the controller the ability to autonomously reconfigure the network in a timely way. In this paper, we address the deployment of a Q-learning algorithm for the routing optimization problem in terms of latency minimization. Using a direct modeling approach of the multi-path flow-routing problem, we delve deeper into the impact of the exploration-exploitation strategies on the algorithm’s performance. Furthermore, we propose a couple of improvements to the Q-Learning algorithm to enhance its performance within the considered environment. On the one hand, we integrate a congestion-avoidance mechanism in the exploration phase, which leads to effective improvements in the algorithm’s performance with regard to average latency, convergence time, and computation time. On the other hand, we propose to implement a novel strategy based on the Max-Boltzman Exploration method (MBE), which is a combination of the traditional εε\varepsilon- greedy and softmax strategies. The results show that, for an appropriate tuning of the hyperparameters, the MBE strategy combined with the congestion-avoidance mechanism performs better than the εε\varepsilon-greedy, εε\varepsilon-decay, and Softmax strategies in terms of average latency, convergence time, and computation time.
Full-text available
Most launcher networks are based on proprietary buses such as MIL-STD-1553B whose low bandwidth limits the introduction of new services of suitable characteristics. Ethernet technology, because of its low cost and high performance, has been considered an excellent candidate for its use in launcher networks. The real time Ethernet solutions based on the Time Sensitive Networking (TSN) standards seem the most suitable because of its multi-vendor product characteristics. In this paper we propose a real time Ethernet solution for aerospace applications in which negligible jitter services has to be guaranteed. The proposed solution is based on the following TSN standards: IEEE 802.1AS/ASrev as synchronization protocol and 802.1Qbv-2015 for deterministic traffic scheduling. To improve both the bandwidth effective and the frame delay the solution is also based on a change in the management of the Priority Code Point field in IEEE 802.1Q standard. The optimal scheduling problem is formulated so as to minimize the makespan, defined as the time needed to deliver all of the messages of an elementary cycle. The problem has been resolved with the CPLEX solver and the proposed solution has been evaluated in terms of both delay and bandwidth effective by comparing its performance with the TTEthernet, FTTEthernet benchmark solutions. The obtained results in a real traffic scenario characterized by the set of messages of the VEGA launcher show how the proposed solution allows for the same performance of TTEthernet, i.e., the solution of proprietary and real-time Ethernet with better performance.
Full-text available
Stimulated by the increase in user demands and the development of intelligent driving, the automotive industry is pursuing high-bandwidth techniques, low-cost network deployment and deterministic data transmission. Time-sensitive networking (TSN) based on Ethernet provides a possible solution to these targets, which is arousing extensive attention from both academia and industry. We review TSN-related academic research papers published by major academic publishers and analyze research trends in TSN. This paper provides an up-to-date comprehensive survey of TSN-related standards, from the perspective of the physical layer, data link layer, network layer and protocol test. Then we classify intelligent driving products with TSN characteristics. With the consideration of more of the latest specified TSN protocols, we further analyze the minimum complete set of specifications and give the corresponding demo setup for the realization of TSN on automobiles. Open issues to be solved and trends of TSN are identified and analyzed, followed by possible solutions. Therefore, this paper can be an investigating basis and reference of TSN, especially for the TSN on automotive applications.
Network data are currently generated and collected to an increasing extent from different fields. In this chapter, we show how network data analysis allows us to gain insight into the data that would be hard to acquire by other means. We introduce some tools in network analysis and visualization. We present important concepts such as connected components, centrality measures and egonetworks, as well as community detection. We use a Python toolbox (NetworkX) to build graphs easily and analyze them. We motivate concepts in network analysis by a real problem dealing with a Facebook network dataset and answering a set of questions. For instance: Which is the most representative member of the network in terms of the most “connected”, the most “circulated”, the “closest” or the most “accessible” to the rest of the members?
Conference Paper
Time-sensitive Networking (TSN) is an extension of the Ethernet, defined by a set of standards developed and maintained by the IEEE 802.1 Time-sensitive Networking Task Group. TSN aims to provide a framework for real-time and deterministic networks. This paper provides a comprehensive and up-to-date survey on the current state of TSN standardization. In particular, we analyze each published standard to provide an overview of the main technical ideas. Where applicable, we provide the gist of the most significant research papers, discuss the current problems and provide future research directions for each standard. This study aims to be an introductory paper for new researchers and a source of ideas for future research for fellow researchers.