Conference PaperPDF Available

Gossip-based monitoring of virtualized resources in 5G networks



Content may be subject to copyright.
Gossip-based monitoring of virtualized resources in
5G networks
Mauro Femminella∗† Gianluca Reali
Department of Engineering, University of Perugia, Perugia, Italy
Consorzio Nazionale Interuniversitario per le Telecomunicazioni (CNIT), Italy
Abstract—The network function virtualization (NFV)
paradigm decouples service functions (SFs) from the physical
equipment where they are executed, thus increasing the efficiency
of resource utilization, and makes networks and services more
scalable and flexible. However, in order to efficiently manage
and chain SFs to build network service slices in 5G networks,
it is necessary to localize (virtualized) SFs together with their
current status, which includes their load, attached virtual links
status, configuration parameters, etc. To this aim, we propose
a monitoring architecture able to track the network location
and the current status of distributed and virtualized SFs, by
using agents responsible to monitor the status of co-located SFs,
both physical and virtual ones. The monitoring agents exchange
their information by means of a gossip protocol to increase
the reliability of the process and to build a distributed service
monitoring architecture. In this way, it is possible to keep
service decisions as local as possible, limiting the interactions
with a centralized orchestrator, and thus increasing network
scalability. We show that the network overhead of the distributed
monitoring process is negligible.
Index Terms—Signaling, network discovery, network slice
monitoring, gossip protocol, packet interception
The network softwarization technologies have changed the
service delivery by providing on-demand networking and
computing resources. By combining the capabilities of cloud
computing with the on-demand routing functions of software
defined networking (SDN), they have attracted attention from
both academia and industries. The possibility of implementing
base station processing in edge clouds, envisioned by the
mobile edge computing (MEC) paradigm [1], [2], has further
pushed network softwarization forward, thus making network
function virtualization (NFV) [3] as the reference network
service delivery approach in 5G networks [4]–[6] for providing
different types of services.
Most proposals have focused on some particular NFV
aspects, such as platforms for implementing it [7]–[12] and
optimal algorithms to chain simple service functions (SFs)
implemented by virtual network function (VFN) instances
to create complex network services [13]–[15], or to define
architectural and protocol solutions for steering traffic between
NFV instances [16], [17]. However, a number of open issues
This work has been performed in the framework of the European projects
5G-EVE and 5G-CARMEN. These projects have received funding from the
European Unions Horizon 2020 research and innovation programme under
grant agreements No. 815974 and 825012. The views expressed are those of
the authors and do not necessarily represent the projects. The Commission is
not responsible for any use that may be made of the information it contains.
and concerns have been recently raised about the applicability
of the NFV model, especially with respect to the network
operation and management needs [18]. In fact, most of pro-
posals dealing with service chaining assume the position and
the status of the existing VNF instances to be known and
easily retrievable, e.g. by querying a central orchestrator which
knows the entire (virtualized) network status. Nevertheless,
this is easy for small networks, made of just a few computing
clusters. Conversely, large networks, composed of different
edge and core computing tenants, interconnected by dynamic
virtual networks, are much more complex to manage. This
scenario is even more challenging in the 5G scenario [4], [5],
where the need of low latency imposes additional requirements
on service chaining, as well as network slices [19] able to
adapt to the needs of vertical users. This includes the dynamic
reconfiguration of network topology and/or VNF instances to
respond to variation of the service load.
In this regard, a possible solution is to adopt a hierarchi-
cal approach, where the central orchestrator (CO) executes
its tasks by interacting with a number of local managers
(LMs), which manage computing resources in edge/MEC
servers and set up network services by driving local SDN
controllers (LCs). LMs share network information to take
informed decisions and keep the decision process as local as
possible, without involving the CO [20]. Thus, a fundamental
component is the monitoring function. In fact, by knowing the
current status of network resources it is is possible to deploy
optimized service chains by means of NFV instances.
In this work, we propose the introduction of a novel local
entity, called monitoring agent (MA), used for monitoring
the status and resources allocated to VNFs in a computing
cluster/tenant, and co-located with the LM. In order to allow
LMs to do their job, the tasks of the MAs is not only to
provide an updated status of currently deployed VNFs, but
also to provide the LMs with an updated picture of the whole
virtual network. For scalability reasons, it is not recommended
to provide the exact (network) location of each VNF instance
in the provider network, but just the capabilities in terms
of VNF type and load monitored by MAs. In this way, the
communication involves MAs only. We propose a gossip-based
protocol to keep each MA informed about the service status
of the resources monitored by the other MAs. Hence, we can
provide a distributed and robust solution, able to easily deal
with the dynamic nature of virtual networks. In addition, the
discovery function is embedded in the monitoring protocol
itself, without requiring a further mechanism to accomplish
this task. In order to optimize service chaining, it is very
useful to know not only the presence of a given VNF type
in a given cluster, but also its distance in terms of IP hops and
communication latency. To this aim, gossip protocol leverages
packet interception capabilities, enabled by SDN. This allows
each MA to have a tree-like view of the surrounding MAs
(and thus of cloud computing tenants and relevant virtual
links), each associated with its current service capabilities.
We demonstrate that this design approach allows LMs to take
better decisions and limits the protocol overhead.
The paper is organized as follows. In section II, we analyze
related work in the field. In section III, we present our
monitoring solution, detailing the service architecture and
protocol details. Section IV presents a performance analysis.
Finally, in section V we draw our conclusions.
A formalization of the gossip problem is proposed in
[21]. As most of gossip solutions, our proposal is round-
based. Gossip rounds can be synchronous or asynchronous.
Synchronous ones need of a synchronization system that in-
creases overhead. For this reason we propose an asynchronous
approach. Gossip protocols generally select peers involved in
a gossip session randomly, but even gossip protocols using
deterministic strategies exist [22]. Gossip protocols can either
involve a single pair of peers, or multiple separated pairs,
or multiple overlapping pairs. In this regard, an important
feature of our proposal is the capability to establish gossip
sessions with multiple peers in a single round, which allows
saving bandwidth and provides multiple system updates within
a single gossip round. This approach can be compared with
gossip algorithms used in wireless multi-hop/ad-hoc networks.
In [23] the gossip protocol makes use of the broadcast nature
of the wireless medium to send messages to all neighbors
in a single round, whilst our solution exploits the packet
interception capabilities of SDN for sending gossip messages
to multiple peers in a single round.
Gossip-based solutions have also been used to solve the
discovery problem [24] [25]. Some analogies between our
proposal and the one shown in [24] exist, although the problem
formalization is different. Both algorithms aim to create a
network spanning tree, used for distributing messages. Nev-
erhteless, the proposal in [24] needs of a prior knowledge of
all node interfaces in the network for creating a spanning tree.
Even if the tree generation is started by an arbitrary node, this
tree is used for distributing messages over the entire network.
On the contrary, our proposal is fully distributed, and each
peer runs the same algorithm and creates its own distribution
tree. In [25], the the two-hop walk is quite similar to the
D-mode discovery process proposed in [26], but it assumes
prior knowledge of the peer set of neighbors. A solution for
collecting this information is proposed in [27].
We have to mention our former NSIS-based proposal [26].
Such a proposal extends the NSIS protocol [28] by means
of gossip and off-path signaling. It is further refined by the
off-path signaling protocol (OSP) [29], specifically designed
for providing a signaling framework for NFV. However, the
OSP proposal still retains the two-layer organization of NSIS
protocol, adding gossip-based peer discovery and peer-to-peer
flooding message distribution. We show that this organization
is not strictly necessary, and that it is possible to embed
the monitoring function in an enhanced gossip function, with
significant saving of bandwidth, as well as protocol simplifi-
cation. Thus, altough part of this proposal has been inspired
to the solution described in [30], which was an evoluton of
[26], their scope, objectives, and structure are really different.
In the area of VNF monitoring, D-StreaMon is an interesting
solution that can support traffic analysis [31]. D-StreaMon
exploits the flexibility of the publish/subscribe networking
model, which could be used to build chains of monitoring
entities. Thus, its messages are good candidates to be carried
over our proposal. Other monitoring solutions are specialized
for specific environments, such as ConMon, dealing with
micro-service environments [32]. A first example of moni-
toring function of VNF chains is presented in [33], but it is
more oriented to debugging rather than providing a monitoring
primitive for network slices.
Finally, the project 5GTANGO includes a DevOps approach
that can be utilized for the validation and verification of VNFs
and network services [34]. Since this architecture needs a
continuous stream of monitoring data from VNF instances,
it could be a good venue for testing our protocol.
A. System architecture
We consider the system architecture depicted in Fig. 1,
which is the reference architecture in many 5G network
proposals [4], [5]. An SDN-based transport network intercon-
nects the core network cloud with a number of (edge/MEC)
cloud tenants, which implement cloud-based access network
processing functions. Each tenant includes a LM, a LC, and
a MA. The MA is responsible to query the VNF instances
running in the local tenant to retrieve their service status (e.g.
VNF type, maximum capacity, current load, relevant slice).
This information is available to the LM, which also drives
the operation of the SDN network by interacting with LC. In
addition, each MA exchanges its information with other peer
MAs to distribute the VNF service status all over the overlay
network and to provide a global VNF monitoring service. A
specific feature of the proposed architecture is that each MA
has a tree-like view of the network. In this view, each edge
cloud tenant is represented by the relevant MA (see Fig. 1), and
each link in this tree is labeled with the IP distance between
the MAs and their communication latency. Gossip packets are
intercepted and redirected to MAs by means of packet inter-
ception, realized by applying a specific configuration to SDN
devices interacting with the LC. The proposed solution uses
a gossip-based discovery protocol that carries in the message
payload also the status of monitored (virtualized) resources.
Thus, by executing the mutual discovery, MA entities also
MEC/Edge cloud
Edge cloud
SDN transport
Central cloud
MEC/Edge cloud
Fig. 1. Considered 5G NFV network scenario.
update VNFs status information in their epeers. The proposed
algorithm is presented in the next subsection.
B. Gossip-based discovery protocol
The approach used to discover MA nodes consists of a
gossip protocol, inspired by Newscast protocol [35], leverag-
ing SDN packet interception. Gossip sessions are round-based
and asynchronous. They are established between two nodes,
an initiator and a responder, through a three-way handshake,
which uses three messages: Registration, Response, and Ack.
At the beginning of each round the initiator sends a Registra-
tion message to the responder. When the responder receives
this message, it replies with a Response. The handshake is
closed by a final Ack message sent by the initiator. As in other
gossip protocols (e.g., see [25]), both the Registration and
the Response messages include a list of (MA) peers that the
initiator and the responder may want to share with each other,
referred to as peers to share (PTS) list. Therefore, each node
can establish gossip sessions with other (possibly unknown)
nodes on subsequent rounds.
Differently from other gossip protocols, the SDN packet
interception capabilities allows the Registration message to be
received and processed not only by the responder, but also by
the MA nodes on (or close to) its path from the initiator to the
responder (see Fig. 2). This way, these nodes can participate to
the discovery process by sending Response messages towards
the initiator, sharing their own PTS list. In addition, they
forward the message towards the destination (responder). In
this way, with a single Registration message, the initiator
provides an update of the status of its monitored resources not
only to the responder, but also to all intermediate MA nodes.
In turns, also in the Response message these nodes (responder
and intermediate ones) disseminate their status information.
By handshaking with each node, the initiator can evaluate
its downstream distance from all MA nodes along the path,
in terms of both IP hops and, roughly, latency for reaching
each of them. Thus, this policy aims at discovering overlay
paths and evaluate the associated metrics, so as to allow each
initiator to build a tree-like view of the MA network. When
IP network IP network IP network
NcNd . ..
Registration (UDP)
NnNh ...
NaNb ...
Ni Nj ...
Ni Nj ...
Ni Nj ...
Fig. 2. Gossip discovery of MA entities enhanced with SDN-based packet
interception: gossip session and path discovery from MA1 to the Tracker.
a MA node is turned on, its list of reachable MA nodes is
empty. Thus, it is necessary to ask another always-on node,
called tracker, an initial list of active MA nodes to gossip with.
Thus, the tracker acts as the first MA responder (node on the
right in Fig. 2). A tracker address must therefore be statically
configured in all MA nodes. After this initial procedure, a
MA node knows (at least) one additional MA node, and
can periodically establish a gossip session with it, in order
to update its PTS list and exchange monitoring information.
Clearly, a MA cannot know when it has discovered all the
other MA nodes in the network (end of discovery phase). It
assumes that when it does not learn the identity of any new
MA node for a given number of subsequent cycles, it can enter
in a steady phase, eventually increasing the gossip session
period to limit overhead.
1) Mathematical model: Before entering the protocol de-
tails, we present a mathematical model of the proposed gossip
protocol. The network is modeled as a graph of MA nodes,
referred to as MA overlay, denoted as G= (V,E).Vis the
set of nodes with cardinality K=|V|, and Eis the set
of the undirected edges. The IP routing of gossip packets,
which determines the elements in Econnecting MA peers,
uses the underlying IP routing, which we assume to be based
on shortest-path algorithms. MA intercept Registration packets
by using SDN rules. We define a path πij ={i, k, .., j }as the
ordered sequence of MA nodes on the path from ito j, and we
denote by sij =πij − {i}the path without the source node,
that is the sequence of MA nodes visited by a packet sent by
the peer itowards the peer j. We define S={sij |i, j ∈ V}.
Let us focus on the discovery phase. It allows all MA
nodes in Vto receive the identities of the other MA nodes
and to evaluate the relevant metrics. The minimization of the
discovery time translates in minimizing the number of gossip
sessions. We model this problem as a set covering problem
(which is a class of problems known to be NP-hard, see [36]):
given a node i∈ V and the associated universe Ui=V − {i},
the set Si={sik, k ∈ Di⊆ Ui},Si⊆ S , is a cover for Uiif
the union of its elements contains all elements in Ui. Thus, it
is possible to formulate the following problem (C1):
subject to
sik =[
sik,i∈ V .(2)
The solution of this problem, that is the identification of the
minimum sets Di, i ∈ V, provides a solution of the discovery
problem for all MA peers. In fact, |Di|is the minimum number
of gossip rounds necessary to the node ito contact all the
other MA nodes in Vby leveraging the interception capability
of SDN. In addition, for each MA node i∈ V, we define
the single-source shortest-path tree Tirooted at i, [36]. Ti
identifies the MA nodes on the (shortest) path towards any
other node k∈ V. An example of Tifor a very simple graph
Gis drawn in bold in Fig. 3, where i= 1. We say that a
node h∈ V is a leaf for iif it is a leaf for the tree Ti, that is
hsij h=j. We denote as Lithe set of leaf nodes for
i, and Mi=|Li|. Paths associated to leaves for node 1 are
shown by red dashed arrows in Fig. 3.
Our proposed solution of the problem (C1)is based on the
following consideration. If a node iexecutes a gossip session
with all the leaves of its Titree, it certainly discovers all the
(MA) nodes in G, together with the relevant metrics, thanks
to packet interception of Registration messages. Thus, our
solution is aimed to quickly discover all Leaf Peers (LPs) of
the tree associated with each node in V.
Theorem 1: The optimal solution D
ito the set cover
problem (C1)is given by the sets of leaves for each node
in the overlay, that is D
i=Li, i ∈ V.
Proof : By definition of leaf, each node h∈ Li, i ∈ V ,
belongs to the optimal set of solution D
i, that is Li⊆ D
otherwise Uiis not covered. We show that D
i=Li. Assume,
by contradiction, that z∈ D
i− Li. Then, since zis not a
leaf, y∈ Li|zsiy. Thus, from the shortest path routing
assumption it follows that siz siy. Since y∈ Li⊆ D
i, thus
i− {z}is still a solution of (C1)(see (2)) but with a lower
cost than D
i(see (1)). Consequently D
icannot be an optimal
solution for i.
Similarly, it is easy to show that Liis the solution also for
the optimization problem C2modeling network overhead:
i=1 X
hpij q+X
subject to
sik,i∈ V ,(4)
that is (C2)(C1), since they have the same solution Bi=
Di=Li, where
pij is the distance, defined as the number of IP hops,
from the node ito the node jon the tree Ti, i.e. on sij .
q, r, and aare the the size, at IP layer, of the Registration,
Response, and Ack messages, respectively (Fig. 2).
Edge of the graph G
Shortest path tree for node 1
Path towards a leaf for node 1
Fig. 3. Example of a possible subset of the universe and solution for the
discovery problem for node 1. The tree T1is drawn in bold.
2) The peer selection algorithm: To cope with all these
issues, we designed a heuristic aimed to quickly discover the
LPs of each MA node, since all the other MA nodes will be
discovered by intercepting Registration messages. Even if the
optimal solution of the network discovery problem is known,
it is difficult to implement it in practice, since the system
works in a distributed fashion with an incomplete knowledge
of the overlay. In fact, inspecting the routing table of SDN
devices (e.g. by interacting with the LC) it is not enough
to identify the leaves of the MA overlay. In addition, it is a
circular problem: the identities of MA nodes are not known at
bootstrap, since they are discovered during the execution of the
network discovery, which we want to optimize by contacting
only specific nodes (often still unknown), i.e. the LPs, to limit
the number of necessary gossip rounds, as in Fig. 2. For this
reason, we call this solution Leaf-based. In addition, each
node, in order to let other nodes quickly discover their LPs,
should exchange only the identities of potential leaves in the
PTS field of Registration and Response messages (Fig. 2.).
To this aim, we have defined simple, lightweight, and soft-
state structures storing peer information at each MA node.
The former, called peer table (PeT), stores the identities of
the other MA peers together with their associated metrics
(peer element, PE). The latter, called path table (PaT), stores
in node ithe ordered sequence of PEs in sij, as new MA
nodes are discovered and contacted. The PaT is computed by
each initiator iby inspecting the Response messages sent by
any intermediate node kthat has intercepted the Registration
message destined to a responder j, as shown in Fig. 2.
Clearly, a node z, which is a leaf for i, is not necessarily a
leaf also for j, which receives zin the PTS list sent by i. In
addition, it may happen, especially in the initial rounds, that
a newly activated MA node knows just a limited number of
peers, thus the identities it shares could not be true leaves.
We now detail the algorithms executed in MA nodes. The
initiator must select two types of peers stored in the PeT: the
so-called peer to gossip (PTG), which is the responder, and
the PTS list, which includes the PE identities to share with the
PTG and any intercepting node. When the initiator receives the
PTS list within a Response sent by a remote peer, it:
adds each element of the received PTS not already
present in the PeT as new PE, with flags <isGossiped,
isContacted>set to <true, false>. This is important for
subsequent selection of PTG and PTS. In fact, since each
node tries to share just LPs, an identity received in a PTS
list is a good candidate for being selected as future PTG;
if not already present, adds each intercepting MA to
the PeT together with its metrics, and the relevant flag
isContacted is set to “true”. This peer is not a good
candidate for future selection of the PTG or a PTS
element, since it is not an LP for the initiator.
The initiator updates a temporary path list (peerList) as it
receives Responses. The position in the peerList of a peer is
exactly its distance in MA hops. The procedure is completed
when the MA hop distance of the PTG is equal to the number
of received responses (size of the peerList), and the last
element of that list is exactly the PTG. If this condition is
not met, at Tgossip expiration, the path is truncated at the
last peer having a position equal to its distance. In any case,
the new path is added to the pathList stored in the PaT.
Finally, the initiator sends an Ack message back, which does
not include any PTS. Through this procedure, the initiator can
also roughly estimate the round trip latency to any responding
peer. For what concerns the evaluation of the IP distance, it
can be evaluated by inspecting the IP header of the gossip
Registration messages.
A last issue is still open: the selection of PTG and PTS
elements. The selection of PTS elements is a common pro-
cess to initiator,forwarders, and responder. Assume that the
maximum size of the PTS list is H, which is a protocol
design parameter. Since the Leaf-based gossip protocol aims
at gossiping LPs, and shared identities are good candidates
to be gossiped, if the PaT includes at least Hpaths, H
randomly selected LPs of these paths are used. Otherwise,
the node tries to fill the PTS list by using peers already
discovered but still not contacted, that are identifiable by the
flag isContacted =f alse in their PEs. Such peers are those
that have already contacted the selecting node, or those whose
identities have been shared by other nodes (i.e. they have also
the flag isGossiped =true). Since uncontacted peers might
also be LPs, this approach is preferable to making use of peers
already contacted, which are not LPs, given that all nodes
should preferably share LPs.
PTG is selected randomly from three priority lists. The first
list, referred to as high priority, includes uncontacted PEs
with the flag isGossiped set true, since they are most likely
LPs. The second list, referred to as low priority, includes
uncontacted PEs with the flag isGossiped set false, that are
not likely to be LP. Finally, the third list, referred to as no
priority, includes all LPs of the PaT. Thus, uncontacted peers
are preferably selected, in order to quickly accomplish network
discovery. When all peers have been contacted (priority lists
are empty), peers enter the steady phase, during which just
LPs are gossiped, in order to update the status of the highest
possible number of peers by a single Registration message.
Since PE states are soft, they are removed if not re-
freshed. The lifetime value depends on the number of paths
in the PaT (pathList), and is set equal to maxTgossip ×
max(|pathList|,1) ×(1 + ∆), where is a parameter used
to avoid accidental PE cancellations.
The PaT consistency is guaranteed by updates done when
a new path is collected during a gossip session. For space
limitations, we do not report details of the function used
to merge, update, or truncate paths already present in the
pathList as a consequence of a gossip session.
The performance of our proposals have been evaluated
by combining simulations and experimental results obtained
with a proof-of-concept prototype running in Linux virtual
machines. Experimental results has confirmed simulations.
The simulation set up consists of 60 nodes. There are stub
and core nodes. The network stubs are 36, modeling MA
nodes in (virtualized) base station [6], MEC nodes [2] or
additional fog/edge nodes [37]. An additional MA stub node
acts as Tracker. Each stub is connected to one of the 23 core
nodes, which represent MA nodes in the 5G core (cloud-
based) network. We have evaluated performance with both
the full topology and partial topology. In the first case, all
60 nodes are MA nodes. In the latter, 48 nodes represents
computing clusters with their MA, and 12 core nodes are
standard IP routers. We have compared the results of the
proposed approach with that of the OSP protocol [29], which
has been specifically designed to distribute signaling (in this
case monitoring information) in architectures based on VNFs.
A. Network discovery of MAs
The performance of the network discovery algorithms was
evaluated by computing the convergence time of both leaf-
based gossip and OSP. We define the convergence time ti
of the node ias the time taken for executing the transient
phase (MA discovery), that is when at least one complete
gossip session has been executed with all MA nodes in
leaf-based solution. Hence, the network convergence time
is maxi∈V {ti}. In OSP, the discovery ends when all MA
neighbors (nodes at MA distance 1 hop) have been gossiped.
Fig. 4 shows the discovery time as a function of H, the
number of shared PEs in a gossip message. In all experiments,
both the Leaf and OSP solutions provide satisfactory conver-
gence time. It emerges that when the Leaf solution is used,
the convergence time is mostly stable. This result is generally
valid since sharing 2 identities is enough to fill the PeT of each
peer with a sufficient number of uncontacted peers. In this
way, uncontacted peers can be selected as a PTG in most of
subsequent cycles. By using the OSP solution, the convergence
time increases with the number of shared identities H. Sharing
many peer identities makes the set of PEs, selectable for the
next PTG, large. In addition, the number of possible PTGs
is much higher than the number of MA neighbors, thus it is
disadvantageous to test a large set of PEs, most of which are
unreachable at MA level (see also [29] and relevant supporting
document). For the full overlay topology, we observe that
0 2 4 6 8 10 12
(a) Full topology
Number of shared peer identities
Convergence time[s]
0 2 4 6 8 10 12
600 (b) Partial topology
Number of shared peer identities
Convergence time[s]
Fig. 4. Convergence time vs. size of the PTS list; 95% confidence intervals are shown.
when the number of shared peer identities is 1 or 2, the OSP
is preferable. In a sparse, partial topology, the Leaf solution
is preferable in all situations. In fact, in a sparse topology the
discovery time of the OSP solution tends to increase, due to
the fact that the average number of “neighbors” could increase.
Instead, the Leaf solution slightly decrease, as expected (less
MA peers, lower discovery time). Since the Leaf solution
is designed to discover leaves, a significant decrease of the
convergence time is expected when the number of leaves
decreases. Thus, in the end, the Leaf solution has a much
more stable and predictable behavior, and is preferable.
B. Network overhead in steady state update of MAs
When all MA nodes are discovered and the goal is just
to exchange NFV status information, we have both executed
experiments and defined a theoretical model for the full
topology. We denote by δthe IP network diameter, and by ξij
the probability that a peer iselects an LP j∈ Lias a PTG.
The length of the paths pij in the PaT of iis modeled as a
discrete random variable distributed in the range [pmin
i, pmax
with mean µiand variance σ2
i. The overhead generated by the
Leaf solution during a gossip session between nodes iand j
is equal to:
φij =pij q+ (r+a)
i=pij q+ (r+a)pij (pij + 1)
thus, the average overhead generated by the ith node is:
Φi,Leaf =X
ξij φij =i+ (r+a)µi
where ξij = 1/Miis assumed to be in the steady state.
We also assume that the mass probability function of pij is
uniformly distributed between pmin
i= 1 and pmax
i. Thus, by a
simple mathematical derivation, omitted for space limitations,
it follows that
Φi,Leaf =i+ (r+a)hµi
iµi)i (7)
Thus, the total network signaling overhead is:
ΦLeaf =
Φi,Leaf =
3(µi+ 2µ2
The overhead rate is found by dividing ΦLeaf by the gossip
period Tgossip, which is equal to 5 seconds. The length of
Registration (q) and Response (r) messages is roughly 1.2 KB
in total, including 2 shared identities of PEs in the PTS list and
the NFV status information in the payload. Selected transport
protocol is UDP, since protocol reliability and robustness is
ensured by gossip. The length of the ack message (a) is 112
bytes. We found that the mathematical model closely matches
the experimental performance for the Leaf approach, being a
slight upper bound.
As for the comparison between OSP and the proposed Leaf-
based gossip, the latter definitely outperforms the former in
terms of used bandwidth. First, we considered that, in the
steady state, the leaf-based gossip approach has a cycle of
about Tcycle = 37×Tgossip to updates all MAs in the network,
where 37 is the number of leaves of the graph (36 stubs plus
the Tracker). To keep the same update rate, we assumed to
flood the MA overlay with OSP messages each Tcycle seconds,
which is roughly equal to 3 minutes. The aggregate bandwidth
consumed by the Leaf-based gossip is equal to 2.4 Mb/s versus
4.73 Mb/s of OSP. Thus, the bandwidth saving is about 50%.
As for the fractional overhead, defined as aggregate overhead
divided by the overall network transport capacity, by using the
strongly conservative assumption of having bidirectional links
at 1 Gb/s, it results equal to 1.62 ×105, thus negligible.
Preliminary results show that, adopting a topology where
some nodes are simply IP routers, the overhead for the
proposed solution sligthly decreases, as desired, since there
are less MA nodes to update. On the other hand, since the
average network degree on the overlay graph increases, the
OSP overhead increases as well, due to its significant impact
on the flooding procedure of OSP.
A (positive) side effect is that intermediate, core MAs
will receive updates much more frequently. Thus, it could
be possible to implement a further optimization to the Leaf-
based gossip algorithm, in order to disable, or at least relaxing,
gossiping in core MAs. In fact, these nodes receives frequent
updates from most of other MAs, thus they could avoid acting
as initiator, and could distribute their status just as responder.
In this paper we showed a proposal for providing a robust,
distributed monitoring service for a 5G network architecture
based on the NFV and SDN concepts. The proposed solution,
based on the concepts of gossip, SDN packet interception,
and VNF information aggregation in monitoring agents, is
scalable with the number of VNF instances running in data
center tenants, and it is able to adapt to changes in the
(virtualized) network topology. It is particularly suitable for
VNF deployments with many computing clusters, such as
those envisaged in the forthcoming 5G networks. Also, it
fits very well the concept of service slice in 5G network
architecture, since each slice can build its virtualized network
topology, including only a subset of monitoring agents, which
can be virtualized as well. Thus, it can be used as a building
block, in conjunction with scalable monitoring software, such
as D-StreaMon [31], to realize scalable network and service
monitoring per slice (and thus per vertical customer) in
forthcoming 5G networks. Given the protocol properties, our
solution can be adopted even if virtual links interconnect data
center tenants offered by differed cloud computing providers,
thus guaranteeing maximum implementation freedom. Finally
incremental deployment is possible, favoring its adoption in
real settings.
Future work will pursue the complete system implementa-
tion by using open source software.
[1] Y. C. Hu et al, “Mobile edge computing - A key technology towards
5G,” ETSI White Paper No. 11, Sept. 2015.
[2] Y. Mao et al., “A survey on mobile edge computing: The communication
perspective,IEEE Communications Surveys Tutorials, vol. 19, no. 4, pp.
2322–2358, Fourthquarter 2017.
[3] “Network functions virtualisation (NFV); management and orchestra-
tion,” ETSI GS NFV-MAN 001 V1.1.1, Dec. 2014.
[4] S. Abdelwahab et al., “Network function virtualization in 5G,IEEE
Communications Magazine, vol. 54, no. 4, pp. 84–91, April 2016.
[5] P. Rost et al., “Mobile network architecture evolution toward 5G,IEEE
Communications Magazine, vol. 54, no. 5, pp. 84–91, May 2016.
[6] Z. Zaidi et al., “Will sdn be part of 5g?” IEEE Communications Surveys
Tutorials, vol. 20, no. 4, pp. 3220–3258, Fourthquarter 2018.
[7] M. Femminella et al., “An enabling platform for autonomic management
of the future internet,” Network, IEEE, vol. 25, no. 6, pp. 24–32, 2011.
[8] J. Martins et al., “Clickos and the art of network function virtualization,”
[9] J. Sherry et al., “Making middleboxes someone else’s problem: Network
processing as a cloud service,” in ACM SIGCOMM’12.
[10] J. Hwang, K. K. Ramakrishnan, and T. Wood, “Netvm: High per-
formance and flexible networking using virtualization on commodity
platforms,” in USENIX NSDI’14, Seattle, WA, Apr. 2014, pp. 445–458.
[11] X. Ge et al., “Openanfv: Accelerating network function virtualization
with a consolidated framework in openstack,” in ACM SIGCOMM’14.
[12] A. Gember-Jacobson et al., “Opennf: Enabling innovation in network
function control,” in ACM SIGCOMM’14.
[13] S. Mehraghdam, M. Keller, and H. Karl, “Specifying and placing chains
of virtual network functions,” in IEEE CloudNet ’14, Oct 2014.
[14] B. Addis et al., “Virtual network functions placement and routing
optimization,” in IEEE CloudNet’15, Oct 2015, pp. 171–177.
[15] S. Sahhaf et al., “Scalable architecture for service function chain
orchestration,” in EWSDN’15, Sept 2015, pp. 19–24.
[16] P. Quinn and J. Guichard, “Service function chaining: Creating a service
plane via network service headers,” Computer, vol. 47, no. 11, Nov 2014.
[17] Y. Zhang et al., “Steering: A software-defined networking for inline
service chaining,” in IEEE ICNP’13, Oct 2013, pp. 1–10.
[18] R. Mijumbi et al., “Management and orchestration challenges in network
functions virtualization,” IEEE Communications Magazine, vol. 54,
no. 1, pp. 98–105, January 2016.
[19] H. Zhang et al., “Network slicing based 5g and future mobile networks:
Mobility, resource management, and challenges,IEEE Communications
Magazine, vol. 55, no. 8, pp. 138–145, Aug 2017.
[20] D. Valocchi et al., “Extensible signaling framework for decentralized
network management applications,” in IFIP/IEEE NOMS’16, April
[21] T. Gonzalez, “An efficient algorithm for gossiping in the multicasting
communication environment,Parallel and Distributed Systems, IEEE
Transactions on, vol. 14, no. 7, pp. 701–708, 2003.
[22] J. Liu et al., “Deterministic gossiping,” Proceedings of the IEEE, vol. 99,
no. 9, pp. 1505–1524, 2011.
[23] X. Gao et al., “An efficient heuristic gossiping mechanism in ad hoc
routing,” in CHINACOM 2007, 2007, pp. 978–982.
[24] A. Bagchi, S. Hakimi, and E. F. Schmeichel, “Gossiping in a distributed
network,” Computers, IEEE Transactions on, vol. 42, no. 2, pp. 253–
256, 1993.
[25] B. Haeupler et al., “Discovery through gossip,” in ACM SPAA ’12, 2012,
pp. 140–149.
[26] M. Femminella et al., “Gossip-based signaling dissemination extension
for next steps in signaling,” in IEEE NOMS 2012, 2012, pp. 1022–1028.
[27] A. Ganesh, A.-M. Kermarrec, and L. Massoulie, “Peer-to-peer mem-
bership management for gossip-based protocols,” Computers, IEEE
Transactions on, vol. 52, no. 2, pp. 139–149, 2003.
[28] R. Hancock et al., “Next Steps in Signaling (NSIS): Framework,” IETF,
RFC 4080, Jun. 2005.
[29] M. Femminella, G. Reali, and D. Valocchi, “A signaling protocol for
service function localization,” IEEE Communications Letters, vol. 20,
no. 7, pp. 1325–1328, 2016.
[30] M. Femminella et al., “Open Call Deliverable OCA-DS1.1 -
Advanced Networking for the EU genomic research (ARES),” 2015, call deliverables/
documents/ares final report.pdf.
[31] D. Palmisano et al., “D-STREAMON NFV-capable distributed frame-
work for network monitoring,” in ITC 29, Sep. 2017.
[32] F. Moradi et al., “Conmon: An automated container based network
performance monitoring system,” in IFIP/IEEE IM’17, May 2017.
[33] M. Peuster and H. Karl, “Profile your chains, not functions: Automated
network service profiling in devops environments,” in IEEE NFV-SDN,
Nov 2017.
[34] P. Twamley et al., “5GTANGO: An approach for testing NFV deploy-
ments,” in EuCNC 2018, June 2018, pp. 1–218.
[35] S. Voulgaris, M. Jelasity, and M. van Steen, “A robust and scalable
peer-to-peer gossiping protocol,” in AP2PC ’03, 2005.
[36] T. H. Cormen et al.,Introduction to Algorithms, 2nd ed. McGraw-Hill
Higher Education, 2001.
[37] L. M. Vaquero and L. Rodero-Merino, “Finding your way in the fog:
Towards a comprehensive definition of fog computing,” SIGCOMM
Comput. Commun. Rev., vol. 44, no. 5, Oct. 2014.
... The proposed gossip protocol leverages packet interception capabilities, enabled by network softwarization technologies, to improve operations efficiency. This paper significantly extends our preliminary conference paper [24], presenting a more complete architectural view as well as a thorough performance evaluation. This includes a performance model for the proposed solution and the comparison with other up to date proposals, including a centralized pub/sub one taken from the recent literature [22], [23], [25], not present at all in the preliminary version. ...
Full-text available
The service function (SF) area has gained increasing attention in the last years due its ability to combine the advantages of cloud computing with network softwarization. By decoupling SFs from the physical equipment where they are executed, it is possible to make network services scalable and flexible. These advantages become even more evident in the forthcoming 6G networks, where the overall environment is expected to become more dynamic and cloud-based, with SFs deployed as cloud-native functions. However, in order to efficiently manage and compose services using these SFs, it is necessary to monitor the available resources of the nodes where they can be deployed, in addition to exchange information relevant to the operational status of active SFs. To this aim, we propose a lightweight monitoring architecture by using agents in charge of monitoring the status of SFs running in co-located clusters. These monitoring agents exchange their information by means of a gossip protocol, which allows increasing the reliability of the process. In this way, it is possible to keep service decisions as local as possible, limiting the interactions with centralized decision and orchestration platforms, and thus increasing network scalability and responsiveness. Performance evaluation shows the effectiveness of the proposed solution, and demonstrates that the network overhead of the distributed monitoring process is definitely affordable.
... Here, the compute, storage or network resources can be configured dynamically through a self-adaptation mechanism. Network configurations as one of the platform concerns have been investigated by a number of authors (Mantri et al., 2013;Tokunaga et al., 2016;Femminella and Reali, 2019). Another strategy is the dynamic allocation and management of tasks in distributed environments. ...
... Here, the compute, storage or network resources can be configured dynamically through a self-adaptation mechanism. Network configurations as one of the platform concerns have been investigated by a number of authors (Mantri et al., 2013;Tokunaga et al., 2016;Femminella and Reali, 2019). Another strategy is the dynamic allocation and management of tasks in distributed environments. ...
Full-text available
Self-adaptive systems such as clouds and edge clouds are more and more using Machine Learning (ML) techniques if sufficient data is available to create respective ML models. Self-adaptive systems are built around a controller that, based on monitored system data as input, generate actions to maintain the system in question within expected quality ranges. Machine learning (ML) can help to create controllers for self-adaptive systems such as edge clouds. However, because ML-created controllers are created without a direct full control by expert software developers, quality needs to be specifically looked at, requiring a better understanding of the ML models. Here, we explore a quality-oriented management and governance architecture for self-adaptive edge controllers. The concrete objective here is the validation of a reference governance architecture for edge cloud systems that facilitates ML controller quality management in a feedback loop.
... In [4,5], an evaluation of Vehicle to Vehicle (V2V) infrastructure and 5G is carried out, considering a network not being able to cover the entire roadway. The authors in [6] propose a monitoring architecture which enables tracking of the location and current status of distributed and virtualized service functions, for both physical and virtual resources. ...
Full-text available
Mobile Edge Computing (MEC) places part of the cloud resources to the edge of the network to increase performance and provide context-aware services. In combination with the expected high performance 5G, many of the limitations for today’s infrastructure could be solved. In the context of the EU 5G-CARMEN project, one of the main challenges is service continuity across organisational and territorial boundaries in a road, motorway and railway settings. The project addresses this challenge by designing and implementing MEC-based services that act as bridges between the different domains, while taking advantage of the high performance and reliability of 5G. Four uses cases have been defined to capture to potential uses of this new paradigm in the mobility domain. In this paper, we present the on going development of service continuity mechanisms within the 5G-CARMEN project.
... This protocol tries to reduce power consumption; however, it does not contemplate heterogeneous surroundings and it is not applicable to the datacenter having one cluster. Femminella et al. [19] have discussed about decoupling the functional services from the physical equipment's when they are executed. They proposed an architecture to track the current status of distributed and virtual service functions. ...
Full-text available
Since the recent past, cloud computing is developing as a solution to expansive calculation and information storage issues in the form of services. It gives a stage to ask for computational assets with "on interest payments per use arrangement". It thus opens ways to getting to boundless assets with negligible equipment and programming at the customers' end. This paper goes for the advancement of a cloud administration's provisioning structure by building up a dynamic load-balancer for the cloud. In this article, a framework and protocol for the resource environment in the cloud have proposed. Distributed Hash Table (DHT) protocol has been utilized for a service query to perform a job agreed by the user. For load balancing, gossip protocol has been used for inter/intra-cluster gossip. For inter-cluster gossip, the load is balanced among the leaders of every cluster. The proposed protocol uses the inter-cloud resource management, where a leader is selected from the cloud that interacts to other cloud and decides on virtual machine (VM) migration. The decision about job allocation is not acknowledged by a single machine, which generates the scalable architecture of the proposed protocol. The protocol considers the current load situation and decides at the time of request submission. This protocol is adaptable, reliable and scalable and supports green computing by utilizing server solidification.
... In [4,5], an evaluation of Vehicle to Vehicle (V2V) infrastructure and 5G is carried out, considering a network not being able to cover the entire roadway. The authors in [6] propose a monitoring architecture which enables tracking of the location and current status of distributed and virtualized service functions, for both physical and virtual resources. ...
Conference Paper
Full-text available
Mobile Edge Computing (MEC) places part of the cloud resources to the edge of the network to increase performance and provide context-aware services. In combination with the expected high performance 5G, many of the limitations for today's infrastructure could be solved. In the context of the EU 5G-CARMEN project, one of the main challenges is service continuity across organisational and territorial boundaries in a road, motorway and railway settings. The project addresses this challenge by designing and implementing MEC-based services that act as bridges between the different domains, while taking advantage of the high performance and reliability of 5G. Four uses cases have been defined to capture to potential uses of this new paradigm in the mobility domain. In this paper, we present the on going development of service continuity mechanisms within the 5G-CARMEN project.
... In [13], [14] evaluation of Vehicle to Vehicle(V2V) Infrastructure and 5G while the network is not able to cover the entire roadway and improve the network signalling has studied. Authors in [15] proposed a monitoring architecture which enables the track of the network location and current status of distributed and virtualized service functions, for both physical and virtual resources. New design for 5G network which is a combining MEC and network slicing for autonomous and connected vehicles proposed in [16]. ...
One of the core elements for the upcoming generation of wireless cellular networks is the availability of network service access continuity in addition to high-speed internet and low latency. The forthcoming fifth generation (5G) greatly improves users’ demand in terms of faster download rates, exceptional system availability, superb end to end coverage with exceptionally low latency and ultra reliability. One of the solutions to provide end to end low latency is the utilization of Mobile Edge Computing (MEC) in the network. MEC provides cloud advantages to users by setting up a small cloud server in the edge node (i.e. close to the end-user), which decreases the amount of latency in network connections, in this regard, service migration has required as users migrate to the new location. Optimal migration decisions are challenging because they depend on the cloud environment, or edge nodes belong to different orchestrators, and security issues in the migration process must also be resolved in order to prevent unreliable requests. This study provides different approaches to address these challenges by identifying the security implications of migration methods based on the blockchain integration.
Full-text available
Driven by the visions of Internet of Things and 5G communications, recent years have seen a paradigm shift in mobile computing, from the centralized Mobile Cloud Computing towards Mobile Edge Computing (MEC). The main feature of MEC is to push mobile computing, network control and storage to the network edges (e.g., base stations and access points) so as to enable computation-intensive and latency-critical applications at the resource-limited mobile devices. MEC promises dramatic reduction in latency and mobile energy consumption, tackling the key challenges for materializing 5G vision. The promised gains of MEC have motivated extensive efforts in both academia and industry on developing the technology. A main thrust of MEC research is to seamlessly merge the two disciplines of wireless communications and mobile computing, resulting in a wide-range of new designs ranging from techniques for computation offloading to network architectures. This paper provides a comprehensive survey of the state-of-the-art MEC research with a focus on joint radio-and-computational resource management. We also discuss a set of issues, challenges and future research directions for MEC research, including MEC system deployment, cache-enabled MEC, mobility management for MEC, green MEC, as well as privacy-aware MEC. Advancements in these directions will facilitate the transformation of MEC from theory to practice. Finally, we introduce recent standardization efforts on MEC as well as some typical MEC application scenarios.
Full-text available
For many, this is no longer a valid question and the case is considered settled with SDN/NFV (Software Defined Networking/Network Function Virtualization) providing the inevitable innovation enablers solving many outstanding management issues regarding 5G. However, given the monumental task of softwarization of radio access network (RAN) while 5G is just around the corner and some companies have started unveiling their 5G equipment already, the concern is very realistic that we may only see some point solutions involving SDN technology instead of a fully SDN-enabled RAN. This survey paper identifies all important obstacles in the way and looks at the state of the art of the relevant solutions. This survey is different from the previous surveys on SDN-based RAN as it focuses on the salient problems and discusses solutions proposed within and outside SDN literature. Our main focus is on fronthaul, backward compatibility, supposedly disruptive nature of SDN deployment, business cases and monetization of SDN related upgrades, latency of general purpose processors (GPP), and additional security vulnerabilities, softwarization brings along to the RAN. We have also provided a summary of the architectural developments in SDN-based RAN landscape as not all work can be covered under the focused issues. This paper provides a comprehensive survey on the state of the art of SDN-based RAN and clearly points out the gaps in the technology.
Full-text available
The fifth-generation (5G) networks are expected to be able to satisfy users' different quality-of-service (QoS) requirements. Network slicing is a promising technology for 5G networks to provide services tailored for users' specific QoS demands. Driven by the increased massive wireless data traffic from different application scenarios, efficient resource allocation schemes should be exploited to improve the flexibility of network resource allocation and capacity of 5G networks based on network slicing. Due to the diversity of 5G application scenarios, new mobility management schemes are greatly needed to guarantee seamless handover in network slicing based 5G systems. In this article, we introduce a logical architecture for network slicing based 5G systems, and present a scheme for managing mobility between different access networks, as well as a joint power and subchannel allocation scheme in spectrum-sharing two-tier systems based on network slicing, where both the co-tier interference and cross-tier interference are taken into account. Simulation results demonstrate that the proposed resource allocation scheme can flexibly allocate network resources between different slices in 5G systems. Finally, several open issues and challenges in network slicing based 5G networks are discussed, including network reconstruction, network slicing management and cooperation with other 5G technologies.
Conference Paper
Full-text available
Network Functions Virtualization (NFV) is incrementally deployed by Internet Service Providers (ISPs) in their carrier networks, by means of Virtual Network Function (VNF) chains, to address customers' demands. The motivation is the increasing manageability, reliability and performance of NFV systems, the gains in energy and space granted by virtualization, at a cost that becomes competitive with respect to legacy physical network function nodes. From a network optimization perspective, the routing of VNF chains across a carrier network implies key novelties making the VNF chain routing problem unique with respect to the state of the art: the bitrate of each demand flow can change along a VNF chain, the VNF processing latency and computing load can be a function of the demands traffic, VNFs can be shared among demands, etc. In this paper, we provide an NFV network model suitable for ISP operations. We define the generic VNF chain routing optimization problem and devise a mixed integer linear programming formulation. By extensive simulation on realistic ISP topologies, we draw conclusions on the trade-offs achievable between legacy Traffic Engineering (TE) ISP goals and novel combined TE-NFV goals.
Conference Paper
Programmability, control and flexibility can be considered as some of the indirect enablers for the success of 5G technologies. A key driver towards this are mechanisms or methodologies to drive shorter time to market for suppliers and operators of virtual network functions (VNFs) and network services. 5GTANGO includes a DevOps approach that can be utilized for the validation and verification (V&V) of VNFs and network services. In this paper, we elaborate further on the approaches made in the areas of testing, catalogues and package management as a means to enable that full DevOps V&V workflow. Finally, we explore the deployment requirement of the V&V via one of our pilot use cases.