Content uploaded by Mario Di Mauro
Author content
All content in this area was uploaded by Mario Di Mauro on Dec 13, 2020
Content may be subject to copyright.
1
Comparative Performability Assessment of SFCs:
The case of Containerized IP Multimedia Subsystem
Mario Di Mauro, Member, IEEE, Giovanni Galatro, Maurizio Longo, Member, IEEE, Fabio Postiglione, Marco
Tambasco
Abstract—The failure of a single network element composing
a Service Function Chain (SFC) unavoidably leads to some
degradation in terms of availability (ability of guaranteeing
working conditions), and/or performance (ability of sustaining
a certain workload) for the whole SFC. By considering both
of these aspects, we propose, as a case study, a joint analysis
of availability and performance (a.k.a. performability) of IP
Multimedia Subsystem, an SFC infrastructure which plays a
key role in the all-IP convergence of telecommunication services,
especially as per prospects of 5G. We refer to an implementation
of IMS based on container technology (containerized IMS, or
cIMS) which allows to decouple the application layer from the
underlying hardware infrastructure more efficiently than classic
virtualization schemes. We model the probabilistic behavior of
a cIMS by means of Stochastic Reward Networks (SRN) and
Reliability Block Diagram (RBD) formalisms to take into account
failure and repair events. Then, with the assistance of a designed-
from-scratch algorithm (OptChains+), we carry on a performa-
bility analysis: i) to evaluate and compare series/parallel cIMS
configurations (or settings), and ii) to find settings with minimum
cost and maximum availability, given a performance level. The
proposed assessment lends itself to a sensitivity analysis, here
demonstrated by examples, useful for robustness evaluation.
Index Terms—IP Multimedia Subsystem, 5G, Containers,
Availability, Performability, Stochastic Reward Networks, Redun-
dancy Optimization, SFC.
ACRONYMS & NOTATIO N
(c)IMS (Containerized) IP Multimedia Subsystem
CNF Containerized Network Function
CNR Containerized Network Replica
CNT Container
CSCF Call Session Control Function
CTMC Continuous-Time Markov Chain
DCK Docker
HPV Hypervisor
HW Hardware
I-CSCF, I Interrogating CSCF
M. Di Mauro, G. Galatro, M. Longo, F. Postiglione are with the
Department of Information and Electrical Engineering and Applied
Mathematics (DIEM), University of Salerno, 84084, Fisciano, Italy. E-mails:
{mdimauro,longo,fpostiglione}@unisa.it, g.galatro1@studenti.unisa.it
M. Tambasco is with Research Consortium on Telecommunications
(CoRiTeL), University of Salerno, 84084, Fisciano, Italy. E-mail:
marco.tambasco@coritel.it
MAA Message Authentication Answer
MAR Message Authentication Request
P-CSCF, P Proxy CSCF
RBD Reliability Block Diagram
S-CSCF, S Serving CSCF
SFC Service Function Chain
SIP Session Initiation Protocol
SRN Stochastic Reward Networks
UAA User Authentication Answer
UAR User Authentication Request
VM Virtual Machine
VNF Virtual Network Function
A(t),AInstantaneous, Steady-State Availability
cjCapacity for node j
γjNormalized perfomance level for node j
EjCost (expenditure) for node j
iMarking
λ(µ) Failure (Repair) rate
pi j Probability of node jbeing in marking i
Pup (Pdn ) Place denoting an up (down) condition
ri(j) Reward rate in marking ifor node j
Tf(Tr) Timed transition of failures (repairs)
tImmediate transition
WDemand
I. INTRODUCTION
TOday, network and telco providers are benefiting from
the Service Function Chaining (SFC) paradigm, which
allows to easily create and deploy novel services through a
series (namely, a chain) of concatenated network components
[1], [2]. They are often designed by exploiting softwarized
technologies such as virtualization, microservices, container-
ized environments, that provide a flexible and handy habitat
2
for diverse telco frameworks [3], [4]. Among such frameworks
we focus on the IP Multimedia Subsystem (IMS), which,
embracing the service chaining concept, has been elected both
by standardization groups [5] and by the industry world [6],
[7] as the ideal intermediary between legacy networks and
5G-based solutions.
In this regard, we highlight that, being mainly focused on an
architectural problem, we consider, for the sake of simplicity,
a high-level perspective of the IMS service chain, as often
contemplated in the technical literature on SFC infrastructures
(e.g., [8], [9]).
Moreover, a recent project named Clearwater [10] has been
specifically conceived to offer an open source implementation
of IMS frameworks within virtualized and containerized en-
vironments, and to be exploited as a benchmark testbed for
assorted performance analyses [11], [12], [13].
Inspired by this last trend, we draw up a technique for
managing availability and performance of service chains,
where a containerized version of IMS (cIMS for brevity) has
been considered as pivotal use case. Basically, a container is
a lightweight process [14], [15] that, differently from classic
virtual machines (VMs), does not include a whole operating
system. Container technology is normally coupled with the
Network Function Virtualization (NFV) paradigm which al-
lows to encompass the network logic (e.g. routing, firewalling,
load balancing) in virtualized elements referred to as Virtual
Network Functions (VNFs). A valuable example is provided
in [16] where the authors present Glasgow Network Function
(GNF), a container-based NFV framework aimed at running
and orchestrating VNFs encapsulated in Linux containers, with
the advantage of allowing a fast deployment on commodity
devices that do not need hardware-accelerated virtualization.
In the specific case considered here, containers fulfill func-
tionalities encountered in an IMS domain. Moreover, since
failure and repair events can occur at various layers of the
cIMS system (container, VM, hypervisor, etc.), we first model
the probabilistic behavior of the whole cIMS chain, and
then solve an optimization problem to achieve cIMS config-
urations guaranteeing, simultaneously, high availability1and
minimal costs at a given performance level. More generally,
the methodology can be adapted to other instances in the field
of telecommunications and hence can be useful to management
organizations involved in planning/deploying/maintaining sys-
tems and services, where trade-offs among cost, availability,
and performance are critical.
The paper is organized as follows. In Section II we advance
a general perspective of the problem and state the contributions
offered herein. In Section III we review the related literature,
highlighting the main differences w.r.t. our proposal. Section
IV starts with a brief overview of IMS, providing more details
on its deployment in a containerized setting, which may give
rise to different configurations. In Section V, we present the
probabilistic model of cIMS which accounts for failure and
repair events by means of two formalisms: Reliability Block
Diagram (RBD) and Stochastic Reward Networks (SRN). In
1High or “five nines” availability requirement indicates a steady-state
availability greater than 0.99999, corresponding to a maximum tolerated
downtime of 5.26 minutes per year.
Section VI, after the formalization of our optimization prob-
lem, we outline the procedure for performability assessment
which exploits an algorithm (OptChains+) designed to manage
the cIMS probabilistic models and to expedite the search for
optimal (minimum cost) deployments. In Section VII we report
the outcomes of a numerical performance analysis across the
various possible configurations a cIMS may assume. An as-
sessment of the robustness of the system in critical conditions
is also provided through a sensitivity analysis. Finally, Section
VIII draws some conclusions and hints for further research.
II. MOT IVATION AND MAIN CONTRIBUTIONS
Service availability is considered a crucial parameter of
Quality of Service (QoS) as specified by ITU-T E.800 [17].
The severe requirements described by this recommendation
imply a very careful network design of 5Ginfrastructures,
where virtualized and containerized modules interact not only
among them, but also with the underlying hardware infras-
tructure. For instance, more containers deployed on a single
virtual machine can be adversely affected by a malfunctioning
operating system running on the VM, or, similarly, more
virtualized network functions sharing the same physical layer
can be adversely influenced by a misconfiguration of the
resource isolation mechanism [18]. Further issues can arise
when the network elements have to be traversed in a specific
order to provide a service. It is the case of SFCs, of which IMS
can be considered a particular realization. Other examples may
include: virtualized Evolved Packet Core (vEPC) solutions
able to interoperate in a chained fashion with SDN components
[19], service chains deployments across virtual data centers
[20], virtual Mobility Management Elements (vMMEs) orga-
nized as SFCs interested by signaling processing flows [21].
Across such a chained scheme, the availability analysis must
consider the features of each single node which, if affected by
a failure event, disrupts the whole chain network flow.
Accordingly, we advance the following main contribu-
tions: i) We propose a detailed availability characterization
of softwarized network chains by exploiting two techniques:
Reliability Block Diagrams (RBD) to describe the high-level
interconnections among concatenated nodes, and Stochastic
Reward Networks (SRN) to model the internal structure
of each node from a probabilistic point of view; ii) We
conduct a performability (availability plus performance) as-
sessment of an exemplary softwarized chain constituted by
an IP Multimedia Subsystem deployed in a container-based
setting, namely, a containerized IMS (cIMS), where three
containerized schemes have been compared and discussed in
detail; moreover, through a sensitivity analysis, we evaluate the
robustness of cIMS with respect to deviations of some external
parameters from their nominal values due, e.g., to designer’s
uncertainty on failure/repair mean times; iii) We devise an
algorithm (nicknamed OptChains+) in charge of evaluating
first RBD/SRN models associated to cIMS deployments (or
settings) and then pinpointing the settings which satisfy, at the
same time, minimum cost and high availability under specific
performance criteria. Finally, an experimental testbed based
on the Clearwater project has been deployed to execute stress
tests aimed at deriving the relevant workload parameters.
3
III. REL ATED RESE AR CH
Over last years, industry and academia have shown a notable
interest in refining techniques and methods for evaluating the
performance, reliability, and availability of novel communica-
tion systems based on cloud models [22]. In particular, the
attention for the Service Level Agreements (namely, the per-
formance indicators that operators must guarantee) encourages
ever deeper analyses about the performability of the systems,
where performance and availability are considered in a unified
manner ([23], [24]).
Aimed at performing a broad-range comparison with affine
literature, we select assorted criteria to highlight the introduced
differences/novelties along several directions. The first crite-
rion pertains the adoption of the state-space model, namely
the SRN. With respect to classic Continuous-Time-Markov-
Chain (CTMC) models (adopted for example in [25], [26], [27]
to characterize the availability of cloud-based infrastructures),
SRN allows to overcome the uncontrollable state space growth
issue arising when modeling real-world complex systems.
A second criterion involves the completeness of the avail-
ability modeling, whereas many works focus only on se-
lected aspects (e.g. only failures). Examples include: [28],
where some open-source and commercial systems have been
considered for the comparison, but, without proposing a
failure/repair mathematical model. Similarly, the stochastic
model proposed in [29], dealing with aspects of an IaaS
infrastructure by adopting SRN methodology, does not include
a failure/repair characterization. Other works are focused on
availability analyses of virtualized elements compositions that
realize an SFC infrastructure. The authors in [30], for instance,
analyze availability issues of an SFC, with respect to the
minimum number of backup VNFs to deploy. The proposed
model addresses (for the sake of simplicity) only failures
events but not repair actions. A repair model is also missed
in [31], where a framework for a reliability evaluation of a
virtualized deployment has been proposed. Similarly, in [32]
the problem of distributing VNF replicas between primary and
backup paths aimed at maximizing the SFC’s availability has
been faced via heuristic algorithm, without taking into account
failure/repair models.
A third criterion regards the broader vision offered by the
performability analysis w.r.t. classic availability assessments
which neglect performance aspects. Along this line we cite
[33], where the authors face an availability analysis concern-
ing micro-service oriented architectures inspired to Google
Kubernetes service, where the performability is considered
as a future work. Performability analysis lacks also in [34],
where the Stochastic Petri Net framework is adopted, and
where a cloud server, a load balancer, and a database distribute
different requests across VMs to realize a Disaster-Recovery-
as-a-Service paradigm.
The present work, which follows the lines traced in recent
works by the same authors ([35], [36]), advances with respect
to the existing literature along two directions: first, it offers an
high availability modeling of a cIMS architecture that, being
the ultimate state-of-art of technology in telco world, has not
been characterized yet at sufficient level of detail (to the best
knowledge of the authors); second, it presents a performability
assessment, not to be found in most related literature, that
may be crucial when coping with service management trade-
offs. Moreover, beside the current application to cIMS, the
tool here developped can be more generally adapted to novel
infrastructures implemented via series-parallel arrangements,
as is the case of redundant SFCs.
IV. IMS FRAM EW ORK AND C ONTAIN ER-B ASED
ENVIRONMENTS
The IP Multimedia Subsystem was originally conceived
as a framework for accessing a large amount of multimedia
IP-based facilities with guaranteed quality of service [37].
Nowadays, IMS is becoming crucial in revitalizing telco in-
frastructures, by enabling, for example, strong inter-operations
of Voice over LTE (VoLTE) across network operators’ bound-
aries, and supporting advanced services such as video calling,
HD voice, web messaging and enriched communications.
Within the IMS domain, the signaling flows are regulated
by Call Session Control Function (CSCF) servers by means
of the SIP protocol. The CSCF functionalities are shared
among three servers. The Pr ox yCSCF (P-CSCF) is a SIP
proxy, and basically acts as an interface between the user
equipment and the IMS domain. The I nterrogatingCSCF (I-
CSCF) forwards SIP requests or responses within the domain.
The Ser vingCSCF (S-CSCF) is in charge of performing
some core functions as session and routing control and user
registration management. Another important node is the Home
Subscriber Server (HSS), an evolved database accomodating
users’ profiles that can be retrieved through a specific protocol
called Diameter.
Usually, the SIP flow across an IMS framework follows a
predefined path. A classic example is the Registration proce-
dure shown in Figure 1: the user device sends a REGISTER
message to P-CSCF (1) in order to get access to the IMS
domain; the REGISTER message is transferred to I-CSCF
(2) that, in turn, retrieves from the HSS the appropriate S-
CSCF address in charge of managing the SIP session. Such
a query/response procedure is managed by means of two
messages: User Authentication Request (UAR) (3), and User
Authentication Answer (UAA) (4). Once REGISTER message
arrives to the S-CSCF (5), the latter queries user profile to
the HSS through another couple of messages: Message Au-
thentication Request (MAR) (6) and Message Authentication
Answer (MAA) (7). If the procedure finishes correctly, the S-
CSCF transmits a 200 OK message to user device (8), (9),
(10), and the registration is terminated.
A. Containerized IMS infrastructure
We consider a deployment of an IMS framework in
a container-based domain such as the recently introduced
Docker [38], RKT [39], OpenVZ [40].
With respect to VMs, containers exhibit some differences.
First, containers share the host operating system, whereas a
VM has its own guest operating system, resulting in a heavier
structure (in terms of disk/memory utilization, start-up time,
etc.). Second, a VM exhibits a strong isolation at the host
4
!"#$%" &'()(*
!"#$%&'()%$
'*+(+, -(( (*+(+,
."#$%&'()%$
/"#01234546#78$
9"#01234546#788
:"#$%&'()%$
;"#.<<#=>
?"#.<<#=>
!<"#.<<#=>
@"#01234546#A8$
B"#01234546#A88
Fig. 1: Registration procedure in IMS domain (simplified).
kernel level (being the operating system not shared), thus
exhibiting a stronger security w.r.t. the containers counterpart.
Finally, containers are more flexible in terms of portability
(since they do not have a separate operating system), whereas
VMs require additional efforts during porting operations es-
pecially when the hosting platforms are different.
Market leaders such as Amazon Web Services and Google
Container Engine typically take advantage both from virtual-
ization and containerization by designing infrastructures where
containers run on top of VMs [41]. Actually, the possibilities
of combining VMs with containers strongly depend on the
specific policy adopted by the cloud provider. An useful
taxonomy is provided in [33], where two common schemes
stand out: the first is the homogeneous one, where several
instances of a single kind of containers run on top of one and
the same VM. Such a scheme is well suited for public cloud
environments where, for security issues, it is preferable not to
share VMs among different users or tenants. The second one is
the heterogeneous scheme, where different kinds of containers
are allowed to share the same VM. It is the case of private
cloud environments (see for example [42]), where there are no
strict security requirements, and where it is possible to design
scalable redundant policies (by replicating the entire VM) to
cope with failure events.
Specifically, we consider Docker-based implementations of
both schemes that share a common five-layer arrangement
referred to as a Containerized Network Replica (CNR), con-
sisting of (see Fig. 2): i) an infrastructure layer that generically
embodies hardware components (HW) such as CPU, RAM,
power supplies; ii) a hypervisor (HPV) acting as an interface
between hardware and upper layers; iii) a virtual machine
(VM) layer that provides a wrapper for the Docker environ-
ment; iv) a Docker daemon2(DCK) that offers a runtime
environment to handle containers; v) a Container (CNT) which
embeds the specific software functionality to be provided (e.g.
Proxy, Serving, etc.), and represents the basic element erogat-
ing/managing IMS sessions. Since a DCK can handle more
than one CNT, and since an HPV can handle different VMs,
different CNRs schemes are possible. As illustrated in Fig.
2, scheme (a) accounts for a homogeneous implementation
of a CNR hosting only one kind of instance (e.g. P-CSCF
or P instance); scheme (b) represents a co-located homo-
2Hereafter, for the sake of brevity, docker daemon will be simply referred
to as docker.
!"#$%&'()%
*+,-
./0,-
*+,1
./0,1
2!
!"# !$#
!%# !&#
!3%453%$
!"#$%&'()% 6!7*8
7
7
!3%453%$,,6!98
*'%:;3<,+3=>'?$,6*+8
.)=@$% .3$A)? 6./08
7
!"#$%&'()%
!3%453%$
*'%:;3<,+3=>'?$
.)=@$% .3$A)?
2
!"#$%&'()%
*+,-
./0,-
*+,1
./0,1
B2
7!
!3%453%$
2!
!!
2
'()(*+,+(-.
/%0+)+.
'+1+2(*+,+(-.
/%0+)+.
6/CD8
Fig. 2: Different schemes of CNRs. Schemes (a) and (b)
realize pure and co-located homogeneous deployments,
respectively. Schemes (c) and (d) realize co-located and
mixed heterogeneous deployments, respectively. Note that
co-location in (b) is intended at infrastructure level, whereas
in (c) is intended at container level.
geneous implementation where different containers instances
(e.g. I-CSCF (I) and HSS (H) instances) are deployed over
different dockers and virtual machines, although they share
the same underlying infrastructure; scheme (c) accounts for
a heterogeneous implementation that allows to consider a co-
located deployment of different containers; finally, scheme (d)
represents a mixed heterogeneous deployment with different
dockers and virtual machines that is not really used in practice,
but inserted for the sake of completeness. We remark that
schemes (b) and (c) represent different kinds of co-location
(that in real IMS settings typically involves HSS and I-CSCF
- see [43]): the former implements co-location at infrastructure
level, whereas the latter implements co-location at container
level. It is useful to define a Containerized Network Function
(CNF) as an ensemble of CNRs deployed to provide a spe-
cific IMS functionality. Otherwise stated, a CNF is a logical
abstraction of a cIMS node composed by one or more CNRs.
Thus, in the sequel, the terms CNF and cIMS node may be
used interchangeably.
In Fig. 3 we outline an exemplary mapping between
functionalities implemented through CNFs (P/I/S-CSCF and
HSS), and the corresponding physical deployments realized
via CNRs. The dashed rectangle surrounding I-CSCF and HSS
indicates that such nodes are typically co-located, meaning that
the corresponding CNRs are implemented through schemes
2(b) and/or 2(c).
The introduction of the CNF representation provides some
degrees of freedom. First, a single CNF can be realized by
means of multiple CNRs that, in principle, can be distributed
geographically. Second, different CNRs belonging to the same
CNF can have a different number of containers since a CNR
has a limited resource to support up to a certain number of
cIMS sessions. Finally, different CNRs belonging to the same
CNF can be deployed according to different schemes (see Fig.
2), as occurs for HSS that, in this example, is deployed by
means of two CNRs: a co-located homogeneous one (CNR 3)
shared with I-CSCF, and a pure homogeneous one (CNR 4).
5
!"#$%&'()
!
!"*$%&'()
!"#$#%
&"#$#% '$$ $"#$#%
#()*+
#()*,
#()*-
#()*!
!
&'
'.
'.
'.
'.
'!/
/0
1#2
'!/
1#2*+ 1#2*,
/0*,
/0*+
'!/
/0
1#2
'!/
/0
1#2
$
#()*3
'.
'!/
/0
1#2
!!!
!
' ' '''
$ $
!
&'
Fig. 3: A Containerized Network Function (CNF) represents
a logical abstraction of an IMS functionality (P/I/S-CSCF,
HSS) and can be deployed through one or more CNRs.
!"#$"%&'()*$+,
!"#
$%
&'(
(&'
!!!
")*+,-.
(/")0!
!"#
$%
&'(
(&'
!!!
")*+,1.
(/")0"
!"#
$%
&'(
(&'
!!!
")*+,1.
2/")0!
!"#
$%
&'(
(&'
!!!
")*+,-.
2/")0#
!"#
$%
&'(
(&'
!!!
")*+,-.
&/")0"
!"#
$%
&'(
(&'
!!!
")*+,-.
&/")0!
!"#
$%
&'(
(&'
!!!
")*+,3.
&/")0#
!"#
$%
&'(
(&'
!!!
")*+,3.
4/")0!
!"#
$%
&'(
(&'
!!!
")*+,1.
2/")0$
!"#$%
&"#$%
'"#$%
("#$%
!"#
$%
&'(
(&'
!!!
")*+,-.
4/")0"
!"#
$%
&'(
(&'
!!!
")*+,-.
2/")0"
Fig. 4: Interconnections among nodes in a (containerized) IMS
infrastructure (homogeneous deployment case).
V. AVAILABILITY MODEL OF CONTAINERIZED IMS
FRAMEWORK
We now demonstrate how a combination of RBD and SRN
formalisms can help the availability analysis of a cIMS. The
former allows to interpret a cIMS infrastructure in terms of
high-level interconnections among nodes, as illustrated in Fig.
4, that reflects the sequential nodes connection depicted in
Fig. 1. As reported in Fig. 4, each CNR can host a different
number of containers, thus realizing an effective redundancy
only up to the DCK layer. This is due to the fact that specific
availability requirements can be met through a variable number
of containers per CNR.
On the other hand, the SRN methodology, stemming from
Markov Reward Models [44], [45], allows to describe the
interactions occurring among the various layers of a CNR
which composes a generic cIMS node. More specifically, we
adopt its graphical description in terms of bi-partite directed
graphs where places (depicted as circles) account for specific
conditions (e.g. nodes up/down), and transitions (depicted
as rectangles) represent the actions (e.g. a node fails or
is repaired). Inside a place, tokens (represented by dots or
numbers) characterize holding conditions. In case of a CNT
layer, one (or more) tokens lying in the “up” place indicate
one or more working containers, whereas for the remaining
layers (DCK, VM, HPV, HW) one token in the “up” place
indicates a working layer. When a failure/repair event related
to a specific CNR layer occurs (namely, a transition is fired),
a token (or more than one token in case of a CNT layer) is
transferred from the source place to a destination place.
In an SRN, transitions times are supposed to be exponential
random variables (a common assumption in reliability and
availability analyses), with λdenoting the failure rate, and µ
the repair rate. Solving an SRN amounts to evaluate the reward
function, defined as a non-negative random process associated
to some dependability metrics (among them, the availability).
Let Y(t) be the reward function that is equal to 1, when
the system is working at time t, and to 0 otherwise. The
instantaneous availability can be expressed as [44]
A(t)=P{Y(t)=1}=E[Y(t)] =X
i∈I
ri·pi(t),(1)
where Iis the set of markings, namely, the set of feasible
tokens distributions, ri(commonly referred to as reward rate)
is the value of Y(t) in marking i, and pi(t) is the corresponding
probability. The set Ican be split in a subset of “up” states
(ri=1), and a subset of “down” states (ri=0).
A. Availability model of homogeneous scheme
Figure 5 (part included in dashed rectangle A) describes
the SRN model of the homogeneous scheme depicted in Fig.
2(a) implementing a generic CNR. Places Pu pC N T [PdnC N T ],
Pup DC K [Pdn DC K ], Pu pV M [PdnV M ], Pu p H PV [Pdn H PV ],
and Pup H W [Pdn HW ] take into account the working [failure]
conditions of container instance, docker daemon, virtual ma-
chine, hypervisor, and hardware, respectively. Note that each
place contains only one token (indicated by number 1), except
for the place Pu pC N T that contains nktokens, which denotes
the possibility of having more replicas of a single container
instance.
Transitions Tf C NT [TrC NT ], Tf DC K [Tr DC K ], Tf V M
[TrV M ], Tf H PV [Tr H PV ], and Tf H W [Tr HW ] denote failure
[repair] activities characterizing containers, docker, virtual ma-
chine, hypervisor, and hardware respectively. Such transitions
(depicted as unfilled rectangles) are called “timed” transitions
and, as previously said, are characterized by exponentially dis-
tributed times. If a timed transition is “marking-dependent” (in
Fig. 5 we insert the “#” symbol to denote such condition), its
effective rate is multiplied by the number of tokens available in
the pertinent place. Conversely, transitions tC N T ,tD C K ,tV M ,
and tH PV (represented by filled and thin rectangles) are called
“immediate” transitions and account for actions occurring in
a zero-length time interval.
The time-evolution of the SRN in Fig. 5 (part A) can
be analyzed by starting from the initial working condition,
where nktokens are located in place Pu pC N T , while a single
token is present in all the remaining “up” places. When a
single container failure occurs (e.g. an uncontrolled reboot of a
6
!
"#!"#$%
"#&'#$%
$#!"#
%#(#$% %#)#$%
!
%($*+ %(,-
!
"!",./
"&',./
"!",-
"&',-
%(,./ %),./ %),-
$$%&
!
"!"/0
"&'/0
$&'
%(/0 %)/0
!
"!"#$%
"&'#$%
$!"#
%(#$%
%)#$%
!!
"!"$*+
"&'$*+
$"()
%)$*+
!
%##($*+
!""!
"##!"$*+
"##&'$*+
$##"()
&
&%##)$*+
"
#!
"#!"/0
"#&'/0
$#&'
%#(/0
%#)/0
!"!
"#!"$*+
"#&'$*+
$#"()
%#)$*+
%#($*+
&
&
&
&
%###)$*+
$
%###($*+
!"""!
"###!"$*+
"###&'$*+
$###"()
&
&
Fig. 5: SRN-based model representative of a generic CNR deployed according to: the homogeneous scheme (part A); the
homogeneous co-located scheme (parts A+B), the heterogeneous co-located scheme (parts A+D); the heterogeneous mixed
scheme (parts A+B+C+D).
container instance), transition Tf C NT is fired and one token in
Pup C N T is moved to place Pd nC NT . As a consequence, nk−1
tokens remain in Pu pC N T . Conversely, once the container
becomes again repaired, Tr C NT is fired and the token comes
back to Pu pC N T .
Now, let us consider the case of a docker layer failure.
The transition Tf DC K is fired and the token is moved from
Pup DC K to Pdn DC K . Notice that, when the docker layer fails,
all container instances that need the underlying docker layer
to be up and running become inactive. Such an issue is taken
into account through an inhibitory arc (depicted as a segment
between Pup DC K and tC N T with a little circle close to the
latter) that, in case of a docker failure, forces tC NT to be fired.
When the docker gets repaired, Tr DC K is fired, and two actions
occur: first, the token passes from Pdn DC K to Pu p DC K , then
the inhibitory arc between Pdn DC K and TrC N T is disabled
and, consequently, the nktokens are ready to be transferred
from PdnC N T to Pu pC N T .
Similar behaviors occur in case of virtual machine, hyper-
visor, and hardware failures/repairs. It is worth noting that the
only layer without an immediate transition is the hardware
layer. Indeed, being hardware the lower layer of a generic
cIMS node structure, no further underlying dependencies have
to be taken into account. Let us now define two quantities use-
ful for the forthcoming performability analysis: the demand W,
that is the required system performance in terms of concurrent
IMS sessions; the capacity cj, namely the maximum number
of concurrent IMS sessions a container belonging to CNF j
can manage. Therefore, the reward rate in marking iis
ri(j)=
1 if Pk
h=1#P(h)
u pC N T ·cj≥W,
0 otherwise,
(2)
where his the number of CNRs forming CNF j, and “#”
refers to the number of tokens3. By defining γj=W/cjas the
normalized performance level, (2) can be recast as
ri(j)=
1 if Pk
h=1#P(h)
u pC N T ≥γj,
0 otherwise.
(3)
Finally, in the limit for t→ ∞, we get the steady-state
availability for cIMS node j:
Aj=lim
t→+∞Aj(t)=X
i∈I
ri(j)·pi j,(4)
where: ri(j) is derived from (3), and pi j is the steady-state
probability given by pi j =limt→+∞pi j (t) (where pi j (t) is
the instantaneous probability of node jbeing in marking i).
From the single cIMS node availability in (4), it is possible to
derive the overall steady-state availability for the homogeneous
scheme as
Ac I M S =Y
j∈P,S,I,H
Aj.(5)
The product in (5) stems from the RBD-like modeling of Fig.
4 representative of series connection among cIMS nodes. The
overall cIMS steady-state availability, in fact, requires that
each node must be available.
B. Availability model of homogeneous co-located scheme
Let us now consider the availability model of a CNR
deployed according to a homogeneous co-located scheme as
3In the standard Petri Net terminology there is a little abuse of notation, as
the symbol # denotes both the number of tokens and the marking-dependent
transitions in SRN graphical representations.
7
shown in Fig. 2(b). We recall that such a co-location is realized
by sharing the infrastructural level (hypervisor/hardware). The
correspondent SRN is given by the parts A e B in Fig. 5,
where, for the sake of simplicity, we consider only two differ-
ent co-located containers and two different dockers and virtual
machine layers. Comparing this co-located scheme against the
homogeneous scheme of Fig. 5 (part A), we can observe that
the main structure remains unaltered, whereas a new piece
(B) typifies the presence of co-located elements. Places and
transitions characterizing such a new piece are distinguished
by means of a prime superscript (e.g. P0
u pC N T ,T0
f C N T , etc.).
Moreover, we can notice the presence of two further inhibitory
arcs connecting the two parts of the graph. The first one
between Pup H P V and t0
V M accounts for the fact that, if
hypervisor fails, the co-located VM (and, in turn, co-located
DCK and CNT) cannot be operative, thus, t0
V M is fired and
the only token in P0
u pV M is moved to P0
dnV M .
The second inhibitory arc between Pdn H PV and T0
rV M
accounts for the fact that the token cannot be moved from
P0
dnV M to P0
u pV M , until the token in Pdn H PV is transferred
to Pup H P V , since the co-located virtual machine cannot be
restored until hypervisor gets repaired.
Similarly, given marking i, it is possible to define a new
reward rate as
r0
i(j1,j2)=
1 if Pk
h=1#P(h)
u pC N T ≥γj1∧
Pk
h=1#P0(h)
u pC N T ≥γj2,
0 otherwise,
(6)
where the symbol ∧denotes a logical A N D operator between
the two conditions. The above expression can be interpreted as
a generalization of (3) to the case of two co-located containers
with performance capacities cj1and cj2belonging to co-
located nodes j1and j2that, in our case, are I-CSCF and
HSS.
Accordingly, the steady-state availability pertinent to a pair
of co-located nodes can be expressed as
A0
j1j2
=lim
t→+∞A0
j1j2(t)=X
i∈I
r0
i(j1,j2)·p0
i j,(7)
where r0
i(j1,j2) is given by (6), and p0
i j is the corresponding
steady-state probability. Considering that, in a homogeneous
co-located scheme, the overall cIMS infrastructure is com-
posed by two non-colocated nodes (typically P-CSCF and S-
CSCF) and two co-located nodes (typically I-CSCF and HSS),
the overall steady-state availability is:
A0
c I M S =A0
j1j2·Y
j,j1,j2
Aj,(8)
with ( j,j1,j2)∈P,S,I,H. The product of Aj(derived from
(4)) spans across the two non-colocated nodes, whereas A0
j1j2
(from (7)) takes into account the remaining co-located nodes.
C. Availability model of heterogeneous co-located scheme
The next model refers to a heterogeneous co-located scheme
of a CNR, and is depicted in Fig. 5 (parts included in dashed
rectangles A and D). Such a scheme represents a lightweight
co-location since the whole infrastructure from docker to
hardware can host different kinds of containers. In this case,
just one new element represented by another container has
been introduced (part D in the pertinent SRN). Similar to
the previous case, the inhibitory arc from Pu pD C K and t00
C N T
forces the co-located container to fail in case of a docker
failure, whereas the inhibitory arc from Pdn DC K to T00
rC N T
prevents that co-located containers could be working again
until docker gets repaired.
For a given marking i, the reward rate admits the following
expression:
r00
i(j1,j2)=
1 if Pk
h=1#P(h)
u pC N T ≥γj1∧
Pk
h=1#P00(h)
u pC N T ≥γj2,
0 otherwise.
(9)
Notice that, such a reward rate is similar to (6) except for the
fact that, now, the co-location is realized at container level,
thus, P00
u pC N T intervenes in (9). Accordingly, the steady-state
availability expression turns out akin to the one derived for
the previous homogeneous case, viz.
A00
c I M S =A00
j1j2·Y
j,j1,j2
Aj,(10)
with ( j,j1,j2)∈P,S,I,H. Again, the product of Aj(derived
from (4)) spans across the two non-colocated nodes (P-CSCF
and S-CSCF), whereas A00
j1j2(built by starting from (9) as
for (7)) accounts for remaining co-located nodes j1and j2,
namely, I-CSCF and HSS.
D. Availability model of heterogeneous mixed scheme
This last SRN model pertains to a heterogeneous mixed
case that is a combination of homogeneous co-located and
heterogeneous co-located schemes; referring to Fig. 5, it
comprises parts A, B, C, and D. The introduction of part
C (logically connected to part B) guarantees the symmetry
with part D (logically connected to part A) and allows to
model the behavior of two separated sub-structures (each
composed of container(s), docker, and VM) which share the
same underlying infrastructure. The two inhibitory arcs (one
from P0
u p DC K and t000
C N T and another from P0
dn DC K to T000
rC N T )
admit the same interpretation, mutatis mutandis, offered for
the heterogeneous case. Let us now derive expressions for the
reward rate and the steady-state availability. Given marking i,
we express the reward rate as
8
r000
i(j1,j2,j3,j4)=
1 if fPk
h=1#P(h)
u pC N T ≥γj1∧
Pk
h=1#P00(h)
u pC N T ≥γj2g
∧
fPk
h=1#P0(h)
u pC N T ≥γj3∧
Pk
h=1#P000(h)
u pC N T ≥γj4g
0 otherwise.
(11)
Notice that the above expression can be derived from (3) con-
sidering two couples of co-located containers with capacities
cj1(with j1representing I-CSCF), cj2(with j2representing
HSS), cj3(with j3representing P-CSCF), and cj4(with j4
representing S-CSCF) in accordance to Fig. 2(d). Hence, the
pertinent steady-state availability can be expressed as
A000
c I M S =lim
t→+∞A000
j1j2j3j4(t)=X
i∈I
r000
i(j1,j2,j3,j4)·p000
i j ,(12)
with ( j1,j2,j3,j4)∈P,S,I,H. The reward function
r000
i(j1,j2,j3,j4) is given by (11) and p000
i j is the corresponding
steady-state probability.
VI. PERFORMABILITY ASSES SMENT
In this section, after a formal statement of the problem
of searching for the optimal cIMS configurations, we present
an automated procedure designed to render the performability
analysis more efficient.
A. Problem Formalization
Let setting Sbe a generic deployment of a cIMS infras-
tructure composed of a certain number of CNRs. Our goal is
to find the settings satisfying high availability requirements
at minimal cost (since CNRs can be variously combined
among them and with different schemes, the optimum could be
achieved by more than one setting). This optimization problem
can be formalized as follows.
Letting Ejbe the cost (expenditure) of node j, composed of
hCNRs, the overall cost of a cIMS setting is
E(S)=X
j∈P,S,I,H
Ej.(13)
Letting R={S:Ac I M S (S)≥A0}be the ensemble of settings
satisfying a steady-state availability requirement A0, the formal
solution of the problem amounts to:
S∗=arg min
S ∈R
E(S).(14)
We shall work under two assumptions. The first one concerns
the cost computation/assignment of a cIMS system. Since
a single cIMS node is composed of one or more CNRs,
we assume that the cost of a single CNR is the sum of
three dimensionless contributions: i) cost per container (CNT)
!"#$%&#'()*+,)%$-./)
($01#2"(#'$ 0)3).4&/"&#'$0
!"#$%&#'()+56)%$-./)($%7$1'#'$0
3)8.&1'9/.)*.##'0:);<#2&(#'$0
!"#$%&'''''$
(&)&$*+(),-.
!"#!"#$%&' &()*+,-.
$%&'('
!/&01)/#%1)$&2.
)*!3%#4567-%4%#)
484924:929%6 %4'; #%.
+,-.-/#
!
Fig. 6: Big picture of the procedure implementing
OptChains+ to support the performability assessment.
embodying the software logic and licenses, ii) cost per docker
and virtual machine (DCK+VM) that includes the operating
system, and iii) cost per hypervisor and hardware (HPV+HW)
representing the infrastructure cost. Each contribution is sup-
posed to be equally priced with a normalized cost amounting
to 1. Such assumption, in line with the policy pricing of top-
player services such as Amazon AWS or Microsoft Azure,
reflects the fact that software parts have a cost comparable
with physical parts since an extra amount due to licenses must
be considered. Needless to say, the proposed analysis can be
generalized by customizing (13) and by choosing different cost
contributions.
The second assumption concerns the diversity between
CSCF containers (P, S, I) and the HSS, as the latter implies an
additional criticality due to the underlying database structure.
This issue is taken into account by imposing that the HSS
container provides one extra replica w.r.t. CSCF containers.
In other words, we impose that γH S S =γC SC F +1. To avoid
overburdened notation, we will use γin place of γCS C F .
B. Performability analysis through OptChains+ algorithm
In our analysis, we face a combinatorial search problem
across a huge number of possible redundancy schemes ob-
tained by variously combining CNRs and pertinent containers.
Our analysis is assisted by TimeNET [46], a powerful tool
for SRN model evaluation, whose functionalities have been
further enriched by means of a purposely designed Python-
based external module4implementing a multi-stage procedure
(sketched in Fig. 6) which:
•automatically builds, replicates (to achieve redundancy),
and evaluates SRN models per cIMS node on the basis of
some parameters such as: desired scheme (homogeneous,
heterogeneous, etc.), Mean Time to Failure (MTTF) 1/λ
and Mean Time to Repair (MTTR) 1/µ for various layers,
desired steady-state availability target A0(0.99999 in our
case), cost per layer, value of γ;
•automatically composes the series/parallel structures (set-
tings) through the RBD formalism, along with the overall
availability evaluation; at the same time, an extraction
of feasible settings satisfying the desired constraints is
performed.
4Available on request.
9
Algorithm 1: OptChains+
1Initialize the vector R’ containing all possible CNFs with
various parameters (schemes, λ,µ, A0, costs, γ);
2for CNF ∈R’ do
3if #Container ≥γthen
4SRN model evaluation (CNF)
5if AC N F ≥A0then
6R[CNF] ←AC N F
7end
8end
9end
Intermediate Input:R,g1,g2,g3,G
10 minC ost ←in f
11 for p∈Rpc sc f do
12 calculate Ep
13 if Ep>g1·minCost then
14 continue;
15 end
16 for s∈Rsc sc f do
17 calculate Es
18 if Ep+Es>g2·minCost
19 OR (Ap·As)<A0then
20 continue;
21 end
22 if homogeneous then
23 for i∈Ric sc f do
24 calculate Ei
25 if Pk=p,s,iEk>g3·minCost
26 OR Qk=p,s,iAk<A0then
27 continue;
28 end
29 for h∈Rhss do
30 calculate Eh
31 if Pk=p,s,i,hEk>G·minCost
32 OR Qk=p,s,i,hAk<A0then
33 continue;
34 end
35 minC ost ←min{minCost, Ec I M S }
36 save [cIMS, AcI M S , Ec I M S ]
37 end
38 end
39 end
40 else if co-located or heterog. then
41 for c∈Rco l−h et do
42 calculate Ec
43 if Pk=p,s,cEk>G·minCost
44 OR Qk=p,s,cAk<A0then
45 continue;
46 end
47 minC ost ←min{minCost, Ec I M S }
48 save [cIMS, AcI M S , Ec I M S ]
49 end
50 end
51 end
52 end
The described procedure has been embedded into an algo-
rithm dubbed OptChains+, whose pseudo-code is reported in
the column to the left. The first part (lines 1 −9) embodies the
external call to TimeNET to build and evaluate SRN models
for single CNFs (made of one or more CNRs - see Fig. 3),
by retaining only the (feasible) CNFs that satisfy a given
availability constraint (AC N F ≥A0, line 5). The rationale
behind this choice is to save computational resources for the
evaluation of the final availability Ac I M S , which is obtained as
the product of AC N F terms, one per node. We want to remark
that such external call (line 4) is just preparatory to obtain the
vector Rof feasible CNFs, thus, in case a different tool is
used, the rest of OptChains+ remains unaltered. The second
part of the algorithm aims at achieving a reduced number of
cIMS settings (matching specific costs Ec I M S and availabil-
ity criteria Ac I M S at the same time) for different schemes
(homogeneous (lines 22 −39), co-located/heterogeneous (lines
40 −50)). The intermediate inputs for such a second part
include: R,γ, and four weight factors (g1,g2,g3,G) adopted to
tune the pruning/searching process. The idea is to perform an
exhaustive search with pruning, starting to cycle on the sub-
vector Rpc sc f which includes all feasible Proxy-type CNFs
(line 11). The algorithm prunes all the settings with a cost
exceeding g1times the cost of cIMS. The variable minCost
represents the whole cIMS cost calculated/updated within a
cycle, and initialized at line 10. Then, when analyzing the
sub-vector Rsc sc f (line 16), OptChains+ prunes all the settings
whose total cost of Proxy and Serving-type CNFs exceeds g2
times the cIMS cost, or whose availability product Ap·As
is less than A0. Similar logic holds for: I-type containers
(line 23) and H-type containers (line 29). At line 31, the
weight factor Gindicates that an extra amount of settings
(with a cost increased by G% regarding the actual cost) is
retained for backup. The final output is a vector gathering:
all the feasible cIMS settings along with their availability
Ac I M S and cost Ec I M S (line 36 for homogenous schemes,
and line 48 for co-located/heterogeneous schemes). We remark
that any reasonable criterion can be pursued to select the
weight factors. The practical rule we adopted is based on the
assumption that each of the four nodes is worth 1/4 of the
whole cIMS deployment. Hence, the algorithm starts to prune
all settings whose P-CSCF cost exceeds its redoubled value,
thus, g1=1/2. Such a “conservative” reasoning is repeated
further ahead in OptChains+, so as to obtain the rescaled
weight factors g2=3/4 and g3=1. Ultimately, the value
of Gis set to 1.15, implying that we keep more settings than
needed (precisely, settings whose cost is increased by 15%),
with the aim of providing a broader set of cIMS combinations.
Intuitively, greater values of weight factors result in a more
conservative strategy since more settings are kept, but at the
cost of a higher computation time.
In our case study, that assumes a maximum of 6 containers
to deploy per CNR (an assumption in keeping with the
resource constraint of the experimental testbed), the variety of
redundancy schemes to analyze produces a number of settings
in the order of 1013 (consider combining 7 containers (0-6)
deployed across 4 nodes, and, then, composing a setting of 4
elements: (74)4).
10
!"#"
!"#$%$&'
$%$&'"()
!%#$%$&'
*"+,-),./
!(%%'
01,.'2.),'3
0"',
%)"
%)"
(**"
%)"+%,-.//+*001 4%$&'"()
!)#$%$&'
$&'"()
234
235
Fig. 7: Sketch of the experimental testbed relying on the
Clearwater architecture.
It is worth noting that, being OptChains+ a heuristic algo-
rithm, its time complexity is related to the choice of weight
factors. If they are high (conservative policy with few pruned
settings) the complexity could reach O(n4). Per contra, for low
weight factors (relaxed policy with few pruned settings - the
typical case), the complexity decays to O(n·log(n)) due to an
embedded ascending sorting operation within cost vectors.
With the proposed OptChains+ tuning, the number of set-
tings to analyze decreases to 105, and, on a standard PC (Intel
Core CPU i5−3230@2.60 GHz, with a RAM of 8 GB), the
whole procedure requires about 450 seconds to run (neglecting
the call to external tool TimeNET).
VII. NUMERICAL EVALUATI ON
In Fig. 7 we sketch the deployed experimental testbed based
on the Clearwater platform that we exploit to support the load
assumption about the number of cIMS sessions representing
the adopted performance capacity indicator.
On a laptop with an Intel Core CPU i7−3630QM@2.40GHz
and with a RAM of 8 GB, we deploy two Linux-based virtual
machines (1 virtual Core and 2 GB of RAM per VM): the first
one serves as a containerized deployment of the whole cIMS
architecture including P-CSCF (Bono), S/I-CSCF (Sprout),
and HSS (Homestead). The second VM is a stress node
that executes some routines useful to perform a load stress
against the containerized platform. The test scenario considers
1000 concurrent IMS sessions with a BHCA (Busy Hour Call
Attempts) equal to 2.6 per user (in line with values provided
for VoLTE - see [47]). The resulting average call setup delay
is 80 msec, a value quite reasonable since the infrastructure
is deployed on the same node, so that interconnection delays
are negligible.
On the other hand, due to the lack of measured data
concerning MTTF and MTTR of containerized components,
we refer in part to the technical literature (see e.g [33]), and
in part to expert hints. Such parameters are shown in Table I.
The experiment allows for a performance demand Wrang-
ing from 2000 to 5000, assuming a performance capacity
c=1000 (both in terms of IMS sessions) and assuming, for
simplicity, cj=c. This basically means that, if a provider
needs to guarantee up to, say, 4000 concurrent IMS sessions,
we get γ=4, indicating the need for at least 4 containers.
Hence, according to the Wvalue, γranges from 2 to 5 (in
TABLE I: Parameters values. CNT and DCK repair times must
be interpreted as times spent to perform a software reboot.
Parameter Description Value
1/λCN T container MTTF (hour) 500
1/λ DC K docker daemon MTTF (hour) 1000
1/λV M virtual machine MTTF (hour) 2880
1/λ H PV hypervisor MTTF (hour) 2880
1/λ H W hardware MTTF (hour) 60000
1/µ C N T container MTTR (sec) 2
1/µ DC K docker daemon MTTR (sec) 5
1/µ V M virtual machine MTTR (hour) 1
1/µ H PV hypervisor MTTR (hour) 2
1/µ H W hardware MTTR (hour) 8
Wperformance demand (IMS sessions) (2000, 5000)
cperformance capacity (IMS sessions) 1000
A0steady-state availability requirement 0.99999
case of non integer result, we consider the next integer value
for γ).
Aimed at considering a practical case, we analyze some
relevant settings (extracted among about 1000 produced by
the procedure) as reported in Table II. The column Scheme
indicates the type of cIMS deployment along with different
values of γ. With a little abuse of notation, homogeneous
(HOM.) scheme refers to a cIMS setting where all nodes
are composed of homogeneous CNRs, whereas in co-located
(COL.) and heterogeneous (HET.) schemes, I-CSCF and HSS
share the same CNR(s) of co-located and heterogeneous type,
respectively. For each scheme, we consider four exemplary
settings (S1, . . . , S4) where a maximum of 4 CNRs per node
are allowed. Let us clarify the notation adopted in Table II by
considering, for instance, setting S1in the co-located scheme
with γ=2. The notation [2 2 0 0] used for P-CSCF
indicates that 2 out of 4 (homogeneous) CNRs are exploited
and 2 containers per CNR are used. Similarly, for the S-CSCF
([1 1 1 0]), 3 out of 4 (homogeneous) CNRs are exploited
and 1 container per CNR is used.
A slightly different notation is used for I-CSCF and
HSS that share the same CNRs. In such a case,
[2,3 2,3 0,0 0,0] indicates that 2 out of 4 (co-located)
CNRs are exploited where 2 I-type containers and 3 H-type
containers are deployed per CNR, respectively.
Such a concise notation is also helpful to quickly compute
the cost Efor each setting. As regards the previous example,
the deployment cost EPfor P-CSCF amounts to 1 ·4 (CNT)
+1·2 (DCK+VM) +1·2 (HPV+HW); cost ESfor S-CSCF
amounts to 1 ·3 (CNT) +1·3 (DCK+VM) +1·3 (HPV+HW);
cost EI,Hfor co-located I-CSCF and HSS amounts to 1 ·10
(CNT) +1·4 (DCK+VM) +1·2 (HPV+HW). The total cost
amounts to E=EP+ES+EI,H=33.
Let us now explore the results in terms of availability and
costs for various settings through the panel of Figs. 8, where
we report the steady-state availability Ac I M S for different
values of γ. Let us consider, for instance, Fig. 8(b) showing
the case γ=3, where the four exemplary settings have been
grouped per scheme. Each bar indicates the availability value,
whereas the number inside reports the cost associated to that
particular setting. The horizontal dashed line represents the
“five nines” threshold that, if crossed, means that the pertinent
11
TABLE II: A selection of 4 exemplary settings (S1,S2,S3,S4) with different distributions of CNRs grouped per scheme and
for values of γranging from 2 to 5.
Scheme Setting P-CSCF S-CSCF I-CSCF HSS Scheme Setting P-CSCF S-CSCF I-CSCF HSS
S1[2 2 0 0] [2 2 0 0] [2 2 0 0] [2 2 1 0] S1[3 3 0 0] [3 3 0 0] [3 3 0 0] [2 2 2 0]
HOM. S2[2 2 0 0] [2 2 0 0] [2 2 0 0] [3 3 0 0] HOM. S2[3 3 0 0] [3 3 0 0] [3 3 0 0] [4 4 0 0]
γ=2 S3[2 2 0 0] [2 2 0 0] [2 2 0 0] [3 4 0 0] γ=3 S3[3 3 0 0] [3 3 0 0] [3 3 0 0] [4 5 0 0]
S4[2 2 0 0] [2 2 0 0] [2 3 0 0] [3 3 0 0] S4[3 3 0 0] [3 3 0 0] [3 4 0 0] [4 4 0 0]
S1[4 4 0 0] [4 4 0 0] [4 4 0 0] [3 3 2 0] S1[5 5 0 0] [5 5 0 0] [5 5 0 0] [3 3 3 0]
HOM. S2[4 4 0 0] [4 4 0 0] [4 4 0 0] [5 5 0 0] HOM. S2[5 5 0 0] [5 5 0 0] [5 5 0 0] [6 6 0 0]
γ=4 S3[4 4 0 0] [4 4 0 0] [4 4 0 0] [5 6 0 0] γ=5 S3[5 5 0 0] [5 5 0 0] [5 6 0 0] [6 6 0 0]
S4[4 4 0 0] [4 4 0 0] [4 5 0 0] [5 5 0 0] S4[5 6 0 0] [5 5 0 0] [5 5 0 0] [6 6 0 0]
Scheme Setting P-CSCF S-CSCF I,H (CNR sharing) Scheme Setting P-CSCF S-CSCF I,H (CNR sharing)
S1[2 2 0 0] [1 1 1 0] [2,3 2,3 0,0 0,0] S1[3 3 0 0] [3 3 0 0] [3,2 3,2 0,2 0,0]
COL. S2[2 2 0 0] [2 2 0 0] [2,3 2,3 0,0 0,0] COL. S2[3 3 0 0] [3 3 0 0] [3,3 3,3 0,1 0,1]
γ=2 S3[2 2 0 0] [2 3 0 0] [2,3 2,3 0,0 0,0] γ=3 S3[3 3 0 0] [3 3 0 0] [3,2 3,2 0,2 0,2]
S4[2 2 0 0] [2 2 0 0] [2,3 2,3 0,0 0,0] S4[3 4 0 0] [3 3 0 0] [3,3 3,3 0,1 0,1]
S1[4 4 0 0] [4 4 0 0] [2,3 2,3 2,2 0,0] S1[5 5 0 0] [5 5 0 0] [3,3 3,3 2,3 0,0]
COL. S2[4 4 0 0] [4 4 0 0] [3,3 3,3 1,2 1,2] COL. S2[5 5 0 0] [2 3 3 3] [2,3 3,3 3,3 3,0]
γ=4 S3[4 4 0 0] [4 4 0 0] [2,3 2,3 2,2 2,2] γ=5 S3[2 3 3 3] [5 5 0 0] [2,3 3,3 3,3 3,0]
S4[4 5 0 0] [4 4 0 0] [3,3 3,3 1,2 1,2] S4[5 5 0 0] [2 3 3 4] [2,3 3,3 3,3 3,0]
S1[1 1 1 0] [2 2 0 0] [2,3 2,3 0,0 0,0] S1[3 3 0 0] [3 3 0 0] [2,2 2,2 1,2 0,0]
HET. S2[2 2 0 0] [2 2 0 0] [2,3 2,3 0,0 0,0] HET. S2[3 3 0 0] [3 3 0 0] [3,3 3,3 0,1 0,1]
γ=2 S3[2 2 0 0] [2 3 0 0] [2,3 2,3 0,0 0,0] γ=3 S3[3 3 0 0] [3 3 0 0] [3,1 3,2 0,2 0,3]
S4[2 3 0 0] [2 2 0 0] [2,3 2,3 0,0 0,0] S4[3 3 0 0] [3 3 3 0] [2,2 2,2 1,2 0,0]
S1[4 4 0 0] [4 4 0 0] [2,3 2,3 2,2 0,0] S1[5 5 0 0] [5 6 2 0] [3,3 3,3 2,3 0,0]
HET. S2[4 4 0 0] [2 2 2 2] [2,3 2,3 2,2 0,0] HET. S2[5 5 0 0] [2 3 3 3] [3,3 3,3 2,3 0,0]
γ=4 S3[2 2 2 2] [4 4 0 0] [2,3 2,3 2,2 0,0] γ=5 S3[2 3 3 3] [5 5 0 0] [3,3 3,3 2,3 0,0]
S4[4 4 0 0] [2 2 2 2] [3,3 1,3 1,2 0,0] S4[5 5 0 0] [2 3 3 4] [3,3 3,3 2,3 0,0]
!"####$
!"####$%
!"####$&
!"####$'
!"####$$
!"#####
!"#####%
!"#####&
!"#####'
!"#$%
E=35
E=34
E=35
E=35
E=33
E=32
E=33
E=33
E =31
E=30
E=31
E=31
!(!%!)!&!(!%!)!&!(!%!)!&
!"#"$%&%"'(
)"*+",-.%/
!%.%0"$%&%"'(
(a) Availability for the case γ=2.
!"####$
!"####$%
!"####$&
!"####$'
!"####$$
!"#####
!"#####%
!"#####&
!"#####'
!"#$%
E=42
E=42
E=43
E=43
E=40
E=44
E=44
E=45
E =37
E=42
E=42
E=42
!(!%!)!&!(!%!)!&!(!%!)!&
!"#"$%&%"'(
)"*+",-.%/ !%.%0"$%&%"'(
(b) Availability for the case γ=3.
!"####$
!"####$%
!"####$&
!"####$'
!"####$$
!"#####
!"#####%
!"#####&
!"#####'
!"#$%
E=50
E=50
E=51
E=51
E=47
E=54
E=54
E=55
E =44
E=48
E=48
E=49
!(!%!)!&!(!%!)!&!(!%!)!&
!"#"$%&%"'(
)"*+",-.%/
!%.%0"$%&%"'(
(c) Availability for the case γ=4.
!"####$
!"####$%
!"####$&
!"####$'
!"####$$
!"#####
!"#####%
!"#####&
!"#####'
!"#$%
E=57
E=58
E=59
E=59
E=54
E=64
E=64
E=65
E =56
E=56
E=56
E=57
!(!%!)!&!(!%!)!&!(!%!)!&
!"#"$%&%"'( )"*+",-.%/ !%.%0"$%&%"'(
(d) Availability for the case γ=5.
Fig. 8: Steady-state availability considering 4 exemplary settings (S1,S2,S3,S4) per scheme for: γ=2,3,4,5.
12
setting does not match the availability requirement.
In order to put forth some unexpected and interesting
behaviors, for each case we report also a setting satisfying the
“four nines” but not the “five nines” condition. This is the case
of S1, whose availability values are 0.999985, 0.999986, and
0.999987, for homogeneous, co-located, and heterogeneous
schemes, respectively.
Let us now focus on the homogeneous scheme: among
settings S2,S3, and S4that barely satisfy the availability
constraint, S2has the lowest cost (E=42), so we elect this
setting as the best one. Notice that S1achieves the same cost as
S2, but with a different distribution of containers in HSS node
(see Table II). For the co-located scheme, we also consider
S2as optimal although the same cost (E=44) is achieved
by S3but at lower availability level (0.9999911 for S3vs.
0.9999925 for S2). This notwithstanding, a network designer
could more comfortably choose S3, should the uniformity
of replica distribution be at a premium (related, may be, to
deployment flexibility). In such a case, in fact, being all CSCF
nodes equal, HSS exhibits distributions [2 2 2 2] for S3
and [3 3 1 1] for S2.
Similar considerations hold true for the heterogeneous
scheme where, again, S2emerges as the setting satisfying the
best trade-off between availability and cost.
Now, consider the case γ=5, whose availability results are
shown in Fig. 8(d). In comparison to case γ=3, two facts
emerge: first, the availability values for settings S2,S3, and S4
(those able to guarantee the “five nines”) are almost equal (in
each scheme), and are very close to the 0.99999 threshold. Ba-
sically, this is related to the need of achieving high availability
requirements with a more challenging performance level that,
in turn, implies more redundancy at the container level for
all considered settings and schemes. Second, the difference in
terms of costs between homogeneous and co-located settings
becomes more pronounced. This behavior can be explained as
follows. For γ=5, the system has to manage a greater number
of cIMS sessions w.r.t. the case γ=3, which in turn implies
that we need more containers. When the number of containers
grows, the co-located setting is filled more “quickly” than the
homogeneous one, thus more DCK and VM layers are needed,
resulting in additional costs.
In conclusion, watching from afar the availability results,
two aspects should be highlighted. The first aspect concerns
the monotonic increase of the cost with γ, since more re-
sources are needed (in terms of CNRs and/or containers). The
second aspect pertains to the choice of a particular scheme
among the three considered: according to the performed
analysis, in fact, co-located and heterogeneous schemes offer
the best trade-off in terms of cost and availability when the
performance level is not so high. Basically, this is due to the
possibility of arranging containers in a more ductile way, since
the homogeneous scheme forces to introduce a new CNR in
case different types of containers have to be deployed.
On the other hand, when high performance level is required,
the redundancy at CNR level is needed also for co-located
and heterogeneous schemes, thus, homogeneous arrangement
becomes more attractive in terms of cost reduction.
A. Sensitivity analysis
We carry out a sensitivity analysis useful, from the de-
signer’s perspective, to cope with parameters uncertainty.
Precisely, we evaluate the effects of drifts from nominal values
(see Table I) for six critical parameters: failure and repair times
pertaining to container, docker, and virtual machine layers.
The results are reported in the panel of Figs. 9, starting from
the best settings (S2) derived in the previous analysis for the
case γ=5. Let us first analyze the Fig. 9(a) showing the
sensitivity analysis for the container failure time (1/λC N T ). It
is possible to observe that, for the case of co-located scheme,
the failure time can be reduced from its nominal value (500
hours, circled in red) to about 370 hours with no side effects on
the availability, since the corresponding curve remains above
the horizontal dashed line (five nines limit). For homogeneous
and heterogeneous schemes, such analysis reveals a similar
behavior (see the zoomed inset), but with more stringent
margins since nominal values can be reduced from 500 hours
to not less than 480 hours. However the improved robustness
for co-located scheme is paid in the coin of a higher cost (see
Fig. 8(d)).
Similar arguments hold for the container repair time sensitivity
as shown in Fig. 9(b). In fact, the nominal value of 1/µC N T
can be relaxed from 2 seconds to about 2.5 seconds for the co-
located scheme, and to about 2.15 seconds for homogeneous
and heterogeneous schemes.
Similar results come from Figs. 9(c) and 9(d) for docker
failure and repair times, respectively. On one hand, we can
see that nominal value of 1/λDC K can be decreased from
1000 hours to about 640 hours (co-located) or to about 900
hours (homogeneous and heterogeneous cases). On the other
hand, the nominal value of 1/µDC K can be relaxed from 5
seconds to about 9 seconds for the co-located scheme, and to
about 5.6 seconds in case of homogeneous and heterogeneous
schemes. Also in this case, the improved robustness to devia-
tion of docker parameters is paid in terms of a higher cost of
deployment.
Finally, we analyze the sensitivity for VM failure and
repair times as reported in Figs. 9(e) and 9(f), respectively.
The nominal value of 1/λV M can be diminished from 2880
hours to 2870 hours (co-located) and to 2878 hours (ho-
mogeneous/heterogeneous) with no side effects on the high
availability requirement. On the contrary, the margin for the
parameter 1/µV M is even more stringent: it can be relaxed
from 1 hour to 1 hour and 1 second (co-located) and to 1
hour and 7 seconds (homogeneous/heterogeneous).
In summary, the sensitivity analysis reveals that the robust-
ness of the whole cIMS is influenced by two factors: i) the
robustness of individual layers, that, for some cases (CNT,
DCK) exhibits a reasonable margin, whereas in other cases
(VM) is practically not amenable to any significant deviation;
ii) the type of deployment where, typically, the co-located
scheme offers more room for manoeuvre.
VIII. CONCLUSIONS
This work represents, to our knowledge, the first attempt
for a performability assessment of a container-based IP Multi-
media Subsystem (cIMS), a particular realization of a Service
13
250 300 350 400 450 500 550 600 650 700 750
!!!!"#"#$%
0.99998992
0.99998994
0.99998996
0.99998998
0.99999
0.99999002
0.99999004
0.99999006
$%&'(
Homogeneous
Co-Located
Heterogeneous
495 500 505
0.9999900025
0.999990003
0.9999900035
0.999990004
0.9999900045
(a)
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
!!!!"#"#$
0.99998994
0.99998996
0.99998998
0.99999
0.99999002
0.99999004
0.99999006
$%&'(
Homogeneous
Co-Located
Heterogeneous
1.998 1.999 2 2.001 2.002
0.9999900034
0.9999900036
0.9999900038
0.999990004
0.9999900042
0.9999900044
(b)
500 600 700 800 900 1000 1100 1200 1300 1400 1500
!!!!"# "#$%
0.99998996
0.99998997
0.99998998
0.99998999
0.99999
0.99999001
0.99999002
0.99999003
$%&'(
Homogeneous
Co-Located
Heterogeneous
980 990 1000 1010 1020
0.999990003
0.9999900035
0.999990004
0.9999900045
(c)
3 3.5 4 4.5 5 5.5 6 6.5 7
!!!!"# "#$
0.99998999
0.999989995
0.99999
0.999990005
0.99999001
0.999990015
0.99999002
0.999990025
0.99999003
0.999990035
0.99999004
$%&'(
Homogeneous
Co-Located
Heterogeneous
4.98 4.99 5 5.01 5.02
0.9999900032
0.9999900034
0.9999900036
0.9999900038
0.999990004
0.9999900042
0.9999900044
(d)
1500 2000 2500 3000 3500 4000
!!!!" "#$%
0.999984
0.999985
0.999986
0.999987
0.999988
0.999989
0.99999
0.999991
0.999992
0.999993
0.999994
#$%"&
Homogeneous
Co-Located
Heterogeneous
2880 2880.2 2880.4 2880.6 2880.8 2881
0.999990004
0.999990006
0.999990008
0.99999001
0.999990012
0.999990014
0.999990016
0.999990018
0.99999002
2879.9 2880 2880.1
0.9999900034
0.9999900036
0.9999900038
0.999990004
(e)
0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5
!!!!" "#$%
0.999985
0.999986
0.999987
0.999988
0.999989
0.99999
0.999991
0.999992
0.999993
0.999994
#$%"&
Homogeneous
Co-Located
Heterogeneous
0.99998 0.99999 1 1.00001 1.00002
0.999990005
0.99999001
0.999990015
0.99999002
0.99998 1 1.00002
0.9999900034
0.9999900036
0.9999900038
0.999990004
(f)
Fig. 9: Influence on the overall cIMS infrastructure (case γ=5) of: container failure time (a), container repair time (b),
docker failure time (c), docker repair time (d), virtual machine failure time (e), virtual machine repair time (f). Nominal values
(reported in Table I) are circled in red.
14
Function Chain. We adopt a two-level hierarchy approach to
model cIMS by invoking two frameworks: Reliability Block
Diagram (RBD) and Stochastic Reward Networks (SRN). The
former, to capture high level interconnections among cIMS
nodes; the latter, to characterize the probabilistic behavior of
each node in terms of failure and repair events. First, we set
up an experimental testbed relying on a containerized IMS
platform (Clearwater) to derive a performance benchmark in
terms of maximum IMS sessions simultaneously supported.
Then, we derive an optimal set of cIMS deploy-
ments (organized in a taxonomy including homogeneous/co-
located/heterogeneous schemes) satisfying the “five nines”
availability requirement at minimum cost, for a given perfor-
mance demand. The analysis is supported by TimeNET (a
well assessed tool for SRN evaluation), and by an algorithm
(OptChains+) which allows to: i) automatically build and
evaluate SRN models by starting from specs of single cIMS
nodes, ii) automatically build and evaluate RBD models for the
high-level composition of cIMS settings, iii) assign/compute
costs and extract feasible settings.
Numerical results suggest that, when the performance level
is not particularly demanding, heterogeneous and co-located
schemes offer a better availability/cost trade-off than the
homogeneous setting. The latter, instead, is more suitable for
high performance levels. The assessment is enriched by a
sensitivity analysis to evaluate the robustness of cIMS archi-
tecture when deviations of some critical design parameters
from their nominal values take place. We have plans to extend
the analysis to other cIMS deployments that, for instance,
might include: co-location of multiple network nodes, more
sophisticated interconnections among involved elements, time-
varying load requirements as demanded by contemporary
Service Level Agreements. Other hints for future research
stem from the consideration that the method here presented
can be adapted, for the benefit of service management organi-
zations, to other infrastructures exhibiting a chained arrange-
ment. Among modern networking systems, examples include:
virtualized EPC nodes able to intervene in a service chain
thanks to the SDN paradigm; dedicated SFCs (virtualized
or containerized) composed, for instance, by firewalls, load
balancers, IDSs built as chained resources in virtual data
centers; virtualized Mobility Management Entities (vMMEs)
whose signaling flow is organized in a chained fashion.
REFERENCES
[1] G. Davoli, W. Cerroni, C. Contoli, F. Foresta, F. Callegati, “Implemen-
tation of service function chaining control plane through OpenFlow,” in
2017 IEEE Conference on Network Function Virtualization and Software
Defined Networks , pp. 1–4, 2017.
[2] D. Borsatti, W. ,Cerroni, G. Davoli, F. Callegati, “Intent-based Service
Function Chaining on ETSI NFV Platforms,” in 2019 IEEE Conference
on Networks of the Future, pp. 144–146, 2019.
[3] Google cloud platform - container engine. Available online: https://
cloud.google.com/container-engine/,accessed:2020-04-28.
[4] Amazon EC2. Available online: https://aws.amazon.com/ecs,accessed:
2020-04- 28.
[5] ETSI Tech. Spec. 124 173 V15.2.0 (2018-09). Available on-
line: https://www.etsi.org/deliver/etsi ts/124100 124199/124173/15.02.
00 60/ts 124173v150200p.pdf,accessed:2020- 04-28.
[6] Ericsson Tech. Rep., “Real-time interaction in 5G –
A use case example from the health care industry,”
2019 [Online]. Available: https://www.ericsson.com/
4a44a9/assets/local/digital-services/offerings/voice-services/
health-care- case-real-time-interaction-in- 5g-with- ims-data-channel.
pdf, accessed: 2019-08-01.
[7] Huawei Tech. Rep., “Vo5G Technical White Paper ,” 2018
[Online]. Available: https://www.huawei.com/it/industry- insights/
technology/vo5g-technical-white-paper, accessed: 2019-08-01.
[8] J. Sun, G. Zhu, G. Sun, D. Liao, Y. Li, A.K. Sangaiah, M. Ra-
machandran, and V. Chang, “A Reliability-Aware Approach for Resource
Efficient Virtual Network Function Deployment,” IEEE Access, vol. 6,
pp. 18238–18250, 2018.
[9] A.S. Sendi, Y. Jarraya, M. Pourzandi, and M. Cheriet, “Efficient Provi-
sioning of Security Service Function Chaining Using Network Security
Defense Patterns,” IEEE Trans. Serv. Comput., vol. PP, no. 99, pp. 1–1,
2016.
[10] The Clearwater Project. Available online: http://www.projectclearwater.
org/,accessed:2020-04-28.
[11] D. Cotroneo, R. Natella, and S. Rosiello, “NFV-throttle: An overload
control framework for network function virtualization,” IEEE Trans.
Netw. Service Manag., vol. 14, no. 4, pp. 949–963, 2017.
[12] D. Cotroneo, L. De Simone and R. Natella, “NFV-Bench: A Depend-
ability Benchmark for Network Function Virtualization Systems,” IEEE
Trans. Netw. Service Manag., vol. 14, no. 4, pp. 934–948, 2017.
[13] M. Di Mauro, A. Liotta, “Statistical Assessment of IP Multimedia Sub-
system in a Softwarized Environment: A Queueing Networks Approach,”
IEEE Trans. Netw. Service Manag., vol. 16, no. 4, pp. 1493–1506, 2019.
[14] C. Negus, W. Henry, Docker Containers. Prentice-Hall, 1 ed., 2015.
[15] Y. Zhang, Network Function Virtualization: Concepts and Applicability
in 5G Networks (cap.2, par. 2.2.3). Hoboken (NJ), Wiley-IEEE Press,
Inc., 1 ed., 2018.
[16] R. Cziva, and D.P. Pezaros, “Container Network Functions: Bringing
NFV to the Network Edge,” IEEE Commun. Mag., vol. 55, no. 6, pp. 24–
31, 2017.
[17] Recommendation E.800. Available online: https://www.itu.int/rec/
T-REC-E.800-200809-I.
[18] H. Chantre, and N.L.S. Fonseca, “Reliable Broadcasting in 5G NFV-
Based Networks,” IEEE Commun. Mag., vol. 56, no. 3, pp. 218–224,
2018.
[19] NEC Corporation, “NEC Virtualized Evolved Packet Core - vEPC,”
2014 [Online]. Available: https://networkbuilders.intel.com/docs/vEPC
white paper w.cover final.pdf, accessed: 2019-08-01.
[20] Ericsson Review, “Virtualizing network services - the telecom cloud,”
2014 [Online]. Available: https://www.ericsson.com/assets/local/
publications/ericsson-technology- review/docs/2014/er-telecom-cloud.
pdf, accessed: 2019-08-01.
[21] H. Jin, Y. Jin, H. Lu, C. Zhao and M. Peng, “NFV and SFC: A Case
Study of Optimization for Virtual Mobility Management,” IEEE J. Sel.
Area Comm., vol. 36, no. 10, pp. 2318–2332, 2018.
[22] R. Ghosh, F. Longo, F. Frattini, S. Russo, and K. S. Trivedi, “Scalable
analytics for IaaS cloud availability,” IEEE Trans. Cloud Comput., vol. 2,
no. 1, pp. 57–70, 2014.
[23] B.R. Haverkort, R. Marie, G. Rubino, and K.S. Trivedi, Performability
modelling techniques and tools. Chichester(UK), John Wiley and Sons,
Ltd., 2001.
[24] K. Nagaraja, G. Gama, R. Bianchini, R.P. Martin, W. Meira, and
T.D. Nguyen “Quantifying the performability of cluster-based services,”
IEEE Trans. Parallel Distrib. Syst, vol. 16, no. 5, pp. 456–467, 2005.
[25] R. Matos, J. Dantas, J. Araujo, K.S. Trivedi, and P. Maciel, “Redundant
Eucalyptus private clouds: Availability modeling and sensitivity analy-
sis,” Journal of Grid Computing, vol. 15, no. 1, pp. 1–22, 2017.
[26] M. C. Bezerra, R. Melo, J. Dantas, P. Maciel and F. Vieira, “Availability
modeling and analysis of a VoD service for eucalyptus platform,” in
2014 IEEE International Conference on Systems, Man, and Cybernetics,
pp. 3779–3784, 2014.
[27] Z. Hong, M. Shi, Y. Wang“CTMC-Based Availability Analysis of Multi-
ple Cluster Systems with Common Mode Failure” in 2016 IEEE Confer-
ence on Applied Computing and Information Technology, pp. 394–396,
2016.
[28] W. Li, A. Kanso, “Comparing Containers versus Virtual Machines for
Achieving High Availability,” in 2015 IEEE International Conference
on Cloud Engineering, pp. 353–358, 2015.
[29] D. Bruneo, “A Stochastic Model to Investigate Data Center Performance
and QoS in IaaS Cloud Computing Systems,” IEEE Trans. Parallel
Distrib. Syst, vol. 25, no. 3, pp. 560–569, 2014.
15
[30] J. Fan, C. Guan, Y. Zhao, and C. Qiao, “Availability-aware mapping of
service function chains,” in IEEE INFOCOM 2017 - IEEE Conference
on Computer Communications, pp. 1–9, 2017.
[31] J. Liu, Z. Jiang, N. Kato, O. Akashi, and A. Takahara, “Reliability
evaluation for NFV deployment of future mobile broadband networks,”
IEEE Wireless Commun., vol. 23, no. 3, pp. 90–96, 2016.
[32] J. Kong, I. Kim, X. Wang, Q. Zhang, H. C. Cankaya, W. Xie, T. Ikeuchi,
and J. P. Jue, “Guaranteed-availability Network Function Virtualization
with Network Protection and VNF replication,” in GLOBECOM 2017,
pp. 1–6, 2017.
[33] S. Sebastio, R. Ghosh, and T. Mukherjee, “An availability analysis
approach for deployment configurations of containers,” IEEE Trans.
Serv. Comput., vol. PP, no. 99, pp. 1–1, 2018.
[34] E. Andrade, B. Nogueira, R. Matos, G. Callou, and P. Maciel, “Availabil-
ity modeling and analysis of a disaster-recovery-as-a-service solution,”
Computing, vol. 99, no. 10, pp. 929–954, 2017.
[35] M. Di Mauro, M. Longo, and F. Postiglione, “Availability Evaluation
of Multi-tenant Service Function Chaining Infrastructures by Multidi-
mensional Universal Generating Function,” IEEE Trans. Serv. Comput.,
DOI: 10.1109/TSC.2018.2885748, 2018.
[36] M. Di Mauro, G. Galatro, M. Longo, F. Postiglione, and M. Tambasco,
“IP Multimedia Subsystem in an NFV environment: availability evalu-
ation and sensitivity analysis”, in IEEE NFV-SDN Conference, (Verona,
Italy, Nov. 2018).
[37] G. Camarillo, and M.A. Garcia-Martin The 3G IP Multimedia Subsys-
tem. New York, John Wiley and Sons, Inc., 3rd ed., 2008.
[38] Docker. Available online: https://www.docker.com,accessed:
2020-04- 28.
[39] CoreOS. Available online: https://coreos.com/rkt/,accessed:2020-04- 28.
[40] OpenVZ. Available online: https://openvz.org,accessed:2020-04- 28.
[41] T. Combe, A. Martin, and R. Di Pietro, “To Docker or Not to Docker: A
Security Perspective,” IEEE Cloud Computing, vol. 3, no. 5, pp. 54–62,
2016.
[42] Amazon AWS Lambda. Available online: https://aws.amazon.com/
lambda/,accessed:2020- 04-28.
[43] S.I. Ahson, IP Multimedia Subsystem (IMS) Handbook. Broken Sound
Parkway (NW), CRC Press, 2008.
[44] J.K. Muppala, G. Ciardo, and K.S. Trivedi, “Stochastic Reward Nets for
Reliability Prediction,” in Communications in Reliability, Maintainabil-
ity and Serviceability, pp. 9–20, 1994.
[45] A. Reibman and R. Smith, and K.S. Trivedi, “Markov and Markov
reward model transient analysis: An overview of numerical approaches,”
Europ. Journ. of Oper. Res., vol. 40, no. 2, pp. 257–267, 1989.
[46] R. German, C. Kelling, A. Zimmermann, and G. Hommel, “TimeNET: a
toolkit for evaluating non-Markovian stochastic Petri nets,” Performance
Evaluation, vol. 24, no. 1-2, pp. 69–87, 1995.
[47] Tonse Telecom, “The LTE Data Storm in the Core of Your Network”,
White Paper, Jan. 2013.
Mario Di Mauro (Laurea in Electronics Engineer-
ing, Univ. of Salerno (Italy), 2005; MS in Network-
ing, Telecom Italia Learning Centre, 2006, PhD.
degree in information engineering, Univ. of Salerno,
2018). He was a Research Engineer with CoRiTeL
(Research Consortium on Telecommunications, led
by Ericsson Italy) and then a Research Fellow with
Univ.of Salerno. His main fields of interest include:
network availability, network security, data analysis
for telecommunication infrastructures.
Giovanni Galatro received the Laurea degree
(summa cum laude) in information engineering from
the University of Salerno (Italy) in 2018, and has
been a visiting student at Dept. of Computer Science
at Groningen University (Netherlands). In 2017 he
got a scholarship with Telecommunication and Ap-
plied Statistics groups, focused on the availability
analysis of modern telco infrastructures.
Maurizio Longo (Laurea in Electronics Engineering
, Univ. of Napoli (Italy), 1972; MSEE, Stanford
Univ., CA, 1977) retired in 2018 from the Univ. of
Salerno (Italy) as Full Professor of Telecommunica-
tions. In this university he also served as Department
Dean, as the Chairman of the Graduate School of
Information Engineering, and as the Director of the
CoRiTeL (Research Consortium on Telecommuni-
cations) Lab. He held academic positions also with
the Univ. Federico II (Napoli), the Parthenope Univ.
(Napoli), the Univ. of Lecce and the Aeronautical
Academy. He has authored over 180 papers in international journals and
conference proceedings, mainly in the fields of telecommunication networks.
Fabio Postiglione is currently an Assistant Professor
of Applied Statistics Univ. of Salerno (Italy). He
received his Laurea degree (summa cum laude) in
Electrical Engineering and his Ph.D. degree in Infor-
mation Engineering from Univ. of Salerno in 1999
and 2005, respectively. His main research interests
include degradation analysis, lifetime estimation,
reliability and availability evaluation of complex
systems (telecommunication networks, fuel cells),
Bayesian statistics and data analysis.
Marco Tambasco received his Master’s degree in
Electronic Engineering from Univ. of Salerno in
2010. He then joined CoRiTeL (Research Consor-
tium on Telecommunications). Research interests
include networks analysis and design, availability
and security of cloud-based telecommunication sys-
tems, Network Function Virtualization (NFV) and
Software Defined Networking (SDN) prototyping.