Content uploaded by Abbas Heydarnoori
Author content
All content in this area was uploaded by Abbas Heydarnoori on Oct 23, 2014
Content may be subject to copyright.
Reliable Deployment of Component-based Applications into
Distributed Environments
Abbas Heydarnoori Farhad Mavaddat
University of Waterloo,
Waterloo, ON,
Canada, N2L 3G1
Abstract
Software deployment process is a sequence of related
activities for installing an already developed applica-
tion into its target environment, and bringing it into an
executing state. For complex component-based applica-
tions that should be deployed into a large distributed en-
vironment, several deployment configurations are typ-
ically possible. These deployment configurations can
have significant impacts on the application’s quality of
service properties such as reliability. In distributed sys-
tems, the reliability of the application is highly depen-
dent on the reliability of its network, and network fail-
ures can have adverse effects on the application’s reli-
ability. Thus, one possible way to increase the relia-
bility of a distributed component-based application is to
deploy it so that the communications among its com-
ponents are done as local as possible. In this paper, a
graph-based deployment planning approach is proposed
for this purpose.
keywords: Software deployment, Software reliabil-
ity, Component-based applications, Distributed sys-
tems, Communication channels.
1 Introduction
With significant advances in software development
technologies, it is possible to have complex software
systems that consist of a large number of heteroge-
neous components distributed over many hosts with
different hardware and software characteristics. How-
ever, in these applications, different components of the
application may have various hardware and software
requirements, and hence they may provide their de-
sired functionality only when these requirements are
answered. Consequently, after the development of an
application, a sequence of related activities must be
done to place its components into the suitable hosts in
the distributed environment, and to make the appli-
cation available for use. This sequence of activities is
referred to as software deployment process, and com-
prises the following activities: acquiring the developed
application from its producer; planning where and how
different components of the application should be in-
stalled in the target environment, resulting in a de-
ployment plan; installing the application into its target
environment according to its deployment plan; config-
uring it; and finally executing it.
For complex applications deployments that many
components should be deployed into many hosts in a
distributed environment, several deployment configu-
rations are usually possible. In a given context, some
of these deployment configurations are obviously more
effective than others in terms of some quality of ser-
vice (QoS) characteristics such as reliability, perfor-
mance, dependability, and so on. The main aim of
the present work is to maximize the reliability, defined
as the probability of failure-free software operation for
a specified period of time in a specified environment
[1]. In the context of distributed environments, one
potential problem is network failures. In these environ-
ments, connectivity losses can lead to disastrous effects
on the system’s reliability, and the software application
may not provide its desired functionality. To reduce
the risks of this problem, one solution is to make the
communications among the application components as
local as possible. In this way, components located in
the same host can communicate without any respect to
the network’s status. Thus, from this perspective, the
most reliable deployment configuration can be defined
as one with the least amount of communications among
the hosts in the distributed environment. From another
point of view, this can be seen as the increased perfor-
mance. This is due to the fact that in distributed en-
vironments, network communications have some over-
1
heads on the software application. Thus, reduced com-
munication among hosts can result in a higher perfor-
mance.
In this paper, a graph-based approach is pre-
sented for planning the deployment of channel-based
component-based applications into peer-to-peer dis-
tributed environments (e.g., Internet) so that the reli-
ability of the application is maximized. A channel is a
peer-to-peer communication medium with well-defined
characteristics and behavior [2]. Examples of channel-
based models are Reo [2], IWIM [3], and Manifold [4].
In a peer-to-peer architecture, two or more comput-
ers (called nodes) can directly communicate with each
other, without requiring any intermediary devices [5].
In contrast to the client/server architecture, in a peer-
to-peer architecture, nodes have equivalent responsibil-
ities, enabling applications that focus on collaboration
and communication. Therefore, the deployment plan-
ning approach presented in this paper does the plan-
ning with respect to the various channel types (or im-
plementations) required by different components of the
application, and various channel types (or implemen-
tations) that different hosts in the target environment
can support.
This paper is organized as follows. Section 2 talks
about the user-specified inputs that are used in the re-
liable deployment planning. Then, Section 3 describes
how these inputs are used to do the actual reliable de-
ployment planning with the help of graph theory algo-
rithms. Finally, Section 4 presents concluding remarks
and outlines future work.
2 Deployment Planning Inputs
To generate deployment plans, the following inputs
should be specified by the user: (1) the software appli-
cation being deployed, (2) the distributed environment
in which the application will be deployed, and (3) the
user-defined constraints regarding this deployment. In
the following sections, we talk about these inputs.
2.1 Specification of the Software Applica-
tion being Deployed
This input specifies the software application being
deployed into the target environment. In the view of
this paper, a software application comprises a num-
ber of software components connected by a number of
channels with different characteristics. The nature of
these software components are irrelevant to this speci-
fication; they are treated as black box software entities
that read data from their input ports and write data
to their output ports. Thus, they can be processes,
Figure 1. A sample application graph. C
i
s
represent application components. T
d
srep-
resent different channel types among them.
Web services, Java beans, CORBA components, and so
on. The communications among these black box soft-
ware entities is done via channels among them. These
channels can have different types or implementations.
For example, channel types can be synchronous, asyn-
chronous, or FIFO. Examples of channel implementa-
tions are simple message passing, encrypted message
passing, or using the shared memory. It is also possible
to model the primitives of other communication mod-
els (such as message passing, shared spaces, or remote
procedure calls) by the channel-based communication
model [2]. Thus, other kinds of component-based ap-
plications can also be seen as some sorts of channel-
based component-based applications. With respect to
this discussion, a component-based application can be
modeled as a graph. The nodes of this graph repre-
sent the components of the application. The edges of
this graph represent the channels among the applica-
tion components.
Definition 2.1 (Application Graph) Suppose C
i
s
represent different components of the application, and
T
d
s represent different channel types. Then, applica-
tion graph AG = (V
AG
,E
AG
) is defined as a graph on
V
AG
= {C
1
,C
2
, ..., C
n
} in which each edge e ∈ E
AG
has a label l
e
∈{T
1
,T
2
, ..., T
k
}.
For example, Fig. 1 shows a sample application
graph for a supposed software system consisting of six
components C
1
− C
6
connected by different channel
types T
1
− T
7
.
2.2 Specification of the Target Environ-
ment
In this paper, the target environment for the deploy-
ment of the application is a peer-to-peer distributed en-
2
Figure 2. A sample distributed environment
consisting of four hosts H
1
− H
4
supporting
different channel types T
1
− T
7
.
vironment consisting of a number of hosts with compu-
tational capabilities (e.g., PCs, laptops, servers, etc.).
Furthermore, the required software for the communi-
cation among the application components has been al-
ready installed on them. However, since different hosts
may have different hardware properties, they may not
be able to support some features of the communication
software installed on them, or it may be even impossi-
ble to install some sorts of communication software on
them. It is also possible that different features/versions
of the communication software have been installed on
different hosts intentionally because of some reasons
(e.g., cost, security, etc.). With respect to this discus-
sion, available hosts in the target environment might
be able to support different sets of channel types (or
implementations). As an example, Fig. 2 demonstrates
a sample peer-to-peer distributed environment consist-
ing of four hosts H
1
− H
4
, each of them can support
different subsets of channel types T
1
− T
7
. To create
a channel of type T
d
between two not necessarily dis-
tinct hosts H
x
and H
y
in this environment, both of
them must be able to support that channel type T
d
.
2.3 Specification of the User-defined Con-
straints
Users may have special constraints regarding the
placement of the application components that should
be taken into account during the deployment planning.
For example, suppose users want the deployment of the
application presented in Fig. 1 into the target environ-
ment presented in Fig. 2 to be done so that component
C
5
toberunoneitherH
1
or H
2
.
Component Name Candidate Hosts
C
1
H
1
, H
2
C
2
H
1
, H
2
, H
3
C
3
H
2
, H
4
C
4
H
1
, H
3
, H
4
C
5
H
2
C
6
H
3
, H
4
Table 1. Candidate hosts for the deployment
of the application components presented in
Fig. 1 into the target environment presented
in Fig. 2.
3 Deployment Planning
In this section, the inputs specified in Section 2 are
used to do the actual deployment planning. However,
first we introduce the concept of candidate host which
is used in the rest of this paper.
Definition 3.1 (Candidate Host) For the given ap-
plication graph AG = (V
AG
,E
AG
), and set of chan-
nel types {T
1
,T
2
, ..., T
k
},supposeT
C
i
= {T
d
|T
d
∈
T, ∃{C
i
,C
j
}∈E
AG
: l
{C
i
,C
j
}
= T
d
} illustrates all
channel types connected to the component C
i
in the ap-
plication graph AG,andT
H
x
shows the set of channel
types that host H
x
can support. Then, host H
x
is a can-
didate host for the deployment of component C
i
,onlyif
(1) T
C
i
⊆ T
H
x
, and (2) host H
x
satisfies user-defined
constraints regarding the deployment of component C
i
.
For example, Table 1 shows the candidate hosts for
deploying the components of the application presented
in Fig. 1 to the target environment presented in Fig.
2. For a more specific example, consider component
C
5
. In the application graph presented in Fig. 1, C
5
is
connected to the channel types T
5
and T
6
.Bylooking
at the channel types that each of the hosts in the target
environment can support, we see that hosts H
2
, H
3
and H
4
can support component C
5
’s required channel
types. But, as mentioned in section 2.3, users want C
5
to be deployed on either hosts H
1
or H
2
. Thus, with
respect to this constraint, the only candidate host for
the deployment of component C
5
is H
2
.
One impractical way for finding the most reliable
deployment configuration is to generate all possible de-
ployment configurations by permuting the sets of can-
didate hosts for different components of the applica-
tion. Then, the deployment configuration with the
greatest number of local channels among the appli-
cation components (or the least number of channels
among the hosts) is selected. However, when the num-
ber of possible deployment configurations is large, a
3
set of algorithms and heuristics should be designed and
applied to effectively solve this problem. Following sec-
tion presents how this can be done in polynomial time.
3.1 Using Multiway Cut Problem in Reli-
able Deployment Planning
In this section, we want to find an efficient algorithm
for solving this complex problem in polynomial time.
For this purpose, we show that the reliable deployment
problem corresponds to the multiway cut problem in
graph theory [6].
Definition 3.2 (Multiway Cut Problem) Let G =
(V,E) be an undirected graph on V = {v
1
,v
2
, ..., v
n
} in
which each edge e ∈ E has a non-negative weight w(e),
and let T = {t
1
,t
2
, ..., t
m
}⊆V be a set of terminals.
Multiway cut is the problem of finding a set of edges
E
⊆ E such that the removal of E
from E disconnects
each terminal from all other terminals, and solution
cost MC =
e∈E
w(e) is also minimized.
Suppose AG = (V
AG
,E
AG
) is the application graph
of the software application being deployed, V
TG
=
{H
1
,H
2
, ..., H
m
} represents the set of available hosts
in the target environment, and CH
C
i
represents the
set of candidate hosts for the deployment of compo-
nent C
i
. To solve the reliable deployment problem, a
graph G =(V,E) is made in the following way:
• V = V
AG
∪ V
TG
.
• E = E
AG
∪ E
H
, where E
H
= {{C
i
,H
x
}|C
i
∈
V
AG
,H
x
∈ CH
C
i
}.
• w(e)=
1 e ∈ E
AG
n
2
e ∈ E
H
. Here n
2
shows a large
number.
Fig. 3 shows an example of a graph developed in
this way for the application graph presented in Fig. 1,
and the target environment presented in Fig. 2. In this
graph, if we set hosts as the terminals of the multiway
cut problem, we prove in the following theorem that the
solution of the multiway cut problem is the solution of
the reliable deployment problem we intend to solve.
Theorem 3.1 Suppose graph G =(V,E) is built in
the way mentioned earlier, and hosts of the target en-
vironment are set as the terminals. Then, the multiway
cut solution of this graph is the solution of the reliable
deployment problem we are looking for. This means
that the application components that lie in the same
subgraph with a host should be deployed on that host,
and this deployment configuration has the least number
of channels among hosts.
Figure 3. A graph built for finding the most
reliable deployment configuration of the ap-
plication presented in Fig. 1 into the target
environment shown in Fig. 2.
Proof Suppose n represents the number of compo-
nents of the application, and k represents the size of
the E
H
, i.e., k = |E
H
|. In the multiway cut solution
we are looking for, each component must be assigned
to exactly one host. Thus, (k − n) edges whose total
weight is n
2
(k − n) will be removed from the E
H
in
the cut. Also, suppose L
OPT
represents the solution of
the reliable deployment problem, i.e., the least number
of channels among hosts after the deployment of the
application. Actually, these channels are those appli-
cation graph edges that lie in the cut, and their total
weight is L
OPT
× 1=L
OPT
. Thus, our goal is to prove
that MC = n
2
(k − n)+L
OPT
.
Case A: MC ≤ n
2
(k − n)+L
OPT
.
Suppose a deployment D : V
AG
−→ V
TG
whose
cost is optimum is done, i.e., it has L
OPT
number
of channels among hosts. Now, assume that C is
its corresponding cut in the graph G =(V, E):
C = {{C
i
,H
x
}|D(C
i
) = H
x
, {C
i
,H
x
}∈E
H
}
M
{{C
i
,C
j
}|D(C
i
) = D(C
j
), {C
i
,C
j
}∈E
AG
}
N
M represents the set of edges of E
H
that lie in
the cut, and N represents the set of edges of E
AG
that lie in the cut. The size of M is (k − n)and
the size of N is L
OPT
. Furthermore, the weight of
the edges in M is n
2
and the weight of the edges
in N is 1. With respect to this description:
4
w(C)=
w({{C
i
,H
x
}|D(C
i
) = H
x
, {C
i
,H
x
}∈E
H
})+
w({{C
i
,C
j
}|D(C
i
) = D(C
j
), {C
i
,C
j
}∈E
AG
})=
n
2
× |{{C
i
,H
x
}|D(C
i
) = H
x
, {C
i
,H
x
}∈E
H
}|+
1 × |{{C
i
,C
j
}|D(C
i
) = D(C
j
), {C
i
,C
j
}∈E
AG
}| =
n
2
(k − n)+L
OPT
Since MC is the cost of the optimum multiway cut,
for sure, MC ≤ w(C). Therefore, MC ≤ n
2
(k −
n)+L
OPT
.
Case B: MC ≥ n
2
(k − n)+L
OPT
.
Suppose C is the optimum multiway cut for graph
G =(V,E) whose cost is MC. Now, we want to
use this cut to generate its corresponding deploy-
ment D. For this purpose, we prove the following
subcases:
Subcase B.1: Cut C includes at most (k − n)
edges of E
H
.
Suppose we want to find a cut whose cost
is the heaviest. In the deployment config-
uration we are looking for, each component
should be assigned to exactly one host. For
this purpose, for each component C
i
in graph
G, we keep an arbitrary edge connecting that
component to an arbitrary host, and we cut
the rest of the edges in E
H
and E
AG
. Since
the maximum number of edges in the appli-
cation graph is (
n
2
), the cost of this cut is
at most (
n
2
)+n
2
(k − n). Thus, the cost of
the multiway cut C can not be more than
n
2
2
+ n
2
(k − n). This means that the cut C
includes at most (k − n) edges of E
H
.Be-
cause, for example, if it includes (k − n +1)
edges of E
H
, then the cost of the cut would
be (
n
2
)+n
2
(k − n + 1) which is more than the
maximum cost we found here.
Subcase B.2: Each component C
i
is connected
to at most one host in the cut C.
Suppose a component C
i
is connected to two
different hosts H
x
and H
y
in the cut. This
means that H
x
and H
y
are connected to-
gether in the cut. However, since H
x
and H
y
belong to the set of terminals, this is impos-
sible. Therefore, C
i
is connected to at most
one host in the cut.
Subcase B.3: Each component C
i
is connected
to exactly one host in the cut C.
From subcases B.1 and B.2 together, it can
be easily understood that each component C
i
1. For each terminal t
i
∈ T , find a minimum-cost
set of edges C
t
i
whose removal disconnects t
i
from the rest of the terminals;
2. Discard cut C
t
x
whose cost w(C
t
x
)isthe
heaviest;
3. Output the union of the rest, call it C.
Algorithm 1: Approximation algorithm for
solving the multiway cut problem.
is connected to exactly one host in the cut C.
D(C
i
) represents the host on which compo-
nent C
i
is mapped.
By using the subcase B.3, cut C’s corresponding
deployment configuration D can be made. Sup-
pose L
D
= |{{C
i
,C
j
}|D(C
i
) = D(C
j
), {C
i
,C
j
}∈
E
AG
}| represents the cost of the deployment con-
figuration D, i.e., the number of channels among
the hosts in the deployment configuration D.In
the following, we prove the correctness of case B:
MC = n
2
(k − n)+
|{{C
i
,C
j
}|{C
i
,C
j
}∈C, {C
i
,C
j
}∈E
AG
}|
≥ n
2
(k − n)+
|{{C
i
,C
j
}|D(C
i
) = D(C
j
), {C
i
,C
j
}∈E
AG
}|
= n
2
(k − n)+L
D
=⇒ MC ≥ n
2
(k − n)+L
D
≥ n
2
(k − n)+L
OPT
Cases A and B together imply that MC = n
2
(k −
n)+L
OPT
. Therefore, the correctness of theorem 3.1
is proved.
In theorem 3.1, we showed that the solution of the
reliable deployment problem can be found by solving
the multiway cut problem in graph theory. However,
it is proved that the multiway cut problem is an NP-
hard problem when the number of terminals is greater
than two. Thus, unless P=NP, it does not have a poly-
nomial time solution [6]. However, it is possible to
find many approximation algorithms for the multiway
cut problem in literature [6, 7, 8]. One of the well-
known and simple approximation algorithms developed
by Dalhaus et al. is provided in Algorithm 1 [6]. As an
example, Fig. 4 shows an example of applying this al-
gorithm on the graph presented in Fig. 3. As we see in
this figure, one of the main problems of these approx-
imation algorithms is that some components may not
be assigned to any hosts (e.g., C
4
and C
6
). To solve
this problem, after applying the multiway cut approx-
imation algorithm on the graph, we check whether or
not all components are assigned to a host. If there are
5
Figure 4. An approximation for the multiway
cut of the graph presented in Fig. 3.
some components which are not assigned to any hosts,
we connect those components to one of their candidate
hosts for the deployment, and we cut all the application
graph edges that are connected to those components.
Since, we are actually removing from the multiway cut
approximation some heavy edges that connect the com-
ponents to the hosts, this approach not only will solve
the problem, but also will improve the approximation
of the multiway cut. Consequently, the result is closer
to the optimum solution we are looking for. After ap-
plying this improvement on the multiway cut approx-
imation presented in Fig. 4, one possible solution for
the reliable deployment problem is {(C
1
→ H
2
), (C
2
→
H
2
), (C
3
→ H
2
), (C
4
→ H
4
), (C
5
→ H
2
), (C
6
→ H
4
)}.
4 Conclusions and Future Work
This paper presented a graph-based approach for
maximizing the reliability of component-based applica-
tions deployments into distributed environments. The
reliability of a distributed application is extremely
dependent on the reliability of its network. There-
fore, in the deployment planning approach presented
in this paper, the communications among the appli-
cation components were tried to be made as local as
possible, allowing minimization of the network failures
effects on the application’s reliability. Furthermore, it
was demonstrated that this deployment problem cor-
responds to the multiway cut problem in graph theory
that is a NP-hard problem. In this paper, an existing
multiway cut approximation algorithm with some im-
provements was used to solve the reliable deployment
problem. For future work, we plan to develop a number
of heuristics that can provide a better approximation
of the reliable deployment problem.
Acknowledgment
The authors would like to acknowledge Ehsan Chini-
forooshan for his valuable suggestions in developing the
ideas expressed in this paper.
References
[1] Lyu, M. R. Handbook of Software Reliability En-
gineering, IEEE Computer Society Press and
McGraw-Hill, 1996.
[2] Arbab, F. Reo: A Channel-based Coordination
Model for Component Composition. Mathemati-
cal Structures in Computer Science, 14, 3 (June
2004), 329-366.
[3] Katis, P., Sabadini, N. and Walters, R. F. C. A
Formalization of the IWIM Model. In Proceed-
ings of the 4th International Conference on Co-
ordination Languages and Models (COORDINA-
TION 2000), Limassol, Cyprus, September 11-13,
2000.
[4] Bonsangue, M. M., Arbab, F., de Bakker, J. W.,
Rutten, J., Scutell, A. and Zavattaro, G. A Transi-
tion System Semantics for the Control-driven Co-
ordination Language Manifold. Theoretical Com-
puter Science, 240, 1 (June 2000), 3-47.
[5] Schollmeier, R. A Definition of Peer-to-Peer Net-
working for the Classification of Peer-to-Peer Ar-
chitectures and Applications. In Proceedings of the
IEEE 2001 International Conference on Peer-to-
Peer Computing (P2P2001), Linkping, Sweden,
August 27-29, 2001.
[6] Dahlhaus, E., Johnson, D. S., Papadimitriou, C.
H., Seymour, P. D. and Yannakakis, M. The Com-
plexity of Multiterminal Cuts. SIAM Journal on
Computing, 23, 4 (August 1994), 864-894. Prelim-
inary version appeared in STOC’92.
[7] Calinescu, G., Karloff, H., and Rabani, Y. An
Improved Approximation Algorithm for Multiway
Cut. Journal of Computer and System Sciences,
60, 3 (June 2000), 564-574. Preliminary version in
STOC’98.
[8] Vazirani, V. V. Approximation Algorithms, Sec-
ond Edition, Springer, 2002.
6