PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Self-adaptive systems continuously adapt to internal and external changes in their execution environment. In context-based self-adaptation, adaptations take place in response to the characteristics of the execution environment, captured as a context. However, in large-scale adaptive systems operating in dynamic environments, multiple contexts are often active at the same time, requiring simultaneous execution of multiple adaptations. Complex interactions between such adaptations might not have been foreseen or accounted for at design time. For example, adaptations can partially overlap, requiring only partial execution of each, or they can be conflicting, requiring some of the adaptations not to be executed at all, in order to preserve system execution. To ensure a correct composition of adaptations, we propose ComInA, a novel reinforcement learning based approach, which autonomously learns interactions between adaptations as well as the most appropriate adaptation composition for each combination of active contexts, as they arise. We present an initial evaluation of ComInA in an urban public transport network simulation, where multiple adaptations to buses, routes, and stations are required. Early results show that ComInA correctly identifies whether adaptations are compatible or conflicting and learns to execute adaptations which maximize system performance. However, further investigation is needed into how best to utilize such identified relationships to optimize a wider range of metrics and utilize more complex composition strategies.
Learning Run-time Compositions of Interacting Adaptations
Nicolás Cardozo
Systems and Computing Engineering Department
Universidad de los Andes, Colombia
n.cardozo@uniandes.edu.co
Ivana Dusparic
School of Computer Science and Statistics
Trinity College Dublin, Ireland
ivana.dusparic@scss.tcd.ie
ABSTRACT
Self-adaptive systems continuously adapt to internal and exter-
nal changes in their execution environment. In context-based self-
adaptation, adaptations take place in response to the characteristics
of the execution environment, captured as a context. However, in
large-scale adaptive systems operating in dynamic environments,
multiple contexts are often active at the same time, requiring simul-
taneous execution of multiple adaptations. Complex interactions
between such adaptations might not have been foreseen or ac-
counted for at design time. For example, adaptations can partially
overlap, requiring only partial execution of each, or they can be
conicting, requiring some of the adaptations not to be executed
at all, in order to preserve system execution. To ensure a correct
composition of adaptations, we propose ComInA, a novel rein-
forcement learning based approach, which autonomously learns
interactions between adaptations as well as the most appropriate
adaptation composition for each combination of active contexts, as
they arise. We present an initial evaluation of ComInA in an urban
public transport network simulation, where multiple adaptations
to buses, routes, and stations are required. Early results show that
ComInA correctly identies whether adaptations are compatible
or conicting and learns to execute adaptations which maximize
system performance. However, further investigation is needed into
how best to utilize such identied relationships to optimize a wider
range of metrics and utilize more complex composition strategies.
CCS CONCEPTS
Computing methodologies Reinforcement learning
;
Software and its engineering Software development tech-
niques; Software design engineering;
KEYWORDS
Dynamic software composition, Reinforcement learning,
ACM Reference format:
Nicolás Cardozo and Ivana Dusparic. 2020. Learning Run-time Composi-
tions of Interacting Adaptations. In Proceedings of Symposium on Software
Engineering for Adaptive and Self-Managing Systems, Seoul, South Korea,
2020 (SEAMS’20), 7 pages.
https://doi.org/10.1145/nnnnnnn.nnnnnnn
SEAMS’20, 2020, Seoul, South Korea
©
2020 Copyright held by the owner/author(s). Publication rights licensed to Associa-
tion for Computing Machinery.
This is the author’s version of the work. It is posted here for your personal use. Not
for redistribution. The denitive Version of Record was published in Proceedings of
Symposium on Software Engineering for Adaptive and Self-Managing Systems, 2020,
https://doi.org/10.1145/nnnnnnn.nnnnnnn.
1 INTRODUCTION
Self-adaptive systems [
24
] enable the dynamic adaptation of soft-
ware systems in response to input gathered from their surrounding
(internal and external) execution environment. In this paradigm,
information about the environment is used to select the most ap-
propriate behavior from a set of predened adaptation alternatives.
Composition-based self-adaptive systems achieve dynamic adap-
tation by continuously (re)composing system modules with the
appropriate specialized behavior to respond to the sensed envi-
ronment. If multiple environment situations are sensed, multiple
adaptations need to be composed. The relationship between adap-
tations can result in three broad composition situations we identify
as: compatible, in which adaptations are independent from each
other, and their composition can take place safely, complementary,
in which adaptations enhance each other or overlap, and either
one or all can be executed, and conicting, in which composing
the adaptations will lead to an execution error, or the eect of the
adaptations cancels each other out.
To illustrate these (and to evaluate our approach), we use the
TranCity example, a public transport management system (Sec-
tion 4), where multiple adaptations to the buses, stations, and routes
might be needed. For example, adaptations to close a station tem-
porarily and stop serving it can be seen as compatible (independent),
while closing the station and sending an additional bus to it might
be conicting. Composing conicting adaptations might lead to
an erroneous system state. Adaptation mechanisms need to assure
that no conicts exist between the adaptations, or between adap-
tations and the base system. The best combination of adaptations
to be executed is the one that has the most positive impact on the
global behavior of the system, in its current context of execution.
Therefore, for each possible context combination, a decision if and
which adaptations to compose must be made.
Managing the interaction between adaptations and their compo-
sition is still an open problem [
26
]. Dierent mechanism exist to as-
sure composition of adaptations still satises the system goals [
26
],
but they all require the upfront denition of composition rules
between adaptations (Section 5). This enforces a xed resolution
mechanism for all adaptation composition scenarios which may
not always yield the most appropriate behavior to fulll the sys-
tem’s goal, as some of the composition interactions may have been
unforeseen by the developers or may have changed over time. In
addition, even if a conict between adaptations is detected at design
time, pre-dening the dominant adaptation that will always take
precedence might not result in the best system performance, as
adaptations might need to alternate to ensure fairness.
To address these issues, we propose a mechanism for the run-
time composition of adaptations, Composing Interacting Adapta-
tions (
ComInA
) (Section 3), in which the composition strategy is
SEAMS’20, 2020, Seoul, South Korea N. Cardozo, I. Dusparic
not presupposed by the system, but rather is learned to be the com-
position most benecial to the overall system execution.
ComInA
uses Reinforcement Learning (
RL
), in particular W-Learning [
19
],
to learn: (1) the relationships (i.e., compatibility or conict) be-
tween adaptations that arise when multiple contexts are sensed
simultaneously, and (2) the most suitable individual or composed
adaptations to execute for every combination of contexts, that takes
advantage of complementary adaptations or prevents composing
conicting ones. As new contexts and adaptations are discovered
while the system executes, the
ComInA
learning processes are
dynamically expanded to capture their relationships with the ex-
isting system. To realize adaptations,
ComInA
uses an adaptation
engine at the programming language level, Context-oriented Pro-
gramming (
COP
) [
25
], enabling ne-grained adaptations (i.e., at the
method level), rather than coarse-grained component-level adapta-
tions used by existing approaches (Section 5).
ComInA
’s initial results show feasibility and usability of the cor-
rect dynamic composition of adaptations. Further evaluation is still
needed to assess the scope of ComInA’s applicability (Section 6).
2 PRELIMINARIES
ComInA
uses
RL
to learn the most suitable way of composing
adaptations at run time. In particular we use the W-Learning multi-
objective learning technique to resolve the most suitable composi-
tion with respect to the system objectives. The dynamic adaptation
behavior in
ComInA
is implemented using
COP
, a programming
language-based adaptation engine. The learning composition be-
havior posit in this paper is implemented on top of such adaptation
engine. In this section, we provide the fundamental concepts from
both techniques required to understand our proposed solution.
2.1 Reinforcement Learning
In
RL
, intelligent agents learn to map environment situations (i.e.,
states) to actions so as to maximize a long term numerical reward
signal they receive from the environment [
23
]. An
RL
agent uses
a set of states,
S
, consisting of all of states contained within an
environment,
A
: the ‘action space’, i.e., the set of all actions that an
agent can execute that aect the environment, and at each time step,
it receives
r
: the numerical reward signal encoding the positive or
negative impact the action had on the environment.
Q-learning [
28
] is a widely used model-free implementation of
RL
. Long-term quality of an action taken in a particular state is
represented by a Q-value,
Q(s,a)
[
28
] and is learned for each state-
actions pair. The Q-learning incorporates only a single source of
reward, i.e., it enables optimization for a single system goal. To learn
how to meet multiple system goals, multiple Q-learning processes
can be implemented. However, as an agent can only execute a single
action at a time, an arbitration approach needs to be added that
resolves which of the Q-learning processes, i.e., which of the agent’s
goals, gets to take the control of action execution. W-learning pro-
vides that mechanism. At every time step, each Q-learning process
nominates an action to execute that is the best suited to the goal it
represents. By implementing W-learning on top, agents also learn,
for each of the states of each of their goal, how important it is, in
terms of received rewards, for their action nominated by that goal
to be selected, as opposed to actions nominated by other goals. This
is expressed as a W-value,
W(s)
, and is learnt by each Q-learning
agent for each state in which it can be. The goal with the maximum
W-value takes priority at the next time step, and its nominated
action is executed.
2.2 Context-oriented Programming
COP
[
25
] is a programming language approach to realize dynamic
behavior adaptations at a ne level of granularity (i.e., methods).
COP
enables clean modularization of adaptations from the base
program logic as well as from other adaptations. Adaptations are
incorporated in a running system by dynamically recomposing
the running system with the adaptation behavior. The dynamic
composition model used in
COP
reies the MAPE loop. Therefore,
our solution transcends the boundaries of
COP
and is applicable to
other adaptation engines.
To implementation
ComInA
we use the Context Traits [
16
]
COP
language, an extension of ECMAScript enabling dynamic adapta-
tions. Three main concepts are behind dynamic adaptations in
COP
,
contexts,behavioral adaptations, and context activations. Contexts
are dened as rst class entities of the system that capture meaning-
ful situations from the systems surrounding execution environment.
On their own, contexts represent situations sensed from the sys-
tem’s surrounding environment. Adaptive behavior is realized by
means of behavioral adaptations, modular behavior (e.g., method)
specications dened in isolation of other components. Behavioral
are associated with a context. When the context is activated (from
anywhere in the system), the associated behavioral adaptation is
composed into the system. If the context is no longer valid, then it
is deactivated, withdrawing its associated behavioral adaptations
from the system.
In Context Traits, behavioral adaptations are composed following
either a predened static policy, or a dynamic policy, in which the
behavioral adaptations associated with the context activated the
latest, are executed rst. In
ComInA
we combine
COP
with
RL
to
override these rules for context adaptation, and enable the system
to learn preferred compositions at run time.
3 LEARNING COMPOSITION STRATEGIES
ComInA
is a novel adaptation composition approach for self-adaptive
systems, that autonomously learns interactions between adapta-
tions as well as the most appropriate adaptation compositions for
each combination of active contexts, as they arise.
ComInA
is based
on W-Learning, and consists of (1) Context Agents, which are tasked
with executing a designated adaptation for each context as well as
learning the suitability of other adaptations available in the system,
(2) Interaction Agents, which are tasked with learning how contexts
and adaptations aect a given context (i.e., relationship between
contexts), and (3) a Context Composer, which, based on input from
the Context and Interaction Agents, determines the adaptation, or
combination of adaptations, to execute for each combination of
active contexts. This section details the algorithms implemented
by each of ComInA’s agents.
Learning Run-time Compositions of Interacting Adaptations SEAMS’20, 2020, Seoul, South Korea
3.1 Context Agent Design
ComInA
-based systems contain a set of context agents
Ac1, . . . , Acn
,
implemented using a Q-learning
RL
process for each context. Ini-
tially, each context agent has a set of states capturing whether
the context is active at a particular time
(
1
)
or not
(
0
)
,
Sqc1=
{Ci
0
,Ci
1
}
, and an action set containing a single adaptation
A={ac1}
.
ac1
is the adaptation required for the context, and is
assumed to be given. The adaptation denition, pre-specied or
learned, is irrelevant to the implementation of
ComInA
. The learn-
ing process of a Context Agent is outlined in Algorithm 1. Every
time context
c
1is active, the
Ac1
agent proposes to execute adapta-
tion
ac1
. However, based on other active contexts, and the decision
of the Context Composer, other adaptations could be executed. In
such case, the agent expands its action space with the executed
adaptation (unknown before), and learns (using Q-learning) the
impact of its execution on its the system. The action set of the Con-
text Agent is built at run time. Knowledge about other adaptations
present in the system might aect its preference about executing
agents’ own adaptations, e.g., if a better adaptation is discovered.
Algorithm 1 ComInA Context Agent Init and Learning Process
1: Sqi:={Ci 0,Ci 1}
2: Ai:={ai}
3: QLearning.Initialize(Sqi,Ai)
4: while true do continuously executing system
5: CurrentContexts [] :=senseEnvConditions()
if agent’s context is not current active
6: if Context Cis in Cur r entContexts[] then
7: nominateAdaptationToExecute(Ai)
8: currentStat e :=Ci 1
9: else
10: currentStat e :=Ci 0
get environment reward for current state
11: reward :=QLearning.getRwd(curr ent St ate)
12: end if
if executed adaptation not previously seen, expand action set
13: execAdapt :=getExecutedAdaptation()
14: if execAdapt is not in Aithen
15: Ai:=AiexecAdaptation
16: end if
standard Q-learning update
17: QLearning.Update(prevState,execAdapt ,rewar d)
18: end while
3.2 Interaction Agent Design
Interaction Agents,
Awc1, . . . Awcn
, dened per context, learn how
their context interacts with other agents. Interaction agents are
implemented using W-learning. We selected W-Learning as a basis
for our approach due to its ability to encode relationships between
adaptations, without having their relative priorities predened or
hard-coded at design time [
5
]. The initial relative priority of adap-
tations is expressed through rewards given to context agents. How-
ever, during the system execution, if an adaptation is "neglected"
for a while, its associated W-value will eventually become higher
than other W-values, and even lower priority adaptations will gain
control of execution, enabling dynamic shift in priorities. In ad-
dition, W-learning implicitly enables detection of complementary
adaptations. For example, multiple adaptations might be suitable
for multiple goals, e.g., an adaptation might be suitable for another
context as a "side-eect" of its execution even though it was des-
ignated to adapt to another context. In such case, the W-value of
both contexts will be low, as neither context has to "compete" to
execute. Such interaction enables the detection of relationships
between multiple adaptations by learning the best adaptation, or a
combination of adaptations, to be executed.
The learning process of an Interaction Agent is outlined in Algo-
rithm 2. At the start the state space of
Awi
is identical to the state
space of the context agent,
Swc1
=
Sqc1
. The state space of Interac-
tion Agents is expanded at run time as new contexts are observed.
Such expansion enables Agents to (quantitatively) learn the impact
of applying the preferred adaptation on the system performance, for
each particular context combination. For example, an interaction
agent for context
c
1might expand its state space, at run time, to
represent all active/inactive combinations with context
c
2(
statesW
=["C1-0,C2-0","C1-0,C2-1","C1-1,C2-0","C1-1,C2-1"]
), and learn
W-values for each of said combinations.
Algorithm 2 ComInA
Interaction Agent Init and Learning Process
1: Swi:=Sqi
2: WLearning.Initialize(Swi)
3: while true do continuously executing system
4: CurrentContexts [] :=senseEnvConditions()
translate all active contexts into a single state
5: for all Context cin CurrentContexts do
6: cur rentW S tate + = c
7: end for
if agent knows current state, get its importance(w-value)
8: if cur rentW S tate is in Swithen
9: w:=WLearning.getW(curr entW St ate )
nominate action as suggested by Q-Learning
10: WLearning.nominateAdaptation(w,ai),
11: else
if unaware of current state, expand state set
12: end if
13: Swi:=Swicur re ntW Stat e
standard W-learning update
14: WLearning.Update(prevW Stat e,execAdapt,r eward )
15: end while
Note that in very large systems, Interaction Agents will not
capture all other active contexts, only the contexts directly aecting
the same components that its own context aects.
3.3 Context Composer
ComInA
contains at least a Context Composer agent.
1
This module
is in charge of composing selected adaptations coming from the
other agents. At each timestep, a Context Composer receives all
adaptation candidates from Context Agents, and their associated
1
In smaller systems a single composer might have a view of all Context and Interaction
agents, while in large ones, multiple composers might be in charge of a limited number
of components that cooperate as a multi-agent system to ensure global performance.
SEAMS’20, 2020, Seoul, South Korea N. Cardozo, I. Dusparic
impact from Interaction Agents. The adaptation with the highest
W-value is executed. However, as a part of exploration, a Context
Composer also executes compositions of adaptations, varying their
combinations and their order, to evaluate the impact of composite
adaptations on the system. Such composite adaptations are seen
just like individual adaptations, and are therefore added to Con-
text Composers’ action space, and their suitability for a particular
context is learned (e.g.,
adapts=["A1","A2","A1,A2","A2,A1"]
).
Therefore, over time, individual Context agents can suggest the
most suitable composite adaptations for their context.
Algorithm 3 Context Composition Selection and Execution
Require: cur rentW S tate
1: while true do
collect preferred adaptations from relevant context agents
2: var adapt N ominations[][],f inalAdaptation
3: for all Ac1to Ac n do
4: adapt N ominations.push(Aci.adaptat ion,wi)
5: end for
6: maxAdapt :=findMaxWvalue(adaptN ominations)
7: if exploring then
try combinations of adaptations
8: f inalAdapt ation =Aci.adaptation Acj.adaptat ion
9: else
10: f inalAdapt ation :=max Adapt
11: end if
12: f inalAdapt ation.execute()
13: end while
The executed adaptations are selected based on learned interac-
tions using W-Learning in both the Context and Interaction Agents.
The proposed combinations are then enacted, using the ordered
composition strategy from
COP
. However,
ComInA
does not x
the composition strategy but allows to continuously dene compo-
sition strategies based on the W-values of Agent’s expanded state
spaces.
4 EVALUATION AND EARLY RESULTS
We evaluate the feasibility and performances of
ComInA
using a
city transport management system.
4.1 Public Transport Case Study
The base TranCity system allows monitoring bus services in a
city, observing the system’s occupancy (of buses and stations),
buses’ frequency, number of buses operating on a route, and road
conditions (e.g., whether they are blocked). Figure 1 shows a view
of TranCity, with the highlighted route used in the evaluation R4.
As the mobility conditions in the city are in continuous change,
the system needs to take actions that modify TranCity’s basic behav-
ior based on the current context gathered. In particular, whenever
any of the monitored variables’ constraints are violated, for exam-
ple, if a station exceeds its capacity, a context is activated and its
associated adaptations are executed. Table 1 shows the subset of
contexts and their associated adaptation used in the evaluation.
During the system execution, multiple contexts can become
activated simultaneously. The system must decide which adaptation,
Figure 1: TranCity bus route system
Context Adaptation
FullBus skip next station, send new bus
FullStation close the station, send bus, redirect passengers
DepotEmpty return a bus to the depot
Table 1: Contexts and their corresponding adaptations
or combination of adaptations, is the best to execute to maximize
performance. We implement this decision process using ComInA.
4.2 Evaluation Scenarios and Parameters
We evaluate behavior and performance of
ComInA
in the following
specic context combinations:
Scenario 1: Independent/Complementary Adaptations, in which
the
FullBus
and
FullStation
contexts can be simultaneously
active,
Scenario 2: Conicting Adaptations, in which the
DepotEmpty
and FullStation contexts can be simultaneously active.
In each of the scenarios, we compare the performance of two
ComInA composition variations with the baseline:
ComInA
Individual. The system dynamically decides which adap-
tation to execute for one of the active contexts,
ComInA
Composable. The system can choose to execute either
of the individual adaptations or a composition of the adaptations
associated to both contexts,
Predened Adaptation. The baseline behavior as provided by
existing approaches. The adaptation which gets executed in case
of multiple active contexts is predened. We have two such base-
lines, one for each of the two active contexts being always a
predened "winner".
We capture system performance using two metrics:
The number of context alerts, where less alerts means better per-
formance. Context alerts happen as a result of abnormal system
operation. These may come from an increase/decrease of pas-
sengers in buses/stations during TranCity’s operation. However,
unresolved alerts will continue to be raised at each timestep until
resolved, indicating that more eective adaptation strategies will
have lower number of alerts.
Learning Run-time Compositions of Interacting Adaptations SEAMS’20, 2020, Seoul, South Korea
Passenger delay, calculated at each timestep as the dierence
between the total cumulative delay all passengers experience
at timestep
t
, and the delay they experienced at the previous
timestep
t
1. Ideally, if the system is operating smoothly, the
additional delay introduced in each timestep is 0.
The combination of the above
ComInA
implementations, base-
lines and metrics enables us to evaluate whether using
ComInA
is better for the system performance than predened composition
rules, as well as whether
ComInA
can detect relationships between
context adaptations active simultaneously and learn to either com-
pose them or execute a single adaptation.
Each scenario ran for 5000 learning episodes, where the initial
2/3 of the episodes were dedicated to exploring, i.e., learning Q-
values and W-values, and the remaining 1/3 on exploiting learned
behavior. αand γwere both set to 0.1 in both scenarios.2
4.3 Results and Analysis
Figure 2: Delay per timestep for Scenario 1
ComInA ComInA Predened
individual composable Bus Station
FullBus 2170 1766 1236 3315
FullStation 303 234 175 477
Both 173 109 48 320
Table 2: Number of context alerts raised in Scenario 1
Scenario 1 Results. Figure 2 shows the delay in Scenario 1. Table 2
depicts the number of raised alerts. For clarity of presentation, in
Figure 2 we only show the delay for the two
ComInA
-based ap-
proaches, while the table contains the alert counts for the baselines
as well. Intuitively, the adaptations provided for the two contexts
FullStation
and
FullBus
are either compatible or complemen-
tary, as they both send an additional bus from the depot into the
system (whether to the station or a bus position). The
FullStation
adaptation additionally closes the station for further arriving pas-
sengers until the capacity is freed. We observe better overall perfor-
mance when composition of the two adaptations is allowed (both in
terms of lower maximum delay and number of alerts raised), rather
than when the system has to pick only one adaptation to execute
(note that the drop in the delay around the step 3700 corresponds to
the system switching to exploitation of the learned knowledge). We
observe that in 5000 steps, the number of times both
FullStation
2
The evaluation did not include extensive hyperparameter tuning and it was performed
with intuitively set initial parameters. Extensive evaluation of all hyperparameter
combinations and their impact on system performance is left for future validation
and
FullBus
were active simultaneously varied between 48 and
320, signifying the frequency of such situations occurring, and the
need for dynamic approaches to resolve them. We also observe
that in the baseline where only the
FullBus
adaptation is executed,
has the lowest overall average delay (not pictured, but observed
to range up to maximum 15 minutes vs up to 300 in the composed
case) and the lowest number of alerts raised. However, even though
results here show better overall performance, always prioritizing
the
Bus
adaptation, might result in unfairness and imbalance in the
system. To better capture the impact on a ne-grained level per
passenger, bus or a station, further evaluation will need to include
a fairness metric, to prevent only one adaptation ever being met, in
order to prevent neglecting individual stations/passengers which
do not signicantly impact the overall averages.
Scenario 2 Results. Figure 3 shows the delays in Scenario 2. Ta-
ble 3 depicts the number of raised alerts. Intuitively, the relationship
between the adaptations associated with the contexts
FullStation
and
DepotEmpty
are conicting, as one requires sending an addi-
tional bus into the system, and another removing the bus from the
system to serve as an emergency backup in the depot. Therefore,
for this scenario we did not provide a baseline always executing one
adaptation, hence, continuously removing buses from the system
would result in no service. In terms of delay (focusing only on the
exploitation stage), both versions achieve stable low delay in the
system. However, in terms of raised alerts, allowing composed adap-
tations is over 20% worse across all types of alerts. This indicates
that the system metrics correctly identied that the adaptations are
conicting and should not be composed (unlike Scenario 1, where
composed adaptations improved performance).
Figure 3: Delay per timestep for Scenario 2
ComInA individual ComInA composable
FullStation 2380 2761
DepotEmpty 713 1055
Both 623 981
Table 3: Number of context alerts raised in Scenario 2
Results Summary and Analysis. These early results conrm that
ComInA
can provide useful insights into relationships between
adaptations at run time that are not necessarily observable at de-
sign time. As a consequence, the system learns relationships with
other adaptations as they occur, dynamically expanding its learning
process. Consequently, learned relationships are taken into account
SEAMS’20, 2020, Seoul, South Korea N. Cardozo, I. Dusparic
by the Context Composer to select the adaptation, or the combi-
nation of adaptations, most suitable for the system performance.
In compatible adaptation scenarios, composing adaptations proved
more suitable for the system performance, while for conicting
adaptations better performance was achieved by selecting only
a single one. Note that the relationship between the adaptations
(i.e., whether they are compatible or conicting) was not given
to the system a priori, but
ComInA
was able to implicitly learn
these relationships, yielding the most suitable composition behav-
ior. However, this initial study focused only on simple composition
strategies (i.e., to compose adaptations or not). Further work is
needed to validate the wider applicability of our approach.
5 RELATED WORK
A recent paper extensively reviews the use of learning in collective
self-adaptive systems [
10
]. Out of over 40 reviewed papers, about
60% use
RL
to learn behavior. However, no uses in context-aware
adaptation compositions were found.
Non learning-based approaches exist to manage the composition
of dynamic adaptations as a means to avoid conicting adaptations
in the running system; we classify those by the general technique
used for adaptation composition.
Goal-based approaches are used to adapt the system behavior
according to the denition of its goals. Static rules dictate the com-
position of adaptations, depending on available service options.
Such rules, and their operation ow, are predened which may not
cover all possible cases in a dynamic environment. Examples of
goal-based approaches are GoCoMo [
8
,
9
], which adapts service
composites in mobile networks according to optimize
QoS
, adaptive
goals [
3
], which describe possible adaptations for a particular goal
coupled with dierent strategies, dictating the conditions under
which adaptations should take place, and QuA [
15
], a component-
based architecture which monitors services to identify available
options that satisfy a service request with a higher QoS.
Policy-based approaches constrain the systems’ behavior to ef-
fect dierent behavior for special situations, and manage its impact
on the system [
13
,
14
]. Policies are used to orchestrate the com-
position of adaptations. PobSAM [
20
,
21
] presents an actor-based
formal model to manage dynamic adaptation. PobSAM actors’ be-
havior is dictated by a manager, which operates following a set of
policies. Similar approaches are based on event systems to monitor
and adapt systems. In such models compositions events are raised,
and if a specied condition is met, the system adaptation executes
as dened by the policy [
11
,
17
]. Policies are dened beforehand in
these systems, but do not need to be hard-coded and can be user-
specied [
4
]. Once specied, the policies dene strict adaptations
according to the interacting system components.
Conict supervision approaches use monitoring strategies to
dictate when adaptations are needed. For particular identied sit-
uations requiring adaptation an associated resolution is specied.
When multiple adaptations are required, their order is specied by
users beforehand [1, 2, 7].
Modeling approaches are used to specify variability (e.g.,
SPL
s [
18
],
BPM
[
27
]), and system adaptations (e.g., renements [
22
]). Such
approaches specify, at a high level of abstraction, the system com-
ponents and their adaptation replacement components. In such
approaches, adaptation composition is specied by the modeling
language, and enforced dynamically by rule systems describing the
system restrictions and interactions between their components.
Service tiles [
6
] enable ne-grained dynamic adaptation com-
position, similar to our approach. Here, a service is composed of
available components according to the tile cardinality, dened as
an integer linear programming problem. This approach enables
a dynamic composition policy based on available tiles. However,
policies may not always yield the best overall system behavior. In
ComInA
, we use
RL
to get environment feedback on the system
performance, to learn the impact of compositions on the behavior.
6 CONCLUSION AND FUTURE WORK
This paper presents
ComInA
, a novel
RL
-based approach to con-
text adaptation composition, that autonomously learns interactions
between adaptations at run time, which are used to learn the most
appropriate adaptation composition for each combination of active
contexts, as they arise.
ComInA
consists of Context Agents, learning
the best adaptations for each context, Interaction Agents, learn-
ing the relationships between adaptations, and Context Composer
Agents, using learned relationships to select the best adaptation
combinations.
ComInA
agents dynamically expand their learning
process at run time, as new contexts and adaptations are observed.
Our initial evaluation conrms that
ComInA
provides useful in-
sights into relationships between adaptations at run time, taking
advantage of this information to select best compositions.
A complete evaluation of
ComInA
’s applicability is required in
multiple dimensions. Most importantly, the performance in other
possible relationships between adaptations needs to be evaluated.
For example, a hard-conict situation needs to be addressed. Cases
where executing conicting adaptations might not be just subopti-
mal, as was the case in our Scenario 2, but unsafe for the system
behavior, e.g., safety critical systems. In such situations, instead
of relying on trial-and-error (i.e., learning) to evaluate the impact
of composed adaptations, we could investigate whether analyz-
ing Q-values and W-values can identify such relationships directly.
Dierent means of composition also need to be investigated. In
our initial evaluation at each timestep we execute either just one
adaptation, or both. Such decision was dynamic for each timestep,
constituting an improvement over state of the art, but it is still lim-
ited to only simple compositions. Potentially, both complementary
and conicting adaptations can benet from partial composition.
In complementary situations, overlapping parts of the adaptation
can be executed only once rather than executing multiple complete
adaptations in a composed scenario. Similarly, in the conicting sce-
nario, conicting sub-parts of the adaptations could potentially be
identied and excluded from the composition, while still partially
composing other parts of adaptations. Finally, scalability and the
emerging behavior in large-scale systems should be investigated, by
implementing multiple collaborating Context Composers managing
their local adaptation compositions. Distributed W-learning [
12
]
can be used to enable system-wide coordination and collaboration
between multiple Interaction Agents.
Learning Run-time Compositions of Interacting Adaptations SEAMS’20, 2020, Seoul, South Korea
ACKNOWLEDGMENTS
We thanks the anonymous reviewers for their input in earlier ver-
sions of the manuscript. This publication is supported in part by a
research grant from Science Foundation Ireland (SFI) under Grant
Number 16/SP/3804.
REFERENCES
[1]
Nagwa Badr, Denis Reily, and A. TalebBendiab. 2002. A Conict Resolution
Architecture For Self-Adaptive Systems. In International Workshop on Architecting
Dependable Systems (WADS’02).
[2]
L. Baresi and S. Guinea. 2011. Self-Supervising BPEL Processes. IEEE Transactions
on Software Engineering 37, 2 (March 2011), 247–263. https://doi.org/10.1109/
TSE.2010.37
[3]
Luciano Baresi and Liliana Pasquale. 2010. Live Goals for Adaptive Service
Compositions. In Proceedings of the 2010 ICSE Workshop on Software Engineering
for Adaptive and Self-Managing Systems (SEAMS’10). ACM, New York, NY, USA,
114–123. https://doi.org/10.1145/1808984.1808997
[4]
Boxue Wang and Xinhuai Tang. 2014. Designing a self-adaptive and context-
aware service composition system. In IEEE Computers, Communications and
IT Applications Conference (ComComAp’14). 155–160. https://doi.org/10.1109/
ComComAp.2014.7017188
[5]
Nicolás Cardozo, Ivana Dusparic, and Jorge H. Castro. 2017. Peace COrP: Learn-
ing to solve conicts between contexts. In Int. Workshop on Context-Oriented
Programming (COP’17). New York, NY, USA.
[6]
L. Cavallaro, E. D. Nitto, C. A. Furia, and M. Pradella. 2010. A Tile-Based Approach
for Self-Assembling Service Compositions. In IEEE International Conference on
Engineering of Complex Computer Systems (ICECCS’10). 43–52. https://doi.org/10.
1109/ICECCS.2010.6
[7]
A. Char, T. Dinkelaker, and M. Mezini. 2009. A Plug-in Architecture for Self-
Adaptive Web Service Compositions. In IEEE International Conference on Web
Services (ICWS’09). 35–42. https://doi.org/10.1109/ICWS.2009.125
[8]
Nanxi Chen, Nicolás Cardozo, and Siobhán Clarke. 2016. Self-Organizing Goal-
Driven Services in Mobile Pervasive Computing. Transactions on Services Com-
puting (2016), to appear.
[9] Nanxi Chen and Siobhán Clarke. 2014. A Dynamic Service Composition Model
for Adaptive Systems in Mobile Computing Environments. In International Con-
ference on Service Oriented Computing (ICSOC’14), Vol. 8831. Springer Berlin
Heidelberg, 93–107.
[10]
M. D’Angelo, S. Gerasimou, S. Ghahremani, J. Grohmann, I. Nunes, E. Pournaras,
and S. Tomforde. 2019. On Learning in Collective Self-Adaptive Systems: State
of Practice and a 3D Framework. In IEEE/ACM 14th International Symposium on
Software Engineering for Adaptive and Self-Managing Systems (SEAMS’19). 13–24.
https://doi.org/10.1109/SEAMS.2019.00012
[11]
Pierre-Charles David and Thomas Ledoux. 2003. Towards a Framework for
Self-adaptive Component-Based Applications. In Distributed Applications and
Interoperable Systems, Jean-Bernard Stefani, Isabelle Demeure, and Daniel Hagi-
mont (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 1–14.
[12]
Ivana Dusparic and Vinny Cahill. 2012. Autonomic multi-policy optimization in
pervasive systems: Overview and evaluation. TAAS 7, 1 (2012), 11.
[13]
Christos Efstratiou, Keith Cheverst, Nigel Davies, and Adrian Friday. 2001. An
Architecture for the Eective Support of Adaptive Context-Aware Applications.
In Proceedings of the Second International Conference on Mobile Data Management
(MDM ’01). Springer-Verlag, London, UK, UK, 15–26.
[14]
C. Efstratiou, A. Friday, N. Davies, and K. Cheverst. 2002. Utilising the event cal-
culus for policy driven adaptation on mobile systems. In Proceedings International
Workshop on Policies for Distributed Systems and Networks. 13–24.
[15]
Frank Eliassen, Eli Gjørven, Viktor S. Wold Eide, and Jørgen Andreas Michaelsen.
2006. Evolving Self-adaptive Services Using Planning-based Reective Mid-
dleware. In Proceedings of the 5th Workshop on Adaptive and Reective Middle-
ware (ARM’06). ACM, New York, NY, USA, 1–6. https://doi.org/10.1145/1175855.
1175856
[16]
Sebastián González, Kim Mens, Marius Colacioiu, and Walter Cazzola. 2013.
Context Traits: dynamic behaviour adaptation through run-time trait recom-
position. In Proceedings of International Conference on Aspect-Oriented Soft-
ware Development (AOSD’13) (AOSD ’13). New York, NY, USA, 209–220. https:
//doi.org/10.1145/2451436.2451461
[17]
Phil Greenwood and Lynne Blair. 2006. A Framework for Policy Driven Auto-
adaptive Systems Using Dynamic Framed Aspects. In Transactions on Aspect-
Oriented Software Development II, AwaisRashid and Mehmet Aksit (Eds.). Springer
Berlin Heidelberg, Berlin, Heidelberg, 30–65.
[18]
S. Hallsteinsen, E. Stav, A. Solberg, and J. Floch. 2006. Using product line tech-
niques to build adaptive systems. In International Software Product Line Conference
(SPLC’06). 150–160. https://doi.org/10.1109/SPLINE.2006.1691586
[19]
Mark Humphrys. 1996. Action Selection methods using Reinforcement Learning.
In Proceedings of the Fourth International Conference on Simulation of Adaptive
Behavior. MIT Press, 135–144.
[20]
Narges Khakpour, Saeed Jalili, Carolyn Talcott, Marjan Sirjani, and Moham-
mad Reza Mousavi. 2009. PobSAM: Policy-based Managing of Actors in Self-
Adaptive Systems. In Proceedings of the International Workshop on Formal Aspects
of Component Software (FACS’09), Vol. 263. 129 – 143.
[21]
Narges Khakpour, Ramtin Khosravi, Marjan Sirjani, and Saeed Jalili. 2010. Formal
Analysis of Policy-based Self-adaptive Systems. In Proceedings of the ACM Sym-
posium on Applied Computing (SAC’10). ACM, New York, NY, USA, 2536–2543.
https://doi.org/10.1145/1774088.1774613
[22]
D. Pavlovic and D. R. Smith. 2001. Composition and renement of behavioral spec-
ications. In Proceedings of the International Conference on Automated Software
Engineering (ASE’01). 157–165. https://doi.org/10.1109/ASE.2001.989801
[23]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning, An Intro-
duction (second edi ed.). MIT Press. 550 pages.
[24]
Mazeiar Salehie and Ladan Tahvildari. 2009. Self-Adaptive Software: Landscape
and Research Challenges. ACM Transactions on Autonomous and Adaptive Systems
4, 2 (May 2009), 14:1–14:42. https://doi.org/10.1145/1516533.1516538
[25]
Guido Salvaneschi, Carlo Ghezzi, and Matteo Pradella. 2012. Context-Oriented
Programming: A Software Engineering Perspective. 85, 8 (Aug. 2012), 1801–1817.
https://doi.org/10.1016/j.jss.2012.03.024
[26]
Bradley Schmerl, Jesper Andersson, Thomas Vogel, Myra B. Cohen, Cecilia M. F.
Rubira, Yuriy Brun, Alessandra Gorla, Franco Zambonelli, and Luciano Baresi.
2017. Challenges in Composing and Decomposing Assurances for Self-Adaptive
Systems. In Software Engineering for Self-Adaptive Systems III. Assurances, Rogério
de Lemos, David Garlan, Carlo Ghezzi, and Holger Giese (Eds.). Springer Interna-
tional Publishing, Cham, 64–89.
[27] Johanneke Siljee, Ivor Bosloper, Jos Nijhuis, and Dieter Hammer. 2005. DySOA:
Making Service Systems Self-adaptive. In International Conference on Service-
Oriented Computing (ICSOC’05), Boualem Benatallah, Fabio Casati, and Paolo
Traverso (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 255–268.
[28]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Technical Note: Q-Learning.
Machine Learning 8, 3 (May 1992).
... Consequently, the DSPLs validation process is complex because the number of possible configurations grows exponentially with the number of features, and features may interact in both unexpected and undesired ways [2,8,28]. Such problems are further amplified if the system can update itself (for example, by downloading new features to interface with a sensor newly plugged into the system) [6]. The feature interaction problem is well-studied for systems where features are bound at specification or design time [1-3, 7, 8, 15, 19], but runtime interactions are less explored [6,24]. ...
... Such problems are further amplified if the system can update itself (for example, by downloading new features to interface with a sensor newly plugged into the system) [6]. The feature interaction problem is well-studied for systems where features are bound at specification or design time [1-3, 7, 8, 15, 19], but runtime interactions are less explored [6,24]. ...
Conference Paper
Dynamic Software Product Lines (DSPLs) engineering implements self-adaptive systems by dynamically binding or unbinding features at runtime according to a feature model. However, these features may interact in unexpected and undesired ways leading to critical consequences for the DSPL. Moreover, (re)configurations may negatively affect the runtime system's architectural qualities, manifesting architectural bad smells. These issues are challenging to detect due to the combinatorial explosion of the number of interactions amongst features. As some of them may appear at runtime, we need a runtime approach to their analysis and mitigation. This thesis introduces the Behavioral Map (BM) formalism that captures information from different sources (feature model, code) to automatically detect these issues. We provide behavioral map inference algorithms. Using the Smart Home Environment (SHE) as a case study, we describe how a BM is helpful to identify critical feature interactions and architectural smells. Our preliminary results already show promising progress for both feature interactions and architectural bad smells identification at runtime.
Conference Paper
Full-text available
Collective self-adaptive systems (CSAS) are distributed and interconnected systems composed of multiple agents that can perform complex tasks such as environmental data collection, search and rescue operations, and discovery of natural resources. By providing individual agents with learning capabilities, CSAS can cope with challenges related to distributed sensing and decision-making and operate in uncertain environments. This unique characteristic of CSAS enables the collective to exhibit robust behaviour while achieving system-wide and agent-specific goals. Although learning has been explored in many CSAS applications, selecting suitable learning models and techniques remains a significant challenge that is heavily influenced by expert knowledge. We address this gap by performing a multifaceted analysis of existing CSAS with learning capabilities reported in the literature. Based on this analysis, we introduce a 3D framework that illustrates the learning aspects of CSAS considering the dimensions of autonomy, knowledge access, and behaviour, and facilitates the selection of learning techniques and models. Finally, using example applications from this analysis, we derive open challenges and highlight the need for research on collaborative, resilient and privacy-aware mechanisms for CSAS.
Chapter
Full-text available
Self-adaptive software systems adapt to changes in the environment, in the system itself, in their requirements, or in their business objectives. Typically , these systems attempt to maintain system goals at run time and often provide assurance that they will meet their goals under dynamic and uncertain circumstances. While significant research has focused on ways to engineer self-adaptive capabilities into both new and legacy software systems, less work has been conducted on how to assure that self-adaptation maintains system goals. For traditional, especially safety-critical software systems, assurance techniques decompose assurances into sub-goals and evidence that can be provided by parts of the system. Existing approaches also exist for composing assurances, in terms of composing multiple goals and composing assurances in systems of systems. While some of these techniques may be applied to self-adaptive systems, we argue that several significant challenges remain in applying them to self-adaptive systems in this chapter. We discuss how existing assurance techniques can be applied to composing and decomposing assurances for self-adaptive systems, highlight the challenges in applying them, summarize existing research to address some of these challenges, and identify gaps and opportunities to be addressed by future research.
Conference Paper
Full-text available
Service-based applications must be adaptable to cope with the dynamic environments in which they reside. Dynamic service composition is a common solution to achieving adaptation, but it is challenging in mobile ad hoc network (MANET) environments where devices are resource-constrained and mobile. Existing solutions to dynamic service composition predefine the multiple configurations that may be possible, but this requires knowledge of the configurations a-priori. Alternatively, some solutions provide on-demand composition configurations, but they depend on central entities which are inappropriate in MANET environments. We propose a decentralized service composition model, in which a system dynamically adapts its business process by composing its fragments on-demand, as appropriate to the constraints of the service consumer and service providers. Results show a high composition success rate for the service compositions in high mobility environments.
Conference Paper
Full-text available
Context-oriented programming emerged as a new paradigm to support fine-grained dynamic adaptation of software behaviour according to the context of execution. Though existing context-oriented approaches permit the adaptation of individual methods, in practice behavioural adaptations to specific contexts often require the modification of groups of interrelated methods. Furthermore, existing approaches impose a composition semantics that cannot be adjusted on a domain-specific basis. The mechanism of traits seems to provide a more appropriate level of granularity for defining adaptations, and brings along a flexible composition mechanism that can be exploited in a dynamic setting. This paper explores how to achieve context-oriented programming by using traits as units of adaptation, and trait composition as a mechanism to introduce behavioural adaptations at run time. First-class contexts reify relevant aspects of the environment in which the application executes, and they directly influence the trait composition of the objects that make up the application. To resolve conflicts arising from dynamic composition of behavioural adaptations, programmers can explicitly encode composition policies. With all this, the notion of context traits offers a promising approach to implementing dynamically adaptable systems. To validate the context traits model we implemented a JavaScript library and conducted case studies on context-driven adaptability.
Article
Full-text available
Self-adaptive systems often use a middleware-based approach where adaptation mechanisms and policies are separated and externalized from the application code. Such separation facilitates the indepen-dent analysis of application and adaptation. In the QuA middle-ware, we use mirror-based reflection and service planning to sup-port the development and execution of self-adaptive systems. A mirror provides meta information about a service's behavior and implementation throughout all life-cycle phases, including its per-formance in different contexts. Service planning supports dynamic discovery, utility-based and context-aware evaluation, and selection of alternative implementations of a given service. Here we argue that the QuA middleware is also able to support certain forms of evolution of adaptive systems. Since in QuA new implementation alternatives or updated versions of software are automatically discovered and considered during service planning, evolution both during run time and load time is supported. Ex-perimental results from evolving a state-of-the-art adaptive media streaming application using our middleware are also presented.
Article
Mobile, pervasive computing environments respond to users’ requirements by providing access to and composition of various services over networked devices. In such an environment, service composition needs to satisfy a request’s goal, and be mobile-aware even throughout service discovery and service execution. A composite service also needs to be adaptable to cope with the environment’s dynamic network topology. Existing composition solutions employ goal-oriented planning to provide flexible composition, and assign service providers at runtime, to avoid composition failure. However, these solutions have limited support for complex service flows and composite service adaptation. This paper proposes a self-organizing, goal-driven service model for task resolution and execution in mobile pervasive environments. In particular, it proposes a decentralized heuristic planning algorithm based on backward-chaining to support flexible service discovery. Further, we introduce an adaptation architecture that allows execution paths to dynamically adapt, which reduces failures, and lessens re-execution effort for failure recovery. Simulation results show the suitability of the proposed mechanism in pervasive computing environments where providers are mobile, and it is uncertain what services are available. Our evaluation additionally reveals the model’s limits with regard to network dynamism and resource constraints.
Article
Web service compositions run in changing environment where different context events can arise to affect the execution of services. In order not to make service execution affected by context events, context-aware service composition becomes one of the major research trends. Service providers can develop context-aware services which can adapt their behaviors dynamically to execution contexts. However, it burdens service providers because they have to keep in mind different execution contexts where their services could be used. In this paper, we design and implement a self-adaptive and context-aware service composition system which can adapt to changing execution contexts and make adjustments according to context events and user-defined personalized policies. It frees service providers from context handling which in turn becomes a task of our system. We use OWL to model context ontologies and extend the OWL-S service model to support context information. Policy is a user-defined adjustment strategy to guide the dynamic adaptation. Service consumers can submit their requests and get context-aware services. Our system can composite services according to service consumers' requests, execute services, monitor execution contexts and adjust its action when contexts change.
Article
Software systems dealing with distributed applications in changing environments normally require human supervision to continue operation in all conditions. These (re-)configuring, troubleshooting, and in general maintenance tasks lead to costly and time-consuming procedures during the operating phase. These problems are primarily due to the open-loop structure often followed in software development. Therefore, there is a high demand for management complexity reduction, management automation, robustness, and achieving all of the desired quality requirements within a reasonable cost and time range during operation. Self-adaptive software is a response to these demands; it is a closed-loop system with a feedback loop aiming to adjust itself to changes during its operation. These changes may stem from the software system's self (internal causes, e.g., failure) or context (external events, e.g., increasing requests from users). Such a system is required to monitor itself and its context, detect significant changes, decide how to react, and act to execute such decisions. These processes depend on adaptation properties (called self-* properties), domain characteristics (context information or models), and preferences of stakeholders. Noting these requirements, it is widely believed that new models and frameworks are needed to design self-adaptive software. This survey article presents a taxonomy, based on concerns of adaptation, that is, how , what , when and where , towards providing a unified view of this emerging area. Moreover, as adaptive systems are encountered in many disciplines, it is imperative to learn from the theories and models developed in these other areas. This survey article presents a landscape of research in self-adaptive software by highlighting relevant disciplines and some prominent research projects. This landscape helps to identify the underlying research gaps and elaborates on the corresponding challenges.
Article
This article describes Distributed W-Learning (DWL), a reinforcement learning-based algorithm for collaborative agent-based optimization of pervasive systems. DWL supports optimization towards multiple heterogeneous policies and addresses the challenges arising from the heterogeneity of the agents that are charged with implementing them. DWL learns and exploits the dependencies between agents and between policies to improve overall system performance. Instead of always executing the locally-best action, agents learn how their actions affect their immediate neighbors and execute actions suggested by neighboring agents if their importance exceeds the local action's importance when scaled using a predefined or learned collaboration coefficient. We have evaluated DWL in a simulation of an Urban Traffic Control (UTC) system, a canonical example of the large-scale pervasive systems that we are addressing. We show that DWL outperforms widely deployed fixed-time and simple adaptive UTC controllers under a variety of traffic loads and patterns. Our results also confirm that enabling collaboration between agents is beneficial as is the ability for agents to learn the degree to which it is appropriate for them to collaborate. These results suggest that DWL is a suitable basis for optimization in other large-scale systems with similar characteristics.