Content uploaded by Nicolás Cardozo
Author content
All content in this area was uploaded by Nicolás Cardozo on Apr 07, 2020
Content may be subject to copyright.
Content uploaded by Nicolás Cardozo
Author content
All content in this area was uploaded by Nicolás Cardozo on Apr 07, 2020
Content may be subject to copyright.
Learning Run-time Compositions of Interacting Adaptations
Nicolás Cardozo
Systems and Computing Engineering Department
Universidad de los Andes, Colombia
n.cardozo@uniandes.edu.co
Ivana Dusparic
School of Computer Science and Statistics
Trinity College Dublin, Ireland
ivana.dusparic@scss.tcd.ie
ABSTRACT
Self-adaptive systems continuously adapt to internal and exter-
nal changes in their execution environment. In context-based self-
adaptation, adaptations take place in response to the characteristics
of the execution environment, captured as a context. However, in
large-scale adaptive systems operating in dynamic environments,
multiple contexts are often active at the same time, requiring simul-
taneous execution of multiple adaptations. Complex interactions
between such adaptations might not have been foreseen or ac-
counted for at design time. For example, adaptations can partially
overlap, requiring only partial execution of each, or they can be
conicting, requiring some of the adaptations not to be executed
at all, in order to preserve system execution. To ensure a correct
composition of adaptations, we propose ComInA, a novel rein-
forcement learning based approach, which autonomously learns
interactions between adaptations as well as the most appropriate
adaptation composition for each combination of active contexts, as
they arise. We present an initial evaluation of ComInA in an urban
public transport network simulation, where multiple adaptations
to buses, routes, and stations are required. Early results show that
ComInA correctly identies whether adaptations are compatible
or conicting and learns to execute adaptations which maximize
system performance. However, further investigation is needed into
how best to utilize such identied relationships to optimize a wider
range of metrics and utilize more complex composition strategies.
CCS CONCEPTS
•Computing methodologies →Reinforcement learning
;
•
Software and its engineering →Software development tech-
niques; Software design engineering;
KEYWORDS
Dynamic software composition, Reinforcement learning,
ACM Reference format:
Nicolás Cardozo and Ivana Dusparic. 2020. Learning Run-time Composi-
tions of Interacting Adaptations. In Proceedings of Symposium on Software
Engineering for Adaptive and Self-Managing Systems, Seoul, South Korea,
2020 (SEAMS’20), 7 pages.
https://doi.org/10.1145/nnnnnnn.nnnnnnn
SEAMS’20, 2020, Seoul, South Korea
©
2020 Copyright held by the owner/author(s). Publication rights licensed to Associa-
tion for Computing Machinery.
This is the author’s version of the work. It is posted here for your personal use. Not
for redistribution. The denitive Version of Record was published in Proceedings of
Symposium on Software Engineering for Adaptive and Self-Managing Systems, 2020,
https://doi.org/10.1145/nnnnnnn.nnnnnnn.
1 INTRODUCTION
Self-adaptive systems [
24
] enable the dynamic adaptation of soft-
ware systems in response to input gathered from their surrounding
(internal and external) execution environment. In this paradigm,
information about the environment is used to select the most ap-
propriate behavior from a set of predened adaptation alternatives.
Composition-based self-adaptive systems achieve dynamic adap-
tation by continuously (re)composing system modules with the
appropriate specialized behavior to respond to the sensed envi-
ronment. If multiple environment situations are sensed, multiple
adaptations need to be composed. The relationship between adap-
tations can result in three broad composition situations we identify
as: compatible, in which adaptations are independent from each
other, and their composition can take place safely, complementary,
in which adaptations enhance each other or overlap, and either
one or all can be executed, and conicting, in which composing
the adaptations will lead to an execution error, or the eect of the
adaptations cancels each other out.
To illustrate these (and to evaluate our approach), we use the
TranCity example, a public transport management system (Sec-
tion 4), where multiple adaptations to the buses, stations, and routes
might be needed. For example, adaptations to close a station tem-
porarily and stop serving it can be seen as compatible (independent),
while closing the station and sending an additional bus to it might
be conicting. Composing conicting adaptations might lead to
an erroneous system state. Adaptation mechanisms need to assure
that no conicts exist between the adaptations, or between adap-
tations and the base system. The best combination of adaptations
to be executed is the one that has the most positive impact on the
global behavior of the system, in its current context of execution.
Therefore, for each possible context combination, a decision if and
which adaptations to compose must be made.
Managing the interaction between adaptations and their compo-
sition is still an open problem [
26
]. Dierent mechanism exist to as-
sure composition of adaptations still satises the system goals [
26
],
but they all require the upfront denition of composition rules
between adaptations (Section 5). This enforces a xed resolution
mechanism for all adaptation composition scenarios which may
not always yield the most appropriate behavior to fulll the sys-
tem’s goal, as some of the composition interactions may have been
unforeseen by the developers or may have changed over time. In
addition, even if a conict between adaptations is detected at design
time, pre-dening the dominant adaptation that will always take
precedence might not result in the best system performance, as
adaptations might need to alternate to ensure fairness.
To address these issues, we propose a mechanism for the run-
time composition of adaptations, Composing Interacting Adapta-
tions (
ComInA
) (Section 3), in which the composition strategy is
SEAMS’20, 2020, Seoul, South Korea N. Cardozo, I. Dusparic
not presupposed by the system, but rather is learned to be the com-
position most benecial to the overall system execution.
ComInA
uses Reinforcement Learning (
RL
), in particular W-Learning [
19
],
to learn: (1) the relationships (i.e., compatibility or conict) be-
tween adaptations that arise when multiple contexts are sensed
simultaneously, and (2) the most suitable individual or composed
adaptations to execute for every combination of contexts, that takes
advantage of complementary adaptations or prevents composing
conicting ones. As new contexts and adaptations are discovered
while the system executes, the
ComInA
learning processes are
dynamically expanded to capture their relationships with the ex-
isting system. To realize adaptations,
ComInA
uses an adaptation
engine at the programming language level, Context-oriented Pro-
gramming (
COP
) [
25
], enabling ne-grained adaptations (i.e., at the
method level), rather than coarse-grained component-level adapta-
tions used by existing approaches (Section 5).
ComInA
’s initial results show feasibility and usability of the cor-
rect dynamic composition of adaptations. Further evaluation is still
needed to assess the scope of ComInA’s applicability (Section 6).
2 PRELIMINARIES
ComInA
uses
RL
to learn the most suitable way of composing
adaptations at run time. In particular we use the W-Learning multi-
objective learning technique to resolve the most suitable composi-
tion with respect to the system objectives. The dynamic adaptation
behavior in
ComInA
is implemented using
COP
, a programming
language-based adaptation engine. The learning composition be-
havior posit in this paper is implemented on top of such adaptation
engine. In this section, we provide the fundamental concepts from
both techniques required to understand our proposed solution.
2.1 Reinforcement Learning
In
RL
, intelligent agents learn to map environment situations (i.e.,
states) to actions so as to maximize a long term numerical reward
signal they receive from the environment [
23
]. An
RL
agent uses
a set of states,
S
, consisting of all of states contained within an
environment,
A
: the ‘action space’, i.e., the set of all actions that an
agent can execute that aect the environment, and at each time step,
it receives
r
: the numerical reward signal encoding the positive or
negative impact the action had on the environment.
Q-learning [
28
] is a widely used model-free implementation of
RL
. Long-term quality of an action taken in a particular state is
represented by a Q-value,
Q(s,a)
[
28
] and is learned for each state-
actions pair. The Q-learning incorporates only a single source of
reward, i.e., it enables optimization for a single system goal. To learn
how to meet multiple system goals, multiple Q-learning processes
can be implemented. However, as an agent can only execute a single
action at a time, an arbitration approach needs to be added that
resolves which of the Q-learning processes, i.e., which of the agent’s
goals, gets to take the control of action execution. W-learning pro-
vides that mechanism. At every time step, each Q-learning process
nominates an action to execute that is the best suited to the goal it
represents. By implementing W-learning on top, agents also learn,
for each of the states of each of their goal, how important it is, in
terms of received rewards, for their action nominated by that goal
to be selected, as opposed to actions nominated by other goals. This
is expressed as a W-value,
W(s)
, and is learnt by each Q-learning
agent for each state in which it can be. The goal with the maximum
W-value takes priority at the next time step, and its nominated
action is executed.
2.2 Context-oriented Programming
COP
[
25
] is a programming language approach to realize dynamic
behavior adaptations at a ne level of granularity (i.e., methods).
COP
enables clean modularization of adaptations from the base
program logic as well as from other adaptations. Adaptations are
incorporated in a running system by dynamically recomposing
the running system with the adaptation behavior. The dynamic
composition model used in
COP
reies the MAPE loop. Therefore,
our solution transcends the boundaries of
COP
and is applicable to
other adaptation engines.
To implementation
ComInA
we use the Context Traits [
16
]
COP
language, an extension of ECMAScript enabling dynamic adapta-
tions. Three main concepts are behind dynamic adaptations in
COP
,
contexts,behavioral adaptations, and context activations. Contexts
are dened as rst class entities of the system that capture meaning-
ful situations from the systems surrounding execution environment.
On their own, contexts represent situations sensed from the sys-
tem’s surrounding environment. Adaptive behavior is realized by
means of behavioral adaptations, modular behavior (e.g., method)
specications dened in isolation of other components. Behavioral
are associated with a context. When the context is activated (from
anywhere in the system), the associated behavioral adaptation is
composed into the system. If the context is no longer valid, then it
is deactivated, withdrawing its associated behavioral adaptations
from the system.
In Context Traits, behavioral adaptations are composed following
either a predened static policy, or a dynamic policy, in which the
behavioral adaptations associated with the context activated the
latest, are executed rst. In
ComInA
we combine
COP
with
RL
to
override these rules for context adaptation, and enable the system
to learn preferred compositions at run time.
3 LEARNING COMPOSITION STRATEGIES
ComInA
is a novel adaptation composition approach for self-adaptive
systems, that autonomously learns interactions between adapta-
tions as well as the most appropriate adaptation compositions for
each combination of active contexts, as they arise.
ComInA
is based
on W-Learning, and consists of (1) Context Agents, which are tasked
with executing a designated adaptation for each context as well as
learning the suitability of other adaptations available in the system,
(2) Interaction Agents, which are tasked with learning how contexts
and adaptations aect a given context (i.e., relationship between
contexts), and (3) a Context Composer, which, based on input from
the Context and Interaction Agents, determines the adaptation, or
combination of adaptations, to execute for each combination of
active contexts. This section details the algorithms implemented
by each of ComInA’s agents.
Learning Run-time Compositions of Interacting Adaptations SEAMS’20, 2020, Seoul, South Korea
3.1 Context Agent Design
ComInA
-based systems contain a set of context agents
Ac1, . . . , Acn
,
implemented using a Q-learning
RL
process for each context. Ini-
tially, each context agent has a set of states capturing whether
the context is active at a particular time
(
1
)
or not
(
0
)
,
Sqc1=
{Ci −
0
,Ci −
1
}
, and an action set containing a single adaptation
A={ac1}
.
ac1
is the adaptation required for the context, and is
assumed to be given. The adaptation denition, pre-specied or
learned, is irrelevant to the implementation of
ComInA
. The learn-
ing process of a Context Agent is outlined in Algorithm 1. Every
time context
c
1is active, the
Ac1
agent proposes to execute adapta-
tion
ac1
. However, based on other active contexts, and the decision
of the Context Composer, other adaptations could be executed. In
such case, the agent expands its action space with the executed
adaptation (unknown before), and learns (using Q-learning) the
impact of its execution on its the system. The action set of the Con-
text Agent is built at run time. Knowledge about other adaptations
present in the system might aect its preference about executing
agents’ own adaptations, e.g., if a better adaptation is discovered.
Algorithm 1 ComInA Context Agent Init and Learning Process
1: Sqi:={Ci −0,Ci −1}
2: Ai:={ai}
3: QLearning.Initialize(Sqi,Ai)
4: while true do ▷continuously executing system
5: CurrentContexts [] :=senseEnvConditions()
▷if agent’s context is not current active
6: if Context Cis in Cur r entContexts[] then
7: nominateAdaptationToExecute(Ai)
8: currentStat e :=Ci −1
9: else
10: currentStat e :=Ci −0
▷get environment reward for current state
11: reward :=QLearning.getRwd(curr ent St ate)
12: end if
▷
if executed adaptation not previously seen, expand action set
13: execAdapt :=getExecutedAdaptation()
14: if execAdapt is not in Aithen
15: Ai:=Ai∪execAdaptation
16: end if
▷standard Q-learning update
17: QLearning.Update(prevState,execAdapt ,rewar d)
18: end while
3.2 Interaction Agent Design
Interaction Agents,
Awc1, . . . Awcn
, dened per context, learn how
their context interacts with other agents. Interaction agents are
implemented using W-learning. We selected W-Learning as a basis
for our approach due to its ability to encode relationships between
adaptations, without having their relative priorities predened or
hard-coded at design time [
5
]. The initial relative priority of adap-
tations is expressed through rewards given to context agents. How-
ever, during the system execution, if an adaptation is "neglected"
for a while, its associated W-value will eventually become higher
than other W-values, and even lower priority adaptations will gain
control of execution, enabling dynamic shift in priorities. In ad-
dition, W-learning implicitly enables detection of complementary
adaptations. For example, multiple adaptations might be suitable
for multiple goals, e.g., an adaptation might be suitable for another
context as a "side-eect" of its execution even though it was des-
ignated to adapt to another context. In such case, the W-value of
both contexts will be low, as neither context has to "compete" to
execute. Such interaction enables the detection of relationships
between multiple adaptations by learning the best adaptation, or a
combination of adaptations, to be executed.
The learning process of an Interaction Agent is outlined in Algo-
rithm 2. At the start the state space of
Awi
is identical to the state
space of the context agent,
Swc1
=
Sqc1
. The state space of Interac-
tion Agents is expanded at run time as new contexts are observed.
Such expansion enables Agents to (quantitatively) learn the impact
of applying the preferred adaptation on the system performance, for
each particular context combination. For example, an interaction
agent for context
c
1might expand its state space, at run time, to
represent all active/inactive combinations with context
c
2(
statesW
=["C1-0,C2-0","C1-0,C2-1","C1-1,C2-0","C1-1,C2-1"]
), and learn
W-values for each of said combinations.
Algorithm 2 ComInA
Interaction Agent Init and Learning Process
1: Swi:=Sqi
2: WLearning.Initialize(Swi)
3: while true do ▷continuously executing system
4: CurrentContexts [] :=senseEnvConditions()
▷translate all active contexts into a single state
5: for all Context cin CurrentContexts do
6: cur rentW S tate + = c
7: end for
▷if agent knows current state, get its importance(w-value)
8: if cur rentW S tate is in Swithen
9: w:=WLearning.getW(curr entW St ate )
▷nominate action as suggested by Q-Learning
10: WLearning.nominateAdaptation(w,ai),
11: else
▷if unaware of current state, expand state set
12: end if
13: Swi:=Swi∪cur re ntW Stat e
▷standard W-learning update
14: WLearning.Update(prevW Stat e,execAdapt,r eward )
15: end while
Note that in very large systems, Interaction Agents will not
capture all other active contexts, only the contexts directly aecting
the same components that its own context aects.
3.3 Context Composer
ComInA
contains at least a Context Composer agent.
1
This module
is in charge of composing selected adaptations coming from the
other agents. At each timestep, a Context Composer receives all
adaptation candidates from Context Agents, and their associated
1
In smaller systems a single composer might have a view of all Context and Interaction
agents, while in large ones, multiple composers might be in charge of a limited number
of components that cooperate as a multi-agent system to ensure global performance.
SEAMS’20, 2020, Seoul, South Korea N. Cardozo, I. Dusparic
impact from Interaction Agents. The adaptation with the highest
W-value is executed. However, as a part of exploration, a Context
Composer also executes compositions of adaptations, varying their
combinations and their order, to evaluate the impact of composite
adaptations on the system. Such composite adaptations are seen
just like individual adaptations, and are therefore added to Con-
text Composers’ action space, and their suitability for a particular
context is learned (e.g.,
adapts=["A1","A2","A1,A2","A2,A1"]
).
Therefore, over time, individual Context agents can suggest the
most suitable composite adaptations for their context.
Algorithm 3 Context Composition Selection and Execution
Require: cur rentW S tate
1: while true do
▷collect preferred adaptations from relevant context agents
2: var adapt N ominations[][],f inalAdaptation
3: for all Ac1to Ac n do
4: adapt N ominations.push(Aci.adaptat ion,wi)
5: end for
6: maxAdapt :=findMaxWvalue(adaptN ominations)
7: if exploring then
▷try combinations of adaptations
8: f inalAdapt ation =Aci.adaptation ∪Acj.adaptat ion
9: else
10: f inalAdapt ation :=max Adapt
11: end if
12: f inalAdapt ation.execute()
13: end while
The executed adaptations are selected based on learned interac-
tions using W-Learning in both the Context and Interaction Agents.
The proposed combinations are then enacted, using the ordered
composition strategy from
COP
. However,
ComInA
does not x
the composition strategy but allows to continuously dene compo-
sition strategies based on the W-values of Agent’s expanded state
spaces.
4 EVALUATION AND EARLY RESULTS
We evaluate the feasibility and performances of
ComInA
using a
city transport management system.
4.1 Public Transport Case Study
The base TranCity system allows monitoring bus services in a
city, observing the system’s occupancy (of buses and stations),
buses’ frequency, number of buses operating on a route, and road
conditions (e.g., whether they are blocked). Figure 1 shows a view
of TranCity, with the highlighted route used in the evaluation R4.
As the mobility conditions in the city are in continuous change,
the system needs to take actions that modify TranCity’s basic behav-
ior based on the current context gathered. In particular, whenever
any of the monitored variables’ constraints are violated, for exam-
ple, if a station exceeds its capacity, a context is activated and its
associated adaptations are executed. Table 1 shows the subset of
contexts and their associated adaptation used in the evaluation.
During the system execution, multiple contexts can become
activated simultaneously. The system must decide which adaptation,
Figure 1: TranCity bus route system
Context Adaptation
FullBus skip next station, send new bus
FullStation close the station, send bus, redirect passengers
DepotEmpty return a bus to the depot
Table 1: Contexts and their corresponding adaptations
or combination of adaptations, is the best to execute to maximize
performance. We implement this decision process using ComInA.
4.2 Evaluation Scenarios and Parameters
We evaluate behavior and performance of
ComInA
in the following
specic context combinations:
•
Scenario 1: Independent/Complementary Adaptations, in which
the
FullBus
and
FullStation
contexts can be simultaneously
active,
•
Scenario 2: Conicting Adaptations, in which the
DepotEmpty
and FullStation contexts can be simultaneously active.
In each of the scenarios, we compare the performance of two
ComInA composition variations with the baseline:
•ComInA
Individual. The system dynamically decides which adap-
tation to execute for one of the active contexts,
•ComInA
Composable. The system can choose to execute either
of the individual adaptations or a composition of the adaptations
associated to both contexts,
•
Predened Adaptation. The baseline behavior as provided by
existing approaches. The adaptation which gets executed in case
of multiple active contexts is predened. We have two such base-
lines, one for each of the two active contexts being always a
predened "winner".
We capture system performance using two metrics:
•
The number of context alerts, where less alerts means better per-
formance. Context alerts happen as a result of abnormal system
operation. These may come from an increase/decrease of pas-
sengers in buses/stations during TranCity’s operation. However,
unresolved alerts will continue to be raised at each timestep until
resolved, indicating that more eective adaptation strategies will
have lower number of alerts.
Learning Run-time Compositions of Interacting Adaptations SEAMS’20, 2020, Seoul, South Korea
•
Passenger delay, calculated at each timestep as the dierence
between the total cumulative delay all passengers experience
at timestep
t
, and the delay they experienced at the previous
timestep
t−
1. Ideally, if the system is operating smoothly, the
additional delay introduced in each timestep is 0.
The combination of the above
ComInA
implementations, base-
lines and metrics enables us to evaluate whether using
ComInA
is better for the system performance than predened composition
rules, as well as whether
ComInA
can detect relationships between
context adaptations active simultaneously and learn to either com-
pose them or execute a single adaptation.
Each scenario ran for 5000 learning episodes, where the initial
2/3 of the episodes were dedicated to exploring, i.e., learning Q-
values and W-values, and the remaining 1/3 on exploiting learned
behavior. αand γwere both set to 0.1 in both scenarios.2
4.3 Results and Analysis
Figure 2: Delay per timestep for Scenario 1
ComInA ComInA Predened
individual composable Bus Station
FullBus 2170 1766 1236 3315
FullStation 303 234 175 477
Both 173 109 48 320
Table 2: Number of context alerts raised in Scenario 1
Scenario 1 Results. Figure 2 shows the delay in Scenario 1. Table 2
depicts the number of raised alerts. For clarity of presentation, in
Figure 2 we only show the delay for the two
ComInA
-based ap-
proaches, while the table contains the alert counts for the baselines
as well. Intuitively, the adaptations provided for the two contexts
FullStation
and
FullBus
are either compatible or complemen-
tary, as they both send an additional bus from the depot into the
system (whether to the station or a bus position). The
FullStation
adaptation additionally closes the station for further arriving pas-
sengers until the capacity is freed. We observe better overall perfor-
mance when composition of the two adaptations is allowed (both in
terms of lower maximum delay and number of alerts raised), rather
than when the system has to pick only one adaptation to execute
(note that the drop in the delay around the step 3700 corresponds to
the system switching to exploitation of the learned knowledge). We
observe that in 5000 steps, the number of times both
FullStation
2
The evaluation did not include extensive hyperparameter tuning and it was performed
with intuitively set initial parameters. Extensive evaluation of all hyperparameter
combinations and their impact on system performance is left for future validation
and
FullBus
were active simultaneously varied between 48 and
320, signifying the frequency of such situations occurring, and the
need for dynamic approaches to resolve them. We also observe
that in the baseline where only the
FullBus
adaptation is executed,
has the lowest overall average delay (not pictured, but observed
to range up to maximum 15 minutes vs up to 300 in the composed
case) and the lowest number of alerts raised. However, even though
results here show better overall performance, always prioritizing
the
Bus
adaptation, might result in unfairness and imbalance in the
system. To better capture the impact on a ne-grained level per
passenger, bus or a station, further evaluation will need to include
a fairness metric, to prevent only one adaptation ever being met, in
order to prevent neglecting individual stations/passengers which
do not signicantly impact the overall averages.
Scenario 2 Results. Figure 3 shows the delays in Scenario 2. Ta-
ble 3 depicts the number of raised alerts. Intuitively, the relationship
between the adaptations associated with the contexts
FullStation
and
DepotEmpty
are conicting, as one requires sending an addi-
tional bus into the system, and another removing the bus from the
system to serve as an emergency backup in the depot. Therefore,
for this scenario we did not provide a baseline always executing one
adaptation, hence, continuously removing buses from the system
would result in no service. In terms of delay (focusing only on the
exploitation stage), both versions achieve stable low delay in the
system. However, in terms of raised alerts, allowing composed adap-
tations is over 20% worse across all types of alerts. This indicates
that the system metrics correctly identied that the adaptations are
conicting and should not be composed (unlike Scenario 1, where
composed adaptations improved performance).
Figure 3: Delay per timestep for Scenario 2
ComInA individual ComInA composable
FullStation 2380 2761
DepotEmpty 713 1055
Both 623 981
Table 3: Number of context alerts raised in Scenario 2
Results Summary and Analysis. These early results conrm that
ComInA
can provide useful insights into relationships between
adaptations at run time that are not necessarily observable at de-
sign time. As a consequence, the system learns relationships with
other adaptations as they occur, dynamically expanding its learning
process. Consequently, learned relationships are taken into account
SEAMS’20, 2020, Seoul, South Korea N. Cardozo, I. Dusparic
by the Context Composer to select the adaptation, or the combi-
nation of adaptations, most suitable for the system performance.
In compatible adaptation scenarios, composing adaptations proved
more suitable for the system performance, while for conicting
adaptations better performance was achieved by selecting only
a single one. Note that the relationship between the adaptations
(i.e., whether they are compatible or conicting) was not given
to the system a priori, but
ComInA
was able to implicitly learn
these relationships, yielding the most suitable composition behav-
ior. However, this initial study focused only on simple composition
strategies (i.e., to compose adaptations or not). Further work is
needed to validate the wider applicability of our approach.
5 RELATED WORK
A recent paper extensively reviews the use of learning in collective
self-adaptive systems [
10
]. Out of over 40 reviewed papers, about
60% use
RL
to learn behavior. However, no uses in context-aware
adaptation compositions were found.
Non learning-based approaches exist to manage the composition
of dynamic adaptations as a means to avoid conicting adaptations
in the running system; we classify those by the general technique
used for adaptation composition.
Goal-based approaches are used to adapt the system behavior
according to the denition of its goals. Static rules dictate the com-
position of adaptations, depending on available service options.
Such rules, and their operation ow, are predened which may not
cover all possible cases in a dynamic environment. Examples of
goal-based approaches are GoCoMo [
8
,
9
], which adapts service
composites in mobile networks according to optimize
QoS
, adaptive
goals [
3
], which describe possible adaptations for a particular goal
coupled with dierent strategies, dictating the conditions under
which adaptations should take place, and QuA [
15
], a component-
based architecture which monitors services to identify available
options that satisfy a service request with a higher QoS.
Policy-based approaches constrain the systems’ behavior to ef-
fect dierent behavior for special situations, and manage its impact
on the system [
13
,
14
]. Policies are used to orchestrate the com-
position of adaptations. PobSAM [
20
,
21
] presents an actor-based
formal model to manage dynamic adaptation. PobSAM actors’ be-
havior is dictated by a manager, which operates following a set of
policies. Similar approaches are based on event systems to monitor
and adapt systems. In such models compositions events are raised,
and if a specied condition is met, the system adaptation executes
as dened by the policy [
11
,
17
]. Policies are dened beforehand in
these systems, but do not need to be hard-coded and can be user-
specied [
4
]. Once specied, the policies dene strict adaptations
according to the interacting system components.
Conict supervision approaches use monitoring strategies to
dictate when adaptations are needed. For particular identied sit-
uations requiring adaptation an associated resolution is specied.
When multiple adaptations are required, their order is specied by
users beforehand [1, 2, 7].
Modeling approaches are used to specify variability (e.g.,
SPL
s [
18
],
BPM
[
27
]), and system adaptations (e.g., renements [
22
]). Such
approaches specify, at a high level of abstraction, the system com-
ponents and their adaptation replacement components. In such
approaches, adaptation composition is specied by the modeling
language, and enforced dynamically by rule systems describing the
system restrictions and interactions between their components.
Service tiles [
6
] enable ne-grained dynamic adaptation com-
position, similar to our approach. Here, a service is composed of
available components according to the tile cardinality, dened as
an integer linear programming problem. This approach enables
a dynamic composition policy based on available tiles. However,
policies may not always yield the best overall system behavior. In
ComInA
, we use
RL
to get environment feedback on the system
performance, to learn the impact of compositions on the behavior.
6 CONCLUSION AND FUTURE WORK
This paper presents
ComInA
, a novel
RL
-based approach to con-
text adaptation composition, that autonomously learns interactions
between adaptations at run time, which are used to learn the most
appropriate adaptation composition for each combination of active
contexts, as they arise.
ComInA
consists of Context Agents, learning
the best adaptations for each context, Interaction Agents, learn-
ing the relationships between adaptations, and Context Composer
Agents, using learned relationships to select the best adaptation
combinations.
ComInA
agents dynamically expand their learning
process at run time, as new contexts and adaptations are observed.
Our initial evaluation conrms that
ComInA
provides useful in-
sights into relationships between adaptations at run time, taking
advantage of this information to select best compositions.
A complete evaluation of
ComInA
’s applicability is required in
multiple dimensions. Most importantly, the performance in other
possible relationships between adaptations needs to be evaluated.
For example, a hard-conict situation needs to be addressed. Cases
where executing conicting adaptations might not be just subopti-
mal, as was the case in our Scenario 2, but unsafe for the system
behavior, e.g., safety critical systems. In such situations, instead
of relying on trial-and-error (i.e., learning) to evaluate the impact
of composed adaptations, we could investigate whether analyz-
ing Q-values and W-values can identify such relationships directly.
Dierent means of composition also need to be investigated. In
our initial evaluation at each timestep we execute either just one
adaptation, or both. Such decision was dynamic for each timestep,
constituting an improvement over state of the art, but it is still lim-
ited to only simple compositions. Potentially, both complementary
and conicting adaptations can benet from partial composition.
In complementary situations, overlapping parts of the adaptation
can be executed only once rather than executing multiple complete
adaptations in a composed scenario. Similarly, in the conicting sce-
nario, conicting sub-parts of the adaptations could potentially be
identied and excluded from the composition, while still partially
composing other parts of adaptations. Finally, scalability and the
emerging behavior in large-scale systems should be investigated, by
implementing multiple collaborating Context Composers managing
their local adaptation compositions. Distributed W-learning [
12
]
can be used to enable system-wide coordination and collaboration
between multiple Interaction Agents.
Learning Run-time Compositions of Interacting Adaptations SEAMS’20, 2020, Seoul, South Korea
ACKNOWLEDGMENTS
We thanks the anonymous reviewers for their input in earlier ver-
sions of the manuscript. This publication is supported in part by a
research grant from Science Foundation Ireland (SFI) under Grant
Number 16/SP/3804.
REFERENCES
[1]
Nagwa Badr, Denis Reily, and A. TalebBendiab. 2002. A Conict Resolution
Architecture For Self-Adaptive Systems. In International Workshop on Architecting
Dependable Systems (WADS’02).
[2]
L. Baresi and S. Guinea. 2011. Self-Supervising BPEL Processes. IEEE Transactions
on Software Engineering 37, 2 (March 2011), 247–263. https://doi.org/10.1109/
TSE.2010.37
[3]
Luciano Baresi and Liliana Pasquale. 2010. Live Goals for Adaptive Service
Compositions. In Proceedings of the 2010 ICSE Workshop on Software Engineering
for Adaptive and Self-Managing Systems (SEAMS’10). ACM, New York, NY, USA,
114–123. https://doi.org/10.1145/1808984.1808997
[4]
Boxue Wang and Xinhuai Tang. 2014. Designing a self-adaptive and context-
aware service composition system. In IEEE Computers, Communications and
IT Applications Conference (ComComAp’14). 155–160. https://doi.org/10.1109/
ComComAp.2014.7017188
[5]
Nicolás Cardozo, Ivana Dusparic, and Jorge H. Castro. 2017. Peace COrP: Learn-
ing to solve conicts between contexts. In Int. Workshop on Context-Oriented
Programming (COP’17). New York, NY, USA.
[6]
L. Cavallaro, E. D. Nitto, C. A. Furia, and M. Pradella. 2010. A Tile-Based Approach
for Self-Assembling Service Compositions. In IEEE International Conference on
Engineering of Complex Computer Systems (ICECCS’10). 43–52. https://doi.org/10.
1109/ICECCS.2010.6
[7]
A. Char, T. Dinkelaker, and M. Mezini. 2009. A Plug-in Architecture for Self-
Adaptive Web Service Compositions. In IEEE International Conference on Web
Services (ICWS’09). 35–42. https://doi.org/10.1109/ICWS.2009.125
[8]
Nanxi Chen, Nicolás Cardozo, and Siobhán Clarke. 2016. Self-Organizing Goal-
Driven Services in Mobile Pervasive Computing. Transactions on Services Com-
puting (2016), to appear.
[9] Nanxi Chen and Siobhán Clarke. 2014. A Dynamic Service Composition Model
for Adaptive Systems in Mobile Computing Environments. In International Con-
ference on Service Oriented Computing (ICSOC’14), Vol. 8831. Springer Berlin
Heidelberg, 93–107.
[10]
M. D’Angelo, S. Gerasimou, S. Ghahremani, J. Grohmann, I. Nunes, E. Pournaras,
and S. Tomforde. 2019. On Learning in Collective Self-Adaptive Systems: State
of Practice and a 3D Framework. In IEEE/ACM 14th International Symposium on
Software Engineering for Adaptive and Self-Managing Systems (SEAMS’19). 13–24.
https://doi.org/10.1109/SEAMS.2019.00012
[11]
Pierre-Charles David and Thomas Ledoux. 2003. Towards a Framework for
Self-adaptive Component-Based Applications. In Distributed Applications and
Interoperable Systems, Jean-Bernard Stefani, Isabelle Demeure, and Daniel Hagi-
mont (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 1–14.
[12]
Ivana Dusparic and Vinny Cahill. 2012. Autonomic multi-policy optimization in
pervasive systems: Overview and evaluation. TAAS 7, 1 (2012), 11.
[13]
Christos Efstratiou, Keith Cheverst, Nigel Davies, and Adrian Friday. 2001. An
Architecture for the Eective Support of Adaptive Context-Aware Applications.
In Proceedings of the Second International Conference on Mobile Data Management
(MDM ’01). Springer-Verlag, London, UK, UK, 15–26.
[14]
C. Efstratiou, A. Friday, N. Davies, and K. Cheverst. 2002. Utilising the event cal-
culus for policy driven adaptation on mobile systems. In Proceedings International
Workshop on Policies for Distributed Systems and Networks. 13–24.
[15]
Frank Eliassen, Eli Gjørven, Viktor S. Wold Eide, and Jørgen Andreas Michaelsen.
2006. Evolving Self-adaptive Services Using Planning-based Reective Mid-
dleware. In Proceedings of the 5th Workshop on Adaptive and Reective Middle-
ware (ARM’06). ACM, New York, NY, USA, 1–6. https://doi.org/10.1145/1175855.
1175856
[16]
Sebastián González, Kim Mens, Marius Colacioiu, and Walter Cazzola. 2013.
Context Traits: dynamic behaviour adaptation through run-time trait recom-
position. In Proceedings of International Conference on Aspect-Oriented Soft-
ware Development (AOSD’13) (AOSD ’13). New York, NY, USA, 209–220. https:
//doi.org/10.1145/2451436.2451461
[17]
Phil Greenwood and Lynne Blair. 2006. A Framework for Policy Driven Auto-
adaptive Systems Using Dynamic Framed Aspects. In Transactions on Aspect-
Oriented Software Development II, AwaisRashid and Mehmet Aksit (Eds.). Springer
Berlin Heidelberg, Berlin, Heidelberg, 30–65.
[18]
S. Hallsteinsen, E. Stav, A. Solberg, and J. Floch. 2006. Using product line tech-
niques to build adaptive systems. In International Software Product Line Conference
(SPLC’06). 150–160. https://doi.org/10.1109/SPLINE.2006.1691586
[19]
Mark Humphrys. 1996. Action Selection methods using Reinforcement Learning.
In Proceedings of the Fourth International Conference on Simulation of Adaptive
Behavior. MIT Press, 135–144.
[20]
Narges Khakpour, Saeed Jalili, Carolyn Talcott, Marjan Sirjani, and Moham-
mad Reza Mousavi. 2009. PobSAM: Policy-based Managing of Actors in Self-
Adaptive Systems. In Proceedings of the International Workshop on Formal Aspects
of Component Software (FACS’09), Vol. 263. 129 – 143.
[21]
Narges Khakpour, Ramtin Khosravi, Marjan Sirjani, and Saeed Jalili. 2010. Formal
Analysis of Policy-based Self-adaptive Systems. In Proceedings of the ACM Sym-
posium on Applied Computing (SAC’10). ACM, New York, NY, USA, 2536–2543.
https://doi.org/10.1145/1774088.1774613
[22]
D. Pavlovic and D. R. Smith. 2001. Composition and renement of behavioral spec-
ications. In Proceedings of the International Conference on Automated Software
Engineering (ASE’01). 157–165. https://doi.org/10.1109/ASE.2001.989801
[23]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning, An Intro-
duction (second edi ed.). MIT Press. 550 pages.
[24]
Mazeiar Salehie and Ladan Tahvildari. 2009. Self-Adaptive Software: Landscape
and Research Challenges. ACM Transactions on Autonomous and Adaptive Systems
4, 2 (May 2009), 14:1–14:42. https://doi.org/10.1145/1516533.1516538
[25]
Guido Salvaneschi, Carlo Ghezzi, and Matteo Pradella. 2012. Context-Oriented
Programming: A Software Engineering Perspective. 85, 8 (Aug. 2012), 1801–1817.
https://doi.org/10.1016/j.jss.2012.03.024
[26]
Bradley Schmerl, Jesper Andersson, Thomas Vogel, Myra B. Cohen, Cecilia M. F.
Rubira, Yuriy Brun, Alessandra Gorla, Franco Zambonelli, and Luciano Baresi.
2017. Challenges in Composing and Decomposing Assurances for Self-Adaptive
Systems. In Software Engineering for Self-Adaptive Systems III. Assurances, Rogério
de Lemos, David Garlan, Carlo Ghezzi, and Holger Giese (Eds.). Springer Interna-
tional Publishing, Cham, 64–89.
[27] Johanneke Siljee, Ivor Bosloper, Jos Nijhuis, and Dieter Hammer. 2005. DySOA:
Making Service Systems Self-adaptive. In International Conference on Service-
Oriented Computing (ICSOC’05), Boualem Benatallah, Fabio Casati, and Paolo
Traverso (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 255–268.
[28]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Technical Note: Q-Learning.
Machine Learning 8, 3 (May 1992).