Content uploaded by Ivan Serina
Author content
All content in this area was uploaded by Ivan Serina
Content may be subject to copyright.
Efficient Plan Adaptation through Replanning Windows
and Heuristic Goals
Alfonso E. Gerevini1and Ivan Serina2
1DEA, Universit`a degli Studi di Brescia, via Branze 38, 25123 Brescia, Italy
gerevini@ing.unibs.it
2Free University of Bozen – Bolzano, Viale Stazione 16, 39042 Bressanone, Italy
ivan.serina@unibz.it
Abstract
Fast plan adaptation is important in many AI-applications. From a theoretical
point of view, in the worst case adapting an existing plan to solve a new problem
is no more efficient than a complete regeneration of the plan. However, in practice
plan adaptation can be much more efficient than plan generation, especially when
the adapted plan can be obtained by performing a limited amount of changes to
the original plan. In this paper, we investigate a domain-independent method
for plan adaptation that modifies the original plan by replanning within limited
temporal windows containing portions of the plan that need to be revised. Each
window is associated with a particular replanning subproblem that contains some
“heuristic goals” facilitating the plan adaptation, and that can be solved using
different planning methods. An experimental analysis shows that, in practice,
adapting a given plan for solving a new problem using our techniques can be much
more efficient than replanning from scratch.
1 Introduction
Fast plan adaptation is important in many AI-applications requiring a plan reasoning
module. A typical plan adaptation task consists of modifying a previously generated
plan in order to use it for solving a new problem which is “similar” to the original
one, in the sense that only a few facts in the initial and goal states are different. This
process can be either off-line (e.g., adapting a plan retrieved from a plan library before
its execution), or on-line (e.g., adapting a plan during a “mixed-initiative” construction
of it [3], or during its execution).
From a theoretical point of view, in the worst case adapting an existing plan is
no more efficient than a complete regeneration of the plan [15]. However, in practice,
we expect that in many cases plan adaptation should be much easier than a complete
replanning, because the adapted plan can be obtained by performing a limited amount
of changes to the original plan. Our general goal is the development of plan adaptation
methods that perform efficiently especially in such cases.
Adj is a domain-independent planning system based on planning graphs [1], that
uses a collection of systematic and heuristic search techniques for solving plan adapta-
tion tasks by replanning certain portions of the original plan.1In this context, a plan is
1Adj is available by inquiring the authors.
Proceedings of the 15th International RCRA workshop (RCRA 2008):
Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion
Udine, Italy, 12–13 December 2008
a partially ordered set of STRIPS-actions, where each action is associated with a time
step and unordered actions are associated with the same time step (indicating that
these actions can be executed in any relative order). A first version of Adj using the
IPP planner [13] was presented in [8]. In this paper, we propose a significant extension
of the original version of Adj based on new heuristics for fast replanning, and we in-
vestigate the use of a different style of planning for the replanning phase. Moreover, we
introduce a plan optimization phase in Adj, and we present new experimental results
evaluating our approach.
Adj is based on identifying and resolving the inconsistencies (with respect to the
new problem) that are present in the original (input) plan by revising limited portions
of the plan. An inconsistency is an action-precondition that is not satisfied, a goal
of the new problem that is not achieved, or a pair of mutually exclusive actions [1].
The portions of the plan that are revised are delimited by some temporal windows on
the plan containing inconsistencies at specific time steps. Each replanning window is
associated with a particular replanning (sub)problem that is solved using a systematic
or heuristic search algorithm (in our implementation we used the search algorithms
of IPP [13] and FF [10], but other algorithms could be used as well). During the
adaptation process a temporal window can be incrementally enlarged up to contain, in
the worst case, the full plan. In such a case a complete replanning is performed. When
a subplan for solving the replanning problem associated with a temporal window W is
computed, the actions of the plan under adaptation that are in W are replaced with the
new subplan. This technique is complete, in the sense that if the replanning algorithm
is sound and complete and the new problem is solvable, then the revision of the input
plan for the old problem leads to a correct plan for the new problem.
In the rest of the paper, first we give a general description of our plan-adaptation
method, followed by a more detailed description; then we present the results of an
experimental evaluation showing that, in practice, our approach can be much more
efficient than replanning from scratch.
2 Plan Adaptation, Replanning Windows and Heuristic
Goals
In this section we describe our plan adaptation technique. We assume that the reader
is familiar with the Graphplan-style of planning [1].
2.1 A General Description of adjust-plan
Given a plan π0for a planning problem Π0and a new problem Π, differing from Π0in
some initial or goal fact(s), the plan adjustment process of Adj consists of three main
phases:
1. Analysis of the input plan to determine a subset of the actions in π0that are
applicable for solving Π.
2
2. Identification of the set S of inconsistencies that are present in π0with respect to
Π (an inconsistency is a pair of mutually exclusive actions, an action with some
unachieved precondition(s), or a goal of Π which is not achieved).
3. Revision of π0to repair S obtaining a valid plan for Π.
The first phase is performed by mapping each action of π0to an action node of the
planning graph Gfor the new problem (if such a node exists). This mapping considers
three cases: (a) the number of time steps involved in π0is the same as the number of
levels in G; (b) the number of time steps involved in π0is higher than the number of
levels in G; (c) the number of the time steps involved in π0is smaller than the number
of levels in G. Each of these cases will be discussed in the last part of Section 2.
The second and the third phases of the adaptation process are accomplished by an
algorithm, adjust-plan, that is correct and complete: the adjusted plan is a valid
plan for Π, and if there exists a plan for Π, then an adjusted plan is computed. Figure
1 describes the main steps of adjust-plan, while some important additional details
are given in the next subsections. The first step of adjust-plan identifies and removes
from π0the actions that are not applicable in the context of the new problem. Then
step 2 identifies the earliest time step at which π0contains an inconsistency. Since we
are considering deterministic domains, this can be accomplished in polynomial time by
simulating the execution of the plan of π0(analogously, the facts that are necessarily
true at any time step can be determined in polynomial time).
Then adjust-plan processes the time step iidentified at step 2, trying to remove
the inconsistencies by replanning from time i-1to time i. If there exists no plan,
or a certain search limit is exceeded, then the replanning window is enlarged (e.g.,
we replan from i-1to i+1). The process is iterated until either a (sub)plan is
found, the window has been enlarged up to include all the actions of the plan and
the (complete) replanning has failed, or the search has reached a predefined CPU-
time limit (max-adjust-time). Note that in our current implementation of adjust-
plan, during replanning (step 5) the actions of πthat are present in the replanning
window are ignored (a new planning graph for the replanning problem is constructed).
The replanning process at step 5 is performed by first constructing the corresponding
replanning graph, and then searching it by using the backtracking scheme of IPP or FF.
At step 6 of adjust-plan the replanning window can be increased going either
backward in time (i.e., init-time is decreased), forward in time (i.e., goal-time is
increased), or both. Enlarging a replanning window corresponds to revising a larger
portion of the plan under adaptation. Such a portion will be replaced by the subplan
solving the replanning problem associated with the enlarged window (when this is
found). Adj has a default value for max-adjust-time that can be modified by the
user. In principle, if max-adjust-time is set to sufficiently high values, then adjust-
plan can increase a replanning window up to reach the earliest and latest time steps.
This would determine a complete search.
3
Algorithm:adjust-plan
Input: a plan π0with some inconsistencies w.r.t. Π, a CPU-time limit max-adjust-time.
Output: either a correct plan or fail.
1. Let πbe the plan obtained from π0by removing the actions that are not applicable for
solving Π;
2. Identify the earliest time step iin πcontaining an inconsistency; if there is no such a
time step, then return π;
3. If iis the latest time step of π, then set init-time to i−1 and goal-time to i,
otherwise set init-time to iand goal-time to i+1;
4. While CPU-time ≤max-adjust-time
5. Replan using as initial facts F(init-time) and as goals G(goal-time), where
F(init-time) is the set of facts that are true at time init-time, and
G(goal-time) is a set containing the preconditions of the actions in πat time
goal-time;
6. If there is no plan from F(init-time) to G(goal-time), or a search limit is
exceeded, then decrease init-time or increase goal-time (i.e., we enlarge the
replanning window), otherwise replace the existing actions between init-time
and goal-time in πwith the new subplan, and goto 2.
7. Return fail.
Figure 1: High-level description of adjust-plan.
Finally, during replanning within a particular window we impose a search limit that
is automatically increased when the replanning window is enlarged.2The motivation
of this heuristic is that when a replanning problem associated with a certain window is
hard to solve, it can be easier to revise a larger portion of the plan (instead of dedicating
a big effort to the revision of a restricted part of the plan). While this does not affect
completeness, in practice it can be significantly effective for the efficient computation
of an adapted plan.
2.2 A Detailed Description of adjust-plan
The input plan π0of adjust-plan is a plan for solving an old problem that we would
like to adapt for solving a new problem Π, where the initial state or the goal state
are different from the corresponding states in the old problem. We indicate with IΠ
the initial state of Π, with πthe plan under adaptation for solving Π, and with Gthe
planning graph of Π. For the moment we assume that Gis the planning graph for
Π such that the last level of Gcontains all the goals of Π with no mutually exclusive
relations between them. Moreover, we assume that length(G) = length(π0), i.e., that
2In the current implementation this limit is defined by constraining either the maximum number m
of time steps or the number of actions in the solution of the replanning problem, depending on the style
of planning (either parallel or sequential) that we use for solving the replanning problem. For parallel
planning, mis initially set to 3, and then it is automatically increased by 2 each time the replanning
window is enlarged by one level; for sequential planning, mis initially set to 4, and then increased by
3 times the number actions in the replanning window.
4
the number of levels in G(excluding the final goal level) is the same as the number of
time steps of π0. In the last part of Section 2, we will generalise to the other cases.
The definition and computation of replanning window use the following notion of
applicable action:
Definition 1 (Applicable action).An action Akof π0at a certain time step kis
applicable for solving Πif and only if Akis a node of Gat the action-level k.
Note that, by definition of planning graph [1], when the fact corresponding to a
precondition of an action Akof π0is absent at the fact-level kof G, also the action
node corresponding to Akis absent at the action-level kof G.
Before starting the adaptation process we remove from π0the actions that are not
applicable, and we set πto the resulting plan. During the adaptation process πis
incrementally revised by replacing subplans within certain temporal windows with new
subplans.
Definition 2 (Replanning window).Given two time steps iand jof a plan πunder
adaptation for solving Π, a replanning window for πis a pair hF(i), G(j)isuch that:
•0≤i < j and if i > 1, then the subplan of πfrom time step 0to time step i−1is
a valid subplan for achieving the preconditions of the actions at time i−1from
IΠ;
•F(i)is the set of positive facts that are true at time step iin π;
•G(j)is a set of (sub)goals either consisting of the final goals of Π(if jis the time
step corresponding to the final goals of Π), or containing the preconditions of the
actions at time j(otherwise).
The time steps iand jof a replanning window hF(i), G(j)iare called replanning
start time and replanning end time respectively; F(i) is called the replanning initial
state, and G(j) is called replanning goal set. Note that when the replanning goal state
does not correspond to the goals of Π, Definition 2 requires that the replanning goal
set G(j)contains the action-preconditions at time j, but it does not specifies which
are the other replanning goals that should be included (if any); moreover these extra
goals are irrelevant for ensuring correctness and completeness [8]. On the other hand,
as will be discussed in the next section, they can be useful for making the adaptation
process more efficient.
Step 2 of adjust-plan guarantees that each replanning window contains the ear-
liest inconsistencies of π. This allows computing an exact assessment of the facts F(i)
that are true at the replanning start time. (If the plan contained an inconsistency
preceding the current replanning start time i, then we could compute only an approx-
imation of the facts that are true at time i, because these can depend on how the
preceding inconsistency will be repaired.) In particular, F(i) can be computed in poly-
nomial time by simulating the execution of the actions in πpreceding time i, i.e., F(i)
can be incrementally computed according to the following definition:
F(i) = IΠifi=0
Ei∪F+
i−1if i >1,
5
where Eiis the set formed by the positive effects of the actions of πat time step i−1,
and F+
i−1is the subset of F(i−1) formed by the facts that are not deleted by the
actions of πat time i−1.
In [8] we prove that adjust-plan is sound and complete, provided that max-adjust-time
is set to a sufficiently high value and that a sound and complete algorithm is used for
the replanning phase.
2.3 Heuristic Replanning Goals
According to Definition 2, the replanning goal set can contain some goals in addition
to the preconditions of the actions that are planned at the replanning end time. In the
following Σjindicates this set of action-preconditions, while Ωjindicates the (possibly
empty) set of the remaining goals of G(j), i.e., G(j) = Σj∪Ωj.
Including a non-empty Ω-set in G(j) can be useful because some actions in the
subsequent part of the plan might require that the Ω-facts hold in the replanning goal
state (despite they are not preconditions of actions at time j). This need arises, for
example, when an action Aof π0in the replanning window has some effect φthat is
not required by any other action in the replanning window, but that persists up to
a time step after the replanning end time (j) to achieve a precondition of an action
B, that otherwise would not be satisfied.3The inclusion of φin G(j) is useful be-
cause the (sub)plan found during replanning might not necessarily contain A(all the
actions in the replanning window are always replaced by the new subplan solving the
corresponding replanning problem). Thus, if φwere not in G(j) and, after the subplan
replacement, φdid not hold at time j, then adjust-plan would identify an inconsis-
tency (the unsatisfied φ-precondition of B) at a time later than j, and hence another
replanning phase would be activated.4
The Ω goal set of G(j) can be seen as an assessment of the set Ω∗
jof the facts that
should hold at time jfor achieving those preconditions of actions in the subsequent
part of πthat otherwise would be unsatisfied (and so these actions would not be
reusable for solving Π). Note that in general computing an exact assessment of Ω∗
j
would not be feasible, because the subsequent part of πcan change incrementally, as a
consequence of repeated replanning phases to cope with inconsistencies at times later
than j. Fortunately, including an exact assessment of Ω∗in G(j) is not necessary for
ensuring the correctness and completeness of the method [8]. Instead, the inclusion
of the Ω-goals should be seen as a heuristic aimed at reusing as many actions of the
original plan as possible, and at obtaining a fast adaptation.
We have developed definitions of Ω-goals, which are based on the assumption that
Π can be solved by a plan which is similar to π, i.e., that the adaptation of π0requires
a limited number of changes. In the rest of the paper, we present two of such defini-
tions and we experimentally compare them in order to analyze their importance and
effectiveness in practice.
3Using the causal-link planning terminology, we can say that π0contains the causal link Aφ
−→ B.
4Reducing the amount of replanning can be crucial not only for efficiency, but also for reusing as
much as possible the input plan, which is important, for example, in mixed-initiative planning.
6
We will use the following notation for associating a time step in the plan under
adaptation πto a time step in the input plan π0. Let tbe a time step of π,ˆ
tdenotes
the time step corresponding to tin π0before any modification. I.e., ˆ
tis defined as
ˆ
t=t−Pk
s=1 δs, where kis the number of subplan-replacements performed by the
algorithm, and δiis the difference between the number of time steps involved in the
new replanned subplan and the number of time steps in the corresponding existing
subplan for the i-th replacement.
2.4 Forward ΩGoal set
We now give the definition of forward Ωreplanning goals (ΩF-goal set), which uses
the notion of persistent fact with respect to π0and to the planning graph of the new
problem. ΩF
jis considered a set of facts that we include in G(j) to obtain a replanning
goal state that is similar to the corresponding state at time ˆin π0. The reason for this
is that, if π0is correct for the old problem Π0, then each state reached by π0contains
a set of facts that is an exact assessment of its Ω∗-set. Hence, if we can solve Π using a
plan that is similar to π0, then we can expect that Ωjis a good approximation of Ω∗
j.
Definition 3 (Persistent fact).A fact fkis persistent in π0and Gat time kif and
only if fkis a node of Gat the fixpoint level such that the corresponding noop-action is
not mutually exclusive with any applicable action of π0.
In the following definition endtime(π0) indicates the last time step of plan π0.
Definition 4 (Forward Ω Goal set – ΩF).Let kbe a time step of π0,ithe earliest time
step where π0has an inconsistency, jthe end time of the current replanning window
in π, and Γkthe following set of facts:
Γk=F(k)if k≤i
Ek∪Γ−
k−1if k > i,
where Ekis the set formed by the positive effects of the applicable actions of π0at time
k, and Γ−
k−1is the subset of Γk−1formed by the facts that are persistent in π0and Gat
time k−1.ΩF
jis the subset of Γˆconsisting of the facts that are persistent in π0and
Gat time ˆ, if ˆ < endtime(π0); the empty set if ˆ=endtime(π0).
Note that the Γ-sets in the definition of the ΩF-goals of G(j), as well as the Σ-
goals, can contain facts that are mutually exclusive in Gat level ˆ. This can be the case
because the set Ejof action-effects, as well as the set Σjof action-preconditions, can
contain facts that are mutually exclusive. This can happen even if the original plan from
which we start the adaptation is valid (for the old problem), because the new problem
Π can have a different initial state, introducing mutual exclusion relations that were
not present in the planning graph of the old problem. During the computation of ΩF
j,
if a certain Γxcontains a subset of mutually exclusive facts, then we impose that not
more than one of them (randomly chosen) belongs to Γx+1 (for x < j).
Figure 2 gives an example illustrating the elements of the sets F(i), Γj, ΩF
jand
G(j). The Figure shows a portion of the planning graph Gfor a problem Π in the
7
blocks world domain, that we are trying to solve by adapting a plan π0(for the sake
of clarity the information in the planning graph that is not relevant for the example is
omitted). Initially we have π=π0. We assume that all the actions of π0are applicable,
that the earliest inconsistency in π0is at time iand that the current replanning window
is hF(i), G(i+ 1)i. Moreover, we assume that, after the application of the actions of π
preceding i, the set of positive facts in the replanning initial state is the same as Γi:5
Γi=F(i) = {hold C,on A B,on D E,clear D,clear A}.
Note that the facts clear B and ontab D are present in G, but after the execution of
πup to time i−1 they are false. stack C B is the only action of πat the time i. This
action has two preconditions: hold C and clear B.clear B is not satisfied at the level
i, and this causes an inconsistency in πat the time step i. The elements of Γi+1 are
the union of:
•the set of the positive effects of the action stack C B:{clear C,on C B,arm empty};
•the subset Γ−
iof the facts in Γithat at time i+ 1 continue to be true (i.e. the
facts in Γithat are persistent in π0and G): {clear D,on D E,clear A}.6
hold C does not belong to Γ−
i(and to Γi+1), as the corresponding no-op is mutually
exclusive with stack C B (i.e., hold C is not persistent in π0and Gat time i). Similarly,
also on A B does not belong to Γi+1. The Σ-set and the ΩF-set forming G(j) are ob-
tained as follows:7Σi+1 ={arm empty,clear D,ontab D}, ΩF
i+1 ={on C B,clear C,clear A}.
2.5 Causal ΩGoal set
The previously defined ΩF-set contains all the facts that are persistent at a specific
time step. Unfortunately, this set could contain some facts that are not necessary for
the correctness of the remaining actions of the plan, which could make solving the
local planning problem harder than necessary. If we consider the example of Figure 2,
fact clear Cat time step i+ 1 is not necessary to the applicability of pickup D and
stack D A.
In order to improve the definition of the heuristic goal set, we can exploit the
causal link representation of the plan under adaptation [16]: the existence of a causal
link af
→b“crossing” a time step lindicates that the truth of fact fat lis necessary
for the correctness of π. More precisely, the set Lπof the causal links in πis defined as:
Lπ=n(af
→b)|a, b ∈π∧f∈add(a)∧f∈prec(b)∧Levπ(a)< Levπ(b)∧
6 ∃c∈πs.t. f ∈del(c)∧Levπ(a)≤Levπ(c)< Levπ(b)},
where Levπ(x) corresponds to the time step of the action xin π. Moreover, we define
the set Lπ
l⊆ Lπof the causal links in πcrossing a time step las follows:
Lπ
l=n(af
→b)∈ Lπ|Levπ(a)< l ≤Levπ(b)o.
5Since in this example ˆ
k=kfor any step k, to simplify the notation ˆ
kis indicated with k.
6on D E,clear D and clear A are the only ones in F(i) whose corresponding no-ops at the level i
in the planning graph are not mutually exclusive either with stack C B or with each other.
7arm empty and on D E in Γi+1 are not in ΩF
i+1 because they are not persistent in πand Gat i+ 1.
8
on_A_B
stack_C_B noop_clear_D
clear_D
noop_ontab_D
pick_up_D
hold_C clear_B
Ω i+1
m.e. m.e.
m.e. m.e. m.e.
m.e.
m.e.
hold_C on_D_E
noop_on_D_E
m.e.
Time
i
i+1
noop_clear_A
clear_A
noop_clear_A
clear_A
m.e.
ontab_D on_D_Eclear_D
noop_on_D_E
ontab_D
noop_on_A_B
on_A_B
noop_hold_C
arm_empty
noop_arm_emptynoop_on_C_Bnoop_clear_C
clear_C on_C_B
CA
D
EB
C
B E
DA
State Level iGoal State
Goals={on C B, on D A},π={...;i:stack C B;i+ 1 : pickup D;i+ 2 : stack D A}.
Figure 2: Example illustrating the definitions of F(i), Γj, ΩF
jand G(j), for j=i+ 1. The
actions of πat the times iand i+ 1 are stack C B and pickup D respectively. We use solid
boxes for representing the facts belonging to F(i), and dashed boxes for the facts of Gthat
are not in F(i). The facts belonging to G(j) are indicated with gray boxes, while the facts
belonging to ΩF
jare enclosed into a box.
The f-facts involved in the causal links of Lπ
lcan be used to derive a new definition of
the Ω-goal set at time step l:
ΩC
l=nf| ∃(af
→b) in Lπ
lo.
For the example of figure 2, we have the following causal links crossing time step i+ 1
(ainit and aend are special actions with effects the facts that are true in the initial state,
and with preconditions the problem goals, respectively):
Lπ
i+1 =nstack C B on C B
−→ aend,stack C B arm empty
−→ pickup D,
ainit
clear D
−→ pickup D,ainit
clear A
−→ stack D Ao
and so we have ΩC
i+1 ={on C B,clear A,arm empty,clear D}.
Differently from ΩF
i+1, ΩC
i+1 does not contain fact clear C, but it includes clear A,
which is necessary to the applicability of stack D A (note that facts arm empty and
clear D already belong to Σi+1). It is important to observe that, since πcould be
an invalid plan, there could be some conflicts between facts in ΩC
land in Σ. In these
cases, in order to avoid having conflicting replanning goals, it is necessary to select a
subset of the computed replanning goals. We do this heuristically. Intuitively, having
to choose between two facts that are mutually exclusive, we prefer the fact that (a) is
used by an action that is temporally nearest to the time step of the replanning goals
and (b) whose truth satisfies the greatest number of action preconditions.
2.6 Additional Details
To conclude the description of our plan-adaptation method, we consider the cases in
which the number of time steps ntinvolved in π0is different from the number of levels
9
O5
Replanning
windows
windows
Optimization
Oend
0 i j i j i jj
1 1 2 2 3 3 4
i
W W W W1 2 3 4
4
1
O
O
O O
O
O
O
O
2
3
O4
6
7
8
9
10
Levels
Figure 3: Example illustrating the iterative optimization procedure. W1···4are the replanning
windows of the subplans replaced during adaptation. Oiare the alternative replanning windows
considered during the optimization process so as to find a series of plans each of which is
characterized by a smaller number of time steps/actions with respect to the previous plan.
nlin the planning graph G.
When nt> nl, the applicable actions of π0are determined by first extending Gto
have the same number of levels as the time steps of π0, and then applying a definition
of an applicable action analogous to Definition 1.
When nt< nl, the mapping between plan-actions and graph-nodes in the definition
of applicable actions is done by considering the last ntlevels of G. E.g., we try to map
actions at time ntto nodes at the last level of G, actions at time nt−1 to nodes at the
penultimate level of G, and so on. If for an action Aof π0there is no corresponding
action node in G, then we consider Aan action that cannot be applied for solving Π,
and we remove it from π. If nt< nl, we say that a fact fkis persistent in π0and Gif and
only if fkis a node of Gat level l=k+length(G)−length(π0), and the corresponding
noop-action at level lis not mutually exclusive with any applicable action of π0at
time k.
2.7 Plan Optimisation
Concerning plan quality, clearly, the plan adaptation techniques that we have presented
can find a first solution that is not optimal with respect to the number of actions or
time steps. If we are interested in finding good quality plans, which is important in
many applications, we can extended our method by identifying a “suitable” alternative
replanning window O, for each previous replanning phase, and repeating the adaptation
process in order to find a new subplan with a lower number of actions/time steps w.r.t.
the corresponding number of actions/time steps in O.
The iterative optimization process illustrated in the example of Figure 3 produces a
series of optimized plans, each of which involves a smaller number of either time steps
or actions. The longer this process is, the better the final plan can be. This process
can be interrupted at any time to return the best plan generated so far.
The computation of an optimized plan is based on considering alternative replan-
ning windows for the subplans that were previously inserted, during either the plan
adaptation phase or the plan optimization process. For each of these subplans, we
have to consider a replanning window that is larger than the one previously used; if,
10
for this enlarged window, we find a subplan involving a smaller number of time steps
(or actions) w.r.t. the current subplan in the replanning window under consideration,
then the plan actions inside the replanning window are replaced with those in the newly
generated subplan.
For instance, if we have replanned from F(ˆı) to G(ˆ), obtaining a subplan with k
time steps, then we will optimize from F(i0) to G(j0) with j0>ˆand i0≤ˆı, imposing
to the solution subplan a maximum number of time steps equal to (j0−ˆ) + k+ (ˆı−
i0). If there is no plan from F(i0) to G(j0) that satisfies such a constraint, then we
either decrease i0or increase j0(i.e. we enlarge the replanning window); otherwise,
we revise π0with the new subplan, and then we consider another replanning window
previously processed by the the plan adaptation/optimization process for the possible
improvement of the corresponding subplan. In principle, each replanning window can
be enlarged up to contain the entire current plan, i.e., F(i0) = IΠ0(the initial state)
and G(j0) = GΠ0. In this case, if we solve the replanning problems using an optimal
planner, like IPP, obviously the generated plan is guaranteed to be optimal.
3 Experimental Results
We present the results for some experiments aimed at testing the efficiency of our plan
adaptation approach, which is implemented in the Adj system. The general goal of
our experiments was to confirm the hypothesis that plan adaptation using Adj can be
much faster than replanning from scratch, for which we used two types of planners: the
optimal parallel planner IPP based on planning graphs, and the satisficying state-based
sequential planner FF based on heuristic search techniques. These well-known planners
are among the best available for these styles of planning. We tested our system using a
large collection of known planning problems in different known domains: Rocket [20],
Logistics [12], Gripper [4], DriverLog and Zenotravel [14]. For each of these problems
we designed some modifications (to the initial state or to the goals) and we solved the
modified versions using a solution plan for the original problems as part of the input.8
Concerning the comparison of Adj with IPP, we used a set of variants of some
benchmarks from the 1st International Planning Competition (IPC). For this experi-
ment, IPP was also used by Adj to solve the planning problems associated with the
replanning windows. Figure 4 compares the CPU-times consumed by Adj with the
CPU-times consumed by IPP for solving the same set of problem variants, which are
named on the x-axis of the plots using numbers. Each problem variant contains few
changes to the facts of the the initial or goal state(s) of the original problem, making
the input plan solving the original problem invalid for the revised problem.9In general,
these results show that solving a problem by plan adaptation using Adj can be much
faster than by replanning from scratch, up to four orders of magnitude. (For example,
Adj with ΩCsolved the Logistics c problem 14 in 0.66 seconds and with ΩFin 0.75
seconds; while IPP required 1871 seconds.) The main reason of this observed behaviour
8The machine we used for the tests was an Intel(R) Xeon(TM) CPU 2.40GHz, with 1GB of RAM.
9The formalisation of the test problems is available from
http://www.ing.unibs.it/∼serina/adaptation/adjust-problems.tar
11
0.01
0.1
1
10
100
1000
5 10 15 20 25 30
Rocket_B Variants
CPU time in seconds (logscale)
OmegaCL (30 solved)
OmegaF (30 solved)
IPP (24 solved)
0.1
1
10
100
1000
10000
100000
5 10 15 20 25 30 35 40 45
Logistics_C Variants
CPU time in seconds (logscale)
OmegaCL (45 solved)
OmegaF (45 solved)
IPP (20 solved)
0.01
0.1
1
10
100
1000
5 10 15 20 25 30
GRIPPER_12 Variants
CPU time in seconds (logscale)
OmegaCL (30 solved)
OmegaF (30 solved)
IPP (9 solved)
Figure 4: CPU-seconds (logarithmic scale) required by Adj and IPP for solving variants of
Rocket b, Logistics c, and Gripper 12.
0.1
1
10
100
1000
10000
20 40 60 80 100 120 140 160 180
Logistics Untyped Strips Domain - log70 Problems
CPU time in seconds (logscale)
OmegaCL (175 solved)
OmegaF (175 solved)
FF (175 solved)
1
10
100
1000
10000
20 40 60 80 100 120 140 160 180
Drivelog Domain - pfile20 Problems
CPU time in seconds (logscale)
OmegaCL (146 solved)
OmegaF (61 solved)
0.01
0.1
1
10
100
1000
10000
20 40 60 80 100
ZenoTravel Strips Domain - pfile20 handcoded Problems
CPU time in seconds (logscale)
OmegaCL (96 solved)
OmegaF (47 solved)
FF (87 solved)
Figure 5: CPU-seconds (logarithmic scale) required by Adj and corresponding generation
CPU-time using FF for the variants of Logistics70, DriverLog Strips pfile20 and Zenotravel
Strips Handcoded pfile20.
is that, very often, the planning subproblems defined by the replanning windows are
much easier than the complete full problem considered by IPP, and they can be solved
by shorter (sub)plans. In general, the performance of Adj depends on the number
of the replanning windows considered, on their size, and on the complexity of the cor-
responding replanning problems. But we observed that the first two factors seem less
crucial than the third one.
From the results in Figure 4 we can also observe that, for this experiment, the
performance of Adj with ΩCand with ΩFare comparable. We believe that the main
reason here is the simplicity of the test problems, for which the use ΩFis sufficiently
effective. On the other hand, we observe that the overhead due to the computation of
casual links in ΩCdoes not slow down the overall performance.
The problems used for the experimental analysis presented in the next part of this
section are variants of hard problems from the 2nd and 3rd IPCs. None of these
problems can be solved by IPP within the CPU-time and memory limits imposed in
our experiments. Since FF was the winner of the 2nd IPC (domain-independent track),
for these problems we used FF both as the reference planner for the replanning from
scratch and as the planner for solving the local replanning problems in Adj. For this
experiment, the mutual exclusive relations used by Adj were computed by running the
12
DISCOPLAN system [6]. Figure 5-a shows the results for several variants of the very
hard problem logistics-70 (from the “additional problems track” of the 2nd IPC). We
can observe that plan adaptation is about two orders of magnitude faster than planning
from scratch. The performances of ΩCand ΩFare again very similar; sometimes the less
accurate ΩFheuristic is even more effective than the ΩCheuristic, but we observed that
often the solutions generated using ΩFare not as good as to those obtained using ΩC.
Figure 5-b gives results for the Driverlog domain from the 3rd IPC [14]. We cre-
ated a set of variants to the pfile20 problem (the most difficult STRIPS problem of
the “standard track” of the 3rd IPC). These modifications concern the initial and/or
final positions of the involved drivers and packages. FF solved none of these problem
variants within the 3600 CPU-seconds limit. Differently from the previous cases, in
this experiment we observed that the ΩCgoal sets are very useful compared to the ΩF
goal sets: Adj with ΩCsolves 146 over 175 problem variants, while Adj with ΩFsolves
only 61 variants. Moreover, Adj with ΩCis generally faster than with ΩF, and it finds
solutions with better quality. Interesting, ΩFcan solve two problems that ΩCcannot
solve. We think that the reason is related to the fact that Adj with ΩCcan remove
some useless actions from the plan, which allows the system to find better quality so-
lutions, but which for these problems makes processing the final replanning windows
more difficult.
Figure 5-c gives results for a set of variants of the pfile20 “handcoded” problem in
the Zenotravel STRIPS domain (the hardest problem in this domain, originally designed
for testing the performance of domain-dependent planners). Here again we can observe
a generally good behavior of ΩC, with which Adj solves 96 over 105 problem variants,
while with ΩFit solves only 47 variants. FF solves 87 variants (using at most 1800
CPU-seconds). Overall, Adj with ΩCis one or two orders of magnitude faster than
FF, although the first solution produced by Adj can contain more actions than the
corresponding solution produced by FF.
As previously observed, the first plan computed by Adj can involve a number
of time steps or actions that is higher than necessary. In Figure 6, we give some
experimental results about the the effectiveness of the plan optimisation phase of Adj.
On the x-axis, we have the CPU-seconds for finding a solution; on the y-axis, we have
plan qualities, in terms of either number of time steps (levels) or number of actions, for
the various solutions that are incrementally generated during the optimisation process.
The curves in Figure 6 indicate that Adj computes very quickly a first solution
which is worse than the optimal solution. However, then the optimisation process can
incrementally derive additional solutions with better quality, up to a solution that in
terms of number of involved time steps or actions is similar to the optimal one. For ex-
ample, concerning the Logistics a variant 24, Adj computes a first solution involving 16
time steps in 1.24 seconds. Then the optimization process derives additional solutions
involving 15, 14, 13, 12, and 11 time steps, using 1.8, 4.2, 142, 1599, 1800 CPU-seconds,
respectively. While IPP computes an optimal solution in 1430 CPU-seconds.
Moreover, Figure 6 also compare the performances of FF and Adj with either
ΩCor ΩF. Although usually the first solution computed by Adj is not better than
the one generated by FF, we observe that, for the test problems considered in this
13
8
9
10
11
12
13
14
15
16
0 200 400 600 800 1000 1200 1400 1600 1800
Seconds
Levels
ADJUST-PLAN
IPP
448
450
452
454
456
458
460
462
464
466
0 100 200 300 400 500 600
Seconds - log70-I1-G1-n4 variant
Number of Actions
OmegaCL
OmegaF
FF
440
450
460
470
480
490
500
0 200 400 600 800 1000 1200 1400 1600 1800
Seconds - log70-I5-G1-n1 variant
Number of Actions
OmegaCL
OmegaF
FF
Figure 6: Optimisation phase for variant num. 24 of Logistics a considering Adj vs. IPP and
variants 34 and 151 of Logistics70 Untyped considering Adj vs. FF
experiment, Adj generates additional plans quickly deriving a solution that is better
than the solution computed by FF.
4 Conclusions and Future Work
We have presented a method and some heuristics for efficient domain-independent
plan adaptation that are implemented in the Adj system. An experimental analysis
indicates that solving a planning problem by adapting an existing plan for a similar
problem using Adj can be dramatically more efficient than replanning from scratch
using two well-known approaches of planning.
In the literature, other methods and systems for plan adaptation have been pro-
posed, such as priar [11], prodigy/analogy [20] and spa [9]. These systems use
underlying planning algorithm and data structures that are significantly different from
those implemented in Adj. Another important difference concerns the input informa-
tion. While Adj uses a very simple and general plan description that can be easily
generated from the output of many planners (or even written by hand), the other men-
tioned systems use specific additional information (such as the “refinement” decisions
taken during the search process in spa) constraining the input plan to be a plan gen-
erated by a specific planner. Other recent related systems are Far-off [17] and van
der Krogt & de Weerdt’s system [19], each of which is based on techniques that are
significantly different from Adj’s techniques.
We are currently studying a new version of Adj using weaker, but computation-
ally more efficient data structures and an extension to support numeric and temporal
domains [14]. Other current and future work includes further techniques for determin-
ing replanning goals, and additional experiments, including, in particular, a comparison
with the performance of other related plan adaptation systems. (A preliminary compar-
ison with the two most related systems mentioned above indicates that Adj performs
more efficiently.) Moreover, we are studying the integration of LPG [5] and SGPlan
[21] into Adj for solving the problems associated with Adj’s replanning windows.
Finally, we intend to develop a complete case-based planning system, which selects
an existing plan from a library and adapts it to solve the current problem using Adj.
Since adapting an existing plan is PSPACE-Complete [15], the use of an exact general
method for deciding whether adapting a plan from a library gives a computational
advantage w.r.t. a complete replanning from scratch seems practically infeasible. How-
ever, this question can be addressed by using heuristic methods such as the polynomial
14
technique presented in [17].
References
[1] A. Blum and M.L. Furst. Fast planning through planning graph analysis. Artificial Intel-
ligence, 90:281–300, 1997.
[2] B. Drabble. Excalibur — A program for planning and reasoning with processes. Artificial
Intelligence, 62(1):1–40, 1993.
[3] G. Ferguson and J. Allen. Towards a mixed-initiative planning assistant. In In Proceedings
of AIPS-96, 1996.
[4] M. Fox and D. Long. The detection and exploitation of symmetry in planning problems.
In Proceedings JCAI-99 , pages 956–961, 1999.
[5] A. Gerevini, A. Saetti, and I. Serina. Planning through stochastic local search and temporal
action graphs. Journal of Artificial Intelligence Research (JAIR), 20:pp. 239–290, 2003.
[6] A. Gerevini and L. Schubert. Inferring state constraints in discoplan: Some new results.
In Proceedings of the AAAI-00, pages 761–767. AAAI press/The MIT Press, 2000.
[7] A. Gerevini and I. Serina. Fast planning through greedy action graphs. In Proceedings of
the AAAI-99, pages 503–510. AAAI Press/MIT Press, July 1999.
[8] A. Gerevini and I. Serina. Fast plan adaptation through planning graphs: Local and sys-
tematic search techniques. In Proceedings of the AIPS-00, pages 112–121. AAAI Press/MIT
Press, 2000.
[9] S. Hanks and D.S. Weld. A domain-independent algorithm for plan adaptation. Journal
of Artificial Intelligence Research (JAIR), 2:319–360, 1995.
[10] J. Hoffmann. FF: The fast-forward planning system. AI Magazine, 22(3):57–62, 2001.
[11] Subbarao Kambhampati and James A. Hendler. A validation-structure-based theory of
plan modification and reuse. Artificial Intelligence, 55:193–258, 1992.
[12] H.A. Kautz and B. Selman. Pushing the envelope: Planning, propositional logic, and
stochastic search. In Howard Shrobe and Ted Senator, editors, Proceedings of the AAAI-
96, pages 1194–1201. AAAI Press, 1996.
[13] J. Koehler, B. Nebel, J. Hoffmann, and Y. Dimopoulos. Extending planning graphs to
an ADL subset. In Fourth European Conference on Planning (ECP’97), pages 273–285.
Springer Verlag, 1997.
[14] D. Long and M. Fox. The 3rd international planning competition: Results and analysis.
Journal of Artificial Intelligence Research (JAIR), Forthcoming, 2003.
[15] B. Nebel and J. Koehler. Plan reuse versus plan generation: A complexity-theoretic
perspective. Artificial Intelligence, 76:427–454, 1995.
[16] J.S. Penberthy and D.S. Weld. UCPOP: A sound, complete, partial order planner for
ADL. In Proceedings of KR’92, pages 103–114, Boston, MA, 1992. Morgan Kaufmann.
[17] Flavio Tonidandel and Marcio Rillo. The far-off system: A heuristic search case-based
planning. In Proceedings of AIPS, pages 302–311. AAAI Press, 2002.
[18] D.R. Traum, J.F. Allen, G. Ferguson, P.A. Heeman, Chung-Hee Hwang, T- Kato, N.
Martin, M. Poesio, and L. K. Schubert. Integrating natural language understanding and
plan reasoning in the TRAINS-93 conversation system. In Working Notes of the AAAI
Spring Symposium on Active NLP, pages 63–67, Stanford, CA, 1994.
[19] R. van der Krogt and M. de Weerdt. Plan repair as an extension of planning. In Proceedings
of ICAPS’05, pages 161–170. AAAI, 2005.
[20] M. Veloso. Planning and learning by analogical reasoning, volume 886 of Lecture Notes in
Artificial Intelligence and Lecture Notes in Computer Science. Springer-Verlag Inc., New
York, NY, USA, 1994.
[21] Hsu C. W., Wah B. W., Huang R., and Chen Y. X. New features in sgplan for handling soft
constraints and goal preferences in PDDL3.0. In Abstract Booklet of the Fifth International
Planning Competition,ICAPS’06, 2006.
15