Content uploaded by Houssam Abbas
Author content
All content in this area was uploaded by Houssam Abbas on Apr 02, 2017
Content may be subject to copyright.
Convergence Proofs for Simulated Annealing Falsification of Safety
Properties*
Houssam Abbas1and Georgios Fainekos2
Abstract— The problem of falsifying temporal logic proper-
ties of hybrid automata can be posed as a minimization problem
by utilizing quantitative semantics for temporal logics. Previous
work has used a variation of Simulated Annealing (SA) to solve
the problem. While SA is known to converge to the global
minimum of a continuous objective function over a closed and
bounded search space, or when the search space is discrete,
there do not exist convergence proofs for the cases addressed
in that previous work. Namely, when the objective function
is discontinuous, and when the objective is a vector-valued
function. In this paper, we derive conditions and we prove
convergence of SA to a global minimum in both scenarios.
We also consider matters affecting the practical performance
of SA.
I. INTRODUCTION
One of the major challenges in the model-based develop-
ment of Cyber-Physical Systems (CPS) is how to automati-
cally verify the correctness of a CPS model with respect to
some formal specification. The proliferation of embedded
computers in a multitude of safety critical systems and
the well documented cases of CPS system failures due to
software-physical system interactions [1], [2] demonstrate
the urgency and the importance of the problem. However,
it is well known [3] that the verification problem of CPS is
undecidable, in general. Therefore, a lot of research effort
has been focused on testing-based methodologies [4]–[13].
In previous work [14]–[16], a notion of robustness of
temporal logics [17] is utilized as a cost function in order to
convert the temporal logic falsification problem of CPS into
an optimization problem. In detail, the robust semantics of
a temporal logic formula over a CPS trajectory evaluate to
a positive value if the trajectory satisfies the specification
and to a negative value otherwise. Thus, we can convert
the falsification problem into a minimization problem of the
specification robustness over the set of all system trajectories.
In general, the resulting optimization problem is non-convex
and non-linear and the search space is uncountable. Thus, in
[14]–[16], a number of stochastic heuristic search techniques
were employed in order to solve the minimization problem
with very promising results. In particular, in [14], a version
of Simulated Annealing (SA) [18], [19] was utilized and a
new SA heuristic for the minimization of a particular class
*This work was partially supported by NSF awards CNS-1017074 and
CNS-1116136.
1H. Abbas is with the Department of Electrical, Computer and Energy
Engineering, Arizona State University, Tempe, AZ, USA hyabbas at
asu.edu
2G. Fainekos is with the Department of Computing, Informatics and
Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
fainekos at asu.edu
of vector functions was also proposed. For the former, it was
claimed that under the assumption of a finite search space1
we can guarantee convergence of SA to the global minimum
and, thus, guarantee the solution of the original falsification
problem. The convergence of the SA algorithm for vector
functions was left as an open problem.
In this paper, we lift the assumption of the finite search
space and we answer the question of what classes of CPS
and under what conditions we can guarantee the convergence
of SA to the global minimum. Furthermore, we derive
conditions so that the SA algorithm over vector functions
converges to the global minimum, as well. In brief, we prove
that if a CPS is simulatable [20], then the SA algorithms are
guaranteed to converge to the global minimum if the global
minimum does not belong to an equivalence class of measure
zero. The results in this paper are important because they
help us understand the practical and theoretical limitations
of the application of SA to the falsification problem of CPS.
While they are presented for safety requirements due to space
constraints, they can easily be extended to general temporal
logic formulae.
Notation.µ(·)denotes the Lebesgue measure. k·k is the
Euclidean distance, and B(x) = {y∈Rd| ky−xk< }.
R≥0= [0,∞)and N≥0=R≥0∩N. ‘Discrete set’ will mean
a finite or countably infinite set. For a set X,P(X)is the
set of all subsets of X.
II. PROB LEM F ORM ULATI ON
A. Falsification of safety properties of hybrid automata
We now introduce the practical setting in which the above
two algorithms are applied, and all theorems are proven. We
stress that we prove the correctness of the SA algorithms
when applied to this practical problem. We consider a
deterministic, non-Zeno2hybrid automaton [21]
H= (L, X, F low, I nit, Inv, E , G, Re)
where L={`0, `1, . . .} ⊂ Nis a countable set of ‘locations’,
X⊂Rdis the continuous state space, F low :L×X→X
is a vector field describing the continuous evolution of a
trajectory at (`, x),Init ⊂Xis the set of initial continuous
conditions, Inv :L→ P(X)associates an invariant set
of F low(`, ·)to each location `,E⊂L×Ldescribes the
possible jumps between locations (i.e. (`i, `j)∈Eiff there
1Any compact set of initial conditions and/or other search parameters is
going to be discretized to a finite set of floating-point numbers.
2An automaton is Zeno if it has trajectories that perform an infinite
number of discrete jumps in a finite amount of time. This is an artifact
of the modeling, and can not happen in reality.
1594
Fiftieth Annual Allerton Conference
Allerton House, UIUC, Illinois, USA
October 1 - 5, 2012
978-1-4673-4539-2/12/$31.00 ©2012 IEEE
exists a trajectory of the system that visits `ithen `jwithout
visiting any other location in-between), G:E→ P(X)
defines guard conditions that cause jumps, and Re :E×X→
Xis the reset map which resets the continuous state with
every jump. H=L×Xis the ‘state-space’ of H. In this
paper we consider automata for which Init ⊂Inv(`0)so
the initial set H0={`0} × Init.
Given an initial state h= (`0, x)∈H0, the hybrid
trajectory that starts there is a vector function ηh: [0,∞)→
Hwhich associates a pair (location, continuous state) to each
point in time: ηh(t)=(`(t), s(x, t)) where s(x, t)is the
continuous state at time tand `(t)is the location of s(x, t).
loc(h)=(`0, `1, . . .)is the list of locations visited by ηh,
with no repetitions. At the time of the jth jump time tj,
say between locations iand k,ηh(tj)is actually set-valued:
ηh(tj) = {(i, tj),(k, tj)}. Because there is no ambiguity
about `0,η(`0,x)and loc(`0, x)will also be denoted by ηx
and loc(x)respectively.
We are given a safety property φof the automaton, and
the set U ⊆ Xof states that violate φ. To falsify φmeans to
find an initial state h0∈H0such that the trajectory ηh0that
starts there enters the unsafe set U. Here we only consider
trajectories of finite duration D < ∞. With non-Zenoness,
this implies that loc(x)is always finite, even if Lis countably
infinite. Lmodels the discrete variables of the automaton so
the range of each xiis not a discrete set3. We will need the
following definitions [14]:
Definition 1 (Discrete distances): lUis the location of the
unsafe set U. Let G= (L, E)be the graph with vertex set L
and edge set E. The discrete distance π(`, `0)between the
two locations `and `0is the length of the shortest path in
Gbetween the locations. Given x∈Init and the trajectory
ηx,`(x)is the location visited by ηxthat is closest to `U,
and k(x)the corresponding distance:
`(x) = argmin`∈loc(x)π(`, `U)
k(x) = min`∈loc(x)π(`, `U) = π(`(x), `U)
Definition 2 (Continuous distances): For x∈X,
dU(x) = infy∈U kx−ykis the distance between xand the
set U. Given (`, x)∈H,r(x) = mint≤DdZ(s(x, t)), where
Zis either Uif `U∈loc(x)(trajectory enters the location
of U), otherwise it is the guard leading to `0, where `0is
the next location in a shortest path in Gfrom `to `U.
Definition 3 (Robustness): Given a trajectory ηx, its ro-
bustness is V(x),(k(x), r(x)). This is the smallest
‘distance’ between the trajectory and the unsafe set U(note
this isn’t a distance in the mathematical sense of the word).
The robustness time t(x)is the time when trajectory ηxis
closest to U:`(x) = `(t(x)) and r(x) = dZ(s(x, t(x)).
Finding an x0that produces an unsafe trajectory (one that
enters U) can be achieved by finding the automaton’s tra-
jectory with smallest robustness. Because only one initial
location is possible (namely, `0), the search for a minimum
3If xiis a discrete state variable taking values in Qi⊂Z, then Lcan
be augmented to L0=L×Qi: every jump in the value of xican then be
modeled as a change of locations in the extended set L0. The continuous
dynamics are unaffected.
robustness-trajectory is to be carried over the initial set Init.
The minimization problems treated in this paper are then:
Problem 1: For a dynamical system (with only one loca-
tion, L={`0})
D=Init, R=R
V(x) = min
t≤DdU(s(t;x))
min
x∈D V(x)
(1)
Problem 2: For a hybrid automaton:
D=Init, R=Z×R
V(x)=(k(x), r(x))
min
x∈D V(x)
(2)
B. Simulators of hybrid automata
Simulated Annealing (which will be formally defined in
the next section) requires the ability to evaluate Vat any
point of the search space: this evaluation requires the simu-
lation of a system trajectory starting at that point. A hybrid
system simulator Hs, which is necessarily a discretization
of the real system H, must be accurate, in the sense that
for every simulated trajectory (generated by the simulator)
starting at some xs∈Inits, there is an arbitrarily close
real system trajectory (generated by H) starting at x∈Init
(Initsis the discretization of Init).
Not every hybrid system admits an accurate simulator.
In [20] sufficient conditions are given on Hand Hsfor
Hsto be an accurate simulator of H. The details of these
conditions are given in the appendix. The following is a
direct consequence of [20, Theorem 3.4]:
Proposition 1: Let Hbe a hybrid automaton, and let P
be the partition of Init induced by the equivalence relation
x≡x0iff loc(x) = loc(x0). Let S∈Pbe a part such
that µ(S)6= 0. If Hsatisfies the conditions for accurate
simulation over S, then for any x0∈Sand every > 0, there
exists δ > 0with the following property: for every trajectory
ηx(·)with initial point xin Bδ(x0)∩S, and every t < D,
there exists t0such that |t−t0|< ,s(x, t0)and s(x0, t)are
in the same location, and ks(x, t0)−s(x0, t)k< .
III. SIM UL ATE D ANNEALING
Simulated Annealing (SA) is a well-known iterative
stochastic algorithm for global optimization. We are given
an objective function Vwith domain D ⊂ Rdand range R.
Dis known as the state space. The objective is to minimize
V.
We are given a Markov kernel R(·,·)on (D,B)where B
is the Borel sigma field on D. This is called the transition
Markov kernel. Thus for each xin D,R(x, ·)is a probability
measure on (D,B), and for each B∈ B,R(·, B)is a
measurable function. We are also given a cooling schedule
(τ0, τ1, . . .)on (D,B): this is a sequence of (possibly ran-
dom) positive numbers.
1595
A. Traditional SA
In traditional SA, D ⊂ Rdand R ⊂ R. SA constructs
iteratively a sequence of states (xi)∈ D, a sequence of
candidate points (yi)∈ D, and a sequence of temperatures
(τi)∈R>0, as follows: an initial state x0and an initial
temperature τ0are given. Having constructed the sequences
(x1, x2, . . . , xk),(y1, y2, . . . , yk), and (τ1, τ2, . . . , τk), a next
candidate point yk+1 is selected according to the probability
distribution R(xk,·). The next state xk+1 is set
xk+1 =(yk+1 with probability p(xk, yk+1, τk)
xkwith probability 1−p(xk, yk+1, τk)
where
p(x, y, τ ) = min{1,exp[−V(y)−V(x)
τ]}(3)
p(x, y, t)is referred to as the acceptance probability. Note
that if V(y)< V (x)(so the candidate improves the value
of the objective function), the candidate is accepted with
certainty.
The following conditions are used in [18] to prove con-
vergence:
•C0. The objective function Vis continuous on D.
•C1. The state space Dis a bounded closed subset of
Rd.
•C2. There exists an x∗∈ D such that Vachieves its
minimum at x∗
•C3. Let x∗be as in C2. For every > 0, the set {x∈
D | kx−x∗k< }=B(x∗)∩ D has positive Lebesgue
measure.
•C4. The selection Markov kernel Ris absolutely con-
tinuous (with respect to the Lebesgue measure on Rd)
and it has a density which is uniformly bounded away
from 0. That is, Ris of the form
R(x, B) = ZB
r(x, y)dy with inf
x,y∈D r(x, y)>0
This implies that all of Dis reachable from any x∈ D.
•C5. For every open subset Bin D,R(x, B)is contin-
uous in x.
•C6. For every choice of initial state x0and initial
temperature τ0, the sequence of temperatures (τk)k≥0
converges in probability to 0.
The following theorem is proven in [18]:
Theorem 1: Let x1, x2, x3, . . . be the sequence of states
generated by the SA algorithm with selection Markov kernel
Rand with cooling schedule τ. Assume that conditions
C0-C6 hold. Let V∗denote the global minimum of Von
D. Then, for every choice of initial conditions (x0, τ0),
the sequence of function values (V(xk))k≥0converges in
probability to V∗. That is, ∀ > 0,P r[|V(xk)−V∗|> ]→0
as k→ ∞.
We now discuss the applicability of conditions C0-C6 to our
hybrid automata:
C0: C0 does not hold for Problem 2, and one contribution of
this paper is to show convergence in the absence of global
continuity. Another contribution is to show that it holds for
Problem 1.
C1: The initial state x0∈Init typically models starting
physical parameters of the system, and these are always finite
in magnitude, whence Init is bounded. Closure of Init will
have to be assumed (this is a standard assumption, in both
practice and theory [4], [20], [22], [23]).
C2: It is shown in Section IV that C0 and C1 imply C2, so
it holds for Problem 1. We assume it holds for Problem 2.
C3: We first generalize C3 to hybrid automata in condition
C7:
•C7. Let x∗be as in C2, and let S∗∈Pbe the part to
which it belongs. Then µ(Bδ(x∗)∩S∗)>0∀δ > 0.
It is immediate to see that the probability of converging to a
minimum that does not satisfy C7 (such as an isolated point)
is 0. Therefore our results can only claim convergence to
minima that belong to sets of non-zero measure, and this is
captured in C7.
C4-C6: these are properties of the optimization algorithm
rather than of the system. The Hit-and-Run sampler may be
used to satisfy C4 and C5 [14], and the cooling schedule can
be chosen to satisfy C6.
We introduce one more condition on the automaton:
•C8. The hybrid automaton admits an accurate simulator
over Init.
Without this condition, it is not guaranteed that we can draw
conclusions about the real system, based on simulations.
B. SA for minimizing a vector function
We now consider the case R=Z×R, so V(x) =
(V1(x), V2(x)) is a vector function. The range of Vis
lexicographically ordered: (k, r)≤(k0, r0)iff (k < k0)
OR (k=k0and r≤r0). This is a total order, so the
issue of non-dominance [24] does not arise. The resulting
vector SA algorithm, introduced in [14], constructs a se-
quence of states (xi)∈ D, a sequence of candidate points
(yi)∈ D, and a sequence of temperatures (τi)∈R>0,
as follows: an initial state x0∈ D and an initial temper-
ature τ0>0are given. Having constructed the sequences
(x1, x2, . . . , xk),(y1, y2, . . . , yk), and (τ1, τ2, . . . , τk), a next
candidate point yk+1 is selected according to the probability
distribution R(xk,·). It then computes
αi= exp −Vi(yk+1)−Vi(xk)
τk, i = 1,2
u=UniformRandomReal(0,1)
The next state xk+1 is determined as
•xk+1 =yk+1 if the event A=
(V1(xk) = V1(yk+1)∧u≤α2)∨(V1(xk)6=V1(yk+1 )∧
u≤α1)is true
•and xk+1 =xkotherwise.
The two events on either side of the disjunctive are disjoint.
It comes that the update rule for vector SA is:
xk+1 =(yk+1 with probability pa(xk, yk+1, τk)
xkwith probability 1−pa(xk, yk+1, τk)
1596
where
pa(x, y, τ ) = min{1,
2
X
i=1
Pr[Vi(yk+1)6=Vi(xk)|u≤α3−i]·α3−i}
We conclude with a Lemma, used in the later proofs,
which will allow us to forego continuity of Vover Init.
It is proved for vector V, which subsumes scalar Vas a
special case. The global minimum of V∗= (k∗, r∗)of V
will be characterized by
k∗≤k(x)∀x∈ D
r∗≤r(x)∀x∈ {y∈ D|V1(y) = k∗}
Given = (1, 2)∈N>0×R>0, define
D={x∈ D|V1(x)≤k∗+1and V2(x)≤r∗+2}
={x∈ D|V(x)≤V∗+}
Lemma 1: Assume C7, C8. Then it holds that ∀=
(1, 2)>0,0< µ(D).
Proof: Fix = (1, 2)>0, and consider the optimum
x∗and its robustness time t∗=t(x∗). By C8 and Prop.1,
∃δ2>0s.t. the image of Bδ2(x∗)∩S∗,Bδ2under
the hybrid dynamics at time t∗is a set of points that are
at least 2-close to s(x∗, t∗). Moreover, all trajectories ηy
starting in Bδ2follow the same sequence of locations as ηx∗;
in particular, they all visit the location l(x∗)and therefore
have k(y) = k(x∗). So |k(y)−k(x∗)|< 1. We know that
µ(Bδ2)>0by condition C7. Recognizing that Bδ2⊂ D
proves the lemma.
Optimizing a vector function. One popular way to optimize
a vector objective function is to map its output V(x)=(k, r)
to scalars in R(e.g. [25]), e.g. using the inverse logit function
with a > 0
Ya: (k, r)7→ k+ 2 [2 exp(r/a)/(1 + exp(r/a)) −1]
Yamaps Z×Rto {z+b|z∈Z, b ∈(−1,1)}. It is strictly
increasing, so the global minima of Ya◦Vare the same as
the global minima of V. However, it faces the ‘saturation’
effect for large absolute values of r(and fixed k): differences
between Yavalues become insignificant at these extremes,
thus not providing the optimizer with enough guidance. This
problem is exacerbated in a practical implementation, which
will discretize Ya, because the the discretized inverse logit is
no longer strictly increasing. This means the global minima
of ˜
Ya◦Vare no longer the same as those of V.
In section IV, it is shown that traditional SA converges
when solving Problem 1. In section V, it is shown that
vector SA converges when solving Problem 2. Throughout,
any mention of a global minimum refers to a minimum that
satisfies all needed conditions, which will be explicated.
IV. TRADITIONAL SA F OR A DY NA MIC AL SY ST E M
Our first result states that traditional SA converges to
the global minimum V∗when solving Problem 1. Recall
that Problem 1 deals with automata with one location, so
there are no guards or resets. This result may be seen as a
special case of Thm.3 presented in the next section. But it is
presented here as a separate result, as a first simple extension
of SA convergence, and to hihglight the role played by the
condition that µ(D)>0for all > 0.
Theorem 2: Assume that conditions C1,C2,C4-C8 are sat-
isfied. Then SA will converge in probability to the global
minimum of Problem 1.
Proof: The proof proceeds along identical lines to
Belisle’s original proof in [18] and so is not repeated. The
main difference is that Vis not continuous over its domain
Init in the present paper. However, Lemma 1 removes the
need for this ’global’ continuity of the objective function,
and may be invoked when proving [18, Lemma 1].
V. VECTOR OBJECTIVE FUNCTION
In this section we prove the convergence of vector SA
when solving Problem 2. The novelty is that this SA algo-
rithm deals with a multi-objective function V(x)=(k , r)∈
N×R, and the objective is no longer continuous on D.
The proof is a variation on Belisle’s proof [18]. Some of
the definitions need to be appropriately modified to account
for the new objective function. The following theorem is
proven next.
Theorem 3: Let x1, x2, . . . be the sequence of states gen-
erated by vector SA when solving Problem 2. Assume that
conditions C1,C2,C4-C8 hold. Let V∗denote the global
minimum of Von D. Then, for any pair of initial conditions
(x0, τ0), the sequence of function values (V(xk)), k ≥0,
converges in probability to V∗.
Since Vis a vector function, all expressions are under-
stood to apply in a component-wise fashion, e.g. |V(x)|>0
iff |V1(x)|>0and |V2(x)|>0.
We will show that for every x0∈ D, t0>0, > 0and
δ > 0there exists an integer n1such that
P r[Xn/∈ D|(X0, T0) = (x0, τ0)] ≤δ∀n≥n1(4)
This will prove the theorem. If is such that Init ⊆ D
then Eq.(4) is trivially satisfied. Therefore, fix > 0such
that D⊂Init,x0∈ D, τ0>0and δ > 0. Let m
and nbe positive integers, ζ∈N>0×R>0, and Xm
n=
(Xn, Xn+1, . . . , Xm)be the random sequence generated by
the random process {Xn}.xm
nwill be a realization of Xm
n.
Define the following events:
A=A(m, n) = the event that none of the states (Xi)n+m
i=n
is in D
B=B(ζ, m, n) = the event that at least one of the
transitions Xn+(k−1) →Xn+k,k= 0,1, . . . , m, is a move
from Dto H,ζ ,{x∈ D|V∗+<V(x)≤V∗++ζ}.
C=C(ζ, m, n) = the event that at least one of the
transitions Xn+(k−1) →Xn+k,k= 0,1, . . . , m, is a move
from Dto ˜
H,ζ ,{x∈ D|V∗+ζ+<V(x)}.
D=the event that Xn+m/∈ D.
Observe that D⊂A∪B∪C. Thus for every ζ, m and n
1597
we have
Pr[Xn+m/∈ D|(x0, τ0)]
= Pr[D|(x0, τ0)]
≤Pr[A|(x0, τ0)] + Pr[B|(x0, τ0)]
+ Pr[D|(x0, τ0)]
(5)
In the next three sections we prove the following three
lemmata, which show that the ‘escape’ probabilities on
the right hand side of (5) can be made arbitrarily small,
regardless of initial conditions (x0, τ0).
Lemma 2: There exists an integer m0(which does not
depend on (x0, τ0)), such that Pr[A(m0, n)|(x0, τ0)] <
δ/3∀n≥0
Lemma 3: Let m0be as in Lemma 2. There exists a ζ0=
(ζ0,1, ζ0,2), independent of (x0, τ0), such that
Pr[B(ζ0, m0, n)|(x0, τ0)] < δ/3∀n≥0
Lemma 4: Let m0and ζ0be as in Lemma 2 and Lemma
3 resp. There exists an integer n0, independent of (x0, τ0),
such that
Pr[C(ζ0, m0, n)|(x0, τ0)] < δ/3∀n≥n0
Thus we may conclude that Pr[Xn+m/∈ D|(x0, τ0)] <
δ∀n≥n0. Therefore, Eq.(4) holds with n1=n0+m0.
The proof of Lemma 2 is almost identical to the original
proof in [18] with some minor obvious modifications, and
again, uses Lemma 1 instead of continuity of V. Therefore
it is omitted.
A. Proof of Lemma 3
Let Xi→Xi+1 denote a transition from state Xito state
Xi+1, and ζ∈N+×R+.
Pr[B(ζ, m0, n)|(x0, τ0)]
≤
m−1
X
j=0
Pr[Xn+j∈ D→Xn+j+1 ∈H,ζ
| {z }
Bj(ζ,m0,n)
|(x0, τ0)]
Pr[Bj(ζ, m0, n)|(x0, τ0)] ≤sup
x∈D
R(x, H,ζ )
Thus
Pr[B(ζ, m0, n)|(x0, τ0)] ≤m0sup
x∈D
R(x, H,ζ )(6)
Now we may write
H,ζ ={x∈ D | V1∗+1< V1(x)≤V1∗+1+ζ1}
∩ {x∈ D | V2∗+2< V2(x)≤V2∗+2+ζ2}
,H1
,ζ ∩H2
,ζ
Consider a sequence {ζ(i)}s.t. ζ(i)→0as i→ ∞. Then
D ⊇ H1
,ζ(1) ⊇H1
,ζ(2) ⊇. . .: since Dis bounded (C1), H1
,ζ
is bounded and therefore has finite Lebesgue measure. Next,
µ(H2
,ζ )→0as ζ2→0(Lemma 2 in [18] with ζ2= 1/`).
Thus µ(H1
,ζ ∩H2
,ζ ) = µ(H,ζ )→0as ζ→0. Therefore,
for every x∈ D,R(x, H,ζ ) = Rx∈H,ζ r(x, y)dy →0as
ζ→0. Thus:
•R(x, H,ζ(i))is a real-valued continuous function over
a compact space D ∀i.
•{R(x, H,ζ(i))}iis a monotonically decreasing se-
quence of functions.
•The sequence {R(x, H,ζ(i))}converges pointwise to
the 0 function.
Dini’s theorem allows us to conclude that R(x, H,ζ )→0
as ζ→0uniformly in x. Thus ∃ζ0= (ζ1,0, ζ2,0)s.t.
supx∈D R(x, H,ζ0)< δ/3m0. Combined with (6), this
proves the lemma.
B. Proof of Lemma 4
Let m0and ζ0be as in Lemma 1 and Lemma 2 respec-
tively. Fix τ∗>0.
Pr[C(ζ0, m0, n)|(x0, τ0)]
≤
m−1
X
j=0
Pr[Xn+j∈ D→Xn+j+1 ∈˜
H,ζ0
| {z }
Cj(ζ0,m0,n)
|(x0, τ0)].
Pr[Cj(ζ0, m0, n)|(x0, τ0)]
= Pr[Cj(ζ0, m0, n)∩τn+j≤τ∗|(x0, τ0)]
+ Pr[Cj(ζ0, m0, n)∩τn+j> τ∗|(x0, τ0)]
The first summand is upper bounded by
pa(x, y, τ ), x ∈ D, y ∈˜
H,ζ0, τ ≤τ∗
≤α1+α2
= exp −V1(y)−V1(x)
τ+ exp −V2(y)−V2(x)
τ
≤exp −V∗−1−ζ1,0+V∗+1
τ
+ exp −V∗−2−ζ2,0+V∗+2
τ
≤exp −ζ1,0
τ+ exp −ζ2,0
τ
So there exists a τ∗s.t. Pr[C(ζ0, m0, n)|(x0, τ0)] ≤δ/6
when τ≤τ∗. Condition C6 guarantees the existence of an
n2such that τn≤τ∗for all n≥n2. The second summand
can be made arbitrarily small by C6: in particular, let n3be
such that it is smaller than δ/6. Letting n0≥max{n2, n3}
proves Lemma 4.
VI. PR AC TIC AL CONSIDERATIONS
The previous sections have demonstrated that SA is a
consistent optimization algorithm for the falsification of
temporal logical properties of hybrid automata; that is, the
sequence of samples it produces converges (in probability)
to the set of global optima, regardless of the initial sample.
This is an important property since it guarantees that longer
runs of SA will produce better minima. From a practical
standpoint, previous work [14] has demonstrated that in
practice, SA performs well, both in terms of speed and
quality of obtained minimum. Rather than replicate those
experiments here, we focus instead on the factors that affect
finite-time performance 4, and how they affect it.
4Finite-time performance tells us how close is the current minimum,
after Nsamples generated, to the global minimum. The answer is naturally
affected by the likelihood of SA to spend many samples near non-global
minima for this system and specification.
1598
It is well-known that the practical performance of SA de-
pends on the specific objective function being optimized, the
particular cooling schedule, and the neighborhood selection;
see [26] for a good review of these issues. In our case,
the neighborhood is all of Init as per condition C4, and
the cooling schedule is adaptively modified to maintain an
acceptance-to-rejection ratio close to 1. This is permitted by
condition C6, and has been shown experimentally to help
avoid local minima traps [14]. The objective function is
directly related to the system and property being falsified,
and we now briefly illustrate how its graph can affect
convergence. We select two benchmark hybrid automata and
corresponding unsafe sets, and study the following three
issues for each:
- Generate the partition Pof its Init set. This allows us to
assess whether it satisfies condition C7.
- Plot the graph of the objective function, which can indicate
the difficulty of this problem instance.
- See if vector SA generates samples with different sequences
of locations (`i)for our system. If a run of vector SA
generates very few different (`i), this might indicate a
local minimum trap, which should not be confused with
having converged (indeed, Lemma 2 asserts that the tail of
the generated sequence consists of samples with the same
sequence of locations with increasing probability).
Our first system, Sys1, is a 2D, 5-location hybrid automa-
ton with linear dynamics in each location:
F low(1, x)=0.1−1 10
−100 −1x, F low (2, x)=0.1−1 100
−10 −1x
F low(3, x) = F low (4, x)=0.1−1 100
−10 −1x
F low(5, x)=0.11−10
10 1 x
Guard(1,2) = {x1>0}, Guard(2,3) = {x2<0∧x1<4.5}
Guard(3,4) = {x1<0}, Guard(2,5) = {x2<0∧x1>4.5}
Guard(4,1) = {x2>0}, Guard(5,3) = {x1<4.5}
U={3< x1<4,3< x2<4}
To generate the graph of the robustness function, we sampled
the initial set with a step size of 0.01 in both dimensions,
leading to a grid with 3600 points. The graph (not shown here
for lack of space) displays several near-flat ‘valleys’ (regions
of small function values) surrounded by ‘peaks’ (regions of
large function values): if SA samples from a given valley, it
will continue sampling from it for a long time because the
probability of accepting an increase in function value will
be small, following Eq.(3). Thus we expect that once SA
samples from a valley that contains a global minimum, then
there’s a high probability it will get arbirarily close to that
minimum.
Using the same grid, the partition Pof the initial set was
obtained. A coarser partition (with step size = 0.02) is shown
in Fig.1 for clarity. A total of 4 parts were obtained, each
corresponding to a sequence of locations. It can be seen that
Parts 1 and 3 cover much of Init (see Fig.1), so it is to be
expected that at least initially, SA will sample from these 2
parts overwhelmingly. If the global minimum is in a different
part, longer runs of SA are required.
Fig. 1: Init partition for Sys1. Parts correspond to the
sequences si
We then ran SA on Sys1 with a sample size of 1000 initial
points, and a trajectory duration of 2sec. All 4 parts were
visited, with the vast majority of the points chosen from
the Part3. This is in accordance with the observations made
above about the partition of Init and shape of the graph.
It is notable that the minimum found by SA (down to two
significant figures) is [1.35,1.74], almost at the boundary of
parts 3 and 4. These points are harder to find for SA because
of the different sequences on either side of the boundary.
The corresponding robustness value is 0.5 (compare to global
minimum robustness value of 0.49, found on above grid, at
[1.58,1.79]).
We also re-use the Nav0 benchmark from [27], which
we will argue is a harder instance for SA. Nav0 is a 4-
dimensional automaton with 16 locations, and it is unknown
whether it is falsifiable or not. For the purposes of presenting
results graphically, we fix the last 2 dimensions of the state
vector to [x3, x4] = [0.1,0.2], and let SA vary the first two
dimensions. The graph of the robustness function is given
in Fig.2. The graph was obtained by sampling the initial set
with a step size of 0.01 (for a total of 3600 points), and
computing a trajectory of duration 10sec. We can observe
a large number of minima with varying depths, increasing
the odds of SA spending a large number of samples in these
minima before jumping back out.
Fig. 2: Graph of the robustness function for Nav0.
Using the same grid, the partition Pof the initial set
1599
was computed. The partition has 189 parts, with the largest
part containing only 15.3% of the points, and the 4 largest
parts together have 53.3% of the points. Fig.3 shows how
the initial set of conditions is highly fragmented. While we
haven’t established a precise relation between the measure of
the parts and SA convergence, condition C7 and the proof
of Prop.1 drive us to conjecture that a larger measure for
the parts leads to a faster convergence. The small size of all
partitions in Nav0 then suggests this is a hard instance for
SA.
Fig. 3: Zoom on Init partition for Nav0. Different symbols
and colors correspond to different sequences of locations.
Note the fragmentation on the right side.
SA was run on Nav0 with a sample size of 1000 initial
points, and a test duration of 10sec. It selected points from 62
parts, with the largest 2 containing 70% of the 1000 points,
and the smallest containing 0.01% of the points. These parts
were not the largest parts in P, which is expected since SA
is driven by the objective function as well as the measure
of the parts. SA found a global minimum value of 0.09 at
[0.61,3.40], compared to the global (grid) minimum of 0.08
at [0.67,3.52].
On the theoretical side, the probabilistic convergence of
the sequence of initial states generated by SA to a global
optimum, can be informally divided into two parts: the first
part is the convergence of the Markov Chain to its zero-
temperature stationary distribution, π∞, in an appropriate
sense. The stationary distribution favors minima of the objec-
tive function. The second part is then the sampling from the
stationary distribution. Thus a bound on SA’s convergence
involves bounding these two components. The second part
depends on the shape of the stationary distribution. In our
case, it is exponential of the form π∞(x) = 1
Me−βV (x)[14],
where Mis the (unknown) normalization constant. For the
first part, recently, a result on finite-time guarantees for SA
optimization over continuous domains was obtained in [28].
Informally, given a desired precision of the optimization, it
provides a number of samples after which we are guaranteed
that the minimum so far is within the desired precision of the
global minimum. The strength of this result is that it requires
very little of the objective function, namely only that it be
well-defined pointwise, measurable, and bounded between 0
and 1. In traditional SA, g◦V(x)satisfies these conditions,
with Vgiven in Problem 1 and gis any monotone increasing
function that maps [0,∞]to [0,1]. No such result exists yet
for a vector objective function, like the one in Problem 2;
this is the subject of future research.
VII. CONCLUSIONS
The problem of falsifying safety properties of hybrid
automata was formulated as optimization problems in [14].
In this paper, we provided conditions on the system under
which Simulated Annealing will converge in probability to
the global minimum, and thus return a system trajectory
of mimimal robustness. Research can proceed along three
directions: the first is to establish convergence conditions for
discrete implementations of the continuous models studied in
this paper. A second direction is to broaden the class of sys-
tems for which SA converges, e.g. to systems with more than
one starting location, and to automata with state- and time-
dependent guard sets, as well as to non-autonomous systems.
A third direction of research is to develop computable criteria
that establish whether a given system is, a priori, suitable
for efficient SA falsification or not. As part of this direction,
there is obvious interest in establishing finite-time guarantees
for the vector SA algorithm.
ACKNOWLEDGMENT
The authors wish to thank Tolga Duman for many helpful
discussions.
REFERENCES
[1] M. Blair, S. Obenski, and P. Bridickas, “Patriot missile software
problem,” United States General Accounting Office, Tech. Rep.
GAO/IMTEC-92-26, 1992.
[2] E. J. Hoffman, W. L. Ebert, M. D. Femiano, H. R. Freeman, C. J.
Gay, C. P. Jones, P. J. Luers, and J. G. Palmer, “The near rendezvous
burn anomaly of december 1998,” Applied Physics Laboratory, Johns
Hopkins University, Tech. Rep., Nov. 1999.
[3] T. A. Henzinger, P. W. Kopke, A. Puri, and P. Varaiya, “What’s
decidable about hybrid automata?” J. Comput. Syst. Sci., vol. 57, no. 1,
pp. 94–124, 1998.
[4] S. Ratschan and J.-G. Smaus, “Finding errors of hybrid systems by
optimizing an abstraction-based quality estimate,” in Proceedings of
the Third Int’l Conf. on Tests and Proofs, Zurich, Switzerland, July
2009, pp. 153–168.
[5] Q. Zhao, B. H. Krogh, and P. Hubbard, “Generating test inputs for
embedded control systems,” IEEE Control Systems Magazine, vol.
Aug., pp. 49–57, 2003.
[6] M. Branicky, M. Curtiss, J. Levine, and S. Morgan, “Sampling-based
planning, control and verification of hybrid systems,” IEE Proc.-
Control Theory Appl., vol. 153, no. 5, pp. 575–590, 2006.
[7] T. Nahhal and T. Dang, “Test coverage for continuous and hybrid
systems,” in CAV, ser. LNCS, vol. 4590. Springer, 2007, pp. 449–
462.
[8] E. Plaku, L. E. Kavraki, and M. Y. Vardi, “Hybrid systems: From
verification to falsification,” in Proceedings of the 19th International
Conference on Computer Aided Verification, ser. LNCS, W. Damm
and H. Hermanns, Eds., vol. 4590. Springer, 2007, pp. 463–476.
[9] ——, “Falsification of ltl safety properties in hybrid systems,” in Proc.
of the Conf. on Tools and Algorithms for the Construction and Analysis
of Systems (TACAS), ser. LNCS, vol. 5505, 2009, pp. 368 – 382.
1600
[10] A. Rizk, G. Batt, F. Fages, and S. Soliman, “On a continuous degree
of satisfaction of temporal logic formulae with applications to systems
biology,” in International Conference on Computational Methods in
Systems Biology, ser. LNCS, no. 5307. Springer, 2008, pp. 251–268.
[11] P. Zuliani, A. Platzer, and E. M. Clarke, “Bayesian statistical model
checking with application to simulink/stateflow verification,” in Pro-
ceedings of the 13th ACM International Conference on Hybrid Sys-
tems: Computation and Control, 2010, pp. 243–252.
[12] A. Donze and O. Maler, “Systematic simulation using sensitivity
analysis,” in Hybrid Systems: Computation and Control, ser. LNCS,
vol. 4416. Springer, 2007, pp. 174–189.
[13] F. Lerda, J. Kapinski, E. M. Clarke, and B. H. Krogh, “Verification
of supervisory control software using state proximity and merging,”
in Hybrid Systems: Computation and Control, ser. LNCS, vol. 4981.
Springer, 2008, pp. 344–357.
[14] H. Abbas, G. E. Fainekos, S. Sankaranarayanan, F. Ivancic, A. Gupta,
and G. J. Pappas, “Probabilistic temporal logic falsification of cyber-
physical systems,” ACM Transactions on Embedded Computing Sys-
tems, vol. (Accepted), 2011.
[15] Y. S. R. Annapureddy, C. Liu, G. E. Fainekos, and S. Sankara-
narayanan, “S-taliro: A tool for temporal logic falsification for hybrid
systems,” in Tools and algorithms for the construction and analysis
of systems, ser. LNCS, vol. 6605. Springer, 2011, pp. 254–257.
[16] T. Nghiem, S. Sankaranarayanan, G. Fainekos, F. Ivancic, A. Gupta,
and G. Pappas, “Monte-carlo techniques for falsification of temporal
properties of non-linear hybrid systems,” in Hybrid Systems: Compu-
tation and Control, 2010.
[17] G. Fainekos and G. Pappas, “Robustness of temporal logic specifica-
tions for continuous-time signals,” Theoretical Computer Science, vol.
410, no. 42, pp. 4262–4291, September 2009.
[18] C. J. P. Belisle, “Convergence theorems for a class of simulated
annealing algorithms on Rd,” Journal of Applied Probability, vol. 29,
no. 4, pp. 885–895, Dec. 1992.
[19] B. Hajek, “Cooling schedules for optimal annealing,” Mathematics of
operation research, vol. 13, no. 2, pp. 311–329, 1988.
[20] R. G. Sanfelice and A. R. Teel, “Dynamical properties of hybrid
systems simulators,” Automatica, vol. 46, no. 2, pp. 239–248, 2010.
[21] P. Tabuada, Verification and Control of Hybrid Systems: A Symbolic
Approach. Springer, 2009.
[22] A. Girard and G. J. Pappas, “Approximation metrics for discrete and
continuous systems,” IEEE Trans. Auto. Cont., vol. 52, no. 5, pp. 782–
798, 2007.
[23] J. Lygeros, K. H. Johansson, S. N. Simic, J. Zhang, and S. Sastry,
“Dynamical properties of hybrid automata,” IEEE Transactions on
Automatic Control, vol. 48, pp. 2–17, 2003.
[24] K. I. Smith, R. M. Everson, J. E. Fieldsend, C. Murphy, and R. Misra,
“Dominance-based multiobjective simulated annealing,” IEEE Trans-
actions on Evolutionary computation, vol. 12, no. 3, pp. 323–342,
2008.
[25] P. Czyzak and A. Jaszkiewicz, “Pareto simulated annealing - a meta-
heuristic technique for multiple-objective combinatorial optimization,”
J. Multi-Criteria Decision Analysis, vol. 7, pp. 34–47, 1998.
[26] D. Henderson, S. H. Jacobson, and A. W. Johnson, “The theory
and practice of simulated annealing,” in Handbook of metaheuristics.
Springer, 2003.
[27] H. Abbas and G. Fainekos, “Linear hybrid system falsification through
local search,” in Automated Technology for Verification and Analysis,
ser. LNCS, vol. 6996. Springer, 2011, pp. 503–510.
[28] A. Lecchini-Visintini, J. Lygeros, and J. M. Maciejowski, “Stochastic
optimization on continuous domains with finite-time guarantees by
markov chain monte carlo methods,” IEEE Transactions on Automatic
Control, vol. 55, no. 12, pp. 2858–2863, Dec. 2010.
[29] R. Goebel, R. G. Sanfelice, and A. R. Teel, “Hybrid dynamical
systems,” IEEE Control Systems Magazine, pp. 28–93, 2009.
VIII. APPENDIX
This section details the ‘hybrid basic conditions’ that are
required of a hybrid automaton Hfor it to have an accurate
simulator. The conditions are framed in the formalism of
differential and difference inclusions [29], which generalizes
the formalism used in this paper.
Definition 4: [20] A hybrid system Hon a state space
Rnis defined by
•A set C⊆Rdcalled the flow set
•A set-valued map F:Rn→ P(Rn)called the flow
map
•A set D⊆Rncalled the jump set
•A set-valued map G:Rn→ P(Rn)called the jump
map
We write H= (C, F, D, G). The dynamics of the system
are given by
ξ∈Rn(˙
ξ∈F(ξ), ξ ∈C
ξ+∈G(ξ), ξ ∈D
Hybrid automata, defined in section II-A, can be modeled
using Def. 4 as follows [29]: let n=d+1 and ξ= [`, x]T∈
L×Rd⊂Rd+1 be the state of the system. Then we take:
C`=Inv(`), D`=∪`0:(`,`0)∈EGuard(`, `0),F`(x) =
F low(`, x),∀x∈C`, and
G`(x) = {[`0, Re(`, `0, x)]T|x∈Guard(`, `0)},∀x∈D`.
For a deterministic automaton, for any x, there is a unique
location `0such that x∈Guard(`, `0)(otherwise, two jumps
(`, `0)and (`, `00)are possible). Thus G`(x)is a singleton.
The dynamics of Hare then given by:
[`, x]T∈L×Rd(˙
ξ∈F(ξ) = {[0, F`(x)]T}, x ∈C`
ξ+∈G(ξ) = {[`0, Re(`, `0, x)]T}, x ∈D`
Conditions for accurate simulation [20, Assumption 2.5].
The data of the hybrid system H= (C, F, D, G), satisfies:
A1. Cand Dare closed sets.
A2. F:Rd→ P(Rd)is outer semicontinuous (o.s.c.)5and
locally bounded, and F(ξ)is nonempty and convex for all
ξ∈C.
A3. G:Rd→ P(Rd)is o.s.c. and locally bounded, and
G(ξ)is nonempty for all ξ∈D.
Proposition 1 is an application of the following theorem.
A ‘maximal’ trajectory is one which can not be extended.
Recall that Dis the time-limit of all trajectories we compute.
Theorem 4: [20, Thm. 3.4] Assume that Hsatisfies the
above conditions. Let K⊂Rnbe a compact set such that H
is pre-complete from K(i.e. each maximal trajectory starting
from Kis either bounded or has infinite length). Then, for
every > 0there exists δ∗>0with the following property:
for any δ∈(0, δ∗]and any solution ηxwith x∈K+δB1(0)
there exists a solution ηx0with x0∈Ksuch that the two
trajectories have the same number of location transitions, and
their continuous parts are -close at all times t<Dat which
they’re in the same location.
5A set-valued map F:Rn→ P(Rn)is o.s.c. iff for all sequences
(ξi)∈Rnconverging to ξand all sequences (ωi)∈F(ξi)converging to
ω, it holds that ω∈F(ξ).
1601