Conference PaperPDF Available

Convergence proofs for Simulated Annealing falsification of safety properties

Authors:

Abstract and Figures

The problem of falsifying temporal logic properties of hybrid automata can be posed as a minimization problem by utilizing quantitative semantics for temporal logics. Previous work has used a variation of Simulated Annealing (SA) to solve the problem. While SA is known to converge to the global minimum of a continuous objective function over a closed and bounded search space, or when the search space is discrete, there do not exist convergence proofs for the cases addressed in that previous work. Namely, when the objective function is discontinuous, and when the objective is a vector-valued function. In this paper, we derive conditions and we prove convergence of SA to a global minimum in both scenarios. We also consider matters affecting the practical performance of SA.
Content may be subject to copyright.
Convergence Proofs for Simulated Annealing Falsification of Safety
Properties*
Houssam Abbas1and Georgios Fainekos2
Abstract The problem of falsifying temporal logic proper-
ties of hybrid automata can be posed as a minimization problem
by utilizing quantitative semantics for temporal logics. Previous
work has used a variation of Simulated Annealing (SA) to solve
the problem. While SA is known to converge to the global
minimum of a continuous objective function over a closed and
bounded search space, or when the search space is discrete,
there do not exist convergence proofs for the cases addressed
in that previous work. Namely, when the objective function
is discontinuous, and when the objective is a vector-valued
function. In this paper, we derive conditions and we prove
convergence of SA to a global minimum in both scenarios.
We also consider matters affecting the practical performance
of SA.
I. INTRODUCTION
One of the major challenges in the model-based develop-
ment of Cyber-Physical Systems (CPS) is how to automati-
cally verify the correctness of a CPS model with respect to
some formal specification. The proliferation of embedded
computers in a multitude of safety critical systems and
the well documented cases of CPS system failures due to
software-physical system interactions [1], [2] demonstrate
the urgency and the importance of the problem. However,
it is well known [3] that the verification problem of CPS is
undecidable, in general. Therefore, a lot of research effort
has been focused on testing-based methodologies [4]–[13].
In previous work [14]–[16], a notion of robustness of
temporal logics [17] is utilized as a cost function in order to
convert the temporal logic falsification problem of CPS into
an optimization problem. In detail, the robust semantics of
a temporal logic formula over a CPS trajectory evaluate to
a positive value if the trajectory satisfies the specification
and to a negative value otherwise. Thus, we can convert
the falsification problem into a minimization problem of the
specification robustness over the set of all system trajectories.
In general, the resulting optimization problem is non-convex
and non-linear and the search space is uncountable. Thus, in
[14]–[16], a number of stochastic heuristic search techniques
were employed in order to solve the minimization problem
with very promising results. In particular, in [14], a version
of Simulated Annealing (SA) [18], [19] was utilized and a
new SA heuristic for the minimization of a particular class
*This work was partially supported by NSF awards CNS-1017074 and
CNS-1116136.
1H. Abbas is with the Department of Electrical, Computer and Energy
Engineering, Arizona State University, Tempe, AZ, USA hyabbas at
asu.edu
2G. Fainekos is with the Department of Computing, Informatics and
Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
fainekos at asu.edu
of vector functions was also proposed. For the former, it was
claimed that under the assumption of a finite search space1
we can guarantee convergence of SA to the global minimum
and, thus, guarantee the solution of the original falsification
problem. The convergence of the SA algorithm for vector
functions was left as an open problem.
In this paper, we lift the assumption of the finite search
space and we answer the question of what classes of CPS
and under what conditions we can guarantee the convergence
of SA to the global minimum. Furthermore, we derive
conditions so that the SA algorithm over vector functions
converges to the global minimum, as well. In brief, we prove
that if a CPS is simulatable [20], then the SA algorithms are
guaranteed to converge to the global minimum if the global
minimum does not belong to an equivalence class of measure
zero. The results in this paper are important because they
help us understand the practical and theoretical limitations
of the application of SA to the falsification problem of CPS.
While they are presented for safety requirements due to space
constraints, they can easily be extended to general temporal
logic formulae.
Notation.µ(·)denotes the Lebesgue measure. k·k is the
Euclidean distance, and B(x) = {yRd| kyxk< }.
R0= [0,)and N0=R0N. ‘Discrete set’ will mean
a finite or countably infinite set. For a set X,P(X)is the
set of all subsets of X.
II. PROB LEM F ORM ULATI ON
A. Falsification of safety properties of hybrid automata
We now introduce the practical setting in which the above
two algorithms are applied, and all theorems are proven. We
stress that we prove the correctness of the SA algorithms
when applied to this practical problem. We consider a
deterministic, non-Zeno2hybrid automaton [21]
H= (L, X, F low, I nit, Inv, E , G, Re)
where L={`0, `1, . . .} ⊂ Nis a countable set of ‘locations’,
XRdis the continuous state space, F low :L×XX
is a vector field describing the continuous evolution of a
trajectory at (`, x),Init Xis the set of initial continuous
conditions, Inv :L→ P(X)associates an invariant set
of F low(`, ·)to each location `,EL×Ldescribes the
possible jumps between locations (i.e. (`i, `j)Eiff there
1Any compact set of initial conditions and/or other search parameters is
going to be discretized to a finite set of floating-point numbers.
2An automaton is Zeno if it has trajectories that perform an infinite
number of discrete jumps in a finite amount of time. This is an artifact
of the modeling, and can not happen in reality.
1594
Fiftieth Annual Allerton Conference
Allerton House, UIUC, Illinois, USA
October 1 - 5, 2012
978-1-4673-4539-2/12/$31.00 ©2012 IEEE
exists a trajectory of the system that visits `ithen `jwithout
visiting any other location in-between), G:E→ P(X)
defines guard conditions that cause jumps, and Re :E×X
Xis the reset map which resets the continuous state with
every jump. H=L×Xis the ‘state-space’ of H. In this
paper we consider automata for which Init Inv(`0)so
the initial set H0={`0} × Init.
Given an initial state h= (`0, x)H0, the hybrid
trajectory that starts there is a vector function ηh: [0,)
Hwhich associates a pair (location, continuous state) to each
point in time: ηh(t)=(`(t), s(x, t)) where s(x, t)is the
continuous state at time tand `(t)is the location of s(x, t).
loc(h)=(`0, `1, . . .)is the list of locations visited by ηh,
with no repetitions. At the time of the jth jump time tj,
say between locations iand k,ηh(tj)is actually set-valued:
ηh(tj) = {(i, tj),(k, tj)}. Because there is no ambiguity
about `0,η(`0,x)and loc(`0, x)will also be denoted by ηx
and loc(x)respectively.
We are given a safety property φof the automaton, and
the set U Xof states that violate φ. To falsify φmeans to
find an initial state h0H0such that the trajectory ηh0that
starts there enters the unsafe set U. Here we only consider
trajectories of finite duration D < . With non-Zenoness,
this implies that loc(x)is always finite, even if Lis countably
infinite. Lmodels the discrete variables of the automaton so
the range of each xiis not a discrete set3. We will need the
following definitions [14]:
Definition 1 (Discrete distances): lUis the location of the
unsafe set U. Let G= (L, E)be the graph with vertex set L
and edge set E. The discrete distance π(`, `0)between the
two locations `and `0is the length of the shortest path in
Gbetween the locations. Given xInit and the trajectory
ηx,`(x)is the location visited by ηxthat is closest to `U,
and k(x)the corresponding distance:
`(x) = argmin`loc(x)π(`, `U)
k(x) = min`loc(x)π(`, `U) = π(`(x), `U)
Definition 2 (Continuous distances): For xX,
dU(x) = infy∈U kxykis the distance between xand the
set U. Given (`, x)H,r(x) = mintDdZ(s(x, t)), where
Zis either Uif `Uloc(x)(trajectory enters the location
of U), otherwise it is the guard leading to `0, where `0is
the next location in a shortest path in Gfrom `to `U.
Definition 3 (Robustness): Given a trajectory ηx, its ro-
bustness is V(x),(k(x), r(x)). This is the smallest
‘distance’ between the trajectory and the unsafe set U(note
this isn’t a distance in the mathematical sense of the word).
The robustness time t(x)is the time when trajectory ηxis
closest to U:`(x) = `(t(x)) and r(x) = dZ(s(x, t(x)).
Finding an x0that produces an unsafe trajectory (one that
enters U) can be achieved by finding the automaton’s tra-
jectory with smallest robustness. Because only one initial
location is possible (namely, `0), the search for a minimum
3If xiis a discrete state variable taking values in QiZ, then Lcan
be augmented to L0=L×Qi: every jump in the value of xican then be
modeled as a change of locations in the extended set L0. The continuous
dynamics are unaffected.
robustness-trajectory is to be carried over the initial set Init.
The minimization problems treated in this paper are then:
Problem 1: For a dynamical system (with only one loca-
tion, L={`0})
D=Init, R=R
V(x) = min
tDdU(s(t;x))
min
x∈D V(x)
(1)
Problem 2: For a hybrid automaton:
D=Init, R=Z×R
V(x)=(k(x), r(x))
min
x∈D V(x)
(2)
B. Simulators of hybrid automata
Simulated Annealing (which will be formally defined in
the next section) requires the ability to evaluate Vat any
point of the search space: this evaluation requires the simu-
lation of a system trajectory starting at that point. A hybrid
system simulator Hs, which is necessarily a discretization
of the real system H, must be accurate, in the sense that
for every simulated trajectory (generated by the simulator)
starting at some xsInits, there is an arbitrarily close
real system trajectory (generated by H) starting at xInit
(Initsis the discretization of Init).
Not every hybrid system admits an accurate simulator.
In [20] sufficient conditions are given on Hand Hsfor
Hsto be an accurate simulator of H. The details of these
conditions are given in the appendix. The following is a
direct consequence of [20, Theorem 3.4]:
Proposition 1: Let Hbe a hybrid automaton, and let P
be the partition of Init induced by the equivalence relation
xx0iff loc(x) = loc(x0). Let SPbe a part such
that µ(S)6= 0. If Hsatisfies the conditions for accurate
simulation over S, then for any x0Sand every  > 0, there
exists δ > 0with the following property: for every trajectory
ηx(·)with initial point xin Bδ(x0)S, and every t < D,
there exists t0such that |tt0|< ,s(x, t0)and s(x0, t)are
in the same location, and ks(x, t0)s(x0, t)k< .
III. SIM UL ATE D ANNEALING
Simulated Annealing (SA) is a well-known iterative
stochastic algorithm for global optimization. We are given
an objective function Vwith domain D ⊂ Rdand range R.
Dis known as the state space. The objective is to minimize
V.
We are given a Markov kernel R(·,·)on (D,B)where B
is the Borel sigma field on D. This is called the transition
Markov kernel. Thus for each xin D,R(x, ·)is a probability
measure on (D,B), and for each B∈ B,R(·, B)is a
measurable function. We are also given a cooling schedule
(τ0, τ1, . . .)on (D,B): this is a sequence of (possibly ran-
dom) positive numbers.
1595
A. Traditional SA
In traditional SA, D ⊂ Rdand R ⊂ R. SA constructs
iteratively a sequence of states (xi)∈ D, a sequence of
candidate points (yi)∈ D, and a sequence of temperatures
(τi)R>0, as follows: an initial state x0and an initial
temperature τ0are given. Having constructed the sequences
(x1, x2, . . . , xk),(y1, y2, . . . , yk), and (τ1, τ2, . . . , τk), a next
candidate point yk+1 is selected according to the probability
distribution R(xk,·). The next state xk+1 is set
xk+1 =(yk+1 with probability p(xk, yk+1, τk)
xkwith probability 1p(xk, yk+1, τk)
where
p(x, y, τ ) = min{1,exp[V(y)V(x)
τ]}(3)
p(x, y, t)is referred to as the acceptance probability. Note
that if V(y)< V (x)(so the candidate improves the value
of the objective function), the candidate is accepted with
certainty.
The following conditions are used in [18] to prove con-
vergence:
C0. The objective function Vis continuous on D.
C1. The state space Dis a bounded closed subset of
Rd.
C2. There exists an x∈ D such that Vachieves its
minimum at x
C3. Let xbe as in C2. For every  > 0, the set {x
D | kxxk< }=B(x) D has positive Lebesgue
measure.
C4. The selection Markov kernel Ris absolutely con-
tinuous (with respect to the Lebesgue measure on Rd)
and it has a density which is uniformly bounded away
from 0. That is, Ris of the form
R(x, B) = ZB
r(x, y)dy with inf
x,y∈D r(x, y)>0
This implies that all of Dis reachable from any x∈ D.
C5. For every open subset Bin D,R(x, B)is contin-
uous in x.
C6. For every choice of initial state x0and initial
temperature τ0, the sequence of temperatures (τk)k0
converges in probability to 0.
The following theorem is proven in [18]:
Theorem 1: Let x1, x2, x3, . . . be the sequence of states
generated by the SA algorithm with selection Markov kernel
Rand with cooling schedule τ. Assume that conditions
C0-C6 hold. Let Vdenote the global minimum of Von
D. Then, for every choice of initial conditions (x0, τ0),
the sequence of function values (V(xk))k0converges in
probability to V. That is,  > 0,P r[|V(xk)V|> ]0
as k→ ∞.
We now discuss the applicability of conditions C0-C6 to our
hybrid automata:
C0: C0 does not hold for Problem 2, and one contribution of
this paper is to show convergence in the absence of global
continuity. Another contribution is to show that it holds for
Problem 1.
C1: The initial state x0Init typically models starting
physical parameters of the system, and these are always finite
in magnitude, whence Init is bounded. Closure of Init will
have to be assumed (this is a standard assumption, in both
practice and theory [4], [20], [22], [23]).
C2: It is shown in Section IV that C0 and C1 imply C2, so
it holds for Problem 1. We assume it holds for Problem 2.
C3: We first generalize C3 to hybrid automata in condition
C7:
C7. Let xbe as in C2, and let SPbe the part to
which it belongs. Then µ(Bδ(x)S)>0δ > 0.
It is immediate to see that the probability of converging to a
minimum that does not satisfy C7 (such as an isolated point)
is 0. Therefore our results can only claim convergence to
minima that belong to sets of non-zero measure, and this is
captured in C7.
C4-C6: these are properties of the optimization algorithm
rather than of the system. The Hit-and-Run sampler may be
used to satisfy C4 and C5 [14], and the cooling schedule can
be chosen to satisfy C6.
We introduce one more condition on the automaton:
C8. The hybrid automaton admits an accurate simulator
over Init.
Without this condition, it is not guaranteed that we can draw
conclusions about the real system, based on simulations.
B. SA for minimizing a vector function
We now consider the case R=Z×R, so V(x) =
(V1(x), V2(x)) is a vector function. The range of Vis
lexicographically ordered: (k, r)(k0, r0)iff (k < k0)
OR (k=k0and rr0). This is a total order, so the
issue of non-dominance [24] does not arise. The resulting
vector SA algorithm, introduced in [14], constructs a se-
quence of states (xi)∈ D, a sequence of candidate points
(yi)∈ D, and a sequence of temperatures (τi)R>0,
as follows: an initial state x0∈ D and an initial temper-
ature τ0>0are given. Having constructed the sequences
(x1, x2, . . . , xk),(y1, y2, . . . , yk), and (τ1, τ2, . . . , τk), a next
candidate point yk+1 is selected according to the probability
distribution R(xk,·). It then computes
αi= exp Vi(yk+1)Vi(xk)
τk, i = 1,2
u=UniformRandomReal(0,1)
The next state xk+1 is determined as
xk+1 =yk+1 if the event A=
(V1(xk) = V1(yk+1)uα2)(V1(xk)6=V1(yk+1 )
uα1)is true
and xk+1 =xkotherwise.
The two events on either side of the disjunctive are disjoint.
It comes that the update rule for vector SA is:
xk+1 =(yk+1 with probability pa(xk, yk+1, τk)
xkwith probability 1pa(xk, yk+1, τk)
1596
where
pa(x, y, τ ) = min{1,
2
X
i=1
Pr[Vi(yk+1)6=Vi(xk)|uα3i]·α3i}
We conclude with a Lemma, used in the later proofs,
which will allow us to forego continuity of Vover Init.
It is proved for vector V, which subsumes scalar Vas a
special case. The global minimum of V= (k, r)of V
will be characterized by
kk(x)x∈ D
rr(x)x∈ {y∈ D|V1(y) = k}
Given = (1, 2)N>0×R>0, define
D={x∈ D|V1(x)k+1and V2(x)r+2}
={x∈ D|V(x)V+}
Lemma 1: Assume C7, C8. Then it holds that =
(1, 2)>0,0< µ(D).
Proof: Fix = (1, 2)>0, and consider the optimum
xand its robustness time t=t(x). By C8 and Prop.1,
δ2>0s.t. the image of Bδ2(x)S,Bδ2under
the hybrid dynamics at time tis a set of points that are
at least 2-close to s(x, t). Moreover, all trajectories ηy
starting in Bδ2follow the same sequence of locations as ηx;
in particular, they all visit the location l(x)and therefore
have k(y) = k(x). So |k(y)k(x)|< 1. We know that
µ(Bδ2)>0by condition C7. Recognizing that Bδ2⊂ D
proves the lemma.
Optimizing a vector function. One popular way to optimize
a vector objective function is to map its output V(x)=(k, r)
to scalars in R(e.g. [25]), e.g. using the inverse logit function
with a > 0
Ya: (k, r)7→ k+ 2 [2 exp(r/a)/(1 + exp(r/a)) 1]
Yamaps Z×Rto {z+b|zZ, b (1,1)}. It is strictly
increasing, so the global minima of YaVare the same as
the global minima of V. However, it faces the ‘saturation’
effect for large absolute values of r(and fixed k): differences
between Yavalues become insignificant at these extremes,
thus not providing the optimizer with enough guidance. This
problem is exacerbated in a practical implementation, which
will discretize Ya, because the the discretized inverse logit is
no longer strictly increasing. This means the global minima
of ˜
YaVare no longer the same as those of V.
In section IV, it is shown that traditional SA converges
when solving Problem 1. In section V, it is shown that
vector SA converges when solving Problem 2. Throughout,
any mention of a global minimum refers to a minimum that
satisfies all needed conditions, which will be explicated.
IV. TRADITIONAL SA F OR A DY NA MIC AL SY ST E M
Our first result states that traditional SA converges to
the global minimum Vwhen solving Problem 1. Recall
that Problem 1 deals with automata with one location, so
there are no guards or resets. This result may be seen as a
special case of Thm.3 presented in the next section. But it is
presented here as a separate result, as a first simple extension
of SA convergence, and to hihglight the role played by the
condition that µ(D)>0for all  > 0.
Theorem 2: Assume that conditions C1,C2,C4-C8 are sat-
isfied. Then SA will converge in probability to the global
minimum of Problem 1.
Proof: The proof proceeds along identical lines to
Belisle’s original proof in [18] and so is not repeated. The
main difference is that Vis not continuous over its domain
Init in the present paper. However, Lemma 1 removes the
need for this ’global’ continuity of the objective function,
and may be invoked when proving [18, Lemma 1].
V. VECTOR OBJECTIVE FUNCTION
In this section we prove the convergence of vector SA
when solving Problem 2. The novelty is that this SA algo-
rithm deals with a multi-objective function V(x)=(k , r)
N×R, and the objective is no longer continuous on D.
The proof is a variation on Belisle’s proof [18]. Some of
the definitions need to be appropriately modified to account
for the new objective function. The following theorem is
proven next.
Theorem 3: Let x1, x2, . . . be the sequence of states gen-
erated by vector SA when solving Problem 2. Assume that
conditions C1,C2,C4-C8 hold. Let Vdenote the global
minimum of Von D. Then, for any pair of initial conditions
(x0, τ0), the sequence of function values (V(xk)), k 0,
converges in probability to V.
Since Vis a vector function, all expressions are under-
stood to apply in a component-wise fashion, e.g. |V(x)|>0
iff |V1(x)|>0and |V2(x)|>0.
We will show that for every x0∈ D, t0>0,  > 0and
δ > 0there exists an integer n1such that
P r[Xn/∈ D|(X0, T0) = (x0, τ0)] δnn1(4)
This will prove the theorem. If is such that Init ⊆ D
then Eq.(4) is trivially satisfied. Therefore, fix  > 0such
that DInit,x0∈ D, τ0>0and δ > 0. Let m
and nbe positive integers, ζN>0×R>0, and Xm
n=
(Xn, Xn+1, . . . , Xm)be the random sequence generated by
the random process {Xn}.xm
nwill be a realization of Xm
n.
Define the following events:
A=A(m, n) = the event that none of the states (Xi)n+m
i=n
is in D
B=B(ζ, m, n) = the event that at least one of the
transitions Xn+(k1) Xn+k,k= 0,1, . . . , m, is a move
from Dto H,ζ ,{x∈ D|V+<V(x)V++ζ}.
C=C(ζ, m, n) = the event that at least one of the
transitions Xn+(k1) Xn+k,k= 0,1, . . . , m, is a move
from Dto ˜
H,ζ ,{x∈ D|V+ζ+<V(x)}.
D=the event that Xn+m/∈ D.
Observe that DABC. Thus for every ζ, m and n
1597
we have
Pr[Xn+m/∈ D|(x0, τ0)]
= Pr[D|(x0, τ0)]
Pr[A|(x0, τ0)] + Pr[B|(x0, τ0)]
+ Pr[D|(x0, τ0)]
(5)
In the next three sections we prove the following three
lemmata, which show that the ‘escape’ probabilities on
the right hand side of (5) can be made arbitrarily small,
regardless of initial conditions (x0, τ0).
Lemma 2: There exists an integer m0(which does not
depend on (x0, τ0)), such that Pr[A(m0, n)|(x0, τ0)] <
δ/3n0
Lemma 3: Let m0be as in Lemma 2. There exists a ζ0=
(ζ0,1, ζ0,2), independent of (x0, τ0), such that
Pr[B(ζ0, m0, n)|(x0, τ0)] < δ/3n0
Lemma 4: Let m0and ζ0be as in Lemma 2 and Lemma
3 resp. There exists an integer n0, independent of (x0, τ0),
such that
Pr[C(ζ0, m0, n)|(x0, τ0)] < δ/3nn0
Thus we may conclude that Pr[Xn+m/∈ D|(x0, τ0)] <
δnn0. Therefore, Eq.(4) holds with n1=n0+m0.
The proof of Lemma 2 is almost identical to the original
proof in [18] with some minor obvious modifications, and
again, uses Lemma 1 instead of continuity of V. Therefore
it is omitted.
A. Proof of Lemma 3
Let XiXi+1 denote a transition from state Xito state
Xi+1, and ζN+×R+.
Pr[B(ζ, m0, n)|(x0, τ0)]
m1
X
j=0
Pr[Xn+j∈ DXn+j+1 H,ζ
| {z }
Bj(ζ,m0,n)
|(x0, τ0)]
Pr[Bj(ζ, m0, n)|(x0, τ0)] sup
x∈D
R(x, H,ζ )
Thus
Pr[B(ζ, m0, n)|(x0, τ0)] m0sup
x∈D
R(x, H,ζ )(6)
Now we may write
H,ζ ={x∈ D | V1+1< V1(x)V1+1+ζ1}
∩ {x∈ D | V2+2< V2(x)V2+2+ζ2}
,H1
,ζ H2
,ζ
Consider a sequence {ζ(i)}s.t. ζ(i)0as i→ ∞. Then
D ⊇ H1
,ζ(1) H1
,ζ(2) . . .: since Dis bounded (C1), H1
,ζ
is bounded and therefore has finite Lebesgue measure. Next,
µ(H2
,ζ )0as ζ20(Lemma 2 in [18] with ζ2= 1/`).
Thus µ(H1
,ζ H2
,ζ ) = µ(H,ζ )0as ζ0. Therefore,
for every x∈ D,R(x, H,ζ ) = RxH,ζ r(x, y)dy 0as
ζ0. Thus:
R(x, H,ζ(i))is a real-valued continuous function over
a compact space D ∀i.
{R(x, H,ζ(i))}iis a monotonically decreasing se-
quence of functions.
The sequence {R(x, H,ζ(i))}converges pointwise to
the 0 function.
Dini’s theorem allows us to conclude that R(x, H,ζ )0
as ζ0uniformly in x. Thus ζ0= (ζ1,0, ζ2,0)s.t.
supx∈D R(x, H,ζ0)< δ/3m0. Combined with (6), this
proves the lemma.
B. Proof of Lemma 4
Let m0and ζ0be as in Lemma 1 and Lemma 2 respec-
tively. Fix τ>0.
Pr[C(ζ0, m0, n)|(x0, τ0)]
m1
X
j=0
Pr[Xn+j∈ DXn+j+1 ˜
H,ζ0
| {z }
Cj(ζ0,m0,n)
|(x0, τ0)].
Pr[Cj(ζ0, m0, n)|(x0, τ0)]
= Pr[Cj(ζ0, m0, n)τn+jτ|(x0, τ0)]
+ Pr[Cj(ζ0, m0, n)τn+j> τ|(x0, τ0)]
The first summand is upper bounded by
pa(x, y, τ ), x ∈ D, y ˜
H,ζ0, τ τ
α1+α2
= exp V1(y)V1(x)
τ+ exp V2(y)V2(x)
τ
exp V1ζ1,0+V+1
τ
+ exp V2ζ2,0+V+2
τ
exp ζ1,0
τ+ exp ζ2,0
τ
So there exists a τs.t. Pr[C(ζ0, m0, n)|(x0, τ0)] δ/6
when ττ. Condition C6 guarantees the existence of an
n2such that τnτfor all nn2. The second summand
can be made arbitrarily small by C6: in particular, let n3be
such that it is smaller than δ/6. Letting n0max{n2, n3}
proves Lemma 4.
VI. PR AC TIC AL CONSIDERATIONS
The previous sections have demonstrated that SA is a
consistent optimization algorithm for the falsification of
temporal logical properties of hybrid automata; that is, the
sequence of samples it produces converges (in probability)
to the set of global optima, regardless of the initial sample.
This is an important property since it guarantees that longer
runs of SA will produce better minima. From a practical
standpoint, previous work [14] has demonstrated that in
practice, SA performs well, both in terms of speed and
quality of obtained minimum. Rather than replicate those
experiments here, we focus instead on the factors that affect
finite-time performance 4, and how they affect it.
4Finite-time performance tells us how close is the current minimum,
after Nsamples generated, to the global minimum. The answer is naturally
affected by the likelihood of SA to spend many samples near non-global
minima for this system and specification.
1598
It is well-known that the practical performance of SA de-
pends on the specific objective function being optimized, the
particular cooling schedule, and the neighborhood selection;
see [26] for a good review of these issues. In our case,
the neighborhood is all of Init as per condition C4, and
the cooling schedule is adaptively modified to maintain an
acceptance-to-rejection ratio close to 1. This is permitted by
condition C6, and has been shown experimentally to help
avoid local minima traps [14]. The objective function is
directly related to the system and property being falsified,
and we now briefly illustrate how its graph can affect
convergence. We select two benchmark hybrid automata and
corresponding unsafe sets, and study the following three
issues for each:
- Generate the partition Pof its Init set. This allows us to
assess whether it satisfies condition C7.
- Plot the graph of the objective function, which can indicate
the difficulty of this problem instance.
- See if vector SA generates samples with different sequences
of locations (`i)for our system. If a run of vector SA
generates very few different (`i), this might indicate a
local minimum trap, which should not be confused with
having converged (indeed, Lemma 2 asserts that the tail of
the generated sequence consists of samples with the same
sequence of locations with increasing probability).
Our first system, Sys1, is a 2D, 5-location hybrid automa-
ton with linear dynamics in each location:
F low(1, x)=0.11 10
100 1x, F low (2, x)=0.11 100
10 1x
F low(3, x) = F low (4, x)=0.11 100
10 1x
F low(5, x)=0.1110
10 1 x
Guard(1,2) = {x1>0}, Guard(2,3) = {x2<0x1<4.5}
Guard(3,4) = {x1<0}, Guard(2,5) = {x2<0x1>4.5}
Guard(4,1) = {x2>0}, Guard(5,3) = {x1<4.5}
U={3< x1<4,3< x2<4}
To generate the graph of the robustness function, we sampled
the initial set with a step size of 0.01 in both dimensions,
leading to a grid with 3600 points. The graph (not shown here
for lack of space) displays several near-flat ‘valleys’ (regions
of small function values) surrounded by ‘peaks’ (regions of
large function values): if SA samples from a given valley, it
will continue sampling from it for a long time because the
probability of accepting an increase in function value will
be small, following Eq.(3). Thus we expect that once SA
samples from a valley that contains a global minimum, then
there’s a high probability it will get arbirarily close to that
minimum.
Using the same grid, the partition Pof the initial set was
obtained. A coarser partition (with step size = 0.02) is shown
in Fig.1 for clarity. A total of 4 parts were obtained, each
corresponding to a sequence of locations. It can be seen that
Parts 1 and 3 cover much of Init (see Fig.1), so it is to be
expected that at least initially, SA will sample from these 2
parts overwhelmingly. If the global minimum is in a different
part, longer runs of SA are required.
Fig. 1: Init partition for Sys1. Parts correspond to the
sequences si
We then ran SA on Sys1 with a sample size of 1000 initial
points, and a trajectory duration of 2sec. All 4 parts were
visited, with the vast majority of the points chosen from
the Part3. This is in accordance with the observations made
above about the partition of Init and shape of the graph.
It is notable that the minimum found by SA (down to two
significant figures) is [1.35,1.74], almost at the boundary of
parts 3 and 4. These points are harder to find for SA because
of the different sequences on either side of the boundary.
The corresponding robustness value is 0.5 (compare to global
minimum robustness value of 0.49, found on above grid, at
[1.58,1.79]).
We also re-use the Nav0 benchmark from [27], which
we will argue is a harder instance for SA. Nav0 is a 4-
dimensional automaton with 16 locations, and it is unknown
whether it is falsifiable or not. For the purposes of presenting
results graphically, we fix the last 2 dimensions of the state
vector to [x3, x4] = [0.1,0.2], and let SA vary the first two
dimensions. The graph of the robustness function is given
in Fig.2. The graph was obtained by sampling the initial set
with a step size of 0.01 (for a total of 3600 points), and
computing a trajectory of duration 10sec. We can observe
a large number of minima with varying depths, increasing
the odds of SA spending a large number of samples in these
minima before jumping back out.
Fig. 2: Graph of the robustness function for Nav0.
Using the same grid, the partition Pof the initial set
1599
was computed. The partition has 189 parts, with the largest
part containing only 15.3% of the points, and the 4 largest
parts together have 53.3% of the points. Fig.3 shows how
the initial set of conditions is highly fragmented. While we
haven’t established a precise relation between the measure of
the parts and SA convergence, condition C7 and the proof
of Prop.1 drive us to conjecture that a larger measure for
the parts leads to a faster convergence. The small size of all
partitions in Nav0 then suggests this is a hard instance for
SA.
Fig. 3: Zoom on Init partition for Nav0. Different symbols
and colors correspond to different sequences of locations.
Note the fragmentation on the right side.
SA was run on Nav0 with a sample size of 1000 initial
points, and a test duration of 10sec. It selected points from 62
parts, with the largest 2 containing 70% of the 1000 points,
and the smallest containing 0.01% of the points. These parts
were not the largest parts in P, which is expected since SA
is driven by the objective function as well as the measure
of the parts. SA found a global minimum value of 0.09 at
[0.61,3.40], compared to the global (grid) minimum of 0.08
at [0.67,3.52].
On the theoretical side, the probabilistic convergence of
the sequence of initial states generated by SA to a global
optimum, can be informally divided into two parts: the first
part is the convergence of the Markov Chain to its zero-
temperature stationary distribution, π, in an appropriate
sense. The stationary distribution favors minima of the objec-
tive function. The second part is then the sampling from the
stationary distribution. Thus a bound on SAs convergence
involves bounding these two components. The second part
depends on the shape of the stationary distribution. In our
case, it is exponential of the form π(x) = 1
MeβV (x)[14],
where Mis the (unknown) normalization constant. For the
first part, recently, a result on finite-time guarantees for SA
optimization over continuous domains was obtained in [28].
Informally, given a desired precision of the optimization, it
provides a number of samples after which we are guaranteed
that the minimum so far is within the desired precision of the
global minimum. The strength of this result is that it requires
very little of the objective function, namely only that it be
well-defined pointwise, measurable, and bounded between 0
and 1. In traditional SA, gV(x)satisfies these conditions,
with Vgiven in Problem 1 and gis any monotone increasing
function that maps [0,]to [0,1]. No such result exists yet
for a vector objective function, like the one in Problem 2;
this is the subject of future research.
VII. CONCLUSIONS
The problem of falsifying safety properties of hybrid
automata was formulated as optimization problems in [14].
In this paper, we provided conditions on the system under
which Simulated Annealing will converge in probability to
the global minimum, and thus return a system trajectory
of mimimal robustness. Research can proceed along three
directions: the first is to establish convergence conditions for
discrete implementations of the continuous models studied in
this paper. A second direction is to broaden the class of sys-
tems for which SA converges, e.g. to systems with more than
one starting location, and to automata with state- and time-
dependent guard sets, as well as to non-autonomous systems.
A third direction of research is to develop computable criteria
that establish whether a given system is, a priori, suitable
for efficient SA falsification or not. As part of this direction,
there is obvious interest in establishing finite-time guarantees
for the vector SA algorithm.
ACKNOWLEDGMENT
The authors wish to thank Tolga Duman for many helpful
discussions.
REFERENCES
[1] M. Blair, S. Obenski, and P. Bridickas, “Patriot missile software
problem,” United States General Accounting Office, Tech. Rep.
GAO/IMTEC-92-26, 1992.
[2] E. J. Hoffman, W. L. Ebert, M. D. Femiano, H. R. Freeman, C. J.
Gay, C. P. Jones, P. J. Luers, and J. G. Palmer, “The near rendezvous
burn anomaly of december 1998,” Applied Physics Laboratory, Johns
Hopkins University, Tech. Rep., Nov. 1999.
[3] T. A. Henzinger, P. W. Kopke, A. Puri, and P. Varaiya, “What’s
decidable about hybrid automata?” J. Comput. Syst. Sci., vol. 57, no. 1,
pp. 94–124, 1998.
[4] S. Ratschan and J.-G. Smaus, “Finding errors of hybrid systems by
optimizing an abstraction-based quality estimate,” in Proceedings of
the Third Int’l Conf. on Tests and Proofs, Zurich, Switzerland, July
2009, pp. 153–168.
[5] Q. Zhao, B. H. Krogh, and P. Hubbard, “Generating test inputs for
embedded control systems,” IEEE Control Systems Magazine, vol.
Aug., pp. 49–57, 2003.
[6] M. Branicky, M. Curtiss, J. Levine, and S. Morgan, “Sampling-based
planning, control and verification of hybrid systems,” IEE Proc.-
Control Theory Appl., vol. 153, no. 5, pp. 575–590, 2006.
[7] T. Nahhal and T. Dang, “Test coverage for continuous and hybrid
systems,” in CAV, ser. LNCS, vol. 4590. Springer, 2007, pp. 449–
462.
[8] E. Plaku, L. E. Kavraki, and M. Y. Vardi, “Hybrid systems: From
verification to falsification,” in Proceedings of the 19th International
Conference on Computer Aided Verification, ser. LNCS, W. Damm
and H. Hermanns, Eds., vol. 4590. Springer, 2007, pp. 463–476.
[9] ——, “Falsification of ltl safety properties in hybrid systems,” in Proc.
of the Conf. on Tools and Algorithms for the Construction and Analysis
of Systems (TACAS), ser. LNCS, vol. 5505, 2009, pp. 368 – 382.
1600
[10] A. Rizk, G. Batt, F. Fages, and S. Soliman, “On a continuous degree
of satisfaction of temporal logic formulae with applications to systems
biology,” in International Conference on Computational Methods in
Systems Biology, ser. LNCS, no. 5307. Springer, 2008, pp. 251–268.
[11] P. Zuliani, A. Platzer, and E. M. Clarke, “Bayesian statistical model
checking with application to simulink/stateflow verification,” in Pro-
ceedings of the 13th ACM International Conference on Hybrid Sys-
tems: Computation and Control, 2010, pp. 243–252.
[12] A. Donze and O. Maler, “Systematic simulation using sensitivity
analysis,” in Hybrid Systems: Computation and Control, ser. LNCS,
vol. 4416. Springer, 2007, pp. 174–189.
[13] F. Lerda, J. Kapinski, E. M. Clarke, and B. H. Krogh, “Verification
of supervisory control software using state proximity and merging,
in Hybrid Systems: Computation and Control, ser. LNCS, vol. 4981.
Springer, 2008, pp. 344–357.
[14] H. Abbas, G. E. Fainekos, S. Sankaranarayanan, F. Ivancic, A. Gupta,
and G. J. Pappas, “Probabilistic temporal logic falsification of cyber-
physical systems,” ACM Transactions on Embedded Computing Sys-
tems, vol. (Accepted), 2011.
[15] Y. S. R. Annapureddy, C. Liu, G. E. Fainekos, and S. Sankara-
narayanan, “S-taliro: A tool for temporal logic falsification for hybrid
systems,” in Tools and algorithms for the construction and analysis
of systems, ser. LNCS, vol. 6605. Springer, 2011, pp. 254–257.
[16] T. Nghiem, S. Sankaranarayanan, G. Fainekos, F. Ivancic, A. Gupta,
and G. Pappas, “Monte-carlo techniques for falsification of temporal
properties of non-linear hybrid systems,” in Hybrid Systems: Compu-
tation and Control, 2010.
[17] G. Fainekos and G. Pappas, “Robustness of temporal logic specifica-
tions for continuous-time signals,” Theoretical Computer Science, vol.
410, no. 42, pp. 4262–4291, September 2009.
[18] C. J. P. Belisle, “Convergence theorems for a class of simulated
annealing algorithms on Rd,” Journal of Applied Probability, vol. 29,
no. 4, pp. 885–895, Dec. 1992.
[19] B. Hajek, “Cooling schedules for optimal annealing,” Mathematics of
operation research, vol. 13, no. 2, pp. 311–329, 1988.
[20] R. G. Sanfelice and A. R. Teel, “Dynamical properties of hybrid
systems simulators,” Automatica, vol. 46, no. 2, pp. 239–248, 2010.
[21] P. Tabuada, Verification and Control of Hybrid Systems: A Symbolic
Approach. Springer, 2009.
[22] A. Girard and G. J. Pappas, “Approximation metrics for discrete and
continuous systems,” IEEE Trans. Auto. Cont., vol. 52, no. 5, pp. 782–
798, 2007.
[23] J. Lygeros, K. H. Johansson, S. N. Simic, J. Zhang, and S. Sastry,
“Dynamical properties of hybrid automata,” IEEE Transactions on
Automatic Control, vol. 48, pp. 2–17, 2003.
[24] K. I. Smith, R. M. Everson, J. E. Fieldsend, C. Murphy, and R. Misra,
“Dominance-based multiobjective simulated annealing,IEEE Trans-
actions on Evolutionary computation, vol. 12, no. 3, pp. 323–342,
2008.
[25] P. Czyzak and A. Jaszkiewicz, “Pareto simulated annealing - a meta-
heuristic technique for multiple-objective combinatorial optimization,
J. Multi-Criteria Decision Analysis, vol. 7, pp. 34–47, 1998.
[26] D. Henderson, S. H. Jacobson, and A. W. Johnson, “The theory
and practice of simulated annealing,” in Handbook of metaheuristics.
Springer, 2003.
[27] H. Abbas and G. Fainekos, “Linear hybrid system falsification through
local search,” in Automated Technology for Verification and Analysis,
ser. LNCS, vol. 6996. Springer, 2011, pp. 503–510.
[28] A. Lecchini-Visintini, J. Lygeros, and J. M. Maciejowski, “Stochastic
optimization on continuous domains with finite-time guarantees by
markov chain monte carlo methods,” IEEE Transactions on Automatic
Control, vol. 55, no. 12, pp. 2858–2863, Dec. 2010.
[29] R. Goebel, R. G. Sanfelice, and A. R. Teel, “Hybrid dynamical
systems,” IEEE Control Systems Magazine, pp. 28–93, 2009.
VIII. APPENDIX
This section details the ‘hybrid basic conditions’ that are
required of a hybrid automaton Hfor it to have an accurate
simulator. The conditions are framed in the formalism of
differential and difference inclusions [29], which generalizes
the formalism used in this paper.
Definition 4: [20] A hybrid system Hon a state space
Rnis defined by
A set CRdcalled the flow set
A set-valued map F:Rn→ P(Rn)called the flow
map
A set DRncalled the jump set
A set-valued map G:Rn→ P(Rn)called the jump
map
We write H= (C, F, D, G). The dynamics of the system
are given by
ξRn(˙
ξF(ξ), ξ C
ξ+G(ξ), ξ D
Hybrid automata, defined in section II-A, can be modeled
using Def. 4 as follows [29]: let n=d+1 and ξ= [`, x]T
L×RdRd+1 be the state of the system. Then we take:
C`=Inv(`), D`=`0:(`,`0)EGuard(`, `0),F`(x) =
F low(`, x),xC`, and
G`(x) = {[`0, Re(`, `0, x)]T|xGuard(`, `0)},xD`.
For a deterministic automaton, for any x, there is a unique
location `0such that xGuard(`, `0)(otherwise, two jumps
(`, `0)and (`, `00)are possible). Thus G`(x)is a singleton.
The dynamics of Hare then given by:
[`, x]TL×Rd(˙
ξF(ξ) = {[0, F`(x)]T}, x C`
ξ+G(ξ) = {[`0, Re(`, `0, x)]T}, x D`
Conditions for accurate simulation [20, Assumption 2.5].
The data of the hybrid system H= (C, F, D, G), satisfies:
A1. Cand Dare closed sets.
A2. F:Rd→ P(Rd)is outer semicontinuous (o.s.c.)5and
locally bounded, and F(ξ)is nonempty and convex for all
ξC.
A3. G:Rd→ P(Rd)is o.s.c. and locally bounded, and
G(ξ)is nonempty for all ξD.
Proposition 1 is an application of the following theorem.
A ‘maximal’ trajectory is one which can not be extended.
Recall that Dis the time-limit of all trajectories we compute.
Theorem 4: [20, Thm. 3.4] Assume that Hsatisfies the
above conditions. Let KRnbe a compact set such that H
is pre-complete from K(i.e. each maximal trajectory starting
from Kis either bounded or has infinite length). Then, for
every  > 0there exists δ>0with the following property:
for any δ(0, δ]and any solution ηxwith xK+δB1(0)
there exists a solution ηx0with x0Ksuch that the two
trajectories have the same number of location transitions, and
their continuous parts are -close at all times t<Dat which
they’re in the same location.
5A set-valued map F:Rn→ P(Rn)is o.s.c. iff for all sequences
(ξi)Rnconverging to ξand all sequences (ωi)F(ξi)converging to
ω, it holds that ωF(ξ).
1601
... However, in general, the sample size n has its upper bound and we cannot take n arbitrarily large enough to control the effect of β. In fact, previous works on SA [1,2,4,9,11,13,16,19] have not derived generalization bounds, which indicates it is difficult to derive practical generalization bounds to the SA algorithm. ...
... Almost all of the previous works on SA [1,2,4,9,11,13,16,19] consider the optimization problem on the discrete or bounded space. In addition, these works only show that the SA algorithm approaches any fixed neighborhood of minimizers of objective functions. ...
Preprint
In this paper, we propose a novel uniform generalization bound on the time and inverse temperature for stochastic gradient Langevin dynamics (SGLD) in a non-convex setting. While previous works derive their generalization bounds by uniform stability, we use Rademacher complexity to make our generalization bound independent of the time and inverse temperature. Using Rademacher complexity, we can reduce the problem to derive a generalization bound on the whole space to that on a bounded region and therefore can remove the effect of the time and inverse temperature from our generalization bound. As an application of our generalization bound, an evaluation on the effectiveness of the simulated annealing in a non-convex setting is also described. For the sample size $n$ and time $s$, we derive evaluations with orders $\sqrt{n^{-1} \log (n+1)}$ and $|(\log)^4(s)|^{-1}$, respectively. Here, $(\log)^4$ denotes the $4$ times composition of the logarithmic function.
... It is apparent that ϕ 1 is always satisfied (in any car model with 5 gears), and it has been added in the specification as redundant check. 1 According to robust semantics, the Boolean connective ∧ is interpreted by minimum , and the "always" operator [0,30] is interpreted by infimum ; the robustness of an atomic formula f (x) < c is given by the margin c − f (x). Therefore, the robustness of ϕ under the signal (gear , speed ), where gear , speed : [0, 30] → R, is (gear , speed ), ϕ = t∈ [0,30] 6 − gear (t) 130 − speed (t) . ...
... Metaheuristics refers to high-level heuristic strategies that utilize heuristics to improve the search efficiency. Several metaheuristic strategies have been applied in falsification, such as Simulated Annealing [1], tabu search [10], and so on. Coverage-guided falsification [2,10,15,29] aims to guide the search using some coverage metrics, so that the search space is sufficiently explored. ...
Chapter
Hybrid system falsification is an important quality assurance method for cyber-physical systems with the advantage of scalability and feasibility in practice than exhaustive verification. Falsification, given a desired temporal specification, tries to find an input of violation instead of a proof guarantee. The state-of-the-art falsification approaches often employ stochastic hill-climbing optimization that minimizes the degree of satisfaction of the temporal specification, given by its quantitative robust semantics. However, it has been shown that the performance of falsification could be severely affected by the so-called scale problem, related to the different scales of the signals used in the specification (e.g., rpm and speed): in the robustness computation, the contribution of a signal could be masked by another one. In this paper, we propose a novel approach to tackle this problem. We first introduce a new robustness definition, called QB-Robustness, which combines classical Boolean satisfaction and quantitative robustness. We prove that QB-Robustness can be used to judge the satisfaction of the specification and avoid the scale problem in its computation. QB-Robustness is exploited by a falsification approach based on Monte Carlo Tree Search over the structure of the formal specification. First, tree traversal identifies the sub-formulas for which it is needed to compute the quantitative robustness. Then, on the leaves, numerical hill-climbing optimization is performed, aiming to falsify such sub-formulas. Our in-depth evaluation on multiple benchmarks demonstrates that our approach achieves better falsification results than the state-of-the-art falsification approaches guided by the classical quantitative robustness, and it is largely not affected by the scale problem.
... The masking effect hinders optimizers from obtaining gradient information to improve solutions, while locality results in solutions that are brittle to noise. The traditional score was used as the objective function in an optimization problem and maximized using heuristic optimization algorithms such as Particle Swarm Optimization, Simulated Annealing and Rapidly Exploring Random Trees (RRTs) in different synthesis, falsification and control problems [10], [11], [12]. Exact approaches in [13], [14] encoded the temporal and Boolean constraints as Mixed Integer Linear Programming (MILP) problems and used off-the-shelf MILP solvers to maximize robustness. ...
... where p i is position, v i is velocity, u ci is the input to reach consensus and u i is the input to be synthesized for agent i to satisfy the temporal task. The consensus input is defined as [22]: 11) where N i is the set of neighboring agents for i, a ij shows whether agent i is connected to agent j, γ p , γ v , γ d are constant coefficients for consensus on position, speed and dampening the speed. The desired task is "Eventually Agent1 visits Blue and Agent2 visits Green within [5,15] and eventually Agent1 and Agent2 visit Yellow within [15,20] and Always within [0, 20] Agent1 and Agent2 stay inside the boundary with speeds being in the allowed range", specified as STL formula: [5,15] p 1 ∈ Blue ∧ F [5,15] p 2 ∈ Green ∧ F [15,20] p 1 ∈ Y ellow ∧ F [15,20] ...
Preprint
We propose a new robustness score for continuous-time Signal Temporal Logic (STL) specifications. Instead of considering only the most severe point along the evolution of the signal, we use average scores to extract more information from the signal, emphasizing robust satisfaction of all the specifications' subformulae over their entire time interval domains. We demonstrate the advantages of this new score in falsification and control synthesis problems in systems with complex dynamics and multi-agent systems.
... In [136], DDQN and Asynchronous Advantage Actor-Critic (A3C) algorithms have been used to solve the robustness guided falsification problem of Cyber-Physical System (CPS). This work has shown that using traditional methods such as simulated annealing [137] and cross entropy [138] are inefficient due to the infinite state space of CPS models. On the contrary, formulating the problem as an RL problem could obtain better results compared with the existing state-of-the-art techniques for detecting any falsified inputs of CPS. ...
Preprint
Full-text available
The rapid increase in the percentage of chronic disease patients along with the recent pandemic pose immediate threats on healthcare expenditure and elevate causes of death. This calls for transforming healthcare systems away from one-on-one patient treatment into intelligent health systems, to improve services, access and scalability, while reducing costs. Reinforcement Learning (RL) has witnessed an intrinsic breakthrough in solving a variety of complex problems for diverse applications and services. Thus, we conduct in this paper a comprehensive survey of the recent models and techniques of RL that have been developed/used for supporting Intelligent-healthcare (I-health) systems. This paper can guide the readers to deeply understand the state-of-the-art regarding the use of RL in the context of I-health. Specifically, we first present an overview for the I-health systems challenges, architecture, and how RL can benefit these systems. We then review the background and mathematical modeling of different RL, Deep RL (DRL), and multi-agent RL models. After that, we provide a deep literature review for the applications of RL in I-health systems. In particular, three main areas have been tackled, i.e., edge intelligence, smart core network, and dynamic treatment regimes. Finally, we highlight emerging challenges and outline future research directions in driving the future success of RL in I-health systems, which opens the door for exploring some interesting and unsolved problems.
... The robust satisfaction semantics of temporal logic are used as the cost function for the optimization problem, which is highly nonlinear and discontinuous. Therefore, optimization-guided falsification adopts a variety of heuristic optimization algorithms, such as ant colony algorithm [2], simulated annealing [3], reinforcement learning [4] and so on. ...
Article
Full-text available
Coverage-guided grey-box fuzzing for computer systems has been explored for decades. However, existing techniques do not adequately explore the space of continuous behaviors in Cyber-Physical Systems (CPSs), which may miss safety-critical bugs. Optimization-guided falsification is promising to find violations of safety specifications, but not suitable for identifying traditional program bugs. This article presents a fuzzing process for finding safety violations at the development phase, which is guided by two quantities: a branch coverage metric to explore discrete program behaviors and a Linear Temporal Logic (LTL) robust satisfaction metric to identify undesirable continuous plant behaviors. We implement CPFuzz to demonstrate the utility of the idea and estimate its effectiveness on seven control system benchmarks. The results show up to a better performance in average time to find violations on all benchmarks than S-TaLiRo and six benchmarks than S3CAMX. Finally, we exploit CPFuzz to synthesize the sensor spoofing attack on a DC motor with fixed-point overflow vulnerability as a case study.
... Hill climbing can be implemented by various stochastic optimization algorithms. Examples include CMA-ES (used in our experiments) [24], simulated annealing [25], etc. ...
Article
Full-text available
The verification of hybrid systems is intrinsically hard, due to the continuous dynamics that leads to infinite search spaces. Therefore, research attempts focused on hybrid system falsification of a black-box model, a technique that aims at finding an input signal violating the desired temporal specification. Main falsification approaches are based on stochastic hill-climbing optimization, that tries to minimize the degree of satisfaction of the temporal specification, given by its robust semantics. However, in the presence of constraints between the inputs, these methods become less effective. In this article, we solve this problem using a search space transformation that first maps points of the unconstrained search space to points of the constrained one, and then defines the fitness of the former ones based on the robustness values of the latter ones. Based on this search space transformation, we propose a falsification approach that performs the search over the unconstrained space, guided by the robustness of the mapped points in the constrained space. We introduce three versions of the proposed approach that differ in the way of selecting the mapped points. Experiments show that the proposed approach outperforms state-of-the-art constrained falsification approaches.
... The dynamics of attack-defense characterized by θ(t, a, d) is injected into the conventional physical system to develop a cyber-physical co-modelling as presented in Eq. (4). space of CPS models, conventional methods such as simulated annealing [87] and cross entropy [88] were found inefficient. Experimental results show the superiority of the use of DRL algorithms against those methods in terms of the smaller number of simulation runs. ...
Article
Full-text available
The scale of Internet-connected systems has increased considerably, and these systems are being exposed to cyber attacks more than ever. The complexity and dynamics of cyber attacks require protecting mechanisms to be responsive, adaptive, and large-scale. Machine learning, or more specifically deep reinforcement learning (DRL), methods have been proposed widely to address these issues. By incorporating deep learning into traditional RL, DRL is highly capable of solving complex, dynamic, and especially high-dimensional cyber defense problems. This paper presents a survey of DRL approaches developed for cyber security. We touch on different vital aspects, including DRL-based security methods for cyber-physical systems, autonomous intrusion detection techniques, and multi-agent DRL-based game theory simulations for defense strategies against cyber attacks. Extensive discussions and future research directions on DRL-based cyber security are also given. We expect that this comprehensive review provides the foundations for and facilitates future studies on exploring the potential of emerging DRL to cope with increasingly complex cyber security problems.
Article
Full-text available
Cyberphysical systems (CPSs) are vulnerable to catastrophic fault propagation due to the strong connectivity among their subsystems. This article introduces a learning-based method to enable CPSs to explain their faults to human users, facilitating effective and efficient collaborative error diagnosis.
Article
Full-text available
Established system relationships for discrete systems, such as language inclusion, simulation, and bisimulation, require system observations to be identical. When interacting with the physical world, modeled by continuous or hybrid systems, exact relationships are restrictive and not robust. In this paper, we develop the first framework of system approximation that applies to both discrete and continuous systems by developing notions of approximate language inclusion, approximate simulation, and approximate bisimulation relations. We define a hierarchy of approximation pseudo-metrics between two systems that quantify the quality of the approximation, and capture the established exact relationships as zero sections. Our approximation framework is compositional for synchronous composition operators. Algorithms are developed for computing the proposed pseudo-metrics, both exactly and approximately. The exact algorithms require the generalization of the fixed point algorithms for computing simulation and bisimulation relations, or dually, the solution a static game whose cost is the so-called branching distance between the systems. Approximations for the pseudo-metrics can be obtained by considering Lyapunov-like functions called simulation and bisimulation functions. We illustrate our approximation framework in reducing the complexity of safety verification problems for both deterministic and nondeterministic continuous systems.
Article
Full-text available
We present a Monte-Carlo optimization technique for finding system behaviors that falsify a metric temporal logic (MTL) property. Our approach performs a random walk over the space of system inputs guided by a robustness metric defined by the MTL property. Robustness is guiding the search for a falsifying behavior by exploring trajectories with smaller robustness values. The resulting testing framework can be applied to a wide class of cyber-physical systems (CPS). We show through experiments on complex system models that using our framework can help automatically falsify properties with more consistency as compared to other means, such as uniform sampling.
Conference Paper
Full-text available
We propose HyDICE, Hybrid DIscrete Continuous Exploration, a multi-layered approach for hybrid-system testing that integrates continuous sampling-based robot motion planning with discrete searching. The discrete search uses the discrete transitions of the hybrid system and coarse-grained decompositions of the continuous state spaces or related projections to guide the motion planner during the search for witness trajectories. Experiments presented in this paper, using a hybrid system inspired by robot motion planning and with nonlinear dynamics associated with each of several thousand modes, provide an initial validation of HyDICE and demonstrate its promise as a hybrid-system testing method. Comparisons to related work show computational speedups of up to two orders of magnitude.
Chapter
Full-text available
Simulated annealing is a popular local search meta-heuristic used to address discrete and, to a lesser extent, continuous optimization problems. The key feature of simulated annealing is that it provides a means to escape local optima by allowing hill-climbing moves (i.e., moves which worsen the objective function value) in hopes of finding a global optimum. A brief history of simulated annealing is presented, including a review of its application to discrete and continuous optimization problems. Convergence theory for simulated annealing is reviewed, as well as recent advances in the analysis of finite time performance. Other local search algorithms are discussed in terms of their relationship to simulated annealing. The chapter also presents practical guidelines for the implementation of simulated annealing in terms of cooling schedules, neighborhood functions, and appropriate applications.
Article
Full-text available
We introduce bounds on the finite-time performance of Markov chain Monte Carlo (MCMC) algorithms in solving global stochastic optimization problems defined over continuous domains. It is shown that MCMC algorithms with finite-time guarantees can be developed with a proper choice of the target distribution and by studying their convergence in total variation norm. This work is inspired by the concept of finite-time learning with known accuracy and confidence developed in statistical learning theory.
Article
This paper presents a multiple-objective metaheuristic procedure-Pareto simulated annealing. The goal of the procedure is to find in a relatively short time a good approximation of the set of efficient solutions of a multiple-objective combinatorial optimization problem. The procedure uses a sample, of so-called generating solutions. Each solution explores its neighbourhood in a way similar to that of classical simulated annealing. Weights of the objectives, used for their local aggregation, are tuned in each iteration in order to assure a tendency for approaching the efficient solutions set while maintaining a uniform distribution of the generating solutions over this set. A computational experiment shows that the method is a better tool for approximating the efficient set than some previous proposals.
Book
Hybrid systems describe the interaction of software, modeled by finite-state systems such as finite-state machines, with the physical world, described by infinite-state systems such as differential equations. Verification and Control of Hybrid Systems provides a unique systematic exposition of several classes of hybrid systems, admitting symbolic models along with the relationships between them. The text outlines several key verification and control synthesis results for hybrid systems, guided by the concept of bisimulation, and illustrated by numerous examples. The book is divided into four parts: Part I presents basic concepts centered on a notion of system that is general enough to describe finite-state, infinite-state, and hybrid systems. Part II discusses the ways in which systems relate to other systems, such as behavioral inclusion/equivalence and simulation/bisimulation, using these relationships to study verification and control synthesis problems for finite-state systems. Part III draws inspiration from timed automata to present several classes of hybrid systems, with richer continuous dynamics, that can be related to finite-state symbolic systems. Once such relationships are established, verification and control synthesis problems for these hybrid systems can be immediately solved by resorting to the techniques described in Part II for finite-state systems. Part IV follows the same strategy by generalizing simulation/bisimulation relationships to approximate simulation/bisimulation relationships that can be used for a wider class of hybrid systems. This comprehensive treatment will appeal to researchers, engineers, computer scientists, and graduate students in the areas of formal methods, verification, model checking, and control and will undoubtedly inspire further study of the specialized literature. © Springer Science+Business Media, LLC 2009. All rights reserved.
Article
This paper presents a multiple-objective metaheuristic procedure—Pareto simulated annealing. The goal of the procedure is to find in a relatively short time a good approximation of the set of efficient solutions of a multiple-objective combinatorial optimization problem. The procedure uses a sample, of so-called generating solutions. Each solution explores its neighbourhood in a way similar to that of classical simulated annealing. Weights of the objectives, used for their local aggregation, are tuned in each iteration in order to assure a tendency for approaching the efficient solutions set while maintaining a uniform distribution of the generating solutions over this set. A computational experiment shows that the method is a better tool for approximating the efficient set than some previous proposals. © 1998 John Wiley & Sons, Ltd.
Article
Robust stability and control for systems that combine continuous-time and discrete-time dynamics. This article is a tutorial on modeling the dynamics of hybrid systems, on the elements of stability theory for hybrid systems, and on the basics of hybrid control. The presentation and selection of material is oriented toward the analysis of asymptotic stability in hybrid systems and the design of stabilizing hybrid controllers. Our emphasis on the robustness of asymptotic stability to data perturbation, external disturbances, and measurement error distinguishes the approach taken here from other approaches to hybrid systems. While we make some connections to alternative approaches, this article does not aspire to be a survey of the hybrid system literature, which is vast and multifaceted.