## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

To read the full-text of this research,

you can request a copy directly from the authors.

... Recently, machine learning techniques have also been applied to falsification to enhance the search ability. For instance, Bayesian optimization [3,11,36] utilizes an acquisition function to balance exploration and exploitation; Reinforcement learning [27,37] naturally emphasizes on exploration. ...

Hybrid system falsification is an important quality assurance method for cyber-physical systems with the advantage of scalability and feasibility in practice than exhaustive verification. Falsification, given a desired temporal specification, tries to find an input of violation instead of a proof guarantee. The state-of-the-art falsification approaches often employ stochastic hill-climbing optimization that minimizes the degree of satisfaction of the temporal specification, given by its quantitative robust semantics. However, it has been shown that the performance of falsification could be severely affected by the so-called scale problem, related to the different scales of the signals used in the specification (e.g., rpm and speed): in the robustness computation, the contribution of a signal could be masked by another one. In this paper, we propose a novel approach to tackle this problem. We first introduce a new robustness definition, called QB-Robustness, which combines classical Boolean satisfaction and quantitative robustness. We prove that QB-Robustness can be used to judge the satisfaction of the specification and avoid the scale problem in its computation. QB-Robustness is exploited by a falsification approach based on Monte Carlo Tree Search over the structure of the formal specification. First, tree traversal identifies the sub-formulas for which it is needed to compute the quantitative robustness. Then, on the leaves, numerical hill-climbing optimization is performed, aiming to falsify such sub-formulas. Our in-depth evaluation on multiple benchmarks demonstrates that our approach achieves better falsification results than the state-of-the-art falsification approaches guided by the classical quantitative robustness, and it is largely not affected by the scale problem.

Falsification of hybrid systems is attracting ever-growing attention in quality assurance of Cyber-Physical Systems (CPS) as a practical alternative to exhaustive formal verification. In falsification, one searches for a falsifying input that drives a given black-box model to output an undesired signal. In this paper, we identify input constraints—such as the constraint “the throttle and brake pedals should not be pressed simultaneously” for an automotive powertrain model—as a key factor for the practical value of falsification methods. We propose three approaches for systematically addressing input constraints in optimization-based falsification, two among which come from the lexicographic method studied in the context of constrained multi-objective optimization. Our experiments show the approaches’ effectiveness.

We study the problem of computing input signals that produce system behaviors that falsify requirements written in temporal logic. We provide a method to automatically search for falsifying time varying uncertain inputs for nonlinear and possibly hybrid systems. The input to the system is parametrized using piecewise constant signals with varying switch times. By applying small perturbations to the system input in space and time, and by using gradient descent approach, we try to converge to the worst local system behavior. The experimental results on non-trivial benchmarks demonstrate that this local search can significantly improve the rate of finding falsifying counterexamples.

Few real-world hybrid systems are amenable to formal verification, due to their complexity and black box components. Optimization-based falsification---a methodology of search-based testing that employs stochastic optimization---is attracting attention as an alternative quality assurance method. Inspired by the recent works that advocate coverage and exploration in falsification, we introduce a two-layered optimization framework that uses Monte Carlo tree search (MCTS), a popular machine learning technique with solid mathematical and empirical foundations. MCTS is used in the upper layer of our framework; it guides the lower layer of local hill-climbing optimization, thus balancing exploration and exploitation in a disciplined manner.

Search-based testing is widely used to find bugs in models of complex Cyber-Physical Systems. Latest research efforts have improved this approach by casting it as a falsification procedure of formally specified temporal properties, exploiting the robustness semantics of Signal Temporal Logic. The scaling of this approach to highly complex engineering systems requires efficient falsification procedures, which should be applicable also to black box models. Falsification is also exacerbated by the fact that inputs are often time-dependent functions. We tackle the falsification of formal properties of complex black box models of Cyber-Physical Systems, leveraging machine learning techniques from the area of Active Learning. Tailoring these techniques to the falsification problem with time-dependent, functional inputs, we show a considerable gain in computational effort, by reducing the number of model simulations needed. The goodness of the proposed approach is discussed on a challenging industrial-level benchmark from automotive.

Many industrial cyber-physical system (CPS) designs are too complex to formally verify system-level properties. A practical approach for testing and debugging these system designs is falsification, wherein the user provides a temporal logic specification of correct system behaviors, and some technique for selecting test cases is used to identify behaviors that demonstrate that the specification does not hold for the system. While coverage metrics are often used to measure the exhaustiveness of this kind of testing approach for software systems, existing falsification approaches for CPS designs do not consider coverage for the signal variables. We present a new coverage measure for continuous signals and a new falsification technique that leverages the measure to efficiently identify falsifying traces. This falsification algorithm combines global and local search methods and uses a classification technique based on support vector machines to identify regions of the search space on which to focus effort. We use an industrial example from an automotive fuel cell application and other benchmark models to compare the new approach against existing falsification tools.

We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task involving finding rewards in random 3D mazes using a visual input.

Model-free reinforcement learning has been successfully applied to a range of challenging problems, and has recently been extended to handle large neural network policies and value functions. However, the sample complexity of model-free algorithms, particularly when using high-dimensional function approximators, tends to limit their applicability to physical systems. In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks. We propose two complementary techniques for improving the efficiency of such algorithms. First, we derive a continuous variant of the Q-learning algorithm, which we call normalized adantage functions (NAF), as an alternative to the more commonly used policy gradient and actor-critic methods. NAF representation allows us to apply Q-learning with experience replay to continuous tasks, and substantially improves performance on a set of simulated robotic control tasks. To further improve the efficiency of our approach, we explore the use of learned models for accelerating model-free reinforcement learning. We show that iteratively refitted local linear models are especially effective for this, and demonstrate substantially faster learning on domains where such models are applicable.

The popular Q-learning algorithm is known to overestimate action values under
certain conditions. It was not previously known whether, in practice, such
overestimations are common, whether this harms performance, and whether they
can generally be prevented. In this paper, we answer all these questions
affirmatively. In particular, we first show that the recent DQN algorithm,
which combines Q-learning with a deep neural network, suffers from substantial
overestimations in some games in the Atari 2600 domain. We then show that the
idea behind the Double Q-learning algorithm, which was introduced in a tabular
setting, can be generalized to work with large-scale function approximation. We
propose a specific adaptation to the DQN algorithm and show that the resulting
algorithm not only reduces the observed overestimations, as hypothesized, but
that this also leads to much better performance on several games.

We propose a family of trust region policy optimization (TRPO) algorithms for
learning control policies. We first develop a policy update scheme with
guaranteed monotonic improvement, and then we describe a finite-sample
approximation to this scheme that is practical for large-scale problems. In our
experiments, we evaluate the method on two different and very challenging sets
of tasks: learning simulated robotic swimming, hopping, and walking gaits, and
playing Atari games using images of the screen as input. For these tasks, the
policies are neural networks with tens of thousands of parameters, mapping from
observations to actions.

Metric Temporal Logic (MTL) specifications can capture complex state and timing requirements. Given a nonlinear dynamical system and an MTL specification for that system, our goal is to find a trajectory that violates or satisfies the specification. This trajectory can be used as a concrete feedback to the system designer in the case of violation or as a trajectory to be tracked in the case of satisfaction. The search for such a trajectory is conducted over the space of initial conditions, system parameters and input signals. We convert the trajectory search problem into an optimization problem through MTL robust semantics. Robustness quantifies how close the trajectory is to violating or satisfying a specification. Starting from some arbitrary initial condition and parameter and given an input signal, we compute a descent direction in the search space, which leads to a trajectory that optimizes the MTL robustness. This process can be iterated to reach local optima (min or max). We demonstrate the method on examples from the literature.

Industrial control systems are often hybrid systems that are required to satisfy strict performance requirements. Verifying designs against requirements is a difficult task, and there is a lack of suitable open benchmark models to assess, evaluate, and compare tools and techniques. Benchmark models can be valuable for the hybrid systems research community, as they can communicate the nature and complexity of the problems facing industrial practitioners. We present a collection of benchmark problems from the automotive powertrain control domain that are focused on verification for hybrid systems; the problems are intended to challenge the research community while maintaining a manageable scale. We present three models of a fuel control system, each with a unique level of complexity, along with representative requirements in signal temporal logic (STL). We provide results obtained by applying a state of the art analysis tool to these models, and finally, we discuss challenge problems for the research community.

The automatic analysis of transient properties of nonlinear dynamical systems is a challenging problem. The problem is even more challenging when complex state-space and timing requirements must be satisfied by the system. Such complex requirements can be captured by Metric Temporal Logic (MTL) specifications. The problem of finding system behaviors that do not satisfy an MTL specification is referred to as MTL falsification. This paper presents an approach for improving stochastic MTL falsification methods by performing local search in the set of initial conditions. In particular, MTL robustness quantifies how correct or wrong is a system trajectory with respect to an MTL specification. Positive values indicate satisfaction of the property while negative values indicate falsification. A stochastic falsification method attempts to minimize the system's robustness with respect to the MTL property. Given some arbitrary initial state, this paper presents a method to compute a descent direction in the set of initial conditions, such that the new system trajectory gets closer to the unsafe set of behaviors. This technique can be iterated in order to converge to a local minimum of the robustness landscape. The paper demonstrates the applicability of the method on some challenging nonlinear systems from the literature.

Monitoring transient behaviors of real-time systems plays an important role in model-based systems design. Signal Temporal Logic (STL) emerges as a convenient and powerful formalism for continuous and hybrid systems. This paper presents an e�fficient algorithm for computing the robustness degree in which a piecewise-continuous signal satisfi�es or violates an STL formula. The algorithm, by leveraging state-of-the-art streaming algorithms from Signal Processing, is linear in the size of the signal and its implementation in the Breach tool is shown to outperform alternative implementations.

In thispaper we introducea variant oftemporal logictailoredfor spec- ifying desired properties of continuous signals. The logic is based on a bounded subset of the real-time logic MITL, augmented with a static mapping from con- tinuous domains into propositions. From formulae in this logic we create auto- matically property monitors that can check whether a given signal of bounded length and finite variability satisfies the property. A prototype implementation of this procedure was used to check properties of simulation traces generated by Matlab/Simulink.

S-TaLiRo is a Matlab (TM) toolbox that searches for trajectories of minimal robustness in Simulink/Stateflow diagrams. It can analyze
arbitrary Simulink models or user defined functions that model the system. At the heart of the tool, we use randomized testing
based on stochastic optimization techniques including Monte-Carlo methods and Ant-Colony Optimization. Among the advantages
of the toolbox is the seamless integration inside the Matlab environment, which is widely used in the industry for model-based
development of control software. We present the architecture of S-TaLiRo and its working on an application example.

With the rapid development of software and distributed computing, Cyber-Physical Systems (CPS) are widely adopted in many application areas, e.g., smart grid, autonomous automobile. It is difficult to detect defects in CPS models due to the complexities involved in the software and physical systems. To find defects in CPS models efficiently, robustness guided falsification of CPS is introduced. Existing methods use several optimization techniques to generate counterexamples, which falsify the given properties of a CPS. However those methods may require a large number of simulation runs to find the counterexample and are far from practical. In this work, we explore state-of-the-art Deep Reinforcement Learning (DRL) techniques to reduce the number of simulation runs required to find such counterexamples. We report our method and the preliminary evaluation results.

Many problems in the design and analysis of cyber-physical systems (CPS) reduce to the following optimization problem: given a CPS which transforms continuous-time input traces in Rm to continuous-time output traces in Rn and a cost function over output traces, find an input trace which minimizes the cost. Cyber-physical systems are typically so complex that solving the optimization problem analytically by examining the system dynamics is not feasible. We consider a black-box approach, where the optimization is performed by testing the input-output behaviour of the CPS.
We provide a unified, tool-supported methodology for CPS testing and optimization. Our tool is the first CPS testing tool that supports Bayesian optimization. It is also the first to employ fully automated dimensionality reduction techniques. We demonstrate the potential of our tool by running experiments on multiple industrial case studies. We compare the effectiveness of Bayesian optimization to state-of-the-art testing techniques based on CMA-ES and Simulated Annealing.

Studying transient properties of nonlinear systems is an important problem for safety applications. Computation-ally, it is a very challenging problem to verify that a nonlinear system satisfies a safety specification. Therefore, in many cases, engineers try to solve a related problem, i.e., they try to find a system behavior that does not satisfy a given specification. This problem is called specification falsification. Optimization has been shown to be very effective in providing a practical solution to the falsification problem. In this paper, we provide effective and practical local and global optimization strategies to falsify a smooth nonlinear system of arbitrary complexity.

We present a model-based falsification scheme for artificial pancreas controllers. Our approach performs a closed-loop simulation of the control software using models of the human insulin-glucose regulatory system. Our work focuses on testing properties of an overnight control system for hypoglycemia/hyperglycemia minimization in patients with type-1 diabetes. This control system is currently the subject of extensive phase II clinical trials.
We describe how the overall closed loop simulator is constructed, and formulate properties to be tested. Significantly, the closed loop simulation incorporates the control software, as is, without any abstractions. Next, we demonstrate the use of a simulation-based falsification approach to find potential property violations in the resulting control system. We formulate a series of properties about the controller behavior and examine the violations obtained. Using these violations, we propose modifications to the controller software to improve its performance under these adverse (corner-case) scenarios. We also illustrate the effectiveness of robustness as a metric for identifying interesting property violations. Finally, we identify important open problems for future work.

Cyber-physical systems (CPS), such as automotive systems, are starting to include sophisticated machine learning (ML) components. Their correctness, therefore, depends on properties of the inner ML modules. While learning algorithms aim to generalize from examples, they are only as good as the examples provided, and recent efforts have shown that they can produce inconsistent output under small adversarial perturbations. This raises the question: can the output from learning components can lead to a failure of the entire CPS? In this work, we address this question by formulating it as a problem of falsifying signal temporal logic (STL) specifications for CPS with ML components. We propose a compositional falsification framework where a temporal logic falsifier and a machine learning analyzer cooperate with the aim of finding falsifying executions of the considered model. The efficacy of the proposed technique is shown on an automatic emergency braking system model with a perception component based on deep neural networks.

We propose a framework to solve falsification problems of conditional safety properties—specifications such that “a safety property \(\varphi _{\mathsf {safe}}\) holds whenever an antecedent condition \(\varphi _{\mathsf {cond}}\) holds.” In the outline, our framework follows the existing one based on robust semantics and numerical optimization. That is, we search for a counterexample input by iterating the following procedure: (1) pick up an input; (2) test how robustly the specification is satisfied under the current input; and (3) pick up a new input again hopefully with a smaller robustness. In falsification of conditional safety properties, one of the problems of the existing algorithm is the following: we sometimes iteratively pick up inputs that do not satisfy the antecedent condition \(\varphi _{\mathsf {cond}}\), and the corresponding tests become less informative. To overcome this problem, we employ Gaussian process regression—one of the model estimation techniques—and estimate the region of the input search space in which the antecedent condition \(\varphi _{\mathsf {cond}}\) holds with high probability.

This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units. These algorithms, called REINFORCE algorithms, are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reinforcement tasks, and they do this without explicitly computing gradient estimates or even storing information from which such estimates could be computed. Specific examples of such algorithms are presented, some of which bear a close relationship to certain existing algorithms while others are novel but potentially interesting in their own right. Also given are results that show how such algorithms can be naturally integrated with backpropagation. We close with a brief discussion of a number of additional issues surrounding the use of such algorithms, including what is known about their limiting behaviors as well as further considerations that might be used to help develop similar but potentially more powerful reinforcement learning algorithms.

Techniques for testing cyberphysical systems (CPS) currently use a combination of automatic directed test generation and random testing to find undesirable behaviors. Existing techniques can fail to efficiently identify bugs because they do not adequately explore the space of system behaviors. In this paper, we present an approach that uses the rapidly exploring random trees (RRT) technique to explore the state-space of a CPS. Given a Signal Temporal Logic (STL) requirement, the RRT algorithm uses two quantities to guide the search: The first is a robustness metric that quantifies the degree of satisfaction of the STL requirement by simulation traces. The second is a metric for measuring coverage for a dense state-space, known as the star discrepancy measure. We show that our approach scales to industrial-scale CPSs by demonstrating its efficacy on an automotive powertrain control system.

In this paper, we describe a formal framework for conformance testing of continuous and hybrid systems, using the international standard `Formal Methods in Conformance Testing' FMCT. We propose a novel test coverage measure for these systems, which is defined using the star discrepancy notion. This coverage measure is used to quantify the validation `completeness'. It is also used to guide input stimulus generation by identifying the portions of the system behaviors that are not adequately examined. We then propose a test generation method, which is based on a robotic motion planning algorithm and is guided by the coverage measure. This method was implemented in a prototype tool that can handle high dimensional systems (up to 100 dimensions).

In this paper, we consider the robust interpretation of Metric Temporal Logic (MTL) formulas over signals that take values in metric spaces. For such signals, which are generated by systems whose states are equipped with non-trivial metrics, for example continuous or hybrid, robustness is not only natural, but also a critical measure of system performance. Thus, we propose multi-valued semantics for MTL formulas, which capture not only the usual Boolean satisfiability of the formula, but also topological information regarding the distance, ε, from unsatisfiability. We prove that any other signal that remains ε-close to the initial one also satisfies the same MTL specification under the usual Boolean semantics. Finally, our framework is applied to the problem of testing formulas of two fragments of MTL, namely Metric Interval Temporal Logic (MITL) and closed Metric Temporal Logic (clMTL), over continuous-time signals using only discrete-time analysis. The motivating idea behind our approach is that if the continuous-time signal fulfills certain conditions and the discrete-time signal robustly satisfies the temporal logic specification, then the corresponding continuous-time signal should also satisfy the same temporal logic specification.

We describe Breach, a Matlab/C++ toolbox providing a coherent set of simulation-based techniques aimed at the analysis of deterministic models
of hybrid dynamical systems. The primary feature of Breach is to facilitate the computation and the property investigation of large sets of trajectories. It relies on an efficient
numerical solver of ordinary differential equations that can also provide information about sensitivity with respect to parameters
variation. The latter is used to perform approximate reachability analysis and parameter synthesis. A major novel feature
is the robust monitoring of metric interval temporal logic (MITL) formulas. The application domain of Breach ranges from embedded systems design to the analysis of complex non-linear models from systems biology.

This paper is motivated by the need for a formal specification method for real-time systems. In these systemsquantitative temporal properties play a dominant role. We first characterize real-time systems by giving a classification of such quantitative temporal properties. Next, we extend the usual models for temporal logic by including a distance function to measure time and analyze what restrictions should be imposed on such a function. Then we introduce appropriate temporal operators to reason about such models by turning qualitative temporal operators into (quantitative) metric temporal operators and show how the usual quantitative temporal properties of real-time systems can be expressed in this metric temporal logic. After we illustrate the application of metric temporal logic to real-time systems by several examples, we end this paper with some conclusions.

This paper puts forward two useful methods for self-adaptation of the mutation distribution - the concepts of derandomization and cumulation. Principle shortcomings of the concept of mutative strategy parameter control and two levels of derandomization are reviewed. Basic demands on the self-adaptation of arbitrary (normal) mutation distributions are developed. Applying arbitrary, normal mutation distributions is equiv-alent to applying a general, linear problem encoding.
The underlying objective of mutative strategy parameter control is roughly to favor previously selected mutation steps in the future. If this objective is pursued rigor-ously, a completely derandomized self-adaptation scheme results, which adapts arbitrary normal mutation distributions. This scheme, called covariance matrix adaptation (CMA), meets the previously stated demands. It can still be considerably improved by cumulation - utilizing an evolution path rather than single search steps.
Simulations on various test functions reveal local and global search properties of the evolution strategy with and without covariance matrix adaptation. Their performances are comparable only on perfectly scaled functions. On badly scaled, non-separable functions usually a speed up factor of several orders of magnitude is ob-served. On moderately mis-scaled functions a speed up factor of three to ten can be expected.

Benchmarks for temporal logic requirements for automotive systems

- B Hoxha
- H Abbas
- G E Fainekos