Chapter

Verification of LSTM Neural Networks with Non-linear Activation Functions

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Recurrent neural networks are increasingly employed in safety-critical applications, such as control in cyber-physical systems, and therefore their verification is crucial for guaranteeing reliability and correctness. We present a novel approach for verifying the dynamic behavior of Long short-term memory networks (LSTMs), a popular type of recurrent neural network (RNN). Our approach employs the satisfiability modulo theories (SMT) solver iSAT solving complex Boolean combinations of linear and non-linear constraint formulas (including transcendental functions), and it therefore is able to verify safety properties of these networks.KeywordsFormal verificationRecurrent neural networksLSTMSMT solvingiSAT

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Reactive synthesis is the task of automatically deriving a correct implementation from a specification. It is a promising technique for the development of verified programs and hardware. Despite recent advances in terms of algorithms and tools, however, reactive synthesis is still not practical when the specified systems reach a certain bound in size and complexity. In this paper, we present a sound and complete modular synthesis algorithm that automatically decomposes the specification into smaller subspecifications. For them, independent synthesis tasks are performed, significantly reducing the complexity of the individual tasks. Our decomposition algorithm guarantees that the subspecifications are independent in the sense that completely separate synthesis tasks can be performed for them. Moreover, the composition of the resulting implementations is guaranteed to satisfy the original specification. Our algorithm is a preprocessing technique that can be applied to a wide range of synthesis tools. We evaluate our approach with state-of-the-art synthesis tools on established benchmarks: the runtime decreases significantly when synthesizing implementations modularly.
Article
Full-text available
Pedestrian trajectory prediction is one of the main concerns of computer vision problems in the automotive industry, especially in the field of advanced driver assistance systems. The ability to anticipate the next movements of pedestrians on the street is a key task in many areas, e.g., self-driving auto vehicles, mobile robots or advanced surveillance systems, and they still represent a technological challenge. The performance of state-of-the-art pedestrian trajectory prediction methods currently benefits from the advancements in sensors and associated signal processing technologies. The current paper reviews the most recent deep learning-based solutions for the problem of pedestrian trajectory prediction along with employed sensors and afferent processing methodologies, and it performs an overview of the available datasets, performance metrics used in the evaluation process, and practical applications. Finally, the current work exposes the research gaps from the literature and outlines potential new research directions.
Chapter
Full-text available
Successfully attaining consensus in the absence of a centralized coordinator is a fundamental problem in distributed multi-agent systems. We analyze progress in the Synod consensus protocol—which does not assume a unique leader—under the assumptions of asynchronous communication and potential agent failures. We identify a set of sufficient conditions under which it is possible to guarantee that a set of agents will eventually attain consensus. First, a subset of the agents must behave correctly and not permanently fail until consensus is reached, and second, at least one proposal must be eventually uninterrupted by higher-numbered proposals. To formally reason about agent failures, we introduce a failure-aware actor model (FAM). Using FAM, we model the identified conditions and provide a formal proof of eventual progress in Synod. Our proof has been mechanically verified using the Athena proof assistant and, to the best of our knowledge, it is the first machine-checked proof of eventual progress in Synod.
Article
Full-text available
Spacecraft collision avoidance procedures have become an essential part of satellite operations. Complex and constantly updated estimates of the collision risk between orbiting objects inform various operators who can then plan risk mitigation measures. Such measures can be aided by the development of suitable machine learning (ML) models that predict, for example, the evolution of the collision risk over time. In October 2019, in an attempt to study this opportunity, the European Space Agency released a large curated dataset containing information about close approach events in the form of conjunction data messages (CDMs), which was collected from 2015 to 2019. This dataset was used in the Spacecraft Collision Avoidance Challenge, which was an ML competition where participants had to build models to predict the final collision risk between orbiting objects. This paper describes the design and results of the competition and discusses the challenges and lessons learned when applying ML methods to this problem domain.
Chapter
Full-text available
Lifted ( family-based ) static analysis by abstract interpretation is capable of analyzing all variants of a program family simultaneously, in a single run without generating any of the variants explicitly. The elements of the underlying lifted analysis domain are tuples, which maintain one property per variant. Still, explicit property enumeration in tuples, one by one for all variants, immediately yields combinatorial explosion. This is particularly apparent in the case of program families that, apart from Boolean features, contain also numerical features with large domains, thus giving rise to astronomical configuration spaces. The key for an efficient lifted analysis is a proper handling of variability-specific constructs of the language (e.g., feature-based runtime tests and #if\texttt {\#if} # if directives). In this work, we introduce a new symbolic representation of the lifted abstract domain that can efficiently analyze program families with numerical features. This makes sharing between property elements corresponding to different variants explicitly possible. The elements of the new lifted domain are constraint-based decision trees , where decision nodes are labeled with linear constraints defined over numerical features and the leaf nodes belong to an existing single-program analysis domain. To illustrate the potential of this representation, we have implemented an experimental lifted static analyzer, called SPLNum 2^2 2 Analyzer , for inferring invariants of C programs. An empirical evaluation on BusyBox and on benchmarks from SV-COMP yields promising preliminary results indicating that our decision trees-based approach is effective and outperforms the baseline tuple-based approach.
Conference Paper
Full-text available
In autonomous air-traffic management scenarios of the future, manned and unmanned aircraft will be able to safely navigate through the National Airspace System, independent of centralized air-traffic controllers, by sharing critical data necessary for maintaining standard separation with each other. Under such conditions, every aircraft must have sufficient knowledge about other aircraft sharing the airspace to operate safely. In this paper, we specify such a state of knowledge and present a formally verified distributed knowledge propagation protocol, which guarantees that this state will eventually be attained, leading to heightened collaborative situational awareness among the aircraft. We use the TLA+^+ Specification Language to specify our protocol and some safety-critical correctness properties. We also provide mechanically-verified proofs of the correctness properties, under a set of suitable operating conditions of the system, by using the TLA+ Proof System.
Conference Paper
Full-text available
SPARK Ada's support for proofs of correctness make the programming language ideal for implementing a PVS specification. Algorithmically implementing a PVS specification in SPARK Ada allows users to maintain the rigor of PVS in executable code. The goal of such an implementation is to maintain the validity of the proofs showing the specification implements formal requirements specified in PVS as theorems. This then shows the implementation also satisfies those formal requirements. We synthesized portions of NASA's DAIDALUS (Detect and AvoID Alerting Logic for Unmanned Systems) PVS specification into SPARK Ada. To provide confidence in the correspondence between the PVS specification and the SPARK Ada implementation, we designed a formal synthesis process. This process, while currently manual, allows us to have increased confidence that the properties proven to hold for the specification will continue to hold for the implementation.
Chapter
Full-text available
Robonaut2 (R2) is a humanoid robot onboard the International Space Station (ISS), performing specialized tasks in collaboration with astronauts. After deployment, R2 developed an unexpected emergent behavior. R2’s inability to distinguish between knee-joint faults (e.g., due to sensor drift versus violated environmental assumptions) began triggering safety-preserving freezes-in-place in the confined space of the ISS, preventing further motion until a ground-control operator determines the root-cause and initiates proper corrective action. Runtime verification (RV) algorithms can efficiently disambiguate the temporal signatures of different faults in real-time. However, no previous RV engine can operate within the limited available resources and specialized platform constraints of R2’s hardware architecture. An attempt to deploy the only runtime verification engine designed for embedded flight systems, R2U2, failed due to resource constraints. We present a significant redesign of the core R2U2 algorithms to adapt to severe resource and certification constraints and prove their correctness. We further define an optimization enabled by our new algorithms and implement the new version of R2U2. We encode specifications describing real-life faults occurring onboard Robonaut2 using Mission-time Linear Temporal Logic (MLTL) and detail our process of specification debugging, validation, and refinement. We deployed this new version of R2U2 on Robonaut2, demonstrating successful real-time fault disambiguation and mitigation triggering of R2’s knee-joint faults without false positives.
Chapter
Full-text available
The Curiosity rover is one of the most complex systems successfully deployed in a planetary exploration mission to date. It was sent by NASA to explore the surface of Mars and to identify potential signs of life. Even though it has limited autonomy on-board, most of its decisions are made by the ground control team. This hinders the speed at which the Curiosity reacts to its environment, due to the communication delays between Earth and Mars. Depending on the orbital position of both planets, it can take 4–24 min for a message to be transmitted between Earth and Mars. If the Curiosity were controlled autonomously, it would be able to perform its activities much faster and more flexibly. However, one of the major barriers to increased use of autonomy in such scenarios is the lack of assurances that the autonomous behaviour will work as expected. In this paper, we use a Robot Operating System (ROS) model of the Curiosity that is simulated in Gazebo and add an autonomous agent that is responsible for high-level decision-making. Then, we use a mixture of formal and non-formal techniques to verify the distinct system components (ROS nodes). This use of heterogeneous verification techniques is essential to provide guarantees about the nodes at different abstraction levels, and allows us to bring together relevant verification evidence to provide overall assurance.
Chapter
Full-text available
Neural networks provide quick approximations to complex functions, and have been increasingly used in perception as well as control tasks. For use in mission-critical and safety-critical applications, however, it is important to be able to analyze what a neural network can and cannot do. For feed-forward neural networks with ReLU activation functions, although exact analysis is NP-complete, recently-proposed verification methods can sometimes succeed.
Conference Paper
Full-text available
We present a novel conflict-aware flight planning approach that avoids the possibility of near mid-air collisions (NMACs) in the flight planning stage. Our algorithm computes a valid flight-plan for an aircraft (ownship) based on a starting time, a set of discrete way-points in 3D space, discrete values of ground speed, and a set of available flight-plans for traffic aircraft. A valid solution is one that avoids loss of standard separation with available traffic flight-plans. Solutions are restricted to permutations of constant ground speed and constant vertical speed for the ownship between consecutive waypoints. Since the course between two consecutive way-points is not changed, this strategy can be used in situations where vertical or lateral constraints due to terrain or weather may restrict deviations from the original flight-plan. This makes our approach particularly suitable for unmanned aerial systems (UAS) integration into urban air traffic management airspace. Our approach has been formally verified using the Athena proof assistant. Our work, therefore, complements the state-of-the-art pairwise tactical conflict resolution approaches by enabling an ownship to generate strategic flight-plans that ensure standard separation with multiple traffic aircraft, while conforming to possible restrictions on deviation from its flight path.
Article
Full-text available
A Boolean logic driven Markov process (BDMP) is a dependability analysis model that defines a continuous-time Markov chain (CTMC). This formalism has high expressive power, yet it remains readable because its graphical representation stays close to standard fault trees. The size of a BDMP is roughly speaking proportional to the size of the system it models, whereas the size of the CTMC specified by this BDMP suffers from exponential growth. Thus quantifying large BDMPs can be a challenging task. The most general method to quantify them is Monte Carlo simulation, but this may be intractable for highly reliable systems. On the other hand, some subcategories of BDMPs can be processed with much more efficient methods. For example, BDMPs without repairs can be translated into dynamic fault trees, a formalism accepted as an input of the STORM model checker, that performs numerical calculations on sparse matrices, or they can be processed with the tool FIGSEQ that explores paths going to a failure state and calculates their probabilities. BDMPs with repairs can be quantified by FIGSEQ (BDMPs capturing quickly and completely repairable behaviors are solved by a different algorithm), and by the I&AB (Initiator and All Barriers) method, recently published and implemented in a prototype version of RISKSPECTRUM PSA. This tool, based exclusively on Boolean representations looks for and quantifies minimal cut sets of the system, i.e., minimal combinations of component failures that induce the loss of the system. This allows a quick quantification of large models with repairable components, standby redundancies and some other types of dependencies between omponents. All these quantification methods have been tried on a benchmark whose definition was published at the MARS 2017 workshop: the model of emergency power supplies of a nuclear power plant. In this paper, after a recall of the theoretical principles of the various quantification methods, we compare their performances on that benchmark.
Conference Paper
Full-text available
Software engineers working in industry seldom try to apply formal methods to solve problems. There are various reasons for this. Sometimes these reasons are understandable---the cost of using formal methods does not make economic sense in many contexts. However, formal methods are also often greeted with scepticism. Formal methods are assumed to take too much time, require tools that are too academic, or to be too mathematical to be understood by practice-oriented software engineers. We tested these assumptions by designing a small course around a framework for program verification, aimed at regular computer science students enrolled in a Master's programme. After four lectures and associated exercises, students were given a small verification task where they had to model and verify a real, non-trivial, C function in Why3. A significant majority of students managed to prove a non-trivial functional specification of this C function in the time allotted, and many also pointed out inherent flaws of this function discovered during formalization. Participants reported no major difficulties or mental hurdles in learning Why3, and considered its approach to be appropriate for selected components of safety-critical software. While formal verification tools such as Why3 still have lots of room for improvement, this experience shows that in a short amount of time, software engineers can be taught to use a program verification tool, and obtain usable results without being fully proficient in it. We further recommend that courses on formal methods should also let students explore these as techniques to be applied, instead of only focusing on the theory behind them, as we expect this to gradually lower the barrier to wider acceptance.
Chapter
Full-text available
Runtime enforcement refers to the theories, techniques, and tools for enforcing correct behavior of systems at runtime. We are interested in such behaviors described by specifications that feature timing constraints formalized in what is generally referred to as timed properties. This tutorial presents a gentle introduction to runtime enforcement (of timed properties). First, we present a taxonomy of the main principles and concepts involved in runtime enforcement. Then, we give a brief overview of a line of research on theoretical runtime enforcement where timed properties are described by timed automata and feature uncontrollable events. Then, we mention some tools capable of runtime enforcement, and we present the TiPEX tool dedicated to timed properties. Finally, we present some open challenges and avenues for future work.
Chapter
Full-text available
Hybrid system falsification is an actively studied topic, as a scalable quality assurance methodology for real-world cyber-physical systems. In falsification, one employs stochastic hill-climbing optimization to quickly find a counterexample input to a black-box system model. Quantitative robust semantics is the technical key that enables use of such optimization. In this paper, we tackle the so-called scale problem regarding Boolean connectives that is widely recognized in the community: quantities of different scales (such as speed [km/h] vs. rpm, or worse, rph) can mask each other’s contribution to robustness. Our solution consists of integration of the multi-armed bandit algorithms in hill climbing-guided falsification frameworks, with a technical novelty of a new reward notion that we call hill-climbing gain. Our experiments show our approach’s robustness under the change of scales, and that it outperforms a state-of-the-art falsification tool.
Chapter
Full-text available
Verification of fault-tolerant distributed protocols is an immensely difficult task. Often, in these protocols, thresholds on set cardinalities are used both in the process code and in its correctness proof, e.g., a process can perform an action only if it has received an acknowledgment from at least half of its peers. Verification of threshold-based protocols is extremely challenging as it involves two kinds of reasoning: first-order reasoning about the unbounded state of the protocol, together with reasoning about sets and cardinalities. In this work, we develop a new methodology for decomposing the verification task of such protocols into two decidable logics: EPR and BAPA. Our key insight is that such protocols use thresholds in a restricted way as a means to obtain certain properties of “intersection” between sets. We define a language for expressing such properties, and present two translations: to EPR and BAPA. The EPR translation allows verifying the protocol while assuming these properties, and the BAPA translation allows verifying the correctness of the properties. We further develop an algorithm for automatically generating the properties needed for verifying a given protocol, facilitating fully automated deductive verification. Using this technique we have verified several challenging protocols, including Byzantine one-step consensus, hybrid reliable broadcast and fast Byzantine Paxos.
Chapter
Full-text available
A controller is a device that interacts with a plant. At each time point, it reads the plant’s state and issues commands with the goal that the plant operates optimally. Constructing optimal controllers is a fundamental and challenging problem. Machine learning techniques have recently been successfully applied to train controllers, yet they have limitations. Learned controllers are monolithic and hard to reason about. In particular, it is difficult to add features without retraining, to guarantee any level of performance, and to achieve acceptable performance when encountering untrained scenarios. These limitations can be addressed by deploying quantitative run-time shields that serve as a proxy for the controller. At each time point, the shield reads the command issued by the controller and may choose to alter it before passing it on to the plant. We show how optimal shields that interfere as little as possible while guaranteeing a desired level of controller performance, can be generated systematically and automatically using reactive synthesis. First, we abstract the plant by building a stochastic model. Second, we consider the learned controller to be a black box. Third, we measure controller performance and shield interference by two quantitative run-time measures that are formally defined using weighted automata. Then, the problem of constructing a shield that guarantees maximal performance with minimal interference is the problem of finding an optimal strategy in a stochastic 2-player game “controller versus shield” played on the abstract state space of the plant with a quantitative objective obtained from combining the performance and interference measures. We illustrate the effectiveness of our approach by automatically constructing lightweight shields for learned traffic-light controllers in various road networks. The shields we generate avoid liveness bugs, improve controller performance in untrained and changing traffic situations, and add features to learned controllers, such as giving priority to emergency vehicles .
Chapter
Full-text available
Mission-time LTL (MLTL) is a bounded variant of MTL over naturals designed to generically specify requirements for mission-based system operation common to aircraft, spacecraft, vehicles, and robots. Despite the utility of MLTL as a specification logic, major gaps remain in analyzing MLTL, e.g., for specification debugging or model checking, centering on the absence of any complete MLTL satisfiability checker. We prove that the MLTL satisfiability checking problem is NEXPTIME-complete and that satisfiability checking Open image in new window , the variant of MLTL where all intervals start at 0, is PSPACE-complete. We introduce translations for MLTL-to-LTL, Open image in new window , MLTL-to-SMV, and MLTL-to-SMT, creating four options for MLTL satisfiability checking. Our extensive experimental evaluation shows that the MLTL-to-SMT transition with the Z3 SMT solver offers the most scalable performance.
Article
Full-text available
The correct representation of the relevant properties of a system is an essential requirement for the effective use and wide adoption of model-based practices in industry. Uncertainty is one of the inherent properties of any measurement or estimation that is obtained in any physical setting; as such, it must be considered when modeling software systems deal with real data. Although a few modeling languages enable the representation of measurement uncertainty, these aspects are not normally incorporated into their type systems. Therefore, operating with uncertain values and propagating their uncertainty become cumbersome processes, which hinder their realization in real environments. This paper proposes an extension of OCL/UML primitive datatypes that enables the representation of the uncertainty that comes from physical measurements or user estimates into the models, together with an algebra of operations that are defined for the values of these types.
Article
Full-text available
Hyperproperties, such as non-interference and observational determinism, relate multiple system executions to each other. They are not expressible in standard temporal logics, like LTL, CTL, and CTL*, and thus cannot be monitored with standard runtime verification techniques. HyperLTL extends linear-time temporal logic (LTL) with explicit quantification over traces in order to express hyperproperties. We investigate the runtime verification problem of HyperLTL formulas for three different input models: (1) The parallel model, where a fixed number of system executions is processed in parallel. (2) The unbounded sequential model, where system executions are processed sequentially, one execution at a time. In this model, the number of incoming executions is a-priori unbounded and may in fact grow forever. (3) The bounded sequential model where the traces are processed sequentially and the number of incoming executions is bounded. We show that the existence of a bound in the parallel and bounded sequential models leads to a different notion of monitorability than in the unbounded sequential model. We show that deciding the monitoriability problem for alternation-free HyperLTL is PS P A C E -complete while the problem is undecidable in general. For every input model, we provide monitoring algorithms along with run-time and storage optimizations. By recognizing properties of specifications such as reflexivity, symmetry, and transitivity, we reduce the number of comparisons between traces. For the sequential models, we present a technique that minimizes the number of traces that need to be stored. We evaluate our optimizations, showing that this leads to a more scalable monitoring and, in particular, a significantly lower memory consumption.
Article
Rational Chebyshev approimations have been computed for the Fresnel integrals C ( x ) C(x) and S ( x ) S(x) for arguments in the intervals [ 0. , 1.2 ] [0.,1.2] and [ 1.2 , 1.6 ] [1.2,1.6] , and for the related functions f ( x ) f(x) and g ( x ) g(x) for the intervals [ 1.6 , 1.9 ] [1.6,1.9] , [ 1.9 , 2.4 ] [1.9,2.4] and [ 2.4 , ∞ ] [2.4,\infty ] . Maximal relative errors range down to 2 × 10 − 19 2 \times {10^{ - 19}} .
Chapter
Recurrent neural networks (RNNs) such as Long Short Term Memory (LSTM) networks have become popular in a variety of applications such as image processing, data classification, speech recognition, and as controllers in autonomous systems. In practical settings, there is often a need to deploy such RNNs on resource-constrained platforms such as mobile phones or embedded devices. As the memory footprint and energy consumption of such components become a bottleneck, there is interest in compressing and optimizing such networks using a range of heuristic techniques. However, these techniques do not guarantee the safety of the optimized network, e.g., against adversarial inputs, or equivalence of the optimized and original networks. To address this problem, we propose DiffRNN, the first differential verification method for RNNs to certify the equivalence of two structurally similar neural networks. Existing work on differential verification for ReLU-based feed-forward neural networks does not apply to RNNs where nonlinear activation functions such as Sigmoid and Tanh cannot be avoided. RNNs also pose unique challenges such as handling sequential inputs, complex feedback structures, and interactions between the gates and states. In DiffRNN, we overcome these challenges by bounding nonlinear activation functions with linear constraints and then solving constrained optimization problems to compute tight bounding boxes on non-linear surfaces in a high-dimensional space. The soundness of these bounding boxes is then proved using the dReal SMT solver. We demonstrate the practical efficacy of our technique on a variety of benchmarks and show that DiffRNN outperforms state-of-the-art RNN verification tools such as Popqorn.
Article
Urban air mobility (UAM) refers to air transportation services within an urban area. Existing ATM methods for unmanned aerial systems do not provide guarantees and will not scale to the projected traffic densities for UAM. We provide a decentralized, hierarchical approach for ATM that allows for scalability to high traffic densities as well as providing theoretical guarantees of correctness with respect to linear temporal logic specifications. Our main contributions are two-fold. First, we propose a novel ATM architecture that divides control authority between vertihubs that are each in charge of all vehicles in their local airspace. Each vertihub also contains a number of vertiports that are in charge of vehicle takeoffs and landings. The resulting architecture is decentralized and hierarchical, which enables scalability and robustness in the event of any individual vertihub or vertiport no longer being operational. Second, we provide a contract-based correct-by-construction reactive synthesis approach that provably guarantees safety with respect to user-provided safety specifications in linear temporal logic. We demonstrate the approach on realistic large-v
Chapter
Reinforcement learning algorithms discover policies that maximize reward. However, these policies generally do not adhere to safety, leaving safety in reinforcement learning (and in artificial intelligence in general) an open research problem. Shield synthesis is a formal approach to synthesize a correct-by-construction reactive system called a shield that enforces safety properties of a running system while interfering with its operation as little as possible. A shield attached to a learning agent guarantees safety during learning and execution phases. In this paper we summarize three types of shields that are synthesized from different specification languages, and discuss their applicability to reinforcement learning. First, we discuss deterministic shields that enforce specifications expressed as linear temporal logic specifications. Second, we discuss the synthesis of probabilistic shields from specifications in probabilistic temporal logic. Third, we discuss how to synthesize timed shields from timed automata specifications. This paper summarizes the application areas, advantages, disadvantages and synthesis approaches for the three types of shields and gives an overview of experimental results.
Article
Distributed reactive synthesis is the problem of algorithmically constructing controllers of distributed, communicating systems so that each closed-loop system satisfies a given temporal specification. We present an algorithm, called negotiation , for sound (but necessarily incomplete) distributed reactive synthesis based on assume–guarantee decompositions. The negotiation algorithm iteratively constructs assumptions and guarantees for each system. In each iteration, each system attempts to fulfill its specification and its guarantee (from the previous round), under the current assumption on the other systems, by solving a reactive synthesis problem. If the specification is not realizable, the algorithm computes a sufficient assumption on the other systems that ensures it can realize the specification and guarantee. This additional assumption further constrains the behavior of other systems and they might require an additional assumption, leading to the next round in the negotiation. The process terminates when a compatible assumption–guarantee pair is found for each system, which is sufficient to also satisfy the specification of each system. We have built a tool called Agnes that implements this algorithm. Using Agnes, we empirically demonstrate the effectiveness of our proposed algorithm on two case studies.
Conference Paper
This work presents formal progress envelopes applied to flight systems for distinctly classifying a system's state space into regions where a formal proof of progress for a distributed algorithm holds or does not hold. It also presents an approach for runtime integration of formal methods in the dynamic data-driven applications systems (DDDAS) architecture using parameterized proofs. Finally, it showcases the development of reusable parameterized proof libraries for high-level statistical and stochastic reasoning in the Athena proof assistant and demonstrates their use with a progress proof for the Paxos distributed consensus protocol.
Chapter
Deep neural networks are revolutionizing the way complex systems are developed. However, these automatically-generated networks are opaque to humans, making it difficult to reason about them and guarantee their correctness. Here, we propose a novel approach for verifying properties of a widespread variant of neural networks, called recurrent neural networks. Recurrent neural networks play a key role in, e.g., speech recognition, and their verification is crucial for guaranteeing the reliability of many critical systems. Our approach is based on the inference of invariants, which allow us to reduce the complex problem of verifying recurrent networks into simpler, non-recurrent problems. Experiments with a proof-of-concept implementation of our approach demonstrate that it performs orders-of-magnitude better than the state of the art.
Chapter
Despite many recent advances, reactive synthesis is still not really a practical technique. The grand challenge is to scale from small transition systems, where synthesis performs well, to complex multi-component designs. Compositional methods, such as the construction of dominant strategies for individual components, reduce the complexity significantly, but are usually not applicable without extensively rewriting the specification. In this paper, we present a refinement of compositional synthesis that does not require such an intervention. Our algorithm decomposes the system into a sequence of components, such that every component has a strategy that is dominant, i.e., performs at least as good as any possible alternative, provided that the preceding components follow their (already synthesized) strategies. The decomposition of the system is based on a dependency analysis, for which we provide semantic and syntactic techniques. We establish the soundness and completeness of the approach and report on encouraging experimental results.
Article
Isabelle/SACM is a tool for automated construction of model-based assurance cases with integrated formal methods, based on the Isabelle proof assistant. Assurance cases show how a system is safe to operate, through a human comprehensible argument demonstrating that the requirements are satisfied, using evidence of various provenances. They are usually required for certification of critical systems, often with evidence that originates from formal methods. Automating assurance cases increases rigour, and helps with maintenance and evolution. In this paper we apply Isabelle/SACM to a fragment of the assurance case for an autonomous underwater vehicle demonstrator. We encode the metric unit system (SI) in Isabelle, to allow modelling requirements and state spaces using physical units. We develop a behavioural model in the graphical RoboChart state machine language, embed the artifacts into Isabelle/SACM, and use it to demonstrate satisfaction of the requirements.
Conference Paper
In this paper, we propose a general approach to derive runtime enforcement implementations for multiagent systems, called shields, from temporal logical specifications. Each agent of the multi-agent system is monitored, and if needed corrected, by the shield, such that a global specification is always satisfied. The different ways of how a shield can interfere with each agent in the system in case of an error introduces the need for quantitative objectives. This work is the first to discuss the shield synthesis problem with quantitative objectives. We provide several cost functions that are utilized in the multi-agent setting and provide methods for the synthesis of cost-optimal shields and fair shields, under the given assumptions on the multi-agent system. We demonstrate the applicability of our approach via a detailed case study on UAV mission planning for warehouse logistics and simulating the shielded multi-agent system on ROS/Gazebo.
Conference Paper
Many software systems are today variational. They can produce a potentially large variety of related programs (variants) by selecting suitable configuration options (features) at compile time. Specialized variability-aware (lifted, family-based) static analyses allow analyzing all variants of the family, simultaneously, in a single run without generating any of the variants explicitly. In effect, they produce precise analysis results for all individual variants. The elements of the lifted analysis domain represent tuples (i.e. disjunction of properties), which maintain one property from an existing single-program analysis domain per variant. Nevertheless, explicit property enumeration in tuples, one by one for all variants, immediately yields to combinatorial explosion given that the number of variants can grow exponentially with the number of features. Therefore, such lifted analyses may be too costly or even infeasible for families with a large number of variants.
Article
Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) online learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous carfollowing with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.
Article
We introduce agent-environment systems where the agent is stateful and executing a ReLU recurrent neural network. We define and study their verification problem by providing equivalences of recurrent and feed-forward neural networks on bounded execution traces. We give a sound and complete procedure for their verification against properties specified in a simplified version of LTL on bounded executions. We present an implementation and discuss the experimental results obtained.
Article
We present a novel scheme called Decentralized Attestation for Device Swarms (DADS), which is, to the best of our knowledge, the first to accomplish decentralized attestation in device swarms. Device swarms are smart, mobile, and interconnected devices that operate in large numbers and are likely to be part of emerging applications in Cyber-Physical Systems (CPS) and Industrial Internet of Things (IIoTs). Swarm devices process and exchange safety, privacy, and mission-critical information. Thus, it is important to have a good code verification technique that scales to device swarms and establishes trust among collaborating devices. DADS has several advantages over current state-of-the-art swarm attestation techniques: It is decentralized, has no single point of failure, and can handle changing topologies after nodes are compromised. DADS assures system resilience to node compromise/failure while guaranteeing only devices that execute genuine code remain part of the group. We conduct performance measurements of communication, computation, memory, and energy using the TrustLite embedded systems architecture in OMNeT++ simulation environment. We show that the proposed approach can significantly reduce communication cost and is very efficient in terms of computation, memory, and energy requirements. We also analyze security and show that DADS is very effective and robust against various attacks.
Conference Paper
Copland is a domain specific language designed for describing, analyzing and executing attestation protocols. Its formal semantics defines evaluation, sequencing, and dispatch of measurements resulting in evidence describing a system's state. That evidence is in turn appraised to determine if and how an external system will interact with it. The contribution of this work is a description of the first Copland interpreter and the attestation manager built around it. Following an overview of the syntax and formal semantics is a collection of motivating examples. Next is a description of a Haskell-based Copland interpreter and the attestation manager constructed around it. Examples are provided to show the interpreter's interface format. A description of the Copland landscape and future goals closes the presentation.