Article

Precise measurement-based worst-case execution time estimation

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

During the development of real-time systems, the worst-case execution time (WCET) of every task or program in the system must be known in order to show that all timing requirements for the system are fulfilled. The increasing complexity of modern processor architectures makes achieving this objective more and more challenging.The incorporation of performance enhancing features like caches and speculative execution, as well as the interaction of these components, make execution times highly dependent on the execution history and the overall state of the system. To determine a bound for the execution time of a program, static methods for timing analysis face the challenge to model all possible system states which can occur during the execution of the program. The necessary approximation of potential system states is likely to overestimate the actual WCET considerably. On the other hand, measurement-based timing analysis techniques use a relatively small number of run-time measurements to estimate the worst-case execution time. As it is generally impossible to observe all potential executions of a real-world program, this approach cannot provide any guarantees about the calculated WCET estimate and the results are often imprecise. This thesis presents a new approach to timing analysis which was designed to overcome the problems of existing methods. By partitioning the analyzed programs into easily traceable segments and by precisely controlling run-time measurements, the new method is able to preserve information about the execution context of measured execution times. After an adequate number of measurements have been taken, this information can be used to precisely estimate the WCET of a program without being overly pessimistic. The method can be seamlessly integrated into frameworks for static program analysis. Thus results from static analyses can be used to make the estimates more precise and perform run-time measurements more efficiently.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... These tools measure the execution times of many small snippets of the task. The individual execution times are combined to a global WCET estimate, e.g, using integer linear programming [WKRP08,Sta09], or probabilistic methods [BCP02,BBN05]. Despite all efforts to combine snippet execution times safely, these tools suffer from the same lack of soundness as full-blown measurements do; it can therefore not be guaranteed that the given WCET estimates are an upper bound of the actual WCET. ...
Article
Reliable task-level execution time information is indispensable for validating the correct operation of safety-critical embedded real-time systems. Static worst-case execution time (WCET) analysis is a method that computes safe upper bounds of the execution time of single uninterrupted tasks. The method is based on abstract interpretation and involves abstract hardware models that capture the timing behavior of the processor on which the tasks run. For complex processors, task-level execution time bounds are obtained by a state space exploration which involves the abstract model and the program. Partial state space exploration is not sound. A full exploration can become too expensive. Symbolic state space exploration methods using binary decision diagrams (BDDs) are known to provide efficient means for covering large state spaces. This work presents a symbolic method for the efficient state space exploration of abstract pipeline models in the context of static WCET analysis. The method has been implemented for the Infineon TriCore 1 which is a real-life processor of medium complexity. Experimental results on a set of benchmarks and an automotive industry application demonstrate that the approach improves the scalability of static WCET analysis while maintaining soundness.
Conference Paper
We present a fast and accurate timing simulation of binary code execution on complex embedded processors. Underlying block timings are extracted from a preceding hardware execution and differentiated by execution context. Thereby, complex factors, such as caches, can be reflected accurately without explicit modeling. Based on timings observed in one hardware execution, timing of numerous other executions for different inputs can be simulated at an average error below 5% for complex applications on an ARM Cortex-A9 processor.
Conference Paper
The present paper investigates the influence of the execution history on the precision of measurement-based execution time estimates for embedded software. A new approach to timing analysis is presented which was designed to overcome the problems of existing static and dynamic methods. By partitioning the analyzed programs into easily traceable segments and by precisely controlling run-time measurements with on-chip tracing facilities, the new method is able to preserve information about the execution context of measured execution times. After an adequate number of measurements have been taken, this information can be used to precisely estimate the Worst-Case Execution Time of a program without being overly pessimistic.
Conference Paper
Estimating the worst-case execution time (WCET) of real-time embedded systems is compulsory for the verification of their correct functioning. Traditionally, the WCET of a program is estimated assuming availability of the program's binary which is disassembled to reconstruct the program, and in some cases its source code to derive useful high-level execution information. However, in certain scenarios the program's owner requires that the binary of the program not be reverse-engineered to protect intellectual property, and in extreme situations, the program's binary is not available for the analysis, in which case it is substituted by program-execution traces. In this paper we show that we can obtain WCET estimates for programs based on runtime-generated or owner-provided time-stamped execution traces and without the need to access the source code or reverse-engineer the binaries of the programs. We show that we can provide very accurate WCET estimations using both integer linear programming (ILP) and constraint logic programming (CLP). Our method generates safe and tight WCET estimations for all the benchmarks used in the evaluation.
Book
Full-text available
In this book we shall introduce four of the main approaches to program analysis: Data Flow Analysis, Control Flow Analysis, Abstract Interpretation, and Type and Effect Systems. Each of Chapters 2 to 5 deals with one of these approaches to some length and generally treats the more advanced material in later sections. Throughout the book we aim at stressing the many similarities between what may at a first glance appear to be very unrelated approaches. To help getting this idea across, and to serve as a gentle introduction, this chapter treats all of-the approaches at the level of examples. The technical details are worked-out but it may be difficult to apply the techniques to related examples until some of the material of later chapters have been studied.
Thesis
Full-text available
When constructing real-time systems, safe and tight estimations of the worst case execution time (WCET) of programs are needed. To obtain tight estimations, a common approach is to do path and timing analyses. Path analysis is responsible for eliminating infeasible paths in the program and timing analysis is responsible for accurately modeling the timing behavior of programs. The focus of this thesis is on analysis of programs running on high-performance microprocessors employing pipelining and caching. This thesis presents a new method, referred to as cycle-level symbolic execution, that tightly integrates path and timing analysis. An implementation of the method has been used to estimate the WCET for a suite of programs running on a high-performance processor. The results show that by using an integrated analysis, the overestimation is significantly reduced compared to other methods. The method automatically eliminates infeasible paths and derives path information such as loop bounds, and performs accurate timing analysis for a multiple-issue processor with an instruction and data cache. The thesis also identifies timing anomalies in dynamically scheduled processors. These anomalies can lead to unbounded timing effects when estimating the WCET, which makes it unsafe to use previously presented timing analysis methods. To handle these unbounded timing effects, two methods are proposed. The first method is based on program modifications and the second method relies on using pessimistic timing models. Both methods make it possible to safely use all previously published timing analysis methods even for architectures where timing anomalies can occur. Finally, the use of data caching is examined. For data caching to be fruitful in real-time systems, data accesses must be predictable when estimating the WCET. Based on a notion of predictable and unpredictable data structures, it is shown how to classify program data structures according to their influence on data cache analysis. For both categories, several examples of frequently used types of data structures are provided. Furthermore, it is shown how to make an efficient data cache analysis even when data structures have an unknown placement in memory. This is important, for example, when analyzing single subroutines of a program.
Conference Paper
Full-text available
In this paper we present a measurement-based worst-case execution time (WCET) analysis method. Exhaustive end-to-end execution-time measurements are computationally intractable in most cases. Therefore, we propose to measure execution times of subparts of the application code and then compose these times into a safe WCET bound. This raises a number of challenges to be solved. First, there is the question of how to define and subsequently calculate adequate subparts. Second, a huge amount of test data is required enforcing the execution of selected paths to perform the desired runtime measurements. The presented method provides solutions to both problems. In a number of experiments we show the usefulness of the theoretical concepts and the practical feasibility by using current state-of-the-art industrial case studies from project partners.
Conference Paper
Full-text available
Microcontrollers are the core part of automotive Electronic Control Units (ECUs). A significant investment of the ECU manufacturers and even their customers is linked to the specified microcontroller family. To preserve this investment it is required to continuously design new generations of the microcontroller with hardware and software compatibility but higher system performance and/or lower cost. The challenge for the microcontroller manufacturer is to get the relevant inputs for improving the system performance, since a microcontroller is used by many customers in many different applications. For Infineon's latest TriCore® based 32-bit microcontroller product line, the required statistical data is gathered by using the trace features of the Emulation Device (ED). Infineon's customers use EDs in their unchanged target system and application environment. With an analytical methodology and based on this statistical data, the performance improvements of different SoC architecture and implementation options can be quantified. This allows an objective assessment of improvement options by comparing their performance cost ratios.
Conference Paper
Full-text available
In this paper we present a new measurement-based worst-case execution time (WCET) analysis method. Exhaustive end-to-end measurements are computationally intractable in most cases.Therefore, we propose to measure execution times of subparts of the application. We use heuristic methods and model checking to generate test data, forcing the executionof selected paths to perform runtime measurements. The measured times are used to calculate the WCET in a final computation step. As we operate on source code level our approach is platform independent except for the run time measurements performedon the target host. We show the feasibility of the required steps and explain our approach by means of a case study.
Conference Paper
Full-text available
A program denotes computations in some universe of objects. Abstract interpretation of programs consists in using that denotation to describe computations in another universe of abstract objects, so that the results of abstract execution give some information on the actual computations. An intuitive example (which we borrow from Sintzoff [72]) is the rule of signs. The text -1515 * 17 may be understood to denote computations on the abstract universe {(+), (-), (±)} where the semantics of arithmetic operators is defined by the rule of signs. The abstract execution -1515 * 17 → -(+) * (+) → (-) * (+) → (-), proves that -1515 * 17 is a negative number. Abstract interpretation is concerned by a particular underlying structure of the usual universe of computations (the sign, in our example). It gives a summary of some facets of the actual executions of a program. In general this summary is simple to obtain but inaccurate (e.g. -1515 + 17 → -(+) + (+) → (-) + (+) → (±)). Despite its fundamentally incomplete results abstract interpretation allows the programmer or the compiler to answer questions which do not need full knowledge of program executions or which tolerate an imprecise answer, (e.g. partial correctness proofs of programs ignoring the termination problems, type checking, program optimizations which are not carried in the absence of certainty about their feasibility, …).
Article
Full-text available
Hard real-time systems must obey strict timing constraints. Therefore, one needs to derive guarantees on the worst-case execution times of a system's tasks. In this context, predictable behavior of system components is crucial for the derivation of tight and thus useful bounds. This paper presents results about the predictabil- ity of common cache replacement policies. To this end, we introduce three metrics, evict, fill, and mls that capture aspects of cache-state predictability. A thorough analy- sis of the LRU, FIFO, MRU, and PLRU policies yields the respective values under these metrics. To the best of our knowledge, this work presents the first quantitative, analytical results for the predictability of replacement policies. Our results support empirical evidence in static cache analysis.
Conference Paper
Full-text available
This paper presents a framework for combining low-level measurement data through high-level static analysis techniques on instrumented programs in order to generate WCET estimates, for which we introduce the instrumentation point graph (IPG). We present the notion of iteration edges, which are the most important property of the IPG from a timing analysis perspective since they allow more path-based information to be integrated into tree-based calculations on loops. The main focus of this paper, however, is an algorithm that performs a hierarchical decomposition of an IPG into an Itree to permit tree-based WCET calculations. The Itree representation supports a novel high-level structure, the meta-loop, which enables iteration edges to be merged in the calculation stage. The timing schema required for the Itree is also presented. Finally, we outline some conclusions and future areas of interest.
Article
Full-text available
This paper describes the tool support for a framework for performing probabilistic worst-case execution time (WCET) analysis for embedded real-time systems. The tool is based on a combination of measurement and static analysis, all in a probabilistic framework. Measurement is used to determine execution traces and static analysis to construct the worst path and e#ectively providing an upper bound on the worst-case execution time of a program. The paper illustrates the theoretical framework and the components of the tool together with a case study.
Conference Paper
Traditional approaches for worst case execution time (WCET) analysis produce values which are very pessimistic if applied to modern processors. In addition, end to end measurements as used in industry produce estimates of the execution time that potentially underestimate the real worst case execution time. We introduce the notion of probabilistic hard real-time systems which have to meet all the deadlines but for which a (high) probabilistic guarantee suffices. We combine both measurement and analytical approaches into a model for computing probabilistically bounds on the execution time of the worst case path of sections of code. The idea of the technique presented is based on combining (probabilistically) the worst case effects seen in individual blocks to build the execution time model of the worst case path of the program (such case may have not been observed in the measurements). We provide three alternative operators for the combination based on whether the information of their dependency is known. Experimental evaluation of a two case study shows extremely low probabilities of the values obtained by traditional analysis.
Thesis
In this work the automatic generation of program analyzers from concise specifications is presented. It focuses on provably correct and complex interprocedural analyses for real world sized imperative programs. Thus, a powerful and flexible specification mechanism is required, enabling both correctness proofs and efficient implementations. The generation process relies on the theory of data flow analysis and on abstract interpretation. The theory of data flow analysis provides methods to efficiently implement analyses. Abstract interpretation provides the relation to the semantics of the programming language. This allows the systematic derivation of efficient provably correct, and terminating analyses. The approach has been implemented in the program analyzer generator PAG. It addresses analyses ranging from "simple'; intraprocedural bit vector frameworks to complex interprocedural alias analyses. A high level specialized functional language is used as specification mechanism enabling elegant and concise specifications even for complex analyses. Additionally, it allows the automatic selection of efficient implementations for the underlying abstract datatypes, such as balanced binary trees, binary decision diagrams, bit vectors, and arrays. For the interprocedural analysis the functional approach, the call string approach, and a novel approach especially targeting on the precise analysis of loops can be chosen. In this work the implementation of PAG as well as a large number of applications of PAG are presented.
Article
The theoretical basis of sequential circuit synthesis is developed, with particular reference to the work of D. A. Huffman and E. F. Moore. A new method of synthesis is developed which emphasizes formal procedures rather than the more familiar intuitive ones. Familiarity is assumed with the use of switching algebra in the synthesis of combinational circuits.
Article
play an important role in the area of embedded systems and especially hard real-time systems. These systems are typically subject to stringent tim-ing constraints, which often result from the interaction with the surrounding physical environment. It is essential that the computations are completed within their associ-ated time bounds; otherwise severe damages may result, or the system may be unus-able. Therefore, a schedulability analysis has to be performed which guarantees that all timing constraints will be met. Schedulability analyses require upper bounds for the execution times of all tasks in the system to be known. These bounds must be safe, i.e., they may never underestimate the real execution time. Furthermore, they should be tight, i.e., the overestimation should be as small as possible. In modern microprocessor architectures, caches, pipelines, and all kinds of specu-lation are key features for improving (average-case) performance. Unfortunately, they make the analysis of the timing behaviour of instructions very difficult, since the ex-ecution time of an instruction depends on the execution history. A lack of precision in the predicted timing behaviour may lead to a waste of hardware resources, which would have to be invested in order to meet the requirements. For products which are manufactured in high quantities, e.g., in the automobile or telecommunications markets this would result in intolerable expenses. Subject of this chapter are one particular approach and the subtasks involved in computing safe and precise bounds on the execution times for real-time systems.
Article
Zugl.: Saarbrücken, University, Diss., 1997.
Article
This paper explores the issues to be addressed to provide safe worst-case execution time (WCET) estimation methods based on measurements. We suggest to use structural testing for the exhaustive exploration of paths in a program. Since test data generation is in general too complex to be used in practice for most real-size programs, we propose to generate test data for program segments only, using program clustering. Moreover, to be able to combine execution time of program segments and to obtain the WCET of the whole program, we advocate the use of compiler techniques to reduce (ideally eliminate) the timing variability of program segments and to make the time of program segments independent from one another. @InProceedings{deverge_et_al:DSP:2007:808, author = {Jean-Fran{c{c}}ois Deverge and Isabelle Puaut}, title = {Safe measurement-based WCET estimation}, booktitle = {5th Intl. Workshop on Worst-Case Execution Time (WCET) Analysis}, year = {2007}, editor = {Reinhard Wilhelm}, publisher = {Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany}, address = {Dagstuhl, Germany}, URL = {http://drops.dagstuhl.de/opus/volltexte/2007/808}, annote = {Keywords: Real-Time, Timing Analysis, Test-Case generation, Processor Architecture}, }
Conference Paper
Embedded computer systems are characterized by the presence of a processor running application specific software. A large number of these systems must satisfy real-time constraints. This paper examines the problem of determining the bound on the running time of a given program on a given processor. An important aspect of this problem is determining the extreme case program paths. The state of the art solution here relies on an explicit enumeration of program paths. This runs out of steam rather quickly since the number of feasible program paths is typically exponential in the size of the program. We present a solution for this problem, which considers all paths implicitly by using integer linear programming. This solution is implemented in the program cinderella which currently targets a popular embedded processor - the Intel i960. The preliminary results of using this tool are presented here.
Article
Verification of program running time is essential in system design with real-time constraints. Simulation with incomplete test patterns or simple instruction counting are not appropriate for complex architectures. Software running times of embedded systems are process state and input data dependent. Formal analysis of such dependencies leads to software running time intervals rather than single values. These intervals depend on program properties, execution paths, and states of processes, as well as on the target architecture. An approach to analysis of process behavior using running time intervals is presented. It improves our previous work by exploiting program segments with single paths and by taking the execution context into account. The example of an asynchronous transfer mode (ATM) cell handler demonstrates significant improvements in analysis precision. Experimental results show the superiority of the presented approach over well-established approaches.
Article
Real-time systems have to complete their actions w.r.t. given timing constraints. In order to validate that these constraints are met, static timing analysis is usually performed to compute an upper bound of the worst-case execution times (WCET) of all the involved tasks.
Article
In previous work [1], we have developed the theoretical basis for the prediction of the cache behavior of programs by abstract interpretation. Abstract interpretation is a technique for the static analysis of dynamic properties of programs. It is semantics based, that is, it computes approximative properties of the semantics of programs. On this basis, it allows for correctness proofs of analyses. It thus replaces commonlyused ad hoc techniques by systematic, provable ones, and it allows the automatic generation of analyzers from specifications as in the Program Analyzer Generator, PAG. In this paper, abstract semantics of machine programs are refined which determine the contents of caches. For interprocedural analysis, existing methods are examined and a new approach that is especially tailored for the analysis of hardware with states is presented. This allows for a static classification of the cache behavior of memory references of programs. The calculated information can be used to...
Article
Introduction Real-time systems and many other computer applications must meet specifications and perform tasks that satisfy timing as well as logical criteria for correctness. Examples of timing properties and constraints include deadlines, the periodic execution of processes, and external event recognition based on time of occurrence (e.g., [9, 18]). We present a scheme for reasoning with and about time and for specifying timing properties in concurrent programs. The objectives are to predict the tim1 This chapter is based on "Reasoning about Time in Higher-Level Language Software," by Alan Shaw that appeared in IEEE Transactions on Software Engineering, vol. 15, no. 7, pp. 875--889, July 1989, c fl1989 IEEE. Sec. 16.1 Introduction 375 ing behavior of higher-level language programs and to prove that they meet their timing constraints, through the direct analysis of program statements. Timing
Article
Programs spend most of their time in loops and procedures. Therefore, most program transformations and the necessary static analyses deal with these. It has been long recognized, that different execution contexts for procedures may induce different execution properties. There are well established techniques for interprocedural analysis like the call string approach. Loops have not received similar attention in the area of data flow analysis and abstract interpretation. All executions are treated in the same way, although typically the first and later executions may exhibit very different properties. In this paper a new technique is presented that allows the application of the well known and established interprocedural analysis theory to loops, It turns out that the call string approach has limited flexibility in its possibilities to group several calling contexts together for the analysis. An extension to overcome this problem is presented that relies on a similar approach but gives more useful results in practice. The classical and the new techniques are implemented in our Program Analyzer Generator FAG, which is used to demonstrate our findings by applying the techniques to several real world programs.
Exploiting Hardware Performance Counters with Flow and Context Sensitive Profiling
  • Glenn Ammons
  • Thomas Ball
  • James R Larus
Glenn Ammons, Thomas Ball, and James R. Larus. Exploiting Hardware Performance Counters with Flow and Context Sensitive Profiling. In PLDI '97: Proceedings of the ACM SIGPLAN 1997 conference on Programming Language Design and Implementation, pages 85-96. ACM, 1997.
An Approach to Symbolic Worst-Case Execution Time Analysis
  • Guillem Bernat
  • Alan Burns
Guillem Bernat and Alan Burns. An Approach to Symbolic Worst-Case Execution Time Analysis. In 25th IFAC Workshop on Real-Time Programming, 2000.
Optimally Profiling and Tracing Programs
  • Thomas Ball
  • James R Larus
Thomas Ball and James R. Larus. Optimally Profiling and Tracing Programs. In POPL '92: Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 59-70. ACM, 1992.
aiT: Worst-Case Execution Time Prediction by Static Programm Analysis
  • Christian Ferdinand
  • Reinhold Heckmann
Christian Ferdinand and Reinhold Heckmann. aiT: Worst-Case Execution Time Prediction by Static Programm Analysis. In Ren Jacquart, editor, REFERENCES Building the Information Society. IFIP 18th World Computer Congress, Topical Sessions, 22-27 August 2004, Toulouse, France, pages 377-384. Kluwer, Boston, Mass., 2004.
Towards Predicated WCET Analysis
  • Amine Marref
  • Guillem Bernat
Amine Marref and Guillem Bernat. Towards Predicated WCET Analysis. In Raimund Kirner, editor, 8th Intl. Workshop on Worst-Case Execution Time (WCET) Analysis, Dagstuhl, Germany, 2008. Schloss Dagstuhl -Leibniz-Zentrum fuer Informatik, Germany.
Semiautomatic Derivation of Abstract Processor Models
  • Markus Pister
  • Marc Schlickling
  • Mohamed Abdel Maksoud
Markus Pister, Marc Schlickling, and Mohamed Abdel Maksoud. Semiautomatic Derivation of Abstract Processor Models. Reports of ES PASS, ES PASS, June 2009.
A Framework for Static Analysis of VHDL Code
  • Marc Schlickling
  • Markus Pister
Marc Schlickling and Markus Pister. A Framework for Static Analysis of VHDL Code. In Christine Rochange, editor, Proceedings of the 7th International Workshop on Worst-Case Execution Time (WCET) Analysis at Pisa, Italy, 2007.
Static Analysis Support for Measurement-Based WCET Analysis
  • Stefan Schaefer
  • Bernhard Scholz
  • Stefan M Petters
  • Gernot Heiser
Stefan Schaefer, Bernhard Scholz, Stefan M. Petters, and Gernot Heiser. Static Analysis Support for Measurement-Based WCET Analysis. In 12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Work-in-Progress Session, 2006. REFERENCES