ArticlePDF Available

Abstract and Figures

The classical approach to automatic cost analysis consists of two phases. Given a program and some measure of cost, the analysis first produces cost relations (CRs), i.e., recursive equations which capture the cost of the program in terms of the size of its input data. Second, CRs are converted into closed-form, i.e., without recurrences. Whereas the first phase has received considerable attention, with a number of cost analyses available for a variety of programming languages, the second phase has been comparatively less studied. This article presents, to our knowledge, the first practical framework for the generation of closed-form upper bounds for CRs which (1) is fully automatic, (2) can handle the distinctive features of CRs, originating from cost analysis of realistic programming languages, (3) is not restricted to simple complexity classes, and (4) produces reasonably accurate solutions. A key idea in our approach is to view CRs as programs, which allows applying semantic-based static analyses and transformations to bound them, namely our method is based on the inference of ranking functions and loop invariants and on the use of partial evaluation.
Content may be subject to copyright.
A preview of the PDF is not available
... Recurrence equations have played a central role in static cost analysis and verification for decades, having been used in many automated tools [15,14,16,59,32,1,71,51,23,50,48] ever since Wegbreit's seminal work in 1975 [80]. The goal of static cost analysis is to infer information about the resources used by programs (e.g. ...
... Second, despite significant progress on specialised solvers, they too do not support the breadth of behaviours arising from programs. As reported in [70], tools such as RaML, CiaoPP, PUBS, Cofloco, KoAT/LoAT, Loopus, or Duet [33,32,1,23,25,24,72,40] all present important limitations and are each unable to infer tight bounds for some classes of programs, that exhibit, e.g. nonlinear recursive, amortised, non-monotonic and/or multiphase behaviour, even when a simple closed-form solution exists. ...
... Because of this, Cofloco outperforms our prototype in category nonmonot, where it obtains exact solutions in cases where our prototype only infer (accurate) approximations. For example, in the memory leak benchmark, we only find bounds [ 4589 4599 , 4609 4599 ] for s = 0 instead of [1,1], and in the open-zip benchmark, a counter-example caused by numerical errors is found for our candidate upper bound. We may note that approaches mentioned in Section 8.2 and 9, not implemented in this prototype, could alleviate this limitation. ...
Preprint
Recurrence equations have played a central role in static cost analysis, where they can be viewed as abstractions of programs and used to infer resource usage information without actually running the programs with concrete data. Such information is typically represented as functions of input data sizes. More generally, recurrence equations have been increasingly used to automatically obtain non-linear numerical invariants. However, state-of-the-art recurrence solvers and cost analysers suffer from serious limitations when dealing with the (complex) features of recurrences arising from cost analyses. We address this challenge by developing a novel order-theoretical framework where recurrences are viewed as operators and their solutions as fixpoints, which allows leveraging powerful pre/postfixpoint search techniques. We prove useful properties and provide principles and insights that enable us to develop techniques and combine them to design new solvers. We have also implemented and experimentally evaluated an optimisation-based instantiation of the proposed approach. The results are quite promising: our prototype outperforms state-of-the-art cost analysers and recurrence solvers, and can infer tight non-linear lower/upper bounds, in a reasonable time, for complex recurrences representing diverse program behaviours.
... As will be discussed in Section 2, such models can be established at different levels of abstraction, ranging from models that characterize individual functional hardware blocks [3], via Instruction Set Architecture (ISA) characterization models [4,5,6], to models based on intermediate representations used by the compiler [7,8]. The final energy models provide information that feeds into static resource usage analysis algorithms [9,10,11,12,13,14,15,16], where they represent the energy usage of elementary parts of the computation. This is discussed in Section 3. ...
... The input to the CiaoPP parametric static resource usage analyzer [13,14,15] is the HC IR, along with assertions in the common assertion language expressing the energy model for LLVM IR blocks and/or individual ISA instructions, and possibly some additional (trusted) information. The analyzer is based on an approach in which recursive equations (cost relations), representing the cost of running the program, are extracted from the program and solved, obtaining upper-and lower-bound cost functions (which may be polynomial, exponential or logarithmic) in terms of the program's inputs [9,10,11,12,16]. ...
Preprint
Promoting energy efficiency to a first class system design goal is an important research challenge. Although more energy-efficient hardware can be designed, it is software that controls the hardware; for a given system the potential for energy savings is likely to be much greater at the higher levels of abstraction in the system stack. Thus the greatest savings are expected from energy-aware software development, which is the vision of the EU ENTRA project. This article presents the concept of energy transparency as a foundation for energy-aware software development. We show how energy modelling of hardware is combined with static analysis to allow the programmer to understand the energy consumption of a program without executing it, thus enabling exploration of the design space taking energy into consideration. The paper concludes by summarising the current and future challenges identified in the ENTRA project.
... The challenge we address originates from the established approach of setting up recurrence relations representing the cost of predicates, parameterized by input data sizes (Wegbreit, 1975;Rosendahl, 1989;Debray et al., 1990;Debray and Lin, 1993;Debray et al., 1997;Navas et al., 2007;Albert et al., 2011;Serrano et al., 2014;Lopez-Garcia et al., 2016), which are then solved to obtain closed forms of such recurrences (i.e., functions that provide either exact, or upper/lower bounds on resource usage in general). Such approach can infer different classes of functions (e.g., polynomial, factorial, exponential, summation, or logarithmic). ...
... We also include tools such as PUBS 15 (Albert et al., 2011(Albert et al., , 2013 and Cofloco 16 (Flores-Montoya, 2017) in this category. These works emphasize the shortcomings of using too simple recurrence relations, chosen as to fit the limitations of available CAS solvers, and the necessity to consider non-deterministic (in)equations, piecewise definitions, multiple variables, possibly increasing variables, and to study non-monotonic behavior and control flow of the equations. ...
Article
Full-text available
Automatic static cost analysis infers information about the resources used by programs without actually running them with concrete data and presents such information as functions of input data sizes. Most of the analysis tools for logic programs (and many for other languages), as CiaoPP, are based on setting up recurrence relations representing (bounds on) the computational cost of predicates and solving them to find closed-form functions. Such recurrence solving is a bottleneck in current tools: many of the recurrences that arise during the analysis cannot be solved with state-of-the-art solvers, including computer algebra systems (CASs), so that specific methods for different classes of recurrences need to be developed. We address such a challenge by developing a novel, general approach for solving arbitrary, constrained recurrence relations, that uses machine learning (sparse-linear and symbolic) regression techniques to guess a candidate closed-form function, and a combination of an SMT-solver and a CAS to check whether such function is actually a solution of the recurrence. Our prototype implementation and its experimental evaluation within the context of the CiaoPP system show quite promising results. Overall, for the considered benchmark set, our approach outperforms state-of-the-art cost analyzers and recurrence solvers and can find closed-form solutions, in a reasonable time, for recurrences that cannot be solved by them.
... SRA has been mainly driven by the timing analysis community. Static cost analysis techniques based on setting up and solving recurrence equations date back to Wegbreit's [40] seminal paper, and have been developed significantly in subsequent work [41,42,43,44,45]. Other classes of approaches to cost analysis use dependent types [46], SMT solvers [47], or size change abstraction [48]. ...
Preprint
Energy transparency is a concept that makes a program's energy consumption visible, from hardware up to software, through the different system layers. Such transparency can enable energy optimizations at each layer and between layers, and help both programmers and operating systems make energy-aware decisions. In this paper, we focus on deeply embedded devices, typically used for Internet of Things (IoT) applications, and demonstrate how to enable energy transparency through existing Static Resource Analysis (SRA) techniques and a new target-agnostic profiling technique, without hardware energy measurements. Our novel mapping technique enables software energy consumption estimations at a higher level than the Instruction Set Architecture (ISA), namely the LLVM Intermediate Representation (IR) level, and therefore introduces energy transparency directly to the LLVM optimizer. We apply our energy estimation techniques to a comprehensive set of benchmarks, including single- and also multi-threaded embedded programs from two commonly used concurrency patterns, task farms and pipelines. Using SRA, our LLVM IR results demonstrate a high accuracy with a deviation in the range of 1% from the ISA SRA. Our profiling technique captures the actual energy consumption at the LLVM IR level with an average error of 3%.
... Many advanced techniques for imperative integer programs apply abstract interpretation to generate numerical invariants. The obtained sizechange information forms the basis for the computation of actual bounds on loop iterations and recursion depths; using counter instrumentation [27], ranking functions [6,2,15,52], recurrence relations [4,1], and abstract interpretation itself [58,18]. Automatic resource analysis techniques for functional programs are based on sized types [54], recurrence relations [23], term-rewriting [9], and amortized resource analysis [35,42,30,51]. ...
Preprint
This article presents a resource analysis system for OCaml programs. This system automatically derives worst-case resource bounds for higher-order polymorphic programs with user-defined inductive types. The technique is parametric in the resource and can derive bounds for time, memory allocations and energy usage. The derived bounds are multivariate resource polynomials which are functions of different size parameters that depend on the standard OCaml types. Bound inference is fully automatic and reduced to a linear optimization problem that is passed to an off-the-shelf LP solver. Technically, the analysis system is based on a novel multivariate automatic amortized resource analysis (AARA). It builds on existing work on linear AARA for higher-order programs with user-defined inductive types and on multivariate AARA for first-order programs with built-in lists and binary trees. For the first time, it is possible to automatically derive polynomial bounds for higher-order functions and polynomial bounds that depend on user-defined inductive types. Moreover, the analysis handles programs with side effects and even outperforms the linear bound inference of previous systems. At the same time, it preserves the expressivity and efficiency of existing AARA techniques. The practicality of the analysis system is demonstrated with an implementation and integration with Inria's OCaml compiler. The implementation is used to automatically derive resource bounds for 411 functions and 6018 lines of code derived from OCaml libraries, the CompCert compiler, and implementations of textbook algorithms. In a case study, the system infers bounds on the number of queries that are sent by OCaml programs to DynamoDB, a commercial NoSQL cloud database service.
Preprint
Traditional static resource analyses estimate the total resource usage of a program, without executing it. In this paper we present a novel resource analysis whose aim is instead the static profiling of accumulated cost, i.e., to discover, for selected parts of the program, an estimate or bound of the resource usage accumulated in each of those parts. Traditional resource analyses are parametric in the sense that the results can be functions on input data sizes. Our static profiling is also parametric, i.e., our accumulated cost estimates are also parameterized by input data sizes. Our proposal is based on the concept of cost centers and a program transformation that allows the static inference of functions that return bounds on these accumulated costs depending on input data sizes, for each cost center of interest. Such information is much more useful to the software developer than the traditional resource usage functions, as it allows identifying the parts of a program that should be optimized, because of their greater impact on the total cost of program executions. We also report on our implementation of the proposed technique using the CiaoPP program analysis framework, and provide some experimental results. This paper is under consideration for acceptance in TPLP.
Chapter
Cross-organisational workflows involve multiple concurrently running workflows across organisations, and are in general more complex and unpredictable than single individual ones. Minor modifications in a collaborating workflow may be propagated to other concurrently running ones, leading to negative impacts on the cross-organisational workflow as a whole, e.g., deadline violation, which can be disastrous for, e.g., healthcare organisations. Worst-case execution time (WCET) is a metric for detecting potential deadline violation for a workflow. In this paper, we present a tool, \mathcal {R}{ \textsc {pl}}\textrm{Tool}, which helps planners model and simulate cross-organisational workflows in a resource-sensitive formal modelling language \mathcal {R}{ \textsc {pl}}. The tool is equipped with a static analyser to approximate the WCET of workflows, which can provide decision support to the workflow planners prior to the implementation of the workflow and to help estimate the effect of changes to avoid deadline violation.Keywordsworst-case execution timecross-organisational workflowsresource planningformal modellingstatic analysis
Chapter
In this paper, we develop semantic foundations for precise cost analyses of programs running on architectures with multi-scalar pipelines and in-order execution with branch prediction. This model is then used to prove the correction of an automatic cost analysis we designed. The analysis is implemented and evaluated in an extant framework for high-assurance cryptography. In this field, developers aggressively hand-optimize their code to take maximal advantage of micro-architectural features while looking for provable semantic guarantees.
Article
Full-text available
Cost analysis of Java bytecode is complicated by its unstruc-tured control flow, the use of an operand stack and its object-oriented programming features (like dynamic dispatching). This paper addresses these problems and develops a generic framework for the automatic cost analysis of sequential Java bytecode. Our method generates cost relations which define at compile-time the cost of programs as a function of their input data size. To the best of our knowledge, this is the first approach to the automatic cost analysis of Java bytecode.
Article
Full-text available
This paper describes a new static analysis for finding approx-imations to the path-length of variables in imperative, object-oriented programs. The path-length of a variable v is the cardinality of the longest chain of pointers that can be followed from v. It is shown how such in-formation may be used for automatic termination inference of programs dealing with dynamically created data-structures.
Article
We show how to efficiently obtain linear a priori bounds on the heap space consumption of first-order functional programs. The analysis takes space reuse by explicit deallocation into account and also furnishes an upper bound on the heap usage in the presence of garbage collection. It covers a wide variety of examples including, for instance, the familiar sorting algorithms for lists, including quicksort. The analysis relies on a type system with resource annotations. Linear programming (LP) is used to automatically infer derivations in this enriched type system. We also show that integral solutions to the linear programs derived correspond to programs that can be evaluated without any operating system support for memory management. The particular integer linear programs arising in this way are shown to be feasibly solvable under mild assumptions.
Conference Paper
Let C be a program written in a formal language in order to be executed by some kind of machinery. A statement about C might be true or false and has the form C:M. For the time being, just consider the statement C:M as a collection of data yielding information about the resources required to execute C; and if we know that C:M is true (or false), we know something useful when it comes to determine the computational complexity of C. Let Γ be a set of statements, and let Γ⊧C:M denote that C:M will be true if all the statements in Γ are true. (The statements in Γ might say something about the computational complexity of the subprograms of C.) If Γ = 0, we will simply write⊧C:M.
Article
Constraint Logic Programming (CLP) is a merger of two declarative paradigms: constraint solving and logic programming. Although a relatively new field, CLP has progressed in several quite different directions. In particular, the early fundamental concepts have been adapted to better serve in different areas of applications. In this survey of CLP, a primary goal is to give a systematic description of the major trends in terms of common fundamental concepts. The three main parts cover the theory, implementation issues, and programming for applications.
Article
This paper attempts to provide an adequate basis for formal definitions of the meanings of programs in appropriately defined programming languages, in such a way that a rigorous standard is established for proofs about computer programs, including proofs of correctness, equivalence, and termination. The basis of our approach is the notion of an interpretation of a program: that is, an association of a proposition with each connection in the flow of control through a program, where the proposition is asserted to hold whenever that connection is taken. To prevent an interpretation from being chosen arbitrarily, a condition is imposed on each command of the program. This condition guarantees that whenever a command is reached by way of a connection whose associated proposition is then true, it will be left (if at all) by a connection whose associated proposition will be true at that time. Then by induction on the number of commands executed, one sees that if a program is entered by a connection whose associated proposition is then true, it will be left (if at all) by a connection whose associated proposition will be true at that time. By this means, we may prove certain properties of programs, particularly properties of the form: ‘If the initial values of the program variables satisfy the relation R l, the final values on completion will satisfy the relation R 2’.
Article
Analyzing time complexity of functional programs in a lazy language is problematic, because the time required to evaluate a function depends on how much of the result is “needed” in the computation. Recent results in strictness analysis provide a formalisation of this notion of “need”, and thus can be adapted to analyse time complexity. The future of programming may be in this paradigm: to create software, first write a specification that is clear, and then refine it to an implementation that is efficient. In particular, this paradigm is a prime motivation behind the study of functional programming. Much has been written about the process of transforming one functional program into another. However, a key part of the process has been largely ignored, for very little has been written about assessing the efficiency of the resulting programs. Traditionally, the major indicators of efficiency are time and space complexity. This paper focuses on the former. Functional programming can be split into two camps, strict and lazy. In a strict functional language, analysis of time complexity is straightforward, because of the following compositional rule: The time to evaluate (ƒ(g x)) equals the time to evaluate (g x) plus the time to evaluate (ƒ y), where y is the value of (g x). However, in a lazy language, this rule only gives an upper bound, possibly a crude one. For example, if ƒ is the head function, then (g x) need only be evaluated far enough to determine the first element of the list, and this may take much less time than evaluating (g x) completely. The key to a better analysis is to describe formally just how much of the result “needs” to be evaluated; we call such a description a context. Recent results in strictness analysis show how such contexts can be modelled using the domain-theoretic notion of a projection [WH87]. This paper describes how these results can be applied to the analysis of time complexity. The method used was inspired by work of Bror Bjerner on the complexity analysis of programs in Martin-Lüf's type theory [Bje87]. The main contribution of this paper is to simplify Bjerner's notation, and to show how contexts can replace his “demand notes”. The language used in this paper is a first-order language. This restriction is made because context analysis for higher-order languages is still under development. An approach to higher-order context analysis is outlined in [Hug87b]. Context analysis is based on backwards analysis, rather than the earlier approach of abstract interpretation; both are discussed in [AH87]. Some work on complexity analysis [Weg75,LeM85] has concentrated on automated analysis: algorithms that derive a closed form for the time complexity of a program. The goal here is less ambitious. We are simply concerned with describing a method of converting a functional program into a series of equations that describe its time complexity. This modest beginning is a necessary precursor to any automatic analysis. The time equations can be solved by traditional methods, yielding either an exact solution, an upper bound, or an approximate solution. (Incidentally, although [LeM85] claims to analyse a lazy language, the analysis uses exactly the composition rule above, and so is more suited for a strict language.) The analysis given here involves two kinds of equations. First are equations defining projection transformers that specify how much of a value is “needed”. Second are equations that specify the time complexity; these depend on the projection transformers defined by the first equations. In both cases, we will be more concerned with setting up the equations that with finding their solutions. As already noted, traditional methods may be applied to satisfy the time complexity equations. Solving the projection transformer equations is more problematic. In some cases, we can find an appropriate solution by choosing an appropriate finite domain of projections, and then applying the method of [WH87] to find the solution in this domain. In other cases, no finite domain of solutions is appropriate, and we will find a solution by a more ad-hoc method: guessing one and verifying that it satisfies the required conditions. More work is required to determine what sort of solutions to the projection transformer equations will be most useful for time analysis, and how to find these solutions. This paper is organized as follows. Section 1 describes the language to be analysed. Section 2 presents the evaluation model. Section 3 gives a form of time analysis suitable for strict evaluation. Section 4 shows how projections can describe what portion of a value is “needed”, and introduces projection transformers. Section 5 gives the time analysis for lazy evaluation. Section 6 presents a useful extension to the analysis method. Section 7 concludes.