Page 1

Unit´ e Mixte de Recherche 5104 CNRS - INPG - UJF

Centre Equation

2, avenue de VIGNATE

F-38610 GIERES

tel : +33 456 52 03 40

fax : +33 456 52 03 50

http://www-verimag.imag.fr

Proving Inter-Program Properties

Andrei Voronkov, Iman Narasamdya

Verimag Research Report noTR-2008-13

September 2008

Reports are downloadable at the following address

http://www-verimag.imag.fr

Page 2

Proving Inter-Program Properties

Andrei Voronkov, Iman Narasamdya

The University of Manchester

voronkov@cs.man.ac.uk

Verimag

Iman.Narasamdya@imag.fr

September 2008

Abstract

We develop foundations for proving properties relating two programs. Our formalization is

based on a suitably adapted notion of program invariant for a single program. First, we give

an abstract formulation of the theory of program invariants based on the notion of assertion

function: a function that assigns assertions to program points. Then, we develop this abstract

notion further so that it can be used to prove properties between two programs. We describe

two applications of the theory. One application is in the translation validation for optimiz-

ing compilers, and the other application is in the certification of smart-card application in the

framework of Common Criteria. The latter application is part of an industrial project con-

ducted at Verimag laboratory.

Keywords: Assertion Function, Invariant, Translation Validation, Common Criteria Certification

Reviewers: Laurent Mounier

Notes:

How to cite this report:

@techreport { ,

title = { Proving Inter-Program Properties},

authors = { Andrei Voronkov, Iman Narasamdya},

institution = { Verimag Research Report },

number = {TR-2008-13},

year = { },

note = { }

}

Page 3

Andrei Voronkov, Iman Narasamdya

1 Introduction

Techniques for proving properties between two programs have become important in the area of program

verification. The verification of a program consists of proving that the program satisfies a given speci-

fication. The specification is usually written in a formal language such as first-order or temporal logic.

However, in some cases, like software evaluation and certification, the formal specification itself is often

not available, and we are only given a model of the specification. This model is essentially a program writ-

ten in a simple language. To prove the correctness of our program, we first formulate the property relating

the program and the model. For example, our program is correct with respect to the model if they perform

the same sequence of function calls when both of them are run on the same input. Such a propertybetween

two programs is called inter-program property throughout this report.

Inter-program properties describe relationships between two programs. A relationship between two

programs includes a mapping between locations and a relationship between variables of the two programs.

Moreover, inter-program properties also involve run-time behaviors of the two programs. Consider the

following two programs:

P

i := 0

while (i < 100) do

i := i + 1

q :

od

return i

P′

i′:= 0

while (i′< 100) do

i′:= i′+ 2

q′:

od

return i′

We want to prove that P and P′are semantically equivalent. That is, for every pair of runs of both

programs, one run is terminating if and only if so is the other, and if the runs are terminating, they return

the same value. We first assert i = i′at q and q′. Then, we argue that P and P′are equivalent with the

following reasoning. From the entries of P and P′, by taking two iterations of the loop in P and a single

iteration of the loop in P′, one can reach q and q′such that the values of i and i′coincide. From q and q′,

by knowing that the values of i and i′coincide (or the equality i = i′holds), then there are two possibilities

depending on the values of i and i′. One possibility is follow the same paths as before and reach q and q′

again such that the values of i and i′coincide. The other possibility is exit the loops and the values of i

and i′remain coincide. These two possibilities show that both runs of P and P′are either terminating or

non-terminating. The second possibility shows that on termination, both runs return the same value.

The notion of semantic equivalence is an example of inter-program property. Such a notion is heavily

used in compiler verification, particularly in translation validation approach and certifying compilers. In

the translation validation approach [11], for each compilation, one proves that the source and the target

programs are semantically equivalent. Particularly in a certifying compiler, the compiler must produce a

certificate certifying such an equivalence.

One might be interested in the notion of safe implementation. For example, a program is a safe imple-

mentation of another program if the sequence of observable behaviors performed by the former program is

a subsequence of that of the latter program. Consider the programs P and P′above and imagine that there

is a function call f(i) at q and q′. Let function calls and return values be the only observable behaviors. P

and P′are no longer equivalent because both perform different sequences of function calls. Nonetheless,

one can prove that P′is a safe implementation of P.

Standardtechniquesforprovingpropertiesofa singleprogramhavebeenaddressedforfourdecades[4,

5]. However, although there have been many kinds of inter-programproperty used in program verification,

there is no adequate basis for describing inter-program properties formally such that a rigorous standard is

establish for certificates and proofs about such properties. We propose in this report an abstract theory of

inter-program properties. The theory is based on the notion of assertion function: a function that assigns

assertions to program points. For example, in the above program P we can assert that i ≤ 100 at q by

defining an assertion function I such that I maps q to i ≤ 100.

The formalization of our theory is based on a suitably adapted notion of program invariant for a single

program. We introduce the notion of extendible assertion function as a constructive notion for describing

Verimag Research Report noTR-2008-131/34

Page 4

Andrei Voronkov, Iman Narasamdya

and proving program invariants. An assertion function I of a program is extendible if for every run of the

program reaching a point p1on which I is defined and the assertion defined at p1holds, we can always

extend the run so that it reaches a point p2on which I is defined and the assertion at p2holds. For example,

suppose that we define an assertion function I of the program P above such that, on the entry and exit of

P, I is defined as true, and on q, I is defined as i ≤ 100. The function I is extendible because if a run

reaches q such that i ≤ 100 holds, then we can extend the run either to reach q again or to reach the exit of

P, and the assertions defined at those points will also hold.

We develop further the notion of extendible assertion function so that it can be used to prove inter-

program properties. To this end, we consider the two programs as a pair of programs with disjoint sets

of variables. For example, to assert that i = i′at q and q′in the programs P and P′above, we define

an assertion function I of (P,P′) such that I maps (q,q′) to i = i′. We will show in this report that

meta properties that hold for the case of a single program also hold for the case of a pair of programs.

Furthermore, since we are interested in a kind of certificate, we develop a notion of verification condition

as a notion of certificate. A verification condition itself is a set of assertions. A certificate can be turned

into a proof by proving that all assertions in the verification condition are valid.

In this report we discuss two prominentapplications of the theory of inter-programproperties. The first

application is translation validation. We focus the application on the translation validation for optimizing

compilers. We can show that the notion of extendible assertion function can capture inter-program prop-

erties used in all existing works on translation validation for optimizing compilers. In Section 5 we will

discuss its application to our previous work on finding basic block and variable correspondence [8] and

briefly mention how our notion of weakly extendible assertion function and the corresponding notion of

verification condition can be used to certify other approaches.

The other application is in software certification. We describe an industrial project for certifying smart-

card applications at Verimag laboratory. In this project, we show that, using our theory, we can provide

certificates that certify propertiesbetween differentmodels of a specification in the frameworkof Common

Criteria [1].

In summary, the contributions of this report are the following:

• A theory of inter-program properties as an adequate basis for describing and proving properties

relating two programs.

• Applications of the theory in compiler verification and in software certification.

The outline of this report is as follows. We first describe the main assumptions used in the theory

of inter-program properties. We then develop a theory of properties of a single program. We call such

properties intra-program properties. Then, we develop the theory further so that it can be used to prove

inter-program properties. Having the theory of inter-program properties, we then discuss two applications

of the theory in translation validation and in Common Criteria certification.

2Main Assumptions

Our formalization will be based on standard assumptions about programs and their semantics. We assume

that a program consists of a finite set of program points. For example, a program point of a program P

can be the entry or the exit of a sequence of statements (or a block) in P. We denote by PointP the set

of program points of P. A program-point flow graph of P is a finite directed graph whose nodes are the

programpoints of P. In the sequel, we assume that everyprogramP we are dealing with is associated with

a program-pointflow graph, denoted by GP.

We assume that every program has a unique entry point and a unique exit point. Denote by entry(P)

and exit(P), respectively, the entry and the exit point of program P. We assume that the program-point

flow graph contains no edge into the entry point and no edge from the exit point.

We describe the run-time behavior of a program as sequences of configurations. A configuration of a

programrunconsistsofa programpointanda mappingfromvariablestovalues. Sucha mappingis calleda

state. The variables used in a state do not necessarily coincide with variables of the program. For example,

we may consider memory to be a variable. Formally, a configuration is a pair (p,σ), where p is a program

2/34Verimag Research Report noTR-2008-13

Page 5

Andrei Voronkov, Iman Narasamdya

point and σ is a state. A configuration (p,σ) is called an entry configuration for P if p = entry(P), and

an exit configuration for P if p = exit(P). For a configuration γ, we denote by pp(γ) the program point

of γ and by state(γ) the state of this configuration.

We assume that the semantics of a program P is defined as a transition relation ?→Pwith transitions of

the form (p1,σ1) ?→P (p2,σ2), where p1,p2are program points, σ1,σ2are states, and (p1,p2) is an edge

in the program-pointflow graph of P.

DEFINITION 2.1 (Computation Sequence,Run) A computation sequence of a program P is either a finite

or an infinite sequence of configurations

(p0,σ0),(p1,σ1),...,

(1)

where (pi,σi) ?→P(pi+1,σi+1) for all i. A run R of a program P from an initial state σ0is a computation

sequence (1) such that p0 = entry(P). A run is complete if it cannot be extended, that is, it is either

infinite or terminates at an exit configuration.

For two configurations γ1,γ2, we write γ1

starting at γ1and ending at γ2. We say that a computationsequence is trivial if it is a sequence of length 1.

We introduce two restrictions on the semantics of programs. First, we assume that programs are deter-

ministic. That is, for every program P, given a configuration γ1, there exists at most one configuration γ2

such that γ1?→Pγ2. Second, we assume that, for every programP and for every non-exitconfigurationγ1

ofP’s run,thereexists a configurationγ2suchthat γ1?→Pγ2, that is, a completerunmayonlyterminatein

an exit configuration. Our results can easily be generalized by dropping these restrictions. Indeed, one can

view a non-deterministic program as a deterministic program having an additional input variable x whose

value is an infinite sequence of numbers, these numbers are used to decide which of non-deterministic

choices should be made. Further, if a program computation can terminate in a state different from the exit

state, we can add an artificial transition from this state to the exit state. After such a modification we can

also consider arbitrary non-deterministic programs.

Further, we assume some assertion languagein which one can write assertions involvingvariables and

express properties of states. For example, the assertion language may be some first-order language. The

set of all assertions is denoted by Assertion. We will use meta variables α, φ, ϕ, and ψ, along with their

primed,subscript, and superscriptnotations, to rangeoverassertions. We write σ |= α to mean an assertion

α is true in a state σ, and also say that σ satisfies α, or that α holds at σ. We say that an assertion α is valid

if σ |= α for every state σ. We will also use a similar notation for configurations: for a configuration(p,σ)

and assertion α we write (p,σ) |= α if σ |= α. We also write σ ?|= α to mean an assertion α is false in σ,

or σ does not satisfy α. We assume that the assertion language is closed under the standard propositional

connectives and respects their semantics, for example σ |= ¬α if and only if σ ?|= α. We call an assertion

valid if it is true in all states.

To ease the readability we introduce the following notation: for all assertions α, α1, and α2, and for

every state σ,

α1∧ α2

for

α, where σ |= α if and only if σ |= α1and σ |= α2

α1∨ α2

for

α, where σ |= α if and only if σ |= α1or σ |= α2

¬α1

for

α, where σ |= α if and only if σ ?|= α1

α1⇒ α2

for

α, where σ |= α if and only if σ |= α2whenever σ |= α1

∗?→P γ2to denote that there is a computation sequence of P

3Intra-Program Properties

In this section we introduce the notion of program invariant for a single program and some related notions

that make it more suitable to present inter-program properties later.

3.1Program Invariants

We introduce the notion of assertion function that associates program points with assertions. An assertion

function for a program P is a partial function

I : PointP→ Assertion

Verimag Research Report noTR-2008-133/34