Page 1

Unit´ e Mixte de Recherche 5104 CNRS - INPG - UJF

Centre Equation

2, avenue de VIGNATE

F-38610 GIERES

tel : +33 456 52 03 40

fax : +33 456 52 03 50

http://www-verimag.imag.fr

Proving Inter-Program Properties

Andrei Voronkov, Iman Narasamdya

Verimag Research Report noTR-2008-13

September 2008

Reports are downloadable at the following address

http://www-verimag.imag.fr

Page 2

Proving Inter-Program Properties

Andrei Voronkov, Iman Narasamdya

The University of Manchester

voronkov@cs.man.ac.uk

Verimag

Iman.Narasamdya@imag.fr

September 2008

Abstract

We develop foundations for proving properties relating two programs. Our formalization is

based on a suitably adapted notion of program invariant for a single program. First, we give

an abstract formulation of the theory of program invariants based on the notion of assertion

function: a function that assigns assertions to program points. Then, we develop this abstract

notion further so that it can be used to prove properties between two programs. We describe

two applications of the theory. One application is in the translation validation for optimiz-

ing compilers, and the other application is in the certification of smart-card application in the

framework of Common Criteria. The latter application is part of an industrial project con-

ducted at Verimag laboratory.

Keywords: Assertion Function, Invariant, Translation Validation, Common Criteria Certification

Reviewers: Laurent Mounier

Notes:

How to cite this report:

@techreport { ,

title = { Proving Inter-Program Properties},

authors = { Andrei Voronkov, Iman Narasamdya},

institution = { Verimag Research Report },

number = {TR-2008-13},

year = { },

note = { }

}

Page 3

Andrei Voronkov, Iman Narasamdya

1 Introduction

Techniques for proving properties between two programs have become important in the area of program

verification. The verification of a program consists of proving that the program satisfies a given speci-

fication. The specification is usually written in a formal language such as first-order or temporal logic.

However, in some cases, like software evaluation and certification, the formal specification itself is often

not available, and we are only given a model of the specification. This model is essentially a program writ-

ten in a simple language. To prove the correctness of our program, we first formulate the property relating

the program and the model. For example, our program is correct with respect to the model if they perform

the same sequence of function calls when both of them are run on the same input. Such a propertybetween

two programs is called inter-program property throughout this report.

Inter-program properties describe relationships between two programs. A relationship between two

programs includes a mapping between locations and a relationship between variables of the two programs.

Moreover, inter-program properties also involve run-time behaviors of the two programs. Consider the

following two programs:

P

i := 0

while (i < 100) do

i := i + 1

q :

od

return i

P′

i′:= 0

while (i′< 100) do

i′:= i′+ 2

q′:

od

return i′

We want to prove that P and P′are semantically equivalent. That is, for every pair of runs of both

programs, one run is terminating if and only if so is the other, and if the runs are terminating, they return

the same value. We first assert i = i′at q and q′. Then, we argue that P and P′are equivalent with the

following reasoning. From the entries of P and P′, by taking two iterations of the loop in P and a single

iteration of the loop in P′, one can reach q and q′such that the values of i and i′coincide. From q and q′,

by knowing that the values of i and i′coincide (or the equality i = i′holds), then there are two possibilities

depending on the values of i and i′. One possibility is follow the same paths as before and reach q and q′

again such that the values of i and i′coincide. The other possibility is exit the loops and the values of i

and i′remain coincide. These two possibilities show that both runs of P and P′are either terminating or

non-terminating. The second possibility shows that on termination, both runs return the same value.

The notion of semantic equivalence is an example of inter-program property. Such a notion is heavily

used in compiler verification, particularly in translation validation approach and certifying compilers. In

the translation validation approach [11], for each compilation, one proves that the source and the target

programs are semantically equivalent. Particularly in a certifying compiler, the compiler must produce a

certificate certifying such an equivalence.

One might be interested in the notion of safe implementation. For example, a program is a safe imple-

mentation of another program if the sequence of observable behaviors performed by the former program is

a subsequence of that of the latter program. Consider the programs P and P′above and imagine that there

is a function call f(i) at q and q′. Let function calls and return values be the only observable behaviors. P

and P′are no longer equivalent because both perform different sequences of function calls. Nonetheless,

one can prove that P′is a safe implementation of P.

Standardtechniquesforprovingpropertiesofa singleprogramhavebeenaddressedforfourdecades[4,

5]. However, although there have been many kinds of inter-programproperty used in program verification,

there is no adequate basis for describing inter-program properties formally such that a rigorous standard is

establish for certificates and proofs about such properties. We propose in this report an abstract theory of

inter-program properties. The theory is based on the notion of assertion function: a function that assigns

assertions to program points. For example, in the above program P we can assert that i ≤ 100 at q by

defining an assertion function I such that I maps q to i ≤ 100.

The formalization of our theory is based on a suitably adapted notion of program invariant for a single

program. We introduce the notion of extendible assertion function as a constructive notion for describing

Verimag Research Report noTR-2008-131/34

Page 4

Andrei Voronkov, Iman Narasamdya

and proving program invariants. An assertion function I of a program is extendible if for every run of the

program reaching a point p1on which I is defined and the assertion defined at p1holds, we can always

extend the run so that it reaches a point p2on which I is defined and the assertion at p2holds. For example,

suppose that we define an assertion function I of the program P above such that, on the entry and exit of

P, I is defined as true, and on q, I is defined as i ≤ 100. The function I is extendible because if a run

reaches q such that i ≤ 100 holds, then we can extend the run either to reach q again or to reach the exit of

P, and the assertions defined at those points will also hold.

We develop further the notion of extendible assertion function so that it can be used to prove inter-

program properties. To this end, we consider the two programs as a pair of programs with disjoint sets

of variables. For example, to assert that i = i′at q and q′in the programs P and P′above, we define

an assertion function I of (P,P′) such that I maps (q,q′) to i = i′. We will show in this report that

meta properties that hold for the case of a single program also hold for the case of a pair of programs.

Furthermore, since we are interested in a kind of certificate, we develop a notion of verification condition

as a notion of certificate. A verification condition itself is a set of assertions. A certificate can be turned

into a proof by proving that all assertions in the verification condition are valid.

In this report we discuss two prominentapplications of the theory of inter-programproperties. The first

application is translation validation. We focus the application on the translation validation for optimizing

compilers. We can show that the notion of extendible assertion function can capture inter-program prop-

erties used in all existing works on translation validation for optimizing compilers. In Section 5 we will

discuss its application to our previous work on finding basic block and variable correspondence [8] and

briefly mention how our notion of weakly extendible assertion function and the corresponding notion of

verification condition can be used to certify other approaches.

The other application is in software certification. We describe an industrial project for certifying smart-

card applications at Verimag laboratory. In this project, we show that, using our theory, we can provide

certificates that certify propertiesbetween differentmodels of a specification in the frameworkof Common

Criteria [1].

In summary, the contributions of this report are the following:

• A theory of inter-program properties as an adequate basis for describing and proving properties

relating two programs.

• Applications of the theory in compiler verification and in software certification.

The outline of this report is as follows. We first describe the main assumptions used in the theory

of inter-program properties. We then develop a theory of properties of a single program. We call such

properties intra-program properties. Then, we develop the theory further so that it can be used to prove

inter-program properties. Having the theory of inter-program properties, we then discuss two applications

of the theory in translation validation and in Common Criteria certification.

2Main Assumptions

Our formalization will be based on standard assumptions about programs and their semantics. We assume

that a program consists of a finite set of program points. For example, a program point of a program P

can be the entry or the exit of a sequence of statements (or a block) in P. We denote by PointP the set

of program points of P. A program-point flow graph of P is a finite directed graph whose nodes are the

programpoints of P. In the sequel, we assume that everyprogramP we are dealing with is associated with

a program-pointflow graph, denoted by GP.

We assume that every program has a unique entry point and a unique exit point. Denote by entry(P)

and exit(P), respectively, the entry and the exit point of program P. We assume that the program-point

flow graph contains no edge into the entry point and no edge from the exit point.

We describe the run-time behavior of a program as sequences of configurations. A configuration of a

programrunconsistsofa programpointanda mappingfromvariablestovalues. Sucha mappingis calleda

state. The variables used in a state do not necessarily coincide with variables of the program. For example,

we may consider memory to be a variable. Formally, a configuration is a pair (p,σ), where p is a program

2/34Verimag Research Report noTR-2008-13

Page 5

Andrei Voronkov, Iman Narasamdya

point and σ is a state. A configuration (p,σ) is called an entry configuration for P if p = entry(P), and

an exit configuration for P if p = exit(P). For a configuration γ, we denote by pp(γ) the program point

of γ and by state(γ) the state of this configuration.

We assume that the semantics of a program P is defined as a transition relation ?→Pwith transitions of

the form (p1,σ1) ?→P (p2,σ2), where p1,p2are program points, σ1,σ2are states, and (p1,p2) is an edge

in the program-pointflow graph of P.

DEFINITION 2.1 (Computation Sequence,Run) A computation sequence of a program P is either a finite

or an infinite sequence of configurations

(p0,σ0),(p1,σ1),...,

(1)

where (pi,σi) ?→P(pi+1,σi+1) for all i. A run R of a program P from an initial state σ0is a computation

sequence (1) such that p0 = entry(P). A run is complete if it cannot be extended, that is, it is either

infinite or terminates at an exit configuration.

For two configurations γ1,γ2, we write γ1

starting at γ1and ending at γ2. We say that a computationsequence is trivial if it is a sequence of length 1.

We introduce two restrictions on the semantics of programs. First, we assume that programs are deter-

ministic. That is, for every program P, given a configuration γ1, there exists at most one configuration γ2

such that γ1?→Pγ2. Second, we assume that, for every programP and for every non-exitconfigurationγ1

ofP’s run,thereexists a configurationγ2suchthat γ1?→Pγ2, that is, a completerunmayonlyterminatein

an exit configuration. Our results can easily be generalized by dropping these restrictions. Indeed, one can

view a non-deterministic program as a deterministic program having an additional input variable x whose

value is an infinite sequence of numbers, these numbers are used to decide which of non-deterministic

choices should be made. Further, if a program computation can terminate in a state different from the exit

state, we can add an artificial transition from this state to the exit state. After such a modification we can

also consider arbitrary non-deterministic programs.

Further, we assume some assertion languagein which one can write assertions involvingvariables and

express properties of states. For example, the assertion language may be some first-order language. The

set of all assertions is denoted by Assertion. We will use meta variables α, φ, ϕ, and ψ, along with their

primed,subscript, and superscriptnotations, to rangeoverassertions. We write σ |= α to mean an assertion

α is true in a state σ, and also say that σ satisfies α, or that α holds at σ. We say that an assertion α is valid

if σ |= α for every state σ. We will also use a similar notation for configurations: for a configuration(p,σ)

and assertion α we write (p,σ) |= α if σ |= α. We also write σ ?|= α to mean an assertion α is false in σ,

or σ does not satisfy α. We assume that the assertion language is closed under the standard propositional

connectives and respects their semantics, for example σ |= ¬α if and only if σ ?|= α. We call an assertion

valid if it is true in all states.

To ease the readability we introduce the following notation: for all assertions α, α1, and α2, and for

every state σ,

α1∧ α2

for

α, where σ |= α if and only if σ |= α1and σ |= α2

α1∨ α2

for

α, where σ |= α if and only if σ |= α1or σ |= α2

¬α1

for

α, where σ |= α if and only if σ ?|= α1

α1⇒ α2

for

α, where σ |= α if and only if σ |= α2whenever σ |= α1

∗?→P γ2to denote that there is a computation sequence of P

3Intra-Program Properties

In this section we introduce the notion of program invariant for a single program and some related notions

that make it more suitable to present inter-program properties later.

3.1Program Invariants

We introduce the notion of assertion function that associates program points with assertions. An assertion

function for a program P is a partial function

I : PointP→ Assertion

Verimag Research Report noTR-2008-133/34

Page 6

Andrei Voronkov, Iman Narasamdya

mapping program points of P to assertions such that I(entry(P)) and I(exit(P)) are defined. The notion

of assertion function generalizes the notion of program invariant: one can consider I as a collection of

invariants associated with program points. The requirement that I is defined on the entry and exit points is

purely technical and not restrictive, for one can always define I(entry(P)) and I(exit(P)) as ⊤, that is,

an assertion that holds at every state.

GivenanassertionfunctionI, wecallaprogrampointpI-observableifI(p) isdefined. Aconfiguration

(p,σ) is called I-observable if so is its program point p. We say that a configuration γ = (p,σ) satisfies I,

denoted by γ |= I, if I(p) is defined and σ |= I(p). We will also say that I is defined on γ if it is defined

on p and write I(γ) to denote I(p).

DEFINITION 3.1 (Program Invariant) Let I be an assertion function of a program P. The function I is

said to be a program invariant of P if for every run

γ0,γ1,...

of the program such that γ0|= I and for all i ≥ 0, we have γi|= I whenever I is defined on pp(γi).

?

In other words, an assertion function is an invariant if and only if for every program run from an entry

configuration satisfying I, every observable configuration of this run satisfies I too.

This notion of invariantis useful for asserting that a programsatisfies some properties,includingpartial

correctness of a problem. Recall that a program P is partially correct with respect to a preconditionϕ and

a postcondition ψ, denoted by {ϕ}P{ψ}, if for every run of P from a configuration satisfying ϕ and

reaching an exit configuration, this exit configuration satisfies ψ. Likewise, a program P is totally correct

with respect to a precondition ϕ and a postcondition ψ, denoted by [ϕ]P[ψ], if every run of P from a

configuration satisfying ϕ terminates in an exit configuration and this exit configuration satisfies ψ.

THEOREM 3.2 Let P be a program and ϕ,ψ be assertions. Let I be an assertion function for P such that

I(entry(P)) = ϕ and I(exit(P)) = ψ. If I is an invariant, then {ϕ}P{ψ}. If, in addition, I is only

defined on the entry and the exit points, then I is an invariant if and only if {ϕ}P{ψ}.

PROOF. Suppose that I is an invariant of P and γ1

exit configuration, and γ1|= ϕ. Then γ1|= I. Using this and the fact that γ2is I-observable, we obtain

γ2|= I, that is, γ2|= ψ.

Now suppose that I is only defined on the entry and the exit points, and {ϕ}P{ψ}. Consider any

complete run of P from a configuration γ1that satisfies ϕ. We have to show that every I-observable

configuration of this run also satisfies I. It is obvious that γ1|= I. But the only observable state of this run

different from γ1may be an exit configuration γ2, in which case, and by our restrictions on programs, the

run terminates at this configuration, then by {ϕ}P{ψ}, we have γ2|= ψ, that is, γ2|= I.

∗?→Pγ2, where γ1is an entry configuration and γ2is an

?

One can provide a similar characterization of loop invariants using our notion of invariant.

3.2Extendible Assertion Functions

Our notion of invariant is not immediately useful for proving that a program satisfies some properties.

For proving, we need a more constructive characterization of relations between I and P than just those

expressed by program runs. We introduce the notion of extendible assertion function that provides such a

characterization.

DEFINITION 3.3 Let I be an assertion function of a program P. I is strongly extendible if for every run

γ0,...,γi

of the program such that i ≥ 0, γ0|= I, γi|= I, and γiis not an exit configuration, there exists a finite

computation sequence

γi,...,γi+n

such that

4/34 Verimag Research Report noTR-2008-13

Page 7

Andrei Voronkov, Iman Narasamdya

1. n > 0,

2. γi+n|= I, and

3. for all j such that i < j < i + n, the configuration γjis not I-observable.

The definition of weakly-extendible assertion function is obtained from this definition by dropping condi-

tion 3.

?

EXAMPLE 3.4 Let us give an example illustrating the difference between the two notions of extendible

assertion functions. Consider the following program P:

i := 0

j := 0

while (j < 100) do

if (i > j) then j := j + 1

else i := i + 1

fi

q :

od

Define an assertion function I of P such that I(entry(P)) = ⊤ and I(q) = I(exit(P)) = (i = j), and

I(p) is undefined on all program points p different from q and the entry and exit points. Then I is weakly

extendible but not strongly extendible. To show that I is weakly extendible, it is enough to observe the

following properties:

1. From an entry configuration, in two iterations of the loop, one reaches a configuration with the

program point q in which i = j = 1;

2. For every v < 100, from a configuration with the program point q in which i = j = v, in two

iterations of the loop, one can reach a configuration in which i = j = v + 1;

3. For every v ≥ 100, from a configuration with the program point q in which i = j = v, one can reach

an exit configuration in which i = j = v.

To show that I is not strongly extendible, it is sufficient to note that, from any entry configuration, after

one iteration of the loop, one can reach a configuration with the program point q in which i = 1 and j = 0

and so i = j does not hold.

?

Using the same arguments as in the proof of Theorem 3.2, we can show that weakly-extendible func-

tions are sufficient for proving partial correctness:

THEOREM 3.5 Let I beaweakly-extendibleassertionfunctionofaprogramP suchthatI(entry(P)) = ϕ

and I(exit(P)) = ψ. Then {ϕ}P{ψ}, that is, P is partially correct with respect to the precondition ϕ and

the postcondition ψ.

?

On the other hand, strongly-extendible assertion functions serve as invariants, as the following theorem

shows:

THEOREM 3.6 Every strongly-extendible assertion function I of a program P is also an invariant of P.

PROOF. We have to show that, for every run γ0,γ1,... of P such that γ0 |= I and every I-observable

configurationγiof this run, we have γi|= I. We will prove it by inductionon i. When i = 0, the statement

is trivial. Suppose i > 0. Take the greatest number j such that 0 ≤ j < i and γjis I-observable. Such a

number exists since γ0is I-observable. By the induction hypothesis, we have γj |= I. By the definition

of strongly-extendible assertion function, we have that there exists an n > 0 and a run γ0,...,γj,...,γn

such that γn |= I and all configurations between γjand γnare not I-observable. Note that both γiand

γnare the first I-observable configurations after γjin their runs. By the assumption that our programs are

deterministic, we obtain γi= γn, so γi|= I.

?

Verimag Research Report noTR-2008-13 5/34

Page 8

Andrei Voronkov, Iman Narasamdya

The following theorem shows that for terminating programs there is a closer relationship between

strongly-extendibleassertion functions and invariants:

THEOREM 3.7 Let an assertion function I be an invariant of P such that I(entry(P)) = α. Let P

terminate for every entry configurationssatisfying α, that is, every run on an entry configurationsatisfying

α is finite. Then I is strongly extendible.

PROOF. Take any run γ0,...,γiof P such that γ0|= I, γi|= I, and γiis not an exit configuration. Extend

this run to a run γ0,...,γi+nthat satisfies the conditions of Definition 3.3. To this end, first extend the run

to a complete run

R = γ0,...,γi,γi+1,....

Let us show that R contains a configuration γi+nwith n > 0 on which I is defined. Such a configuration

exists since R is finite and I is defined on the exit configuration of R. Take the smallest n such that I is

defined on γi+n. Since n is the smallest, I is undefined on all configurations between γiand γi+nin R.

Since I is an invariant, we have γi+n|= I.

?

The conditionon programsto be terminatingis not veryconstructive. We will introduceothersufficient

conditionson assertion functionswhich, on the one hand,will guaranteethat an invariantis also stronglyor

weakly extendible, and on the other hand, make our notion of invariant similar to more traditional ones [7].

To this end, we will use paths inthe program-pointflow graphGP. Such a path is called trivial if it consists

of a single point. To guarantee that an invariant I of a program P is strongly extendible, we require that

I must be defined on certain program points such that those points break all cycles in GP. That is, every

cycle in GPcontains at least one of these points. We introduce the notion of covering set to describe this

requirement.

DEFINITION 3.8 (Covering Set) Let P be a program and C be a set of program points in P. We say that

C covers P if entry(P) ∈ C and every infinite path in GP contains a program point in C. An assertion

function I is said to cover P if the set of I-observable program points covers P.

?

Any set C that covers P is often called a cut-point set of P.

THEOREM 3.9 Let I be an invariant of P. If I covers P, then I is strongly extendible.

PROOF. Take any run γ0,...,γiof P such that γ0|= I, γi|= I and γiis not an exit configuration. We

have to extend this run to a run γ0,...,γi+nsatisfying the conditions of Definition 3.3. To this end, first

extend this run to a complete run R = (γ0,...,γi,γi+1,...). Let us show that R contains a configuration

γi+nwith n > 0 on which I is defined. Indeed, if R is finite, then the last configuration of R is an exit

configuration, and then I is defined on it. If R is infinite, then the path pp(γi+1),pp(γi+2),... is infinite,

hence contains a program point on which I is defined. Take the smallest positive n such that I is defined

on γi+n. Since n is the smallest, I is undefined on all configurations between γiand γi+nin R. Since I is

invariant, we have γi+n|= I.

?

EXAMPLE 3.10 Consider again the program P of Example 3.4. Define an assertion function I1of P such

that I1is defined only on the entry and the exit points, I1(entry(P)) = ⊤ and I1(exit(P)) = (i = j). I1

is an invariant of P, but does not cover P since it is undefined on all points in the loop. Nevertheless, I1is

strongly extendible.

Let us now define another assertion function I2such that I2(entry(P)) = ⊤, I2(exit(P)) = (i = j),

I2(q) = (i > j ⇒ i = j + 1), and I2is undefined on all other points. I2is an invariant of P and also

strongly extendible. Moreover, I2covers P.

?

6/34Verimag Research Report noTR-2008-13

Page 9

Andrei Voronkov, Iman Narasamdya

3.3 Verification Conditions

Our next aim is to define a notion of verification condition as a collection of formulas and use these veri-

fication conditions to prove properties of programs. We want to define it in such a way that a verification

condition guarantees certain properties of programs. To this end, we use the notions of precondition and

liberal precondition for programs and paths in program-point flow graphs.

DEFINITION 3.11 (Weakest Liberal Precondition) An assertion ϕ is called the weakest liberal precondi-

tion of a program P and an assertion ψ, if

1. {ϕ}P{ψ}, and

2. for every assertion ϕ′such that {ϕ′}P{ψ}, the assertion ϕ′⇒ ϕ is valid.

In general, the weakest liberal precondition may not exists. If it exists, we denote the weakest liberal

precondition of P and ψ by wlpP(ψ).

In a similar way, we introduce the notion of a weakest liberal precondition of a path π = (p0,...,pn)

in the flow graph. An assertion ϕ is called a precondition of the path π and an assertion ψ, if, for every

state σ0such that σ0|= ϕ, there exist states σ1,...,σnsuch that

(p0,σ0) ?→ (p1,σ1) ?→ ... ?→ (pn,σn)

and σn|= ψ. An assertion ϕ is called the weakest precondition of π and ψ, denoted by wpπ(ψ), if it is a

precondition of π and ψ, and, for every precondition ϕ′of π and ψ, the assertion ϕ′⇒ ϕ is valid.

An assertion ϕ is called a liberal precondition of the path π and an assertion ψ, if, for every sequence

σ0,...,σnof states such that

(p0,σ0) ?→ (p1,σ1) ?→ ... ?→ (pn,σn),

and σ0 |= ϕ, we have σn |= ψ. An assertion ϕ is called the weakest liberal precondition of π and ψ,

denoted by wlpπ(ψ), if it is a liberal precondition of π and ψ, and, for every liberal precondition ϕ′of π

and ψ, the assertion ϕ′⇒ ϕ is valid.

?

Laterin provingthe correctnessofverificationcondition,we findthat the followingpropertyofweakest

liberal precondition is useful:

COROLLARY 3.12 Let π = p0,...,pnbe a path, and ϕ and ψ be assertions. Suppose that there exists a

sequence σ0,...,σnof states such that

(p0,σ0) ?→ (p1,σ1) ?→ ... ?→ (pn,σn),

σ0|= ϕ and ϕ ⇒ wlpπ(ψ) is valid. Then σn|= ψ.

PROOF. Since σ0|= ϕ and ϕ ⇒ wlpπ(ψ) is valid, we have σ0|= wlpπ(ψ). Since wlpπ(ψ) is the weakest

liberal precondition for π and ψ, we have σn|= ψ.

?

Another useful property of weakest preconditions and weakest liberal precondition is that the weakest

liberal precondition can be expressed in terms of the weakest precondition.

THEOREM 3.13 Let π be a path and ψ be an assertion. Then wlpπ(ψ) is equivalent to wpπ(ψ) ∨

¬wpπ(⊤).

PROOF. Let π = (p0,...,pn). We have to show that, for every state σ, σ |= wlpπ(ψ) if and only if

σ |= wpπ(ψ) ∨ ¬wpπ(⊤).

(⇒) Suppose that σ |= wlpπ(ψ) for some state σ. Suppose further that there exists a sequence

σ0,...,σnof states such that σ0= σ and

(p0,σ0) ?→ (p1,σ1) ?→ ... ?→ (pn,σn).

Verimag Research Report noTR-2008-137/34

Page 10

Andrei Voronkov, Iman Narasamdya

Since the weakest liberal precondition is a liberal precondition, by the definition of liberal precondition,

we have σn|= ψ. Hence, by the definition of precondition, the state σ satisfies some precondition ϕ of π

and ψ. By the definition of weakest precondition, the assertion ϕ ⇒ wpπ(ψ) is valid, and thus we have

σ |= wpπ(ψ).

Supposethatthereisnosuchasequenceσ0,...,σn. Bythedefinitionofprecondition,anyprecondition

of π and ⊤ is equivalent to ⊥, and so is wpπ(⊤). Thus, we have σ |= ¬wpπ(⊤).

(⇐) Suppose σ |= wpπ(ψ) for some state σ. Then, there exist states σ0,...,σnsuch that σ = σ0,

(p0,σ0) ?→ (p1,σ1) ?→ ... ?→ (pn,σn),

and σn |= ψ. By the definition of liberal precondition, σ satisfies some liberal precondition ϕ of π and

ψ. By the definition of weakest liberal precondition, the assertion ϕ ⇒ wlpπ(ψ) is valid. Hence, we have

σ |= wlpπ(ψ).

Suppose now that σ |= ¬wpπ(⊤). Since every state satisfies ⊤, the relation σ |= ¬wpπ(⊤) means that

there is no sequence σ0,...,σnof states such that σ = σ0and

(p0,σ0) ?→ (p1,σ1) ?→ ... ?→ (pn,σn).

By the definitionofliberal precondition,the state σ satisfies anyliberal preconditionofπ andanyassertion.

Thus, the state σ also satisfies any liberal precondition ϕ of π and ψ. By the definition of weakest liberal

precondition, the assertion ϕ ⇒ wlpπ(ψ) is valid. Hence, we have σ |= wlpπ(ψ).

?

We have so far not imposed any restrictions on the programming languages in which programs are

written. However, to provide certificates or verification conditions for program properties, we need to be

able to compute the weakest and the weakest liberal precondition of a given path and an assertion.

DEFINITION 3.14 (Weakest Precondition Property) We say that a programminglanguage has the weakest

precondition property if, for every assertion ψ and path π, the weakest precondition for π and ψ exists and

moreover, can effectively be computed from π and ψ.

?

In the sequel we assume that our programming language has the weakest precondition property. Note that

Theorem 3.13 implies that in any such language, given a path π and an assertion ψ, one can also compute

the weakest liberal precondition for π and ψ.

Next, we describethe verificationconditionsassociatedwith assertionfunctions. Suchverificationcon-

ditions form certificates for program properties described by the assertion functions. Let I be an assertion

function. A path p0,...,pnin GPis called I-simple if n > 0 and I is defined on p0and pnand undefined

on all program points p1,...,pn−1. We will say that the path is between p0and pn.

DEFINITION 3.15 Let I be an assertion function of a program P such that the domain of I covers P. The

strong verification condition associated with I is the set of assertions

{I(p0) ⇒ wlpπ(I(pn))

| π is an I-simple path between p0and pn}.

Note that the strong verification condition is always finite.

?

THEOREM 3.16 Let I beanassertionfunctionofa programP whosedomaincovers P andS bethe strong

verification condition associated with I. If every assertion in S is valid, then I is strongly extendible.

PROOF. Take any run γ0,...,γiof P such that γ0|= I, γi|= I and γiis not an exit configuration. Using

arguments of the proof of Theorem 3.9, we extend this run to a run γ0,...,γi+nsuch that I is defined on

γi+nbut undefined on γi+1,...,γi+n−1. It remains to prove that γi+n|= I.

Consider the run γi,...,γi+nand denote the program point of each configuration γjin this run by

pjand the state of γjby σj. Then the path π = (pi,...,pi+n) is simple and we have σi |= I(pi). The

assertion

I(pi) ⇒ wlpπ(I(pi+n))

8/34Verimag Research Report noTR-2008-13

Page 11

Andrei Voronkov, Iman Narasamdya

belongs to the strong verification condition associated with I, hence valid, so by Corollary 3.12 we have

σi+n|= I(pi), which is equivalent to γi+n|= I.

?

Note that this theorem gives us a sufficient condition for checking partial correctness of the program: given

an assertion function I defined on a covering set, we can generate the strong verification condition associ-

ated with I. This condition by Theorem 3.16 guarantees that I is strongly extendible, hence also weekly

extendible. Therefore, by Theorem 3.5 guarantees partial correctness. Moreover, the strong verification

condition is simply a collection of assertions, so if we have a theorem prover for the assertion language, it

can be used to check the strong verification condition.

One can reformulate the notion of verification condition in such a way that it will guarantee weak

extendibility. For every path π, denote by start(π) and end(π), respectively, the first and the last point of

π.

DEFINITION 3.17 Let I be an assertion function of a program P and Π a set of paths in GPsuch that for

every path π in Π both start(π) and end(π) are I-observable. For every program point p in P, denote by

Π|p the set of paths in Π whose first point is p.

The weak verification condition associated with I and Π consists of all assertions of the form

I(start(π)) ⇒ wlpπ(I(end(π))),

where π ∈ Π and all assertions of the form

I(p) ⇒

?

π∈Π|p

wpπ(⊤),

where p is an I-observable point.

The first kind of assertion in this definition is similar to the assertions used in the strong verification condi-

tion, but instead of all simple paths we consider all paths in Π. The second kind of assertion expresses that,

whenevera configurationat a point p satisfies I(p), the computationfrom this configurationwill inevitably

follow at least one path in Π. This informal explanation is made more precise in the following theorem.

THEOREM 3.18 Let I and Π be as in Definition 3.17 and W be the weak verification condition associated

with I and Π. If every assertion in W is valid, then I is weakly extendible.

PROOF. In the proof, whenever we denote a configuration by γi, we use pifor the program point and σi

for the state of this configuration, and similarly for other indices instead of i.

Take any run γ0,...,γiof P such that γ0|= I, γi|= I and γiis not an exit configuration. Since piis

I-observable, the following assertion belongs to W:

I(pi) ⇒

?

π∈Π|pi

wpπ(⊤),

and hence it is valid. Since γi|= I, we have σi|= I(pi), then by the validity of the above formula we have

σi|=

?

π∈Π|pi

wpπ(⊤).

This implies that there exists a path π ∈ Π|pisuch that σi |= wpπ(⊤). Let the path π have the form

pi,...,pi+n. Then, by the definition of wpπ(⊤), there exist states σi+1,...,σi+nsuch that

(pi,σi) ?→ (pi+1,σi+1) ?→ ... ?→ (pi+n,σi+n).

Using that π ∈ Π and repeating arguments of Theorem 3.16 we can prove σi+n|= I(pi+n).

?

Verimag Research Report noTR-2008-13 9/34

Page 12

Andrei Voronkov, Iman Narasamdya

4 Inter-Program Properties

In this section we develop further the notion of extendible assertion function so that it can be used to prove

inter-program properties. Given a pair (P,P′) of programs, we assume that they have disjoint sets of

variables. A configurationis a tuple (p,p′, ˆ σ), where p ∈ PointP, p′∈ PointP′, and ˆ σ is a state mapping

from all variables of both programs to values. A state can be considered as a pair of states: one for the

variables of P and one for the variables of P′. In the sequel, such a state ˆ σ is written as (σ,σ′), where σ is

for P and σ′is for P′. Similarly, the configuration (p,p′, ˆ σ) can be written as (p,p′,σ,σ′).

Similar to the case of a single program, we say that a configuration γ = (p,p′,σ,σ′) is called an entry

configuration for (P,P′) if p = entry(P) and p′= entry(P′), and an exit configuration for (P,P′) if

p = exit(P) and p′= exit(P′). We overload the functions pp and state to deal with such configurations,

that is, pp(γ) = (p,p′) and state(γ) = (σ,σ′). We introduce two new functions on configurations, ps1

and ps2, such that, on γ, ps1(γ) = (p,σ) is a configuration of P and ps2(γ) = (p′,σ′) is a configuration

of P′.

The transition relation ?→ of a pair (P,P′) of programs contains two kinds of transition:

(p1,p′,σ1,σ′) ?→ (p2,p′,σ2,σ′),

such that (p1,σ1) ?→ (p2,σ2) is in the transition relation of P, and

(p,p′

1,σ,σ′

1) ?→ (p,p′

2,σ,σ′

2),

such that (p1,σ1) ?→ (p2,σ2) is in the transition relation of P′.

Having the notion of transition relation for pairs of programs, the notions of computationsequence and

run can be defined in the same way as in the case of a single program. That is, a computation sequence of

(P,P′) is a finite or infinite sequence

γ0,γ1,...

of configurations such that γi ?→ γi+1for all i. A run from an initial state ˆ σ is a computation sequence

such that γ0 = (p0,p′

there can be many runs of (P,P′) from (σ,σ′). The following theorem then shows that if any of those run

is terminating, then all runs are terminating, and they terminate at the same configuration.

0, ˆ σ) is an entry configuration. One can observe that, for any pair (σ,σ′) of states,

LEMMA 4.1 Let (P,P′) be a pair of programs. If a run of (P,P′) from an entry configuration γ is

terminating at an exit configuration γ′, then all runs of (P,P′) from γ are terminating at γ′.

PROOF. Let R = γ0,...,γkbe a terminating run of (P,P′) such that γ0= γ and γk= γ′. Denote by R|P

the subsequence

γi0,...,γim

of R such that, for all l = 0,...,m − 1, the transition ps1(γil) ?→ ps1(γil+1) is a transition in P. This

meansthat therunofP fromtheentryconfigurationps1(γi0) terminatesat theexit configurationps1(γim).

Similarly for R|P′, we have the subsequence

γj0,...,γjn.

We also have

γ0

=(pp(ps1(γi0)),pp(ps2(γj0)),

state(ps1(γi0)),state(ps2(γj0)))

(pp(ps1(γim)),pp(ps2(γjn)),

state(ps1(γim)),state(ps2(γjn))).

γk

=

(2)

Assume that there is a non-terminating run R′of (P,P′) from γ. Then, R′|P or R′|P′is infinite.

Without loss of generality, suppose that

R′|P = γ′

0,γ′

1,...

is infinite. That is, the run of P from ps1(γ′

and P is deterministic, the run of P from ps1(γ′

0) is non-terminating. However, since ps1(γ′

0) must terminate. This contradicts the existence of R′.

0) = ps1(γi0)

10/34Verimag Research Report noTR-2008-13

Page 13

Andrei Voronkov, Iman Narasamdya

Moreover,the run of P from ps1(γi0) must terminate at ps1(γim). Using the same argument,we can show

that the run of P′from ps1(γj0) must terminate at ps1(γjn). By equalities 2, it follows that all runs of

(P,P′) from γ are terminating at γ′.

?

We will show later that the abovelemma allows us to preservemeta properties of the abstract notions intro-

duced in the previous section when these notions are developed further for the case of a pair of programs.

An assertion function of a pair (P,P′) of programs is a partial function

I : PointP× PointP′ → Assertion

mappingpairs of programpoints of P and P′to assertions such that I is defined on (entry(P),entry(P′))

and (exit(P),exit(P′)).

GivenanassertionfunctionI, we call apairofprogrampoints(p,p′) I-observableif I(p,p′) is defined.

Let γ = (p,p′,σ,σ′) be a configuration. Then, γ is I-observable if so is the pair of program points (p,p′).

We also say that γ satisfies I, denoted by γ |= I, if I is defined on (p,p′) and (σ,σ′) |= I(p,p′). We will

also say that I is defined on γ if it is defined on (p,p′) and write I(γ) to denote I(p,p′).

The notions of partial and total correctness for the case of single programs can be adapted for the case

of pairs of programs. A pair (P,P′) of programs is partially correct with respect to a precondition ϕ and a

postcondition ψ, denoted by {ϕ}(P,P′){ψ}, if for every run of (P,P′) from a configuration satisfying ϕ

and reaching an exit configuration, this exit configuration satisfies ψ. A pair (P,P′) of programs is totally

correct with respect to a precondition ϕ and a postcondition ψ, denoted by [ϕ](P,P′)[ψ], if every run of

(P,P′) from a configuration satisfying ϕ terminates in an exit configuration and this exit configuration

satisfies ψ.

Unlike in the case of a single program, for a pair of programs, there is no notions of invariant and

strongly-extendibleassertion function. The transition relation of a pair of programs has no synchronization

mechanism. For example, one program in a pair can make as many transitions as possible, while the other

program in the same pair stays at some program point without making any transition. Thus, it is not useful

to have the notions of invariant and strongly-extendible assertion functions.

The notion of weakly-extendible assertion function is better suited for describing inter-program prop-

erties. Weakly-extendible assertion functions for a pair of programs can be defined in the same way as in

the case of a single program.

DEFINITION 4.2 Let I be an assertion function of a pair (P,P′) of programs. I is weakly extendible if for

every run

γ0,...,γi

of (P,P′) such that i ≥ 0, γ0 |= I, γi |= I, and γiis not an exit configuration, there exists a finite

computation sequence

γi,...,γi+n

of (P,P′) such that

1. n > 0, and

2. γi+n|= I.

?

EXAMPLE 4.3 Let us illustrate the notion of weakly-extendible assertion function for a pair of programs.

Consider the following two programs P and P′:

Verimag Research Report noTR-2008-1311/34

Page 14

Andrei Voronkov, Iman Narasamdya

P

i := 0

j := 0

while (j < 100) do

if (i > j) then j := j + 1

else i := i + 1

fi

q :

od

Define an assertion function I of (P,P′) such that

P′

i′:= 0

j′:= 0

while (j′< 100) do

i′:= i′+ 1

j′:= j′+ 1

q′:

od

I(entry(P),entry(P′))

I(q,q′)

I(exit(P),exit(P′))

=

=

=

⊤

ϕ

ϕ,

where

ϕ = (i = i′) ∧ (j = j′) ∧ (i = j).

The function I is weakly extendible due to the following properties:

1. Fromanentryconfigurationof(P,P′), bytakingacomputationsequenceconsistingoftwoiterations

of the loop of P and one iteration of the loop of P′, one reaches a configurationwith program points

(q,q′) in which ϕ holds.

2. For everyv < 100, from a configurationwith the programpoints (q,q′) in which i = i′= j = j′= v,

by taking a computation sequence consisting of two iterations of the loop of P and one iteration of

the loop of P′, one again reaches a configuration with program points (q,q′) in which i = i′= j =

j′= v + 1.

3. For everyv ≥ 100, from a configurationwith the programpoints (q,q′) in which i = i′= j = j′= v,

one can reach an exit configuration in which i = i′= j = j′= v.

?

Concerningthe sufficiency of weakly-extendibleassertion functions for provingpartial correctness, we

obtain the same result as in the case of a single program, as stated by the following theorem:

THEOREM 4.4 Let I be an assertion function of a pair (P,P′) of programs such that

ϕ = I(entry(P),entry(P′)) and ψ = I(exit(P),exit(P′)).

If the assertion function I is weakly extendible, then {ϕ}(P,P′){ψ}, that is, (P,P′) is partially correct

with respect to the precondition ϕ and postcondition ψ.

PROOF. Suppose that I is weakly extendible and γ

is an exit configuration, and γ |= ϕ. It follows that γ |= I. By Lemma 4.1, all runs of (P,P′) from γ

terminate at γ′.

Consider any complete run

R = γ0,...,γm

∗?→(P,P′)γ′, where γ is an entry configuration and γ′

of (P,P′) from γ, that is, γ = γ0and γm= γ′. We need to prove that γm|= ψ. Take the largest number

j such that γjis not the exit configuration γ′and γj |= I. Such a configuration exists since γ0= γ and

γ |= I. Since I is weakly extendible, there exists a computation sequence

γj,...,γj+n

such that γj+n|= I. Now, since j is the largest one, we have γj+n= γm, and thus γm|= I. It follows by

the definition of I that γm|= ψ, as required.

?

12/34Verimag Research Report noTR-2008-13

Page 15

Andrei Voronkov, Iman Narasamdya

Similar to the properties of a single program, the verification conditions associated with inter-program

properties use the notion of path. However,since the flow graphsof the two programs in a pair of programs

are considered disjoint, the notion of path for pairs of programs needs to be elaborated. A path π of a pair

(P,P′) of programs is a finite or infinite sequence

(p0,p′

0),(p1,p′

1),...

of pairs of program points such that, for all i ≥ 0, either

• (pi,pi+1) is an edge of GPand p′

i= p′

i+1, or

• (p′

i,p′

i+1) is an edge of GP′ and pi= pi+1

A path ˆ π of (P,P′) can be considered as a trajectory in a two dimensional space where the axes are paths

of P and P′. We denote such a path ˆ π by (π,π′), where π and pi′are the axes of the space, π is a path of

P and π′is a path of P′.

Having the notion of path for a pair of programs, the notions of precondition and liberal precondition

for paths of a pair of programs can be defined in the same way as in the case of a single program. In

fact, the weakest precondition of a path of a pair of programs may be derived from the paths of the single

programs.

THEOREM 4.5 Let (π,π′) be a path of a pair (P,P′) of programs. Let ψ be an assertion such that ψ is

equivalent to ψ1∧ ψ2, where ψ1contains only variables from P and ψ2contains only variables from P′.

Then, wp(π,π′)(ψ) is equivalent to wpπ(ψ1) ∧ wpπ′(ψ2).

PROOF.

wpπ·π′(ψ). Then there is a sequence of (σ1,σ′

Let (π,π′) = (p0,p′

0),...,(pk,p′

k) Suppose there is a pair (σ0,σ′

1),...,(σk,σ′

0) of states that satisfies

k) such that

(p0,p′

0,σ0,σ′

0) ?→ (p1,p′

1,σ1,σ′

1)... ?→ (pk,p′

k,σk,σ′

k)

and (σk,σ′

P′, we have σk|= ψ1and σ′

By the construction of (π,π′), we have

k) |= ψ, which also means (σk,σ′

k) |= ψ1∧ψ2. By the disjointness of sets of variables of P and

k|= ψ2.

(pi0,σi0) ?→ ... ?→ (pim,σim)

such that π = pi0,...,pim, σi0= σ0, and σim= σk. Similarly, we have

(p′

j0,σ′

j0) ?→ ... ?→ (p′

jn,σ′

jn)

such that π′= p′

Consequently, (σ0,σ′

j0,...,p′

0) |= wpπ(ψ1) ∧ wpπ′(ψ2), as required.

jn, σ′

j0= σ′

0, and σ′

jn= σ′

k. It follows that σ0|= wpπ(ψ1) and σ′

0|= wpπ′(ψ2).

?

We candefine theverificationconditionassociatedwith weaklyextendibleassertionfunctionssimilarly

to the case of a single program.

DEFINITION 4.6 Let I be an assertion function of a pair (P,P′) of programs and Π a set of non-trivial

paths of the pair of programs such that for every path π in Π both start(π) and end(π) path are I-

observable. For every pair (p,p′) of program points, denote by Π|(p,p′) the set of paths in Π whose

first pair of points is (p,p′).

The weak verification condition associated with I and Π consists of all assertions of the form

I(start(π)) ⇒ wlpπ(I(end(π))),

where π ∈ Π and all assertions of the form

I(p,p′) ⇒

?

π∈Π|(p,p′)

wpπ(⊤),

where (p,p′) is an I-observable point, and p is not the exit point of P.

?

Verimag Research Report noTR-2008-1313/34

Page 16

Andrei Voronkov, Iman Narasamdya

THEOREM 4.7 Let I and Π be as in Definition 4.6 and W be the weak verification condition associated

with I and Π. If every assertion in W is valid, then I is weakly extendible.

PROOF. In the proof, whenever we denote a configuration by γi, we use (pi,p′

and (σi,σ′

γ0,...,γiof (P,P′) such that γ0 |= I, γi |= I and γiis not an exit configuration. Since (pi,p′

I-observable, the following assertion belongs to W:

i) for the program points

i) for the states of this configuration, and similarly for other indices instead of i. Take any run

i) is

I(pi,p′

i)) ⇒

?

π∈Π|(pi,p′

i)

wpπ(⊤),

and hence it is true. Since γi|= I, we have (σi,σ′

we have

i) |= I(pi,p′

i), then by the validity of the above formula

(σi,σ′

i) |=

?

π∈Π|(pi,p′

i)

wpπ(⊤).

This implies that there exists a path π ∈ Π|(pi,p′

form

i) such that (σi,σ′

i) |= wpπ(⊤). Let the path π have the

(pi,p′

i),...,(pi+n,p′

i+n).

Then, by the definition of wpπ(⊤), there exist pairs of states (σi+1,σ′

i+1),...,(σi+n,σi+n) such that

(pi,p′

... ?→ (pi+n,p′

i,σi,σ′

i) ?→ (pi+1,pi+1,σi+1,σ′

i+n,σi+n,σ′

i+1) ?→ ...

i+n).

Using that π ∈ Π, it follows that (σi+n,σ′

i+n) |= I(pi+n,p′

i+n).

?

The notion of weak verification condition is the cornerstone of our theory of inter-program properties.

The notion of weak verification condition forms a suitable notion of certificate about properties involving

two programs.

5Translation Validation

Translation validation [11] is an approach to compiler verification. In this approach, instead of proving

the correctness of a compiler for all source programs, one proves that, for a single source program, the

program and the result of its compilation, or the target program, are semantically equivalent. Translation

validation approach has mainly been used in the verification of optimizing compilers, for example in [12,

10, 15, 13, 8]. In the case of optimizing compilers, the target program is obtained by applying optimizing

transformations to the source program. Both source and target programs are usually in the same language.

In the sequel we focus the application of our theory on the translation validation for optimizing compilers.

In translation validation one first has to define formally the correctness property between the source

and the target programs. A typical correctness property in translation validation is semantic equivalence.

An example of informal definition of semantic equivalence is as follows: a source program P and a target

program P′are semantically equivalent if, for every pair of runs of both programs on the same input, (1)

both runs perform the same sequence of function calls, (2) one run is terminating if and only if so is the

other, and (3) on termination both runs return the same value. Having the correctness property, usually

one then defines a notion of correspondence between two programs. The semantic equivalence is then

established by finding some correspondences between the programs. Both the correctness property and

the notion of correspondence are inter-program properties. If we can show that such properties can be

captured by our notion of extendibleassertion function, then we can providecertificates or proofs for those

properties.

14/34Verimag Research Report noTR-2008-13

Page 17

Andrei Voronkov, Iman Narasamdya

5.1 Basic-Block and Variable Correspondences

We start our discussion with our translation validation work described in [8, 9]. In our work we intro-

duce the notion of basic-block and variable correspondence. The equivalence between two programs is

established by finding certain basic-block and variable correspondences.

Denote by InVarPthe set of input variables of a programP. In the sequel, given two programs P and

P′, we assume that there always exists a one-to-one correspondence In between InVarP and InVarP′.

We also say that the runs R and R′are on the same input if, let states σ0and σ′

respectively, R and R′, we have σ0(x) = σ′

We first define the notion of program equivalence. Denote by InVarP and ObsVarP the sets of,

respectively,inputvariables and observablevariables of a programP. The sourceprogramP and the target

program P′are semantically equivalent if there exist a one-to-one correspondence In between InVarP

and InVarP′ and a one-to-one correspondence Obs between ObsVarP and ObsVarP′, such that for

every pair of runs

R

=(p0,σ0),(p1,σ1),...

R′

=(p′

0be the initial states of,

0(In(x)) for all x ∈ InVarP.

0,σ′

0),(p′

1,σ′

1),...

of, respectively, P and P′, and σ0(x) = σ′

0(In(x)) for all x ∈ InVarP, the following conditions hold:

• R is terminating (or finite) if and only if so is R′;

• if R and R′are terminating with, respectively, states σ and σ′, then σ(y) = σ(Obs(y)) for all

y ∈ ObsVarP.

A block in a program is a sequence of statements in the program. A block is basic if it is maximal, and

it can only be entered at the beginning and exited at the end of the block.

Let us assume that the programpoints being consideredin a programsconsist of the entry point of each

basic block in the program, such that the point is denoted by the basic block itself. A run can be defined as

a sequence

(β0,σ0),(β1,σ1),(β2,σ2),...,

where, for all i ≥ 0, the point βiis the entry point of basic block βi. For any run R and any sequence¯b

of basic blocks, we denote by R|¯b the subsequence of R consisting only of configurations whose program

points are the entry points of basic blocks in¯b.

Given two programs P and P′, let¯b = b1,...,bmand¯b′= b′

blocks of, respectively, P and P′, and let ¯ x = x1,...,xnand ¯ x′= x′

variables of, respectively, P and P′. There is a basic-block and variable correspondence between (¯b, ¯ x)

and (¯b′, ¯ x′) if for every two runs

1,...,b′

mbe sequences of distinct basic

1,...,x′

nbe sequences of distinct

R

R′

=

=

(β0,σ0),(β1,σ1),...

(β′

0,σ′

0),(β′

1,σ′

1),...

of, respectively, P and P′on the same inputs, let

R|¯b

R′|¯b′

=

=

(βi0,σi0),(βi1,σi1),...

(β′

i′

i′

0,σ′

0),(β′

i′

1,σ′

i′

1),...,

then R|¯b and R|¯b′are of the same length and the following conditions hold: for all k

1. βik= bjif and only if β′

i′

k= b′

jfor all j, and

2. σik+1(xl) = σ′

i′

k+1(x′

l) for all l.

In the sequel, we often call¯b,¯b′sequences of control blocks and ¯ x, ¯ x′sequences of control variables. We

assume that every program has a unique start block and a unique exit block. The entry of start block is the

program’sentry point, while the exit of exit block is the program’sexit points. The start block of a program

is also a control block, and it always corresponds to the start block of the other program.

Verimag Research Report noTR-2008-1315/34

Page 18

Andrei Voronkov, Iman Narasamdya

i ← 0

s ← 0

b0

i < n

b1

return s

b2

x ← i

s = 0

b3

s ← x

b4

s ← s + x

b5

y ← m

i ← i + y

b6

f

t

t

f

i′← 0

s′← 0

y′← m′

b′

0

i′< n′

b′

1

return s′

b′

2

s′← s′+ i′

i′← i′+ y′

b′

3

f

t

PP′

Figure 1: P′is an optimized version of P.

EXAMPLE 5.1 Let us givean exampleof basic-blockand variablecorrespondence. Considerthe programs

P andP′depictedinFigure1. Thevariablesm,naretheinputvariablesofP, andtheirprimedcounterparts

are the input variables of P′such that In(m) = m′and In(n) = n′. P′is obtained from P by fusing the

branches that are represented by blocks b4and b5, and by moving the assignment instruction y ← m out of

the loop.

There is a basic-block and variable correspondence between (b6,(s,i,m,n)) and (b′

For every two runs of P and P′on the same input, the block b6is visited as many times as the block b′

visited, and on each correspondingvisits, at the exits of the blocks, the values of s,i,m,n coincide with the

values of their primed counterparts. With the same reasoning, it is easy to see that there is a basic-block

and variable correspondencebetween ((b6,b2),(s,i,m,n)) and ((b′

3,(s′,i′,m′,n′)).

3is

3,b′

2),(s′,i′,m′,n′)).

?

Establishing program equivalence can be accomplished by finding basic-block and variable correspon-

dences between the exit blocks and between the observable variables.

THEOREM 5.2 Let P and P′be programs, and btand b′

Let Obs be the one-to-one correspondence between ObsVarP and ObsVarP′, where ObsVarP =

{x1,...,xn}. P and P′are semantically equivalent if and only if there is a basic-block and variable

correspondence between

tbe the exit blocks of P and P′, respectively.

(bt,(x1,...,xn)) and (b′

t,(Obs(x1),...,Obs(xn)))

?

Note that in the above example there is a correspondence between

((b6,b2),(s,i,m,n)) and ((b′

3,b′

2),(s′,i′,m′,n′)).

By the definition of basic-block and variable correspondences, it obviously follows that there is a corre-

spondencebetween(b2,s)and(b′

are s and s′, we can conclude that the programs P and P′in the above example are equivalent.

The verification of basic-block and variable correspondences has been described in detail in [9]. For

presentation in this section, suppose that one finds a basic-block and variable correspondence between

(¯b, ¯ x) ofaprogramP and(¯b′, ¯ x′)ofaprogramP′, where¯b = b1,...,bm,¯b′= b′

2,s′). Sinceb2andb′

2aretheexitblocksandtheonlyobservablevariables

1,...,b′

m, ¯ x = x1,...,xn,

16/34Verimag Research Report noTR-2008-13

Page 19

Andrei Voronkov, Iman Narasamdya

and ¯ x′= x′

of a basic block and is denoted by the basic block itself. Given a sequence¯b of basic blocks, a path

π = β0,...,βk is¯b-simple if β0and βkare in¯b, or β0is the start block and βkis in¯b, but none of

β1,...,βk−1are in¯b. Assume further that every program has a unique variable ρ representing program

counter, and for every basic block β in the program, the value of ρ is updated with β by the assignment

ρ ← β at the entry of β.

The verification condition associated with a basic-block and variable correspondence consists of two

parts: conjecture preservation and simulation relation. A conjecture is a set of assertions. Consider again

the basic-block and variable correspondencebetween (¯b, ¯ x). Let ρ,ρ′be programcounters of, respectively,

programs P and P′, and C be a conjecture in the verification condition. For two blocks β,β′, we often

write ρ(β,β′) as a shorthand for ρ = β ∧ ρ′= β′. For every pair β,β′of corresponding control blocks,

we require that the assertion?C ∧ ρ(β,β′) ⇒?n

j = 1,...,m, let

Πbj,bi

=

{π1

Πb′

j,b′

i

=

{π1

b′

1,...,x′

n. Assume that a path is a sequence of program points, where each point is the exit

k=1xk = x′

kis valid. For all i = 1,...,m and for all

bj,bi,...,πc

i,...,πd

bj,bi}

j,b′

b′

j,b′

i}

be the sets of, respectively, all¯b-simple paths between bjand biand all¯b′-simple paths between b′

The verification condition associated with conjecture preservation consists of the following assertions: for

all k = 1,...,c and for all l = 1,...,d,

jand b′

i.

?

C ∧ ρ(bj,b′

j) ⇒ wlpπk

bj,bi(wlpπl

b′

j,b′

i(

?

C)).

The verification condition associated with simulation relation consists of the following assertions: for all

k = 1,...,c,

?C ∧ ρ(bj,b′

j) ∧ wpπk

bj,bi(⊤)

⇒

?d

l=1wpπl

b′

j,b′

i(⊤),

and for all l = 1,...,d,

?C ∧ ρ(bj,b′

j) ∧ wpπl

b′

j,b′

i(⊤)

⇒

?c

k=1wpπk

bj,bi(⊤).

EXAMPLE 5.3 Let us consider again the programs P and P′in Example 5.1. The programs are depicted

in Figure 1. In this examplewe show the verificationconditionassociated with the basic-blockand variable

correspondence between ((b6,b2),(s,i,m,n)) and ((b′

Let bsand b′

is the predecessor of b′

verification condition consists of the following assertions:

3,b′

2),(s′,i′,m′,n′)).

sbe the start blocks of, respectively, P and P′. That is, bsis the predecessor of b0and b′

0. Let ϕ be an assertion equivalent to m = m′∧ n = n′. The conjecture C in the

s

ρ(bs,b′

ρ(b6,b′

ρ(b2,b′

s) ⇒ ϕ,

3) ⇒ ϕ ∧ s = s′∧ i = i′∧ y′= m′, and

2) ⇒ ϕ ∧ s = s′∧ i = i′.

The first assertion above describes the input condition. The second and third assertions describe the corre-

spondence between corresponding control variables at the corresponding control blocks. Note that in the

second assertion the conjunction y′= m′is a loop invariant that is crucial for proving the correspondence.

Having the conjecture, we can generate assertions associated with conjecture preservation and simula-

tion relation for the following pairs of sets of simple paths:

• Πb6,b6= {πb4

b6,b6,πb5

b6,b6} and Πb′

3,b′

3= {πb′

3,b′

3};

• Πb6,b2= {πb6,b2} and Πb′

3,b′

2= {πb′

3,b′

2};

• Πbs,b6= {πb4

bs,b6,πb5

bs,b6} and Πb′

s,b′

3= {πb′

s,b′

3}; and

• Πbs,b2= {πbs,b2} and Πb′

s,b′

2= {πb′

s,b′

2},

where πb3

b1,b2denotes a path from b1to b2via b3.

?

Verimag Research Report noTR-2008-1317/34

Page 20

Andrei Voronkov, Iman Narasamdya

The notion of basic-block and variable correspondence can be captured by the notions of extendible

assertion function and weak verification condition. That is, given a basic-block and variable correspon-

dence between programs P and P′, we can define an extendible assertion functionˆI and a setˆΠ of paths

of (P,P′), such that if all assertions in the weak verification condition associated withˆI andˆΠ are valid,

then so are all assertions in the verification condition associated with the correspondence.

Consider a basic-block and variable correspondence between (¯b, ¯ x) of P and (¯b′, ¯ x′) of P′. Let V

be the verification condition associated with the correspondence, such that C is the conjecture in V. We

define the assertion functionˆI of (P,P′) that expresses C. For simplicity, since we are interested in the

correspondence of control variables at the exits of corresponding control blocks, we say thatˆI is defined

on a pair (β,β′) of control blocks to mean thatˆI is defined on a pair (exit(β),exit(β)) of the exits of β

and β. The functionˆI is defined as follows:

ˆI(entry(P),entry(P′)) = ρ(entry(P),entry(P′)) ∧?C

and for every pair β,β′of corresponding control blocks

ˆI(β,β′)=

ρ(β,β′) ∧?C.

Note thatˆI can be defined on other pairs of programs points.

Next, recall that a path ˆ π of (P,P′) can be considered as a trajectory in a two dimensional space where

the axes are a path π of P and a path π′of P′. We denote such a path ˆ π by (π,π′). We now impose some

requirements on the setˆΠ of paths of (P,P′). First, the setˆΠ includes all simple paths in the sets of simple

paths used to generate the verification condition V, that is,

ˆΠ ⊇ {(πβ1,β2,πβ′

∃Πβ1,β2,Πβ′

1,β′

2) |

2.πβ1,β2∈ Πβ1,β2∧ πβ′

1,β′

1,β′

2∈ Πβ′

1,β′

2},

where Πβ1,β2is the set of all¯b-simple paths from β1to β2and Πβ′

from β′

¯b′-simple path, and neither the nontrivial prefixes of π are¯b-simple paths nor the nontrivial prefixes of π′

are¯b′-simple paths.

Having the functionˆI and the setˆΠ, one can prove a basic-block and variable correspondences by

proving the weak verification condition associated withˆI andˆΠ.

1,β′

2is the set of all¯b′-simple paths

1to β′

2. Second, for every other path (π,π′) inˆΠ, π and π′are not prefixes of any¯b-simple and

THEOREM 5.4 Let V,ˆI, andˆΠ be as defined above, and W be the weak verification condition associated

withˆI andˆΠ. Then, if all assertions in W are valid, then all assertions in V are valid.

PROOF. We first prove the conjecture preservation of V. Take any two pairs (β1,β′

corresponding control blocks, such that there exist a¯b-simple path πβ1,β2from β1to β2and a¯b′-simple

path πβ′

1,β′

1) and (β2,β′

2) of

2from β′

1to β′

2. We need to prove that the assertion

?

C ∧ ρ = β1∧ ρ′= β′

1⇒ wlpπβ1,β2(wlpπβ′

1,β′

2(

?

C))

(3)

is valid. Since the assertion?C ∧ρ = β1∧ρ′= β′

are updatedimmediatelyprecedingthe exit blocks, the assertion wlp(πβ1,β2,πβ′

to wlpπβ1,β2(wlpπβ′

1,β′

1is equivalent toˆI(β1,β′

1) and we assume that ρ and ρ′

2)(ˆI(β2,β′

1,β′

2)) is equivalent

2(?C)). Because the assertion

ˆI(β1,β′

1) ⇒ wlp(πβ1,β2,πβ′

1,β′

2)(ˆI(β2,β′

2))

is valid, so is the assertion (3).

For the simulation relation of V, we prove it by contradiction. Assume that there are two pairs (β1,β′

and (β2,β′

πβ1,β2, but the assertion

1)

2) of corresponding control blocks such that, without loss of generality, there is a¯b-simple path

?

C ∧ ρ = β1∧ ρ′= β′

1∧ wpπβ1,β2(⊤) ⇒

?

π∈Πβ′

1,β′

2

wpπ(⊤)

18/34 Verimag Research Report noTR-2008-13

Page 21

Andrei Voronkov, Iman Narasamdya

is not valid. Recall that the set Πβ′

?C ∧ ρ = β1∧ ρ′= β′

ˆI(β1,β′

follows the path πβ1,β2, but there is no computation sequence of P′that starts from the exit of β′

σ′, and follows any path in Πβ′

1,β′

By the requirements imposed onˆΠ, the path πβ1,β2is only paired with some path in Πβ′

that at the exits of β1and β2, the computationsequence from the states (σ,σ′) that satisfyˆI(β1,β′

follow any path inˆΠ. This is a contradiction since all assertions in W are valid.

1,β′

2is the set of all¯b′-simple paths from β′

1is equivalent toˆI(β1,β′

1) such that there is a computation sequence of P that starts from the exit of β1and state σ, and

1to β′

2. Since the assertion

1), it means that there is a pair (σ,σ′) of states satisfying

1and state

2.

1,β′

2. It means

1) cannot

?

EXAMPLE 5.5 Consider again the programs P and P′in Figure 1. We want to verify the basic-block

and variable correspondencebetween (b6,(s,i,m,n)) and (b′

associated with the correspondenceconsists of a conjecture C that includes the following assertions:

3,(s′,i′,m′,n′)). The verification condition V

ρ(bs,b′

ρ(b6,b′

s) ⇒ m = m′∧ n = n′, and

3) ⇒ m = m′∧ n = n′∧ s = s′∧ i = i′∧ y′= m′.

The pairs of sets of simple paths considered in generating V are: (Πb6,b6,Πb′

These sets of paths are defined as in Example 5.3.

We define the assertion functionˆI as follows:

3,b′

3) and (Πbs,b6,Πb′

s,b′

3).

ˆI(bs,b′

ˆI(b6,b′

ˆI(b2,b′

s)

3)

2)

=

=

=

ρ(bs,b′

ρ(b6,b′

⊤

s) ∧?C

3) ∧?C

Next, we define the setˆΠ as the union of the cross products of the following pairs of sets: (Πb6,b6,Πb′

(Πbs,b6,Πb′

s,b′

verification condition W associated withˆI andˆΠ are valid. By Theorem 5.4, all assertions in V are also

valid. It means that there exists a basic-block and variable correspondence between (b6,(s,i,m,n)) and

(b′

3,b′

3),

3), (Πbs,b2,Πb′

s,b′

2), and (Πb6,b2,Πb′

3,b′

2). It can be proved that all assertions in the weak

3,(s′,i′,m′,n′)).

?

Note that in the aboveexamplethe functionˆI is definedon the pair (b2,b′

are control blocks. Moreover, the setˆΠ above includes the pair (πbs,b2,π′

them are usedto generatethe verificationconditionV. One can actuallyprovea largercorrespondence,that

is between ((b6,b2),(s,i,m,n)) and ((b′

control blocks andˆΠ includes only pairs of paths used to generate V. By the following theorem, it follows

thatthereexistsa basic-blockandvariablecorrespondencebetween(b6,(s,i,m,n)) and(b′

2) althoughnone of the blocks

2) of simple paths but none of

bs,b′

3,b′

2),(s′,i′,m′,n′)), and thusˆI can be defined only on pairs of

3,(s′,i′,m′,n′)).

THEOREM 5.6 Let P and P′be programs. If there is a basic-block and variable correspondence between

(¯b,β, ¯ x) of P and (¯b′,β′, ¯ x) of P′, or there is a basic-block and variable correspondence between (¯b, ¯ x,y)

of P and (¯b′, ¯ x,y′) of P′, then there is a basic-block and variable correspondence between (¯b, ¯ x) and

(¯b, ¯ x).

Adding a pair of control blocks or a pair of control variables into a basic-block and variable correspon-

dence often results in a non basic-block and variable correspondence. In such a case, to apply the above

theorem, one can always translate programs into SSA form [2], and modify the correspondence according

to the variable renaming that occurs during the translation.

In the following example we will show that we can prove a basic-block and variable correspondence

using the notions of extendible assertion function and weak verification condition although the verification

condition associated with the correspondence cannot be generated.

EXAMPLE 5.7 Consider the programs P and P′in Figure 2. The one-to-one correspondence In between

the input variables maps N to N′. There is a basic-block and variable correspondence between ((bs,b6),j)

and((b′

s,b′

0),N′), andwewanttoverifythiscorrespondence. Theverificationconditionassociatedwith the

Verimag Research Report noTR-2008-13 19/34

Page 22

Andrei Voronkov, Iman Narasamdya

bs

N ≥ 0

b0

i ← 0

b1

i < N

b2

j ← i

b3

i ← i + 1

b4

j ← N

b5

return j

b6

t

t

b′

s

return N′

b′

0

PP′

Figure 2: P′is an optimized version of P.

correspondence cannot be generated because there are infinitely many simple paths from bsto b6. Adding

new pairs of control blocks is not possible because all blocks in P′are already control blocks.

We can prove the correspondenceusing the notions of extendible assertion function and weak verifica-

tion condition. Define the assertion functionˆI as follows:

ˆI(bs,b′

ˆI(b4,b′

s)

s)

=

=

N = N′,

N = N′∧ N ≥ 0 ∧ i ≤ N.

ˆI(b6,b′

0) = j = N′,

Let the setˆΠ of paths of (P,P′) consist of the following paths:

(πb5

(πb4,b4,πb′

bs,b6,πb′

s,b′

s),(πb4,b6,πb′

0),(πb2,b3

bs,b6,πb′

s,b′

0),

0),(πbs,b4,πb′

s),

s,b′

where πbdenotes a trivial path consisting only of a single point b. One can prove that all assertions in the

weak verification condition associated withˆI andˆΠ are all valid. Moreover, fromˆI andˆΠ, one can reason

that there is a basic-block and variable correspondence between ((bs,b6),j) and ((b′

s,b′

0),N′).

?

5.2 Proof Rule VALIDATE

Inthis sectionwe discusshowournotionofextendibleassertionfunctioncancaptureinter-programproper-

ties described by the proof rule VALIDATE in [15]. The proof rule consists of several steps. First, establish

a control abstraction κ between programs P and P′. The abstraction is a mapping from CPP′ to CPP,

where CPP′ is a cut-point set of P′and, additionally, includes the exit block of P′. The set CPP can

be defined similarly. The abstraction κ must map the entry and the exit of P to, respectively, the entry

and the exit of P′. Second, for each point p′in CPP′, form an intra-program assertion αp′ referring only

to variables in P′. Next, establish a data abstraction δ, which is an assertion relating variables in P and

variables in P′.

A path in a program can be considered as a transition relation containing the conditions that enable the

path to be traversed and the data transformation effected by the path. For example, consider the program

P is Example 1. The path consisting of the instructions

i := i + y;i < n;x := i′

describes the transition relation

i∗= i + y ∧ i∗< n ∧ x∗= i∗.

20/34 Verimag Research Report noTR-2008-13

Page 23

Andrei Voronkov, Iman Narasamdya

For simplicity, we denote by π the transition relation described by a path π.

Similar to the proof technique for verifying basic-block and variable correspondences, the proof rule

VALIDATE generates assertions that express simulation relation. That is, for each pair (p1,p′

points such that κ(p′

P′, let Πκ(p′

1),κ(p′

assertion is valid:

1) of program

1) = p1and there is a CPP′-simple path πp′

2)be the set of all simple paths from κ(p′

1,p′

2from p′

2) in P, one proves that the following

1to p′

2in the flow graph of

1) to κ(p′

αp′

1∧ δ ∧ πp′

1,p′

2⇒ ∃V∗

P.(

?

2)∈Πκ(p′

πκ(p′

1),κ(p′

1),κ(p′

2)

)πκ(p′

1),κ(p′

2)) ∧ δ∗∧ α∗

p′

2,

(4)

where V∗

αp′

In [15] the correctnessof data abstraction δ is provedseparately. Essentially, for everytwo simple paths

from p′

Pis a sequence of starred version of some variables in P, and δ∗and α∗

2by replacing all variables updated in πp′

p′

2are obtained from δ and

1,p′

2and πκ(p′

1),κ(p′

2)by their starred counterparts.

1to p′

2and from κ(p′

1) to κ(p′

2), one proves that the assertion

αp′

1∧ δ ∧ πp′

1,p′

2∧ πκ(p′

1),κ(p′

2)⇒ δ∗∧ α∗

p′

2

(5)

is valid.

The notions of extendible assertion function and weak verification condition can capture the program

properties described by the rule VALIDATE. First, define an assertion functionˆI from the abstractions and

intra-program assertions in the rule. Then, reuse the simple paths in the proof rule to generate the weak

verification condition associated with the function. Let the assertion functionˆI of (P,P′) be defined as

follows: for every point p′in CPP′,

ˆI(κ(p′),p′) = αp′ ∧ δ,

andˆI is undefined on other pairs of points. Define a setˆΠ of paths of (P,P′) as follows:

ˆΠ = {(π,π′) | ∃Πp′

1,p′

2,Πκ(p′

1),κ(p′

2).π′∈ Πp′

1,p′

2∧ π ∈ Πκ(p′

1),κ(p′

2)}.

Note thatthe definitionofˆI is differentfromˆI discussed inthe previoussectionon basic-blockandvariable

correspondences. The functionˆI in this section is only defined on pairs of control points.

Having the functionˆI and the setˆΠ, one can prove a property described by rule VALIDATE by proving

the weak verification condition associated withˆI andˆΠ.

THEOREM 5.8 LetˆI andˆΠ be as definedabove, and W be the weak verification conditionassociated with

ˆI andˆΠ. Then, if all assertions in W are valid, then so are all assertions of the forms (4) and (5).

PROOF. We first provethat all assertions of the form(5) are valid. Since the assertionˆI(κ(p′

alent to αp′

1,p′

αp′

ˆI(κ(p′

1) ⇒ wlp(πp′

1),p′

1) is equiv-

2))(δ ∧

1∧ δ, the assertion πp′

2), and the assertion

2∧ πκ(p′

1),κ(p′

2)⇒ δ∗∧ α∗

p′

2is equivalent to wlp(πp′

1,p′

2,πκ(p′

1),κ(p′

1),p′

1,p′

2,πκ(p′

1),κ(p′

2))(δ)

is valid, it follows that the assertion

αp′

1∧ δ ∧ πp′

1,p′

2∧ πκ(p′

1),κ(p′

2)⇒ δ∗∧ α∗

p′

2

is also valid.

Now, assume that, for some points p′

1and p′

2, the assertion

αp′

1∧ δ ∧ πp′

1,p′

2⇒ ∃V∗

P.(

?

2)∈Πκ(p′

πκ(p′

1),κ(p′

1),κ(p′

2)

)πκ(p′

1),κ(p′

2)) ∧ δ∗∧ α∗

p′

2,

is not valid. It means that there is a pair (σ,σ′) of states satisfying αp′

sequence from p′

traversing any path in Πκ(p′

1),κ(p′

Πκ(p′

1),κ(p′

1∧ δ ∧ πp′

1,p′

2, there is a computation

1on σ′traversing the path πp′

2). Since CPPis a cut-point sets, all paths from κ(p′

2). By the definitionofˆΠ, the path πp′

1,p′

2, but there is no computation sequence from p1on σ

1) to κ(p′

2) are all in

2). Thus, there

1,p′

2is onlypairedwith all paths in Πκ(p′

1),κ(p′

Verimag Research Report noTR-2008-13 21/34

Page 24

Andrei Voronkov, Iman Narasamdya

i ← 0

b0

x ← a + b

i ?= 0

b1

i ← x

b2

i ← i + x

b3

y ← a + i

i < 100

b4

bt

t

f

i′← 0

x′← a′+ b′

b′

0

i′← i′+ x′

i′< 100

b′

1

y′← a′+ i′

b′

2

b′

t

f

PP′

Figure 3: P′is an optimized version of P.

are states (σ,σ′) satisfyingˆI(κ(p′

follow any path inˆΠ. However, since all assertions in W are valid, our assumption is then contradictory. ?

1),p′

1) but the computationsequence from (κ(p′

1),p′

1) on (σ,σ′) does not

EXAMPLE 5.9 In this example we consider programs used as an example in [15]. We depict the programs

in Figure 3.

Let us denote the entry point of a basic block by the name of the basic block itself. The control

abstraction κ maps b0to b′

respectively. The data abstraction δ is defined as the following assertion:

0, b1to b′

1, and btto b′

t. The blocks btand b′

tare the exit blocks of P and P′,

ρ = κ(ρ′) ∧ i = i′∧ a = a′∧ b = b′∧ (ρ′?= b′

1⇒ x = x′∧ y = y′),

where ρ is a program counter, and the data abstraction always implies the equality ρ = κ(ρ). Furthermore,

at the entry of b′

We define the assertion functionˆI as follows:

1we have the assertion αb′

1equivalent to x′= a′+ b′.

ˆI(b0,b′

ˆI(bt,b′

ˆI(b1,b′

0)

t)

1)

=

=

=

δ

δ

δ ∧ αb′

1,

andˆI is undefined on other pairs of points. The setˆΠ of paths of (P,P′) consists of the following paths:

(πb0,b1,πb′

0,b′

1),(πb2

b1,b1,πb′

1,b′

1),(πb3

b1,b1,πb′

1,b′

1),(πb2

b1,bt,πb′

1,b′

t),(πb3

b1,bt,πb′

1,b′

t).

These pairs of paths are all pairs of simple paths considered in the proof rule VALIDATE. Thus, by Theo-

rem 5.8 if all assertions in the verification condition associated withˆI andˆΠ are valid, then all assertions

generated by VALIDATE are also valid.

?

The proof rule VALIDATE cannot prove the inter-program property in Example 5.7. For each pair

2and πκ(b′

nontrivial. However, in Example 5.7 the setˆΠ contains the path (πb4,b4,πbs), where πbsis a trivial path. In

this sense, our notions of extendible assertion function and weak verification condition are more powerful

than the proof rule VALIDATE.

πb′

1,b′

1),κ(b′

2)of simple paths used in generating assertions of the forms (4) and (5), both paths are

22/34Verimag Research Report noTR-2008-13

Page 25

Andrei Voronkov, Iman Narasamdya

5.3 Simulation Invariants

We show in this section that the notion of simulation invariant introduced in the work on credible compila-

tion [12] can be captured by our notions of extendible assertion function and weak verification condition.

There are two kinds of invariant introduced in [12], they are standard invariants and simulation invari-

ants. A standard invariant of a program P is written as ?α?p, where α is an assertion and p is a program

point of P. The invariant is true if, for all executions of P, the assertion α holds on the state at point p.

Simulation invariants express a simulation relationship between the partial executions of programs.

Partial executions of a program are computation sequences starting from the entry of the program. A

simulation invariant between two programs P and P′is written as ?α, ¯ e?p ⊳ ?α′, ¯ e′?p′, where α,α′are as-

sertions, ¯ e, ¯ e′are equallylong sequencesof expressions,and p,p′are programpoints of P,P′, respectively.

For sequences ¯ e = e1,...,enand ¯ e′= e′

invariant is true if for all partial executions of P′reaching p′with α′true, there exists a partial execution

of P reaching p with α true such that the execution of P is on the same input as that of P′and ¯ e = ¯ e′.

The notion of standard invariant can be captured by the notion of extendible assertion function. Instead

of proving a single standard invariant, in credible compilation one usually proves a set of standard invari-

ants. Given a set S of standard invariants, we define an assertion functionˆI as follows: for ?α?p ∈ S,

ˆI(p) = α. Note thatˆI can be defined on other points. We then prove thatˆI is a strongly-extendible

assertion function.

1,...,e′

nof expressions, we write ¯ e = ¯ e′for?n

i=1ei= e′

i. The

THEOREM 5.10 Let S be a set of standard invariants andˆI be an assertion function such that, for all

?α?p ∈ S,ˆI(p) = α. IfˆI is strongly extendible, then all standard invariants in S are true.

?

The proof of the above theorem is straightforward from the definition of strongly-extendible assertion

functions.

The notion of simulation invariant can be captured by the notions of weakly-extendible assertion func-

tion and weak verification condition. Similar to proving standard invariants, instead of proving a single

simulation invariant, one usually proves a set of simulation invariants. Similar to proving basic-block and

variable correspondences and properties described by rule VALIDATE, proving a set of simulation invari-

ants requiressome standardinvariantsthat are assumedto be true. Let S be a set of simulationand standard

invariants of programs P and P′. Denote by

S|p = {α | ∃?α?p ∈ S}

the set of assertions of all standard invariants in S such that the points of the invariants are p. Denote by

S|(p,p′) = {α ∧ α′∧ ¯ e = ¯ e′| ∃?α, ¯ e?p ⊳ ?α′, ¯ e′?p′∈ S}

the set of assertions of all simulation invariants in S such that the pairs of points are (p,p′). We define an

assertion functionˆI of (P,P′) as follows: for every pair (p,p′) of points such that S|(p,p′) is not empty,

ˆI(p,p′) =

?

S|(p,p′) ∧

?

S|p ∧

?

S|p′.

Let TS = {p′| ∃?α, ¯ e?p ⊳ ?α′, ¯ e′?p′∈ S} be the set of all program points of P′such that there is a

simulation invariant in S involving these points. We assume that the set of all TS-simple paths is finite.

This assumption is also used in the proof rules described in [12] to prove a set of simulation invariants. We

defineasetˆΠ ofpathsof(P,P′) withthe followingrequirement: foreveryTS-simplepathπp′

pathapathπp1,p2in theflowgraphofP suchthat(1)therearesimulationinvariants?α1, ¯ e1?p1⊳?α′

and ?α2, ¯ e2?p2⊳ ?α′

1,p′

Having the functionˆI and the setˆΠ, one can prove a set of simulation invariants by proving the weak

verification condition associated withˆI andˆΠ.

1,p′

2, thereis a

1, ¯ e′

1?p′

1

2, ¯ e′

2?p′

2in S, and (2) (πp1,p2,πp′

2) is inˆΠ. Note that the path πp1,p2can be trivial.

THEOREM 5.11 Let S,ˆI, andˆΠ be as defined above. Let W be the weak verification condition associated

withˆI andˆΠ. If all assertions in W are valid, then all simulation invariants in S are true.

?

Verimag Research Report noTR-2008-1323/34

Page 26

Andrei Voronkov, Iman Narasamdya

p1: g ← 0

p5: g ← g + 6

p4: g < 48

p7: return g

f

p′

1: g′← 0

p′

2: g′← g′+ 6

p′

3: g′< 48

p′

5: g′← g′+ 6

p′

4: g′< 48

p′

7: return g′

f

f

PP′

Figure 4: P′is an optimized version of P.

To prove the above theorem, we need to describe the proof rules in [12]. In this section we will only

providean informalproofof the theorem. First, the set TScontains all programpoints of P′such that there

is a simulation relation there. Since we consider all TS-simple path, by the requirement ofˆΠ, if there is

a TS-simple path πp′

1,p′

invariants at (p1,p′

follow in the proof.

2, then there is a path πp1,p2in the flow graph of P such that there are simulation

1) and at (p2,p′

2). Thus, the paths inˆΠ represents the directions that the rules in [12]

EXAMPLE 5.12 We consider an example taken from the work on credible compilation in [12]. The pro-

grams in the example are depicted in Figure 4. Each block in the programs has only one instruction, and

the instruction is labelled with a point denotingthe entry of the block. The set S of simulation and standard

invariants to be proved consists of the following invariants:

?g%12 = 0 ∨ g%12 = 6?p4,?g′%12 = 0?p′

?g%12 = 0,g?p5⊳ ?⊤,g′?p′

?g%12 = 6,g?p4⊳ ?⊤,g′?p′

?g%12 = 6,g?p5⊳ ?⊤,g′?p′

?g%12 = 0,g?p4⊳ ?⊤,g′?p′

?⊤,g?p7⊳ ?⊤,g′?p′

7.

4,?g′%12 = 6?p′

3

2,

3,

5,

4,

The assertion functionˆI is defined as follows:

ˆI(p5,p′

ˆI(p4,p′

ˆI(p5,p′

ˆI(p4,p′

ˆI(p7,p′

2)

3)

5)

4)

7)

=

=

=

=

=

g%12 = 0 ∧ g = g′

g%12 = 6 ∧ g = g′∧ g′%12 = 6 ∧ (g′%12 = 0 ∨ g%12 = 6)

g%12 = 6 ∧ g = g′

g%12 = 6 ∧ g = g′∧ g′%12 = 0 ∧ (g′%12 = 0 ∨ g%12 = 6)

g = g′

Both at the pair of exit points and at the pair of entry points,ˆI is definedas ⊤. The setˆΠ of paths of (P,P′)

consists of the following paths:

(πp1,p5,πp′

(πp5,p4,πp′

1,p′

2),(πp5,p4,πp′

4),(πp4,p7,πp′

2,p′

3),(πp4,p5,πp′

7),(πp4,p7,πp′

3,p′

5),

7).

5,p′

4,p′

3,p′

It is easy to see that if all assertions in the weak verification condition W associated withˆI andˆΠ are

valid, then all simulation assertions in S are true. Assume that all assertions in W are valid. The invariant

24/34Verimag Research Report noTR-2008-13

Page 27

Andrei Voronkov, Iman Narasamdya

?⊤,g?p7⊳ ?⊤,g′?p′

going to p7such that (π,π′) is inˆΠ andˆI is defined on the starts of π and π′. For the path πp′

have the path πp4,p7such that (πp4,p7,πp′

4,p′

are valid, we can ignore the assertions defined on (p4,p′

the path πp4,p7such that (πp4,p7,πp′

3,p′

?g%12 = 0,g?p4⊳ ?⊤,g′?p′

for all partial executions of P′and the executions reach p′

execution reaches p7and the values of g and g′on reaching p7and p′

7is true by the following reasoning. For every path π′going to p′

7, there is a path π

4,p′

7, we

7) inˆΠ andˆI is defined on (p4,p′

4) and on (p7,p′

7) is inˆΠ andˆI is defined on (p4,p′

4and ?g%12 = 6,g?p4⊳ ?⊤,g′?p′

4). Since all assertions in W

7). For the path πp′

3). Assuming that the invariants

3are true, we can prove inductively that,

7, there is a partial execution of P′such that the

7coincide.

3,p′

7, we have

?

Recall that we require that, for every path (π,π′) in the setˆΠ, the path π′is not trivial. Due to this

requirement, the proof rules described [12] cannot prove the inter-programproperty in Example 5.7. Thus,

with our notions of extendible assertion functionand weak verification condition,we can provemore inter-

program properties than that of the proof rules in [12].

The notion of simulation invariant used in credible compilation is similar to the notion of simulation

triple in Necula’s work on translation validation [10]. Thus, our notions of extendible assertion function

and weak verification condition can capture the notion of simulation triple as well.

6 Common Criteria Certification

We discuss in this section an application of our theory of inter-program properties in the certification of

smart-card applications. The work described in this section is part of an industrial project, called EDEN2,

that has been conducted at Verimag laboratory.1The aim of the project are twofold: (1) to develop a

method for software certification in the frameworkof Common Criteria certification [1], and (2) to provide

a certificate or a collection of certificates showing that a smart-card application follows its specification or

a model of its specification.

Common Criteria (CC) is an international standard for the evaluation of security related systems. CC

defines requirements for certification: security policy model (SPM), functional specification (FSP), high-

level design (HLD), low-level design (LLD), and implementation (IMP). Given a specification of a system

or a program, an SPM is a model of the specification. an FSP describes an input-output relationship of

the system. HLDs are often fused into FSPs or into LLDs. An LLD itself is described as a reference

implementation.2The IMP is the code implementing the system.

Each requirement in CC has a representation. For example, in EDEN2 project the SPM is written in

a declarative language that specifies, for each smart-card command, the normal behavior of the command

and the actions that the command has to perform when a card tear (or power loss) occurs. The FSP

and the LLD in EDEN2 project are programs written in subsets of Java, while the IMP are Java Card

programs [3, 14]. The HLD in EDEN2 is fused into the LLD. Essentially, the SPM, the FSP, the LLD,

and the IMP are programs that can be represented as program-point flow graphs. Between every two

consecutiverequirementrepresentationsthereisaso-calledrepresentationcorrespondence(RCR).AnRCR

is essentially a property relating two programs, or an inter-program property, and thus we can apply our

theory of inter-program properties to proving RCRs and providing certificates about the RCRs. Our theory

is also applicable to proving properties of SPMs, FSPs, LLDs, and IMPs. In this report we focus on the

application of the theory to proving RCRs.

The definitions of RCRs between two consecutive requirements are different. We first discuss the

definition of RCRs between SPMs and FSPs. To this end, we discuss the SPM and the FSP. An SPM and

an FSP consist of a set of commands. A command can be thought of as a method in a Java program or a

function in a C program. For each command in the SPM and the FSP, the command can be represented

by two programs, one program specifies the normal behavior of the command and the other specifies what

the command has to do when a card tear occurs. For simplicity, we call the former program the normal

fragment of the command and the latter one the abrupt fragment of the command. In the FSP the normal

1Industrial partners involved in this project include companies that work on security for embedded systems, e.g., Gemalto and

Trusted Logic.

2In the latest version of Common Criteria report [1], HLD and LLD are replaced by TOE design description (TDS). In this report

we regard LLDs as TDSs.

Verimag Research Report noTR-2008-1325/34

Page 28

Andrei Voronkov, Iman Narasamdya

and abrupt fragments are represented by a try-catch construct. The try part represents the normal fragment,

the catch part catches a special exception and represents the abrupt fragment.

Theoperationof SPMs andFSPs resemblesthe operationof smart-cardapplications,that is, by sending

a sequence of commands to the SPMs and the FSPs. One can think of an SPM or an FSP as a program that

takes as an input a sequence of commands of the form C(a1,...,an), where C is the command’s name

and a1,...,anare input arguments. A run of an SPM or an FSP can be described as a sequence of runs

of commands. For each run of a command, if no card tears occur and the run of the command terminates

normally,then the runof the SPM orthe FSP fetches the nextinput C(a1,...,an) fromthe inputsequence.

If a card tear occurs, then the run goes to the abrupt fragment of the command. If the run of the abrupt

fragment then terminates, the run of the command is said to terminate abruptly, and in turn the run of the

SPM or the FSP simply terminates.

A run of an SPM or an FSP is a finite or infinite alternating sequence

γ0,ε1,γ2,ε2,...,

where

• γ0is an entry configuration;

• for all i ≥ 0, we have γi?→ γi+1; and

• for all j ≥ 1, the event εjis an event associated with transition γj−1?→ γj.

We assume that each of the SPM and the FSP has an input variable, and the state of configuration γ0maps

this variable to the input value, which is a sequence of commands. Later in the definition of RCRs between

SPMs and FSPs we introduce a one-to-one correspondenceObs between the set of observable variables of

an SPM and the set of observable variables of an FSP. We assume that Obs maps the input variable of the

SPM to the input variable of the FSP.

For every run of a command, upon reaching the exit of normal fragment, the run of an SPM or an FSP

emits either a Pass event or a Fail event, and upon reaching the exit of abrupt fragment, the run emits an

Abrupt event. We assume that emitting an event is the same as assigning the event to a special variable ε.

Events are not restricted to Pass, Fail, and Abrupt events; we allow internal or unobservable events.

We now define the notion of RCR between SPMs and FSPs that we use in EDEN2. Let E be a set of

observable events. Denote by R|Ethe subsequence of R consisting only of events in E:

R

R|E

=

=

(p0,σ0),ε1,(p1,σ1),ε2,...

(p0,σ0),εi1,(pi1,σi1),εi2,(pi2,σi2),

where εij∈ E for all j. Let X be a set of variables of an SPM, we denote by Ab(X) the set of variables in

X such that the variables are modified in the abrupt fragment of the SPM.

DEFINITION 6.1 Let OSPMand OFSP be the sets of observable variables of, respectively, an SPM and

an FSP such that there is a one-to-one correspondence Obs between OSPM and OFSP. Let EO =

{Pass,Fail,Abrupt} be the set of observable events of the SPM and the FSP. There is an RCR between

the SPM and the FSP if, for every run

R|EO= (p0,σ0),εi1,(pi1,σi1),...

of the FSP, there is a run

R′|EO= (p′

0,σ′

0),ε′

j1,(p′

j1,σ′

j1),...

of the SPM, where for all x ∈ OSPM, we have σ0(x) = σ′

0(Obs(x)), such that, for all k

• εik= ε′

jk,

• if εik?= Abrupt, then σik(x) = σ′

jk(Obs(x)) for all x ∈ OSPM,

• if εik= Abrupt, then σil(y) = σ′

jl(Obs(y)) for all y ∈ Ab(OSPM).

26/34 Verimag Research Report noTR-2008-13

Page 29

Andrei Voronkov, Iman Narasamdya

trial > 0

length = l

val := ⊥

pin = p

val := ⊤

trial := MAX

trial := trial − 1

trial > 0

i := 0

i < l

trial := MAX

val := ⊤

p′

2

val := ⊥

pin[i] = p[i]

i := i + 1

trial := trial − 1

pe

p1

p2

p′

3

p′′

1

p′

e

p′

1

px

p′

x

val := ⊥

p3

Figure 5: P1is on the left and P′

1is on the right.

?

To apply the theory of inter-program properties to proving an RCR between an SPM and an FSP, we

prove the RCR between each corresponding commands separately. Let Obs be a one-to-one correspon-

dence between observable variables of the SPM and of the FSP. There is an RCR between the SPM and the

FSP of a command C if the following conditions hold. For any run R of the command C in the FSP from

a state σ1, there is a run R′of the same command in the SPM from a state σ′

?

• if R is terminating (or the run reaches the exit of normal or abrupt fragment), then so is R′,

1such that σ1and σ′

1satisfy

x∈OSPMx = Obs(x), and

• when R and R′are terminating with, respectively, states σ2and σ′

such that

2, R and R′emit the same event ε

– if ε ?= Abrupt, then σ2and σ′

– otherwise σ2and σ′

2satisfy?

x∈OSPMx = Obs(x);

2satisfy x = Obs(x) for all x ∈ Ab(OSPM).

EXAMPLE 6.2 In this example we will show that there is an RCR between the SPM and the FSP of the

commandcheckPIN used for PIN verification. Let us first considerthe flow graphs representingthe normal

fragments of the SPM and of the FSP. Call the former flow graph P1and the latter P′

are depicted in Figure 5. The edge (p2,px) emits a Pass event, while other edges coming to pxemits a

Fail event. Similarly, the edge (p′

event.

Forclarity,we assumethattheSPM andtheFSP havedisjointsets ofvariables. Tothis end,we consider

that all variables in the FSP are in primed notation. Let the sets

1. These flow graphs

2,p′

x) emits a Pass event, while other edges coming to p′

xemit a Fail

OSPM

OFSP

=

=

{trial,pin,p,val,MAX,ε}

{trial′,pin′,p′,val′,MAX′,ε′}

be the sets of observable variables of, respectively, the SPM and the FSP such that a one-to-onecorrespon-

dence Obs between OSPMand OFSPmaps each variable in OSPMto its primed counterpart in OFSP

Note that pin in the SPM has a scalar type but pin′in the FSP has an array type. So, we have to define

the equivalence between pin and pin′. First, every array PIN p has a length l associated with the array; we

write the association as a pair (p,l). We introduce a predicate ≡ between such pairs such that, given an

Verimag Research Report noTR-2008-13 27/34

Page 30

Andrei Voronkov, Iman Narasamdya

ae

ax

Abrupt

val := ⊥

Abrupt

val := ⊥

a′

e

a′

x

Figure 6: P2is on the left and P′

2is on the right.

array PINs p,p′and lengths l,l′, we say that (p,l) ≡ (p′,l′) if l = l′and for all i = 0,...,l − 1, we have

p[i] = p′[i]. Next we introduce a predicate ∼ between scalar PINs and array PINs. The predicate ∼ is

axiomatized as follows: for every scalar PINs w,x and for every array PINs y,z,

x ∼ y ⇒ (y ≡ z ⇔ x ∼ z)

x ∼ y ⇒ (w = x ⇔ w ∼ y).

The predicate ∼ defines the equality between a scalar PIN and an array PIN.

The following assertions express the correspondence between observable variables of the SPM and of

the FSP:

φ1

⇔

trial = trial′

φ2

⇔

val = val′

φ3

⇔

pin ∼ (pin′,length′)

φ4

φ5

φ6

⇔

⇔

⇔

p ∼ (p′,l′)

MAX = MAX′

ε = ε′

Next, we define an assertion functionˆI1of (P1,P′

1) as follows:

ˆI1(pe,p′

ˆI1(p1,p′

ˆI1(p1,p′′

e)

1)

1)

=

=

=

?6

?6

?6

∧length′= l′∧ i′< l′∧ (∀j.0 ≤ j < i′⇒ pin′[j] = p′[j])

?6

?6

?6

i=1φi

i=1φi∧ trial > 0

i=2φi∧ trial > 0 ∧ trial = trial′+ 1

ˆI1(p2,p′

ˆI1(p3,p′

ˆI1(px,p′

2)

3)

x)

=

=

=

i=1φi∧ pin = p ∧ (pin,length) ≡ (p,l)

i=1φi∧ pin ?= p ∧ (pin,length) ?≡ (p,l)

i=1φi

The functionˆI1is undefined elsewhere.

Denote a path from point p to q in a program-point flow graph by πp,q. We define a setˆ Π1of paths of

(P1,P′

1) such that the set consists of the following paths:

(πpe,p1,πp′

(πp1,p2,πp′

(πp2,px,πp′

e,p′

1),(πpe,px,πp′

2),(πp1,πp′′

x),(πp3,px,πp′

e,p′

1),(πp1,p2,πp′′

x).

x),(πp1,πp′

1,p′′

1),(πp1,px,πp′

2),(πp1,p3,πp′′

1,p′

x),

3),

1,p′

1,p′′

1,p′

1,p′

2,p′

3,p′

One can prove that all assertions in the weak verification condition associated withˆI1andˆ Π1are valid.

We now consider the flow graphs of the abrupt fragments of the SPM and of the FSP. Call the former

one P2and the latter P′

(P2,P′

ˆI2(ae,a′

=

ˆI2(ax,a′

=

2. These flow graphs are depicted in Figure 6. We define an assertion functionˆI2of

2) as follows:

e)

x)

⊤

val = val′.

The functionˆI2is undefined elsewhere. One can prove easily that all assertions in the weak verification

condition associated withˆI2andˆ Π2are valid.

From the assertion functionsˆI1,ˆI2 and from the setsˆ Π1, ˆ Π2, one can easily see that there is an

RCR between the SPM and the FSP of the command checkPIN. First, since for all p ?= px,ˆI1(p,p′

x)

28/34Verimag Research Report noTR-2008-13

Page 31

Andrei Voronkov, Iman Narasamdya

is undefined and the set {π′| ∃π.(π,π′) ∈ ˆ Π1} is the set of all simple paths induced by the points in

P′

ˆI1(pe,p′

the assertionˆI1(px,p′

?

Now, if a card tear occurs then the run R′will go to the entry of P′

run R can also go to the entry P2such that the entry configurations of the runs R and R′on reaching the

entries of P2and P′

aetoo. SinceˆI2is weakly extendible and the assertionˆI2(ax,a′

then the exit configurations of both runs satisfy?

between the SPM and the FSP of the command.

1, then if for a pair of runs R′of P′

e), then if the run R′is terminating, then so is the run of R. SinceˆI1is weakly extendible and

x) ⇒?

x∈OSPMx = Obs(x).

1and R of P1such that the entry configurations of the runs satisfy

x∈OSPMx = Obs(x) is valid, the exit configurations of both runs satisfy

2. Since theˆI2(ae,a′

e) is valid, the

2satisfyˆI2(ae,a′

e). With the same reasoning as above, if R′reaches a′

e, then R reaches

x) ⇒?

x∈Ab(OSPM)x = Obs(x) is valid,

x∈Ab(OSPM)x = Obs(x). Therefore, there is an RCR

?

We now focus on RCRs between FSPs and LLDs. Before discussing RCRs, we first describe LLDs. In

EDEN2 the languageused to write an LLD is a subset of Java. This subset includes memorycharacteristics

and transaction mechanism of Java Card [14, 3]. First, in the language of LLDs there are two kinds of

memory, persistent memory and transient memory. The difference between these kinds of memory is the

following: when power is lost (or a card tear occurs), data stored in the persistent memory will be kept in

the memory, while data stored in the transient memory will be lost. In the sequel, variables whose values

are stored in the persistent memory are called persistent variables, and variables whose values are stored

in the transient memory are called transient variables.

Similar to the FSP, an LLD consists of a set of commands where each commandis a Java method. Card

tears are capture using a try-catch construct where the try part represents the normal fragment of the LLD

and the catch part catches a special exceptionand represents the abrupt fragmentof the LLD. The language

of LLDs offers a transaction mechanism that resembles the transaction mechanism of Java Card API. Our

modelling of transactions follows the modelling of Java Card transactions in [6]. We introduce a boolean

variable inTrans to keep track if a transaction is in progress or not. When a transaction begins, the value

of inTrans is set to true, and when it ends, the value of inTrans is set to false. One can set the value of

inTrans to false to escape from a transaction. This feature is useful for variables whose updates must be

unconditional.

Similar to FSPs, an LLD is a program that takes as an input a sequence of command calls of the form

C(a1,...,an), where C is the command’s name and a1,...,anare input arguments. The notion of run of

LLDs is the same as the notion of run of FSPs.

Having described LLDs, we now define RCRs between FSPs and LLDs. Let us first denote by Pr(X)

theset ofpersistentvariablesinthesetX ofvariablesofanLLD.LaterinthedefinitionofRCRs betweenan

FSP andanLLDwe requirethatobservablepersistentvariablesofthe LLDareupdatedinthesame orderas

their counterparts of the FSP. But, when a transaction is in progress, then such an order becomes irrelevant.

For example, given a one-to-one correspondence Obs between observable variables of the LLD and of the

FSP, if no transaction is in progress and the observable persistent variables of the LLD are updated in the

order x1,x2,x3, then their counterparts are updated in the order Obs(x1),Obs(x2),Obs(x3). However,

when a transaction is in progress, then the order of updating Obs(x1),Obs(x2),Obs(x3) is irrelevant.

Moreover, whether a transaction is in progress or not, each variable is updated with the same value as its

counterpart. To this end, first, for each persistent variable x of the LLD and its counterpart Obs(x) of the

FSP, we associate with both variables an event function Write x. This function takes as an input the value

v ofx or Obs(x) and returnsan eventWrite x(v). The followingassertionaxiomatizesthe eventfunction:

∀x,y,v,w.(Write x(v) = Write y(w) ⇔ Write x = Write y ∧ v = w),

where the equality Write x = Write y denotes a syntactic equality. In the sequel we denote by τxthe

domain of variable x.

Second,the set ofeventsemittedbytheLLD is a powerset ofthe set ofeventsemittedbytheFSP. Next,

assignments to observable persistent variables and committing transactions emit events in the following

way:

Verimag Research Report noTR-2008-1329/34

Page 32

Andrei Voronkov, Iman Narasamdya

• In the try part of the FSP, the update of a variable y, where y = Obs(x) for an observable persistent

variable x in the LLD, emits Write x(v), where v is the updated value of y.

• In the try part of the LLD,

– if no transaction is in progress, that is the variable inTrans is false, then the update of an

observable persistent variable x emits {Write x(v)}, where v is the updated value of x;

– if atransactionis inprogress,thatis thevariableinTransactionis true,thenwheninTransaction

is set to false and beforehand the observable persistent variables x0,...,xnare updated such

that the latest updated values of these variables are, respectively, v0,...,vn, then if the reset-

ting of inTrans is not caused by aborting the in-progress transaction, then the resetting emits

{Write x0(v0),...,Write xn(vn)}. However, when the resetting of inTrans is caused by

abortingthein-progresstransactionornoobservablevariablesareupdated,thennoset ofevents

is emitted.

For comparing events of the LLD and events of the FSP, we say that a nonempty set {ε0,...,εm} of

LLD’s events matches a sequence ε′

there exists j such that 0 ≤ j ≤ n and ε′

events matches a sequence ε′

increasing sequence n1< n2< ... of positive integers such that

0,...,ε′

nof FSP’s events if (1) m = n, and (2) for all i = 0,...,m,

j= εi. Now, we say that a sequence ˆ ε1, ˆ ε2,... of sets of LLD’s

2,... of FSP events if either both sequences are of length 0, or there is an

1,ε′

1. ˆ ε1matches ε′

1,...,ε′

n1, and

2. for all i ≥ 2, ˆ εimatches ε′

ni−1+1,...,ε′

ni.

Note that the one-to-one correspondence Obs maps variables of the LLD to variables of the FSP. We

assume that the FSP and the LLD have disjoint sets of variables. In the sequel, for simplicity, the inverse

of Obs is called Obs as well. That is, for any variable x of the LLD and any variable x′of the FSP,

x′= Obs(x) if and only if x = Obs(x′).

DEFINITION 6.3 Let OFSP and OLLDbe the sets of observable variables of, respectively, an FSP and a

LLD, and Obs be a one-to-one correspondencebetween these sets. Let the sets

EFSP

=

{Pass,Fail,Abrupt}

∪{Write x(v) | x ∈ Pr(OLLD) ∧ v ∈ τObs(x)}

{{Pass},{Fail},{Abrupt}}

∪(P({Write x(v) | x ∈ Pr(OLLD) ∧ v ∈ τx}) − {∅})

ELLD

=

be the sets of observable events of the FSP and of the LLD, respectively. There is an RCR between the FSP

and the LLD if, for every run

R|ELLD= (p0,σ0),εi1,(pi1,σi1),...

of the LLD, there is a run

R′|EFSP= (p′

0,σ′

0),ε′

j1,(p′

j1,σ′

j1),...

of the FSP, where for all x ∈ OLLD, we have σ0(x) = σ′

sequence n1< n2< ... of positive integers such that

0(Obs(x)), such that there is an increasing

1. εi1matches ε′

j1,...,ε′

jn1, and

2. for all k > 1, εikmatches ε′

jnk−1+1,...,ε′

jnk,

and

• for all l, if εil?= {Pass} ?= {Fail} ?= {Abrupt}, then σil(y) = σ′

Pr(OLLD); otherwise

jnl(Obs(y)) for all y ∈

• σil(x) = σ′

jnl(Obs(x)) for all x ∈ OLLD.

30/34 Verimag Research Report noTR-2008-13

Page 33

Andrei Voronkov, Iman Narasamdya

length = l

trial := trial − 1

trial > 0

i := 0

i < l

trial := MAX

val := ⊤

val := ⊥

pin[i] = p[i]

i := i + 1

p4

px

val := ⊥

trial > 0

p′

e

p′

1

pin[i] = p[i]

i := i + 1

length = l

i := 0

i < l

p′

2

trial := trial − 1

inTrans := ⊥

p′

5

inTrans

pe

p1

inTrans := ⊤

tb := trial

p5

val := ⊥

p′

4

val := ⊤

inTrans

tb := trial

p′

x

p3

p6

p′

trial := MAX

7

p′

8

p2

p′

3

val := ⊥

inTrans := ⊤

trial := MAX

inTrans := ⊥

trial := trial − 1

p′

6

Figure 7: P1is on the left and P′

1is on the right.

?

Similar to the RCR between an SPM and an FSP, we use the special variable ε to store the events

emitted by the FSP and the LLD. For the RCR between an FSP and a LLD, emitting an event means

concatenating the event to the current value of the special variable ε. Particularly for the LLD, we use

another special variable εtto keep track the updated observable persistent variables when a transaction

is in progress. When the variable inTrans is set to true, the variable εtis set to the empty set. During

the transaction, any update to an observable persistent variable x with value v is recorded by updating εt

with εt∪ {Write x(v)}. When the variable inTrans is set to false, the variable ε is set to ε;εtonly if the

reseting of inTrans is not caused by aborting the in-progress transaction. Moreover, when the LLD emits

a Pass or Fail event, and a transaction is in progress, then ε is updated with ε;εt;Pass or ε;εt;Fail,

respectively. When a card tear occurs and the LLD emits Abrupt, then the content of εtis discarded

and ε is updated with ε;Abrupt. When an observable persistent variable is updated more than once in a

transaction, then one can always translate the LLD into SSA form [2] such that in the program texts there

is only one assignment to each variable.

Similar to RCRs between SPMs and FSPs, we apply the theory of inter-program properties to prov-

ing an RCR between an FSP and an LLD by proving the RCR between each corresponding commands

separately.

EXAMPLE 6.4 We consider again the command checkPIN in this example. Figure 7 depicts the FSP and

the LLD of the try parts of the command checkPIN. The flow graph of the FSP is called P1and is on the

lefthand side of the figure, while the other flow graph is the flow graph of the LLD and it is called P′

Persistent variables in P′

backup variable for the variable trial.

1.

1are trial,pin,length,MAX. Other variables are transient. The variable tb is a

Verimag Research Report noTR-2008-13 31/34

Page 34

Andrei Voronkov, Iman Narasamdya

Let the set

OFSP = {trial,pin,length,p,l,val,MAX,ε}

be the set of observable variables of the FSP and the set OLLD be the set of observable variables of

the LLD such that OLLDconsists of the primed counterparts of all variables in OFSP. The one-to-one

correspondence Obs between OFSP and OLLDmaps each variable in OFSP to its primed counterpart in

OLLD. We express the relationship of observable variables by the following assertions:

φ1

φ2

φ

⇔

⇔

⇔

pin = pin′∧ length = length′∧ MAX = MAX′∧ trial = trial′

p = p′∧ l = l′∧ val = val′∧ ε = ε′

φ1∧ φ2∧ (inTrans′⇒ trial = tb′)

The assertions φ1and φ2describe the correspondence of, respectively, persistent and transient variables.

We define an assertion function of (P1,P′

1) as follows:

ˆI1(pe,p′

=ˆI1(p3,p′

e) =ˆI1(p1,p′

3) =ˆI1(p3,p′

1) =ˆI1(p1,p′

7) =ˆI1(p4,p′

5) =ˆI1(p5,p′

4) =ˆI1(p6,p′

6) =ˆI1(p2,p′

8) =ˆI1(px,p′

2)

x) = φ

LetS1= {p | ∃p′,ϕ.ˆI1(p,p′) = ϕ} andS′

on whichˆI1is defined. Given a set S of program points in a flow graph, we say that a path π = p0,...,pn

in the flow graph is S-simple if n > 0, p0and pnare in S, and none of p1,...,pn−1are in S.

We define a setˆ Π1of paths of (P1,P′

1= {p′| ∃p,ϕ.ˆI1(p,p′) = ϕ}. bethe sets ofprogrampoints

1) as follows: for every S′

1-simple path πp′,q′,

• there is an S1-simple path πp,qsuch thatˆI1(p,p′) andˆI1(q,q′) are defined, or

• there is a trivial path πp, where p ∈ S1, such thatˆI1(p,p′) andˆI1(p,q′) are defined.

One can easily prove that the assertions in the verification condition associated withˆI1andˆ Π1are valid,

and thusˆI1is weakly extendible.

We next consider the catch parts of the command updatePIN. The flow graphs P2and P′

are the catch parts of the command. Note that the catch part P2of the FSP is different from the one shown

on the righthand side of Figure 6. The flow graph P2in Figure 8 updates the variables p and l. The

counterparts of these variables in the LLD are transient variables,3and so on abrupt they are set to their

default values. Nevertheless, one can easily define an assertion function of the flow graph P2in Figure 8

and the flow graph P2of the SPM in Figure 6 such that there is still an RCR between the SPM and the FSP

of the command checkPIN.

We define an assertion functionˆI2of (P2,P′

2in Figure 8

2) as follows:

ˆI2(ae,a′

ˆI2(a1,a′

ˆI2(ax,a′

e)

1)

x)

=

=

=

φ1∧ p = p′∧ ε = ε′∧ (inTrans′⇒ trial = tb′)

φ1∧ p = p′∧ val = val′∧ ε = ε′

φ.

Note that the assertions φ ⇒ˆI2(ae,a′

Moreover, since the set S′

run of P′

configurations of the runs satisfyˆI2(ae,a′

Let S′

a point p in P2andˆI2(p,p′) is defined. Similarly, let S2= {p | ∃p′,ϕ.ˆI2(p,p′) = ϕ}. Let ΠS′

of all S′

follows:

e) andˆI2(ae,a′

1, by the weak-extendibility ofˆI1, it follows that for every finite

1, there is a finite run of P1such that the initial configurations of the runs satisfy φ and the last

e).

2= {p′| ∃p,ϕ.ˆI2(p,p′) = ϕ} be the set of points in P′

e) ⇒?

x∈Pr(OLLD)x = Obs(x) ∧ ε = ε′are valid.

1above covers P′

2such that for each point p′in S′

2, there is

2be the set

2-simple paths and ΠS2be the set of all S2-simple paths. We define a setˆ Π2of paths of (P2,P′

2) as

ˆ Π2= {(πp,q,πp′,q′) | ∃ϕ1,ϕ2.(πp,q,πp′,q′) ∈ ΠS2× ΠS′

2andˆI1(p,p′) = ϕ1andˆI1(q,q′) = ϕ2}.

One can prove that the assertions in the weak verification condition associated withˆI2andˆ Π2are valid.

From the assertion functionsˆI1,ˆI2and the setsˆ Π1,ˆ Π2, and the weak extendibility ofˆI1andˆI2, one

can easily see that there is an RCR between the FSP and the LLD of the command checkPIN.

?

3Stack variables are transient variables.

32/34Verimag Research Report noTR-2008-13

Page 35

Andrei Voronkov, Iman Narasamdya

ae

val := ⊥

ax

a1

Abrupt

j := 0

l := 0

j < p.length

j := j + 1

p[j] := 0

j := j + 1

a′

x

j := 0

j < p.length

Abrupt

l := 0

p[j] := 0

a′

1

val := ⊥

inTrans

trial := tb

a′

e

Figure 8: P2is on the left and P′

2is on the right.

7 Conclusion

We have developed a theory of inter-program properties. The theory forms a basis for describing and

proving properties between two programs. The cornerstone of the theory is the notion of weak verification

condition,by which one can providecertificates certifyingthe inter-programproperties. The theoryitself is

abstract and general, in the sense that it can be applied to programs written in any programminglanguages

as long as these languages have the weakest precondition property.

We have applied the theory in the translation validation for optimizing compilers and in Common Cri-

teria certification. In translation validation, we have shown that, using the notions of extendible assertion

function and weak verification condition, we can capture different notions of correspondence used in dif-

ferent translation validation work. We have also shown that we can prove the equivalence of two programs

in the presence of optimizations that introduce or eliminate loops. In Common Criteria certification, we

have shown that the theory can be applied to two programs written in different languages, and the theory

can also provide certificates certifying representation correspondences between requirements in Common

Criteria.

References

[1] CommonCriteria for InformationTechnologySecurityEvaluation. Version3.1,CCMB-2007-09-003.

1, 6, 2

[2] B. Alpern, M.N. Wegman, and F.K. Zadeck. Detecting equality of variables in programs. In Con-

ference Record of the Fifteenth Annual ACM Symposium on Principles of Programming Languages

(POPL 1988), pages 1–11, 1988. 5.1, 6

[3] Z. Chen. Java Card Technology for Smart Cards. The Java Series. Addison-Wesley, 2000. 6, 6

[4] Robert W. Floyd. Assigning meaning to programs. In J. T. Schwartz, editor, Proceedings of Sympo-

sium in Applied Mathematics, pages 19–32, 1967. 1

[5] C. A. R. Hoare. An axiomatic basis for computer programming. CACM, 12(10):576–580,1969. 1

[6] E.-M.G.M. Hubbers and E. Poll. Reasoning about card tears and transactions in Java Card. In

M. WermelingerandT. Margaria-Steffen,editors, FundamentalApproachesto Software Engineering,

Verimag Research Report noTR-2008-13 33/34

Page 36

Andrei Voronkov, Iman Narasamdya

7th International Conference, FASE 2004, volume 2984 of LNCS, pages 114–128. Springer-Verlag,

2004. 6

[7] Jacques Leockx, Kurt Sieber, and Ryan D. Stansifer. The Foundations of Program Verification (2nd

ed.). John Wiley & Sons, Inc., New York, NY, USA, 1987. 3.2

[8] I. Narasamdya and A. Voronkov. Finding basic block and variable correspondence. In Proceedings

of the 12th International Static Analysis Symposium (SAS), 2005. 1, 5, 5.1

[9] Iman Narasamdya.

mizing Compilers.

http://www-verimag.imag.fr/˜narasamd/NarasamdyaThesis.ps. 5.1, 5.1

Establishing Program Equivalence in Translation Validation for Opti-

PhD thesis, The University of Manchester, 2007.Downloadable at

[10] George C. Necula. Translation validation for an optimizing compiler. In Proceedings of the ACM

SIGPLAN Conference on Principles of ProgrammingLanguagesDesign andImplementation(PLDI),

pages 83–95, June 2000. 5, 5.3

[11] A. Pnueli, M. Siegel, and E. Singerman. Translation validation. LNCS, 1384, 1998. 1, 5

[12] M.RinardandD.Marinov. Crediblecompilationwithpointers. InProceedingsoftheFLoCWorkshop

on Run-Time Result Verification, Trento, Italy, July 1999. 5, 5.3, 5.3, 5.3, 5.12, 5.3

[13] XavierRival. Symbolictransferfunction-basedapproachestocertifiedcompilation. InProceedingsof

the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 1–13.

ACM Press, 2004. 5

[14] Sun Micro systems, Inc, Palo Alto, California.

http://java.sun.com/javacard/3.0/. 6, 6

Java Card 3.0 Platform Specification, 2008.

[15] Lenore D. Zuck, Amir Pnueli, and Benjamin Goldberg. VOC: A methodology for the translation

validation of optimizing compilers. J. UCS, 9(3):223–247,2003. 5, 5.2, 5.2, 5.9

34/34Verimag Research Report noTR-2008-13