Page 1

Java’s Integral Types in PVS

Bart Jacobs

Dep. Comp. Sci., Univ. Nijmegen,

P.O. Box 9010, 6500 GL Nijmegen, The Netherlands.

bart@cs.kun.nl

February 20, 2003

Abstract. This paper extends PVS’s standard bitvector library with multiplica-

tion, division and remainder operations, together with associated results. This ex-

tension isneeded togiveappropriate semanticstoJava’s integral typesin program

verification. Special emphasis is therefore put on Java’s widening and narrowing

functions in relation to the newly defined operations on bitvectors.

1Introduction

Many programmming languages offer different integral types, represented by different

numbers of bits. In Java one has types byte (8 bits), short (16 bits), int (32 bits)

and long (64 bits). Additionally, there is a 16 bit type char for unicode characters,

see [6,

ferences and interpret all of these types as the unbounded, mathematical integers. This

same approach has been followed until recently within the Java verification technol-

ogy developed in Nijmegen around the LOOP translation tool [1] and the PVS theorem

prover [12].

During the last few years the main application area for the LOOP tool is Java Card

based smart cards. Within this setting the abovementioned abstraction of integral types

is problematic, because of the following reasons.

??4.2.1]. It is a usual abstraction in program verification to disregard these dif-

– Given the limited resources on a smart card, a programmerchooses his/her integral

data types as small as possible, so that potential overflows are a concern (see [4,

Chapter14]). Since such overflowsdo not produceexceptions in Java (like in Ada),

a precise semantics is needed.

– Communication between a smart card and a terminal uses a special structured byte

sequence, called an “apdu”, see [4]. As a result, many low-level operations with

bytes occur frequently, such as bitwise negation or masking.

– Unnoticed overflow may form a security risk: imagine you use a short for a se-

quence number in a security protocol, which is incremented with every protocol

run. An overflow then makes you vulnerable to a possible replay attack.

Attention in the theorem proving community has focused mainly on formalising

properties of (IEEE 754) floating-point numbers, see e.g. [2,7,8,14]. Such results are

of interest in the worlds of microprocessor construction and scientific computation.

However, there are legitimate concerns about integral types as well. It is argued in [15]

Page 2

that Java’s integral types are unsafe, because overflow is not detected via exceptions,

and are confusingbecause of the asymmetric way that conversionswork: argumentsare

automatically promoted, but results are not1.

The verification approach centered around the LOOP tool uses the specification

language JML [10,11] in order to express the required correctness properties for Java

programs. Such (simple) properties can also be checked statically with the ESC/Java

tool [5],but such checkingignoresintegralbounds.The theoremproverbased approach

with the semantics of this paper will take bounds into account. In [3] it is proposed that

a specification language like JML for Java) should use the mathematical (unbounded)

integers, for describing the results of programs using bounded integral types, because

“developers are in a different mindset when reading or writing specifications, particu-

larly when it comes to reasoning about integer arithmetic”. This issue is not resolved

yet in the program specification community—and it will not be settled here. Anyway,

Instead, this paper describes the new semantics for Java’s integral types developed

for the LOOP tool. This semantics is based on PVS’s (standard) bitvector library. This

PVS library describes bitvectors of arbitrary length, given as a parameter, togetherwith

functions bv2nat and bv2int for the unsigned (one’s-complement) and signed (two’s-

complement) interpretation of bitvectors. Associated basic operations are defined, such

as addition, subtraction, and concatenation. In this paper, the following items are added

to this library.

1. Executable definitions. For instance, the standard library contains “definitions by

specification” of the form:

-(bv: bvec[N]): { bvn: bvec[N] | bv2int(bvn) =

IF bv2int(bv) = minint THEN bv2int(bv)

ELSE -(bv2int(bv)) ENDIF}

*(bv1: bvec[N], bv2: bvec[N]): {bv: bvec[2*N] | bv2nat(bv) =

bv2nat(bv1) * bv2nat(bv2)}

Such definitions2are not so useful for our program verifications, because some-

times we need to actually compute outcomes. Therefore we give executable redefi-

nitions of these operations. Then we can compute, for instance, (4*b)&0x0F.

2. Similarly, executable definitions are introduced for division and remainder opera-

tions,whicharenotpresentinthestandardlibrary.We give suchdefinitionsbothfor

unsigned and signed interpretations, following standard hardware realisations via

shifting of registers. The associated results are non-trivial challenges in theorem

proving.

3. Typically for Java we introduce so-called widening and narrowing, for turning a

bitvectorof length

? into oneoflength

??? and back,see [6,

??5.1.2and

??5.1.3].

1For a byte (or short) b, the assignment b = b-b leads to a compile time error: the arguments

of the minus function are first converted implicitly to int, but the result must be converted

explicitly back, as in b = (byte)(b-b).

2Readers familiar with PVS will see that these definitions generate so-called type correctness

conditions (TCCs), requiring that the above sets are non-empty. These TCCs can be proved

via the inverses int2bv and nat2bv of the (bijective) functions bv2int and bv2nat, see

Section 2. The inverses exist because one has bijections, but they are not executable.

Page 3

When, for example, a byte is added to a short, both arguments are first “promoted”

in Java-speak to integers via widening, and then added. Appropriate results are

proven relating for instance widening and multiplication or division.

As a result the familiar cancellation laws for multiplication ( ?

We show how our definitions of multiplication, division and remaindersatisfy all prop-

erties listed in the Java Language Specification [6,

In particular we get a good handle on overflow, so that we can prove for the values

minint = 0x80000000 and maxint = 0x7FFFFFFF, the truth of the follow-

ing Java booleans.

??15.17.1-3].

minint - 1 == maxint

minint * -1 == minint

minint / -1 == minint

maxint + 1 == minint

maxint * maxint == 1

??????????, for

????) and for division (

???

???

?

?

?, for

?????????) do not hold, since:

minint * -1 == minint * 1 (minint * -1) / (minint * 1) == 1

But these cancellationlaws do holdin case thereis no overflow.Similarly, we can prove

the crucial propertyof the familiarmask to turnbytes into nonnegativeshorts: fora byte

b,

(short)(b & 0xFF) == (b >= 0) ? b : (b + 256)

Integralarithmeticis averybasictopicincomputerscience(seee.g.[13]).Mostthe-

orem provers have a standard bitvector library that covers the basics, developed mostly

for hardware verification. But multiplication, division and remainder are typically not

included. The contribution of this paper lies in the logical formalisation of these oper-

ations and their results, and in linking the outcome to Java’s arithmetic, especially to

its widening and narrowing operations. These are the kind of results that “everybody

knows” but are hard to find and easy to get wrong.

Thispaperhasasimplestructure.ItstartsbyexplainingthebasicsofPVS’s standard

bitvector library. Then, in Section 4 it describes our definition of multiplication with

associated properties. Division and remainder operations are more difficult; they are

first described in unsigned (one’s-complement) form in Section 5, and subsequently in

signed (two’s-complement) form in Section 6. Although the work we have done has

been carried out in the language of a specific theorem prover (namely PVS), we shall

use a general, mathematical notation to describe it.

2 PVS’s standard bitvector library

The distribution of PVS comes with a basic bitvector library3. We sketch some in-

gredients that will be used later. A bit is defined as in PVS as a boolean, but here

we shall equivalently use it as an element of

def

?????. A bitvector of length

? is a

function in bvec ????

?below???? bit

?, where below

??? is the

?-element set

3Developed by Butler, Miner, Carre˜ no (NASA Langley), Miller, Greve (Rockwell Collins) and

Srivas (SRI International).

Page 4

?????

bv2nat-rec ?bv?

???????? of natural numbers below

bv ??

?. For instance, the null bitvector is

??

Similarly, one can write

Clearly, bv2nat is bijective. And also: bv2nat?bv?

? below

?????, which we shall often write as

??

? , leaving the length

? implicit.

??

? for

??? below?????. It should be distinguished from 1 =

??? below??

The unsigned interpretation of bitvectors is given by the (parametrised) function

bv2nat

??if

??? then

? else

?.

? bvec ???? below??

?

?, defined as:

bv2nat

?bv ?

def

? bv2nat-rec ?bv????

where

??

def

?

?

?

if

???

? ????

???

? bv2nat-rec ?bv ??? ?? if

???

(1)

??? bv

?

??

? , bv2nat?bv??

?

?

?

The signed interpretation is given by a similar function bv2int

?? bv

?

??

? , and bv2nat?bv???? bv

? 1.

? bvec ???????

????

???

??????

???

?. It is defined in terms of the unsigned interpretation:

bv2int

?bv?

def

?

?

bv2nat

bv2nat

?bv ?

if bv2nat

?bv???

???

?bv ???

?otherwise.

(2)

The condition bv2nat

?bv ???

???means that the most significant bit bv??? ?? is

?. Therefore, this bit is often called the sign bit, when the signed interpretation is used.

This bv2int function is also bijective.

The PVS bitvector library provides various basic operations and results. For in-

stance, there is an (executable) addition operation

recursively defined adder. A unary minus operation

tion, as described in the introduction. Binary minus is then defined as: bv

bv

tation. A typical result is:

? on bitvectors, introduced via a

? is introduced via a specifica-

?

? bv

?

def

?

?

???bv

?

?. These operations work for both the unsigned and for the signed interpre-

bv2int

?bv

bv2int

?

? bv

?

?

?

?

?

?

?

?

?

?

?

?

?bv

?

?? bv2int

?bv

?

?

if

and bv2int

??

?

bv ??

??

? bv2int?bv

?

?? bv2int

?bv

?

?

?bv

?

?? bv2int?bv

?

???

???

bv2int

bv2int

?bv

?

?? bv2int

?bv

?

???

?if bv2int

?bv

?

??? and bv2int?bv

?

???

?bv

rsh??

?

?? bv2int

?bv

?

???

?otherwise.

The second cases deals with overflow, and the third one with underflow. The library

shows, among other things, that the structure

group.

Also we shall make frequent use of left and right shift operations. For

?bvec ??????

??

???? is a commutative

???,

lsh???bv ?? ??? below

????

?

??? if

???

?

otherwise.

?bv ?? ??? below

????

?

bv ????? if

?????

?

otherwise.

Page 5

3 Widening and narrowing

widen?bv?

As mentioned in the introduction, Java uses so-called widening and narrowing opera-

tions to move from one integral type to another. These operations can be described in a

parametrised way, as functions:

widen

? bvec ???? bvec ?????

and

narrow

? bvec ?????? bvec ???

defined as:

def

? ??? below??????

?

bv ???

if

???

bv ??? ?? otherwise

narrow

?BV ?

def

? ??? below????BV ???

Thus, narrowing simply ignores the first

the unsigned interpretation is unaffected, in the sense that:

? bits. The key property of widening is that

bv2int

?widen ?bv??

narrow?widen ?bv??

? bv2int

?bv?

A theme that will re-appear several times in this paper is that after widening there is no

overflow:

bv2int

?widen ?bv

?

?? widen ?bv

?

?

?

? bv2int

?bv

?

?? bv2int

?bv

?

?

bv2int

?

? widen?bv?

?

??bv2int

?bv??

(3)

There are similar results about narrowing. First:

? bv?

But also:

narrow

?BV

narrow

?

? BV

widen ?widen ????

?

?? narrow

?BV

?

?? narrow?BV

?

?

??BV ???narrow

?BV ??

TheLOOPtooluseswideningandnarrowinginthetranslation ofJava’sarithmetical

expressions. For instance, for a byte b and short s, a Java expression

(short)(b + 2*s)

is translated into PVS as:

narrow

?

??? widen ???

?

because the arguments are “promoted” in Java to 32 bit integers before addition and

multiplication are applied.

In this way we can explain (and verify in PVS) that for byte b = -128, one has

in Java: b-1 is

???? and (byte)(b-1) is

???.