Content uploaded by David Déharbe

Author content

All content in this area was uploaded by David Déharbe

Content may be subject to copyright.

Formal Modelling of a Microcontroller

Instruction Set in B

Val´erio Medeiros Jr1, David D´eharbe1

Federal University of Rio Grande do Norte, Natal RN 59078-970, Brazil

Abstract. This paper describes an approach to model the functional

aspects of the instruction set of microcontroller platforms using the no-

tation of the B method. The paper presents speciﬁcally the case of the

Z80 platform. This work is a contribution towards the extension of the

B method to handle developments up to assembly level code.

1 Introduction

The B method [1] supports the construction of safety systems models by veriﬁca-

tion of proofs that guarantees its correctness. So, an initial abstract model of the

system requirements is deﬁned and then it is reﬁned until the implementation

model. Development environments based on the B method also include source

code generators for programming languages, but the result of this translation

cannot be compared by formal means. The paper [4] presented recently an ap-

proach to extend the scope of the B method up to the assembly level language.

One key component of this approach is to build, within the framework of the B

method, formal models of the instruction set of such assembly languages.

This work gives an overview of the formal modelling of the instruction set of

the Z80 microcontroller [6]1. Using the responsibility division mechanism pro-

vided by B, auxiliary libraries of basic modules were developed as part of the

construction of microcontroller model. Such library has many deﬁnitions about

common concepts used in the microcontrollers; besides the Z80 model, it is used

by two other microcontrollers models that are under way.

Other possible uses of a formal model of a microcontroller instruction set in-

clude documentation, the construction of simulators, and be possibly the starting

point of a veriﬁcation eﬀort for the actual implementation of a Z80 design. More-

over the model of the instruction set could be instrumented with non-functional

aspects, such as the number of cycles it takes to execute an instruction, to prove

lower and upper bounds on the execution time of a routine. The goal of this

project, though, is to provide a basis for the generation of software artifacts at

the assembly level that are amenable to reﬁnement veriﬁcation within the B

method.

This paper is focused on the presentation of the Z80 model, including elemen-

tary libraries to describe hardware aspects. The paper is structured as follows.

1The interested reader in more details is invited to visit our repository at:

http://code.google.com/p/b2asm.

283

Section 2 provides a short introduction to the B method. Section 3 presents the

elementary libraries and the modelling of some elements common to microcon-

trollers. Section 4 presents the B model of the Z80 instruction set. Section 5

provides some information on the proof eﬀort needed to analyze the presented

models. Related work is discussed in Section 6. Finally, the last section is devoted

to the conclusions.

2 Introduction to the B Method

The B method for software development [1] is based on the B Abstract Machine

Notation (AMN) and the use of formally proved reﬁnements up to a speciﬁcation

suﬃciently concrete that programming code can be automatically generated from

it. Its mathematical basis consists of ﬁrst order logic, integer arithmetic and set

theory, and its corresponding constructs are similar to those of the Z notation.

A B speciﬁcation is structured in modules. A module deﬁnes a set of valid

states, including a set of initial states, and operations that may provoke a tran-

sition between states. The design process starts with a module with a so-called

functional model of the system under development. In this initial modelling

stage, the B method requires that the user proves that, in a machine, all the its

initial states are valid, and that operations do not deﬁne transitions from valid

states to invalid states.

Essentially, a B module contains two main parts: a header and the available

operations. Figure 1 has a very basic example. The clause MACHINE has the

name of module. The next two clauses respectively reference external modules

and create an instance of an external module. The VARIABLES clauses declares

the name of the variables that compose the state of the machine. Next, the

INVARIANT clause deﬁnes the type and other restrictions on the variables. The

INITIALIZATION speciﬁes the initial states. Finally, operations correspond to

the transitions between states of the machine.

MACHINE micro

SEES TYPES ,ALU

INCLUDES MEMORY

VARIABLES pc

INVARIANT pc ∈INSTRUCTION

INITIALISATION pc := 0

OPERATIONS

JMP(jump)=

PRE jump ∈INSTRUCTION

THEN pc := jump

END

END

Fig. 1. A very basic B machine.

284

3 Model structure and basic components

We have been developed a reusable set of basic deﬁnitions to model hardware

concepts and data types concepts. These deﬁnitions are grouped into two se-

parate development projects and are available as libraries. A third project is

devoted to the higher-level aspects of the platform. Thus, the workspace is com-

posed of: a hardware library, a types library and a project for the speciﬁc plat-

form, in this case the Z80. The corresponding dependency diagram is depicted

in Figure 2; information speciﬁc to each project is presented in the following.

Fig. 2. Dependency diagram of the Z80 model.

3.1 Bit Representation and Manipulation

The entities deﬁned in the module BIT DEFINITION are the type for bits, logi-

cal operations on bits (negation, conjunction, disjunction, exclusive disjunction),

as well as a conversion function from booleans to bits.

First, bits are modelled as a set of integers: BIT =0..1. The negation is an

unary function on bits and it is deﬁned as:

bit not ∈BIT →BIT ∧∀(bb).(bb ∈BIT ⇒bit not (bb ) = 1 −bb )

The module also provides lemmas on negation that may be useful for the

users of the library to develop proofs:

∀(bb).(bb ∈BIT ⇒bit not (bit not (bb )) = bb)

Conjunction is an unary function on bits and it is deﬁned as:

bit and ∈BIT ×BIT →BIT ∧

∀(b1 ,b2 ).(b1 ∈BIT ∧b2 ∈BIT ⇒

((bit and(b1 ,b2 ) = 1) ⇔(b1 = 1) ∧(b2 = 1)))

The module provides the following lemmas for conjunction, either:

∀(b1 ,b2 ).(b1 ∈BIT ∧b2 ∈BIT ⇒

(bit and(b1 ,b2 )=bit and(b2 ,b1 ))) ∧

∀(b1 ,b2 ,b3 ).(b1 ∈BIT ∧b2 ∈BIT ∧b3 ∈BIT ⇒

(bit and(b1 ,bit and(b2 ,b3 )) = bit and (bit and (b1 ,b2 ),b3 )))

285

The module provides deﬁnitions of bit or (disjunction) and bit xor (exclu-

sive disjunction), as well as lemmas on those operators. These are standard and

their expression in B is similar as for bit and, they are thus omitted.

Finally, the conversion from booleans to bits is simply deﬁned as:

bool to bit ∈BOOL →BIT ∧bool to bit ={TRUE )→ 1,FALSE )→ 0}

Observe that all the lemmas that are provided in this module have been

mechanically proved by the theorem prover included with our B development

environment. None of these proofs requires human insight.

3.2 Representation and Manipulation of Bit Vectors

Sequences are pre-deﬁned in B, as functions whose the domain is an integer

range with lower bound 1 (one). Indices in bit vectors usually range from 0

(zero) upwards and the model we propose obeys this convention by making an

one-position shift where necessary. This shift is important to use the predeﬁned

functions of sequences. We thus deﬁne bit vectors as non-empty sequences of bits,

and BIT VECTOR is the set of all such sequences: BIT VECTOR = seq(BIT ).

The function bv size returns the size of a given bit vector. It is basically a

wrapper for the predeﬁned function size that applies to sequences.

bv size ∈BIT VECTOR →N1∧

bv size =λbv.(bv ∈BIT VECTOR |size(bv))

We also deﬁne two functions bv set and bv clear that, given a bit vector,

and a position of the bit vector, return the bit vector resulting from setting the

corresponding position to 0 or to 1, and a function bv get that, given a bit vector,

and a valid position, each one returns the value of the bit at that position. Only

the ﬁrst deﬁnition is shown here:

bv set ∈BIT VECTOR ×N→BIT VECTOR ∧bv set =

λv, n.(v∈BIT VECTOR ∧n∈N∧n<bv size(v)|v!−−{n+1)→ 1})

Additionally, the module provides deﬁnitions for the classical logical combi-

nations of bit vectors: bit not,bit and ,bit or and bit xor . Only the ﬁrst two

are presented here. Observe that the domain of the binary operators is restricted

to pairs of bit vectors of the same length:

bv not ∈BIT VECTOR →BIT VECTOR ∧

bv not =λv.(v∈BIT VECTOR |λi.(1..bv size(v)) |bit not(v(i))) ∧

bv and ∈BIT VECTOR ×BIT VECTOR →BIT VECTOR ∧

bv and =λv1,v

2.(v1∈BIT VECTOR ∧v2∈BIT VECTOR ∧

bv size(v1)=bv size (v2)|λi.(1..bv size (v1)) |bit and (v1(i),v

2(i)))

We provide several lemmas on bit vector operations. These lemmas express

properties on the size of the result of the operations as well as classical algebraic

properties such as associativity and commutativity.

3.3 Modelling Bytes and Bit Vectors of Length 16

Bit vectors of length 8 are bytes. They form a common entity in hardware design.

We provide the following deﬁnitions:

BYTE WIDTH =8∧BYTE INDEX =1.. BYTE WIDTH∧

286

PHYS BYTE INDEX =0.. (BYTE WIDTH-1) ∧

BYTE ={bt |bt ∈BIT VECTOR ∧bv size(bt)=BYTE WIDTH}∧

BYTE ZERO ∈BYTE ∧BYTE ZERO =BYTE INDEX ×{0}

The BYTE INDEX is the domain of the functions modelling bytes. It starts

at 1 to obey a deﬁnition of sequences from B. However, it is common in hardware

architectures to start indexing from zero. The deﬁnition PHYS BYTE INDEX

is used to provide functionalities obeying this convention. The BYTE type is

a specialized type from BIT VECTOR, but it has a size limit. Other speciﬁc

deﬁnitions are provided to facilitate further modelling: the type BV16 is created

for bit vector of length 16 in a similar way.

3.4 Bit Vector Arithmetics

Bit vectors are used to represent and combine numbers: integer ranges (signed

or unsigned). Therefore, our library includes functions to manipulate such data,

for example, the function bv to nat that maps bit vectors to natural numbers:

bv to nat ∈BIT VECTOR →N∧

bv to nat =λv.(v∈BIT VECTOR |!i.(i∈dom(v).v(i)×2i))

An associated lemma is: ∀n.(n∈N1⇒bv to nat (nat to bv (n)) = n)

3.5 Basics Data Types

The instruction set of microcontrollers usually have common data types. These

types are placed in the types library. Each type module has functions to manip-

ulate and convert its data. There are six common basics data types represented

by modules, see details in table 1.

Table 1. Descriptions of basic data types

T ype N ame UCHAR SCHAR USHORTINT SSHORTINT BYTE BV16

Range 0..255 -128..127 0..65.535 -32.768..32.767 – –

P hysical S ize 1 byte 1 byte 2 bytes 2 bytes 1 bytes 2 bytes

Usually, each type module just needs to instantiate concepts that were al-

ready deﬁned in the hardware modelling library. For example, the function

bv to nat from bit vector arithmetics is specialized to byte uchar . As the set

BYTE is a subset of the BIT VECTOR, this function can deﬁned as follows:

byte uchar ∈BYTE →N∧

byte uchar =λ(v).(v∈B Y T E |bv to nat(v))

The deﬁnitions of the library types reuse the basic deﬁnitions from the hard-

ware library. This provides greater conﬁdence and facilitates the proof process,

because the prover can reuse the previously deﬁned lemma.

The inverse function uchar byte is easily deﬁned:

287

uchar byte ∈UCHAR →BYTE ∧

uchar byte =(byte uchar )−1

Similarly, several other functions and lemmas were created for all other data

types.

4 Description of the Z80 B model

The Z80 is a CISC microcontroller developed by Zilog [6]. It supports 158 dif-

ferent instructions and all of them were speciﬁed. These instructions are classi-

ﬁed into these categories: load and exchange; block transfer and search; arith-

metic and logical; rotate and shift; bit manipulation; jump, call and return;

input/output; and basic cpu control.

The main module includes an instance of the memory module and accesses

the deﬁnitions from basic data types modules and the ALU module.

MACHINE

Z80

INCLUDES

MEMORY

SEES

ALU, BIT DEFINITION, BIT VECTOR DEFINITION,

BYTE DEFINITION, BV16 DEFINITION,

UCHAR DEFINITION, SCHAR DEFINITION,

SSHORT DEFINITION ,USHORT DEFINITION

Each instruction is represented by a B operation in the module Z80. By

default, all parameters from operations are either predeﬁned elements in the

model or integers values in the decimal representation. The internal registers

contain 208 bits of reading/writing memory. It includes two sets of six general

purpose registers which may be used individually as 8-bits registers or as 16-

bits register pairs. The working registers are represented by variable rgs8. The

domain of rgs8 (id regs8 ) is a set formed by identiﬁers of registers of 8 bits.

These registers can be accessed in pairs, forming 16-bits, resulting in another set

of identiﬁers of 16-bits registers, named id reg16 . The main working register of

Z80 is the accumulator (rgs8(a0 )) used for arithmetic, logic, input/output and

loading/storing operations.

4.1 Modelling Registers, Input and Output Ports and Instructions

The Z80 has diﬀerent types of registers and instructions. The CPU contains

general-purpose registers (id reg 8), a stack pointer (sp), program counter (pc ),

two index registers (ix and iy), an interrupt register (i), a refresh register (r),

two bits (iﬀ1,iﬀ2) used to control the interruptions, a pair of bits to deﬁne the

interruption mode (im) and the input and output ports (i o ports). Below, part

of the corresponding deﬁnitions are replicated from the INVARIANT:

rgs8 ∈id reg 8→BYTE ∧pc ∈INSTRUCTION ∧

sp ∈BV16 ∧ix ∈BV16 ∧iy ∈BV16 ∧

288

i∈BYTE ∧r∈BYTE ∧iﬀ1∈BIT ∧iﬀ2∈BIT ∧

im :(BIT ×BIT)∧i o ports ∈BYTE →BYTE

A simple example of instruction is a LD n A, as shown below. Many times,

to model an instruction is necessary to use the predeﬁned functions, these help

the construction of model. This instruction use the updateAddressMem function

from Memory module and it receives an address memory and its new memory

value. Finally it increments the program counter (pc ) and update the refresh

register (r).

LD n A (nn )=

PRE nn ∈USHORT

THEN

updateAddressMem (ushort to bv16 (nn ),rgs8 (a0 ))||

pc := instruction next (pc )|| r:= update refresh reg(r)

END

The microcontroller model can specify security properties. For example, the

last operation could have a restriction to write only in a deﬁned region of memory.

5 Proofs

The proof obligations allow to verify the data types, important system properties

and if the expressions are well-deﬁned (WD)2. The properties provide additional

guarantees, because they can set many safety rules. However, the model can be

very diﬃcult to prove.

Several iterations were needed to provide the good library deﬁnitions as well

as to ﬁne-tune the model of the microcontroller instructions by factoring common

functionalities into auxiliary deﬁnitions.

However, few proof commands3need to be used to prove most proof obli-

gations. As there are many similar assembly instructions, some human-directed

proofs, when replayed, could discharge other proof obligations. A good example

is a set of 17 proof commands that quickly aided the veriﬁcation of 99% (2295)

of WD proofs. We also set up a proving environment consisting of networked

computers to take advantage of the distribution facilities now provided in the B

development environment. Finally, all of the 2926 proof obligations were proved

using the tool support of the development environment.

6 Related Works

There are in the literature of computer science some approaches [2, 3] to model

hardware and the virtual machines using the B method. Then, in both works

the B method has been used successfully to model the operational semantic.

2An expression is called “well-deﬁned” (or unambiguous) if its deﬁnition assigns it a

unique interpretation or value.

3The proof commands are steps that direct the prover to ﬁnd the proof, and cannot

introduce false hypothesis.

289

However the cost of modelling was still expensive and this paper quoted some

techniques to lower the cost of modelling.

In general, the researchers employing the B method have focused on more

abstract level of description of software. Considering low-level aspect, there has

been previous work on modelling the Java Virtual Machine [3].

The main motivation of our research is the development of veriﬁed software

up to the assembly level, which requires specifying the semantics of the under-

lying hardware. Thus, some aspects were not modelled in our work such as the

execution time of the instructions. Also we did not consider the microarchitecture

of the hardware as the scope of our work does not include hardware veriﬁcation.

However, there are many other specialized techniques to verify these questions.

7 Conclusions

This work has shown an approach to the formal modelling of the instruction set

of microcontrollers using the B method. During the construction of this model,

some ambiguities and errors were encountered in the oﬃcial reference for Z80

microcontroller [6]. As the B notation has a syntax that is not too distant from

that of imperative programming languages, such model could be used to improve

the documentation used by assembler programmers. Besides, the formal notation

used is analyzed by software that guarantees the correctness of typing, the well-

deﬁnedness of expressions, in addition to safety properties of the microcontroller

state.

Future works comprise the development of software with the B method from

functional speciﬁcation to assembly level, using the Z80 model presented in this

paper. The mechanic compilation from B algorithmic constructs to assembly

platform is also envisioned.

Acknowledges: This work received support from ANP (Agˆencia Nacional do

Petr´oleo, G´as Natural e Biocombust´ıveis) and CNPq (Conselho Nacional de De-

senvolvimento Cient´ıﬁco e Tecnol´ogico).

References

1. Abrial, J. R. The B Book: Assigning Programs to Meanings. Cambridge University

Press, United States of America, 1 edition, 1996.

2. Aljer, P. Devienne; S. Tison J-L. Boulanger and G. Mariano. Bhdl: Circuit Design

in B. A. In ACSD, Third International Conference on Application of Concurrency

to System Design, pages 241-242, 2003.

3. Casset L.; Lanet J. L. A Formal Speciﬁcation of the Java Bytecode Semantics using

the B method.Technical Report, Gemplus. 1999.

4. Dantas, B; D´eharbe, D.; Galv˜ao, S. L.; Moreira, A. M. and Medeiros Jr, V. G..

Applying the B Method to Take on the Grand Challenge of Veriﬁed Compilation.

In: SBMF, Savaldor, 2008. SBC.

5. Hoare, C. A. R. The verifying compiler, a grand challenge for computing research.

In: VMCAI, p. 78-78, 2005.

6. Zilog. Z80 Family CPU User Manual. http://www.zilog.com/docs/z80/um0080.pdf

290