Simple type theory: simple steps towards a formal specification
ABSTRACT Engineers, particularly software engineers, need to know how to read and write precise specifications. Specifications are made precise by expressing them in a formal mathematical language. Simple type theory, also as known as higherorder logic, is an excellent educational and practical tool for creating and understanding formal specifications. It provides a better logical foundation for specification than firstorder logic and is a better introductory specification language than industrial specification languages like VDMSL and Z. For these reasons, we recommend that simple type theory be incorporated into the undergraduate engineering curriculum.

Conference Paper: Specifying Essential Features of Street Networks.
[Show abstract] [Hide abstract]
ABSTRACT: In order to apply advanced highlevel concepts for transportation networks, like hypergraphs, multilevel wayfinding and traffic forecasting, to commercially available street network datasets, it is often necessary to generalise from network primitives. However, the appropriate method of generalisation strongly depends on the complex street network feature they belong to. In this paper, we develop formal expressions for road segments and some essential types of roads, like roundabouts, dual carriageways and freeways. For this purpose, a formal network language is developed, which allows a clear distinction among the geometrical network, its embedding into the Euclidian plane, as well as navigational constraints for a traffic mode.Spatial Information Theory, 8th International Conference, COSIT 2007, Melbourne, Australia, September 1923, 2007, Proceedings; 01/2007  SourceAvailable from: aau.dk
Conference Paper: A method for inductive estimation of public transport traffic using spatial network characteristics
Association of Geographic Information Laboratories in Europe AGILE: 10th AGILE International Conference on Geographic Information Science 2007; 01/2007
Page 1
Session F1C
Simple Type Theory:
Simple Steps Towards a Formal Specification
William M. Farmer
Department of Computing and Software
McMaster University
1280 Main Street West
Hamilton, Ontario L8S 4K1
Canada
{wmfarmer,mohrens}@mcmaster.ca
Martin v. Mohrenschildt
Abstract  Engineers, particularly software engineers, need
to know how to read and write precise specifications.
Specifications are made precise by expressing them in a
formal mathematical language. Simple type theory, also as
known as higherorder logic, is an excellent educational
and practical tool for creating and understanding formal
specifications. It provides a better logical foundation for
specification than firstorder logic and is a better intro
ductory specification language than industrial specification
languages like
VDMSL and
recommend that simple type theory be incorporated into the
undergraduate engineering curriculum.
Z. For these reasons, we
INTRODUCTION
The creation and use of specifications is fundamental to
engineering practice. Software engineers—as well as many
other engineers—need to know how to read and write precise
specifications. Specifications are made precise by expressing
them in a formal mathematical language. Such formal spec
ifications have many advantages over informal specifications
written in natural language because they can be mechanically
constructed and analyzed by humans or by software. However,
most engineers lack the background required to employ formal
specifications, despite the fact that undergraduate engineering
programs usually include some exposure to logic and discrete
mathematics.
Simple type theory, also as known as higherorder logic, is
a natural extension of firstorder logic which is simple, highly
expressive, and practical. (For a full discussion of the virtues of
simple type theory, see [5].) It is an excellent educational and
practical tool for creating and understanding formal specifica
tions. This is illustrated by simple type theory’s three strengths.
Simplicity. Simple type theory has a very simple syntax
and semantics. It contains fewer syntactic categories than first
order logic, and it is based on the same semantic principles as
firstorder logic. It is therefore not much harder for students to
learn than firstorder logic. By virtue of its simplicity, it is also
much easier to learn than industrial specification languages
0780385527/04/$20.00 c ? 2004 IEEE
like VDMSL [2] and Z [20].
Expressivity. Simple type theory is a highly expressive
specification language as well as a powerful logic. It contains
the basic logical tools needed for writing clear and concise
formal specifications. They are:
1) Propositional connectives.
2)Universal and existential quantifiers.
3)Application and abstraction operators (e.g., for applying
and defining functions, respectively).
4) Higherorder objects (such as functions and sets).
5) Types.
6)A definite description operator (for forming expressions
of the form “the unique object x that satisfies the property
P”).
We will illustrate each of these elements later in the paper. It is
important to note that firstorder logic contains only a few of
these elements, namely, propositional connectives, quantifiers,
and a operator for applying functions and predicates.
Practicality. Although a textbook formulation of simple
type theory is a much more practical specification language
than firstorder logic, it would be burdensome to write large,
complex specifications in it. However, by extending its syntax
and semantics in certain ways, simple type theory can be made
into an effective specification language for actual use. These
extensions do not change simple type in any fundamental way;
they just make simple type theory more convenient to employ.
Simple type theory provides a better logical foundation
for formal specification than firstorder logic. It is a better
introductory specification language than the industrial spec
ification languages VDMSL and Z. And some of the most
popular and advanced computer theorem proving systems,
including HOL [10], IMPS [7], Isabelle [18], ProofPower [13],
and PVS [16], are based on logics that are extensions of simple
type theory. For these reasons, we recommend that simple
type theory be incorporated into the undergraduate engineering
curriculum.
The paper is organized as follows. Section II explains what
simple type theory is and introduces a practical version of
simple type theory called BESTT. The process of composing
October 20–23, 2004, Savannah, GA
34thASEE/IEEE Frontiers in Education Conference
F1C1
Page 2
Session F1C
a specification is outlined in section III. The main section
of the paper, section IV, illustrates how specifications can
be written in BESTT. Simple type theory as a specification
language is compared with the leading industrial specification
languages in section V, and software tools for manipulating
specifications written in simple type theory are briefly dis
cussed in section VI. The paper ends with some remarks
about our experiences using simple type theory at McMaster
in section VII and some concluding remarks in section VIII.
WHAT IS SIMPLE TYPE THEORY?
As we said in the Introduction, simple type theory is
a natural extension of firstorder logic which is simple,
highly expressive, and practical. There are many different
formulations of simple type theory, but the most common
formulation today is a system due to A. Church [1], known as
Church’s type theory, which includes machinery for applying
and specifying functions and a definite description operator.
In this paper we will always assume that “simple type theory”
means “Church’s type theory”.
Simple type theory is the most popular form of type
theory. Simple type theory, like other type theories, has two
kinds of syntactic objects. Expressions denote values including
the truth values T (true) and F (false); they do what both terms
and formulas do in firstorder logic. Types denote nonempty
sets of values; they are used to restrict the scope of variables,
control the formation of expressions, and classify expressions
by value. For example, a formula is an expression of type
BOOL that denotes a truth value.
Unlike some other type theories, simple type theory is a
classical, nonconstructive, twovalued logic. It includes strong
support for specifying and reasoning with a hierarchy of
higherorder functions. (A function is higher order if it takes
other functions as arguments.) The structure of the hierarchy
is inherited from the structure of the types. In contrast to a set
theory (such as ZermeloFraenkel (ZF) set theory), simple type
theory is a function theory in which functions are also used
to represent other kinds of values such as sets and relations.
There are several ways of extending simple type theory so
that it is more practical for actual use (see [5]). One example
of a practical extension of simple type theory developed at
McMaster University is BESTT [4], a Basic Extended Simple
Type Theory. BESTT has type variables for forming polymor
phic types and expressions as in the HOL logic [10] and built
in machinery for working with tuples, lists (finite sequences),
and sets.
Undefinedness arises naturally throughout mathematics,
computer science, and engineering. For example, division by 0
is never defined and many computer programs do not terminate
on all inputs. Since undefinedness is often unavoidable, a
practical logic for specification must have a mechanism for
dealing with it. BESTT comes with a partial semantics in which
functions may be partial and expressions may be undefined
(but formulas are always either true or false). This semantics
formalizes the traditional approach to undefinedness employed
0780385527/04/$20.00 c ? 2004 IEEE
in mathematical practice (see [3], [6]). As a result, undefined
applications of partial functions like 1/0 and undefined definite
descriptions like “the unique x such that x ?= x” can be
directly expressed in BESTT (for more examples, see [6]).
THE PROCESS OF COMPOSING A SPECIFICATION
By a specification we mean a document written in a
formal language—a language having a precise syntax and
semantics—that describes the requirements, interfaces, data
structures, module designs, etc. of a system. While there are
philosophical discussions about how formal or informal a
specification should be, we think that it is essential that en
gineering students, particularly software engineering students,
are capable of making precise and unambiguous statements
about small systems.
The process of composing a specification for a system
consists of seven steps. The first three are for all systems, but
the last four are just for systems that keep a state or memory.
1)
Defined the types. The first step is to defined the types of
values—quantities, data, objects, inputs, outputs, etc.—
with which the system is concerned. If this step is done
correctly, the remaining steps are often straightforward.
2)
Declare the constants. The constants are the primitives
of a language L for describing the values and making
assertions about them. For example, the constants may
include 0,1,−,+,∗ for talking about integers.
3)
State the axioms. The axioms are a set Γ of statements
in L that are assumed to be true about the values. The
language L and set Γ of axioms form a theory that
specifies the values. For example, the axioms may include
a statement that says 0 is the identity with respect to +
over the integers.
4)
Declare the state variables. If the system we are specify
ing keeps a state or memory, the state variables are special
constants that represent components of the system’s state.
For example, the two state variables for a stack might
be an array that holds the stack elements and a natural
number that holds the height of the stack.
5)
State the invariants. The invariants of the system are
statements involving the state variables that are assumed
to remain true in all states of the system. In the stack
example above, the statement that the height is between
0 and the length of the array would be an invariant.
6)
Declare the operations. The operations are special con
stants that effectively read or write state variables. They
may also take inputs and return outputs. They represent
how the system interacts with its state. In the stack
example, top, pop, and push would be operations.
7)
Specify the operations. The operations are either defined
as functions that map states to states or are specified by
the properties they satisfy, for example, using pre and
postconditions. In the stack example, the top operation
could be defined as a partial function that maps a
nonempty stack of elements to the top element of the
stack.
October 20–23, 2004, Savannah, GA
34thASEE/IEEE Frontiers in Education Conference
F1C2
Page 3
Session F1C
In practice, the specification writer usually does not do
the steps in a entirely sequential manner but instead goes back
and forth between the steps until a complete specification is
produced.
WRITING SPECIFICATIONS IN BESTT
Before a student is able to compose a specification using
BESTT, or any other specification language, he or she must
become familiar with logic, and in our case with higherorder
logic. The student also needs to have acquired basic knowledge
in discrete mathematics, data structures, algorithms, and finite
state machines. Due to space limitations, we can only give a
short introduction to the language of BESTT. For a complete
introduction the reader is referred to [4]. (The notation we
will use in this paper is slightly different than that in [4].)
We would like to point out that the language of BESTT is
very compact, containing only a small number of language
elements compared to other specification languages such as
VDMSL and Z. A detailed comparison between BESTT and
other specification languages is given in section V. The reader
familiar with the ML [19] functional programming language
or the PVS the specification language will notice many simi
larities between BESTT and these languages.
Following the process of composing a specification pre
sented in section III, a specification in BESTT can contain five
types of specification elements:
1)Type definitions.
2) Declarations of constants, state variables, and operations.
3) Axioms.
4)Invariants.
5)Definitions or specifications of operations.
Types form the foundation of BESTT. We assume the
atomic types include BOOL, INT, FLOAT, CHAR, STRING, and
UNIT. Let α and β be types. Compound types are constructed
using two binary type constructors: α → β is the type of
functions from α to β and α∗β is the product type of α and
β. For convenience, we allow the usage of records types, e.g.,
(x:INT,y:INT), which are essentially product types with named
components. Additional compound types are constructed using
two unary constructors: set(α) and list(α) are the types of a
set and a list of elements of type α, respectively. We will use
the alternation constructor , e.g., MaleFemale, to construct
enumerated types. Finally, we will assume that users can
define their own type constructors in BESTT, e.g., order(α) =
(α ∗ α) → BOOL, including recursive type constructors.
A type definition is given using the key word type
followed by an identifier naming that type, possibly a list of
type variables for type constructors, and then the type. BESTT
allows the definition of abstract types, i.e., types that are not
further defined. The syntax we use for type declarations is
identical to that used in the language OCaml [14] (a dialect
of ML). For example,
type
type
type
Date
Gender =
Person =
MaleFemale
STRING ∗ Date ∗ Gender
defines the abstract type Date, the type Gender as an enumer
ation of constants, and the type Person as a product type of
String, Date, and Gender. We like to point out that at any time
point we could further specify the type Date without changing
any of the definitions depending on it.
Declarations are used to declare constants, state variables,
and operations. A declaration consists of the key work decl,
an identifier, and a type. For example,
decl
decl
data set :
get date :
set(Person)
Person → Date
declare the state variable data set and the operation get date.
Expressions in BESTT are formed from variables and de
clared constants, state variables, and operations using standard
expression constructors. The expression constructors include
an ifthenelse constructor and constructors to build and ac
cess lists such as ?a1,...,an?. Sets can be constructed by
enumeration, e.g., {a1,...,an}, or by set abstraction, e.g.,
(Sa : α . p(a)) where p is a predicate over the elements of
type α. Individual objects can be described using the construct
of definite description. For example, the definite description
(Ia : α . p(a)), where again p is a predicate over the elements
of type α, denotes the unique element that satisfies p if there
is such an element and is undefined otherwise. See [4] for a
complete list of the expression constructors in BESTT.
Formulas, i.e., expressions of type BOOL, are formed
using the constants T, F, representing true and false, the
standard logical connectors ¬,∧,∨,⇒,⇔ and the quantifiers
∀,∃. BESTT also provides the definedness operator ↓ such that
E↓ asserts that the expression E is defined.
An invariant is a formula involving state variables that is
assumed to be true. Logically, an invariant is just a special kind
of axiom. Invariants are stated using the key work inv followed
by the name and formula of the invariant. For example, the
invariant
invunique birthday =
∀(n,d,g),(n?,d?,g?) : data set .
n = n?→ d = d?
states that, for any two objects of type Person associated with
the variable data set, if the two strings are identical then so
are the two birthdays.
BESTT provides the special expression constructor that is
used to define a function by means of what is called lambda
abstraction in logic. Definitions associate a previously declared
identifier with an expression. The definition
defget date = (n,d,g) ?→ d
defines the operation get date by stating that, given an object
of type Person, declared to be the product of STRING, Date,
and Gender, it returns the element of type Date.
Now we can continue with our example by declaring
decl
decl
find date :
find people :
Person → (BOOL * Date)
Date → set(Person)
and then make the following definitions to give meaning to
October 20–23, 2004, Savannah, GA
34thASEE/IEEE Frontiers in Education Conference
F1C3
0780385527/04/$20.00 c ? 2004 IEEE
Page 4
Session F1C
the declared operations:
deffind date = n ?→
((∃d : Date, g : Gender . (n,d,g) ∈ data set),
(Id : Date . ∃g : Gender . (n,d,g) ∈ data set))
Note that, by the previously stated invariant unique birthday,
the definite description is defined exactly if the first component
in the pair is true.
def find people = d ?→
(S(n,d?,g) : Person .
(n,d?,g) ∈ data set ∧ d = d?)
find people returns the set of objects of type Person that are
“born on some given date”.
Summarizing, we have given the specification of a simple
system that stores people together with their unique birthdays,
and allows us to determine the birthday of a person and the
set of all the people born on any particular date.
Higherorder constructs are very useful if one aims to
specify abstract concepts such as sorting. Using the type
constructor order(α), as defined above, we can declare an
operation sorter as follows:
decl sorter : order(α) → (list(α) → list(α))
We can then define the operation sorter by definition descrip
tion:
defsorter =
Is : order(α) → (list(α) → list(α)) .
∀o : order(α), a,b : list(α) .
s(o)(a) = b ⇔
(∀i : INT . 0 ≤ i < b − 1 ⇒ o(b[i],b[i + 1]))
∧ list equiv(a,b)
where list equiv(a,b) is predicate of type (list(α) ∗ list(α)) →
BOOL that is true if the sequences a and b contain the same
elements but in possibly a different ordering.
Specifications are often intended to describe dynamic
models in which the state of the model can be changed by the
specification’s operations, as done in VDMSL and Z. In this
case, we speak about traces of states where a state q is a list of
values associated with the specification’s state variables. Since
an operation o effectively reads and writes state variables, it
can be viewed as a state transition function that moves a state
q to a state ˜ q, which is written as δ(q,o) = ˜ q. Let Q be the
state space, i.e., the set of all possible states, and O be the
set of operations. We introduce the notation q = A to denote
that the formula A holds in the state q. Then q = A?means
that, for each o ∈ O, if δ(q,o) = ˜ q, then ˜ q = A, i.e., that A
holds in each possible next state. q = (A ⇒ B) means that
q = A implies q = B. There are similar definitions for the
other propositional connectives.
An invariant is now a formula A such that
q0= A ∧ ∀q : Q . q = (A ⇒ A?)
where q0is the initial state. With this the student is introduced
to model checking as a tool to verify properties of specifica
0780385527/04/$20.00 c ? 2004 IEEE
tions. Traces allow us to define the notion of an event as done
in [9] and [12]: @T(A) denotes the time points in a trace
?q0,q1,...? at which qi= (¬A ∧ A?), and @T(A) WHEN D
denotes the time points at which qi= ¬A∧A?∧D∧D?, i.e.,
at which A changes from false to true and D remains true.
(This could also be defined using previous states instead of
next states.)
To illustrate the usage of events we give a small specifica
tion that defines an event that occurs if a user clicks the mouse
in a button (fixed area) on the screen. We need the following
type declarations:
type
type
mouse button =
mouse pos =
UpDown
(x:INT,y:INT)
Now we declare the state of the mouse using a state variable:
declmouse : (b: mouse button,pos:mouse pos)
We need a predicate that is true if the mouse is pointing
to the area of interest:
declin area : mouse pos → BOOL
The predicate in area can easily be defined using that the area
is located at position (x0,y0) with a width w and height h as
in area = (x,y) ?→
(x0≤ x ≤ x0+ w) ∧ (y0≤ y ≤ y0+ h)
Now we can specify the events of interest, first that the
mouse button was pressed while the mouse pointed to the area:
def
defB1 = @T(mouse.b = Down)
WHEN in area(mouse.pos.x,mouse.pos.y)
Note that these events do not happen if we move the mouse
with the button down over the area without pressing it down
within the area. Further we need the events when the mouse
button was released while the mouse was within the area and
while it was released outside the area:
defB2 = @T(mouse.b = Up)
WHEN in area(mouse.pos.x,mouse.pos.y)
defB3 = @T(mouse.b = Up)
WHEN ¬in area(mouse.pos.x,mouse.pos.y)
If a B1 event is followed by a B2 event then the mouse
was clicked and released within the area, but if it is followed
by a B3 event then the mouse was clicked but not released
within the area.
COMPARISON WITH OTHER SPECIFICATION LANGUAGES
Instructors introducing students to formal specification can
choose from a variety of different specification languages. In
general, one can distinguish two classes of specification lan
guages: model oriented and property oriented. Modeloriented
languages, such as the PVS [16] specification language;
UML [11], the Universal Modeling Language; VDMSL [2],
the specification language of the Vienna Development Method
(VDM); and Z [20] (called Z after ZermeloFraenkel set the
ory), are languages that describe the operations by defining
October 20–23, 2004, Savannah, GA
34thASEE/IEEE Frontiers in Education Conference
F1C4
Page 5
Session F1C
how they affect the models (state space) they work on. This
includes the notion of pre and postconditions and the different
kinds of specification languages using finite state machines.
Propertyoriented languages, often called algebraic languages,
make abstract statements about the properties of the opera
tions, which could be satisfied by many models. The com
monly used example is the stack. A propertyoriented language
would state that pop(push(s,t)) = s and top(push(s,t)) = t,
while a modeloriented language would first declare that a
stack has a state represented as a list of objects and then
describe how top, pop, and push affect this state.
In contrast to Z, BESTT allows expressions to be unde
fined. BESTT is not a threevalued logic; the logic provides
only the usual two boolean values, and formulas are always
defined. We think that this approach is very natural, e.g.,
top(empty stack) is undefined. Specification languages cannot
ignore occurrences of undefinedness. They can either deal with
them indirectly in various ways or directly as BESTT does, by
using artificially constructed types with undefined values, or
by introducing the notion of an “exception”.
The languages UML and Z use two dimensional graphical
constructs to assist the composer and reader of specifications.
In our software design courses, we found that using BESTT
within Parnas Tables [17] increased the readability of specifi
cations in a similar way.
To illustrate the difference in syntax we define the same
structure, a simple binary tree in BESTT, VDMSL, and Z: The
BESTT specification is
typeBtree(α) = EmptyNode(Btree(α) ∗ α ∗ Btree(α))
where the type Btree(α) is parameterized with the type of
the content of the nodes. This line can be directly sent to
an OCaml interpreter or compiler. Since VDMSL and Z do
not support type variables, we cannot specify a parameterized
type, but we have to construct our trees over some fixed type,
e.g., INT. The binary tree specification in VDMSL is given by
binnode ::
left :
Btree
value :
INT
right :
Btree
Btree = [binnode]
and in Z the same specification is
Btree ::= nilbinnode << Btree × INT × Btree >>
The specifications in BESTT and Z look very similar,
but the VDMSL specification is different because Btree is
not actually a type in VDMSL, only binnode is. We found,
confirmed by our experiences teaching formal specification to
students, that the BESTT notation is more natural.
SOME REMARKS ABOUT SOFTWARE TOOLS
Simple type theory can be used as a specification language
without the aid of software tools—especially for educational
purposes. However, as one considers specifications of greater
size and complexity, it becomes increasingly more difficult
0780385527/04/$20.00 c ? 2004 IEEE
to express and analyze them in simple type theory without
support from software tools. In short, practical specification
using simple type theory (or any other formal specification
language) requires tool support.
There are three basic kinds of software tools for formal
specification:
1)
Tools for constructing specifications. These tools help the
user to find syntactic mistakes in specifications and to
combine smaller specifications into larger ones. They in
clude type checkers for checking whether a specification
is type correct and type inferers for determining the types
of the specification components. The latter are included
in implementations of the ML programming language.
2)
Tools for analyzing specifications using symbolic com
putation. There are a wide range of tools of this kind.
These include model checkers for determining whether
a property holds in every state of a specified system,
computer algebra systems for simplifying algebraic ex
pressions, decision procedures that can automatically
decide whether certain kinds of conjectures are true or
false, systems for finding counterexamples to conjectures,
and various kinds of systems for visualizing the models
that satisfy a specification.
3)
Tools for analyzing specifications using formal deduction.
These include automatic theorem provers like Otter [15]
that attempt to automatically find a proof for a conjecture
and proof development systems like PVS with which the
user and the system interactively construct a proof of a
conjecture.
Specification systems of the future will contain an inte
grated set of tools of all three kinds described above. They will
be effectively “laboratories” in which to construct and analyze
specifications and in which symbolic computation and formal
deduction are fully integrated (e.g., see [8]).
EXPERIENCES AT MCMASTER UNIVERSITY
By teaching introductory courses in software engineering
to undergraduate and graduate students in software engineer
ing, computer engineering, and electrical engineering, we have
become very familiar with the problems students have in
understanding formal specification techniques and languages.
Our experiences are what prompted us to develop BESTT and
our approach to teaching specification.
Our approach has been incorporated in the B.Eng. pro
gram in software engineering at McMaster University. De
signed in 1999 and revised in 2003, this program is accredited
by the Canadian Engineering Accreditation Board (CEAB).
The first year is a common year to all engineering students. In
the 2003 revision we decided to postpone the first introductory
course in software engineering to the second term in the
second year. Students first have to pass three secondyear
firstterm courses: one course in logic, a separate course
in discrete mathematics, and a course that introduces them
to the fundamentals of programming as a continuation of
their firstyear programming course. In their first software
October 20–23, 2004, Savannah, GA
34thASEE/IEEE Frontiers in Education Conference
F1C5