Content uploaded by Herman Venter
Author content
All content in this area was uploaded by Herman Venter
Content may be subject to copyright.
Specification and Verification: The Spec# Experience
Mike Barnett
Microsoft Research, Redmond
mbarnett@microsoft.com
Manuel Fähndrich
Microsoft Research, Redmond
maf@microsoft.com
K. Rustan M. Leino
Microsoft Research, Redmond
leino@microsoft.com
Peter Müller
ETH Zurich
peter.mueller@inf.ethz.ch
Wolfram Schulte
Microsoft Research, Redmond
schulte@microsoft.com
Herman Venter
Microsoft Research, Redmond
hermanv@microsoft.com
ABSTRACT
Spec# is a programming system that puts specifications in
the hands of programmers and includes tools that use them.
The system includes an object-oriented programming lan-
guage with specification constructs, a compiler that emits
executable code and run-time checks for specifications, a
programming methodology that gives rules for structuring
programs and for using specifications, and a static program
verifier that attempts to mathematically prove the correct-
ness of programs. This paper reflects on the six-year expe-
rience of building and using Spec#, the scientific contribu-
tions of the project, remaining challenges for tools that seek
to establish program correctness, and prospects of incorpo-
rating program verification into everyday software engineer-
ing.
0. INTRODUCTION: THE SPEC# VISION
Software engineering is the process of authoring software
that is to fulfill some worldly needs. It is an expensive en-
deavor that provides difficulties at all levels. At the top level,
the gathering of requirements for the software is usually an
exploratory and iterative process. Any change in require-
ments ripples through all parts of the software artifact and
is further complicated by having to maintain previous ver-
sions of the software. At the level of the program itself,
the interaction between program modules requires an un-
derstanding of what is expected and what can be assumed
by the interacting modules. At the level of each module, one
problem is to maintain the consistency of data structures, let
alone remember what it means for a particular data struc-
ture to be consistent. At the level of individual operations,
the algorithmic details can be tricky to get right.
Two common themes among all of these difficulties are
the problem of having useful and accurate documentation
and the problem of making sure that programs adhere to
documented behavior and do not misuse features of the pro-
gramming language.
Spec# (pronounced “speck sharp”) is a research project
aimed at addressing these two problems. To combat the first
problem, Spec# takes the well-known approach of provid-
ing contracts, specification constructs to document behav-
ior. To combat the second problem, Spec# adds automatic
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 2009 Barnett, Fähndrich, Leino, Müller, Schulte, Venter.
tool support, which includes an automatic program verifier.
The Spec# project set out to explore the programmer ex-
perience of using specifications all the time and receiving
benefit from them. The tool support is intended not just
to help ensure program correctness, but also, importantly,
to lure programmers into recording their design decisions in
specifications, knowing that the specifications will not just
rot as stale comments.
In this paper, we describe the Spec# programming sys-
tem, along with our initial goals of the project, what we have
done with it, how it has already had some impact, and how
we now, in retrospect, view some of our design decisions.
1. SONGS OF INNOCENCE
We started the Spec# project in 2003 as an attempt to
build a comprehensive program verification system [4]. Our
dream was to build a real system that real programmers
can use on real programs to do real verification—a system
that “the programming masses” could use in their everyday
work. We wanted to explore and push the boundaries of
specification and verification technology to get closer to re-
alizing these aspirations. Let us consider in more detail the
lay of the land at the time we started the project.
Program verification was already several decades old, start-
ing with some formal underpinnings of program semantics
and techniques for proving program correctness [13]. Sup-
ported by mechanical proof assistants, some early program
verifiers were the GYPSY system with the Boyer-Moore
prover [0] and the Stanford Pascal Verifier [18]. Later sys-
tems, which are still used today, include full-featured proof
assistants like PVS [23] and Isabelle/HOL [22].
Another approach that uses verification technology were
extended static checkers like ESC/Modula-3 [9] and ESC/-
Java [12]. These tools have been more closely integrated into
existing programming languages and value automation over
expressivity or mathematical guarantees like finding all er-
rors in a program. The automation is enabled by a breed of
combined decision procedures that today is known as Satis-
fiability Modulo Theories (SMT) solvers [8]. To make them
easier and more cost-effective to use, extended static check-
ers were intentionally designed to be unsound, that is, they
may miss certain errors.
Dynamic checking of specifications has always been done
by Eiffel [20], which also pioneered the inclusion of specifi-
cations in an object-oriented language. The tool suite for
the Java Modeling Language (JML) [16] included a facility
for dynamic checking of specifications [5]. Eiffel influenced
JML, and the strong influence of both of these on Spec# is
1
all too evident.
Our plan to target real programs was no doubt our sin-
gle most influential decision: it has permeated every corner
of the Spec# design. Targeting real programs meant not
designing a tool for a toy language with idealized features.
In addition to learning how to handle difficult or otherwise
uncomfortable language features, a benefit of this decision
is the large body of programs and libraries that can be used
as starting points for specification and verification. The de-
cision implied a connection with an existing language, so we
decided to build our language extensions around C# and
the .NET platform. Other well-known real languages with
specifications were Eiffel, Java+JML, and SPARK Ada [1].
Like in Eiffel, out extensions made specifications first-class
elements in the language.
Our plan to build a system for real programmers immedi-
ately ruled out the possibility of exposing users to an inter-
active proof assistant. We felt that while programmers need
to know how to specify a program, they should not need to
understand proof theory, the logical encoding of a program’s
semantics, or how to issue tactics to guide the proof search.
Instead, we turned to an automatic SMT solver. This is not
to say that verification is fully automatic, because Spec#
users still need to supply specifications. But all interaction
between users and Spec#’s tools takes place in the context
of the program and its specifications. The major contender
in this space was ESC/Java and similar tools using JML
specifications in Java programs.
Another important consequence of building a system for
real programmers was the need for something to attract
users. It is a long journey for a user to get to the point of
writing specifications that lead to effective verification. To
give users immediate benefit for any specification they write,
however partial, we decided to provide dynamic checking of
specifications. As with Eiffel, the Spec# compiler emits run-
time checks of many contracts as part of the target code.
Throughout the project, we also worked hard on providing
good defaults so that programmers would not be unduly
burdened for the most common cases.
Finally, our plan to do real verification meant not com-
promising soundness. This goal aligned our direction with
fully featured proof assistants. Sound verification of object-
oriented programs does not come easily. Of the unsound
features in ESC/Java, many were known to have sound solu-
tions. But two open key areas were how to specify and verify,
in the presence of subclassing and dynamically-dispatched
methods (which give rise to the possibility of reentrancy),
the internal consistency conditions of data structures (known
as object invariants) and the possible effects that methods
can have on the program state (known as method framing).
To ensure that our verifier would scale to large programs
and could be applied to libraries, we also wanted to sup-
port modular verification, where every class can be verified
without knowing its clients and subclasses. We started the
project with an idea for addressing these problems, which
gave us a glimmer of hope of building a sound and modular
verifier.
Though we aimed for a broad design, we left out sev-
eral things initially, so that we could provide simpler speci-
fications. For example, we provided no support for writing
specifications for unsafe (non type-safe) code, concurrency,
higher-level aspects of closure objects, and some functional-
correctness concerns of algorithmic verification. The spec-
ifications instead focused on partial properties, of the kind
that every programmer could write down and for which one
might be willing to accept the run-time overhead of dynamic
checking. Some other features that we left out of our initial
design, like generics, were subsequently added.
In summary, we set out to build a programming system
with specifications as an integral component. The system
was to blend into existing practices, it was to provide a
range of assurance levels from dynamic checking to static
verification, and the static verification was to be sound and
automatic. To succeed, a number of scientific questions had
to be answered; for example: how to specify object invari-
ants that may span multiple objects, how to statically ver-
ify object invariants that may be temporarily broken, how
to encode method framing to match programmer intentions
while avoiding recursive definitions that may cause problems
with the SMT solver, how to facilitate abstraction by pure
methods, how to push common idioms into the simpler-to-
understand type system, how to choose an appropriate level
of dynamic checks to include at run time, and how to present
errors to users as early as possible during program design.
We also had a number of engineering concerns: how to in-
tegrate a language with specifications and extended types
into a multi-language platform, how to obtain specifications
for common libraries, how to store specifications for reuse,
how to integrate the tools into an IDE, and, of course, how
to manage our own development of the tools themselves.
Surprisingly, this list did not seem so daunting at the time.
2. THE SPEC# SYSTEM
Spec# is an object-oriented language; it is a superset of
C# v2.0, the version of C# released in 2005. As a mem-
ber of the .NET family of languages, Spec# compiles to
MSIL bytecode, runs in the .NET virtual machine, and is
supported by Visual Studio’s integrated development envi-
ronment (IDE), which provides syntax highlighting and the
ability to run the program verifier in the background as the
code is being written. The extensions to C# chiefly con-
sist of the standard design-by-contract features [20]: method
contracts (pre- and postconditions) and object invariants.
2.0 A Quick Tour of the Language
Fig. 0 shows the most commonly used features. A full
explanation of the language is in the Spec# tutorial [17].
The specifications in a Spec# program are written directly
in the program, as in GYPSY and Eiffel. And as in Eiffel and
JML, the specifications are expressions of the programming
language, which is more natural for programmers than some
mathematical notation.
2.0.0 Pre- and Postconditions
Specified using the Spec# keyword requires, the method
CrankItUp has a precondition, which indicates what condition
must hold on entry to the method. The caller must establish
the condition and the method implementation can assume it.
Symmetrically, the postcondition (Spec# keyword ensures)
indicates what condition must hold on (normal) exit from
the method. The implementation must establish it and the
caller can assume it.
The first postcondition uses the expression old(Volume())
to refer to the value of Volume() on entry to the method. It
promises that the value of Volume() is increased by amount.
The second postcondition expresses that the method returns
2
public class Stereo {
int currentCDSlot;
[Rep] Speaker! left = new Speaker();
[Rep] Speaker! right = new Speaker();
invariant 0 <= currentCDSlot;
invariant left != right;
invariant left.Gain == right.Gain;
public int CrankItUp(int amount)
requires 0 <= amount;
ensures Volume() == old(Volume()) + amount;
ensures result == Volume();
{
expose (this) {
left.Adjust(amount);
right.Adjust(amount);
}
...
}
[Pure] public int Volume()
{return left.Gain; }
public void ChangeCD(int newSlot)
requires 0 <= newSlot;
{ currentCDSlot = newSlot; }
}
Figure 0: A (partial) Spec# program demonstrat-
ing the basic features of the language. It contains
method contracts that describe (part of ) the method
behavior and object invariants describing the consis-
tent state of each instance of the class.
the final value of Volume(); the Spec# keyword result refers
to the return value of the method.
Since contracts must not cause any state changes, methods
may be used in contracts only if they are side-effect free,
which is indicated by the [Pure] custom attribute0as in the
definition of Volume().
Method contracts are inherited in subclasses, that is, over-
riding methods have to live up to the contracts of the meth-
ods they override. This requirement allows one to reason
modularly about a method call in terms of the method con-
tract without knowing all possible implementations of that
method in subclasses. Contract inheritance enforces the
well-known concept of behavioral subtyping.
2.0.1 Object Invariants
The class Stereo declares three object invariants, which
specify what it means for an object of this class to be in a
“good” state, i.e., when it is consistent. Whereas the first two
invariants constrain the values of the fields of a Stereo ob-
ject, the third invariant relates the states of two sub-objects
that make up the representation of the aggregate object.
The first assignment statement in the body of the method
CrankItUp might break that invariant before the subsequent
assignment re-establishes it. To indicate that an object in-
variant might be temporarily violated, the two assignment
statements must appear within an expose statement, which
is described in more detail in Sec. 2.1. Note that no expose
statement is needed in ChangeCD, because there is no tempo-
rary violation of an invariant: the single assignment main-
0A .NET feature that allows associating arbitrary meta-data
with program elements.
tains the invariant.
2.0.2 Non-null Types
Null-dereference problems have become the bane of object-
oriented programming. We found that the single most com-
mon specification is the exclusion of the null value from
the possible values of a field, method parameter, or result.
Therefore, we chose a lightweight notation for this prop-
erty: non-nullness is indicated in the declaration of a vari-
able or method by an exclamation point (“bang”) after the
type name. In Fig. 0, such non-null types are used in the
declarations of the fields left and right. A non-null type
for a field, parameter, or method result is similar to a cor-
responding invariant, precondition, or postcondition. How-
ever, non-null types are enforced via Spec#’s special non-
null type system, see Sec. 2.2.0. References of non-null types
can be dereferenced safely without requiring run-time checks
or proof obligations to prevent errors. (Don’t you wish your
programming language could do this?)
2.0.3 Ownership
Each object of type Stereo is an aggregate object. It con-
tains references to other objects which make up its internal
representation. In Spec#, the aggregate/sub-object relation
is expressed using the [Rep] custom attribute in the decla-
ration of the field pointing to the sub-object. In our exam-
ple, the speakers are sub-objects of a Stereo object—we say
that the Stereo object owns its speakers. Due to this own-
ership relation, Spec# enforces that two Stereo objects do
not share their speakers and that, in general, a speaker can
be modified only through its owning Stereo object. This lets
aStereo object maintain object invariants over the state of
its speakers, such as the third object invariant. We explain
more details in the next subsection.
2.1 Spec# Methodology
Spec# comes with its own programming and verification
methodology, which permits sound modular reasoning about
object invariants and framing. We will give a short overview
of this important contribution of the Spec# project here; a
full explanation is in the Spec# tutorial [17].
A first problem is that there is no natural granularity asso-
ciated with an object invariant. An invariant cannot always
hold: it is generally necessary to temporarily violate an in-
variant with later state changes re-establishing it while up-
dating, as illustrated by method CrankItUp. The granularity
also does not correspond to method boundaries: a method
may temporarily violate the invariant and then make a call-
back, which makes the object accessible outside the class
while it is in an inconsistent state. Such a situation would
make the implicit assumption that object invariants hold
on entry to public methods incorrect, leading to unsound
verification.
A second, related, problem is that an object invariant of-
ten depends on the state of other objects, for instance, the
invariant of an aggregate object typically depends on the
state of its sub-objects, as illustrated by Stereo. Conse-
quently, modifications of these sub-objects potentially vi-
olate the invariant of the aggregate. This situation is in-
escapable for any system with reusable components, so the
methodology must allow it. But the verifier must ensure
that the aggregate object’s invariants are re-established be-
fore the aggregate relies on them again.
3
Spec# solves the first problem with the expose statement,
which is similar to a lock statement in concurrent program-
ming. It indicates a non-re-entrant lexical region within
which an object’s state is vulnerable and within which the
invariant may be temporarily violated. The object invariant
must hold in order for the block to be entered or exited.
The second problem is solved by using an ownership sys-
tem: the heap is organized into a forest of tree structures.
The edges of the trees indicate ownership, that is, an aggre-
gate/sub-object relation. Roughly, an ob ject invariant can
depend only on state in the subtree of which it is the root.
Within an expose block, a method can call down in the own-
ership tree, but not up; this prevents methods from being
invoked on inconsistent objects.
That is the basic approach to specifying aggregate ob-
jects in Spec#. However, many object-oriented programs
are not only about hierarchical data structures, but also
consist of mutually referring objects, such as the Subject-
Observer pattern. To deal with such peer relations, the
methodology uses a notion of peer consistency, which says
that an object and all its peers are consistent. In fact, we
use peer consistency as a default in method contracts.
As pointed out in Sec. 1, the second problem for a sound
verifier is the frame problem, which deals with what“frame”
can be put around a method call to limit the effects the call
may potentially have. For instance, in method CrankItUp,
what is the program state after the call to left.Adjust? A
sound but pessimistic verifier might treat the call as mod-
ifying everything in the heap, since potentially all objects
in the heap are reachable from every method (for instance,
through static fields). A better solution would be to know
exactly which parts of the heap a method changes, but in the
presence of subclassing and information hiding, it is not even
possible to name them all! Instead, some form of abstraction
is needed, yet one that is precise enough for the program ver-
ifier. Again, Spec# utilizes its ownership system: without
an explicit specification stating otherwise (using the Spec#
keyword modifies), a method may modify only the fields of
the receiver and of those objects within the subtree of which
the receiver is the root. Using ownership to abstract over
the modifications of sub-objects is justified, because clients
of an object should not be concerned with its sub-objects.
For instance, clients of Stereo ob jects need to know only
about the result of Volume(), but not how the volume is
stored in the sub-objects of Stereo.
2.2 The Checkers
Spec# specifications are checked at three levels: by the
compiler, by dynamic checks, and by the static verifier. We
summarize the checks in this section.
2.2.0 Compiler Checks
The Spec# compiler checks three important properties.
Non-null Types. The non-null type checker enforces that
expressions of non-null types actually contain non-null val-
ues. Most aspects of the non-null type system can be checked
statically. Those that cannot, for instance, casts from possibly-
null to non-null types lead to run-time checks and proof obli-
gations for the static verifier.
Because non-null annotations are nearly ubiquitous, our
compiler offers a command-line option for having non-null
types be the unmarked case: a reference type Tthat permits
null is then specified as T? corresponding to nullable value
types in C#. That is, instead of T! and T, this options lets
one write Tand T?, respectively.
The main complication in enforcing non-null types arises
during the initialization of variables. To make sure that
a constructor does not access the object being constructed
before its non-null fields are initialized, we developed two
solutions. The simpler solution is to allow the constructor
base call to occur anywhere within a constructor body (not
just first). The compiler can then perform syntactic checks
to make sure initialization occurs before the base call, which
means all non-null fields have non-null values after the base
call. The other solution caters to legacy code and cyclic data
structures by a more sophisticated data-flow analysis [10].
Importantly, the compiler is smart enough to track the
flow of null and non-null values. Therefore, non-null anno-
tations are typically applied only to fields and parameters.
Method Purity. As mentioned earlier, we require con-
tracts to be pure, that is, side-effect free, to ensure that
dynamic contract checking does not interfere with the exe-
cution of the rest of the program and that contracts have a
simple semantics that can be encoded in the static verifier.
It would be easy to forbid the use of all side-effecting oper-
ations (such as field updates) in the body of a pure method,
but doing so would be too restrictive. For instance, a method
called in a specification might want to use an iterator to tra-
verse a collection. Creating and advancing the iterator are
side effects; however, these effects are not observable when
the method returns. Spec# enforces this notion of purity,
called weak purity, which forbids pure methods only from
changing the state of existing objects.
Admissibility. The compiler also ensures that specifica-
tions follow the Spec# methodology. This includes enforc-
ing limits on what can be mentioned in an object invariant
and what things a pure method is allowed to read. These
admissibility checks are crucial for sound static verification.
2.2.1 Dynamic Checking
Large portions of Spec# contracts can be checked at run-
time. For instance, except for the non-null specifications
and method purity, all of the specifications shown in Fig. 0
result in run-time checks generated by the Spec# compiler.
To avoid the performance overhead of run-time checks, there
are compiler options for turning off the dynamic checks for
some of the contracts.
2.2.2 Static Checking
The program verifier flags violations both of the explicit
contracts and of the runtime semantics (e.g., null-dereference,
array index out of bounds, or division by zero). It checks one
method at a time. If the verification fails, it displays an er-
ror message, the location of the error, the trace through the
method that contains the error, and possibly a counterex-
ample. The problem can then be fixed by correcting the
program or, almost equally often, the specification. Fig. 1
illustrates how the IDE reports verification errors to the pro-
grammer.
Our static verifier is sound, but not complete. That is, it
finds all errors in the programs, but it might also complain
about methods that are actually correct (so-called spurious
warnings). Such spurious warnings can often be fixed by pro-
viding more comprehensive specifications. In some cases, it
is necessary to add an assumption to the program, a condi-
4
Figure 1: A screenshot of the Spec# IDE. Verifica-
tion errors are indicated by squigglies, and tool tips
display error messages. The example could be fixed
by adding the precondition a.Length > 0.
tion that is assumed to be true by the verifier. To assume a
condition e, one uses the Spec# program statement assume
e, but it must be used with care because stating a wrong as-
sumption makes the verification unsound (an extreme case
is the statement assume false, which allows one to verify
anything). However, assumptions are checked at run time
and, thus, can be found during testing. assume statements
also make a good focus in human code reviews.
It is important to understand that static verification does
not fully replace testing. Tests are still necessary to en-
sure the requirements have been captured correctly, to check
those properties that are not expressed by contracts, and to
check properties ignored by our verifier (for instance, stack
overflows).
3. CASE STUDIES
The Spec# language and static verifier have been used in
a number of case studies, which helped to evaluate the per-
formance of the programming system and provided valuable
feedback to the Spec# team.
The largest program written in Spec# is the Spec# pro-
gram verifier itself. The code makes extensive use of non-
null types, but also includes contracts. Large portions of the
code (around 3500 lines of source code) are routinely verified
as part of the Spec# test suite. The verification guarantees
the absence of run-time errors, but includes only very few
functional correctness properties.
Another case study was to retrofit an existing compiler
infrastructure with specifications. That experience helped
form the default method contracts based on peer consis-
tency.
Spec# has also been applied to code developed outside
the Spec# project to verify the absence of run-time errors.
For instance, Spec# was used to verify a file system device
driver of the Singularity operating system, demonstrating
that Spec# is (barely) flexible enough to handle system-
level code. Several students at ETH Zurich used Spec# to
verify libraries of the Mono .NET implementation.
Finally, we used Spec# to verify some classical textbook
examples. Supporting these programs is important because
they are likely to be among the first examples new users
might try and educators might want to use. For those ex-
amples, we verified functional correctness, typically simple
mathematical properties.
4. IMPACT
In this section, we summarize Spec#’s influence on re-
searchers and language designers in academia and industry.
4.0 Scientific Results
The main research focus of the Spec# project has been
on improving verification methodology by identifying com-
mon programming idioms and developing techniques and
notations for their specification and verification. We have
advanced verification methodology for object-oriented pro-
grams in three important areas. First, the Spec# methodol-
ogy supports sound modular verification of object invariants
in the presence of multi-object invariants, subclassing, and
call-backs. Second, Spec# has elaborate support for data
abstraction through model fields and pure methods. Third,
Spec#’s dynamic ownership model allows one to express
heap topologies and to use them for verification. Besides
these results in the area of verification methodology, the
Spec# project gained practical experience with a design of
non-null types and incorporated flexible object initialization
schemes. It also advanced the foundations of program ver-
ification, for instance, by providing a verification condition
generator for unstructured programs. The scientific contri-
butions of the Spec# project have been published in over 30
articles.
4.1 Impact on Academia
Spec# has had a major impact on academic research. A
number of research projects build directly on the Spec# in-
frastructure. SpecLeuven [14] is an extension of the Spec#
methodology and tools to handle concurrency. SpecLeuven
uses Spec#’s ownership system to enforce locking strategies.
M¨
uller’s group extends the Spec# tools, especially the IDE
integration, to provide the programmer with better expla-
nations of verification errors and counterexamples.
A number of research groups use the Boogie verification
engine, which was developed as part of the Spec# project [2].
For instance, various Java/JML, bytecode/BML, and Eiffel
projects use Boogie as target for their verifiers. At the other
end, Boogie’s output is now also fed to interactive theorem
provers. In addition, we see people encode and verify new
logics, for instance region logics, and verify challenging ex-
amples, like garbage collectors.
Other projects do not use the Spec# infrastructure, but
seem to be influenced and inspired by the Spec# project.
Eiffel now supports attached types, a variation of a non-null
type system. JML does not include a non-null type sys-
tem, but now offers non-null annotations and even decided
to make non-null the default for all reference types. The idea
to run a program verifier within an IDE and to report verifi-
cation errors just like compiler errors has been picked up by
ESC/Java2, which now comes with an Eclipse integration.
Similarly, the Rodin tool provides an Eclipse integration for
the Event-B tools [19].
Spec# has been used to teach program verification at half
a dozen universities, mostly in graduate seminars but also at
the undergraduate level. We and others have taught Spec#
at a number of summer schools as well as several tutorials
5
at major conferences.
4.2 Impact on Industry
Initially, we had hoped to convince one of the program-
ming language teams at Microsoft to add Spec#-like fea-
tures. Not only is this a difficult proposition by itself, but
our focus on the Spec# language did not address the fact
that .NET is a multi-language platform. We also did not
have any support for unsafe (managed) code or for con-
currency and were battling a perception that verification
is relevant only for safety-critical software (which is not a
core business for Microsoft). Even so, Spec# has influenced
other projects in Microsoft Research as well as in product
groups.
HAVOC. HAVOC is a tool that repurposes parts of the
Spec# system to verify low-level sequential systems code
written in C [6]. HAVOC has been applied to verify prop-
erties of device drivers and critical components in the Win-
dows kernel. A version of HAVOC has also been targeted
to find specific errors in a very large code base at Microsoft.
HAVOC’s recent focus has been to infer specifications for
legacy code bases.
VCC. The Verifying C Compiler (VCC) project [7] is spec-
ifying and verifying a shipping code base: Microsoft Hyper-
V. Hyper-V is a hypervisor — a thin layer of software that
sits just above the hardware and beneath one or more oper-
ating systems. VCC adopts Spec#’s tool chain and method-
ology. It addresses Spec#’s limits in two dimensions: it in-
cludes full functional verification and it verifies concurrent
operating system code. For the latter, VCC allows two-
state invariants that span multiple objects without sacrific-
ing thread or data modularity.
Code Contracts for .NET. The lessons we learned from
Spec# inspired a new project, Code Contracts for .NET. To
avoid the need to get users to adopt a new language (and
the need to support it), we introduced a library-based ap-
proach. Taking advantage of the interoperability of all .NET
languages, calls to the contract library, which includes meth-
ods such as Contract.Requires, can be called from any .NET
program. Both method contracts and object invariants are
supported, although we intentionally do not (yet) offer a
sound treatment for invariants. Non-null types and purity
checking are not supported.
A compiler generates the normal MSIL code for the calls
to contract methods, and post-build tools then extract the
contracts and use them for both dynamic and static check-
ing. We have also investigated how such contracts can be ex-
ploited by other tools (like Pex, see sidebar). As of .NET 4.0
(Beta 1 shipped in May 2009), the contract library is now
a part of mscorlib, .NET’s standard library. The associated
tools are distributed through DevLabs, a web site where Vi-
sual Studio makes early technology available.
5. SONGS OF EXPERIENCE
In this section, we reflect on our initial aspirations and
design decisions.
5.0 Not Being a Toy Language
The fact that we built Spec# as a full-scale .NET language
and made a mode for it inside the Visual Studio IDE has had
far-reaching consequences. For one, it made the scope of the
project large enough to include a wealth of both scientific
and engineering challenges. The project shows that it is
possible to build a practical verifier of this scale; in fact,
given the present availability of SMT solvers and verification
engines like Boogie or Why [11], the task of building a verifier
is now smaller than it was. Let us take a closer look at some
decisions we made and how they fared.
Most importantly, being a full language that compiles to
a common platform has increased the credibility of the re-
search. We feel that it has heightened the impact of the
research, letting us approach people who, especially at Mi-
crosoft, might not have been impressed by a one-off system.
The integration into Visual Studio allowed us to do back-
ground verification at design time, immediately indicating
errors in the program text by the underlinings fondly known
as squigglies. It also allowed us to populate tool tips, the lit-
tle floating windows that pop up when the mouse hovers
near an element, with contracts that boost programmer un-
derstanding of the code. A crucial consequence of this is how
it has allowed us to, through live demos, communicate the
vision of our project. Demos aside, if verification will ever
make it into the daily rhythm of mainstream programming,
it will be through such a design-time interface that provides
on-line verification.
Having access to existing programs and libraries has two
important advantages: it lets you try out your ideas and it
brutally reveals problems that still need solutions. It thus
both validates research and guides the way to more research
problems to be tackled. On numerous occasions, seeing the
results of our experiments forced us to support previously
ignored features and to alter and expand our specification
methodology.
Dealing with a full language also has disadvantages. Build-
ing the prototype system takes effort but is helped by the ini-
tial energy and enthusiasm that go into the creation of a new
research project. Once most of the system is in place, how-
ever, adding or modifying features of the system becomes a
larger undertaking than one would wish. Frequently, we felt
that we were not able to move as quickly as we wished. For
example, adding a new feature with syntax required changes
not just to the parser, but also the rest of compiler, the ad-
missibility checker, the meta-data encoder and decoder, and
the verifier.
Our IDE integration did just enough to communicate our
vision. However, our implementation thereof is far inferior to
product-quality integration, and the Spec# mode in Visual
Studio is downright clunky compared to the whiz-bang C#
mode. Using our own integration, we were able to provide
the programmer with feedback as the program is keyed in,
but it also meant that we were unable to take advantage of
all of the IDE advances, such as refactoring support, without
a prohibitively large engineering investment.
If one were to do the research project again, it is not clear
that extending an existing language (here, C#) is the best
strategy. Not only does it mean having to deal with con-
structs that are difficult to reason about, but it also presents
a problem when the base language evolves. For example,
for us to migrate Spec# to extend versions 3 and 4 of C#
would require more development resources than we have, all
for the purpose of supporting features of marginal research
return. In contrast, SPARK Ada is built as a subset of Ada,
which more easily lets a language designer pick features that
mesh well with verification and trivially solves the problem
6
of what to do when the base language evolves. But a subset
is also problematical, because it makes it more difficult to
apply the verifier to legacy code.
A final point is the question of where in the compilation
chain to apply verification. The Spec# program verifier ac-
tually starts with the MSIL bytecode that the compiler pro-
duces. This lets the verifier ignore syntactic variations of-
fered by the source language (for example, for loops versus
while loops) and allows cross-language verifiers to be built.
But for some features, like C#’s CLU-like iterators, it would
be much easier to start with the source constructs intact
than having to verify or first reverse engineer the auxiliary
classes and chopped-up method bodies that the compiler
emits into the bytecode. While one might be tempted to
start at the source, we feel that it is the right thing to start
with MSIL—otherwise every language would have to write
its own verifier. But compilers should annotate the bytecode
to make it easy to recover higher-level information.
5.1 Non-null Types
Non-null types have proven to be useful and easy to use.
Like other successful type features, they provide an enforce-
able discipline that hits a sweet spot of ruling out most pro-
grams with certain kinds of errors while allowing most pro-
grams without such errors. In our experience, the non-null
types are almost universally liked by users, with the excep-
tion entailed by the difficulty of converting legacy code.
We started the Spec# project with reference types being
possibly-null by default (as in C# and Java) and requir-
ing the type modifier !to express a non-null type (except
for local variables, where the type checker automatically in-
ferred the non-null mode). We found, however, that in our
own code, the non-null-by-default option led to less clutter.
Consequently, our suggestion to future language designers is
to let possibly-null types be the option.
As a downside, our non-null type system has some holes
resulting from the engineering compromises required to in-
tegrate the types into an existing platform. A complete
system needs support from the .NET virtual machine, for
example, to ensure that element assignments to arrays re-
spect the covariant arrays of .NET. As another example,
handling non-null static fields requires more control during
class initialization than the .NET machine provides. We
designed (sometimes complicated) workarounds for the lack
of virtual-machine support, but we did not implement all
of these workarounds. Our hope is that future execution
platforms will be designed with non-null types in mind.
Our technology-transfer attempt of non-null types into
Microsoft languages left us with a surprise. We had felt
that the fruits of our non-null research were ready for prime
time. But, as we just described, non-null types do not reap
its full benefits in a single language—they need platform
support. Interestingly enough, as we described in Sec. 4, we
had better luck with the technology transfer of contracts.
Meanwhile, non-Microsoft languages have picked up non-
null types, as we mentioned in Sec. 4.1.
5.2 Dynamic and Static Checking
Another major goal of our initial design was to support
both dynamic and static checking of specifications. Such a
design has several advantages. Most importantly, it immedi-
ately rewards users for writing specifications, because these
will turn into useful run-time assertions. And each specifi-
cation added in that way help reduce the additional cost of
writing provable specifications at a time when the program
might be run through the program verifier. Another ad-
vantage of using specifications for both dynamic and static
checking is that the user has to learn just one specification
language.
Whenever dynamic checking would be too expensive—for
example, in the enforcement of method frames—our princi-
ple was to drop the dynamic check. If dynamic checks are
a subset of the static checks, then one still obtains the nice
property that any program that is statically verified will run
without dynamic violations of specifications. An important
exception to this subsetting is the assume statement, whose
sole purpose is to trade a dynamic check for an assumption
by the static verifier.
Unfortunately, we were not always consistent with this
principle. For example, we introduced quantified expres-
sions (like forall) not just in specifications, but also in code,
and this (mis)led us to insist that all quantified expressions
be executable. This ruled out some quantifiers like those
that range over all objects in the heap, which are sometimes
useful in static verification. Worse, it led us to allow any
executable code in quantified expressions, which exceeded
our abilities to generate sensible verification conditions for
all quantified expressions. It is now clear to us that this is
not a great design for quantified expressions.
A point that often comes up in discussions is the prospect
of omitting the dynamic checks of those specifications that
have been statically verified. We never got around to trying
this for Spec#. However, we do offer a word of caution
to others who might consider doing so: The soundness of
such optimizations relies on all proof obligations in all parts
of the program to be enforced in some way, which is dicey
when (as in Spec#) static verification is optional and not all
specifications generate dynamic checks, or in an environment
(like .NET) where some of the interoperating languages have
neither static nor dynamic checking.
5.3 Verifying Loops
A familiar issue with static verification is the need for loop
invariants. Whereas a tool like ESC/Java avoids this issue
by checking only a bounded number of iterations of each
loop, Spec# verifies all iterations, which comes at a cost.
Our experience shows this cost to be moderately low, which
we explain as follows. The effective loop invariant draws
from three sources. One source is that the Spec# program
verifier includes an abstract interpreter that infers loop in-
variants. By default, it only infers relations between local
variables and numerical constants (the so-called interval ab-
stract domain), but even this provides some useful lower
bounds for some loop indices. A second source of loop in-
variants is loop-modification inference [9]: Spec# enforces
the method frame on every loop iteration. In other words,
an automatic part of the invariant of a loop is the modifies
clause of the enclosing method. This is the “biggest” part
of the effective loop invariant, and it is also the part that
no user would want to supply explicitly. Finally, the third
source are the user-supplied loop invariants. Because of the
first two sources, there are many loops that require no user-
supplied loop invariants at all, especially for methods with
no postconditions. As one takes steps toward functional-
correctness verification, the need for user-supplied loop in-
variants is likely to increase.
7
5.4 Methodology
The Spec# methodology led us to the first implementa-
tion of a sound modular approach to specifying and verifying
object invariants and method frames. Compared to earlier
solutions [21], the Spec# methodology is better suited for
automatic verification using SMT solvers. Since we started
the project, others have designed some alternative method-
ologies [15, 24]. The Spec# methodology has been stream-
lined for some common object-oriented patterns (e.g., ag-
gregate objects), and it lends itself to concise specifications
of programs that fall within those common patterns. How-
ever, for programs that use more complicated patterns, the
methodology can be too restrictive. For example, this caused
us to lose the interest of one otherwise avid user at Microsoft
who was trying to verify a large body of code.
The learning curve of the methodology has been steeper
than we would have liked, and non-expert users can have
problems knowing what to do in response to certain error
messages. The fact that we constantly changed the method-
ology to improve it, and along with it the terminology we
used, complicated the users’ view. We hope to have miti-
gated this problem somewhat with our new tutorial [17].
One breakthrough in the methodology was how to en-
code method frames to the SMT solver [3]. The beautiful
aspect of the solution was that the encoding required only
an ordinary universal quantifier—no recursive functions or
reachability predicates. However, many of the performance
problems we experienced arose from the large number of
these quantifiers that are emitted for methods that make
many calls.
5.5 Miscellaneous
We make four more remarks about our experience.
First, we had set out to build a sound verification sys-
tem. While we have found sound solutions to fundamen-
tal problems of modular verification of object-oriented pro-
grams, our implementation is not perfect. There are several
known unimplemented features and, given the number of er-
rors found in our implementation and semantic encoding so
far, we predict there are also unknown errors.
Second, the best and most far-reaching single design deci-
sion we made in the implementation of the Spec# program
verifier was to introduce an intermediate language in be-
tween the Spec# program and the formulas sent to the the-
orem prover. This intermediate verification language, called
Boogie (like the verification engine that uses it), was de-
signed for manual authoring as well as for automatic transla-
tion [2]. This extra layer of indirection meant that many al-
ternative design decisions could be investigated in a lightweight
fashion by just hand-modifying the translated Spec# pro-
gram. Another benefit of having this important separation
of concerns is the use of Boogie as the backend for other
programming languages and verification systems and as the
frontend for different theorem provers. For example, the
HAVOC and VCC tools mentioned in Sec. 4.2 both build on
Boogie.
Third, a conclusion we have drawn from our interaction
with developers is that real developers do appreciate con-
tracts—contracts are not just an esoteric feature prescribed
by“Dijkstra clones”. Unfortunately, we have also seen an un-
reasonable seduction with static checking. When program-
mers see our demos, they often develop a romantic enthusi-
asm that does not correspond to verification reality. Post-
installation depression can then set in as they encounter dif-
ficulties while trying to verify their own programs.
Fourth, from the research standpoint, having source syn-
tax for specification constructs homes in on the important
concepts and lets the program text be concise and usefully
descriptive. However, from the adoption standpoint, there
are two problems with this approach. One is that the engi-
neering overhead associated with being a superset language
is a high price to pay. The other is that we want adoption
in the platform, not just in a single language. Therefore, we
gradually steered toward a language-independent solution
by providing the specification constructs via a library, and
this eventually became Code Contracts. With Code Con-
tracts as an approximation of built-in contract features in
the platform, individual languages can now consider adding
convenient source syntax.
6. CONCLUSIONS
Since the Spec# project started, the Verified Software Ini-
tiative has organized the verification community to work to-
wards larger projects, larger risks, and a long-term view of
program verification. All of our work fits into these aims.
Going forward, we find that Spec# has evolved from a
single-language vision into four ongoing prongs.
One prong is the Spec# project itself, which is now enter-
ing a new phase: through a new open-source release (adding
to the binary download1), we hope to see continued improve-
ments of the Spec# programming system from a larger com-
munity. We also hope to see more use of Spec# in teaching,
especially in light of a new and comprehensive tutorial [17].
A second prong is the Boogie language and verification
engine, which are seeing continued adoption as a standard
intermediate verification language. Several projects, includ-
ing HAVOC and VCC, have discovered the usefulness of the
abstraction provided by the Boogie language.
A third prong is a new strand of research that attempts
functional-correctness verification using automatic tools. Us-
ing a variation of the Spec# methodology and the Boogie
intermediate verification language, VCC is an example of
such a project, and the experience with it so far has been
very positive.
A fourth prong aims at the mass adoption of specifications
in programming. Our language-independent Code Contracts
library, which has become an official part of the .NET 4.0
platform, goes beyond what any single language can do—it
puts a specification facility into the platform. Through the
associated tool support, we hope this will lead to improve-
ments in the software engineering process.
These four prongs continue to push frontiers in the quest
for verified software.
7. ACKNOWLEDGMENTS
Spec# would not exist were it not for all of the research
interns and users who helped bring it into existence. We
also express our thanks to the early reviewers of this paper.
1http://research.microsoft.com/specsharp
8
8. REFERENCES
[0] A. L. Ambler, D. I. Good, J. C. Browne, W. F.
Burger, R. M. Cohen, C. G. Hoch, and R. E. Wells.
GYPSY: A language for specification and
implementation of verifiable programs. SIGPLAN
Notices, 12(3):1–10, Mar. 1977.
[1] J. Barnes. High Integrity Software: The SPARK
Approach to Safety and Security. Addison Wesley,
2003.
[2] M. Barnett, B.-Y. E. Chang, R. DeLine, B. Jacobs,
and K. R. M. Leino. Boogie: A modular reusable
verifier for object-oriented programs. In F. S. de Boer,
M. M. Bonsangue, S. Graf, and W.-P. de Roever,
editors, Formal Methods for Components and Objects:
4th International Symposium, FMCO 2005, volume
4111 of LNCS, pages 364–387. Springer, Sept. 2006.
[3] M. Barnett, R. DeLine, M. F¨
ahndrich, K. R. M.
Leino, and W. Schulte. Verification of object-oriented
programs with invariants. JOT, 3(6):27–56, 2004.
www.jot.fm.
[4] M. Barnett, K. R. M. Leino, and W. Schulte. The
Spec# programming system: An overview. In
G. Barthe, L. Burdy, M. Huisman, J.-L. Lanet, and
T. Muntean, editors, CASSIS 2004, Construction and
Analysis of Safe, Secure and Interoperable Smart
Devices, volume 3362 of LNCS, pages 49–69. Springer,
2005.
[5] L. Burdy, Y. Cheon, D. R. Cok, M. D. Ernst, J. R.
Kiniry, G. T. Leavens, K. R. M. Leino, and E. Poll.
An overview of JML tools and applications.
International Journal on Software Tools for
Technology Transfer, 7(3):212–232, June 2005.
[6] S. Chatterjee, S. K. Lahiri, S. Qadeer, and
Z. Rakamaric. A reachability predicate for analyzing
low-level software. In O. Grumberg and M. Huth,
editors, Tools and Algorithms for the Construction
and Analysis of Systems, 13th International
Conference, TACAS 2007, LNCS, pages 19–33.
Springer, Mar. 2007.
[7] E. Cohen, M. Dahlweid, M. Hillebrand,
D. Leinenbach, M. Moskal, T. Santen, W. Schulte, and
S. Tobies. Verifying system-level c code with the
microsoft verifying c compiler (vcc). In T. Nipkow and
C. Urban, editors, Theorem Proving in Higher Order
Logics, 22st International Conference, TPHOLs 2009,
volume 5674 of LNCS. Springer, 2009.
[8] D. Detlefs, G. Nelson, and J. B. Saxe. Simplify: a
theorem prover for program checking. J. ACM,
52(3):365–473, May 2005.
[9] D. L. Detlefs, K. R. M. Leino, G. Nelson, and J. B.
Saxe. Extended static checking. Research Report 159,
Compaq Systems Research Center, Dec. 1998.
[10] M. F¨
ahndrich and S. Xia. Establishing object
invariants with delayed types. In Object-oriented
programming, systems, languages, and applications
(OOPSLA), pages 337–350. ACM Press, 2007.
[11] J.-C. Filliˆatre and C. March´e. The
Why/Krakatoa/Caduceus platform for deductive
program verification. In Computer Aided Verification,
19th International Conference, volume 4590 of LNCS,
pages 173–177. Springer, 2007.
[12] C. Flanagan, K. R. M. Leino, M. Lillibridge,
G. Nelson, J. B. Saxe, and R. Stata. Extended static
checking for Java. In Proceedings of the 2002 ACM
SIGPLAN Conference on Programming Language
Design and Implementation (PLDI), pages 234–245.
ACM, May 2002.
[13] R. W. Floyd. Assigning meanings to programs. In
Mathematical Aspects of Computer Science, volume 19
of Proceedings of Symposium in Applied Mathematics,
pages 19–32. American Mathematical Society, 1967.
[14] B. Jacobs, F. Piessens, J. Smans, K. R. M. Leino, and
W. Schulte. A programming model for concurrent
object-oriented programs. ACM transactions on
programming languages and systems, 31(1):1–48, 2008.
[15] I. T. Kassios. Dynamic frames: Support for framing,
dependencies and sharing without restrictions. In
J. Misra, T. Nipkow, and E. Sekerinski, editors,
Formal Methods (FM), volume 4085 of LNCS, pages
268–283. Springer, 2006.
[16] G. T. Leavens, A. L. Baker, and C. Ruby. Preliminary
design of JML: A behavioral interface specification
language for Java. ACM SIGSOFT Software
Engineering Notes, 31(3):1–38, Mar. 2006.
[17] K. R. M. Leino and P. M¨
uller. Spec# tutorial. In
LASER summer school lecture notes, LNCS. Springer,
2009. To appear.
[18] D. C. Luckham, S. M. German, F. W. von Henke,
R. A. Karp, P. W. Milne, D. C. Oppen, W. Polak, and
W. L. Scherlis. Stanford Pascal Verifier user manual.
Technical Report STAN-CS-79-731, Stanford
University, 1979.
[19] F. D. Mehta. Proofs for the Working Engineer. PhD
thesis, ETH Zurich, 2008.
[20] B. Meyer. Object-oriented Software Construction.
Series in Computer Science. Prentice-Hall
International, New York, 1988.
[21] P. M¨
uller. Modular Specification and Verification of
Object-Oriented Programs, volume 2262 of LNCS.
Springer, 2002.
[22] T. Nipkow, L. C. Paulson, and M. Wenzel.
Isabelle/HOL: a proof assistant for higher-order logic.
Springer, 2002.
[23] S. Owre, J. M. Rushby, and N. Shankar. PVS: A
prototype verification system. In D. Kapur, editor,
Conference on Automated Deduction (CADE), volume
607 of Lecture Notes in Artificial Intelligence, pages
748–752. Springer, 1992.
[24] M. J. Parkinson and G. M. Bierman. Separation logic
and abstraction. In J. Palsberg and M. Abadi, editors,
Proceedings of the 32nd ACM SIGPLAN-SIGACT
Symposium on Principles of Programming Languages,
POPL 2005, pages 247–258. ACM, Jan. 2005.
9
APPENDIX
A. PROPOSED SIDEBAR: THE MSR ECOSYS-
TEM
Spec# came into existence and prospered because of the
rich set of other projects within Microsoft Research that
we collaborated with and have depended upon. Here is a
brief overview of some of them, along with pointers for more
information.
Z3 Z3 is a state-of-the-art Satisfiability Modulo Theories
(SMT) solver. It combines decision procedures for func-
tions, arithmetic, and logical quantifiers. Because of its
high performance, it is the default SMT solver used by
the Boogie verification engine.
CCI The Microsoft Research Common Compiler Infrastruc-
ture (CCI) is a set of base classes that implement com-
mon functionality needed by compilers. It takes care
of intermediate code generation and provides help with
symbol table management, meta-data importing, name
resolution, overload resolution, error reporting and so
on. It also includes functionality that helps with in-
tegration into the Visual Studio development environ-
ment. It was first developed as part of the implementa-
tion of Comega, but has since been used for a number
of other compilers, including the Spec# compiler. Re-
cently, a newly redesigned version of the core parts of
CCI has been released as an open source project (see
http://ccimetadata.codeplex.com).
Pex Pex is a white-box unit-test generation tool. It au-
tomatically produces a small test suite with high code
coverage for .NET programs. To this end, Pex performs
a systematic program analysis (using dynamic symbolic
execution, similar to path-bounded model-checking) to
determine test inputs for Parameterized Unit Tests.
Pex learns the program behavior by monitoring exe-
cution traces, and an SMT solver computes new test
inputs which exercise different program behavior. For
more information see http://research.microsoft.com/pex.
B. PROPOSED SIDEBAR: GLOSSARY
There were many things that we didn’t have the room to
include citations for. This is a list of things for which we
could at least provide a one-sentence definition and citation
for, if that seems desirable. (This list may not contain ev-
erything from the paper that it could. We can continue to
add to it.)
•Behavioral subtyping
•bytecode/BML
•Hypervisor
•IDE
•JML
•MSIL
•.NET
•Singularity/Sing#
•SMT
•Eiffel
•SPARK Ada
•Event-B
10