ArticlePDF Available

Abstract

“I have no data yet. It is a capital mistake to theorise before one has data.” Sir Arthur Conan Doyle The Adventures of Sherlock Holmes de Bruijn notation is a coding of lambda terms in which each occurrence of a bound variable x is replaced by a natural number, indicating the ‘distance’ from the occurrence to the abstraction that introduced x . One might suppose that in any datatype for representing de Bruijn terms, the distance restriction on numbers would have to be maintained as an explicit datatype invariant. However, by using a nested (or non-regular) datatype, we can define a representation in which all terms are well-formed, so that the invariant is enforced automatically by the type system. Programming with nested types is only a little more difficult than programming with regular types, provided we stick to well-established structuring techniques. These involve expressing inductively defined functions in terms of an appropriate fold function for the type, and using fusion laws to establish their properties. In particular, the definition of lambda abstraction and beta reduction is particularly simple, and the proof of their associated properties is entirely mechanical.
J. Functional Programming 9 (1): 77–91, January 1999. Printed in the United Kingdom
c
1999 Cambridge University Press
77
de Bruijn notation as a nested datatype
RICHARD S. BIRD
Programming Research Group, Oxford University,
Wolfson Building, Parks Road, Oxford OX1 3QD, UK
ROSS PATERSON
Department of Computer Science, City University,
Northampton Square, London EC1V 0HB, UK
“I have no data yet. It is a capital mistake to
theorise before one has data.”
Sir Arthur Conan Doyle
The Adventures of Sherlock Holmes
Abstract
de Bruijn notation is a coding of lambda terms in which each occurrence of a bound variable x
is replaced by a natural number, indicating the ‘distance’ from the occurrence to the abstraction
that introduced x. One might suppose that in any datatype for representing de Bruijn terms,
the distance restriction on numbers would have to maintained as an explicit datatype invariant.
However, by using a nested (or non-regular) datatype, we can define a representation in which
all terms are well-formed, so that the invariant is enforced automatically by the type system.
Programming with nested types is only a little more difficult than programming with regular
types, provided we stick to well-established structuring techniques. These involve expressing
inductively defined functions in terms of an appropriate fold function for the type, and using
fusion laws to establish their properties. In particular, the definition of lambda abstraction
and beta reduction is particularly simple, and the proof of their associated properties is
entirely mechanical.
Capsule Review
Many functional languages (certainly ML and Haskell) allow nested data types, a recursive
data type in which the recursive occurrence on the right hand side of the definition is applied
to different arguments than the left hand side. (The Introduction of the paper elaborates.)
But such types are almost unusable unless the language supports polymorphic recursion,in
which a function can call itself recursively at a different type to the “parent” call. Abstracting
such functions into folds (as is commonly done with ordinary recursive functions) requires
the fold to take a polymorphic function as its argument. Finally, it turns out to be essential to
abstract over type constructors, not only over types.
This fascinating paper shows that if these extensions are supported, then nested data
types become genuinely useful. Though a particular example (de Bruijn notation), Bird and
Paterson show that nested data types can express and statically enforce useful invariants
which is, of course, what types systems are for.
In short, this paper demonstrates by example that some relatively modest extensions to the
Hindley-Milner type system have practical utility. Whether this is an isolated example, or a
member of a large and compelling class of applications, remains to be seen.
78 R. S. Bird and R. Paterson
1 Introduction
A standard representation of lambda terms, with variables of type v, in Haskell
involves essentially the following datatype:
data Term v = Var v | App (Term v, Term v) | Lam v (Term v)
The problem with the standard representation is that while abstraction is easy
to implement, application is not. Application of a lambda term Lam xb to an
argument t involves substituting t for all free occurrences of x in b. Care has to
be taken to avoid the capture of free variables in t by bound variables in b.To
overcome this problem, de Bruijn (1972) proposed a notation for lambda expressions
in which bound variables do not occur. In his notation, no variable appears after
the constructor Lam, and bound variables appear as natural numbers. The number
assigned to an occurrence of a bound variable x is the depth of nesting of Lam terms
between that occurrence and the (closest) binding occurrence of x. For example,
λx.x (λy.x y (λz.x y z))
translates to
λ.0(λ.10(λ.2 1 0))
This example is taken from Paulson (1996), which discusses de Bruijn notation in
detail.
If one wants to represent lambda terms involving both bound and free variables in
the de Bruijn style, then the declaration of Term v has to be changed. One possibility,
used in Paulson (1996), is to have two kinds of variable: free variables drawn from
v, and bound variables drawn from Int . Another possibility is to use a datatype
declaration
data Term v = Var v | App (Term v, Term v) | Lam (Term (Incr v))
data Incr v = Zero | Succ v
In the body of a lambda abstraction, the set of variables is augmented with an extra
element, the variable bound by the lambda. This variable is denoted by Zero; each
free variable x is renamed Succ x inside the lambda. For example, the terms λx.x
and λx.λy.x are represented as
Lam (Var Zero) and Lam (Lam (Var (Succ Zero)))
The term λx.λy.x y z, containing a free variable z, may be represented as the following
element of Term Char:
Lam (Lam (App (App (Var (Succ Zero), Var Zero), Var (Succ (Succ z’)))))
The type Term is an example of a nested datatype (Bird and Meertens, 1998) because
its definition has a recursive use with a different argument from the left-hand side.
Such definitions are also sometimes called non-regular.
Our aim in this paper is to study this novel representation of lambda terms, and
to give the implementations of abstraction and application. Useful and interesting
examples of nested datatypes have been rather thin on the ground until recently,
de Bruijn notation as a nested datatype 79
and de Bruijn notation gives us an excellent opportunity to explore the theory
in the context of a specific example. We believe that the right way to proceed
into the largely uncharted territory of nested types is to stick to the structuring
principles provided by the now well-established theory of regular datatypes. This
theory is reviewed briefly in section 2. In section 3 we introduce the type of lambda
terms, and set up appropriate machinery for defining functions over this type. The
implementations of abstraction and application are given in section 4. In the final
section, we will generalise what we have learnt to cover an extension of de Bruijn’s
notation.
Another aim of the paper concerns proof. In our view, equational properties
of functions are most easily proved when functions are defined as combinations
of other functions, using functional composition rather than application as the
primary combining form. As a consequence, proof by induction is replaced by
appeal to general equational laws that make up standard theory. This material is
also reviewed briefly in section 2. Proofs of the various equations were generated
using the simple automatic calculator described in Bird (1998); we include a selection
of them.
All programs in typewriter font are expressed in Hugs 1.3c (Jones, 1998), an
extension of Haskell that provides a more flexible typing discipline.
2 Preliminaries
Let us begin, not with Term, but with the simpler inductive datatype of binary trees
(which is equivalent to Term without the difficult Lam case):
data BinTree a = Leaf a | Fork (Pair (BinTree a))
type Pair a = (a,a)
By default, Haskell allows Leaf and Fork to be non-strict functions, so the dec-
laration above captures partial and infinite trees as well as finite ones. However,
although all functions defined in this paper are legal Haskell (extended with a more
general typing discipline), we are only concerned with datatypes that are flat sets,
and functions that are total in the set-theoretic sense. Thus, all functions are con-
sidered to be strict, as in ML. This will enable us to state equational laws without
mentioning strictness conditions explicitly.
2.1 Functors
For each datatype constructor
1
, there is a corresponding action on functions, which
preserves the shape of a data structure while replacing elements within it. The classic
example is the map function on lists, and functional programmers call these actions
mapping functions. For the type constructor Pair, the mapping function is
mapP :: (a -> b) -> Pair a -> Pair b
mapPf(x,y)=(fx,fy)
1
We do not consider type constructors that include function types.
80 R. S. Bird and R. Paterson
The mapping function on binary trees is:
mapB :: (a -> b) -> BinTree a -> BinTree b
mapB f (Leaf x) = (Leaf . f) x
mapB f (Fork p) = (Fork . mapP (mapB f)) p
The slightly unusual form of the right-hand sides is intended to suggest the function-
level equations
mapB f · Leaf = Leaf · f (1)
mapB f · Fork = Fork · mapP (mapB f) (2)
Category theorists refer to the combination of type constructor and map function
as a functor. Hence the following laws, satisfied by any mapping function, are called
functor laws:
mapB id = id (3)
mapB (f · g)=mapB f · mapB g (4)
A further property, called naturality, plays an important role in many calculations.
A polymorphic function f ::Ma Na, where M and N are given type constructors,
may be viewed as a collection of functions, one for each instantiation of the type
variable a. Because f is polymorphic, i.e. defined independently of a, these instances
are related by the following naturality condition:
mapN k · f = f · mapM k (5)
for all functions k, where mapM and mapN are the map functions for the type con-
structors M and N, respectively. Such functions f are called natural transformations.
As one example, any function of type
flatten :: BinTree a -> [a]
is a natural transformation, with naturality property
map f · flatten = flatten · mapB f (6)
Similarly, the naturality of the BinTree constructors Leaf :: a BinTree a and
Fork :: Pair (BinTree a) BinTree a is expressed by equations (1) and (2), which
define the action mapB. Note that the action on functions corresponding to the
identity type constructor is the identity, and a composition of type constructors
corresponds to a composition of actions.
2.2 Folds
The second general operator generalises the foldr function on lists. For binary trees,
the operator is
foldB :: (a -> b) -> (Pair b -> b) -> BinTree a -> b
foldB l f (Leaf x)=lx
foldB l f (Fork p) = (f . mapP (foldB l f)) p
de Bruijn notation as a nested datatype 81
The fold operator takes a function argument for each constructor of the datatype.
Its action to replace the constructors in its input with the corresponding functions.
Often the effect is to reduce the data structure to a summary value, as in the first
two of the following examples:
size = foldB (const 1) (uncurry (+))
height = foldB (const 0) maxp
where maxp (x, y)=1+maxxy
flatten = foldB wrap (uncurry (++))
where wrapx=[x]
A fundamental property of all fold operators is that they produce the unique
function satisfying the above defining equations. From this follows a trio of useful
calculational laws. The simplest is the identity law, which for binary trees is
foldB Leaf Fork = id (7)
The other two laws are more powerful, and heavily used in calculations. The fusion
law states that
h · foldB lf = foldB l
0
f
0
h · l = l
0
h · f = f
0
· mapP h
(8)
The map-fusion law states that
foldB lf· mapB h = foldB (l · h) f (9)
An immediate consequence of map-fusion and the identity law is an alternative
definition of mapB as a fold:
mapB h = foldB (Leaf · h) Fork (10)
The map operator for each regular datatype may be defined as fold in this way, but
this does not hold for nested datatypes.
Fusion laws, functor properties, and naturality conditions, are all we need for a
powerful generic equational theory of inductive datatypes. For further details, see
Bird and de Moor (1997).
2.3 Monads
Monad operations provide a useful way of structuring many programs. Functional
programmers are introduced to monads as a type constructor with a certain binding
operation. Category theorists use a function-level definition, which is also more
convenient for calculations. A monad is defined as a type constructor M with a
mapping function mapM and two operations
unit :: a Ma
join :: M (Ma) Ma
These natural transformations are required to satisfy the following coherence laws:
join · mapM unit = id (11)
82 R. S. Bird and R. Paterson
join · unit = id (12)
join · mapM join = join · join (13)
In total, there are seven laws available for reasoning about a monad: the three
coherence laws, the two naturality laws for unit and join, and the two functor laws
for mapM , the mapping function associated with M.
A standard example of a monad is the list type constructor, with unit returning
a singleton list, and concat as the join operation. Binary trees also form a monad,
with unit Leaf and the following join function:
joinB :: BinTree (BinTree a) -> BinTree a
joinB = foldB id Fork
As we will see, lambda terms also form a monad; the unit and join operations on
lambda terms will be needed in the definition of lambda abstraction and application.
See Bird (1998) for further discussion of monads and monad laws, and the different
ways one can describe them.
3 de Bruijn notation
We can follow the same steps with the type Term a of lambda terms over a type a:
data Term v = Var v | App (Pair (Term v)) | Lam (Term (Incr v))
data Incr v = Zero | Succ v
3.1 Maps
The first step is to identify the map operators for the newly introduced types. The
mapping function corresponding to Incr is straightforward:
mapI :: (a -> b) -> Incr a -> Incr b
mapI f Zero = Zero
mapI f (Succ x) = (Succ . f) x
As we might expect, Term is more interesting:
mapT :: (a -> b) -> Term a -> Term b
mapT f (Var x) = (Var . f) x
mapT f (App p) = (App . mapP (mapT f)) p
mapT f (Lam t) = (Lam . mapT (mapI f)) t
Note the change of argument of mapT in the Lam case: the required mapping
function for Term (Incr a)ismapT (mapI f). As a result, mapT leaves bound variables
unchanged, and replaces only free variables. In the nested definition, bound variables
have become part of the shape of a term.
Note also that the argument of mapT in the Lam case also has a different
type, namely Incr a Incr b, but this is an instance of the declared signature. The
definition of mapT makes use of polymorphic recursion; it is the first function in this
paper whose type signature cannot be omitted.
de Bruijn notation as a nested datatype 83
3.2 Folds
The definition of the fold function for Term follows from the principle of replacing
constructors by functions:
foldTval(Var x)=vx
foldTval(App p) = (a . mapP (foldT v a l)) p
foldTval(Lam t) = (l . foldTval)t
Unfortunately, the last line of this definition will not pass a standard Haskell
typechecker: if foldT val is applied to a term of type Term V for some type V , then
the foldT val on the right side is applied to a term of type Term (Incr V ). Hence
the argument functions v, a and l must be applicable at a range of different types;
effectively, they must be polymorphic. Haskell’s language of types cannot express
this without an extension called rank-2 type signatures (McCracken, 1984). Such
signatures have been implemented in GHC and also in Hugs 1.3c (Peyton Jones
et al. , 1998; Jones, 1998). In the syntax of Hugs 1.3c, foldT can be made acceptable
by adding the following type signature:
foldT :: (forall a. a -> n a) ->
(forall a. Pair (n a) -> n a) ->
(forall a. n (Incr a) -> n a) ->
Termb->nb
Here the variable n denotes an arbitrary type constructor.
As a consequence of the arguments being natural transformations, foldT val is a
natural transformation, with associated property
mapN k · foldT val = foldT val· mapT k (14)
The naturality law of foldT does not hold for regular datatypes, such as binary trees
or lists, because the argument of the fold is not required to be natural.
The above naturality condition implies that no instance of foldT can manipulate
the values of free variables. As a result, we cannot define all the functions we would
like on terms as instances of foldT . This phenomenon motivates a more general
definition of the fold operator on nested datatypes such as Term; we will call it
gfold for generalised fold:
gfoldT :: (forall a.ma->na)->
(forall a. Pair (n a) -> n a) ->
(forall a. n (Incr a) -> n a) ->
(forall a. Incr (m a) -> m (Incr a)) ->
Term (m b) -> n b
gfoldTvalk(Var x)=vx
gfoldTvalk(App p) = (a . mapP (gfoldTvalk))p
gfoldTvalk(Lam t) = (l . gfoldTvalk.mapT k) t
The two additional ingredients in the definition of gfoldT are, firstly, that the
argument of v is generalised from a to ma for an arbitrary type constructor m and,
84 R. S. Bird and R. Paterson
secondly, that an extra argument k is provided for the fold. To explain the role of
the extra function k, observe that a lambda term with variables drawn from mb has
type
Term (Incr (mb))
Applying mapT k to this lambda term produces an element of type
Term (m (Incr b))
Applying gfoldT valk to this element produces an element of type
n (Incr b)
This is the correct type for an argument of l. More details of generalised folds and
their properties may be found in a companion paper (Bird and Paterson, 1998).
The arguments to gfoldT are natural transformations, and the result is also a
natural transformation. Thus, if gfoldT valk:: Term (Mb) Nb, we have
mapN k · gfoldT valk = gfoldT valk· mapT (mapM k)
for all k, where mapM and mapN are the mapping functions associated with the
type constructors M and N.
The advantage of the generalised fold resides in the extra degree of freedom for
selecting the type constructor m. In theory, we can take m = Id, the identity type
constructor, and so obtain
foldT val = gfoldT valid (15)
as a special case. Thus gfoldT generalises foldT . Another instance of gfoldT takes
both m and n to be constant type constructors, delivering specific types for all
arguments. However, type constructor polymorphism in Haskell is limited, in that
type constructor variables may only be instantiated to datatype constructors (possibly
partially applied). The alternative to expressing these special cases by installing Id
and Const as new datatype constructors is to define specialised versions of gfoldT .
For example, the following version corresponds to the constant type constructors
case:
kfoldT :: (a -> b) -> (Pair b -> b) -> (b -> b) ->
(Incr a -> a) ->
Terma->b
kfoldTvalk(Var x)=vx
kfoldTvalk(App p) = (a . mapP (kfoldTvalk))p
kfoldTvalk(Lam t) = (l . kfoldTvalk.mapT k) t
Note that kfoldT has exactly the same definition as gfoldT , but a different (more
specific) type. For example, we can convert a lambda term to a string by
showT :: Term String -> String
showT = kfoldT id showP (’L’:) showI
where showP (x,y) = "(" ++ x ++""++y++")"
showI Zero = "0"
showI (Succ x) = ’S’:x
de Bruijn notation as a nested datatype 85
In particular, we can use showT to convert an element of type Term Char to a string
in which individual character variables are printed without their quotes:
showTC :: Term Char -> String
showTC = showT . mapT wrap
where wrapx=[x]
For example, applying showTC to
Lam (App (Var Zero, App (Var (Succ x’), Var (Succ y’))))
produces the string L(0(x’y’)).
The function gfoldT satisfies similar fusion laws to those discussed above for
binary trees. Such laws are proved from the fact that gfoldT is the unique function
satisfying its defining equation. (This can be established by induction over terms.)
In particular, the identity law states that
gfoldT Var App Lam id = id (16)
The map-fusion law states that
gfoldT valk· mapT h = gfoldT (v · h) alk
0
k · mapI h = h · k
0
The fold-fusion law is the following: suppose we have the typing
gfoldT valk :: Term (Ma) Na
Then
h · gfoldT valk = gfoldT v
0
a
0
l
0
(mapM k
0
· k) (17)
h · v = v
0
h · a = a
0
· mapP h
h · l = l
0
· h · mapN k
0
The proof consists of simple calculations to show that h · gfoldT valk satisfies the
defining equations of gfoldT v
0
a
0
l
0
(mapM k
0
· k). The Lam case is the longest:
h · gfoldT valk· Lam
= {definition of gfoldT }
h · l · gfoldT valk· mapT k
= {assumption}
l
0
· h · mapN k
0
· gfoldT valk· mapT k
= {naturality}
l
0
· h · gfoldT valk· mapT (mapM k
0
) · mapT k
= {functor}
l
0
· h · gfoldT valk· mapT (mapM k
0
· k)
3.3 A monad
The type constructor Term is also a monad, with Var as the unit operator, and
joinT defined by
86 R. S. Bird and R. Paterson
joinT :: Term (Term a) -> Term a
joinT = gfoldT id App Lam distT
distT :: Incr (Term a) -> Term (Incr a)
distT Zero = Var Zero
distT (Succ x) = mapT Succ x
The function distT replaces Succs on terms by Succs on variables. It satisfies the
following properties
2
, easily established by cases:
distT · mapI Var = Var (18)
distT · mapI joinT = joinT · mapT distT · distT (19)
Using these equations, and the fusion laws for gfoldT , we can prove the coherence
laws for the monad operations on Term:
joinT · Var = id (20)
joinT · mapT Var = id (21)
joinT · mapT joinT = joinT · joinT (22)
For example, we give the proof of equation (22):
joinT · mapT joinT
= {definition of joinT }
gfoldT id App Lam distT · mapT joinT
= {map fusion, by distribution law (19)}
gfoldT (id · joinT ) App Lam (mapT distT · distT )
= {identity}
gfoldT (joinT · id ) App Lam (mapT distT · distT )
= {fusion (backwards), since joinT · Lam = Lam · joinT · mapT distT }
joinT · gfoldT id App Lam distT
= {definition of joinT }
joinT · joinT
4 Abstraction and application
It is time now to return to the main problem in hand, namely, to give the imple-
mentations of abstraction and application.
Abstracting with respect to a free variable x is easy: each occurrence of x in a
term is replaced by Zero, and each occurrence of a variable y 6= x is replaced by
Succ y. This is implemented by:
abstract :: Eq a => a -> Term a -> Term a
abstract x = Lam . mapT (match x)
2
These equations are part of the statement that distT is a distributive law (Barr and Wells, 1984)
between the monads on Term and Incr.
de Bruijn notation as a nested datatype 87
match :: Eq a => a -> a -> Incr a
matchxy=ifx==ythen Zero else Succ y
The definition of application is also quite short. We define application as a function
that takes a term t and the body b of a lambda abstraction, and replaces every
occurrence of Zero (the nameless variable bound by the abstraction) in b by t:
apply :: Term a -> Term (Incr a) -> Term a
apply t = joinT . mapT (subst t . mapI Var)
The function mapT (subst t · mapI Var) returns an element of Term (Term a), a term
of terms. The function joinT ‘flattens’ such elements into ordinary terms.
The actual substitution is done by the function subst t, a left inverse of match t:
subst :: a -> Incr a -> a
subst x Zero = x
subst x (Succ y) = y
Note that the type of subst implies the following ‘free theorem’ (Wadler, 1989):
f · subst x = subst (fx) · mapI f (23)
To check this definition of apply, let us prove that substituting an abstracted
variable returns the original term:
apply (Var x) · mapT (match x)
= {definitions}
joinT · mapT (subst (Var x) · mapI Var) · mapT (match x)
= {law (23)}
joinT · mapT (Var · subst x) · mapT (match x)
= {functor}
joinT · mapT (Var · subst x · match x)
= {functor, monad law (21)}
mapT (subst x · match x)
= {definitions of subst, match}
mapT id
= {functor}
id
5 An extension of de Bruijn’s notation
Substitution on de Bruijn terms transforms arguments as well as function bodies,
thus precluding sharing. Consider the example term from section 1, with the variables
rewritten in unary notation:
λ.0(λ.S00(λ.SS0 S0 0))
If this term is applied to the term λ.0 S0, the result is
(λ.0 S0
)(λ.(λ.0 SS0)0(λ.(λ.0 SSS0) S0 0))
88 R. S. Bird and R. Paterson
where the three versions of the argument are underlined. There is a generalisation
of de Bruijn notation in which S can be applied to any term, not just a variable
(Paterson, 1991). Its effect is to escape the scope of the matching λ. With this looser
representation of terms, one can avoid transforming arguments while substituting.
In the above example, substitution yields
(λ.0 S0
)(λ.S(λ.0 S0)0(λ.SS(λ.0 S0) S0 0))
In effect, we have postponed pushing the S’s down to the variables.
We still require that each S or 0 have a matching lambda. This constraint is
captured by the following definition:
data TermE a = VarE a
| AppE (Pair (TermE a))
| LamE (TermE (Incr (TermE a)))
Note that TermE is doubly nested. A similar definition can be used to model
quasiquotation (literal data with an escape operator) as in Scheme (Clinger and
Rees, 1991) or multi-stage programming languages like MetaML (Taha and Sheard,
1997).
Though TermE is more complex, we can follow the same steps as for BinTree
and Term. The mapping function for TermE is given by:
mapE :: (a -> b) -> TermE a -> TermE b
mapE f (VarE x) = (VarE . f) x
mapE f (AppE p) = (AppE . mapP (mapE f)) p
mapE f (LamE t) = (LamE . mapE (mapI (mapE f))) t
The generalised fold operator is
gfoldE :: (forall a.ma->na)->
(forall a. Pair (n a) -> n a) ->
(forall a. n (Incr (n a)) -> n a) ->
(forall a. Incr a -> m (Incr a)) ->
TermE (m b) -> n b
gfoldEvalk(VarE x)=vx
gfoldEvalk(AppE p) = (a . mapP (gfoldEvalk))p
gfoldEvalk(LamE t) = (l . gfoldEvalk.
mapE (k . mapI (gfoldEvalk))) t
Note the change in type for the last argument k: a lambda abstraction for extended
terms with variables of type mb has type
TermE (Incr (TermE (mb)))
Applying mapE (mapI (gfoldE valk)) to a value of this type produces an element of
type
TermE (Incr (nb))
Applying mapE k to this element produces an element of type
TermE (m (Incr (nb)))
de Bruijn notation as a nested datatype 89
A second recursive application of gfoldE valk now produces an element of the type
required by l, namely,
n (Incr (nb))
The identity law for extended terms is
gfoldE
0
VarE AppE LamE id = id (24)
The map-fusion law is
gfoldE val(h · k) · mapE h = gfoldE (v · h) alk (25)
The fusion law for gfoldE valk:: TermE (Ma) Na is
h · gfoldE valk = gfoldE v
0
a
0
l
0
(mapM k
0
· k) (26)
h · v = v
0
h · a = a
0
· mapP h
h · l = l
0
· h · mapN (k
0
· mapI h)
Extended terms also comprise a monad, with unit VarE and join operator defined
by:
joinE :: TermE (TermE a) -> TermE a
joinE = gfoldE id AppE LamE VarE
Verification of the monad laws is straightforward. For example, we will prove that
joinE · mapE joinE = joinE · joinE (27)
We have
joinE · mapE joinE
= {definition of joinE }
gfoldE id AppE LamE VarE · mapE joinE
= {definition of joinE }
gfoldE id AppE LamE (joinE · VarE · VarE ) · mapE joinE
= {map fusion}
gfoldE joinE AppE LamE (VarE · VarE )
= {naturality of VarE }
gfoldE joinE AppE LamE (mapE VarE · VarE )
= {fusion (backwards)}
joinE · gfoldE id AppE LamE VarE
= {definition of joinE }
joinE · joinE
With the definitions above, we can define abstraction and application:
abstractE :: Eq a => a -> TermE a -> TermE a
abstractE x = LamE . mapE (mapI VarE . match x)
applyE :: TermE a -> TermE (Incr (TermE a)) -> TermE a
applyE t = joinE . mapE (subst t)
90 R. S. Bird and R. Paterson
Finally, let us see how to convert extended terms into ordinary ones. We want a
function
cvtE :: TermE a -> Term a
We will define cvtE as an instance of gfoldE . Typing considerations dictate that
m = Id and n = Term in the type assignment for gfoldE . Once again Haskell forces
us to define a variant gfoldE
0
, whose definition is the same as that of gfoldE , but
with m specialised to Id. We define
cvtE = gfoldE’ Var App (Lam . joinT . mapT distT) id
To check this definition, we can show that cvtE is a monad morphism, that is, it
satisfies the equations:
cvtE · VarE = Var (28)
cvtE · joinE = joinT · mapT cvtE · cvtE (29)
The first is immediate from the definition, and the second is an appeal to fusion:
cvtE · joinE
= {definition of joinE }
cvtE · gfoldE id AppE LamE VarE
= {fusion}
gfoldE cvtE App (Lam · joinT · mapT distT )(mapE id · VarE )
= {identity}
gfoldE cvtE App (Lam · joinT · mapT distT ) VarE
= {map fusion (backwards)}
gfoldE id App (Lam · joinT · mapT distT )(cvtE · VarE ) · mapE cvtE
= {definition of cvtE }
gfoldE id App (Lam · joinT · mapT distT ) Var · mapE cvtE
= {identity}
gfoldE id App (Lam · joinT · mapT distT )(Var · id) · mapE cvtE
= {fusion (backwards)}
joinT · gfoldE
0
Var App (Lam · joinT · mapT distT ) id · mapE cvtE
= {definition of cvtE }
joinT · cvtE · mapE cvtE
= {naturality of cvtE }
joinT · mapT cvtE · cvtE
This equation is used in the proof that substitution on extended terms correctly
mirrors substitution on de Bruijn terms:
cvtE · applyE t = apply (cvtE t) · cvtBodyE (30)
where cvtBodyE converts an extended abstraction body to a simple one:
cvtBodyE :: TermE (Incr (TermE a)) -> Term (Incr a)
cvtBodyE = joinT . mapT distT . cvtE . mapE (mapI cvtE)
The proof is lengthy but routine, and we omit it.
de Bruijn notation as a nested datatype 91
6 Conclusion
Our representation of de Bruijn terms illustrates the ability of nested datatypes to
express constraints on data structures, so that they can be enforced by the type
checker. It has also served as a test case for the extension to nested datatypes of
structuring principles developed for regular datatypes, using maps, folds and monads.
In the case of de Bruijn terms, these operators do most of the work, including
handling bound variables, so that the definition of application and abstraction is
particularly simple. Moreover, with programs structured in this way most proofs
are mechanical, and were indeed generated using the simple automatic calculator
described in Bird (1998).
Programs that manipulate nested types require a number of recently explored
extensions of the Hindley-Milner type system. The limited form of type constructor
polymorphism provided by Haskell has been an occasional hindrance, forcing us
to define specialised versions of polymorphic functions, or new datatypes that are
equivalent to existing types; in both cases an opportunity for reuse is lost. It might be
reasonable to design a language in which these restrictions were lifted, at the cost of
explicit abstraction and instantiation with respect to type constructors, but not types.
Acknowledgements
Oege de Moor suggested using a nested datatype for lambda terms. An anonymous
referee suggested a number of improvements.
References
Barr, M. and Wells, C. (1984) Toposes, triples and theories. Grundlehren der Mathematischen
Wissenschaften, no. 278. Springer-Verlag.
Bird, R. and de Moor, O. (1997) The Algebra of Programming. Prentice Hall.
Bird, R. (1998) Introduction to Functional Programming using Haskell. Prentice Hall.
Bird, R. and Paterson, R. (1998) Generalised folds for nested datatypes. Submitted.
Bird, R. S. and Meertens, L. (1998) Nested datatypes. Mathematics of Program Construction:
Lecture Notes in Computer Science 1422, pp. 52–67. Springer-Verlag.
Clinger, W. and Rees, J. (1991) Revised
4
report on the algorithmic language Scheme. ACM
Lisp pointers, IV(July–September).
de Bruijn, N. G. (1972) Lambda calculus notation with nameless dummies. Indagationes
Mathematicae, 34, 381–392.
Jones, M. (1998) A technical summary of the new features in Hugs 1.3c. Unpublished.
McCracken, N. J. (1984) The typechecking of programs with implicit type structure. Semantics
of Data Types: Lecture Notes in Computer Science 173, pp. 301–315. Springer-Verlag.
Paterson, R. A. (1991) Non-deterministic lambda-calculus: A core for integrated languages.
Declarative programming, Sassbachwalden. Springer-Verlag.
Paulson, L. C. (1996) ML for the Working Programmer (2nd edn). Cambridge University
Press.
Peyton Jones, S. et al. (1998) The Glasgow Haskell Compiler. Department of Computer
Science, University of Glasgow.
Taha, W. and Sheard, T. (1997) Multi-stage programming with explicit annotations. ACM
Symposium on Partial Evaluation and Semantics-based Program Manipulation, pp. 203–217.
Wadler, P. (1989) Theorems for free! 4th Conference on Functional Programming Languages
and Computer Architecture, pp. 347–359. IFIP.
... 4. We leverage heterogeneous substitution systems that we generalize to monoidal categories to prove the initiality theorem, efficiently and modularly. To do so, we generalize and adapt to our purpose a proof scattered throughout several papers [49,19,9,10]. 5. We prove the adjoint theorem from the initiality theorem rather than the opposite, and rely on ω-cocontinuity rather than on monoidal closedness. This is a key difference for relating the different approaches. ...
... In other words, we aim to define a general notion of model of a language, encompassing the existing ones, for untyped and typed higher-order languages, encompassing their syntaxes and substitution structures. For substitution, the notion of monad has been found to capture its structure and properties [16,19,15]. For instance, the untyped lambda calculus can be described as a monad on the category of sets. ...
... Before introducing monoids to model substitution, let us first consider why Monads are not sufficient to encompass the different existing approaches. Monads on C have long been identified as a suitable axiomatization of simultaneous substitution and its basic properties [16,19,15]. ...
Preprint
Initial semantics aims to capture inductive structures and their properties as initial objects in suitable categories. We focus on the initial semantics aiming to model the syntax and substitution structure of programming languages with variable binding as initial objects. Three distinct yet similar approaches to initial semantics have been proposed. An initial semantics result was first proved by Fiore, Plotkin, and Turi using {\Sigma}-monoids in their seminal paper published at LICS'99. Alternative frameworks were later introduced by Hirschowitz and Maggesi using modules over monads, and by Matthes and Uustalu using heterogeneous substitution systems. Since then, all approaches have been significantly developed by numerous researchers. While similar, the links between this different approaches remain unclear. This is especially the case as the literature is difficult to access, since it was mostly published in (short) conference papers without proofs, and contains many technical variations and evolutions. In this work, we introduce a framework based on monoidal categories that unifies these three distinct approaches to initial semantics, by suitably generalizing and combining them. Doing so we show that modules over monoids provide an abstract and easy to manipulate framework, that {\Sigma}-monoids and strengths naturally arise when stating and proving an initiality theorem, and that heterogeneous substitution systems enable us to prove the initiality theorem modularly. Moreover, to clarify the literature, we provide an extensive overview of related work using our framework as a cornerstone to explain the links between the different approaches and their variations.
... Bird and Paterson's de Bruijn indices as a nested data type [3] parametrize the type of expressions by a type of (free) variables and rely on polymorphic recursion to extend the scope under binders. An extension and generalization of this approach is implemented in Haskell in Kmett's bound library 3 [14]. ...
... Bird and Paterson's de Bruijn indices as a nested data type [3] parametrize the type of expressions by a type of (free) variables and rely on polymorphic recursion to extend the scope under binders. An extension and generalization of this approach is implemented in Haskell in Kmett's bound library 3 [14]. ...
... In this section, we develop a variant of free scoped monads [16] but relying on the foil [17] for intrinsic scoping instead of Bird and Patterson's de Bruijn notation as nested data types [3]. ...
Preprint
Full-text available
Handling bound identifiers correctly and efficiently is critical in implementations of compilers, proof assistants, and theorem provers. When choosing a representation for abstract syntax with binders, implementors face a trade-off between type safety with intrinsic scoping, efficiency, and generality. The "foil" by Maclaurin, Radul, and Paszke combines an efficient implementation of the Barendregt convention with intrinsic scoping through advanced type system features in Haskell, such as rank-2 polymorphism and generalized algebraic data types. Free scoped monads of Kudasov, on the other hand, combine intrinsic scoping with de Bruijn indices as nested data types with Sweirstra's data types \`a la carte approach to allow generic implementation of algorithms such as higher-order unification. In this paper, we suggest two approaches of making the foil more affordable. First, we marry free scoped monads with the foil, allowing to generate efficient, type-safe, and generic abstract syntax representation with binders for any language given its second-order signature. Second, we provide Template Haskell functions that allow generating the scope-safe representation from a na\"ive one. The latter approach enables us to use existing tools like BNF Converter to very quickly prototype complete implementation of languages, including parsing, pretty-printing, and efficient intrinsically scoped abstract syntax. We demonstrate both approaches using λΠ\lambda\Pi with pairs and patterns as our example object language. Finally, we provide benchmarks comparing our implementation against the foil, free scoped monads with nested de Bruijn indices, and some traditional implementations.
... While the proof method is introduced and illustrated in detail in the main body of the paper, the interested reader can consult the formalization, commented and with pointers to the paper, for the complete development (excluding the practical examples that are presented here for illustration purposes only). The Coq formalization, available at https://bitbucket.org/pl-uwr/diacritical, uses a de Bruijn representation for λterms, where the de Bruijn indices are encoded using nested datatypes [11]. ...
Article
Full-text available
Normal-form bisimilarity is a simple, easy-to-use behavioral equivalence that relates terms in λ\lambda-calculi by decomposing their normal forms into bisimilar subterms. Moreover, it typically allows for powerful up-to techniques, such as bisimulation up to context, which simplify bisimulation proofs even further. However, proving soundness of these relations becomes complicated in the presence of η\eta-expansion and usually relies on ad hoc proof methods which depend on the language. In this paper we propose a more systematic proof method to show that an extensional normal-form bisimilarity along with its corresponding up to context technique are sound. We illustrate our technique with three calculi: the call-by-value λ\lambda-calculus, the call-by-value λ\lambda-calculus with the delimited-control operators shift and reset, and the call-by-value λ\lambda-calculus with the abortive control operators call/cc and abort. In the first two cases, there was previously no sound up to context technique validating the η\eta-law, whereas no theory of normal-form bisimulations for a calculus with call/cc and abort has been presented before. Our results have been fully formalized in the Coq proof assistant.
... Still within the target area of datatype-generic programming (and reasoning), but more specifically, the datatypes we focus on in the present work are canonically equipped with a substitution operation-itself defined via a variant of the recursion principle associated to the datatypes (recursion in Mendler-style [23]). This substitution satisfies the laws of the well-known mathematical structure of a monad -an observation originating in [7,9,5]. In this work, we not only construct the datatypes themselves, but also provide a monadic structure-both the operations and the laws-on those datatypes. ...
Preprint
The term UniMath refers both to a formal system for mathematics, as well as a computer-checked library of mathematics formalized in that system. The UniMath system is a core dependent type theory, augmented by the univalence axiom. The system is kept as small as possible in order to ease verification of it - in particular, general inductive types are not part of the system. In this work, we partially remedy the lack of inductive types by constructing some datatypes and their associated induction principles from other type constructors. This involves a formalization of a category-theoretic result on the construction of initial algebras, as well as a mechanism to conveniently use the datatypes obtained. We also connect this construction to a previous formalization of substitution for languages with variable binding. Altogether, we construct a framework that allows us to concisely specify, via a simple notion of binding signature, a language with variable binding. From such a specification we obtain the datatype of terms of that language, equipped with a certified monadic substitution operation and a suitable recursion scheme. Using this we formalize the untyped lambda calculus and the raw syntax of Martin-L\"of type theory.
... While the proof method is introduced and illustrated in detail in the main body of the paper, the interested reader can consult the formalization, commented and with pointers to the paper, for the complete development (excluding the practical examples that are presented here for illustration purposes only). The Coq formalization, available at https://bitbucket.org/pl-uwr/diacritical, uses a de Bruijn representation for λterms, where the de Bruijn indices are encoded using nested datatypes [11]. ...
Preprint
Full-text available
Normal-form bisimilarity is a simple, easy-to-use behavioral equivalence that relates terms in λ\lambda-calculi by decomposing their normal forms into bisimilar subterms. Moreover, it typically allows for powerful up-to techniques, such as bisimulation up to context, which simplify bisimulation proofs even further. However, proving soundness of these relations becomes complicated in the presence of η\eta-expansion and usually relies on ad hoc proof methods which depend on the language. In this paper we propose a more systematic proof method to show that an extensional normal-form bisimilarity along with its corresponding up to context technique are sound. We illustrate our technique with three calculi: the call-by-value λ\lambda-calculus, the call-by-value λ\lambda-calculus with the delimited-control operators shift and reset, and the call-by-value λ\lambda-calculus with the abortive control operators call/cc and abort. In the first two cases, there was previously no sound up to context technique validating the η\eta-law, whereas no theory of normal-form bisimulations for a calculus with call/cc and abort has been presented before. Our results have been fully formalized in the Coq proof assistant.
Article
By abstracting over well-known properties of De Bruijn's representation with nameless dummies, we design a new theory of syntax with variable binding and capture-avoiding substitution. We propose it as a simpler alternative to Fiore, Plotkin, and Turi's approach, with which we establish a strong formal link. We also show that our theory easily incorporates simple types and equations between terms.
Article
Full-text available
The report gives a defining description of the programming language Scheme. Scheme is a statically scoped and properly tail-recursive dialect of the Lisp programming language invented by Guy Lewis Steele Jr. and Gerald Jay Sussman. It was designed to have an exceptionally clear and simple semantics and few different ways to form expressions. A wide variety of programming paradigms, including imperative, functional, and message passing styles, find convenient expression in Scheme. The introduction offers a brief history of the language and of the report. The first three chapters present the fundamental ideas of the language and describe the notational conventions used for describing the language and for writing programs in the language. Chapters 4 and 5 describe the syntax and semantics of expressions, programs, and definitions. Chapter 6 describes Scheme's built-in procedures, which include all of the language's data manipulation and input/output primitives. Chapter 7 provides a formal syntax for Scheme written in extended BNF, along with a formal denotational semantics. An example of the use of the language follows the formal syntax and semantics. The report concludes with a list of references and an alphabetic index.
Article
1. Categories.- 2. Toposes.- 3. Triples.- 4. Theories.- 5. Properties of Toposes.- 6. Permanence Properties of Toposes.- 7. Representation Theorems.- 8. Cocone Theories.- 9. More on Triples.- Index to Exercises.
Article
We introduce MetaML, a practically motivated, statically typed multi-stage programming language. MetaML is a “real” language. We have built an implementation and used it to solve multi-stage problems. MetaML allows the programmer to construct, combine, and execute code fragments in a type-safe manner. Code fragments can contain free variables, but they obey the static-scoping principle. MetaML performs type-checking for all stages once and for all before the execution of the first stage. Certain anomalies with our first MetaML implementation led us to formalize an illustrative subset of the MetaML implementation. We present both a big-step semantics and type system for this subset, and prove the type system's soundness with respect to a big-step semantics. From a software engineering point of view, this means that generators written in the MetaML subset never generate unsafe programs. A type system and semantics for full MetaML is still ongoing work. We argue that multi-stage languages are useful as programming languages in their own right, that they supply a sound basis for high-level program generation technology, and that they should support features that make it possible for programmers to write staged computations without significantly changing their normal programming style. To illustrate this we provide a simple three-stage example elaborating a number of practical issues. The design of MetaML was based on two main principles that we identified as fundamental for high-level program generation, namely, cross-stage persistence and cross-stage safety. We present these principles, explain the technical problems they give rise to, and how we address with these problems in our implementation.
Conference Paper
Without Abstract
Conference Paper
In these lecture notes we describe an algebraic approach to programming, suitable both for the derivation of individual programs and for the study of programming techniques in general. The programming techniques we have in mind are those paradigms and strategies of program construction, such as dynamic programming, greedy algorithms, exhaustive search, and divide and conquer, that form the core of most textbooks in Algorithm Design. We illustrate the main ideas in the context of optimization problems, developing one or two of the basic techniques used to solve them.