ArticlePDF Available

Abstract

In functional programming, fold is a standard operator that encapsulates a simple pattern of recursion for processing lists. This article is a tutorial on two key aspects of the fold operator for lists. First of all, we emphasize the use of the universal property of fold both as a proof principle that avoids the need for inductive proofs, and as a definition principle that guides the transformation of recursive functions into definitions using fold. Secondly, we show that even though the pattern of recursion encapsulated by fold is simple, in a language with tuples and functions as first-class values the fold operator has greater expressive power than might first be expected. 1 Introduction Many programs that involve repetition are naturally expressed using some form of recursion, and properties proved of such programs using some form of induction. Indeed, in the functional approach to programming, recursion and induction are the primary tools for defining and proving properties of p...
J. Functional Programming 9 (4): 355–372, July 1999. Printed in the United Kingdom
c
! 1999 Cambridge University Press
355
A tutorial on the universality and
expressiveness of fold
GRAHAM HUTTON
University of Nottingham, Nottingham, UK
http://www.cs.nott.ac.uk/-gmh
Abstract
In functional programming, fold is a standard operator that encapsulates a simple pattern of
recursion for processing lists. This article is a tutorial on two key aspects of the fold operator
for lists. First of all, we emphasize the use of the universal property of fold both as a proof
principle that avoids the need for inductive proofs, and as a definition principle that guides
the transformation of recursive functions into definitions using fold. Secondly, we show that
even though the pattern of recursion encapsulated by fold is simple, in a language with tuples
and functions as first-class values the fold operator has greater expressive power than might
first be expected.
Capsule Review
Within the last ten to fifteen years, the algebra of datatypes has become a stable and well
understood element of the mathematics of program construction. Graham Hutton’s paper is
a highly readable, elementary introduction to the algebra centred on the well-known function
on lists. The paper distinguishes itself by focusing on how the properties are used for the
crucial task of ‘constructing’ programs, rather than on the post hoc verification of existing
programs. Several well-chosen examples are given, beginning at an elementary level and
progressing to more advanced applications. The paper concludes with a good overview and
bibliography of recent literature which develops the theory and its applications in more
depth.
1 Introduction
Many programs that involve repetition are naturally expressed using some form of
recursion, and properties proved of such programs using some form of induction.
Indeed, in the functional approach to programming, recursion and induction are the
primary tools for defining and proving properties of programs.
Not surprisingly, many recursive programs will share a common pattern of recur-
sion, and many inductive proofs will share a common pattern of induction. Repeating
the same patterns again and again is tedious, time consuming, and prone to error.
Such repetition can be avoided by introducing special recursion operators and proof
principles that encapsulate the common patterns, allowing us to concentrate on the
parts that are dierent for each application.
In functional programming, fold (also known as foldr) is a standard recursion
356 G. Hutton
operator that encapsulates a common pattern of recursion for processing lists.
The fold operator comes equipped with a proof principle called universality, which
encapsulates a common pattern of inductive proof concerning lists. Fold and its
universal property together form the basis of a simple but powerful calculational
theory of programs that process lists. This theory generalises from lists to a variety
of other datatypes, but for simplicity we restrict our attention to lists.
This article is a tutorial on two key aspects of the fold operator for lists. First of
all, we emphasize the use of the universal property of fold (together with the derived
fusion property) both as proof principles that avoid the need for inductive proofs,
and as definition principles that guide the transformation of recursive functions into
definitions using fold. Secondly, we show that even though the pattern of recursion
encapsulated by fold is simple, in a language with tuples and functions as first-class
values the fold operator has greater expressive power than might first be expected,
thus permitting the powerful universal and fusion properties of fold to be applied
to a larger class of programs. The article concludes with a survey of other work on
recursion operators that we do not have space to pursue here.
The article is aimed at a reader who is familiar with the basics of functional
programming, say to the level of Bird and Wadler (1988) and Bird (1998). All
programs in the article are written in Haskell (Peterson et al., 1997), the standard
lazy functional programming language. However, no special features of Haskell are
used, and the ideas can easily be adapted to other functional languages.
2 The fold operator
The fold operator has its origins in recursion theory (Kleene, 1952), while the use
of fold as a central concept in a programming language dates back to the reduction
operator of APL (Iverson, 1962), and later to the insertion operator of FP (Backus,
1978). In Haskell, the fold operator for lists can be defined as follows:
fold :: (α β β) β ([α] β)
fold fv [ ] = v
fold fv(x : xs)=fx( fold f v xs)
That is, given a function f of type α β β and a value v of type β, the function
fold fvprocesses a list of type [α] to give a value of type β by replacing the nil
constructor [ ] at the end of the list by the value v, and each cons constructor (:)
within the list by the function f. In this manner, the fold operator encapsulates a
simple pattern of recursion for processing lists, in which the two constructors for lists
are simply replaced by other values and functions. A number of familiar functions
on lists have a simple definition using fold. For example:
sum :: [Int] Int product :: [Int] Int
sum = fold (+) 0 product = fold (×)1
and :: [Bool] Bool or :: [Bool] Bool
and = fold () True or = fold () False
A tutorial on the universality and expressiveness of fold 357
Recall that enclosing an infix operator in parentheses () converts the operator
into a prefix function. This notational device, called sectioning, is often useful when
defining simple functions using fold. If required, one of the arguments to the operator
can also be enclosed in the parentheses. For example, the function (++) that appends
two lists to give a single list can be defined as follows:
(++) :: [α] [α] [ α]
(++ ys)=fold (:) ys
In all our examples so far, the constructor (:) is replaced by a built-in function.
However, in most applications of fold the constructor (:) will b e replaced by a
user-defined function, often defined as a nameless function using the λ notation, as
in the following definitions of standard list-processing functions:
length :: [α] Int
length = fold (λx n 1+n)0
reverse :: [α] [α]
reverse = fold (λx xs xs ++[x]) [ ]
map :: (α β) ([α] [β])
map f = fold (λx xs fx: xs) [ ]
filter :: (α Bool) ([α] [α])
filter p = fold (λx xs if pxthen x : xs else xs) [ ]
Programs written using fold can be less readable than programs written using
explicit recursion, but can be constructed in a systematic manner, and are better
suited to transformation and proof. For example, we will see later on in the article
how the above definition for map using fold can be constructed from the standard
definition using explicit recursion, and more importantly, how the definition using
fold simplifies the process of proving properties of the map function.
3 The universal property of fold
As with the fold operator itself, the universal property of fold also has its origins
in recursion theory. The first systematic use of the universal property in functional
programming was by Malcolm (1990a), in his generalisation of Bird and Meerten’s
theory of lists (Bird, 1989; Meertens, 1983) to arbitrary regular datatypes. For
finite lists, the universal property of fold can be stated as the following equivalence
between two definitions for a function g that processes lists:
g [ ] = v
g (x : xs)=fx(g xs)
g = fold fv
In the right-to-left direction, substituting g = fold fvinto the two equations for g
gives the recursive definition for fold. Conversely, in the left-to-right direction the
two equations for g are precisely the assumptions required to show that g = fold fv
358 G. Hutton
using a simple proof by induction on finite lists (Bird, 1998). Taken as a whole,
the universal property states that for finite lists the function fold fvis not just a
solution to its defining equations, but in fact the unique solution.
The key to the utility of the universal property is that it makes explicit the two
assumptions required for a certain pattern of inductive proof. For specific cases then,
by verifying the two assumptions (which can typically be done without the need for
induction) we can then appeal to the universal property to complete the inductive
proof that g = fold fv. In this manner, the universal property of fold encapsulates
a simple pattern of inductive proof concerning lists, just as the fold operator itself
encapsulates a simple pattern of recursion for processing lists.
The universal property of fold can be generalised to handle partial and infinite
lists (Bird, 1998), but for simplicity we only consider finite lists in this article.
3.1 Universality as a proof principle
The primary application of the universal property of fold is as a proof principle
that avoids the need for inductive proofs. As a simple first example, consider the
following equation between functions that process a list of numbers:
(+1) · sum = fold (+) 1
The left-hand function sums a list and then increments the result. The right-hand
function processes a list by replacing each (:) by the addition function (+) and the
empty list [ ] by the constant 1. The equation asserts that these two functions always
give the same result when applied to the same list.
To prove the above equation, we begin by observing that it matches the right-hand
side g = fold fvof the universal property of fold , with g = (+1) · sum, f = (+),
and v = 1. Hence, by appealing to the universal property, we conclude that the
equation to be proved is equivalent to the following two equations:
((+1) · sum) [ ] = 1
((+1) · sum)(x : xs) = (+) x (((+1) · sum) xs)
At first sight, these may seem more complicated than the original equation. However,
simplifying using the definitions of composition and sectioning gives
sum [ ] + 1 = 1
sum (x : xs) + 1 = x +(sum xs + 1)
which can now be verified by simple calculations, shown here in two columns:
sum [ ] + 1 sum (x : xs)+1
= { Definition of sum } = { Definition of sum }
0 + 1 (x + sum xs)+1
= { Arithmetic } = { Arithmetic }
1 x +(sum xs + 1)
This completes the proof. Normally this proof would have required an explicit use of
induction. However, in the above proof the use of induction has been encapsulated
A tutorial on the universality and expressiveness of fold 359
in the universal property of fold, with the result that the proof is reduced to a
simplification step followed by two simple calculations.
In general, any two functions on lists that can be proved equal by induction can
also be proved equal using the universal property of the fold operator, provided, of
course, that the functions can be expressed using fold. The expressive power of the
fold operator will be addressed later on in the article.
3.2 The fusion prop erty of fold
Now let us generalise from the sum example and consider the following equation
between functions that process a list of values:
h · fold gw = fold fv
This pattern of equation occurs frequently when reasoning about programs written
using fold . It is not true in general, but we can use the universal property of fold
to calculate conditions under which the equation will indeed be true. The equation
matches the right-hand side of the universal property, from which we conclude that
the equation is equivalent to the following two equations:
(h · fold gw) [ ] = v
(h · fold gw)(x : xs)=fx((h · fold gw) xs)
Simplifying using the definition of composition gives
h ( fold gw [ ]) = v
h ( fold gw(x : xs)) = fx(h ( fold g w xs))
which can now be further simplified by two calculations:
h ( fold gw[ ]) = v
{ Definition of fold }
hw = v
and
h ( fold gw(x : xs)) = fx(h ( fold g w xs))
{ Definition of fold }
h (gx( fold g w xs)) = fx(h ( fold g w xs))
{ Generalising ( fold g w xs) to a fresh variable y }
h (g x y)=fx(hy)
That is, using the universal property of fold we have calculated without an explicit
use of induction two simple conditions that are together sucient to ensure for
all finite lists that the composition of an arbitrary function and a fold can be fused
together to give a single fold. Following this interpretation, this property is called
the fusion property of the fold operator, and can be stated as follows:
hw = v
h (g x y)=fx(hy)
h · fold gw = fold fv
360 G. Hutton
The first systematic use of the fusion property in functional programming was again
by Malcolm (1990a), generalising earlier work by Bird (1989) and Meertens (1983).
As with the universal property, the primary application of the fusion property is
as a proof principle that avoids the need for inductive proofs. In fact, for many
practical examples the fusion property is often preferable to the universal property.
As a simple first example, consider again the equation:
(+1) · sum = fold (+) 1
In the previous section this equation was proved using the universal property of
fold. However, the proof is simpler using the fusion property. First, we replace the
function sum by its definition using fold given earlier:
(+1) · fold (+) 0 = fold (+) 1
The equation now matches the conclusion of the fusion property, from which we
conclude that the equation follows from the following two assumptions:
(+1) 0 = 1
(+1) ((+) xy) = (+) x ((+1) y)
Simplifying these equations using the definition of sectioning gives 0 + 1 = 1 and
(x + y) + 1 = x +(y + 1), which are true by simple properties of arithmetic. More
generally, by replacing the use of addition in this example by an arbitrary infix
operator that is associative, a simple application of fusion shows that:
( a) · fold () b = fold ()(b a)
For a more interesting example, consider the following well-known equation,
which asserts that the map operator distributes over function composition (·):
map f · map g = map (f · g)
By replacing the second and third occurrences of the map operator in the equation
by its definition using fold given earlier, the equation can be rewritten in a form
that matches the conclusion of the fusion property:
map f · fold (λx xs gx: xs) [ ]
=
fold (λx xs (f · g) x : xs) [ ]
Appealing to the fusion property and then simplifying gives the following two
equations, which are trivially true by the definitions of map and (·):
map f [ ] = [ ]
map f (gx: y)=(f · g) x : map f y
In addition to the fusion property, there are a number of other useful properties
of the fold operator that can be derived from the universal property (Bird, 1998).
However, the fusion property suces for many practical cases, and one can always
revert to the full power of the universal property if fusion is not appropriate.
A tutorial on the universality and expressiveness of fold 361
3.3 Universality as a definition p rinciple
As well as being used as a proof principle, the universal property of fold can also be
used as a definition principle that guides the transformation of recursive functions
into definitions using fold. As a simple first example, consider the recursively defined
function sum that calculates the sum of a list of numbers:
sum :: [Int] Int
sum [ ] = 0
sum (x : xs)=x + sum xs
Suppose now that we want to redefine sum using fold. That is, we want to solve the
equation sum = fold fvfor a function f and a value v. We begin by observing that
the equation matches the right-hand side of the universal property, from which we
conclude that the equation is equivalent to the following two equations:
sum [ ] = v
sum (x : xs)=fx(sum xs)
From the first equation and the definition of sum, it is immediate that v = 0. From
the second equation, we calculate a definition for f as follows:
sum (x : xs)=fx(sum xs)
{ Definition of sum }
x + sum xs = fx(sum xs)
{ Generalising (sum xs) to y }
x + y = f x y
{ Functions }
f = (+)
That is, using the universal property we have calculated that:
sum = fold (+) 0
Note that the key step () above in calculating a definition for f is the generalisation
of the expression sum xs to a fresh variable y. In fact, such a generalisation step is
not specific to the sum function, but will be a key step in the transformation of any
recursive function into a definition using fold in this manner.
Of course, the sum example above is rather artificial, because the definition of
sum using fold is immediate. However, there are many examples of functions whose
definition using fold is not so immediate. For example, consider the recursively
defined function map f that applies a function f to each element of a list:
map :: (α β) ([α] [β])
map f [ ] = [ ]
map f (x : xs)=fx: map f xs
To redefine map f using fold we must solve the equation map f = fold gvfor a
function g and a value v. By appealing to the universal property, we conclude that
this equation is equivalent to the following two equations:
362 G. Hutton
map f [ ] = v
map f (x : xs)=gx(map f xs)
From the first equation and the definition of map it is immediate that v = [ ]. From
the second equation, we calculate a definition for g as follows:
map f (x : xs)=gx(map f xs)
{ Definition of map }
fx: map f xs = gx(map f xs)
{ Generalising (map f xs) to ys }
fx: ys = g x ys
{ Functions }
g = λx ys fx: ys
That is, using the universal property we have calculated that:
map f = fold (λx ys fx: ys) [ ]
In general, any function on lists that can be expressed using the fold operator can
be transformed into such a definition using the universal property of fold.
4 Increasing the power of fold: generating tuples
As a simple first example of the use of fold to generate tuples, consider the function
sumlength that calculates the sum and length of a list of numbers:
sumlength :: [Int] (Int, Int)
sumlength xs =(sum xs, length xs)
By a straightforward combination of the definitions of the functions sum and
length using fold given earlier, the function sumlength can be redefined as a single
application of fold that generates a pair of numbers from a list of numbers:
sumlength = fold (λn (x, y) (n + x, 1+y)) (0, 0)
This definition is more ecient than the original definition, because it only makes a
single traversal over the argument list, rather than two separate traversals. General-
ising from this example, any pair of applications of fold to the same list can always
be combined to give a single application of fold that generates a pair, by appealing
to the so-called ‘banana split’ property of fold (Meijer, 1992). The strange name of
this property derives from the fact that the fold operator is sometimes written using
brackets (||) that resemble bananas, and the pairing operator is sometimes called
split. Hence, their combination can be termed a banana split!
As a more interesting example, let us consider the function dropWhile p that
removes initial elements from a list while all the elements satisfy the predicate p:
dropWhile :: (α Bool) ([α] [α])
dropWhile p [ ] = [ ]
dropWhile p (x : xs)=if pxthen dropWhile p xs else x : xs
A tutorial on the universality and expressiveness of fold 363
Suppose now that we want to redefine dropWhile p using the fold operator. By
appealing to the universal property, we conclude that the equation dropWhile p =
fold fvis equivalent to the following two equations:
dropWhile p [ ] = v
dropWhile p (x : xs)=fx(dropWhile p xs)
¿From the first equation it is immediate that v = [ ]. From the second equation, we
attempt to calculate a definition for f in the normal manner:
dropWhile p (x : xs)=fx(dropWhile p xs)
{ Definition of dropWhile }
if pxthen dropWhile p xs else x : xs = fx(dropWhile p xs)
{ Generalising (dropWhile p xs) to ys }
if pxthen ys else x : xs = f x ys
Unfortunately, the final line above is not a valid definition for f, because the variable
xs occurs freely. In fact, it is not possible to redefine dropWhile p directly using fold.
However, it is possible indirectly, because the more general function
dropWhile
*
:: (α Bool) ([α] ([α], [α]))
dropWhile
*
p xs =(dropWhile p xs, xs)
that pairs up the result of applying dropWhile p to a list with the list itself can be
redefined using fold . By appealing to the universal property, we conclude that the
equation dropWhile
*
p = fold fvis equivalent to the following two equations:
dropWhile
*
p [ ] = v
dropWhile
*
p (x : xs)=fx(dropWhile
*
p xs)
A simple calculation from the first equation gives v = ([ ], [ ]). From the second
equation, we calculate a definition for f as follows:
dropWhile
*
p (x : xs)=fx(dropWhile
*
p xs)
{ Definition of dropWhile
*
}
(dropWhile p (x : xs),x: xs)=fx(dropWhile p xs, xs)
{ Definition of dropWhile }
(if pxthen dropWhile p xs else x : xs, x : xs)
= fx(dropWhile p xs, xs)
{ Generalising (dropWhile p xs) to ys }
(if pxthen ys else x : xs, x : xs)=fx(ys, xs)
Note that the final line above is a valid definition for f, because all the variables
are bound. In summary, using the universal property we have calculated that:
dropWhile
*
p = fold fv
where
fx(ys, xs)=(if pxthen ys else x : xs, x : xs)
v = ([ ], [ ])
364 G. Hutton
This definition satisfies the equation dropWhile
*
p xs =(dropWhile p xs, xs), but
does not make use of dropWhile in its definition. Hence, the function dropWhile itself
can now be redefined simply by dropWhile p = fst · dropWhile
*
p.
In conclusion, by first generalising to a function dropWhile
*
that pairs the desired
result with the argument list, we have now shown how the function dropWhile can
be redefined in terms of fold, as required. In fact, this result is an instance of a
general theorem (Meertens, 1992) that states that any function on finite lists that is
defined by pairing the desired result with the argument list can always be redefined
in terms of fold, although not always in a way that does not make use of the original
(possibly recursive) definition for the function.
4.1 Primitive recursion
In this section we show that by using the tupling technique from the previous section,
every primitive recursive function on lists can be redefined in terms of fold . Let us
begin by recalling that the fold operator captures the following simple pattern of
recursion for defining a function h that processes lists:
h [ ] = v
h (x : xs)=gx(h xs)
Such functions can be redefined by h = fold gv. We will generalise this pattern
of recursion to primitive recursion in two steps. First of all, we introduce an extra
argument y to the function h, which in the base case is processed by a new function
f, and in the recursive case is passed unchanged to the functions g and h. That is,
we now consider the following pattern of recursion for defining a function h:
hy [ ] = fy
hy(x : xs)=gyx(h y xs)
By simple observation, or a routine application of the universal property of fold,
the function hycan be redefined using fold as follows:
hy = fold (gy)(fy)
For the second step, we introduce the list xs as an extra argument to the auxiliary
function g. That is, we now consider the following pattern for defining h:
hy [ ] = fy
hy(x : xs)=g y x xs (h y xs)
This pattern of recursion on lists is called primitive recursion (Kleene, 1952). Tech-
nically, the standard definition of primitive recursion requires that the argument y
is a finite sequence of arguments. However, because tuples are first-class values in
Haskell, treating the case of a single argument y is sucient.
In order to redefine primitive recursive functions in terms of fold, we must solve
the equation hy= fold ijfor a function i and a value j. This is not possible
directly, but is possible indirectly, because the more general function
k y xs =(h y xs, xs)
A tutorial on the universality and expressiveness of fold 365
that pairs up the result of applying hyto a list with the list itself can be redefined
using fold. By appealing to the universal property of fold, we conclude that the
equation ky= fold ijis equivalent to the following two equations:
ky [ ] = j
ky(x : xs)=ix(k y xs)
A simple calculation from the first equation gives j =(f y, [ ]). ¿From the second
equation, we calculate a definition for i as follows:
ky(x : xs)=ix(k y xs)
{ Definition of k }
(hy(x : xs),x: xs)=ix(h y xs, xs)
{ Definition of h }
(g y x xs (h y xs),x: xs)=ix(h y xs, xs)
{ Generalising (h y xs) to z }
(g y x xs z, x : xs)=ix(z, xs)
In summary, using the universal property we have calculated that:
ky = fold ij
where
ix(z, xs)=(g y x xs z, x : xs)
j =(f y, [ ])
This definition satisfies the equation k y xs =(h y xs, xs), but does not make
use of h in its definition. Hence, the primitive recursive function h itself can now b e
redefined simply by hy= fst · ky. In conclusion, we have now shown how an
arbitrary primitive recursive function on lists can be redefined in terms of fold.
Note that the use of tupling to define primitive recursive functions in terms
of fold is precisely the key to defining the predecessor function for the Church
numerals (Barendregt, 1984). Indeed, the intuition behind the representation of the
natural numbers (or more generally, any inductive datatype) in the λ-calculus is the
idea of representing each number by its fold operator. For example, the number
3=succ (succ (succ zero)) is represented by the term λf x f (f (fx)), which is
the fold operator for 3 in the sense that the arguments f and x can be viewed as
the replacements for the succ and zero constructors respectively.
5 Using fold to generate functions
Having functions as first-class values increases the power of primitive recursion,
and hence the power of the fold operator. As a simple first example of the use of
fold to generate functions, the function compose that forms the composition of a
list of functions can b e defined using fold by replacing each (:) in the list by the
composition function (·), and the empty list [ ] by the identity function id:
compose :: [α α] (α α)
compose = fold (·) id
366 G. Hutton
As a more interesting example, let us consider the problem of summing a list of
numbers. The natural definition for such a function, sum = fold (+) 0, processes
the numbers in the list in right-to-left order. However, it is also possible to define a
function suml that processes the numbers in left-to-right order. The suml function is
naturally defined using an auxiliary function suml
*
that is itself defined by explicit
recursion and makes use of an accumulating parameter n:
suml :: [Int] Int
suml xs = suml
*
xs 0
where
suml
*
[] n = n
suml
*
(x : xs) n = suml
*
xs (n + x)
Because the addition function (+) is associative and the constant 0 is unit for
addition, the functions suml and sum always give the same result when applied to
the same list. However, the function suml has the potential to be more ecient,
because it can easily be modified to run in constant space (Bird, 1998).
Suppose now that we want to redefine suml using the fold operator. This is not
possible directly, but is possible indirectly, because the auxiliary function
suml
*
:: [Int] (Int Int)
can be redefined using fold. By appealing to the universal property, we conclude
that the equation suml
*
= fold fvis equivalent to the following two equations:
suml
*
[ ] = v
suml
*
(x : xs)=fx(suml
*
xs)
A simple calculation from the first equation gives v = id. From the second equation,
we calculate a definition for the function f as follows:
suml
*
(x : xs)=fx(suml
*
xs)
{ Functions }
suml
*
(x : xs) n = fx(suml
*
xs) n
{ Definition of suml
*
}
suml
*
xs (n + x)=fx(suml
*
xs) n
{ Generalising (suml
*
xs) to g }
g (n + x)=f x g n
{ Functions }
f = λx g (λn g (n + x))
In summary, using the universal property we have calculated that:
suml
*
= fold (λx g (λn g (n + x))) id
This definition states that suml
*
processes a list by replacing the empty list [ ] by
the identity function id on lists, and each constructor (:) by the function that takes
a number x and a function g, and returns the function that takes an accumulator
value n and returns the result of applying g to the new accumulator value n + x.
Note that the structuring of the arguments to suml
*
:: [Int] (Int Int) is
crucial to its definition using fold. In particular, if the order of the two arguments is
A tutorial on the universality and expressiveness of fold 367
swapped or they are supplied as a pair, then the type of suml
*
means that it can no
longer be defined directly using fold. In general, some care regarding the structuring
of arguments is required when aiming to redefine functions using fold . Moreover,
at first sight one might imagine that fold can only be used to define functions that
process the elements of lists in right-to-left order. However, as the definition of suml
*
using fold shows, the order in which the elements are processed depends on the
arguments of fold, not on fold itself.
In conclusion, by first redefining the auxiliary function suml
*
using fold , we have
now shown how the function suml can be redefined in terms of fold, as required:
suml xs = fold (λx g (λn g (n + x))) id xs 0
We end this section by remarking that the use of fold to generate functions
provides an elegant technique for the implementation of ‘attribute grammars’ in
functional languages (Fokkinga et al., 1991; Swierstra et al., 1998).
5.1 The foldl operator
Now let us generalise from the suml example and consider the standard operator
foldl that processes the elements of a list in left-to-right order by using a function f
to combine values, and a value v as the starting value:
foldl :: (β α β) β ([α] β)
foldl fv [ ] = v
foldl fv(x : xs)=foldl f (f v x) xs
Using this operator, suml can be redefined simply by suml = foldl (+) 0. Many other
functions can be defined in a simple way using foldl. For example, the standard
function reverse can redefined using foldl as follows:
reverse :: [α] [α]
reverse = foldl (λxs x x : xs) [ ]
This definition is more ecient than our original definition using fold , because it
avoids the use of the inecient append operator (++) for lists.
A simple generalisation of the calculation in the previous section for the function
suml shows how to redefine the function foldl in terms of fold:
foldl f v xs = fold (λx g (λa g (f a x))) id xs v
In contrast, it is not possible to redefine fold in terms of foldl, due to the fact that
foldl is strict in the tail of its list argument but fold is not. There are a number
of useful ‘duality theorems’ concerning fold and foldl, and also some guidelines for
deciding which operator is best suited to particular applications (Bird, 1998).
5.2 Ackermann’s function
For our final example of the power of fold, consider the function ack that processes
two lists of integers, and is defined using explicit recursion as follows:
368 G. Hutton
ack :: [Int] ([Int] [Int])
ack [] ys = 1 : ys
ack (x : xs) [ ] = ack xs [1]
ack (x : xs)(y : ys)=ack xs (ack (x : xs) ys)
This is Ackermann’s function, converted to operate on lists rather than natural
numbers by representing each number n by a list with n arbitrary elements. This
function is the classic example of a function that is not primitive recursion in a
first-order programming language. However, in a higher-order language such as
Haskell, Ackermann’s function is indeed primitive recursive (Reynolds, 1985). In
this section we show how to calculate the definition ack in terms of fold .
First of all, by appealing to the universal property of fold, the equation ack =
fold fvis equivalent to the following two equations:
ack [ ] = v
ack (x : xs)=fx(ack xs)
A simple calculation from the first equation gives the definition v = (1 :). From the
second equation, proceeding in the normal manner does not result in a definition
for the function f, as the reader may wish to verify. However, progress can be
made by first using fold to redefine the function ack (x : xs) on the left-hand
side of the second equation. By appealing to the universal property, the equation
ack (x : xs)=fold gwis equivalent to the following two equations:
ack (x : xs) [ ] = w
ack (x : xs)(y : ys)=gy(ack (x : xs) ys)
The first equation gives w = ack xs [1], and from the second:
ack (x : xs)(y : ys)=gy(ack (x : xs) ys)
{ Definition of ack }
ack xs (ack (x : xs) ys)=gy(ack (x : xs) ys)
{ Generalising (ack (x : xs) ys) to zs }
ack xs zs = g y zs
{ Functions }
g = λy ack xs
That is, using the universal property we have calculated that:
ack (x : xs)=fold (λy ack xs)(ack xs [1])
Using this result, we can now calculate a definition for f:
ack (x : xs)=fx(ack xs)
{ Result above }
fold (λy ack xs)(ack xs [1]) = fx(ack xs)
{ Generalising (ack xs) to g }
fold (λy g)(g [1]) = f x g
{ Functions }
f = λx g fold (λy g)(g [1])
In summary, using the universal property twice we have calculated that:
ack = fold (λx g fold (λy g)(g [1])) (1 :)
A tutorial on the universality and expressiveness of fold 369
6 Other work on recursion operators
In this final section we briefly survey a selection of other work on recursion operators
that we did not have space to pursue in this article.
Fold for regular datatypes. The fold operator is not specific to lists, but can
be generalised in a uniform way to ‘regular’ datatypes. Indeed, using ideas from
category theory, a single fold operator can be defined that can be used with any
regular datatype (Malcolm, 1990b; Meijer et al., 1991; Sheard and Fegaras, 1993).
Fold for nested datatypes. The fold operator can also be generalised in a natural
way to ‘nested’ datatypes. However, the resulting operator appears to be too general
to be widely useful. Finding solutions to this problem is the subject of current
research (Bird and Meertens, 1998; Jones and Blampied, 1998).
Fold for functional datatypes. Generalising the fold operator to datatypes that
involve functions gives rise to technical problems, due to the contravariant nature
of function types. Using ideas from category theory, a fold operator can be defined
that works for such datatypes (Meijer and Hutton, 1995a), but the the use of this
operator is not well understood, and practical applications are lacking. However,
a simpler but less g eneral solution has given rise to some interesting applications
concerning cyclic structures (Fegaras and Sheard, 1996).
Monadic fold. In a series of influential articles, Wadler showed how pure functional
programs that require imperative features such as state and exceptions can be
modelled using monads (Wadler, 1990, 1992a, 1992b). Building on this work, the
notion of a ‘monadic fold’ combines the use of fold operators to structure the
processing of recursive values with the use of monads to structure the use of
imperative features (Fokkinga, 1994; Meijer and Jeuring, 1995b).
Relational fold. The fold operator can also be generalised in a natural way from
functions to relations. This generalisation supports the use of fold as a specification
construct, in addition to its use as a programming construct. For example, a relational
fold is used in the circuit design calculus Ruby (Jones and Sheeran, 1990; Jones,
1990), the Eindhoven spec calculus (Aarts et al., 1992), and in a recent textbook on
the algebra of programming (Bird and de Moor, 1997).
Other recursion operators. The fold operator is not the only useful recursion oper-
ator. For example, the dual operator unfold for constructing rather than processing
recursive values has been used for specification purposes (Jones, 1990; Bird and
de Moor, 1997), to program reactive systems (Kieburtz, 1998), to program opera-
tional semantics (Hutton, 1998), and is the subject of current research. Other in-
teresting recursion operators include the so-called paramorphisms (Meertens, 1992),
hylomorphisms (Meijer, 1992), and zygomorphisms (Malcolm, 1990a).
Automatic program transformation. Writing programs using recursion operators can
simplify the process of optimisation during compilation. For example, eliminating
the use of intermediate data structures in programs (deforestation) in considerably
simplified when programs are written using recursion operators rather than general
recursion (Wadler, 1981; Launchbury and Sheard, 1995; Takano and Meijer, 1995).
A generic system for transforming programs written using recursion operators is
currently under development (de Moor and Sittampalan, 1998).
370 G. Hutton
Polytypic programming. Defining programs that are not specific to particular
datatypes has given rise to a new field, called polytypic programming (Backhouse
et al., 1998). Formally, a polytypic program is one that is parameterised by one
or more datatypes. Polytypic programs have already been defined for a number of
applications, including pattern matching (Jeuring, 1995), unification (Jansson and
Jeuring, 1998), and various optimisation problems (Bird and de Moor, 1997).
Programming languages. A number of experimental programming languages have
been developed that focus on the use of recursion operators rather than general re-
cursion. Examples include the algebraic design language ADL (Kieburtz and Lewis,
1994), the categorical programming language Charity (Cockett and Fukushima,
1992), and the polytypic programming language PolyP (Jansson and Jeuring, 1997).
Acknowledgements
I would like to thank Erik Meijer and the members of the Languages and Program-
ming group in Nottingham for many hours of interesting discussions about fold.
I am also grateful to Roland Backhouse, Mark P. Jones, Philip Wadler, and the
anonymous JFP referees for their detailed comments on the article, which led to
a substantial improvement in both the content and presentation. This work is sup-
ported by Engineering and Physical Sciences Research Council (EPSRC) research
grant GR/L74491, Structured Recursive Programming.
References
Aarts, C., Backhouse, R., Hoogendijk, P., Voermans, E. and van der Woude,
J. (1992) A relational theory of datatypes. Available on the World-Wide-Web:
http://www.win.tue.nl/win/cs/wp/papers/papers.html.
Backhouse, R., Jansson, P., Jeuring, J. and Meertens, L. (1998) Generic programming: An
introduction. Lecture Notes of the 3rd International Summer School on Advanced Functional
Programming.
Backus, J. (1978) Can programming be liberated from the Von Neumann style? A functional
style and its algebra of programs. Comm. ACM, 9.
Barendregt, H. (1984) The Lambda Calculus Its syntax and semantics. North-Holland.
Bird, R. (1989) Constructive functional programming. Proc. Marktoberdorf International Sum-
mer School on Constructive Methods in Computer Science. Springer-Verlag.
Bird, R. (1998) Introduction to Functional Programming using Haskell (2nd ed.). Prentice Hall.
Bird, R. and de Moor, O. (1997) Algebra of Programming. Prentice Hall.
Bird, R. and Meertens, L. (1998) Nested datatypes. In: Jeuring, J. (ed.), Proc. 4th International
Conference on Mathematics of Program Construction: Lecture Notes in Computer Science
1422. Springer-Verlag.
Bird, R. and Wadler, P. (1988) An Introduction to Functional Programming. Prentice Hall.
Cockett, R. and Fukushima, T. (1992) About Charity. Yellow Series Report No. 92/480/18,
Department of Computer Science, The University of Calgary.
de Moor, O. and Sittampalan, G. (1998) Generic program transformation. Lecture Notes of
the 3rd International Summer School on Advanced Functional Programming.
Fegaras, L. and Sheard, T. (1996) Revisiting catemorphisms over datatypes with embedded
A tutorial on the universality and expressiveness of fold 371
functions. Proc. 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages.
Fokkinga, M. (1994) Monadic maps and folds for arbitrary datatypes. Memoranda Informatica
94-28, University of Twente.
Fokkinga, M., Jeuring, J., Meertens, L. and Meijer, E. (1991) Translating attribute grammars
into catamorphisms. The Squiggolist, 2(1).
Hutton, G. (1998) Fold and unfold for program semantics. Proc. 3rd ACM SIGPLAN Inter-
national Conference on Functional Programming.
Iverson, K. E. (1962) A Programming Language. Wiley.
Jansson, P. and Jeuring, J. (1997) PolyP a polytypic programming language extension. Proc.
24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM
Press.
Jansson, P. and Jeuring, J. (1998) Polytypic unification. J. Functional Programming (to appear).
Jeuring, Jo. (1995) Polytypic pattern matching. Proc. 7th International Conference on Func-
tional Programming and Computer Architecture. ACM Press.
Jones, G. (1990) Designing circuits by calculation. Technical Report PRG-TR-10-90, Oxford
University.
Jones, G. and Sheeran, M. (1990) Circuit design in Ruby. In: Staunstrup (ed.), Formal Methods
for VLSI Design. Elsevier.
Jones, M. P. and Blampied, P. (1998) A pragmatic approach to maps and folds for param-
eterized datatypes. Submitted.
Kieburtz, R. B. (1998) Reactive functional programming. Proc. PROCOMET. Chapman &
Hall.
Kieburtz, R. B. and Lewis, J. (1994) Algebraic Design Language (preliminary definition). Oregon
Graduate Institute of Science and Technology.
Kleene, S. C. (1952) Introduction to Metamathematics. Van Nostrand Rheinhold.
Launchbury, J. and Sheard, T. (1995) Warm fusion: Deriving build-catas from recursive
definitions. Proc. 7th International Conference on Functional Programming and Computer
Architecture. ACM Press.
Malcolm, G. (1990a) Algebraic data types and program transformation. PhD thesis, Groningen
University.
Malcolm, G. (1990b) Algebraic data types and program transformation. Science of Computer
Programming, 14(2–3), 255–280.
Meertens, L. (1983) Algorithmics: Towards programming as a mathematical activity. Proc.
CWI Symposium.
Meertens, L. (1992) Paramorphisms. Formal Aspects of Computing, 4(5), 413–425.
Meijer, E. (1992) Calculating compilers. PhD thesis, Nijmegen University.
Meijer, E. and Hutton, G. (1995a) Bananas in space: Extending fold and unfold to expo-
nential types. Proc. 7th International Conference on Functional Programming and Computer
Architecture. ACM Press.
Meijer, E. and Jeuring, J. (1995b) Merging monads and folds for functional programming.
In: Jeuring, J. and Meijer, E. (eds.), Advanced Functional Programming: Lecture Notes in
Computer Science 925. Springer-Verlag.
Meijer, E., Fokkinga, M. and Paterson, R. (1991) Functional programming with bananas,
lenses, envelopes and barb ed wire. In: Hughes, J. (ed.), Proc. Conference on Functional
Programming and Computer Architecture: Lecture Notes in Computer Science 523. Springer-
Verlag.
372 G. Hutton
Peterson, J. et al. (1997) The Haskell language report, version 1.4. Available on the World-
Wide-Web: http://www.haskell.org.
Reynolds, J. C. (1985) Three approaches to type structure. Proc. International Joint Conference
on Theory and Practice of Software Development: Lecture Notes in Computer Science 185.
Springer-Verlag.
Sheard, T. and Fegaras, L. (1993) A fold for all seasons. Proc. ACM Conference on Functional
Programming and Computer Architecture. Springer-Verlag.
Swierstra, S. D., Alcocer, P. R. A. and Saraiva, J. (1998) Designing and implementing
combinator languages. Lecture Notes of the 3rd International Summer School on Advanced
Functional Programming.
Takano, A. and Meijer, E. (1995) Shortcut deforestation in calculational form. Proc. 7th
International Conference on Functional Programming and Computer Architecture. ACM Press.
Wadler, P. (1981) Applicative style programming, program transformation, and list operators.
Proc. ACM Conference on Functional Programming Languages and Computer Architecture.
Wadler, P. (1990) Comprehending monads. Proc. ACM Conference on Lisp and Functional
Programming.
Wadler, P. (1992a). The essence of functional programming. Proc. Principles of Programming
Languages.
Wadler, P. (1992b) Monads for functional programming. In: Broy, M. (ed.), Proc. Marktober-
dorf Summer School on Program Design Calculi. Springer-Verlag.
... As is well-known [Geuvers andPoll Geuvers andPoll 2007 2007;Hutton Hutton 19991999, structural recursion on inductive types allows us to express primitive recursion. By 'primitive recursion', we mean recursing through values of an inductive type µα .σ ...
Preprint
Full-text available
We present a modular semantic account of Bayesian inference algorithms for probabilistic programming languages, as used in data science and machine learning. Sophisticated inference algorithms are often explained in terms of composition of smaller parts. However, neither their theoretical justification nor their implementation reflects this modularity. We show how to conceptualise and analyse such inference algorithms as manipulating intermediate representations of probabilistic programs using higher-order functions and inductive types, and their denotational semantics. Semantic accounts of continuous distributions use measurable spaces. However, our use of higher-order functions presents a substantial technical difficulty: it is impossible to define a measurable space structure over the collection of measurable functions between arbitrary measurable spaces that is compatible with standard operations on those functions, such as function application. We overcome this difficulty using quasi-Borel spaces, a recently proposed mathematical structure that supports both function spaces and continuous distributions. We define a class of semantic structures for representing probabilistic programs, and semantic validity criteria for transformations of these representations in terms of distribution preservation. We develop a collection of building blocks for composing representations. We use these building blocks to validate common inference algorithms such as Sequential Monte Carlo and Markov Chain Monte Carlo. To emphasize the connection between the semantic manipulation and its traditional measure theoretic origins, we use Kock's synthetic measure theory. We demonstrate its usefulness by proving a quasi-Borel counterpart to the Metropolis-Hastings-Green theorem.
... • Class Laws We proved the monoid laws for the Peano, Maybe and List data types and the Functor, Applicative, and Monad laws, summarized in Figure 11, for the Maybe, List and Identity monads. • Higher Order Properties We used natural deduction to prove textbook logical properties as in § 3. We combined natural deduction principles with PLE-search to prove universality of right-folds, as described in [25] and formalized in Agda [34]. • Functional Correctness We proved correctness of a SAT solver and a unification algorithm as implemented in Zombie [11]. ...
Preprint
Full-text available
We introduce Refinement Reflection, a new framework for building SMT-based deductive verifiers. The key idea is to reflect the code implementing a user-defined function into the function's (output) refinement type. As a consequence, at uses of the function, the function definition is instantiated in the SMT logic in a precise fashion that permits decidable verification. Reflection allows the user to write equational proofs of programs just by writing other programs using pattern-matching and recursion to perform case-splitting and induction. Thus, via the propositions-as-types principle, we show that reflection permits the specification of arbitrary functional correctness properties. Finally, we introduce a proof-search algorithm called Proof by Logical Evaluation that uses techniques from model checking and abstract interpretation, to completely automate equational reasoning. We have implemented reflection in Liquid Haskell and used it to verify that the widely used instances of the Monoid, Applicative, Functor, and Monad typeclasses actually satisfy key algebraic laws required to make the clients safe, and have used reflection to build the first library that actually verifies assumptions about associativity and ordering that are crucial for safe deterministic parallelism.
... norm. Thus they inform the design of quantum programming languages [11], as part of a highly effective broader approach to program semantics from universal properties [30,16,27]. ...
... Starting from the function definitions, we can see that they follow the same recursive pattern, we can write this pattern in Agda, which is just a specialization of a fold function [11,12]: ...
Article
Full-text available
It is well known that some recursive functions admit a tail recursive counterpart which have a more efficient time-complexity behavior. This paper presents a formal specification and verification of such process. A monoid is used to generate a recursive function and its tail-recursive counterpart. Also, the monoid properties are used to prove extensional equality of both functions. In order to achieve this goal, the Agda programming language and proof assistant is used to generate a parametrized module with a monoid, via dependent types. This technique is exemplified with the length, reverse, and indices functions over lists.
Article
Online streaming algorithms, tailored for continuous data processing, offer substantial benefits but are often more intricate to design than their offline counterparts. This paper introduces a novel approach for automatically synthesizing online streaming algorithms from their offline versions. In particular, we propose a novel methodology, based on the notion of relational function signature (RFS), for deriving an online algorithm given its offline version. Then, we propose a concrete synthesis algorithm that is an instantiation of the proposed methodology. Our algorithm uses the RFS to decompose the synthesis problem into a set of independent subtasks and uses a combination of symbolic reasoning and search to solve each subproblem. We implement the proposed technique in a new tool called Opera and evaluate it on over 50 tasks spanning two domains: statistical computations and online auctions. Our results show that Opera can automatically derive the online version of the original algorithm for 98% of the tasks. Our experiments also demonstrate that Opera significantly outperforms alternative approaches, including adaptations of SyGuS solvers to this problem as well as two of Opera's own ablations.
Chapter
We show how types of free idempotent commutative monoids and free commutative monoids can be constructed in ordinary dependent type theory, without the need for quotient types or setoids, and prove that these constructions realise finite sets and multisets, respectively. Both constructions arise as generalisations of C. Coquand’s data type of fresh lists. We also show how many other free structures also can be realised by other instantiations. All of our results have been formalised in Agda.
Preprint
Full-text available
We study the polyregular string-to-string functions, which are certain functions of polynomial output size that can be described using automata and logic. We describe a system of combinators that generates exactly these functions. Unlike previous systems, the present system includes an iteration mechanism, namely fold. Although unrestricted fold can define all primitive recursive functions, we identify a type system (inspired by linear logic) that restricts fold so that it defines exactly the polyregular functions. We also present related systems, for quantifier-free functions as well as for linear regular functions on both strings and trees.
Article
Full-text available
This paper reports ongoing research into a theory of datatypes based on the calculus of relations. A fundamental concept introduced here is the notion of ``relator'' which is an adaption of the categorical notion of functor. Axiomatisations of polynomial relators (that is relators built from the unit type and the disjoint sum and cartesian product relators) are given, following which the general class of initial datatypes is studied. Among the topics discussed are natural polymorphism, junctivity and continuity properties. The current paper is an incomplete draft and will be supplemented at later dates.
Article
Full-text available
The subject of these lectures is a calculus of functions for deriving programs from their specifications. This calculus consists of a range of concepts and notations for defining functions over various data types — including lists, trees, and arrays — together with their algebraic and other properties. Each lecture begins with a specific problem, and the theory necessary to solve it is then developed. In this way we hope to show that a functional approach to the problem of systematically calculating programs from their specifications can take its place alongside other methodologies.
Article
The paper describes a succinct problem-oriented programming language. The language is broad in scope, having been developed for, and applied effectively in, such diverse areas as microprogramming, switching theory, operations research, information retrieval, sorting theory, structure of compilers, search procedures, and language translation. The language permits a high degree of useful formalism. It relies heavily on a systematic extension of a small set of basic operations to vectors, matrices, and trees, and on a family of flexible selection operations controlled by logical vectors. Illustrations are drawn from a variety of applications.
Article
An important feature of the applicative style is the use of operators that package common patterns of computation. For example, the list operator map applies a function to every element of a list. Practical use of this style has been hampered by the fact that it can be very inefficient to execute. One remedy for this situation is to use source-to-source program transformation to convert applicative style programs to more efficient equivalents. This paper examines how list operators can be used to guide the transformation process. It describes a small set of list operators that possess a “complete” set of transformation rules, allowing transformations to be performed very efficiently. Whereas most previous transformation methods resemble proofs, this transformation method resembles algebraic manipulation.