ArticlePDF Available

Abstract

Although the λ-calculus is well known as a universal programming language, it is seldom used for actual programming or expressing algorithms. Here we demonstrate that it is possible to use the λ-calculus as a comprehensive formalism for programming by showing how to convert programs written in functional programming languages like Clean and Haskell to closed λ-expressions. The transformation is based on using the Scott-encoding for Algebraic Data Types instead of the more common Church encoding. In this way we not only obtain an encoding that is better comprehensible but that is also more efficient. As a proof of the pudding we provide an implementation of Eratosthenes’ prime sieve algorithm as a self-contained, 143 character length, λ-expression.
Programming in the λ-Calculus
From Church to Scott and back
Jan Martin Jansen
Faculty of Military Sciences,
Netherlands Defence Academy,
Den Helder, the Netherlands
jm.jansen.04@nlda.nl
Abstract. Although the λ-calculus is well known as a universal pro-
gramming language, it is seldom used for actual programming or ex-
pressing algorithms. Here we demonstrate that it is possible to use the λ-
calculus as a comprehensive formalism for programming by showing how
to convert programs written in functional programming languages like
Clean and Haskell to closed λ-expressions. The transformation is based on
using the Scott-encoding for Algebraic Data Types instead of the more
common Church encoding. In this way we not only obtain an encoding
that is better comprehensible but that is also more efficient. As a proof of
the pudding we provide an implementation of Eratosthenes’ prime sieve
algorithm as a self-contained, 143 character length, λ-expression.
1 The Church and Scott Encodings for Algebraic Data
Types
The λ-calculus can be considered as the mother of all (functional) program-
ming languages. Every course or textbook on λ-calculus (e.g. [1]) spends some
time on showing how well-known programming constructs can be expressed in
the λ-calculus. It commonly starts by explaining how to represent For natural
numbers, in almost all cases the Church numerals are chosen as the leading ex-
ample. The definition of Church numerals and operations on them shows that
it is possible to use the λ-calculus for all kinds of computations and that it is
indeed a universal programming language. The Church encoding can be gener-
alized for the encoding of general Algebraic Data Types (see [2]). This encoding
allows for a straightforward implementation of iterative (primitive recursive) or
fold-like functions on data structures, but often requires complex and inefficient
constructions for expressing general recursion.
It is less commonly known that there exist an alternative encoding of num-
bers and algebraic data structures in the λ-calculus. This encoding is relatively
unknown, and independently (re)discovered by several authors (e.g. [9, 8, 10] and
the author of this paper[6]), but originally attributed to Scott in an unpublished
lecture which is cited in Curry, Hindley and Seldin ([4], page 504) as: Dana
Scott, A system of functional abstraction. Lectures delivered at University of
California, Berkeley, Cal., 1962/63. Photocopy of a preliminary version, issued
2
by Stanford University, September 1963, furnished by author in 1968.1We will
therefore call it the Scott encoding. The encoding results in a representation
that is very close to algebraic data types as they are used in most functional
programming languages.
The goal of this paper is not to introduce a new (functional) programming
language, but to show how the λ-calculus itself can be used as a concise pro-
gramming formalism.
This paper starts with a discussion on Algebraic Data Types in Section 2. In
Section 3 it discusses how the Scott and Church encoding can be used to encode
Algebraic Data Types as λ-terms and how these approaches differ. Section 4
focusses on the encoding of recursive functions as λ-terms. In Section 5 the
focus is on the conversion of a complete Haskell or Clean program to a singe
λ-term. The paper ends with a discussion in Section 7 and some conclusions in
Section 8.
2 The Nature of Algebraic Data Types
Consider Algebraic Data Type (ADT) definitions in languages like Clean or
Haskell such as tuples, booleans, temperature, maybe, natural (Peano) numbers
and lists:
data Boolean =True | False
data Tuple a b =Tuple a b
data Temperature =Fahrenheit Int | Celsius Int
data Maybe a =Nothing | Just a
data Nat =Zero | Suc Nat
data List t =Nil | Cons t (List t)
A type consists of one or more alternatives. Each alternative consist of a name,
possibly followed by a number of arguments. Algebraic Data Types are used for
several purposes:
to make enumerations, like in Boolean;
to package data, like in Tuple;
to unite things of different kind in one type, like in MayBe and Temperature;
to make recursive structures like in Nat and List (in fact to construct new
types with an infinite number of elements).
The power of the ADT construction in modern functional programming lan-
guages is that one formalism can be used for all these purposes.
If we analyse the construction of ADT’s more carefully, we see that construc-
tor names are used for two purposes. First, they are used to distinguish the
different cases in a single type definition (like True and False in Boolean,Nothing
and Just in Maybe and Fahrenheit and Celsius in Temperature). Second, we need them
for recognizing them as being part of a type and making type inferencing pos-
sible. Therefore, all constructor names must be different in a single functional
1I would like to thank Matthew Naylor for pointing me at this reference.
3
program (module). For distinguishing the different cases in a function definition,
pattern matching on constructor names is used.
3 Representing Algebraic Data Types in the λ-calculus
In this section it is shown how to represent ADT’s in the λ-calculus. First, we
focus on non-recursive data types for which the Scott and Church encodings are
the same and thereafter on recursive types for which the encodings differ.
3.1 Named λ-expressions
First, some remarks about the notation of λ-expressions. For convenience we will
give λ-expressions sometimes names:
True λtf.t
These names can be used as macro’s in other λ-expressions. They are always
written in italics:
True (λfg.fg) (λfg.gf)
is a short-hand for:
(λtf.t) (λfg.fg) (λfg.gf)
Note that these macro names may not be used recursively, because this will
lead to an infinite substitution process. Later on we discuss how to represent
recursion in λ-expressions.
3.2 Expressing Enumeration Types in the λ-calculus
The simplest example of such a type is Boolean. We already noted that we use
pattern matching for recognizing different cases (constructors). So we are ac-
tually looking for an alternative for pattern matching using λ-expressions. The
simplest boolean pattern matching example is if-then-else:
ifte True t f =t
ifte False t f =f
But the same effect can easily be achieved by making True and False functions of
two variables, selecting the first or second argument respectively and by making
ifte the identity function. Therefore, the λ-calculus solution for this is straight-
forward:
True λtf.t
False λtf.f
ifte λi . i
This is also the standard encoding used for booleans that can be found in λ-
calculus courses and text books. Both Church and Scott use this encoding.
4
3.3 Expressing a Simple Container Type in the λ-calculus
Tuple is the simplest example of a container type. If we group data into a con-
tainer, we also need constructions to get data out of it (projection functions). For
Tuple this can be realized by pattern matching or by using the selection functions
fst and snd. These functions are defined in Haskell as:
fst (Tuple a b) = a
snd (Tuple a b) = b
Containers can be expressed in the λ-calculus by using closures (partial applica-
tions). For Tuple the standard way to do this is:
Tuple λabf.fab
A tuple is a function that takes 3 arguments. If we supply only two, we have a
closure. This closure can take a third argument, which should be a 2 argument
function. This function is then applied to the first two arguments. The third
argument is therefore called a continuation (the function with which the com-
putation continues). It is now easy to find out what the definitions of fst and
snd should be:
fst λt.t(λab.a)
snd λt.t(λab.b)
If applied to a tuple, they apply the tuple to a two argument function, that
selects either the first (fst) or second (snd) argument.
Again, this definition of tuples is the standard one that can be found in λ-
calculus text books and courses. Also for this case the Church and Scott encoding
are the same.
3.4 Expressing General Non-Recursive Multi-Case Types in the
λ-calculus
It is now a straightforward step to come up with a solution for arbitrary non-
recursive ADT’s. Just combine the two solutions from above. Let us look at the
definition of the function warm that takes a Temperature as an argument:
warm :: Temperature Boolean
warm (Fahrenheit f) = f>90
warm (Celsius c) = c>30
We have to find encodings for (Fahrenheit f)and (Celsius c). The enumeration
example tells that we should make a λ-expression with 2 arguments that returns
the first argument for Fahrenheit and the second argument for Celsius. The con-
tainer solution (as used for Tuple) tells us that we should feed the argument of
Fahrenheit or Celsius to a continuation function. Combining these two solutions
we learn that Fahrenheit and Celsius should both have 3 arguments. The first
one to be used for the closure and the second and third as continuation argu-
ments. Fahrenheit should choose the first continuation argument and apply it to
its first argument and Celsius should do the same with the second continuation
argument:
5
Fahrenheit λtfc. ft
Celsius λtfc. ct
Using this encoding the definition of warm becomes:
warm λt.t(λf.f>90) (λc.c>30)
In the body the first argument of trepresents the Fahrenheit case and the second
one the Celsius case.
Also in this non-recursive case the Scott and Church approach do not differ.
3.5 Recursive Types in the λ-calculus: the Scott Encoding
In the Scott Encoding the previous strategy, as used for Temperature, is also ap-
plied to recursive types. As a matter of fact, the Scott Encoding ignores the
fact that we deal with a recursive type! Let us look for example at Nat and List.
Applying the strategy we used for Temperature for Nat we obtain the following
definitions:
Zero λz s . z
Suc λnzs .sn
Applying the same strategy for List, we obtain:
Nil λnc.n
Cons λx xs n c . c x xs
Functions like predecessor, head and tail can now easily be defined:
pred λn.nundef (λm.m)
head λxs . xs undef (λx xs . x)
tail λxs . xs undef (λx xs . xs)
Note that pred and tail have constant time complexity!
As another example we give the Scott Encoding of the fold functions for Nat
and List. The Haskell definition foldNat is given by:
foldNat f x Zero =x
foldNat f x (Suc n) = f(foldNat f x n)
The conversion for the Scott encoding of Nat is straightforward, the bodies of the
two cases simply appear as the first and second argument of n(later on we show
how to remove the recursive call for foldNat):
foldNat λfxn.nx(λn.f(foldNat fxn))
For foldList, the Haskell definition is:
foldList f d [] = d
foldList f d (h:t) = f h (foldList f d t)
Using the Scott encoding for lists this becomes:
foldList λf d xs . xs d (λht.fh(foldList fdt))
The Scott encoding of ADT’s is completely equivalent to their counterparts in
Haskell and Clean. Functions acting on them can be straightforwardly converted
to their Scott versions.
6
3.6 Recursive Types in the λ-calculus: the Church Encoding
Church uses an entirely different approach for the encoding of recursive data
types.
The Church definitions of natural numbers are:
Zero λf x . x
Suc λnfx.f(nfx)
If we compare this to the Scott approach we see that, instead of feeding only n
to the continuation function f,the result of nfxis fed to it. But this is exactly
the same thing as what happens in the fold function. The definition of foldNat
for Church encoded numerals can therefore be given by:
foldNat λfxn.nfx
In [5] Hinze states that Church numerals are actually folds in disguise. As a con-
sequence only primitive recursive functions on numbers can be easily expressed
using the Church encoding. For functions that need general recursion (or func-
tions for which the result for suc n cannot be expressed using the result for n) we
run into troubles. Church himself was not able to solve this problem, but Kleene
found a way out during a visit to the dentist (as described by Barendregt in
[2]). A nice example of his solution is the predecessor function, which could be
easily expressed using the Scott encoding, as we saw earlier. To define it using
the Church encoding Kleene used a construction with pairs (Tuple):
pred λn . snd (n(λp . Tuple (Suc (fst p)) (fst p)) (Tuple Zero Zero ))
Each pair combines the result of the recursive call with the previous element. A
disadvantage of this solution, besides that it is hard to comprehend, is that it
has complexity O(n) while the Scott version has constant complexity.
The Church encoding for lists together with the function tail is given by:
Nil λf x . x
Cons λhtfx.fh(tfx)
tail λxs . snd (xs (λx rs . Tuple (Cons x(fst rs)) (fst rs)) (Tuple Nil Nil ))
Also here the definition of Cons behaves like a fold (a foldr actually). Again, we
need the pair construction from Kleene for tail. The definition of foldList for
Church encoded lists is given by:
foldList λf d xs . xs f d
3.7 The Scott Encoding: the General Case
In general the mapping of an ADT to λ-expressions using the Scott encoding is
defined as follows. Given an ADT definition in Haskell or Clean:
data type_name t1... tk=C1t1,1... t1,n1| ... | Cmtm,1... tm,nm
Then this type definition with mconstructors can be mapped to mλ-expressions:
7
C1λv1,1... v1,n1f1... fm. f1v1,1... v1,n1
...
Cmλvm,1... vm,nmf1... fm. fmvm,1... vm,nm
Consider the (multi-case) pattern-based function fin Haskell or Clean defined on
this type:
f(C1v1,1... v1,n1) = body1
...
f(Cmvm,1... vm,nm) = bodym
This function is converted to the following λ-expression (of course, the bodies
should also be encoded):
fλx.x
(λv1,1... v1,n1. body1)
...
(λvm,1... vm,nm. bodym)
3.8 From Church to Scott and back
It is straightforward to convert Church and Scott encoded numerals into each
other. Because a fold replaces constructors by functions and Church numerals
are actually folds, we can obtain the Scott representation by substituting back
the Scott versions of the constructors:
toScott λn.nSucsZeros
To go from Scott to Church we should use the Scott version of foldNat:
toChurch λnfx.foldNat fxn
The conversions between the Church and Scott encoding for lists are given by:
toScottList λxs . xs ConssNil s
toChurchList λxs f d . foldList f d xs
The list definitions are completely equivalent to those for numbers. They only use
a different fold function in toChurchList and different constructors in toScottList.
For other recursive ADT’s similar transformations can be defined.
In the remainder of this paper we will concentrate on defining algorithms in
the λ-calculus using the Scott encoding.
4 Defining Functions using the Scott Encoding
Now we know how to represent ADT’s we can concentrate on functions. We
already gave some examples of them above (ifte,fst,snd,head,tail,pred,warm,
foldNat,foldList). The more interesting examples are the recursive functions. The
standard technique for defining a recursive function in the λ-calculus is to use a
fixed point operator. Let us look for example at addition for Peano numbers in
Haskell:
8
add Zero m =m
add (Suc n)m=Suc (add n m)
Using the Scott encoding, this becomes:
add0λnm.nm(λn . Suc (add 0n m))
We now have to get rid of the recusrsive macro add0in this definition. The
standard way to do this is with the use of the Yfixed point combinator:
addYY(λadd n m . n m (λn . Suc (add n m)))
Yλh . (λx.h(x x)) (λx.h(x x))
There is, however, another way to represent recursion. Instead of using a fixed
point operator we can also give the recursive function itself as an argument (like
this is done in the argument of Yin addY):
add λadd n m . n m (λn . Suc (add add n m))
The price to pay is that each call of add should have add as an argument. The
gain is that we do not need the fixed point operator any more. This definition is
also more efficient, because it uses fewer reduction steps during reduction than
the fixed-point version. The following example shows how add should be used
to add one to one (note the double add in the call):
(λadd . add add (Suc Zero) (Suc Zero)) add
4.1 Mutually Recursive functions
For mutually recursive functions, we have to add all mutually recursive functions
as arguments for each function. An example to clarify this:
isOdd Zero =False
isOdd (Suc n) = isEven n
isEven Zero =True
isEven (Suc n) = isOdd n
This can be represented by λ-expressions as:
isOdd λisOdd isEven n . n False (λn . isEven isOdd isEven n)
isEven λisOdd isEven n . n True (λn . isOdd isOdd isEven n)
5 Converting Algorithms to the λ-calculus
We now have all ingredients ready for converting complete programs. The last
step to be made is combining everything into a single λ-expression. For example,
if we take the add 1 1 example from above, and substitute all macros, we obtain:
(λadd . add add ((λn z s.s n)(λz s. z)) ((λn z s.s n) (λz s. z)))
(λadd n m . n m (λn . (λn z s.s n) (add add n m)))
Using normal order (outermost) reduction this reduces to:
9
λz s. s (λz s. s (λz s. z))
which indeed represents the desired value 2. We can improve the readability by
introducing explicit names for zero and suc by abstracting out their definitions:
(λzero suc .
(λadd .
add add (suc zero) (suc zero))
(λadd n m . n m (λn . suc (add add n m)))
(λz s.z) (λn z s.s n)
Here we applied a kind of inverted λ-lifting. We have used smart indentation
to make the expression better readable. Note the nesting in this definition: the
definition of add is inside the scope of the variables suc and zero, because its
definition depends on their definitions. In this way the macro reference Suc in
the definition of add can be replaced by a variable suc.
As another example, the right hand side of the Haskell function:
main =isOdd (Suc (Suc (Suc Zero)))
can be written as:
(λisOdd isEven . isOdd isOdd isEven (Suc (Suc (Suc Zero))) ) isOdd isEven
and after substituting all macro definitions and abstracting out definitions:
(λtrue false zero suc .
(λisOdd isEven .
isOdd isOdd isEven (suc (suc (suc zero))) )
(λisOdd isEven n . n false (λn . isEven isOdd isEven n))
(λisOdd isEven n . n true (λn . isOdd isOdd isEven n)))
(λt f.t) (λt f.f) (λz s.z) (λn z s.s n)
Which reduces to:
λtf.t
Which shows that 3 is indeed an odd number.
5.1 Formalizing the Conversion
Above we mentioned the operation of abstracting out definitions. Here we make
this more precise. The conversion of a program into a closed λ-expression pro-
ceeds in a number of steps:
1. Remove all syntactic sugar like zf-expressions, where and let expressions.
2. Eliminate algebraic data types by converting them to their Scott encoding.
3. Eliminate pattern-based function definitions by using the Scott encoding.
4. Remove (mutually) recursion by the introduction of extra variables.
5. Make a dependency sort of all functions, resulting in an ordered collection of
sets. So the first set contains functions that do not depend on other functions
(e.g. the Scott encoded ADT’s). The second set contains functions that only
depend on functions in the first set, etc. We can do this because all possible
cycles are already removed in the previous step.
10
6. Construct the resulting λ-expression by nesting the definitions from the dif-
ferent dependency sets. The outermost expression consists of an application
of a λ-expression with as variables the names of the functions from the first
dependency set and as arguments the λ-definitions of these functions. The
body of this expression is obtained by repeating this procedure for the re-
mainder dependency sets. The innermost expression is the main expression.
The result of this process is:
(λfunction_names_first_set .
(λfunction_names_second_set .
...
(λfunction_names_last_set .
main_expression)
function_definitions_last_set)
...
function_definitions_second_set)
function_definitions_first_set
6 Eratosthenes’ Prime Sieve as a Single λ-expression
As a last, more convincing example, we convert the following Haskell version of
the Eratosthenes prime sieve algorithm to a single λ-expression:
data Nat =Zero | Suc Nat
data Inflist t =Cons t (Inflist t)
nats n =Cons n (nats (Suc n))
sieve (Cons Zero xs) = sieve xs
sieve (Cons (Suc k)xs) = Cons (Suc k) (sieve (rem k k xs))
rem p Zero (Cons x xs) = Cons Zero (rem p p xs))
rem p (Suc k) (Cons x xs) = Cons x (rem p k xs)
main =sieve (nats (Suc (Suc Zero)))
Here we use infinite lists for the storage of numbers and the resulting primes.
sieve filters out the zero’s in a list and calls rem to set multiples of prime numbers
to zero. Applying the first four steps of the conversion procedure results in:
Zero λz s . z
Suc λnzs .sn
Cons λx xs c . c x xs
nats λnats n . Cons n(nats nats (Suc n))
sieve λsieve ls . ls (λx xs . x (sieve sieve xs)
(λk . Cons x(sieve sieve (rem rem k k xs))))
rem λrem p k ls . ls (λx xs . k (Cons Zero (rem rem p p xs))
(λk . Cons x(rem rem p k xs)))
main sieve sieve (nats nats (Suc (Suc Zero)))
The dependency sort results in:
[{zero,suc,cons},{rem,nats},{sieve},{main}]
11
Putting everything into a single λ-expression this becomes:
(λzero suc cons .
(λrem nats .
(λsieve .
sieve sieve (nats nats (suc (suc zero))))
sieve)
rem nats)
Zero Suc Cons
And after substituting the λ-definitions for all macros:
(λzero suc cons .
(λrem nats .
(λsieve .
sieve sieve (nats nats (suc (suc zero))))
(λsieve ls . ls (λx xs . x (sieve sieve xs)
(λk . cons x (sieve sieve (rem rem k k xs))))))
(λrem p k ls . ls (λx xs . k (cons zero (rem rem p p xs))
(λk . cons x (rem rem p k xs))))
(λnats n . cons n (nats nats (suc n))))
(λzs.z) (λnzs.sn) (λx xs c . c x xs)
Which reduces to an infinite λ-expression starting with:
λc. c (λz s. s (λz s. s (λz s. z))) (λc. c (λz s. s (λz s. s (λz s. s (λz s. z))))
(λc. c (λz s. s (λz s. s (λz s. s (λz s. s (λz s. s (λz s. z)))))) ...
One can recognize the start of a list containing: 2, 3 and 5. Using single character
names the expression reduces to a 143 character length definition:
(λzsc.(λrf.(λe.ee(ff(s(sz))))(λel.lλht.h(eet)λk.ch(ee(rrkkt))))
(λrpkl.lλht.k(cz(rrppt))λk.ch(rrpkt))(λfn.cn(ff(sn))))(λzs.z)(λnzs.zn)(λhtc.cht)
This λ-term can also be considered as a constructive definition of what prime
numbers are. An even shorter defintion of a prime number generator in the
λ-calculus can be found in Tromp [11].
7 Discussion
We already indicated that the Scott encoding just combines the techniques used
for encoding booleans and tuples in the Church encoding as described in standard
λ-calculus text books and courses. The Scott and Church encodings only differ
for recursive types. A Church encoded type just defines how functions should be
folded over an element of the type. A fold can be characterized as a function that
replaces constructors by functions. The Scott encoding just packages information
into a closure. Recursiveness of the type is not visible at this level. Of course,
this is also the case for ADT’s in functional languages, where recursiveness is
only visible at the type level and not at the element level.
The representation achieved using the Scott encoding is equivalent to that of
ADT definitions in modern functional programming languages and allows for an
12
similar realization of functions defined on ADT’s. Also the complexity (efficiency)
of these functions is similar to their equivalents in functional programming lan-
guages. This in contrast to their counterparts using the Church encoding that
sometimes have a much worse complexity. Therefore, from a programmers per-
spective the Scott encoding is better than the Church encoding.
An interesting question now is: Why is the Scott encoding relatively unknown
and almost never mentioned in textbooks on the λ-calculus? The encoding is sim-
pler than the Church encoding and allows for a straightforward implementation
of functions acting on data types. Of course, the way ADT’s are represented in
modern functional programming languages is rather new and dates from lan-
guages like ISWIM [7], HOPE [3] and SASL [13, 12] and this was long after the
Church numerals were invented. Furthermore, ADT’s are needed and defined by
programmers, who needed an efficient way to define new types, which is rather
irrelevant for mathematicians and logicians studying the λ-calculus.
In [6] it is shown that this representation of functional programs can be
used to construct very efficient, simple and small interpreters for lazy functional
programming languages. These interpreters only have to implement β-reduction
and no constructors nor pattern matching.
Altogether, we argue that the Scott encoding also should have its place in
λ-calculus textbooks and courses and in λ-calculus courses for computer scientist
this encoding should have preference over the Church encoding.
8 Conclusions
In this paper we showed how the λ-calculus can be used to express algorithms
and Algebraic Data Types in a way that is close to the way this is done in modern
functional programming languages. To achieve this, we used a rather unfamiliar
encoding of ADT’s attributed to Scott. We showed that this encoding can be
considered as a logical combination of the way how enumerations (like booleans)
and containers (like tuples) are normally encoded in the λ-calculus. The encoding
differs from the Church encoding and the connecting element between them is
the fold function.
For recursive functions we did not use the standard fixed-point combinators,
but instead used a simple technique where an expression representing a recursive
function is given (a reference to) itself as an argument. In this way the recursion
is made more explicit and this also results in a more efficient implementation
using fewer reduction steps.
We also sketched a systematic method for converting Haskell or Clean like
programs to closed λ-expressions.
Altogether we have shown that it is possible to express a functional program
in a concise way as a λ-expression and demonstrated that the λ-calculus is indeed
a universal programming language in a convincing way.
13
References
1. H. Barendregt. The lambda calculus, its syntax and semantics (revised edition),
volume 103 of Studies in Logic. North-Holland, 1984.
2. H. Barendregt. The impact of the lambda calculus in logic and computer science.
The Bulletin of Symbolic Logic, 3(2):181–215, 1997.
3. R. M. Burstall, D. B. MacQueen, and D. T. Sannella. Hope: An experimental
applicative language, 1980.
4. H. Curry, J. Hindley, and J. Seldin. Combinatory Logic, volume 2. North-Holland
Publishing Company, 1972.
5. R. Hinze. Theoretical pearl Church numerals, twice! Journal of Functional Pro-
gramming, 15(1):1–13, 2005.
6. J. Jansen, P. Koopman, and R. Plasmeijer. Efficient interpretation by transform-
ing data types and patterns to functions. In H. Nilsson, editor, Revised Selected
Papers of the 7th Trends in Functional Programming ’06, volume 7, pages 73–90,
Nottingham, UK, 2006. Intellect Books.
7. P. J. Landin. The next 700 programming languages. Commun. ACM, 9(3):157–166,
1966.
8. T. A. Mogensen. Efficient Self-Interpretation in Lambda Calculus. Journal of
Functional Programming, 2:345–364, 1994.
9. J. Steensgaard-Madsen. Typed representation of Objects by Functions. ACM
Transactions on Programming Languages and Systems, 11(1):67–89, jan 1989.
10. A. Stump. Directly reflective meta-programming. Journal of Higher Order and
Symbolic Computation, 2008.
11. J. Tromp. John’s lambda calculus and combinatory logic playground, 2012.
http://homepages.cwi.nl/ tromp/cl/cl.html.
12. D. Turner. Some History of Functional Programming Languages, 2012. Invited
talk, Trends in Functional Programming 2012, St. Andrews, United Kingdom,
TFP 2012.
13. D. A. Turner. A new implementation technique for applicative languages. Softw.,
Pract. Exper., 9(1):31–49, 1979.
... Inductive datatypes can be encoded using Scott encodings [23,15], which we explain in Section 4.3. ...
... We use Scott encodings [23,15] to encode inductive types and its constructors. Scott encodings represent the matches on the inductive type. ...
Preprint
We provide a plugin extracting Coq functions of simple polymorphic types to the (untyped) call-by-value λ\lambda-calculus L. The plugin is implemented in the MetaCoq framework and entirely written in Coq. We provide Ltac tactics to automatically verify the extracted terms w.r.t a logical relation connecting Coq functions with correct extractions and time bounds, essentially performing a certifying translation and running time validation. We provide three case studies: A universal L-term obtained as extraction from the Coq definition of a step-indexed self-interpreter for \L, a many-reduction from solvability of Diophantine equations to the halting problem of L, and a polynomial-time simulation of Turing machines in L.
... While L simplifies the full λ-calculus, it inherits powerful techniques developed for the λ-calculus: Procedural recursion can be expressed with self-application, inductive data types can be expressed with Scott encodings [15,18], and program verification can be based on onestep reduction, the accompanying equivalence closure, and the connecting Church-Rosser property. ...
... Seen as a programming language, L is a language where all values are procedures. We now show how procedures can encode data using a scheme known as Scott encoding [15,18]. The Scott encoding of a value is a higher-order procedure providing the match construct for the value. ...
Article
Full-text available
We formalise a (weak) call-by-value λ\lambda -calculus we call L in the constructive type theory of Coq and study it as a minimal functional programming language and as a model of computation. We show key results including (1) semantic properties of procedures are undecidable, (2) the class of total procedures is not recognisable, (3) a class is decidable if it is recognisable, corecognisable, and logically decidable, and (4) a class is recognisable if and only if it is enumerable. Most of the results require a step-indexed self-interpreter. All results are verified formally and constructively, which is the challenge of the project. The verification techniques we use for procedures will apply to call-by-value functional programming languages formalised in Coq in general.
... With this transformation, the code is now prepared for the subsequent step, where data types and simple case expressions are further converted into functions. 5) Encoding Data Types: Data types in LambdaM are encoded using the Scott encoding [8]. In this encoding scheme, each data constructor and simple case expression is transformed into an ordinary function. ...
Preprint
Full-text available
The Massimult project aims to design and implement an innovative CPU architecture based on combinator reduction with a novel combinator base and a new abstract machine. The evaluation of programs within this architecture is inherently highly parallel and localized, allowing for faster computation, reduced energy consumption, improved scalability, enhanced reliability, and increased resistance to attacks. In this paper, we introduce the machine language LambdaM, detail its compilation into KVY assembler code, and describe the abstract machine Matrima. The best part of Matrima is its ability to exploit inherent parallelism and locality in combinator reduction, leading to significantly faster computations with lower energy consumption, scalability across multiple processors, and enhanced security against various types of attacks. Matrima can be simulated as a software virtual machine and is intended for future hardware implementation.
... We follow their approach in employing Scott's encoding [28,34] to incorporate inductive types and a fixed-point combinator ρ for recursion. For instance the Scott encoding of booleans is defined as ε bool true = λx y.x and ε bool false = λx y.y, or λλ1 and λλ0 using de Bruijn indices which we will avoid for examples. ...
Article
Full-text available
The MetaCoq project aims to provide a certified meta-programming environment in Coq. It builds on Template-Coq, a plugin for Coq originally implemented by Malecha (Extensible proof engineering in intensional type theory, Harvard University, http://gmalecha.github.io/publication/2015/02/01/extensible-proof-engineering-in-intensional-type-theory.html, 2014), which provided a reifier for Coq terms and global declarations, as represented in the Coq kernel, as well as a denotation command. Recently, it was used in the CertiCoq certified compiler project (Anand et al., in: CoqPL, Paris, France, http://conf.researchr.org/event/CoqPL-2017/main-certicoq-a-verified-compiler-for-coq, 2017), as its front-end language, to derive parametricity properties (Anand and Morrisett, in: CoqPL’18, Los Angeles, CA, USA, 2018). However, the syntax lacked semantics, be it typing semantics or operational semantics, which should reflect, as formal specifications in Coq, the semantics of Coq ’s type theory itself. The tool was also rather bare bones, providing only rudimentary quoting and unquoting commands. We generalize it to handle the entire polymorphic calculus of cumulative inductive constructions, as implemented by Coq, including the kernel’s declaration structures for definitions and inductives, and implement a monad for general manipulation of Coq ’s logical environment. We demonstrate how this setup allows Coq users to define many kinds of general purpose plugins, whose correctness can be readily proved in the system itself, and that can be run efficiently after extraction. We give a few examples of implemented plugins, including a parametricity translation and a certified extraction to call-by-value λ\lambda -calculus. We also advocate the use of MetaCoq as a foundation for higher-level tools.
... Lists are represented as a recursive abstract datatype using the cons function as originally used in lambda calculus to represent lists using Church Encoding [17]. The syntax [a 1 , a 2 , . . . ...
Thesis
Full-text available
The Semantic Web is an emerging component of the set of technologies that will be known as Web 3.0 in the future. With the large changes it brings to how information is stored and represented to users, there is a need to re-evaluate how this information can be queried. Specifically, there is a need for Natural Language Interfaces that allow users to easily query for information on the Semantic Web. While there has been previous work in this area, existing solutions suffer from the problem that they do not support prepositional phrases in queries (e.g, “in 1958” or “with a key”). To achieve this, we improve on an existing semantics for event-based triplestores that supports prepositional phrases and demonstrate a novel method of handling the word “by”, treating it directly as a preposition in queries. We then show how this new semantics can be integrated with a parser constructed as an executable attribute grammar to create a highly modular and extensible Natural Language Interface to the Semantic Web that supports prepositional phrases in queries.
... The transformation of any of these number encodings to one of the other encodings is given by NumToNum. This uniform transformation is a generalisation of the transformations between Church and Scott numbers in [22]. The context determines the encodings n and m. ...
Conference Paper
From the λ-calculus it is known how to represent (recursive) data structures by ordinary λ-terms. Based on this idea one can represent algebraic data types in a functional programming language by higher-order functions. Using this encoding we only have to implement functions to achieve an implementation of the functional language with data structures. In this paper we compare the famous Church encoding of data types with the less familiar Scott and Parigot encodings. We show that one can use the encoding of data types by functions in a Hindley-Milner typed language by adding a single constructor for each data type. In an untyped context, like an efficient implementation, this constructor can be omitted. By collecting the basic operations of a data type in a type constructor class and providing instances for the various encodings, these encodings can co-exist in a single program. By changing the instance of this class we can execute the same algorithm in a different encoding. This makes it easier to compare the encodings with each other. We show that in the Church encoding selectors of constructors yielding the recursive type, like the tail of a list, have an undesirable strictness in the spine of the data structure. The Scott and Parigot encodings do not hamper lazy evaluation in any way. The evaluation of the recursive spine by the Church encoding makes the complexity of these destructors linear time. The same destructors in the Scott and the Parigot encoding requires only constant time. Moreover, the Church encoding has problems with sharing reduction results of selectors. The Parigot encoding is a combination of the Scott and Church encoding. Hence we might expect that it combines the best of both worlds, but in practice it does not offer any advantage over the Scott encoding.
Chapter
The Internet of Things, IoT, brings us large amounts of connected computing devices that are equipped with dedicated sensors and actuators. These computing devices are typically driven by a cheap microprocessor system with a relatively slow processor and a very limited amount of memory. Due to the special input-output capabilities of IoT devices and their connections it is very attractive to execute (parts of) programs on these microcomputers.
Conference Paper
Generalized algebraic data types (GADT) have been notoriously difficult to implement correctly in Scala. Both major Scala compilers, Scalac and Dotty, are currently known to have type soundness holes related to them. In particular, covariant GADTs have exposed paradoxes due to Scala's inheritance model. We informally explore foundations for GADTs within Scala's core type system, to guide a principled understanding and implementation of GADTs in Scala.
Conference Paper
We formalise a weak call-by-value λ\lambda -calculus we call L in the constructive type theory of Coq and study it as a minimal functional programming language and as a model of computation. We show key results including (1) semantic properties of procedures are undecidable, (2) the class of total procedures is not recognisable, (3) a class is decidable if it is recognisable, corecognisable, and logically decidable, and (4) a class is recognisable if and only if it is enumerable. Most of the results require a step-indexed self-interpreter. All results are verified formally and constructively, which is the challenge of the project. The verification techniques we use for procedures will apply to call-by-value functional programming languages formalised in Coq in general.
Article
Full-text available
This pearl explains Church numerals, twice. The first explanation links Church numerals to Peano numerals via the well-known encoding of data types in the polymorphic λ-calculus. This view suggests that Church numerals are folds in disguise. The second explanation, which is more elaborate, but also more insightful, derives Church numerals from first principles, that is, from an algebraic specification of addition and multiplication. Additionally, we illustrate the use of the parametricity theorem by proving exponentiation as reverse application correct.
Conference Paper
Full-text available
In this paper we present the stepwise construction of an efficient interpreter for lazy functional programming languages like Haskell and Clean. The interpreter is realized by first transforming the source language to the intermediate language SAPL (Simple Application Programming Language) consisting of pure functions only. During this transformation algebraic data types and pattern-based function definitions are mapped to functions. This eliminates the need for constructs for Algebraic Data Types and Pattern Matching in SAPL. For SAPL a simple and elegant interpreter is constructed using straightforward graph reduction techniques. This interpreter can be considered as a prototype implementa- tion of lazy functional programming languages. Using abstract interpretation tech- niques the interpreter is optimised. The performance of the resulting interpreter turns out to be very competitive in a comparison with other interpreters like Hugs, Helium, GHCi and Amanda for a number benchmarks. For some benchmarks the interpreter even rivals the speed of the GHC compiler. Due to its simplicity and the stepwise construction this implementation is an ideal subject for introduction courses on implementation aspects of lazy functional pro- gramming languages.
Book
A, by 1984, reasonably complete survey of the untyped lambda calculus.
Article
A family of unimplemented computing languages is described that is intended to span differences of application area by a unified framework. This framework dictates the rules about the uses of user-coined names, and the conventions about characterizing functional relationships. Within this framework the design of a specific language splits into two independent parts. One is the choice of written appearances of programs (or more generally, their physical representation). The other is the choice of the abstract entities (such as numbers, character-strings, list of them, functional relations among them) that can be referred to in the language. The system is biased towards “expressions” rather than “statements.” It includes a nonprocedural (purely functional) subsystem that aims to expand the class of users' needs that can be met by a single print-instruction, without sacrificing the important properties that make conventional right-hand-side expressions easy to construct and understand.
Article
A systematic representation of objects grouped into types by constructions similar to the composition of sets in mathematics is proposed. The representation is by lambda expressions, which supports the representation of objects from function spaces. The representation is related to a rather conventional language of type descriptions in a way that is believed to be new. Ordinary control-expressions (i.e.,case- and let-expressions) are derived from the proposed representation.
Article
Existing meta-programming languages operate on encodings of programs as data. This paper presents a new meta-programming language, based on an untyped lambda calculus, in which structurally reflective programming is supported directly, without any encoding. The language features call-by-value and call-by-name lambda abstractions, as well as novel reflective features enabling the intensional manipulation of arbitrary program terms. The language is scope safe, in the sense that variables can neither be captured nor escape their scopes. The expressiveness of the language is demonstrated by showing how to implement quotation and evaluation operations, as proposed by Wand. The language’s utility for meta-programming is further demonstrated through additional representative examples. A prototype implementation is described and evaluated.
Article
It is shown how by using results from combinatory logic an applicative language, such as LISP, can be translated into a form from which all bound variables have been removed. A machine is described which can efficiently execute the resulting code. This implementation is compared with a conventional interpreter and found to have a number of advantages. Of these the most important is that programs which exploit higher order functions to achieve great compactness of expression are executed much more efficiently.