Content uploaded by Jacques Carette
Author content
All content in this area was uploaded by Jacques Carette on Jan 15, 2019
Content may be subject to copyright.
arXiv:1812.08079v1 [cs.LO] 14 Dec 2018
Theory Presentation Combinators
Jacques Carette
and
Russell O’Connor
Department of Computing and Software, McMaster University
Hamilton, Ontario, Canada
carette@mcmaster.ca,roconnor@theorem.ca.
To build a scalable library of mathematics, we need a method which takes advantage of the
inherent structure of mathematical theories. Here we argue that theory presentation combinators
are a helpful tool towards that quest. We motivate our choice of combinators, and give them
precise semantics. We observe that the category of contexts plays a fundamental rˆole (explicitly
or otherwise) in all such developments, so we will examine its structure carefully. In particular,
as it is a fibered category, cartesian liftings are pervasive. While our original work was based on
experience and intuition, this work is firmly grounded in categorical semantics, and has resulted
in a much cleaner and more powerful set of theory presentation combinators.
1. INTRODUCTION
A mechanized mathematics system, to be useful, must possess a large library of
mathematical knowledge, on top of sound foundations. While sound foundations
contain many interesting intellectual challenges, building a large library seems a
daunting task simply because of its sheer volume. However, as has been well-
documented [CFJ+11, CK11, GS10], there is a tremendous amount of redundancy
in existing libraries. Thus there is some hope that by designing the “right” meta-
language, guided by parsimony principles [Vel07], we can reduce the effort needed
to build a library of mathematics.
Our aim is to build tools that allow library developers to take advantage of com-
monalities in mathematics so as to build a large, rich library for end-users, whilst
expending much less actual development effort than in the past. In other words,
we continue with our approach of developing High Level Theories [CF08] through
building a network of theories, by putting our previous experiments [CFJ+11] on a
sound theoretical basis.
1.1 The Problem
The problem we wish to solve is easy to state: we want to shorten the development
time of large mathematical libraries. But why would mathematical libraries be any
different than other software, where the quest for time-saving techniques has been
long but vain [Bro95]? Because we have known since Whitehead’s 1898 text “A
treatise on universal algebra” [Whi98] that significant parts of mathematics have
a lot of structure, structure which we can take advantage of. The flat list of 342
structures gathered by Peter Jipsen [Jip] is both impressively large, and could easily
be greatly extended. Another beautiful source of structure in a theory graph is that
of modal logics; John Halleck’s web pages on Logic System Interrelationships [Hal]
Journal of Formal Reasoning Vol. ?, No. ?, Month Year, Pages 1–30.
2⋅J. Carette and R. O’Connor
is quite eye opening.
Magma
Semigroup
Pointed Semigroup
Monoid
Group
Abelian Group
Fig. 1. Theories
Figure 1 shows what we are talking about: The presen-
tation of the theory Semigroup strictly contains that of the
theory Magma, and so on1. It is therefore pointless for a
human to enter this information multiple times – if it is ac-
tually possible to take advantage of this structure. Strict
inclusions at the level of presentations is only part of the
structure: for example, we know that a Ring actually con-
tains two isomorphic copies of Monoid, where the isomor-
phism is given by a simple renaming. There are further
commonalities to take advantage of, which we will explain
later in this paper.
Another question that arises naturally: is there sufficient
structure outside of the traditional realm of universal alge-
bra, in other words, beyond single-sorted equational theo-
ries, to make it worthwhile to develop significant infrastruc-
ture to leverage that structure? Luckily for us, it turns out
that there is.
We will also require tools to selectively hide (and reveal) this structure from end-
users. This latter requirement stems from the observation [CF08] that in practice,
when mathematicians are using theories rather than developing new ones, they
tend to work in a rather “flat” namespace. An analogy: someone working in Group
Theory will unconsciously assume the availability of all concepts from a standard
textbook, with their usual names and meanings. As their goal is to get some work
done, whatever structure system builders have decided to use to construct their
system should not leak into the application domain. They may not be aware of the
existence of pointed semigroups, nor should that awareness be forced upon them.
On the other hand, some application domains do rely on the “structure of theories”,
so we cannot unilaterally hide this structure from all users either.
1.2 Contributions
We previously explained our core ideas in [CO12], where a variant of the category of
contexts was presented as our setting for theory presentations. There we presented a
simple term language for building theories, along with two (compatible) categorical
semantics – one in terms of objects, another in terms of arrows. By using “tiny
theories”, this allowed reuse and modularity. We emphasized names, as the objects
we are dealing with are syntactic and ultimately meant for human consumption.
We also emphasized arrows: while this is categorically obvious, nevertheless the
current literature on this topic is very object-centric. Put another way: most of
the emphasis in other work is on operational issues, or evolved from operational
thinking, while our approach is unabashedly denotational, whilst still taking names
seriously.
We leverage that basis here, and extend our work in multiple ways2. First, we
1We are not concerned with models, whose inclusion go in the opposite direction.
2We provide a summary of the contributions here to guide the reader who wishes to focus on the
new ideas, even though much of the terminology used in this paragraph is only defined later.
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
2 MOTIVATION 3
enhanced contexts with definitions. We treat these as first-class citizens, so that
names introduced by definitions are dealt with in the same way as all other names.
The categorical semantics is extended to a fibration of generalized extensions over
contexts. This is not straightforward: taking names seriously prevents us from
having a cloven fibration without a renaming policy. But once this machinery is
in place, this allows us to build presentations by lifting views over extensions, a
very powerful mechanism for defining new presentations. There are obstacles to
taking the “obvious” categorical solutions: for example, having all pullbacks would
require that the underlying type theory have subset types, which is something we
do not want to force. Furthermore, equivalence of terms needs to be checked when
constructing mediating arrows, which in some settings may have implications for
the decidability of typechecking.
While the core ideas of [CO12] remain, the text has been almost completely
rewritten. We have also made the choice to generally put the links to the related
work and its relationship to our work on its own section (8) rather than in the text.
1.3 Plan of paper
We motivate our work with concrete examples in section 2. Section 3 lays out the
basic (operational) theory, with concrete algorithms. The theoretical foundations
of our work, the fibered category of contexts, is presented in full detail in section 4.
This allow us in section 5 to formalize a combinator language for theory presenta-
tion combinators. We close with some discussion, related work and conclusions in
sections 7–10.
2. MOTIVATION
We review, informally, the motivation for introducing a variety of combinators for
creating new theory presentations from old. We use an informal syntax which
should be readily understandable to anyone with a reasonable background in math-
ematics and type theory; section 5 will give a formal syntax and its semantics. Note
that the “intuitive” combinators that we present here are purely motivational, as
the semantics of some of these turn out to be awkward and/or contrived. We will
thus have to build our formal language (almost) from scratch, based on the seman-
tics we develop in sections 3 and 4. As we go, we will also comment on the problems
which need to be overcome to obtain a reasonable solution.
It is important to remember, throughout this section, that our principal perspec-
tive is that of system builders. Our task is to form a bridge (via software) between
tasks that end-users of a mechanized mathematics system may wish to perform, and
the underlying (semantic) theory concerned. This bridge is necessarily syntactic,
as syntax is the only entity which can be symbolically manipulated by computers.
More importantly, we must respect the syntactic choices of users, even when these
choices are not necessarily semantically relevant.
4⋅J. Carette and R. O’Connor
2.1 Extension
The simplest situation is where the presentation of one theory is included, verbatim,
in another. Concretely, consider Monoid and CommutativeMonoid.
Monoid ≜
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
U∶Type
(⋅) ∶ (U, U )→U
e∶U
right identity ∶∀x∶U.x ⋅e=x
left identity ∶∀x∶U.e ⋅x=x
associative ∶∀x, y, z ∶U.(x⋅y) ⋅ z=x⋅ (y⋅z)
⎫
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎬
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎭
CommutativeMonoid ≜
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
U∶Type
(⋅) ∶ (U, U )→U
e∶U
right identity ∶∀x∶U.x ⋅e=x
left identity ∶∀x∶U.e ⋅x=x
associative ∶∀x, y, z ∶U.(x⋅y) ⋅ z=x⋅ (y⋅z)
commutative ∶∀x, y ∶U.x ⋅y=y⋅x
⎫
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎬
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎭
As expected, the only difference is that CommutativeMonoid adds a commutative
axiom. Thus, given Monoid, it would be much more economical to define
CommutativeMonoid ≜Monoid extended by {commutative ∶∀x, y ∶U.x ⋅y=y⋅x}
2.2 Renaming
From an end-user perspective, our CommutativeMonoid has one flaw: such monoids
are frequently written additively rather than multiplicatively. Let us call a commu-
tative monoid written additively an abelian monoid, as we do with groups. Thus it
would be convenient to be able to say
AbelianMonoid ≜CommutativeMonoid[ (⋅)↦+, e ↦0]
Immediately, one is led to ask: how are AbelianMonoid and CommutativeMonoid
related? Traditionally, these are regarded as equal, for semantic reasons. However,
since we are dealing with presentations, as syntax, we wish to regard them as
isomorphic rather than equal3. In other words, we take a nominal rather than
structural approach, since we are dealing with syntax. While working up to iso-
morphism is a minor inconvenience for the semantics, this enables us to respect
user choices in names.
2.3 Combination
But even with these features, given Group, we would find ourselves writing
CommutativeGroup ≜Group e xtended by {commutative ∶∀a, b ∶U.a ⋅b=a⋅b}
which is problematic: we lose the relationship that every commutative group is a
commutative monoid. In other words, we reduce our ability to transport results “for
3Univalent Foundations[Uni13] does not change this, as we can distinguish the two, as presenta-
tions.
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
2 MOTIVATION 5
free” to other theories, and must prove that these results transport, even though
the morphism involved is (essentially) the identity. Thus it is natural to further
extend our language with a facility that expresses this sharing. Taking a cue from
previous work, we might want to say
CommutativeGroup ≜combine CommutativeMonoid,Group over Monoid
Informally, this can be read as saying that Group and CommutativeMonoid are both
“extensions” of Monoid, and CommutativeGroup is formed by the union (amalga-
mated sum) of those extensions. In other words, by over, we mean to have a
single copy of Monoid, to which we add the extensions necessary for obtaining
CommutativeMonoid and Group. This implicitly assumes that our two Monoid ex-
tensions are meant to be orthogonal, in some suitable sense.
Unfortunately, while this “works” to build a sizeable library (say of the order of
500 concepts) in a fairly economical way, it is nevertheless brittle. Let us examine
why this is the case. It should be clear that by combine, we really mean pushout.
But a pushout is a 5-ary operation on 3 ob jects and 2 arrows; our syntax gives
the 3 ob jects and leaves the arrows implicit. In other words, they have to be
inferred. This is a very serious mistake: these arrows are (in general) impossible
to infer, especially in the presence of renaming. As mentioned previously, there
are two distinct arrows from Monoid to Ring, with neither arrow being “better” or
somehow more canonical than the other. Furthermore, we know that pushouts can
also be regarded as a 2-ary operation on compatible arrows. In other words, even
though our goal is to produce theory presentations, using pushouts as a fundamental
building block, gives us no choice but to take arrows seriously.
2.4 Arrows
If we revisit the extension and renaming operations, it is easy to see that these
operations not only create a new presentation, they also create a map from the
source presentation into the target presentation. For extensions, this is an injective
map. In other words,
CommutativeMonoid ≜Monoidextended by {commutative ∶∀x, y ∶U.x ⋅y=y⋅x}
creates more than just CommutativeMonoid, it also creates a morphism from Monoid
to CommutativeMonoid. These can be written explicitly, and in this case this would
be
MtoCM ≜[U∶=U,(⋅)∶=(⋅),e∶=e,right identity ∶=right identity,
left identity ∶=left identity,associative ∶=associative,
commutative ∶=∀x, y ∶U.x ⋅y=y⋅x]∶Monoid ⇒CommutativeMonoid
where we use ⇒to indicate that this is a construction, and ∶=to mean that this is
an assignment of terms of Monoid to names of CommutativeMonoid. Clearly this
would be very tedious to write out for larger theories. In concrete syntax, we would
prefer to write just the non-identity parts, so that for this case we would prefer
MtoCM ≜[commutative ∶=∀x, y ∶U.x ⋅y=y⋅x]∶Monoid ⇒CommutativeMonoid
6⋅J. Carette and R. O’Connor
which is easily seen to be isomorphic to the definition we started with. Thus it
would be better to simply infer these morphisms when we can. We will however
not make inferability a requirement.
For renaming, it is natural to require that the map on names causes no collisions,
as that would rename multiple concepts to the same name. While this is a poten-
tially interesting operation on presentations, this is not the operation that users
have in mind for renaming. Collision-free renamings also induce an injective map.
{V∶Type} { ?∶Type}
{U∶Type} {W∶Type}
[V↦?]
[U↦V]
[U↦W]
[W↦?]
Pushouts do create arrows as well,
but unfortunately renamings are a prob-
lem: there are simple situations where
there is no canonical name for some of
the objects in the result. For example,
take the presentation of Carrier, aka
{U∶Type}and the arrows induced by
the renamings U↦Vand U↦W; while
the result will necessarily be isomorphic
to Carrier, there is no canonical choice
of name for the end result. This is one
problem we must solve. The Figure above left illustrates the issue. It also illustrates
that we really do compute amalgamated sums and not simply syntactic union.
In general, a map from one presentation to another will be called a view. For
example, one can witness that the additive naturals form a monoid with a statement
such as
view Nat as Monoid via [U∶=N,(⋅)∶=+N, e ∶=0, ...](1)
where we elide the names of the proofs. The right hand side of an assignment in
a view does not need to be a symbol, it can be any well-typed term. For example,
we can have a view from Magma to itself which maps the binary operation to its
opposite:
U∶Type
(⋅)∶(U, U )→U U∶Type
(⋅)∶(U, U )→U
[U∶=U, ⋅ ∶=flip ⋅](2)
2.5 Little Theories
One important observation is that contexts of a type theory (or a logic) contain the
same information as a theory presentation. Given a context, theorems about specific
structures can be constructed by transport along views [FGT92]. For example, in
the context of the definition of Monoid (2.1), we can prove that the identity element,
e, is unique.
∀e′∶U. ((∀x.e′⋅x=x)∨(∀x.x ⋅e′=x))→e′=e
In order to apply this theorem in other contexts, we can provide a view from one
theory presentation to another. For example, consider the theory presentation of
semi-rings.
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
2 MOTIVATION 7
Semiring ≜
U∶Type
(+)∶(U, U )→U
(×)∶(U, U )→U
0∶U
1∶U
additiveassociative ∶∀x, y, z ∶U. (x+y)+z=x+(y+z)
additivecommutative ∶ ∀x, y ∶U.x +y=y+x
additive left identity ∶ ∀x∶U.0+x=x
additive right identity ∶ ∀x∶U.x +0=x
multiplicativeassociative ∶ ∀x, y , z ∶U. (x×y)×z=x×(y×z)
multiplicative left identity ∶ ∀x∶U.1×x=x
multiplicative right identity ∶ ∀x∶U.x ×1=x
left distributive ∶ ∀x, y , z ∶U.x ×(y+z)=x×y+x×z
rightt distributive ∶ ∀x, y , z ∶U. (y+z)×x=y×x+z×x
There are two naturally induced views from Monoid to Semiring, one assigning
⋅to ×and eto 1, and another assigning ⋅to +and eto 0 (with the views also
assigning the monoid axioms to their respective axioms). Each of these two views
can be used to transport our example theorem to prove that 0 and 1 are unique
with respect to their associated binary operations.
But these are not the only views from Monoid to Semiring. We do not have to
restrict to assigning constants to constants – we could map constants to arbitrary
terms (in the underlying language). For example we could send ×to λx, y ∶U.y ×x.
Which leads to the inevitable conclusion that, in general, we need an explicit
language for defining views. But we have to proceed with care, otherwise we risk
making simple situations complicated. For example, if we required explicit identity
views for extensions, this would be semantically correct but painfully verbose in
practice, as was pointed out earlier.
2.6 Models
It is important to remember that models are contravariant: while there is a presen-
tation view from Monoid to CommutativeMonoid, the model morphisms are from
JCommutativeMonoidKto JMonoidK. Theorems are also contravariant with respect
to model morphisms, so that they travel in the same direction as presentation views.
In this way a view to the empty theory presentation provides models of presen-
tations by assigning every constant to a closed term. It is worthwhile noting that
these models are internal to the underlying logic, rather than necessarily being Set-
models. For example, if our underlying logic can express the existence of a type
of natural numbers, N, then the view given by (1) can be used to transport our
example theorem to prove that 0 is the unique identity element for +N.
2.7 Tiny Theories
We noticed in our experiments [CFJ+11] that for ease of extension, it was best
to use tiny theories, in other words presentations which add in a single concept
8⋅J. Carette and R. O’Connor
at a time. This is useful both for defining pure signatures (presentations with no
axioms) as well as when defining properties such as commutativity. Typically one
proceeds by first defining the smallest typing context in which the property can be
stated. For commutativity, Magma is the smallest such context – which also turns
out to be a signature. We can then obtain the structures we are actually interested
in via a “mixin” of the necessary properties over a base signature.
An example might make this clearer. Suppose we want to construct the presenta-
tion of CommutativeSemiring by adding the commutativity property to Semiring
(see §2.5). As commutativity is defined as an extension to Magma, we need a view
from Magma to Semiring. This view will tell us (exactly!) which binary operation
we want to make commutative. Here we would pick the view that maps Uto U
and (⋅)to (×)We can then combine that view with the injection from Magma to
CommutativeMagma to produce a CommutativeSemiring presentation.
CommutativeSemiring ≜
U∶Type
(+)∶U→U→U
(×)∶U→U
0∶U
1∶U
...
multiplicativecommutative ∶∀xy ∶U.x ×y=y×x
...
We see that this operation also requires that we provide a renaming that maps the
axiom name “commutative” to “multipliciative commutative” in order to avoid the
possibility of name collision (as addition was already commutative in Semiring).
2.8 Constructions
It is worthwhile noticing that there is nothing specific to CommutativeGroup in
the renaming ⋅↦+, e ↦0, this can be applied to any theory where the pairs
(⋅,+)and (e, 0)have compatible signatures (including the case where they are not
present). Similarly, extend really defines a “construction” which can be applied to
any presentation whenever all the symbols used in the extension are defined. In
other words, a reasonable semantics should associate a whole class of arrows4to
these operations. While it is tempting to think that these operations will induce
some endofunctors on presentations, this is not quite the case: name clashes will
prevent that.
2.9 Problems
Clearly we need to have a setting in which extensions, renamings and combinations
(or mixins) make sense. We will need to play close attention to names, both to
allow pleasant names and prevent accidental collisions. In other words, to be able
to maintain human-readable names for all concepts, we will put the burden on
the library developers to come up with a reasonable naming scheme, rather than
to push that issue onto end users. Another way to see this is that symbol choice
4We are again being deliberately vague here, section 4 will make this precise.
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
3 BASIC SEMANTICS 9
carries a lot of intentional, as well as contextual, information which is commonly
used in mathematical practice.
Views will need to be formally defined, as well as a convenient language for
dealing with them. While in some situations, it is imperative to be explicit about
views, at other times they are obvious or easily inferred; in those latter situations,
usability dictates that we should let the system do the heavy lifting for us.
Furthermore, we do want to use both the little theories and tiny theories method,
so our language (and semantics) needs to allow, even promote, that style. We will
see that, semantically, not all views have the same compositional properties. We
will thus want to single out, syntactically, as large a subset of well-behaved views
as possible, even though we know we can’t be complete.
Our earlier attempt used an explicit base for combine, which only works for
medium-scale libraries: we need to work more directly with views themselves. A
common solution uses long names, which automatically generates (new, long) names
to uniquely identify common names. But this has the effect of leaking the details
of how a presentation was constructed into the names of the constants of the new
presentation. This essentially prevents later refinements, as all these names would
change. As far as we can tell, any automatic naming policy will suffer from this
problem, which is why we insist on having the library developers explicitly deal
with name clashes. We can then check that this has been done consistently. In
practice few renamings are needed, so allowing the empty renaming annotation to
denote the identity renaming scheme makes our design choice lightweight.
3. BASIC SEMANTICS
In this section we present the necessary definitions from (dependent) type theory
and category theory which will form the basis of our theory presentation combina-
tors. First we formally describe theory presentations and views, then we describe
the semantics of our combinators.
Our presentations depend on a background type theory, but is otherwise agnostic
as to many of the internal details of that theory. From this type theory we require
the following:
—An infinite set of variable names V.
—A typing judgement for terms sof type σin a context Γ which we write Γ ⊢s∶σ.
—A kinding judgement of types σof kind κin a context Γ which we write
Γ⊢σ∶κ∶ ◻. We further assume that the set of valid kinds κ∶ ◻ is given and
fixed.
—A definitional equality (a.k.a. convertibility) judgement of terms s1of type σ1
and s2of type σ2in a context Γ, which we write Γ ⊢s1∶σ1≡s2∶σ2. We will
write Γ ⊢s1≡s2∶σto denote Γ ⊢s1∶σ≡s2∶σ.
—A notion of substitution on terms. Given a list of variable assignments [xi→sai]i<n
and an expression ewe write e[xi∶=sai]i<nfor the term eafter simultaneous sub-
stitution of variables {xi}i<nby the corresponding term in the assignment.
We will often denote an assignment by v, and its application to a term eby e[v].
10 ⋅J. Carette and R. O’Connor
3.1 Theory Presentations
A theory presentation is a well-typed list of declarations and definitions. More
formally, Figure 2 gives the formation rules. In this definition, we use Γto denote
the set of variables of a well-formed context Γ. Explicitly, it is given by
∅=∅Γ ; x∶σ=Γ∪{x} Γ ; x∶σ∶=s=Γ∪{x}
Here x∶σ∶=sdenotes the declaration of a new synonym xfor term sof type σ.
It is possible to develop this theory without declarations, however including them
appears to make both the theory and practical implementations easier.
∅ctx
Γctx x∉ ∣Γ∣Γ⊢σ∶κ∶ ◻
(Γ ; x∶σ)ctx
Γctx x∉ ∣Γ∣Γ⊢s∶σ
(Γ ; x∶σ∶= s)ctx
Fig. 2. Formation rules for contexts
3.2 Views
A view from a theory presentation Γ to a theory presentation ∆ is an assignment
of well-typed expressions in ∆ to declarations of Γ. The assigments transport well-
typed terms in the context Γ to well-typed terms in ∆, by substitution. More
formally,
∆ctx
[ ] ∶ ∅ →∆
(Γ ; x∶σ)ctx [v]∶Γ→∆ ∆ ⊢r∶σ[v]
[v, x ∶= r]∶(Γ ; x∶σ)→∆
(Γ ; x∶σ) [v]∶Γ→∆ ∆ ⊢r≡s[v]∶σ[v]
[v, x ∶= r]∶(Γ; x∶σ∶= s)→∆
Fig. 3. Formation rules for views.
There is a subtle but important distinction between assignments, [v]and views,
[v]∶Γ→∆. A view is made up of 3 components: an assignment, a source presen-
tation and a target presentation. In particular, the same assignment can occur in
different views.
3.2.1 Extensions and Inclusions. An extension is a special type of view, which
we denote
[a↦ra]a∈∣Γ∣∶Γ→∆
where each expression rais a unique variable name from ∆. An inclusion is a
special type of extension of the form
[a↦a]a∈∣Γ∣∶Γ→∆.
Inclusions have the nice property that there is a most one inclusion between
any two theory presentations, and that inclusions form a poset of presentations.
However this nice property is also a limitation. As we have hinted at before, Ring
is an extension of Monoid in two different ways, and hence both extensions cannot
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
3 BASIC SEMANTICS 11
be inclusions. We do not give inclusions any special status (unlike extensions); we
draw attention to them here as many other systems make inclusions play a very
special rˆole.
3.2.2 Composition of Views. Given two views: [v]∶Γ→∆ and [w]∶∆→Φ, we
can compose them to create a view [v];[w]∶Γ→Φ. If [v]=[a∶=ra]a∈∣Γ∣then the
composite view is
[v];[w]≜[a∶=ra[w]]a∈∣Γ∣.
That this gives a well-defined notion of composition, and that it is associative is
standard [Car86, Jac99, Tay99].
3.2.3 Equivalence of Views. Two views with the same domain and codomain,
[u],[v]∶Γ→∆ are equivalent if ∆ ⊢ra∶(σa[u])≡sa∶(σa[v]) where
Γ≜[a∶σa]a∈∣Γ∣
[u]≜[a∶=ra]a∈∣Γ∣
[v]≜[a∶=sa]a∈∣Γ∣
3.2.4 The category of theory presentations. We now have all the necessary in-
gredients to define the category of theory presentations Pwith theory presentations
as objects, and views as morphisms. The identity inclusions are the identity mor-
phisms, and views act on views by substitution, which is associative and respects
the identity.
Note that in [CO12], we worked with C=Pop, which is traditionally called the
category of contexts, which is more often used in categorical logic [Car86, Jac99,
Tay99, Pit00]. But in our setting, and as is common in the context of specifications
(see for example [BG77, Smi93, CoF04] amongst many others), we prefer to take
our intuition from textual inclusion rather than models. Nevertheless, when it will
be time to define the semantics, we will revert to using C, as this not only simplifies
certain arguments, it also makes our work easier to compare to that in categorical
logic.
3.3 Combinators
Having defined theory presentations and views (including extensions), we can now
define presentation and view combinators. In fact, all combinators in this section
will end up working in tandem on presentations and views. They allow us, as with
most combinators, to create new presentations/views from old, in a much more
convenient manner than building everything by hand.
The combinators are: extend, rename, combine and mixin. This list should be
unsurprising given §2. Although we expect the majority of theory presentations and
views will be constructed with these combinators, a few complex views will need
to be defined directly. The reader may have noticed the absence of combinators
such as delete or hide: this is quite purposeful on our part. While the operational
semantics on theory presentations for these is “obvious”, the denotational semantics
in terms of theory morphisms is backwards, and has distasteful properties.
We give the full details of the constructions, which are completely deterministic.
These can serve as a direct design for an implementation. In other words, this
12 ⋅J. Carette and R. O’Connor
section gives an operational semantics for the combinators. In the next section, we
will give them a categorical semantics; we make a few inline remarks here to help
the reader understand why we choose a particular construction.
3.3.1 Renaming. Given a presentation Γ and an injective renaming function
π∶Γ→Vwe can construct a new theory presentation ∆ ≜Γ[a↦π(a)]a∈∣Γ∣by
renaming Γ’s symbols: we will denote this action of πon Γ by π⋅Γ. We also
construct a extension from [a→π(a)]a∈∣Γ∣∶Γ→π⋅Γ which provides a translation
from Γ to the constructed presentation π⋅Γ; we denote this extension by vπ. For
this construction as a whole, we use the notation
R(Γ, π ∶Γ→V)≜pres =π⋅Γ
extend =vπ∶Γ→π⋅Γ
3.3.2 Extend. Given a theory presentation Γ, a fresh name aand a well formed
type σof some kind κ, (i.e. Γ ⊢σ∶κ∶ ◻) we can construct a new theory presentation
∆≜Γ; a∶σand the extension (an inclusion in this case) [b↦b]b∈∣Γ∣∶Γ→∆. More
generally, given a sequence of fresh names, types and kinds, {ai}i<n,{σi}i<n, and
{κi}i<nwe can define a sequence of theory presentations Γ0≜Γ and Γi+1≜Γi;ai∶σi
so long as Γi⊢σi∶κi∶ ◻. Given such as sequence we construct a new theory
presentation ∆ ≜Γnwith the extension (which is still an inclusion) [b↦b]b∈∣Γ∣∶
Γ→∆. Of course ∆ is the concatenation of Γ with {ai∶σi∶κi}i<n. We will thus
use Γ ⋊∆+to denote the target of this view whenever the components of ∆+are
clear from context. However ∆+is in general not a valid presentation, as it may
depend on Γ. This is why we use an asymmetric symbol ⋊.
It is worthwhile noting that general extensions [u]∶Γ→∆ as defined in §3.2.1
can be decomposed into a renaming composed with an ⋊, in other words [u]∶Γ→
∆=[u]∶Γ→(π⋅Γ⋊∆+), where πis defined by the action of the extension [u]
on Γ, namely π⋅Γ=Γ[u]. We will use the notation Γ[u]⋊∆+as it makes the
dependence on the extension clearer.
Extensions which are inclusions are traditionally called display maps in C=Pop ,
and our [b↦b]b∈∣Γ∣∶Γ→(Γ; a∶σ)in Pis denoted by ˆa∶(Γ; a∶σ)→Γ in C[Tay99],
and δain [Jac99].
For notational convenience, we can encode the construction above as an explicit
function from the inputs as given above, to a record containing two fields, pres (for
presentation) and extend (for extension).
E(Γ,∆+)≜pres =Γ⋊∆+
extend =[b↦b]b∈∣Γ∣∶Γ→Γ⋊∆+
where ∆+={ai∶σi∶κi}i<n.
3.3.3 Combine. Given two extensions [u∆]∶Γ→∆ and [uΦ]∶Γ→Φ and two
injective renaming functions π∆∶∆→Vand πΦ∶Φ→V, we can combine them
and generate a new theory presentation Ξ. We require that
π∆(x)=πΦ(y)⇔∃z∈Γ. x =z[u∆]∧y=z[uΦ].
Say that the two extensions decompose as ∆ =Γ[u∆]⋊∆+and Φ =Γ[uΦ]⋊
Φ+. Then we define Ξ ≜Ξ0⋊(Ξ∆∪ΞΦ)where Ξ0≜Γ[zπ∆(z[u∆])]z∈∣Γ∣(or
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
3 BASIC SEMANTICS 13
equivalently Ξ0≜Γ[zπΦ(z[uΦ])]z∈∣Γ∣), Ξ∆≜∆+[xπ∆(x)]x∈∣∆∣, and ΞΦ≜
Φ+[yπΦ(y)]y∈∣Φ∣. Note that, by construction, Ξ0⋊(Ξ∆⋊ΞΦ)is equivalent to
Ξ0⋊(ΞΦ⋊Ξ∆); we denote this equivalence class5of views by Ξ0⋊(Ξ∆∪ΞΦ).
The combination operation also provides the two extensions [v∆]∶∆→Ξ and
[vΦ]∶Φ→Ξ where
[v∆]≜[x↦π∆(x)]x∈∣∆∣
[vΦ]≜[y↦πΦ(y)]y∈∣Φ∣
A quick calculation shows that [u∆];[v∆]is equal to [uΦ];[vΦ](and not just
equivalent); we denote this joint arrow [uv]∶Γ→Ξ. Furthermore, combine provides
a set of mediating views from the constructed theory presentation Ξ. Suppose we
are given views [w∆]∶∆→Ω and [wΦ]∶Φ→Ω such that the the composed views
[u∆];[w∆]∶Γ→Ω and [uΦ];[wΦ]∶Γ→Ω are equivalent. We can combine [w∆]
and [wΦ]into a mediating view [wΞ]∶Ξ→Ω where
[wΞ]≜[π∆(x)∶=x[w∆]]x∈∣∆∣∪[πΦ(y)∶=y[wΦ]]y∈∣Φ∣.
This union is well defined since if π∆(x)=πΦ(y)then there exists zsuch that x=
z[u∆]and y=z[uΦ], in which case x[w∆]=z[u∆][w∆]and y[wΦ]=z[uΦ][wΦ]
are equivalent since by assumption [u∆];[w∆]and [uΦ];[wΦ]are equivalent. It is
also worthwhile noticing that this construction is symmetric in ∆ and Φ.
For this construction, we use the following notation, where we use the symbols
as defined above (omitting type information for notational clarity)
C(u∆, uΦ, π∆, πΦ)≜
pres =Ξ0⋊(Ξ∆∪ΞΦ)
extend∆=[v∆]∶∆→Ξ
extendΦ=[vΦ]∶Φ→Ξ
diag =[uv]∶Γ→Ξ
mediate =λ w∆wΦ. wΞ
The attentive reader will have noticed that we have painstakingly constructed an
explicit pushout in P. There are two reasons to do this: first, we need to be
this explicit if we wish to be able to implement such an operation. And second,
we do not want an arbitrary pushout, because we do not wish to work up to
isomorphism as that would “mess up” the names. This is why we need user-provided
injective renamings π∆and πΦto deal with potential name clashes. If we worked
up to isomorphism, these renamings would not be needed, as they can always be
manufactured by the system – but then these are no longer necessarily related to
the users’ names. Alternatively, if we use long names based on the (names of the)
views, the method used to construct the presentations and views “leaks” into the
names of the results, which we also consider undesirable.
3.3.4 Mixin. Given a view [u∆]∶Γ→∆, an extension [uΦ]∶Γ→Φ and two
disjoint injective renaming functions π∆∶∆→Vand πΦ+∶Φ+→V, where
5In practice, theory presentations are rendered (printed, serialized) using a topological sort where
ties are broken alphabetically, so as to be construction-order indedepent.
14 ⋅J. Carette and R. O’Connor
the extension Φ decomposes as Φ =Γ[uΦ]⋊Φ+, we can mixin the view into the
extension, constructing a new theory presentation Ξ. We define Ξ ≜Ξ1⋊Ξ2where
Ξ1≜∆[x↦π∆(x)]x∈∣∆∣
Ξ2≜Φ+[y∶=π′
Φ+(y)]y∈∣Φ∣
π′
Φ+(y)≜z[u∆][x↦π∆(x)]x∈∣∆∣when there is a z∈Γsuch that z[uΦ]=y
πΦ+(y)when y∈Φ+
The mixin also provides an extension [v∆]∶∆→Ξ and a view [vΦ]∶Φ→Ξ, defined
as
[v∆]≜[x↦π∆(x)]x∈∣∆∣
[vΦ]≜[y∶=π′
Φ+(y)]y∈∣Φ∣
By definition of extension, there is no z∈Γthat is mapped into Φ+by [uΦ]. The
definition of π′
Φ+is arranged such that [u∆];[v∆]is equal to [uΦ];[vΦ](and not
just equivalent); so we can denote this joint arrow by [uv]∶Γ→Ξ. In other words,
in a mixin, by only allowing renaming of the new components in Φ+, we insure
commutativity on the nose rather than just up to isomorphism.
Mixins also provide a set of mediating views from the constructed theory presen-
taiton Ξ. Suppose we are given a view [w∆]∶∆→Ω and view [wΦ]∶Φ→Ω such
that the composed views [u∆];[w∆]∶Γ→Ω and [uΦ];[wΦ]∶Γ→Ω are equivalent.
We can combine [w∆]and [wΦ]into the mediating view [wΞ]∶Ξ→Ω defined as
[wΞ]≜[π∆(x)∶=x[w∆]]x∈∣∆∣∪[πΦ+(y)∶=y[wΦ]]y∈∣Φ+∣.
For mixin, again using the symbols as above, we denote the construction results
as
M(u∆, uΦ, π∆, πΦ)≜
pres =Ξ1⋊Ξ2
extend∆=[v∆]∶∆→Ξ
viewΦ=[vΦ]∶Φ→Ξ
diag =[uv]∶Γ→Ξ
mediate =λ w∆wΦ. wΞ
Symbolically the above is very similar to what was done in combine, and indeed
we are constructing all of the data for a specific pushout. However in this case the
results are not symmetric, as seen from the details of the construction of Ξ1and
Ξ2, which stems from the fact that in this case [vΦ]is an arbitrary view rather
than an extension.
3.3.5 Reification of views. Although we (will) have a syntax and semantics for
views, there are times when we wish to take views and treat them as first-class
objects. For example, if we want to show that the set of all (small) groups and
group homomorpisms forms a category, we need to be able to have a “theory” of
group homomorphisms. But we can think of an even simpler example: we would
like to talk about the theory of opposite magmas (see the view 2 in §2.4). To do
this, we need to somehow internalize (reify) this view: this is a further reason to
add declarations to our presentations (§3.1).
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
4 THE CATEGORICAL THEORY OF SEMANTICS 15
Suppose we are given a view [v]∶Γ→∆ We want to define a new presentation
which internalizes [v]. A priori, this would require copying all of Γ and ∆ into a
new presentation, and then define relations between the terms of Γ and ∆ via [v].
However, ∆ might share some names with Γ, with some sharing of names, both on
purpose and accidental. While we could rename everything in ∆ and use [v]to
recover sharing, this is wasteful. In this case, we will only ask for a renaming for
those names of ∆ which introduce definitions.
Given an injective renaming function π∶∆→V, we can define a new presen-
tation Ξ ≜Γ⋊π⋅∆⋊wwhere w≜[π(z)∶σ[v]∶=z[v]]z∶σ∈Γ. Note how we have
used the convertibility axiom from the formation rules of views (Fig. 3) in the
definition of w. Naturally, we also get extensions uΓ∶Γ↪Ξ=[b↦b]b∈∣Γ∣and
u∆∶∆↪Ξ=[b↦πb]b∈∣∆∣.
For internalization, using the symbols as above, we denote results of the con-
struction as
I(v, π)≜
pres =Ξ
extendΓ=uΓ
extend∆=u∆
4. THE CATEGORICAL THEORY OF SEMANTICS
At first glance, the definitions of combine and mixin may appear ad hoc and overly
complicated. This is because, in practice, the renaming functions π∆and πΦare
frequently the identity. The main reason for this is that mathematical vernacular
uses a lot of rigid conventions, such as usually naming an associative, commutative,
invertible operator which possesses a unit +, and the unit 0, backward composition is
○, forward composition is ;, and so on. But the usual notation of lattices is different
than that of semirings, even though they share a similar ancestry – renamings are
clearly necessary at some point.
The details of the combinators combine and mixin can be motivated by giving
them a categorical specification. When we do, we find out that the mixin operation
is a Cartesian lifting in a suitable fibration, and the combine operation is a special
case of mixin.
While our primary interest is in theory presentations, the bulk of the categorical
work in this area has been done on the category of contexts, which is the opposite
category. To be consistent with the existing literature, we will give our semantics
in terms of B=Pop. Thus if [v]∶Γ→∆ is a view from the theory presentation Γ to
the theory presentation ∆, then [v]is an arrow from ∆, considered as a context,
to Γ, considered as a context. We will write such arrows as [v]∶∆←Γ as an arrow
from ∆ to Γ when we are considering the category of contexts. Composition of two
arrows is simply the composition of views.
4.1 Semantics
The category of contexts forms the base category for a fibration. The fibered
category Eis the category of theory extensions. The objects of Eare extensions
of theory presentations. We write such objects as [u]∶Γ∆ where Γ is the base
of the extension and ∆ is the extended theory presentation. The notation is to
remind the reader that the underlying arrows are in fact monos. Arrows between
16 ⋅J. Carette and R. O’Connor
two extensions is a pair of views forming a commutative square with the extensions.
Thus given extensions [u2]∶Γ2∆2and [u1]∶Γ1∆1, then an arrow between
these consists of two arrows [v∆]∶∆2←∆1and [vΓ]∶Γ2←Γ1from Bsuch that
[u1]○[v∆]=[vΓ]○[u2]∶∆2←Γ1. When we need to be very precise, we write such
an arrow as [u2][v∆]
∆2←∆1
Γ2←Γ1
[vΓ][u1]. We will write v∆
Γ∶u2u1whenever the rest
of the information can be inferred from context. When given a specific arrow in E,
we will use the notation e⇇for its name.
A fibration of Eover Bis defined by giving a suitable functor from Eto B. Our
“base” functor sends an extension [e]∶Γ∆ to Γ and sends an arrow v∆
Γ∶u2u1
to its base arrow [vΓ]∶Γ2←Γ1.
Theorem 4.1. This base fibration is a Cartesian fibration.
This theorem, in slightly different form, can be found in [Jac99] and [Tay99]. We
give a full proof here because we want to make the link with our mixin construction
explicit. We use the results of §3.3 directly.
Proof. Suppose [u∆]∶∆←Γ is an arrow in B(a view), and [uΦ]∶ΓΦ is an
object of Ein the fiber of Γ (i.e. an extension). We need to construct a Cartesian
lifting of [u∆], which is a Cartesian arrow of Eover [u∆]. The components of
the mixin construction are exactly the ingredients we need to create this Cartesian
lifting. Let π∆∶∆→Vand πΦ+∶Φ+→Vbe two disjoint injective renaming
functions. Note that such π∆and πΦ+always exist because Vis infinite while ∆
and Φ+are finite. Then let
M(u∆, uΦ, π∆, πΦ)≜
pres =Ξ
extend∆=[v∆]∶∆→Ξ
viewΦ=[vΦ]∶Φ→Ξ
diag =[uv]∶Γ→Ξ
mediate =λ w∆wΦ. wΞ
Then e⇇≜[v∆][vΦ]
Ξ←Φ
∆←Γ
[u∆][uΦ]is an arrow of Ewhich is a Cartesian lift of [u∆].
Firstly, to see that e⇇is in fact an arrow of E, we note that [v∆]∶∆→Ξ
is an extension, so [v∆]∶∆↪Ξ is an object of E. Next we need to show that
[v∆]○[u∆]=[vΦ]○[uΦ]. Let z∈Γ. Then z[u∆][v∆]=z[u∆][x↦π∆(x)]
by definition of [v∆]. On the other hand, z[uΦ][vΦ]=z[uΦ][y∶=π′
Φ+(y)]y∈∣Φ∣
by definition of [vΦ]. However, z[uΦ]is a variable since uΦis an extension, and
by definition π′
Φ+(z[uΦ])=z[u∆][x↦π∆(x)]x∈∣∆∣so that we have z[u∆][v∆]=
z[uΦ][vΦ]as required.
Secondly we need to see that e⇇is a Cartesian lift of [u∆]∶∆←Γ. We note that
it is plain to see that [u∆]∶∆←Γ is the base of e⇇. What remains is to show that
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
4 THE CATEGORICAL THEORY OF SEMANTICS 17
for any other arrow f⇇≜[wΨ][wΦ]
Ω←Φ
Ψ←Γ
[wΓ][uΦ]from Eand any arrow [w0]∶Ψ←∆
from Bsuch that [wΓ]=[u∆]○[w0]∶Ψ←Γ, there is a unique mediating arrow
h⇇≜[wΨ][wΞ]
Ω←Ξ
Ψ←∆
[w0][v∆]from Esuch that
h⇇;f⇇=e⇇(3)
To show that such an h⇇exists, we only need to construct [wΞ]∶Ω←Ξ and
show that it has the required properties. We will show that the mediating arrow
[w]from the mixin construction given [wΦ]∶Φ→Ω and [w∆]≜[w0];[wΨ]∶∆→Ω
is the required arrow.
First we note that [u∆]○[w∆]=[uΦ]○[wΦ]as required by the mixin construction
for the mediating arrow since [u∆]○[w0]=[u∆]○[w0]○[wΨ]=[wΓ]○[wΨ]=
[uΦ]○[wΦ]. Now taking [wΞ]≜[w]we need to show that h⇇is a well defined
arrow in Eby showing it forms a commutative square. Suppose x∈∆. Then
x[v∆][wΞ]=π∆(x)[wΞ]=x[w∆]=x[w0][wΨ]as required. Next we need to show
that equation (3) holds. It suffices to show that [vΦ]○[wΞ]=[wΦ]since it is already
required that [u∆]○[w0]=[wΓ]. Suppose y∈Φ. There are two possiblities,
either y=z[uΦ]for some z∈Γ, or y∈Φ+where Φ =Γ[uΦ]⋊Φ+. If y∈Φ+
then y[vΦ][wΞ]=πΦ+(y)[wΞ]=y[wΦ]as requried. In case y=z[uΦ], then
y[vΦ][wΞ]=z[uΦ][vΦ][wΞ]=z[u∆][v∆][wΞ]=z[u∆][w0][wΨ]=z[wΓ][wΨ]=
z[uΦ][wΦ]=y[wΦ]as requiried.
Lastly we need to show that the mediating arrow h⇇is the unique arrow one
satisfying equation (3). Let j⇇be another arrow of E, where j⇇must have the
same shape as h⇇, but with [wΞ]replaced with [w′
Ξ]. Suppose that
j⇇;f⇇=e⇇
We need to show that [w′
Ξ]=[wΞ]. Suppose z∈Ξ. There are two possiblities.
Either z=x[v∆]for some x∈∆or z=y[vΦ]for some y∈Φ+. Suppose z=x[v∆].
Then z[w′
Ξ]=x[v∆][w′
Ξ]=x[w0][wΨ]=x[v∆][wΞ]=z[wΞ]as required. On the
other hand, suppose z=y[vΦ]. Then z[w′
Ξ]=y[vΦ][w′
Ξ]=y[wΦ]=y[vΦ][wΞ]=
z[wΞ]as required. So [w′
Ξ]=[wΞ]and hence j⇇=h⇇, as required.
The above proof illustrates that the mixin operation is characterized by the pro-
perties of a Cartesian lifting in the fibration of extensions. Notice that a Cartesian
lift is only characterised up to isomorphism. Thus there are potentially many iso-
morphic choices for a Cartesian lift, and hence there are many possible choices for
how to mixin an extension into a view. This is the underlying reason why the mixin
construction requires a pair of renaming functions. The renaming functions pick
out a particular choice of mixin from the many possibilities. This ability to specify
which mixin construction to make is quite important as one cannot simply define
a mixin to be “the” Cartesian lift, since “the” Cartesian lift is only defined up to
18 ⋅J. Carette and R. O’Connor
isomorphism. It is important to remember that for user syntax, we cannot work up
to isomorphism!
Next we will see that combine is a special case of mixin.
Theorem 4.2. Given two extensions [u∆]∶Γ↪∆and [uΦ]∶Γ↪Φand re-
naming functions π∆∶∆→Vand πΦ∶Φ→Vsastifiying the requirement of the
combine construction, then
M(u∆, uΦ, π∆, πΦ+)=C(u∆, uΦ, π∆, πΦ)(4)
where Φ=Γ[uΦ]⋊Φ+and πΦ+=πΦ⇂∣Φ+∣, and equation 4 is interpreted component-
wise.
Proof. Suppose that
C(u∆, uΦ, π∆, πΦ)=
pres =Ξ0⋊(Ξ∆∪ΞΦ)
extend∆=[v∆]∶∆→Ξ
extendΦ=[vΦ]∶Φ→Ξ
diag =[uv]∶Γ→Ξ
mediate =λ w∆wΦ. wΞ
and
M(u∆, uΦ, π∆, πΦ+)=
pres =Ξ′
extend∆=[v′
∆]∶∆→Ξ′
viewΦ=[v′
Φ]∶Φ→Ξ′
diag =[uv′]∶Γ→Ξ′
mediate =λ w∆wΦ. wΞ′
Recall that Ξ =Ξ0⋊(Ξ∆∪ΞΦ)=Ξ0⋊(Ξ∆)⋊ΞΦwhere
Ξ0≜Γ[z↦π∆(z[v∆])]z∈∣Γ∣, Ξ∆≜∆+[x↦π∆(x)]x∈∣∆∣, and ΞΦ≜Φ+[y↦πΦ(y)]y∈∣Φ∣.
In particular note that Ξ0=Γ[v∆][z[v∆]↦π∆(z[v∆])]z∈∣Γ∣. Since ∆ =Γ[v∆]⋊
∆+, we have that
Ξ0⋊Ξ∆=Γ[v∆][z[v∆]↦π∆(z[v∆])]z∈∣Γ∣⋊∆+[x↦π∆(x)]x∈∣∆∣
=(Γ[v∆]⋊∆+)[x↦π∆(x)]x∈∣∆∣
=∆[x↦π∆(x)]x∈∣∆∣
Recall also that Ξ′=Ξ′
1⋊Ξ′
2where Ξ′
1≜∆[x↦π∆(x)]x∈∣∆∣and Ξ′
2≜Φ+[y↦π′
Φ+(y)]y∈∣Φ∣.
So we see that Ξ′
1=Ξ0⋊Ξ∆.
Next we show that π′
Φ+=πΦ. If y∈Φthen either y∈Φ+or there is some z∈Γ
such that y=z[vΦ]. If y∈Φ+then π′
Φ+(y)=πΦ+(y)=πΦ(y). If y=z[vΦ], then
π′
Φ+(y)=z[u∆][x↦π∆(x)]x∈∣∆∣=π∆(z[u∆]) =πΦ(z[uΦ]) =πΦ(y). Therefore
Ξ′
2=ΞΦand hence Ξ′=Ξ.
Next we need to show that [v′
∆]=[v∆]and [v′
Φ]=[vΦ]. First we see that [v′
∆]
and [v∆]are both defined to be [x↦π∆(x)]x∈∣∆∣, so clearly they are equal. Next
we see that [vΦ]≜[y↦πΦ(y)]y∈∣Φ∣and [v′
Φ]≜[y↦π′
Φ+(y)]y∈∣Φ∣are equal because
π′
Φ+=πΦ. This also gives that [uv]=[uv′].
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
5 SYNTAX AND SEMANTICS OF PRESENTATION COMBINATORS 19
Lastly we show that the mediating arrow of the combine is the same as the
mediating arrow of the mixin. Suppose we are given [w∆]∶∆→Ω and [wΦ]∶Φ→Ω
such that [u∆];[w∆]=[uΦ];[wΦ]∶Γ→Ω. To show that the mediating arrow
produced by combine, [wΞ]∶Ξ→Ω is the same as the medating arrow produced by
the mixin, it suffies to prove that the mediating arrow satifies the universal property
of the Cartesian lift, since such an arrow is unique. Thus it suffieces to show that
[vΦ];[wΞ]=[wΦ]∶Φ→Ω and [v∆];[wΞ]=[w∆]∶∆→Ω. Let y∈Φ. Then
y[vΦ][wΞ]=πΦ(y)[wΞ]=y[wΦ]. Let x∈∆. Then x[v∆][wΞ]=π∆(x)[wΞ]=
x[w∆]as required.
Combine is rather well-behaved. In particular,
Proposition 4.1. C(u∆, uΦ, π∆, πΦ)=C(uΦ, u∆, πΦ, π∆), i.e. combine is com-
mutative.
It turns out that combine also satisfies an appropriate notion of associativity. In
other words, we can compute limits of cones of extensions.
4.2 No Lifting Views over Views
Why do we restrict ourselved to the fibration of extensions? Why not allow mixins
of arbitrary views over arbitrary views? If mixins over arbitrary views were allowed,
then the notion of a Cartesian lifting reduces to that of a pullback. But to demand
that the category of contexts and views be closed under all pullbacks would require
too much from our type theory: we would have to have all equalizers (as we already
have all products). In particular, at the type level, this would force us to have subset
types, which is something we are not willing to impose. Thus a restriction is needed,
and our proposed restriction of only mixing in extensions into views appears to be
quite practical. Taylor [Tay99] is a good source of further reasons for the naturality
of restricting to this case.
5. SYNTAX AND SEMANTICS OF PRESENTATION COMBINATORS
We are now ready to give a concrete syntax for our presentation and view combina-
tors. This syntax reflects our desire to have a clean semantics, and thus is extracted
from the previous section, rather than trying to patch up our intuitive syntax. In
other words, we followed the development process of prototyping to get an idea
of what is needed,find an elegant denotational semantics, and redo everything to
match the elegant semantics on the nose.
We use A, B to denote names at the presentation/view level, xand yto denote
symbols, tare terms of the underlying type theory, and lare (raw) contexts from
the underlying type theory.
20 ⋅J. Carette and R. O’Connor
tpc ∶∶=Empty
Theory {l}
extend Aby {l}
combine A r1, B r2
mixin A r1, B r2
view Aas Bvia v
A;B
Ar
r∶∶=[ren]
v∶∶=[assign]
ren ∶∶=x↦y
ren, x ↦y
assign ∶∶=x∶=t
assign, x ∶=t
Informally, these forms correspond to the empty theory, an explicit theory, a
theory extension, combining two extensions, mixing in a view and an extension,
explicit view, sequencing views, and renaming.
What might be surprising is that we do not have a separate language for presen-
tations and views. This is because our language does not have a single semantics
in terms of presentations, extensions or views, but rather has several compatible
semantics. In other words, our syntax will yield objects of B, objects of E(i.e.
extensions) and arrows of B(views).
The semantics is given by defining three partial maps, J−KB∶tpc ⇀B,J−KE∶
tpc ⇀E,J−KB→∶tpc ⇀HomB. This is done by simultaneous structural recusion.
We also use J−Kπfor the straightforward semantics in V→Vof a renaming.
J−KB∶tpc ⇀B
JEmptyKB=∅
JTheory {l}KB=lwhen lctx
Jextend Aby {l}KB=EJAKB,id∣A∣.pres
Jcombine A1r1, A2r2KB=C(JA1KE,JA2KE,Jr1Kπ,Jr2Kπ).pres
Jmixin A1r1, A2r2KB=M(JA1KB→,JA2KE,Jr1Kπ,Jr2Kπ).pres
Jview Aas Bvia vKB=
JA;BKB=cod JA;BKB→
JA rKB=R(JAKB,JrKπ).pres
Recall that objects of Ecorresponds to those arrows of B(i.e. views) which are
in fact extensions.
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
5 SYNTAX AND SEMANTICS OF PRESENTATION COMBINATORS 21
J−KE∶tpc ⇀E
JEmptyKE=I∅
JTheory {l}KE=!l∶[]→JlKB
Jextend Aby {l}KE=EJAKB,id∣A∣.extend
Jcombine A1r1, A2r2KE=C(JA1KE,JA2KE,Jr1Kπ,Jr2Kπ).diag
Jmixin A1r1, A2r2KE=
Jview Aas Bvia vKE=
JA;BKE=JAKE;JBKE
JA rKE=R(JAKB,JrKπ).extend
Lastly, arrows of Bare views.
J−KB→∶tpc ⇀HomB
JEmptyKB→=I∅
JTheory {l}KB→=!l∶[]→JlKB
Jextend Aby {l}KB→=EJAKB,id∣A∣.extend
Jcombine A1r1, A2r2KB→=C(JA1KE,JA2KE,Jr1Kπ,Jr2Kπ).diag
Jmixin A1r1, A2r2KB→=C(JA1KB→,JA2KE,Jr1Kπ,Jr2Kπ).diag
Jview Aas Bvia vKB→=[v]∶JAKB→JBKB
JA;BKB→=JAKB→;JBKB→
JA rKB→=R(JAKB,JrKπ).extend
All rules are strictly compositional except for JA;BKB, but this is ok since JA;BKB→
is compositional.
Note that we could have interpreted Jview Aas Bvia vKBas codJview Aas Bvia vKB→,
rather than as , but this is not actually helpful, since this is just JBKB, which is
not actually what we want. What we would really want is the result of doing the
substitution vinto A, but the resulting presentation may no longer be well-formed.
So we chose to interpret the attempt to take the object component of a view as a
specification error. Similarly, even though we can give an interpretation as an ex-
tension for mixin when A1turns out to be an extension, and also for an extension r
in a view context (i.e. view Aas Bvia r), we also choose to make these specification
errors as well.
We should also note here that in our implementation, we allow raw renamings
([ren]) and assignments ([assign]) to be named, for easier reuse. While renamings
can be given a simple categorical semantics (they induce a natural transformation
on B), assignments really need to be interpreted contextually since this requires
checking that terms tare well-typed.
Furthermore, we add a bit of syntactic sugar: ABstands for combine A[], B [],
a rather common situation.
22 ⋅J. Carette and R. O’Connor
6. EXAMPLES
We show some progressively more complex examples, drawn from our library. These
are chosen to illustrate the power of the combinators, but also how these do solve
the various problems we highlighted in §2. In all the examples below, we are talking
solely about presentations of theories; we will nevertheless drop “presentation” and
talk about Group even though we are not interested in groups themselves, nor even
“Group Theory”. The reader needs to keep this distinction in mind when reading
our examples.
The simplest use of combine comes very quickly in a hierarchy built using tiny
theories, namely when we construct a pointed magma from a magma and (the
theory of) a point.
C a r r i e r : = e x t e n d Empty by {U : t y p e }
Magma := e x t e n d C a r r i e r b y {∗: (U,U) −>U}
Po i n te d := e x t e n d C a r r i e r b y {e : U }
PointedMagma := Magma | | Poi n te d
where we have used the sugar for combine. Since JMagmaKEand JPointedKE
are both arrows from JCarrierKB, these can be combined into another extension
JPointedMagmaKE∶JCarrierKB→JPointedMagmaKB.
If we want a theory of two points, we need to rename one of them:
TwoPointed := c om b in e Po in ted [ ] , Po in t ed [U |−>V ]
We can just as easily extend by properties:
L e f t U n i t a l : = e x t e n d PointedMagma by {
axio m l e f t I d e n t i t y : f o r a l l x : U. e ∗x = x
}
This illustrates a design principle: properties should be defined as extensions of
their minimal theory. Such minimal theories are most often signatures, in other
words property-free theories. By the results of the previous section, this maximizes
reusability. Even though signatures have no specific status in our framework, they
arise very naturally as “universal base points” for theory development.
LeftUnital of course has a natural dual, RightUnital. While this is easy enough
to define explicitly, this should nevertheless give pause, as this is really duplicating
information which already exists. This can be solved using the following self-view :
F l i p : = v i e w Magma a s Magma v i a [∗|−>fu n ( x , y ) . y ∗x ]
Note that there is no interpretation for JFlipKB; if we were to perform the substi-
tution directly, we would obtain
Th e o ry {U : t y p e ; fu n ( x , y ) . y ∗x : ( U,U) −>U}
which is ill-defined since it contains the undefined symbol ∗.
One could be tempted to then write
R i g h tU n i t a l := m ix in F l i p [ ] , L e f t U n i t a l [ ]
but this is incorrect since LeftUnital is an extension from PointedMagma, not
Magma. The solution is to write
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
7 DISCUSSION 23
R i g h tU n i t a l := m ix in F l i p [ ] , ( P oint edMa gma ; L e f t U n i t a l ) [ ]
which gives a correct answer, but with an axiom still called leftIdentity; the
better solution is to write
R i g h t U n i ta l : = m ix i n F l i p [ ] ,
( PointedMagma ; L e ft Un it a l ) [ l e f t I d e n t i t y |−>r i g h t I d e n t i t y ]
which is the RightUnital we want. Note that the construction also make available
an extension from Magma (as if we had done the construction manually) as well as
views from LeftUnital and Magma.
Note that the syntax used above is sub-optimal: the path PointedMagma;LeftUnital
may well be needed again, and should be named. In other words,
L e f t U ni t := Po int edMa gma ; L e f t U n i t a l
is a useful intermediate definition.
Note that the previous examples reinforce the importance of signatures, and of
arrows from signatures to “interesting” theories as important, separate entities.
For example, Monoid as an extension is most usefully seen as an arrow from the
presentation PointedMagma.
Our machinery also allows one to construct the inverse view, from LeftUnital
to RightUnital. Consider the view Flip;LeftUnital and the identity view from
LeftUnital to itself. These are exactly the needed inputs for mediate, which
returns a (unique) view from LeftUnital to RightUnital. Furthermore, we obtain
(from the construction of the mediating view) that this view composes with the view
from RightUnital to LeftUnital to give the identity. This is illustrated in Figure 6
where the J−KB→annotations on nodes are omitted; note that the arrows are in the
semantic category, which is the opposite of the one for theory presentations. Let
RU =C(JFlipKB→,JLeftUnitalKE,JidKπ,J[leftIdentity ↦rightIdentity]Kπ)
then FlipRU =RU.viewLeftUnital and
FlipLU =RU.mediateLeftUnital (JLeftUnitalKE;JFlipKB→,JidKE)
The constrution of mediate insures that FlipLU ;FlipRU =JidKE,provided that we
know that
JFlipKB→;JFlipKB→=JidKB→∶JMagmaKB→JMagmaKB.
The above identity is not, however, structural, it properly belongs to the underlying
type theory: it boils down to asking if
∀x∶U.flip (flip x)=βηδ x
or, to use the notation of §3.1,
[U∶Type, x ∶U]⊢flip (flip x)≡x∶U.
7. DISCUSSION
It is important to note that we are essentially parametric in the underlying type
theory. From a categorical point of view, this is hardly surprising: this is the whole
point of contextual categories [Car86]. A lot more features can be added to the
24 ⋅J. Carette and R. O’Connor
RightUnital LeftUnital
Magma Magma
LeftUnital
Magma
FlipRU
JRightUnitalKE
JFlipKB→
JLeftUnitalKE
JLeftUnitalKE
JidKE
FlipLU
JFlipKB→
Fig. 4. Construction of LeftUnital and RightUnital. See the text for the interpretation.
type theory, at no harm to the combinators themselves – see Jacobs [Jac99] and
Taylor [Tay99] for many such features.
One of the features we did choose to build in was to allow definitions in our
contexts. This is especially useful when transporting theorems from one setting
to another, as is done when using the “Little Theories” method [FGT92]. In this
setting it is frequently beneficial to first build up towers of conservative extensions
above each of the theories, so as to build up a more convenient common vocabulary,
which then makes interpretations easier to build (and use).
Lastly, we have implemented a “flattener” for our semantics, which just turns a
presentation Agiven in our language into a flat presentation Theory{l}by comput-
ing cod (JAKE). We have been very careful to ensure that all our constructions leave
no trace of the construction method in the resulting flattened theory. We strongly
believe that users of theories do not wish to be burdened by such details, and
we also want developers to have maximal freedom in designing a modular, reusable
and maintainable hierarchy without worrying about backwards compatibility of the
hierarchy, only the end results: the theory presentations.
8. RELATED WORK
We have been highly influenced by the early work of Burstall and Goguen [BG77,
BG79], Doug Smith’s Specware [Smi93, Smi99], and the work of Kapur, Musser
and Stepanov on Tecton [KMS82, KM92]. They gave us the basic operational
ideas, and some of the semantic tools we needed. But we quickly found out, much
to our dismay, that neither of these approaches seemed to scale very well. Later, we
were hopeful that the approach of CASL [CoF04] might work, but then found that
their own base library was insufficiently factored and full of redundancies. Of the
vast algebraic specification literature around this topic, we want to single out the
work of Oriat [Ori00] on isomorphism of specification graphs as capturing similar
ideas to ours on extreme modularity. And it cannot be emphasized enough how
crucial Bart Jacob’s book [Jac99] has been to our work.
Another line of influence is through universal algebra [Whi98, BS81], more pre-
cisely the constructions of universal algebra, rather than its theorems. That we can
Journal of Formal Reasoning Vol. ?, No. ?, Month Year.
8 RELATED WORK 25
manipulate signatures as algebraic objects is firmly from that literature. Of course,
we must generalize from the single-sorted equational approach of the mathematical
literature, to the dependently typed setting. As we eschew all matters dealing with
models, the syntactic manipulation aspects of universal algebra generalize quite
readily. The syntactic concerns are also why Lawvere theories [Law04] are not as
important to this work. Sketches [BW90] certainly could have been used, but would
have led us too far away from the elegance of using structures already present in
the λ-calculus (namely contexts) quite directly.
Institutions might also appear to be an ideal setting for our work. But even
as the relation to categorical logic has been worked out [GMdP+07], it remains
that these theories are largely semantic, in that they all work up to equivalence.
This makes the theory of institutions significantly simpler, however it also makes it
largely unusable for user-oriented systems: people really do care what names their
symbols have in their theory presentations.
After we had largely finished our work, we found various research threads which
had a lot in common with ours. They used different terminology, and frequently
provided no implementations.
Proceeding in chronological order, the Harper-Mitchell-Moggi work on Higher-
order Modules [HMM89] covers some of the same themes we do: a set of construc-
tions (at the semantic level) similar to ours is developed for ML-style modules.
However, they did not seem to realize that these constructions could be turned
into an external syntax, with crucial application to structuring a large library of
theories (or modules). Nor did they see the use of fibrations, since they avoided
such issues “by construction”. Moggi returned to this topic [Mog89], and did make
use of fibrations as well as categories with attributes, a categorical version of con-
texts. Post-facto, it is possible to recognize some of our ideas as being present in
Section 6 of that work; the emphasis is however completely different. In that same
vein Structured theory presentations and logic representations [HST94] does have
a set of combinators. However, it seems flawed: Definition 3.3 of the signature of
a presentation requires that both parts of a union must have the same signature
(to be well formed) and yet their Example 3.6 on the next page is not well formed!
That being said, many parts of the theory-level semantics is the same. However, it
is our morphism-level semantics which really allows one to build large hierarchies
conveniently.
A rather overlooked 1997 Ph.D. thesis by Sherri Shulman [Shu97] presents a
number of interesting combinators. Unfortunately the semantics are unclear, espe-
cially in cases where theories have parts in common; there are heavy restrictions
on naming, and no renaming, which makes the building of large hierarchies fragile.
Nevertheless, there is much kinship here, especially that of extreme modularity. If
this work had been implemented in a mainstream tool, it would have saved us a lot
of effort.
[Tay99] does worry more about syntax. Although the semantic component is
there, there are no algorithms and no notion of building up a library. The categorical
tools are presented too, but not in a way to make the connection clear, and lead to
an implementable design.
MMT [RK13] is of course closest. But the structuring tools are still not as
nice as we’d like — [DHS09] shows some examples. Many of the problems that
26 ⋅J. Carette and R. O’Connor
we have identified as problems for scaling are still present. MMT does have some
advantages: it is foundation independent, and possesses some rather nice web-based
tools for pretty display. But their extend operation (named include) is theory-
internal, and its semantics is not given through flattening (which they have yet to
implement). The result is that their theory hierarchies explicitly suffer from the
“bundling” problem, as lucidly explained in [SvdW11], who introduce type classes
in Coq to help alleviate this problem. Furthermore, although theory morphisms
are first class, obtaining the “right” ones seems to be entirely manual.
Isabelle’s locales support locale expressions[Bal03], which are also reminiscent
of ours. However, we are unaware of a denotational semantics for them; further-
more, neither combine nor mixin are supported. Axiom [JS92] does support theory
formation operations, but these are quite restricted, as well as defined purely op-
erationally. They were meant to mimic what mathematicians do informally when
operating on theories. No semantics for them has ever been published.
Coq has both Canonical Structures and type classes [SvdW11], but no combi-
nators to make new ones out of old. Similarly, Lean [dMKA+15] has some (still
evolving) structuring mechanisms but not combinators to form new theories from
old.
9. ACKNOWLEDGMENTS
We wish to thank Michael Kohlhase, Florian Rabe and William M. Farmer for
many fruitful conversations on this topic. We also wish to thank the participants
of the Dagstuhl Seminar 18341 “Formalization of Mathematics