ChapterPDF Available

Abstract and Figures

In this paper, we argue that category theory (CT), the mathematical theory of abstract processes, could provide a concrete formal foundation for the study and practice of systems engineering. To provide some evidence for this claim, we trace the classic V-model of systems engineering, stopping along the way to (a) introduce elements of CT and (b) show how these might apply in a variety of systems engineering contexts. © Springer International Publishing AG 2018. All rights are reserved.
Content may be subject to copyright.
15
th
Annual Conference on Systems Engineering Research
Disciplinary Convergence: Implications for
Systems Engineering Research
1
Eds.: Azad M. Madni, Barry Boehm
Daniel A. Erwin, Roger Ghanem; University of Southern California
Marilee J. Wheaton, The Aerospace Corporation
Redondo Beach, CA, March 23-25, 2017
Categorical foundations for system engineering
Spencer Breiner
a
, Eswaran Subrahmanian
a,b
, Albert Jones
b
a
National Institute of Standards and Technology, spencer.breiner@nist.gov
b
Carnegie Mellon University, sub@cmu.edu
Abstract
In this paper we argue that category theory (CT), the mathematical theory of abstract processes, could
provide a concrete formal foundation for the study and practice of systems engineering. To provide some
evidence for this claim, we trace the classic V-model of systems engineering, stopping along the way to (a)
introduce elements of CT and (b) show how these might apply in a variety of systems engineering contexts.
Keywords: Category theory, Foundations of system engineering, Mathematical modeling
Introduction
Systems are becoming more complex, both larger and more interconnected. As computation and
communication in system components goes from novelty to the norm, this only becomes more true. In
particular, we have no generally accepted method for designing, testing and analyzing systems which mix
both physical and computational dynamics. We believe that a new formal foundation is required to model
and study such complex systems.
Existing approaches, typified by the V-model of systems engineering, are more heuristic than formal.
First we conceptualize the system, setting our various requirements and assumptions. Next we refine this
into a functional decomposition which details how our system will meet its goals. In realization, we map
these functions to components of our systems. Finally, we integrate these components into a true system,
testing along the way, before releasing the system for operation.
This says what we need to do, but not how to do it. A formal foundation would supplement this
framework with concrete tools and formal methods for accomplishing each step. Our goal in this paper is
to propose a candidate approach for such a foundation, based on a branch of mathematics called
category theory (CT).
We should mention some prior work associating CT and systems engineering. For example, CT is
listed as a foundational approach in the Systems Engineering Body of Knowledge (SEBOK, [1]), although
there is little detail associated with the entry. More substantively, Arbib & Manes [2] studied applications
of CT in systems control in the 1970's. This work was largely stymied by the unfamiliarity of categorical
ideas and the lack of good tools for implementing them (on which we will have more to say in the
conclusion).
CT is the mathematical theory of abstract processes, and as such it encompasses both physics and
computation. This alone makes it a good candidate for foundational work on modern systems. As we
proceed, we will also argue for other virtues including expressivity, precision, universality and modularity
among others.
To make our argument, we will trace through the classic V-model of systems engineering,
2
demonstrating along the way how CT might apply at each step in the process. We have chosen the V-
model not for validity (it oversimplifies) but merely for familiarity.
In tracing the V, we hope to accomplish two things. First, we aim to demonstrate the range of
categorical methods in order to demonstrate that CT might provide a holistic foundation for systems
engineering. Second, and more important, we hope to introduce systems engineers to the language and
methods of CT, and pique the interest of the systems engineering community to investigate further. Our
hope is that one day soon this paper might serve as the preface to a much deeper study that systems
engineers and category theorists might write together.
1. Conceptualization
The first role for CT in systems engineering is as a precise technical language in which to express and
analyze models of systems information, ranging from theoretical predictions to raw data. The key feature
of CT in this respect is its abstraction. We can form categorical models from graphs, from logical
ontologies, from dynamical systems and more, and we can use categorical language to analyze the
relationships and interactions between these. To get a sense of what this looks like, we will model some
simple system architectures and the relationships between them.
The categorical model for an abstract network is remarkably simple:
(1)
The first thing to observe is that a category contains two types of entities, called objects and arrows.
Intuitively, we think of these as sets and functions, though they are abstract in the model itself. An
instance of the model replaces abstract objects and arrows with concrete sets and functions. It is not hard
to see that any network can be encoded as an instance of N, as in figure 1.
The key difference between categories and directed graphs are the construction principles which allow
us to combine the elements of our models. Foremost among these construction principles is arrow
composition; whenever we are given sequential arrows
→ 
→ , we can build a new arrow .  :  → .
Another way to think of this is, when we draw categories as directed graphs, the arrows include paths of
edges as well as individual arcs. We also allow paths of length 0, called identities.
To see why this is useful, consider the following simple model for a hierarchy of depth  :
(2)
Here the primary structure is the self-arrow parent:NodeNode, which sends each node to the level
above it in the hierarchy. By composing parent with itself we can trace our way up the hierarchy from
any node.
By itself, this is too flexible. There is nothing to ensure that all nodes are part of the same hierarchy
and, even worse, our ``hierarchy'' might contain loops! We can eliminate these worries by demanding that
the parent map is ``eventually constant'': after repetitions, every node ends up at the same place. This
Fig. 1: Network as an
N
-instance
3
involves two ingredients: a construction and a path equation.
Categorical constructions generalize most set theoretic operations such as unions, intersections and
Cartesian products. The terminal object 1 stands in for a singleton set, and allows us to express the
notion of a constant value rootNode. The path equation paren
const.root forces the th parent
of any node to equal root, ensuring a single hierarchy with no loops.
A more interesting example is the layered architecture L (figure 2), in which channels must conform to
a hierarchy of layers. Here the path equations constrain where channels may occur, while the + and /
constructions express the fact that channels may form either between layers (Γ) or within a layer ().
All of these models are fairly trivial. The main point is that the sorts of class modeling which systems
engineers already do is not too far away from a precise formal language. By carefully modeling our
concepts at the early stages of systems engineering we can express requirements more precisely,
identify misconceptions and inconsistencies, and establish concrete domain-specific languages. Best of
all, we get both intuitive graphical presentations like those found in UML/SysML class diagrams without
sacrificing the semantic precision associated with OWL and other formal approaches to ontology.
CT also goes beyond these existing languages. A functor is a mapping between categories; it sends
object to objects and arrows to (paths of) arrows, without changing the effects of composition. These
maps, along with other constructions like colimits and natural transformations, allows us to explicitly
identify and represent the relationships between individual categorical models, thereby linking them into
larger networks. This allows semantic ontologies to emerge organically from the bottom-up, grounded in
practice, in contrast to ``upper ontology'' approach (e.g., the Basic Formal Ontology [3]), which tries to
impose semantic structure from the top down.
A simple example is the idea that a hierarchy is a special type of network. This fact can be formalized
as a functor :NH. To define we ask, for each component of N, what plays an analogous role in H?
The translation for Node is clear. In the hierarchy we have one channel for each node, so Channel also
maps to the same object Node. Since each channel maps from a node to its parent, target corresponds
with parent and source with the identity (zero-length path). Putting it all together, we have the functor
depicted in figure 3(a). Similarly, we can identify one hierarchy (of layers L) and two networks (of
channels C and layers L') in the layer architecture, corresponding to the four functors in figure 3(b). We
Fig. 2: Categorical model for layered architectures
Fig. 3: Functors translate between categorical models
4
even have a path equations--.  ′--which acknowledges that the network of layers in L is just the
same as the network in H which is constructed from the hierarchy in L.
The stylized models and relationships presented here are fairly trivial, but the general method of
categorical modeling is quite powerful. By varying the constructions we allow ourselves to use, CT
modeling can range in expressiveness from simple equations to full higher-order logic [12]. For more
thorough introductions to categorical modeling, see [23] or [10]. The main thing to remember is that
categorical methods provide tools for expressing and relating our formal models.
2. Decomposition
In the last section we met all the essential elements of category theory--objects and arrows,
composition, identities--except one: the associativity axiom. Given a sequence of three composable
arrows
→ 
→ 
→ , we could first compose at and then at , or vice versa. Both should yield the
same result: . .  . . . When applied to processes, this axiom is so obvious it is difficult to
express in English:
Doing and then , and then doing
is the same as
doing , and then doing and then .
Because of this, there is no need to keep track of parentheses when we compose arrows.
This allows us to describe complex processes based on only two pieces of information: (i) the
descriptions of simpler subprocesses and (ii) the way they were chained together. Of course, systems
engineers know that complex emergent phenomena may arise from simple subprocesses. This does not
mean that compositional, categorical mathematics does not apply. Instead, it means that the
compositional representations of such systems may require greater complexity than the naïve models we
might produce from scratch. By demanding compositionality from the outset, we are forced to build
interaction into our models from the ground up!
One important step in this direction is to generalize the sorts of composition that we allow. In fact,
there are many different flavors of category theory, each of which supports a different notion of
composition. The plain categories that we met in the last section allow only unary (single-input) processes
and serial composition. Some varieties like groups, which formalize the mathematics of symmetry, restrict
ordinary categories to obtain simpler structures. Others like process categories and operads add in
additional construction principles like parallel composition and multiple input/output. Through these
constructions, categories axiomatize the most fundamental concepts in systems engineering: resources
and processes [7].
Fig. 4: Process decomposition as a string diagram
5
All of these share a common theme of composition and associativity. For groups, this allows us to
describe the way that arbitrary rigid motions can be decomposed into translations and rotations. More
generally, this allows us to express complicated structures in terms of smaller and simpler pieces. It can
also help to show when a chain of complicated operations has a simple and predictable outcome.
Process categories, which are embody the mathematical structure of multi-resource functional
decomposition [7,4]. In the mathematical literature these are often refered to as “traced symmetric
monoidal categories”, but we feel that this nomenclature is too imposing given their simplicity and
importance. One particularly nice feature of these structures is that process categories support a
graphical syntax called string diagrams like the one in figure 4. Completely formal and technically precise,
these diagrams are nevertheless as intuitive and easy-to-read as flow charts.
Where string diagrams represent process flows, another class of structures called operads formalizes
the notion of a parts decomposition [21]. In an operad, the objects are interfaces and the arrows are
“wiring diagrams” which connect a set of small interfaces into one larger component. Here associativity
says that there is only one meaning for the phrase “a system of systems of systems.”
These representations make it easier to talk about relationships across scale. Some or all of the
subprocesses in the figure 4 will have their own process decompositions. The only substantive constraint
on these decompositions is that they have the appropriate input and output strings. This leaves us with
one high-level categorical model P for the entire process and several low-level models Q
for the
individual subprocesses.
To express the relationship between these, we first combine the low-level pieces into a single
aggregate model Q ⨁ Q
. This involves an operation called a colimit which generalizes set-theoretic
unions; building them requires explicitly representing the overlap between different models. Once we
build the aggregate model, we can then define a functor PQ which essentially pastes copies of the
smaller diagrams Q
into the appropriate bubbles from P. This identifies an explicit model for the total
high-level process P inside the aggregate low-level model Q. Furthermore, we can also allow multiple
decompositions for a given subprocess, providing a framework for modularity and versioning.
3. Realization
During realization we turn our abstract models into concrete realizations. In spirit, the relationship
between these two is analogous to the that between the logician's notions of syntax and semantics.
Roughly speaking, syntax is what we say and semantics is what we mean, or what we are talking about.
Models are like syntax: they describe how a product or system is supposed to work in terms of both
structure (decomposition and component interaction) and behavior (requirement and verification
specifications). Attaching semantics to these models means assigning each syntactic component to some
sort of concrete entity, in a way that mirrors the structure and behavior of the model.
Ultimately these concrete entities will be physical components and functioning source code, but before
we reach that point we must pass through many other, more abstract semantics. These might range from
the formal verification of a critical algorithm to a stochastic model of user behavior, but most have some
flavor of simulation. The motivating example to keep in mind is the simulation of a system in terms of
(discrete, continuous or hybrid) dynamical systems [15].
The key feature of the logician's semantics is compositionality: if we want to determine the truth of a
complex logical formula, it is enough to look at the truth values of its subformulas. This might seem to fail
for a given dynamical system: just because each component of my system is safe in isolation hardly
guarantees safety of the composite system. Doesn't the existence of emergent phenomena mean that the
behavior of a complex system is not determined by the behavior of its components? This
misunderstanding rests on a conflation of two distinct notions of “behavior”.
We can think of system behavior as a path through some high-dimensional state space; component
behavior is the projection of this path onto the subspace of component parameters. The problem is that
component dynamics in isolation trace out different paths than the projected system dynamics would.
This is why component safety in isolation does not entail system safety, even for the same component
metrics. This also means that there is no hope of composing individual component behaviors to derive
system behavior.
6
However dynamical models, the differential equations which generate these paths, are composable:
we can derive the dynamical equations of a system from the dynamics of its components [24]. The
formula for this derivation will, of course, depend on how the components are connected to one another.
Each diagram like the one in Figure 4 generates its own formula. CT structures this relationship, making
the requirements of compositionality explicit through the language of categories and functors.
Logical semantics involves three main elements: (i) a syntactic model to be interpreted, (ii) an
assignment of syntactic elements to semantic objects, and (iii) a satisfaction relation which determines
whether this assignment meets the requirements of the model. However, traditional logic operates in a
fixed context of sets and functions (deterministic semantics), while CT broadens this to allow stochastic
semantics, dynamical semantics and more. Thus categorical semantics adds one further element, (iv) a
universe of semantic entities.
This approach relies on an important though informal distinction in CT between smaller, ``syntactic''
categories and larger, ``semantic'' categories. Syntactic categories are like the architectural models
described from section 1, built directly from graphs (generators), path equations (relations) and
categorical structure (constructions).
Semantic categories instead use some other formalism, like set theory or matrix algebra, to define the
objects and arrows of a category directly. The prototypical example is the category of sets and functions,
denoted Sets, where composition (and hence path equations) is computed explicitly in terms of the rule
.    . Many other semantic categories like Graph (graphs and homomorphisms) and Vect
(vector spaces and linear maps) can be constructed from set theoretic entities.
Once we adopt this viewpoint, the relationship between syntax and semantics can be represented as a
functor from one type of category to the other. We have already seen one example of this approach, in
figure 1, where we described a network instance in terms of a pair of functions. This is exactly the same
as a functor N  !: we map objects of N to objects of Sets and arrows of N to arrows of Sets (i.e., to
sets and functions).
The satisfaction relation for the semantic interpretation is determined by the preservation of categorical
structure. A good example is the coproduct “+”, used in our model for the layered architecture L (figure 3).
Not all functors L  ! are semantically valid, only those which map the abstract coproduct " # $  L to
a concrete coproduct (disjoint union) in Sets. We say that a model of L should preserve coproducts.
Implicit in any categorical model is a minimal set of construction principles required to preserve full
semantics.
Once we recognize that the traditional (logical) interpretations for a model M are the structure-
preserving functors M  !, we are in an easy position to generalize to a much wider array of
semantics. We have explicitly identified the necessary structural context (e.g., coproducts) M, so we can
replace Sets by any other category which has these same features. We can use a category Dyn whose
objects are dynamical systems; a functor M %&'provides dynamical semantics. There is a category
Prob whose arrows are probabilistic mappings; a functor M ()*+ describes stochastic semantics for
M. There is a computational category Type where arrows are algorithms; functors M ,&- provide
computational interpretations for M. We can often compose these, for example mapping a model to a
dynamical system, and then mapping this to a computational simulation. Sometimes we can even mix
semantics together, so that in figure 4 we could give dynamical models for Heat and Simmer, a
computational model of Control and a stochastic Measure, and compose these to give a hybrid
dynamical model for the whole system.
4. Integration
The main role of our models in system integration is to collect and manage the tremendous amount of
structured data collected and analyzed during the integration process. This data is necessarily
heterogeneous, multi-scale and dispersed across many models and experts. Categorical models have
several nice features which can support the federation of this data.
First of all, we can regard a finite syntactic category M (like one of the architectural models in section
7
1) as a database schema [14,19,20]. Roughly speaking, the objects are tables and the arrows are foreign
keys. This means that we can use the models already produced during conceptualization and
decomposition to store the data generated during integration. Formally this depends on the functorial
semantics discussed in the previous section; we can think of an instance of the database as a functor
M  ! mapping each table to a set of rows. Notice that this approach automatically ties the data that
we produce to our semantic models.
A more significant challenge is the dispersion of data across many engineers using many different
models. In order to build a holistic picture of our system, we need some way of putting models together
and aggregating the data they contain. The CT approach involves a categorical construction called a
colimit, together with an additional twist.
A colimit is a categorical construction that generalizes unions, allowing us to build new objects by
gluing together old ones. For example, any graph can be constructed using colimits by gluing edges
together at nodes. To integrate two objects using a colimit, we first explicitly identify their overlap as a
third object, along with two maps embedding the overlap into each component. Given this data, the
colimit construction then produces a fourth object together with two maps which embed the original
components into the new object. See figure 5(a).
The twist is that, instead of looking at categorical constructions inside our models, now we are
interested in performing colimits with our models. This approach depends on the fact that CT is self-
referential: the methods of CT can be applied to study categories themselves. In particular, there is a
semantic category Cat whose objects are categories and whose arrows are functors. Colimits in this and
related semantic contexts can be used to define model integration. A very simple example is given in
figure 5(b).
In fact, we can form colimits from any number of components, so long as we accurately represent their
overlaps (and overlaps of overlaps, etc.), providing a scheme for wider integrations. However,
representing all those overlaps may be inefficient. Another alternative is to integrate serially, adding in
one new model at a time. CT provides us with a language to state and prove that either approach is valid,
and that the two options will yield equivalent results [25].
As for heterogeneity, CT constructions called sheaves have recently been proposed as ``the canonical
datastructure for sensor integration'' [18]. The main idea is that when different of sensors capture
overlapping information, it must be restricted or transformed before it can be compared. In the simplest
example, to identify overlapping images we must first crop to their common ground (restriction) before
comparing the results. A simplistic algorithm would ask for perfect agreement on the restriction, but a
more sophisticated integration might allow small differences in shading or perspective (transformation).
We can also compare different types of information, so long as we can project them to a common context;
we might match up audio and video by translating both to time series and looking for common patterns.
CT provides the language and spells out the requirements for translating between contexts in this way.
Finally, by mixing colimits with functors, we can connect our models across layers of abstraction [6].
Suppose that H is a model one level of abstraction above that of M and N in figure 5. Both M and N are
Fig. 5: The colimit construction
8
more detailed than H, but each only covers half the range. When we put them together, though, they do
cover the same range: every entity of H can be defined by mixing structures from M and from N.
Formally, this means that we can construct a refinement functor H ./012M3N4O which tells us how to
compute high-level characteristics in terms of low-level ones, helping to trace high-level requirements to
low-level performance.
5. Operation
In operation, systems are never static. Components fail and need to be replaced. New models and
versions require tweaks to existing production and control system. New technology or regulation changes
the environment in which our systems operate. Because of this, it is critical that our models should be
relatively easy to maintain and update. Here again, categorical methods have some nice features which
recommend them.
One significant challenge in updating a model is that we must take existing data attached to the
original model and shift it over to the new one. Thinking of our models as domain-specific languages, we
must translate our data from one language to another. These processes are often messy and ad hoc, but
categorical constructions can help to structure them.
As we mentioned in the last section, a class-type categorical model N like those discussed in section 1
can be translated more-or-less directly into database schemas [14,19,20] where objects are tables and
arrows are foreign keys. An instance of the database is a functor N  ! which sends each abstract
table to a concrete set of rows. By generating our data stores directly from models, our data is
automatically tied to its semantics.
We can then use functors to formalize the relationship between old and new models. This will provide
a dictionary to guide our translation. Moreover, expressing the transformations in these terms can help to
organize and explain certain inevitable features of this process.
A good example is the phenomenon of duality between models and data. A meticulous reader will
have noted that, in the discussion of architectural models, we said that “every hierarchy is a special kind
of network”, but then proceded to define a functor NH. The direction has reversed!
The categorical formulation explains this fact: given a functor NH and an instance H  !, we
can compose these at H to obtain an instance N  !. So every functor between syntactic models
defines a mapping of instances in the opposite direction. We might call this operation model restriction or
projection, and categorically speaking it is simply composition.
While composition allows us to restrict data backwards along a functor, subtler and more significant
constructions called Kan extensions allow us to push data in the same direction as a functor [20]. In many
cases, data demanded by the new model will be unavailable in the old; in others, we may split one
concept into two, or vice versa. In all of these cases, Kan extensions provide explicit instructions for
building a “best approximation” to the old data, subordinate to the new schema.
Remarkably, the same operation of Kan extension can also be used to encode quantification in formal
logic [17] and periodic states in dynamical systems [15]. This points to a critically important aspect of
categorical methods: uniformity. The abstraction of CT allows us to apply the same set of tools to a
remarkably diverse set of problems and circumstances.
This can be problematic for beginners: even simple applications of CT may require learning several
abstract constructions. Why bother, when there are easier solutions to this problem or that? The value of
the CT approach only becomes apparent for more substantive problems, where the same familiar tools
can still be applied.
Another nice property of categorical models is modularity, which is supported by the fact that the
colimit construction is a functor. Suppose, for example, that we extend one of the models in figure 5(a) via
a functor NN. A categorical construction principle for the colimit then guarantees that we can build a
new map ./012M3N4O ./012M3N4 O. This allows us to update domain-specific models locally and
then lift these changes to a global context.
More generally, the category theoretic property of naturality (over the diagram of the colimit) encodes
the restrictions which must be satisfied if updates to multiple components are to be consistent with one
another. Other categorical constructions called fibrations have been useful in formalizing more general
9
bidirectional transformations, where updates may not be consistent with one another [13,9]. In fact, the
elucidation of this concept of naturality was the motivating goal in the original development of CT;
categories and functors were merely the supporting concepts which underpin ``natural transformations''
[11].
Our discussion here has tried to indicate the potential breadth of categorical analysis. In so doing, we
have sacrificed depth in return. There is much more to be said.
Conclusion
One by one, the elements of category theory may not seem so impressive. We already have OWL for
representic semantic information, and good tools for interacting with databases. The UML/SysML
language family allows us to build graphical models and translate them into code stubs for programming.
Modelica and other modeling languages allow us to describe component-based decompositions and link
these to dynamical simulations. R and other software provides tools for statistical modeling.
The real value of CT is that it provides a context in which all of these can interact, and a rigorous
language for defining and analyzing those interactions. Now we have a chance to formalize entire
toolchains and workflows: we can agree on a graphical model, produce from it a semantic (logical) model
and populate it with data from an existing schema. We can use that data to derive a dynamical model,
and transform this into a computational simulation before piping the results to statistical software for
analysis. This entire process can be structured by categorical models.
This indicates why systems engineering offers an ideal test bed for the emerging discipline of applied
category theory. First, there is no avoiding the need to employ formal methods from multiple disciplines.
The details of our system exist at different scales and layers of abstraction. The need to interface
between many groups and researchers generates many demands: precise language to prevent
misunderstanding, intuitive (e.g., graphical) representations for easy communication, and structural
modularity for putting these pieces together.
Today, CT can supply plausible suggestions for meeting all of these requirements and more. However,
much work is required to turn this promise into practice. We can identify at least two important obstacles
which have stymied the growth of applied category theory.
First of these is CT’s learning curve, which is undeniably steep, but has become more gentle in recent
years. New textbooks [16,22] targeted at scientists and undergraduates have made the mathematical
ideas more accessible. New applications in areas like chemistry [7], electrical engineering [5] and
machine learning [8] have broadened the base of examples to more concrete, real-world problems.
A more substantial obstacle is tool support. Today CT can solve many problems at the conceptual
level, but there are few good tools for implementing those solutions. Outside of functional programming
(one of the major successes of CT) most software is academic, and it is neither simple enough nor
powerful enough to address system-scale demands. Addressing this deficiency will require substantial
funding and a concerted effort to bring together mathematicians with domain experts to attack complex,
real-world problems.
Fortunately, this requirement is less daunting than it seems. Because CT generalizes many other
formalisms, we should be able to use existing tools to solve categorically formulated problems. By turning
a category into a logical theory we can use an OWL theorem prover for validation. To analyze the
behavior of a functional model, we can derive a Petri net for simulation. By projecting our categorical
models back into existing formalisms, we can piggyback on existing tools and methods. The results of
these analyses can then be lifted back to the categorical level for a holistic appraisal.
We envision an open, CT-based platform for information modeling and analysis. The platform should
support modules for the various CT constructions (e.g., functors, colimits) and translations (OWL, SQL,
petri nets), which could then be assembled on a case-by-case basis to address specific problems. In the
long run, such a platform would be applicable across many domains, but to get there we first need to drill
down and provide a proof of concept. Systems engineering is the perfect candidate.
10
Disclaimer
Any mention of commercial products within NIST web pages is for information only; it does not imply
recommendation or endorsement by NIST.
References
1. Abran A, Moore JW, Bourque P, Dupuise R, Tripp LL. Software engineering body of knowledge. New York: IEEE Computer Society;
2004.
2. Arbib M, Manes G. Foundations of system theory: decomposable systems. Automatica 1974:10(3),285-302.
3. Arp R, Smith B, Spear AD. Building ontologies with basic formal ontology. Cambridge: MIT Press; 2015.
4. Baez J, Stay M. Physics, topology, logic and computation: a Rosetta Stone. In: Coecke, B editor. New Structures for Physics 2011.
Heidelberg: Springer; p. 95-168.
5. Baez J, Fong B. A compositional framework for passive linear networks. arXiv preprint 2015:1504.05625.
6. Breiner S, Subrahmanian E, Jones A. Categorical models for process planning. Under review: Computers and Industry, 2016.
7. Coecke B, Fritz T, Spekkens RW. A mathematical theory of resources. Information and Computation, 2014: 250:59-86.
8. Culbertson J, Sturtz K. Bayesian machine learning via category theory. arXiv preprint 2013:1312.1445.
9. Diskin Z. Algebraic models for bidirectional model synchronization. In Czarnecki K, et al. editors. International Conference on Model
Driven Engineering Languages and Systems 2008. Springer, p. 21-36.
10. Diskin Z, Maibaum T. Category theory and model-driven engineering: From formal semantics to design patterns and beyond. In
Cretu LG, Dumitriu F, editors. Model-Driven Engineering of Information Systems: Principles, Techniques, and Practice 2014.
Toronto: Apple; p. 173-206.
11. Eilenberg S, Mac Lane S. General theory of natural equivalences. Trans. AMS 1945; 58(2):231-294.
12. Jacobs B. Categorical logic and type theory. New York: Elsevier; 1999.
13. Johnson M, Rosebrugh R, Wood RJ. Lenses, brations and universal translations. Math. Struct. in Comp.Sci. 2012; 22(01):25-42.
14. Johnson M, Rosebrugh R, Wood RJ. Entity-relationship-attribute designs and sketches. Theory and Applications of Categories
2002; 10(3):94-112.
15. Lawvere FW. Taking categories seriously. Revista Colombiana de Matematicas 1986; XX:147-178, 1986.
16. Lawvere FW, Schanuel SH. Conceptual mathematics: a first introduction to categories. Cambridge: Cambridge; 2009.
17. MacLane S, Moerdijk I. Sheaves in geometry and logic: A first introduction to topos theory. New York: Springer Science & Business
Media; 2012.
18. Robinson M. Sheaves are the canonical datastructure for sensor integration. arXiv preprint 2016::1603.01446.
19. Rosebrugh R, Wood RJ. Relational databases and indexed categories. In Seely RAG, editor. Proceedings of the International
Category Theory Meeting 1991 1992. Providence: Canadian Mathematical Society (vol. 13):391-407.
20. Spivak DI. Functorial data migration. Information and Computation 2012; 217:31-51.
21. Spivak DI. The operad of wiring diagrams: Formalizing a graphical language for databases, recursion, and plug-and-play circuits.
arXiv preprint 2013:1305.0297.
22. Spivak DI. Category theory for the sciences. Cambridge: MIT Press; 2014.
23. Spivak DI, Kent RE. Ologs: a categorical framework for knowledge representation. PLoS One 2012; 7(1):e24274.
24. Spivak DI, Vasilakopoulou C, Schultz P. Dynamical systems and sheaves. arXiv preprint 2016:1609.08086.
25. Wisnesky R, Breiner S, Jones A, Spivak DI, Subrahmanian E. Using category theory to facilitate multiple manufacturing service
database integration. J Comput Inf Sci in Eng, In press.
... 21 Sys-Construction 6: (Couple or n-System Functor), Let Ω i,j : Sys i → Sys j be functors on/onto disjoint systems Sys i ∈ SY S. Term the collection objects, n − Sys and the collection of couples Σ : Sys → Sys. 22 Sys-Construction 7: (Free System of System Representation), Let SY S be the category whose SY S 0 are the n − Sys objects and whose arrows SY S Σ are the Σ : Sys → Sys arrows. The representationalf unctor [or "forgetful functor"] is defined by U M P : SY S → Graph where Graph is the category of graphs. ...
... Taking each Sys as vertex and couple/feedback to edge, one gains a collection as originally defined thus recursing the Sys description. The remaining definida then are conjectured (necessary) to hold presuming underlying adjoined system descriptions: 1) Each system has an identity (by distinguished state) and feedback couples expresses an "identifiable" mapping (ie self-permutation) 20 the later is shared interest to categorical logic and algebraic geometry 21 Wymore Def 5.1, p202-3 22 Wymore Def 5.2, p211 ...
... • Reuse of the shared diagrammatic typing • Functors trace the shared homological sets • Finite objects over finite identified systems The generalized requirement is for this ongoing cocompletion, given that well-defined connections appear as such [22] [27]. ...
Preprint
Full-text available
An analysis was performed to compare the ex-pressibility under category theory for algebraic aspects to systems engineering. The purpose is to replicate the developed aspects under 'model-based' specification theorems. A categorical definition for 'system' is constructed in the category of categories centrally defined by epimorphism & functor. Then the major theorems developed for model-based are reconstructed and whose proofs chain are presented through respective deductions from Wymore[1] and Awoday[2]. The resultant is twofold: showing "parallel" deductive abstractions and interpreted "categorical" primitives as deductive results. Algebraic differences exist in universality, indexing, and adjoint for engineering specification, yet 'system of system' construction has express under functors via graphical 'diagrams'. Finally type differences both intension & extension are explored throughout, and supplementary representations are discussed in incorporating "set" and "structural" formalizations to system (engineering) aspects.
... 1 the so-called "data lake" approach to data management, which can lead to data scientists spending 80% of their time cleaning data (Wickham 2014) out the enterprise, so that data and programs that depend on that data need not constantly be re-validated for every particular use. Computer scientists have been developing techniques for preserving data integrity during transformation since the 1970s (Doan, Halevy, and Ives 2012); however, we agree with the authors of (Breiner, Subrahmanian, and Jones 2018) and others that these techniques are insufficient for the practice of AI and modern IT systems integration and we describe a modern mathematical approach based on category theory ( Barr and Wells 1990;Awodey 2010), and the categorical query language CQL 2 , that is sufficient for today's needs and also subsumes and unifies previous approaches. ...
... Category theory ( Barr and Wells 1990;Awodey 2010) is the most recent branch of pure mathematics, originating in 1946 in algebraic topology. There are three main concepts of study: categories, functors, and natural transformations. ...
Preprint
In this paper we take the common position that AI systems are limited more by the integrity of the data they are learning from than the sophistication of their algorithms, and we take the uncommon position that the solution to achieving better data integrity in the enterprise is not to clean and validate data ex-post-facto whenever needed (the so-called data lake approach to data management, which can lead to data scientists spending 80% of their time cleaning data), but rather to formally and automatically guarantee that data integrity is preserved as it transformed (migrated, integrated, composed, queried, viewed, etc) throughout the enterprise, so that data and programs that depend on that data need not constantly be re-validated for every particular use.
... Our earlier work on knowledge structure led to designing the interdisciplinary engineering knowledge genome (IEKG) for creating a formal model to bridge numerous disciplines that use a particular class of mathematical models (Reich & Shai, 2012). In a similar spirit, we are using category theory, an abstract meta-mathematical language (Spivak, 2015), to compose the plurality of formal models used in engineering design to provide an integrated information systems infrastructure (Breiner et al., 2018(Breiner et al., , 2019. In this paper, with a similar objective of searching for patterns in design studies that span the variety of design contexts, we employ an abstract unit-level model of design and a framework to compose a multilevel model-based approach. ...
Article
Full-text available
Design research as a field has been studied from diverse perspectives starting from product inception to their disposal. The product of these studies includes knowledge, tools, methods, processes, frameworks, approaches, and theories. The contexts of these studies are innumerable. The unit of these studies varies from individuals to organizations, using a variety of theoretical tools and methods that have fragmented the field, making it difficult to understand the map of this corpus of knowledge across this diversity. In this paper, we propose a model-based approach that on the one hand, does not delve into the details of the design object itself, but on the other hand, unifies the description of design problem at another abstraction level. The use of this abstract framework allows for describing and comparing underlying models of published design studies using the same language to place them in the right context in which design takes place and to enable to inter-relate them, to understand the wholes and the parts of design studies. Patterns of successful studies could be generated and used by researchers to improve the design of new studies, understand the outcome of existing studies, and plan follow-up studies.
... Category Theory is gaining attention in recent years as a potential underlying formalism for systems engineering, as an important pillar in systems science, and as the basis for an open information modeling and analysis platform [45], with semantic roots in design theory [46]. The envisioned platform's robustness would allow for domain-agnostic cherry-picking and assembling of constructs and transforms. ...
Preprint
Full-text available
We introduce the Concept→Model→Graph→View Cycle (CMGVC). The CMGVC facilitates coherent architecture analysis, reasoning, insight, and decision making based on conceptual models that are transformed into a generic, robust graph data structure (GDS). The GDS is then transformed into multiple views of the model, which inform stakeholders in various ways. This GDS-based approach decouples the view from the model and constitutes a powerful enhancement of model-based systems engineering (MBSE). The CMGVC applies the rigorous foundations of Category Theory, a mathematical framework of representations and transformations. We show that modeling languages are categories, drawing an analogy to programming languages. The CMGVC architecture is superior to direct transformations and language-coupled common representations. We demonstrate the CMGVC to transform a conceptual system architecture model built with the Object Process Modeling Language (OPM) into dual graphs and a stakeholder-informing matrix that stimulates system architecture insight.
Chapter
Predictive maintenance is an additional means to improve the performance and the safety of complex production systems. This chapter focuses on a state of the art regarding standardization for integration of advanced maintenance approaches as predictive maintenance in production and logistics system with an application in the UPTIME project. Standardization plays a critical role in enabling the combination of predictive maintenance techniques and fulfiling the promise of an overall improvement of a production system in terms of capacity, safety and environment protection thanks to the advanced techniques. The chapter gives a big picture of the standards of interest regarding industrial data for predictive maintenance. It presents the results of a questionnaire on the standards answered by the members of the UPTIME consortium. The chapter briefly discusses different organizational, methodological and technological issues to be addressed.
Article
The Engineering Systems (ES) movement set a research agenda that transformed the field of systems engineering. By focusing on complex sociotechnical systems, the ES movement dramatically expanded the scope of the problems that could be addressed by systems engineers, drawing young scholars into the field. ES succeeded in focusing the scholarly community around the concepts of the “ilities”—nontraditional system lifecycle properties—sociotechnical complexity, and system architecture. In this article, I review some of the progress made by scholars in the field, outline directions for future work, and identify challenges facing future progress in systems engineering. Specific attention is given to approaches that emphasize the roles of abstraction hierarchies, contextual interpretation, knowledge sharing, and expertise. I also briefly address the perceived trade-off between academic rigor and practical utility—a perennial concern in the field. In each case, a review of the literature is performed documenting the progress made by the ES community and other scholarly communities whose findings may be synergistic with ours. Finally, I conclude with a proposal for preserving the momentum started by the ES movement by suggesting a forum for diverse scholarly discourse in systems engineering.
Article
Full-text available
A categorical framework for modeling and analyzing systems in a broad sense is proposed. These systems should be thought of as ‘machines’ with inputs and outputs, carrying some sort of signal that occurs through some notion of time. Special cases include continuous and discrete dynamical systems (e.g. Moore machines). Additionally, morphisms between the different types of systems allow their translation in a common framework. A central goal is to understand the systems that result from arbitrary interconnection of component subsystems, possibly of different types, as well as establish conditions that ensure totality and determinism compositionally. The fundamental categorical tools used here include lax monoidal functors, which provide a language of compositionality, as well as sheaf theory, which flexibly captures the crucial notion of time.
Article
Full-text available
The goal of this paper is to illustrate the use of category theory (CT) as a basis for the integration of manufacturing service databases. In this paper, we use as our reference prior work by Kulvatunyou et al. (2013, "An Analysis of OWL-Based Semantic Mediation Approaches to Enhance Manufacturing Service Capability Models," Int. J. Comput. Integr. Manuf., 27(9), pp. 803-823) on the use of web ontology language (OWL)-based semantic web tools to study the integration of different manufacturing service capability (MSC) databases using a generic-model approach that they propose in their paper. We approach the same task using a different set of tools, specifically CT and FQL, a functorial query language based on categorical mathematics. This work demonstrates the potential utility of category-theoretic information management tools and illustrates some advantages of categorical techniques for the integration and evolution of databases. We conclude by making the case that a category-theoretic approach can provide a more flexible and robust approach to integration of existing and evolving information.
Article
A sensor integration framework should be sufficiently general to accurately represent all information sources, and also be able to summarize information in a faithful way that emphasizes important, actionable features. Few approaches adequately address these two discordant requirements. The purpose of this expository paper is to explain why sheaves are the canonical data structure for sensor integration and how the mathematics of sheaves satisfies our two requirements. We outline some of the powerful inferential tools that are not available to other representational frameworks.
Article
Passive linear networks are used in a wide variety of engineering applications, but the best studied are electrical circuits made of resistors, inductors and capacitors. We describe a category where a morphism is a circuit of this sort with marked input and output terminals. In this category, composition describes the process of attaching the outputs of one circuit to the inputs of another. We construct a functor, dubbed the "black box functor", that takes a circuit, forgets its internal structure, and remembers only its external behavior. Two circuits have the same external behavior if and only if they impose same relation between currents and potentials at their terminals. The space of these currents and potentials naturally has the structure of a symplectic vector space, and the relation imposed by a circuit is a Lagrangian linear relation. Thus, the black box functor goes from our category of circuits to the category of symplectic vector spaces and Lagrangian linear relations. We prove that this functor is a symmetric monoidal dagger functor between dagger compact categories. We assume the reader has some familiarity with category theory, but none with circuit theory or symplectic linear algebra.
Article
In many different fields of science, it is useful to characterize physical states and processes as resources. Chemistry, thermodynamics, Shannon's theory of communication channels, and the theory of quantum entanglement are prominent examples. Questions addressed by a theory of resources include: Which resources can be converted into which other ones? What is the rate at which arbitrarily many copies of one resource can be converted into arbitrarily many copies of another? Can a catalyst help in making an impossible transformation possible? How does one quantify the resource? Here, we propose a general mathematical definition of what constitutes a resource theory. We prove some general theorems about how resource theories can be constructed from theories of processes wherein there is a special class of processes that are implementable at no cost and which define the means by which the costly states and processes can be interconverted one to another. We outline how various existing resource theories fit into our framework. Our abstract characterization of resource theories is a first step in a larger project of identifying universal features and principles of resource theories. In this vein, we identify a few general results concerning resource convertibility.
Article
This paper extends the ‘lens’ concept for view updating in Computer Science beyond the categories of sets and ordered sets. It is first shown that a constant complement view updating strategy also corresponds to a lens for a categorical database model. A variation on the lens concept called a c-lens is introduced, and shown to correspond to the categorical notion of Grothendieck opfibration. This variant guarantees a universal solution to the view update problem for functorial update processes.
Article
From the Bayesian perspective, the category of conditional probabilities (a variant of the Kleisli category of the Giry monad, whose objects are measurable spaces and arrows are Markov kernels) gives a nice framework for conceptualization and analysis of many aspects of machine learning. Using categorical methods, we construct models for parametric and nonparametric Bayesian reasoning on function spaces, thus providing a basis for the supervised learning problem. In particular, stochastic processes are arrows to these function spaces which serve as prior probabilities. The resulting inference maps can often be analytically constructed in this symmetric monoidal weakly closed category. We also show how to view general stochastic processes using functor categories and demonstrate the Kalman filter as an archetype for the hidden Markov model.