Conference PaperPDF Available

Modules with Interfaces for Dynamic Linking and Communication

Authors:

Abstract and Figures

Module systems are well known as a means for giving clear interfaces for the static linking of code. This paper shows how adding ex- plicit interfaces to modules for 1) dynamic linking and 2) cross-computation communication can increase the declarative, encapsulated nature of mod- ules, and build a stronger foundation for language-based security and version control. We term these new modules Assemblages.
Content may be subject to copyright.
Modules with Interfaces for Dynamic Linking
and Communication
Yu David Liu Scott F. Smith
Department of Computer Science
The Johns Hopkins University
{yliu, scott}@cs.jhu.edu
Abstract. Module systems are well known as a means for giving clear
interfaces for the static linking of code. This paper shows how adding ex-
plicit interfaces to modules for 1) dynamic linking and 2) cross-computation
communication can increase the declarative, encapsulated nature of mod-
ules, and build a stronger foundation for language-based security and
version control. We term these new modules Assemblages.
1 Introduction
Module systems traditionally excel in static linking. In typical module systems
such as ML functors [Mac84], mixins [BC90,DS96,HL02], Units [FF98] and Jiazzi
[MFH01], each module has a list of features (including functions, classes, types,
submodules, etc) as exports, and a list of imported features. Applications can
then be built by statically linking together a collection of modules. Definitions
of imported features are unknown when the module is written, but all names
must be resolved the moment the module is loaded and executed.
The rapid evolution of the Internet has changed the landscape of software
design, and now it is more common that encapsulated code segments contain
name references that can only be resolved at runtime. Applications of this nature
can generally be placed into two distinct categories, which we term dynamic
plugins and reactive computations, respectively.
Dynamic Plugins A dynamic plugin is our term for dynamically linked code:
a piece of code is dynamically plugged into an already running computation.
Dynamic plugins are ubiquitous in modern large-scale software designs, from
browser and operating system plugins, to incremental as-needed loading of appli-
cation features, and for dynamic update of critical software systems demanding
non-stop services. When the main application is loaded, future dynamic plugins
may be completely unknown, and name references to them must be bound at
runtime.
Reactive Computations Reactive computations are the collection of au-
tonomous coarse-grained computations that are reactive to requests in a dis-
tributed environment. A reactive computation has its own collection of objects;
so, a thread in a JVM is not a reactive computation, but a whole JVM is, as is
an application domain in the Microsoft CLR. In the grid computing paradigm,
each grid cell can be viewed as a reactive computation communicating with other
cells. The communication between a Java applet and its loading virtual machine
can also be viewed as one between two reactive computations. Compared to a dy-
namic plugin, reactive computations are much more loosely coupled since object
references cannot be shared; this is because the computations are on different
nodes or involve parties with limited trust in each other.
As with dynamic plugins, cross-computation invocation also requires name
binding at runtime: when a computation is loaded, it cannot yet know about the
existence of other computations it intends to communicate with.
Previous research on dynamic linking [Blo83,LB98,SPW03,DLE03] and soft-
ware updating [HMN01,BHSS03] also focuses on the dynamic plugin problem,
and numerous projects on remote invocation such as RPC and RMI have tar-
geted reactive computations. In this paper we develop a new module-centered
approach to these two forms of computation which offers advantages over the
existing approaches by making interfaces more explicit, and increasing under-
standing and expressiveness of the code. In particular, we develop a new module
theory that, along with a standard form of interface for static linking, also in-
corporates explicit interfaces for dynamic plugins and reactive computations.
2 Our Approach: Assemblages
In this paper, we present a type safe module calculus with a single notion of
module that supports static linking, dynamic linking, and communication be-
tween reactive computations. Our modules are called assemblages. Assemblages
are code blocks that can be statically linked with one another to form bigger
assemblages. When an assemblage is loaded into memory, it becomes an assem-
blage runtime (or runtime for short), which serves as a reactive computation
with an explicit interface for cross-computation communication and an explicit
interface for dynamically plugging other assemblages into its current runtime.
All interactions of a runtime should be through these interfaces alone, giving
complete encapsulation.
2.1 Basic Model
Fig. 1 shows the three fundamental processes our calculus addresses. We now
introduce them separately, together with concepts and terms we will use through-
out the paper.
Static Linking Our module calculus comes with a fairly standard notion of
static linking. Fig. 1 (a) shows expression A+A0, which represents the static
linking of assemblages Aand A0. One somewhat unusual feature is that assem-
blages themselves are first-class values in our calculus, and static linking is also
Assemblage Assemblage
Runtime Static Linker Dynamic
Linker Connector
Legend
(a)
(b)
(c)
A A'
Ar A'
Ar Ar'
n n n
n n'
nn'
Fig. 1. Three Fundamental Processes (a) Static Linking (b) Dynamic Linking (c) Cross-
computation Communication
a first-class expression: linking can happen anywhere in a program. Each as-
semblage is associated with a list of static linkers, each of which defines what
features it imports and what ones it exports. In the static linking process, sup-
pose Aand A0each have a static linker of the name ν. The pair of static linkers
are thus matched against each other, where imports of Aare satisfied by exports
of A0and vice versa. In the resulting assemblage, a new static linker with name
νwill be created, which bears the exported features of both static linkers. Static
linkers that are not matched will also be carried to the composed assemblage
(suppose Ahas a static linker named ν0but A0does not). Notice that Aand
A0are stateless code entities. Units, mixin module systems, and various other
calculi for the static linking of code fragments [Car97,DEW99,WV00,AZ02] all
work in a related way, so this aspect of our theory is not particularly unique.
Dynamic Linking Dynamic linking is illustrated in Fig. 1 (b); expression
pluginν7→ν0A0in our calculus triggers a dynamic linking. Assemblage A0is
loaded and linked into the current assemblage runtime Arwhere the plugin
expression is defined. The interesting thing here is not only that A0is an as-
semblage, but also Aris an assemblage runtime. Thus, unlike some projects
[Blo83,FF98,SPW03] which recognize dynamic plugins can be modelled as mod-
ules, we also recognize dynamic linking happens between a module runtime and
a module. By equipping Arwith interfaces describing its dynamic linking be-
haviors (which we call dynamic linkers), a successful dynamic linking process
can thus be conceived as a bi-directional interface matching between the plugin
initiator’s codebase module and the module representing the dynamic plugin.
We believe this is more precise and explicit than the unidirectional notion of dy-
namic linking found in other module systems, where the initiating computation
has no explicit interface to the module being linked.
Adynamic linker is the dynamic linking interface of an assemblage. It spec-
ifies the dynamic linking behaviors after the assemblage is loaded to become an
assemblage runtime. Specifically, it defines what type of dynamic plugins the
assemblage expects. In Fig. 1 (b), the plugin initiating assemblage runtime has
a dynamic linker named ν, and when expression pluginν7→ν0A0is evaluated,
dynamic linker νof the initiating assemblage runtime is matched with the static
linker ν0of A0. This may seem like a mismatch, linking a static linker and a
dynamic linker, but the linking is in fact occurring across sorts: a runtime is
being linked with a piece of code. The result of a plugin expression is for the
runtime to be the original runtime plus the runtime form of the newly added
plugin. Dynamic linking here does not increase the number of runtimes, because
dynamic linking is a tightly-coupled interaction. A non-example is Java applets;
they are not tightly coupled with the loading JVM, they are best not viewed
as dynamically linked code. Individual applets are better modeled as distinct
runtimes communicating with its inhabiting VM, which can be better modelled
by the cross-computation communication we introduce next.
Cross-Computation Communication Fig. 1 (c) demonstrates how expres-
sion connectν7→ν0Ar0sets up a cross-computation connection between two as-
semblage runtimes, the runtime containing the connect expression (Ar) and
the runtime Ar0. In a similar vein to dynamic linking, both the party receiving
the cross-computation invocation and the party initiating the invocation must
have explicit interfaces declaring the form of this interaction on their codebase
assemblages, which we call connectors. A successful communication process is
thus a bi-directional interface matching between the initiator module runtime
and the receiver module runtime.
Aconnector specifies the cross-computation communication interface of an
assemblage runtime. Specifically, it defines what type of assemblage runtimes the
assemblage expects to communicate with. In Fig. 1 (c), the connect-initiating
assemblage runtime has a connector named ν, and when its connectν7→ν0Arex-
pression is evaluated, connector νof the initiating assemblage runtime is matched
with connector ν0of the assemblage runtime Ar. The connect expression will
not lead to merging of runtimes, and since the runtimes must remain distinct,
all parameters passed between the two must be passed by (deep) copy. These
runtimes may also be on the same or different network nodes.
Before presenting further details of the system, we introduce a simple real-
world example.
VolcanoMain =assemblage {
static linker NetLib{
. . . statically linked some network library . . .
}
dynamic linker DetectorPlugin{
import detectMethod()
export getEnv == . . . get current environment snapshot . . .
}
connector CodeUpdate{
import getDetectCode();
export check(condition) == . . . check applicability of detect model . . .
}
.
.
.
// local feature implementation
updateDetector == λx.(let cb =connectCodeUpdate7→Code xin
let code =cb getDetectCode () in
let comp =pluginDetectorPlugin7→Detect code in
comp.detectMethod())
.
.
.
}
Fig. 2. A Sensor Network Example
2.2 A Real World Example
In this section, we introduce a simplified volcano sensor network example to
demonstrate the basic ideas of assemblages. The example is in Fig. 2. For the
purpose of improved readability, we here use sugared syntax slightly different
from our calculus; type declarations are also omitted here for brevity. Initially
we define an example with core features only, and will extend it below to illustrate
more advanced features of our calculus.
In a typical sensor network [HSW+00] for a volcano sensing application, a
number of smart tiny sensor nodes are scattered in the crater of a volcano, each
of which functions as an independent computer. In addition to the parts for
regular computation, each sensor node is also equipped with sensing device to
collect environment information, such as temperature, electromagnetism, etc.
Different sensor nodes can communicate with each other; at least some sensor
node can communicate with base station situated out of the volcano to report
data or receive control information.
One critical necessity in the design of such a system is that, once sensor
nodes are physically deployed, they are not likely to be reclaimable for pur-
poses such as software upgrade. In the example shown in Fig. 2, function up-
dateDetector provides support to dynamically update the mathematical model
of volcano detection used in the sensor: in reality, it is not uncommon for sci-
entists to adjust the mathematical models after the sensors have been deployed.
Without getting into too much detail, the function works as follows: it first sets
up a connection cb with base station represented by function argument x, via
connectCodeUpdate7→Code x. A new version of the detection model code is thus
DetectorPlugin
Semi-Parametric Model
DSModelPlugin
Gaussian
Method
Dynamic
Node
Static
Linker
Dynamic
Linker Connector
Legend
Assemblage
Runtime
lc2 temp1, loc1
Per-Connection State
Store
Dynamic
Linkage Connection
(a)
(b)
lc2
lc3
lc4
lc1
lc1
lc3last
1
lc1last
2
lc4last
3
lc1lc2last
4
Poisson
Method
Parametric Model
Non-Parametric Model
Model
Model
Model
Volcano Sensing Main
temp1, loc1
temp2, loc2
temp3, loc3
temp4, loc1
TempQuery
TempQuery
TempQueryTempReport
TempReport
TempReport
lc3 lc4
lc2
lc2
Fig. 3. An Example on (a) Rebindable Dynamic Linkers (b) Generative Stateful Connectors
acquired by invoking cb getDetectCode (); subsequently, it is dynamically
plugged into the current runtime via pluginDetectorPlugin7→Detect code. The up-to-
date detection method may then be invoked via comp.detectMethod().
This example demonstrates some basic uses of dynamic linkers and connec-
tors. Observe how the dynamic linker and connector are both bi-directional inter-
faces: they import some features and export others. Interface matching is a com-
ponent of both dynamic linking and connection. The connectCodeUpdate7→Co de x
indicates the CodeUpdate connector of the current assemblage runtime is con-
nected to the Code connector of x. A connection can be successfully established
only if each party exports what the other party needs to import (extra exports
can be present and are ignored). We also show a standard static linker NetLib,
expecting some network libraries; this importer will need to be satisfied before
the sensor program is up and running.
2.3 More on Assemblages
With the basic model introduced and a simple example given, we now look at
more advanced features of our calculus.
Rebindable Dynamic Linkers In the basic model presented in Fig. 1, dy-
namic linking was presented as a one-to-one relation between the initiating as-
semblage runtime and the dynamic plugin. This is however an oversimplified
view of the calculus, and does not completely coincide with reality. Consider the
sensor example again. Scientists might prefer to run multiple detection mod-
els at the same time, and compare the results of different models. The current
VolcanoMain assemblage however only has one dynamic linker. In a one-to-one
dynamic linking model, plugging in one detection assemblage would lead to the
invalidation of previous dynamic plugins on the same dynamic linker.
For this reason, dynamic linkers in our calculus are rebindable. This implies
different assemblages may be plugged in to the same dynamic linker at the
same time, and not interfere with each other. Fig. 3 (a) shows this one-to-N
relation. Here the Volcano Sensing Main assemblage runtime is the runtime form
of assemblage VolcanoMain after its static linker is satisfied; it is here dynamically
linked to three different dynamic plugins representing three different detection
mathematical models; interestingly, since dynamic plugins themselves can have
dynamic linkers, it might plug in other assemblages as its plugins. In Fig. 3
(a), a dynamic plugin of the Volcano Sensing Main assemblage runtime, one
representing say a parametric probabilistic detection approach, further plugs in
different distribution models to its DSModelPlugin dynamic linker, such as a
Gaussian distribution or Poisson distribution method.
The dynamic linking established by a plugin expression is called a dynamic
linkage, and the expression returns a value we call a dyanmic linkage handle;
programmers can use it to refer to the particular assemblage just plugged in.
Generative Connectors Our connectors are also more nuanced than as pre-
sented in Fig. 1: connectors also need to be rebound, as we just saw for dynamic
linkers. As shown in Fig. 2.3 (b), a task such as temperature measurement in a
typical volcano sensor network is achieved by the collaboration of a number of
sensors; each one of them usually communicates with its neighbors to exchange
data such as temperature information. Since the configuration of the network is
not fixed until sensors are scattered in the crater, the program developer can not
define an a priori list of connectors, each of which is assigned to one neighbor.
Instead, each sensor must only be equipped with rebindable connectors like Tem-
pQuery and TempReport given in Fig. 3 (b), where at any moment the TempQuery
connector of a sensor may be connected with multiple TempReport connectors of
its neighbors, and vice versa.
Another important issue is that each connection will need to keep its per-
connection data: sensors need to record collected temperature information from
its neighbors, together with the location information on where the temperature is
sampled. This kind of information varies from connection to connection; a global
state of assemblage for this purpose is not enough. In Fig. 3 (b), each connector is
associated with a per-connection state store, which records the private generative
states associated with each connections. The index of the store is the connection
ID generated when connection is established via a connect expression.
Our calculus supports generative connectors where per-connection states are
supported. It implicitly also supports rebindability. Each successful connect
expression creates a connection, and the expression returns a value that is a
connection handle; with the handle, programmers can refer to different con-
nections (and the private per-connection state) on the same connector in the
same program. One additional advantage of using handles is there is no problem
with programs trying to access features on a non-connected connector—there
is no name by which to refer to the features on the connector until there is a
handle. Since multiple handles can be active and each connection has a unique
state, there is an analogy of a connection definition with a class, and each ac-
tive connection with an object, with the connection handle being the reference
to the object. These “objects” are something like facades in the facade design
pattern—they are the external interface to the component.
Typed Calculus and Types as Features Our calculus is typed, and the
type system has several pleasant properties such as soundness and decidability
of type checking. There is no runtime error associated with attempting to use
a connector that is not connected to anything, because connectors are only ac-
cessed via handles which only exist because a connection was created. The only
error possible is the handle could be stale because the connection has termi-
nated. We have also explored the possibilities that types are themselves treated
as features that are imported and exported across static linkers, dynamic linkers
and connectors. In this presentation however, we do not focus on these aspects
due to limited space. Interested readers can refer to [LS04] for details.
2.4 Why Dynamic Linkers and Connectors?
Static linkers, dynamic linkers and connectors together form the interfaces of
assemblages. Before proceeding to the formalization, we address an important
question concerning the purpose of the paper: Why dynamic linkers and connec-
tors?
First, modules with fully declarative interfaces lead to a more complete pro-
gram specification. Assemblages are highly declarative; in fact, all of an assem-
blage’s potential for interaction with outside the codebase can be read off of
the interfaces. This is obviously a good thing for language design, leading to
provable type safety without obscurity. The idea also has impact on paradigms
of software development. Indeed, interfaces (static linkers, dynamic linkers and
connectors) can be defined at the design phase, reflecting designer’s intention of
the assemblage to interact with other parties. A declaration of DetectorPlugin
dynamic linker coincides with the designer’s intention that VolcanoMain module
will dynamically link to some detection model plugins; the CodeUpdate connector
coincides with the designer’s expectation that the module will communicate with
some codebase. In a large-scale software development process, software design
and software implementation are typically accomplished by different people. The
module calculus’ type system ensures the implementation will faithfully follow
the intention of the designer. For instance, a compile-time type mismatch would
occur if implementor of VolcanoMain desires to communicate with a codebase
which does not provide a getDetectCode function.
Second, these interfaces provide crucial support for extending the calculus
to other important language features. Since all of the external interactions are
declared on the static linkers, dynamic linkers, and connectors, new modes of
external interaction can easily be layered on top of these existing notions. Exam-
ples include security (on connectors), transaction management (on connectors),
and version control (on dynamic linkers). For example, for security, since ev-
ery cross-computation invocation will need to be directed through connectors,
access control on connectors is enough to secure assemblage runtimes from unau-
thorized nonlocal access. We do not directly address these topics here, but the
module theory is designed with them in mind.
Third, dynamic linkers and connectors better model the complex interactions
between different parties than is possible in systems without them. A naive im-
plementation for dynamic plugins can be achieved by direct dynamic loading,
and invocations can thus be made on exported functions of the plugins. An ob-
vious problem of this approach however is when there is a need for callback
functions. For reactive computations, distributed protocols often involve mes-
sage exchanges back and forth. If RMI or RPC is used, this interaction protocol
will be completely submerged in the code in the various methods, and no sin-
gle point in the program will indicate the protocol as a whole. In our example,
the CodeUpdate connector specifies all the possible interactions between a code
provider and a code client. The whole code updating protocol, although simple
in this example, is captured by CodeUpdate connector, giving a clear protocol
specification.
3 Syntax
The syntax of our calculus is shown in Fig. 4. It differs from the syntax used in
Sec. 2, but in a trivial manner only: in the calculus syntax we remove the key-
words (such as assemblage,import and export) used in the sugared syntax;
otherwise the two forms of syntax are identical. Notation mis used to represent
a sequence of entities m1, . . . , mn, and we take the empty sequence as a special
value φ. The ]operator denotes the concatenation of two sequences; for the
empty sequence, m]φ=φ]m=mfor any m. Since in this presentation, each
mican only take one of the three syntactical forms a7→ b,a== band a:b, we
also view mas a mapping and call it well-formed if it is a function, i.e. there
does not exist a1=a2but b16=b2; in this case when no confusion arises, we
also define m(a) = b.m(a) = iff a7→ b(or a== b, or a:b) is not present
for any b, or mis not well-formed. m{a7→ b}denotes a mapping update; it is
a mapping the same as m, except m{a7→ b}(a) = bwhile m(a) could be other
values.
An assemblage (A) is composed of a well-formed sequence of static linkers (S),
dynamic linkers (D), connectors (C) and its local private code (L). Each static
A::= hS;D;C;Liassemblage
S,D::= ν7→ hI;Eistatic linker,dynamic linker
C::= ν7→ hI;E;Jiconnector
I::= α:τimported feature
E::= α== λx.e :τexported feature
L::= α== F:τlocal feature
J::= α== ref F:τper connection state
F::= cst |A|λx.e |ref F feature
e::= () |x|cst |thisc |thisd nul l value,variable ,const
|A:τ|e+efirst class assemblage,sum
|pluginν7→ν0e|connectν7→ν0edynamic plugin,connect
|α@local |α@ν|e.α |eα |eαefeature access
|λx.e :τ|e e first class function,app
|ref e|!e|e:= estate
νinterface name
αfeature name
τtype,defined in Fig.8
cst integer constant
x variable
Fig. 4. Assemblage Language Syntax
linker or dynamic linker has a name (ν), a well-formed sequence of imported fea-
tures (I) and a well-formed sequence of exported features (E); each connector,
in addition, is associated with a well-formed sequence of per-connection states
(J). I]E(and in the connector case I]E]J) also must be well-formed: we
disallow the case where the same feature is imported and exported on the as-
semblage. Features are chosen from constants (cst), functions (λx.e), references
(ref F) and nested assemblages (A). This particular choice is made to preserve
a balance between functional features and imperative features. With references
and functions around, primitive classes and objects can also be modelled with
widely known encoding techniques, and are left out of the calculus for simplic-
ity. α:τdenotes importing a feature named αwith type τand α== F:τ
denotes exporting a feature named α, defined to be Fwith type τ. Features
to be imported/exported on interfaces must be functions. This function-only
restriction however does not restrict the expressiveness of the calculus: import-
ing/exporting references can be encoded as importing/exporting a pair of getter
function and a setter function; importing/exporting assemblages can be encoded
as importing/exporting a function taking a null value and returning the assem-
blage. Per-connection state (J) is defined via feature α== ref F:τ, which
means the state is named α,ref Fwill be its initial value, and τis its type.
Most of the expressions ehave been explained in Sec. 2. Additionally, we
use α@local to refer to a feature αdefined in locally (in L). α@νrefers to a
feature αdefined in static linker ν.thisd refers to the current dynamic linkage,
and thisc refers to the current connection. The meaning of “current” depends
on the dynamic linkage handle or connection handle which invokes the function
thisdexpression is situated in. Because of the rebindable nature of dynamic
Dynamic
Node
Legend
Assemblage
Runtime
Dynamic
Linkage Connection
lc1
Live Connection Registry
Active
Dynamic
Node
Active
Assemblage
Runtime
TempQuery
TempQuery
TempReport
TempReport
last ==
temp ==
loc ==
Volcano Sensing
Main
Semi-Parametric
Model
Non-Parametric
Model
Parametric
Model
Gaussian
Method
Poisson
Method
lc1
lc3
lr2
lr1
lr2
lr2
ls1 ls2 ls3
ls3
ls2
ls1
last
1
temp
1
loc
1
... ...
Heap
Runtime Global
Dynamic Linking Tree
TempQuery
TempQuery
TempQuery
TempReport
TempReport
TempReport
lc3
lc1
lc4
lc2
Connector
TempQuery
TempQuery
TempReport
TempReport
last ==
temp ==
loc ==
lr2
lr2 ls3
ls2
ls1
lr3
TempQuery
TempQuery
TempReport
TempReport
last ==
temp ==
loc ==
lr2
lr2 ls3
ls2
ls1
TempQuery TempReport last ==
lr2
TempQuery TempReport last ==
lr2
TempQuery TempReport last ==
lr2
TempQuery TempReport last ==
lr2
... ... ... ...
... ...
lr2
lr2
lr2
lr2
lr2
lr2
lr2
lr2
Fig. 5. The Big Picture
linking, we disallow a syntax like α@νwhere νis a dynamic linker name: it
would be ambiguous which version of the dynamic linkages is referred to if there
were more than one dynamic linkage created from the same dynamic linker, and
would be undefined if none were present. This restriction on syntax also holds
for connectors for similar reasons. e.α is used to refer to a feature αin dynamic
linkage handle e. Syntax e  α refers to a per-connection state αin connection
handle e.e  α edenotes an invocation of a function defined in a connection
handle. This expression can potentially denote a cross-computation invocation,
and has a different semantics from regular function application; we therefore use
different syntax for the two cases. () is used to denote a null value of unit type,
as in ML. let is encoded by function application.
ιr Rassemblage runtime ID
ιs Sstore ID
ιn Ndynamic node ID
ιd Ddynamic linkage ID
ιc Cconnection ID
G::= R runtime global
R::= ιr 7→ hT;H;Yiassemblage runtime
T::= hN;Kidynamic linking tree
H::= ιs 7→ vheap
Y::= ιc 7→ hιr;ν1;ν2;Jrilive connection registry
N::= ιn 7→ hSr;Dr;Cr;Lridynamic node
K::= ιd 7→ hιn1;ν1;ιn2;ν2idynamic linkage
Jr::= α7→ ιs per connection state
fv ::= cst |A|fun(ιr, ιn, λx.e)|ιs feature value
v::= () |ιr |ιd |ιc |fv value
e::= · · · | v|inR(ιr, ιn, e)expression at runtime
E::= [ ] |pluginν7→ν0E|connectν7→ν0Eevaluation context
|E+e|v+E
|Ee|vE|ref E |!E|E:= e|v:= E
|E|E α |E α e|v  α E|load E
cst,A, α, J defined in Fig.4
Sr, Dr, Cr , Lrsee Sec.4.2
Fig. 6. Operational Semantics Auxiliary Definitions
4 Operational Semantics
In this section we present the dynamic semantics of our calculus. We first infor-
mally explain the big picture of how a typical application appears at run-time,
and then we discuss formal details of the operational semantics.
4.1 The Big Picture
Sec. 2.3 gave an example of a temperature measurement application for sen-
sor networks; the illustrations used in that section (Fig. 3), however, are only
intended to target high-level concepts. In this section, the precise runtime snap-
shot of the same application is illustrated in Fig. 5. Here the whole network is
composed of multiple sensor nodes in the form of Fig. 3 (b) to perform tem-
perature measurement, while individual nodes are experimenting with different
computational models in the form of Fig. 3 (a).
At runtime, the entire application space, with all reactive computations of
concern, is called a runtime global; the whole temperature measurement network
is a runtime global for instance. Inside it, independently deployed assemblage
runtimes are running on potentially distributed locations. Each of them can be
created by explicit load expressions, during which a static assemblage is loaded
into memory. The first assemblage runtime in the runtime global is loaded via
abootstrapping process. At load time, each runtime is associated with an ID. In
Fig. 5, we have three runtimes with IDs ιr1,ιr2and ιr3. Assemblage runtimes
communicate with each other via connections over paired connectors, which also
have connection ID’s, their connection handle. In the figure, runtimes with ID’s
ιr1and ιr2are communicating via two connections; the connection with ID ιc1
is between the connector TempQuery of ιr1and connector TempReport of ιr2.
Internally, each runtime contains a dynamic linking tree, a heap and a live
connection registry. The heap is standard and holds the multable data; it is de-
fined as a sequence of stores, each of which is associated with a store ID. In
Fig. 5, the heap of runtime ιr1currently has a store ιs2which holds a constant
value temp1. A runtime’s live connection registry holds the currently active con-
nections. It is defined as a table indexed by connection IDs; each entry contains
information such as what parties are involved in the connection (runtime IDs
and connector names), and the per-connection state store. For instance, the first
row of the live connection registry of runtime ιr1shown in Fig. 5 indicates it cur-
rently has a connection ιc1on its TempQuery connector, and that connection is
to TempReport of runtime ιr2. The last column indicates the per-connection field
last has a reference value ιs1: the value is a reference since per-connection states
always contain mutable data, just as object fields in object-oriented languages
are mutable. The meanings of connector-related expressions of our calculus, such
as connectν7→ν0e, are related to operations on live connection registries. For in-
stance, when a connection is established via a connect expression, both involved
parties of the connection have one entry added to their live connection registry.
Per-connection states are allocated and initialized at connection establishment
time.
A dynamic linking tree is used to reflect the rebindable nature of dynamic
linkers, as described informally in Sec. 2.3. Indeed, if rebindability were not
supported, a plugin expression could just merge the codebase of the current
assemblage runtime with the code of dynamic plugin, in the same manner as
static linking A1+A2. However due to rebindability, each dynamic linker of the
current runtime can be associated with multiple independent dynamic plugins at
the same time. The data structure is in general a tree: when the main assemblage
is first loaded to memory, it creates a root node, with all application logic of the
main assemblage defined inside. Each plugin expression executed in the root
will result in the creation of a tree node representing the dynamic plugin (with
all application logic of the dynamic plugin defined inside), and the newly created
node becomes a child of the root. Since child nodes can themselves plug in code
(the plugin of a plugin), the tree can in general have depth greater than two.
The fact that the data structure is a tree other than a DAG or some random
graph can be easily proved by the way it is constructed. In Fig. 5, the dynamic
linking tree is the internal representation of the application whose high-level
requirement is shown in Fig. 3 (a). A dynamic linking tree is composed of a
series of dynamic nodes (the tree nodes) and dynamic linkages (the tree edges).
Each dynamic linkage is referenced by its dynamic linkage ID ιd, which is the
realization of the dynamic linkage handle of the previous section. The behaviors
of dynamic linker-related expressions, such as pluginν7→ν0e, are operations on
dynamic linking trees. For instance, when a plugin is assembled via a plugin
expression, the dynamic plugin will becomes a dynamic node that is a leaf of the
initiator runtime’s dynamic linking tree.
Since concurrency is not the focus of this paper, our calculus assumes for
simplicity that at any moment only one assemblage runtime is active, and only
one dynamic node in this active runtime is performing the reduction. This fact
is shown in Fig. 5, where distinct notation is used for active runtimes and active
dynamic nodes.
4.2 A Formal Overview of Dynamic Semantics
Fig. 6 defines the relevant data structures that play a part in defining the dy-
namic semantics. Most of them have been explained with the example in Fig. 5.
The rest is explained below.
Formal Details of the Dynamic Linking Tree Each dynamic node is associ-
ated with an ID ιn. Given a static assemblage A=hS;D;C;Li, its corresponding
dynamic node form N=ιn 7→ hSr;Dr;Cr;Lriis almost identical to A, except
that it has an ID ιn, and the features defined in Sr,Dr,Cr,Lrare slightly
different in form from its static counterparts due to function closure and muta-
ble states, which will be made clear when we explain feature values shortly. A
dynamic linkage is of the form ιd 7→ hιn1;ν1;ιn2;ν2i, denoting a tree edge with
an ID ιd linking dynamic linker ν1of dynamic node ιn1with static linker ν2
of dynamic node ιn2. We use root(T) to denote the root node of the dynamic
linking tree T.
Feature Values At the source code level, our language supports four kinds of
features; see Fig. 4. At runtime, not all of them are values; the possible feature
values fv are defined in Fig. 6. ref Ffeatures are not values, since this indicates
a heap allocation; the corresponding value is the store ID where the value is
allocated on the heap. Function values are closures fun(ιr, ιn, λx.e), where ιr
and ιn are the IDs of the runtime and the dynamic node where the function is
defined. The reason why λx.e is not a value is that the body emight refer to
other features such as α@local. At runtime, if functions as first-class values are
passed from one dynamic node to another, the meaning of α@local would not
be preserved if the defining dynamic node were not recorded; passing around
closures would make parameter passing of first-class functions have a consistent
meaning universally. Our language does not allow functions to be passed from
one runtime to another, so theoretically, the ιr information in function closures
could be removed. We keep it here to show our language could easily support
function passing across runtimes without technical difficulty, which also implies
mechanisms like RMI could also be easily supported. The reason we do not
support function passing across runtimes is that it gives an indirect access to a
runtime that is not explicit in an interface; this topic is elaborated in Sec. 5.1.
Source-code level features are converted to feature values when assemblages
are loaded either through bootstrapping process, or loaded via an explicit load
expression, or added to the current runtime via a plugin expression.
(mcnxt)R,E[e]ιr1,ιn1
R0,E[e0] if R, e ιr1,ιn1
R0, e0
(plugin)R,pluginν17→ν2Aιr,ιn
R{ιr 7→ R0}, ιd
if R(ιr) = hT;H;Bi, T =hN;Ki
start(A, ιr , ιn2) = (N2, H2), ιn2, ιd fresh
K2= (ιd 7→ hιn;ν1;ιn2;ν2i)
R0=hhN]N2;K]K2i;H]H2;Bi
(fun)R, λx.e ιr1,ιn1
R,fun(ιr1, ιn1, λx.e)
(app)R,fun(ιr1, ιn2, λx.e)vιr1,ιn1
R0,inR(ιr2, ιn2, e{v/x})
(sum)R,A1+A2
ιr,ιn
R,hS1S2;D1]D2;C1]C2;L1]L2i
if Ai=hSi;Di;Ci;Lii, i ={1,2}
(coninv)R, ιc  α vιr1,ιn1
R{ιr27→ R2},inR(ιr2, ιn2,E2(α){ιc/thisc}v0)
R,inR(ιr1, ιn1,E1(α){ιc/thisc}v)
if R(ιri) = hTi;Hi;Yii
root(Ti) = (ιni7→ hSri;Dri;Cri;Lrii)
Cri(νi) = hIi;Ei;Jiifor i∈ {1,2}
Y1(ιc) = hιr2, ν1, ν2,Jr1i
dcopy(v, H1) = (v0, H 0), R2=hT2;H2]H0;Y2i
(cons)R, ιc  α ιr1,ιn1
R,Jr1(α)
if R(ιr1) = hT1;H1;Y1i, Y1(ιc) = hιr2, ν1, ν2,Jr1i
(conn)R,connectν17→ν2ιr2
ιr1,ιn1
R{ιr17→ R0
1}{ιr27→ R0
2}, ιc
if ιc fresh ,R(ιri) = hTi;Hi;Yii
initS (Ti, νi, ιri, ιc) = (Jri, H 0
i),for i∈ {1,2}
R0
1=hT1;H1]H0
1;Y1] {ιc 7→ hιr2;ν1;ν2;Jr1i}i
R0
2=hT2;H2]H0
2;Y2] {ιc 7→ hιr1;ν2;ν1;Jr2i}i
(load)R,load Aιr1,ιn1
→ hR](ιr27→ hhN2;φi;H2;φi), ιr2
if ιn2, ιr2fresh,start (A, ιr2, ιn2) = (N2, H2)
(inre)R,inR(ιr2, ιn2, e)ιr1,ιn1
R0,inR(ιr2, ιn2, e0) if R, e ιr2,ιn2
R0, e0
(inrv)R,inR(ιr2, ιn2, v)ιr1,ιn1
R{ιr17→ R1}, v0
if R(ιr1) = hT1;H1;Y1i,R(ιr2) = hT2;H2;Y2i
dcopy(v, H2) = (v0, H 0), R1=hT1;H1]H0;Y1i
Fig. 7. Selected Reduction Rules
Values and Expressions at Runtime Values in our calculus can be feature
values; assemblage runtime IDs ιr (which serve as handles to assemblage run-
times, returned from load expressions); dynamic linkage IDs ιd (which serve as
handles to dynamic linkages, returned from plugin expressions); or, connection
IDs ιc (which serve as handles to connections, returned from connect expres-
sions).
We extend the expressions given in Fig. 4 with new syntax in Fig. 6 (see e
definition) to aid in implementing the operational semantics. inR(ιr, ιn, e) is an
auxillary expression defining a code context switch, meaning eis evaluated in
runtime with ID ιr and dynamic node with ID ιn; the expression is particularly
useful to model function invocations, during which the current execution point
is switched.
4.3 A Guided Tour to Reduction Rules
Operational semantics of our language is given in Fig. 7. G, e ιr,ιn
G0, e0
indicates a reduction of expression ein the presence of global runtime G, where
the current active runtime has ID ιr, and the current active dynamic node in ιr
has ID ιn. Evaluation contexts are defined in Fig. 6. Here we omit the rules for
expressions ref e,e:= e0, !e,α@ν,α@local and e.α; these rules are relatively
straightforward. Also note that some of the rules might get stuck on certain
combinations of Gand e. We largely omit the specifications of the faulty cases
here, but claim that the static type system introduced in Sec. 5 will ensure these
faulty cases never appear when real reductions happen at dynamic time. Details
on omitted reduction rules, specifications of these faulty expressions, and proof
to back up the forementioned claim can be found in [LS04].
We use e{e0/x}to denote capture-free substitution. If eis an assemblage, the
substitution does nothing: assemblages do not contain free variables, and even
all import feature names need to be explicitly declared on their static linkers,
dynamic linkers or connectors.
Assemblage Loading and Bootstrapping We now first explain how an as-
semblage runtime loads in another assemblage, and proceed to discuss the process
of bootstrapping, where the first assemblage in runtime global is loaded.
The (load) rule in Fig. 7 shows how loading is simply the creation of a new
assemblage runtime, and the result returned is a runtime handle, the ID of the
new runtime. Function start(A, ιr , ιn) = (N, H) prepares the initial dynamic
node (N) out of a static assemblage (A), together with the initial heap (H).
This function, whose formal definition is skipped here, is fairly straightforward
according to the following rules:
For every function feature in α== λx.e form defined in A, its corresponding
place in Nis substituted with α== fun(ιr, ιn, λx.e).
For every reference feature in α== ref Fform defined in A, its correspond-
ing place in Nis substituted with α== ιs, where ιs is a fresh store ID,
and at the same time ιs 7→ fv is in H. Here fv is the feature value form of
F. Since reference feature could be in the form like ref ref 0, this process
could lead to multiple stores defined in H.
Aand Nare otherwise identical.
The (load) rule doesn’t perform any initialization, because an initializing
load can easy be defined using this primitive one; for example, one method
could be
loadinit Adef
=let x1=load Ain
let x2=connectInitIn7→InitOut x1in
x2Main ()
which assumes connectors InitIn/InitOut are present on loaders/loadees re-
spectively, importing/exporting function Main(). Bootstrapping the first assem-
blage Aboot =hS;D;C;Liis accomplished by initiating execution in the state
hιr 7→ hhN;φi;H;φi, ιri,C(InitOut)(Main)()
where ιr,ιn are fresh, start (Abo ot, ιr, ιn) = (N, H ).
Static Linking The (sum) rule shows how static linking of two first-class
assemblages happens. It merges their dynamic linkers, connectors and local def-
initions, with preconditions that these parts do not clash by name. The fact
that clash of local feature names would lead to stuck computations might be
counter-intuitive: in reality, local features are supposed to be invisible from the
outside, and therefore static linking of two assemblages with some shared local
feature names should be a valid operation. To avoid this dilemma, we stipulate
assemblages are freely α-convertible with regard to local feature names.
Static linkers are matched by name, according to the operator. Given two
sequences of static linkers S1and S2,S1S2is the shortest sequence satisfying
all of the following conditions:
If S1has a static linker Sby name νbut S2does not, or vice versa.,Sis a
static linker in S1S2.
If S1and S2both have a static linker by name νwhose bodies are hI1;E1i
and hI2;E2irespectively, then S1S2also has a static linker named νand a
body hI;E1]E2i, where Iexactly include imported features whose names
are listed in I1but not E2, or listed in I2but not E1.
Dynamic Linking The (plugin) rule dynamically links a new assemblage to
the initiating runtime. A new dynamic node (N2) is created out of the assemblage
to be plugged in, via the start() function, and it is then added to the initiating
assemblage runtime by adding the node and an edge with ID ιd to the runtime’s
dynamic linking tree. Note that in dynamic linking, no new runtime is created;
the plugin will eventually become part of the initiating assemblage runtime.
This can be illustrated by the way the start() function is used: the initiating
runtime’s ID is passed to create the new dynamic node, not a fresh runtime ID.
The return value of the plugin expression is the ID of the newly created edge;
this is the dynamic linkage handle to the plugin, and conceptually represents
the link created out of the dynamic linking process. With this, features exported
from the plugin or from the initiating party can thus be accessed via an e.α
expression.
Cross-Computation Communication Connections are established by the
(conn) rule, which adds an entry to the live connection registry (Y) of both
connected parties. Per-connection states are also initialized at this point through
a simple function initS(); since all per-connection states are mutable, this func-
tion predictably deals with initialization of reference features, which is detailed
when we explained the start() function. Function features defined in connectors
are invoked by expression eαv, and its semantics is defined by (coninv).
Per-connection state can be referred to via expression eα; the related reduction
rule is (cons).
τ::= unit |int |ττ|τref primitive types
|Asm(S,D,C,L)assemblage type
|Rtm(C)runtime type
|Dlnk(E)dynamic linkage type
|Cnt(E,J)connection type
S,D::= ν7→ hI;E i static linker /dynamic linker type
C::= ν7→ hI;E;J i connector type
I,E,J,L::= α:τfeature type declaration
Fig. 8. Type Syntax
In the (coninv) rule, ιc  α vinvokes a function named αon a previously
established connection ιc, with vas the parameter. Since αcould be defined
by either of the two parties connection ιc connects, there are two possibilities:
1) αis exported in the assemblage runtime containing the expression; in this
case, the invocation is an intra-runtime one. 2) αis imported in the assemblage
runtime containing the expression; the invocation is thus an inter-runtime one.
A deep copy of parameter vshould be passed to the target runtime; specifically,
when vis a store ID, the heap cells associated with vwill be passed around,
with store IDs refreshed. The underlying design principle for the copy semantics
is object confinement: each assemblage runtime should have its own political
boundaries and direct references across boundaries would cause many problems
such as security. Function dcopy(v , H)=(v0, H0) defines the value (v0) and the
heap cells (H0) that need to be transferred if vunder heap Hneeds to be passed
across computations. v0is not always the same as vbecause stores are refreshed
if vis a store ID. In both case 1) and case 2), substitution of ιc for thisc is
needed to determine what the “current connection” means.
5 The Type System
In this section, we informally explain the type system of our calculus. We start
with an overview which covers the major ideas behind the type system, then
we explain the type-checking process in detail. Some properties of the type sys-
tem are stated at the end. The complete formal type system with proofs of its
properties can be found in a technical report [LS04].
5.1 Overview
The Types The types are defined in Fig. 8. The assemblage type Asm(S,D,C,L)
contains type declarations of static linkers, dynamic linkers, connectors, and lo-
cal features. It is used in two situations: top-level typechecking and typechecking
of first-class assemblages. At the top level, each assemblage is a separate com-
pilation unit in our type system and is given an assemblage type. In the second
situation, first-class assemblages can appear anywhere as expressions, be passed
as arguments, etc.
The runtime type Rtm(C) is the type of assemblage runtimes. When an
assemblage runtime is viewed by other assemblage runtimes, the only thing other
runtimes care about is how to communicate with the runtime. Thus, a runtime
type only contains the list of connector types. The dynamic linkage type Dlnk(E)
structurally is a sequence of type declarations for functions either exported from
dynamic linking initiator’s dynamic linker or the dynamic plugin’s corresponding
static linker. The connection type Cnt(E,J) is structurally a sequence of type
declarations for functions either exported from connection initiator’s connector
or connection receiver’s connector, and Jis type declaration of per-connection
states.
Interface Matching As introduced in Sec. 2, static linking, dynamic linking
and connection establishment share one common trait: all three fundamental pro-
cesses involve bi-directional interface matchings. This commonality is reflected
in the typecheckings of three related expressions: A1+A2,pluginν17→ν2eand
connectν17→ν2e; an interface match check is performed for all three typecheck-
ings, between two static linkers in the first case, one dynamic linker and one static
linker in the second case, and two connectors in the third case. By definition,
each interface type, be it static linker type, dynamic linker type or connector
type, is composed of a list of type declarations for imported features and a list
for exported features. Two interface types, say i1and i2, are considered a match
iff
1. If i1exports a feature αof type τ, and if i2imports a feature αof type τ0,
then τmust be a subtype of τ0. The same should also hold if i2exports and
i1imports.
2. i1and i2do not export features by the same name.
3. If i1and i2are both static linker types, they do not import features of the
same name. If one of them is not a static linker type, then every imported
feature in i1(or i2) must match an exported feature of the same name in i2
(or i1).
Condition 1 is the most important one: features matched by name also match
by type. The flexible part is that our type system does not demand exact match-
ing of types; instead, it is acceptable if the export feature has a more precise
type than what is expected from the import counterpart. Our subtyping relation
is standard for primitive types; for types Asm(S,D,C,L), Rtm(C), Dlnk(E),
and Cnt(E,J), subtyping is given the natural structural definition.
Condition 2 is used to avoid a feature name clash. For instance, if ιd is the
result of pluginν7→ν0e, the meaning of expression ιd.α would be ambiguous if
both dynamic linker νof the initiator and static linker ν0of the dynamic plugin
exported the feature α. The restriction here might not correspond to reality: in
real life, dynamic plugins might be developed independently, and such a name
clash does have a chance to happen. However, such a clash can easily be avoided
if the language supports either feature name renaming, or casting to remove
some exported features. Our calculus currently does not include these operators,
but they can easily be added without affecting the calculus core.
Condition 3 states that for dynamic linking and connection case, no dangling
imports are allowed if interface match succeeds; for static linking, our calculus
does allow some imports to not be satisfied, since the result (say A) of A1+A2
can still be statically linked in the future, e.g., by A+A3.
Principle of Computation Encapsulation and Parameter Passing Across
Computations One of the design principles of our calculus is computation en-
capsulation: reactive computations should only communicate with each other via
explicit interfaces, in our context, connectors. Parameter passing across reactive
computations, if not handled properly, could however violate this principle. The
three types of parameters that cause troubles are function closures, dynamic
linkage handles, and connection handles; our type system disallows the passing
of these three types of values.
We first consider the problematic case of passing connection handles. Suppose
assemblage runtime with ID ιr1contains a connectν7→ν0ιr2expression which
returns a connection handle ιc. Had we allowed ιc to be passed as a parameter,
runtime ιr1could pass it to some runtime ιr3via some previously existing con-
nection. Now although runtime ιr3does not have a connector ν(or ν0), it would
still be able to use features associated with connection ιc via ιc  α eexpres-
sions, meaning it is accessing a feature not through an explicit interface on its
runtime, ιr3. Similarly, passing dynamic linkage handles could allow assemblage
runtimes to gain direct access to dynamic plugins they do not have interfaces to
plug in to.
The case for passing function closures across assemblage runtimes suffers from
a similar problem, but it is less obvious. In Sec. 4, we have already mentioned
how there is no technical difficulty in passing function closures across assemblage
runtimes; indeed we could just pass function closures in a manner similar to how
Java RMI passes object references. Now let us consider why a mechanism like this
would violate our encapsulation principle. Suppose assemblage runtime with ID
ιr1has a function closure fun(ιr1, ιn1, λx.e) and econtains an expression α@ν
to use a feature αexported from static linker νof ιr1. Had we allow this closure
to be passed to another runtime, it could, by several indirections (ιr1to ιr2, and
ιr2to ιr3for instance), eventually be received by some runtime ιr3that has no
direct communication with ιr1. But, by applying the function, this assemblage
runtime ιr3would be able to access to feature αof static linker νof ιr1, through
a channel not explicit in ιr3’s interface.
The legal parameters that can be passed across computations are primitive
values such as integers, runtime handles, references and first-class assemblages.
Passing runtime handles is an important means for a runtime to “advertise” itself
to other runtimes. References are passed by deep copy (recall the reduction rule
(coninv) in Sec. 4). The exclusion of function closures from passable parameters
across computations might appear to disallow the possibility of any code passing
in our calculus, but in fact not. If passing code is needed, users can encapsulate
the code as an assemblage and pass the assemblage; assemblages are completely
self-contained, without need for a closure, and so are nothing more than a kind
of data.
To enforce the principle of computation encapsulation, our type system checks
that for every imported function feature given in a connector type, its parameter
and return value can not have the aforementioned types. This well-formedness
property must hold for all connector types.
5.2 Details of Typechecking
We now explain how top-level typechecking is achieved, and how some important
expressions are typechecked, in our type system.
Assemblage Typechecking At top level, assemblage hS;D;C;Lias a separate
compilation unit is well typed if all exported features defined in its static linkers
S, dynamic linkers Dand connectors C, and all locally defined features L, are
well-typed. Well-typed assemblages have an assemblage type Asm(S,D,C,L),
which structurally corresponds to hS;D;C;Liin an intuitive way.
To typecheck an assemblage appearing inside a program as a first-class value
is the same as top-level typechecking, with the only exception that if the assem-
blage is annotated with assemblage type Asm(S,D,C,L) and typechecks, the
type we give to this assemblage expression is Asm(S,D,C, φ). Assemblages are
encapsulated entities and on the outside local features should be invisible, just
as with private fields of objects.
Static Linking To typecheck expression A1+A2, the following conditions have
to be satisfied:
A1and A2both need to be well-typed, with type Asm(S1,D1,C1, φ), and
Asm(S2,D2,C2, φ) respectively.
If S1includes a static linker named νand S2includes a static linker with
the same name, the two linkers must match according to Sec. 5.1.
No dynamic linkers in D1and D2can share the same name.
No connectors in C1and C2can share the same name.
Expression A1+A2has type Asm(S1tS2,D1] D2,C1] C2, φ) if the above
conditions are met. Here the toperator is the same as that of the operator
explained in Sec. 4.3, but changing S,I,Eto S,I,E. Also notice that since
first-class assemblages are given a type in which local feature types are set to φ,
there can be no name clash checking on local features of A1and A2. This is not
a problem, however, since assemblages are α-convertible with respect to local
feature names (see Sec. 4).
Dynamic Linking Expression pluginν17→ν2e, when well-typed in our type
system, has a dynamic linkage type Dlnk(E); this corresponds to the fact that
the return value of a plugin expression is a dynamic linkage handle. It typechecks
iff
It appears in an assemblage whose type is Asm(S1,D1,C1,L1).
Expression e, which will evaluate to an assemblage, has a type Asm(S2,D2,C2, φ).
D1includes a dynamic linker type ν17→ hI1;E1i, and S2includes a static
linker type ν27→ hI2;E2i, and the two types match according to Sec. 5.1.
E=E1] E2.
C2must be φ.
Static linker ν2is the only static linker in S2which has imported features.
The last two conditions merit some further explanation. C2must be φbecause
if dynamic plugins had extra connectors, the assemblage runtime, after being
plugged in with dynamic plugins of this kind, would be faced with a dilemma: it
either needs to dynamically change its type to reflect some connectors that are
dynamically added, or these new connectors are not exposed to outsiders and are
de facto useless. Our calculus does not tackle this dilemma to preserve simplicity.
The last condition is necessary because otherwise, the imported features not
satisfied by dynamic linking would become dangling unresolved name references.
Cross-Computation Communication Expression connectν17→ν2e, when
well-typed in our type system, has a connection type Cnt(E,J1); this corre-
sponds to the fact that the return value of a connect expression is a connection
handle. It typechecks iff:
It appears in an assemblage whose type is Asm(S1,D1,C1,L1).
Expression e, which evaluates to an assemblage runtime handle, has a type
Rtm(C2).
C1includes a connector type ν17→ hI1;E1;J1i, and C2includes a connector
type ν27→ hI2;E2;J2i, and the two connector types match according to Sec.
5.1.
E=E1] E2.
5.3 Properties of the Type System
We have proved soundness of our type system [LS04], in which we have shown
the bootstrapping process preserves type, and the subject reduction property
holds, i.e., the G, e ιr,ιn
G0, e0reduction always preserves type. In addition,
the typechecking process is decidable.
6 Related Work
In terms of static linking alone, our calculus is in the spirit of numerous module
systems and calculi mentioned in Sec. 1 and Sec. 2. The calculus presented here
supports first-class modules and static linking as first-class expressions, which
some of the aforementioned projects, such as ML functors, do not support. In this
presentation, we omitted how types can themselves be imported or exported as
features, but type importing/exporting, including bounded parametric types and
cross-module recursive types is covered in the long version [LS04]. Previous works
in this category, with the exception of Units, do not consider dynamic linking or
cross-computation communication. For instance, ML modules do not themselves
constitute a runtime, and even though structures have explicit interfaces for run-
time interaction via S.x, this is for tightly-coupled interaction within a single
runtime, not cross-computation invocation.
Dynamic linking is supported in Java. Although it does not support source-
level dynamic linking expressions, classes are loaded dynamically [LB98,DLE03].
The classloader mechanism provides a very powerful way to customize the dy-
namic linking process, but its maximum expressiveness, particularly classloader
delegation, requires much of the typechecking work to happen at dynamic link
time. In addition, we believe the granularity of modules provides a better layer
for dynamic linking, because explicit dynamic linking interfaces can be specified
without too much labor, and users can have more programmatic control over
the dynamic linking process. Dynamic linking of modules is explored in Argus
[Blo83], in the invoke expressions of Units [FF98], and in the dynamic export
declarations of MJ [CBGM03]. These projects only take advantage of the fact
that dynamic plugins are modules and so the interface is unidirectional only: the
running program has no explicit interface to the plugged-in code.
There have been many effective protocols developed for reactive computa-
tions, including RMI, RPC, and component architectures such as COM+ and
CORBA. These protocols generally define a one-way communication interface
only; the receiver has an interface, but not the sender. The bidirectional con-
nector as a concept has existed in the software engineering community for some
time, e.g. in [AG97], but those closest to our are two programming language ef-
forts: ArchJava [ASCN03], and Cells [RS02]. In ArchJava, connectors are more
low-level than ours. Each connector may have a typecheck method which maxi-
mize flexibility of typechecking, something we do not support. Connectors in our
calculus share the same notion as in Cells language, but connectors in Cells do
not consider rebindability and per-connection states. These projects in general
do not have module system as a priority, and it therefore do not address type
imports and exports.
Research on software components [Szy98] is diverse. A number of industrial
component systems (such as COM+, Javabeans) have been successful in mod-
elling reactive computations, and some support both static linking and cross-
computation communication, such as CORBA CCM. They are only loosely re-
lated to our project, as they do not consider dynamic linking issue and type
issues.
7 Conclusions and Future Work
The major contribution of this paper is a novel module system where static
linking, dynamic linking and cross-computation communication are all defined
in a uniform framework by declaring explicit, bi-directional interfaces. Explicit
interfaces for dynamic linking and cross-computation communication provide
more declarative specifications of the interaction between parties, and also gives
a stronger foundation for adding other critical language features such as security
and version control. We have yet to see a fully bi-directional dynamic linking
interface in the literature. Bi-directional communication interfaces are found for
example in [ASCN03,RS02], but our work builds this feature into module sys-
tems. In the full version of this paper [LS04] we show how the calculus presented
here can be extended to include types as features, and how they are useful for
situations involving dynamic linking and cross-computation communication.
Since every cross-computation communication must be directed through con-
nectors in our calculus, access control on connectors is enough to ensure the
network security of the assemblage. Assemblages provide a good granularity for
encapsulation, and our type system and semantics of the calculus restricts the
types of data that can be transfered across computations, which also coincides
with a proper policy of confinement in security. A future topic is to define a
complete security architecture based on this calculus.
It is our belief that rebindable dynamic linkers can provide a strong initial
basis upon which a rigorous theory of code version control can be built. Since
dynamic plugins can be rebound, each successive binding at runtime is a new
version of the code. Rebindability also allows multiple versions to co-exist. We
are interested in building a version control layer on top of our calculus.
Acknowledgements We would like to acknowledge Ran Rinat for contributions
at earlier stages of this project.
References
[AG97] Robert Allen and David Garlan. A formal basis for architectural connection.
ACM Transactions on Software Engineering and Methodology, 6(3):213–
249, 1997.
[ASCN03] Jonathan Aldrich, Vibha Sazawal, Craig Chambers, and David Notkin. Lan-
guage support for connector abstractions. In Proceedings of the Seventeenth
European Conference on Object-Oriented Programming, June 2003.
[AZ02] D. Ancona and E. Zucca. A calculus of module systems. Journal of func-
tional programming, 11:91–132, 2002.
[BC90] Gilad Bracha and William Cook. Mixin-based inheritance. In Norman Mey-
rowitz, editor, Proceedings of OOPSLA/ECOOP, pages 303–311, Ottawa,
Canada, 1990. ACM Press.
[BHSS03] G. Bierman, M. Hicks, P. Sewell, and G. Stoyle. Formalizing dynamic
software updating, 2003.
[Blo83] Toby Bloom. Dynamic module replacement in a distributed programming
system. Technical Report MIT/LCS/TR-303, 1983.
[Car97] Luca Cardelli. Program fragments, linking, and modularization. In Confer-
ence Record of POPL’97: The 24th ACM SIGPLAN-SIGACT Symposium
on Principles of Programming Languages, pages 266–277, 1997.
[CBGM03] John Corwin, David F. Bacon, David Grove, and Chet Murthy. MJ: a
rational module system for java and its applications. In Proceedings of the
18th ACM SIGPLAN conference on Object-oriented programing, systems,
languages, and applications, pages 241–254, 2003.
[DEW99] Sophia Drossopoulou, Susan Eisenbach, and David Wragg. A fragment
calculus towards a model of separate compilation, linking and binary com-
patibility. In Logic in Computer Science, pages 147–156, 1999.
[DLE03] Sophia Drossopoulou, Giovanni Lagorio, and Susan Eisenbach. Flexible
models for dynamic linking. In 12th European Symposium on Programming,
2003.
[DS96] Dominic Duggan and Constantinos Sourelis. Mixin modules. In Proceedings
of the ACM SIGPLAN International Conference on Functional Program-
ming (ICFP ’96), volume 31(6), pages 262–273, 1996.
[FF98] Matthew Flatt and Matthias Felleisen. Units: Cool modules for HOT lan-
guages. In Proceedings of the ACM SIGPLAN ’98 Conference on Program-
ming Language Design and Implementation, pages 236–248, 1998.
[HL02] Tom Hirschowitz and Xavier Leroy. Mixin modules in a call-by-value set-
ting. In European Symposium on Programming, pages 6–20, 2002.
[HMN01] Michael W. Hicks, Jonathan T. Moore, and Scott Nettles. Dynamic software
updating. In SIGPLAN Conference on Programming Language Design and
Implementation, pages 13–23, 2001.
[HSW+00] Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David E. Culler, and
Kristofer S. J. Pister. System architecture directions for networked sen-
sors. In Architectural Support for Programming Languages and Operating
Systems, pages 93–104, 2000.
[LB98] Sheng Liang and Gilad Bracha. Dynamic class loading in the Java vir-
tual machine. In Conference on Object-oriented programming, systems,
languages, and applications (OOPSLA’98), pages 36–44, 1998.
[LS04] Yu David Liu and Scott F. Smith. Modules With Inter-
faces for Dynamic Linking and Communication (long version),
http://www.cs.jhu.edu/~scott/pll/assemblage/asm.pdf. Techni-
cal report, Baltimore, Maryland, March 2004.
[Mac84] D. MacQueen. Modules for Standard ML. In Proceedings of ACM Confer-
ence on Lisp and Functional Programming, pages 409–423, 1984.
[MFH01] S. McDirmid, M. Flatt, and W. Hsieh. Jiazzi: New-age components for
old-fashioned Java. In Proc. of OOPSLA, October 2001.
[RS02] Ran Rinat and Scott Smith. Modular internet programming with cells. In
Proceedings of the Sixteenth ECOOP, June 2002.
[SPW03] Nigamanth Sridhar, Scott M. Pike, and Bruce W. Weide. Dynamic module
replacement in distributed protocols. In Proceedings of the 23rd Interna-
tional Conference on Distributed Computing Systems, May 2003.
[Szy98] Clemens Szyperski. Component Software: Beyond Object-Oriented Pro-
gramming. ACM Press and Addison-Wesley, New York, NY, 1998.
[WV00] J. B. Wells and Ren´e Vestergaard. Equational reasoning for linking with
first-class primitive modules. In Programming Languages and Systems, 9th
European Symp. Programming, volume 1782, 2000.
... In the object-oriented programming community, there has been extensive research on attacking this issue. For example, SuperGlue [9], Jiazzi [10], the calculus of assemblages [11] and so on. SuperGlue is a connection-based asynchronous programming model. ...
... SuperGlue, Jiazzi and rCOS all cope with composing (gluing) components statically in the sense that all method names used for composing must be resolved in the moment these components are composed (glued). Whereas the calculus of assemblages [11] can handle the composing (gluing) dynamically. However, there is no the notion of contracts within it either. ...
... In the object-oriented programming community, there has been extensive research on attacking this issue. For example, SuperGlue [9], Jiazzi [10], the calculus of assemblages [11] and so on. SuperGlue is a connection-based asynchronous programming model. ...
... SuperGlue, Jiazzi and rCOS all cope with composing (gluing) components statically in the sense that all method names used for composing must be resolved in the moment these components are composed (glued). Whereas the calculus of assemblages [11] can handle the composing (gluing) dynamically. However, there is no the notion of contracts within it either. ...
... In the object-oriented programming community, there has been extensive research on attacking this issue. For example, SuperGlue [9], Jiazzi [10], the calculus of assemblages [11] and so on. SuperGlue is a connection-based asynchronous programming model. ...
... SuperGlue, Jiazzi and rCOS all cope with composing (gluing) components statically in the sense that all method names used for composing must be resolved in the moment these components are composed (glued). Whereas the calculus of assemblages [11] can handle the composing (gluing) dynamically. However, there is no the notion of contracts within it either. ...
... Ensemble. Assemblages, with interfaces both for static linking and inter-node communication, were developed in [11]. The aforecited project is a general-purpose language for distributed software design, and this paper represents a re-targeting of those general concepts to the specific domain of sensor applications. ...
Article
In this paper we describe Ensemble, a proposed language framework for sensor network programming. Our goal is to provide a programming framework to scientists and engineers that will allow them to directly code sensor network applications, without the need for expertise in low-level device programming. The key concepts in Ensemble are high-level communication protocol connectors, and the ability for systems programmers to define new communication protocols as metaprotocol extensions. I.
... Our design of interfaces might be structurally similar with systems in this category, but there is a fundamental semantic difference due to the fundamental difference between modules and objects. In this regard, Classages is closest to a component language we designed, Assemblages [LS04]. In Assemblages, fundamental interactions on the module/component level are directly supported in a programming language. ...
Article
This dissertation describes the design and implementation of a general-purpose object-oriented (OO) programming language, Classages. The novel object model of Classages gives programmers refined control over modeling the interactions inside OO software, with inspirations drawn from human sociology. Key innovations include the language constructs of Mixers and Connectors as encapsulation-enforceable in-terfaces for specifying bi-directional interaction behaviors, and a novel type system called Pedigree Types to help programmers organize the object heap into a hierarchy reflecting the fundamental principle of hi-erarchical decomposition. Important properties of the language, including type soundness, hierarchy shape enforcement, and alias protection, are formally established and proved. A prototype compiler is implemented. The object model of Classages sets a significant departure from the familiar one taken by Smalltalk, Java, C++, and C#. Its simple, expressive, and rigorously defined core enriches the theoretical foundations of OO languages. From the perspective of software engineering, the Classages language is particularly good at controlling software complexity, a crucial goal in modern software development.
... Ensemble. Assemblages, with interfaces both for static linking and inter-node communication, were developed in [11]. The aforecited project is a general-purpose language for distributed software design, and this paper represents a re-targeting of those general concepts to the specific domain of sensor applications. ...
Article
In this paper we describe Ensemble, a proposed language framework for sensor network programming. Our goal is to provide a programming framework to scientists and engineers that will allow them to directly code sensor network applications, without the need for expertise in low-level device programming. The key concepts in Ensemble are high-level communication protocol connectors, and the ability for systems programmers to define new communication protocols as metaprotocol extensions.
... Connectors do an excellent job of expressing persistent communication channels such as file operations and socket connections, but simple service requests are not persistent and are more elegantly implemented as services; thus we provide both connectors and services. The notions of connector and service come from our previous work on component interfaces [23,21]. ...
Article
Full-text available
In this paper, we develop a novel microkernel-based virtual machine, the µKVM. It is a microkernel architecture because the size of the trusted system codebase is greatly reduced in comparison to VM's such as the Java Virtual Machine. The µKVM kernel manages sensitive resources such as I/O, and implements a minimal set of low-level system operations. System libraries are implemented outside the kernel, and run in user mode. All interactions between the kernel and applications are declared on explicit interfaces, and security policies are also enforced at these interfaces. We test our architecture in a µKVM prototype developed for Java and show how the microkernel architecture supports the existing functionality of the J2SDK. The prototype is benchmarked, and the results show that our implementation compares fa-vorably with the J2SDK and so the architecture does not appear to be a burden on running time.
... MJ [7] is a module system designed to control the complexity of configuration management in Java platforms. Liu and Smith [23] describe a module system that supports the declaration of bi-directional interfaces. Designed primarily for access control, ISOMOD is unique in two ways: (1) name visibility constraints can be imposed dynamically; (2) fine-grained name visibility constraints can be expressed in the ISOMOD policy language to control not only what names are visible, but also to whom and to what extent they are visible. ...
Article
In a modern programming language, scoping rules determine the visibility of names in various regions of a program. In this work, we examine the idea of allowing an application developer to customize the scoping rules of its underlying language. We demonstrate that such an ability can serve as the cornerstone of a security architecture for dynamically extensible systems.A run-time module system, IsoMod, is proposed for the Java platform to facilitate software isolation. A core application may create namespaces dynamically and impose arbitrary name visibility policies (i.e., scoping rules) to control whether a name is visible, to whom it is visible, and in what way it can be accessed. Because IsoMod exercises name visibility control at load time, loaded code runs at full speed. Furthermore, because IsoMod access control policies are maintained separately, they evolve independently from core application code. In addition, the IsoMod policy language provides a declarative means for expressing a very general form of visibility constraints. Not only can the IsoMod policy language simulate a sizable subset of permissions in the Java 2 security architecture, it does so with policies that are robust to changes in software configurations. The IsoMod policy language is also expressive enough to completely encode a capability type system known as Discretionary Capability Confinement. In spite of its expressiveness, the IsoMod policy language admits an efficient implementation strategy. Name visibility control in the style of IsoMod is therefore a lightweight access control mechanism for Java-style language environments.
... In component based development (CBD), how to construct composite components from existing ones is not new [11,12]. In the object-oriented programming community, there has been extensive research on attacking this issue, such as Su- perGlue [21], Jiazzi [20], the calculus of assemblages [19], etc. However, our model is constructed based on the service. ...
Article
Full-text available
The Service Component Architecture (SCA) provides a platform-independent component model for service-oriented development. A service component with different communication mechanisms and implementation languages can be modeled in SCA. However, it lacks a formal foundation for SCA-based system specification and verification. This paper presents a formal service component signature model with respect to the specification of the SCA assembly model. Inspired by the idea of independence in SCA, a language-independent dynamic behaviour model is proposed for specifying the interface behaviour of the service component by port activities. Based on the dynamic behaviour model, the compatibility relation between components is discussed. A set of transition rules are given to map Business Process Execution Language for Web Services (BPEL) to dynamic behaviour expressions and then to Petri nets, thus the service component based system can be verified with existing tools. A case study is demonstrated to illustrate how to use our approach to constructing a web application in a rigorous way.
Article
Full-text available
Dynamic linking, as in Java and C#, allows users to execute the most recent versions of software without re-compilation or re-linking. Dynamic linking is guided by type names stored in the bytecode. In current dynamic linking schemes, these type names are hard-coded into the bytecode. Thus, the bytecode reflects the compilation environment that produced it. However, the compilation environment need not be the same as the execution environment: a class may be replaced by one that oers the "same" services but has a dierent name. Such changes are not supported by current linking schemes. We suggest a more flexible approach to dynamic linking, where bytecode contains type variables rather than types, and where these type variables are substituted during execution. We develop a non-deterministic system that allows type variable substitution at many dierent points, and sketch a proof of soundness.
Article
Full-text available
We present Jiazzi, a system that enables the construction of large-scale binary components in Java. Jiazzi components can be thought of as generalizations of Java packages with added support for external linking and separate compilation. Jiazzi components are practical becuase they are constructed out of standard Java source code. Jiazzi requires neither extensions to the Java language nor special conventions for writing Java source code that will go inside a component. Our components are expressive becuase Jiazzi supports cyclic component linking and mixins, which are used together in an open class pattern that enables the modular addition of new features to existing classes. This paper describes Jiazzi, how it enhances Java with components, its implementation, and how type checking works. An implementation of Jiazzi is available for download.
Conference Paper
Full-text available
The ML module system provides powerful parameterization facilities, but lacks the ability to split mutually recursive definitions across modules, and does not provide enough facilities for incremental programming. A promising approach to solve these issues is Ancona and Zucca’s mixin modules calculus CMS. However, the straightforward way to adapt it to ML fails, because it allows arbitrary recursive definitions to appear at any time, which ML does not support. In this paper, we enrich CMS with a refined type system that controls recursive definitions through the use of dependency graphs. We then develop a separate compilation scheme, directed by dependency graphs, that translate mixin modules down to a CBV λ-calculus extended with a non-standard let rec construct.
Conference Paper
Full-text available
The success of Java in recent years is largely due to its targeting as a language for the Internet. Many of the network-related features of Java however are not part of the core language design. In this paper we focus on the design of a more parsimonious Internet programming language, which supports network integration smoothly and coherently as part of its core specification. The key idea is to center these extensions around the unified notion of a cell. Cells are deployable containers of objects and code, which may import (plugin) and export (plugout) classes and operations. They may be dynamically linked and unlinked, locally or across the network. Cells may be dynamically loaded, unloaded, copied, and moved, and serve as units of security. At first approximation, cells can be thought of as a hybrid between modules and components. Here we concentrate on the design of JCells, a language which builds cells on top of the fundamental Java notions of class, object, and virtual machine.
Conference Paper
A module system ought to enable assembly-line programming using separate compilation and an expressive linking language. Separate compilation allows programmers to develop parts of a program independently. A linking language gives programmers precise control over the assembly of parts into a whole. This paper presents models of program units , MzScheme's module language for assembly-line programming. Units support separate compilation, independent module reuse, cyclic dependencies, hierarchical structuring, and dynamic linking. The models explain how to integrate units with untyped and typed languages such as Scheme and ML.
Book
Component Software: Beyond Object-Oriented Programming explains the technical foundations of this evolving technology and its importance in the software market place. It provides in-depth discussion of both the technical and the business issues to be considered, then moves on to suggest approaches for implementing component-oriented software production and the organizational requirements for success. The author draws on his own experience to offer tried-and-tested solutions to common problems and novel approaches to potential pitfalls. Anyone responsible for developing software strategy, evaluating new technologies, buying or building software will find Clemens Szyperski's objective and market-aware perspective of this new area invaluable.
Conference Paper
Modules and linking are usually formalized by encodings which use the λ-calculus, records (possibly dependent), and possibly some construct for recursion. In contrast, we introduce the m-calculus, a calculus where the primitive constructs are modules, linking, and the selection and hiding of module components. The m-calculus supports smooth encodings of software structuring tools such as functions (λ-calculus), records, objects (ς-calculus), and mutually recursive definitions. The m-calculus can also express widely varying kinds of module systems as used in languages like C, Haskell, and ML. We prove the m-calculus is confluent, thereby showing that equational reasoning via the m-calculus is sensible and well behaved.
Conference Paper
.We present the notion of class loaders and demonstrate some of their interesting uses. In addition, we discuss how to maintain type safety in the presence of user-defined dynamic class loading.