Conference PaperPDF Available

CoCoL: Concurrent Communications Library

Authors:

Abstract

In this paper we examine a new CSP inspired library for the Common Intermediate Language, dubbed CoCoL: Concurrent Communications Library. The use of the Common Intermediate Language makes the library accessible from a number of languages, including C#, F#, Visual Basic and IronPython. The processes are based on tasks and continuation callbacks, rather than threads, which enables networks with millions of running processes on a single machine. The channels are based on request queues with two-phase commit tickets, which enables external choice without coordination among channels. We evaluate the performance of the library on different operating systems, and compare the performance with JCSP and C++CSP.
Communicating Process Architectures 2015
P.H. Welch et al. (Eds.)
Open Channel Publishing Ltd., 2015
© 2015 The authors and Open Channel Publishing Ltd. All rights reserved.
1
CoCoL: Concurrent Communications
Library
Kenneth SKOVHEDE
1
and Brian VINTER
Niels Bohr Institute, University of Copenhagen
Abstract. In this paper we examine a new CSP inspired library for the Common In-
termediate Language, dubbed CoCoL: Concurrent Communications Library. The use
of the Common Intermediate Language makes the library accessible from a number
of languages, including C#, F#, Visual Basic and IronPython. The processes are based
on tasks and continuation callbacks, rather than threads, which enables networks with
millions of running processes on a single machine. The channels are based on re-
quest queues with two-phase commit tickets, which enables external choice without
coordination among channels. We evaluate the performance of the library on different
operating systems, and compare the performance with JCSP and C++CSP.
Keywords. CSP, concurrent programming, process oriented programming, C#, .Net,
Common Intermediate Language
Introduction
Since C. A. Hoare introduced the CSP algebra [1] a large number of implementations have
been implemented where the occam family [2,3] and later JCSP [4] have attracted attention.
Where the occam family presents the user with a new language, designed to give easy access
to CSP features, the JCSP approach is to use the Java language and environment and add CSP
functionality.
By introducing a new language for CSP, there are some obvious benefits, such as natural
constructs for expressing processes, external choice, side-effect-free guaranteed, and other
central CSP elements. When introducing the CSP elements into an existing language, the
features provided by the language limit the design freedom..
On the other hand, when adding CSP features to an existing language, the CSP imple-
mentation can leverage the existing user base, rather than require newcomers to learn a new
syntax and semantics set. With an existing language there is usually also an existing eco-
system with toolchains, support libraries etc. Another important benefit from adding CSP
support to an existing language is that the user can choose a mixed approach, where only
parts of the program is using CSP constructs, and other parts are using the native language
approach, for example functional or object oriented.
With the Concurrent Communications Library (CoCoL) we choose the latter approach:
implementing CSP functionality as a library for the Common Intermediate Language (CIL).
In CoCoL, we have experimented with implementing a communication oriented pro-
gramming paradigm for the CIL languages, hosted entirely within the Common Language
Runtime (CLR), and leveraging features of the CIL languages. With this paper we present
the design considerations and measure the achieved performance compared to a number of
related libraries.
1
Corresponding Author: Kenneth Skovhede, Niels Bohr Institute, Blegdamsvej 17, DK-2100 Copenhagen OE.
Tel.: +45 35325209; E-mail: skovhede@nbi.ku.dk.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
2 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
Through the use of CoCoL, it is possible to use a CSP-like design approach from any
of the languages supported by the CIL, including C#, F# and VisualBasic. This enables the
existing users from a number of languages to apply CSP design principles without learning a
new language.
1. Background
As CoCoL relies heavily on features found in the C# language and runtime, this section pro-
vides an overview of some of the components in that environment. This section is by no
means an exhaustive listing of all features, but seeks to provide the foundation for under-
standing the implementation of CoCoL.
1.1. CLI Terminology
The Common Language Infrastructure (CLI) is a specification [5] for a runtime environment,
which comprises Common Language Runtime (CLR), Common Type System (CTS), Com-
mon Metadata format, and Common Intermediate Language (CIL).
CIL is an assembly-like bytecode, which is comparable to the Java bytecode, but with
the difference that CIL is designed to support a number of different languages. CIL is ex-
ecuted by the CLR, which can be compared to the Java Virtual Machine (JVM). Like Java
source code is compiled into Java bytecode and executed by the JVM, languages such as C#,
F# and VisualBasic are compiled into CIL and executed by the CLR. The JVM and CLR
environments also share other traits, such as being based on JIT compilation, having garbage
collected memory, and differentiating between value-based and reference-based types [6,7].
Any language that compiles into CIL should also follow the Common Language Specifi-
cation (CLS), which describes the rules a compatible language should observe. If a language
uses the CTS and honors the CLS, any other language in the CLI can use compiled methods
and types from that language, and vice versa. This interoperability feature is used in CLI to
provide a set of Standard Libraries, which provides common functionality, such as file access,
network access, xml to all languages.
The most prominent implementations of the CLI is the Microsoft .Net Runtime, which
is available only on Windows. The open source Mono [8] implementation is feature complete
in terms of the CLI, and available on all major platforms, but does not implement all of the
support libraries shipped with the .Net implementation.
1.2. Generics
In a strongly typed language, a method that needs to operate on any type of data is often
using the common Object type. As an example, a dynamic list (e.g. an ArrayList) can
contain all types of data, by forcing the caller to convert, or cast, the data to the Object
type before storing it, and then reversing the operation when retrieving it. This allows for a
single implementation of a dynamic list, but comes with a large overhead when the data is
primitives, such as integers, because these need to be boxed, that is, encapsulated in heap
allocated objects that later needs to be de-allocated. This issue has compelled both Java and
CIL to introduce generics, which can be considered a type-safe kind of templates.
Since CIL was introduced later than Java, it has less legacy code to support than Java.
This has prompted Microsoft to introduce a compatibility breaking change in the type system,
which allows for the use of type-safe generic types [9]. Java has chosen the type erasure
approach, which gives full backward compatibility, but completely removes the generic type
information from the resulting bytecode [10]. In contrast, CIL introduced the generics into
the type system, so the type information is available at runtime. The CIL runtime system uses
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library 3
the JIT compiler to generate typed classes based on the actual type. This allows the runtime
to completely avoid casting into the common object super-type. Listing 1 shows an example
of a generic method, which returns the first element of an array. In Java, no type information
is preserved, so it would be required to use Object in place of T, and then cast the input.
In the CIL version, the type information is present, so it can be transformed into the version
shown in Listing 2, with no casting required. In a C++ template setting, the transformation
would be performed at compile time, so a C++ binary must include the transformed versions
for all used types. In CIL this is not required, because the expansion happens at runtime when
the method is JIT compiled, which allows a library to export a function that takes any type as
input.
v o i d T F i r s t ( T [ ] d a t a ) {
i f ( d a t a == n u l l )
throw new N u l l E x c e p t i o n ( ) ;
e l s e i f ( d a t a . L e n g th == 0 )
r e t u r n d e f a u l t ( T ) ;
e l s e
r e t u r n d a t a [ 0 ] ;
}
Listing 1: Generic function example
v o i d i n t F i r s t ( i n t [ ] d a t a ) {
i f ( d a t a == n u l l )
throw new N u l l E x c e p t i o n ( ) ;
e l s e i f ( d a t a . L e n g th == 0 )
r e t u r n 0 ;
e l s e
r e t u r n d a t a [ 0 ] ;
}
Listing 2: Instantiated generic function
1.3. Delegates and Lambda Functions
What is known as higher-order functions or function pointers is implemented in CIL using a
delegate, which encapsulates the context of the caller, that is the value of this [9]. When
the delegate is created, the this context and target method is stored in a lookup table,
and a pointer to this memory area is returned
1
. When the delegate is invoked, the callback
function is invoked in the correct context. This makes it possible to create a callback method
that automatically carries state, and can invoked a method on a specific object instance.
v o i d T F i r s t ( T [ ] d a t a ,
Func<T , bo ol > t e s t ) {
i f ( d a t a == n u l l )
throw new N u l l E x c e p t i o n ( ) ;
f o r e a c h ( v a r n i n d a t a )
i f ( t e s t ( n ) )
r e t u r n n ;
r e t u r n d e f a u l t ( T ) ;
}
Listing 3: Generic function with delegate
v a r d a t a = new i n t [ ] { 1 , 2 , 3 } ;
v a r n = 0 ;
v a r o n e p l u s = F i r s t ( d a t a , x => {
i f ( x > 0 )
n ++ ;
r e t u r n n >= 2 ;
} ) ;
Listing 4: Example use of a lambda method
A related technique is anonymous methods, also known as lambda methods, which are
methods that can only be referenced by their handle (i.e. they have no name). In C# the
keyword => creates a lambda function, which can be combined with generics to implement
common methods. As an example, consider the generic method in Listing 3, which accepts a
delegate called test. In Listing 4, this method is used to pick the second non-zero integer in
an array. Note that the variable n is declared outside the lambda function, but is still accessible
1
Here, delegates refer to the pure function pointer-like feature in C#, not the delegate in F# which is equivalent
to a lambda closure in C#
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
4 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
from within, and can be used to keep state inside the lambda function. This reveals that while
the lambda method looks like a delegate, it is in fact more complicated, as it needs to create
a closure object that captures all accessible variables. The this context is used with the
delegate to invoke the closure instance and allow the delegate code to access the variables
inside the scope.
1.4. Continuations
Another use of lambda methods is to provide a callback method for long-running operations.
This kind of callback is commonly referred to as continuations, as the code “continues” inside
the callback method. Continuations can be considered similar to an event-based mode, where
the event “fires” once the long-running operation has completed.
Without callbacks, a multithreading approach would need to introduce a worker thread
and handle communication with locks and monitors, which is known to be error prone and
difficult for novice programmers [11,12].
v o i d Example ( ) {
v a r a = L oad Url ( ) ;
v a r b = Download ( a ) ;
f i l e . W r i t e ( a , b ) ;
}
Listing 5: Sequential code
v o i d Example ( ) {
Lo a dU r l ( ( a ) => {
Download ( ( b ) => {
f i l e . W r i t e ( a , b ) ;
} ) ;
} ) ;
}
Listing 6: Continuation with lambda func-
tions
a s y n c v o i d Example ( ) {
v a r a = a w a i t Lo adU rl ( ) ;
v a r b = a w a i t Download ( a ) ;
a w a i t f i l e . W ri t e ( a , b ) ;
}
Listing 7: Finite state machine with await
c l a s s S t a t e {
o b j e c t a , b ;
i n t s t a t e = 0 ;
p u b l i c v o i d Next ( ) {
s w i t c h ( t h i s . s t a t e ) {
c a s e 0 :
t h i s . s t a t e = 1 ;
Lo a dU r l ( t h i s . SetA ) ;
break ;
c a s e 1 :
t h i s . s t a t e = 2 ;
Download ( t h i s . a , t h i s . Se tB ) ;
break ;
c a s e 2 :
f i l e . W r i t e ( t h i s . a , t h i s . b ) ;
break ;
}
}
v o i d SetA ( o b j e c t a r g ) {
t h i s . a = a r g ;
t h i s . N ext ( ) ;
}
v o i d SetB ( o b j e c t a r g ) {
t h i s . b = a r g ;
t h i s . N ext ( ) ;
}
}
v o i d Example ( ) {
new S t a t e ( ) . N ext ( ) ;
}
Listing 8: Callbacks with a finite state ma-
chine
A simple program that loads a URL from (slow) storage, and then downloads the con-
tent from the (slow) network can be written sequentially, as in Listing 5. Such a sequential
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library 5
program has the benefit that the flow is easy to follow, as each line is executed in full before
advancing to the next line, and local variables store the program state. With the use of lambda
functions, a similar program can be written as a continuation-style, shown in Listing 6. This
approach means that the initiating thread is not blocked during the long-running operations,
but it has the drawback that the program flow becomes harder to follow, especially if one of
the methods needs to throw an exception. The continuation approach also complicates the
storage of state data (i.e. “a” and “b”), although the compiler handles this automatically.
A more structured approach is to use a finite state machine, as shown in Listing 8. From
the number of lines alone, it is clear that this approach is less intuitive and harder to use. But
it does define a strict program flow, which allows error handling to be introduced.
Fortunately, .Net 4.5 features two new keywords: async and await [13]. The async
keyword is simply used for ensuring backwards compatibility with older code, where await
was not a keyword and thus could be used for a variable name. By adding the async modified
to a function name, the compiler interprets await as a keyword. The await keyword is the
major change, which automatically transforms the function into a finite state machine, and
captures all variables in the scope into a new object instance. Each use of the await statement
will correspond to a state in the state machine, and the callback will point to the state object,
such that once the operation completes, it will advance the state.
This rewrite is performed solely at compile time, and is thus as efficient as if it was
written by hand. This allows a function that needs to use long-running calls to be rewrit-
ten as a continuation based method with only the addition of async and await keywords,
as illustrated in Listing 7. This does not solve the inherent problems found in concurrent
programming, but fortunately this can be handled by a CSP-like channel approach!
1.5. Tasks
To further simplify the use of the await statement, CIL introduces a common Task class that
can be called with await. Any method can thus signal that it is running asynchronously by
returning a Task object, and the caller can use await keyword, or call the Wait method on
the Task to suspend the thread until completion. If the Wait method is called, the execution
becomes sequential like that in Listing 5. If the await keyword is used instead, the program
is written as shown in Listing 7, with the compiler automatically implementing it as shown
in Listing 8.
A number of simple helper methods are also found in the Task class, such as WhenAll
and WhenAny, which returns a Task representing some combination of other Tasks.
The Task class resembles the promise or future idea found in NodeJS [14], Smalltalk [15]
and C++11 [16] among others. For Java, the java.util.concurrent.Future [17] class is
similar.
2. Implementation
The overall design approach has been to make the library API as simple and possible, and
use as many existing language and runtime features as possible. This resulted in a library
that contains only a single implementation of a channel, and no implementation of a process.
The channel is a generic type, such that it can be typed to transfer a specific type of data, for
example an integer.
2.1. Channels
The channel implementation supports multiple readers and multiple writers, that is, it is an
any-to-any channel. Any communication on the channel is ordered, such that the first regis-
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
6 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
tered reader will be guaranteed to read the first value written, and the same applies to the writ-
ers. Each communication is also atomic, meaning that a communication will always notify
both the reader and the writer of the communication, or do nothing.
The source code for the channel implementation is 300 source lines, where the entire
CoCoL library is implemented in less than 1500 lines [18].
A key feature of the channels is that they are based on continuations, that is, rather
than block the caller until the operation has completed, they return a Task (see 1.4 and 1.5).
Internally, the returned Tasks are stored in queues, to ensure ordered responses. Since the
queues will have exactly one entry if the channels are used in a one-to-one manner, there
would be very little gained in implementing specialized versions for the channels.
A typical machine cannot run more than a few thousand threads, due to the memory
required for each thread stack. This is normally not a prohibitive limit, as it is far more threads
than there are physical execution units in the system. But in a CSP context, it is common to
create a large number of processes which stay inactive for long periods of time.
By mapping each process to a Task instead of a thread, they are stored as callback
methods with no stack. As mentioned in section 1.4, the state machine encapsulates any local
variables that might otherwise be stored on the stack. This makes it possible to create millions
of processes with a moderate amount of memory.
The Task approach also allows the runtime system to choose how many concurrent
threads it will use to run the tasks. The default implementation leaves this decision to the
ThreadPool, which automatically adjusts the number of active threads, based on the number
of queued Tasks.
A related approach to ensuring processes waiting for communication do not require a
stack, can be found in the ProcessJ language, which can also be used to increase the number
of processes in Java [19]. The project is similar, in that it provides an environment, which
allows millions of processes. The difference is that the ProcessJ approach relies on a custom
compiler and a custom language, whereas CoCoL is implemented with features already in
the CLR, and supports existing languages.
2.2. Integration With CIL
By using the built-in Task system found in the .Net 4.5 library it is possible to mix channel
communications with other kinds of blocking operations. It is also possible to use the utility
methods that operate on Tasks objects, notably the ability to wait for multiple Tasks to
complete.
v o i d P a r D e l t a ( ) {
/ / O m i t t e d c h a n n e l d e c l a r a t i o n s
w h i l e ( t r u e ) {
v a r d a t a = a w a i t i n . ReadAsync ( ) ;
a w a i t Task . WhenAll (
outA . W r i teAs y nc ( d a t a ) ;
outB . W rit e A syn c ( d a t a ) ;
) ;
}
}
Listing 9: Parallel Delta function
v o i d S e q D e l t a ( ) {
/ / O m i t t e d c h a n n e l d e c l a r a t i o n s
w h i l e ( t r u e ) {
v a r d a t a = a w a i t i n . ReadAsync ( ) ;
a w a i t outA . W r i teAs y nc ( d a t a ) ;
a w a i t outB . W rite A sync ( d a t a ) ;
}
}
Listing 10: Sequential Delta function
As an example it is possible to write the classic CSP Delta process, as the method
shown in Listing 9, which reads an input channel and copies the value to two or more output
channels. Note that the implementation in Listing 9 awaits both write operations in parallel
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library 7
through the WhenAll helper, and thus writes to both output channels in any order, making it a
Parallel Delta. If a sequential delta is required it can be written as shown in Listing 10, where
the state machine will ensure that channel “A” is written before channel “B”. The parallel
version uses a counter internally to wait for all the writes to complete. Our measurements
show that this extra overhead is optimized away by the JIT compiler.
2.3. Additional Channel Features
CoCoL supports two complementary methods for getting a channel; both are found in
the ChannelManager factory class. An anonymous channel can be created with a call to
CreateChannel, which will simply create a new channel instance, which can be passed
around. The method GetChannel takes a channel name as an argument, and will create a
channel if no existing channels has the name; otherwise it will return the existing channel
with that name.
Other than the creation logic, there is no difference between the channels. If the channels
are shared in a complex call hierarchy with no easy way to pass the channel instance, the
named approach may simplify this, but comes at the cost of managing a global (channel)
namespace.
The channels default to being un-buffered, but can optionally be created as buffered im-
plementations. In a buffered setup, a number of writes are allowed without having a desig-
nated reader. When reading from a buffered channel, a buffered write is taken from the queue,
and if there are pending writers, the next writer is allowed to write.
Additionally, the channel implementation also supports poison logic, which is named
Retire in CoCoL. A retired channel will return an exception on all non-buffered operations
happening after Retire() has been called.
To support a mixed-mode program, where only parts of the code are written to utilize the
concurrent model, a number of support methods are also present. One of the support methods
is the blocking mode extension that simply blocks on each task.
2.4. Processes
As mentioned, there is no explicit support for processes in CoCoL. Instead, the user can
start a process in a number of ways. One way is to simply start a thread and run the process
as a normal thread. Another method is to create a class that implements the IProcess or
IProcessAsync interface, and start it using the CoCoL loader methods, and yet another
way is to simply run an asynchronous function. These different approaches are shown in
Listing 11, 12, 13, and 14. Once a process is running, it can simply exit the function to stop
running. A thread can be joined, but if the process was started through the loader system, the
caller does not know when it stopped. This can be remedied by a signal, such as a channel,
or any other inter-process communication method. If the process was started as shown in
Listing 12, the returned Task object can be (a-)waited upon.
In JCSP, the user would typically create instances of the different processes, and then
execute them explicitly in either sequence or parallel. With CoCoL and asynchronous pro-
gramming, the processes are started simply by calling a function, thus implicitly running all
processes in parallel, but allowing the user to wait for completion.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
8 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
v o i d Run ( o b j e c t d a t a ) {
w h i l e ( t r u e ) {
ou t . Wr i t e (
i n . Read ( )
) ;
}
}
new T hread ( Run ) . S t a r t ( ) ;
Listing 11: Thread process
a s y n c Task Run ( ) {
w h i l e ( t r u e ) {
a w a i t o ut . W r i t e (
a w a i t i n . ReadAsync ( )
) ;
}
}
Run ( ) ;
Listing 12: Asynchronous process
c l a s s I d e n t i t y : I P r o c e s s {
p u b l i c v o i d Run ( ) {
w h i l e ( t r u e ) {
ou t . Wr i t e (
i n . Read ( )
) ;
}
}
}
CoCoL . L o a der
. S t a r t F r o m T y p e s ( t y p e o f ( I d e n t i t y ) ) ;
Listing 13: IProcess process
c l a s s I d e n t i t y : I A s y n c P r o c e s s {
p u b l i c Task RunAsync ( ) {
w h i l e ( t r u e ) {
a w a i t o ut . W r iteA s ync (
a w a i t i n . ReadAsync ( )
) ;
}
}
CoCoL . L o a der
. S t a r t F r o m T y p e s ( t y p e o f ( I d e n t i t y ) ) ;
Listing 14: IAsyncProcess process
2.5. Alternation With Two-Phase Commit
In CSP there is a construct known as external choice or alternation, which is used to choose
between multiple available channels in a race-free manner. In JCSP and C++CSP this is
implemented with channel guards being passed to a method that chooses which channel to
use.
b o o l O f f e r ( o b j e c t c a l l e r ) v o i d Commit ( )
{ {
M o n i t o r . E n t e r ( m l o c k ) ; m ta k e n = t r u e ;
M o n i t o r . E x i t ( m l o ck ) ;
/ / R e t u r n and k e e p l o c k }
i f ( ! m t a k e n )
r e t u r n t r u e ;
v o i d Wi thdraw ( )
M o n i t o r . E x i t ( m l o ck ) ; {
r e t u r n f a l s e ; M o n i t o r . E x i t ( m l o ck ) ;
} }
Listing 15: Basic implementation of a two-phase-commit that allows a single operation
To keep CoCoL more in line with existing CIL terminology, the operations that per-
form external choice are called ReadFromAny and WriteToAny. Rather than implement
guards for the channels, the functions simply take a list, or array, of channels. To imple-
ment the skip and timeout guards, the methods take a timeout argument, which can be either
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library 9
Timeout.Immediate for skip or any positive Timespan value for timeout. For ease of use,
the default timeout is set to Timeout.Infinite, causing the operations to wait forever.
Task Read ( TwoPhaseCommit r e a d e r c o m m i t )
{
v a r r e a d e r t a s k = new Task ( ) ;
w h i l e ( ! w r i t e r q u e u e . Empty ( ) ) {
v a r w r i t e r t a s k , w r i t e r c o m m i t = w r i t e r q u e u e . PeekHead ( ) ;
i f ( w r i t e r c o m m i t . O f f e r ( ) ) {
/ / W r i t e r a gre ed , c h e c k r e a d e r
i f ( r e a d e r c o m m i t . O f f e r ( ) ) {
/ / Agreemen t , commi t t o c o m m u n i ca t i o n
r e a d e r c o m m i t . Commit ( ) ;
w r i t e r c o m m i t . Commit ( ) ;
/ / Remove p e n d i n g w r i t e r
w r i t e r q u e u e . Dequeue ( ) ;
/ / Exc ha ng e v a l u e
r e a d e r t a s k . S e t V a l u e ( w r i t e r t a s k . G e tVa l u e ( ) ) ;
/ / S c h e d u l e c a l l b a c k s
r e a d e r t a s k . S i g n a l R e a d y ( ) ;
w r i t e r t a s k . S i g n a l R e a d y ( ) ;
/ / Co mm u ni cat ion c o m p l e t e
r e t u r n r e a d e r t a s k ;
} e l s e {
/ / Re ade r d e c l i n e d , n o t i f y w r i t e r
w r i t e r c o m m i t . Withdraw ( ) ;
/ / Re ade r d e c l i n e d , so we s t o p t r y i n g
r e t u r n r e a d e r t a s k ;
}
} e l s e {
/ / W r i t e r i s no l o n g e r a v a i l a b l e ,
/ / so we remove i t
w r i t e r q u e u e . Dequeue ( ) ;
}
}
/ / No m a tc h i n g w r i t e r , s u s p e n d r e a d e r
/ / No te t h a t t h i s ca n c a u s e ” l i t t e r i n g ”
r e a d e r q u e u e . Enqueue ( r e a d e r t a s k , r e a d e r c o m m i t ) ;
r e t u r n r e a d e r t a s k ;
}
Listing 16: Pseudo code for a channel using two-phase logic
Inside the ReadFromAny and WriteToAny methods, the selection is performed by creat-
ing a two-phase-commit object and, in turn, passing this, to each channel. This will register
a pending read or write on all the channels in the list, which is the intent or voting phase of
the two-phase commit protocol [20]. Once a channel has a matching operation, it will invoke
the Offer method on the two-phase-commit object passed by both the read and the write
end. If one or both sides decline the offer (i.e. they have already communicated elsewhere),
the declined requests are removed from the channel, and the Withdraw method is invoked.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
10 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
If both sides agree to take the offer, the Commit method is called and the channel enqueues
both the reader and the writer callback methods.
The implementation of the TwoPhaseCommit object is thus very simple: a call to Offer
returns false if a communication has already completed, otherwise a lock is acquired and
true is returned. The Withdraw method releases the lock, and the Commit method marks the
instance completed and releases the lock, thus allowing only a single communication to suc-
ceed. A simplified version of the code in a two-phase-commit object is shown in Listing 15.
The simplified Read function shown in Listing 16, illustrates that each channel end relies
solely on the two-phase-commit object for coordination, and thus each channel works the
same way, regardless of how many channels the read and write are registered with.
This approach scales to a large number of channels, because the channels themselves do
not participate in the communication but rely on the TwoPhaseCommit object to handle the
choice. This also allows custom versions of external choice, should it be desired. A drawback
to this method is that there could potentially be many unused communication offers in the
channels, which will grow if the channel is repeatedly being used as part of a set, but never
succeeds. This can be fixed by probing items in the queues inside the channels if the queues
exceed a threshold.
2.6. Alternation
As described above, the two-phase commit implementation in Listing 15 allows a single
communication to continue. This makes it very simple to perform an external choice on any
number of channels, by simply handing the same two-phase-commit instance to each channel.
The CSP priority alternation scheme can be implemented simply by registering the op-
eration on each channel in the desired order. The first channel that is able to complete the
communication will trigger the two-phase-commit instance as illustrated in Listing 16. The
random alternation scheme can be implemented in the much the same way, by shuffling the
channel list, and then using the priority alternation method.
The fair alternation scheme requires that each channel receive an equal amount of com-
munication. Another way of expressing this is to say that the channel priority is ordered, so
that the least communicating channel has the highest priority. This way of expressing the
rules for fair alternation transforms the problem into a question of sorting the channels.
A simple approach to sorting the channels is to keep a counter for each channel, and
then simply sort the list of channels, using the counter values in increasing order. While this
works, it is not very efficient because changes in the usage counters are simple increments.
For an efficient solution, we have identified two different usage scenarios, shown in
Figure 2 and Figure 1. In the first scenario, some channels communicate very often, others
less frequently, and in the latter, all channels communicate an equal amount.
From Figure 2 we can see that this sorted list can become an “almost-sorted list”, if the
first channel with usage count 66 is used again. If any of the other channels are used, the
list remains sorted. For such a scenario, the bubble sort algorithm is very efficient, as it can
terminate early, minimizing the number of swaps and compares.
In the scenario shown in Figure 1, a bubble sort could also be fairly efficient. However,
we consider this scenario to be the most likely, and thus we have an optimization for this,
which is to keep an index to the first element with the lowest usage count. If the channel
being used is to the left of this index, we swap with the element at the index, and increment
the index. If the channel being used is to the right, we apply bubble sort. Once the index
becomes -1, we re-scan the list to find the new lowest index. This optimization means that,
in a case where the communication always succeeds (i.e. there is always a waiting process),
we do not need to bubble the first element all the way to the end of the list.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library 11
This approach makes the actual sorting as limited as possible, with only a single counter
as extra overhead. In the implementation, the counter array and channel arrays are kept sep-
arate, with swaps performed on both, such that the channel list can be used without any
copying.
0 0 0 0 0 1 1 1 1 1
First element with lowest usage count
Sort order
Figure 1. The fair alternation usage list under balanced load.
0 4 6 12 32 42 66 66 85 92
First element with lowest usage count
Sort order
Figure 2. The fair alternation usage list under un-balanced load.
3. Results
To evaluate the performance of CoCoL we have chosen to compare with two existing CSP
libraries: C++CSP [21], and JCSP [4].
These libraries all include examples of the common CSP benchmarks: CommsTime and
Stressed ALT. This makes the performance results here somewhat comparable to those re-
ported in 2003 [21].
The benchmarks are used mostly unmodified from the source, with minor modifications
to even out the differences such as number of iterations, problem size, etc.
To further expand on these results, and make an attempt at producing comparable cross-
OS results, we have executed all benchmarks on the same hardware: an Apple MacBook Pro
with an i7 2.8 Ghz processor and 16 GB 1600 MHz DDR3 RAM, running OSX 10.10.3.
To produce results from other operating systems, we have used the Parallels Desktop
10.2.0 software to create virtual machines, and installed 64 bit Windows 8.1, and Ubuntu
14.04.2 guest operating systems. While there is certainly an overhead associated with run-
ning another operating system inside a virtual machine, we consider the results to be a good
indicator of the relative performance. We base this on the fact that the benchmarks are highly
CPU intensive and the Parallels software does not emulate the CPU, but presents it to the
guest OS.
For the benchmarks running on the CLR, the most popular open source implementation
is Mono [8], which is available on all tested operating systems. For all operating systems we
have used Mono version 4.0.1 and, additionally for Windows, we used the Microsoft .Net
Runtime version 4.5.50709.
For Java and JCSP, we use the current Oracle JRE version 1.8.0 45 throughout the tests
on all three operating systems.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
12 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
3.1. CommsTime
The CommsTime benchmark is a classic micro-benchmark, with the purpose of giving a
measure of the communication overhead introduced by a channel communication. While the
benchmark is a bit simplistic it does give an measure of how much overhead each communi-
cation adds. A schematic representation of the CommsTime network is shown in Figure 3.
Prefix Delta Consumer
Successor
Figure 3. The CommsTime network.
Identity
Prefix +
Delta
Identity
Consumer
Identity
Identity
Identity
Figure 4. CommsTime scalable network.
3.1.1. CommsTime With CIL
To evaluate the different approaches to implementing channel communication in the Com-
mon Intermediate language, we have implemented the CommsTime example in multiple con-
figurations. The Await version uses the await keyword when reading and writing, and thus
gets an automatic implementation of the finite state machine. The Building blocks version is
also using await statements, but uses pre-cooked processes for the Prefix, Delta and Suc-
cessor processes similar to other CSP libraries. The Blocking version uses the asynchronous
communication channels, but blocks on each call, thus requiring a thread for each process.
The Minimal version is an experiment with using an extremely simple un-buffered channel
based on traditional locking and events. The BlockingCollection version is exploring the op-
tion of using the BlockingCollection data structure found in the .Net 4.5 libraries. The
BlockingCollection looks similar to a channel, in that it supports a buffered collection
that blocks readers and writers, and some set operations that are similar to the ReadFromAny
and WriteToAny methods. The results are shown in Figure 5 for running on all operating
systems with Mono and the Microsoft .Net runtime on Windows.
From the results we can see that generally the Mono runtime is between 5 to 10 times
slower than the Microsoft .Net runtime. As expected, the version written with await and the
version using pre-cooked processes perform almost identically. If the CoCoL channels are
fitted with locks to provide a blocking interface, we can see that the execution time is ap-
proximately 5 times slower. If the CoCoL library were implemented with locks and events, it
would generally be slower, except for Mono on Windows and OSX. The BlockingCollection
is generally many times slower, except on OSX where it seems to use an OSX specific feature
to obtain very fast execution times.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library 13
0"
5"
10"
15"
20"
25"
30"
35"
40"
45"
Await" Building"blocks" Blocking" Minimal" BlockingCollec<on"
Microseconds*pr.*communca0on*
Win"/".Net"
Win"/"Mono"
OSX"/"Mono"
Linux"32"/"Mono"
Linux"64"/"Mono"
Figure 5. Communication time with CommsTime for a variety of similar approaches to implementing a com-
munication channel in C#. Lower is better.
3.1.2. CommsTime Compared to Other Libraries
To evaluate how the CoCoL approach compares to existing libraries, we run the same Comm-
sTime experiment with both JCSP and C++CSP. For Linux, we also measure with OpenJDK
1.7.0 79, which is the default Java library on Ubuntu.
Due to outdated dependencies, we were only able to get C++CSP running on Linux.
The 64-bit version compiles, but produces deadlocks and segmentation faults, thus we only
include the 32-bit results. The combined results are shown in Figure 6.
Overall, the fastest implementation is C++CSP when using a single thread for all pro-
cesses, resulting in a cooperative threading model with very fast switching. When C++CSP
uses multiple threads for the processes, the context switches are causing the C++CSP im-
plementations to be consistently slower than the Mono and JCSP version. Interestingly, the
(correct) parallel version in JCSP is consistently faster than the sequential version. The Mono
and JCSP versions are generally comparable, with neither being consistently faster. The .Net
runtime is significantly faster than any of the Mono or Java based versions, being more than
twice as fast as the fastest implementation.
3.1.3. CommsTime Scaling
To evaluate the relative overhead when scaling systems beyond the small CommsTime ex-
ample, we have implemented a variant of the CommsTime network where we introduce for-
warding processes to form a communication ring as shown in Figure 4. With this ring-setup
it becomes trivial to increase the number of processes, and thus experiment with the scalabil-
ity of the systems. In Figure 7 we show how the communication overhead increases slowly
when the number of processes and channels increase. The general tendency is that there is
little extra overhead from running more processes. When increasing the number of processes
and channels by 5 orders of magnitude, the increase in communication time is doubled in the
worst case.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
14 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
0"
2"
4"
6"
8"
10"
12"
14"
16"
18"
20"
Win" OSX" Linux"32"bit" Linux"64"bit"
Microseconds*pr.*communica0on*
CoCoL"/".Net"
CoCoL"/"Mono"
Java"/"Par"
Java"/"Seq"
OpenJDK"/"Par"
OpenJDK"/"Seq"
CPP"/"Par"/"single"
CPP"/"Par"/"mulH"
CPP"/"Seq"/"single"
CPP"/"Seq"/"mulH"
Figure 6. Communication time with CommsTime for three different libraries on different operating systems.
Lower is better.
Figure 7. Communication time when scaling the number of channels and processes in CommsTime. Lower is
better.
3.2. Stressed Alt
The stressed alt benchmark is using a number of shared channels, with each channel having a
number of writers contending to write each channel. At the receiving end of the channels is a
single stressed reader, performing a fair read from the channel set, ensuring that no channels
starve.
When the stressed alt benchmark runs with JCSP on OSX, there is an issue with scaling
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library 15
to more channels. When increasing from a 10x100 problem size to a 20x100 problem size,
the communication time increases from 46 microseconds to 140 microseconds. This appears
to be an implementation detail in the JVM that causes exponential time increase. To make the
figure easier to read, this measurement has been excluded from Figure 8. Again, the Mono
runtime is approximately 5 times slower than the .Net version. The .Net version performs
almost as good as the JCSP version, which has consistently good performance across all
operating systems, except OSX where some scaling issue appear.
One thing to note is that the JCSP version cannot handle more than approximately 3000
processes on Linux. Even with 4GB system memory, an out-of-memory exception is thrown
when the system attempts to start the many processes. Others report that the limit is around
7000 processes [19], but do not specify machine details.
With the CoCoL library, it is possible to use a million processes on all platforms. With the
.Net runtime it takes around 4 hours to complete 1000 rounds of 1 million communications.
The test setup with 1 million processes using the Mono runtime have not been completed, as
each test would take approximately 20 hours to complete.
0"
5"
10"
15"
20"
25"
30"
35"
40"
45"
50"
Win"/".Net" Win"/"Mono" OSX"/"Mono" Ubuntu"32"/"
Mono"
Ubuntu"64"/"
Mono"
Win"/"JCSP" OSX"/"JCSP" Ubuntu"32"/"
JCSP"
Ubuntu"64"/"
JCSP"
Microseconds*pr.*communica0on*
10x10"
10x100"
20x100"
100x100"
1000x1000"
Figure 8. Communication time for Stressed Alt with JCSP and CoCoL on different operating systems. Lower
is better.
3.3. Mandelbrot
To investigate a slightly more realistic system, where each process has a varying, non-zero,
amount of work to do, we have implemented a renderer for a Mandebrot fractal. The imple-
mentation takes the image dimensions and iteration count as input and then forwards each
pixel to the workers. The workers forward each result pixel to a renderer, which assembles
the picture.
We have implemented the process network in two flavors: static and dynamic. In the
static setup, the processes are loaded initially, and wait for channel input. In the static setup,
there are 32 workers, reading from a shared input channel, and writing to a shared output
channel. In the dynamic setup, a worker is created pr. pixel, and the values are passed into
the constructor of each worker. The process that spawns the workers also collects all results
through a shared channel.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
16 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
To evaluate how well the system scales with a large number of communications, we have
varied the size of the output image. For the dynamic approach, this means that as many as 4
million short-lived processes are created.
!"
#"
$!"
$#"
%!"
%#"
&'(")"*+,-")"
.-/01"
&'(")"
23(3")"
.-/01"
4.5")"
23(3")"
.-/01"
678(-8"
9%")"
23(3")"
.-/01"
678(-8"
:;")"
23(3")"
.-/01"
&'(")"*+,-")"
<=(/>'1"
&'(")"
23(3")"
<=(/>'1"
4.5")"
23(3")"
<=(/>'1"
678(-8"
9%")"
23(3")"
<=(/>'1"
678(-8"
:;")"
23(3")"
<=(/>'1"
Microseconds*pr.*pixel*
$!!?$!!"
#!!?#!!"
$!!!?$!!!"
%!!!?%!!!"
Figure 9. Time pr. pixel for a Mandelbrot renderer. Lower is better.
When running the benchmarks, the actual image drawing was disabled to reduce issues
with the graphics library, and the maximum number of iterations for each pixel was set to
100. The results for various platforms are shown in Figure 9, and show that even when the
size of the image grows, the time to compute each pixel is almost constant. For most results,
the 100x100 pixel setup is too small to allow running at full speed. The Mono runtime has
nearly identical performance across all operating systems.
In the static configuration, the .Net runtime is clearly fastest, with the slowest being
approximately 5 times slower.
The dynamic setup is clearly slower than the static, but only with about a factor of 2 for
the Mono runtime. The .Net runtime has some issues handling this particular setup, and ends
up 10 times slower than the static version. This is most likely an issue with the thread pool
not matching the increase/decrease in workload well enough.
4. Future Work
The primary focus on CoCoL has been to define a minimal API for a single machine. The
CSP system lends itself nicely to multiple machines, so this should be investigated.
With distributed processes it is required that data being passed on a channel is serial-
izable for the network transport. Fortunately, CIL has a rich set of capabilities for serializ-
ing and deserializing objects, which should make such efforts possible. When performing a
distributed external choice, it is relatively straightforward to implement with the two-phase-
commit approach, but requires an efficient distributed lock.
Many kinds of data, such as file handles and the channels themselves, cannot be serial-
ized. One approach to solving this is to use the CIL remoting, which is a kind of RPC call,
where a proxy object forwards calls and data to the process that contains the original entry.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library 17
Once the serialization issues are resolved, the idea that processes are not depending on a
stack makes them portable. This portability follows from the finite state machine encapsula-
tion, where all state and accessible variables are captured in an object instance. By serializing
the state object, it is possible to migrate the process without handling potential pointers to the
local memory space. This allows inactive processes to be suspended and migrated to different
machine to provide a workload balancing.
The external choice implementation lends itself to different forms of external choice.
One such implementation could be a multicast operation, where the writer will atomically
write a value to n channels in a set of m channels, or not write at all.
5. Conclusion
CoCoL is a new library for handling concurrent programs without the need for traditional
locking constructs. By using CSP ideas, it becomes possible to build programs using tra-
ditional CSP logic. By using a terminology that is unlike the traditional CSP wording and
allowing a mixed paradigm approach, it is the authors’ hope that the library will become
popular outside the CSP community.
As CoCoL uses the CIL runtime it works seamlessly across all major operating systems.
Combined with the very small open source codebase, the authors’ consider it a candidate for
teaching CSP-like concurrency.
The use of continuations and general language integration makes it possible to write
most common CSP examples in a single file. With the task-based parallelism it becomes
possible to run millions of processes with moderate memory requirements.
All source code, including the benchmarks, can be found on the project website [18].
Acknowledgements
This research was supported by grant number 131-2014-5 from Innovation Fund Denmark.
References
[1] C.A.R. Hoare. Communicating Sequential Processes. Prentice-Hall, London, 1985. ISBN: 0-131-53271-
5.
[2] Peter H Welch and Fred RM Barnes. Communicating mobile processes: introducing occam-pi. in 25 years
of csp, volume 3525 of. Lecture Notes in Computer Science, pages 175–210.
[3] Peter H Welch and Fred RM Barnes. A csp model for mobile channels. In CPA, pages 17–33, 2008.
[4] Peter H Welch, Neil CC Brown, James Moores, Kevin Chalmers, and Bernhard HC Sputh. Integrating and
extending jcsp. Communicating Process Architectures 2007, 65:349–370, 2007.
[5] ECMA ECMA. 335: Common language infrastructure (cli). ECMA, Geneva (CH),, 2005.
[6] Shyamal Suhana Chandra and Kailash Chandra. A comparison of java and c#. Journal of Computing
Sciences in Colleges, 20(3):238–254, 2005.
[7] Jeremy Singer. JVM versus CLR: a comparative study. In Proceedings of the 2nd international conference
on Principles and practice of programming in Java, pages 167–169. Computer Science Press, Inc., 2003.
[8] Mono Project. The mono project. http://www.mono-project.com/. [Online; accessed June 2015].
[9] Anders Hejlsberg, Scott Wiltamuth, and Peter Golde. C# language specification. Addison-Wesley Long-
man Publishing Co., Inc., 2003.
[10] Gilad Bracha, Martin Odersky, David Stoutamire, and Philip Wadler. Making the future safe for the past:
Adding genericity to the java programming language. Acm sigplan notices, 33(10):183–200, 1998.
[11] D. Szafron, J. Schaeffer, and A. Edmonton. An experiment to measure the usability of parallel program-
ming systems. Concurrency Practice and Experience, 8(2):147–166, 1996.
[12] L. Hochstein, J. Carver, F. Shull, S. Asgari, and V. Basili. Parallel programmer productivity: A case
study of novice parallel programmers. In Supercomputing, 2005. Proceedings of the ACM/IEEE SC 2005
Conference, pages 35–35. IEEE, 2005.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
18 K. Skovhede and B. Vinter / CoCoL: Concurrent Communications Library
[13] Semih Okur, David L Hartveld, Danny Dig, and Arie van Deursen. A study and toolkit for asynchronous
programming in c#. In Proceedings of the 36th International Conference on Software Engineering, pages
1117–1127. ACM, 2014.
[14] Stefan Tilkov and Steve Vinoski. Node.js: Using javascript to build high-performance network programs.
IEEE Internet Computing, 14(6):0080–83, 2010.
[15] Adele Goldberg and David Robson. Smalltalk-80: the language and its implementation. Addison-Wesley
Longman Publishing Co., Inc., 1983.
[16] C++11. std::future namespace in c++11. http://en.cppreference.com/w/cpp/thread/future.
[Online; accessed June 2015].
[17] Oracle. java.util.concurrent.future class. https://docs.oracle.com/javase/8/docs/api/java/
util/concurrent/Future.html. [Online; accessed June 2015].
[18] K. Skovhede. Cocol source code. https://github.com/kenkendk/cocol. [Online; accessed June
2015].
[19] Jan B. Pedersen and Andreas Stefik. Towards millions of processes on the jvm. 2014.
[20] Butler Lampson and Howard Sturgis. Crash recovery in a distributed data storage system. Xerox Palo
Alto Research Center Palo Alto, California, 1979.
[21] Neil C. C. Brown. C++CSP2: A Many-to-Many Threading Model for Multicore Architectures. In Alis-
tair A. McEwan, Wilson Ifill, and Peter H. Welch, editors, Communicating Process Architectures 2007,
pages 183–205, jul 2007.
CPA 2015 preprint the proceedings version will have other page numbers and may have minor differences.
... The Concurrent Communications Library, CoCoL, was introduced in 2015 [1] and is a CSP [2] inspired library targeting C# and other languages running on the Common Language runtime, also known as CLR, .Net or CIL. The driving idea in CoCoL is to provide a simple interface for constructing programs using CSP techniques, but without requiring knowledge about the history and theory of CSP. ...
... After we introduced CoCoL, we implemented various improvements and extensions to the core library, to make it simple to work with, without sacrificing the predictability. The changes are almost exclusively implemented as extra functions, leaving the core channel logic and implementation unchanged from the initial description [1]. ...
... The channel expiration mechanism is implemented with a sorted list, such that a single timer keeps track of all pending channel timeouts, rather than instantiating a timer for each call [1]. Since a sorted list is rarely used, it has been omitted from the PCL and we have simply added our own implementation which keeps the list sorted in O(lg n). ...
Conference Paper
Full-text available
This paper presents updates and measurements for the Concurrent Communications Library, CoCoL, which is a CSP inspired library targeting C# and other languages running on the Common Language Runtime, also known as .Net. We describe the new library interface methods that simplify writing correct, encapsulated and compositional networks. We also describe an extension to the library, which enables communication over network connections and measure the performance.
... Other approaches exist for implementing CSP semantics. A recent C# approach has investigated the use of asynchronous operations [8]. Python approaches to CSP semantics have explored stackless and non-concurrent methods. ...
... The limitation on process numbers that also impacts JCSP is an issue that needs further investigating. A possible avenue is moving towards a coroutine model similar to that developed for C# [8]. The C++ standard committee is currently reviewing a proposal for coroutine support in the standard library that would enable such functionality. ...
Conference Paper
Full-text available
Although many CSP inspired libraries exist, none yet have targeted modern C++ (C++11 onwards). The work presented has a main objective of providing a new C++CSP library which adheres to modern C++ design principles and standards. A secondary objective is to develop a library that provides simple message passing concur-rency in C++ using only the standard library. The library is evaluated in comparison to JCSP using microbenchmarks. CommsTime and StressedAlt are used to determine the properties of coordination time, selection time, and maximum process count. Further macrobenchmarks, Monte Carlo π and Mandelbrot, are gathered to measure potential speedup with C++CSP. From the microbenchmarks, it is shown that C++CSP performs better than JCSP in communication and selection operations, and due to using the same threading model as JCSP can create an equal number of processes. From the macrobenchmarks, it is shown that C++CSP can provide an almost six times speedup for computation based workloads, and a four times speedup for memory based work-loads. The implementation of move semantics in channels have provided suitable enhancements to overcome data copy costs in channels. Therefore, C++CSP is considered a useful addition to the range of CSP libraries available. Future work will investigate other benchmarks within C++CSP as well as development of networking and skeleton based frameworks.
... Internally, each operation is using CSP [8] in a manner similar to the skeleton approach [6]. The implementation is currently based on the CoCoL library [10], which is used to provide ordered and thread-safe shared access to each collection. This setup also means that each of the operations in SODA can be rewritten using CSP communication primitives, which enables some degree of concurrency and allows reasoning about the correctness of an implemented algorithm. ...
Conference Paper
Full-text available
This paper explores how a skeleton based approach can be used to perform big data analysis. We introduce a restricted storage system based on blocks with a fixed maximum size. The storage design removes the residual data problem commonly found in storage systems, and enables processing on individual blocks. We then introduce a stream-oriented query system that can be used on top of the distributed storage system. The query system is built on a limited number of core operations. Each of the perform a specified function, such as filtering elements, but are skeleton operations where the programmer needs to fill in how to perform the operation. The operations are designed to allow splitting across the blocks in the storage system, giving concurrent execution while maintaining a completely sequential program description. To assist in understanding the data flow, we also introduce a graphical representation for each of the methods, enabling a visual expression of an algorithm. To evaluate the query system we implement a number of classic Big-Data queries and show how to implement them with code, and how the queries can be visualized with the graphical representation.
... Perhaps the easiest way to introduce the meaning of a broadcast mechanism is to compare it with the well established one-to-any or any-to-any mechanism which all modern CSP-style libraries offer [4,5,6,7]. With the any-to-any channel a message sent by one process is delivered to any process that is ready to receive. ...
Conference Paper
Full-text available
While CSP-only models process-to-process rendezvous-style message passing, all newer CSP-type programming libraries offer more powerful mechanisms , such as buffered channels, and multiple receivers, and even multiple senders, on a single channel. This work investigates the possible variations of a one-to-all, broadcasting, channel. We discuss the different semantic meanings of broadcasting and show three different possible solutions for adding broadcasting to CSP-style programming.
Article
We present GJ, a design that extends the Java programming language with generic types and methods. These are both explained and implemented by translation into the unextended language. The translation closely mimics the way generics are emulated by programmers: it erases all type parameters, maps type variables to their bounds, and inserts casts where needed. Some subtleties of the translation are caused by the handling of overriding. GJ increases expressiveness and safety: code utilizing generic libraries is no longer buried under a plethora of casts, and the corresponding casts inserted by the translation are guaranteed to not fail. GJ is designed to be fully backwards compatible with the current Java language, which simplifies the transition from non-generic to generic programming. In particular, one can retrofit existing library classes with generic interfaces without changing their code.
Article
Asynchronous programming is in demand today, because responsiveness is increasingly important on all modern devices. Yet, we know little about how developers use asynchronous programming in practice. Without such knowledge, developers, researchers, language and library designers, and tool providers can make wrong assumptions. We present the first study that analyzes the usage of asynchronous programming in a large experiment. We analyzed 1378 open source Windows Phone (WP) apps, comprising 12M SLOC, produced by 3376 developers. Using this data, we answer 2 research questions about use and misuse of asynchronous constructs. Inspired by these findings, we developed (i) Asyncifier, an automated refactoring tool that converts callback-based asynchronous code to use async/await; (ii) Corrector, a tool that finds and corrects common misuses of async/await. Our empirical evaluation shows that these tools are (i) applicable and (ii) efficient. Developers accepted 314 patches generated by our tools.
Article
This paper introduces occam-π, an efficient and safe binding of key elements from Hoare’s CSP and Milner’s π-calculus into a programming language of industrial strength. A brief overview of classical occam is presented, before focussing on the extensions providing data, channel and process mobility. Some implementation details are given, along with current benchmark results. Application techniques exploiting mobile processes for the direct modelling of large-scale natural systems are outlined, including the modelling of locality (so that free-ranging processes can locate each other). Run-time overheads are sufficiently low so that systems comprising millions of dynamically assembling and communicating processes are practical on modest processor resources. The ideas and technology will scale further to address larger systems of arbitrary complexity, distributed over multiple processors with no semantic discontinuity. Semantic design, comprehension and analysis are made possible through a natural structuring of systems into multiple levels of network and the compositionality of the underlying algebra.
Article
This paper suggests that input and output are basic primitives of programming and that parallel composition of communicating sequential processes is a fundamental program structuring method. When combined with a development of Dijkstra’s guarded command, these concepts are surprisingly versatile. Their use is illustrated by sample solutions of a variety of familiar programming exercises.
Article
C# is a simple, modern, object-oriented, and type-safe programming language that combines the high productivity of rapid application development languages with the raw power of C and C++. Written by the language's architect and design team members, The C# Programming Language is the definitive technical reference for C#. Moving beyond the online documentation, the book provides the complete specification of the language along with descriptions, reference materials, and code samples from the C# design team.The first part of the book opens with an introduction to the language to bring readers quickly up to speed on the concepts of C#. Next follows a detailed and complete technical specification of the C# 1.0 language, as delivered in Visual Studio .NET 2002 and 2003. Topics covered include Lexical Structure, Types, Variables, Conversions, Expressions, Statements, Namespaces, Exceptions, Attributes, and Unsafe Code.The second part of the book provides an introduction to and technical specification of the four major new features of C# 2.0: Generics, Anonymous Methods, Iterators, and Partial Types.Reference tabs and an exhaustive print index allow readers to easily navigate the text and quickly find the topics that interest them most. An enhanced online index allows readers to quickly and easily search the entire text for specific topics.With the recent acceptance of C# as a standard by both the International Organization for Standardization (ISO) and ECMA, understanding the C# specification has become critical. The C# Programming Language is the definitive reference for programmers who want to acquire an in-depth knowledge of C#.0321154916B10142003
Article
Java is becoming very prevalent as a first programming language in some major universities. While other universities are still using C++ and planning to switch to Java, we have a new and emerging language C# (pronounced as C Sharp) on the block. It seems to be the appropriate time to look at these two languages in terms of their introductory programming concepts and discuss the strengths and weaknesses harnessed in each separating one from the other. Keeping this in mind, we have looked at both of these languages, developed some simple programs, and provided our initial impressions. We believe that this comparison will be very helpful; not only in selecting the first programming language, but also knowing what features should remain in future programming languages to come.
Article
One of the more interesting developments recently gaining popularity in the server-side JavaScript space is Node.js. It's a framework for developing high-performance, concurrent programs that don't rely on the mainstream multithreading approach but use asynchronous I/O with an event-driven programming model.