ArticlePDF Available

CPC: programming with a massive number of lightweight threads

Authors:

Abstract

Threads are a convenient and modular abstraction for writing concurrent programs, but often fairly expensive. The standard alternative to threads, event-loop programming, allows much lighter units of concurrency, but leads to code that is difficult to write and even harder to understand. Continuation Passing C (CPC) is a translator that converts a program written in threaded style into a program written with events and native system threads, at the programmer's choice. Together with two undergraduate students, we taught ourselves how to program in CPC by writing Hekate, a massively concurrent network server designed to efficiently handle tens of thousands of simultaneously connected peers. In this paper, we describe a number of programming idioms that we learnt while writing Hekate; while some of these idioms are specific to CPC, many should be applicable to other programming systems with sufficiently cheap threads.
arXiv:1102.0951v1 [cs.PL] 4 Feb 2011
CPC: programming with a massive number of lightweight threads
Gabriel Kerneis
Universit´e Paris Diderot
Paris, France
kerneis@pps.jussieu.fr
Juliusz Chroboczek
Universit´e Paris Diderot
Paris, France
1 Introduction
Threads are a convenient and modular abstraction for writing concurrent programs. Unfortunately,
threads, as they are usually implemented, are fairly expensive, which often forces the programmer to
use a somewhat coarser concurrency structure than he would want to. The standard alternative to threads,
event-loop programming, allows much lighter units of concurrency; however, event-loop programming
splits the flow of control of a program into small pieces, which leads to code that is difficult to write and
even harder to understand [1, 8].
Continuation Passing C (CPC) [4, 6] is a translator that converts a program written in threaded style
into a program written with events and native system threads, at the programmer’s choice. Threads in
CPC, when compiled to events, are extremely cheap, roughly two orders of magnitude cheaper than in
traditional programming systems; this encourages a somewhat unusual programming style.
Together with two undergraduate students [2], we taught ourselves how to program in CPC by writing
Hekate, a BitTorrent seeder, a massively concurrent network server designed to efficiently handle tens
of thousands of simultaneously connected peers. In this paper, we describe a number of programming
idioms that we learnt while writing Hekate; while some of these idioms are specific to CPC, many should
be applicable to other programming systems with sufficiently cheap threads.
The CPC translation process itself is described in detail elsewhere [6].
2 Cooperative CPC threads
The extremely lightweight, cooperative threads of CPC lead to a “threads are everywhere” feeling that
encourages a somewhat unusual programming style.
Lightweight threads Contrary to the common model of using one thread per client, Hekate spawns at
least three threads for every connecting peer: a reader, a writer, and a timeout thread. Spawning several
CPC threads per client is not an issue, especially when only a few of them are active at any time, because
idle CPC threads carry virtually no overhead.
The first thread reads incoming requests and manages the state of the client. The BitTorrent protocol
defines two states for interested peers: “unchoked,” i.e. currently served, and “choked. Hekate maintains
90% of its peers in choked state, and unchokes them in a round-robin fashion.
The second thread is incharge ofactually sending the chunks of data requested by thepeer. It usually
sleeps on a condition variable, and is woken up by the first thread when needed. Because these threads
are scheduled cooperatively, the list of pending chunks is manipulated by the two threads without need
for a lock.
Each read on a network interface is guarded by a timeout, and a peer that has not been involved in
any activity for a period of time is disconnected. Earlier versions of Hekate which did not include this
protection would end up clogged by idle peers, which prevented new peers from connecting.
In order to simplify the protocol-related code, timeouts are implemented in the buffered read function,
which spawns a new timeout thread on each invocation. This temporary third thread sleeps for the
1
CPC: programming with a massive number of lightweight threads Kerneis, Chroboczek
cps void
listening(hashtable * table) {
/* ... */
while(1) {
cpc_io_wait(socket_fd, CPC_IO_IN);
client_fd = accept(socket_fd, ...);
cpc_spawn client(table, client_fd);
}
}
Figure 1: Accepting connections and spawning threads
duration of the timeout, and aborts the I/O ifit is still pending. Because most timeouts do not expire, this
solution relies on the efficiency of spawning and context-switching short-lived CPC threads [4, 6].
Native and cps functions CPC threads might execute two kinds of code: native functions and cps func-
tions (annotated with the cps keyword). Intuitively, cps functions are interruptible and native functions
are not. From a more technical point of view, cps functions are compiled by performing a transformation
to Continuation Passing Style (CPS), while native functions execute on the native stack [6].
There is a global constraint on the call graph of a CPC program: a cps function may only be called
by a cps function; equivalently, a native function can only call native functions — but a cps function can
call a native function. This means that at any point in time, the dynamic chain consists of a “cps stack”
of cooperating functions followed by a “native stack” of regular C functions. Since context switches are
forbidden in native functions, only the former needs to be saved and restored when a thread cooperates.
Figure 1 shows an example of a cps function: listening calls the primitive cpc io wait to wait
for the file descriptor socket fd to be ready, before accepting incoming connections with the native
function accept and spawning a new thread for each of them.
3 Comparison with event-driven programming
Code readability Hekate’s code is much more readable than its event-driven equivalents. Consider for
instance the BitTorrent handshake, a message exchange occurring just after a connection is established.
In Transmission1, a popular and efficient BitTorrent client written in (mostly) event-driven style, the
handshake is a complex piece of code, spanning over a thousand lines in a dedicated file. By contrast,
Hekate’s handshake is a single function of less than fifty lines including error handling.
While some of Transmission’s complexity is explained by its support for encrypted connexions,
Transmission’s code is intrinsically much more messy due to the use of callbacks and a state machine
to keep track of the progress of the handshake. This results in an obfuscated flow of control, scattered
through a dozen of functions (excluding encryption-related functions), typical of event-driven code [1].
Expressivity Surprisingly enough, CPC threads turn out to be more expressive than native threads, and
allow some idioms that are more typical of event-driven style.
A case in point: buffer allocation for reading data from the network. When a native thread performs a
blocking read, it needs to allocate the buffer before theread system call; when many threads are blocked
waiting for a read, these buffers add up to a significant amount of storage. In an event-driven program,
1http://www.transmissionbt.com
2
CPC: programming with a massive number of lightweight threads Kerneis, Chroboczek
it is possible to delay allocating the buffer until after an event indicating that data is available has been
received.
The same technique is not only possible, but actually natural in CPC: buffers in Hekate are only
allocated after cpc io wait has successfully returned. This provides the reduced storage requirements
of an event-driven program while retaining the linear flow of control of threads.
4 Detached threads
While cooperative, deterministically scheduled threads are less error-prone and easier to reason about
than preemptive threads, there are circumstances in which native operating system threads are necessary.
In traditional systems, this implies either converting the whole program to use native threads, or manually
managing both kinds of threads.
A CPC thread can switch from cooperative to preemptive mode at any time by using the the cpc attach
primitive (inspired by FairThreads’ ft thread link [3]). A cooperative thread is said to beattached to
the default scheduler, while a preemptive one is detached.
The cpc attach primitive takes a single argument, a scheduler, either the default event loop (for
cooperative scheduling) or a thread pool (for preemptive scheduling). It returns the previous scheduler,
which makes it possible to eventually restore the thread to its original state. Syntactic sugar is provided
to execute a block of code in attached or detached mode (cpc attached,cpc detached).
Hekate is written in mostly non-blocking cooperative style; hence, Hekate’s threads remain attached
most of the time. There are a few situations, however, where the ability to detach a thread is needed.
Blocking OS interfaces Some operating system interfaces, like the getaddrinfo DNS resolver in-
terface, may block for a long time (up to several seconds). Although there exist several libraries which
implement equivalent functionality in a non-blocking manner, in CPC we simply enclose the call to the
blocking interface in a cpc detached block (see Figure 2a).
Figure 2b shows how cpc detached is expanded by the compiler into two calls to cpc attach.
Note that CPC takes care to attach the thread before returning to the caller function, even though the
return statement is inside the cpc detached block.
cpc_scheduler *s =
cpc_detached { cpc_attach(cpc_default_threadpool);
rc = getaddrinfo(name, ...) rc = getaddrinfo(name, ...)
return rc; cpc_attach(s);
}return rc;
(a) (b)
Figure 2: Expansion of cpc detached in terms of cpc attach
Blocking library interfaces Hekate uses the curl library 2to contact BitTorrent trackers over HTTP.
Curl offers both asimple, blocking interface anda complex, non-blocking one. Wedecided to use theone
interface that we actually understand, and therefore call the blocking interface from a detached thread.
Parallelism Detached threads make it possible to run on multiple processors or processor cores. Hekate
does not use this feature, but a CPU-bound program would detach computationally intensive tasks and
let the kernel schedule them on several processing units.
2http://curl.haxx.se/libcurl/
3
CPC: programming with a massive number of lightweight threads Kerneis, Chroboczek
prefetch(source, length); /* (1) */
cpc_yield(); /* (2) */
if(!incore(source, length)) { /* (3) */
cpc_yield(); /* (4) */
if(!incore(source, length)) { /* (5) */
cpc_detached { /* (6) */
rc = cpc_write(fd, source, length);
}
goto done;
}
}
rc = cpc_write(fd, source, length); /* (7) */
done:
...
The functions prefetch and incore are thin wrappers around the posix madvise and mincore system calls.
Figure 3: An example of hybrid programming (non-blocking read)
5 Hybrid programming
Most realistic event-driven programs are actually hybrid programs [7, 9]: they consist of a large event
loop, and a number of threads (this is the case, by the way, of the Transmission BitTorrent client men-
tioned above). Such blending of native threads with event-driven code is made very easy by CPC,where
switching from one style to the other is a simple matter of using the cpc_attach primitive.
This ability is used in Hekate for dealing with disk reads. Reading from disk might block if the
data is not in cache; however, if the data is already in cache, it would be wasteful to pay the cost of
a detached thread. This is a significant concern for a BitTorrent seeder because the protocol allows
requesting chunks in random order, making kernel readahead heuristics useless.
The actual code is shown in Figure 3: it sends a chunk of data from a memory-mapped disk file
over a network socket. In this code, we first trigger an asynchronous read of the on-disk data (1), and
immediately yield to threads servicing other clients (2) in order to give the kernel a chance to perform the
read. When we are scheduled again, we check whether the read has completed (3); if it has, we perform
a non-blocking write (7); if it hasn’t, we yield one more time (4) and, if that fails again (5), delegate the
work to a native thread which can block (6).
Note that this code contains a race condition: the prefetched block of data could have been swapped
out before the call to cpc write, which would stall Hekate until the write completes. However, our
measurements show that the write never lasted more than 10 ms, which clearly indicates that the race
does not happen. Note further that the call to cpc write in the cpc detached block (6) could be
replaced by a call to write: we are in a native thread here, so the non-blocking wrapper is not needed.
However, the CPC runtime is smart enough to detect this case, and cpc write simply behaves as write
when invoked in detached mode; for simplicity, we choose to use the CPC wrappers throughout our code.
6 Experimental results
Benchmarking a BitTorrent seeder is a difficult task because it relies either on a real-world load, which is
hard to control and only provides seeder-side information, or on an artificial testbed, which might fail to
accurately reproduce real-world behaviour. Our experience with Hekate in both kinds of setup shows that
CPC generates efficient code, lightweight enough to run Hekate on embedded hardware. This confirms
our earlier results [5], where me measured the performance of toy web servers.
4
CPC: programming with a massive number of lightweight threads Kerneis, Chroboczek
Real-world workload To benchmark the ability of Hekate to sustain a real-world load, we need pop-
ular torrents with many requesting peers over a long period of time. Updates for Blizzard’s game World
of Warcraft (WoW), distributed over BitTorrent, meet those conditions: each of the millions of WoW
players around the world runs a hidden BitTorrent client, and at any time many of them are looking for
the latest update.
We have run an instance of Hekate seeding WoW updates without interruption for weeks. We saw up
to 1,000 connected peers (800 on average) and a throughput of up to 10 MB/s(around 5 MB/s on average).
Hekate never used more than 10% of the 3.16GHz dual core CPU of our benchmarking machine.
Stress-test on embedded hardware We have ported Hekate to OpenWrt3, a Linux distribution for
embedded devices. Hekate runs flawlessly on a MIPS-based router with a 266 MHz CPU, 32 MB of
RAM and a 100Mbps network card. The torrent files were kept on a USB key.
Because Hekate maps every file it serves in memory, and the MIPS routers running OpenWrt are 32-
bit machines, we are restricted to no more than 2GB of content. Our stress-test consists in 1,000 clients,
requesting random chunks of a 1.2GB torrent from a computer directly connected to the device. Hekate
sustained a throughput of 2.9MB/s. The CPU was saturated, mostly with software interrupt requests
(60% sirq, the usb-storage kernel module using up to 25 % of CPU).
7 Conclusions
Hekate has shown that CPCis a tool that is able to produce efficient network servers, even whenused by
people who do not fully understand its internals and are not specialists of network programming. While
writing Hekate, we had a lot of fun exploring the somewhat unusual programming style that CPC’s
lightweight, hybrid threads encourage.
We have no doubt that CPC, possibly with some improvements, will turn out to be applicable to a
wider range of applications than just network servers, and are looking forward to experimenting with
CPU-bound distributed programs.
References
[1] A. Adya, J. Howell, M. Theimer, W. J. Bolosky, and J. R. Douceur. Cooperative task management without
manual stack management. In Proceedings of the 2002 USENIX Annual Technical Conference, 2002.
[2] P. Attar and Y. Canal. R´ealisation d’un seeder bittorrent en CPC, June 2009. Rapport de stage.
[3] F. Boussinot. FairThreads: mixing cooperative and preemptive threads in C. Concurrency and Computation:
Practice and Experience, 18(5):445–469, 2006.
[4] J. Chroboczek. Continuation-passing for C: a space-efficient implementation of concurrency. Technical report,
PPS, Universit´e Paris 7, 2005.
[5] G. Kerneis and J. Chroboczek. Are events fast? Technical report, PPS, Universit´e Paris 7, 2009.
[6] G. Kerneis and J. Chroboczek. Continuation-Passing C, compiling threads to events through continuations.
Submitted for publication, 2010.
[7] V. S. Pai, P. Druschel, and W. Zwaenepoel. Flash: an efficient and portable web server. In Proceedings of the
1999 USENIX Annual Technical Conference, 1999.
[8] R. von Behren, J. Condit, and E. Brewer. Why events are a bad idea (for high-concurrency servers). In
Proceedings of the 9th conference on Hot Topics in Operating Systems, 2003.
[9] M. Welsh, D. Culler, and E. Brewer. SEDA: an architecture for well-conditioned, scalable internet services.
SIGOPS Oper. Syst. Rev., 35(5):230–243, 2001.
3http://openwrt.org
5
... The reader may remark this concept is reminiscent of transactional memory [16]. Cooperative multitasking enables the design of concurrent programs on single threaded environments, which not only can prove beneficial performancewise [24], but also significantly reduces the need for locks and other synchronization mechanisms, and opens the door for more relaxed immutability models. ...
Conference Paper
Aliasing is a vital concept of programming, but it comes with a plethora of challenging issues, such as the problems related to race safety. This has motivated years of research, and promising solutions such as ownership or linear types have found their way into modern programming languages. Unfortunately, most current approaches are restrictive. In particular, they often enforce a single-writer constraint, which prohibits the creation of mutable self-referential structures. While this constraint is often indispensable in the context of preemptive multithreading, it can be worked around in the case of single threaded programs. With the recent resurgence of cooperative multitasking, where processes voluntarily share control over a single execution thread, this appears to be interesting trade-off. In this paper, we propose a type system that relaxes the usual single-writer constraint for single threaded programs, without sacrificing race safety properties. We present it in the form of a simple reference-based language, for which we provide a formal semantics, as well as an interpreter.
... Pinning introduced additional work allocating and de-allocating the GCHandle structures that are used to pin objects. CPC [21] uses a compiler that transforms code to continuation passing style (CPS) to support large numbers of threads: threads comprise a dynamic chain of heap-allocated suspension records. As in AC an IO library provides integration with asynchronous IO operations exposed by the OS. ...
Conference Paper
This paper introduces AC, a set of language constructs for composable asynchronous IO in native languages such as C/C++. Unlike traditional synchronous IO interfaces, AC lets a thread issue multiple IO requests so that they can be serviced concurrently, and so that long-latency operations can be overlapped with computation. Unlike traditional asynchronous IO interfaces, AC retains a sequential style of programming without requiring code to use multiple threads, and without requiring code to be "stack-ripped" into chains of callbacks. AC provides an "async" statement to identify opportunities for IO operations to be issued concurrently, a "do..finish" block that waits until any enclosed "async" work is complete, and a "cancel" statement that requests cancellation of unfinished IO within an enclosing "do..finish". We give an operational semantics for a core language. We describe and evaluate implementations that are integrated with message passing on the Barrelfish research OS, and integrated with asynchronous file and network IO on Microsoft Windows. We show that AC offers comparable performance to existing C/C++ interfaces for asynchronous IO, while providing a simpler programming model.
... Together with two undergraduate students, we taught ourselves how to program in CPC by writing Hekate, a BitTorrent seeder, a massively concurrent network server designed to efficiently handle tens of thousands of simultaneously connected peers [27,3]. In this section, we give an overview of the CPC language through several programming idioms that we discovered while writing Hekate. ...
Article
Full-text available
In this paper, we introduce Continuation Passing C (CPC), a programming language for concurrent systems in which native and cooperative threads are unified and presented to the programmer as a single abstraction. The CPC compiler uses a compilation technique, based on the CPS transform, that yields efficient code and an extremely lightweight representation for contexts. We provide a proof of the correctness of our compilation scheme. We show in particular that lambda-lifting, a common compilation technique for functional languages, is also correct in an imperative language like C, under some conditions enforced by the CPC compiler. The current CPC compiler is mature enough to write substantial programs such as Hekate, a highly concurrent BitTorrent seeder. Our benchmark results show that CPC is as efficient, while using significantly less space, as the most efficient thread libraries available.
Conference Paper
This paper introduces AC, a set of language constructs for composable asynchronous IO in native languages such as C/C++. Unlike traditional synchronous IO interfaces, AC lets a thread issue multiple IO requests so that they can be serviced concurrently, and so that long-latency operations can be overlapped with computation. Unlike traditional asynchronous IO interfaces, AC retains a sequential style of programming without requiring code to use multiple threads, and without requiring code to be "stack-ripped" into chains of callbacks. AC provides an "async" statement to identify opportunities for IO operations to be issued concurrently, a "do..finish" block that waits until any enclosed "async" work is complete, and a "cancel" statement that requests cancellation of unfinished IO within an enclosing "do..finish". We give an operational semantics for a core language. We describe and evaluate implementations that are integrated with message passing on the Barrelfish research OS, and integrated with asynchronous file and network IO on Microsoft Windows. We show that AC offers comparable performance to existing C/C++ interfaces for asynchronous IO, while providing a simpler programming model.
Conference Paper
Full-text available
Coroutines and events are two common abstractions for writing concurrent programs. Because coroutines are often more convenient, but events more portable and efficient, it is natural to want to translate the former into the latter. CPC is such a source-to-source translator for C programs, based on a partial conversion into continuation-passing style (CPS conversion) of functions annotated as cooperative. In this article, we study the application of the CPC translator to QEMU, an open-source machine emulator which also uses annotated coroutine functions for concurrency. We first propose a new type of annotations to identify functions which never cooperate, and we introduce CoroCheck, a tool for the static analysis and inference of cooperation annotations. Then, we improve the CPC translator, defining CPS conversion as a calling convention for the C language, with support for indirect calls to CPS-converted function through function pointers. Finally, we apply CoroCheck and CPC to QEMU (750 000 lines of C code), fixing hundreds of missing annotations and comparing performance of the translated code with existing implementations of coroutines in QEMU. Our work shows the importance of static annotation checking to prevent actual concurrency bugs, and demonstrates that CPS conversion is a flexible, portable, and efficient compilation technique, even for very large programs written in an imperative language.
Article
Full-text available
We compare a set of web servers on a simple synthetic workload. We show that, on this particular bench-mark, event-driven code is as fast or faster than the fastest implementations using thread libraries.
Conference Paper
Full-text available
Event-based programming has been highly touted in recent years as the best way to write highly concurrent applications. Having worked on several of these systems, we now believe this approach to be a mistake. Specifically, we believe that threads can achieve all of the strengths of events, including support for high concurrency, low overhead, and a simple concurrency model. Moreover, we argue that threads allow a simpler and more natural programming style. We examine the claimed strengths of events over threads and show that the weaknesses of threads are artifacts of specific threading implementations and not inherent to the threading paradigm. As evidence, we present a user-level thread package that scales to 100,000 threads and achieves excellent performance in a web server. We also refine the duality argument of Lauer and Needham, which implies that good implementations of thread systems and event systems will have similar performance. Finally, we argue that compiler support for thread systems is a fruitful area for future research. It is a mistake to attempt high concurrency without help from the compiler, and we discuss several enhancements that are enabled by relatively simple compiler changes.
Article
Full-text available
In this paper, we introduce Continuation Passing C (CPC), a programming language for concurrent systems in which native and cooperative threads are unified and presented to the programmer as a single abstraction. The CPC compiler uses a compilation technique, based on the CPS transform, that yields efficient code and an extremely lightweight representation for contexts. We provide a proof of the correctness of our compilation scheme. We show in particular that lambda-lifting, a common compilation technique for functional languages, is also correct in an imperative language like C, under some conditions enforced by the CPC compiler. The current CPC compiler is mature enough to write substantial programs such as Hekate, a highly concurrent BitTorrent seeder. Our benchmark results show that CPC is as efficient, while using significantly less space, as the most efficient thread libraries available.
Conference Paper
Full-text available
We propose a new design for highly concurrent Internet services, which we call the staged event-driven architecture (SEDA). SEDA is intended to support massive concurrency demands and simplify the construction of well-conditioned services. In SEDA, applications consist of a network of event-driven stages connected by explicit queues. This architecture allows services to be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity. SEDA makes use of a set of dynamic resource controllers to keep stages within their operating regime despite large fluctuations in load. We describe several control mechanisms for automatic tuning and load conditioning, including thread pool sizing, event batching, and adaptive load shedding. We present the SEDA design and an implementation of an Internet services platform based on this architecture. We evaluate the use of SEDA through two applications: a high-performance HTTP server and a packet router for the Gnutella peer-to-peer file sharing network. These results show that SEDA applications exhibit higher performance than traditional service designs, and are robust to huge variations in load. 1
Article
Threads are a convenient abstraction for programming concurrent systems. They are however expensive, which leads many program- mers to use coarse-grained concurrency where a fine-grained struc- ture would be preferable, or use more cumbersome implementation techniques. Cooperative threads can be easily implemented in a language that provides first-class continuations. Unfortunately, CPS conver- sion, the standard technique for adding continuations to a language, is not natural for typical imperative languages. This paper defines a notion of CPS conversion for the C programming language. Continuation Passing C (CPC) is a concurrent extension of C with very cheap threads. It is implemented as a series of source-to- source transformations, including CPS conversion, that convert a threaded program in direct style into a purely sequential C program. In this paper, we describe CPC and the transformations that are used in its implementation.
Conference Paper
Abstract Cooperative task management can provide program ar - chitects with ease of reasoning about concurrency is - sues This property is often espoused by those who recommend "event - driven" programming over "multi - threaded" programming Those terms conflate several issues In this paper, we clarify the issues, and show how one can get the best of both worlds: reason more simply about concurrency in the way "event - driven" advocates recommend, while preserving the readability and main - tainability of code associated with "multithreaded" pro - gramming We identify the source of confusion about the two pro - gramming styles as a conflation of two concepts: task management and stack management Those two con - cerns define a two - axis space in which "multithreaded" and "event - driven" programming are diagonally oppo - site; there is a third "sweet spot" in the space that com - bines the advantages of both programming styles We point out pitfalls in both alternative forms of stack man - agement, manual and automatic , and we supply tech - niques that mitigate the danger in the automatic case Finally, we exhibit adaptors that enable automatic stack management code and manual stack management code to interoperate in the same code base
Article
FairThreads introduces fair threads which are executed in a cooperative way when linked to a scheduler, and in a preemptive way otherwise. Constructs exist for programming the dynamic linking/unlinking of threads during execution. Users can profit from the cooperative scheduling when threads are linked. For example, data only accessed by the threads linked to the same scheduler does not need to be protected by locks. Users can also profit from the preemptive scheduling provided by the operating system (OS) when threads are unlinked, for example to deal with blocking I/Os. In the cooperative context, for the threads linked to the same scheduler, FairThreads make it possible to use broadcast events. Broadcasting is a powerful, abstract, and modular means of communication. Basically, event broadcasting is made possible by the specific way threads are scheduled by the scheduler to which they are linked (the ‘fair’ strategy). FairThreads give a way to deal with some limitations of the OS. Automata are special threads, coded as state machines, which do not need the allocation of a native thread and which have efficient execution. Automata also give a means to deal with the limited number of native threads available when large numbers of concurrent tasks are needed, for example in simulations. Copyright © 2005 John Wiley & Sons, Ltd.
Article
This paper presents the design of a new Web server architecture called the asymmetric multiprocess event-driven (AMPED) architecture, and evaluates the performance of an implementation of this architecture, the Flash Web server. The Flash Web server combines the high performance of singleprocess event-driven servers on cached workloads with the performance of multi-process and multithreaded servers on disk-bound workloads. Furthermore, the Flash Web server is easily portable since it achieves these results using facilities available in all modern operating systems. The performance of different Web server architectures is evaluated in the context of a single implementation in order to quantify the impact of a server's concurrency architecture on its performance. Furthermore, the performance of Flash is compared with two widely-used Web servers, Apache and Zeus. Results indicate that Flash can match or exceed the performance of existing Web servers by up to 50% across a wide range of real...
Réalisation d'un seeder bittorrent en CPC
  • P Attar
P. Attar and Y. Canal. Réalisation d'un seeder bittorrent en CPC, June 2009. Rapport de stage.