BookPDF Available

Developing an Akka Edge

Authors:
  • Thales Digital Factory
An Introduction to Actor-based
Concurrency with Scala
Thomas Lockney and Raymond Tay
BLEEDINGedgePRESS
Developing an
Akka Edge:
Table&of&Contents&
&
1. Introduction&
2. Working&with&Actors&
3. Running&with&Akka&
4. Handling&Faults&and&Actor&Hierarchy&
5. Routers&
6. Dispatchers&
7. Remoting&
8. Diving&Deeper&into&Futures&
9. Testing&
10. Clusters&
Appendix&A:&Further&Reading&
Appendix&B:&Schedulers&
Preface
Who is this book for?
This book was written to act as a tour guide for newcomers to Akka. We're assuming that you have perhaps heard of
Akka, maybe read an article or blog post or two about it, but you don't really have your bearings with it. Our goal is
that, by the time you've finished reading this, you're able to navigate your way around Akka without feeling like
you've plunged in at the deep end.
We're writing this book focusing solely on Akka's Scala API. If you are coming from a background in an object-
oriented language such as Java, C#, or Ruby, you should be able to work through a couple of short tutorials on
Scala and gain enough understanding to follow along. None of the features We'll be describing require
understanding of the deeper details of Scala, so a basic reading-level familiarity should be sufficient.
If you are looking for good places to start, we've provided a handful of resources in the Appendix. Similarly, we've
given a list of books, articles and blog posts that dive more deeply into the topics we discuss so that you can, when
you're ready, find the material that will help you move towards an expert-level understanding of this material. But
keep in mind that the only real expertise comes from actually working with the tools you are given. Plan to spend
time and effort and you will gain mastery.
Why did we write this?
The idea to write this book came about when we were trying to think of a way to help manage a problem We've seen
since first becoming part of the Akka community. If you lurk on the Akka IRC channel (#akka on Freenode), it's not at
all uncommon to have people drop in and ask some pretty basic questions about getting started with Akka. You
might even say it's a common occurence. Similarly, a lot of the questions that appear on the mailing list turn out to be
rather simple issues based on basic misunderstandings about how Akka works.
Many of the questions we hear are about integrating Akka into existing projects, figuring out what actor-thinking is all
about (though newcomers don't usually phrase it that way), or trying to understand some fundamental concept that
perhaps got missed in the requestor's perusal of the documentation provided by the Akka team.
A brief aside about the Akka docs
To be clear, the Akka docs are simply fantastic. If anything, they set a standard that we hope more projects aim for as
they consider how to compile their documentation. But they present a couple of problems for the newcomer. The
documentation is largely organized as a reference. There is a getting started section, but it's fairly brief and doesn't
walk through the process of putting together an Akka application beyond a pretty basic level. The documentation is
essentially feature-oriented, but if you don't know what feature it is that you need, you might be lost unless you read
the whole thing and spend a lot of time experimenting. If you're reading a map and you don't know what the symbols
mean, you're going to get lost; it's as simple as that.
Why you should read this?
If you're dealing with problems that inherently involve concurrency and running into trouble, you may have run across references
to Akka on some blog post, a mailing list, or maybe at some user group meeting. But, honestly, unless you've encountered the
actor model before either by working with some other actor-based system (e.g., Erlang, Kilim, Celluloid, Actorom, etc.), you're
likely to be confused when you first try to dig in and get your bearings.
The hardest part for a lot of people to understand is that Akka is a toolkit with a number of very powerful tools that
can scale up or down from very small tasks to very, very large ones. Understanding how to pick out the smallest bits
first and add on what you need, as you need it, is not always so obvious.
Our goal is to walk through some simple examples, building up a body of knowledge about the most important
features Akka provides. In some cases, we'll take these examples further and build on them where it makes sense
and in other cases we'll present short, self-contained examples to highlight particular functionality. Our hope is that,
in the end, you can take what you learn here and start building your own Akka-based system, leaning on the existing
documentation when you need more in-depth information.
Acknowledgements
Thomas Lockney
First, I want to thank both the Scala and Akka communities for the generosity and support they have given me over
the last few years. I have never encountered a community that was so full of intelligent, yet humble and helpful
3
the last few years. I have never encountered a community that was so full of intelligent, yet humble and helpful
people. I also could not have produced the work you are reading with out the time and work of my reviewers: Jamie
Allen, Scott Clasen, Scott Parish, Dave Rostron, Tymon Tobolski, and, most importantly, Viktor Klang, who provided
extensive feedback and notes to improve what I have put together here. Please keep in mind that any remaining
errors are my own.
I should also thank Roland Kuhn, who has repeatedly been of great assistance on the mailing list and the Akka IRC
channel, and, of course, Jonas Bonér, who had the great vision to create this incredible toolkit I describe here, and
the rest of the Akka team for all their amazing work. Oh, and thanks to Brendan McAdams for being a constant
shoulder to lean on when I was feeling like I had perhaps dived in a bit too deep.
Finally, I would not be able to get through a single day, whether writing this book or otherwise, without my amazing,
inspirational wife, Nicole, and our fun-loving, loyal, and ever cuddly dogs, Maggie and Harvey. This work is also
dedicate to the memory of our wonderful canine friends who had since passed on: Ida, Alfie, Henry, and Oscar.
Raymond Tay
I like to thank Thomas and BleedingEdge Press for inviting me to partake in this work where I can share my love and
passion for distributing computing using Akka. This love could not have reached its zenith without the support of the
Akka community and I would like to acknowledge the following people: Roland Kuhn, Jamie Allen, Josh Suereth,
Martin Odersky, Eugene Burmako, Derek Wyatt whom have provided me ideas, inspiration & motivation whether
they know it or not. I'm grateful. Lastly, I would like to give thanks to my loving wife of 11 years whom has given me
support over the time it took to write the book. Finally, I would like to thank the reviewers of our book and hope that
this book would stand the scrutiny of time and errors in the book are solely mine.
4
Chapter 1. Introduction
As developers, we certainly face some significant challenges every day, but there is a big one that most of us will
have to address sooner or later. As the number of both CPU cores and systems that your code runs on increase, the
issue of concurrency rears its head and the approaches that have become common practice show their insufficiency
for addressing these issues.
The common technique for coordinating concurrent tasks takes the form of shared, mutable data structures and
locking mechanisms to only allow a single caller in a flow of execution to change the shared structure at any one
time. This is commonly known as coarse-grain synchronization. If you pick up the typical book on concurrency in
Java, C# or C++, this is the approach you will generally see described. This works well when concurrency levels are
low but when concurrency starts to climb, then the data structure becomes a bottleneck and it effectively forces
serialized access to the structure; at that there's also fine-grained and optimistic synchronization. Applications of
these techniques requires judicious use and a good understanding of the trade offs and a good deal about the
application behavior and at that the developer needs to know whether the component-under-concurrency is mostly
conducting read operations, write operations, read-write operations etc and there are methods to deal with the
various situations e.g. employing variations of readers-writers locks.
However, these techniques prove to be difficult to scale. We can understand this by Amdahl's law (by Gene Amdahl
in 1967) and a key component in that is the speedup. The law says that inorder to achieve speedup, we can employ
parallel processors, or alternatively exploit parallelism by either changing our algorithm (from sequential to parallel)
or changing the approach; and it is one of the goals of this book to focus on the latter with Akka. In the classic
scenario when the application is executed in a single-threaded manner, the use of locks is obsolete since there is no
apparent need to do so but when the application is brought to bear in a distributed computing environment with
machines hosting multiple CPU cores then using locks exacerbates the problem since improper deployment of locks
could potentially cripple the system e.g. locking over a distributed operation that has a long latency — locks and
shared memory in the form described above are simply not feasible across systems.
The Problems of Concurrency
Generally speaking, we can group approaches used for coordinating concurrent operations in two broad camps: one
that uses locks to manage access to shared memory-based data structures or other shared resources; and one that
builds on the idea of message passing. We'll walk through a brief description of each of these to make sure our
understanding is clear.
The locking approach built around multiple threads accessing shared, mutable data structures is perhaps the most
common technique for concurrency coordination. Let's consider the simple example of a system that has two cores
and a handful of IO devices attached (the important point being that there are more devices than cores). We'll make
an assumption that the combined data throughput of these devices is more than a single core can process, without
causing other necessary work to back up. Putting aside the concern that we generally shouldn't make assumptions
about the hardware (this is an important concern, though), one possible approach when encountering this issue is to
try and spread the load across the cores. But when we're dealing with I/O, we need to manage buffers to hold the
data being processed. How do we make those buffers safe for use by both cores? One solution would be locking our
data structures.
As the title of this section might suggest (and as pointed out earlier), there are problems with the lock-oriented
approach that quickly bring about difficulties. At the least, we'll have to figure out at what scope to place our locks.
Locking the whole structure is a bad idea. To be clear, this is due to the bottleneck around this structure anytime a
thread needs to interact with it. Unfortunately, making the locks too fine-grained is also not a clear winner - it results
in far too much resource consumption, since the system has to manage the locking and unlocking.
And then there's the issue of lock contention and deadlock or, often worse, livelock. Deadlock occurs when two
threads are running, trying to access some shared resources and neither can progress because they are both
waiting for the other to release the other resource.
Figure 1: Deadlock
5
We can get a sense of the sequence of events in a deadlock situation by looking at the diagram in Figure 1. There
are two threads shown, each of which need to obtain a lock on two different resources. But in this case they are
attempting to get the locks in differing orders, so the first thread 1 acquires the lock for A, with thread 2 acquiring the
lock on B. Next, thread 1 tries to acquire the lock for object B, but it must wait because thread 2 currently holds it. If,
in the meantime, thread 2 then tries to get a lock on object A, we end up with a deadlock. Both threads will sit until
they are forcibly destroyed waiting on the other thread to release the lock on the object they need. This assumes that
some supervisor or guardian thread or process is watching for these cases and is able to safely do this without
severe consequences such as corrupted data.
This may look overly simplistic, but when we are dealing with larger numbers of threads and many shared resources
that require locks for safe usage, deadlocks can happen more frequently than we might hope. There are common
techniques for reducing the chances of this happening -- for example, by attempting to enforce predefined ordering
for any lock acquisitions -- but they usually end up being cumbersome to manage and maintain.
The livelock variation of this pattern occurs when two (or more) threads are competing for some resource, but the
state of the lock each thread is trying to acquire changes with respect to the other while trying to determine the state
of the lock held by the competing thread. That is, imagine thread 1 has a lock on A, but also needs a lock on B. But
the lock on B is currently held by thread 2. In the meantime, thread 2 needs that lock on A, being held by thread 1.
Now, with this scenario, we have a slightly different form of deadlock. But consider what happens if each of these
threads is running code with retry logic that continually retries the lock acquisition on the second lock and never
releases the original lock it has already acquired. This can be even worse than a deadlock, since rather than having
our code sitting idle, the threads involved are constantly spinning CPU cycles. This means our system just expells a
lot of waste heat. Not the sort of thing our finance department will be very happy about when the utility bill shows up.
What is the Actor Model?
The actor model was first introduced in a paper written in 1973 by Carl Hewitt along with Peter Bishop and Richard
Steiger. In very general terms, it describes a method of computation that envolves entities that send messages to
each other. Conceptually, this might sound much like object-oriented programming, but there are important
differences. In the years since the publication of that first paper, the model has been further refined into a complete
model of computation. Understanding this model is important to understanding Akka, as it is the core upon which
Akka is built.
The primary entity of this model are these artifacts called actors, as the name implies. An actor is just a single entity
(we might think of this as an instance of a class, in OOP terminology), which interacts with other actors entirely via
message passing. Each actor has a mailbox, which stores any messages it has not yet processed. The actor is
responsible for defining what behavior will be executed when it receives a given message, but it can also, on receipt
of a message, choose to change that behavior for subsequent messages. Related closely to this is the idea that an
actor is solely responsible for its state and this state cannot be changed directly by anything external to the actor.
State changes can happen based upon messages that have been received, but the actor maintains that
responsibility.
It's also important to understand that each actor, while it's handling a single message and executing any behavior
defined upon receipt of that message, is working in a sequential manner. Further, it processes each message in the
order in which they arrive in its mailbox, one message at a time. At any given time; however, a number of different
actors can be executing their behaviors. We might even have a number of instances of a single type of actor all
working to process messages as they are received. Akka improves the adaptability of the actors model to various
possible use cases by allowing the developer to specify the type of built-in mailbox to be used as well as developing
a custom mailbox-type to be used.
Another property of actors is that they are never referenced directly. Each actor is given an address that is used to
send messages to it. For an actor to send a message to another actor, it must know that other actors address. Actors
never reference each other directly, but only through these addresses. Later on in this book, starting with the next
chapter, we'll see how Akka uses this trait in an interesting and useful way.
The final characteristic of actors is that they can create other actors. This may be obvious, but it's an important point
to understand. This gives you enormous power to create complex systems of actors (We'll discuss more about
actor systems in a bit). In fact, it is quite common to have a hierarchy of actors where one top level actor acts as the
initial starting point for your system, that is responsible for starting up other actors needed to express the remaining
functionality you need. We'll be covering this more in chapter four.
To provide an example of how we might represent something using this model, let's imagine we have an actor, we
can call it StoreManager, that monitors access to the door of your bookstore. When the store is closed, we want our
security alarm to be triggered if the door opens unexpectedly. We could model the door opened event using a
DoorOpened message. To trigger the alarm, the message is TriggerAlarm. When our actor receives the
DoorOpened message, its behavior will be to send TriggerAlarm (to some other, currently unspecified actor that
somehow interfaces to the physical alarm system).
Now, the key missing piece here is how to handle things during normal business hours. In this case, when the
StoreManager actor receives the DoorOpened message, we almost certainly don't want to trigger the alarm. In this
case, we would have a scheduled message sent (we'll call it OpeningTime) at the moment when the store is
supposed to open (we'll get into scheduling messages later -- it's not explicitly part of the actor model). When
StoreManager receives this message, it will switch behavior so that any DoorOpening message it receives will not
trigger the alarm. Perhaps we could even have it trigger some other action, such as ringing a chime to let us know
6
trigger the alarm. Perhaps we could even have it trigger some other action, such as ringing a chime to let us know
someone has entered, or incrementing a counter, so we can see how many unique visitors we had in a given day. At
closing time, we would again have a scheduled message, ClosingTime, sent to cause the actor to revert to the
previous behavior.
Applying the actors model (without involving Akka yet), there's one identifiable actor already and that is
StoreManager and this actor would be waiting for the message DoorOpened and based on the above constraints
would send out a TriggerAlarm or RingChime message.
Figure 2: StoreManager states
Are You Distributed?
Up to this point, we've primarily been focusing on a single system, but the problem gets even more interesting when
we consider running across multiple independent systems. The simple fact is that most of the standard concurrency
toolkits don't offer us a lot of help here as their solutions focus on providing constructs that help the developers work
with shared memory systems which for historical reasons, remain the most prevalent. In our example, a scenario
would be to count the number of customers entering the store and here you could keep track of the number of times
the message RingChime was seen and you would use the constructs provided by the toolkit to implement the
tracking process - this problem relates to concurrency. There are of course multiple variations of this problem (we'll
see some of these soon) and a common theme that results from this is "just run your code on each system and use
defensive coding to address possible issues."
What do we mean by "defensive coding"? Let's look at the most common approaches. Imagine that we have built a
system to handle managing users' financial accounts. Periodically we receive data about new transactions from
external systems that we don't manage. When that data comes in, we need to add these transactions into our
accounting history so we can later report back to the user their full transaction details.
But how do we handle these incoming transactions when we have redundant servers handling the work of pulling
the data in? Do we have each system independently pulling in the new data? How do we make sure they don't both
see the same item and add it, creating a duplicate record?
One approach would be to have the systems each communicating with each other over some RPC-style mechanism
and deciding which is to handle the incoming data. But then we have to get into a whole host of issues around
determining which system should be handling the data ingest and what to do when something goes wrong. As an
example, there is the unfortunately too frequent case of partial network outages. This will usually be seen by our
code as a sudden inability to reach resources on other machines. We can handle this within our business logic
using try/catch blocks, but before long, we find ourselves spending more time creating mechanisms to handle the
various failure scenarios and possible issues than we are on building the actual application.
The other common approach is to make this an issue of data consistency and push it all the way to some back-end
data store. Of course, this assumes we're using a data store that has a facility for handling this, such as a
transactional RDBMS and, further, that there are uniquely defining traits on the transaction data that allow us to
differentiate two seemingly identical transactions from each other. To understand this better, imagine that we go to
the store and buy a soda for $0.99, but then we remember our friend wanted one, too, and, feeling generous, we
make another transaction for a $0.99 soda. Depending on how the data shows up in our ingest system, these
transactions could look identical, depending upon the granularity of the identifying fields, as shown in Figure 3.
Figure 3: Soda purchases
7
Of course, the whole point of this approach is to have the database able to handle the case where we end up with
actual duplicates. That is, when two (or more) systems attempt to insert the same data. We have now essentially
pushed some portion of our application logic, that is, the logic that allows the database to determine whether two
records are actual duplicates, to another external system (the database, of course). Again, this would generally be
considered ancillary to the actual purpose of the application. But far too many developers now consider this part of
their standard operating procedure. This typically results in a segregated code base where three layers is obvious:
(a) front-end (typically Javascript, HTML), (b) middle-ware (typically RESTful or HTTP asynchronous handlers) & (c)
back-end (typically wrappers around a unified database API like JDBC) and in the Java world, the code in the
middle-ware/backend could be written in POJO+RMI i.e. remote-method-invocation or JEE(also J2EE) its a fair
approach considering the times it was invented in and it is pretty much bolting-the-nut.
Figure 4: Soda purchase with resend
8
Both of these approaches clearly have issues and, from our perspective, neither seem very appealing. Why not just
use a toolkit that gives us built-in tools to handle these scenarios and make the whole issue of where our code is
running independent of (and, to a large degree, invisible to) the application logic itself? Wouldn't it be simpler if our
infrastructure code, the libraries and toolkits we build upon already handled this kind of situation for us?
This is where Akka can step in to provide us some help. Akka builds on the actor model (more on this momentarily),
which helps us by modelling interaction between components as sequences of message passing, which, as we'll
see, allows us to structure behaviors between components in a very fluid manner. Further, it is built on the
assumption that failures will occur and therefore presents us with supervisors that help your system to self-heal.
Finally, it gives us easy access to flexible routing and dispatching logic to make distributing our application across
multiple systems convenient and natural.
By moving concurrency abstractions up to a higher layer, Akka gives us a powerful toolkit for handling the issues
that have been discussed.
Before we move on, we present one possible way in which to use Akka to solve the problem we described earlier. In
Akka, there's a module called akka-dataflow (which allows a code to be executed asynchronously and deterministic
) and we can express one possible solution to the above problem by wrapping the sequence of actions into a work-
flow like structure and the following demonstrates this:
0001 flow {
0002 tryWithDatabasePool {
0003 purchaseSoda(sodaObject)
0004 }
0005 }
0006
0007 def tryWithDatabasePool(code: Unit)(implicit pool: ConnectionPoolDataSource) = {
0008 val connection = pool.getPooledConnection.connection
0009 try {
0010 connection.createStatement.execute(convertToSqlFromDomainModel(code))
0011 } finally {connection.close}
0012 }
The code above can be encapsulated in an actor, in Akka, where our actor watches for a (imaginary) message
Purchase(sodaObject) which carries a payload, sodaObject and we can hook callbacks such that when our soda
has been registered with the database we would want the status to be made known to the requester or activate the
(pre-defined) recovery mechanism and the following encapsulates these ideas:
0001 class Soda extends Actor {
0002 def receive = {
0003 case Purchase(Soda)
0004 val work = flow { /* as shown above */ }
0005 work onComplete { sender ! Done } // tell requester it's OK
0006 work onFailure activateRecovery // start 'recovery' mechanism
0007 }
0008 }
0009 }
0010 def activateRecovery = { /* send email and/or register a 'fail' for this transaction */}
We'll explore the most significant of these in the chapters ahead, but first it might help to provide a bit of background
for the most fundamental idea it builds upon.
How does this compare to traditional concurrency?
Now that the actor model has been detailed, it's worth talking a bit more about how this compares to the traditional
approach to concurrency used in most other popular languages (in other words, Java, C++, etc.).
As described above, the usual technique used to coordinate concurrent operations, that is the approach considered
standard, is the use of locks, semaphores, mutexes, or any of the other variations on this theme within the context of
thread-based execution models. This approach relies on some shared state or resources that must be restricted to
only be accessed by any one given thread of control at a time. There are a lot of variations on this, such as only
locking for write operations, while allowing reads from multiple concurrent callers, but the problems they exhibit fall
largely into a single category.
Imagine we have two threads and a single field that needs to be updated whenever one of those threads encounters
some event (perhaps it's triggered by someone visiting our website and the field is a counter of total visitors). For
some portion of the time, things appear to be just fine, but then two visitors come to our website at the same time,
both triggering an update to the counter. Obviously, only one thread can win. So we add a lock around the method
that updates this field. That seems to do the trick.
Figure 5: Thread contention
9
But let's consider what happens when you're no longer dealing with two threads, but with potentially hundreds or
thousands. At this stage, you might think everything is just fine, until we open up our profiler and see the how many
threads are blocked, waiting to get access to the lock. It's very difficult to reason about these scenarios and figuring
out the right strategy to prevent excessive resource consumption as the JVM tries to negotiate this logic. This is not
something most developers are going to have the time to do. Thankfully, this is the kind of problem that the actor
model and Akka are designed to help.
Figure 6: Heavy thread contention
Since this book is about Akka, and therefore about the JVM, we might wonder what the situation is given the
generally positive statements oftened heard about the java.util.concurrent package. This package, originally under
the EDU.oswego.cs.dl.util.concurrent namespace and later formalized in JSR 166, started out as a project of Doug
Lea's with the goal of adding an efficient implementation of the commomly used utility classes needed for concurrent
programming in Java. It contains some very powerful and well tested code, without question. But it's also very low-
level and requires very careful coding to use it effectively.
In our opinion, it's best to look at these as the fundamental structures upon which better abstractions can be built. In
fact, Doug Lea has made it clear on numerous occasions that he intends these tools to be used as the underlayer of
more approachable libraries. Ideally, these abstractions will allow us to more easily reason about the problems
which we need to address.
What is Akka and what does it bring to the table?
Akka is a project first begun by Jonas Bonér in 2008, but with roots that go much, much deeper. It pulls ideas from
Erlang and Clojure, but also adds a lot of its own innovations and brings in the elegance of Scala. There is a Java
API, as well, which we can use if someone forces you to -- but perhaps at that point we should look for other work or
make new friends!
Akka is built firmly on the actor model. We've already covered that fairly well, but we'll describe, briefly, how Akka
approaches this model and what else it brings along. The model itself is just a framework for modelling problems,
but actual implementations are free to implement the specifics however they wish. In our opinion, Akka makes some
very good, pragmatic choices here.
Actors in Akka are very lightweight. The memory usage of a single instance is on the order of a few hundred bytes
10
Actors in Akka are very lightweight. The memory usage of a single instance is on the order of a few hundred bytes
(less than 600). Using Akka's actors for seemingly simple tasks should not be a concern from this perspective unless
we're in an extremely memory restricted environment. But even if we are, there's a fair chance any non-actor
approach to concurrency could easily use similar memory just trying to manage the locks, mutexes and state
juggling needed to keep it "safe".
Akka organizes actors into hierarchies, referred to as actor systems. To begin the life of any actor, you must first
obtain a reference to the ActorSystem (we'll show you how to get one in the next chapter) upon which all actors
reside. At the top level of the system, we have three actors called guardians. We won't really directly interact with
these actors the way we would other actors, but it's important to understand how it forms the top of the hierarchical
tree. We can create our own actors from this top-level, which can then create their own child actors, resulting in a
tree of of actors, as shown in the following diagram.
Figure 7: Akka's actor hierarchy
There are important reasons Akka structures actors into these hierarchies. The first reason is for fault-tolerance.
Akka includes a supervision system that allows each parent actor to determine a course of action when a child actor
experiences a failure (that is, throws an exception). The choices, when this happens are: restart the child, resume
the child, terminate the child, and escalate the failure to the next parent (i.e. grand parent). We'll go into more details
about each of these when we get into implementation details later.
This hierarchy also allows for a simple, file-system like addressing scheme that will be familiar to developers. For
instance, an actor called "accountCreditor" (actors can either be given names or the framework will assign one) that
was created in an actor system called "accounting" and with a parent called "accountMonitor" might be addressed
as "akka://accounting/user/accountMonitor/accountCreditor". We won't dwell on all the specifics of this right now, but
it's a useful property that also ties closely into the next feature.
This last feature we want to highlight is the very simple and easily configured remoting that Akka provides. This
feature is the core that provides the key to truly scalable concurrency. With Akka remoting, we are able to both create
and interact with actors on another distinct Akka instance. This could be on the same machine, running under a
different JVM instance, or it could be located on another machine entirely.
And as we mentioned earlier, the remoting capability is very much integrated with the addressing of actors. From the
point-of-view of a given actor, if you have the address of another actor, it doesn't matter or need to care whether that
actor is local or remote. This is called location transparency and it's a very powerful concept because it allows you to
focus on the problem-at-large instead of experiencing an internal hemorrhage tinkering with the mechanics of it. The
reason is that remoting can be driven entirely via configuration. We can have an application, actually binaries, that
runs entirely in a single JVM, or across a large number of JVMs, with the only change being in the configuration.
Where are we going with this?
Now that we've covered the basic ideas, we can start to look at where this is all headed. At this point we're assuming
you have some problem to solve or will likely encounter some problem in the near future that involves the need to
build with concurrency in mind. We're hoping to demonstrate how Akka can help solve these problems.
In the next chapter, we'll show how to get started with Akka by way of some simple, but realistic examples. From
there, We'll demonstrate how we can take that simple example, make it more resilient, make it distributed and
perhaps a few more interesting things that would almost certainly be anything but simple with a thread-based
approach.
11
Chapter 2: Working with Actors
In this chapter, we'll get a brief description of how actors are created in Akka, including how we can add actors into
an existing project and then we'll see some details on how to interact with actors beyond just blindly sending
messages. Many of the examples in this book assumes you are working in a UNIX/Linux environment and it would
work, generally, in a Windows-based environment with a few minor adjustments. In later chapters, we'll revisit
portions of this example and refine it, explaining the features Akka provides along the way.
Determining the scope
It's hard to imagine there are many developers out there right now who haven't written some kind of web-based
code, whether it was a Rails application, a Java servlet or whatever. This is a seemingly simple task, but a host of
complexities can quickly rear their head. Though some of these systems are fairly trivial, in most cases the incoming
requests require interaction with some other system components. This might be a database or it could be other web-
services or perhaps a message queue, such as ActiveMQ or RabbitMQ.
In this case, we're going to see a simple service for managing a collection of bookmarks. The service will allow for
the creation of new bookmarks and the ability to query the saved bookmarks. We'll leave out editing to keep the
example code reasonably simple. As we develop this system, we'll look at what enhancements we might like to add
and explore how Akka can help out.
You can imagine this service might start as a simple weekend hack that you put together to keep track of links you
see on Twitter, Facebook, and elsewhere. But perhaps over time you show it to a few friends who ask to start using it
and eventually it might grow into something much larger, needing mutiple servers and dedicated resources. So, you
want to make sure the service is robust and that it is resilient to failures.
We'll look at how to build a simple service like that just described and then see how to add on the additional
functionality as the requirements expand. We'll go over the basics in this chapter and get comfortable with Akka
before we move into deeper waters in later chapters.
Laying the foundation
Now the first question is where to start. We're going to keep things somewhat simplified and ignore a few details, like
validation, but these are easy enough to add later. We're going to build this using a very simple Java servlet-based
approach, since we really don't want to complicate matters by introducing additional dependencies. And we'll use
Jetty to handle loading and running the servlet, without requiring a complete container setup.
With that out of the way, let's think about how we might approach this. We can start by defining a simple servlet that
accepts an HTTP Post request, parses the parameters given and updates some data structure. Here's a quick
example of how this might work:
0001 import javax.servlet.http.{HttpServletResponse, HttpServletRequest, HttpServlet}
0002 import javax.servlet.annotation.WebServlet
0003 import java.util.UUID
0004 import scala.collection.concurrent.{ Map ConcurrentMap }
0005
0006 case class Bookmark(title: String, url: String)
0007
0008 @WebServlet(name = "bookmarkServlet", urlPatterns = Array("/"))
0009 class BookmarkServlet(bookmarks: ConcurrentMap[UUID, Bookmark]) extends HttpServlet {
0010
0011 override def doPost(req: HttpServletRequest,
0012 res: HttpServletResponse) {
0013 val out = res.getOutputStream()
0014 val title = req.getParameter("title")
0015 val url = req.getParameter("url")
0016 val bookmark = Bookmark(title, url)
0017 val uuid = UUID.randomUUID()
0018 bookmarks.put(uuid, bookmark)
0019 out.print("Stored bookmark with uuid: " + uuid)
0020 }
0021 override def doGet(req: HttpServletRequest,
0022 res: HttpServletResponse) {
0023 val out = res.getOutputStream()
0024 val bookmarkId = req.getParameter("uuid")
0025 bookmarks.get(UUID.fromString(bookmarkId)) match {
0026 case Some(bookmark) out.println("Retrieved " + bookmark)
0027 case None out.println("Bookmark with UUID specified does
0028 not exist.")
0029 }
0030 }
0031 }
This code is fairly typical for a servlet. We're using a scala.collection.concurrent.Map (aliased here as
ConcurrentMap, to highlight the type) to store the bookmarks, so it's thread-safe. Let's exercise this example a little
just to get a feel of how it works.
Store and Retrieve a bookmark
Navigate to the directory ch2-working-with-actors/Non-Akka and run: > sbt.
Next, enter run on the command prompt: > run
12
and the SBT (aka Simple Build Tool) will look for our main class i.e. Bookmarker and start it upon which you'll notice
that a simple web server has lodged itself on your machine. A quick note here is that most of the code and
commands you'll see later on assumes a SBT session with dependencies already loaded. Next, we are going to
store our first bookmark in our system and retrieve it using a key that will be returned to the user. Assuming you are
working with some kind of UNIX system, you would do the following to store our first bookmark (We're using curl
which is a standard UNIX commandline tool which you can use to conduct HTTP operations, in our case it's POST
and GET)
0001 > curl localhost:8080 -d title="Developing an Akka edge"&url="http://somepublisher.com/akka-book"
and that previous command would return the following message with the identifier embedded into it
> Stored bookmark with uuid: 39d5968f-921e-41f0-a829-42a9cbf91ea9
Next, you can continue to use curl to query our system or fire up a browser
> curl localhost:8080?uuid=39d5968f-921e-41f0-a829-42a9cbf91ea9
And the following message is rendered on your command prompt or browser
Retrieved Bookmark(OpenCL,null)
This is all good and well until we need to do something besides simply adding, editing, and getting the list of
bookmarks. Let's assume at some point you want to retrieve a bookmark by URL, perhaps in order to avoid adding
duplicates. Since the URL is contained within the Bookmark object, we would have to traverse the Map looking at all
of the contained objects in it to find any bookmarks with the specified URL. It's certainly possible to build a set of
data-structures to store various cross-referencing indexes, but at this point you'll end quickly lose the thread-safety of
the concurrent map we've used above. Instead, you would need to manually lock the collections all at once, so that
any changes to one get reflected in the others.
There's also the issues that you can't easily interact with the same map from other instances of the application or
from another machine. A better solution would be to use an external database of some form. Once you begin
interacting with external services, though, actors can be very useful, as we'll see shortly (and, to a larger degree,
throughout the rest of this book).
Putting actors to work
First, though, we're going to take you on a brief detour to show you how to create and use actors in Akka. We
mentioned in the previous chapter, it all starts with an ActorSystem. This is the top-level entry point which you need
to have in place before you can create any actors. Creating one is as simple as giving it a name (we will be using
this later):
0001 scala> import akka.actor.ActorSystem
0002 import akka.actor.ActorSystem
0003 scala> val system = ActorSystem("SimpleSystem")
0004 system: akka.actor.ActorSystem = akka://SimpleSystem
But this is not very useful by itself. We still need to create the an actual actor, which we can do by defining a class
that extends the akka.actor.Actor trait and that contains a receive method:
0001 scala> import akka.actor.Actor
0002 import akka.actor.Actor
0003 scala> class SimpleActor extends Actor {
0004 | def receive = {
0005 | case "aMessage" => // ...
0006 | case i: Int => // ...
0007 | }
0008 | }
0009 defined class SimpleActor
Now that we have an actor class defined, we can create an instance of it using the actor system we created earlier.
This will create an instance of the class, start it (note: this is done asynchronously), and return an instance of an
akka.actor.ActorRef.
0001 scala> import akka.actor.Props
0002 import akka.actor.Props
0003 scala> system.actorOf(Props[SimpleActor], name = "mySimpleActor")
0004 res1: akka.actor.ActorRef = Actor[akka://SimpleSystem/user/mySimpleActor]
Don't worry about what Props is doing there, for now but know that it's a configuration object typically used when
creating Actors. Later, we'll show you a few cases where it serves a more clear purpose, but for now just look at it as
a basic factory for actor instances.
An ActorRef is the object you must always use to reference an actor in Akka. When I discussed how actors are
always referred to be some address in the previous chapter, this is the corresponding representation of that address
as used by Akka. From within the actor itself, you can get this reference using the self field. It's important to
understand that this reference is the mechanism through which you interact with an actor. This separation is
important for Akka to maintain location transparency and to prevent you from directly calling methods on actor
instances, which can break certain properties of the actor model. From outside of the actor you cannot, for instance,
call fields defined on the actor object itself, you can only interact through the interface that is exposed through this
ActorRef object. Here's an example of what would happen if you (even) tried:
13
ActorRef object. Here's an example of what would happen if you (even) tried:
0001 scala> class AActor extends Actor { var state = 42L; def changeState(x: Long) = state = x; def receive = { case x => println(s"YOU SENT: ${x}"
0002 defined class AActor
0003 scala> system.actorOf(Props[AActor], "a-actor")
0004 res9: akka.actor.ActorRef = Actor[akka://test/user/a-actor#-1660648705]
0005 scala> res9.state
0006 <console>:17: error: value state is not a member of akka.actor.ActorRef
0007 res9.state
0008 ^
0009 scala> res9.changeState
0010 <console>:17: error: value changeState is not a member of akka.actor.ActorRef
0011 res9.changeState
There's another form of this actor initialization that you should know about, though, which comes into play when you
have an actor class that takes constructor parameters.
0001 scala> class NamedActor(name: String) extends Actor {
0002 | def receive = {
0003 | case _ => // ...
0004 | }
0005 | }
0006 defined class NamedActor
0007 scala> system.actorOf(Props(new NamedActor("myActor")), name = "myNamedActor")
0008 res2: akka.actor.ActorRef = Actor[akka://SimpleSystem/user/myNamedActor]
Beyond this, you will later also encounter cases where you want to create actors from within other actors. This is a
very important technique that you'll later discover is fundamental to building systems capable of handling a wide
range of tasks. So far, we've been using the ActorSystem instance, generally called system in the examples I've
shown, but within an actor instance, there's a private field called context that is defined on the Actor trait, which
provides similar functionality. In fact, as with ActorSystem, it extends the ActorRefFactory trait, which defines the
actorOf and actorFor (which you will be seeing later in the book) methods. You can also use context to get a
reference to the current actor system by calling context.system. The flip-side of creating actors, which also starts
them, is stopping them. You can use the context to stop an actor by calling context.stop(ref) where ref is an
ActorRef instance. Starting and stopping an actor is done asynchrously.
Finally, the last piece you need to understand for now is how to send messages to actors. Messages can be of any
type. Akka will not prevent you from sending whatever you want to send. But it's important to understand the protocol
defined for a given actor.
The word protocol we used here was for a reason. A protocol is a well-defined set of rules for how communication
occurs between two or more parties. When you're communicating with an actor (whether from another actor or from
non-actor code), you should take the time to define your protocol of communication carefully. This is done by
defining a set of messages that the actor will accept and possibly respond to. In Akka, we generally use case
classes or case objects to define these messages. It's also typical to define them on the companion object for the
actor, but we're getting ahead of the game. We'll come back to this later.
You can send messages using the ! method which is used with ActorRef objects. This method, referred to as tell,
might look odd, at first, but you'll get used to it quickly and it makes for a very nice and succinct syntax:
someActor ! MyMessage("hello")
Within an actor, you use the self ActorRef when it needs to send itself a message:
self ! MyMessage("hello")
After that whirlwind tour of using actors in Akka, let's return to our earlier work and see how we can bring Akka into
the picture and let it help us out. Rather than showing you a full database enabled example right away, we're going
to simply use an actor as a standin for the database. Later, we'll make this actor the actual interface for the database,
taking advantage of Akka features such as fault-tolerance and routing to give us a robust and scalable way to
interact with the database. Let's look at our new actor first:
0001 import akka.actor.Actor
0002 import java.util.UUID
0003
0004 case class Bookmark(uuid: UUID, title: String, url: String)
0005
0006 object BookmarkStore {
0007 case class AddBookmark(title: String, url: String)
0008 case class GetBookmark(uuid: UUID)
0009 }
0010 class BookmarkStore extends Actor {
0011 import BookmarkStore.{GetBookmark, AddBookmark}
0012 var bookmarks = Map[UUID, Bookmark]()
0013 def receive = {
0014 case AddBookmark(title, url)
0015 val exists = bookmarks.values.exists {
0016 bm
0017 (bm.title == title && bm.url == url) || false
0018 }
0019 if (!exists) {
0020 val id = UUID.randomUUID
0021 bookmarks += (id → Bookmark(id, title, url))
0022 sender ! Some(id)
0023 } else
0024 sender ! None
0025 }
0026 case GetBookmark(uuid)
0027 sender ! bookmarks.get(uuid)
0028 }
14
0029 }
There's not much to this simple actor. It's essentially just acting as a wrapper around a Map that stores Bookmark
objects referenced by their UUID. The interesting thing to note here is how the actor responds to the messages it
receives. The sender method used here is defined within the base Actor class and provides the ActorRef to the
originator of the current message. You will use sender frequently when writing actor code. One important thing to
note is that sender is a method and the object returned by it will change in different contexts. That is, if you need to
reference it later (to pass on to some other actor or for replying later within some other context), you should create a
copy of the object in a val and pass that around, as needed.
On the servlet end, we're going to take advantage of the Java Servlet 3.0 support for asynchronous calls. While this
requires a bit more setup, it makes for much more natural integration with actors. First, here's the boilerplate to
actually create the class and make the necessary imports:
0001 import java.util.UUID
0002 import javax.servlet.http.{HttpServletResponse, HttpServletRequest, HttpServlet}
0003 import javax.servlet.annotation.WebServlet
0004 import akka.actor.{ActorRef, ActorSystem}
0005 import akka.pattern.ask
0006 import akka.util.Timeout
0007 import scala.concurrent.ExecutionContext
0008 import scala.concurrent.duration._
0009 import scala.util.{Failure, Success}
0010
0011 @WebServlet(asyncSupported = true)
0012 class BookmarkServlet(system: ActorSystem, bookmarkStore: ActorRef) extends HttpServlet {
0013
0014 // import the case classes we use for communicating with the actor
0015 import BookmarkStore.{AddBookmark, GetBookmark}
0016 }
Let's look at the doPost method to see how we interact with the actor:
0001 override def doPost(req: HttpServletRequest,
0002 res: HttpServletResponse) {
0003
0004 import ExecutionContext.Implicits.global
0005
0006 val asyncCtx = req.startAsync()
0007 val writer = asyncCtx.getResponse.getWriter
0008 val title = req.getParameter("title")
0009 val url = req.getParameter("url")
0010
0011 implicit val timeout = Timeout(5 seconds)
0012 asyncCtx.setTimeout(5 * 1000)
0013 val uuidFuture = bookmarkStore ? AddBookmark(title, url)
0014 uuidFuture.mapTo[Option[UUID]].onComplete {
0015 case Success(uuid)
0016 writer.write(s"Successfully created bookmark with uuid = $uuid")
0017 case Failure(error)
0018 writer.write("Failure creating bookmark: " + error.getMessage)
0019 }
0020 }
Here we first get a reference to an ExecutionContext. An ExecutionContext is an abstraction for defining the context
in which some code will be executed. Here we need to specify this because of the requests we're making to the
actor. You'll likely notice, after all the discussion earlier about the ! method, that we're not actually using it. Instead,
we're using the ? method — along the same lines as tell, this is referred to as >ask. This method returns a future,
which is a container for some value that may or may not have yet been computed and this newly created future can't
possibly execute forever, that'll be a bad idea hence you would need to supply a timeout and we do so via the
implicit value timeout. Since sending a message to an actor is asynchronous, we need to use a future to get a
response in non-actor code. The onComplete method is a callback that's available on Future instances. It's this
callback that requires the implicit ExecutionContext referenced earlier, because it executes asynchronously, the
ExecutionContext needs to be available to supply a thread in which it can execute. We also use Future's mapTo
method to constrain the type of the return value. This is useful since messages sent by actors are inherently of a
dynamic type and so we can't guarantee the type of the message that will be sent back to use from an actor. If the
Future takes longer than this timeout to return a value, we will receive a Failure in the onComplete callback.
You'll also note that we invoke req.startAsync(). This code is part of the Servlet 3.0 API which allows us to get a
handle on the current request context so that we can then later return a response to the caller of this servlet
asynchronously without having to keep a thread busy. In this case, we get a Writer instance via the
asyncCtx.getResponse.getWriter call and then set a timeout (there's a notable similarity here to what we are doing
with the Future, for good reason) before our asynchronous code begins executing. This will allow the servlet
container to interrupt the flow of execution if the timeout is exceeded and return an appropriate response to the
client, hopefully before the client's connection to the server has itself timed out.
The futures used by Akka are now part of the Scala standard library. They were added with the release of Scala
2.10, but were in large part derived from the futures used in earlier versions of Akka. We brought this up in case you
see mention of 'Akka futures' and wonder if they are a different sort of beast.
The servlet's doGet method will look very familiar:
0001 override def doGet(req: HttpServletRequest,
0002 res: HttpServletResponse) {
15
0003
0004 implicit val ec = ExecutionContext.Implicits.global
0005
0006 val asyncCtx = req.startAsync()
0007 val writer = asyncCtx.getResponse.getWriter
0008 val bookmarkId = UUID.fromString(req.getParameter("uuid"))
0009
0010 implicit val timeout = Timeout(5 seconds)
0011 asyncCtx.setTimeout(5 * 1000)
0012 val bookmarkFuture = bookmarkStore ? GetBookmark(bookmarkId)
0013
0014 bookmarkFuture.mapTo[Option[Bookmark]].onComplete {
0015 case Success(bm)
0016 writer.write(bm.getOrElse("Not found").toString)
0017 case Failure(error)
0018 writer.write("Could not retrieve bookmark: " + error.getMessage)
0019 }
0020 }
This is essentially using the same pattern we saw in the doPost example. The last method of this servlet we should
examine provides a very important bit of functionality, shutting down the ActorSystem when we're done with it. This
is why earlier we passed the ActorSystem to the servlet. You should be on the look out when designing your Akka-
based application for existing hooks such as this that might be available to handle tasks like cleanly shutting down
your system:
0001 override def destroy() {
0002 system.shutdown()
0003 }
Putting our actor to better use
Now that we've shown a basic actor and how to integrate it into a common scenario, let's look a bit more at how we
can expand on this to really leverage the capabilities of actors. As previously discussed, actors are excellent at
handling concurrency and scenarios where failures are an expected occurence, so let's consider a situation where
these are particularly important. As mentioned earlier, it really makes sense for an application like above to use an
external database. This database could be your typical relational system, such as MySQL or PostgeSQL, a non-
relational store, such as MongoDB, or perhaps even a web service sitting in front of some unknown anonymous
store (that is, it might not be something of which we actually know the internals). To keep things simple, we'll use a
standin for a data store, but this could just as easily be an external system:
0001 import scala.collection.concurrent.TrieMap
0002
0003 trait Database[DATA, ID] {
0004 def create(id: ID, data: DATA): ID
0005 def read(id: ID): Option[DATA]
0006 def update(id: ID, data: DATA)
0007 def delete(id: ID): Boolean
0008 def find(data: DATA): Option[(ID, DATA)]
0009 }
0010
0011 object Database {
0012
0013 def connect[DATA, ID](service: String): Database[DATA, ID] = {
0014 new Database[DATA, ID] {
0015
0016 // We're using a thread-safe concurrent collection here to avoid
0017 // odd behavior when you run the example code.
0018 private val store = TrieMap[ID, DATA]()
0019
0020 def create(id: ID, data: DATA) {
0021 store += (id → data)
0022 id
0023 }
0024
0025 def read(id: ID) = {
0026 store.get(id)
0027 }
0028
0029 def update(id: ID, data: DATA) {
0030 for (item <- store.get(id)) yield {
0031 store += (id → data)
0032 data
0033 }
0034 }
0035
0036 def delete(id: ID) = {
0037 store -= id
0038 !store.contains(id)
0039 }
0040
0041 def find(data: DATA) = {
0042 store.find(_._2 == data)
0043 }
0044 }
0045 }
0046 }
As you can see, this data store of ours supports the basic CRUD (create, read, update, and delete) operations as
well as a very simple find operation that assumes the candidate value matches precisely. We'll revise our earlier
actor to use this implementation:
0001 case class Bookmark(title: String, url: String)
0002
0003 class BookmarkStore(database: Database[Bookmark, UUID]) extends Actor {
0004
0005 import BookmarkStore.{GetBookmark, AddBookmark}
0006
0007 def receive = {
0008 case AddBookmark(title, url)
0009 val bookmark = Bookmark(title, url)
0010 database.find(bookmark) match {
0011 case Some(found) sender ! None
16
0012 case None
0013 val uuid = UUID.randomUUID
0014 database.create(uuid, bookmark)
0015 sender ! Some(uuid)
0016 }
0017 case GetBookmark(uuid)
0018 sender ! database.read(uuid)
0019 }
0020 }
You'll notice that the definition of Bookmark was modified slightly. This isn't critical, but it makes our usage of the
simple data store we created earlier a bit simpler. You can see that the actor now expects to have a database
instance passed to it, and that it is paramaterized on the Bookmark type we're storing in it, along with the UUID we're
using as for our keys in the database. The rest of the functionality here isn't effectively different to the caller of this
actor, since it still sends back the same response messages in each case.
Now, in the code that starts our application, we can add one line and modify the line that starts the actor in order to
supply the necessary database instance:
0001 val database = Database.connect[Bookmark, UUID]("bookmarkDatabase")
0002
0003 val bookmarkStore = system.actorOf(Props(new BookmarkStore(database)))
This still hasn't really brought anything new to the table. In the real world, if this were an external system as we've
described above, we'd be expecting failures, resource contention (how many connections does the database allow
or handle before giving us trouble?), etc. Let's add a couple things to both give us a bit of load-balancing and failure
handling:
0001 import akka.actor.{OneForOneStrategy, Props, ActorSystem}
0002 import akka.actor.SupervisorStrategy.{Escalate, Restart}
0003 import scala.concurrent.duration._
0004 import akka.routing.RoundRobinRouter
0005
0006 val databaseSupervisorStrategy =
0007 OneForOneStrategy(maxNrOfRetries = 5, withinTimeRange = 30 seconds) {
0008 case e: Exception Restart
0009 case _ Escalate
0010 }
0011
0012 val bookmarkStore =
0013 system.actorOf(Props(new BookmarkStore(database)).
0014 withRouter(RoundRobinRouter(nrOfInstances = 10,
0015 supervisorStrategy = databaseSupervisorStrategy)))
This code needs some explanation, of course, and while we will be going into much more detail later in chapters
four and five, we'll give you a brief glimpse here of what's going on. First, let's cover the databaseSupervisorStrategy
object that we created. Supervisors are a core feature of Akka that are responsible for determining what to do in the
case of exceptions within your actors. In this particular strategy we've definied, we are telling the supervisor to restart
an actor that throws a regular exception, to escalate any other errors. As briefly mentioned in the previous chapter,
actors in an actor system form a hierarchy and each actor in that hierarchy is the supervisor for its children (this
relationship defines the supervision hierarchy). Escalating means to pass the failure up to the parent, in which case
that parent's own supervisor is responsible for determinging how to handle the failure.
Here, we're inserting a parent actor for the BookmarkStore actor by way of a RoundRobinRouter that's been added
to the creation. Routers allow you to determine how messages are sent to one or more child actors. In this case, we
decided to use the simple round robin strategy, which simply iterates through the actors it is routing for in a
sequence, sending each new message to the next actor in turn, and restarting at the beginning of the sequence
when it has reached the last actor. The parameters we've passed, set the number of actors to start within this router
(that is, 10 instances of BookmarkStore) and the supervisor strategy I just described.
Wrap up
We've walked through an example showing how Akka can integrate with other libraries like the Servlet API to build
interesting combinations of capabilities. So far, we have side stepped the question of how to actually build this app,
though. In the next chapter, we'll cover these details and show you a few useful helpers that make your work easier.
17
Chapter 3: Running with Akka
You now have a bit of a taste for Akka, the actor model, and how these concepts and tools can help you address
concurrency needs. But a necessary step in building any application is actually building the appplication. In this
chapter, we'll show you how to go about this, using the preferred tooling of the Scala community, and show you a
few additional bits of infrastructure you'll need to make this all work for you, and to help make your system more
maintainable.
sbt
sbt is a standard part of the Scala environment used for building libraries and applications. It's not the only choice
available to you, but it's definitely the most commonly used and recommended when working with Akka and Scala.
You're welcome, of course, to use what ever build tool you feel the most comfortable with, but at the very least it is
worth your time to get familiar enough with sbt that you can build projects that are using it, when you need to.
First, of course, you will need to install sbt. You can find the instructions and downloads available on the SBT
website. You will want to make sure you have the sbt command in your path, which you can verify by running the sbt
command in an empty directory (create one, if necessary). On the first run, sbt will download a number of its own
dependencies and then should leave you at prompt that looks something like this:
0001 [info] Loading global plugins from /Users/tlockney/.sbt/plugins
0002 [info] Set current project to default-e05719 (in build file:/Users/tlockney/tmp/sbt-test/)
0003 &gt
If you get something along these lines, you're now ready to use sbt, otherwise you'll need to double check the
installation instructions and try again. You can also find help on SBT's mailing list or in the #sbt IRC channel on
Freenode.
The sbt build definition
In our combined experience, build tools are all over the map, so to speak, in terms of functionality and capabilities,
but there are a handful of defining characteristics that you can use to differentiate some of them. The most
immediately obvious of these, to us, is how they expect you to layout your project's source code. On one end of the
spectrum, you have tools like Ant and Rake, both of which go no further than suggesting what the default buildfile's
name should be. On the other end, Maven and sbt expect a project with a specific structure (though you can override
this, should you choose to do so). Mark Harrah, the original developer of sbt, chose to adopt much of the Maven
conventions with regard to this layout -- a decision which we feel was wise. For various reasons, we believe this
aspect of the tool was an excellent idea.
The structure you'll get used to seeing in sbt-based projects, generally uses the following guidelines, starting from
the root directory of the project. All source code and resources intended for the final build artifacts, go under src, with
src/main being the location of your regular code and resources, while src/test will hold the various test cases and
dependent resources. These directories will each contain one or more subdirectories with the following structure:
Scala code should go under src/main/scala, or src/test/scala in the case of tests and test-related utilities.
Similarly, Java code goes in src/main/java or src/test/java.
Resources, such as configuration files, go in src/main/resources or src/test/resources.
Your various library dependencies can be handled in a couple of different ways. One option is to simply copy these
Jar files into a lib directory at the top-level of your project. But more commonly, you will use managed dependencies,
which are declared as part of your build file.
The simplest build file format is handled by a build.sbt file. This file is expected to be at the top-level of your project
and will generally contain information such as the project name, project version, the Scala version being used and
potentially a set of dependencies to include along with any additional remote repositories that should be queried for
these dependencies.
To get us started, here's a very simple build.sbt file that can be used for building a project that uses just the basic
Akka actor module:
0001 name := "Simple Akka-based Project"
0002
0003 version := "0.1"
0004
0005 scalaVersion := "2.10.1"
0006
0007 libraryDependencies += "com.typesafe.akka" %% "akka-actor" % "2.1.2"
In the build.sbt format, you must separate each setting with one full blank line -- this is required by the parser that
reads these files. There is a more advanced format that uses full Scala objects to define your build, and you can find
more information about this in the sbt documentation.
You will also see a handful of operators used to assign values here. The only ones we will be covering for now,
though, are :=, += and ++=. The first of these, :=, is the basic assignment operator. It is used to assign a specific,
fixed value to one of the build settings, as seen above for the name, version, and scalaVersion settings. The
+= operator is used to append a value to an existing setting -- resolvers and libraryDependencies are the most
18
+= operator is used to append a value to an existing setting -- resolvers and libraryDependencies are the most
common cases where you'll see this. Finally, the ++= operator is used to append a collection (in the form of a Seq())
to an existing setting. As with the simple append, this is most often used with the resolvers and libraryDependencies
settings. For example, you might see libraryDependencies appended to for each dependency:
0001 libraryDependencies += "com.typesafe.akka" %% "akka-actor" % "2.1.2"
0002
0003 libraryDependencies += "com.typesafe.akka" %% "akka-remote" % "2.1.2"
Or, alternately, using the Seq approach:
0001 libraryDependencies ++= Seq(
0002 "com.typesafe.akka" %% "akka-actor" % "2.1.2",
0003 "com.typesafe.akka" %% "akka-remote" % "2.1.2"
0004 )
Notice that you did not need to separate each line in the second form — since you're updating a single setting, there
is no need to do so.
Compiling and running the code
Once you have your build set up correctly, you will of course need to compile and run your code. You might even
have tests you'd like to run to verify you haven't broken anything!
sbt defines quite a few tasks (and allows you to define more, but that's more advanced than we'll be covering here).
The most important of those to get familiar with, for now, are update, compile, test, and run. The update task tells sbt
to pull down any dependencies that it needs as defined in the build file. The compile and test tasks should be pretty
self explanatory. Executing the run task causes sbt to look for any main methods defined in your code. If it finds more
than one, it will prompt you to select which to run.
You should also know that there are two standard ways to execute these tasks: one is to simply pass them as
arguments to the sbt command (e.g., sbt compile will cause sbt to compile your code). The other option is to run
them from within the sbt console -- as shown above when we showed you how to verify you had the sbt command
on your path. If you're running them within the sbt console, you can exit sbt using the exit command (there is a
difference between tasks and commands in sbt, but that subject is beyond the scope of this book).
Building your Akka-based project with sbt
Now that you've gotten the basics of using sbt, we can walk you through setting up a project to use Akka. We'll show
you what you need to run the code shown in the previous chapter and then you can build on this in later chapters. To
start, here's all you need in your build.sbt to get things working:
0001 name := "A Simple Akka Project"
0002
0003 version := "0.1"
0004
0005 scalaVersion := "2.10.1"
0006
0007 libraryDependencies ++= Seq(
0008 "org.eclipse.jetty" % "jetty-server" % "9.0.0.v20130308",
0009 "org.eclipse.jetty" % "jetty-webapp" % "9.0.0.v20130308",
0010 "com.typesafe.akka" %% "akka-actor" % "2.1.2"
0011 )
It's as simple as that. This is essentially the previous definition we showed earlier, but with the addition of the Jetty
dependencies. You'll also notice two different styles used for declaring the dependencies. In the first two cases, the
first two strings are shown with just a single % character, but the akka-actor dependency has two of them. The first
two declarations are equivalent to what you'd see in a Maven POM file, just written a bit differently.
The Akka declaration is using a style that's common for Scala libraries -- the extra % character tells sbt to append
the Scala binary version (in this case "2.10") to the end of the dependency name ("akka-actor"), in this form: "akka-
actor_2.10". You'll see this repeatedly in sbt build files and it's worth getting used to now. It could have been written
explicitly as "akka-actor_2.10". The primary reason to use this style is that, when you have a number of
dependencies that, in turn, depend on the version of Scala being used, it's simpler and less repetitive to just change
the scalaVersion setting.
That's pretty much all there is to it, until you later decide to branch out and add new dependencies or get even more
advance by adding sbt plugins to the mix. For instance, you might want to add one of the plugins available for
automatically generating IDE configurations. Later in the book we will need to handle some of the examples with an
additional Akka module (akka-remote) for handling remote actors.
19
Akka and application configuration
Akka uses the Typesafe Config library to handle configuring the actor system and the behavior of its component
parts. This library supports a number of different formats for your configuration files. But the one we'll focus on is
similar to JSON, but more powerful and simpler, known as HOCON ("Human-Optimized Config Object Notation").
The format is very flexible in what it will accept and it's worth reading the documentation to get a feel for it.
The other significant information you need to understand about this library, is that it is built upon the philosophy that
your code should never contain default values. Instead, these default values are expected to be set in a file called
reference.conf which usually is distributed as part of the JAR file your code lives in. Similarly, the libraries you
depend upon will be similarly configured in their JAR files. At the application level, you then have an
application.conf file that allows you to override any of these settings. Additionally, you can use system properties to
override any settings. System properties are given the highest priority, allowing for very flexible runtime configuration
adjustments.
There are quite a few more features supported by this library and we'll be covering a few more of them as we look at
configuring Akka, but the best resource if you're curious is the library's own documentation. This library is becoming
more pervasive in the Scala community and is already being used for configuring the Play framework (actually, the
library was initially developed to supply a common configuration mechanism for Akka and Play). It's worth
familiarizing yourself with it on a deeper level.
Akka uses the standard mechanism described above to load a default configuration, which you can override in your
application by adding an application.conf file or by setting system properties. This configuration file will typically go
in the src/main/resources directory in order to be loaded correctly at runtime. We could show you the most basic
configuration file, but it would be hard to show in this book given that it would be entirely empty. In fact, the file
doesn't even need to exist. But to give you a sense of what you will typically see, here's a fairly straightforward
configuration that includes a few items I'll be talking more about later:
0001 akka {
0002 event-handlers = ["akka.event.slf4j.Slf4jEventHandler"]
0003 loglevel = DEBUG
0004 log-config-on-start = on
0005 actor {
0006 debug {
0007 receive = on
0008 lifecycle = on
0009 unhandled = on
0010 }
0011 }
0012 }
This configuration sets up a typical environment you might use while developing with Akka in order to effectively root
out possible issues. You'll want to familiarize yourself with the full documentation on the configuration settings
available, but in general terms this example sets up the logging system and turns on a variety of helpful log
messages.
You should take note of the structure of this configuration. As with JSON, it's made up of a sequence of nested
components, each of with is a child of the object in which it is contained. Each of these objects has a path
determined by this structure and you have the option of specify each setting using the same kind of dot notation
that's commonly seen in Java properties files. Here's a few functionally identical iterations on one of the lines from
the previous configuration to give you an idea of how this can vary:
0001 akka.actor {
0002 debug.receive = on
0003 }
0004 akka.actor.debug {
0005 receive = on
0006 }
0007 akka.actor.debug.receive = on
As we go through some of the additional functionality of Akka, we'll point out configuration settings that can be used
(and in some cases are necessary) to affect the behavior of the system. You can also use this configuration to define
your own settings specific to your application, but that's outside the scope of this book. You should read both the
Config library documentation as well as the documentation specific to the Akka configuration if you choose to do so.
There are special, though not onerous considerations you'll need to make in order to avoid breaking the handling of
the configuration elsewhere.
Logging in Actors and elsewhere
Logging in Akka is handled asynchronously, as you might expect. It does this using a special event bus. You enable
this by defining event handlers that should be subscribed to this bus. The default logging handler simply logs to
STDOUT and is enabled using the following configuration:
akka.event-handlers = ["akka.event.Logging$DefaultLogger"]
This simple STDOUT logging is often not the most ideal of approaches, so Akka also includes an SLF4J-based
event handler included in the akka-slf4j module. You'll need to add this to your libraryDependencies in sbt. You will
also want to add a SLF4J backend such as the recommended Logback:
0001 libraryDependencies ++= Seq(
0002 "com.typesafe.akka" %% "akka-slf4j" % "2.1.2",
20
0003 "ch.qos.logback" % "logback-classic" % "1.0.7"
0004 )
You will also need to replace the default the event handler using the following configuration.
akka.event-handlers = ["akka.event.slf4j.Slf4jEventHandler"]
Once you have this enabled, you can add a basic Logback configuration in src/main/resources.
0001 <configuration>
0002 <appender name="FILE" class="ch.qos.logback.core.FileAppender">
0003 <file>logs/application.log</file>
0004 <encoder>
0005 <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger - %msg%n</pattern>
0006 </encoder>
0007 </appender>
0008 <root level="info">
0009 <appender-ref ref="FILE"/>
0010 </root>
0011 </configuration>
Within your actors, you can add logging using either the general Logging facility or by mixing in the
ActorLogging trait. This later option requires less setup, so we recommend using it. You can also use the
LoggingReceive wrapper (along with a bit of configuration) to have messages received by your actor in your logs.
The configuration needed is:
0001 akka.loglevel = DEBUG
0002 akka.actor.debug.receive = on
And your actor code will look very much like you've already seen, with a few small additions:
0001 import akka.actor.{Actor, ActorLogging}
0002
0003 class MyLoggingActor extends Actor with ActorLogging {
0004 def receive = LoggingReceive {
0005 case Message() => //...
0006 case _ => //...
0007 }
0008 }
If you're familiar with any of the common Java-based logging frameworks, the logging API in Akka will feel very
familiar. The typical logging levels are supported: error, warning, info, and debug. In addition, it supports up to four
placeholder parameters in a format string, using the string {} for each position you want to be replaced. Also, as is
typical of error-level logging, you can pass an instance of Throwable as the first parameter for any calls to the
error method. All of this is available through the log object when the ActorLogging mixin is used. Here are a handful
of examples:
0001 log.info("A very simple log message")
0002
0003 log.warn("Something potentially dangerous may have happened in {}", "current method")
0004
0005 log.error(anException, "Error encountered trying to do something dangerous. Error message: {}",
0006 anException.getMessage)
0007
0008 log.debug(
0009 """
0010 Dumping out a lot of info is typical for a debug statement, but we're just
0011 going to print the current unix timestamp and process user : {}
0012 """,
0013 System.currentTimeMillis, System.getenv.get("USER"))
Finally, for more general logging, you can use the Logging and LoggingAdaptor objects to get access to the logging
system. You will not generally directly refer to the LoggingAdapter class, instead you'll use Logging's factory method
to create an instance. You need to have a reference to the current ActorSystem, but otherwise this can be used
anywhere in your code. This is actually what ActorLogging is using, it just makes it simpler to use inside of an actor.
The only other thing you need is to specify the source you want to be used to identify the log statement in the logs.
Normally, this will be either an instance of an actor, a simple String, or a class instance. If you need something else,
you can define a LogSource[T] implicit value, but we'll leave that as an exercise for the reader:
0001 import akka.event.Logging
0002
0003 val log = Logging(actorSystem, this)
0004 log.info("Now we have a logging object instantiated!")
Wrap-up
Now that you know how to build your project with sbt, adjust the configuration as necessary and add logging where
you need it, you should be ready to move forward. In the next chapters, we'll dive deeper into the core features of
Akka that you'll be using as you build your applications.
21
Chapter 4: Handling Faults and Actor Hierarchy
Failures happen. It's a simple fact of life. In the context of our systems, this might be anything from network outages
to drive failures or even simple errors in your application logic. The key is that we have to assume these events will
occur. If we don't, we are guilty of simply ignoring reality.
When we think about failures, whether they are caused by exceptions, environmental factors or whatever, the most
likely way we have learned to think about them is in terms of containment. For example, take the typical try/catch
block that seems to be ubiquitous in nearly every popular language right now in whatever form it takes. This is all
about containing these errors and failures and making sure they don't cause the rest of our system to come crashing
down.
But there's a big issue that you have perhaps encountered. When these failures occur, we need to understand what
state is our system in at the point the error happened and what do we need to do to make sure it's in a known-good
state. That's not always an easy question to answer, particularly when we consider how exceptions effect the
control-flow of your application.
Typical failure handling
It's worthwhile to spend some time reviewing the mechanisms Scala has built in for handling failure cases before
stepping back and looking at what Akka brings to the mix. Some of this will perhaps be familiar, given that these
same techniques exist in many other languages. But Scala brings a couple of nice additions to the mix.
The first of these is exception handling using try/catch/finally. Here's a simple example to make sure the concept is
clear:
0001 try {
0002 val writer = new FileWriter("test.out")
0003 writer.write(bigBlockOfData)
0004 } catch {
0005 case e: FileOutputException =>
0006 println("Failed to write data.")
0007 } finally {
0008 writer.close()
0009 }
In this example, we see a case statement for an exception type that we know might be encountered and a finally
block that will be executed whether the catch block is executed or not.
Another recent addition to the arsenal is Try[T], which allows for executing code that might be expected to result in
an exception being thrown. The classes Success[T] and Failure[T], both of which extend Try, are the concrete
instances which our code will receive depending on the success of the code passed to it. In the case of an
exception, the Failure instance returned will contain the exception that was thrown. Similarly, if no exception was
thrown, the Success instance will contain the final result of the code that was evaluated.
One of the benefits of Try is that it includes the map and flatMap operations, so we can use it in for comprehensions,
chaining together operations and proceeding only on a successful execution. In comparison to deeply nested
try/catch blocks, it should be apparent how much simpler this can make the code. Another reason to consider Try is
that it encodes a possibly failing operation in the type system rather than having to make the choice to explicitly
handle exceptions wherever they might occur or to delegate them up the call stack.
As a basic example of using Try, here we are retrieving the HTML source for Google's home page (the final output is
truncated, of course):
0001 scala> import scala.util.Try; import scala.io.Source
0002 import scala.util.Try
0003 import scala.io.Source
0004
0005 scala> for (g <- Try{Source.fromURL("http://google.com")})
0006 | yield g.getLines.foreach(println)
0007 <!doctype html><html...
Failures can also be handled using a simple match on the result:
0001 scala> import scala.util.{Failure, Success}
0002 import scala.util.{Failure, Success}
0003 scala> Try { Source.fromURL("http://imaginary.domain") } match {
0004 | case Success(result) => //...
0005 | case Failure(error) => println("Failed: " + error)
0006 | }
0007 Failed: java.net.UnknownHostException: imaginary.domain
There is Either[A,B] and there is Try[T] and there are differences between them.
A obvious difference is that Either does not specify what is considered a success and failure but leaves this notion
by providing two types, Left and Right, to implement a convention of saying Left is used for success while Right is
used for failure and this is in sharp contrast to Success[T] and Failure[T] which reflects the semantics of success and
failure (which are sub-types of Try[T]).
A second difference is that you need to know the specific types for when it is a success and failure in the case of
Either whilst Try demands that you know the type of when it is a success since failure is also some kind of
22
Either whilst Try demands that you know the type of when it is a success since failure is also some kind of
Throwable and this need to know the specifics reflects in the way you use Either i.e. When composing Eithers, you
need to make an explicit choice of the left / right via the methods left and right in Either. This poses a problem when
you build and chain computations especially asynchronous computations where the reason of failure are aplenty
and often you like to register callbacks to perform some action whenever a success or failure occurs e.g. Future[T]
and Promise[T] and given that Try are a more general abstraction (and reads better), they are our preferred choice.
Readers interested in more details, should consult the documentation.
Let it crash
The Akka engineering team often uses the catch-phrase let it crash (they actually borrowed it from the Erlang
community, where it's sometimes phrased as let it fail). The idea is that failures should be accepted and handled
appropriately. But how? Understanding the answer to this question is, in many ways, fundamental to truly
understanding how to design with actors — at least, if the intent is to have the system be resilient to failures.
The primary idea to keep in mind is to isolate tasks that manipulate important data from those tasks that don't. This is
easy to do with actors.
For instance, consider an actor that needs to maintain some local representation of a value, perhaps a current
trading price for a shares of commodities. But this actor needs to periodically request updates to the value from a
remote service. Calling the remote is risky. If the service is providing a RESTful mechanism for retrieving the current
value, it might have to deal with any of a number of possible error conditions: connection timeouts, HTTP failure
code responses, improper data encodings, expired access credentials, etc.
The right way to approach this is to separate the remote request-making portion of this task into a new actor.
Depending upon the overall structure of the system, this actor might be instantiated and managed by the data-
caching actor, or it might be handled by some other actor that has more general responsibilities for handling these
kinds of remote requests (for example, it may also handle encoding and decoding of the data formats involved).
Either way, that request-making actor is a child of some other actor. The parent of that child is also it's supervisor.
Fail fast
Another important idea is that of failing fast. That is, if a failure occurs, it's usually best to make sure the failure is
recognized and acted on immediately, but also to allow the actors to fail when they encounter problems. Following
the principle of keeping actors small and single purpose (sometimes referred to as the single-responsibility
principle), it makes sense to not include a huge amount of logic around handling failures, but rather to let the actor
fail and then either restart or recreate it and try again. This is where the topic of supervision comes into play.
Supervision
Supervisors are a key concept to understand and master in Akka. A poor understanding of them will almost certainly
lead to unexpected behavior in the actor system. This might result in data disappearing that was expected to appear,
requests to remote systems that shouldn't have occurred, or any of a number of other oddities.
Any actor in the system has at the very least a parent actor. At the topmost level are the special guardian actors
mentioned earlier in the book (more details on these shortly). It's best to create one or more top-level actors that will
then in any non-trivial system likely create additional actors as children. These children may even, in turn, create
additional actors as needed. This tree of actors forms the actor hierarchy of the system.
Each actor acts as a supervisor for its children. What this means is that any unhandled errors that occur within the
children are handled by the parent within a special structure called a SupervisorStrategy. The SupervisorStrategy for
a given actor is one of two types: either a OneForOneStrategy or an AllForOneStrategy. The difference between
these two types is what happens to the other child actors of the supervisor. In the case of the
OneForOneStrategy only the failing actor has the response of the strategy applied. With an AllForOneStrategy, all of
the failing actors sibling actors are also affected. OneForOneStrategy is very often the best choice, unless the
collection of actors are closely interdependent in some way that requires action from all of them. E.g. If you had built
a monitoring system where an actor, node-actor, spawns actors that monitor the available resources (i.e. cpu, ram,
hdd etc) in a single node and let's assume node-actor spawns actors monitoring cpu, ram, hard drive, live-ness etc
then you would apply the AllForOneStrategy (to stop all monitoring actors) if the live-ness actor that says that the
node went down; extrapolating that situation and imagine that there are actors monitoring the 500 nodes in the data
center environment i.e. 500 node-actor and you would probably apply a OneForOneStrategy to restart the monitoring
when the node comes back up instead of the former strategy.
The supervisor strategy's main purpose is to take an the error that occured and translate it into an appropriate course
of action, which is one of the following:
Resume — the actor should simply resume its operations, keeping all internal state
Restart — the actor should be restarted, resetting any internal state
Stop — the actor should simply be stopped
Escalate — the error should be escalated to the parent of the supervisor
No single one of these can be applied across the board to every case, so we need to determine which applies at any
given time. Note that resume should only be used when it is certain that the code can continue without issue in its
current internal state. Since restarts are such a common scenario, but necessarily need to be handled specially to
avoid cascading failures, these strategies also take two initial parameters: the number of times an actor is allowed to
23
avoid cascading failures, these strategies also take two initial parameters: the number of times an actor is allowed to
restart and the window of time in which that count is applied. To be precise, if the number of restarts is set to a
maximum of 5 in a 60 second window and the actor has already restarted 5 times in that 60 second window, the
actor will simply be terminated if another failure occurs.
Let's look at an example of defining a simple strategy:
0001 import akka.actor.Actor
0002 import akka.actor.OneForOneStrategy
0003 import akka.actor.SupervisorStrategy._
0004 import scala.concurrent.duration._
0005
0006 case class ExpectedHiccup(m: String) extends Exception(m)
0007 case class RemoteSystemFault(m: String) extends Exception(m)
0008 class ChildActor extends Actor {
0009 def receive = case x => throw RemoteSystemFault("fault!")
0010 } class SimpleSupervisor extends Actor {
0011 override val supervisorStrategy =
0012 OneForOneStrategy(maxNrOfRetries = 5, withinTimeRange = 60 seconds) {
0013 case _: ExpectedHiccup => Resume
0014 case _: RemoteSystemFault => Restart
0015 }
0016
0017 def receive = {
0018 case msg => context.actorFor("child") ! msg // deliver the message to the child
0019 }
0020 override def preStart() : Unit = {
0021 context.actorOf(Props[ChildActor], "child")// start the child when supervisor starts
0022 }
0023 }
What we have here is really an actor, SimpleSupervisor, whom has its own strategy to implement i.e. we override
supervisorStrategy with an implementation looking out for RemoteSystemFault & ExpectedHiccup, and this actor
starts / spawns another actor, ChildActor, and a message delivered to the supervisor actor i.e. SimpleSupervisor
would get delivered to the ChildActor which in turn throws an RemoteSystemFault. What happens next is that the
child actor is restarted.
It's important to point out what's not in this strategy: quite a bit. There are certainly many other possible exceptions
that might occur — it's impossible to say which they might be. In these unhandled cases, those errors are escalated
to this actor's supervisor.
Another important point is that, in the case where no strategy is defined, Akka will use the default strategy. This is
actually defined as one of two system strategies: SupervisorStrategy.defaultStrategy and
SupervisorStrategy.stoppingStrategy.
In the default strategy, the following cases are handled: an ActorInitializationException, which is thrown when
an actor's initialization logic fails, or an ActorKilledException, which is thrown when an actor receives an
akka.actor.Kill message. Both result in the actor being stopped; any other Exception instance will cause the
actor to be restarted, and any other instance of Throwable will be escalated.
The SupervisorStrategy.stoppingStrategy will simply stop any failing child actor. Note that any grand-child
actors or below that will also be stopped.
Both of these pre-defined strategies are of type OneForOneStrategy, with no limits specified for the maximum
number of restarts and no window defined. Given that, it's good to think about how failures will be handled there is
no defined supervisor. This can easily result in a system spiraling out of control, given that a generic exception will
cause the actor to get restarted. If this exception occurs every time the actor is executed, it will be spinning in-place
with potentially disasterous results.
The actor-lifecycle
While this applies to more than just failure handling, it's worth briefly discussing the lifecycle of an actor in Akka. As
we've already seen, actors typically begin life with a call to actorOf(Props[SomeActor]). Akka starts the actor when it
is first created, and as with any other object in Scala, initialization code can be placed within the constructor (the
body of the class definition). We can also insert code to be run immediately before the actor is actually started, but
after the object has been created, using the preStart hook. A common pattern with actors is to have the actor send
itself a message when it starts to let it know to initiate some process (for instance, scheduling work using the
scheduler, described in Appendix B). Akka also provides the ability to perform cleanup, as necessary, using the
postStop hook. It's important to note that, at this point, the actor's mailbox will be unavailable.
0001 import akka.actor.Actor
0002
0003 case object Initialize
0004
0005 class SelfInitializingActor extends Actor {
0006 override def preStart {
0007 super.preStart // empty implementation in the base type or class
0008 self ! Initialize
0009 }
0010
0011 override def postStop {
0012 // perform some cleanup or just log the state of things
0013 super.postStop // empty implementation in the base type or class
0014 }
0015
0016 def receive = {
0017 case Initialize => {
0018 // perform necessary initialization
0019 }
0020 }
0021 }
24
The most important hooks, though, when it comes to handling failures within an actor, are provided to allow for
handling of additional tasks needed when restarts occur. The preRestart and postRestart methods both get passed
the Throwable instance that caused the restart. In the preRestart case Akka also passes the most recent message
from the actors message queue that caused the exception to occur. Note that postRestart normally calls the preStart
method, so if overriding postRestart, we will need to call that directly (or call super.postRestart with the same
parameters) if our code depends on that hook, as well, particularly if we are depending on preStart for initialization.
You will typically use these restart hooks to handle cleanup chores in failure scenarios. A good example would be
when there is some interdependent resource that needs to know when the actor is available:
0001 import akka.actor.{ Actor, ActorRef }
0002
0003 case class Available(ref: ActorRef)
0004 case class Unavailable(ref: ActorRef)
0005
0006 class CodependentActor(dep: ActorRef) extends Actor {
0007 override def preStart {
0008 super.preStart
0009 dep ! Available(self)
0010 }
0011
0012 override def preRestart(reason: Throwable,
0013 message: Option[Any]) {
0014 super.preRestart // Default implementation is to stop and unwatch all "child" actors
0015 dep ! Unavailable(self)
0016 }
0017
0018 // this overriden implementation is not really
0019 // needed, but it's here to show you the form
0020 override def postRestart(reason: Throwable) {
0021 preStart()
0022 }
0023 }
The other key mechanism available as part of the whole actor lifecycle system is the so-called DeathWatch, which
provides a means to be notified when actors fail or when a particular actor has been stopped permanently (that is,
restarts don't count). In order to make use of this, an actor registers its interest in being notified by calling
context.watch on a reference to the target actor. When that target actor is shut down, the DeathWatch sends a
Terminated message, which includes the ActorRef for the deceased actor. It's also possible to receive multiple
Terminated messages for a single actor. This mechanism is very useful when you need to have the failure of one
actor trigger some other action, but be sure to use it carefully.
0001 import akka.actor.{Actor, Props, Terminated}
0002
0003 case class Register(ref: ActorRef)
0004 class MorbidActor extends Actor {
0005 def receive = {
0006 case Register(ref) => context.watch(ref)
0007 case Terminated(ref) =>
0008 }
0009 }
Understanding the actor lifecycle is an important factor for designing robust actor systems. The dependencies
between the components of an actor system should be built in such a way that the lifecycle of each individual
component is considered as part of the overall picture.
A bit more about the hierarchy
It's worth talking a bit more about the actor hierarchy and what, in particular, sits above the top-most actors. There
are three special actors known as guardians, which are internal to Akka. The one most often seen reference to is the
user guardian. The user guardian is handles is responsible for handling any errors that percolate up through the
actor hierarchy and which are not handled by any explicit supervisors lower in the tree. It normally implements
default strategy described above, but that can be overridden as of Akka 2.1 by overriding the setting
akka.actor.guardian-supervisor-strategy. To specify a different strategy, set this to the fully-qualified pathname of a
class that implements akka.actor.SupervisorStrategyConfiguration. Since this is rarely needed, so we will leave this
as an exercise for the reader.
The other guardians to be aware of are the system and the root guardians. The system guardian is responsible for
certain system-level services that need to be available before any user-level actors are started and that need to still
be running after any user-level actors are shutdown. One instance of this would be the logging actors that reside
under the system guardian. The order of startup and shutdown of the guardians in Akka provide this feature. The
startup order is root, followed by the system, followed by user. The inverse is used for shutdown.
The root guardian resides at the very top of the actor hierarchy. Its primary purpose is to handle faults that escalate
from the user or system guardians and a few other special root-level actors, with the default action being to stop
those children. This guardian, being an actor, still needs a supervisor and it has one: a special pseudo-actor that will
stop the child (root guardian) on any failure that reaches this level. Before the root guardian is stopped, the hierarchy
is walked, recursively stopping all children, after which the isTerminated flag of the ActorSystem is set to true. The
Akka team refers to this supervisor as "the one who walks the bubbles of space-time" — this is a reference to the
recursive nature of the supervisor hierarchy and the fact that this special supervisor needs to exist outside of the
"bubble of space-time" of the ActorSystem in order to avoid the issue of infinite recursion.
25
Guidelines for handling failures
Now that we've seen how Akka addresses exceptions and errors, we can cover some general principles that are
good to follow. These guidelines won't fit every scenario, but they are appropriate to use as a goal and careful
thought should be given when straying from this path.
We've already discussed the let it crash philosophy. But this idea is important enough to reinforce: the system
should be designed to allow for failure by isolating failure-prone operations in distinct actors. This allows for a lot of
flexibility when dealing with and accomodating these failures. As we'll see when we cover routers, this approach
also pairs well with pooled actors. When a single actor in the pool fails, we can still quickly retry the operation,
getting either another actor from the pool or possibly even the same actor after a restart.
One technique for isolating failures is to use what's know as the error kernel pattern. This pattern is only possible
because of the supervisor hierarchy in the actors of Akka since it provides you the means to delegate tasks to child
actors using a variety of ways e.g. routers (you'll learn that in the next chapter) and supervisor strategies.
The key idea with the error kernel pattern is that we are localizing the specific failure handling where it makes sense,
near the failure site, while allowing critical failures that might require more broadly scoped action to escalate as
necessary. A typical example of this would be interaction with some external system, such as a database or
webservice. In a normal operation you would expect some amount of failures interacting with these systems, and
those failures should be isolated using the error kernel pattern or similar localized supervisors.
In this pattern, you typically create a new actor to handle some unit of work that has a potential failure case that
should be guarded against, while allowing other work to continue. An example of this in action follows:
0001 import akka.actor.{OneForOneStrategy, Props, Actor}
0002 import akka.actor.SupervisorStrategy.{Escalate, Restart}
0003 import scala.concurrent.duration._
0004 case class Payload()
0005 case class CompletedWork()
0006 case class UnableToPerformWorkException() extends Exception()
0007 class ErrorKernelExample extends Actor {
0008
0009 override val supervisorStrategy =
0010 OneForOneStrategy(maxNrOfRetries = 5, withinTimeRange = 15 seconds) {
0011 case _: UnableToPerformWorkException => Restart
0012 case _: Exception Escalate
0013 }
0014
0015 def receive = {
0016 case work: Payload
0017 context.actorOf(Props[Worker]) forward work
0018 }
0019 }
0020 class Worker extends Actor() {
0021 def receive = {
0022 case work: Payload
0023 sender ! doSomework(work)
0024 context.stop(self)
0025 }
0026 def doSomework(work: Payload) = {
0027 // process the work and then return a success message
0028 CompletedWork
0029 }
0030 }
0031 }
0032
In this example, the ErrorKernelExample actor is using a custom SupervisorStrategy that will restart the child actor
when it throws an UnableToPerformWorkException on up to five occurences within a 15 second interval. The actor
itself waits for the Payload message, which indicates that some work needs to be performed, and creates an
instance of the Worker actor, forwarding the Payload message to it. By forwarding the message, the child actor is
then able to reply to the original sender who requested the work be performed directly when it has completed
successfully. Structuring the work this way allows the ErrorKernelExample to remain available for further work even
in the case of error occuring while performing the work. However, in this case, we wanted to know to finally give up
when the failure rate is too high, hence the settings given for the strategy.
We should also consider dedicated supervisors in some cases, as an additional means of isolation. A typical
approach is to have a single supervisor with its children that's playing a particular role in our system since that helps
developers create logical abstractions and to reason about them. This is really just another form of the error kernel
pattern shown above, but using a slightly different approach. Instead of creating anonymous actors to isolate
dangerous work, we create normal, non-anonymous actors and create special actors to be their parents, but whose
sole purpose is to supervise its children. We'll see an example of this in the context of our earlier example from the
second chapter.
First, we'll create a new actor to supervise our BookmarkStore actors. This is a very simplified example, but it should
provide an idea of how we can approach this:
0001 import akka.actor.{ActorRef, OneForOneStrategy, Actor}
0002 import akka.actor.SupervisorStrategy.Restart
0003 import scala.concurrent.duration._
0004 class BookmarkStoreGuardian(database: Database[Bookmark, UUID]) extends Actor {
0005
0006 override val supervisorStrategy =
0007 OneForOneStrategy(maxNrOfRetries = 5, withinTimeRange = 30 seconds) {
0008 case _: Exception => Restart
0009 }
0010
0011 val bookmarkStore =
0012 context.actorOf(Props(classOf[BookmarkStore], database).
0013 withRouter(RoundRobinRouter(nrOfInstances = 10)))
26
0014
0015 def receive = {
0016 case msg => bookmarkStore forward msg
0017 }
0018 }
This actor, implementing the idea of error kernel pattern, simply forwards the messages on to the BookmarkStore
actors which it places behind a RoundRobinRouter (we'll see more of in the next chapter). The overriden
supervisorStrategy is very simple here. As with the example used in Chapter 2, it simply restarts the actors on any
failure until it has exceeded 5 failures within 30 seconds. If those limits are exceeded, the failures will be escalated
(in this case, as we'll see below, up to the top of our ActorSystem, resulting in the system shutting down).
Here's the revised form of our Bookmarker application that initializes all of this.
0001 import org.eclipse.jetty.server.Server
0002 import org.eclipse.jetty.servlet.{ServletHolder, ServletContextHandler}
0003 import java.util.UUID
0004 import akka.actor.{Props, ActorSystem}
0005 import akka.routing.RoundRobinRouter
0006
0007 object Bookmarker extends App {
0008
0009 val system = ActorSystem("bookmarker")
0010
0011 val database = Database.connect[Bookmark, UUID]("bookmarkDatabase")
0012
0013 val bookmarkStoreGuardian =
0014 system.actorOf(Props(classOf[BookmarkStoreGuardian], database))
0015
0016 val server = new Server(8080)
0017 val root = new ServletContextHandler(ServletContextHandler.SESSIONS)
0018 root.addServlet(new ServletHolder(new BookmarkServlet(system, bookmarkStoreGuardian)), "/")
0019
0020 server.setHandler(root)
0021 server.start
0022 server.join
0023 }
The only significant change here is the addition of the code that starts our BookmarkStoreGuardian. Take note of
how we pass this guardian actor to our servlet instead of the previous routed pool of BookmarkStore actors. Since
the guardian is simply forwarding messages down to the underlying actors, this works without having to change our
servlet code at all.
Wrap-up
Akka gives us powerful techniques for fault handling, but it also requires designing with failure in mind. But that's
something we should be doing anyway. In the next chapter, we'll begin looking at additional structures for handling
the flow of messages and allocation of work in our system using routers and dispatchers.
27
Chapter 5: Routers
Up to this point in the book, we've looked at some uses of actors. While you can certainly build complicated chains
of actors cooperating via message passing, you haven't really seen the components that allow for building more
interesting and complex actor topologies. This is where routers (and dispatchers, covered in the next chapter) enter
the picture.
Routers are a mechanism Akka provides to handle the flow of messages to actors. They allow for grouping actors
together and sending messages to different actors based upon a variety of different rules. Dispatchers, on the other
hand, are more concerned with the actual management of the actors' execution. In fact, they are themselves
instances of ExecutionContext, which we briefly looked at earlier in the context of futures.
It's easy to get confused at times about which you should choose, given a specific problem, between a router or a
dispatcher. Hopefully, by the end of this chapter you will begin building an intuitive sense of when each of these
provide an appropriate solution.
The basics of routers
The basic purpose of a router is to provide a means to determine where messages are sent between a group of
possible actors. You can think of these as being similar to a load-balancer in front of a typical web application. In
fact, one of the most common use cases for routers in Akka is to balance some load across a set of actors. This can
be used quite effectively for situations that call for possibly limited or costly resource usage, like what you might
typically see in, for instance, a pool of database connections or other external resources (connections to external
dependencies is a common scenario here).
It's important right from the start to understand that, while routers are internally implemented as actors, they possess
certain properties and behaviors that are not always comparable to normal actors. For instance, they do not actually
use the same mailbox and message handling scheme used by other actors. Akka specifically bypasses these to
make routing efficient and simple. There is a cost, if you're actually implementing a router, but as we don't plan to
cover that here, you can safely ignore it for now.
One of the consequences of routers being actors is that they will be part of the actor hierarchy, so the path used to
address them will be based on the name give to the router, rather than the name of the actors used by the router.
This also has implications for how responses are sent when an actor behind a router is communicating with another
actor. Normally, if you just rely on an actor using sender from within a routed actor, any messages that other actor
sends back to its sender reference would go straight to the current actor, rather than being routed back through the
router. Depending upon the scenario, this might not be ideal. You can override this behavior and have any
responses sent back through the router using a variation on the normal message sending syntax (this actually uses
a method to which ! is effectively aliased behind the scenes). This form tells the actor on the other end to use the
current actor's parent, which is the router, as the sender:
0001 // within an actor that is behind a router
0002 sender.tell(SomeResponseMessage(), context.parent)
Built-in router types
Akka provides a variety of routers to use in your application.
RoundRobinRouter
The RoundRobinRouter is one of the simplest routers you'll encounter, but it is nonetheless very useful. In very
simple terms, it will send each message in-turn to the next actor in order, based upon the ordered sequence of
available routees and you cannot dictate which actor to begin with.
This router works quite well for simple scenarios where you want to spread tasks across a set of actors, but where
there is not likely to be a large variation of time spent performing those tasks and where mailbox backlogs are not a
primary concern. The reason for this is straightforward. If the tasks have a significant variance in time incurred, the
likelihood of ending up with one or more actors backing up rises as the rate of messages being sent increases. You
might end up with a number of messages that were sent early in the sequence of events, but which end up sitting
unattended to for long periods of time while other messages sent much later are handled after only a short delay.
SmallestMailboxRouter
The SmallestMailboxRouter is potentially useful for helping with the situation described in the previous scenario
where you want to avoid having messages sitting unhandled simply because they were sent to a busy actor when
another actor is sitting without messages to process. The SmallestMailboxRouter will look at the available routees
and select from them whichever routee has the smallest (possibly empty) mailbox and the trade-off is that the
collected values of these mailboxes from the routees could be stale by the time this router uses it to route messages.
It's not a cure-all, though — even if incoming messages are always sent to the smallest mailbox, there is no way to
predict whether that mailbox is currently occupied by messages that will actually take longer to process than a much
larger mailbox full of messages awaiting a different actor. Also, this router does not have the ability to view the
mailbox size for remote actors (remote actors are covered in a chapter 8). Given this limitation, remote actors are
given the lowest priority in the selection algorithm.
28
BroadcastRouter
BroadcastRouter is a handy utility for the case where you need to send the messages to all its associated actors. A
use of this router might be that you are building a monitoring system where you would send a message to this router
to detect if the nodes in the environment are alive, or busy. When the router receives a health check message, the
router broadcasts this request to its actors which in turn conduct a tcp-ping message to the nodes they are
responsible for and the return times are collated for storage.
RandomRouter
This router simply sends messages randomly to its routees. There's not much more to it than that. A use case for this
router might be to serve requests which returns a value given a particular key. This router would create a pool of
routees where each router would query the datastore e.g. in-memory cache, where the datastore holds some key-
value pair and returns the value if found or some message if not found. The important thing is that it doesn't matter
which actor serves the request but it is important that it does.
ScatterGatherFirstCompletedRouter
The ScatterGatherFirstCompletedRouter is a very special router that behaves quite differently from the other routers
described here. Like the BroadcastRouter, it will send any received message to all its routees, but it's intended to be
used with futures (therefore, it does not burden the router upfront) and the ask pattern, so that a response will be
returned, but the response returned will be the first of the routees to return a response.
This router essentially wraps together a common usage of futures, which is to handle precisely this scatter-gather
pattern. This is useful when the use case requires more sensitivity to time than others since this router chooses the
response that returns first.
ConsistentHashingRouter
A chapter alone could be dedicated to describing consistent hashing in depth, but in essence it's a means of
mapping hash keys such that the addition of new slots to the hash table results in a minimal remapping of keys. This
can be very useful, for instance, when you want to determine what set of servers to send a given request to. If the
mappings were to change each time a new key was added, there would be a very high cost to any such additions.
But with consistent hashing, this is minimized and thus there's a largely predictable mapping of a given request to a
target server or resource. This exact technique is used in a number of popular distributed data storage services.
The actual use of this router is complicated enough to be beyond the scope of this book, but it's a useful facility to be
aware of when designing distributed systems. It can be of particular use for cases where you are caching or
memoizing data within stateful actors and you need to reduce the need to refresh that cached data.
Others
The routers described above are just the existing routers provided as part of Akka's library, but it's not difficult to
create custom routers if need be. Reading the source code for the built-in routers will give you perhaps the best
indication of what's involved and, of course, the Akka docs include good material on this subject.
Using routers
There are two mechanisms to specify how a router instance should be created, one is purely programmatic and the
other uses the Akka configuration. Both have their uses and it's important to understand the reasons to choose one
over the other, which we'll cover as we look at each approach.
Configuration-based router creation
Creating routers via configuration is very simple and makes it easy to adjust the runtime routing strategy without
needing to make code changes and push out a new build:
0001 akka.actor.deployment {
0002 /configBasedRouter {
0003 router = round-robin
0004 nr-of-instances = 3
0005 }
0006 }
Assuming you have an actor called PooledActor, you would then use this by adding the following code. Note that
the Props object is still passed the instance of tha actor you intend as your routee, but then you modify the Props by
way of the withRouter method call:
0001 import akka.actor._
0002 import akka.routing.FromConfig
0003 val router = context.actorOf(Props[PooledActor].withRouter(FromConfig()),
0004 name = "configBasedRouter")
0005 router ! SomeMessage()
All routers available within Akka allow for resizeable pools of the routees. You can set this via code, but since we're
29
All routers available within Akka allow for resizeable pools of the routees. You can set this via code, but since we're
focusing on configuration based router definition, let's look at some of the options available:
0001 akka.actor.deployment {
0002 /resizableRouter {
0003 router = smallest-mailbox
0004 resizer = {
0005 lower-bound = 2
0006 upper-bound = 20
0007 messages-per-resize = 20
0008 }
0009 }
0010 }
The configuration defined here specifies a starting size of 2 routees, in addition to assuring that we never drop below
2 routees in the pool. Further, we'll never have more than 20 routees and the router will only try to resize, if
necessary, after every 20 messages. This option is useful for assuring that the router is not spending an excessive
amount of time trying to resize the pool.
There are a handful of other configuration parameters available for the resizer and they can get a bit confusing on
your first encounter with them, so here's a brief summary of each and how they are used:
pressure-threshold is used to define how the resizer determines how many actors are currently busy. If this
value is set to 0, then it uses the number of actors currently processing a message. If the value is set to the
default value of 1, it uses the number of actors which are processing messages and have one of more
messages in their mailbox. If the number is greater than 1, it will only include actors which have at least that
number of messages in their mailbox.
rampup-rate is used to determine how much to increase the routee pool size by when all current routees are
considered to be busy (based on the pressure-threshold). This value is a ratio and defaults to 0.2, which
means that when the resizer determines that it needs to create more routees, it will attempt to increase the pool
size by 20%.
backoff-threshold is used for reducing the size of the pool. Another ratio (this value defaults to 0.3) is
interpreted to mean that there must be less than 30% of the current routees busy before the pool is shrunk.
backoff-rate is essentially the inverse of rampup-rate, since it determines how much to decrease the pool size
when that is called for. The default rate of 0.1 means that it will be decreased by 10% when needed.
stop-delay is used to provide a small delay before a PoisonPill message is sent to the routees that are being
removed from the pool, in order to shut them down. The delay, defaulting to 1s, is provided to allow some time
for messages to be placed into the routees mailbox before sending them the message to terminate.
You can easily spend a huge amount of time just trying to get the perfect configuration, but we'd advise against
spending excessive effort on this when you're first building your system. It takes careful testing to really test these
changes adequately, so working with the defaults is often a good place to start.
Programmatic router creation
Creating a router in code is also quite simple. If you choose to use this approach, you can always redefine the
configuration using the methods specified previously, assuming you've given the router a name you can use to
reference it in the configuration:
0001 val randomRouter = context.actorOf(Props[MyActor].withRouter(
0002 RandomRouter(nrOfInstances = 100)),
0003 name = "randomlyRouted")
Alternately, to using configuration-driven actor sizing (either using a fixed nr-of-instances or by defining a resizer),
you can pass a collection of existing actors in to your router. This can be very useful when you need to perform more
complex setup of each actor instance than the Props factory-based approach allows. A very simple example of this
would be simply setting a specific name for each actor. It's notable that using this approach obviates the use of nr-of-
instances or any sort of resizing:
0001 val namedActors = Vector[ActorRef](
0002 context.actorOf(Props[MyActor], name = "i-am-number-one")
0003 context.actorOf(Props[MyActor], name = "i-am-number-two")
0004 )
0005 val router = context.actorOf(Props().withRouter(
0006 SmallestMailboxRouter(routees = namedActors)),
0007 "smallestMailboxRouter")
Routers and supervision
Routers, like other parts of your actor hierarchy, generally should be considered in light of supervision and failure
handling. By default, all routers will escalate any errors thrown by their routees, which can lead to some unexpected
behavior. For example, if your actor that creates the router has a policy of restarting all of its children on an exception
when one of your routees encounters an error, all of the children of the parent of the router will be restarted. This is
not a good thing. Thankfully, you can override the strategy used by the router easily enough as the following
example demonstrates:
0001 val router = context.actorOf(Props[MyActor].withRouter(
0002 RoundRobinRouter(nrOfInstances = 20,
0003 supervisorStrategy = OneForOneStrategy() {
30
0004 case _: DomainException => SupervisorStrategy.Restart
0005 }
0006 )
0007 ))
Continuing our example application
In the second chapter, we showed you a very simple example of using a router to both load balance across requests
to a database and to provide fault handling. We then changed the fault handling mechanism in the last chapter to
use a special intermediate actor to provide a supervisor for our routed actors. Let's expand this a bit further with what
we've learned here to make it more adaptive to changing load-handling needs.
First, we make a fairly simple change in Bookmarker, removing a bit more code we placed there earlier:
0001 import org.eclipse.jetty.server.Server
0002 import org.eclipse.jetty.servlet.{ServletHolder, ServletContextHandler}
0003 import java.util.UUID
0004 import akka.actor.{Props, ActorSystem}
0005 import akka.routing.{FromConfig}
0006 object Bookmarker extends App {
0007 val system = ActorSystem("bookmarker")
0008 val database = Database.connect[Bookmark, UUID]("bookmarkDatabase")
0009 val bookmarkStore =
0010 system.actorOf(Props(new BookmarkStore(database)).withRouter(FromConfig()),
0011 name = "bookmarkStore")
0012
0013 val bookmarkStoreGuardian =
0014 system.actorOf(Props(new BookmarkStoreGuardian(bookmarkStore)))
0015 val server = new Server(8080)
0016 val root = new ServletContextHandler(ServletContextHandler.SESSIONS)
0017 root.addServlet(new ServletHolder(new BookmarkServlet(system,
0018 bookmarkStoreGuardian)), "/")
0019
0020 server.setHandler(root)
0021 server.start
0022 server.join
0023 }
The primary change to note is that we now create the BookmarkStore router using the FromConfig call to tell Akka to
load the router settings from the application.conf file. Here's the minimal configuration used here:
0001 akka.actor.deployment {
0002 /bookmarkStore {
0003 router = round-robin
0004 nr-of-instances = 10
0005 resizer {
0006 lower-bound = 5
0007 upper-bound = 50
0008 pressure-threshold = 0
0009 rampup-rate = 0.1
0010 backoff-threshold = 0.4
0011 }
0012 }
0013 }
We're making some assumptions here that we should explain. Of course, this is a semi-imaginary example, given
that we're using a mock database interface. Even with a real database or other external data store, determining the
settings to use here would take a bit of analysis. In this case, we're assuming that at minimum, we might want to
have a collection of 5 BookmarkStore actors to interact with the database but ramping up to 50 in times of peak load.
Further, we're setting the pressure-threshold to 0 based upon the understanding that calls to an external system are
expensive, so having an actor currently processing a message is enough to consider it busy. However, we might not
want to ramp up too quickly, so we've set the rampup-rate to only increase the pool size by 10% at a time. We also
want to shrink it back down quickly so that idling resources are returned back to the system, tending towards a small
pool, so we've set the backoff-threshold to drop the pool size when fewer than 40% of my routees are busy.
Wrap-up
This whirlwind tour of routers and dispatchers has hopefully given you an idea of the flexibility Akka gives you for
creating robust configurations that can handle very different types of workflow, depending upon your needs. There
are a huge range of choices available to you, so you might feel overwhelmed, but it's generally best to start with the
minimum you need to get things working. Then, through watching the performance and profiling under real
workloads, you can get a better sense of where to apply these different tools and understand how they might impact
your overall system.
31
Chapter 6: Dispatchers
Now, we turn to the matter of dispatchers. Dispatchers and Routers are distinct in the sense that the former governs
the mechanics of messaging while the latter provides the mechanics of applying a strategy to select a actor to route
the message to. It's perhaps best to think of dispatchers as embodying two fundamental concepts: management of
the actor mailbox and the threading strategy used to allow actors to do actual work. We'll discuss both of these in a
bit more depth as they are the focus of the primary differences between various dispatcher choices.
Let's look at the mailbox choices first, as they are the simpler part of this picture. There are primarily two important
concerns that make up the selections available. That is, you have the choice of whether to assign a distinct mailbox
for each actor or to share one for all. Further, you can choose whether the mailboxes should be of unlimited capacity
or if they should be bounded to some fixed limit. The default dispatcher gives each actor its own unbounded
mailbox. You also have the option of implementing a priority-based mailbox, which requires mapping messages to a
numeric priority.
The task management piece is closely related to (and uses, behind the scenes)
Java's java.util.concurrent.ExecutorService facility. The purpose is to abstract out how threads are managed and
how actors are given time and resources in the form of these threads in which to perform their tasks.
The default dispatcher uses the relatively new Fork/Join framework developed by Doug Lea, one of the key
masterminds behind Java's java.util.concurrent package. This framework is included with the Java 7 release, but
Scala includes a copy of it, which means you can still use it with the Java 6 JVM. The Fork/Join framework is well
suited for actor-based workflows because of the work-stealing approach it uses to keep all threads in the pool busy
when possible. This approach tends to work well for systems that spawn lots of small tasks that need to be executed
concurrently.
Configuring and using dispatchers
Let's look at how you might typically configure and use the default dispatcher before we dive into other dispatcher
choices. The configuration of the default has a number of options (all custom dispatchers do, as well), but we'll focus
on a few key points:
0001 akka.actor {
0002
0003 default-dispatcher {
0004 # We'll look at other possible values for this setting later
0005 type = Dispatcher
0006 # This is the default already, but you also have the option of
0007 # using "thread-pool-executor", which corresponds to a
0008 # java.util.concurrent.ThreadPoolExecutor instance
0009 executor = "fork-join-executor"
0010
0011 # Throughput defines the "fairness" of the executor. This works by
0012 # setting the number of messages that the dispatcher will allow to
0013 # be processed before the thread is returned to the pool
0014 throughput = 10
0015
0016 # Since we defined this type of executor, we can use this section
0017 # to customize its behavior
0018 fork-join-executor {
0019
0020 # The minimum number of threads that should be allocated
0021 parallelism-min = 4
0022
0023 # The maximum number of threads that should be allocated
0024 parallelism-max = 32
0025
0026 # The executor will use this along with the number of available
0027 # CPU cores to select how many threads to allocate within the
0028 # bounds given by the previous settings. The formula used is
0029 # ceil(cores * factor). For example, if you have 8 cores and
0030 # the following 4.0 setting, the result will be 40, but since
0031 # we set an upper bound of 32, only 32 threads will be
0032 # allocated
0033 parallelism-factor = 4.0
0034
0035 }
0036 }
0037 }
Let's talk for a moment about how these settings should be used. This is an area that can take a lot of effort to get
right, so understanding some basic guidelines will be helpful. The first couple of configuration settings are fairly
straightforward, though the executor setting can be set to a few different values. The fork-join-executor value is
shown, and thread-pool-executor, is a fully qualified classname (FQCN) for a custom implementation of
the akka.dispatch.ExecutorServiceConfigurator abstract class which two classes (representing the fork-join and
thread-pool approach) are derived from namely ForkJoinExecutorConfigurator and ThreadPoolExecutorConfigurator
which gives you a choice in selecting a concurrency model based on that of Java7 and not Java6 (since Java6 has
only the implementation based on the ThreadPool). The parallelism-min and parallelism-max values are simple
bounds that are used in combination with the parallelism-factor setting to determine how many threads the executor
is allowed to allocate. You will need to evaluate the nature of the work being performed by the actors (and futures, as
we'll see in chapter 8) on this executor to determine how best to adjust these settings. For instance, if you are
performing a lot of CPU-bound work, it doesn't make a lot of sense to allow the executor to use more threads than
you have available CPU cores. On the other hand, if your actors are spending a lot of time waiting, whether for IO
operations, other actors within your system, or whatever, then allocating sufficient threads to allow them enough
processor time to check whether they have messages ready for them is important.
The throughput setting is very useful — it allows you to tell the dispatcher how many messages to drain from a given
32
The throughput setting is very useful — it allows you to tell the dispatcher how many messages to drain from a given
actors mailbox before it returns the thread being used to the pool for another actor to use. A setting of 1 here allows
for maximum fairness, but also means that there may be a lot of context switching going on as the executor
schedules your actors and you may experience latencies higher on older machines. The struggle is to determine
what is the sweet spot for this value and the answer is that it depends; you want to achieve both fairness in the work
distribution as well as having responsive applications and there's a good chance you need to conduct experiments
to find that sweet spot.
It's perhaps not obvious, but now that we've configured the above settings in akka.actor.default-dispatcher, we don't
actually have to modify any code to use it. Any actor for which we don't configure a specific dispatcher will use these
settings — that's why it's called the default-dispatcher, of course.
To configure a custom dispatcher, you simply need to add a section defining the settings for that dispatcher and give
it a suitable name. Here's a minimal configuration for a dispatcher that uses Java's
java.util.concurrent.ThreadPoolExecutor (Akka provides a convenient shortcut with the thread-pool-executor name
for this dispatcher and a corresponding implementation of an ExecutorServiceConfigurator for it).
0001 my-thread-pool-dispatcher {
0002
0003 type = Dispatcher
0004
0005 executor = "thread-pool-executor"
0006
0007 thread-pool-executor {
0008
0009 core-pool-size-min = 4
0010 core-pool-size-max = 32
0011 core-pool-size-factor = 2.0
0012
0013 max-pool-size-min = 8
0014 max-pool-size-max = 84
0015 max-pool-size-factor = 3.0
0016
0017 keep-alive-time = 30ms
0018
0019 task-queue-type = "linked"
0020 }
0021 }
These settings are essentially mirrors of the settings definied for the underlying ThreadPoolExecutor interface (see
the Javadocs for a lot more detail). In case you're not familiar with this interface, the first thing we need to cover are
the core-pool-size and the max-pool-size settings. The core pool is essentially the pool of threads that this executor
tries to keep available for work, while the maximum pool settings define an outer bound for the number of threads
available. If a new task is submitted to the pool and there are fewer threads than the core pool size, an additional
thread will be created even if there are other idle threads in the core pool as long as the maximum pool size is not
exceeded. The min, max and factor settings for each pool provide a way to dynamically size these pools based on
the number of CPU cores available. That is, the number of cores will be multiplied by the factor and constrained
within the bounds of the minimum and maximum sizes specified with the result being used to size the pools
appropriately. As an example, suppose you have set the pool minimum size to 4, maximum size to 32 and a factor of
3.0. With 4 CPU cores this will result in 12 threads being placed in the pool. But with 16 CPU cores, you would get
32 threads due to the maximum size being less than 16 * 3.0. Finally, for pool size related settings, the keep-alive-
time shown sets the time that the pool will wait before shutting down additional, non-core threads in the pool.
The task-queue-size and task-queue-type settings are used to define the type of queue used to hold incoming tasks
that are waiting for an available thread from the pool. Akka provides two implementations for use here: linked which
uses a LinkedBlockingQueue and array which uses an ArrayBlockingQueue. The LinkedBlockingQueue is, by
default, an unbounded queue that can grow as needed and, as such, it allows tasks to wait until a core pool thread is
available before allocating a thread for a task, so no more than core-pool-size-max threads will ever be used in this
case. You can also specify an optional task-queue-size setting for LinkedBlockingQueue to limit the number of tasks
that can be submitted when the queue is waiting on threads to be available. Tasks that are submitted when the
queue is at capacity will be rejected. The ArrayBlockingQueue is a fixed-size, bounded queue, so it also requires
the task-queue-size setting.
You would use the dispatcher just defined using the following code:
0001 import akka.actor.Props
0002
0003 val anActor =
0004 context.actorOf(Props[MyActor].withDispatcher("my-thread-pool-dispatcher"))
Note that this configuration can be defined anywhere in your configuration. In the case of my-thread-pool-dispatcher,
we're assuming that it's a top-level configuration entry. If you instead defined it at myapp.dispatchers.my-thread-pool-
dispatcher, the code would need to reference that full patch to access it:
0001 context.actorOf(Props[MyActor]
0002 .withDispatcher("myapp.dispatchers.my-thread-pool-dispatcher"))
You don't need to know what all the values shown above mean unless you find yourself in need of an alternative to
the Fork/Join-based dispatcher. If that's the case, we highly recommend reading Java Concurrency in Practice to get
a solid understanding of the different executors offered by the Java concurrency libraries. In general, you will usually
want to stick with the default, Fork/Join-based dispatcher, as it provides very good performance characteristics to
most cleanly designed non-blocking actor usages.
You should also be aware that these dispatchers can be used as an ExecutionContext since they implement that
interface. This means you can use them for executing futures by creating an implicit value that refers to them.
33
0001 implicit val myThreadPoolDispatcher =
0002 system.dispatchers.lookup("myapp.dispatchers.my-thread-pool-dispatcher"))
Provided dispatchers
Akka provides four standard dispatchers for your use: Dispatcher, PinnedDispatcher, BalancingDispatcher, and
CallingThreadDispatcher. We've already implicity covered the default Dispatcher, but to complete the picture, we
should describe its other characteristics. As we've already seen, you can use and of fork-join-executor, thread-pool-
executor or a FQCN pointing to an implementation of akka.dispatcher.ExecutorServiceConfigurator to determine
how tasks submitted will be executed on this dispatcher. The dispatcher will allocate a mailbox so that each actor on
this dispatcher has a unique mailbox.
The PinnedDispatcher is a special dispatcher that actually allocates a single thread for each actor assigned to it.
Said another way, this means that for every actor you assign to use this dispatcher, that actor will always be
guaranteed to have a full thread available for use. If you're asking yourself why you would want to do this, think
about cases where you had a resource or set of resources that need to be given priority over most other parts of the
system. The intent of this dispatcher is to attempt to assure that that these actors will always have some CPU time
available, but it's impossible to really guarantee this, of course. Also keep in mind that you don't want to overuse this
pattern. In fact, use it very sparingly. You can configure a dispatcher to be of type PinnedDispatcher with the
following configuration:
0001 my-pinned-dispatcher {
0002 executor = "thread-pool-executor"
0003 type = PinnedDispatcher
0004 }
Another dispatcher you might hear about is BalancingDispatcher. This is what you might be tempted to call a work-
stealing dispatcher, though that term is not technically accurate. With this dispatcher, all actors using it will share a
single mailbox and the dispatcher will give messages in the mailbox to idle actors as they become available. So it's
not really work-stealing in typical sense, but it does attempt to distribute the load fairly across a set of actors.
One caveat to be aware of with this dispatcher is that it's intended to be used with actors that are all of the same
type. There's nothing that will prevent you from attempting to assign it to different actor types, but you will want to
make sure you understand what you're doing if you ever try to do so -- at the least, each actor type should assume it
will be receiving the same set of possible messages.
Using this dispatcher in your configuration is similar to the PinnedDispatcher. You simply assign it as the type, but
you probably want to use the fork-join-executor here:
0001 my-balancing-dispatcher {
0002 # this is not strictly necessary, as it is the default
0003 executor = "fork-join-executor"
0004 type = BalancingDispatcher
0005 }
Mailboxes
The subject of mailboxes has been discussed already, but we haven't really devoted much time to them. As
described earlier, mailboxes can be either bounded or unbounded. In the case of bounded mailboxes, you should
also be aware that they are typically blocking. That is, the act of adding a message to a bounded will block for some
period of time if the mailbox is currently full. The timeout for this is set in the configuration, as is the size and type of
the mailbox. Also, it is certainly possible to implement a non-blocking, bounded mailbox. In Akka 2.2, the default
mailbox type is the type akka.dispatch.UnboundMailbox i.e. unbounded mailbox.
Here are a couple quick configuration examples showing you basic unbounded and bounded configurations.
A typical unbounded mailbox
0001 my-unbounded-mailbox-dispatcher {
0002 mailbox-capacity = -1
0003 }
Here we are just asking for the most plain vanilla unbounded, nonblocking mailbox.
A mailbox with a 100 message limit
0001 my-bounded-mailbox-dispatcher {
0002 mailbox-capacity = 100
0003 mailbox-push-timeout-time = 0
0004 }
And here we create a simple bounded, blocking mailbox. It will accept up to 100 messages and then block
indefinitely when new messages are enqueued. The key mailbox-capacity is self-explanatory and the other key
mailbox-push-timeout-time is the maximum acceptable latency you are willing to accept when pushing messages
and measured in nanoseconds and naturally, you would want the value to be close to zero i.e. 0.
We can also create PriorityMailbox instances, though this requires a bit of custom code. The idea with these mailbox
types is that each incoming message is given a weighted priority that is used to determine in what order the
messages in the mailbox are processed. Assuming we wanted to create a mailbox that would handle AddBookmark
messages and prioritize them based on the contents of their URL, the following approach would be one possible
34
messages and prioritize them based on the contents of their URL, the following approach would be one possible
solution:
0001 package akkaguide
0002
0003 import akka.dispatch.{PriorityGenerator, UnboundedPriorityMailbox}
0004 import akka.actor.ActorSystem
0005 import com.typesafe.config.Config
0006
0007 object BookmarkPriorityQueueMailbox {
0008 import akkaguide.BookmarkStore.AddBookmark
0009
0010 val priorityGenerator = PriorityGenerator {
0011 case AddBookmark(title, url) if url.contains("typesafe.com") => 0
0012 case AddBookmark(title, url) if url.contains("oracle.com") => 2
0013 case _ => 1
0014 }
0015 }
0016
0017 class BookmarkPriorityQueueMailbox(settings: ActorSystem.Settings, config: Config)
0018 extends UnboundedPriorityMailbox(BookmarkPriorityQueueMailbox.priorityGenerator)
As you can see, this relies on creating a PriorityGenerator instance that ranks the messages as integer values. The
lower the value, the higher priority the message is given. In this example, URLs containing the value "typesafe.com"
are given the highest priority, while those containing "oracle.com" are given the lowest. We assume you can see
where our biases are more inclined to. To use this dispatcher, you can simply specify the fully qualified class name
of the new mailbox class in the mailbox-type setting of the dispatcher configuration:
0001 boundedBookmarkDispatcher {
0002 type = Dispatcher
0003 executor = "fork-join-executor"
0004 mailbox-type = "akkaguide.UnboundedBookmarkPriorityQueueMailbox"
0005 }
Now that we've defined this, we can modify the code that starts the BookmarkStoreGuardian we created
previously so that it uses this dispatcher to handle all messages that pass through it:
0001 val bookmarkStoreGuardian =
0002 system.actorOf(Props(new BookmarkStoreGuardian(bookmarkStore))
0003 .withDispatcher("boundedBookmarkDispatcher"))
There is another way to achieve this (new in Akka 2.2) and what we need to do is to create two configurations: (a)
deployment (b) mailbox. The main idea is to decouple the mailbox configuration from that of the dispatcher and the
linkage between them is the name of the mailbox; Akka 2.2 allows us to achieve this by linking the value of mailbox
to the implementation marked by mailbox-type as shown below:
0001 #(option 1)
0002 # Value of 'mailbox' here is name of another configuration block which
0003 # houses the configuration of the particular mailbox.
0004 akka.actor.deployment {
0005 /boundedMailboxactor {
0006 mailbox = boundedBookmarkMailbox
0007 ... other deployment configuration can be here
0008 }
0009 }
0010 #(option 2)
0011 boundedBookmarkMailbox {
0012 mailbox-type = "akkaguide.UnboundedBookmarkPriorityQueueMailbox"
0013 mailbox-capacity = 1000
0014 mailbox-push-timeout-time = 1
0015 }
From the perspective of our example, now we have and now you would have another two methods in which to
create the actor of the same semantics which we show below and they are equivalent. The first form which looks up
the configuration by looking up the value of the name in the actorOf API which only only names the given actor but
also through this associative linking (remember it'll discover the mailbox's configuration) create the appropriate
mailbox.
0001 (option 1) val bookmarkStoreGuardian = system.actorOf(Props(new BookmarkStoreGuardian(bookmarkStore)), "boundedMailboxactor")
0002
0003 (option 2) val bookmarkStoreGuardian = system.actorOf(Props(new BookmarkStoreGuardian(bookmarkStore)).withMailbox("boundedBookmarkMailbox"
You should be aware that Akka provides six mailbox types namely UnboundedMailbox,
SingleConsumerOnlyUnboundedMailbox, BoundedMailbox, UnboundedPriorityMailbox, BoundedPriorityMailbox
and Durable mailboxes and the first five of these mailboxes are in-memory (check the documentation here) with no
message persistency in them and only the durable mailboxes provide a way for you to persist those messages
through its file-based implementation. Readers interested in introducing durability in messages are invited to read
the documentation on the details.
35
Other mailboxes
You can also create your own custom mailbox types and while this might seem quite appealing at first, we would
advise against it unless you have a very strong reason and know for certain that you really understand well the
semantics behind the enqueueing and dequeueing used by whatever container you choose to use for your
messages. Creating a custom mailbox itself is easy, but getting it right is another matter, and the mailboxes supplied
with Akka are most often all you need. Implementing your own mailbox is not a trivial enterprise but if you had to ask,
we would invite you to start by asking yourself the question: "what kind of properties does my mailbox would have?"
and we suspect that Akka would have answered a big part of those questions since you've learnt that Akka allows
the means to configure a mailbox's capacity through mailbox-capacity, throughput via a combination of
dispatchers, routers (think scatter-gather pattern), reducing controller delay via mailbox-push-timeout-time and
message durability (through durable mailboxes) and the sweet spot is discovering a pattern combining these
techniques that suits your use case.
Finally, you will also likely run across a reference to durable mailboxes, of which there were historically a number of
implementations supplied with Akka. For reasons of maintainability, these were all removed from the distribution
with the exception of one that is backed by a journaled transaction log on the filesystem. It's easy enough to use this
mailbox with the following configuration:
0001 journaled-file-dispatcher {
0002 mailbox-type = akka.actor.mailbox.filebased.FileBasedMailboxType
0003 }
0004
Using this mailbox type is just like any other mailbox (though, it provides a wealth of configuration parameters), with
the exception that the messages will still be retained if the virtual machine running Akka dies for whatever reason.
This can be very handy for protecting yourself against failure, particularly when isolating difference resources (local
or remote) within your system, such as a remote webservice or an incoming data feed. There are additional third-
party durable mailboxes available, including one based on AMQP, which you can find online.
Wrap-up
This whirlwind tour of dispatchers has hopefully given you an idea of the flexibility Akka gives you for creating robust
configurations that can handle very different types of workflow, depending upon your needs. There are a huge range
of choices available to you, so you might feel overwhelmed, but it's generally best to start with the defaults to get
things working. Then, through watching the performance and profiling under real workloads, you can get a better
sense of where to apply these different tools and understand how they might impact your overall system.
36
Chapter 7: Remoting
The power of Akka is hopefully apparent to you by now, but really, we've only just begun to scratch the surface.
We've already highlighted the fact that Akka brings great flexibility and resiliency for problems involving
concurrency, but so far we've only focused on running Akka at the single machine level. Bringing additional
machines into the picture is where the capabilities of Akka really show their strongest side.
Akka's support for remoting is where the strength of the message passing approach is perhaps best highlighted.
This, and the fundamental focus on making everything in Akka support remote actors as easily as local actors,
makes a dramatic difference in your ability to take a single system application and transform it into a distributed
system without extreme contortions. That alone is far from typical in usual enterprise frameworks that most of us
have been subjected to over the years.
This ability to transparently interact between actors that may be local or remote without having to change the
underlying code, is referred to as location transparency, as mentioned in the first chapter. Building your applications
to take advantage of this can be driven almost entirely from a configuration-based approach. In this chapter we'll
look at how to get started with Akka's remoting module and how to go about getting to the point where location
transparency is just an assumed feature of your system.
Setting up remoting
We've used the most basic of the libraries that Akka provides. In order to get started with Akka remoting, you'll need
to add the following to your build definition:
libraryDependencies += "com.typesafe.akka" %% "akka-remote" % "2.2.3"
If you're running in the sbt console, don't forget to run reload so that it reloads the updated build configuration. The
next time you try to compile anything, sbt will see that there is a new dependency and attempt to retrieve it.
The other key addition is to your Akka configuration. These additions will enable remote access from other Akka
instances. It's important to note that the akka.remote.netty.tcp.hostname setting needs to reference a hostname that
any remote Akka instances will be able to resolve. In many of the Akka tutorials you'll see online, this field is set to
localhost or 127.0.0.1, which causes needless frustration when some hapless developer is trying to learn how to use
Akka remoting and can't figure out why two instances can't talk to each other. Make sure that if you use a hostname
here (and there's no reason not to — we're using IP addresses for convenience) it is globally addressable by all
Akka instances, typically this is done through an update to the UNIX host file /etc/hosts and/or updating the domain-
name-service.
The port number specified can also vary, as needed. In particular, if you want to try running multiple Akka instances
on a single machine, they will need to use unique port numbers to avoid errors. Here we use 2552, but if we were to
run another instance of an Akka remoting-based application on the same machine, we would need to use a different
port.
0001 akka {
0002 actor {
0003 provider = "akka.remote.RemoteActorRefProvider"
0004 }
0005 remote {
0006 enabled-transports = ["akka.remote.netty.tcp"]
0007 netty.tcp {
0008 hostname = "192.168.5.5"
0009 port = 2552
0010 }
0011 }
0012 }
Looking up and creating actors remotely
Let's understand the creation and lookup of actors from a local context before considering the remote context.
Creation of actors, as we've learnt, can be achieved through invoking actorOf in either an ActorSystem or an Actor
and we can lookup actors using the actorFor through an ActorSystem or an Actor. In Akka 2.2 onwards, actorFor is
deprecated in favour of actorSelection as a more unified way of conducting lookups and readers can consult the
documentation for details.
Now that we have the configuration in place, actually using remote actors is incredibly simple. Sending a message
to a remote actor requires simply knowing the actor's address. This code looks up an actor called remoteActor that
lives on the host with IP 192.168.5.5 using port 2552, with an actor system named remoteActorSystem:
0001 val remoteActor = context
0002 .actorFor("akka.tcp://remoteActorSystemName@192.168.5.5:2552/user/remoteActor")
0003 remoteActor ! SomeMessage("hello!")
Similarly, you can create an actor running on a remote node with just a bit more work:
0001 import akka.actor.{ Props, Deploy, AddressFromURIString }
0002 import akka.remote.RemoteScope
0003
0004 val remoteAddress =
0005 AddressFromURIString("akka.tcp://remoteActorSystem@192.168.5.5:2552/user/remoteActor")
0006
37
0007 val newRemoteActor = context.actorOf(Props[MyRemoteActor]
0008 .withDeploy(Deploy(scope = RemoteScope(remoteAddress))))
There is an additional requirement here, though, which is that the class MyRemoteActor that's being instantiated
here needs to be available to the runtime in both the local and remote JVM instances. An observation from these two
approaches is that the address string is embedded in the program code and that presents a problem if your
application logic has more remote actor invocations as part of the overall system; however it is a quick solution if you
are building out your Actor application over the weekend.
Akka provides us another way to achieve this and reduces this form of coupling by mapping the address in the actor
deployment configuration. This would be the equivalent configuration for the previous example:
0001 akka.actor.deployment {
0002 /remoteActor {
0003 remote = "akka.tcp://remoteActorSystem@192.168.5.5:2552"
0004 }
0005 }
With this configuration in place, the previous code is simplified:
0001 val newRemoteActor = context.actorOf(Props[MyRemoteActor],
0002 name = "remoteActor")
You should notice that this looks just like code we've used previously to start up actors. In other words, we're
creating a remote actor without the code having to take the location into account. You should be aware that when we
apply this particular technique of creating the remote actor (with help from the configuration) is that we do not
actually instantiate the actor there and then, but rather we direct our request to the daemon actor listening to that port
i.e. the value of akka.actor.deployment.<actor name>.remote. This is truely location transparency.
Serialization
In Akka, it is easy to forget about data that's being transported from one actor to another actor or even ActorRefs
themselves i.e. we typically use Scala case objects or primitive types like Strings, Integers etc and even have
domain objects that might possibly capture some state of your application at the time prior to being transported
across the network. In remoting actors, this data problem is most glaring when your data is not serializable. Akka
caters for this by providing the default serializers or if you are not happy with the performance they deliver, you can
build your own too. The default configuration is a mapping of serializers (i.e. akka.actor.serializers) to their
respective message objects (i.e. akka.actor.serialization-bindings) and shown below:
0001 akka {
0002 actor {
0003 serializers {
0004 akka-containers = "akka.remote.serialization.MessageContainerSerializer"
0005 proto = "akka.remote.serialization.ProtobufSerializer"
0006 daemon-create = "akka.remote.serialization.DaemonMsgCreateSerializer"
0007 }
0008 serialization-bindings {
0009 # Since com.google.protobuf.Message does not extend Serializable but
0010 # GeneratedMessage does, need to use the more specific one here in order
0011 # to avoid ambiguity
0012 "akka.actor.ActorSelectionMessage" = akka-containers
0013 "com.google.protobuf.GeneratedMessage" = proto
0014 "akka.remote.DaemonMsgCreate" = daemon-create
0015 }
In the event that your data is simple enough such that you don't need to write your own serializer/de-serializer, then
you would apply the typical technique of marking your objects as serializable, as what you would do in Java. But in
the event that you need a custom serializer by building one yourself, then Akka provides a framework to do this. In
both situations, we won't discuss serialization further from here onwards and we invited readers to check the
documentation here.
Secure Communication Data
We've discussed briefly about data serialization and its primarily about making sure your data stays consistent as it
travels from the source to its destination and over here, we discuss a little about making this communication etc
more secure. Akka allows us to accomplish in three ways by providing mechanisms to secure the transport layer, the
kinds of messages you can send and who can actually receive those messages:
Untrusted mode
Secure Cookie Handshake
SSL
Akka typically runs in the trusted mode i.e. any actor can connect to any actor by knowing its address
(local/remote/clustered) and that's a problem because by default any actor can send a shutdown message i.e.
PoisonPill, monitor actors remotely or locally (via the Terminated message) and remote deployment. You can avoid
this by turning on the property in your configuration
akka.remote.untrusted-mode = on
You should combine that setting by also marking the data messages your application sends with a trait
PossiblyHarmful (super trait of Kill, PoisonPill, Terminated & ReceiveTimeout) which effectively muffled those
messages.
38
A second option is to restrict who can actually send the message to the backend and by examining a cookie during
the handshaking process, a server can possibly reject a client if this cookie is not present. You can generate this
cookie in two ways by invoking a script (found in $AKKA_HOME/scripts/generate_config_with_secure_cookie.sh) or
programmatically (via akka.util.Crypt.generateSecureCookie). The following configuration needs to be present on
both client and server systems else it'll fail:
0001 akka.remote{
0002 secure-cookie="090A030E0F0A05010900000A0C0E0C0B03050D05"
0003 require-cookie=on
0004 }
The final option is to enable SSL in the transport layer and it's done through a transport (Akka defines transports as
mechanisms that initializes the underlying transmission capabilities and includes establishing links with remote
entities) and in particular the default is the NettyTransport. We know this because if you can recall that we had
akka.remote.enabled-transports = ["akka.remote.netty.tcp"] set into our configuration when we began this chapter.
But the NettyTransport is not the only one you can use, since Akka allows you to build your own transport class but
we won't discuss that here (read details at the documentation). Using the NettyTransport, we can actually configure
the key-store, trust-store, protocol etc and the following is the default configuration which you can alter using your
project's security credentials:
0001 netty.ssl = {
0002 # Enable SSL/TLS encryption.
0003 # This must be enabled on both the client and server to work.
0004 enable-ssl = true
0005 security {
0006 key-store = "keystore"
0007 key-store-password = "changeme"
0008 key-password = "changeme"
0009 trust-store = "truststore"
0010 trust-store-password = "changeme"
0011 protocol = "TLSv1"
0012 enabled-algorithms = ["TLS_RSA_WITH_AES_128_CBC_SHA"]
0013 random-number-generator=""
0014 }
0015 }
Remoting with routers
Routers are also able to take advantage of remoting successfully, as you might imagine. This is done via some
additional configuration directives that we didn't cover in the Routing chapter. A simple example showing how
remoting and routers can be combined would be a broadcaster that allows for broadcasting messages to a set of
remote actors.
On the target remote nodes, you might define the receiving actors as follows:
0001 import akka.actor._
0002
0003 class ReceiverActor extends Actor with ActorLogging {
0004 def receive = {
0005 case msg => log.info(msg)
0006 }
0007 }
Then, to send a message to each of these nodes concurrently, you would start up your router, to which you can then
send a message, which will appear on each of the remotes:
0001 val broadcaster = context.actorOf(Props[BroadcastReceivingActor]
0002 .withRouter(FromConfig()), name = "broadcaster")
0003
0004 broadcaster ! "hello, remote nodes!"
The configuration that tells this router to use a BroadcastRouter with remote nodes is very simple and what it does is
that it would clone three instances of the actor broadcaster and distribute it to the nodes at 10.10.10.01, 10.10.10.02
& 10.10.10.03:
0001 akka.actor.deployment {
0002 /broadcaster {
0003 router = broadcast
0004 nr-of-instances = 3
0005 target {
0006 nodes = [
0007 "akka.tcp://broadcastExampleSystem@10.10.10.01:2552",
0008 "akka.tcp://broadcastExampleSystem@10.10.10.02:2552",
0009 "akka.tcp://broadcastExampleSystem@10.10.10.03:2552"
0010 ]
0011 }
0012 }
0013 }
Scatter-gather across a set of remote workers
Returning to the earlier example application, let's add some handy functionality to our system that grabs pages we bookmark. We're going t o
use a very simplified approach that's not really ideal, but that will at least get t he idea across, and then show you how you can use remoting
effectively here to help ensure you actually retrieve the page successfully. First, here's the new Crawler actor that I'll be using to grab the
pages:
0001 import akka.actor.Actor
0002 import io.Source
0003
39
0004 object Crawler {
0005 case class RetrievePage(bookmark: Bookmark)
0006 case class Page(bookmark: Bookmark, content: String)
0007 }
0008
0009 class Crawler extends Actor {
0010 import Crawler.{RetrievePage, Page}
0011 def receive = {
0012 case RetrievePage(bookmark) =>
0013 val content = Source.fromURL(bookmark.url).getLines().mkString("\n")
0014 sender ! Page(bookmark, content)
0015 }
0016 }
There's not much to this. It just receives a request to retrieve a page using the URL in a given bookmark and then
sends the response back to the original requestor. There are a few different ways we could hook this into our
existing application, but keeping with the theme of simplicity, I'll just hook it in to the code that stores the bookmark.
0001 class BookmarkStore(database: Database[Bookmark, UUID]) extends Actor {
0002
0003 import BookmarkStore.{GetBookmark, AddBookmark}
0004 import Crawler.{RetrievePage, Page}
0005
0006 val crawler = context.actorOf(Props[Crawler])
0007
0008 def receive = {
0009
0010 case AddBookmark(title, url) =>
0011 val bookmark = Bookmark(title, url)
0012 database.find(bookmark) match {
0013 case Some(found) => sender ! None
0014 case None =>
0015 database.create(UUID.randomUUID, bookmark)
0016 sender ! Some(bookmark)
0017 crawler ! RetrievePage(bookmark)
0018 }
0019
0020 case GetBookmark(uuid) =>
0021 sender ! database.read(uuid)
0022
0023 case Page(bookmark, pageContent) =>
0024 database.find(bookmark).map {
0025 found =>
0026 database.update(found._1, bookmark.copy(content = Some(pageContent)))
0027 }
0028 }
0029 }
The main items to note here are where we send the RetrievePage message to the crawler and then, later, when we
get the Page response back, we go look the bookmark back up from the database and add the freshly retreived
contents to it. This is all well and good, but imagine that you had a system that was not very reliable at pulling these
page contents, and you wanted a higher likelihood of actually getting the page contents successfully. Our approach
would be to use a ScatterGatherFirstCompletedRouter combined with a handful of remote nodes to farm out the
request handling. A few little changes will show how this might work:
0001 import akka.actor.{Props, Actor}
0002 import java.util.UUID
0003 import akka.routing.ScatterGatherFirstCompletedRouter
0004 import scala.concurrent.duration._
0005 import akka.pattern.{ask, pipe}
0006 import akka.util.Timeout
0007
0008 class BookmarkStore(database: Database[Bookmark, UUID], crawlerNodes: Seq[String]) extends Actor {
0009
0010 import BookmarkStore.{GetBookmark, AddBookmark}
0011 import Crawler.{RetrievePage, Page}
0012
0013 val crawlerRouter =
0014 context.actorOf(Props.empty.withRouter(ScatterGatherFirstCompletedRouter(routees = crawlers,
0015 within = 30 seconds)))
0016
0017 def receive = {
0018
0019 case AddBookmark(title, url) =>
0020 val bookmark = Bookmark(title, url)
0021 database.find(bookmark) match {
0022 case Some(found) => sender ! None
0023 case None =>
0024 database.create(UUID.randomUUID, bookmark)
0025 sender ! Some(bookmark)
0026 import context.dispatcher
0027 implicit val timeout = Timeout(30 seconds)
0028 (crawlerRouter ? RetrievePage(bookmark)) pipeTo self
0029 }
0030
0031 case GetBookmark(uuid) =>
0032 sender ! database.read(uuid)
0033
0034 case Page(bookmark, pageContent) =>
0035 database.find(bookmark).map {
0036 found =>
0037 database.update(found._1, bookmark.copy(content = Some(pageContent)))
0038 }
0039 }
0040 }
There are a few new things here that we'll cover in more detail in the next chapter on futures, but the idea here is that
the scatter-gather approach is sending the request to each of the actors we've defined, which is determined by the
crawlerNodes parameter. We're assuming this will be passed in as a Seq of remote actor system addresses, for
example: "akka.tcp://bookmarker@192.168.5.5:2553/user/crawler". The application daemon is Bookmarker (as
before) and as an example of how our sample application would work, we would start the ActorSystem named
bookmarker and with that start two remote actors using the configuration approach and have our main actor lookup
these two remote actors whenever a valid bookmark is created (recall a valid bookmark would contain a URL that
our crawler would crawl the given page and store it into our in-memory cache which is available upon the next
40
our crawler would crawl the given page and store it into our in-memory cache which is available upon the next
request) and the following snippets from application.conf and Bookmarker.scalacaptures this:/p>
0001 akka.actor.deployment {
0002 /crawler-0 {
0003 remote = "akka.tcp://bookmarker@127.0.0.1:2552"
0004 }
0005 /crawler-1 {
0006 remote = "akka.tcp://bookmarker@127.0.0.1:2552"
0007 }
0008 }
0009 object Bookmarker extends App {
0010 ...
0011 val crawlers = Seq(system.actorOf(Props[Crawler], "crawler-0"), system.actorOf(Props[Crawler], "crawler-1"))
0012
0013 val bookmarkStoreGuardian =
0014 system.actorOf(Props(new BookmarkStoreGuardian(database,
0015 collection.immutable.Seq(
0016 "akka.tcp://bookmarker@127.0.0.1:2552/user/crawler-0",
0017 "akka.tcp://bookmarker@127.0.0.1:2552/user/crawler-1"))).withDispatcher("boundedBookmarkDispatcher"))
0018 }//end of Bookmarker
The router will send the request to each node, waiting up to 30 seconds for a response. The first result it gets back is
then sent to the BookmarkStore actor itself to process, just as it did earlier.
Gotchas and troubleshooting
The two most difficult things about working with remote actors are understanding when to use them and
troubleshooting things when they aren't working. Understanding the characteristics of the problem you're trying to
solve is a good place to start thinking about what sort of actor system you might want to design. You should ask
yourself whether the system you're building either requires more resources than a single system can provide, if it
necessarily requires being spread across multiple systems, or if there is some other design constraint that makes
interaction between multiple actor systems a requirement. Perhaps the important thing to keep in mind is that you
still need to focus on developing clear protocols for communicating between your actors. If you've designed a
reasonably robust actor hierarchy, with appropriate supervisors and topologies for distributing work, nothing should
need to change.
The point we want you to come away with is that working with remote actors should not generally be any different
from working with local actors. You should be assuming your actors may fail, behave unexpectedly (did you really
cover all the corner cases of that data parsing actor you wrote?), or whatever, no matter whether it is designed to run
locally or remotely. Sure there will be differences, particularly if you are doing things that are implicitly designed
around system-locality -- for example, accessing local files could fall into this bucket. But those are differences you
are hopefully taking into account regardless of whether you're using actors or not. But with remoting and the ease of
taking a local actor and transforming it into a remote one, Akka gives you one more reason to start designing with
actors.
When things do go wrong while you're using remote actors, keep in mind that they are a network-based resource. So
troubleshooting primarily falls into the same set of strategies you would use for troubleshooting any other network
based resource. If your traffic appears to simply not be appearing on your remote system, start with the network layer
and work up from there. Check any firewall rules; make sure any host names you use are actually resolving correctly
and that TCP ports are not being used by other resources or services. And, of course, don't forget to check that
things are actually plugged in!
There are also configuration options that Akka provides to help you debug the messages being sent between your
remote systems. These go hand-in-hand with the same steps mentioned in chapter three. To be specific, you need
to make sure your actors include logging functionality, by adding the akka.actor.ActorLogging trait to your actors and
by using the akka.event.LoggingReceive wrapper around the receive block in your actor. The configuration
directives are:
0001 akka {
0002 remote {
0003 log-received-messages = on
0004 log-sent-messages = on
0005 log-remote-lifecycle-events=on
0006 }
0007 }
You can also get a ton of debugging information by listening to Akka's EventBus, which is used internally by Akka to
distribute internal messages. You can listen to either the remote server events, remote client events, or both, by
listening for one of the event types RemoteServerLifeCycleEvent, RemoteClientLifeCycleEvent,
or RemoteLifeCycleEvent. Each of these is actually part of a hierarchy of events you can listen for, but these top
level events are often what you want when you're just debugging. To find information about the additional event
types, see the documentation. Listening to this event bus is simply a matter of registering an actor and subscribing to
the particular event class type you want to listen for (this is called a classifier in the documentation), as this example
shows:
0001 import akka.remote.RemotingLifecycleEvent
0002 val eventStreamListener = system.actorOf(Props(new Actor {
0003 def receive = {
0004 case msg => println(msg)
0005 }
0006 }))
0007 system.eventStream.subscribe(eventStreamListener, classOf[RemotingLifecycleEvent])
41
Wrap-up
Truthfully, we've only scratched the surface, both with remoting and Akka in general. But now you've seen how to
put these tools into use and you should be able to take them from these simple examples to start building real
applications. It's important to remember to use the resources you have available to you, so be sure to continue
experimenting, reading and exploring with Akka. The best resource at your disposal is real experience.
42
Chapter 8: Diving Deeper into Futures
In the second chapter we walked you through some code that used futures, but we didn't spend much time on them.
We like to develop the idea a bit further here and show you how they can be used effectively to interact with actors
and to sequence operations (or, computations, speaking more formally).
Clarifying our definition
Earlier we defined futures very generally, but let's be a bit more specific. A future represents the value of some
expression or computation that may not have yet completed or which is not yet known. This mechanism was first
proposed as a model for obtaining the result of parallel evaluation of expressions in a programming language.
As you saw earlier, futures are useful when you need to get a response back from an actor. This is precisely the
scenario described in the definition of futures: we want to send a message off to an actor and get a response, but we
don't know when or whether that response will come back. Our future gives us a handle by which to get that
response.
While the most obvious use of futures in the context of Akka is for receiving responses from actors when outside of
an actor context, you can also simply create futures directly. This is useful if you have a block of code you'd like to be
run asynchronously from the rest of your code. Yes, you could create an actor to do this, but that requires a lot more
work when you have access to futures essentially for free. Here, in the simplest form, is how you can execute a block
of code as a future:
0001 import scala.concurrent.future
0002 import scala.concurrent.ExecutionContext.Implicits.global
0003
0004 future { 1 + 2 }
Of course, this example is not really complete — we're not yet doing anything with the result.
Here's a more complete example, again using the global ExecutionContext (more on this topic momentarily) and this
time we compute the sums of 3 random numbers (between 0 and 100). We also added a type annotation to
demonstrate the type of the future being created:
0001 import scala.util.Random._
0002 import scala.concurrent._
0003 import scala.concurrent.ExecutionContext.Implicits.global
0004
0005 val sums = Future.reduce(Seq(future(nextInt(100)), future(nextInt(100)), future(nextInt(100)) ))(_ + _)
0006 sums.map(x println(s"Sum is ${x}"))
Execution Context
As we hinted at earlier, an ExecutionContext is an abstraction for something that provides a way to execute
computations. For example, it might represent a variable-sized thread pool that scales appropriately depending on
the number of CPU cores available.
There are a variety of ways to get provide or get ahold of an existing ExecutionContext. If you're familiar with the
java.util.concurrent API, you know that ExecutorService is very similar to what we've described. Conveniently, you
can use one via the ExecutionContext.fromExecutorService method, passing in the ExecutorService you want to
use. You should keep in mind that should you use an ExecutorService instance, you will still need to shut it down
yourself — this can be done in the same place you shutdown your ActorSystem (for example, as we've showed in
the servlet example earlier, you might do this by overriding the servlet's destroy method).
As we showed you earlier, there's also a default, global ExecutionContext available by importing
scala.concurrent.ExecutionContext.Implicits.global. And if you're running within an environment using Akka, as you
likely will be at some point, you can use the system or actor context local dispatcher (more about this later in the
book), but using one of the following forms:
0001 class MyApp {
0002 val system = ActorSystem("MySystem")
0003 import system.dispatcher
0004 }
This form should only be used when you are not within an actual actor, but where you do have access to an
ActorSystem instance. The next form would be used within an actor:
0001 class MyActor extends Actor {
0002 import context.dispatcher
0003 }
Choosing the appropriate ExecutionContext is important because the context in which work is being done. In
particular, ExecutionContext allows one to select either the Fork-Join-Pool or ThreadPool implementations by
defining the appropriate Executor or ExecutorService (Java7 has both but Java6 has Thread-Pool only). Without
getting too much into specifics, ThreadPool differs from Fork-Join in the respect of work-stealing and with that, Fork-
Join would be more suitable in the event when a coarse-grained (potentially long duration) task is handled by a fork-
join-pool since work can be distributed amongst the idle workers whereas in a thread-pool implementation, that
same work would be handled by the assigned thread regardless of how long it would take. Considering that creating
43
same work would be handled by the assigned thread regardless of how long it would take. Considering that creating
your own Executor or ExecutorService is a non-trivial exercise, it would be a last resort to do that and instead
explore other techniques to distribute the work.
For comprehensions
The futures defined within Scala's standard library define a set of very useful methods that you may have already
seen on a number of other classes in Scala. These methods are map, flatMap, filter, and foreach.
These methods collectively allow the Scala compiler to provide a bit of syntactic-sugar in the form of for-
comprehensions. Syntactic-sugar, if you're not familiar with the term, is a mechanism of the compiler that transforms
one form of syntax into another, equivalent form. Generally, this is used to make something that's somewhat
cumbersome into a nicer form.
First, let's just look at a very simple example of using map.
0001 import akka.dispatch._
0002 def asyncAction: Future[String] = {
0003 // ...
0004 }
0005 val delayedResponse = asyncAction()
0006 // only called if the future returns a successful result
0007 val transformedResponse: Future[Int] = delayedResponse.map {
0008 response => response.length
0009 }
We have included a few unnecessary type annotations here to help clarify what's happening. Of particular note is
the response type of map. It will always return another future. But the type parameter of that future may be different
from the intial future if you perform some kind of transformation within the map call.
Another important thing to realize is that the map block will not actually be called until the initial future itself actually
completes successfully. An example is as follows where you'll notice a 5 second latency when evaluating this
expression:
0001 val sums = Future.reduce(Seq(future(nextInt(100)),
0002 future(nextInt(100)),
0003 future{Thread.sleep(5000);5}, // sleeps for 5 seconds, returns 5
0004 future(nextInt(100)) ))(_ + _);
0005 sums.map(x println(x))
If the initial future or the future returned by the call to map throws an exception, the final future returned from map will
be a Failure, containing whatever exception caused it to fail.
flatMap is very similar to map. What differentiates the two is that flatMap actually expects the code called within its
block to return a Future, directly, as opposed to returning a value which is then wrapped in a new Future. Otherwise,
the semantics are essentially the same, including the handling of exceptions, of course.
filter takes the successful result of a future and, based on predicate you pass to it, returns either the original value in
another future or returns a Failure with a NoSuchElementException value. If an exception occurs either in the
original future or in the handling of the pedicate, a Failure is also returned with the exception that caused the failure.
Let's look at how we can take advantage of the awesome sugared syntax we've been offered. Let's say you have
requests to make to two different actors and you want to calculate some final value based on the results of calling
both those actors. You can do this using a for comprehension, essentially composing the final result from the
individual results:
0001 val itemId = 998
0002 val buyersCurrency = "GBP"
0003 val currentPrice: Future[Double] =
0004 pricingActor ? GetPrice(itemId)
0005 val conversionRate: Future[Double] =
0006 conversionRateActor ? GetConversion(buyersCurrency)
0007 val convertedPrice: Future[Double] =
0008 for { price <- currentPrice;
0009 rate <- conversionRate } yield {
0010 currentPrice * conversionRate
0011 }
You'll note that, again, we added some type annotations to make it clear what types are received by the calls. The
final result of the for comprehension is still a future, which is important to recognize.
Also, notice that we're making the requests to the actors before the for comprehension. If the calls were made
directly inside, we'd be effectively sequencing the actual calls. That is, one call would be made to pricingActor and
then it would sit waiting for the result. Only then would the second call to conversionRateActor be made.
To understand that, it helps to see how this gets transformed by the Scala compiler. We won't explain all the rules
here (feel free to go read the Scala language specification, for the curious reader), but essentially, for
comprehension is turned into the following code.
0001 val convertedPrice: Future[Double] = currentPrice.flatMap {
0002 case price => {
0003 conversionRate.map {
0004 case rate => {
0005 currentPrice * conversionRate
0006 }
0007 }
0008 }
44
0009 }
Looking at the example in this form hopefully makes it more clear why you want to make your calls to get the futures
prior to entering the for comprehension. To be clear, here's the equivalent code without the futures defined outside
the for comprehension after it has been de-sugared by the compiler.
0001 val convertedPrice: Future[Double] =
0002 (pricingActor ? GetPrice(itemId)).flatMap {
0003 case price => {
0004 (conversionRateActor ? GetConversion(buyersCurrency)).map {
0005 case rate => {
0006 currentPrice * conversionRate
0007 }
0008 }
0009 }
0010 }
If you look closely at this form, you'll see how the GetConversion message is not sent to the conversionRateActor
until after the price has already been returned by the pricingActor.
Sequencing futures
Sometimes you really need to take call a series of futures as a sequence of operations. There are a couple of
approaches to this. One is to simply invoke the next future in the sequence in the callback of the current future you're
code is waiting on. We won't even bother showing an example of this — it's cumbersome and quickly gets ugly.
Another approach is to use the for comprehensions we just demonstrated. This works well, but in some cases it
results in code that's not exactly fluid. Thankfully, the Scala 2.10 API has added a few new features to futures that
make this a lot nicer.
We were just talking about sequencing futures and an extra mechanism is provided in the form of andThen. This
method is useful when you want to do something as a side-effect on the completion of another future. This method
differs from onSuccess(or onFailure) in that the execution order of the execution of the registered callbacks in
onSuccess/onFailure is not defined and it does not produce another future after it completes. The method, andThen,
takes a partial function where the incoming type is Try[_], so you can restrict the match on Success or Failure, or, if
you don't care, ignore it entirely. The result of the original future is still returned as the final value. And, finally, you
can chain together as many calls to andThen as you like.
0001 val f = future { 1 + 2 }
0002 f andThen {
0003 case Success(i) => println("The result is: " + i)
0004 } andThen {
0005 case _ => doSomethingElse() // don't care what we've got
0006 }
There are cases where you want to make a series of calls and get the result from the first to return a result. In this
situation, you can use Future.firstCompletedOf, which will handle as many futures as you want to give it and give
you a future containing the result (whether a success or failure). This case can be very helpful in situations where
you're calling some service that might timeout frequently.
0001 val tryOne, tryTwo, tryThree = future { makeSomeRequest() }
0002 val first = Future.firstCompletedOf(Seq(tryOne, tryTwo, tryThree))
Finally, you can take a transform a sequence of futures into a future of sequences. Let's say you have a sequence of
URLs that you want to retrieve and then process them when they're all done, but not before the entire sequence has
completed. You could do something like this:
0001 import scala.io.Source
0002
0003 val urls = Seq("http://google.com", "http://twitter.com", "http://typesafe.com")
0004 val pages = Seq.map { future { Source.fromURL(urls) } }
0005 Future.sequence(pages).onComplete {
0006 // process the Seq
0007 }
Error handling and exceptions
The last, but most definitely not least, thing we need to cover is error handling with futures. We've already shown you
how you can use onFailure as part of this, but that's just one tool available but we know now that onFailure like its
cousin onSuccess is non-deterministic in the respect that the execution order of callbacks is not in program order.
We need to cover two additional observations about what we already know and then show a couple of helps that are
available to you.
The first thing is to remedy the lack of execution order in the registered callbacks whenever we use methods like
onComplete, onSuccess, onFailure, etc. We have already shown you a remedy in the last section using the method
andThen to deterministically execute handlers for your futures and we can generalize that further and use map &
flatMap where we can not only can handle the success and failures of our futures but also generate new ones, if we
wish. As an example of how this would work, we assume we have a sequence of futures generating random
numbers (as before) but we've implanted an exception in them (we need to get rid of that) but we still like to be able
to compute the sums of all other valid numbers in the sequence, the following illustrates this approach:
0001 val ss = Seq(future(throw new Exception("haha!")), future(nextInt(100)), future(nextInt(100)) )
0002 val sss = ss.flatMap{ x => x.value match {case Some(Failure(_)) => Seq(); case b => Seq(future(b.get)) }}
0003 sss.foldLeft(0)( (acc,e) => acc + e.value.get.get.get )
45
The second thing to discuss is how onFailure can be used in cases where it might not be so obvious. Although
we've already shown how callbacks can be used with futures even when using other mechanisms for handling the
results, such as for comprehensions, we haven't really commented on an important idea here. Notice how for
comprehensions and many of the other structures the library provides still give you a final future to deal with. This
outermost future can still have callbacks assigned to it:
0001 val firstFuture = someCallReturningAFuture()
0002 val secondFuture = anotherCallReturningAFuture()
0003 val thirdFuture = aFinalCallReturningAFuture()
0004 val sequencedResult = for { first <- firstFuture
0005 second <- secondFuture
0006 third <- thirdFuture } yield {
0007 first + second + third // we'll assume the results can all added together
0008 }
0009
0010 sequencedResult.onFailure {
0011 case e: DomainException => {
0012 // do some cleanup here
0013 }
0014 }
This is not revolutionary, but again, it's important to recognize that after the for comprehension returns, we're still
dealing with a future and any errors happening anywhere within the sequence of calls to each of those methods will
bubble up to that final future. You can handle each of those individually and not use the for comprehension, but if
you choose to go that route, it makes sense to handle the errors at the top.
Another common scenario you might run into occurs when you want to attempt to get some value as a future and, in
the case of a failure, fall back to another future. This is provided by the appropriately named fallbackTo method,
which simply takes a future as its argument:
0001 val firstTry = future { doSomethingQuestionable() }
0002 val backup = future { doSomethingLessQuestionable() }
0003 val result = firstTry fallbackTo backup
The other mechanisms you'll likely want to take a look at are recover and recoverWith. In the case of recover, the
outgoing return type needs to be the same type as the original parameter type of the future you started with. In the
case of recoverWith, it should be another future, but parameterized with the same type as the original future.
Here's a quick example of each:
0001 import scala.io.Source
0002 import java.net.MalformedURLException
0003
0004 val candidateURLOne = "htt://google.xyz"
0005 val candidateURLTwo = "http://google.com"
0006 val getOne = future { Source.fromURL(candidateURLOne) }
0007 val getTwo = future { Source.fromURL(candidateURLTwo) }
0008
0009 getOne.recover {
0010 case MalformedURLException =>
0011 Source.fromURL(candidateURLOne.replace("htt", "http"))
0012 }.recoverWith {
0013 case UnknownHostException =>
0014 getTwo
0015 }
As you can see, recover allows us to try a new operation, without it needing to be itself a future. The second
approach, using recoverWith, is useful when you have an operation that could take time to perform and an alternate
fallback approach that should be applied when the first doesn't work.
Note the difference between fallbackTo, which we showed you a little while ago, and recoverWith that we just
demonstrated. fallbackTo takes a future and returns it on any failure of the original future, while recoverWith allows
you to match on error returned by the original exception and return a suitable response based on it. Lastly, we can
generalize recoverWith by realizing that we are effectively applying a function over the future and hence you can
utilize map, flatMap in addition to recover and recoverWith.
Handling actor responses
We titled this section in a way that implies it's all about the response we get back from sending messages to actors.
But the fact is that this applies to any futures you might be using. This might feel like we're rehashing old ground, but
there's actually a lot more to handling results from futures than you might think. We've already seen a bit of this
previously, but let's go over some of the subtleties here to make sure we understand what's going on.
An additional method, mentioned briefly in the second chapter, for getting the result from a Future is to use one of the
callbacks provided. Of these, onComplete is basically the catch-all. It lets you get the result whether the call resulted
in a success or a failure. As you can probably deduce, onSuccess and onFailure handle the individual cases of
success and failure.
There are a couple of interesting behavioral details to know about before you start using these:
You can create an arbitrary number of callbacks on any future.
These callbacks will not be called in a specific order. To put it another way, do not assume the callbacks will
be called in the order you have defined them.
If any given callback throws an exception, the other callbacks will still be called.
46
Now that you understand the ground-rules, let's look at an example:
0001 import akka.actor._
0002 import akka.patterns.ask
0003 import akka.actor.Status.Failure
0004
0005 case class Message(msg: String)
0006 case class Fail(msg: String)
0007
0008 val system = new ActorSystem("callbacks")
0009 val responder = system.actorOf(Props(new Actor {
0010 def receive = {
0011 case Message(msg) => sender ! msg
0012 // note: we need to respond with a Failure here!
0013 case Fail(msg) => sender ! Failure(new Exception(msg))
0014 }
0015 }))
0016
0017 val responseOne = responder ? Message("will succeed")
0018 val responseTwo = responder ? Fail("will fail")
0019 responseOne.onComplete {
0020 case Success(result) => println(result)
0021 case Failure(failure) => println("Oops! Something went wrong: " + failure)
0022 }
0023 responseTwo.onSuccess {
0024 case msg => println("This will never get called.")
0025 }
0026 responseTwo.onFailure {
0027 case e: Exception => println("This will get called.")
0028 }
In this example, we are creating a very simple actor just to give us a response. In this case, we're dictating what type
of response we get based on the type of message we send. We're calling the actor twice and registering a
single onComplete callback on the first response and both an onSuccess and onFailure callback on the second
response.
As you can see again, each of the callbacks expect a PartialFunction to be passed to them.
The onComplete callback will always receive one of Success or Failure. In a similarly rigid manner, onFailure will
always receive some type of Throwable, typically some Exception object. onSuccess can receive pretty much
anything -- that's defined by your code or the code of the actor you're interacting with.
All these callbacks have a return type of Unit. This is important since it means you can't chain the callbacks together.
This isn't really a limitation, really, and should help you to remember that they are not going to be called in any
predetermined order.
We should also mention that, in the rare case that you need to wait for a response, you can also perform a blocking
operation to get the result from a future. Really, you should likely only encounter this when you're using futures
alongside synchronous code. There is a solution to this problem and that is to use pipeTo in
akka.pattern.PipeToSupport. Let's first take a look at how you can wait on a response from a future. You still must
provide a timeout so it won't block forever:
0001 import scala.concurrent._
0002 import scala.concurrent.duration._
0003
0004 val responseFuture = future { longRunningOperation() }
0005 val response = Await.result(responseFuture, 5 seconds)
If the future returns a failure either through its own logic or from a timeout, the code calling Await.result will instead
receive an Exception, so be prepared for this possibilty if you need to handle those errors. Now, let's take a look at
how you can possibly do the same by using pipeTo(without blocking):
0001 import scala.concurrent._
0002 import scala.concurrent.duration._
0003 import akka.pattern.pipe
0004 import akka.actor.{Status,Success,Failure}
0005 val actorRef = system.actorOf(Props[SomeActor], "SomeActor")
0006 future { longRunningOperation() } pipeTo actorRef
0007 class SomeActor extends Actor {
0008 def receive = {
0009 case f:Failure(cause) => log.error("error in computation")
0010 case s:Success(status) => log.info("success")
0011 }
0012 }
}
Wrap-up
This has been a bit of a whirlwind tour of the use of futures in both Scala and Akka, but it's important foundation
material for you to understand as you start making use of these libraries. The fundamental nature of asynchronous
computation is that some results will not be known until some future time, so having a general but usable
mechanism for dealing with this is important. Futures, were a developer's way to introduce concurrency(blocking
and non-blocking) into an otherwise sequential operation and it is also a technique to improve responsiveness by
offloading task(s) that could be potentially long-running. An example is the Action.Async construct in the Play
Framework which improves concurrency by using a Future to serve HTTP requests etc. There is a downside and
that is it becomes the golden hammer to every problem and use case. We definitely urge the reader to step back
from futures and think about whether the goal you're trying to accomplish can be done by using Actors instead and
factor Futures into your design when it is needed.
47
Chapter 9: Testing
In this chapter, we focus our attention on testing actor systems. We start with understanding the testing strategy and
toolkit that Akka has to provide and we will use examples to illustrates these test strategies. In this chapter, we will
be using ScalaTest as our test framework which works well with SBT. The nice thing about sbt is that the popular
Jenkins and Hudson Continuous Integration environments gel together pretty nice and allows the developer to
execute test runs on a automated basis, mostly happening right after code checkins. Writing tests for actors is a little
trickier that writing actor code for the reason that when you are writing actor code, you tend to express the code in a
sequential manner that reflects the work process requirement (see illustration below) but before you launch it into
production, you definitely want to gain a certain level of confidence that the work process is correct for all users, in a
concurrent / asynchronous environment.
Akka provides a tool kit akka-testkit which provides you the capability to write tests that validate your actor's
processing logic (in isolation, without threads) and its inter-actor processing (with threads, communicating with other
actors) and it's quite natural. Taking a step back about what you've learnt so far and reflecting on crafting actors, you
would notice that the Akka model focuses your attention into crafting your application's logic into the message
handler i.e. def receive = { ... }
0001 class YourActor extends Actor {
0002 def receive = {
0003 // business logic
0004 // step 1, do this
0005 // step 2, do that
0006 ...
0007 // step N, complete
0008 }
0009 }
and Akka's actor model detaches you from thinking about how data/messages are transported between actors; in
fact Akka provides the developer a configuration based approach to configure how the actor system might be
customized based on your requirements on scaling by increasing/decreasing parallelism, message routing /
dispatching etc as illustrated in previous chapters.
Setting Up
Setting up the dependency is easy, you just add the following line to your build.sbt
0001 libraryDependencies ++= Seq( ...
0002 "com.typesafe.akka" %% "akka-testkit" % "2.2.1")
Akka is agnostic about which test framework you would like to use though Akka's tests was written
using ScalaTest and in this case we'll follow suit too and we add the following dependency to our build.sbt too
0001 libraryDependencies ++= Seq(...
0002 "com.typesafe.akka" %% "akka-testkit" % "2.2.1",
0003 "org.scalatest" % "scalatest_2.10" % "2.0.M6" % "test")
The next few sections will examine how you would go about using Akka's testing toolkit to write tests and you'll
notice some parallels in the JVM world of unit-testing.
Isolation Testing
This style of writing tests is almost the first form you will begin with, reason being that you would naturally want to
validate your application's logic before subjecting that logic to a concurrency test (which is the subject of the next
chapter). Let's use an example assuming an crazy problem of an Actor hold a reference to a stack of invoices,
modelled by Order, and let's assume that this actor can only add invoices upon receiving a message, Checkin, and
finally it'll tally the total monies on that stack upon receiving the message, Checkout. Let's take a look at our code
0001 import akka.actor._
0002 import collection.mutable.SynchronizedStack
0003 sealed trait Message
0004 case class Checkin(amount: Int) extends Message
0005 case class Checkout extends Message
0006 class InvoiceActor extends Actor with ActorLogging {
0007 private[this] var orders = Order()
0008 def receive = {
0009 case Checkin(amount) orders.add(amount)
0010 case Checkout sender ! orders.tally
0011 }
0012 def topOfOrder = orders.top
0013 }
0014 trait Order {
0015 // Our restaurant is special: we only accept dollars, no coins!
0016 // we need to synchronized access to our shared structure
0017 private[this] var orders : SynchronizedStack[Int] = new SynchronizedStack[Int]()
0018 def add(amount: Int) : Unit = orders = orders push amount
0019 def apply(x: Int) = orders.applyOrElse(x, (_:Int) match { case _ 0 })
0020 def tally : Int = orders.foldLeft(0)(_ + _)
0021 def numOfOrders : Int = orders.size
0022 def top : Int = orders.top
0023 }
0024 object Order {
0025 def apply() = new Order {}
0026 }
There are a few ways you can go about testing this code and for starters, we can identify two things we need to test
48
There are a few ways you can go about testing this code and for starters, we can identify two things we need to test
in isolation and subsequently under concurrency. Admittedly, we cheated a little here by using the synchronized
stack in our choice because otherwise you might start with a stack and run into concurrency problems before
switching out for a synchronized stack.
Test the validity of Order
Test the validity of the business logic within the actor, Invoices
It's probably a good idea to test Order in isolation so that we can iron the bugs before we begin writing tests for our
business logic in our actor. Here is how that test might look like using a test framework like ScalaTest:
0001 import org.scalatest.FlatSpec
0002
0003 class OrderSpec extends FlatSpec {
0004 behavior of "An default Order i.e. no orders have been added"
0005
0006 private[this] val order = Order()
0007
0008 it should "have zero orders" in {
0009 assert(order.numOfOrders === 0)
0010 }
0011
0012 it should "return 0 when tally is invoked" in {
0013 assert(order.tally === 0)
0014 }
0015
0016 behavior of "An order when > 1 order(s) have been added"
0017
0018 it should "have exactly 2 orders when 2 order are added" in {
0019 order.add(1)
0020 assert(order.top === 1)
0021 order.add(2)
0022 assert(order.numOfOrders === 2)
0023 }
0024
0025 it should "return 13 when tally is invoked for orders of values {1,2,5,5}" in {
0026 order.add(5)
0027 order.add(5)
0028 assert(order.tally === 13)
0029 }
0030 }
At this point in time, we are satisfied that Order has been tested and we can move on to begin writing tests for our
actor. In our actor, Invoices, we would like to actually make sure the behavior is not only consistent in a single-
threaded execution environment but also in a multi-threaded environment. To achieve the former, we can hook our
actor into the akka-testkit module by extending from akka.testkit.TestKit and providing an container in which to host
our actors i.e. ActorSystem
The following code shows one way of crafting this test using the BDD i.e. Behavior Driven Developmentstyle in
ScalaTest.
0001 import akka.testkit._
0002 import akka.actor.{ActorRef, ActorSystem, Props}
0003 import org.scalatest.{FlatSpecLike, BeforeAndAfterAll}
0004
0005 class InvoiceActorSpec extends TestKit(ActorSystem()) with ImplicitSender
0006 with FlatSpecLike
0007 with MustMatchers
0008
0009 behavior of "`Invoice` actor taking 2 orders"
0010
0011 it should "have two values {400, 500} when the same two orders are issued" in {
0012 val actor = TestActorRef[InvoiceActor]
0013 val ref = actor.underlyingActor
0014 actor ! Checkin(400)
0015 ref.topOfOrder must be (400)
0016 actor ! Checkin(500)
0017 ref.topOfOrder must be (500)
0018 }
0019
0020 it should "return a tally of 900 after two orders of value 400 & 500" in {
0021 val actor = TestActorRef[InvoiceActor]
0022 actor ! Checkin(400)
0023 actor ! Checkin(500)
0024 actor ! Checkout
0025 expectMsg(900)
0026 }
0027 override def afterAll() { system.shutdown }
0028 }
Let's go through what we did to arrive at this test. The first thing we did was to extend from TestKit and provided
TestKit with a container in which all actors are hosted (for this test only); next, we mixed in a trait called FlatSpecLike
which is the cousin of FlatSpec and we did this because FlatSpec is a class and we cannot extend from two
classes, even in Scala. Moving on, we mixed in another trait called ImplicitSender and this trait is necessary for the
test to succeed because it allows the test code to receive the response messsages.
The last statement we made about ImplicitSender deserves more attention. This particular trait is needed in your
tests because it provides a special environment in which your tests will execute on and that environment is actually
an actor that's single-threaded so that your tests are executed in written order.
Inside InvoiceActorSpec, we have two tests and once testing completes it is necessary to shutdown the ActorSystem
or else you would have unintentionally leaked the ActorSystem (that's bad since we are essentially wasting the
resources on the computer). We need to find a way to inject the clean up code and we would like it to occur when all
of our test completes; but then again you may argue that you would like to create and destroy an ActorSystem before
and after every test and you can definitely do that but if you recall one of the core principles of unit testing is that it
should complete rather quickly and this cycling of ActorSystem's do take time which leaves us the former option. We
are fortunate as ScalaTest's BeforeAndAfterAll trait provides a method, afterAll, which allows us to place our clean
49
are fortunate as ScalaTest's BeforeAndAfterAll trait provides a method, afterAll, which allows us to place our clean
up code.
Since we've understood the basics of our actor testing lifecycle, we can dive into the two tests we have. You will
notice that they look similar and differ in ways. Common to both tests is the fact that we need to create a reference to
our actor under test and Akka named it TestActorRef and this reference basically wraps around our actor,
InvoiceActor, along with a special dispatcher called CallingThreadDispatcher (it doesn't create a new thread, but
uses the current one i.e. piggy back). This dispatcher is ideal for our cause because we wanted to make sure our
test succeeds in isolation (which means our actor's logic works!) before subjecting it to concurrency.
The next thing you almost want to do when writing tests for actors is to make sure:
Messages are received and acted upon
Message handler logic is correct
In the former, we send messages to our actor under test via the familiar tell syntax i.e. ! and we know that its acted
upon because our actor's logic says to send back the results to the actor and by using the function, expectMsg, we
know that it did. There is a entire plethora of functions that allows the developer to test the validity of responses that's
embedded in akka.testkit.TestKit and we encourage you to explore the Scala docs on that.
In the latter, our actor's code indicates that upon receiving a Checkin message, we'll deposit that amount into the
Order held by the actor and we wanted to make sure that the said amount is the received amount and let's assume
we are not allowed to add another message, LastCheckin, to add this feature (though we see no real harm, really)
like this:
0001 case class LastCheckin
0002 class InvoiceActor extends Actor with ActorLogging {
0003 def receive = {
0004 case Checkin(amount) // as above
0005 case Checkout // as above
0006 case LastCheckin sender ! topOfOrder
0007 }
0008 // other code omitted
0009 private def topOfOrder = orders.top
0010 }
In honesty, this would work but there's another way! Akka provided us another way of achieving this without much
fanfare by allowing us access to our actor, InvoiceActor, by invoking a method, underlyingActor, and this allows us to
add a function to check the state of the Order after each Checkin by the expression
0001 ref.topOfOrder must be (400) // now, topOfOrder is made 'public' & no synchronization
There are a few ways we can see the tests we have written in action. Since we are using SBT, we could try the
following
>test
and that would trigger a one-time only trigger of all tests in our project. You would see the following output on the
console window:
0001 [info] OrderSpec:
0002 [info] An default Order i.e. no orders have been added
0003 [info] - should have zero orders
0004 [info] - should return 0 when tally is invoked
0005 [info] An order when > 1 order(s) have been added
0006 [info] - should have exactly 2 orders when 2 order are added
0007 [info] - should return 13 when tally is invoked for orders of values {1,2,5,5}
0008 [info] InvoiceActorSpec:
0009 [info] `Invoice` actor taking 2 orders
0010 [info] - should have two values {400, 500} when the same two orders are issued
0011 [info] - should return a tally of 900 after two orders of value 400 & 500
0012 [info] Passed: : Total 6, Failed 0, Errors 0, Passed 6, Skipped 0
0013 [success] Total time: 1 s, completed Sep 15, 2013 11:51:54 AM
At this point, you should see that Akka does not interfere with how we write tests and this beneficial to developers as
they can continue the testing and mocking frameworks that worked well for them and only worry about
asynchronicity when they have to.
Parallel Testing
This section is really about running tests in parallel and you normally would want to do that is so that your test cycle
can complete quicker than when running your test suites in sequence. There are two ways you can accomplish this
and one way is to not involve Akka and the unit testing framework will handle running parallel tests; the second way
is to involve Akka and this manner allows not only parallelism in your unit tests but also you may launch multiple
Scala modules, dependent on your requirement.
In the scenario where you want your unit test to take care of launching tests in parallel, in the context of ScalaTest,
we need to create tests suites (its really a Scala class which you use to categorize your tests, in a meaningful way)
with the following mixins
FunSuite [with BeforeAndAfter] with ParallelTestExecution
FunSuiteLike [with BeforeAndAfter] with ParallelTestExecution
where the trait BeforeAndAfter should be mixed in before ParallelTestExecution so ensure the As far as we know in
50
where the trait BeforeAndAfter should be mixed in before ParallelTestExecution so ensure the As far as we know in
the the ScalaTest 2.0.M7 release, ParallelTestExecution must be the last trait to be mixed in a method, runTest, has
been marked final.
The following code demonstrates what the test code looks like after the appropriate traits have been mixed in to
allow parallel execution.
0001 import akka.testkit._
0002 import akka.actor.{ActorRef, ActorSystem, Props}
0003 import org.scalatest.{FlatSpecLike, BeforeAndAfterAll, FunSuiteLike, ParallelTestExecution}
0004 import org.scalatest.matchers.MustMatchers
0005 class InvoiceActorFunParSpec extends TestKit(ActorSystem()) with ImplicitSender
0006 with FunSuiteLike
0007 with BeforeAndAfterAll
0008 with MustMatchers
0009 with ParallelTestExecution {
0010 test("have three values {400, 500, 600} when the same three orders are issued") {
0011 val actor = TestActorRef[InvoiceActor]
0012 val ref = actor.underlyingActor
0013 actor ! Checkin(400)
0014 ref.topOfOrder must be (400)
0015 actor ! Checkin(500)
0016 ref.topOfOrder must be (500)
0017 actor ! Checkin(600)
0018 ref.topOfOrder must be (600)
0019 }
0020 test("return a tally of 1,500 after three orders of value 400, 500 & 600") {
0021 val actor = TestActorRef[InvoiceActor]
0022 actor ! Checkin(400)
0023 actor ! Checkin(500)
0024 actor ! Checkin(600)
0025 actor ! Checkout
0026 expectMsg(1500)
0027 }
0028 override def afterAll() = { system.shutdown}
0029 }
0030 class InvoiceActorFunParSpec2 extends TestKit(ActorSystem()) with ImplicitSender
0031 with FunSuiteLike
0032 with BeforeAndAfterAll
0033 with MustMatchers
0034 with ParallelTestExecution {
0035 test("have one value {400} when one order is issued") {
0036 val actor = TestActorRef[InvoiceActor]
0037 val ref = actor.underlyingActor
0038 actor ! Checkin(400)
0039 ref.topOfOrder must be (400)
0040 }
0041 test("return a tally of 500 after one order of value 500") {
0042 val actor = TestActorRef[InvoiceActor]
0043 actor ! Checkin(500)
0044 actor ! Checkout
0045 expectMsg(500)
0046 }
0047 override def afterAll() = { system.shutdown}
0048 }
To run these new tests, we go back to SBT and trigger them again. Depending on how many times you trigger these
tests, the order of execution, for InvoiceActorFunParSpec and InvoiceActorFunParSpec2, would be different. A
sample output from my machine (with the test suite names in bold) is as follows:
0001 [info] OrderSpec:
0002 [info] An default Order i.e. no orders have been added
0003 [info] - should have zero orders
0004 [info] - should return 0 when tally is invoked
0005 [info] An order when > 1 order(s) have been added
0006 [info] - should have exactly 2 orders when 2 order are added
0007 [info] - should return 13 when tally is invoked for orders of values {1,2,5,5}
0008 [info] InvoiceActorFunParSpec2:
0009 [info] - have one value {400} when one order is issued
0010 [info] - return a tally of 500 after one order of value 500
0011 [info] InvoiceActorFunParSpec:
0012 [info] - have three values {400, 500, 600} when the same three orders are issued
0013 [info] - return a tally of 1,500 after three orders of value 400, 500 & 600
0014 [info] InvoiceActorSpec:
0015 [info] `Invoice` actor taking 2 orders
0016 [info] - should have two values {400, 500} when the same two orders are issued
0017 [info] - should return a tally of 900 after two orders of value 400 & 500
0018 [info] Passed: : Total 10, Failed 0, Errors 0, Passed 10, Skipped 0
0019 [success] Total time: 2 s, completed Sep 16, 2013 10:07:31 AM
The way ScalaTest handles parallel test execution is that it would run the test suites in parallel while each test
within the test suite would execute in sequence i.e. written text order. We'll take a look at another form of parallel
testing, in the following section. In the second scenario of involving Akka to take care of launching your tests in
parallel, then you have the choice of running tests in parallel on a single host, running tests in a coordinated manner
across distributed hosts etc.
Concurrent Testing
In the previous section, you've seen how the test framework we've chose allows the developer the capability of
writing test logic, validating your actor and also the capability to execute those tests in parallel. Well, Akka has
another tool in the toolbox which allows us, to launch tests in parallel across multiple JVM instances. The module
we will be discussing is the sbt-multi-jvm and it's different from akka-testkit in that it just cares about one thing:
detects the applications it needs to work with and launch each of them into separate JVMs, possibly on different
machines.
The sbt-multi-jvm module is actually part of a larger module, akka-multi-node-testkit which allows the developer to
craft distributed tests that reflects more of a reality than not i.e. coordinated distributed test. We'll take a look at how
to accomplish that a little while later.
51
In this section, we will take a look at hooking our current tests i.e. written with ScalaTest into sbt-multi-jvm, followed
by an illustration on how to launch Scala modules within a test (i.e. controlled) environment. Finally, we'll
demonstrate how to use the multi-node testkit.
Setting Up
Akka, the cool dude that he is, has only two requirements in order to get this to work. First, we need to incorporate
the plugin into your build file for your project/plugins.sbt
addSbtPlugin("com.typesafe.sbt" % "sbt-multi-jvm" % "0.3.8")
You can either restart sbt or issue the update command to the sbt console to update this new setting. The second
thing is to include this new addition to the build file, project/Build.scala and for starters, it should look like this:
0001 import sbt._
0002 import Keys._
0003 import com.typesafe.sbt.SbtMultiJvm
0004 import com.typesafe.sbt.SbtMultiJvm.MultiJvmKeys.{ MultiJvm }
0005 object Chapter9Build extends Build {
0006 lazy val root = Project(id = "chapter9-testing-actors",
0007 base = file("."),
0008 settings = Project.defaultSettings ++ multiJvmSettings) configs(MultiJvm)
0009 lazy val multiJvmSettings = SbtMultiJvm.multiJvmSettings ++ Seq(
0010 // make sure that MultiJvm test are compiled by the default test compilation
0011 compile in MultiJvm <<= (compile="" in="" multijvm)="" triggeredby="" test),="" disable="" parallel="" tests="" parallelexecution=
0012
0013 <p>The final thing you need to do is to create a directory&nbsp;<em>src/multi-jvm/scala</em>&nbsp;and start depositing the tests you will write in distributed mode over there. The way in which plugin knows what test to pick out is based on the naming convention that candidates for such execution runs are identified when this string,&nbsp;
0014 <pre>
0015 class InvoiceActorMultiJvmNode1 extends TestKit(ActorSystem()) with ImplicitSender
0016 with FlatSpecLike
0017 with BeforeAndAfterAll
0018 with MustMatchers { // as before }</pre>
0019
0020 <p>The portion of the above code marked in <strong>bold</strong> shows the difference; you may wish to contrast this approach against the previous approach we took with ScalaTest and you might actually like it. The mechanism for discovering which tests to execute is based on naming convention. We briefly mentioned earlier that it's looking the files and tests with the same string as
0021 <pre>
0022 multi-jvm/
0023 └── scala
0024 ├── InvoiceActorMultiJvmNode1.scala
0025 └── InvoiceActorMultiJvmNode2.scala</pre>
0026
0027 <p>The following figure illustrates how the tests are grouped, as we described previously.</p>
0028 <p><img src="static/chapter-9-actor-testing.001_2.jpg" alt="How tests are grouped" width="512" height="384">&nbsp;</p>
0029 <p><strong>Tip</strong>: If you, or your organization, isn't too keen on the name <span style="constant">MultiJvm</span>, you can change it by placing an expression similar to this in your
0030 <pre>
0031 multiJvmMarker in MultiJvm := "GroupTest"
0032 and the placement of that expression can be like this:
0033 object Chapter9Build extends Build {
0034 lazy val root = Project(id = "chapter9-testing-actors",
0035 base = file("."),
0036 settings = Project.defaultSettings ++
0037 multiJvmSettings ++
0038 Seq(multiJvmMarker in MultiJvm := "GroupTest")) configs(MultiJvm) // as before </pre>
0039
0040 <p>When you next trigger the multiple jvm test to run, it'll look for files which has <span style="constant">GroupTest</span> embedded within and start inspecting those files and executing those tests with a name that has
0041 <p>To illustrate how this would work, you would again employ&nbsp;<em>sbt&nbsp;</em>and enter the following expression into the console
0042 <pre>> multi-jvm:test</pre>
0043 <p>And in our scenario, we have two tests running concurrently i.e. InvoiceActorMultiJvmNode1 and InvoiceActorMultiJvmNode2 and when you trigger that test, you should see an output on the&nbsp;
0044 <pre>
0045 [info] * InvoiceActor
0046 [JVM-1] Run starting. Expected test count is: 2
0047 [JVM-1] InvoiceActorMultiJvmNode1:
0048 [JVM-1] Node1 : `Invoice` actor taking 2 orders
0049 [JVM-2] Run starting. Expected test count is: 2
0050 [JVM-2] InvoiceActorMultiJvmNode2:
0051 [JVM-2] Node2 : `Invoice` actor taking 2 orders
0052 [JVM-1] - should have two values {400, 500} when the same two orders are issued
0053 [JVM-1] - should return a tally of 900 after two orders of value 400 & 500
0054 [JVM-1] Run completed in 636 milliseconds.
0055 [JVM-1] Total number of tests run: 2
0056 [JVM-1] Suites: completed 1, aborted 0
0057 [JVM-1] Tests: succeeded 2, failed 0, canceled 0, ignored 0, pending 0
0058 [JVM-1] All tests passed.
0059 [JVM-2] - should have two values {400, 500} when the same two orders are issued
0060 [JVM-2] - should return a tally of 900 after two orders of value 400 & 500
0061 [JVM-2] Run completed in 666 milliseconds.
0062 [JVM-2] Total number of tests run: 2
0063 [JVM-2] Suites: completed 1, aborted 0
0064 [JVM-2] Tests: succeeded 2, failed 0, canceled 0, ignored 0, pending 0
0065 [JVM-2] All tests passed.
0066 [success] Total time: 1 s, completed Sep 19, 2013 4:48:01 PM</pre>
0067
0068 <p>When you constrast the approach we took here with the one we took with ScalaTest, you'll quickly realize that it doesn't take too much effort to convert those ScalaTests (or whatever test framework you're familiar with) to the ones here. We believe they are not meant to replace those tests you've already prepared with your favourite test framework but its always good to know you have alternatives.
0069 <p>Next, let's take a look at another use case for this module. Let's assume that you have crafted several applications i.e. Scala modules and they start up depending on certain attributes or characteristics that you've represented in a configuration file or even passed them through the command line using the familiar
0070 <pre>
0071 object MyApplication {
0072 val property1 = System.getProperty("some_property1")
0073 val property2 = System.getProperty("some_property2")
0074 // ...
0075 def main(args: Array[String]) = {
0076 // Asserting properties
0077 // Start up your application depending on the characteristics of those properties
0078 }
0079 }</pre>
0080
0081 <p>At this point, we can enlist the help of sbt-multi-jvm to help us out! Assuming that we have factored out the nasty looking <em>System.getProperty("XXX")
0082 <pre>
0083 trait Configuration {
0084 val default = "Nothing"
0085 // System.getProperty is weird...but that's in the Java world...
0086 def getPropertyOrElse(prop: String, default: String = default) =
0087 System.getProperty(prop) match { case null default; case x x }
0088 def getCustomMessageForNode = getPropertyOrElse("custom_message")
0089 def getThreadPoolName = getPropertyOrElse("custom_pool_name")
0090 def getThreadPoolType = Class.forName(getThreadPoolName)
0091 }
0092 object Configuration {
0093 def apply = new Configuration {}
0094 }
0095 object SimpleModuleMultiJvmNode1 extends Configuration with FlatSpecLike with MustMatchers {
52
0096 def main(args: Array[String]) : Unit = {
0097 val p = getCustomMessageForNode
0098 val pn = getThreadPoolName
0099 pn must include ("ForkJoinPool") // assertion
0100 }
0101 }
0102 object SimpleModuleMultiJvmNode2 extends Configuration with FlatSpecLike with MustMatchers {
0103 def main(args: Array[String]) : Unit = {
0104 val p = getCustomMessageForNode
0105 val pn = getThreadPoolName
0106 pn must include ("ThreadPoolExecutor") // assertion
0107 }
0108 }</pre>
0109 <p>In our example, we again followed the naming conventions as described earlier to group those use cases that we would like to categorized them by. When we accomplished this (as shown), we have two modules that can be started up in separate JVMs that tests for the validity of our properties. To trigger this, we input the expression multi-jvm:run SimpleModule into the sbt console and a test failure would result in error messages spewed onto the console and below is a sample run on our sbt console:
0110 <pre>
0111 > multi-jvm:run SimpleModule
0112 [info] Compiling 5 Scala sources to /Users/raymondtay/akka-edge/ch9-testing-actors/target/scala-2.10/multi-jvm-classes...
0113 [warn] there were 2 deprecation warning(s); re-run with -deprecation for details
0114 [warn] one warning found
0115 [info] * SimpleModule
0116 [success] Total time: 5 s, completed Sep 22, 2013 9:51:26 PM</pre>
0117
0118 <p>Before concluding this section, let us demonstrate to you how we can bootstrap our InvoiceActor into a Scala module and have it run a test; this covers the use case where you have actors and you like them to be launched from a Scala module that resembles a standalone application. Here's what happened to that piece of code, now, as compared to what we did earlier
0119 <pre>
0120 object SimpleModuleMultiJvmNode1 extends TestKit(ActorSystem("SimpleModuleAS")) with Configuration with ImplicitSender
0121 with FlatSpecLike
0122 with MustMatchers
0123 with BeforeAndAfterAll {
0124 def doAction(f: Unit) = f
0125 def main(args: Array[String]) : Unit = {
0126 try doAction{
0127 val p = getCustomMessageForNode
0128 val pn = getThreadPoolName
0129 pn must include ("ForkJoinPool")
0130 val invoice = TestActorRef[InvoiceActor]
0131 invoice ! Checkin(500)
0132 invoice ! Checkout
0133 expectMsg(500)
0134 } finally afterAll
0135 }
0136 override def afterAll = { system.shutdown }
0137 }</pre>
0138
0139 <p>What happened here is that we used the previous technique of creating a <span style="constant">TestActorRef</span> to wrap around our
0140 <h2>Multi-Node Testing</h2>
0141 <p>Previously, we demonstrated that you can improve your testing cycle by running your tests concurrently but that doesn't really tell the whole story because a big part of your actor application would be communicating with other machines in the network. As you continue to write actor code, you would quickly realize that testing on a single host would out lived its usefulness and we need to find some manner to put the application to a distributed test. The following simplified diagram illustrates the concurrent testing we have encountered so far where actor-1 delivers the payload to actor-3 through actor-2 (which we'll assume is performing some middle-man processing) and the right-hand side of the diagram illustrates how each actor code is tested by launching parallel JVMs.
0142 <p style="text-align: justify;">&nbsp;<img src="static/chapter-9-actor-testing.002.jpg" alt="" width="1024" height="768"></p>
0143 <p style="text-align: justify;">When we want to test our application in a distributed way, ideally we would need a framework that allows our actor code to be hosted within a test framework like ScalaTest and the tests would be packaged along with our dependencies and shipped off to the destination nodes and begins the distributed testing. Fortunately for us, the framework is already present and is known as
0144 <p style="text-align: justify;">How this testkit works is that there is a <span style="constant">TestConductor</span> that coordinates and runs the tests across the nodes that we have deployed our application to. The next thing is for us to hook our tests to a specification which allows our tests to be hooked to the multi-node test framework's lifecycle and its called
0145 <h3 style="text-align: justify;">Setting up</h3>
0146 <p style="text-align: justify;">Edit the build.sbt file and include the following dependency to it:</p>
0147 <pre>"com.typesafe.akka" %% "akka-multi-node-testkit" % "2.2.1"</pre>
0148 <p style="text-align: justify;">and update the project by triggering the <em>update</em> command to the <em>sbt</em> console and sbt should shortly download the dependency to your local file system if it hasn't already done so. This testkit allows your test to be delivered and executed on various machines, so its probably a good idea to create password-less
0149 <h3 style="text-align: justify;">Example</h3>
0150 <p style="text-align: justify;">Let's take a further look by building the classic example of messaging forwarding involving three actors. The first actor will begin by sending the message to the middle-actor and that would be forwarded to the last actor, by which the last actor will send a reply back to the original actor. Next, our test environment would consists of a single host that's attached to two virtual machines with the localhost acting as the
0151 <p style="text-align: justify;">&nbsp;</p>
0152 <p><img src="static/chapter-9-actor-testing.003_1.jpg" alt="" width="1024" height="768"></p>
0153 <p>The following code demonstrates this (we'll explain in more detail in a short while)</p>
0154 <pre>
0155 import org.scalatest.{BeforeAndAfterAll, WordSpecLike}
0156 import org.scalatest.matchers.MustMatchers
0157 import akka.actor._
0158 import akka.testkit.ImplicitSender
0159 import akka.remote.testkit._
0160 // Your specification should mixin what you need for your test
0161 trait ForwardMultiNodeSpec extends MultiNodeSpecCallbacks
0162 with WordSpecLike
0163 with MustMatchers
0164 with BeforeAndAfterAll {
0165 override def beforeAll() = multiNodeSpecBeforeAll()
0166 override def afterAll() = multiNodeSpecAfterAll()
0167 }
0168 //Create as many `roles` as you have `nodes`
0169 object MConfig extends MultiNodeConfig {
0170 val node1 = role("Actor-1")
0171 val node2 = role("Actor-2")
0172 val node3 = role("Actor-3")
0173 }
0174 // the sbt-multi-jvm testkit would launch 1 `class` per `node
0175 class ForwardMNMultiJvmNode1 extends ForwardMultipleNodeDemo
0176 class ForwardMNMultiJvmNode2 extends ForwardMultipleNodeDemo
0177 class ForwardMNMultiJvmNode3 extends ForwardMultipleNodeDemo
0178 class ForwardMultipleNodeDemo extends MultiNodeSpec(MConfig)
0179 with ForwardMultiNodeSpec
0180 with ImplicitSender {
0181 import MConfig._
0182 def initialParticipants = roles.size
0183 // Override these two methods if you like to conduct some
0184 // setup/shutdown for `all` tests
0185 override def atStartup() = println("STARTING UP!")
0186 override def afterTermination() = println("TERMINATION!")
0187 "A multi-node illustrating `message-forwarding`" must {
0188 "wait for all nodes to enter barrier" in {
0189 enterBarrier("startingup")
0190 }
0191 "send to and receive from a remote node via `forwarding`" in {
0192 runOn(node1) {
0193 enterBarrier("deployed")
0194 var msgReturned = false
0195 val next = system.actorFor(node(node2) / "user" / "MiddleMan")
0196 val me = system.actorOf(Props(new Actor with ActorLogging {
0197 def receive = {
0198 case "hello" next forward (self, "hello")
0199 case ("hello",Stop) condition = true
0200 }}), "Initiator")
0201 me ! "hello"
0202 awaitCond(msgReturned == true)
0203 }
0204 runOn(node2) {
0205 val next = system.actorFor(node(node3) / "user" / "Destinator")
0206 system.actorOf(Props(new Actor with ActorLogging {
0207 def receive = {
0208 case (to, msg) next forward (to, msg)
0209 }}), "MiddleMan")
0210 enterBarrier("deployed")
0211 }
0212 runOn(node3) {
0213 system.actorOf(Props(new Actor with ActorLogging {
53
0214 def receive = {
0215 case (to:ActorRef, msg) to ! (msg, Stop)
0216 }}), "Destinator")
0217 enterBarrier("deployed")
0218 }
0219 enterBarrier("finished")
0220 }
0221 }
0222 }</pre>
0223 <p>Let's go through in more detail what we've done over here. What we did was to craft a test that consists of three actors with names Initiator, MiddleMan and Destinator; each of these actors are to be executed on the nodes node1, node2, node3 respectively through the function, runOn. If you examine the actors code, you would notice that its just regular akka actor code but there's a caveat and that is we need to make sure the order of startup is correct for our small simulation.
0224 <p>The Initiator would need to be started up after MiddleMan and Destinator otherwise Initiator would not be able to locate the appropriate actor to forward the message to, and this testkit gave us the answer by allowing us to place barriers i.e. barriers dictate that no one crosses the barrier untill everyone has arrived at the barrier.
0225 <p>Using the barrier, we can control the order of the startup and in our example MiddleMan and Destinator would startup before Initiator (which is what we want). Notice that we place barriers upon startup and shutdown in our test lifecycle and that is probably a good practice to abide by since you won't run into situations where the tests runs out of control because of these synchronization issues. When writing tests, there's often a placement in the framework which prepares and subsequently releases resources for the tests, akin to
0226 <pre>
0227 def atStartup() : Unit
0228 def afterTermination() : Unit</pre>
0229
0230 <p>which alters the default behavior, which is to do nothing, when your tests are starting up or shutting down. In our example, we simply gave it a printout and during the test run, you would notice this output. To give this example a run, enter the expression into the&nbsp;
0231 <pre>multi-node-test-only ForwardMN</pre>
0232 <p>and the testkit would package our application and its dependencies together producing an artifact jar ch9-testing-actors_2.10.2-1.0-multi-jvm-assembly.jar and transports this over to the hosts. Here's a sample output:
0233 <pre>
0234 [info] Including protobuf-java-2.4.1.jar
0235 [info] Including akka-actor_2.10-2.2.1.jar
0236 [info] Including akka-multi-node-testkit_2.10-2.2.1.jar
0237 ... some lines omitted
0238 [info] Including netty-3.6.6.Final.jar
0239 [info] Including scalatest_2.10-2.0.M7.jar
0240 [info] Including uncommons-maths-1.2.2a.jar
0241 [info] Including scala-library.jar
0242 [info] SHA-1: WrappedArray(-22, 53, -18, 49, -65, 38, -30, -5, 89, -62, 123, -65, -86, 27, -85, 90, -77, 104, -55, 19)
0243 [info] Packaging /Users/tayboonl/akka-edge/ch9-testing-actors/target/ch9-testing-actors_2.10.2-1.0-multi-jvm-assembly.jar ...
0244 [info] Done packaging.
0245 [info] * ForwardMN
0246 ... more lines omitted reporting test results </pre>
0247
0248 <p>and you can correlate this action by observing that several SSH processes have been started and would live as long as the tests lasts, and the test would report that it completely successfully !
0249 <h2>Timed Tests</h2>
0250 <p>Thus far, we have made no assumptions about how quickly should our transactions should complete and to be specific we made no assertions about the latency we are willing to compromise for when placing orders or checking out. The akka-testkit provides us a mini-DSL in which we can express our latency concerns by placing the logic in which we want accomplished within a time range using this
0251 <pre>
0252 within([min, ] max) {
0253 // time sensitive business logic
0254 }</pre>
0255
0256 <p>What this means is the business logic must be completed between the timed window between (min, max) and this mini-DSL can be embedded. Let's take a look at how our tests can be crafted to account for this. Let's assume that our business owner lays the requirement that any single order should take no more than 0.5 seconds to register while 2 or more orders should take no more than 0.8 seconds to register then our system can be validated for this requirement with the following changes made to our code
0257 <pre>
0258 class InvoiceActorTimeSpec extends TestKit(ActorSystem()) with ImplicitSender
0259 with FlatSpecLike
0260 with BeforeAndAfterAll
0261 with MustMatchers
0262 with ParallelTestExecution {
0263 behavior of "`Invoice` actor taking 2 orders must complete within 800 millis"
0264 import scala.concurrent.duration._
0265 it should "have two values {400, 500} when the same two orders are issued" in {
0266 val actor = TestActorRef[InvoiceActor]
0267 val ref = actor.underlyingActor
0268 within (800 millis) {
0269 within(500 millis) {
0270 actor ! Checkin(400)
0271 ref.topOfOrder must be (400)
0272 }
0273 within(500 millis) {
0274 actor ! Checkin(500)
0275 ref.topOfOrder must be (500)
0276 }
0277 }
0278 }</pre>
0279
0280 <p>On the subject of time comes the question: "Would the tests still give the same results when executed on a faster machine compared to a slower machine? " many of the testing techniques with Akka carries a timeout value in which the test must complete or it would report test failures. Most of timeout values are implicit but they can be made explicit and in situations where you need to deliberately slow down the test(s), you can include the expression in your tests:
0281 <p>1.second.dilated</p>
0282 <p>That's with the assumption of that the expression is given within a ActorSystem with the necessary imports. Be careful when the criteria for the deemed success or failure of your tests depends on time since a heavily loaded system would mean that your application has to compete with other stuff for CPU-/IO-time; tests should be, afterall, deterministic.
54
Chapter 10 : Clusters
In this chapter, we'll explore a new feature available in the latest Akka 2.2 (soon to be 2.3) release codenamed
"Coltrane" which is Akka clusters. Prior to this release clusters in Akka was experimental but we have not been
available to conceal our excitement over this feature and we've decided to share with you, the reader, why you
should look out for this feature but that is probably a misnomer, since akka-cluster is really more of a framework for
building and running long-lasting distributed applications.
Clustering technology isn't something new and at least for the Java world, several clustering implementations have
been offered in the past (some survived to this day) and most of them don't call themselves clusters but rather the
name middleware was given (see Wikipedia's interpretation). To the Java/Scala developers amongst us, you're
probably familiar with some cluster implementations like Oracle WebLogic, IBM WebSphere, RedHat's JBoss etc
and those implementations typically focused on implementing the distributed computing constructs derived from the
J2EE (dubbed by Sun Microsystems) or what is known now as JEE specifications (driven by present-day Oracle).
The clustering solutions of yore and present, mostly focused on providing computing abstractions that was common
place and some notable features includes the famous two-phase-commits for coordinating transactions over a
distributed system; another one is the abstraction that allows the developer to design application dataflows by
allowing the business logic to be encapsulated across stateless or stateful workers a.k.a Enterprise JavaBeans
(EJBs) with their life-cycles managed by the containers (another name for J2EE/JEE-middleware) that hosted them
and many of these implementations provide fail-overs for compute tasks when a node is under significant load, or
when a node that went under the bridge. Many of these solutions comes with productivity tools that aids the
developer to be more productive in developing applications in J2EE/JEE. The JEE specification covers more than
what we have just described here and these implementations might be a perfect match for businesses who have
these needs but there is a problem, we think.
Adopting these solutions, in our opinion, comes with a cost. This cost is visibility. In distributed systems, visibility is
an intrinsic property that is very much desired. What we mean here is the ability to have transparency in the way
things actually work, not how we think they should work and lastly to effect the changes so that others can benefit
from it. Akka, being open sourced, allows the developer (or the team) unrestricted access to its source and hence the
propensity to achieve this. An example that epitomizes the problem with the lack of visibility is when the commercial
cluster system does not behave in the fashion in which it is suppose to. What happens next would be to file a
incident report to the commercial entity that is supporting the software to its customers (that is, the customer is you or
your organization) and the time to resolution could range from days to months, sometimes. In a situation that some of
us (me including) have experienced, we were advised to upgrade to a recent version of that software and had to
work out an action plan for migration and that is a pain that we hope we won't relived anytime soon.
Another observation is that we, developers, are living in a world where we no longer build software in
isolation/obscurity and together with the proliferation of open source technologies that range from programming
languages, cloud operating systems (eucalyptus, open stack etc) and their tooling etc the use case for building
technology on closed systems in contrast to open source systems might not be that favourable as this trend of open
source development proliferates. This observation is supported by the fact that you've picked up this book and know
that you can download and examine not only this book's source code but also the source code from Scala and Akka
means that you have already acquired the fundamental means to reason not only how it works but possibly, change
the way it works. A nicety of akka-cluster is that it's built on akka-actor technology so in that way, its a extension of
Actors that you've already learnt.
Having said that, smart developers like yourselves have begun to wonder what kinds of applications you can build
using akka-cluster and the answer is aplenty (see link).
We, are not advocating that you should drop the ball altogether on those clustering solutions that your organization
has spent considerable resources building up on, rather we're encouraging you to explore akka-cluster and see
whether you can employ it to use and we believe the prospects are high and even more so when you've dabbled
with actors for a while now. Information gathered from water coolers and the company pantries reveal a compelling
(somewhat crafty) strategy: discover a sensible clustering approach, implement and test it, finally swap out that
J2EE/JEE solution with Akka cluster. Works like a charm.
For the rest of chapter, we are going to explore clusters in Akka by understanding how it works through (relatively)
painless examples which should give you a good basic intuition and we'll demonstrate two use cases:
(a) A distributed computing cluster
(b) How you can integrate a REST server to a Akka cluster.
The last example was chosen because prior to akka-cluster, many developers would develop front-end services on
a REST/HTTP server framework like Play or Spray possibly, and build front-end consumer services a.k.a. back-end
using Actors in Akka and when they discovered the akka-cluster, the more adventurous hakkers amongst us would
start migrating those services into clusters.
55
About Akka Clusters
In 2007, Amazon released a famous paper titled `Dynamo: Amazon's highly available key-value store` where the
authors of that work went into detail explaining the design philosophies of Dynamo which covers not only cluster
design, data replication design, node recovery and failure recovery etc and as far as we know, Akka derived its
design philosophy from Dynamo and Basho's Riak to implement the cluster membership. Nodes in the cluster
communicate their liveness, system metrics by gossiping amongst themselves via a gossip protocol; the gossip
protocol propagates data like the state of the cluster and vector clocks are used to reconcile and merge the
differences detected.
A cluster is made up of member nodes and these nodes can be on the same host (with different port numbers, of
course) or on different hosts. Each node is identified by a string that is of the form hostname:port:uid and Akka
clusters are versatile in the respect that clusters do not have a strict requirement of hosting only actors, infact you
can run good ol' Scala code if you like; however, the typical use case of Akka clusters is to distribute the application
over it. Clusters in Akka are fault-tolerant peer-to-peer based cluster membership with no single point of failure or
single point of bottleneck. Sweet!
Let's take a look at how they actually work through an example. This example has a simple architecture: A Scala
module will start a cluster and starts an Actor; this actor will print a message whenever it detects a node has just
joined the cluster. The following code illustrates the module that will host our cluster:
0001 package simplecluster
0002 object SimpleCluster extends App {
0003 override def main(args: Array[String]): Unit = {
0004 if (args.nonEmpty) System.setProperty("akka.remote.netty.tcp.port", args(0))
0005 val system = ActorSystem("Cluster")
0006 val clusterListener = system.actorOf(Props[SimpleClusterListener],
0007 name = "clusterListener")
0008 Cluster(system).subscribe(clusterListener, classOf[ClusterDomainEvent])
0009 }
0010 }
What we have done here is to create a ActorSystem and have that ActorSystem create the listener-actor. At this
point in time, we have no cluster yet. The cluster comes to life when we create it through a constructor call
Cluster(system) and have our cluster delegate all events that pertains to when a cluster to an object that is
represented by the trait ClusterDomainEvent.
ClusterDomainEvent is the super-trait that represents events going on inside a cluster e.g. cluster goes up/down,
cluster becomes unreachable, change in cluster behavior, change in cluster resource utilization.
The listener will be waiting for messages to arrive and once they do, the response is to log some intrinsic
information. For example, the listener will log a status whenever a new cluster shows up and when a member went
down. Here's what the listener looks like:
0001 package simplecluster
0002 class SimpleClusterListener extends Actor with ActorLogging {
0003 def receive = {
0004 case state: CurrentClusterState
0005 log.info("Current members: {}", state.members.mkString(", "))
0006 case MemberUp(member)
0007 log.info("Member is Up: {}", member.address)
0008 case UnreachableMember(member)
0009 log.info("Member detected as unreachable: {}", member)
0010 case MemberRemoved(member, previousStatus)
0011 log.info("Member is Removed: {} after {}",
0012 member.address, previousStatus)
0013 case _: ClusterDomainEvent // ignore
0014 }
0015 }
You noticed that our listener is watching for certain types of messages to be processed and if you are keen to find
out how events, failure detection, routing works in the (current) cluster, you can read the following sub-sections or
skip to Start up a cluster just to see how things work!
Failure Detection
Failure detection is essential to the day to day operations of a cluster, because it is in the interests of the service
providers and operators alike to be able to respond when one or more nodes in the cluster failed to provide the said
service. How Akka clusters accomplished this is by having each node monitored by a few other nodes and when the
node being monitored becomes unreachable because of network failure, JVM crashing etc it'll communicate this to
the others by gossiping and likewise, if the node become reachable again everyone else would know about it
through gossip and this monitoring is done by sending heart beat messages and the interpretation of the arrival
times of the heartbeat messages is implemented in Akka by The Phi Accrual Failure Detector by Hayashibara et al.
The key formulation is this formula:
The function φ you see measures the level of trust or conversely level of suspicion that any node is up or down
(depending on your interpretation); the function F here is a cumulative distribution function of historical heartbeat
inter-arrival times. This formulation looks at historical and current data to derive a level of suspicion on this node to
predict whether a node is unreachable. You can control the sensitivity of this detector by adjusting the threshold
through the configuration i.e. application.conf by adapting the value of akka.cluster.failure-detector.threshold to a
suitable value in your production cluster environment; another control you can apply to adjust the sensitivity is
56
suitable value in your production cluster environment; another control you can apply to adjust the sensitivity is
through adapting the value of the key akka.cluster.failure-detector.acceptable-heartbeat-pause. You are invited to
read about the details here.
Reactive Cluster Routers
Starting from Akka cluster 2.2, a new kind of cluster aware router was added to the family and this particular router
was metrics-based i.e. its able to take advantage of the metric data read off the cluster and load balance the
messages across the nodes in the cluster, it's known as AdaptiveLoadBalancingRouter. This router consumes the
metrics data (collected from the nodes in the cluster via the gossip protocol) and forms a weighted routees i.e. all
routees are given a weight based on the amount of remaining capacity, and messages will be routed to the routee
that has the larger available capacity at the time of measure. The following metrics are currently available:
HeapMetricsSelector, CpuMetricsSelector, SystemLoadAverageMetricsSelector and MixMetricsSelector; and they
are on the remaining capacities of the JVM heap memory, CPU (User + Sys + Nice + Wait on Linux systems),
system load respectively and MixMetricsSelector uses a combination of cpu + heap + system load. The routees that
are associated with this router would be automatically registered or un-registered depending on whether the node
becomes reachable or unreachable.
However, starting from Akka cluster 2.3, the AdaptiveLoadBalancingRouter is deprecated in favour of two routers
which carry the adaptivity of load balancing by allowing your choice of metrics-selector (as described in the previous
paragraph) during router creation, hence increasing the flexibility of your cluster routing design:
AdaptiveLoadBalancingPool - Each router of this type would own its set of routees and different routers do not
share routees. You might want to consider using this type of router when you have a master node delegating
work to the othe nodes in the cluster through this router.
AdaptiveLoadBalancingGroup - Each router of this type would share the routees even if these routers are
running on different nodes in the cluster. You might want to consider deploying this type of router in the
situation where you have backends with frontends as a bulkhead.
We won't delve too much into the usage of these new routers and invite you to explore them here.
How events in the cluster works
When a cluster starts up, it creates many objects to help manage the state of the cluster, detecting if members are up
or down among other activities. In terms of the membership detection, the approach is that an internal actor,
ClusterDaemon (which in turn is child of another actor ClusterCoreSupervisor), is the main guy whose watching out
for things like membership detection through Akka's gossip-protocol etc. Hence, when we say we want our actor
SimpleClusterListener to be the receipient of what goes on around the cluster through this expression
Cluster(system).subscribe(clusterListener, classOf[ClusterDomainEvent])
What effectively happens is that the ClusterDaemon actor delegates all events to this publisher actor
ClusterDomainEventPublisher whom is responsible for pushing events related to the cluster onto the event bus. In
our example SimpleClusterListener is a subscriber of these events which would otherwise write to void figuratively
speaking and when events are being published by the cluster, all subscribers are notified.
Returning from that deviation, let's try starting up our cluster and have 2 more nodes join the cluster, in turn.
Start up a cluster
Let's start up a cluster of three nodes where two are seed nodes and the third node be associated with either one of
the previously started seed nodes. These seed nodes are the entry-points for other nodes to join the cluster and
when a new node joins the cluster, the seed nodes communicate amongst themselves to see which one is able to
take on this new node. When the new node joins a particular seed node, you would noticed a message that reflects
that. Besides getting the relevant dependencies ready in the build.sbt, we need to provide the usual
application.conf in the src/main/resources with the following contents:
0001 akka {
0002 actor {
0003 provider = "akka.cluster.ClusterActionRefProvider"
0004 }
0005 remote {
0006 log-remote-lifecycle-events = off
0007 netty.tcp {
0008 hostname = "localhost"
0009 port = 0
0010 }
0011 }
0012 cluster {
0013 seed-nodes = [
0014 "akka.tcp://Cluster@localhost:2551",
0015 "akka.tcp://Cluster@localhost:2552"]
0016 auto-down = on
0017 }
0018 log-dead-letters-during-shutdown = off
0019 }
To start this cluster up, we want the seed nodes to start up first (before we allow other nodes to join the cluster) by
issuing the following command over a terminal (noticed that the port numbers used corresponds to that found in
application.conf):
sbt "run-main simplecluster.SimpleCluster 2551"0001 sbt "run-main simplecluster.SimpleCluster 2552"
57
In the time between the issue of two commands, you would notice that the cluster lodged at port 2551 is attempting
to connect with the other at port 2552 and fails until the other party starts up and depending on far apart was the
issue of the two commands, you would noticed the leadership of the cluster swings from the node at 2551 to the
node at port 2552. Next, you can issue a command to bring up a new node which will join either one of these seed
nodes by not providing a port number like this (a random port will be assigned to this new node):
sbt "run-main simplecluster.SimpleCluster"
And from this point onwards, you can basically add more nodes by issuing the above command to a terminal and
you can bring down a node by either issuing a ctrl-c to the terminal or programmatically by a call like
Cluster(system).leave(nodeAddress). The interesting thing you can do upon cluster startup is to run a piece of code
and you do that by having an expression Cluster(system).registerOnMemberUp(someComputation) and we'll see in
a while how you might apply that idea in the application.
A Distributed Computing Cluster Application
In this section, we are going to demonstrate how you can go about building a compute cluster that does one really
important thing: computing the value of Pi i.e. . When you start thinking about coming up with a solution with Scala
alone, you can develop a sequential or parallel approach and when you rethink about solution in Akka, you get
creative and start thinking of having a master-slave model where the master actor collates the results of multiple
slaves (its basically parallel but distributed) and finally in akka-cluster, you may choose to spread the computing
over multiple nodes.
Let's start by outlining how the value of Pi can be calculated and we can gradually work out a sequential to parallel
approach, and we then transition over to using actors and finally a clustered approach. Calculating the value of Pi is
a computationally intensive application or in other words, it's more CPU bound than I/O bound; though the idea of
computing this well known value may look trivial, getting it right may not be the case. The formula we will be using
for this computation is this:
and it has the property of being embarassingly parallel, which literally means that each part of the computation can
be completed in isolation, and it will be from this formula in which our code will be based upon.
A sequential approach
Like all things, we should establish a base reference implementation upon which to validate whether we are doing
the right thing and as a simple example, we have encapsulated the computation of Pi in this function calculatePi and
the computation within adopted a tail-recursive implementation which doesn't exhaust the stack and hence we won't
experience the nasty StackOverflowException (admittedly, you can do this in other ways). Next, we provide a
environment in which we can observe and measure its execution by wrapping this function inside a Scala module,
Pi. The code below demonstrates how Pi can be calculated from a single executing thread i.e. the main thread.
package pi_sequential0001 import scala.annotation.tailrec
0002 object Pi extends App {
0003 def calculatePi(numberOfElements: Int) : Double = {
0004 @tailrec
0005 def calculatePiFor(start: Int, limit: Int, acc: Double) : Double =
0006 start match {
0007 case x if x == limit acc
0008 case _ calculatePiFor(start + 1, limit, acc + 4.0 * (1 - (start % 2) * 2) / (2 * start + 1))
0009 }
0010 calculatePiFor(start, start + numberOfElements - 1, 0.0)
0011 }
0012 override def main(args: Array[String]) : Unit = {
0013 val start = System.currentTimeMillis
0014 val numberOfIterations = 10000
0015 val numberOfElements = 10000
0016 var acc = 0.0
0017 for(i ← 0 until numberOfIterations)
0018 acc += calculatePi(i * numberOfElements, numberOfElements)
0019 println(s"\n\tpi approximation: ${acc}, took: ${System.currentTimeMillis - start} millis")
0020 }
When you invoke this application via sbt, you would get an output that resembles something similar below
[info] Running pi_sequential.Pi0001 pi approximation: 3.1435501812459323, took: 739 millis
This is as simple as it gets, really. Compute bound computations when executed by a single thread are typically
inefficient from the perspective that it doesn't take advantage of the modern machine's multi-core/threading
capabilities and we can improve upon that by using the age-old technique known to most Java developers: running
the computation on a pool of threads.
58
A parallel approach via a Thread Pool
We know that calculating the value of Pi using the above formulation can be achieved in parallel and in isolation.
What this means is that we need to provide an abstraction around that individual computation so that we not only
delay its evaluation but also to be able to assign each computation to a executor service of our choice; The executor
service is useful in modern machine architectures where multiple CPU cores are available and allows the developer
to utilize the entire resource available on the system and in Scala, one common technique is to create a pool of
threads and when we need work to be done, we simply delegate the task to the pool of resources. We should
caution the reader that there are tradeoffs whether to choose a strategy that's based on a single-thread or thread-
pool and we invite readers to check out Brian Goetz et al Java Concurrency in Practice for details. The thread pool
approach makes sense for a number of reasons and we've named one already, also this approach works because
our tasks are homogeneous and independent (we'll see that soon) and when there are more tasks than available
threads and creating a pool aprior shaves some latencies & resources off since the computations pick any available
thread from the same pool and doesn't have to start/shutdown threads. This strategy works well when the duration of
each task is relatively short and you have a considerable amount of compute cores on the machine.
Hence, first thing we need to do is to identify where the isolated task is. In our case, we can use the previous
example and if we examine it close enough we can see that this expression 4.0 * (1 - (i % 2) * 2) / (2 * i + 1) comes to
mind and this expression evaluates to a Double and we can delay its execution by reformulating this expression into
a function such that the type of this function is () Double. Now that we know this, we can collect all these individual
expressions into a collection like List[() Double]. This is not enough because we know from standard Scala rules,
we cannot dictate the execution preference to a Scala container so we need to build another abstraction that
captures the function i.e. () Double as well as allowing us to dictate our computational preference and we called
this container a ThreadStrategy. Finally, we have an expression that reads like the following:
for{0001 // iterate through all expressions
0002 } yield executeStrategy.execute( () 4.0 * (1 - (i % 2) * 2) / (2 * i + 1))
For the purpose of illustrating this example, we have created two strategies for executing functions: (a)
SameThreadStrategy and (b) ThreadPoolStrategy. The above functionality is encapsulated inside calculatePi which
carries the implicit object that indicates our preferred choice of running this calculation. The following code shows
you our approach:
def calculatePi(start: Int, numberOfElements: Int)0001 (implicit executeStrategy : ThreadStrategy = SameThreadStrategy) : Double = {
0002 val computations =
0003 for (i ← start until (start + numberOfElements - 1))
0004 yield executeStrategy.execute( () 4.0 * (1 - (i % 2) * 2) / (2 * i + 1))
0005 computations.aggregate(0.0)( (acc,f) f() + acc, _ + _ )
0006 }
That function calculatePi is turn wrapped inside a Scala module named Pi in which we can measure and observe its
execution when given a ThreadPool versus the single-thread; the following code demonstrates this:
override def main(args: Array[String]) : Unit = {0001 val start = System.currentTimeMillis
0002 val numberOfIterations = 10000
0003 val numberOfElements = 10000
0004 var acc = 0.0
0005 for(i ← 0 until numberOfIterations)
0006 acc += calculatePi(i * numberOfElements, numberOfElements)(ThreadPoolStrategy)
0007
0008 println(s"\n\tpi approximation: ${acc}, took: ${System.currentTimeMillis - start} millis")
0009 }
A run of this application via sbt would give you an output similar to this (if you have a multi-core machine, you would
noticed that the console would indicate different thread ids):
[info] Running pi_parallel.Pi0001 ...
0002 thread-ID: 477
0003 thread-ID: 479
0004 thread-ID: 478
0005 thread-ID: 480
0006 pi approximation: 3.1435501812459323, took: 41592 millis
The careful reader would notice that it took a much longer time as compared to the sequential and the developer
might be led to belief that the sequential approach is significantly better! There are plausibly a number of reasons for
this: (a) our algorithm implementation is probably not the most optimized (b) the number of cores on the test machine
is small (c) speed of each core etc. As an example how changing the algorithm would improve the run times of
approximating Pi, we can use the discrete approach which means discovering the integral of the function that
approximates Pi by calculating all the rectangles under that function and the following illustration illustrates this idea
59
and we divide the area underneath the function into bins and the approach we've taken here is to partition the data
range and have a task processed the task's bin and we aggregate all the partials sums once all the tasks have
completed execution and we would have an reasonably effective way of computing Pi and this is commonly known
as the scatter-gather pattern and we'll see in the next section how we can adapt this pattern to the Actors model. The
following formula sums it up:
Below, is the code that illustrates the ideas we just talked about and we gave it another name,
calculatePiSlightlyClever for the purpose of contrasting this approach with the above and you noticed that we
modeled the computation still using the ThreadPoolStrategy trait we created earlier.
def calculatePiSlightlyClever(start: Int, end : Int, step : Double)(implicit executeStrategy : ThreadStrategy = SameThreadStrategy) : Double = {
0002 for( i ← start until end )
0003 yield executeStrategy.execute{ () val x = (i + 0.5) * step; 4.0 / (1.0 + (x * x)) }
0004 computations.aggregate(0.0)( (acc, f) f() + acc, _ + _ )
0005 }
This time, our runtimes have improved significantly by approximately six-fold by selecting another algorithm and
application of simple task and data decomposition ideas for calculating Pi and a sample run of this computation
shows
[info] Running pi_parallel.PiSlightlyClever0001 pi approximation: 3.141592653589859, took: 955 millis
We caution the reader to be judicious when applying a change in algorithm as it may result in a loss of precision and
the best advice is to consult with subject matter experts and past-/recent research collaterals. In our example, we
experienced a loss of precision to a thousandth and for our purpose in this book, its a tradeoff we can live with but
you might not have luxury. Thus far, we have explored a few ways in which we can utilize the compute resources
that's available on a single machine/node and let's take a look at how we can model our problem with Actors in the
coming section.
An actors approach
In this section, we are going to take a look at how we can implement this solution using actors. Considering that
you've been working through this book for a while now, you may have many ideas how to go about doing this. Our
approach evolves from the previous section where we apply the scatter-gather pattern to the actor and this term
originated in vector/array addressing, to the best of our knowledge, and has been applied to many disciplines in
computer science. The Java fork-join model is probably the best analogy to go about understanding this and in our
actor implementation, we go about implementing this idea.
We are going to a Master actor that's responsible for starting other Worker actors when the workers received the
message Calculate, which also carried the payload encapsulated by a message called Work. >Each Worker will
proceed and compute the value of Pi and return the result by bandwagoning the value on the message Result and
the master keeps track of how many workers have returned and finally present the result back. The figure below
60
the master keeps track of how many workers have returned and finally present the result back. The figure below
illustrates the scatter-gather approach.
Additionally, we want to know how long it took for the computation to complete so two marks are made in the master
actor to measure the duration it took when it started up to the time all the workers have returned from their
computation and we have a message that does this named ApproximatedPi. Here are the messages we used for
this application:
sealed trait PiMessage0001 case object Calculate extends PiMessage
0002 case class Work(start: Int, numberOfElements: Int) extends PiMessage
0003 case class Result(value: Double) extends PiMessage
0004 case class ApproximatedPi(pi: Double, duration: Duration)
and the Worker's implementation is shown as follows:
class Worker extends Actor {0001 @tailrec
0002 private def calculatePiFor(start: Int, limit: Int, acc: Double) : Double =
0003 start match {
0004 case x if x == limit acc
0005 case _ calculatePiFor(start + 1, limit, acc + 4.0 * (1 - (start % 2) * 2) / (2 * start + 1))
0006 }
0007 def receive = {
0008 case Work(start, numberOfElements) sender ! Result(calculatePiFor(start, start + numberOfElements - 1, 0.0))
0009 }}
with the Master actor's implementation which takes care of measuring the duration of the entire computation and
once all the results have been aggregated, we notify the listener which simply prints out the result whilst the Master
shuts down. All this code should be relatively familiar with you by now and you may want to refresh yourself on the
chapters on Working with Actors and Routers.
class Master(numberOfWorkers: Int, numberOfMessages: Int, numberOfElements: Int, listener: ActorRef)
extends Actor {
var pi: Double = _
var numberOfResults : Int = _
val start: Long = System.currentTimeMillis
61
val workerRouter = context.actorOf(Props[Worker].withRouter(RoundRobinRouter(numberOfWorkers)), name
= "routerForWorkers")
class Master(numberOfWorkers: Int, numberOfMessages: Int, numberOfElements: Int, listener: ActorRef) extends Actor {0001 var pi: Double = _
0002 var numberOfResults : Int = _
0003 val start: Long = System.currentTimeMillis
0004 val workerRouter = context.actorOf(Props[Worker].withRouter(RoundRobinRouter(numberOfWorkers)), name = "routerForWorkers")
0005 def receive = {
0006 case Calculate for (i ← 0 until numberOfMessages) workerRouter ! Work(i * numberOfElements, numberOfElements)
0007 case Result(value) pi += value
0008 numberOfResults += 1
0009 numberOfResults match {
0010 case x if x == numberOfMessages listener ! ApproximatedPi(pi, duration = (System.currentTimeMillis - start).millis)
0011 context.stop(self)
0012 case _
0013 }
0014 }
0015 }
0016 class Listener extends Actor {
0017 def receive = {
0018 case ApproximatedPi(pi, duration) println(s"\n\tPi approximation: $pi, took: $duration ")
0019 context.system.shutdown()
0020 }
0021 }
Other than keeping track of how many workers have completed their job, the master is also responsible to shut down
all supervised actors after sending the result, ApproximatedPi, to the listener (another actor which shutdowns after
printing out the result); the master shutdowns by stopping itself through invoking context.stop(self) and that casades
the shutdown to the other supervised actor which is workerRouter.
Let's take this for a spin! A sample run on our test machine via `sbt` has the following output (depending on your
machine specifications, you might have a different runtime)
[info] Running pi_actors.Pi0001 Pi approximation: 3.1435501812459323, took: 456 milliseconds
An clustered approach
Earlier in this chapter, we have understood some of the mechanics of the Akka cluster and over here we're going to
demonstrate to you how we are going to adapt the design of the Actors approach to the cluster approach. There are
several ways to design a solution around this problem i.e. computing the value of Pi and the approach we like to
show you is to have a number of nodes start up in our cluster and once the cluster is ready, some front-end service
will begin serving those compute requests to the cluster and the manner in which the cluster picks up jobs will be in
a FIFO manner.
In our choice of design, we are going to build a cluster that has two basic components: (a) a front-end service & (b) a
back-end service. In our context, the front-end service would be responsible for serving the request to compute Pi
using our back-end service and in that, we would reuse the scatter-gather design we implemented in the previous
section. To help our front-end service manage our backend's availability, we'll apply a simple coordination strategy
to mark a backend i.e. cluster as busy or idle. The backend service, PiBackend, has two components: (a) a Cluster
backed by an Actor (b) a Scala module that drives the creation of the cluster and we have the same approach for the
frontend service as well, named PiFrontend.
object PiBackend {0001 def main(args: Array[String]): Unit = {
0002 // Override the configuration of the port when specified as program argument
0003 val config =
0004 (if (args.nonEmpty) ConfigFactory.parseString(s"akka.remote.netty.tcp.port=${args(0)}")
0005 else ConfigFactory.empty).withFallback(
0006 ConfigFactory.parseString("akka.cluster.roles = [backend]")).
0007 withFallback(ConfigFactory.load())
0008 val system = ActorSystem("ClusterSystem", config)
0009 system.actorOf(Props[PiBackend], name = "backend")
0010 }
0011 }
The above driver for our backend would start up our cluster-backed-actor on the designated port (read in from the
commandline) or randomly assigned on the designated host. Simple. What happens next is a little more involved
(see accompany code following this paragraph), when the actor PiBackend starts up, it evolves into a cluster by
invoking Cluster(context.system) and assigning the returned value to an immutable value conveniently called
cluster. During the cluster startup, it also registers itself to listen in when a new node joins the cluster and is ready to
receive requests i.e. MemberUp in the function preStart. When the cluster experienced changes either members
come and go or become unreachable, then a message of CurrentClusterState is intercepted by our cluster and for
every node that is a member of the cluster and is up i.e. MemberUp we'll notify the front-end of this node's presence
by passing that member to our function, register; we'll see in a while why this is important. Thus far, we've described
mechanisms employed when building a typical cluster and we present the code:
class PiBackend extends Actor {0001 var originalSender : ActorRef = _
0002 var jobDispatcher : ActorRef = _
0003 var currentJob : CalculatePi = _
0004 var pi: Double = _
0005 var numberOfResults : Int = _
0006 val start: Long = System.currentTimeMillis
0007 val numberOfWorkers = java.lang.Runtime.getRuntime.availableProcessors
0008 lazy val workerRouter = context.actorOf(Props[Worker].withRouter(RoundRobinRouter(numberOfWorkers)), name = "routerForWorkers")
0009 val cluster = Cluster(context.system)
0010 // subscribe to cluster changes, MemberUp
0011 // re-subscribe when restart
0012 override def preStart(): Unit = cluster.subscribe(self, classOf[MemberUp])
0013 def receive = {
0014 case (job : CalculatePi, orgSender: ActorRef)
0015 originalSender = orgSender // capture the original sender i.e. the module PiFrontend
0016 jobDispatcher = sender // capture the dispatcher i.e. the actor PiFrontend
0017 currentJob = job // capture the job the dispatcher wants me to do
0018 for (i ← 0 until job.numberOfMessages) workerRouter ! Work(i * job.numberOfElements, job.numberOfElements)
62
0018 for (i ← 0 until job.numberOfMessages) workerRouter ! Work(i * job.numberOfElements, job.numberOfElements)
0019 case Result(value) pi += value
0020 numberOfResults += 1
0021 numberOfResults match {
0022 case x if x == currentJob.numberOfMessages originalSender ! ApproximatedPi(pi, duration = (System.currentTimeMillis - start).millis)
0023 jobDispatcher ! (currentJob, Idle())
0024 context.stop(self)
0025 case _
0026 }
0027 case state: CurrentClusterState
0028 state.members.filter(_.status == MemberStatus.Up) foreach register
0029 case MemberUp(m) register(m)
0030 }
0031 def register(member: Member): Unit =
0032 if (member.hasRole("frontend"))
0033 context.actorSelection(RootActorPath(member.address) / "user" / "frontend") ! BackendRegistration
0034 }
Next, our backend service needs to know how to conduct two things: (a) compute Pi, (b) communicate the results
back to the frontend service. The functionality is encapsulated in the function receive (as seen above) and over here
we need to look at how our frontend service functions in relation to our backend service. Recall earlier that we
wanted our frontend service to serve compute requests to our cluster and we should not send requests to the
backend when it has finished the previous request. Another way to think about this is that the frontend, PiFrontend,
is a gate-keeper of sorts to our cluster.
PiFrontend attempts to send the job, CalculatePi, to our cluster by checking if any backends are registered and it
knows this when any PiBackend sends the message BackendRegistration when it comes up in the cluster (see
earlier code). Through this mechanism, PiFrontend is watching the multitude of PiBackends and in our example,
PiFrontend is watching for the event when any of the backends leaves the cluster or goes down by keeping an eye
for the message Terminated and our reaction to that is to remove that particular node. Any new node will, by default,
be given a state Idle and transitions to Busy when it's working on our compute Pi request. The following code
expresses these ideas:
class PiFrontend extends Actor {0001 var backends = IndexedSeq.empty[(ActorRef, State)]
0002 var jobCounter = 0
0003 def receive = {
0004 case job: CalculatePi if backends.isEmpty
0005 sender ! CalculatePiFailed("Service unavailable, try again later", job)
0006 case job: CalculatePi
0007 jobCounter += 1
0008 val availBEs = backends.filter(_._2 == Idle()) // find `idle` backends
0009 val index = jobCounter % availBEs.size
0010 backends(index)._1 ! (job, sender) // send job to `Idle` backend
0011 val temp = backends(index)
0012 backends.updated(index, temp._1 → Busy()) // mark backend as `Busy`
0013 case (_: CalculatePi, s: State)
0014 val aBackend = backends.find(_._1 == sender).head // find the backend
0015 val index = backends.indexOf(aBackend)
0016 backends.updated(index, aBackend._1 → Idle()) // marked backend as `Idle`
0017 case BackendRegistration if !backends.contains(sender)
0018 context watch sender
0019 backends = backends :+ (sender, Idle())
0020 case Terminated(a)
0021 backends = backends.filterNot(_._1 == a)
0022 }
0023 }
We've seen how our frontend would attempt to dispatch the jobs to available i.e. Idle backends for processing but
we've not seen how those requests are generated. To generate those requests, we know that we want to launch that
avalanche of requests only when the cluster is up (i.e. all nodes should be in the Up-state) and the manner in which
we handle this is by configuring akka.cluster.min-nr-of-members to be equal to 3 in our application.conf. Finally, we
use a convenience method, registerOnMemberUp, to place our launch code which basically lifts our code into a
java.lang.Runnable
object PiFrontend {0001 def main(args: Array[String]): Unit = {
0002 // Override the configuration of the port when specified as program argument
0003 val config =
0004 (if (args.nonEmpty) ConfigFactory.parseString(s"akka.remote.netty.tcp.port=${args(0)}")
0005 else ConfigFactory.empty).withFallback(
0006 ConfigFactory.parseString("akka.cluster.roles = [frontend]")).
0007 withFallback(ConfigFactory.load())
0008 val system = ActorSystem("ClusterSystem", config)
0009 val frontend = system.actorOf(Props[PiFrontend], name = "frontend")
0010 val cluster = Cluster(system)
0011 cluster.subscribe(frontend, classOf[ClusterDomainEvent])
0012 cluster.registerOnMemberUp {
0013 import scala.util.Random._
0014 import system.dispatcher
0015 implicit val timeout = Timeout(10 seconds)
0016 // Depending on how fast your cluster computes each computation, not all jobs may get to
0017 // run on the cluster. For purpose of illustration, we'll vary the computation length
0018 for (n ← 1 to 5) {
0019 (frontend ? CalculatePi(nextInt(10000), nextInt(10000))) onSuccess {
0020 case result println(result)
0021 }
0022 // wait a while until next request,
0023 // to avoid flooding the console with output
0024 Thread.sleep(2000)
0025 }
0026 }
0027 }
0028 }
And here's the configuration in our application.conf:
akka {0001 actor {
0002 provider = "akka.cluster.ClusterActorRefProvider"
0003 }
0004 remote {
0005 log-remote-lifecycle-events = off
0006 netty.tcp {
0007 hostname = "localhost"
0008 port = 0
63
0008 port = 0
0009 }
0010 }
0011 cluster {
0012 seed-nodes = [
0013 "akka.tcp://ClusterSystem@localhost:2551",
0014 "akka.tcp://ClusterSystem@localhost:2552"]
0015 auto-down = on
0016 min-nr-of-members = 3
0017 }
0018 log-dead-letters-during-shutdown = off
0019 }
To run our example, open three terminals and execute the following commands in each terminal, in turn:
sbt "run-main pi_cluster.PiBackend 2551"0001 sbt "run-main pi_cluster.PiBackend 2552"
0002 sbt "run-main pi_cluster.PiFrontend"
When the Akka cluster framework detects that all members of the cluster are up i.e. akka.cluster.min-nr-of-members,
it'll run the code that we registered via registerOnMemberUp and here's a sample output captured from the terminal
hosting PiFrontend:
CalculatePiFailed(Service unavailable, try again later,CalculatePi(3351,4383))0001 ApproximatedPi(3.1935353605129233,8057 milliseconds)
0002 ApproximatedPi(3.1438214359957004,11173 milliseconds)
0003 CalculatePiFailed(Service unavailable, try again later,CalculatePi(6921,1532))
0004 CalculatePiFailed(Service unavailable, try again later,CalculatePi(4597,4886))
The following diagram illustrates how our cluster looks like schematically with the kinds of messages transmitted
within our cluster:
Our design of choice has deficiencies, naturally and for the purpose of this book we've ignored quite a fair bit but
when using clusters in an production environment, you would want to give it further thought and asks questions like:
Do my application(s) really need a cluster?
Our example could be accomplished through the Actors approach too but depending on the
requirements of the business, you may wish to ensure greater fault-tolerance via Akka clusters.
How does one package & deploy the cluster in production? Automated? Manual is out of the question, of
course
We did not cover that in detail or barely since its really a topic that involves organization processes and
systems architecture with a wide plethora of approaches and tools
How to manage failures in application/business logic?
In our example, we would like our cluster to be able shelve our requests till the resources becomes
available but we didn't do that.
How to manage the cluster administratively?
How to make the cluster react based on load? In particular how to scale up or down?
64
That last question is a topic we did not cover in this book and we encourage readers to do so by exploring a
particular facet of this problem by employing the cluster aware routers in the akka.cluster.routing package. We've
outlined the new routers i.e. AdaptiveLoadBalancingGroup, AdaptiveLoadBalancingPool available in Akka 2.3 in a
previous paragraph and its certainly allows a good level of granularity in your current or future cluster designs.
Testing the cluster approach
We are going to leverage the concepts we've developed in the chapter on Testing, in particular applying the multi-
node testkit i.e. akka-multi-node-testkit to our cluster design. The two test cases we want to validate against are:
A frontend can be started within a time frame (with no backends) and issuing a request to compute Pi would
inevitably fail
A frontend and two backends are started and issuing a compute Pi request would eventually be successful
(once the backends have registered with the frontend, as what we designed)
Hence, we know that we need three nodes (i.e. 1 frontend , 2 backends) and in multi-node testing, we need to
assign roles to each of these nodes as well as have as many objects of MultiNodeConfig as there are nodes
(required by the test cases), i.e. 3, and the following demonstrates this manifestation:
object PiClusterSpecConfig extends MultiNodeConfig {0001 // register the named roles (nodes) of the test
0002 val frontend1 = role("frontend1")
0003 val backend1 = role("backend1")
0004 val backend2 = role("backend2")
0005 ...
0006 nodeConfig(frontend1)(
0007 ConfigFactory.parseString("akka.cluster.roles =[frontend]"))
0008
0009 nodeConfig(backend1, backend2)(
0010 ConfigFactory.parseString("akka.cluster.roles =[backend]"))
0011 }
0012
0013 class PiClusterSpecMultiJvmNode1 extends PiClusterSpec
0014 class PiClusterSpecMultiJvmNode2 extends PiClusterSpec
0015 class PiClusterSpecMultiJvmNode3 extends PiClusterSpec
No surprises there, we hope. Next, we would like to be able to start the first frontend which would be used by the
backends in the subsequent test and we like to make sure it can be started up. The test issues a CalculatePi request
to an non-existent backend and this would failed and a CalculatePiFailed message would be caught and the
following illustrates that:
"start first frontend" in within(15 seconds) {0001 runOn(frontend1) {
0002 // this will only run on the 'first' node
0003 Cluster(system) join node(frontend1).address
0004
0005 val piFrontend = system.actorOf(Props[PiFrontend], name = "frontend")
0006 piFrontend ! CalculatePi(10000, 10000)
0007
0008 expectMsgPF() {
0009 // no backends yet, service unavailble
0010 case CalculatePiFailed(_, CalculatePi(10000,10000))
0011 }
0012 }
0013 // this will run on all nodes
0014 // use barrier to coordinate test steps
0015 testConductor.enter("frontend1-started")
0016 }
The next test leverages on the created frontend and we start our backends i.e. PiBackend, and lodge them on the
two nodes we defined earlier and have a CalculatePi request and this time round, if the test doesn't take more than
the time frame i.e. 20 seconds and the computation completes, then a message 3.2454043060553874 would be
returned as expected. The following code encapsulates these ideas:
"start two backends which automatically registers, verify service" in within(20 seconds) {0001 runOn(backend1) {
0002 Cluster(system) join node(frontend1).address
0003 system.actorOf(Props[PiBackend], name = "backend")
0004 }
0005 testConductor.enter("backend1-started")
0006
0007 runOn(backend2) {
0008 Cluster(system) join node(frontend1).address
0009 system.actorOf(Props[PiBackend], name = "backend")
0010 }
0011 testConductor.enter("backend2-started")
0012
0013 runOn(frontend1) {
0014 assertServiceOk()
0015 }
0016
0017 testConductor.enter("all-ok")
0018 }
To run this, we would issue the test command on sbt and it would reflect the execution of the two tests in their written
order (exiting with a message: All tests passed). A sample output of that run looks like this:
[info] * pi_cluster.PiClusterSpec0001 [JVM-1] Run starting. Expected test count is: 2
0002 [JVM-1] PiClusterSpecMultiJvmNode1:
0003 [JVM-1] The compute `Pi` service
0004 [JVM-2] Run starting. Expected test count is: 2
0005 [JVM-2] PiClusterSpecMultiJvmNode2:
0006 [JVM-3] Run starting. Expected test count is: 2
0007 [JVM-3] PiClusterSpecMultiJvmNode3:
0008 [JVM-2] The compute `Pi` service
0009 [JVM-3] The compute `Pi` service
0010 [JVM-1] - must start first frontend
0011 [JVM-2] - must start first frontend
0012 [JVM-3] - must start first frontend
65
0013 [JVM-1]
0014 [JVM-1] No backend is up to service request
0015 [JVM-1]
0016 [JVM-1] !!!BACKEND REGISTERED!!!
0017 [JVM-1]
0018 [JVM-1] !!!BACKEND REGISTERED!!!
0019 [JVM-1]
0020 [JVM-1] YES!!! 3.2454043060553874
0021 [JVM-1] - must start two backends which automatically registers, verify service
0022 [JVM-3] - must start two backends which automatically registers, verify service
0023 [JVM-2] - must start two backends which automatically registers, verify service
Next, we will take a look at how our design can be adapted to the Play framework and in particular, we take a look at
how we can front our akka cluster with a Play instance and how we can trigger these computations through a HTTP
interface.
Integrating with Play 2.2.x Framework
The Play Framework is an advanced HTTP server implementation that is not only light-weight, stateless but it's also
built on top of Akka; it also supports asynchronous i.e non-blocking I/O, streams, WebSockets, Ajax and has JSON
as a first-class citizen within the framework and finally RESTful by default. When you combine that with the
capabilities in Akka as a whole, the developer would quickly realize that the choices of design becomes really large
and with choice, comes with the agility in design.
Previously, we had a configuration of three (3) nodes where the frontend i.e. PiFrontend would serve requests to two
backends i.e. PiBackend and that worked out alright. Quite possibly in the not so far future, you might want to
experiment building out the design to have a Play instance play the role of the frontend and have that instance serve
requests to your Pi cluster. There are a few ways you can do this and for the purpose of this book, we like to show
you how to reuse most of the components we have illustrated previously in the simplest manner.
Buildout the Play Controller
Play adopts a MVC i.e. model-controller-view model and when a Play instance starts up, it'll load the controller code
and it's life begins and our frontend actor, PiFrontend, starts up upon receiving the first HTTP GET request. The
controller does this by creating a ActorSystem with its configuration and when the actor starts up, it'll join the the
backend which would have been started already. As before, the backends would register itself to the frontend
therefore, the first calculatePi request would inevitably fail but the subsequent requests should succeed. The
following code demonstrates how our actor code is part of the controller and how a cluster is subsequently started.
package controllers0001 object Application extends Controller {
0002 import play.api.Play.current
0003 import pi_cluster._
0004 def loadClusterConfiguration() : Config = ConfigFactory.load("ClusterSystem.conf")
0005 def createActorSystem(configuration: Config) = ActorSystem.create("ClusterSystem", configuration)
0006 def createFrontendService(actorSystem: ActorSystem) = actorSystem.actorOf(Props[PiFrontend], "frontend")
0007 def createClusterAndWatch(actorSystem: ActorSystem)(service: ActorRef) = Cluster(actorSystem).subscribe(service, classOf[ClusterDomainEvent])
0008 val actorSystem = createActorSystem(loadClusterConfiguration)
0009 val frontend = createFrontendService(actorSystem)
0010 createClusterAndWatch(actorSystem)(frontend)
0011 // more code omitted
0012 def calculatePi() = { .... }
0013 }
Next, calculatePi would be function to activate when a HTTP GET request arrives and we would like to be able to
serve requests to the backend and return that result to the requester; in Play all requests are handled by an Action
but using that directly leads to synchronous code and that would not go too well if our computation takes a longer
time than expected so we use another form i.e. Action.async that improves concurrency and returns a asynchronous
result wrapped in a Future i.e. Future[Result] and to guard against longer-than-expected-computation we can apply
an common idiom in Play to time-out after a pre-defined duration e.g. we use 5 seconds and we return the result
back to the requester be it an error or the value of the computed value of Pi. The following code, calculatePi, in the
controller captures these ideas:
def calculatePi() = Action.async {0001 import scala.util.Random._
0002 val limit = 10000
0003 val catchAll = play.api.libs.concurrent.Promise.timeout("doh!", 5 seconds)
0004 import scala.concurrent.Future
0005 val f = frontend ? pi_cluster.CalculatePi(nextInt(limit), nextInt(limit))
0006 Future.firstCompletedOf(Seq(f, catchAll)) map {
0007 case msg : String NotFound
0008 case f: pi_cluster.CalculatePiFailed Ok(f.reason)
0009 case data: pi_cluster.ApproximatedPi Ok(data.pi.toString)
0010 }
0011 }
Now that we have our function to be invoked, we need to provide a mapping by using a routing file; this mapping
maps the HTTP-request type (e.g. POST, GET, PUT, DELETE, TRACE) to a particular implementation in the
controller and in our example we would like to invoke through a URI http://<somehost>:<someport>/calculatePi. In
Play, you can do this by providing a routing file at conf/routes at the root directory of our Play application i.e.
play_with_akka_cluster with the following contents:
GET <tabspace> / <tabspace> controllers.Application.index0001 GET <tabspace> / <tabspace> calculatePi controllers.Application.calculatePi
0002 GET <tabspace> /assets/*file <tabspace> controllers.Assets.at(path="/public", file)
66
A sample run
To get our example to run, we need to get a few things ready and let's walk through how to get the application
running. First of all, get the code for the book and deposit it somewhere on a directory of your choice. Next, we need
to build the code for this section but before we go about doing that, we need to make the artifacts of our dependency
available to our controller by publishing them locally like this
ch10-akka-cluster $> sbt publish-local
Which deposits those artifacts locally and we need to be able to reference that dependency and in sbt we include
that dependency in our sbt build file, build.sbt, like this:
libraryDependencies ++= Seq(0001 "ch10-akka-cluster" %% "ch10-akka-cluster" % "1.0",
0002
Finally, we build the controller logic using those published artifacts by launching sbt and issue a compile command
[info] Set current project to play_with_akka_cluster (in build file:/Users/tayboonl/akka-edge/ch10-akka-cluster/play_with_akka_cluster/)0001
0002 [success] Total time: 1 s, completed Nov 12, 2013 1:52:04 PM
0003 [play_with_akka_cluster] $
That completed, we are ready to launch our revised application and to do that we launch two instances of
PiBackend and then the Play instance (which would launch our controller, Application) by issuing the following
commands in three terminals like this:
ch10-akka-cluster $> sbt "run-main pi_cluster.PiBackend 2551"0001 ch10-akka-cluster $> sbt "run-main pi_cluster.PiBackend 2552"
0002 ch10-akka-cluster/play_with_akka_cluster $> sbt run
As before, the backends would connect amongst themselves and our cluster, ClusterSystem, is formed. Next, we
issue a HTTP GET request either using the UNIX command, curl e.g. curl localhost:9000/calculatePi and eventually
you would receive a message on the computed value of Pi.
67
Appendix A: Further Reading
Here are some resources that might help you out as you learn Akka and explore the fantastic world of Scala
development.
Books on Scala
Programming in Scala, Second Edition, Martin Odersky, Lex Spoon, and Bill Venners, Artima, 2010.
There's simply no more complete book available for understanding the entirety of the Scala programming
language.
Scala for the Impatient, Cay Horstmann, Addison-Wesley, 2012.
This is an excellent introduction to Scala for experienced programmers who just need to hit the ground running
quickly.
Scala in Depth, Joshua D. Suereth, Manning, 2012.
When you're ready to dig in to the true magic of Scala in a way that's clear and well-informed, you should not
hesitate to pick up Josh's book to be your guide.
Books on Akka
Akka Concurrency: Building reliable software in a multi-core world, Derek Wyatt, Artima, 2013.
This should probably be the next book you read, after you've had some time to explore Akka a bit more. Wyatt
goes into great depth to explain the details you need to know, while informing it from hard-won experience.
Akka in Action, Raymond Roestenburg and Rob Bakker, Manning, 2013.
I have only read a few chapters of this upcoming book, but from what I've seen, it will be another valuable
addition to any aspiring Akka engineers arsenal of knowledge.
Blogs, blog posts, and articles
Scala for Java Refugees, Daniel Spiewak
If you're coming from the Java world, this is a good place to start. It's a few years old, yet still relevant and
useful for getting a quick taste of Scala.
A Scala Tutorial for Java programmers, Michel Schinz and Philipp Haller
This tutorial is also, as the title makes clear, intended for Java developers. It was my introduction to Scala and,
while it still took me a while to get my full bearings, it laid the groundwork that I'm still building upon.
Let it Crash, Akka Team
This is the blog maintained by the Akka core team, featuring news about upcoming releases, highlights of new
features, occasional guest posts, and all sorts of other goodness. You should definitely be keeping an eye on
this site.
68
Appendix B: Schedulers
Akka provides a fairly simple but capable task scheduling mechanism. The timing is based on Netty's
HashedWheelTimer and uses an approach that simply looks to see which tasks are overdue each time it ticks. This
does not allow for extremely precise accuracy, but it works for a large majority of cases where tasks simply need to
be run on a regular basis.
The scheduler is capable of either sending a message to a designated actor, running a simple function, or running
an instance of java.lang.Runnable. Further, it can perform these functions at either set intervals (along with an
optional initial delay before the first execution) or after a single interval. A few simple examples will help to illuminate
the workings of the scheduler.
Also, like futures, the scheduler requires an implicit ExecutionContext to be in scope. The simplest way to go about
this is to simply import the actor system's dispatcher into the current scope. You can, of course, use an alternative
dispatcher, as described in chapter six.
0001 import akka.actor.{Actor, Props}
0002 import scala.util.concurrent.duration._
0003 import system.dispatcher
0004
0005 val actor = system.actorOf(Props(new Actor {
0006 def receive = {
0007 case msg => println(msg)
0008 }
0009 }))
0010 system.scheduler.schedule(10 seconds, 0 seconds, actor, "Hello!")
This example above will print out the message Hello! every ten seconds starting from the point when the code is
executed, since there is a zero second initial delay specified. To send the same message but without repeating and
with a fifteen second delay, use the following code:
system.scheduler.scheduleOnce(15 seconds, actor, "Hello!")
As mentioned earlier, you can also just schedule simple functions or Runnables, as well, using either schedule or
scheduleOnce.
0001 // execute this function every second with an initial 5 second
0002 // delay
0003 system.scheduler.schedule(5 seconds, 1 second) {
0004 log.info("tick!")
0005 }
0006 system.scheduler.scheduleOnce(60 seconds) {
0007 log.info("One minute is gone!")
0008 }
0009 val runnable = new Runnable {
0010 def run() {
0011 log.info("tock!")
0012 }
0013 }
0014 system.scheduler.schedule(5 seconds, 1 second, runnable)
0015 system.scheduler.scheduleOnce(60 minutes) {
0016 log.info("Another hour gone.")
0017 }
Each of these calls to schedule or scheduleOnce return a Cancellable object, which defines the cancel and
isCancelled methods. You can cancel a scheduled task using cancel and check to see if it has already been
cancelled using isCancelled.
I've found that it's often useful to have an actor perform some tasks at a regular interval. A first approach to
implementing this would likely involve creating the actor and then sending a message (we'll call it Tick for this
scenario) to the actor at specified intervals using schedule. The problem with this approach is that if the actor wants
to interrupt the scheduled ticking, it needs to have a reference to the Cancellable object for that task. But this can
sometimes be awkward to do in a clean manner.
A cleaner approach that gives you more control is to use scheduleOnce and then, in the block that handles the Tick
message, have it schedule the next Tick.
0001 import akka.actor.{Actor, Props}
0002 import scala.concurrent.duration._
0003
0004 case object Tick
0005
0006 class TickingActor extends Actor {
0007
0008 // shortcut
0009 val system = context.system
0010 override def preStart() = {
0011 system.scheduler.scheduleOnce(1 second, self, Tick)
0012 }
0013
0014 def receive = {
0015 case Tick => {
0016 system.scheduler.scheduleOnce(1 second, self, Tick)
0017 // do whatever work needs to be done
0018 }
0019 }
0020 }
If you find for some reason you need to wait to schedule the next Tick message until after the work has been done,
you should use a try/finally structure and place the call to scheduleOnce inside the finally block to prevent it from
69
you should use a try/finally structure and place the call to scheduleOnce inside the finally block to prevent it from
failing to get scheduled should any errors occur.
70
... It currently defaults to the ForkJoinPool, which is usually sufficient for most tasks. This is referred to as the Dispatcher [9] [11] and is equipped with the functionality to determine the execution strategy for a given program, such as which thread to use, how many to make available in a pool for actors to run on, etc. ...
Conference Paper
Full-text available
In this paper we describe an architecture and implementation of the ACTOR model of concurrent computation which exploits the multi-core processors of modern day computer architectures. A novel aspect of our approach, and where it differs from many other implementations, is that it is hosted in an existing programming language as native constructs; we employ Swift which is rapidly rising in popularity but in its standard distribution lacks the facilities for true concurrent programming. We describe an extension to the language which enables access to concurrent features and provides an API for supporting such interactions. We consider the various architectural issues, competing approaches, and discuss early findings from our prototype implementation.
ResearchGate has not been able to resolve any references for this publication.