ArticlePDF Available

About Automatic Benchmarking of IaaS Cloud Service Providers for a World of Container Clusters

Authors:

Abstract and Figures

Cloud service selection can be a complex and challenging task for a cloud engineer. Most current approaches try to identify a best cloud service provider by evaluating several relevant criteria like prices, processing, memory, disk, network performance, quality of service and so on. Nevertheless, the decision making problem involves so many variables, that it is hard to model it appropriately. We present an approach that is not about selecting a best cloud service provider. It is about selecting most similar resources provided by different cloud service providers. This fits much better practical needs of cloud service engineers. Especially, if container clusters are involved. EasyCompare, an automated benchmarking tool suite to compare cloud service providers, is able to benchmark and compare virtual machine types of different cloud service providers using an Euclidian distance measure. It turned out, that only 1% of theoretical possible machine pairs have to be considered in practice. These relevant machine types can be identiífied by systematic benchmark runs in less than three hours. We present some expectable but also astonishing evaluation results of EasyCompare used to evaluate two major and representative public cloud service providers: Amazon Web Services and Google Compute Engine.
Content may be subject to copyright.
Columbia International Publishing
Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
Research Article
About Automatic Benchmarking of IaaS Cloud Service
Providers for a World of Container Clusters
Nane Kratzke1, Peter-Christian Quint1*
Received 9 April 2015; Published online 13 June 2015
© The author(s) 2015. Published with open access at www.uscip.us
Abstract
Cloud service selection can be a complex and challenging task for a cloud engineer. Most current approaches try
to identify a best cloud service provider by evaluating several relevant criteria like prices, processing, memory,
disk, network performance, quality of service and so on. Nevertheless, the decision making problem involves
so many variables, that it is hard to model it appropriately. We present an approach that is not about selecting
a best cloud service provider. It is about selecting most similar resources provided by different cloud service
providers. This 󰅮its much better practical needs of cloud service engineers. Especially, if container clusters are
involved. EasyCompare, an automated benchmarking tool suite to compare cloud service providers, is able to
benchmark and compare virtual machine types of different cloud service providers using an Euclidian distance
measure. It turned out, that only 1% of theoretical possible machine pairs have to be considered in practice.
These relevant machine types can be identi󰅮ied by systematic benchmark runs in less than three hours. We
present some expectable but also astonishing evaluation results of EasyCompare used to evaluate two major
and representative public cloud service providers: Amazon Web Services and Google Compute Engine.
Keywords: Cloud service selection; IaaS; Virtual machine; Performance; Cloud computing; Similarity measure;
Performance vector; Benchmark; Container cluster
1. Introduction
Cloud service selection can be a complex and challenging task for a cloud engineer. According to Sun et al. [59],
there exist several research directions like analytic hierarchy processes [16], utility function based method-
ologies [23], outranking approaches [30], simple additive weighting approaches[1], cloud feature models [48],
dynamic programming approaches [9], cloud service indexing [54], ranking approaches [29], Markov decision
process [63] or even astonishing combinations of Fuzzy logic, evidence theory and game theory [17] to support
this task. But all of theses mentioned approaches seem to have the tendency to end in complex mathematical
models. Due to this mathematical complexity these approaches seem to overstrain cloud engineers acceptance.
These approaches are often exaggerated or not well accepted by practioners. In practice, cloud service selection
is most of the time not realized very systematically. According to our experience, cloud service selection is more
*Corresponding email: nane.kratzke@󰅮h-luebeck.de
1 Lübeck University of Applied Sciences, Department of Electrical Engineering and Computer Science, Center of Excellence
CoSA
16
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
an evolutionary process. So often, the 󰅮irst cloud service provider is not selected systematically, it is just a ”mean
available”.
Therefore, our guiding question in this paper is an engineering one. It is not about selecting virtual machines
from a best cloud service provider, but it is about selecting the most similar cloud virtual machines provided
by different cloud service providers. Container clusters like Mesos [6], CoreOS [12] or Kubernetes [24] are get-
ting more and more attention in cloud computing due to the fact that they are providing a horizontally scalable
concept of one logical virtual machine, that can be deployed on up to thousands of ”physical” nodes. These ”phys-
ical” nodes can be hosted by different public providers and in private clouds or on bare-metal machines as well.
Horizontally scalable microservice approaches – widely used by companies like Facebook, Twitter, Net󰅮lix, and
so on – rely heavily on this kind of cluster solutions [43]. This is done to handle regional workloads, to provide
failover and over󰅮low capacities and also to avoid cloud vendor lock-in [32], [33]. All provided machines of such
kind of cluster should show similar performance characteristics to provide 󰅮ine-grained resource allocation ca-
pabilities [27]. At this point, one obvious question arises: How to identify similar virtual machine types from
different providers? None of the above mentioned approaches will answer this question. All of them are trying
to identify a best (in terms of performance or cost-effectiveness) resource. These approaches are not helpful to
identify most similar resources provided by different providers. A major problem is, that no two cloud service
providers provide exactly the same virtual machine types.
To answer these questions, we propose an automated benchmarking system called EasyCompare. The core idea
is to describe virtual machine type performances by a performance vector. A similarity value between 0 (not
similar) and 1 (very similar) can be calculated using these vectors. The presented approach can be used to
compare regional differences of providers, to compare machine variants of the same or different cloud service
providers, or to de󰅮ine machine types for a private cloud to align them to performance of machines types of
public cloud service providers, and so on.
The remainder of this paper is structured as follows. A proposed benchmarking architecture to collect bench-
mark data in the cloud is explained in Section 2. Section 3analyzes related work and existing benchmarking so-
lutions for cloud computing to derive some appropriate benchmarks to measure relevant performance aspects
of cloud provided virtual machines. Section 4presents some data, which has been collected to compare two
representative public IaaS Cloud Service Providers (Amazon Web Services, AWS and Google Compute Engine,
GCE). By analyzing this data, we will see that different cloud service providers provide a variety of comparable
and non comparable machine types, making it complicated (but not impossible) to compare virtual machine
types of different cloud service providers. Section 5applies these insights to derive a similarity measure, mak-
ing it possible to compare two arbitrary virtual machine types by a normalized Euclidian distance measure. The
proposed Euclidian distance measure is evaluated on the collected data. We conclude our 󰅮indings in Section 6
and show that our approach contributes to do a pragmatic cloud service selection only on relevant subsets of
cloud offerings.
2. Architecture of Automated Benchmarking
The general architecture of EasyCompare is shown in Figure 1. It is a Ruby command line tool which has been
developed by authors of this contribution to collect benchmark data of various cloud service providers. It has
an extendable architecture to support arbitrary cloud service providers by plugins. Provider internals are han-
dled by provider plugins for speci󰅮ic cloud service providers. The architecture is easily extendable for other
providers (see Figure 1). At its current development state plugins for two prominent and representative public
cloud service providers are provided: AWS and GCE. An OpenStack (an private cloud infrastructure) plugin is
under development. Nevertheless, EasyCompare is designed to work with every public or private IaaS cloud
infrastructure. It provides a data collection module to collect benchmark data from one or more IaaS cloud
infrastructures and it provides an analytics module to analyze collected benchmark data.
17
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
Fig. 1: Architecture of EasyCompare
2.1 EasyCompare’s Data Collection Module
The data collection process presented in Figure 2is executed by the data collection module of EasyCompare.
The data collection can be con󰅮igured by a benchmark de󰅮inition 󰅮ile (JSON format). This 󰅮ile is used to de󰅮ine
the provider, the datacenter (zone, region, etc.) and virtual machine types to be benchmarked. The 󰅮ile also
contains the iperf con󰅮iguration [37] for network benchmarking. One benchmark de󰅮inition 󰅮ile can contain a
set of arbitrary cloud service providers which are benchmarked in parallel (and automatic). For each provider
the following benchmark process is executed in parallel.
Fig. 2: Data Collection Process
In a deploy step all to be benchmarked virtual machines are deployed as well as the iperf server according to
the de󰅮initions provided in JSON 󰅮ile. A corresponding benchmark node is deployed for each to be benchmarked
virtual machine type. This node is used to collect relevant benchmark data of the virtual machine type. The
following install and run steps are performed per benchmark node.
The install step waits for all deployed benchmark nodes to be booted and installs the necessary benchmarking
software. This is realized via SSH remote shell commands executed via a Node interface. Therefore on all nodes
an automated benchmark suite (Phoronix [46]) is installed and con󰅮igured to operate non interactively and fully
18
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
automatic. Furthermore, Phoronix is used to install and con󰅮igure all used benchmarks. The used benchmarks
are introduced in Section 3. To do network benchmarking, iperf is installed as well.
The run step waits for all benchmarks to be installed and tells the Phoronix benchmark suite to execute them
sequentially on all benchmark nodes. Phoronix generates XML 󰅮iles with benchmark data. These XML 󰅮iles
are returned and saved to be analyzed and evaluated by the data analytic module (Section 2.2). It is up to the
responsible benchmarking analyst, to select a virtual machine type for the iperf server. It must be assured, that
networking bottlenecks are on the benchmark node’s side and not on the iperf server’s side (normally a virtual
machine type with maximum network performance of a provider should be selected to play the role of the iperf
server). The measured iperf benchmark data is returned and saved to be analyzed and evaluated by the data
analytic module.
The clean step simply shuts down all deployed benchmark and iperf server nodes to avoid costs. So a complete
benchmarking run against cloud service providers can be executed completely non-interactively.
2.2 EasyCompare’s Data Analytics Module
The analytics module of EasyCompare provides several commands to inspect, analyze, visualize and compare
collected benchmark data. All data provided to the following analytical commands can be 󰅮iltered by providers,
datacenters or virtual machine types of analytical interest.
The inspect command is used to identify high level meta data about benchmark data which has been recorded
by the collect module. It reports for which cloud service providers, datacenters and machine types benchmark
data are present. This meta data is primarily useful for analysts to 󰅮ilter existing benchmark data by the following
analytic commands.
The analyze command can be used to generate detailed tabular reports of benchmark data, which has been
recorded by the collect module. It is able to generate tabular reports like shown exemplary in Table 4.
The visualize command can be used to generate detailed visualized formats of benchmark data, which has been
recorded by the collect module. The values are presented as box plots to enable estimation about variances and
skewness of measured distributions. It is able to generate scripts, which can be processed by the statistical
computing framework R [57] to generate box plot visualizations like shown exemplary in Figure 3.
The compare command can be used to compare benchmark data recorded by the collect module. Data pre-
sentation is done by tabular formats to compare ef󰅮iciently a huge amount of machine type pairs from various
providers. With this command it is also possible to generate Tex snippets, which can be processed by the Tex
typesetting system [31] to generate similarity tables like shown exemplary in Table 1. The similarity of machine
types is calculated by a normalized Euclidian distance measure. Details about the normalized Euclidian distance
measure are explained in Section 5.
3. Benchmark Design
Benchmarking can be tricky in its details [60]. Especially if benchmarks are used to determine similarity across
several varying instance types. Therefore, several publications and approaches which are often used to evaluate
performance aspects of cloud provided virtual machines have been analyzed [7], [10], [19], [25], [28], [34], [41],
[56], [61].
Domain or application based benchmarks: CloudStone is a toolkit for characterizing the workload of
a typical Web 2.0 social networking application. CloudStone includes Faban [18], a workload framework de-
signed for non static and distributed web applications. CloudStone is capable to use the measurement results
for calculating a metric that quanti󰅮ies the dollars per user per month [56]. HiBench is a benchmark suite for
the open-source MapReduce implementation Apache Hadoop [3]. The suite contains real-world applications for
web-search indexing like Nutch Indexing and PageRank [64], Hadoop distributed 󰅮ile system [4], [28] and bench-
marks for large-scale machine learning like Bayesian Classi󰅳ication and K-means Clustering. HiBench contains
19
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
further benchmarks (like Sort,WordCount and TeraSort providing good performance indicators for MapReduce
[14]) [28]. CloudSuite is a benchmark suite designed for scale-out workload experiments [20],[45]. The suite
includes tasks that were identi󰅮ied as some of the more common tasks which are handled using cloud com-
puting: Data Analytics using Apache Hadoop MapReduce framework [3] and the Apache Mahout library [5]. It
is designed for machine learning analysis on large-scale datasets [45], Data Caching using Memchached [21],
Data Serving using Cassandra [2] and Yahoo! Cloud Serving Benchmark [11], Media Streaming using the Darwin
Streaming Server with workload by Faban [18], graph based machine learning using GraphLab [13], Web Serving
using CloudStone [56], Web Search using (comparable to HiBench and CloudSuite) the Nutch search engine to
benchmark the indexing process [45].
Synthetic benchmarks: CloudCmp is a very (and maybe the most) in󰅮luencing benchmark in this domain
and a framework designed for customers to choose their cloud provider. The framework compares performance
and price to calculate the best cloud provider services by the needed technical requirements. It relies on bench-
mark suites like SPECjvm2008 [53], iperf [37], SciMark FFT, Crypto.AES, Crypto.RSA and further benchmarks.
CloudCmp can calculate factors like cost per task [34]. PerfKit Benchmarker contains set of benchmarks to
measure and compare cloud offerings. The benchmarks are not tuned (i.e. use defaults) to re󰅳lect what most users
will see” [25]. PerfKit contains a lot of benchmarks which makes this approach very run time intensive. And a
lot of benchmarks seem to generate redundant performance data. Some of the used benchmarks by PerfKit are
bonnie [8], CoreMark [15], iperf and netperf, Spec CPU 2006 [26], Unixbench [55], HPCC [36]. ALIBI [10] is a
prototype of a resource monitoring system. It is designed to analyze nested virtualization overheads by using
synthetic benchmarks for CPU/Memory and I/0 workloads. For compute- and memory-bound workloads SPEC
CINT2006 [39] and for network I/O netperf [51] are used.
Conclusion for our benchmark design: We decided not to follow application based benchmarking ap-
proaches like CloudSuite or CloudStone. These approaches try to cover speci󰅮ic applications like Hadoop or
domains like web serving or data streaming. We do not assume the usage IaaS cloud services for a speci󰅮ic do-
main or application type. Therefore, synthetic-like benchmarks to measure and compare virtual machine types
in general terms of processing, memory, disk and networking performance are preferred to cover relevant per-
formance attributes identi󰅮ied – for instance – by approaches like CloudGenius [41]. Therefore, our approach
follows more the benchmark design philosophy of PerfKit or ALIBI without copying it exactly. Other than PerfKit
or ALIBI, the amounts of benchmarks are reduced to generate a set of performance data without redundancies
and in less time. But like PerfKit, benchmarks are con󰅮igured by using their default values to re󰅮lect what most
users will see.
Benchmark runs should have short run times for public cloud computing. Mainly, to reduce data collection costs.
On the other hand, the provided data has to be statistically signi󰅮icant. Small sample sizes and signi󰅮icance must
be no con󰅮lict but handled with care (see Schmid and Huber [52]). The benchmarks should be simply deploy-
able and executable on virtual machines. And the returned data should be simply analyzable. That is, why data
formats for benchmark reports like JSON or XML are used by EasyCompare. Automated benchmark toolsets like
Phoronix [46] are simply deployable. Furthermore, these tools even check deviations while measuring bench-
mark results. If necessary (due to intolerable deviations of already collected data), additional benchmark runs
are executed to provide statistically signi󰅮icant samples.
The Taychon benchmark is used to measure the processing performance of a virtual machine type [58].
The Tachyon benchmark scales well with increasing cores and is faster than other benchmarks used by
CloudCmp or PerfKit (SPECjvm2008 [53], SciMark FFT [47]). Finally, Tachyon has been selected to be part
of the SPEC MPI2007 [42] benchmark suite. Nevertheless, like CloudCmp, Tachyon covers 󰅮loating point
operations. Tachyon benchmark is nothing more than rendering several standardized raytracing scenes
with a same named raytracing library. Furthermore, the Tachyon benchmark is relatively independent
from memory sizes and measures processing performance only.
20
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
The Stream benchmark is used to measure the memory performance [38]. This is basically the same
approach like CloudCmp. Stream is a simple, synthetic benchmark designed to measure sustainable mem-
ory bandwidth (in MB/s) for four simple vector operations (Copy, Scale, Add and Triad). To reduce bench-
marking times we only applied the Triad operation. The triad operation has the most practical relevance
according to Raman [50]. The other operations Copy, Scale, Add are not used.
The IOZone benchmark [44] is used to measure read and write disk performance. The IOZone bench-
mark produces reliable results in shorter time than a DFSIO benchmark (often used by cloud bench-
marks). To reduce benchmarking duration, only the read and write modes of 512 MB 󰅮iles with a record
size of 1 MB are executed. Other modes (covering smaller and bigger 󰅮ile sizes and different record sizes)
of the IOZone benchmark are not executed. Most of the times additional modes generate no fundamental
new insights in the I/O performance of a system.
Like CloudCmp and PerfKit do, iperf is used to measure network transfer rates of intra cloud data trans-
fers. And like done by [10], data transfer rates are measured on the sender side to minimize communi-
cation counter part in󰅮luences. iperf is used to send 30 times as many data as in 10 seconds possible
to measure maximum transfer rates. Unlike CloudCmp, no wide-area delivery transfers are measured
because there are to much in󰅮luences, which can not be accounted reliably to a cloud service provider.
The above described benchmark suite is designed to be run in less than two hours even on small virtual machine
types.
4. Performance Evaluation of two Cloud Service Providers
In Section 3introduced benchmarks have been collected by EasyCompare (Section 2). Benchmark data has been
measured on various machine types provided by AWS and GCE (see Appendix, Table 4). Benchmarking has been
performed in several datacenters1of both providers and at different days2to measure a realistic distribution
and not a unique distribution of performance characteristics. Median values, amount of samples nand quartiles
(in box plots) are presented according to recommendations, if nothing can be assumed on underlying probability
distributions [52]. The variance and skewness of distributions can be estimated for each provider by provided
box plots.
4.1 Processing Performance
Figure 3shows that processing performance of a virtual machine is mainly in󰅮luenced by the amount of available
simultaneous executable threads. A n-thread machine is providing almost the same processing performance on
GCE as on AWS. Even processing optimized machines do not show a substantial better performance than stan-
dard machines. So, provider labels like ”computing” or ”high-cpu” for machine types are mainly marketing labels.
Both providers have low cost instances which are single thread processors and which show signi󰅮icant lower
processing performance than all other provided machine types (’m3.medium’ for AWS and ’n1-standard-1’ for
GCE). While all machine types show almost no variance for processing performance, this is not true for the low-
cost GCE machine type ’n1-standard-1. This machine type shows a signi󰅮icant performance variance (see Fig-
ure 3). This might be an indication that this machine type is a bit overbooked in the GCE infrastructure. Taking
all together, machine types can be well compared by their available amount of simultaneous executable threads.
1AWS: eu-central-1, us-east-1; GCE: europe-west1-d, europe-west1-c, us-central1-f, asia-east1-a
225th Feb., 26th Feb., 27th Feb., 1st Mar., and 2nd Mar. 2015
21
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
n1-highmem-16, GCE
n1-standard-16, GCE
n1-highcpu-16, GCE
n1-standard-8, GCE
n1-highcpu-8, GCE
n1-highmem-8, GCE
n1-highmem-4, GCE
n1-highcpu-4, GCE
n1-standard-4, GCE
n1-highmem-2, GCE
n1-highcpu-2, GCE
n1-standard-2, GCE
n1-standard-1, GCE
0
50
100
150
GCE Processing performance (smaller is better)
Taychon, Time [s]
c3.4xlarge, AWS
i2.4xlarge, AWS
r3.4xlarge, AWS
c3.2xlarge, AWS
i2.2xlarge, AWS
m3.2xlarge, AWS
r3.2xlarge, AWS
c3.xlarge, AWS
r3.xlarge, AWS
m3.xlarge, AWS
i2.xlarge, AWS
c3.large, AWS
m3.large, AWS
r3.large, AWS
m3.medium, AWS
0
50
100
150
AWS Processing performance (smaller is better)
Taychon, Time [s]
Fig. 3: Measured GCE and AWS processing performances
4.2 Memory Performance
More variances are likely for the memory performance of virtual machines. The reader can see this effect in
Figure 4by increasing box sizes. In general, memory performance is decreasing with processor performance
(so the memory performance is aligned to the processing performance by public cloud providers, which is more
than reasonable). AWS provides for high n-thread machines better memory performances than GCE. Similar to
marketing labels for processing machines, there seem to exist marketing labels for memory optimized machines.
Machines being labeled as RAM-optimized (AWS, r3.*) or ”highmem” (GCE) seem not to show better transfer
rates than standard machines (of course these machines provide substantial more memory, but the memory is
not faster or shows smaller variances in data transfer rates, check Table 4).
n1-highmem-16, GCE
n1-standard-16, GCE
n1-highcpu-16, GCE
n1-standard-8, GCE
n1-highcpu-8, GCE
n1-highmem-8, GCE
n1-highmem-4, GCE
n1-highcpu-4, GCE
n1-standard-4, GCE
n1-highmem-2, GCE
n1-highcpu-2, GCE
n1-standard-2, GCE
n1-standard-1, GCE
0
5000
10000
15000
20000
25000
30000
35000
GCE Memory transfer rates (higher is better)
Stream, Triad [MB/s]
c3.4xlarge, AWS
i2.4xlarge, AWS
r3.4xlarge, AWS
c3.2xlarge, AWS
i2.2xlarge, AWS
m3.2xlarge, AWS
r3.2xlarge, AWS
c3.xlarge, AWS
r3.xlarge, AWS
m3.xlarge, AWS
i2.xlarge, AWS
c3.large, AWS
m3.large, AWS
r3.large, AWS
m3.medium, AWS
0
5000
10000
15000
20000
25000
30000
35000
AWS Memory transfer rates (higher is better)
Stream, Triad [MB/s]
Fig. 4: Measured GCE and AWS memory transfer performances
22
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
4.3 Disk Performance (read and write)
I/O performances have the tendency to show even more variances than memory performances. This is the
nature of things and not very astonishing. This can be seen for GCE and AWS as well. AWS seems to provide a
slightly better read and write performance than GCE (see Figure 5). I/O performancesof both providers decrease
slightly with processing performances of machines. AWS shows better performances but also greater variances
in data transfer rates. This is especially true for write performances of non low-cost machines. On the other
hand, it must be stated that median write performance values of AWS are always on the upper side. So most of
the write operations are fast and only some are slow. In general, AWS seems to be the better provider for storage
intensive applications compared with GCE (if no low-cost machines are used). GCE shows more balanced I/O
performance across all machine types but on a lower level.
Machines which are labeled as storage optimized (AWS: all i2 instance types are optimized for storage intensive
tasks) seem not to show better transfer rates than standard machines. Of course these machines can be con-
󰅮igured to provide substantial more disk space than standard machines. But the transfer rates are not faster or
show smaller variances (check Table 4). Especially the ’i2.4xlarge’ instance type provides astonishing small
read performances. At this point we have to say, that AWS and GCE provide options to optimize I/O performance
by using SSDs (GCE and AWS) and assured I/O operations per second (AWS). We have only measured the default
con󰅮iguration of the mentioned machine types.
So, there exist a plenty of variations between different providers regarding disk performances which makes it
complicated to compare the various machine types regarding their IO performance.
4.4 Network Performance
Regarding network performances of provided machines, a complete different philosophy of AWS and GCE can be
seen. The AWS philosophy seems to provide a well de󰅮ined and leveled network performance which is decreas-
ing with decreasing processing performance of virtual machine type. On these well de󰅮ined levels of network
performance only minor variances in data transfer rates occur (see Figure 6).
On the other hand, the reader can see a complete different GCE network design philosophy (see Figure 6). Com-
parable to IO performance, GCE seems to provide a well-balanced network performance across all machine
types. This transfer performance seems not even aligned to the processing performance of the machine types.
In other words, GCE seems to try providing full network power for all machine types. This is bought in by sub-
stantial bigger variances of data transfer rates. Some GCE machine types ’n1-highmem-16’ and ’n1-highmem-8
seem to show signi󰅮icantly higher data transfer rates. Nevertheless, their boxes are overlapping with a lot of
other machines and so it is not clear whether this is intended by GCE or only a effect which has been measured
coincidentally.
So, measured network performances of GCE and AWS show completely different design philosophies for the net-
work design of a public cloud, making it even more complicated to compare the various machine types regarding
their network performance.
23
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
n1-highmem-16, GCE
n1-standard-16, GCE
n1-highcpu-16, GCE
n1-standard-8, GCE
n1-highcpu-8, GCE
n1-highmem-8, GCE
n1-highmem-4, GCE
n1-highcpu-4, GCE
n1-standard-4, GCE
n1-highmem-2, GCE
n1-highcpu-2, GCE
n1-standard-2, GCE
n1-standard-1, GCE
0
2000
4000
6000
8000
GCE Disk IO Read (higher is better)
IOZone, Read [MB/s]
c3.4xlarge, AWS
i2.4xlarge, AWS
r3.4xlarge, AWS
c3.2xlarge, AWS
i2.2xlarge, AWS
m3.2xlarge, AWS
r3.2xlarge, AWS
c3.xlarge, AWS
r3.xlarge, AWS
m3.xlarge, AWS
i2.xlarge, AWS
c3.large, AWS
m3.large, AWS
r3.large, AWS
m3.medium, AWS
0
2000
4000
6000
8000
AWS Disk IO Read (higher is better)
IOZone, Read [MB/s]
n1-highmem-16, GCE
n1-standard-16, GCE
n1-highcpu-16, GCE
n1-standard-8, GCE
n1-highcpu-8, GCE
n1-highmem-8, GCE
n1-highmem-4, GCE
n1-highcpu-4, GCE
n1-standard-4, GCE
n1-highmem-2, GCE
n1-highcpu-2, GCE
n1-standard-2, GCE
n1-standard-1, GCE
0
20
40
60
80
100
120
GCE Disk IO Write (higher is better)
IOZone, Write [MB/s]
c3.4xlarge, AWS
i2.4xlarge, AWS
r3.4xlarge, AWS
c3.2xlarge, AWS
i2.2xlarge, AWS
m3.2xlarge, AWS
r3.2xlarge, AWS
c3.xlarge, AWS
r3.xlarge, AWS
m3.xlarge, AWS
i2.xlarge, AWS
c3.large, AWS
m3.large, AWS
r3.large, AWS
m3.medium, AWS
0
20
40
60
80
100
120
AWS Disk IO Write (higher is better)
IOZone, Write [MB/s]
Fig. 5: Measured GCE and AWS disk IO performances
5. Describing and Comparing Virtual Machine Types
This contribution uses the following performance describing vector ito express virtual machine type perfor-
mances provided by different cloud service providers (see Section 4).
i=
i1
i2
i3
i4
i5
i6
i7
Processing: Amount of simultaneous executable threads
Processing: Processing time in seconds (Median of all Tachyon benchmark runs)
Memory: Memory size in MB
Memory: Memory transfer in MB/s (Median of all Stream Triad benchmark runs)
Disk: Data transfer in MB/s for disk reads (Median of all IOZone read stress benchmark runs)
Disk: Data transfer in MB/s for disk writes (Median of all IOZone write stress benchmark runs)
Network: Data transfer in MB/s via network (Median of all iperf benchmark runs)
(1)
The similarity s(i,j)of two virtual machine types can be analyzed by calculating their vector similarity. Although
this approach is mostly used in domains like information retrieval, text and data mining, it turned out that vec-
tor similarities can be used to determine the similarity of virtual machine types as well. But an appropriate
24
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
n1-highmem-16, GCE
n1-standard-16, GCE
n1-highcpu-16, GCE
n1-standard-8, GCE
n1-highcpu-8, GCE
n1-highmem-8, GCE
n1-highmem-4, GCE
n1-highcpu-4, GCE
n1-standard-4, GCE
n1-highmem-2, GCE
n1-highcpu-2, GCE
n1-standard-2, GCE
n1-standard-1, GCE
0
100
200
300
400
500
600
GCE Network transfer rates (higher is better)
iperf [MB/s]
c3.4xlarge, AWS
i2.4xlarge, AWS
r3.4xlarge, AWS
c3.2xlarge, AWS
i2.2xlarge, AWS
m3.2xlarge, AWS
r3.2xlarge, AWS
c3.xlarge, AWS
r3.xlarge, AWS
m3.xlarge, AWS
i2.xlarge, AWS
c3.large, AWS
m3.large, AWS
r3.large, AWS
m3.medium, AWS
0
100
200
300
400
500
600
AWS Network transfer rates (higher is better)
iperf [MB/s]
Fig. 6: Measured GCE and AWS network performances
similarity measure has to be chosen.
5.1 Choosing an Appropriate Similarity Measure
According to Lin, many similarity measures have been proposed [35]. McGill surveyed and compared over 60
similarity measures alone [40]. A good similarity measure has to ful󰅮ill three similarity intuitions (see Lin [35]).
Intuition 1: The similarity between A and B is related to their commonality. The more commonality they
share, the more similar they are.
Intuition 2: The similarit y between A and B is relatedto the differences between them. The more differences
they have, the less similar they are.
Intuition 3: The maximum similarity between A and B is reached when A and B are identical.
In our case, machine performances are described as numerical feature vectors. This reduces the set of possible
similarity measures to some promising approaches. Comparing two numerical feature vectors iand jis often
done in information retrieval by the cosine similarity [22] or the dice similarity [35]
scosine(i,j) = Pm
n=1(in·jn)
pPm
n=1(in)2·pPm
n=1(jn)2sdice (i,j) = 2Pm
n=1 injn
Pm
n=1 i2
n+Pm
n=1 j2
n
(2)
or via distance measures like euclidian distance [49, 62] or derived euclidian distance metric [35].
deuclidian(i,j) = v
u
u
t
m
X
n=1
(injn)2meuclidian(i,j)=1
1 + deuclidian(i,j)(3)
Sadly, it turned out, that all measures do not ful󰅮ill the second similarity intuition postulated by Lin [35] or had
other shortcomings. The Cosine similarity tends to rate a lot of non similar machines to be similar. The Dice
similarity tends to rate a lot of non-similar machines to be similar (although showing somewhat better results
than the Cosine similarity). The Euclidian distance is able to identify identical machines as totally similar and
showed reasonable results for second intuition similarities, but not in a normalized way. So a Euclidian simi-
larity value of 0 means total similarity but what does a Euclidian similarity value of 500 means? The Euclidian
25
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
distance metric provides a normalized value but tends to behave like a boolean identical operator (it only was
able to identify identical machines but rated all other machines as totally non similar, this has to do with the fact
that the distance metric takes absolute performance differences into account and no performance relations).
Nevertheless, measures comparing positions of a vector (Euclidian distances) showed more intuitive results
compared to measures comparing vector directions (like Cosine similarity). Therefore, this contribution pro-
poses a normalized Euclidian distance metric based on relative performance differences. The following vari-
ant of the Euclidian distance metric measures the relative performance relations of performance components
(i1, i2, . . . , im)and (j1, j2, . . . , jm)and normalizes the result s(i,j)to a similarity value between 0.0(iand jare
not similar at all) and 1.0(iand jare completely similar).
i,js(i,j) = 1 1
m
m
X
n=1
1min(in, jn)
max(in, jn)2
i,js(i,j) = s(j,i)
i,j0s(i,j)1
is(i,i) = 1
(4)
It produces similarity values which are in accordance with cloud engineer expectations and ful󰅮ills all relevant
similarity intuitions postulated by Lin [35].
Lin’s concept of similarity is designed as an information-theoretic de󰅮inition and therefore has to consider miss-
ing data as well. Missing data can not happen in our case. We had only to consider commonality (intuition 1),
if we would like to compare non complete numerical feature vectors. But that is not the case and so we do not
have to consider Lin’s intuition 1 for evaluation. But of course, we have to consider the second and third intu-
ition and show how the proposed similarity measure is in accordance to these intuitions. We do this by showing
that the proposed similarity measure ful󰅮ills the identical intuition (intuition 3) in Section 5.2 and the difference
intuition (intuition 2) in Section 5.3 and Section 5.4.
5.2 Evaluating the Distance Measure by Comparing Machine Types of the same Provider
Table 1shows exemplary the Euclidian similarity of various GCE machine types. On the main diagonal of Table 1
all similarity values are 1.0 and this must be true according to Lin’s third identical intuition because on the main
diagonal identical machine types are compared. And according to our expectation, similarities are decreasing
with increasing processing, memory, disk, or networking differences of virtual machine types (which is accord-
ing to Lin’s second difference intuition). Furthermore, we can even identify some similarity clusters in Table
1which are aligned to the amount of simultaneous executable threads. 16 thread machines show comparable
similarity values, the same is true for 8, 4 and 2 thread machines. The same effect can be seen for AWS machine
types as well (but due to page limitations, data is not presented).
The Euclidian distance measure works well with both providers. But we are going to double check that by com-
paring same machine types of different datacenters of the same provider and by comparing different machine
types of different providers.
5.3 Evaluation the Distance Measure by Comparing Provider Datacenters
Machine types from different datacenters of the same provider should show similar characteristics. Neverthe-
less, it is not realistic to expect a perfect similarity value of 1.0 on the main diagonal. But the resulting similarity
table should look very similar to the Table 1. And of course, slightly reduced similarity values are expectable
due to statistical 󰅮luctuations. And this is exactly what can be seen exemplary in Table 2for GCE. There still ex-
ist a clear (but not longer perfect) main diagonal. But almost all similarity values are decreasing and no exact
similarities exist any more.
26
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
Table 1: Euclidian similarity values of GCE machine types (sorted on both axis by descending amount
of simultaneous executable threads)
Similarities
n1-highcpu-16 (GCE)
n1-highmem-16 (GCE)
n1-standard-16 (GCE)
n1-highcpu-8 (GCE)
n1-highmem-8 (GCE)
n1-standard-8 (GCE)
n1-highcpu-4 (GCE)
n1-highmem-4 (GCE)
n1-standard-4 (GCE)
n1-highcpu-2 (GCE)
n1-highmem-2 (GCE)
n1-standard-2 (GCE)
n1-standard-1 (GCE)
n1-highcpu-16 (GCE) 1.00 0.81 0.85 0.66 0.62 0.67 0.60 0.60 0.76 0.53 0.53 0.54 0.53
n1-highmem-16 (GCE) 0.81 1.00 0.79 0.65 0.68 0.61 0.60 0.54 0.56 0.54 0.47 0.49 0.49
n1-standard-16 (GCE) 0.85 0.79 1.00 0.64 0.60 0.68 0.57 0.62 0.62 0.51 0.55 0.54 0.52
n1-highcpu-8 (GCE) 0.66 0.65 0.64 1.00 0.83 0.82 0.67 0.62 0.64 0.58 0.54 0.70 0.55
n1-highmem-8 (GCE) 0.62 0.68 0.60 0.83 1.00 0.78 0.64 0.58 0.60 0.56 0.50 0.51 0.51
n1-standard-8 (GCE) 0.67 0.61 0.68 0.82 0.78 1.00 0.63 0.66 0.67 0.54 0.58 0.58 0.55
n1-highcpu-4 (GCE) 0.60 0.60 0.57 0.67 0.64 0.63 1.00 0.80 0.82 0.65 0.60 0.62 0.61
n1-highmem-4 (GCE) 0.60 0.54 0.62 0.62 0.58 0.66 0.80 1.00 0.84 0.59 0.65 0.62 0.59
n1-standard-4 (GCE) 0.76 0.56 0.62 0.64 0.60 0.67 0.82 0.84 1.00 0.61 0.64 0.65 0.62
n1-highcpu-2 (GCE) 0.53 0.54 0.51 0.58 0.56 0.54 0.65 0.59 0.61 1.00 0.80 0.83 0.68
n1-highmem-2 (GCE) 0.53 0.47 0.55 0.54 0.50 0.58 0.60 0.65 0.64 0.80 1.00 0.84 0.67
n1-standard-2 (GCE) 0.54 0.49 0.54 0.70 0.51 0.58 0.62 0.62 0.65 0.83 0.84 1.00 0.69
n1-standard-1 (GCE) 0.53 0.49 0.52 0.55 0.51 0.55 0.61 0.59 0.62 0.68 0.67 0.69 1.00
The same effects (but less distinctive) can be seen for AWS. The reader can see that the Euclidian distance
measure works well with both providers and is even able to compare datacenter de󰅮inition characteristics of
providers. It ful󰅮ills Lin’s second intuition (difference intuition) and works well close to Lin’s third intuition
(identical intuition).
Table 2: Euclidian similarity values of various GCE machine types hosted in europe-west1-c and
us-central1-f (sorted on both axis by descending amount of simultaneous executable threads)
Similarities
n1-highcpu-16 (GCE, us-central1-f)
n1-highmem-16 (GCE, us-central1-f)
n1-standard-16 (GCE, us-central1-f)
n1-highcpu-8 (GCE, us-central1-f)
n1-highmem-8 (GCE, us-central1-f)
n1-standard-8 (GCE, us-central1-f)
n1-highcpu-4 (GCE, us-central1-f)
n1-highmem-4 (GCE, us-central1-f)
n1-standard-4 (GCE, us-central1-f)
n1-highcpu-2 (GCE, us-central1-f)
n1-highmem-2 (GCE, us-central1-f)
n1-standard-2 (GCE, us-central1-f)
n1-standard-1 (GCE, us-central1-f)
n1-highcpu-16 (GCE, europe-west1-c) 0.98 0.86 0.79 0.65 0.66 0.62 0.61 0.54 0.73 0.54 0.48 0.52 0.51
n1-highmem-16 (GCE, europe-west1-c) 0.83 1.00 0.79 0.65 0.67 0.62 0.61 0.54 0.59 0.54 0.48 0.51 0.51
n1-standard-16 (GCE, europe-west1-c) 0.83 0.80 1.00 0.65 0.60 0.67 0.56 0.61 0.59 0.49 0.55 0.53 0.50
n1-highcpu-8 (GCE, europe-west1-c) 0.68 0.66 0.64 1.00 0.81 0.84 0.66 0.62 0.66 0.57 0.55 0.72 0.56
n1-highmem-8 (GCE, europe-west1-c) 0.62 0.65 0.59 0.80 1.00 0.78 0.65 0.57 0.61 0.56 0.50 0.53 0.52
n1-standard-8 (GCE, europe-west1-c) 0.67 0.65 0.66 0.85 0.79 0.99 0.63 0.64 0.66 0.55 0.57 0.58 0.55
n1-highcpu-4 (GCE, europe-west1-c) 0.61 0.59 0.59 0.66 0.60 0.65 0.98 0.81 0.86 0.63 0.63 0.66 0.62
n1-highmem-4 (GCE, europe-west1-c) 0.59 0.56 0.62 0.64 0.58 0.66 0.80 0.98 0.83 0.60 0.65 0.63 0.59
n1-standard-4 (GCE, europe-west1-c) 0.75 0.59 0.60 0.66 0.60 0.66 0.83 0.82 1.00 0.62 0.63 0.65 0.61
n1-highcpu-2 (GCE, europe-west1-c) 0.55 0.53 0.53 0.58 0.53 0.57 0.63 0.60 0.65 0.97 0.83 0.85 0.69
n1-highmem-2 (GCE, europe-west1-c) 0.54 0.52 0.53 0.58 0.52 0.58 0.62 0.61 0.65 0.82 0.98 0.85 0.68
n1-standard-2 (GCE, europe-west1-c) 0.52 0.49 0.50 0.70 0.49 0.55 0.59 0.58 0.62 0.81 0.83 0.98 0.68
n1-standard-1 (GCE, europe-west1-c) 0.52 0.53 0.48 0.55 0.53 0.53 0.62 0.55 0.62 0.70 0.65 0.68 1.00
5.4 Evaluating the Distance Measure by Comparing Different Machines from Different Providers
The sections above have shown that the proposed Euclidian distance measure is able to produce plausible re-
sults. Nevertheless, the distance measure was not developed to compare different machine types of the same
provider (the measure is just cross checked against these uses cases to show that the results of the Euclidian
distance are reasonable). The main purpose of this measure is to compare different machine types of different
providers.
27
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
If different providers are compared, no sharp main diagonal can be excepted any longer. If that was the case, it
would mean, that different cloud service providers would try to copy performance characteristics of competitors
one by one. But of course, similarity clusters are expectable and should be identi󰅮ied by a similarity measure.
And these clusters should be aligned to some degree to the processing power of the instances. Of course, AWS
and GCE are both providing 16, 8, 4, 2 thread machines. And 16 thread machines should show a somewhat
similar and better performance than 8, 4, and 2 thread machines.
Table 3shows measured Euclidian similarity values for various machine pairs of AWS and GCE. A 󰅮irst ’16 thread
cluster’ consists of the ’*.4xlarge’ AWS and the ’n1-*-16’ GCE machine types. A ’8 thread cluster’ consists of the
*.2xlarge’ AWS and ’n1-*-8’ GCE machine types. The reader will identify easily 4 and 2 thread clusters as well
and will even identify the small ’one thread cluster’ consisting of the AWS ’m3.medium’ and GCE ’n1-standard-1
low-cost machine pair.
So the similarity values of the Euclidian distance measure are also reasonable for the intended use case to com-
pare different machine types from different providers and the similarity values are aligned to cloud engineer ex-
pectations. Even more interesting, the Euclidian similarity is also telling something about machine types which
are more similar than others – and therefore more preferable for container clusters. Table 3shows three pairs
of AWS and GCE machine types which show more similarity than other combinations (similarities of 0.97 and
0.98 which is comparable to inner GCE similarities between datacenter, see Table 2). These pairs would be a
󰅮irst choice for a container cluster.
Table 3: Similarity values of various AWS and GCE machines types (sorted on both axis by descending
amount of simultaneous executable threads)
Similarities
n1-highcpu-16 (GCE)
n1-highmem-16 (GCE)
n1-standard-16 (GCE)
n1-highcpu-8 (GCE)
n1-highmem-8 (GCE)
n1-standard-8 (GCE)
n1-highcpu-4 (GCE)
n1-highmem-4 (GCE)
n1-standard-4 (GCE)
n1-highcpu-2 (GCE)
n1-highmem-2 (GCE)
n1-standard-2 (GCE)
n1-standard-1 (GCE)
c3.4xlarge (AWS) 0.80 0.83 0.78 0.64 0.64 0.74 0.57 0.52 0.53 0.51 0.44 0.46 0.46
i2.4xlarge (AWS) 0.81 0.84 0.79 0.66 0.66 0.61 0.60 0.54 0.56 0.53 0.47 0.49 0.48
r3.4xlarge (AWS) 0.80 0.83 0.78 0.64 0.65 0.60 0.57 0.52 0.54 0.51 0.45 0.46 0.46
c3.2xlarge (AWS) 0.81 0.61 0.65 0.82 0.78 0.82 0.61 0.61 0.77 0.53 0.52 0.54 0.51
i2.2xlarge (AWS) 0.66 0.61 0.65 0.82 0.78 0.83 0.61 0.62 0.63 0.53 0.53 0.54 0.52
m3.2xlarge (AWS) 0.66 0.61 0.65 0.82 0.78 0.97 0.61 0.62 0.63 0.53 0.53 0.54 0.52
r3.2xlarge (AWS) 0.66 0.61 0.65 0.82 0.78 0.82 0.61 0.62 0.63 0.53 0.53 0.54 0.52
c3.xlarge (AWS) 0.60 0.54 0.61 0.76 0.58 0.65 0.79 0.83 0.83 0.58 0.61 0.75 0.57
i2.xlarge (AWS) 0.59 0.53 0.60 0.61 0.57 0.79 0.78 0.82 0.83 0.57 0.61 0.60 0.56
m3.xlarge (AWS) 0.75 0.56 0.60 0.64 0.60 0.65 0.82 0.82 0.98 0.61 0.61 0.62 0.59
r3.xlarge (AWS) 0.60 0.54 0.60 0.61 0.58 0.79 0.79 0.83 0.84 0.59 0.62 0.62 0.58
c3.large (AWS) 0.52 0.46 0.53 0.51 0.49 0.56 0.57 0.64 0.63 0.77 0.84 0.81 0.79
m3.large (AWS) 0.54 0.47 0.55 0.68 0.50 0.58 0.59 0.64 0.64 0.80 0.85 0.97 0.67
r3.large (AWS) 0.65 0.45 0.53 0.51 0.48 0.56 0.57 0.64 0.76 0.77 0.83 0.81 0.65
m3.medium (AWS) 0.38 0.34 0.41 0.39 0.36 0.42 0.43 0.48 0.47 0.51 0.57 0.55 0.85
The presented data was double checked by comparing same machine types of different datacenters of the same
provider. We did this for AWS and GCE. Due to page limitations, we presented only data for GCE. Nevertheless, all
collected data for AWS and GCE come to the same conclusion. Our proposed similarity measure is in accordance
with cloud service engineer’s expectation and in accordance with Lin’s difference and identical intuition of a
good similarity measure.
6. Conclusion
Cloud service selection can be a complex and challenging task for a cloud engineer. The underlying decision
making problem makes it dif󰅮icult to model it appropriately. Furthermore, very often cloud service selection is
not done very systematically in practice. Using a cloud service provider infrastructure is more an evolutionary
process than a conscious selection. At some point in time, there might arise the need to change a cloud service
provider or to diverse a deployment across several cloud service providers.
28
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
There exist several technologies to support this. One approach is to use container cluster technologies like
CoreOS, Kubernetes or Mesos. Well known companies like Google, Net󰅮lix, Twitter are doing this very successful
to handle regional workloads, to provide failover and over󰅮low capacities. Small and medium sized companies
can bene󰅮it as well. Nevertheless, these type of clusters rely on homogeneous nodes (nodes with similar per-
formance characteristics). Otherwise, the intended 󰅮ine-grained resource allocation of container clusters gets
limited. As Section 3showed, there exist several approaches to compare public cloud service providers. Most
of these approaches focus to select best-performing and cost-effective providers but not to select similar re-
sources offered by different providers. This contribution concentrated on 󰅮inding most similar resources and
not the most performant or cost-effective one’s.
EasyCompare is able to determine performance differences of different virtual machine types as a numeric value
between 0 (no similarity) and 1 (maximum similarity). This is done by calculating a normalized Euclidian dis-
tance measure. We evaluated the distance measure and showed that it is relevant, able to identify similar re-
sources and provides reasonable similarity values according to Lin’s intuitions of a good similarity measure.
We analyzed other similarity measures like cosine similarity, dice similarity, euclidian distance and euclidian
distance metric as well, but found no other appropriate measures to express virtual machine similarity in an
appropriate way.
Comparing virtual machines of different cloud service providers is mainly a combinatoric problem. It turned out,
that not hundreds of theoretical possible machine pairs have to be considered in practice, but only three to 󰅮ive
with relevant similarities. These three to 󰅮ive relevant machine types can be identi󰅮ied by systematic benchmark
runs in less than three hours. EasyCompare is designed to do fast and completely automatic benchmarking by
only collecting data relevant to identify such similarities. We analyzed AWS and GCE and it turned out that only
three machine pairs of these both providers (out of 195 analyzed machine pairs) are showing high similarities
(see Section 5.4). All other machine types (that is more than 98.5% of the problem space) are of minor relevance
for cloud engineers dealing with container clusters. Similar effects for other cloud service providers are more
than likely. Identifying most appropriate service providers for such a limited set of resources is obviously much
simpler than comparing complete cloud service provider resource offerings.
Acknowledgement
This study was funded by German Federal Ministry of Education and Research (03FH021PX4, Cloud TRANSIT).
We used research grants by courtesy of Amazon Web Services and Google Compute Engine. The authors thank
Lübeck University (Institute of Telematics) and fat IT solution GmbH (Kiel) for their general support of the re-
search project Cloud TRANSIT.
References
[1] A. Afshari, M. Mojahed, and R. M. Yusuff. Simple additive weighting approach to personnel selection prob-
lem. International Journal of Innovation, Management and Technology, 1(5):511–515, 2010.
[2] Apache. Cassandra. http://cassandra.apache.org/. Accessed May 20, 2015.
[3] Apache. Hadoop. http://hadoop.apache.org/. Accessed May 20, 2015.
[4] Apache. Hdfs. http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/
HdfsDesign.html. Accessed May 20, 2015.
[5] Apache. Mahout. http://mahout.apache.org/. Accessed May 20, 2015.
[6] Apache. Mesos. http://mesos.apache.org. Accessed May 20, 2015.
29
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
[7] G. Ataş and V. C. Gungor. Performance evaluation of cloud computing platforms using statistical methods.
Comput. Electr. Eng., 40(5):1636–1649, July 2014.
[8] T. Bray. Bonnie benchmark, 1996. http://www.textuality.com/bonnie/. Accessed May 20, 2015.
[9] C.-W. Chang, P. Liu, and J.-J. Wu. Probability-based cloud storage providers selection algorithms with max-
imum availability. In Parallel Processing (ICPP), 2012 41st International Conference on, pages 199–208.
IEEE, 2012.
[10] C. Chen, P. Maniatis, A. Perrig, A. Vasudevan, and V. Sekar. Towards veri󰅮iable resource accounting for out-
sourced computation. SIGPLAN Not., 48(7):167–178, Mar. 2013.
[11] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems
with ycsb. In Proceedings of the 1st ACM symposium on Cloud computing, pages 143–154. ACM, 2010.
[12] CoreOS. Coreos. https://coreos.com. Accessed May 20, 2015.
[13] Dato. Dato graphlab create. https://dato.com/products/create/open_source.html. Accessed May
20, 2015.
[14] J. Dean and S. Ghemawat. Mapreduce: simpli󰅮ied data processing on large clusters. Communications of the
ACM, 51(1):107–113, 2008.
[15] E. M. B. C. (EEMBC). Eembc – the embedded microprocessor benchmark consortium. https://www.
eembc.org/coremark/. Accessed May 20, 2015.
[16] D. Ergu, G. Kou, Y. Peng, Y. Shi, and Y. Shi. The analytic hierarchy process: Task scheduling and resource
allocation in cloud computing environment. J. Supercomput., 64(3):835–848, June 2013.
[17] C. Esposito, M. Ficco, F. Palmieri, and A. Castiglione. Smart cloud storage service selection based on fuzzy
logic, theory of evidence and game theory. Computers, IEEE Transactions on, PP(99):1–1, 2015.
[18] faban.org. Faban - helping measure performance. http://faban.org/index.html, 2014. Accessed May
20, 2015.
[19] M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki,
and B. Falsa󰅮i. Clearing the clouds: A study of emerging scale-out workloads on modern hardware. SIGPLAN
Not., 47(4):37–48, Mar. 2012.
[20] M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki,
and B. Falsa󰅮i. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In Pro-
ceedings of the seventeenth international conference on Architectural Support for Programming Languages
and Operating Systems, ASPLOS ’12, pages 37–48, New York, NY, USA, 2012. ACM.
[21] B. Fitzpatrick. Distributed caching with memcached. Linux journal, 2004(124):5, 2004.
[22] W. B. Frakes and R. Baeza-Yates, editors. Information Retrieval: Data Structures and Algorithms. Prentice-
Hall, Inc., Upper Saddle River, NJ, USA, 1992.
[23] S. K. Garg, S. Versteeg, and R. Buyya. Smicloud: A framework for comparing and ranking cloud services.
In Utility and Cloud Computing (UCC), 2011 Fourth IEEE International Conference on, pages 210–218. IEEE,
2011.
[24] Google. Kubernetes. http://kubernetes.io. Accessed May 20, 2015.
30
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
[25] Google. Per󰅮kit benchmarker. https://github.com/GoogleCloudPlatform/PerfKitBenchmarker,
2015. Accessed May 20, 2015.
[26] J. L. Henning. Spec cpu2006 benchmark descriptions. ACM SIGARCH Computer Architecture News,
34(4):1–17, 2006.
[27] B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos:
A platform for 󰅮ine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Confer-
ence on Networked Systems Design and Implementation, NSDI’11, pages 295–308, Berkeley, CA, USA, 2011.
USENIX Association.
[28] S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang. The hibench benchmark suite: Characterization of the
mapreduce-based data analysis. In Data Engineering Workshops (ICDEW), 2010 IEEE 26th International
Conference on, pages 41–51. IEEE, 2010.
[29] R. Karim, C. Ding, and A. Miri. An end-to-end qos mapping approach for cloud service selection. In Services
(SERVICES), 2013 IEEE Ninth World Congress on, pages 341–348. IEEE, 2013.
[30] C.-K. Ke, Z.-H. Lin, M.-Y. Wu, and S.-F. Chang. An optimal selection approach for a multi-tenancy service
based on a sla utility. In Computer, Consumer and Control (IS3C), 2012 International Symposium on, pages
410–413. IEEE, 2012.
[31] D. E. Knuth and D. Bibby. The texbook, volume 1993. Addison-Wesley Reading, MA, USA, 1986.
[32] N. Kratzke. Lightweight virtualization cluster - howto overcome cloud vendor lock-in. Journal of Computer
and Communication (JCC), 2(12), oct 2014.
[33] N. Kratzke. A lightweight virtualization cluster reference architecture derived from open source paas plat-
forms. Open Journal of Mobile Computing and Cloud Computing (MCCC), 1(2):17–30, 2014.
[34] A. Li, X. Yang, S. Kandula, and M. Zhang. Cloudcmp: Comparing public cloud providers. In Proceedings of the
10th ACM SIGCOMM Conference on Internet Measurement, IMC ’10, pages 1–14, New York, NY, USA, 2010.
ACM.
[35] D. Lin. An information-theoretic de󰅮inition of similarity. In Proceedings of the Fifteenth International Con-
ference on Machine Learning, ICML ’98, pages 296–304, San Francisco, CA, USA, 1998. Morgan Kaufmann
Publishers Inc.
[36] P. R. Luszczek, D. H. Bailey, J. J. Dongarra, J. Kepner, R. F. Lucas, R. Rabenseifner, and D. Takahashi. The hpc
challenge (hpcc) benchmark suite. In Proceedings of the 2006 ACM/IEEE conference on Supercomputing,
page 213. Citeseer, 2006.
[37] Mark Gates, Alex Warshavsky. Iperf. https://iperf.fr/. Accessed May 20, 2015.
[38] J. D. McCalpin. Memory bandwidth and machine balance in current high performance computers. IEEE
Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, pages 19–25, Dec.
1995.
[39] H. McGhan. Spec cpu2006 benchmark suite. Microprocessor Report, 2006.
[40] M. McGill. An evaluation of factors affecting document ranking by information retrieval systems., 1979.
Accessed May 20, 2015.
[41] M. Menzel and R. Ranjan. Cloudgenius: Decision support for web server cloud migration. In Proceedings of
the 21st International Conference on World Wide Web, WWW ’12, pages 979–988, New York, NY, USA, 2012.
ACM.
31
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
[42] M. S. Müller, M. van Waveren, R. Lieberman, B. Whitney, H. Saito, K. Kumaran, J. Baron, W. C. Brantley, C. Par-
rott, T. Elken, et al. Spec mpi2007—an application benchmark suite for parallel systems using mpi. Con-
currency and Computation: Practice and Experience, 22(2):191–205, 2010.
[43] S. Newman. Building Microservices - Designing Fine-Grained Systems. O’Reilly, 2015.
[44] W. D. Norcott and D. Capps. Iozone 󰅮ilesystem benchmark. www.iozone.org, 55, 2003.
[45] E. Parallel Systems Architecture Lab. Cloudsuite benchmarks. http://parsa.epfl.ch/cloudsuite/
overview.html. Accessed May 20, 2015.
[46] Phoronix Media. Phoronix test suite. http://www.phoronix-test-suite.com/. Accessed May 20, 2015.
[47] R. Pozo and B. Miller. Java scimark 2.0. http://math.nist.gov/scimark2/. Accessed May 20, 2015.
[48] C. Quinton, N. Haderer, R. Rouvoy, and L. Duchien. Towards multi-cloud con󰅮igurations using feature models
and ontologies. In Proceedings of the 2013 international workshop on Multi-cloud applications and federated
clouds, pages 21–26. ACM, 2013.
[49] R. Rada, H. Mili, E. Bicknell, and M. Blettner. Development and application of a metric on semantic nets.
Systems, Man and Cybernetics, IEEE Transactions on, 19(1):17–30, 1989.
[50] K. Raman. Optimizing memory bandwidth on stream triad. https://software.intel.com/en- us/
articles/optimizing-memory-bandwidth-on-stream-triad, 2013. Accessed May 20, 2015.
[51] Rick Jones . Network performance benchmark netperf. http://www.netperf.org/netperf/. Accessed
May 20, 2015.
[52] H. Schmid and A. Huber. Measuring a small number of samples, and the 3v fallacy: Shedding light on
con󰅮idence and error intervals. Solid-State Circuits Magazine, IEEE, 6(2):52–58, Spring 2014.
[53] K. Shiv, K. Chow, Y. Wang, and D. Petrochenko. Specjvm2008 performance characterization. In Computer
Performance Evaluation and Benchmarking, pages 17–35. Springer, 2009.
[54] J. Siegel and J. Perdue. Cloud services measures for global use: the service measurement index (smi). In
SRII Global Conference (SRII), 2012 Annual, pages 411–415. IEEE, 2012.
[55] Smith, Ben and Grehan, Rick and Yager, Tom and Niemi, DC. byte-unixbench - a unix benchmark suite.
https://code.google.com/p/byte-unixbench. Accessed May 20, 2015.
[56] W. Sobel, S. Subramanyam, A. Sucharitakul, J. Nguyen, H. Wong, A. Klepchukov, S. Patil, A. Fox, and D. Patter-
son. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for web 2.0. In Proc.
of CCA, volume 8, 2008.
[57] R. Statistical Package. R: A language and environment for statistical computing. Vienna, Austria: R Foun-
dation for Statistical Computing, 2009.
[58] J. Stone. Tachyon parallel / multiprocessor ray tracing system. http://jedi.ks.uiuc.edu/~johns/
raytracer, 1995. Accessed May 20, 2015.
[59] L. Sun, H. Dong, F. K. Hussain, O. K. Hussain, and E. Chang. Cloud service selection: State-of-the-art and
future research directions. Journal of Network and Computer Applications, 45(0):134 – 150, 2014.
[60] A. Traeger, E. Zadok, N. Joukov, and C. P. Wright. A nine year study of 󰅮ile system and storage benchmarking.
Trans. Storage, 4(2):5:1–5:56, May 2008.
32
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
[61] C. Vazquez, R. Krishnan, and E. John. Cloud computing benchmarking: A survey. In Proceedings of Interna-
tional Conference on Grid and Cloud Computing and Applications (GCA 2014), 2014.
[62] S. Wu, Y. Bi, and X. Zeng. Using the euclidean distance for retrieval evaluation. In A. Fernandes, A. Gray, and
K. Belhajjame, editors, Advances in Databases, volume 7051 of Lecture Notes in Computer Science, pages
83–96. Springer Berlin Heidelberg, 2011.
[63] J. Yang, W. Lin, and W. Dou. An adaptive service selection method for cross-cloud service composition.
Concurrency and Computation: Practice and Experience, 25(18):2435–2454, 2013.
[64] L. Yi and J. Dai. Experience from hadoop benchmarking with hibench: from micro-benchmarks toward
end-to-end pipelines. In Advancing Big Data Benchmarks, pages 43–48. Springer, 2014.
33
Nane Kratzke and Peter-Christian Quint / Journal of Cloud Computing Research
(2015) Vol. 1 No. 1 pp. 16-34
Appendix
Table 4: Measured median performance values of several AWS and GCE machine types (sorted by
descending processing performance)
Machine Type Threads Memory (MB) Tachyon (s) Memory
(MB/s)
Network
(MB/s)
I/O Write
(MB/s)
I/O Read
(MB/s)
c3.4xlarge (AWS) 16 30720 7.92
n=33
31512.40
n=83
254.00
n=120
123.16
n=87
7140.05
n=85
i2.4xlarge (AWS) 16 123904 8.45
n=33
27910.25
n=90
255.50
n=120
121.00
n=96
5629.10
n=93
r3.4xlarge (AWS) 16 123904 8.47
n=33
31502.55
n=76
256.00
n=120
122.80
n=85
7269.05
n=84
c3.2xlarge (AWS) 8 15360 15.72
n=36
29873.50
n=80
129.00
n=120
122.99
n=96
6914.41
n=84
i2.2xlarge (AWS) 8 62464 16.79
n=36
29278.75
n=80
128.50
n=120
122.17
n=89
6923.51
n=93
m3.2xlarge (AWS) 8 30720 16.83
n=36
29433.30
n=80
126.00
n=120
121.63
n=91
7157.27
n=84
r3.2xlarge (AWS) 8 62464 16.85
n=36
29511.40
n=82
129.50
n=120
122.98
n=94
6926.79
n=84
c3.xlarge (AWS) 4 8192 31.38
n=36
20137.95
n=80
96.55
n=120
122.94
n=93
7018.64
n=96
r3.xlarge (AWS) 4 30720 33.52
n=36
18674.15
n=78
96.70
n=120
122.00
n=87
6656.78
n=87
m3.xlarge (AWS) 4 15360 33.55
n=36
19613.50
n=80
128.50
n=120
112.72
n=105
6782.94
n=84
i2.xlarge (AWS) 4 30720 33.60
n=36
20289.05
n=80
96.60
n=120
122.76
n=87
7709.90
n=84
c3.large (AWS) 2 4096 62.46
n=36
11523.00
n=78
67.80
n=120
68.41
n=90
7449.54
n=87
m3.large (AWS) 2 8192 66.96
n=36
10977.95
n=80
85.30
n=120
82.21
n=84
6868.19
n=84
r3.large (AWS) 2 15360 66.98
n=36
11562.40
n=80
68.00
n=120
67.80
n=88
7618.09
n=88
m3.medium (AWS) 1 4096 168.05
n=36
6631.10
n=103
42.10
n=120
41.61
n=84
3779.64
n=100
n1-highmem-16 (GCE) 16 105472 8.05
n=39
26836.40
n=98
282.50
n=150
87.09
n=105
5650.34
n=175
n1-highcpu-16 (GCE) 16 15360 8.07
n=36
27008.70
n=80
117.50
n=120
86.85
n=84
5205.79
n=138
n1-standard-16 (GCE) 16 60416 8.07
n=42
26827.45
n=100
71.55
n=150
86.90
n=105
5560.74
n=162
n1-highcpu-8 (GCE) 8 8192 16.15
n=45
25811.80
n=99
148.50
n=150
85.21
n=105
4978.00
n=168
n1-highmem-8 (GCE) 8 53248 16.18
n=45
25664.05
n=120
302.50
n=150
85.33
n=105
5526.72
n=167
n1-standard-8 (GCE) 8 30720 16.22
n=45
25416.80
n=98
81.15
n=150
85.30
n=105
5283.85
n=151
n1-highmem-4 (GCE) 4 26624 32.34
n=45
18857.30
n=100
58.95
n=150
83.60
n=105
5354.39
n=146
n1-standard-4 (GCE) 4 15360 32.81
n=45
18164.70
n=100
93.25
n=150
83.59
n=105
4950.54
n=167
n1-highcpu-4 (GCE) 4 3584 32.88
n=45
18484.30
n=100
166.50
n=150
85.39
n=105
4787.73
n=169
n1-highmem-2 (GCE) 2 13312 64.82
n=45
10195.00
n=100
69.20
n=150
81.13
n=105
5054.46
n=174
n1-highcpu-2 (GCE) 2 2048 65.98
n=45
10577.40
n=98
190.50
n=150
84.52
n=105
5045.38
n=172
n1-standard-2 (GCE) 2 8192 67.57
n=45
9967.35
n=106
103.00
n=150
80.96
n=105
4693.51
n=180
n1-standard-1 (GCE) 1 4096 84.99
n=51
10740.60
n=99
120.50
n=150
61.65
n=109
4621.90
n=177
34
... This Section summarizes the main points of our performance considerations. The following of our studies investigated performance aspects on service level [31,32,33] and on elastic container platform level [34,35]. We developed the research benchmarking prototypes EASY-COMPARE for the platform level and PPBENCH for the service level (see Table 18) to do these pre-solution performance analytics. ...
... We have to search for the most similar -homogeneous -resources. Therefore we developed EASYCOMPARE [35] and Appendix D: Table 18. This benchmark suite uses a feature vector shown in equation (1) to describe the performance of a virtual machine type. ...
... So, similar machine types might be rare but they exist, and they are identifiable. EASYCOMPARE can be easily extended to benchmark further IaaS cloud service providers [35]. ...
Technical Report
Full-text available
The project CloudTRANSIT dealt with the question of how to transfer cloud applications and services at runtime without downtime across cloud infrastructures from different public and private cloud service providers. This technical report summarizes the outcomes of approximately 20 research papers that have been published throughout the project. This report intends to provide an integrated birds-eye view on these-so far-isolated papers. The report references the original papers where ever possible. This project also systematically investigated practitioner initiated cloud application engineering trends of the last three years that provide several promising technical opportunities to avoid cloud vendor lock-in pragmatically. Especially European cloud service providers should track such kind of research because of the technical opportunities to bring cloud application workloads back home to Europe. Such workloads are currently often deployed and inherently bound to U.S. providers. Intensified EU General Data Protection (GDPR) policies, European Cloud Initiatives, or "America First" policies might even make this imperative. So, technical solutions needed for these scenarios that are manageable not only by large but also by small and medium-sized enterprises. Therefore, this project systematically analyzed commonalities of cloud infrastructures and cloud applications. Latest evolutions of cloud standards and cloud engineering trends (like containerization) were used to derive a cloud-native reference model (ClouNS) that guided the development of a pragmatic cloud-transferability solution. This solution intentionally separated the infrastructure-agnostic operation of elastic container platforms (like Swarm, Kubernetes, Mesos/Marathon, etc.) via a multi-cloud-scaler and the platform-agnostic definition of cloud-native applications and services via an unified cloud application modeling language. Both components are independent but complementary. Because of their independence, they can even contribute (although not intended) to other fields like moving target based cloud security-but also distributed ledger technologies (block-chains) made provide options here. The report summarizes the main outcomes and insights of a proof-of-concept solution to realize transferability for cloud applications and services at runtime without downtime.
... This Section summarizes the main points of our performance considerations. The following of our studies investigated performance aspects on service level [29,30,31] and on elastic container platform level [32,33]. We developed the research benchmarking prototypes EASY-COMPARE for the platform level and PPBENCH for the service level (see Table 18) to do these pre-solution performance analytics. ...
... So, we do not search for best performing or the most cost effective resources, we have to search for the most similar -homogeneous -resources. Therefore we developed EASYCOMPARE [33] and Appendix D: Table 18. This benchmark suite uses a feature vector shown in equation (1) to describe the performance of a virtual machine type. ...
... So, similar machine types might be rare but they exist and they are identifiable. EASYCOMPARE can be easily extended to benchmark further IaaS cloud service providers [33]. ...
Preprint
Full-text available
The project CloudTRANSIT dealt with the question how to transfer cloud applications and services at runtime without downtime across cloud infrastructures from different public and private cloud service providers. This technical report summarizes the outcomes of more than 20 research papers and reports that have been published throughout the course of the project. The intent of this report is to provide an integrated birds-eye view on these-so far-isolated papers. The report references to the original papers where ever possible. This project also systematically investigated practitioner initiated cloud application engineering trends of the last three years that provide several promising technical opportunities to avoid cloud vendor lock-in pragmatically. Especially European cloud service providers should track such kind of research because of the technical opportunities to bring cloud application workloads back home to Europe that are currently often deployed and inherently bound to U.S. providers. Intensified EU General Data Protection (GDPR) policies, European Cloud Initiatives, or "America First" policies might even make this imperative. There are technical solutions needed for these scenarios that are manageable not only by large but also by small and medium sized enterprises. Therefore, this project systematically analyzed commonalities of cloud infrastructures and of cloud applications. Latest evolutions of cloud standards and cloud engineering trends (like containerization) were used to derive a cloud-native reference model (ClouNS) that guided the development of a pragmatic cloud-transferability solution. This solution intentionally separated the infrastructure-agnostic operation of elastic container platforms (like Swarm, Kubernetes, Mesos/Marathon, etc.) via a multi-cloud-scaler and the platform-agnostic definition of cloud-native applications and services via an unified cloud application modeling language. Both components are independent but complementary. Because of their independence they can even contribute (although not intended) to other fields like moving target based cloud security. The report summarizes the main outcomes and insights of a proof-of-concept solution to realize transferability for cloud applications and services at runtime without downtime.
... To compare similar machine types it was decided to follow the approach presented in [19] to make use of machine types from di↵erent providers that show high similarities regarding processing, networking, I/O and memory performance. Table 5 shows the selection of machine types. ...
... Figure 6 5 https://github.com/microservices-demo/microservices-demo (last access 3rd July 2017) Table 5. Used machine types and regions: The machine types have been selected according to the proposed method by Kratzke and Quint [19]. The OpenStack m1.large and m1.medium are research institute specific machine types and that have been intentionally defined to show maximum similarities with the other mentioned machine types. ...
Preprint
Full-text available
Cloud-native applications are often designed for only one specific cloud infrastructure or platform. The effort to port such kind of applications into a different cloud is usually a laborious one time exercise. Modern Cloud-native application architecture approaches make use of popular elastic container platforms (Apache Mesos, Kubernetes, Docker Swarm). These kind of platforms contribute to a lot of existing cloud engineering requirements. This given, it astonishes that these kind of platforms (already existing and open source available) are not considered more consequently for multi-cloud solutions. These platforms provide inherent multi-cloud support but this is often overlooked. This paper presents a software prototype and shows how Kubernetes and Docker Swarm clusters could be successfully transfered at runtime across public cloud infrastructures of Google (Google Compute Engine), Microsoft (Azure) and Amazon (EC2) and further cloud infrastructures like Open-Stack. Additionally, software engineering lessons learned are derived and some astonishing performance data of the mentioned cloud infrastruc-tures is presented that could be used for further optimizations of IaaS transfers of Cloud-native applications.
... To compare similar machine types it was decided to follow the approach presented in [19] to make use of machine types from different providers that show high similarities regarding processing, networking, I/O and memory performance. Table 5 shows the selection of machine types. ...
... It turned out that most of runtimes are due to low level IaaS infrastructure operations and not due to elastic container platform operations. Figure 6 Table 5. Used machine types and regions: The machine types have been selected according to the proposed method by Kratzke and Quint [19]. The OpenStack m1.large and m1.medium are research institute specific machine types and that have been intentionally defined to show maximum similarities with the other mentioned machine types. ...
Chapter
Full-text available
Cloud-native applications are often designed for only one specific cloud infrastructure or platform. The effort to port such kind of applications into a different cloud is usually a laborious one time exercise. Modern Cloud-native application architecture approaches make use of popular elastic container platforms (Apache Mesos, Kubernetes, Docker Swarm). These kind of platforms contribute to a lot of existing cloud engineering requirements. This given, it astonishes that these kind of platforms (already existing and open source available) are not considered more consequently for multi-cloud solutions. These platforms provide inherent multi-cloud support but this is often overlooked. This paper presents a software prototype and shows how Kubernetes and Docker Swarm clusters could be successfully transfered at runtime across public cloud infrastructures of Google (Google Compute Engine), Microsoft (Azure) and Amazon (EC2) and further cloud infrastructures like OpenStack. Additionally, software engineering lessons learned are derived and some astonishing performance data of the mentioned cloud infrastructures is presented that could be used for further optimizations of IaaS transfers of Cloud-native applications.
... The platforms operated a reference "sock-shop" application being one of the most complete reference applications for microservices architecture research [18]. Table IV lists the machine types that show a high similarity across different providers [19]. ...
Article
Full-text available
Cloud applications expose besides service endpoints also potential or actual vulnerabilities. Therefore, cloud security engineering efforts focus on hardening the fortress walls but seldom assume that attacks may be successful. At least against zero-day exploits, this approach is often toothless. Other than most security approaches and comparable to biological systems, we accept that defensive "walls" can be breached at several layers. Instead of hardening the "fortress" walls, we propose to make use of an (additional) active and adaptive defense system to attack potential intruders. This immune system is inspired by the concept of a moving target defense. This "immune system" works on two layers. On the infrastructure layer, virtual machines are continuously regenerated (cell regeneration) to wipeout even undetected intruders. On the application level, the vertical and horizontal attack surface is continually modified to circumvent successful replays of formerly scripted attacks. Our evaluations with two common cloud-native reference applications in popular cloud service infrastructures (Amazon Web Services, Google Compute Engine, Azure and OpenStack) show that itis technically possible to limit the time of attackers acting detected down to minutes. Further, more than 98% of an attack surface can be changed automatically and minimized, which makes it hard for intruders to replay formerly successful scripted attacks. So, even if intruders get a foothold in the system, it is hard for them to maintain it. Therefore, our proposals are robust and dynamically change due in response to security threats similar to biological immune systems.
... The platforms operated a reference "sock-shop" application being one of the most complete reference applications for microservices architecture research [17]. Table IV lists the machine types that show a high similarity across different providers [18]. ...
Preprint
Full-text available
Cloud applications expose - besides service endpoints - also potential or actual vulnerabilities. Therefore, cloud security engineering efforts focus on hardening the fortress walls but seldom assume that attacks may be successful. At least against zero-day exploits, this approach is often toothless. Other than most security approaches and comparable to biological systems we accept that defensive "walls" can be breached at several layers. Instead of hardening the "fortress" walls we propose to make use of an (additional) active and adaptive defense system to attack potential intruders - an immune system that is inspired by the concept of a moving target defense. This "immune system" works on two layers. On the infrastructure layer, virtual machines are continuously regenerated (cell regeneration) to wipe out even undetected intruders. On the application level, the vertical and horizontal attack surface is continuously modified to circumvent successful replays of formerly scripted attacks. Our evaluations with two common cloud-native reference applications in popular cloud service infrastructures (Amazon Web Services, Google Compute Engine, Azure and OpenStack) show that it is technically possible to limit the time of attackers acting undetected down to minutes. Further, more than 98% of an attack surface can be changed automatically and minimized which makes it hard for intruders to replay formerly successful scripted attacks. So, even if intruders get a foothold in the system, it is hard for them to maintain it.
... Our work in [1] presented the MyMinder architecture and problem formulation for selecting the most suitable CP to migrate the VM along with some initial experimental results that motivate the need for live VM migration from one CP to another. Although we presented the performance variability results from the bursting instance types of the selected public CPs, authors in [15] and [14] have experimentally proved that even dedicated instances show performance variability. ...
Chapter
Full-text available
Vendor lock-in is a major obstacle for cloud users in performing multi-cloud deployment or inter-cloud migration, due to the lack of standardization. Current research efforts tackling the inter-cloud migration problem are commonly technology-oriented with significant performance overheads. Moreover, current studies do not provide adequate support for decision making such as why and when inter-cloud migration should take place. We propose the architecture and the problem formulation of a Multi-objective dYnamic MIgratioN Decision makER (MyMinder) framework that assists cloud users in achieving a stable QoS performance in the post-deployment phase by helping decide on actions to be taken as well as providing support to achieve such actions. Additionally, we demonstrate the migration capability of MyMinder by proposing an Automated Triggering Algorithm (ATA), which uses existing Docker Swarm technology for application migration.
... The platforms operated a reference "sockshop" application being one of the most complete reference applications for microservices architecture research (Aderaldo et al., 2017). Table 4 lists the machine types that show a high similarity across different providers (Kratzke and Quint, 2015). ...
Conference Paper
Full-text available
Cloud applications expose - beside service endpoints - also potential or actual vulnerabilities. And attackers have several advantages on their side. They can select the weapons, the point of time and the point of attack. Very often cloud application security engineering efforts focus to harden the fortress walls but seldom assume that attacks may be successful. So, cloud applications rely on their defensive walls but seldom attack intruders actively. Biological systems are different. They accept that defensive "walls" can be breached at several layers and therefore make use of an active and adaptive defense system to attack potential intruders - an immune system. This position paper proposes such an immune system inspired approach to ensure that even undetected intruders can be purged out of cloud applications. This makes it much harder for intruders to maintain a presence on victim systems. Evaluation experiments with popular cloud service infrastructures (Amazon Web Services, Google Compute Engine, Azure and OpenStack) showed that this could minimize the undetected acting period of intruders down to minutes.
... Table IV lists the machine types that show a high similarity across different providers. These machine types have been selected according to [22]. The OpenStack m1.large and m1.medium are research institute specific machine types that have been intentionally defined to show maximum similarities with the other mentioned machine types. ...
Conference Paper
Full-text available
There is no such thing as an impenetrable system, although the penetration of systems does get harder from year to year. The median days that intruders remained undetected on victim systems dropped from 416 days in 2010 down to 99 in 2016. Perhaps because of that, a new trend in security breaches is to compromise the forensic trail to allow the intruder to remain undetected for longer in victim systems and to retain valuable footholds for as long as possible. This paper proposes an immune system inspired solution which uses a more frequent regeneration of cloud application nodes to ensure that undetected compromised nodes can be purged. This makes it much harder for intruders to maintain a presence on victim systems. Basically the biological concept of cell-regeneration is combined with the information systems concept of append-only logs. Evaluation experiments performed on popular cloud service infrastructures (Amazon Web Services, Google Compute Engine, Azure and OpenStack) have shown that between 6 and 40 nodes of elastic container platforms can be regenerated per hour. Even a large cluster of 400 nodes could be regenerated in somewhere between 9 and 66 hours. So, regeneration shows the potential to reduce the foothold of undetected intruders from months to just hours.
Article
Full-text available
Selection of qualified personnel is a key success factor for an organization. The complexity and importance of the problem call for analytical methods rather than intuitive decisions. In literature, there are various methods regarding personnel selection. This paper considers a real application of personnel selection with using the opinion of expert by one of the decision making model, it is called SAW method. This paper has applied seven criteria that they are qualitative and positive for selecting the best one amongst five personnel and also ranking them. Finally the introduced method is used in a case study.
Article
Full-text available
Cloud platforms encompass a large number of storage services that can be used to manage the needs of customers. Each of these services, offered by a different provider, is characterized by specific features, limitations and prices. In presence of multiple options, it is crucial to select the best solution fitting the requirements of the customer in terms of quality of service and costs. Most of the available approaches are not able to handle uncertainty in the expression of subjective preferences from customers, and can result in wrong (or sub-optimal) service selections in presence of rational/selfish providers, exposing untrustworthy indications concerning the quality of service levels and prices associated to their offers. In addition, due to its multi-objective nature, the optimal service selection process results in a very complex task to be managed, when possible, in a distributed way, for well-known scalability reasons. In this work, we aim at facing the above challenges by proposing three novel contributions. The fuzzy sets theory is used to express vagueness in the subjective preferences of the customers. The service selection is resolved with the distributed application of fuzzy inference or Dempster-Shafer theory of evidence. The selection strategy is also complemented by the adoption of a game theoretic approach for promoting truth-telling ones among service providers. We present empirical evidence of the proposed solution effectiveness through properly crafted simulation experiments.
Article
Full-text available
To overcome vendor lock-in obstacles in public cloud computing the capability to define transferable cloud-based services is crucial but not yet solved satisfactory. This is especially true for small and medium sized enterprises being typically not able to operate a vast staff of cloud service and IT experts. Actual state of the art cloud service design does not deal systematically how to define, deploy and operate cross-plattform capable cloud services. This is mainly due to inherent complexity of the field and differences in details between a plenty of existing public and private cloud infrastructures. One way to handle this complexity is to restrict cloud service design to a common subset of commodity features provided by existing public and private cloud infrastructures. Nevertheless these restrictions raise new service design questions and have to be answered in ongoing research in a pragmatic manner regarding the limited IT-operation capabilities of small and medium sized enterprises. By simplifying and harmonizing the use of cloud infrastructures using lightweight virtualization approaches the transfer of cloud deployments between a variety of cloud service providers will become possible. This article will discuss several aspects like high availability, secure communication, elastic service design, transferability of services and formal descriptions of service deployments which have to be addressed and are investigated by our ongoing research.
Article
Full-text available
Actual state of the art of cloud service design does not deal systematically how to define, deploy and operate cross-platform capable cloud services. By simplifying and harmonizing the use of IaaS cloud infrastructures using lightweight virtualization approaches, the transfer of cloud deployments between a variety of cloud service providers becomes more frictionless. This article proposes operating system virtualization as an appropriate and already existing abstraction layer on top of public and private IaaS infrastructures, and derives an reference architecture for lightweight virtualization clusters. This reference architecture is reflected and derived from several existing (open source) projects for container-based virtualization like Docker, Jails, Zones, Workload Partitions of various Unix operating systems and open source PaaS platforms like CoreOS, Apache Mesos, OpenShift, CloudFoundry and Kubernetes.
Article
Full-text available
Cloud technology connects a network of virtualized computers that are dynamically provisioned as computing resources, based on negotiated agreements between service providers and users. It delivers information technology resources in diverse forms of service, and the explosion of Cloud services on the Internet brings new challenges in Cloud service discovery and selection. To address these challenges, a range of studies has been carried out to develop advanced techniques that will assist service users to choose appropriate services. In this paper, we survey state-of-the-art Cloud service selection approaches, which are analyzed from the following five perspectives: decision-making techniques; data representation models; parameters and characteristics of Cloud services; contexts, purposes. After comparing and summarizing the reviewed approaches from these five perspectives, we identify the primary research issues in contemporary Cloud service selection. This survey is expected to bring benefits to both researchers and business agents.
Conference Paper
As Hadoop-based big data framework grows in pervasiveness and scale, realistically benchmarking Hadoop systems becomes critically important to the Hadoop community and industry. In this paper, we present our experience of Hadoop benchmarking with HiBench (an open source Hadoop benchmark suite widely used by Hadoop users), and introduce our recent work on advanced end-to-end ETL-recommendation pipelines based on our experience.
Article
Cloud computing aims at providing elastic applications that can be scaled at runtime. In practice, traditional service composition methods are not flexible enough to changes in cross-cloud environment. In view of this challenge, an adaptive service selection method for cross-cloud service composition is proposed in this paper, by dynamically selecting proper services with near-optimal performance for adapting to changes in time. First, concretely speaking, the service selection and execution are modeled with Markov decision process to ensure flexibility. Second, service pair is defined, and the way to build service pairs set is proposed to predict the performance of candidate services. Third, the adaptive service selection algorithm is designed to select proper cloud services for changing cross-cloud environment. Finally, a case study for cross-cloud service composition and experiments are presented for validating the feasibility of our proposal. Concurrency and Computation: Practice and Experience, 2013.(c) 2013 Wiley Periodicals, Inc.
Article
Outsourced computation services should ideally only charge customers for the resources used by their applications. Unfortunately, no verifiable basis for service providers and customers to reconcile resource accounting exists today. This leads to undesirable outcomes for both providers and consumers-providers cannot prove to customers that they really devoted the resources charged, and customers cannot verify that their invoice maps to their actual usage. As a result, many practical and theoretical attacks exist, aimed at charging customers for resources that their applications did not consume. Moreover, providers cannot charge consumers precisely, which causes them to bear the cost of unaccounted resources or pass these costs inefficiently to their customers. We introduce ALIBI, a first step toward a vision for verifiable resource accounting. ALIBI places a minimal, trusted reference monitor underneath the service provider's software platform. This monitor observes resource allocation to customers' guest virtual machines and reports those observations to customers, for verifiable reconciliation. In this paper, we show that ALIBI efficiently and verifiably tracks guests' memory use and CPU-cycle consumption.