Conference PaperPDF Available

Impact of Application Load in Function as a Service


Abstract and Figures

Function as a Service (FaaS) introduces a different notion of scaling than related paradigms. The unlimited upscaling and the property of downscaling to zero running containers leads to a situation where the application load influences the number of running containers directly. We propose a combined simulation and benchmarking process for cloud functions to provide information on the performance and cost aspect for developers in an early development stage. Our focus in this paper is on simulating the concurrently running containers on a FaaS platform based on different function configurations. The experiment performed serves as a proof of concept work and emphasizes the importance for design decisions and system requirements. Especially for self-hosted FaaS platforms or resources bound to cloud functions like database connections, this information is crucial for deployment and maintenance.
Content may be subject to copyright.
--- PREPRINT ---
Impact of Application Load in
Function as a Service
Johannes Manner and Guido Wirtz
DSG, University Bamberg, An der Weberei 5, 96047 Bamberg, Germany
Abstract. Function as a Service (FaaS) introduces a different notion of
scaling than related paradigms. The unlimited upscaling and the prop-
erty of downscaling to zero running containers leads to a situation where
the application load influences the number of running containers directly.
We propose a combined simulation and benchmarking process for cloud
functions to provide information on the performance and cost aspect for
developers in an early development stage. Our focus in this paper is on
simulating the concurrently running containers on a FaaS platform based
on different function configurations. The experiment performed serves as
a proof of concept work and emphasizes the importance for design deci-
sions and system requirements. Especially for self-hosted FaaS platforms
or resources bound to cloud functions like database connections, this
information is crucial for deployment and maintenance.
Keywords: Serverless Computing, Function as a Service, FaaS, Bench-
marking, Simulation, Load Profile
1 Introduction
As in every virtualization technology, a cloud function container faces some
performance challenges when it is created and executed for the first time. To
estimate the quality of serivce a system delivers, benchmarking applications is
crucial. Huppler [1] stated that a benchmark should be relevant,fair,ver-
ifiable,economical and repeatable. There does exist a bunch of experiments,
e.g. [2,3,4,5,6], executed in the FaaS domain to assess this new paradigm in the
cloud stack.
We investigate these benchmarks based on the requirements of Huppler. All
of these benchmarks make a performance evaluation of one or more FaaS plat-
forms compared to each other or related technologies like Virtual Machine (VM)
based solutions. The biggest issue in general is the repeatability of a bench-
mark since the targeted field is highly evolving. Another problem is the lack
of information about the settings and other influential factors of the mentioned
experiment. Results are discussed in detail for every of these publications, but
only a few of them describe all the necessary steps to repeat the experiment
--- PREPRINT ---
2 J. Manner, G. Wirtz
and verify the findings, as Kuhlenkamp and Werner [7] ascertained. All the
benchmarks in their literature study are FaaS related and conducted since 2015.
They gave each experiment a score between 0 and 4 to assess quality of the
presented work. Workload generator, function implementation, platform config-
uration and other used services are the categories of their systematic literature
study. The mean average was 2.6, which indicates that a lot of information is
missing in the conducted benchmarks. Only 3 out of 26 experiments supplied
all preconditions and parameters needed to make it possible to reproduce the
presented results. Therefore, results of different benchmarks are often not com-
parable to each other. The first category, generation of load patterns and their
topology, is the least discussed item. Authors of FaaS benchmarking papers only
write in every third publication about the load pattern aspect.
As the load pattern topology has a major influence on the scaling behavior of
a cloud function platform, we focus on this aspect here. This is also important
for software architects constructing hybrid architectures which need informa-
tion about the incoming request rate in the non-FaaS part of their systems.
Otherwise, the FaaS part of applications can cause Distributed Denial of Ser-
vice (DDos) attack on other parts. Our paper stresses this aspect in particular by
(i) discussing different ways to specify load patterns, (ii) proposing a workflow
for a combined FaaS simulation and benchmarking process and (iii) presenting a
methodology to compute the number of running instances out of the respective
load trace.
The outline of the paper is as follows. Section 2 discusses related work and
answers the first contribution, which load generation tools are suited to specify
application workloads. Section 3 proposes a generic workflow for a simulation and
benchmarking process of cloud functions and picks a single aspect, the number
of concurrently running functions as a proof of concept. The paper concludes
with a discussion in Section 4 and an outlook in Section 5.
2 Related Work
2.1 Benchmarking FaaS
The open challenges Iosup and others [8] mentioned in their publication for
Infrastructure as a Service (IaaS) benchmarking are partly-open challenges for
FaaS as well. There is currently a lack of methodological approaches to bench-
mark cloud functions consistently. Malawski and others [6,9] conducted scien-
tific workflow benchmarks and built their benchmarking pipeline based on the
serverless framework 1. They publish their benchmarking results continuously,
but do not include simulations, which would reduce cost and time. Similar to
this approach, Scheuner and Leitner [10] introduced a system, where micro
and application benchmarks are combined. Especially the micro benchmarking
aspect is interesting for a consistent FaaS methodology since typically a single
cloud function is the starting point. Three different load patterns are part of
--- PREPRINT ---
Impact of Application Load in FaaS 3
their contribution but hidden in the implementation of their system and there-
fore not directly mentioned, as in many other FaaS publications. These initial
benchmarks focusing on a single aspect in isolation are important steps to un-
derstand the impact on system design and execution, but they are quite difficult
to setup and need a lot of time for execution, as Iosup already mentioned for
IaaS benchmarking. So, [8] proposes a combination of simulating small sized ar-
tificial workloads and conducting real world experiments as the most promising
approach to get stable results with least effort in time and money.
2.2 Load Patterns in Conducted Experiments
The ”job arrival pattern” [8] is critical for the performance of any System Un-
der Test (SUT). Especially in FaaS, where scaling is determined by the given
input workload. To perform repeatable benchmarks and enable a simulation of
cloud functions under different external circumstances, the documentation of
load patterns is critical. As mentioned in the introduction, this is the least dis-
cussed aspect, but some authors explained their workload in detail, as discussed
by [7]. These descriptions are not sufficient:
McGrath and Brenner [11] performed a concurrency test and a backoff
test. The concurrency test featured 15 test executions. Each of them at 10
seconds intervals with an increasing number of concurrent request. For the
first test execution only 1 request was started and in the last execution
15 concurrent requests were submitted. This was repeated 10 times. The
backoff test performed a single invocation from 1 to 30 minutes pausing
time between the invocations to investigate the expiration time of a cloud
function container and the impact of cold start on execution performance.
Lee and others [4] focused on concurrency tests. First they measured the
function throughput per second by invoking the cloud functions 500, 1,000,
2,000, 3,000 and 10,000 times. The time between invocations was not men-
tioned. Therefore, it is not clear if the second call used the already warm
containers from the first execution. Furthermore, they investigated different
aspects with 1 request at a time and 100 concurrent requests and a few other
settings, but also not informed the reader about wait time between calls or
the exact distribution.
Figiela and others [12] conducted two CPU intensive benchmarks. The first
one was executed every 5 minutes and invoked the different functions once.
The second experiment used a fork-join model and executed the tasks in
parallel for 200, 400 and 800 concurrent tasks. The number of repetitions
and the corresponding wait time between them were not mentioned and
maybe not present.
Back and Andrikopoulos [2] used fast fourier transformation,matrix ma-
nipulation and sleep use cases for their benchmark. They parameterized each
function and executed each combination once a day on three consecutive
days. It is unclear to the reader, if all of these measurements resulted in a
cold start on the respective providers. Also the results are prone to outliers
since the sample size with 3 executions per combination can distort findings.
--- PREPRINT ---
4 J. Manner, G. Wirtz
Das and others [13] implemented a sequential benchmark of cloud and edge
resources, where the time of two consecutive invocations was between 10 and
15 seconds to avoid concurrent request executions. There is no information
how the authors dealt with the first invocation of a cloud function.
Manner and others [14] focused on the cold start overhead in FaaS. There-
fore, they defined a sequential load pattern to generate pairs of a single cold and
warm start to compare the performance on a container basis. Warm starts were
executed 1 minute after the cold execution returns. After the pair was executed,
the pattern paused for 29 minutes to achieve a shutdown of the container. W.r.t.
the load pattern aspect, the experiments in this publication are reproducible and
all necessary information is described to repeat them.
All the presented workloads are artificial load patterns, where some reduc-
tions are made for simplicity and to assess a single detail or use case in FaaS. It
is often unclear if the authors used an established load generation tool or imple-
mented a proprietary interface for submitting the workload. There is currently
a lack in experiments, which use real world load traces.
2.3 Load Generation Tools
Before discussing load generation tools, the kind of application load is important
for any benchmark or simulation. Schroeder and others [15] defined three of
them: Closed, open and partly-open systems. Closed systems can predict, based
on other parts of the system, how many incoming request will arise. In contrast,
the workload of an open system is not predictable since users access the service
randomly via an interface. Partly-open is a combination of both.
We only focus on open systems since a single cloud function is in focus of our
work and has therefore no other dependencies. There exists a recent study about
workload generators for web-based systems [16], where a comprehensive collection
is presented and a lot of generation tools are compared. For benchmarking FaaS,
an arrival rate of requests is the needed input. Therefore, we picked two tools to
generate workloads as a reference here. Tools like JMeter 2focus on controlled
workloads with constant, linear or stepwise increasing loads. This behavior is
especially important to generate clean and clear experimental setups to isolate
different aspects under investigation. Based on these ideas, we also implemented
some benchmarking modes in our prototype 3to control the execution of requests
based on our needs and added some instrumentation to compare the execution
time on the platform and on a local machine, submitting the requests. On the
other hand, there are tools to model real world load traces based on seasonal,
bursty, noisy and trend parts like LIMBO [17]. LIMBO enables the generation
of a load pattern based on an existing trace or via combination of mathematical
functions. In contrast to JMeter, where the load can be directly submitted,
LIMBO decouples the load generation and the submission via another tool, as
suggested by [16].
--- PREPRINT ---
Impact of Application Load in FaaS 5
3 FaaS Benchmarking and Simulation
3.1 Combined Workflow
Deploy Cloud
Store Results
Prediction suited
for the use case?
Store Simulation
Execution Data
Platform and
Simulation Data
Page 1 of 1
Fig. 1: Generic Pipeline for FaaS Benchmarking
Figure 1 presents a generic pipeline for FaaS benchmarking inspired by Io-
sup [8]. The SUT is not explicitly mentioned since a single cloud function is
the SUT in our approach. Memory setting, the size of the deployment artifact
etc. [14] directly influence the execution time and are therefore relevant, in com-
bination with the load pattern, to assess the concurrently running containers.
After providing the cloud function, the load pattern and the mentioned meta-
data, our prototype starts the simulation. After simulation is done, the user has
to decide, if the simulated values are suited for the use case, e.g. if the number
of concurrently running instances not exceeding a limit, or, if he has to adjust
the values and starts another simulation, e.g. raising the memory setting and
reducing the overall execution time. In the latter case, the prototype stores this
interim result for a later comparison with the next simulation round, where a
developer can assess which setting results in better cost and performance.
If the simulation is satisfying for the user, our prototype deploys the function
using serverless framework and submits the workload based on the load pattern.
Our prototype uses synchronous Representational State Transfer (REST) calls
to generate events on the FaaS platform as introduced in [14]. This behavior
is similar to the direct executor model as proposed by [9]. Subsequently, the
user analyses platform execution data and compares the results with predicted
values of the simulation. Finally, the results are stored for further improving the
simulation framework, proposed in a prior paper [18].
3.2 Simulating Number of Cloud Function Containers
This section focuses on a single piece of the presented pipeline in Figure 1:
Perform Simulation. We investigate only one aspect of this piece: Number of
running containers. An important aspect is execution time w.r.t. to different
function configurations. Also the input of functions highly influences the runtime
performance, e.g. sorting algorithms. We tackle this problem of varying execution
--- PREPRINT ---
6 J. Manner, G. Wirtz
times in future work, when refining the simulation engine. Further aspects are the
associated cost impact, effects on used backend services like cloud databases etc.
If the simulation exposes a high number of concurrently running containers and
the concurrency level is problematic, the developer could throttle the incoming
requests by using a queue etc.
Algorithm 1 is implemented by our prototype. A comparison to used schedul-
ing algorithms in open source FaaS platforms is outstanding. Currently, the sim-
ulation uses the mean average execution time (exec) of the investigated cloud
function, mean cold start time (cold) and idle time for container shutdown (shut-
down). The timeStamps are a list of double values marking the start time of a
request and are created manually. Statistical deviations are not included in this
proposed simulation approach, but planned for future work. The gateway spawns
events and triggers the function under test. Multi-tenancy is not included in our
algorithm, but has an impact on performance and execution time as Heller-
stein and others [19] stated.
Algorithm 1 Basic Simulation - Number of Containers
1: procedure simulate(exec, cold, shutdown, timeStamps))
2: for time in timeStamps do
3: checkF inishedC ontainers(time)
4: shutdownIdleContainers(time, shutdown)
5: if idleContainerAvailable() then
6: pickIdleAndExecute(exec)
7: else
8: spinUpAndExecute(col d, exec)
9: end if
10: end for
11: shutdownAllC ontainers()
12: generateC ontainerDistribution()
13: end procedure
In line 3, the program checks, if some of the containers have finished their
execution at time and sets these containers in an idle state. The next function
shuts all idle containers down, which exceed the shutdown time. At this point,
the internal state of the simulation is clean and the next request can be executed
either from an already warm container (line 5, 6) or a new instance (line 8), which
is affected by a cold start. If all request are served, the prototype produces a
distribution, how many containers are running on the basis of seconds.
4 Discussion
Figure 2 depicts an initial load trace and two corresponding simulations. The
colored numbers are counts on a second basis and show the number of incoming
requests (orange) and the number of concurrently running containers (yellow
--- PREPRINT ---
Impact of Application Load in FaaS 7
and gray). The input trace is artificially created and the values 4for these two
simulations are chosen w.r.t. a prior investigation [14] 5.
Timestamp InitialDistribution SimulationInput[10.0,0.3,1800.0] SimulationInput[5.1,0.25,1800.0]
2 4 10 10
3 7 17 17
4 6 23 23
5 4 27 25
6 3 30 25
7 3 33 25
8 3 36 25
9 3 39 25
10 6 43 25
11 6 46 25
12 6 49 25
13 4 49 26
14 4 49 27
15 4 48 28
16 0 43 25
17 0 40 18
18 0 37 12
19 0 34 8
20 0 31 4
21 0 25 0
22 0 19 0
23 0 12 0
49 49 49 48
23 25 25 25 25 25 25 25 25 26 27 28
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
InitialDistribution SimulationInput[10.0,0.3,1800.0] SimulationInput[5.1,0.2 5,1800.0]
Fig. 2: Example Distribution for an Artificial Load Trace
The execution time of the gray run (5.1) is roughly twice the execution time
of the yellow run (10.0). As for many FaaS platforms, like AWS Lambda 6
and Google Cloud Functions 7, the CPU resources are directly coupled with
the memory setting. We suppose, that for example the yellow cloud function is
deployed with a memory limit of 256 MB RAM, whereas the gray cloud function
is restricted to 128 MB RAM. Assumed, that the two functions are implemented
in Java, the cold start time is not affected in the same way since the JVM startup
is resource intensive in both cases. The shutdown time (1800.0) has no effect in
this example since the considered interval is too short.
Gray and yellow graphs show a start-up, an execution and a tear-down phase.
Our artificial distribution simulates a moderate load with a few invocations per
second. The start-up phase is similar for the first 5 seconds since after 5 seconds
the first containers are reused in the yellow simulation. Our execution phase is
only five seconds for the gray (second 11 to 16) compared to ten seconds for
the yellow simulation (second 6 to 16), but shows the impact of the supposed
runtime configuration on the number of running containers. For self-hosted FaaS
platforms or resources bound to cloud functions like database connections, the
difference between 28 or 49 concurrently running containers influence system
requirements and design decisions. The tear-down in the yellow case happens
faster due to the shorter execution time. The output load trace is missing in this
4Compare the input values to Algorithm 1 - SimulationInput[exec,cold,shutdown].
5Source code, parameters and input trace are available on GitHub: https://github.
--- PREPRINT ---
8 J. Manner, G. Wirtz
5 Future Work
The aim is to implement the suggested simulation and benchmarking pipeline in
our prototype. Therefore, the next step is to include an automated data picking
facility as Malawski and others [9] already implemented.
Our simulation model is based on a few parameters without statistical devi-
ation to keep the system deterministic. We want to extend the simulation in this
directions and also include the output load pattern of our simulation since this
output is maybe the input for another component of the overall (hybrid) appli-
cation. Furthermore, we want to conduct a few benchmarks on constant, linear
and bursty workloads to refine our simulation model and perform a realistic
proof-of-concept of our work and include the multi tenancy aspect.
To conclude, the topology of load pattern has a major influence on the num-
ber of running containers on the FaaS platforms. Our paper stresses this aspect
in particular and puts emphasis on the lack of documentation in conducted ex-
periments from the literature. The presented simulation is a first step in our
overall simulation approach towards predictability of platform behavior.
1. K. Huppler. The art of building a good benchmark. In Raghunath Nambiar and
Meikel Poess, editors, Performance Evaluation and Benchmarking, pages 18–30.
Springer, 2009.
2. T. Back and V. Andrikopoulos. Using a Microbenchmark to Compare Function as
a Service Solutions. In Service-Oriented and Cloud Computing. Springer, 2018.
3. D. Jackson and G. Clynch. An investigation of the impact of language runtime on
the performance and cost of serverless functions. In Proc. WoSC, 2018.
4. H. Lee et al. Evaluation of Production Serverless Computing Environments. In
Proc. WoSC, 2018.
5. W. Lloyd et al. Improving Application Migration to Serverless Computing Plat-
forms: Latency Mitigation with Keep-Alive Workloads. In Proc. WoSC, 2018.
6. M. Malawski et al. Benchmarking Heterogeneous Cloud Functions. In Dora B.
Heras and Luc Boug´e, editors, Euro-Par 2017: Parallel Processing Workshops,
pages 415–426. Springer International Publishing, 2018.
7. J. Kuhlenkamp and S. Werner. Benchmarking FaaS Platforms: Call for Community
Participation. In Proc. WoSC, 2018.
8. A. Iosup et al. IaaS Cloud Benchmarking: Approaches, Challenges, and Experience.
In Proc. MTAGS, 2012.
9. M. Malawski. Towards Serverless Execution of Scientific Workflows HyperFlow
Case Study. In Proc. WORKS, 2016.
10. J. Scheuner and P. Leitner. A Cloud Benchmark Suite Combining Micro and
Applications Benchmarks. In Proc. ICPE, 2018.
11. G. McGrath and P. R. Brenner. Serverless computing: Design, implementation,
and performance. In Proc. ICDCSW, 2017.
12. K. Figiela et al. Performance evaluation of heterogeneous cloud functions. Con-
currency and Computation: Practice and Experience, 2018.
13. A. Das et al. EdgeBench: Benchmarking edge computing platforms. In Proc.
WoSC, 2018.
--- PREPRINT ---
Impact of Application Load in FaaS 9
14. J. Manner et al. Cold Start Influencing Factors in Function as a Service. In Proc.
WoSC, 2018.
15. B. Schroeder et al. Open Versus Closed: A Cautionary Tale. In Proc. NSDI, 2006.
16. M. Curiel and A. Pont. Workload generators for web-based systems: Characteris-
tics, current status, and challenges. IEEE Communications Surveys & Tutorials,
20(2):1526–1546, 2018.
17. J. von Kistowski et al. Modeling and extracting load intensity profiles. ACM
Transactions on Autonomous and Adaptive Systems, 11(4):1–28, 2017.
18. J. Manner. Towards Performance and Cost Simulation in Function as a Service.
In Proc. ZEUS (accepted), 2019.
19. J. M. Hellerstein et al. Serverless Computing: One Step Forward, Two Steps Back.
In Proc. CIDR, 2019.
... In this paper we focus on various delays that appear while spawning numerous serverless functions simultaneously and closely investigate them, both for warm and cold start. Several recent works determined delayed start of serverless functions when multiple functions are invoked at the same time [16,20,24,25]. This observation motivated us to introduce the spawn start, which appears when the user invokes serverless functions up to the concurrency limit of 1,000 serverless functions. ...
... In order to alleviate the effect of the overloading, the authors recommended to invoke the serverless functions in a cascade, which significantly reduced the overall delay. On the other side, Manner and Wirtz[20] reported that serverless functions' behavior is affected by the workload. As a consequence, models that use average values may generate a large inaccuracy when estimating runtime of serverless functions. ...
Conference Paper
Many researchers reported considerable delay of up to a few seconds when invoking serverless functions for the first time. This phenomenon, which is known as a cold start, affects even more when users are running multiple serverless functions orchestrated in a workflow. However, in many cases users need to instantly spawn numerous serverless functions, usually as a part of parallel loops. In this paper, we introduce the spawn start and analyze the behavior of three Function-as-a-Service (FaaS) providers AWS Lambda, Google Cloud Functions, and IBM Cloud Functions when running parallel loops, both as warm and cold starts. We conducted a series of experiments and observed three insights that are beneficial for the research community. Firstly, cold start on IBM Cloud Functions, which is up to 2s delay compared to the warm start, is negligible compared to the spawn start because the latter generates additional 20s delay. Secondly, Google Cloud Functions' cold start is "warmer" than the warm start of the same serverless function. Finally, while Google Cloud Functions and IBM Cloud Functions run the same serverless function with low concurrency faster than AWS Lambda, the spawn start effect on Google Cloud Functions and IBM Cloud Functions makes AWS the preferred provider when spawning numerous serverless functions.
... HOROVITZ et al. [41] built the self optimizing Machine Learning (ML) tool FaaStest by predicting the workload of a function and scheduled functions on VMs or on a cloud function platform. Since the workload is one of the determining factors influencing performance [42] with respect to cold starts and parallelism level on the platform, their research is important for simulating cloud function platforms but does not include function characteristics as well. Another approach to simulate FaaS is FaaSSimulator [43]. ...
... These insights result in a profiling strategy where the virtualized approach to build the system is accommodated. The presented process in Figure 2 is a subprocess of the overall benchmarking pipeline presented in prior work [42]. Equations and Figures used in this work are added to the steps where they correspond to. ...
... HOROVITZ et al. [41] built the self optimizing Machine Learning (ML) tool FaaStest by predicting the workload of a function and scheduled functions on VMs or on a cloud function platform. Since the workload is one of the determining factors influencing performance [42] with respect to cold starts and parallelism level on the platform, their research is important for simulating cloud function platforms but does not include function characteristics as well. Another 7 ...
... These insights result in a profiling strategy where the virtualized approach to build the system is accommodated. The presented process in Figure 2 is a subprocess of the overall benchmarking pipeline presented in prior work [42]. Equations and Figures used in this work are added to the steps where they correspond to. ...
Conference Paper
Full-text available
Function as a Service (FaaS)-the reason why so many practitioners and researchers talk about Serverless Computing-claims to hide all operational concerns. The promise when using FaaS is that users only have to focus on the core business functionality in form of cloud functions. However, a few configuration options remain within the developer's responsibility. Most of the currently available cloud function offerings force the user to choose a memory or other resource setting and a timeout value. CPU is scaled based on the chosen options. At a first glance, this seems like an easy task, but the tradeoff between performance and cost has implications on the quality of service of a cloud function. Therefore, in this paper we present a local simulation approach for cloud functions and support developers in choosing a suitable configuration. The methodology we propose simulates the execution behavior of cloud functions locally, makes the cloud and local environment comparable and maps the local profiling data to a cloud platform. This reduces time during the development and enables developers to work with their familiar tools. This is especially helpful when implementing multi-threaded cloud functions.
Function as a Service is one of the most popular and used offerings of cloud paradigm and hence is continuously evolving with both changes in the offering and the implementation. To the service user, it still seems to be a black box completely abstracting the hardware implementation underneath. However, it still suffers from the common pitfalls of virtualization, and the evaluation of these pitfalls becomes inevitable. Particularly, reusing the container for the next call, having minimal idle time to avoid the cost, and meeting the demand or auto-scaling are few of the key stakeholders in controlling the instantiation of new containers. Hence, cold starts within the function call. This study aims to outline the key factors that are mainly responsible for the cold starts and their impact on the duration of the cold start. We conducted a series of AWS Lambda executions with different configurations and dependencies to understand the implications of different cold start factors.
Full-text available
One of the most urgent problems in the study and analysis of hydrolithospheric processes is the construction of verifiable mathematical and computer models that make it possible to predict the behavior of an object under various initial conditions and input influences. Recently, machine learning methods have been increasingly used in geological research. This paper discusses machine learning methods used in geological exploration to automate data analysis, as well as used for neural network information modeling of geological objects.
Conference Paper
Full-text available
Function as a Service (FaaS) promises a more cost-efficient deployment and operation of cloud functions compared to related cloud technologies, like Platform as a Service (PaaS) and Container as a Service (CaaS). Scaling, cold starts, function configurations, dependent services, network latency etc. influence the two conflicting goals cost and performance. Since so many factors have impact on these two dimensions, users need a tool to simulate the function in an early development stage to solve these conflicting goals. Therefore, a simulation framework is proposed in this paper.
Conference Paper
Full-text available
Function as a Service (FaaS) is a young and rapidly evolving cloud paradigm. Due to its hardware abstraction, inherent virtualization problems come into play and need an assessment from the FaaS point of view. Especially avoidance of idling and scaling on demand cause a lot of container starts and as a consequence a lot of cold starts for FaaS users. The aim of this paper is to address the cold start problem in a benchmark and investigate influential factors on the duration of the perceived cold start. We conducted a benchmark on AWS Lambda and Microsoft Azure Functions with 49500 cloud function executions. Formulated as hypotheses, the influence of the chosen programming language, platform, memory size for the cloud function, and size of the deployed artifact are the dimensions of our benchmark. Cold starts on the platform as well as the cold starts for users were measured and compared to each other. Our results show that there is an enormous difference for the overhead the user perceives compared to the billed duration. In our benchmark, the average cold start overheads on the user's side ranged from 300ms to 24s for the chosen configurations.
Conference Paper
The number of available FaaS platforms increases with the rising popularity of a “serverless” architecture and development paradigm. As a consequence, a high demand for benchmarking FaaS platforms exists. In response to this demand, new benchmarking approaches that focus on different objectives continuously emerge. In this paper, we call for community participation to conduct a collaborative systematic literature review with the goal to establish a community-driven knowledge base.
The Function as a Service (FaaS) subtype of serverless computing provides the means for abstracting away from servers on which developed software is meant to be executed. It essentially offers an event-driven and scalable environment in which billing is based on the invocation of functions and not on the provisioning of resources. This makes it very attractive for many classes of applications with bursty workload. However, the terms under which FaaS services are structured and offered to consumers uses mechanisms like GB–seconds (that is, X GigaBytes of memory used for Y seconds of execution) that differ from the usual models for compute resources in cloud computing. Aiming to clarify these terms, in this work we develop a microbenchmark that we use to evaluate the performance and cost model of popular FaaS solutions using well known algorithmic tasks. The results of this process show a field still very much under development, and justify the need for further extensive benchmarking of these services.
Cloud Functions, often called Function‐as‐a‐Service (FaaS), pioneered by AWS Lambda, are an increasingly popular method of running distributed applications. As in other cloud offerings, cloud functions are heterogeneous due to variations in underlying hardware, runtime systems, as well as resource management and billing models. In this paper, we focus on performance evaluation of cloud functions, taking into account heterogeneity aspects. We developed a cloud function benchmarking framework, consisting of one suite based on Serverless Framework and one based on HyperFlow. We deployed the CPU‐intensive benchmarks: Mersenne Twister and Linpack. We measured the data transfer times between cloud functions and storage, and we measured the lifetime of the runtime environment. We evaluated all the major cloud function providers: AWS Lambda, Azure Functions, Google Cloud Functions, and IBM Cloud Functions. We made our results available online and continuously updated. We report on the results of the performance evaluation, and we discuss the discovered insights into resource allocation policies.
Conference Paper
Micro and application performance benchmarks are commonly used to guide cloud service selection. However, they are often considered in isolation in a hardly reproducible setup with a flawed execution strategy. This paper presents a new execution methodology that combines micro and application benchmarks into a benchmark suite called RMIT Combined, integrates this suite into an automated cloud benchmarking environment, and implements a repeatable execution strategy. Additionally, a newly crafted Web serving benchmark called WPBench with three different load scenarios is contributed. A case study in the Amazon EC2 cloud demonstrates that choosing a cost-efficient instance type can deliver up to 40% better performance with 40% lower costs at the same time for the Web serving benchmark WPBench. Contrary to prior research, our findings reveal that network performance does not vary relevantly anymore. Our results also show that choosing a modern type of virtualization can improve disk utilization up to 10% for I/O-heavy workloads.