Conference PaperPDF Available

Towards Performance Tooling Interoperability: An Open Format for Representing Execution Traces

  • NovaTec Consulting GmbH

Abstract and Figures

Execution traces capture information on a software system’s runtime behavior, including data on system-internal software control flows, performance, as well as request parameters and values. In research and industrial practice, execution traces serve as an important basis for model-based and measurement-based performance evaluation, e.g., for application performance monitoring (APM), extraction of descriptive and prescriptive models, as well as problem detection and diagnosis. A number of commercial and open-source APM tools that allow the capturing of execution traces within distributed software systems is available. However, each of the tools uses its own (proprietary) format, which means that each approach building on execution trace data is tool-specific. In this paper, we propose the (OPEN.xtrace) format to enable data interoperability and exchange between APM tools and (SPE) approaches. Particularly, this enables SPE researchers to develop their approaches in a tool-agnostic and comparable manner. OPEN.xtrace is a community effort as part of the overall goal to increase interoperability of SPE/APM techniques and tools. In addition to describing the OPEN.xtrace format and its tooling support, we evaluate OPEN.xtrace by comparing its modeling capabilities with the information that is available in leading APM tools.
Content may be subject to copyright.
Towards Performance Tooling Interoperability:
An Open Format for
Representing Execution Traces
Duˇsan Okanovi´c1, Andr´e van Hoorn1, Christoph Heger2,
Alexander Wert2, and Stefan Siegl2
1Univ. of Stuttgart, Inst. of Software Technology, Reliable Software Systems, GER
2NovaTec Consulting GmbH, CA Application Performance Management, GER
Execution traces capture information on a software system’s
runtime behavior, including data on system-internal software control
flows, performance, as well as request parameters and values. In research
and industrial practice, execution traces serve as an important basis for
model-based and measurement-based performance evaluation, e.g., for
application performance monitoring (
), extraction of descriptive and
prescriptive models, as well as problem detection and diagnosis. A number
of commercial and open-source
tools that allow the capturing of
execution traces within distributed software systems is available. However,
each of the tools uses its own (proprietary) format, which means that
each approach building on execution trace data is tool-specific.
In this paper, we propose the Open Execution Trace Exchange
(OPEN.xtrace) format to enable data interoperability and exchange
between APM tools and software performance engineering (
) ap-
proaches. Particularly, this enables
researchers to develop their
approaches in a tool-agnostic and comparable manner. OPEN.xtrace is
a community effort as part of the overall goal to increase interoperability
of SPE/APM techniques and tools.
In addition to describing the OPEN.xtrace format and its tooling sup-
port, we evaluate OPEN.xtrace by comparing its modeling capabilities
with the information that is available in leading APM tools.
1 Introduction
Dynamic program analysis aims to get insights from a software system based on
runtime data collected during its execution [
]. An important data structure
used for dynamic program analysis is the execution trace. In its simplest form,
an execution trace captures the control flow of method executions for a request
served by the system. It can be represented by a dynamic call tree as depicted in
Figure 1a [
]. In the example, the method
is the entry point to the
processing of a request. The method
calls the
which then calls
, etc. The order and nesting of method executions
can be obtained by performing a depth-first traversal of the dynamic call tree.
To appear in
Proceedings of the 13th European Workshop on Performance Engineering (EPEW '16).
The final publication will be available at Springer via
2 D. Okanovi´c, A. van Hoorn, C. Heger, A. Wert, and S. Siegl
processData(..) renderData(..)
(a) classic tree representation
doGet(..) - [JVM1@srv1] 0.2 sec
doFilter(..) - 0.2 sec
doSearch(..) - 0.15 sec
getData(..) - [HotSpot1@srv2] 0.13 sec
executeQuery(..) - org.h2.jdbc.PreparedStatement 0.13 sec
processData(..) - [HotSpot1@srv2] 0.03 sec
renderData(..) - [HotSpot1@srv2] 0.01 sec
(b) profiler view
Fig. 1: Example trace
The collection of execution traces is one of the basic features expected from
application performance monitoring (
) tools. For instance, it is required to
fulfill at least the following three dimensions of
functionality as defined by
Kowall and Cappelli [
]: i.) application topology discovery and visualization,
ii.) user-defined transaction profiling, and iii.) application component deep dive.
And indeed, the common commercial [
] and open-source [
tools do support this feature based on application instrumentation, stack
trace sampling, or a mixture of both. It needs to be emphasized that a lot more
information than needed for reconstructing dynamic call trees is collected. For
instance, the data may include information on timing (e.g., response times, CPU
times), variables (HTTP request parameters, SQL queries), or error information
(HTTP status code, Java exceptions). Figure 1b shows a simplified profiler-like
view on the execution trace from Figure 1a, including location information and
method response times, as provided by most APM tools. However, the type of
data and the representation format differ greatly among the different
In addition to the aforementioned three
dimensions, execution traces
provide the data set for various further software performance engineering (
activities. For instance, researchers have proposed approaches for extracting and
visualizing performance models [
], as well as detecting and diagnosing
performance problems [
]. Unfortunately, the existing approaches
are tailored to the execution trace representations of specific
tools or
custom-made monitoring and tracing implementations.
To summarize, the efficient capturing of detailed execution trace data from
software systems during development and operations is widely supported by
tools. However, due to diverse and proprietary data formats,
approaches building on execution trace data are usually tool-specific.
To overcome this limitation, we propose the Open Execution Trace Exchange
(OPEN.xtrace) format, serving as an open interchange format for representing
APM Interoperability: An Open Format for Representing Execution Traces 3
execution traces provided by different
tools. The format is accompanied
by extensible tooling support to instantiate and serialize the OPEN.xtrace
data, and to import and export OPEN.xtrace data from/to the data format of
tools. Under the umbrella of the Standard Performance Evaluation
Corporation’s Research Group (SPEC RG), OPEN.xtrace is developed as an
ongoing community effort among
researchers and industry practi-
tioners as a part of the overall goal to increase the interoperability among tools
and approaches in this field [
]. The idea of a common format for execution
traces goes in line with related community efforts to increase interoperability
and usability [30], e.g., for performance [13, 24, 31] and workload [29] models.
The contribution of this paper is the presentation of the OPEN.xtrace
format, its tooling support, and the evaluation that analyzes the format’s complete-
ness by comparing the provided data with the data available in leading commercial
and open-source APM tools. It needs to be emphasized that OPEN.xtrace is
a work in progress and that this paper presents the current state.
The remainder of this paper is organized as follows. Section 2 provides an
overview of related work. Section 3 describes the OPEN.xtrace format and its
tooling support. Section 4 includes the comparison with
tools. In Section 5,
we draw the conclusions and outline future work. Supplementary material for this
paper, including the OPEN.xtrace software and the detailed data supporting
the evaluation, is available online [28].
2 Related work
Related works can be grouped into i.) interoperability and exchange formats in
software, service, and systems engineering in general, ii.) concrete efforts in this
direction in performance engineering in particular, as well as into iii.) formats
for representing trace data.
The idea of having standardized common data formats is not new and not
limited to the representation of execution traces. Various efforts in software,
service, and systems engineering to provide abstract data models and modeling
languages (meta-models) for concrete problems have been proposed and are used
in research and practice. Selected examples include TOSCA for representing cloud
deployments [
], CIM as an information model for corporate IT landscapes [
and the NCSA Common Log Format supported by common web and application
servers [
]. A well-defined data model (modeling language) comprises an abstract
syntax, semantics, and one or more (textual, visual, or a combination of both)
concrete syntax [
]. The syntax is commonly based on meta-models, grammars,
or schemas (object-relational, XML, etc). Data models have proven to be most
successful if they are developed and maintained by consortia of academic and
industrial partners, such as DMTF,
or W3C.
For this reason,
4 D. Okanovi´c, A. van Hoorn, C. Heger, A. Wert, and S. Siegl
OPEN.xtrace is being developed as an open community effort driven by SPEC
RG from the very beginning.
For workload and performance models, researchers have proposed a couple
of intermediate or interchange formats to reduce the number of required trans-
formations between architectural and analytical performance models and tools.
] and CSM (Core Scenario Model) [
] focus on a scenario-based
abstraction for performance models, and transformations from/to software de-
sign models (e.g., UML SP/MARTE) and analytical models such as (layered)
queuing networks are available. Similarly, PMIF (and extended versions of it)
] focuses on queueing models. WESSBAS [
] is a modeling language for
session-based workloads, supporting transformations to different load generators
and performance prediction tools.
Few works exist on data formats for execution traces. Kn¨upfer et al. [
propose the Open Trace Format (OTF). It is suited for high performance com-
puting, where the most important issues are overhead in both storage space
and processing time, and scalability. Although similar in name to our format,
OTF is not focused on execution traces, but on collections of arbitrary system
events. Similar to OTF, the OpenTracing project provides an API for logging
events on different platforms. Unlike OPEN.xtrace, OpenTracing
focuses on
so-called spans, i.e., logical units of work—not actual method executions. The
Common Base Event format (CBE) was created as a part of IBM’s Common
Event Infrastructure, a unified set of APIs and infrastructure for standardized
event management and data exchange between content manager systems [
CBE stores data in XML files. Application Response Measurement (ARM) [
is an API to measure end-to-end transaction-level performance metrics, such
as response times. Transaction-internal control flow is not captured and a data
model is not provided. To summarize, there is no open and common format for
representing execution traces. The existing formats either represent high-level
events or are tailored to specific tools. Section 4 will provide details on execution
trace data and representation formats of selected APM tools.
In Section 3.1, we provide an example to introduce additional concepts and
terminology. In Section 3.2, the main components of OPEN.xtrace’s data model
are described in form of a meta-model [
]. In Section 3.3, the OPEN.xtrace
instance of the example trace is presented. Section 3.4 presents the tooling
3.1 Example and terminology
The example execution trace shown in Figure 2, which extends the trace from
Figure 1, results from a HTTP request to a distributed Java enterprise application,
whose execution spans over multiple network nodes.
APM Interoperability: An Open Format for Representing Execution Traces 5
1d oG e t (. . ) - fo o . ba r . E n tr y S er v l et .. . J V M1 @ s rv 1
2d oF i lt e r ( .. ) - f oo . b ar . So m e Fi l te r
3d oS e ar c h ( .. ) - f oo . b ar . Fu l l Se a r ch A c ti o n
4g et D at a ( . .) - f oo . b a r . Lo a d Ac t io n . . . H ot S p ot 1 @ sr v 2
5lo g ( . .) - f oo . b ar . Lo g ge r
6l oa d Da t a ( .. ) - f oo . b ar . Lo a d Ac t io n
7l is t ( .. ) - fo o . b ar . L i s tA c t io n
8e xe c u t eQ u e r y ( .. ) - o rg . h 2 . j db c . P r e p a re d S t at e m e nt
9e xe c u t eQ u e r y ( .. ) - o rg . h 2 . j db c . P r e p a re d S t at e m e nt
10 e xe c u t eQ u e r y ( .. ) - o rg . h 2 . j db c . P r e p a re d S t at e m e nt
11 p ro c e ss D at a (. . ) - fo o . ba r . P r oc e ss A c ti o n . .. J VM 1 @ sr v 1
12 p ro c e ss S i ng l e ( .. ) - f oo . b ar . Pr o c es s A ct i o n
13 p ro c e ss S i ng l e ( .. ) - f oo . b ar . Pr o c es s A ct i o n
14 p ro c e ss S i ng l e ( .. ) - f oo . b ar . Pr o c es s A ct i o n
15 r en d e rD a ta ( .. ) - fo o . ba r . R e nd e r Ac t i on
Fig. 2: Sample trace (textual representation)
The trace starts with the execution of the
method in
the virtual machine JVM1 on the node srv1 (line 1). After the initial processing
on this node, the execution moves to the node srv2 (line 4). On this node, after
logging an event (line 5), data is fetched from a database by performing several
database calls (lines 8–10). Since the database is not instrumented, there are no
executions recorded on the node hosting it. After these calls, the execution returns
to srv1 (line 11), where the final processing is performed and the execution ends.
The complete list and structure of method executions to process the client
request is denoted as a trace. A part of the trace that is executed on a certain
location is called a subtrace. Locations can be identified with server addresses or
names, virtual machine names, etc.
Each execution within a trace is called callable. This example shows several
kinds of executions: method-to-method call (e.g., line 2), move of the execution
to a different node (e.g., line 4), logging call (line 5), call to a database (e.g., line
9), and HTTP call (line 1).
A trace contains subtraces and has one subtrace—the one where the execution
starts—that acts as a root. Each subtrace can have child subtraces, and it acts as
a parent to them. Also, subtraces contain callables, with one callable—the entry
point of that subtrace—acting as a root. Callables that can call other callables,
e.g., method-to-method calls and remote calls, are called nesting callables.
Additionally, each of these entities contains performance-relevant information
such as timestamps, response times, and CPU times, which will be detailed in
the next section.
3.2 Meta-model
The core meta-classes of the OPEN.xtrace model are presented in Figure 3.
6 D. Okanovi´c, A. van Hoorn, C. Heger, A. Wert, and S. Siegl
-traceId : long
-responseTime : long
-exclusiveTime : long
-host : String
-runtimeEnvironment :
-application : String
-businessTransaction :
-nodeType : String
-labels : List<String>
-timestamp : long
-name : String
-key : String
-value : String
Fig. 3: Trace,SubTrace,Callable and Location
is the container entity that encapsulates an entire execution trace.
subsumes a logical invocation sequence through the target system
potentially passing multiple system nodes, containers, or applications.
specifies an execution context within the trace. It consists of the
host identifier, the identifier of the runtime container (e.g., JVM) where the
subtrace is executed, the identifier of the application, the business transaction
identifier, and the node type. The business transaction specifies the business
purpose of the trace. Node type describes the role of the node that the subtrace
belongs to, e.g., ”Application server” or ”Messaging node”.
represents an extract of the logical
that is executed within
one Location.
is a node in a
that represents any callable behavior
(e.g., operation execution). For each subtrace there is a root
, and each
has its containing subtrace.
can be used to
add information on a
that is tool-specific and not explicitly modeled by
OPEN.xtrace. For simple types of additional information, the
can be used.
are extending the
which provides re-
sponse time and exclusive time. Response time represents the time it takes for
an instance to execute. Exclusive time is the execution duration of the instance
excluding the execution duration of all nested instances, e.g., a subtrace without
its subtraces. If an instance has no nested elements, its exclusive time is equal to
its response time.
The detailed inheritance hierarchy of the Callable is shown in Figure 4.
are used for logging and exception
events, respectively.
contains information on the logging
level and the message, while
contains the message, the cause,
and the stack trace of the exception, as well as the class of the exception thrown.
APM Interoperability: An Open Format for Representing Execution Traces 7
-labels : List<String>
-timestamp : long
-errorMessage : String
-cause : String
-stackTrace : String
-throwableType : String
-loggingLevel : String
-message : String
-exitTime : long
-prepared : Boolean
-unboundSQLStatement : String
-boundSQLStatement : String
-parameterBindings : List<String>
-dbProductName : String
-dbProductVersion : String
-dbUrl : String
-sqlStatement : String
-responseTime : long
-exclusiveTime : long
-id : long
Fig. 4: Callable with its inheritance hierarchy
is used for modeling exit time for synchronous events that
have it, such as method executions and database calls. It also extends the
is used if the execution of the trace moves from one
location to another. It points to another SubTrace.
Calls to a database are represented with the
. As we
do not expect the monitoring of internals of the database management systems,
calls to databases cannot have child callables.
For callables that are able to call other callables, such as methods invoking
other methods,
is used. Each
can have one parent
instance of
type. Root callables in subtraces do not have parent
callables. On the other hand,
can have multiple children, each
of which is of instance Callable.
The inheritance hierarchy for NestingCallable is shown in Figure 5.
-cpuTime : Long
-exclusiveCPUTime : Long
-className : String
-methodName : String
-packageName : String
-parameterTypes : String
-parameterValues : String
-returnType : String
-constructor : boolean
-URI : String
-method : HTTPMethod
-httpParameters : Map<String, String[]>
-httpAttributes : Map<String, String>
-httpSessionAttributes : Map<String, String>
-httpHeaders : Map<String, String>
Fig. 5: NestingCallable with its inheritance hierarchy
8 D. Okanovi´c, A. van Hoorn, C. Heger, A. Wert, and S. Siegl
is used for the representation of method executions. It
contains information on the method’s signature, e.g., method name, containing
class and package, return type, and a list of parameter types, as well as their
values. The time a method spent executing on a CPU, with or without the time
on the CPU of called methods, is represented using the properties
exclusiveCPUTime, respectively.
For modeling incoming HTTP calls,
is used. It
contains the information on URI, HTTP method, parameters, attributes (request
and session), and headers. HTTP calls are always the root callables of the
In practice, different APM tools provide different sets of data. To avoid
situations where we are not sure if some data is missing, or is not supported by
a tool, some attributes are marked as optional. For a full list of optional values,
please refer to the detailed documentation.
The current version of the OPEN.xtrace meta-model is implemented in
Java [
]. To provide native support for model-driven scenarios, we plan to
develop a version using respective technologies such as Ecore [10].
3.3 Model of the sample trace
For the trace shown in Figure 2, the resulting object model would be similar
to the model depicted in Figure 6. The model has been simplified to fit space
constraints. Some methods from the listing as well as timing and additional
information have been omitted.
The trace model can be read as follows. The execution starts with the doGet
method (1) on location srv1. Other methods are successively called, until the
doSearch method is called (2). From there, the execution moves to subtrace
subTr2 on location srv2 (3). After the last method in this subtrace finishes
execution (4), the execution returns to srv1 (2) and continues with the execution
of doSearch until the end (5).
3.4 Tooling support
OPEN.xtrace provides not only the trace meta-model, but also a default
implementation, tool adapters, and serialization support, which are publicly
available [28].
Default implementation
The default implementation of OPEN.xtrace is
meant to be used by, e.g., tool developers. Any implementation of the format can
be converted into the default implementation and be used as such by the tools.
As stated above, in order to translate proprietary trace representations
by APM tools into the OPEN.xtrace, adapters are required. Similar to some
other well-known approaches (e.g., JDBC), we provide interfaces which are
supposed to be implemented by tool vendors or third parties. Currently, we
APM Interoperability: An Open Format for Representing Execution Traces 9
host = srv1
runtimeEnvironment = JVM1
srv1 : Location
trace : Trace
subTr1 : SubTrace
subTr2 : SubTrace
host = srv2
runtimeEnvironment = HotSpot1
srv2 : Location
doFilter :
doSearch :
remote :
getData :
log :
list :
executeQuery1 :
processData :
processSingle :
sqlStatement = SELECT ...
dbCall1 :
executeQuery3 :
sqlStatement = SELECT ...
dbCall3 :
URI = /service/url/method
doGet :
processSingle :
processSingle :
executeQuery3 :
sqlStatement = SELECT ...
dbCall3 :
Fig. 6: Object model of the trace in Figure 2.
provide publicly available [
] adapters for the following tools: Dynatrace [
inspectIT [
], Kieker [
], CA APM (previously Wiley Introscope) [
]. Some
details on the implementation of each tool adapter include the following:
Data from
] has to be first exported into the XML format via
Dynatrace’s session storage mechanism and retrieved via the REST API.
After that, the adapter parses the XML file and creates an OPEN.xtrace
representation of the traces included in the session storage.
] stores traces in the form of invocation sequences, in the
internal storage called CMR. The adapter connects to the CMR, reads the
traces directly from it, and creates the OPEN.xtrace representation.
] adapter is implemented as a Kieker plugin. Integrated into
the Kieker.Analysis component, it reads traces from a Monitoring Log/Stream,
and exports them as OPEN.xtrace traces. Additionally, the adapter sup-
ports the reverse direction, i.e., the transformation of OPEN.xtrace in-
stances into the Kieker format.
10 D. Okanovi´c, A. van Hoorn, C. Heger, A. Wert, and S. Siegl
] supports exporting of the trace data to XML. As with the
Dynatrace, this XML is then parsed, and an OPEN.xtrace model is created.
OPEN.xtrace provides serialization and deserialization helpers,
that are based on the Kryo library.
The current implementation provides serial-
ization of binary data, but we also plan to implement a textual serialization. So
far, we have not explicitly considered the storage layout and its efficiency.
4 Evaluation
The goal of the evaluation is to assess whether OPEN.xtrace is expressive
enough, i.e., whether it is able to represent the execution trace data provided by
common APM tools.
The research questions that we want to answer are as follows:
RQ1: Can the APM tools provide the data required for OPEN.xtrace?
Which data available in
tools are not available in OPEN.xtrace?
By investigating RQ1, we want to see what is the level of support for OPEN.xtrace
in available APM tools. The answer to RQ2 will give us the information on
the current coverage w.r.t. the modeling coverage and how to further develop
We analyzed the data that is provided by
tools and compared the data
they provide with the data that is available in OPEN.xtrace. Since there are
many tools, most of them proprietary, the complete survey of all
tools is
an impossible task. Instead, we focus on the most popular tools, with the largest
market share, according to the Gartner report [
]. In our future work, we plan
to add information on additional APM tools as a part of the community effort.
The tools and sources of information that we analyzed are as follows.
Dynatrace APM
]—The trial version, the publicly available documenta-
tion, as well as data exported from our industry partners were used.
New Relic APM
]—The online demo with the sample application was
used. However, in the trial version, we were not able to export the data, so
the data was gathered from the UI and the available documentation.
AppDynamics APM
]—AppDynamics was tested using the trial version.
CA APM Solution
]—The licensed version of the tool was used to export
the traces.
Riverbed APM
]—The test version with the demo application was used.
In this version, we were not able to export the data, so we used the available
UI and the documentation.
]—We used the online demo with the sample application and
the provided UI.
APM Interoperability: An Open Format for Representing Execution Traces 11
Additionally, we analyzed the data from two open source tools:
] and
inspectIT [25].
For those tools that did not provide a demo application, the DVD Store
instrumented and used as a sample application. It has to be noted that in this
survey we used only the basic distributions of the APM tools. Some of the tools,
such as Kieker, have extension mechanisms allowing to measure additional data.
For the cases that trial versions were used, to the best of our knowledge, this
does not have an influence on the evaluation.
This section presents a condensed overview of the extensive raw data set
developed in our study, which is available as a part of the supplementary material
]. To give an idea of the amount of the investigated APM tool features: the
raw table of results includes around 340 features analyzed in each of the eight
Coverage of OPEN.XTRACE
After we collected the data from the tools, we
compared the features of OPEN.xtrace to the data provided by the tools. The
comparison is shown in Table 1. The features presented in the rows are extracted
from the trace model (Section 3.2).
From the table we can see that, while no tool provides all of the data, the
method description and timing information is provided by all analyzed tools.
The level of detail depends on the tool. IBM is one exception, since their tool
provides only aggregated information about method execution over the span of
time period. Examples of this kind of data are average, minimal, and maximal
response and CPU times, number of exceptions, number of SQL calls, etc.
In other tools, this aggregated data is also available, but this kind of data is
of no interest for OPEN.xtrace, since it is intended to represent single traces.
Data not covered by OPEN.XTRACE
The data collected in the survey
showed that there is some data that is not covered by OPEN.xtrace, but is
provided by some of the tools. Although this data can be modeled using additional
information (see Section 3.2), we plan to include it explicitly in our future work.
Synchronization Time,Waiting Time, and Suspended Time
All three
mentioned metrics are available in Dynatrace. While OPEN.xtrace provides
means to show that the method was waiting, there are situations where it is
important to know why the method was on hold. Synchronization time repre-
sents periods of waiting for access to a synchronization block or a method.
Waiting time is the time spent waiting for an external component. Suspended
time is the time the whole system was suspended due to some external event
during which it could not execute any code.
Nested Exceptions
The nested exception can point to the real cause of the
problem and therefore provide valuable information for the analysis. This
metric is available in Dynatrace.
12 D. Okanovi´c, A. van Hoorn, C. Heger, A. Wert, and S. Siegl
New Relic
App Dynamics
CA Technologies
Method description
Method name •••••••••
Package name • • • • • • • •
Class name • • • • • • • •
Parameter types • • • •
Parameter values • •
Return type • • •
Is constructor • • • • • • • •
Timing information
Response time •••••••••
Exclusive time • • • •
Timestamp •••••••••
CPU time • • • •
Exclusive CPU time • •
Exit time • • • • •
Location data
Host • • • • • • • •
Runtime environment • • • • •
Application • • • •
Business transaction
Node type • •
Database call
SQL statement • • • • • •
Is prepared • •
Bound SQL Statement • •
DB name • • •
DB version • •
URL • • • • • •
HTTP call
HTTP method • • •
Parameters • • • •
Attributes • •
Session attributes • •
Headers • • • •
Logging Logging level • •
Message • •
Error information
Error message • • • • • •
Cause • •
StackTrace • • • •
Throwable type • • • •
Table 1: Comparison of data available in OPEN.xtrace to APM tools
APM Interoperability: An Open Format for Representing Execution Traces 13
Garbage Collector
There is a set of performance issues related to garbage
collection, so this information can help to identify them. This metric is
available in New Relic, App Dynamics and IBM APM.
Thread Name
There are situations where a certain thread or thread group
causes a problem. Adding this information to the location description would
make the diagnosis of these problems easier. The thread name metric is
available in Dynatrace, New Relic, and CA. The thread group name is
available in CA.
HTTP Response Code and Response Headers
Knowing the state of the
HTTP response can be important for detecting problems in traces that
include HTTP calls. The response code is available in Dynatrace, New Relic,
Riverbed, and IBM, while New Relic additionally provides response header
5 Conclusion
Execution trace data is an important basis for different
approaches. While
a number of commercial and open-source
tools provides the support for
capturing of execution traces within distributed software systems, each of the
tools uses its own (proprietary) format.
In this paper we proposed OPEN.xtrace and its tooling support, which
provides a basis for execution trace data interoperability and allows for developing
tool-agnostic approaches. Additionally, we compared OPEN.xtrace with the
information that is available in leading APM tools, and evaluated its modeling
capabilities. Our evaluation showed the level of support for the format in most
tools, and provided us with the guidelines on how to further extend
the format.
Since this is a community effort, we plan to engage the public, including
tool vendors to influence the further development of OPEN.xtrace, all
under the umbrella of the SPEC RG [
]. Future work includes extensions of
the modeling capabilities, e.g., to support asynchronous calls, and to support
tools via respective adapters. In the long term, we want to
extend the effort by including also non-trace data, e.g., system-level monitoring
data in form of time series data.
This work is being supported by the German Federal Ministry of Education and
Research (grant no. 01IS15004, diagnoseIT), by the German Research Founda-
tion (DFG) in the Priority Programme “DFG-SPP 1593: Design For Future—
Managed Software Evolution” (HO 5721/1-1, DECLARE), and by the Research
Group of the Standard Performance Evaluation Corporation (SPEC RG,
// Special thanks go to Alexander Bran, Alper Hidiroglu,
and Manuel Palenga Bachelor’s students at the University of Stuttgart for
their support in the evaluation of the APM tools.
AppDynamics—Application Performance Monitoring and Management.
CA—Application Performance Management.
Dynatrace—Application Monitoring.
IBM—Application Performance Management.
Logging control in W3C httpd.
[6] New Relic APM.
Riverbed—Application Performance Monitoring.
Ammons, G., Ball, T., Larus, J.R.: Exploiting hardware performance counters
with flow and context sensitive profiling. In: Proc. ACM SIGPLAN ’97 Conf.
on Programming Language Design and Implementation (PLDI ’97). pp.
85–96 (1997)
Binz, T., Breitenb¨ucher, U., Kopp, O., Leymann, F.: TOSCA: Portable
automated deployment and management of cloud applications. In: Advanced
Web Services, pp. 527–549 (2014)
Brambilla, M., Cabot, J., Wimmer, M.: Model-Driven Software Engineering
in Practice. Morgan & Claypool Publishers, 1st edn. (2012)
Brosig, F., Huber, N., Kounev, S.: Automated extraction of architecture-level
performance models of distributed component-based systems. In: Proc. 26th
IEEE/ACM Int. Conf. on Automated Software Engineering (ASE 2011). pp.
183–192 (2011)
Canfora, G., Penta, M.D., Cerulo, L.: Achievements and challenges in soft-
ware reverse engineering. Commun. ACM 54(4), 142–151 (2011)
Ciancone, A., Drago, M.L., Filieri, A., Grassi, V., Koziolek, H., Mirandola,
R.: The KlaperSuite framework for model-driven reliability analysis of
component-based systems. Software and System Modeling 13(4), 1269–1290
Distributed Management Task Force: Common Information Model (CIM)
standard. (Feb 2014)
Elarde, J.V., Brewster, G.B.: Performance analysis of application response
measurement (ARM) version 2.0 measurement agent software implementa-
tions. In: Proc. 2000 IEEE Int. Performance, Computing, and Communica-
tions Conf. (IPCCC ’00). pp. 190–198 (2000)
Fittkau, F., Finke, S., Hasselbring, W., Waller, J.: Comparing trace visu-
alizations for program comprehension through controlled experiments. In:
APM Interoperability: An Open Format for Representing Execution Traces 15
Proc. 2015 IEEE 23rd Int. Conf. on Program Comprehension (ICPC ’15).
pp. 266–276 (2015)
[17] Heger, C., van Hoorn, A., Okanovi´c, D., Siegl, S., Wert, A.: Expert-guided
automatic diagnosis of performance problems in enterprise applications. In:
Proc. 12th Europ. Dependable Computing Conf. (EDCC ’16). IEEE (2016),
to appear
van Hoorn, A., Waller, J., Hasselbring, W.: Kieker: A framework for appli-
cation performance monitoring and dynamic software analysis. In: Proc. 3rd
ACM/SPEC Int. Conf. on Performance Eng. (ICPE ’12). pp. 247–248 (2012)
Israr, T.A., Woodside, C.M., Franks, G.: Interaction tree algorithms to
extract effective architecture and layered performance models from traces.
Journal of Systems and Software 80(4), 474–492 (2007)
Jacob, B., Lanyon-Hogg, R., Nadgir, D., Yassin, A.: A Practical Guide to
the IBM Autonomic Computing Toolkit. IBM (2004)
Kiciman, E., Fox, A.: Detecting application-level failures in component-based
internet services. IEEE Trans. on Neural Networks 16(5), 1027–1041 (2005)
Kn¨upfer, A., Brendel, R., Brunst, H., Mix, H., Nagel, W.E.: Introducing
the Open Trace Format (OTF). In: Proc. 6th Int. Conf. on Computational
Science (ICCS’06). pp. 526–533. Springer-Verlag (2006)
Kowall, J., Cappelli, W.: Magic quadrant for application performance moni-
toring (2014)
Llad´o, C.M., Smith, C.U.: PMIF+: Extensions to broaden the scope of
supported models. In: Proc. 10th Europ. Performance Evaluation Workshop
(EPEW ’13). pp. 134–148 (2013)
[25] NovaTec Consulting GmbH: inspectIT.
Parsons, T., Murphy, J.: Detecting performance antipatterns in component
based enterprise systems. Journal of Object Technology 7(3), 55–91 (2008)
Rohr, M., van Hoorn, A., Giesecke, S., Matevska, J., Hasselbring, W.,
Alekseev, S.: Trace-context sensitive performance profiling for enterprise
software applications. In: Proc. SPEC Int. Performance Evaluation Workshop
(SIPEW ’08). pp. 283–302 (2008)
SPEC Research Group: OPEN APM interoperability initiative.
// (2016)
ogele, C., van Hoorn, A., Schulz, E., Hasselbring, W., Krcmar, H.: WESS-
BAS: Extraction of probabilistic workload specifications for load testing
and performance prediction—A model-driven approach for session-based
application systems. Journal on Software and System Modeling (2016), under
Walter, J., van Hoorn, A., Koziolek, H., Okanovic, D., Kounev, S.: Asking
”what”?, automating the ”how”?: The vision of declarative performance engi-
neering. In: Proc. 7th ACM/SPEC on Int. Conf. on Performance Engineering.
pp. 91–94. ICPE ’16, ACM (2016)
Woodside, C.M., Petriu, D.C., Petriu, D.B., Shen, H., Israr, T., Merseguer, J.:
Performance by unified model analysis (PUMA). In: Proc. 5th Int. Workshop
on Software and Performance (WOSP ’05). pp. 1–12 (2005)
... To achieve low overhead, resource monitoring approaches typically do not aim at complex in-vivo online analyses , e.g., aggregation of monitoring information or constraint checking. Instead, many tools collect extensive traces, e.g., using Open.XTrace [44] , in order to support different post-mortem offline analyses . ...
... Like the other parts of the architecture, this is not provided by all monitoring frameworks, and may take very different forms, ranging from simple trace files to entire database solutions. In recent years, several initiatives have pushed on standardization for persistence and interchange formats, e.g., Open.XTrace [44] or OpenTracing 7 for performance monitoring. ...
[Context] Complex and heterogeneous software systems need to be monitored as their full behavior often only emerges at runtime, e.g., when interacting with other systems or the environment. Software monitoring approaches observe and check properties or quality attributes of software systems during operation. Such approaches have been developed in diverse communities for various kinds of systems and purposes. For instance, requirements monitoring aims to check at runtime whether a software system adheres to its requirements, while resource or performance monitoring collects information about the consumption of computing resources by the monitored system. Many venues publish research on software monitoring, often using diverse terminology, and focusing on different monitoring aspects and phases. The lack of a comprehensive overview of existing research often leads to re-inventing the wheel. [Objective] We provide a domain model to structure and systematize the field of software monitoring, starting with requirements and resource monitoring. [Method] We developed an initial domain model based on (i) our extensive experiences with requirements and resource monitoring, (ii) earlier efforts to develop a comparison framework for monitoring approaches, and (iii) an earlier systematic literature review on requirements monitoring frameworks. We then systematically analyzed 47 existing requirements and resource monitoring approaches to iteratively refine the domain model and to develop a reference architecture for software monitoring approaches. [Results] Our domain model covers the key elements of monitoring approaches and allows analyzing their commonalities and differences. Together with the reference architecture, our domain model supports the development of integrated monitoring solutions. We provide details on 47 approaches we analyzed with the model to assess its coverage. We also evaluate the reference architecture by instantiating it for five different monitoring solutions. [Conclusions] We conclude that requirements and resource monitoring have more commonalities than differences, which is promising for the future integration of existing monitoring solutions.
... In the same way, OpenTelemetry [35] was proposed, based on two former standards: OpenCensus [63] and Open-Tracing [64]. Other standards like OpenXTrace [65] were proposed, although they did not get much attention from academia and industry. The benefit of using such standards is to ease the data exchange and integration of different monitoring components. ...
Fog computing is a distributed paradigm that provides computational resources in the users’ vicinity. Fog orchestration is a set of functionalities that coordinate the dynamic infrastructure and manage the services to guarantee the Service Level Agreements. Monitoring is an orchestration functionality of prime importance. It is the basis for resource management actions, collecting status of resource and service and delivering updated data to the orchestrator. There are several cloud monitoring solutions and tools, but none of them comply with fog characteristics and challenges. Fog monitoring solutions are scarce, and they may not be prepared to compose an orchestration service. This paper updates the knowledge base about fog monitoring, assessing recent subjects in this context like observability, data standardization and instrumentation domains. We propose a novel taxonomy of fog monitoring solutions, supported by a systematic review of the literature. Fog monitoring proposals are analyzed and categorized by this new taxonomy, offering researchers a comprehensive overview. This work also highlights the main challenges and open research questions.
... Because Dialogflow is limited to conversations, we had to implement the fulfillment functionality [1] for Dialogflow to connect it to a web service that actually executes load tests. We send the load test configuration to the Vizard framework [22], which executes load tests and sends back execution reports. For the load test execution itself, Vizard uses Apache JMeter. ...
... Kieker traces are also available for use with tools other than Kieker's own tooling, as they can be automatically transformed to Open Execution Trace Exchange (OPEN.xtrace) traces, an open source trace format enabling interoperability between software performance engineering approaches (Okanović et al., 2016). ...
Full-text available
Energy efficiency of computing systems has become an increasingly important issue over the last decades. In 2015, data centers were responsible for 2% of the world's greenhouse gas emissions, which is roughly the same as the amount produced by air travel. In addition to these environmental concerns, power consumption of servers in data centers results in significant operating costs, which increase by at least 10% each year. To address this challenge, the U.S. EPA and other government agencies are considering the use of novel measurement methods in order to label the energy efficiency of servers. The energy efficiency and power consumption of a server is subject to a great number of factors, including, but not limited to, hardware, software stack, workload, and load level. This huge number of influencing factors makes measuring and rating of energy efficiency challenging. It also makes it difficult to find an energy-efficient server for a specific use-case. Among others, server provisioners, operators, and regulators would profit from information on the servers in question and on the factors that affect those servers' power consumption and efficiency. However, we see a lack of measurement methods and metrics for energy efficiency of the systems under consideration. Even assuming that a measurement methodology existed, making decisions based on its results would be challenging. Power prediction methods that make use of these results would aid in decision making. They would enable potential server customers to make better purchasing decisions and help operators predict the effects of potential reconfigurations. Existing energy efficiency benchmarks cannot fully address these challenges, as they only measure single applications at limited sets of load levels. In addition, existing efficiency metrics are not helpful in this context, as they are usually a variation of the simple performance per power ratio, which is only applicable to single workloads at a single load level. Existing data center efficiency metrics, on the other hand, express the efficiency of the data center space and power infrastructure, not focusing on the efficiency of the servers themselves. Power prediction methods for not-yet-available systems that could make use of the results provided by a comprehensive power rating methodology are also lacking. Existing power prediction models for hardware designers have a very fine level of granularity and detail that would not be useful for data center operators. This thesis presents a measurement and rating methodology for energy efficiency of servers and an energy efficiency metric to be applied to the results of this methodology. We also design workloads, load intensity and distribution models, and mechanisms that can be used for energy efficiency testing. Based on this, we present power prediction mechanisms and models that utilize our measurement methodology and its results for power prediction. Specifically, the six major contributions of this thesis are: We present a measurement methodology and metrics for energy efficiency rating of servers that use multiple, specifically chosen workloads at different load levels for a full system characterization. We evaluate the methodology and metric with regard to their reproducibility, fairness, and relevance. We investigate the power and performance variations of test results and show fairness of the metric through a mathematical proof and a correlation analysis on a set of 385 servers. We evaluate the metric's relevance by showing the relationships that can be established between metric results and third-party applications. We create models and extraction mechanisms for load profiles that vary over time, as well as load distribution mechanisms and policies. The models are designed to be used to define arbitrary dynamic load intensity profiles that can be leveraged for benchmarking purposes. The load distribution mechanisms place workloads on computing resources in a hierarchical manner. Our load intensity models can be extracted in less than 0.2 seconds and our resulting models feature a median modeling error of 12.7% on average. In addition, our new load distribution strategy can save up to 10.7% of power consumption on a single server node. We introduce an approach to create small-scale workloads that emulate the power consumption-relevant behavior of large-scale workloads by approximating their CPU performance counter profile, and we introduce TeaStore, a distributed, micro-service-based reference application. TeaStore can be used to evaluate power and performance model accuracy, elasticity of cloud auto-scalers, and the effectiveness of power saving mechanisms for distributed systems. We show that we are capable of emulating the power consumption behavior of realistic workloads with a mean deviation less than 10% and down to 0.2 watts (1%). We demonstrate the use of TeaStore in the context of performance model extraction and cloud auto-scaling also showing that it may generate workloads with different effects on the power consumption of the system under consideration. We present a method for automated selection of interpolation strategies for performance and power characterization. We also introduce a configuration approach for polynomial interpolation functions of varying degrees that improves prediction accuracy for system power consumption for a given system utilization. We show that, in comparison to regression, our automated interpolation method selection and configuration approach improves modeling accuracy by 43.6% if additional reference data is available and by 31.4% if it is not. We present an approach for explicit modeling of the impact a virtualized environment has on power consumption and a method to predict the power consumption of a software application. Both methods use results produced by our measurement methodology to predict the respective power consumption for servers that are otherwise not available to the person making the prediction. Our methods are able to predict power consumption reliably for multiple hypervisor configurations and for the target application workloads. Application workload power prediction features a mean average absolute percentage error of 9.5%. Finally, we propose an end-to-end modeling approach for predicting the power consumption of component placements at run-time. The model can also be used to predict the power consumption at load levels that have not yet been observed on the running system. We show that we can predict the power consumption of two different distributed web applications with a mean absolute percentage error of 2.2%. In addition, we can predict the power consumption of a system at a previously unobserved load level and component distribution with an error of 1.2%. The contributions of this thesis already show a significant impact in science and industry. The presented efficiency rating methodology, including its metric, have been adopted by the U.S. EPA in the latest version of the ENERGY STAR Computer Server program. They are also being considered by additional regulatory agencies, including the EU Commission and the China National Institute of Standardization. In addition, the methodology's implementation and the underlying methodology itself have already found use in several research publications. Regarding future work, we see a need for new workloads targeting specialized server hardware. At the moment, we are witnessing a shift in execution hardware to specialized machine learning chips, general purpose GPU computing, FPGAs being embedded into compute servers, etc. To ensure that our measurement methodology remains relevant, workloads covering these areas are required. Similarly, power prediction models must be extended to cover these new scenarios.
Context. Microservice-based architectures are expected to be resilient. Problem. In practice, the elicitation of resilience requirements and the quantitative evaluation of whether the system meets these requirements is not systematic or not even conducted. Objective. We explore (1) the usage of the scenario-based Architecture Trade-Off Analysis Method (ATAM) and established hazard analysis techniques, i.e., Fault Trees and Control Hazard and Operability Study (CHAZOP), for interactive resilience requirement elicitation and (2) resilience testing through chaos experiments for architecture assessment and improvement. Method. In an industrial setting, we design a structured ATAM-based workshop, including the system’s stakeholders, to elicit resilience requirements. To complement the workshop, we develop RESIRIO—a semi-automated, chatbot-assisted, and CHAZOP-based approach—for elicitation. We evaluate RESIRIO through a user study. The requirements from both sources are specified using the ATAM scenario template. We use and extend Chaos Toolkit to transform and automate two scenarios. We quantitatively evaluate these scenarios and suggest resilience improvements based on resilience patterns. Result. We identify 12 resilience scenarios in the workshop. We share lessons learned from the study. In particular, our work provides evidence that an ATAM-based workshop is intuitive to stakeholders in an industrial setting and that stakeholders can quickly learn to use RESIRIO in order to successfully obtain new scenarios. Conclusion. Our approach helps requirements and quality engineers in interactive resilience requirements elicitation.KeywordsInteractive elicitationRequirements engineeringResilienceHazard analysis
Conference Paper
State-of-the-art approaches for reporting performance analysis results rely on charts providing insights on the performance of the system, often organized in dashboards. The insights are usually data-driven, i.e., not directly connected to the performance concern leading the users to execute the performance engineering activity, thus limiting the understandability of the provided result. A cause is that the data is presented without further explanations.
Full-text available
The specification of workloads is required in order to evaluate performance characteristics of application systems using load testing and model-based performance prediction. Defining workload specifications that represent the real workload as accurately as possible is one of the biggest challenges in both areas. To overcome this challenge, this paper presents an approach that aims to automate the extraction and transformation of workload specifications for load testing and model-based performance prediction of session-based application systems. The approach (WESSBAS) comprises three main components. First, a system- and tool-agnostic domain-specific language (DSL) allows the layered modeling of workload specifications of session-based systems. Second, instances of this DSL are automatically extracted from recorded session logs of production systems. Third, these instances are transformed into executable workload specifications of load generation tools and model-based performance evaluation tools. We present transformations to the common load testing tool Apache JMeter and to the Palladio Component Model. Our approach is evaluated using the industry-standard benchmark SPECjEnterprise2010 and the World Cup 1998 access logs. Workload-specific characteristics (e.g., session lengths and arrival rates) and performance characteristics (e.g., response times and CPU utilizations) show that the extracted workloads match the measured workloads with high accuracy.
Conference Paper
Full-text available
Application performance management (APM) is a necessity to detect and solve performance problems during operation of enterprise applications. While existing tools provide alerting and visualization capabilities when performance requirements are violated during operation, the isolation and diagnosis of the problem's real root cause is the responsibility of the rare performance expert, often resulting in a boring and recurring task. Main challenges for APM adoption in practice include that initial setup and maintenance of APM, and particularly the diagnosis of performance problems are error-prone, costly, and require a high manual effort and expertise. In this paper, we present preliminary work on diagnoseIT, an approach that utilizes formalized APM expert knowledge to automate the aforementioned recurring APM activities.
Conference Paper
Full-text available
Over the past decades, various methods, techniques, and tools for modeling and evaluating performance properties of software systems have been proposed covering the entire software life cycle. However, the application of performance engineering approaches to solve a given user concern is still rather challenging and requires expert knowledge and experience. There are no recipes on how to select, configure, and execute suitable methods, tools, and techniques allowing to address the user concerns. In this paper, we describe our vision of Declarative Performance Engineering (DPE), which aims to decouple the description of the user concerns to be solved (performance questions and goals) from the task of selecting and applying a specific solution approach. The strict separation of " what " versus " how " enables the development of different techniques and algorithms to automatically select and apply a suitable approach for a given scenario. The goal is to hide complexity from the user by allowing users to express their concerns and goals without requiring any knowledge about performance engineering techniques. Towards realizing the DPE vision, we discuss the different requirements and propose a reference architecture for implementing and integrating respective methods, algorithms, and tooling.
Conference Paper
Full-text available
Workload generation is essential to systematically evaluate performance properties of application systems under controlled conditions, e.g., in load tests or benchmarks. The definition of workload specifications that represent the real workload as accurately as possible is one of the biggest challenges in this area. This paper presents our approach for the modeling and automatic extraction of probabilistic workload specifications for load testing session-based application systems. The approach, called WESSBAS, comprises (i.) a domain specific language (DSL) enabling layered modeling of workload specifications as well as support for (ii.) automatically extracting instances of the DSL from recorded sessions logs and (iii.) transforming instances of the DSL to workload specifications of existing load testing tools. During the extraction process, different groups of customers with similar navigational patterns are identified using clustering techniques. We developed corresponding tool support including a transformation to probabilistic test scripts for the Apache JMeter load testing tool. The evaluation of the proposed approach using the industry standard benchmark SPECjEnterprise2010 demonstrates its applicability and representativeness of the extracted workloads.
Conference Paper
Full-text available
The performance model interchange format (PMIF) is a common representation for data that reduces the number of custom interfaces required to move performance models among modeling tools. In order to manage the research scope, the initial version of PMIF was limited to Queueing Network Models (QNM) that can be solved by efficient, exact solution algorithms. The overall model interoperability approach has now been demonstrated to be viable. This paper broadens the scope of PMIF to represent models that can be solved with additional methods such as analytical approximations or simulation solutions. It presents the extensions considered, shows alternatives for representing them with a meta-model, describes the PMIF+ extended meta-model and its validation.
Conference Paper
Full-text available
Kieker is an extensible framework for monitoring and analyzing the runtime behavior of concurrent or distributed software systems. It provides measurement probes for application performance monitoring and control-flow tracing. Analysis plugins extract and visualize architectural models, augmented by quantitative observations. Configurable readers and writers allow Kieker to be used for online and offline analysis. This paper reviews the Kieker framework focusing on its features, its provided extension points for custom components, as well the imposed monitoring overhead.
Full-text available
This book discusses how model-based approaches can improve the daily practice of software professionals. This is known as Model-Driven Software Engineering (MDSE) or, simply, Model-Driven Engineering (MDE). MDSE practices have proved to increase efficiency and effectiveness in software development, as demonstrated by various quantitative and qualitative studies. MDSE adoption in the software industry is foreseen to grow exponentially in the near future, e.g., due to the convergence of software development and business analysis. The aim of this book is to provide you with an agile and flexible tool to introduce you to the MDSE world, thus allowing you to quickly understand its basic principles and techniques and to choose the right set of MDSE instruments for your needs so that you can start to benefit from MDSE right away. The book is organized into two main parts. The first part discusses the foundations of MDSE in terms of basic concepts (i.e., models and transformations), driving principles, application scenarios and current standards, like the well-known MDA initiative proposed by OMG (Object Management Group) as well as the practices on how to integrate MDSE in existing development processes. The second part deals with the technical aspects of MDSE, spanning from the basics on when and how to build a domain-specific modeling language, to the description of Model-to-Text and Model-to-Model transformations, and the tools that support the management of MDSE projects. The book is targeted to a diverse set of readers, spanning: professionals, CTOs, CIOs, and team managers that need to have a bird's eye vision on the matter, so as to take the appropriate decisions when it comes to choosing the best development techniques for their company or team; software analysts, developers, or designers that expect to use MDSE for improving everyday work productivity, either by applying the basic modeling techniques and notations or by defining new domain-specific modeling languages and applying end-to-end MDSE practices in the software factory; and academic teachers and students to address undergrad and postgrad courses on MDSE. In addition to the contents of the book, more resources are provided on the book's website, including the examples presented in the book. Table of Contents: Introduction / MDSE Principles / MDSE Use Cases / Model-Driven Architecture (MDA) / Integration of MDSE in your Development Process / Modeling Languages at a Glance / Developing your Own Modeling Language / Model-to-Model Transformations / Model-to-Text Transformations / Managing Models / Summary
Portability and automated management of composite applications are major concerns of todayâs enterprise IT. These applications typically consist of heterogeneous distributed components combined to provide the applicationâs functionality. This architectural style challenges the operation and management of the application as a whole and requires new concepts for deployment, configuration, operation, and termination. The upcoming OASIS Topology and Orchestration Specification for Cloud Applications (TOSCA) standard provides new ways to enable portable automated deployment and management of composite applications. TOSCA describes the structure of composite applications as topologies containing their components and their relationships. Plans capture management tasks by orchestrating management operations exposed by the components. This chapter provides an overview on the concepts and usage of TOSCA.
Automatic prediction tools play a key role in enabling the application of non-functional requirements analysis, to simplify the selection and the assembly of components for component-based software systems, and in reducing the need for strong mathematical skills for software designers. By exploiting the paradigm of Model-Driven Engineering (MDE), it is possible to automatically transform design models into analytical models, thus enabling formal property verification. MDE is the core paradigm of the KlaperSuite framework presented in this paper, which exploits the KLAPER pivot language to fill the gap between design and analysis of component-based systems for reliability properties. KlaperSuite is a family of tools empowering designers with the ability to capture and analyze quality of service views of their systems, by building a one-click bridge towards a number of established verification instruments. In this article, we concentrate on the reliability-prediction capabilities of KlaperSuite and we evaluate them with respect to several case studies from literature and industry.