Fig 1 - uploaded by André van Hoorn
Content may be subject to copyright.
Example trace including an N+1 problem (in red rectangle)  

Example trace including an N+1 problem (in red rectangle)  

Source publication
Conference Paper
Full-text available
Application performance management (APM) is a necessity to detect and solve performance problems during operation of enterprise applications. While existing tools provide alerting and visualization capabilities when performance requirements are violated during operation, the isolation and diagnosis of the problem's real root cause is the responsibi...

Contexts in source publication

Context 1
... APM tools provide a detailed white-box view into EA system stacks-ranging from system-level mon- itoring (e.g., CPU, memory and network utilization) up to detailed execution traces also spanning multiple nodes includ- ing the client devices in distributed EAs. Figure 1 depicts an excerpt of a real trace from an example system. Analysis of the given trace reveals an N+1 problem in the application: one database query with a larger result set is followed by a sequence of short database queries to items obtained in the result set of the initial query [18]. ...
Context 2
... is the task of the APM expert to diagnose such a problem, i.e., to identify its respective root cause(s). It can be assumed that a trace as depicted in Figure 1 is or can be collected for each request to a system-provided service. The common procedure is to manually inspect execution traces in a representation as depicted in Figure 1. ...
Context 3
... can be assumed that a trace as depicted in Figure 1 is or can be collected for each request to a system-provided service. The common procedure is to manually inspect execution traces in a representation as depicted in Figure 1. It needs to be emphasized that this may include hundreds or thousands of traces. ...
Context 4
... can be easily seen that this task is extremely time-consuming and error-prone. Moreover, APM experts tend to see the same problems (anti- patterns), such as the one in Figure 1, over and over again, because they represent the common pitfalls that happen often in practice. ...
Context 5
... technology-specific rules provide more details, depending on the underlying system's specifics. Figure 3 illustrates the rule-based diagnosis process based on the example from Section II (Figure 1) The execution of each rule contributes additional insights into the diagnosis result. For example, a very basic rule identifies the slowest method (i.e., the one with the highest response or exclusive time) in a trace. ...

Citations

... In the literature, several approaches have been proposed for modeling, analyzing, and optimizing the performance of software applications [9], [10]. Two main directions have been pursued: (i) model-based performance analysis, i.e., performance models are built out of Java applications [11], [12] and used for predictions; (ii) application performance monitoring, i.e., tools that collect trace data for inspection [13], [14]. Motivated by the recent trend of integrating development (Dev) and operations (Ops) teams, processes, and tools [15], [16], [17], it is necessary that software engineers are aware of the performance evolution of their applications. ...
Article
Full-text available
The detection of performance issues in Java-based applications is not trivial since many factors concur to poor performance, and software engineers are not sufficiently supported for this task. The goal of this manuscript is the automated detection of performance problems in running systems to guarantee that no quality-based hinders prevent their successful usage. Starting from software performance antipatterns, i.e., bad practices (e.g., extensive interaction between software methods) expressing both the problem and the solution with the purpose of identifying shortcomings and promptly fixing them, we develop a framework that automatically detects seven software antipatterns capturing a variety of performance issues in Java-based applications. Our approach is applied to real-world case studies from different domains, and it captures four real-life performance issues of Hadoop and Cassandra that were not predicted by state-of-the-art approaches. As empirical evidence, we calculate the accuracy of the proposed detection rules, we show that code commits inducing and fixing real-life performance issues present interesting variations in the number of detected antipattern instances, and solving one of the detected antipatterns improves the system performance up to 50%.
... A number of Kieker-related research projects have been and are being conducted over the years. Examples are Declare (Declarative Performance Engineering) [39], diagnoseIT (Expert-Guided Automatic Diagnosis of Performance Problems) [40], ContinuITy (Automated Load Testing in Continuous Software Engineering) [41], Orcas (Efficient Resilience Benchmarking of Microservice Architectures) [42], MooBench (Monitoring Overhead Benchmark) [33,43,44], iObserve (Integrated Observation and Modeling Techniques to Support Adaptation and Evolution of Software Systems) [45,46] and PubFlow (Workflows for Research Data Publication) [47]. ...
... During the past years, Kieker was employed in several industrial collaborations and technology transfer projects. This includes the funded technology-transfer projects DynaMod (Dynamic Analysis for Model-Driven Software Modernization) [57], MENGES (Model-Driven Engineering of Rail Control Centers) [58], and diagnoseIT (Expert-guided automatic diagnosis of performance problems in enterprise applications) [40]. Examples for such industrial collaborations with impact on Kieker's development include the following: ...
Article
Full-text available
Application-level monitoring and dynamic analysis of software systems are a basis for various tasks in software engineering research, such as performance evaluation and reverse engineering. The Kieker framework provides monitoring, analysis, and visualization support for these purposes. It commenced in 2006, and grew toward a high-quality open-source software that has been employed in a variety of software engineering research projects over the last decade. Several research groups constitute the open-source community to advance the Kieker framework. In this paper, we review Kieker’s history, development, and impact both in research and technology transfer with industry.
... All of these steps require a significant level of expertise and can successfully be executed usually only with a significant effort. However, performance experts come in short supply and even if they are available, the task of performance analysis can be time-consuming and error prone [13]. Existing approaches that aim to simplify this process [34,35] are limited to either choosing the underlying approach or result interpretation. ...
... While these solutions help experts to identify application performance bottlenecks in their enterprise applications, they are mostly limited to alert and visualization functions, rather than providing advanced anomaly detection capabilities. Furthermore, root cause analysis can often only be performed manually, by experts with significant domain knowledge about the applications being assessed [7]. ...
Conference Paper
Full-text available
Existing application performance management (APM) solutions lack robust anomaly detection capabilities and root cause analysis techniques, that do not require manual efforts and domain knowledge. In this paper, we develop a density-based unsupervised machine learning model to detect anomalies within an enterprise application, based upon data from multiple APM systems. The research was conducted in collaboration with a European automotive company, using two months of live application data. We show that our model detects abnormal system behavior more reliably than a commonly used outlier detection technique and provides information for detecting root causes.
... Similarly to our approach, the authors rely on workloads derived from the operational use of the software system. The diagnoseIT approach described by Heger et al.[69] relies on trace data from application performance management (APM) tools to discover recurring performance issues including a mapping to performance antipatterns.For this purpose, knowledge from APM experts is formalized as rules which are applied to the trace data to locate instances of known performance issues. As opposed to the approach described in this paper, diagnoseIT focuses on data obtained from APM tools during production and does not consider load testing or profiling. ...
Article
Full-text available
Context: The performance assessment of complex software systems is not a trivial task since it depends on the design, code, and execution environment. All these factors may affect the system quality and generate negative consequences, such as delays and system failures. The identification of bad practices leading to performance flaws is of key relevance to avoid expensive rework in redesign, reimplementation, and redeployment. Objective: The goal of this manuscript is to provide a systematic process, based on load testing and profiling data, to identify performance issues with runtime data. These performance issues represent an important source of knowledge as they are used to trigger the software refactoring process. Software characteristics and performance measurements are matched with well-known performance antipatterns to document common performance issues and their solutions. Method: We execute load testing based on the characteristics of collected operational profile, thus to produce representative workloads. Performance data from the system under test is collected using a profiler tool to create profiler snapshots and get performance hotspot reports. From such data, performance issues are identified and matched with the specification of antipatterns. Software refactorings are then applied to solve these performance antipatterns. Results: The approach has been applied to a real-world industrial case study and to a representative laboratory study. Experimental results demonstrate the effectiveness of our tool-supported approach that is able to automatically detect two performance antipatterns by exploiting the knowledge of domain experts. In addition, the software refactoring process achieves a significant performance gain at the operational stage in both case studies. Conclusion: Performance antipatterns can be used to effectively support the identification of performance issues from load testing and profiling data. The detection process triggers an antipattern-based software refactoring that in our two case studies results in a substantial performance improvement.
... There are also approaches to automatically detect root causes of typical performance problems, also known as performance antipatterns [15]. These approaches usually use trace data [6], but may also add other information, e.g., configuration data [12]. • System refactoring and adaptation. ...
... We see promising future research directions in automation of supporting activities and analysis of data. In our current research, we try to tackle selected challenges by including expert knowledge and analyzing performance concerns by a declarative approach [6,18]. Technology transfer of new APM approaches developed in research would benefit from the availability of real-world APM data for the evaluation of the approaches. ...
Conference Paper
Full-text available
The performance of application systems has a direct impact on business metrics. For example, companies lose customers and revenue in case of poor performance such as high response times. Application performance management (APM) aims to provide the required processes and tools to have a continuous and up-to-date picture of relevant performance measures during operations, as well as to support the detection and resolution of performance-related incidents. In this tutorial paper, we provide an overview of the state of the art in APM in industrial practice and academic research, highlight current challenges, and outline future research directions.
... Any performance degradation in these systems results in significant losses. There are many approaches for dealing with performance problems by performing diagnosis and root cause analysis, both before [10,15] and after [5] the software enters production. ...
... In large enterprise systems, application performance monitoring (APM) tools can report a large number of performance incidents, such as slow response times or increased resource consumption. However, the capabilities of these tools are limited to alerting and reporting performance problems [3], while only few approaches deal with diagnosing them [5]. In both cases, all performance problems are reported separately, again leaving the performance expert to deal with one problem at the time. ...
... In our previous work [5], we presented diagnoseIT -an approach for automated diagnosis of performance problems. The motivation for our work was the fact that APM practice requires enormous manual effort and expertise. ...
Conference Paper
As the importance of application performance grows in modern enterprise systems, many organizations employ application performance management (APM) tools to help them deal with potential performance problems during production. In addition to monitoring capabilities, these tools provide problem detection and alerting. In large enterprise systems these tools can report a very large number of performance problems. They have to be dealt with individually, in a time-consuming and error-prone manual process, even though many of them have a common root cause. In this vision paper, we propose using automatic categorization for dealing with large numbers of performance problems reported by APM tools. This leads to the aggregation of reported problems, reducing the work required for resolving them. Additionally, our approach opens the possibility of extending the analysis approaches to use this information for a more efficient diagnosis of performance problems.
... In addition to the aforementioned three APM dimensions, execution traces provide the data set for various further software performance engineering (SPE) activities. For instance, researchers have proposed approaches for extracting and visualizing performance models [11,16,19], as well as detecting and diagnosing performance problems [17,21,26,27]. Unfortunately, the existing approaches are tailored to the execution trace representations of specific APM tools or custom-made monitoring and tracing implementations. ...
Conference Paper
Full-text available
Execution traces capture information on a software system’s runtime behavior, including data on system-internal software control flows, performance, as well as request parameters and values. In research and industrial practice, execution traces serve as an important basis for model-based and measurement-based performance evaluation, e.g., for application performance monitoring (APM), extraction of descriptive and prescriptive models, as well as problem detection and diagnosis. A number of commercial and open-source APM tools that allow the capturing of execution traces within distributed software systems is available. However, each of the tools uses its own (proprietary) format, which means that each approach building on execution trace data is tool-specific. In this paper, we propose the (OPEN.xtrace) format to enable data interoperability and exchange between APM tools and (SPE) approaches. Particularly, this enables SPE researchers to develop their approaches in a tool-agnostic and comparable manner. OPEN.xtrace is a community effort as part of the overall goal to increase interoperability of SPE/APM techniques and tools. In addition to describing the OPEN.xtrace format and its tooling support, we evaluate OPEN.xtrace by comparing its modeling capabilities with the information that is available in leading APM tools.
... While many performance problem detection [22], diagnosis [17,33,47], prediction [21,34], and prevention [19,15] techniques have been developed, it is very hard to evaluate them. It is common to use either specific benchmarks [33,47], generated applications [20], or case studies that involve some real applications [12,13,22], where the results are compared to what experienced developers/testers would have expected. ...
Conference Paper
Full-text available
A challenging problem with today's increasingly large and distributed software systems is their performance behavior. To help developers avoid or detect mistakes that lead to performance problems, many researchers in software performance engineering have come up with classifications of such problems, called antipatterns. To test the approaches for antipattern detection, data from running systems is required. However, the usefulness of this data is doubtful as it may or may not include manifestations of performance problems. In this paper, we classify existing performance antipatterns w.r.t. their suitability for being injected and, based on this, introduce an extensible tool that allows to inject instances of these antipatterns into existing applications. The approach can be useful for researchers to test and validate their automated runtime problem evaluation and prevention techniques. Using two exemplary performance antipatterns, it is demonstrated that the injection is easily possible and produces feasible, though currently rather clinical results.