Conference Paper

Deterministic Models of Software Aging and Optimal Rejuvenation Schedules

Zuse Inst. Berlin (ZIB), Berlin
DOI: 10.1109/INM.2007.374780 Conference: Integrated Network Management, 2007. IM '07. 10th IFIP/IEEE International Symposium on
Source: DBLP


Automated modeling of software aging processes is a prerequisite for cost-effective usage of adaptive software rejuvenation as a self-healing technique. We consider the problem of such automated modeling in server-type applications whose performance degrades depending on the "work" done since last rejuvenation, for example the number of served requests. This type of performance degradation - caused mostly by resource depletion - is common, as we illustrate in a study of the popular Axis Soap server 1.3. In particular, we propose deterministic models for approximating the leading indicators of aging and an automated procedure for statistical testing of their correctness. We further demonstrate how to use these models for finding optimal rejuvenation schedules under utility functions. Our focus is on the important case that the utility function is the average of a performance metric (such as maximum service rate). We also consider optional SLA constraints under which the performance should never drop below a specified level. Our approach is verified by a study of the aging processes in the Axis Soap 1.3 server. The experiments show that the deterministic modeling technique is appropriate in this case, and that the optimization of rejuvenation schedules can greatly improve the average maximum service rate of an aging application.

Download full-text


Available from: Artur Andrzejak, Apr 25, 2015
  • Source
    • "Aging issues occur in any type of software that is sufficiently complex, but it is particularly troublesome in longrunning applications. Examples include telecommunication systems, web-servers, web-service middleware [3], or cloud computing infrastructure [4]. Until today, the primary technique to combat software aging are controlled restarts known as software rejuvenation [5]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Memory-related software defects manifest after a long incubation time and are usually discovered in a production scenario. As a consequence, this frequently encountered class of so-called software aging problems incur severe follow-up costs, including performance and reliability degradation, need for workarounds (usually controlled restarts) and effort for localizing the causes. While many excellent tools for identifying memory leaks exist, they are inappropriate for automated leak detection or isolation as they require developer involvement or slow down execution considerably. In this work we propose a lightweight approach which allows for automated leak detection during the standardized unit or integration tests. The core idea is to compare at the byte-code level the memory allocation behavior of related development versions of the same software. We evaluate our approach by injecting memory leaks into the YARN component of the popular Hadoop framework and comparing the accuracy of detection and isolation in various scenarios. The results show that the approach can detect and isolate such defects with high precision, even if multiple leaks are injected at once.
    Full-text · Conference Paper · Aug 2013
  • Source
    • "Consequently, the primary technique to combat software aging are controlled restarts known as software rejuvenation [2]. The majority of research in the last two decades was targeting modeling of the degradation process and optimizing rejuvenation schedules [3], [4], [5], [6] (see also Sec. IV). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Software aging, i.e. degradation of software performance or functionality caused by resource depletion is usually discovered only in the production scenario. This incurs large costs and delays of defect removal and requires provisional solutions such as rejuvenation (controlled restarts). We propose a method for detecting aging problems shortly after their introduction by runtime comparisons of different development versions of the same software. Possible aging issues are discovered by analyzing the differences in runtime traces of selected metrics. The required comparisons are workload-independent which minimizes the additional effort of dedicated stress tests. Consequently, the method requires only minimal changes to the traditional development and testing process. This paves the way to detecting such problems before public releases, greatly reducing the cost of defect fixing. Our study focuses on the memory leaks of Eucalyptus, a popular open source framework for managing cloud computing environments.
    Full-text · Conference Paper · Jan 2013
  • Source
    • "This gives us a constant scale with respect to the execution of the load application. Workload-based sampling has also been shown to be more appropriate in other areas [7] [8] of failure prediction. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A recent trend in the design of commodity processors is the combination of multiple independent execution units on one chip. With the resulting increase of complexity and transistor count, it becomes more and more likely that a single execution unit on a processor gets faulty. In order to tackle this situation, we propose an architecture for dependable process management in chip-multiprocessing machines. In our approach, execution units survey each other to anticipate future hardware failures. The prediction relies on the analysis of processor hardware performance counters by a statistical rank-sum test. Initial experiments with the Intel Core processor platform proved the feasibility of the approach, but also showed the need for further investigation due to a high prediction quality variation in most of the cases.
    Full-text · Conference Paper · Jul 2009
Show more