Conference Paper

High-available grid services through the use of virtualized clustering.

Tech. Univ. of Catalonia Barcelona, Barcelona;
DOI: 10.1109/GRID.2007.4354113 Conference: 8th IEEE/ACM International Conference on Grid Computing (GRID 2007), September 19-21, 2007, Austin, Texas, USA, Proceedings
Source: DBLP

ABSTRACT Grid applications comprise several components and web-services that make them highly prone to the occurrence of transient software failures and aging problems. This type of failures often incur in undesired performance levels and unexpected partial crashes. In this paper we present a technique that offers high-availability for Grid services based on concepts like virtualization, clustering and software rejuvenation. To show the effectiveness of our approach, we have conducted some experiments with OGSA-DAI middleware. One of the implementations of OGSA-DAI makes use of use of Apache Axis V1.2.1, a SOAP implementation that suffers from severe memory leaks. Without changing any bit of the middleware layer we have been able to anticipate most of the problems caused by those leaks and to increase the overall availability of the OGSA-DAI Application Server. Although these results are tightly related with this middleware it should be noted that our technique is neutral and can be applied to any other Grid service that is supposed to be high-available.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we present a comparative experimental study of the main software rejuvenation techniques developed so far to mitigate the software aging effects. We consider six different rejuvenation techniques with different levels of granularity: (i) physical node reboot, (ii) virtual machine reboot, (iii) OS reboot, (iv) fast OS reboot, (v) standalone application restart, and (vi) application rejuvenation by a hot standby server. We conduct a set of experiments injecting memory leaks at the application level. We evaluate the performance overhead introduced by software rejuvenation in terms of throughput loss, failed requests, slow requests, and memory fragmentation overhead. We also analyze the selected rejuvenation techniques’ efficiency in mitigating the aging effects. Due to the growing adoption of virtualization technology, we also analyze the overhead of the rejuvenation techniques in virtualized environments. The results show that the performance overheads introduced by the rejuvenation techniques are related to the granularity level. We also capture different levels of memory fragmentation overhead induced by the virtualization demonstrating some drawbacks of using virtualization in comparison with non-virtualized rejuvenation approaches. Finally, based on these research findings we present comprehensive guidelines to support decision making during the design of rejuvenation scheduling algorithms, as well as in selecting the appropriate rejuvenation mechanism.
    Performance Evaluation. 03/2013; 70(3):231–250.
  • Source
    Performance Evaluation 11/2013; 70(11):917–933. · 0.84 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: A number of studies have reported the phenomenon of “Software aging”, caused by resource exhaustion and characterized by progressive software performance degradation. In this article, we carry out an experimental study of software aging and rejuvenation for an on-line bookstore application, following the standard configuration of TPC-W benchmark. While real website is used for the bookstore, the clients are emulated. In order to reduce the time to application failures caused by memory leaks, we use the accelerated life testing (ALT) approach. We then select the Weibull time to failure distribution at normal level, to be used in a semi-Markov process, to compute the optimal software rejuvenation trigger interval. Since the validation of optimal rejuvenation trigger interval with emulated browsers will take an inordinate long time, we develop a simulation model to validate the ALT experimental results, and also estimate the steady-state availability to cross-validate the results of the semi-Markov availability model.
    ACM Journal on Emerging Technologies in Computing Systems 01/2014; 10(1):1. · 0.76 Impact Factor

Full-text (2 Sources)

Available from
Jun 3, 2014