Conference Paper

High-available grid services through the use of virtualized clustering

Tech. Univ. of Catalonia Barcelona, Barcelona
DOI: 10.1109/GRID.2007.4354113 Conference: 8th IEEE/ACM International Conference on Grid Computing (GRID 2007), September 19-21, 2007, Austin, Texas, USA, Proceedings
Source: DBLP


Grid applications comprise several components and web-services that make them highly prone to the occurrence of transient software failures and aging problems. This type of failures often incur in undesired performance levels and unexpected partial crashes. In this paper we present a technique that offers high-availability for Grid services based on concepts like virtualization, clustering and software rejuvenation. To show the effectiveness of our approach, we have conducted some experiments with OGSA-DAI middleware. One of the implementations of OGSA-DAI makes use of use of Apache Axis V1.2.1, a SOAP implementation that suffers from severe memory leaks. Without changing any bit of the middleware layer we have been able to anticipate most of the problems caused by those leaks and to increase the overall availability of the OGSA-DAI Application Server. Although these results are tightly related with this middleware it should be noted that our technique is neutral and can be applied to any other Grid service that is supposed to be high-available.

Download full-text


Available from: Javier Alonso Lopez
  • Source
    • "The determination of optimal interval to achieve maximum availability and minimum downtime cost, however, is mostly performed through building and analyzing an analytical model [27– 29], whereas inspection-based rejuvenation is triggered in the case if aging effects measured through observations of system state violate restrict criteria or particular conditions. The rejuvenation trigger epoch is decided by a variety of mechanisms including threshold-based methods using aging indicators [30] [31] [32]; prediction-based approaches: machine learning, statistical approaches, or structural models [33] [34] [35] [36]; and mixed approaches using prediction methods to determine optimal threshold [37]. However, the implementation of inspection-based rejuvenation in a real environment could be troublesome for system administrator due to the growing complexity of the systems introduced by recent technologies (e.g., cloud computing) and heterogeneous environments (e.g., software defined data center) where the systems have to interact with each other. "
    [Show abstract] [Hide abstract]
    ABSTRACT: It is important to assess availability of virtualized systems in IT business infrastructures. Previous work on availability modeling and analysis of the virtualized systems used a simplified configuration and assumption in which only one virtual machine (VM) runs on a Virtual Machine Monitor (VMM) hosted on a physical server. In this paper, we show a comprehensive availability model using stochastic reward nets (SRN). The model takes into account: (i) the detailed failures and recovery behaviors of multiple VMs, (ii) various other failure modes and corresponding recovery behaviors (e.g., hardware faults, failure and recovery due to Mandelbugs and aging related bugs), and (iii) dependency between different sub-components (e.g., between physical host failure and VMM, etc.) in a virtualized servers system. We also show numerical analysis on steady state availability, downtime in hours per year, transaction loss and sensitivity analysis. This model provides a new finding on how to increase system availability by combining both software rejuvenations at VM and VMM in a wise manner.
    Full-text · Article · Aug 2014 · The Scientific World Journal
  • Source
    • "However, these approaches require re-engineering the application, hence are not as cost-effective a solution. In this study we have selected the application rejuvenation technique discussed in [5] [14]. This approach uses virtualization technology so that it can be used for legacy as well as new applications without requiring application re-engineering. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we present a comparative experimental study of the main software rejuvenation techniques developed so far to mitigate the software aging effects. We consider six different rejuvenation techniques with different levels of granularity: (i) physical node reboot, (ii) virtual machine reboot, (iii) OS reboot, (iv) fast OS reboot, (v) standalone application restart, and (vi) application rejuvenation by a hot standby server. We conduct a set of experiments injecting memory leaks at the application level. We evaluate the performance overhead introduced by software rejuvenation in terms of throughput loss, failed requests, slow requests, and memory fragmentation overhead. We also analyze the selected rejuvenation techniques’ efficiency in mitigating the aging effects. Due to the growing adoption of virtualization technology, we also analyze the overhead of the rejuvenation techniques in virtualized environments. The results show that the performance overheads introduced by the rejuvenation techniques are related to the granularity level. We also capture different levels of memory fragmentation overhead induced by the virtualization demonstrating some drawbacks of using virtualization in comparison with non-virtualized rejuvenation approaches. Finally, based on these research findings we present comprehensive guidelines to support decision making during the design of rejuvenation scheduling algorithms, as well as in selecting the appropriate rejuvenation mechanism.
    Full-text · Article · Mar 2013 · Performance Evaluation
  • Source
    • "The inspection-based approaches can be subsequently divided in two categories: threshold-based approaches and predictionbased approaches. Threshold-based approaches are based on monitoring the aging effects and triggering the software rejuvenation when a specific threshold is exceeded [7]. In prediction-based approaches, a prediction method is applied to "
    [Show abstract] [Hide abstract]
    ABSTRACT: Software rejuvenation has been addressed in hundreds of papers since it was proposed in 1995 by Huang et al. The growing number of research papers shows the great importance of this topic. However, no paper has studied yet software rejuvenation in the real world. This paper investigates to what extent software rejuvenation techniques are integrated in the IT and Telco solutions. For this purpose, it has been conducted an intensive search of different sources such as company's product websites, technical papers, white papers, US patents, and consultant surveys. The results show that IT and Telco companies develop software rejuvenation solutions to deal with software aging. The number of US patents addressing this issue confirms the interest of industry to develop mechanisms to deal with software aging-related failures. It has been observed that real software rejuvenation solutions mainly use time-based or threshold-based policies, while the US patents are focused on predictive approaches.
    Full-text · Conference Paper · Jan 2012
Show more