Article

Distributed Environment for Efficient Virtual Machine Image Management in Federated Cloud Architectures

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The use of virtual machines (VMs) in Cloud computing provides various benefits in the overall software engineering lifecycle. These include efficient elasticity mechanisms resulting in higher resource utilization and lower operational costs. The VMs as software artifacts are created using provider-specific templates, called virtual machine images (VMI), and are stored in proprietary or public repositories for further use. However, some technology-specific choices can limit the interoperability among various Cloud providers and bundle the VMIs with nonessential or redundant software packages, leading to increased storage size, prolonged VMI delivery, stagnant VMI instantiation, and ultimately vendor lock-in. To address these challenges, we present a set of novel functionalities and design approaches for efficient operation of distributed VMI repositories, specifically tailored for enabling (1) simplified creation of lightweight and size optimized VMIs tuned for specific application requirements; (2) multi-objective VMI repository optimization; and (3) efficient reasoning mechanism to help optimizing complex VMI operations. The evaluation results confirm that the presented approaches can enable VMI size reduction by up to 55%, while trimming the image creation time by 66%. Furthermore, the repository optimization algorithms can reduce the VMI delivery time by up to 51% and cut down the storage expenses by 3%. Moreover, by implementing replication strategies, the optimization algorithms can increase the system reliability by 74%.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Compact sub-structures (sub-graphs) of graphs find use in many scientific disciplines: from resource placement [43] to routing [30,31], structural analysis to data compression [45,46], information coding and machine learning [15,19], to name but a few. One of the most popular and widely studied compact sub-structures of a graph are the spanning trees; and, in particular, for weighted graphs -and digraphs (directed graphs) -the Minimum Spanning Trees (MST) [48]. ...
Preprint
Full-text available
We introduce the concept of Most, and Least, Compact Spanning Trees -- denoted respectively by $T^*(G)$ and $T^\#(G)$ -- of a simple, connected, undirected and unweighted graph $G(V, E, W)$. For a spanning tree $T(G) \in \mathcal{T}(G)$ to be considered $T^*(G)$, where $\mathcal{T}(G)$ represents the set of all the spanning trees of the graph $G$, it must have the least sum of inter-vertex pair shortest path distances from amongst the members of the set $\mathcal{T}(G)$. Similarly, for it to be considered $T^\#(G)$, it must have the highest sum of inter-vertex pair shortest path distances. In this work, we present an iteratively greedy rank-and-regress method that produces at least one $T^*(G)$ or $T^\#(G)$ by eliminating one extremal edge per iteration.The rank function for performing the elimination is based on the elements of the matrix of relative forest accessibilities of a graph and the related forest distance. We provide empirical evidence in support of our methodology using some standard graph families; and discuss potentials for computational efficiencies, along with relevant trade-offs, to enable the extraction of $T^*(G)$ and $T^\#(G)$ within reasonable time limits on standard platforms.
... The ENTICE project [11,21] is a multidisciplinary team of computer scientists, application developers, cloud providers and operators with the aim to research a ubiquitous repository-based technology for VMI (and container) management called ENTICE environment. This environment proves a universal backbone for IaaS image management operations, which accommodate the needs for different use cases with dynamic resource (e.g., requiring resources for minutes or just for a few seconds) and other Quality of Service (QoS) requirements. ...
Article
Full-text available
Virtual machine (VM) images (VMIs) often share common parts of significant size as they are stored individually. Using existing de-duplication techniques for such images are non-trivial, impose serious technical challenges, and requires direct access to clouds’ proprietary image storages, which is not always feasible. We propose an alternative approach to split images into shared parts, called fragments, which are stored only once. Our solution requires a reasonably small set of base images available in the cloud, and additionally only the increments will be stored without the contents of base images, providing significant storage space savings. Composite images consisting of a base image and one or more fragments are assembled on-demand at VM deployment. Our technique can be used in conjunction with practically any popular cloud solution, and the storage of fragments is independent of the proprietary image storage of the cloud provider.
... Razavi and Kielmann [17] proposed an elastic VM deployment mechanism using VMI caches to overcome the VM start-up bottlenecks. Kimovski et al. [18] described mechanisms how to efficient manage virtual machines in federated cloud repositories. ...
Article
New software engineering technologies facilitate development of applications from reusable software components, such as Virtual Machine and container images (VMI/CIs). Key requirements for the storage of VMI/CIs in public or private repositories are their fast delivery and cloud deployment times. ENTICE is a federated storage facility for VMI/CIs that provides optimisation mechanisms through the use of fragmentation and replication of images and a Pareto Multi-Objective Optimisation (MO) solver. The operation of the MO solver is, however, time-consuming due to the size and complexity of the metadata, specifying various non-functional requirements for the management of VMI/CIs, such as geolocation, operational cost, and delivery time. In this work, we address this problem with a new semantic approach, which uses an ontology of the federated ENTICE repository, knowledge base, and constraint-based reasoning mechanism. Open Source technologies such as Protégé, Jena Fuseki, and Pellet were used to develop a solution. Two specific use cases, (1) repository optimisation with offline and (2) online redistribution of VMI/CIs, are presented in detail. In both use cases, data from the knowledge base are provided to the MO solver. It is shown that Pellet-based reasoning can be used to reduce the input metadata size used in the optimisation process by taking into consideration the geographic location of the VMI/CIs and the provenance of the VMI fragments. It is shown that this process leads to reduction of the input metadata size for the MO solver by up to 60% and reduction of the total optimisation time of the MO solver by up to 68%, while fully preserving the quality of the solution, which is significant.
Conference Paper
Full-text available
Cloud Federation facilitates the concept of aggregation of multiple services administered by different providers, thus opening the possibility for the customers to profit from lower cost and better performance, while allowing for the cloud providers to offer more sophisticated services. Unfortunately, current state-of-the-art does not provide any substantial means for streamlined adaptation of federated Cloud environments. One of the essential barriers that prevents Cloud federation is the inefficient management of distributed storage repositories for Virtual Machine Images (VMI). In such environments, the VMIs are currently stored by Cloud providers in proprietary centralised repositories without considering application characteristics and their runtime requirements, causing high deployment and instantiation overheads. In this paper, a novel multi-objective optimization framework for VMI placement across distributed repositories in federated Cloud environment has been proposed. Based on the communication performance requirements, VMI use patterns, and structure of images or input data, the framework provides efficient means for transparent optimization of the distribution and placement of VMIs across distributed repositories to significantly lower their provisioning time for complex resource requests and for executing the user applications.
Article
Full-text available
Elasticity, the ability to rapidly scale resources up and down on demand, is an essential feature of public cloud platforms. However, it is difficult to understand the elasticity requirements of a given application and workload, and if the elasticity provided by a cloud provider will meet those requirements. We introduce the elasticity mechanisms of a typical Infrastructure as a Service (IaaS) cloud platform (inspired by Amazon EC2). We have enhanced our Service Oriented Performance Modelling method and tool to model and predict the elasticity characteristics of three realistic applications and workloads on this cloud platform. We compare the pay-as-you-go instance costs and end-user response time service level agreements for different elasticity scenarios. The model is also able to predict the elasticity requirements (in terms of the maximum instance spin-up time) for the three applications. We conclude with an analysis of the results.
Conference Paper
Full-text available
The widespread usage of virtualization has caused a major impact in disparate areas such as scientific computing, industrial businesses and academic environments. This has led to a massive production of Virtual Machine Images (VMIs). The management of this broad spectrum of VMIs should consider the variety of operating systems, applications and hypervisors. This paper describes the developments towards a catalog and repository system for VMIs. Using the industry standard Open Virtualization Format (OVF) the system offers indexing and storage capabilities of VMIs as well as user-oriented matchmaking algorithms to leverage VMI sharing.
Article
Full-text available
Though traditional Ethernet based network architectures such as Gigabit Ethernet have suffered from a huge performance difference as compared to other high performance networks (e.g, InfiniBand, Quadrics, Myrinet), Ethernet has continued to be the most widely used network architecture today. This trend is mainly attributed to the low cost of the network components and their backward compatibility with the existing Ethernet infrastructure. With the advent of 10-Gigabit Ethernet and TCP Offload Engines (TOEs), whether this performance gap be bridged is an open question. In this paper, we present a detailed performance evaluation of the Chelsio T110 10-Gigabit Ethernet adapter with TOE. We have done performance evaluations in three broad categories: (i) detailed micro-benchmark performance evaluation at the sockets layer, (ii) performance evaluation of the Message Passing Interface (MPI) stack atop the sockets interface, and (iii) application-level evaluations using the Apache web server. Our experimental results demonstrate latency as low as 8.9 ?s and throughput of nearly 7.6 Gbps for these adapters. Further, we see an order-of-magnitude improvement in the performance of the Apache web server while utilizing the TOE as compared to the basic 10-Gigabit Ethernet adapter without TOE.
Article
Full-text available
Cloud computing is an emerging commercial infrastructure paradigm that promises to eliminate the need for maintaining expensive computing facilities by companies and institutes alike. Through the use of virtualization and resource time sharing, clouds serve with a single set of physical resources a large user base with different needs. Thus, clouds have the potential to provide to their owners the benefits of an economy of scale and, at the same time, become an alternative for scientists to clusters, grids, and parallel production environments. However, the current commercial clouds have been built to support web and small database workloads, which are very different from typical scientific computing workloads. Moreover, the use of virtualization and resource time sharing may introduce significant performance penalties for the demanding scientific computing workloads. In this work, we analyze the performance of cloud computing services for scientific computing workloads. We quantify the presence in real scientific computing workloads of Many-Task Computing (MTC) users, that is, of users who employ loosely coupled applications comprising many tasks to achieve their scientific goals. Then, we perform an empirical evaluation of the performance of four commercial cloud computing services including Amazon EC2, which is currently the largest commercial cloud. Last, we compare through trace-based simulation the performance characteristics and cost models of clouds and other scientific computing platforms, for general and MTC-based scientific computing workloads. Our results indicate that the current clouds need an order of magnitude in performance improvement to be useful to the scientific community, and show which improvements should be considered first to address this discrepancy between offer and demand.
Conference Paper
Full-text available
Cloud federation has been proposed as a new paradigm that allows providers to avoid the limitation of owning only a restricted amount of resources, which forces them to reject new customers when they have not enough local resources to fulfill their customers' requirements. Federation allows a provider to dynamically outsource resources to other providers in response to demand variations. It also allows a provider that has underused resources to rent part of them to other providers. Both things could make the provider to get more profit when used adequately. This requires that the provider has a clear understanding of the potential of each federation decision, in order to choose the most convenient depending on the environment conditions. In this paper, we present a complete characterization of providers' federation in the Cloud, including decision equations to outsource resources to other providers, rent free resources to other providers (i.e. insourcing), or shutdown unused nodes to save power, and we characterize these decisions as a function of several parameters. Then, we demonstrate in the evaluation section how a provider can enhance its profit by using these equations to exploit federation, and how the different parameters influence which is the best decision on each situation.
Conference Paper
Full-text available
The near future evolution of the cloud computing can be hypothesized in three subsequent stages: stage 1 "Monolithic" (now), cloud services are based on independent proprietary architectures; stage 2 "Vertical Supply Chain", cloud providers will leverage cloud services from other providers; stage 3 "Horizontal Federation", smaller, medium, and large cloud providers will federate themselves to gain economies of scale and an enlargement of their capabilities. Currently, the major clouds are planning the transition to the stage 2, but how to achieve the stage 3 is unclear because some architectural limitations have to be overcome. In this paper, considering a general cloud architecture, we highlight such limitations and propose some enhancements which add new federation capabilities. In order to address such concerns we propose a solution based on the Cross-Cloud Federation Manager, a new component placeable inside the cloud architectures, allowing a cloud to establish the federation with other clouds according to a three-phase model: discovery, match-making and authentication.
Conference Paper
Full-text available
The commercial success of Cloud computing and recent developments in Grid computing have brought platform virtualization technology into the field of high performance computing. Virtualization offers both more flexibility and security through custom user images and user isolation. In this paper, we deal with the problem of distributing virtual machine (VM) images to a set of distributed compute nodes in a Cross-Cloud computing environment, i.e., the connection of two or more Cloud computing sites. Ambrust et al. identified data transfer bottlenecks as one of the obstacles Cloud computing has to solve to be a commercial success. Several methods for distributing VM images are presented, and optimizations based on copy on write layers are discussed. The performance of the presented solutions and the security overhead is evaluated.
Conference Paper
Full-text available
Future Grid (FG) is an experimental, high-performance test bed that supports HPC, cloud and grid computing experiments for both application and computer scientist. Future Grid includes the use of virtualization technology to allow the support of a wide range of operating systems in order to include a test bed for various cloud computing infrastructure as a service frameworks. Therefore, efficient management of a variety of virtual machine images becomes a key issue. Current cloud frameworks do not provide a way to manage images for different IaaS frameworks. They typically provide their own image repositories, but in general they do not allow us to store the needed metadata to handle other IaaS images. We present a generic catalog and image repository to store images of any type. Our image repository has a convenient interface that distinguishes image types. Therefore, it is not only useful for Future Grid, but also for any application that needs to manage images.
Conference Paper
Full-text available
This paper introduces a modified PSO, Non-dominated Sorting Particle Swarm Optimizer (NSPSO), for better multiobjective optimization. NSPSO extends the basic form of PSO by making a better use of particles’ personal bests and offspring for more effective nondomination comparisons. Instead of a single comparison between a particle’s personal best and its offspring, NSPSO compares all particles’ personal bests and their offspring in the entire population. This proves to be effective in providing an appropriate selection pressure to propel the swarm population towards the Pareto-optimal front. By using the non-dominated sorting concept and two parameter-free niching methods, NSPSO and its variants have shown remarkable performance against a set of well-known difficult test functions (ZDT series). Our results and comparison with NSGA II show that NSPSO is highly competitive with existing evolutionary and PSO multiobjective algorithms.
Conference Paper
Elastic cloud applications rely on fast virtual machine (VM) startup, e.g. when scaling out for handling increased workload. While there have been recent studies into the VM startup time in clouds, the effects of the VM image (VMI) disk size and its contents are little understood. To fill this gap, we present a detailed study of these factors on Amazon EC2. Based on our findings, we developed a novel approach for consolidating size and contents of VMIs. We then evaluated our approach with the ConPaaS VMI, an open-source Platform-as-a-Service runtime. Compared to an unmodified ConPaaS VMI, our approach results in up to four times reduction of the disk size, three times speedup for the VM startup time, and three times reduction of storage cost.
Conference Paper
The design of quality measures for approximations of the Pareto-optimal set is of high importance not only for the performance assessment, but also for the construction of multiobjective optimizers. Various measures have been proposed in the literature with the intention to capture different preferences of the decision maker. A quality measure that possesses a highly desirable feature is the hypervolume measure: whenever one approximation completely dominates another approximation, the hypervolume of the former will be greater than the hypervolume of the latter. Unfortunately, this measure—as any measure inducing a total order on the search space—is biased, in particular towards convex, inner portions of the objective space. Thus, an open question in this context is whether it can be modified such that other preferences such as a bias towards extreme solutions can be obtained. This paper proposes a methodology for quality measure design based on the hypervolume measure and demonstrates its usefulness for three types of preferences.
Article
Small/medium cloud storage providers can hardly compete with the biggest cloud players such as Google, Amazon, Dropbox, etc. As a consequence, the cloud storage market depends on such mega-providers and each small/medium provider cannot face alone the challenge of Big Data storage. A possible solution consists in establishing stronger partnerships among small-medium providers where they can borrow/lend resources each other, according to the rules of the federated cloud ecosystem they belong to. According to such an approach, the challenge consists in creating federated cloud ecosystems able to compete with mega-provides and one of the major problems for the achievement of such an ecosystem is the management of inter-domain communications. In this paper, we propose an architecture addressing such an issue. In particular, we present and test a solution integrating the CLEVER Message Oriented Middleware (MOM) with the Hadoop Distribute File System (HDFS), i.e., one of the major massive storage solutions currently available on the market.
Conference Paper
To provide elastic cloud services with QoS guarantee, it is essential for cloud data centers to provision virtual machines rapidly according to user requests. Due to bandwidth bottleneck of centralized model, P2P model is recently adopted in data centers to relieve server workload by enabling sharing among VM instances. In this paper, we develop a simple theoretic model to analyze two typical P2P models for VM image distribution, namely, isolated-image P2P distribution model and cross-image P2P distribution model. We compare their efficiency under different parameter settings and derive their corresponding optimal server bandwidth allocation strategies. In addition, we also propose a practical optimal server bandwidth provisioning algorithm for chunk-level cross-image P2P distribution mechanism to further improve its efficiency. Extensive simulations are conducted to validate the effectiveness of our proposed algorithm.
Article
Multi-objective evolutionary algorithms (MOEAs) that use non-dominated sorting and sharing have been criticized mainly for: (1) their O(MN<sup>3</sup>) computational complexity (where M is the number of objectives and N is the population size); (2) their non-elitism approach; and (3) the need to specify a sharing parameter. In this paper, we suggest a non-dominated sorting-based MOEA, called NSGA-II (Non-dominated Sorting Genetic Algorithm II), which alleviates all of the above three difficulties. Specifically, a fast non-dominated sorting approach with O(MN<sup>2</sup>) computational complexity is presented. Also, a selection operator is presented that creates a mating pool by combining the parent and offspring populations and selecting the best N solutions (with respect to fitness and spread). Simulation results on difficult test problems show that NSGA-II is able, for most problems, to find a much better spread of solutions and better convergence near the true Pareto-optimal front compared to the Pareto-archived evolution strategy and the strength-Pareto evolutionary algorithm - two other elitist MOEAs that pay special attention to creating a diverse Pareto-optimal front. Moreover, we modify the definition of dominance in order to solve constrained multi-objective problems efficiently. Simulation results of the constrained NSGA-II on a number of test problems, including a five-objective, seven-constraint nonlinear problem, are compared with another constrained multi-objective optimizer, and the much better performance of NSGA-II is observed
Kertesz A The ENTICE approach to decompose monolithic services into microservices
  • G Marosi Kecskemeti
  • Acs
Understanding series and parallel systems reliability
  • Jorge R
Efficient distribution of virtual machines for cloud computing
  • Schmidt M Fallenbeck N Smith M Freisleben
Kecskemeti G, Marosi ACs, Kertesz A. The ENTICE approach to decompose monolithic services into microservices. In: IEEE High Performance Computing & Simulation (HPCS). Innsbruck, Austria, IEEE;July 18-22, 2016: 591‐596. <https://doi.org/10.1109/HPCSim.2016.7568389>
The hypervolume indicator revisited: on the design of pareto-compliant indicators via weighted integration
  • Thielel Zitzlere Brockhoffd
Performance characterization of a 10-gigabit ethernet TOE
  • Fengwc Balajip Baronc Bhuyanln Pandadk