Publications (7)0 Total impact
- [Show abstract] [Hide abstract]
ABSTRACT: As modern service systems are pressured to provide competitive prices via cost-effective capacity planning, especially in the paradigm of cloud computing, service level agreements (SLAs) end up becoming ever more sophisticated, i.e., fulfilling targets of different percentiles of response times. However, it is no mean feat to predict even the average response times of real systems, or even abstracted queueing systems that typically simplify system details, and it gets even more complicated when trying to manage SLAs defined by various percentiles of response times. To efficiently capture these different percentiles, we first develop a novel and autonomic methodology - termed Burst Based Simulation, which combines burst profiling on real systems with complex, state-dependent simulations. Moreover, based on our methodology, we construct an analysis on SLA management: the prediction of SLA violations given a certain request pattern. We evaluate our approach on two types of service systems, virtualized and bare-metal, with wide ranges of SLAs and traffic loads. Our evaluation results show that our methodology is able to achieve an average error below 15% when predicting different response time percentiles, and accurately capture SLA violations.
Conference Paper: Cost-driven Service Provisioning in Hybrid Clouds[Show abstract] [Hide abstract]
ABSTRACT: Hybrid clouds, which comprise nodes both in a private cloud and in a public cloud, have emerged as a new model for service providers to deploy their services. However, given Quality-of-Service requirements for each service, the question of on how many private and public nodes to deploy the services in the most cost-effective way remains to be answered. The challenges faced in the hybrid cloud stem from the disparate time-varying requests across multiple services, the different cost structures of both types of nodes, and the performance characteristics of nodes. In this paper, we propose a novel algorithm to dynamically optimize the allocation of private and public nodes across services, with special focus on the performance-cost tradeoff between private and public nodes. The algorithm is based on an analytical cost-performance framework for service deployment in hybrid clouds. Our evaluation results based on trace-driven simulation show that our proposed node allocation algorithm can effectively achieve a good cost-performance ratio, compared to the deployment of purely public and private cloud.
Conference Paper: Opportunistic Service Provisioning in the Cloud[Show abstract] [Hide abstract]
ABSTRACT: There is an emerging trend to deploy services in cloud environments due to their flexibility in providing virtual capacity and pay-as-you-go billing features. Cost-aware services demand computation capacity such as virtual machines (VMs) from a cloud operator according to the workload (i.e., service invocations) and pay for the amount of capacity used following billing contracts. However, as recent empirical studies show, the performance variability, i.e., non-uniform VM performance, is inherently higher than in private hosting platforms, since cloud platforms provide VMs running on top of typically heterogeneous hardware shared by multiple clients. Consequently, the provisioning of service capacity in a cloud needs to consider workload variability as well as varying VM performance. We propose an opportunistic service replication policy that leverages the variability in VM performance, as well as the on-demand billing features of the cloud. Our objective is to minimize the service provisioning costs by keeping a lower number of faster VMs, while maintaining target system utilization. Our evaluation results on traces collected from in-production systems show that the proposed policy achieves significant cost savings and low response times.
Conference Paper: Minimizing Retrieval Cost of Multi-Layer Content Distribution Systems[Show abstract] [Hide abstract]
ABSTRACT: Content distribution systems, such as on-demand video services , file-sharing networks [8,1], and content clouds , provide ubiquitous data access and data sharing for large numbers of end-users. To efficiently provide content access across geographic locations, content storage nodes with limited capacity are conventionally organized in a multi-layer architecture to facilitate vertical as well as horizontal peer content retrievals, each of which may have different bandwidth constraints and transport costs. Content management, i.e., caching strategies, is deployed to efficiently utilize storage nodes and further reduce network retrieval traffic and cost. An optimal system design of such a multi-layer system needs to accommodate the trade-offs between vertical communication, peer communication, storage capacity and the users' retrieval traffic, driven by caching policies. In this paper, we propose a generic optimization framework based on steady-state content diffusion to minimize the total content retrieval cost in multi-layer content distribution systems, while considering the aforementioned trade-offs. The derived optimal content diffusion can evaluate the optimality of caching policies, and dimension the size of the system and the node caching capacity. Furthermore, we develop Peer Aware Content Caching (PACC) policies based on the derived optimal content diffusion. Our simulation results show that PACC effectively caches content and minimizes vertical and horizontal content retrieval costs under different system scenarios.
Conference Paper: Load-Balancing Dynamic Service Binding in Composition Execution Engines[Show abstract] [Hide abstract]
ABSTRACT: Performance and scalability of service-oriented applications, such as Web service compositions or business processes, depend on the dynamically bound services. In order to handle an increasing number of clients, load-balancing techniques are important. In this paper we assume the presence of multiple functionally equivalent services and explore different load-balancing algorithms to dynamically select service bindings with the goal to reduce average service response time. Using mathematical queueing models of service performance and simulation, we compare different service selection algorithms, including Static Lottery, Round-Robin, and Shortest-Queue. Furthermore, we propose linear and quadratic Dynamic Lottery service selection algorithms, which assign and periodically update service selection probabilities according to monitored average service response time. Our simulation environment models both stateless and stateful services and offers a wide range of service performance models with different degrees in the variation of service response time. While the Shortest-Queue algorithm performs best in simulation settings with only stateless services or low variance of service response time, the Round-Robin and Dynamic Lottery algorithms work best in settings with stateful services and high variance of service performance.
- [Show abstract] [Hide abstract]
ABSTRACT: The load on today's service-oriented systems is strongly varying in time. It is advantageous to conserve energy by adapting the number of replicas according to the recent load. Over-provisioning of service replicas is to be avoided, since it increases the operating costs. Under-provisioning of service replicas leads to serious performance degradation and violates service-level agreements. To reduce energy consumption and maintain appropriate performance, we study two service replication strategies: (1) arrival rate based and (2) response time based policy. By simulation, we show that the average number of service replicas and response time can be reduced especially when combining our proposed replication strategies and load balancing schemes.