Technical Report

Performance-driven Coordinated Grid Scheduling


Grid computing has emerged as a way to share geographically and organizationally distributed resources that may belong to different institutions or administrative domains. In this context, the scheduling and resource management is usually performed by a grid resource broker. The scheduling task consists of distributing the jobs among the different centers resources and the need to coordinate the grid with the underlying scheduling levels which have already been identified. However, there is still a lack of policies for this approach. In this paper we describe and evaluate our coordinated grid scheduling strategy. We take as a reference the FCFS job scheduling policy and the matchmaking approach for the resource selection. We also present a new job scheduling policy based on backfilling (JR-backfilling) that aims to improve the workloads execution performance, avoiding starvation and the SLOW-coordinated resource selection policy that considers the average bounded slowdown of the resources as the main parameter to perform the resource selection. From our evaluation, based on trace-driven simulations of real grid systems, we state that our proposed coordinated strategy can substantially improve the workloads execution performance as well as the resource utilization.

Download full-text


Available from: Ivan Rodero, Oct 07, 2015
27 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Computational grids that couple geographically distributed resources such as PCs, workstations, clusters, and scientific instruments, have emerged as a next generation computing platform for solving large-scale problems in science, engineering, and commerce. However, application development, resource management, and scheduling in these environments continue to be a complex undertaking. In this article, we discuss our efforts in developing a resource management system for scheduling computations on resources distributed across the world with varying quality of service (QoS). Our service-oriented grid computing system called Nimrod-G manages all operations associated with remote execution including resource discovery, trading, scheduling based on economic principles and a user-defined QoS requirement. The Nimrod-G resource broker is implemented by leveraging existing technologies such as Globus, and provides new services that are essential for constructing industrial-strength grids. We present the results of experiments using the Nimrod-G resource broker for scheduling parametric computations on the World Wide Grid (WWG) resources that span five continents.
    Future Generation Computer Systems 10/2002; 18(8-18):1061-1074. DOI:10.1016/S0167-739X(02)00085-7 · 2.79 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In systems consisting of multiple clusters of processors interconnected by relatively slow network connections such as our Distributed ASCI Supercomputer (DAS), applications may benefit from the availability of processors in multiple clusters. However, the performance of single-application multicluster execution may be degraded due to the slow wide-area links. In addition, scheduling policies for such systems have to deal with more restrictions than schedulers for single clusters in that every component of a job has to fit in separate clusters. In this paper we present a measurement study of the total runtime of two applications, and of the communication time of one of them, both on single clusters and on multicluster systems. In addition, we perform simulations of several multicluster scheduling policies based on our measurement results. Our results show that in many cases, restricted forms of co-allocation in multiclusters have better performance than not allowing co-allocation at all.
    01/1970: pages 105-128;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Metacomputing is the aggregation of distributed and high-performance resources on coordinated networks. With careful scheduling, resource-intensive applications can be implemented efficiently on metacomputing systems at the sizes of interest to developers and users. In this paper, we focus on the problem of scheduling applications on metacomputing systems. We introduce the concept of application-centric scheduling in which everything about the system is evaluated in terms of its impact on the application. Application-centric scheduling is used by virtually all metacomputer programmers to achieve performance on metacomputing systems. We describe two successful metacomputing applications to illustrate this approach, and describe AppLeS (Application-Level Scheduling) agents which generalize the application-centric scheduling approach. Finally, we show preliminary results which compare AppLeS-derived schedules with conventional strip and blocked schedules for a 2D Jacobi code
    High Performance Distributed Computing, 1996., Proceedings of 5th IEEE International Symposium on; 09/1996
Show more