Parallel implementations of three scientific applications using LB_migrate
ABSTRACT In this paper we focus on the implementation of large scientific applications with LB_Migrate, a dynamic load balancing library. The library employs dynamic loop scheduling techniques to address performance degradation factors due to load imbalance, provides a flexible interface with the native data structure of the application, and performs data migration. The library is reusable and it is not application specific. For initial testing, the library was employed in three applications: the profiling of an automatic quadrature routine, the simulation of a hybrid model for image denoising, and N-body simulations. We discuss the original applications without the library, the changes made to the applications to be able to interface with the library, and we present experimental results. Performance results indicate that the library adds minimal overhead, up to 6%, and it varies from application to application. However the benefits gained from the use of the library are substantial.
- [show abstract] [hide abstract]
ABSTRACT: This paper proposes guided self-scheduling, a new approach for scheduling arbitrarily nested parallel program loops on shared memory multiprocessor systems. Utilizing loop parallelism is clearly most crucial in achieving high system and program performance. This method achieves simultaneously the two most important objectives: load balancing and very low synchronization overhead. For certain types of loops the authors show analytically that guided self-scheduling uses minimal overhead and achieved optimal schedules. The authors discuss experimental results that clearly show the advantage of guided self-scheduling over the most widely known dynamic methods.IEEE Transactions on Computers 01/1987; 36:1425-1439. · 1.38 Impact Factor
Conference Proceeding: Effectiveness of a Dynamic Load Balancing Library for Scientific Applications[show abstract] [hide abstract]
ABSTRACT: The design of a general-purpose dynamic load balancing library for a vast variety of parallel applications is a very challenging task. The library has to address potentially unpredictable load imbalance in the application, to interface with the data structures native to the application and to achieve significant performance improvement and scalability. In this paper we look into the design and sample results of a new dynamic load balancing library called LB Migrate, targeted for large scientific applications with parallel loops as a major source of concurrency. The applicatons must supply the library a routine that encapsulates the computations for a chunk of loop iterates, and the data for the computations must be stored in an array of arbitary type. We demonstrate the effectiveness of the library on two real applications - the profiling of an automatic quadrature routine problem and for a simulation of a hybrid model for image denoising. The experimental results indicate that the library achieves up to 60 % performance improvement for these applications.Parallel and Distributed Computing, 2007. ISPDC '07. Sixth International Symposium on; 08/2007
- [show abstract] [hide abstract]
ABSTRACT: HACS, a fourth-order Hermite integrator with the Ahmad-Cohen scheme is implemented. HACS is self-starting, so its implementation is considerably simpler than the original ACS. Compared to ACS, HACS allows time steps twice as long for the same accuracy and the increase in calculation cost per timestep is not very large. The actual gain in speed depends on the hardware and ranges between a factor of one and two. The gain using ACS would be significantly smaller on vector or parallel machines.Publications- Astronomical Society of Japan 03/1992; 44:141-151. · 2.44 Impact Factor