-
[show abstract]
[hide abstract]
ABSTRACT: Known challenges for petascale machines are that (1) the costs of I/O for high performance applications can be substantial,
especially for output tasks like checkpointing, and (2) noise from I/O actions can inject undesirable delays into the runtimes
of such codes on individual compute nodes. This paper introduces the flexible ‘DataStager’ framework for data staging and
alternative services within that jointly address (1) and (2). Data staging services moving output data from compute nodes
to staging or I/O nodes prior to storage are used to reduce I/O overheads on applications’ total processing times, and explicit
management of data staging offers reduced perturbation when extracting output data from a petascale machine’s compute partition.
Experimental evaluations of DataStager on the Cray XT machine at Oak Ridge National Laboratory establish both the necessity
of intelligent data staging and the high performance of our approach, using the GTC fusion modeling code and benchmarks running
on 1000+ processors.
KeywordsI/O-WARP-GTC-XT3-Datatap-XT4-Staging-Data services
Cluster Computing 04/2012; 13(3):277-290. · 0.52 Impact Factor
-
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, HPDC 2011, San Jose, CA, USA, June 8-11, 2011; 01/2011
-
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, HPDC 2011, San Jose, CA, USA, June 8-11, 2011; 01/2011
-
IEEE 7th International Conference on E-Science, e-Science 2011, Workshop Proceedings, Stockholm, Sweden, December 5-8, 2011; 01/2011
-
Proceedings of the 8th International Conference on Autonomic Computing, ICAC 2011, Karlsruhe, Germany, June 14-18, 2011; 01/2011
-
Jeremy Logan,
Scott Klasky,
Jay F. Lofstead,
Hasan Abbasi,
Stéphane Ethier,
Ray W. Grout,
Seung-Hoe Ku,
Qing Liu,
Xiaosong Ma,
Manish Parashar,
Norbert Podhorszki,
Karsten Schwan, Matthew Wolf
IEEE 7th International Conference on E-Science, e-Science 2011, Workshop Proceedings, Stockholm, Sweden, December 5-8, 2011; 01/2011
-
24th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010, Atlanta, Georgia, USA, 19-23 April 2010 - Workshop Proceedings; 01/2010
-
Proceedings of the 7th International Conference on Autonomic Computing, ICAC 2010, Reston, VA, USA, June 7-11, 2010; 01/2010
-
Conference on High Performance Computing Networking, Storage and Analysis, SC 2010, New Orleans, LA, USA, November 13-19, 2010; 01/2010
-
Proceedings of the Third ACM International Conference on Distributed Event-Based Systems, DEBS 2009, Nashville, Tennessee, USA, July 6-9, 2009; 01/2009
-
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31 - September 4, 2009, New Orleans, Louisiana, USA; 01/2009
-
ICPPW 2009, International Conference on Parallel Processing Workshops, Vienna, Austria, 22-25 September 2009; 01/2009
-
Norbert Podhorszki,
Scott Klasky,
Qing Liu,
Ciprian Docan,
Manish Parashar,
Hasan Abbasi,
Jay F. Lofstead,
Karsten Schwan, Matthew Wolf,
Fang Zheng,
Julian Cummings
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, WORKS 2009, November 16, 2009, Portland, Oregon, USA; 01/2009
-
Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, HPDC 2009, Garching, Germany, June 11-13, 2009; 01/2009
-
Fourth International Conference on e-Science, e-Science 2008, 7-12 December 2008, Indianapolis, IN, USA; 01/2008
-
[show abstract]
[hide abstract]
ABSTRACT: The data needs of current and future PetaScale applications have increased over the last half decade to the extent that appropriate data management has become a crucial requirement. This concerns not only the storage of data produced by the new class of PetaScale applications, but also the data exchanges needed for coupling applications with concurrent analysis, online data visualization for validation, and others. To address such dynamic code coupling, we introduce the concept of an extensible, dynamic, and flexible data workspace, termed LIVE. In contrast to the data exchanges programmed with MPI, MPI-IO, or grid software, LIVE focuses on data exchanges carried out without a priori knowledge of potential data requirements. Examples include exchanges required by ad-hoc or dynamically determined methods for data validation, for general data analysis tasks, or for data visualization. Run on an execution environment comprised of integrated dynamic discovery and on-line management services, LIVE is used to create a ‘data workspace’ for a working molecular dynamics code base utilized by mechanical and materials engineers at Georgia Tech, for multi-scale materials modeling. Measurements of both this application’s data workspace and of the basic primitives in the LIVE framework demonstrate that the environment’s substantial flexibility has minimal impact on overall performance, and in fact, that it improves performance in a number of usage scenarios. In particular, for a visualization pipeline example derived from our collaborators, we see a slight improvement over a solution based on MPI-IO, and a further improvement of up to 5% by utilizing LIVE’s ability to overlap communication with user-specified computation.
Cluster Computing, 2007 IEEE International Conference on; 10/2007
-
Proceedings of the 16th International Symposium on High-Performance Distributed Computing (HPDC-16 2007), 25-29 June 2007, Monterey, California, USA; 01/2007
-
2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26 - 30, 2005, Boston, Massachusetts, USA; 01/2005
-
[show abstract]
[hide abstract]
ABSTRACT: Monitoring the resources of distributed systems is essential to the successful deployment and execution of grid applications, particularly when such applications have welldefined QoS requirements. The dproc system-level monitoring mechanisms implemented for standard Linux kernels have several key components. First, utilizing the familiar /proc filesystem, dproc extends this interface with resource information collected from both local and remote hosts. Second, to predictably capture and distribute monitoring information, dproc uses a kernel-level group communication facility, termed KECho, which is based on events and event channels. Third and the focus of this paper is dproc's run-time customizability for resource monitoring, which includes the generation and deployment of monitoring functionality within remote operating system kernels. Using dproc, we show that (a) data streams can be customized according to a client's resource availabilities (dynamic stream management), (b) by dynamically varying distributed monitoring (dynamic filtering of monitoring information) appropriate balance can be maintained between monitoring overheads and application quality, and (c) by performing monitoring at kernel-level, the information captured enables decision making that takes into account the multiple resources used by applications.
09/2004;
-
[show abstract]
[hide abstract]
ABSTRACT: Large-scale clusters require run-time monitoring of their resources to determine the appropriate allocations of resources to applications and to ensure the applications requirements for performance or quality of service (QoS). Such applications share I/O devices, access remote sensors and large-scale remote data contained in digital libraries, support scienti c collaboration by remote visualization of their data, and they interact with other computations via the Grid[1, 2].
09/2004;