Publications (2)1.3 Total impact
- [show abstract] [hide abstract]
ABSTRACT: We present a collection of techniques for exploiting latent I/O asynchrony which can substantially improve performance in data-intensive parallel applications. Latent asynchrony refers to an application’s tolerance for decoupling ancillary operations from its core computation, and is a property of HPC codes not fully explored by current HPC I/O systems. Decoupling operations such as buffering and staging, reorganization, and format conversion in space and in time from core codes can shorten I/O phases, preserving valuable MPP compute cycles. We describe in this paper DataTaps, IOgraphs, and Metabots, three tools which allow HPC developers to implement decoupled I/O operations. Using these tools, asynchrony can be exploited by data generators which overlap computation with communication, and by data consumers that perform data conversion and reorganization out-of-band and on-demand. In the context of a data-intensive fusion simulation, we show that exploiting latent asynchrony through decoupling of operations can provide significant performance benefits.International Journal of High Performance Computing Applications 01/2011; 25:161-179. · 1.30 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: The challenge of meeting the I/O needs of petascale applications is exacerbated by an emerging class of data-intensive HPC applications that requires annotation, reorganization, or even conversion of their data as it moves between HPC com- putations and data end users or producers. For instance, data visualization can present data at different levels of detail. Further, actions on data are often dynamic, as new end-user requirements introduce data manipulations unforeseen by orig- inal application developers. These factors are driving a rich set of requirements for future petascale I/O systems: (1) high levels of performance and therefore, flexibility in how data is extracted from petascale codes; (2) the need to support 'on de- mand' data annotation - metadata creation and management - outside application codes; (3) support for concurrent use of data by multiple applications, like visualization and storage, including associated consistency management and scheduling; and (4) the ability to flexibly access and reorganize physical data storage. We introduce an end-to-end approach to meeting these requirements: Structured Streams, streams of structured data with which methods for data management can be associated whenever and wherever needed. These methods can execute synchronously or asynchronously with data extraction and streaming, they can run on the petascale machine or on associated machines (such as storage or visualization engines), and they can implement arbitrary data annotations, reorganization, or conversions. The Structured Streaming Data System (SSDS) enables high-performance data movement or manipulation between the compute and service nodes of the petascale machine and between/on service nodes and ancillary machines; it enables the metadata creation and management associated with these movements through specification instead of application coding; and it ensures data consistency in the presence of anticipated or unanticipated data consumers. Two key abstractions implemented in SSDS, I/O graphs and Metabots, provide developers with high-level tools for structuring data movement as dynamically-composed topologies. A lightweight storage system avoids traditional sources of I/O overhead while enforcing protected access to data. This paper describes the SSDS architecture, motivating its design decisions and intended application uses. The utility of the I/O graph and Metabot abstractions is illustrated with examples from existing HPC codes executing on Linux Infiniband clusters and Cray XT3 supercomputers. Performance claims are supported with experiments benchmarking the underlying software layers of SSDS, as well as application-specific usage scenarios.