This paper proposes Phoenix, a programming model for writing parallel and distributed applications that accommodate dynamically joining/leaving compute resources. In the proposed model, nodes involved in an application see a large and fixed virtual node name space. They communicate via messages, whose destinations are specified by virtual node names, rather than names bound to a physical resource. We describe Phoenix API and show how it allows a transparent migration of application states, as well as dynamically joining/leaving nodes as its by-product. We also demonstrate through several application studies that Phoenix model is close enough to regular message passing, thus it is a general programming model that facilitates porting many parallel applications/algorithms to more dynamic environments. Experimental results indicate applications that have a small task migration cost can quickly take advantage of dynamically joining resources using Phoenix. Divide-and-conquer algorithms written in Phoenix achieved a good speedup with a large number of nodes across multiple LANs (120 times speedup using 169 CPUs across three LANs). We believe Phoenix provides a useful programming abstraction and platform for emerging parallel applications that must be deployed across multiple LANs and/or shared clusters having dynamically varying resource conditions.
[Show abstract][Hide abstract] ABSTRACT: ion concepts based on process groups have largely dominated the design and implementation of communication patterns in message passing systems. Al-though such an approach seems pragmatic—given that participating processes form a ‘group’—in this dissertation, we discuss subtle issues that affect the qualitative and quantitative aspects of this approach. To address these issues, we introduce the concept of a ‘communication structure, ’ which defines a communication pattern as an implicit runtime composition of localised patterns, known as ‘roles. ’ During ap-plication development, communication structures are derived from the algorithm being implemented. These are then translated to an executable form by defining process specific data structures, known as ‘branching channels.’ The qualitative advantages of the communication structure approach are that the resulting programming model is non-ambiguous, uniform, expressive, and ex-tensible. To use a pattern is to access the corresponding branching channels; to define a new pattern is simply to combine appropriate roles. The communication
[Show abstract][Hide abstract] ABSTRACT: We describe the design and implementation of a "Grid-enabled" message passing library, in the context of the Phoenix message passing model. It supports: (1) message routing between nodes not directly reachable due to firewalls and/or NAT; (2) resource discovery facilitating ease of configuration that allows nodes without static names; (e.g., DHCP nodes) to participate in computation without specific efforts; and (3) nodes dynamically joining/leaving computation at runtime. We argue that, in future Grid environments, all of the above functions, not just routing across firewalls, will become important issues of Grid-enabled message passing systems including MPI. Unlike solutions commonly proposed by previous work on a Grid-enabled MPI, our system runs a distributed resource discovery and routing table construction algorithm, rather than assuming all such pieces of information are available in a static configuration file or alike. Experimental results using 400 nodes in three LAN indicate that our algorithm is able to dynamically discover participating peers, connect them to each other and calculate a routing table. The elapsed time of our algorithm is only approximately twice as long as that of offline route calculation that just connects nodes based on a fully given configuration.
Cluster Computing and the Grid, 2004. CCGrid 2004. IEEE International Symposium on; 05/2004
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.