Grid Services for MPI
ABSTRACT Institutional grids consist of the aggregation of clusters belonging to different administrative domains to build a single parallel machine. To run an MPI application over an institutional grid, one has to address many challenges. One of the first problems to solve is the connectivity of the different nodes not belonging to the same administrative domain. Techniques based on communication relays, dynamic port opening, among others, have been proposed. In this work, we propose a set of Grid or Web Services to abstract this connectivity service, and we evaluate the performances of this new level of communication for establishing the connectivity of an MPI application over an experimental grid.
Article: QCG-OMPI: MPI applications on grids.Future Generation Comp. Syst. 01/2011; 27:357-369.
Conference Proceeding: Optimal Experimental Design in the Modelling of Pattern Formation.Computational Science - ICCS 2008, 8th International Conference, Kraków, Poland, June 23-25, 2008, Proceedings, Part I; 01/2008
Conference Proceeding: Running Parallel Applications with Topology-Aware Grid Middleware.Fifth International Conference on e-Science, e-Science 2009, 9-11 December 2009, Oxford, UK; 01/2009
Grid Services for MPI
Camille Coti2, Ala Rezmerita1, Thomas Herault1, and Franck Cappello2
1Univ Paris Sud; LRI; INRIA Futurs; F-91405 Orsay France
2INRIA Futurs; F-91405 Orsay France
email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org
Institutional grids consist of the aggregation of clusters belonging to different adminis-
trative domains that build a single parallel machine. In order to run a MPI application
over an institutional grid, one has to address many problems. One of the first problems
to solve is the connectivity of nodes not belonging to the same administrative domain.
To protect the network from unauthorized access many sites use firewalls. On some
sites firewalls are configured to allow outbound connections and to block inbound con-
nections,oftenwith the exceptionof a fewwell-knownports(e.g.,SSH). On someother
sites there is a strict separation between the internal and external networks and only a
front-end machine is accessible. Theses connectivity constraints limit the execution of
parallel application between multiple sites.
The connectivity problems can sometimes be solved when only one site uses a fire-
wall, since all required connections are initiated from the protected site. However this
solution requires modifications to applications or communication libraries. Also, if all
sites are using firewalls this approach can no longer be applied. Another solution is to
configure the firewalls so that a port range is open and adapt the applications to use
only theses ports. However this solution is a threat to the site security. Sometimes the
only possibility for the compute nodes to communicate with the outside world is to use
the front-end machine as a bridge. In addition to causing connection setup problems
the use of Network Address Translation (NAT) devices complicates machine identifi-
cation. The private addresses used in a NAT site are not globally unique, which causes
difficulties in creating a unique identifier for every machine.
In this work, we propose a set of Grid or Web Services that provide a new level of
communication for establishing connectivity of MPI applications over an experimental
We define a distributed frameworkto allow the grid infrastructureto provideservices to
the applications. In this paper we detail a brokeringservice that providesthe computing
nodes a way to communicate with each other. Other services can be implemented in
our framework, such as monitoring service, spawning service and distributed storage
service. The brokering service establishes a connection between nodes that would not
be able to communicate with each other otherwise because a NAT and/or a firewall
F. Cappello et al. (Eds.): EuroPVM/MPI 2007, LNCS 4757, pp. 393–394, 2007.
c ? Springer-Verlag Berlin Heidelberg 2007
394C. Coti et al.
are standing between them. When the MPI library needs to establish a communication
that finds the best method to establish this connection (NAT and/or firewall bypassing)
and returns the appropriate connection information to the initiator of the connection.
Some techniques have been presented in . We implemented the service using the
light-weight web-services engine gSOAP and interfaced it with OpenMPI.
1 10 1001000 10000100000
Message size in Bytes
Bandwidth in Mbps
(a) Bandwidth measured by NetPIPE
Message size in Bytes
Latency in usec
(b) Latency measured by NetPIPE
Fig.1. Communication performance
Figure 1 shows the impact of the framework on communication performances, mea-
sured using the NetPipe test. The nodes are interconnected by a proxy, which adds a
hop between them. We can see the impact of this additional hop on bandwidth on fig-
ure 1(a) and on latency on figure 1(b). Since the service is invoked only to establish
the communication, it has no effect on the performances of the communications them-
selves. Therefore, the other techniques that establish a direct connection between two
nodes (reverse connection, traversing TCP and TCP hole punching) give the same per-
formances as a direct connection without a firewall. The additional cost induced by the
establishment of the connection is 10 ms.
1. Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V.,
Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H.,Daniel, D.J., Graham, R.L., Woodall,
T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In:
Proceedings, 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, pp. 97–
104 (September 2004)
2. Rezmerita, A., Morlier, T., N´ eri, V., Cappello, F.: Private virtual cluster: Infrastructure and
protocol for instant grids. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds.) Euro-Par 2006.
LNCS, vol. 4128, pp. 393–404. Springer, Heidelberg (2006)
3. van Engelen, R.: Pushing the SOAP envelope with web services for scientific computing. In:
proceedings of the International Conference on Web Services (ICWS), pp. 346–352 (2003)