Content uploaded by Ranieri Baraglia
Author content
All content in this area was uploaded by Ranieri Baraglia on May 09, 2013
Content may be subject to copyright.
WAMM (Wide Area Metacomputer Manager):
A Visual Interface for Managing Metacomputers
Version 1.0
Ranieri Baraglia, Gianluca Faieta, Marcello Formica,
Domenico Laforenza
CNUCE - Institute of the Italian National Research Council
Via S.Maria, 36 - I56100 Pisa, Italy
Tel. +39-50-593111 - Fax +39-50-904052
email: R.Baraglia@cnuce.cnr.it, D.Laforenza@cnuce.cnr.it
meta@calpar.cnuce.cnr.it
Contents
1 Introduction 3
1.1 Metacomputer . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Metacomputing environments 4
3 Design goals 5
4 WAMM 9
4.1 Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3 Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.4 Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.5 Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 WAMM's implementation issues 19
5.1 Structure of the program . . . . . . . . . . . . . . . . . . . . . 19
5.2 Host control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.3 Task control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.4 Remote commands . . . . . . . . . . . . . . . . . . . . . . . . 21
5.5 Remote compilation . . . . . . . . . . . . . . . . . . . . . . . . 22
6 Future developments 23
7 Related works 24
References 25
1 INTRODUCTION 3
1 Introduction
Last years have seen a considerable increase in computer p erformance, mainly
as a result of faster hardware and more sophisticated software. Nevertheless,
there are still problems, in the elds of science and engineering, that are not
aordable using currently available sup ercomputers. Infact, these problems,
due to their size and complexity, require a computing p ower which is consid-
erably higher than currently available in a single machine. To deal with such
problems, several sup ercomputers need to be concentrated on the same site
to get the total p ower needed. This is obviously unfeasible both in terms of
logistics and economics.
For a few years in some imp ortant research centers (mostly in the USA)
tests have been made on the coop erative use, via network, of geographically
distributed computational resources. Several words have b een coined in con-
nection with this approach, such as Metacomputing [1], Heterogenous Com-
puting [3], Distributed Heterogeneous Supercomputing [2], Network Com-
puting, etc. One of the main reasons for introducing computer networks
was to allow researchers to carry out their work where they pleased, by giv-
ing them access to geographically distributed computing tools, both rapidly
and transparently. Technological advances and the increasing diusion of
networks (originally used for le transfer, electronic mail and then remote
login) now make it possible to achieve another interesting goal: to consider
multiple resources distributed over a network as a single computer, that is,
a
metacomputer
.
1.1 Metacomputer
A metacomputer is very dierent from a typical parallel MIMD-DM machine
(e.g. Thinking Machine CM-5, nCUBE2, IBM SP2). Generally, a MIMD
computer consists of
tightly coupled
processing no des of the same type, size
and power, whereas in a metacomputer the resources are
loosely coupled
and
heterogeneous. Each of these can eciently perform sp ecic tasks (calcula-
tion, storage, rendering, etc.), and, in these terms, each machine can execute
a suitable piece of an application. Thus, it is possible to exploit the
anity
existing between software modules and architectural classes. For example, in
a metacomputer containing a Connection Machine (CM-2) and an IBM SP2,
an application which is partitionable in two components | one
data parallel
,
WAMM Overview
2 METACOMPUTING ENVIRONMENTS 4
the other
coarse-grain task farm
| would naturally exploit the features of
both machines.
Metacomputing is now certainly feasible and could b e an economically
viable way to deal with some complex computational problems (not only
technical and scientic ones), as a valid alternative to extremely costly tra-
ditional supercomputers. Metacomputing is still at an early stage and more
research is necessary in several scientic and technological areas, for example:
1. metho dologies and tools for the analysis, parallelization and distribu-
tion of an application on a metacomputer;
2. algorithms for process-pro cessor allocation and load balancing in a het-
erogeneous environment;
3. user-friendly interfaces to manage and program metacomputers;
4. fault-tolerance and security of the metacomputing environments;
5. high p erformance networks.
2 Metacomputing environments
Developing metacomputing environments entails resolving several problems,
both hardware (networks, mass memories with parallel access, etc.) and
software (languages, development environments, resource management tools,
etc). Although many of the hardware problems are close to a solution, soft-
ware problems are still far from being resolved.
Currently available development environments generally have tools for
managing the resources of a metacomputer, but often do not have adequate
tools for designing and writing programs. Without such tools, software de-
sign cycle for metacomputers can come up against considerable diculties.
Typically, building an application for a metacomputer involves the following
steps:
1. user writes the source les on a local no de of the metacomputer;
2. source les are then transferred on every node;
3. the related compilation is made on all no des;
WAMM Overview
3 DESIGN GOALS 5
4. if errors are detected or modications made, then all the previous steps
are rep eated.
Compilation is needed on each node since it is not p ossible to determine a
priori on which machines the modules that make up the application will be
run. If the right tools are not available then the user has to manually transfer
source les and execute the corresponding compilation on all the no des; such
operations have to be repeated whenever even the smallest error needs to be
corrected. Therefore, even with just a few machines, a metho d to make these
operations automatic is essential.
Some metacomputing environments provide to ols to control certain as-
pects of conguration and management of the virtual machine, such as the
activation, insertion and removal of no des (e.g. the PVM console [11]). In
certain cases, however, easier to use management tools are needed, ab ove all
when large metacomputers with several nodes are b eing worked on.
To alleviate these problems, we have developed a graphical interface based
on OSF/Motif [21] and PVM, which simplies op erations normally carried
out to build and use metacomputer applications, as well as for managing
parallel virtual machines.
3 Design goals
This section describes the guidelines we followed in designing the interface.
We b elieve they are general enough to b e applicable to any development to ol
for metacomputing.
Easy of use of a metacomputer.
The main aim of the interface should b e
the simplication of the use of a metacomputer. This entails giving the user
an overall vision of the system, especially if there are many nodes spread out
over several sites. At the same time, the individual resources should b e easily
identiable. Although a simple list of the network addresses of the machines
would probably b e the fastest method to identify and access a particular
node, it would b e b etter to group machines by following some precise criteria
so as to facilitate user
exploration
of the resources available on the network.
In addition, the interface should let users work ab ove all on the local
node. Op erations that need to be carried out on remote no des should b e
executed automatically. Thus, developing software for metacomputers will
WAMM Overview
3 DESIGN GOALS 6
mainly require the use of the same tools used to write and set up sequential
programs (editor,
make
, etc.). This way, the impact with a new programming
environment will be less problematic.
Any simplications of the use of metacomputers cannot b e made if the
tools themselves are not easy to use and intuitive. It is well known that
Graphical User Interfaces (GUI) have gained the favor of computer users. We
therefore decided to develop our interface as an X11 program, thus allowing
users to access functionalities via windows, menus and icons. This requires
the use of graphical terminals, but it saves users from having to learn new
commands, keyboard shortcuts, etc.
System control.
When working with a metacomputer, especially if a low
level programming environment such as PVM is used, it may be dicult to
control operations that o ccur on remote no des: unexperienced users could
be discouraged. An interface for programming and using a metacomputer
should oer users as much information as possible, and full control on what
happens in the system. For example, users should never get into situations
where they do not know what is happening on a certain node. If problems
arise, these should be communicated with complete messages and not with
cryptic error codes. If the problem is so serious that the interface can no
longer b e used, then the program must exit tidily.
Virtual Machine management.
The interface must have a set of basic func-
tions to manage the virtual machine (addition/removal of nodes, control of
the state of one or more nodes, creation and shutdown of the virtual ma-
chine). Essentially, all the basic functions of the PVM console should be
implemented.
Process management.
Again, the functionalities to implement should b e, at
least, the ones that are provided by the PVM console. It must be p ossible
to spawn processes; the interface must allow the use of all the activation
parameters and ags that can be used in PVM. Users should be able to
redirect the output of the tasks towards the interface, so that they can control
the b ehaviour of the programs in \real time".
Remote commands.
When several machines are available, users often need
to open sessions on remote hosts or, at least, execute remote commands (e.g.
WAMM Overview
4 WAMM 7
uptime
or
xload
so as to know the machine load). Using UNIX commands
such as
rsh
to execute a program on a remote host is rather inconvenient, so
the interface should simplify this. Further simplication is needed for X11
programs. When an X11 program is run on a remote host, windows have to
be visualized on the local graphical terminal, which itself must be allowed to
accept them. The
xhost
command is used to p ermit this, but it should be
made automatic if X11 programs are run by the interface.
Remote compilation.
One of the most imp ortant functionalities of the in-
terface should be the ability to carry out the compilation of a program on
remote machines. Once the lo cal directory with the source codes has been
specied, along with the
Makefile
to use and the hosts where the program
has to b e compiled, the remainder should be completely managed by the
interface. This involves sending source les to remote hosts and starting
compilers. Any such op erations carried out by hand, apart from being time
expensive, are also error prone.
Remote compilation is quite complex. The user of the interface must be
able to follow the procedure step by step, and if necessary stop it at any
moment. For users to feel at ease with an automatic tool, all the operations
should be carried out tidily. For example, temporary old les, created on
le systems of remote machines as a result of previous compilation, must b e
deleted transparently.
Congurability.
The interface must be congurable so that it can be adapted
to any number of machines and sites. The conguration of the metacomputer
must therefore not be hard-co ded in the program, but specied by external
les.
To modify any graphical element on the interface (colours, window size,
fonts, etc.) resource les should be used. This is the standard technique for
all X11 programs, and does not require the interface to be re-compiled. Also,
using the X11 program
editres
, graphical elements can be modied without
having to write a resource le.
Finally, the program should not impose any constraints on the number of
nodes and networks in the system, nor on what type or where they are.
WAMM Overview
4 WAMM 8
Figure 1: WAMM
WAMM Overview
4 WAMM 9
4 WAMM
On the basis of the goals and criteria dened above, we have developed
WAMM (Wide Area Metacomputer Manager), an interface prototype for
metacomputing (g. 2). WAMM was written in C; in addition to PVM, the
OSF/Motif and xpm
1
libraries are required. This section gives a general
overview of the interface.
4.1 Conguration
To use WAMM, users have to write a conguration le which contains the
description of the nodes that can be inserted into the virtual machine, i.e.
all the machines that users can access and which have PVM installed. This
operation only has to be done the rst time that WAMM is used.
The conguration le is written in a simple declarative language. An
excerpt of a conguration le is shown in the following:
WAN italy {
TITLE "WAN Italy"
PICT italy.xpm
MAN cineca 290 190
MAN pisa 210 280
LAN caspur 300 430 }
MAN cineca {
TITLE "Cineca"
PICT cineca.xpm
LAN cinsp1 220 370
LAN cinsp2 220 400
LAN cinsp3 220 430
LAN cinsp4 220 460 }
...
MAN pisa {
TITLE "MAN Pisa"
PICT pisa.xpm
1
xpm
is a freely distributable library which simplies the use of pixmaps in X11; it is
available via anonymous FTP at
avahi.inria.fr
WAMM Overview
4 WAMM 10
LAN cnuce 200 100
LAN sns 280 55 }
LAN caspur {
TITLE "Caspur"
HOST caspur01
HOST caspur02
...
HOST caspur08 }
...
HOST cibs {
ADDRESS cibs.sns.it
PICT IBM41T.xpm
ARCH RS6K
OPTIONS "&"
XCMD "Xterm" "xterm -sb"
CMD "Uptime" "uptime"
CMD "Who" "who"
}
...
The le describes the geographical network used, named
italy
. The net-
work consists of some MANs and a LAN. For example,
pisa
MAN includes
the local networks
cnuce
and
sns
;
sns
LAN contains various workstations,
among which
cibs
.
As can b e seen, the network is described following a tree-like structure.
The root is the WAN, the geographical network that groups together all the
hosts. The children are Metropolitan (MAN) and Local (LAN) networks. A
MAN can only contain local networks, whereas LANs contain only the hosts,
the leaves of the tree.
Various items can be specied for each declared structure, many of which
are optional and have default values. Each no de on the tree (network or host)
can have a PICT item, which is used to associate a picture to the structure.
Typically, geographical maps are used for networks, indicating where the
resources are; icons representing the architecture are used for the hosts.
WAMM Overview
4 WAMM 11
The following is an example of a \rich" description of a host:
HOST cibs {
ADDRESS cibs.sns.it # host internet address
PICT IBM500.xpm # icon
ARCH RS6K # architecture type
INFO "RISC 6000 model 580" # other information
OPTIONS "& so=pw" # PVM options
XCMD "AIXTerm" "aixterm -sb"
XCMD "NEdit" "nedit" # remote commands
XCMD "XLoad" "xload"
CMD "Uptime" "uptime"
}
In this case the host is an
IBM RISC 6000
workstation; the type of architec-
ture is
RS6K
(the same names as adopted by PVM are used). To insert the
node into the PVM, sp ecial ags have to be used (
& so=pw
are PVM's own
options). The user can execute
aixterm
,
nedit
,
xload
and
uptime
com-
mands directly from the interface on the remote node. For example, using
aixterm
he can connect himself directly to the machine. The
aixterm
pro-
gram runs on remote node, but the user receives the window on the local
terminal and can use it to insert UNIX commands.
4.2 Activation
User starts the interface with the command:
wamm <configuration file>
PVM does not need to have b een already activated. If the virtual machine
does not exist at this point, WAMM creates it on the basis of the contents
of the conguration le. The \base" window corresponding to the WAN is
shown to the user (g. 2).
4.3 Windows
WAMM visualizes information relating to the networks (at WAN, MAN or
LAN level) in separate windows, one for each network. Hosts are shown
inside the window of the LAN they belong to.
WAMM Overview
4 WAMM 12
Figure 2: WAMM, example of initial WAN window
WAMM Overview
4 WAMM 13
The WAN window is split into three parts (g. 2). At top left is the map
indicated in the conguration le. There is a button for each sub-network;
the user can select them to open corresponding windows.
All the hosts declared in the conguration le are listed on the right. The
list has various uses: the user can access a host quickly by double clicking on
the name of the machine, without having to navigate through the various sub-
networks. By selecting a group of hosts, various operations can be invoked
by the menu:
insert hosts into PVM;
remove hosts from PVM;
check hosts' status;
compile on selected hosts.
All the messages produced by WAMM are shown at the bottom. Fig-
ure 2 shows information written when the program was started. MAN
sub-networks are shown using the same type of window (g 3). The only
dierence is in the list of hosts, which, in this case, only includes no des that
belong to the MAN.
For the local networks, the windows are organized dierently (see g. 4).
The window reproduces a segment of Ethernet with the related hosts. For
each host the following are shown: the icon, the current status (
PVM
means
that the node belongs to the virtual machine), the type of architecture and
other information specied in the conguration le. Each icon has a p opup
menu associated with it, which can be activated using the right mouse button.
This menu enables users to change the status of the node (add or remove from
PVM), run a compilation or execute one of the remote commands indicated
in the conguration le. Basic operations on groups of hosts can still be
carried out by selecting one or more nodes and invoking the operation from
the window menu. In all cases, the results appear in the message area at the
bottom of the window.
4.4 Compilation
The compilation of a program is mostly managed by WAMM: the user only
has to select hosts where he wants to do the compilation, and call
Make
WAMM Overview
4 WAMM 14
Figure 3: WAMM, a MAN window representing Pisa
from the
Apps
menu (g. 2). Using a dialog b ox, the local directory that
contains the source les and the
Makefile
can be specied, along with any
parameters needed for the
make
command. No restrictions are made on the
type of source les to compile: they can be written in any language.
WAMM carries out some op erations needed to compile an application. In
the following order:
1. all the source les are group ed into one le, in the
tar
standard format
used on UNIX machines;
2. the le produced is compressed using the
compress
command;
3. a PVM task, which deals with the compilation (
PVMMaker
), is spawned
on each selected no de; the compressed le is sent to all these tasks.
WAMM Overview
4 WAMM 15
Figure 4: WAMM, LAN window
Now WAMM's work ends: the remaining op erations are carried out at the
same time by all
PVMMaker
s, each on its node. Each
PVMMaker
performs the
following actions:
1. it creates a temp orary work directory, inside user's home directory;
2. compressed le is received, expanded and saved in the new directory;
source les are extracted;
3. the UNIX
make
command is executed.
At the end of the compilation the working directory is not destroyed (but it
will be if there is a subsequent compilation). If needed, the user can thus
WAMM Overview
4 WAMM 16
Figure 5: WAMM, control window for the compilation
connect to the host, modify the source code if necessary, and manually start
a new compilation on the same les.
Each
PVMMaker
noties WAMM of all the op erations that have been ex-
ecuted. Messages received from
PVMMaker
s are shown in a control window,
to let users check how the compilation is going. Figure 5 depicts a sample
compilation run on seven machines. For each machine the step that was
currently being made when the image was \xed" can b e seen. For example,
astro.sns.it
has received its own copy of the directory and is expanding
it, while
calpar
has successfully completed the compilation.
By selecting one or more hosts in the control window, output messages
can b e seen b efore the compilation completes, along with any errors caused
by
make
and by the compiler. A
make
can be stopp ed at any moment with
a menu command. If a node fails (for example due to errors in the source
code), this do es not aect the other nodes.
If the compilation is successful, the same
Makefile
that was used to carry
out it can copy the executable les produced into the directory
$PVM_ROOT/bin/$PVM_ARCH
used by PVM as a \storage" for executable les. This op eration can be
carried out on each node.
WAMM Overview
4 WAMM 17
Figure 6: WAMM, PVM pro cess spawning
4.5 Tasks
WAMM allows PVM tasks to b e spawned and controlled. Programs are
executed by selecting
Spawn
from menu
Apps
(g. 2). A dialog box is opened
where some parameters used by PVM for requesting the execution of the
tasks can be inserted (g. 6). The following can be specied:
the name of the program;
any command-line arguments to pass to the program;
the number of copies to execute;
the mapping scheme: by specifying
Host
all the tasks are activated on
one machine (whose address has to be indicated); by specifying
Arch
,
WAMM Overview
4 WAMM 18
Figure 7: WAMM, control window for PVM tasks
PVM chooses only those machines with a user-selected architecture;
nally,
Auto
can be used to let the PVM choose the nodes on which
copies have to be spawned;
PVM's various ags (
Debug
,
Trace
,
MPP
);
any redirection of the output of the program to WAMM (ag
Output
);
the events to record if the Trace option is enabled.
By selecting
Spawn
, the new tasks are run. The
Windows
menu (g. 2) can b e
used to open a control window that contains some status information on all
the PVM tasks b eing executed in the virtual machine (g. 7). Data on tasks
are automatically updated: if a task terminates, its status is changed. New
tasks that appear in the system are added to the list, even if they were not
spawned by WAMM. Output of processes activated with the
Output
option
can b e seen in separate windows and can b e also saved to a le. If the output
windows are open, new messages from the tasks are shown immediately.
Kill
(task destruction) and
Signal
(sending sp ecic signal) are possible
for all PVM tasks, including those not spawned by WAMM.
WAMM Overview
5 WAMM'S IMPLEMENTATION ISSUES 19
5 WAMM's implementation issues
This section outlines the most important aspects of the implementation.
Some reference is made to concepts and functionalities of the UNIX op erating
system and the PVM environment; see [20] and [11] for further details.
5.1 Structure of the program
Each complex function of WAMM is implemented by an independent module;
the modules are then linked during the compilation. This type of structure is
useful for all complex programs and facilitates modications to the code and
the insertion of new functionalities. The set of mo dules can be sub divided
into three levels:
Application modules.
These are high level mo dules which implement task
spawning and control, as well as source codes compilation.
Graphic modules.
These include all the functions needed to create the
graphic interface of the program.
Network modules.
These are control modules which act as an interface
between the application and the underlying virtual machine.
The program is totally event-driven. Once the initialization of the internal
modules and the related data structures is complete, the program stops and
waits for messages from the PVM environment or from the user (for example,
the termination of an active task or the selection of a button in a window).
This is a typical X11 program b ehaviour.
5.2 Host control
During initialization WAMM enrolls PVM so that all the control functions
of the virtual machine oered by the environment can be exploited. Sp ecif-
ically, the insertion and removal of hosts is controlled using the function
pvm_notify
. WAMM is informed of any changes in the metacomputer con-
guration and shows them to the user. The notication mechanism is also
able to recognize any variations pro duced by external programs. For exam-
ple, if hosts are added or removed using the PVM console, the modication
is detected by WAMM as well.
WAMM Overview
5 WAMM'S IMPLEMENTATION ISSUES 20
5.3 Task control
Unfortunately, PVM's notication mechanisms for tasks is not as complete
as that for hosts: by using
pvm_notify
it is p ossible to nd out when a given
task terminates, but not when a new task app ears in the system. To get
complete control of tasks to o, WAMM has to use satellite processes, named
PVMTasker
s. During initialization phase a
PVMTasker
is spawned on each
node in the virtual machine. Each
PVMTasker
periodically queries its own
PVM daemon to get the list of tasks running on the node. When variations
from the previous control are found, WAMM is sent the relevant information.
Using
PVMTasker
processes is just one way to emulate a more complete
pvm_notify
. One drawback is that
PVMTasker
s have to b e installed on each
node indicated in the conguration le. An alternative is to let the interface
itself request, from time to time, the complete list of tasks (PVM's function
pvm_tasks
can b e used to do this). This metho d do es not need satellites but
does have some drawbacks; specically:
data from all the daemons have to be transmitted to WAMM, even if
there have not b een variations in the number of tasks, compared to
the previous control. In the rst solution messages are only sent when
necessary;
if one of the no des fails, then
pvm_tasks
waits until a timeout is
reached. Some minutes may pass before the function resumes with
the next nodes and sends the list of tasks to WAMM. The rst solu-
tion, which is based on independent tasks, does not have this problem:
if a node fails, then only its own tasks will not be up dated.
There is a third solution, which exploits PVM's concept of
tasker
2
. A
tasker
is a PVM program enabled to receive the control messages which,
in the virtual machine, are normally used to request the activation of a
new process. This basically means that if a
tasker
process is active on a
node, the lo cal daemon does not activate the program, but passes the request
to the
tasker
. The
tasker
executes it, and when the activated pro cess has
terminated, it informs the daemon. We could write a
tasker
in such a way
that not only it deals with the daemon, but it also noties the interface of
the activation and the termination of its own tasks. This solution is the
2
Taskers
, along with
Hosters
, were introduced with version 3.3 of PVM.
WAMM Overview
5 WAMM'S IMPLEMENTATION ISSUES 21
least expensive in terms of communications (p erio dic messages between the
control task and the daemon of its node are eliminated too), but it is not
without drawbacks:
a program still has to be installed on each node;
no information are available on PVM pro cesses created b efore the
taskers
are registered.
When developing WAMM we tried out all three solutions. We opted for
the rst as it was by far the b est both in terms of network usage and control
capabilities.
5.4 Remote commands
PVMTasker
processes described above are also used to execute programs on
remote hosts: the satellite task receives the name of the program along with
command line arguments and executes a
fork
. Child pro cess executes the
program; output is sent to the interface. This solution has a main drawback:
it is imp ossible to execute commands on hosts were
PVMTasker
is not running.
The classical alternative consists in using commands such as
rsh
or
rexec
.
These can be used for any host, even if it is not in PVM. For example, to
nd out the load of the node
evans.cnuce.cnr.it
, a user connected to
calpar.cnuce.cnr.it
can write:
rsh evans.cnuce.cnr.it uptime
The
uptime
command is executed on
evans
and the output is shown on
calpar
. The no des do not have to belong to the virtual machine (nor, in
fact, does PVM have to be installed). The problem arises from the fact that
rsh
and
rexec
can b e considered alternatives:
to use
rsh
the user has to give remote hosts permission to accept the
execution requests, by creating an
.rhosts
le on each no de used;
to use
rexec
no
.rhosts
les are needed, but unlike
rsh
, the password
of the account on the remote host is requested.
Neither metho d is really satisfactory:
.rhosts
les create security prob-
lems and are often avoided by system administrators; the request for pass-
word is not acceptable when there are many accounts or commands to deal
WAMM Overview
5 WAMM'S IMPLEMENTATION ISSUES 22
with. PVM therefore allows b oth metho ds to be used: the user species in
the hostle, for each no de, what should be used (the same options are ad-
mitted in the conguration le used by WAMM). This information is needed
since PVM has to activate the
pvmd
daemon on all the remote no des that
are inserted in the virtual machine, by using either
rsh
or
rexec
.
The alternative metho d to execute a remote command would entail exam-
ining the conguration le to establish whether
rsh
or
rexec
is required for
each node. In any case it is easier to run the command from the
PVMTasker
on
the remote node. With resp ect to the
PVMTasker
, the command is executed
locally, so neither
rsh
nor
rexec
are used.
5.5 Remote compilation
As described in the previous section, WAMM can compile programs on re-
mote hosts by using a PVM task called
PVMMaker
.
PVMMaker
s are spawned,
on each no de required, only at compilation time and terminate immediately
after the conclusion of the
make
process. The compressed directory and the
command line arguments for
make
are sent to
PVMMaker
s. Upon receipt of
these data, each
PVMMaker
expands the directory, activates the compilation
on its own node and sends messages back to the interface about the current
activity. Output messages produced by the compilers are also sent to the
interface, under the form of normal PVM messages.
Much of what was said for the execution of remote commands is also
applicable to the compilation. To transfer a le onto a host the UNIX com-
mand
rcp
can be used, but it requires, like
rsh
, a suitable
.rhosts
le on
the destination node. The only valid alternative is to use the
ftp
protocol
to transfer les, but managing it is considerably more complex than simply
transferring data between tasks via PVM primitives.
The execution of commands needed for the compilation could b e accom-
plished by using
rsh
or
rexec
too. However, not only would this lead to the
problems described ab ove, but also it would not b e as ecient as the solution
based on the
PVMMaker
s | the interface would have to manage the results
of
all
the op erations on
all
the hosts. By using the
PVMMaker
, the interface
only has to spawn the tasks, send them the directory with the source les
and show the user the messages that come back from the various
PVMMaker
s.
The compilation is carried out in parallel on all the nodes.
The disadvantages of using
PVMMaker
s are similar to the ones describ ed
WAMM Overview
6 FUTURE DEVELOPMENTS 23
for the
PVMTasker
s: a
PVMMaker
has to b e installed on each node that the
user has access to and compilations cannot be made on a node that is not
part of the virtual machine (the
PVMMaker
could not be activated). To resolve
the rst point, the functionalities of the
PVMTasker
and the
PVMMaker
could
perhaps be brought together into one task, thus simplifying the installation
of WAMM. However, having two separate tasks oers greater modularity.
6 Future developments
The current version of WAMM is only the starting point to build a complete
metacomputing environment, based on PVM. Some implementation choices,
such as subdividing mo dules with a similar structure, were made in order to
simplify as far as possible the insertion of new features. Possible new features
are shown in the following.
Resource management.
The management of the resources of a metacom-
puter (nodes, networks, etc.) is one of the most imp ortant aspects of each
metacomputing environment. WAMM currently let users work only with
the simple mechanisms implemented in PVM to control the nodes and task
execution. For example, to spawn a task, the user can only choose whether
to use a particular host, a class of architectures or any node identied by
PVM on a round-robin basis. For complex applications, more sophisticated
mapping algorithms are needed. Such algorithms could be implemented in
the module that activates the tasks.
Performance analysis.
The current version of WAMM allows the activation
of a task in trace mo de: whenever the task calls a PVM function, a trace
message is sent to the interface. By appropriately recording and organizing
these data, all the information needed to study program p erformance can
be obtained. Specically, times sp ent on calculation, communications and
various overheads can be determined. At the moment WAMM do es not
make any use of the trace messages it receives; a future version should have
a module which can collect these data, show them to the user either as
graphics or tables, and save them in a \standard" format, which can b e used
in subsequent examinations via external to ols, such as ParaGraph [15].
WAMM Overview
7 RELATED WORKS 24
Remote commands.
The p ossibility to execute remote commands could be
exploited to run the same op eration on several hosts simultaneously. This
type of functionality, not implemented yet, would allow some problems re-
garding the development and maintenance of PVM programs to be resolved.
For example, with one command an old PVM executable le could be deleted
from a group of no des. Manual deletion is not feasible if there are many
nodes, it would therefore b e advantageous to use the interface.
7 Related works
WAMM's features can b e divided in two groups: rst, it provides users with
a set of facilities to control and congure the metacomputer; on the other
side, it can b e also considered as a software development to ol. There are
some other packages which oer similar functionalities.
XPVM is a graphical console for PVM with supp ort for virtual machine
and process management. The user can change metacomputer conguration
(by adding or removing nodes) and spawn tasks, in a way similar to WAMM's.
With respect to WAMM, XPVM does not provide the same \geographical"
view of the virtual machine and is probably more suitable for smaller systems.
XPVM has no facilities for source le distribution, parallel compilation and
execution of commands on remote nodes; anyway, it includes a section for
trace data analysis and visualization, not implemented in WAMM yet.
HeNCE [13, 14] is a PVM based metacomputing environment which
greatly simplies software development cycle. In particular, it implements
a system for source le distribution and compilation on remote no des, sim-
ilar to that used by WAMM: source les can be compiled in parallel on
several machines and this task is controlled by processes comparable with
PVMTasker
s. HeNCE lacks all the virtual machine management facilities
provided by WAMM; for these, it is often necessary to use PVM console. It
should be said that HeNCE was designed with dierent goals: the simpli-
cation of application development is mostly achieved by using a dierent
programming mo del, with communication abstraction, rather than providing
remote compilation facilities.
WAMM Overview
REFERENCES 25
References
[1] L. Smarr, C.E. Catlett. Metacomputing. Communications of the ACM, June
1992, Vol. 35, No. 6 (45- 52).
[2] R. Freund, D. Conwell. Superconcurrency: A Form of Distributed Heteroge-
neous Supercomputing. Supercomputing Review, Oct. 1990, pp. 47-50.
[3] A.A. Khokhar, V.K. Prasanna, M.E. Shaaban, C. Wang. Heterogeneous Com-
puting: Challenges and Opp ortunities. Computer - IEEE (18-27), June 1993.
[4] P. Messina. Parallel and Distributed Computing at Caltech. Technical Re-
port CCSF-10-91, Caltech Concurrent Supercomputing Facilities, California
Institute of Technology, USA, October 1991.
[5] P. Messina. Parallel Computing in USA. Technical Report CCSF-11-91,
Caltech Concurrent Sup ercomputing Facilities, California Institute of Tech-
nology, USA, Octob er 1991.
[6] P. Huish (Editor). Europ ean Meta Computing Utilising Integrated Broadband
Communication. EU Project Number B2010, Technical Rep ort.
[7] P. Arb enz, H.P. Luthi, J.E. Mertz, W. Scott. Applied Distributed Super-
computing in Homogeneous Networks. IPS Research Rep ort N.91-18, ETH
Zurich.
[8] V.S. Sunderam. PVM: a Framework for Parallel Distributed Computing.
Concurrency: Practice and Experience, 2(4):315{339, December 1990.
[9] G. A. Geist, V. S. Sunderam. Network-Based Concurrent Computing on the
PVM System. Concurrency: Practice and Exp erience - Vol. 4(4) - July 1992.
[10] J. J. Dongarra, G. A. Geist, R. Manchek, V.S. Sunderam. The PVM Con-
current System: Evolution, Experiences and Trends. Parallel Computing
20(1994) 531-545.
[11] A. L. Beguelin, J. J. Dongarra, G. A. Geist, R. Mancheck, V. S. Sunderam,
and W. Jiang. PVM3 users' guide and reference manual. Technical Report
ORNL/TM-12187, Oak Ridge National Lab, May 1994.
[12] R. Baraglia, G. Bartoli, D. Laforenza, A. Mei. Network Computing:
denizione e uso di una propria macchina virtuale parallela mediante PVM.
CNUCE, Rapporto Interno C94-05, Gennaio 1994.
[13] A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, V.S. Sunderam. Graph-
ical Development Tools for Network-Based Concurrent Supercomputing. Pro-
ceedings of Sup ercomputing 91, Albuquerque 1991.
WAMM Overview
REFERENCES 26
[14] A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, K. Moore, R. Wade,
J. Plank, V.S. Sunderam. HeNCE: A Users' Guide, Version 2.0.
[15] M. T. Heath and J. E. Finger. ParaGraph: a tool for visualizing performance
of parallel programs. Oak Ridge National Lab, Oak Ridge, TN, 1994.
[16] G. Bertin, M. Stiavelli. Reports on Progress in Physics, 56, 493, 1993.
[17] S. Aarseth. Multiple Timescales. Ed. J.U. Brackbill & B.I. Cohen, p.377,
Orlando: Academic Press, 1985.
[18] L. Hernquist. Computer Physics Communications, 48, 107, 1988.
[19] H. E. Bal, J. G. Steiner, A. S. Tanenbaum. Programming Languaues for
Distributed Computing Systems. ACM Computing Surveys, Vol. 21, No. 3,
September 1989.
[20] H. Hahn. A Student's Guide to UNIX. Mc Graw-Hill, Inc., 1993.
[21] M. Brain. Motif Programming: The Essentials . . . and More. Digital Press,
1992.
WAMM Overview