Conference PaperPDF Available

Computational Steering and Online Visualization of Scientific Applications on Large-Scale HPC Systems within e-Science Infrastructures


Abstract and Figures

In the past several years, many scientific applications from various domains have taken advantage of e-science infrastructures that share storage or computational resources such as supercomputers, clusters or PC server farms across multiple organizations. Especially within e-science infrastructures driven by high-performance computing (HPC) such as DEISA, online visualization and computational steering (COVS) has become an important technique to save compute time on shared resources by dynamically steering the parameters of a parallel simulation. This paper argues that future supercomputers in the Petaflop/s performance range with up to 1 million CPUs will create an even stronger demand for seamless computational steering technologies. We discuss upcoming challenges for the development of scalable HPC applications and limits of future storage/IO technologies in the context of next generation e- science infrastructures and outline potential solutions.
Content may be subject to copyright.
Computational Steering and Online Visualization of Scientific Applications on
Large-Scale HPC Systems within e-Science Infrastructures
M.Riedel, Th.Eickermann, S.Habbinga, W.Frings, P.Gibbon, D.Mallmann, F.Wolf, A.Streit, Th.Lippert
John von Neumann Institute of Computing, Central Institute of Applied Mathematics
Forschungszentrum Juelich
D-52425 Juelich, Germany
Felix Wolf
Department of Computer Science
RWTH Aachen University
D-52056, Aachen, Germany
Wolfram Schiffmann
Department of Computer Science
University of Hagen
D-58097, Hagen, Germany
Andreas Ernst, Rainer Spurzem
Astronomisches Rechen-Institut
University of Heidelberg
D-69120, Heidelberg, Germany
Wolfgang E. Nagel
Center for Information Services and
High Performance Computing
Technische Universit¨at Dresden
D-01069, Dresden, Germany
In the past several years, many scientific applications
from various domains have taken advantage of e-science in-
frastructures that share storage or computational resources
such as supercomputers, clusters or PC server farms a cross
multiple organizations. Especially within e-science infras-
tructures driven by high-performance computing (HPC)
such as DEISA, online visualization and computational
steering (COVS) has become an important technique to save
compute time on shared resources by dynamically steer-
ing the parameters of a parallel simulation. This paper
argues that future supercomputers in the Petaflop/s perfor-
mance range with up to 1 million CPUs will create an even
stronger demand for seamless computational steering tech-
nologies. We discuss upcoming challenges for the devel-
opment of scalable HPC applications and limits of future
storage/IO technologies in the context of next generation e-
science infrastructures and outline potential solutions.
1. Introduction
Large-scale scientific research often relies on the collab-
orative use of computers and storage in combination with
experimental and diagnostic devices, such as magnetic res-
onance imaging (MRI) systems and large-scale telescopes.
One of the goals of such infrastructures is to facilitate
the routine interaction of e-scientists and their work flows
with advanced problem solving tools and computational re-
sources that supercomputer centers around the world pro-
vide. Many e-science applications take advantage of these
infrastructures to simulate phenomena related to a specific
scientific or engineering domain on advanced parallel com-
puter architectures. While a various e-science infrastruc-
tures exist today (e.g. EGEE [4], OSG [6], PRAGMA [7],
China-Grid [2]), infrastructures such as DEISA [3] or Ter-
aGrid [9], which are largely driven by high-performance
computin g (HPC) needs, will face new challenges during
the coming years. On the one hand, increasing complexity
of Grid applications that embrace multiple physical models
(i.e., multi-physics) and consider a larger range of scales
(i.e., multi-scale) is creating a steadily growing demand
for compute power. On the other hand, disproportional
power dissipation and diminishing returns from additional
instruction-level parallelism are limiting further progress in
uni-processor performance. Therefore, the only option left
to satisfy increasing Grid application demands is to harness
higher degrees of parallelism by employing a larger number
of moderately fast processor cores. Already today, most mi-
croprocessors integrate several cores on a single chip (i.e.,
multi-core) that can run in parallel. As a result of these
two developments, supercomputer Grids will have to inte-
grate peta-scale high-end computers with millions of CPUs.
For instance, the leading European supercomputing centers
have formed a consortium called the Pa rtnership for Ad-
vanced Computing in Europe (PACE) to create a persistent,
sustainable, pan-European HPC service with three to ve
HPC leadership systems at the Petaflop/s level.
However, the integration of such large-scale computing
resources into future e-science infrastructures will pose ma-
jor challenges. First and foremost, it is unclear to many
e-scientists whether their code that shows satisfactory per-
formance on thousands CPUs today will also scale beyond
a hundred thousand. Moreover, storage barriers can be ex-
pected to take effect if the evolution of storage technology
and interconnecting I/O systems in terms of latency and
bandwidth does not keep pace with the evolution of pure
computational power. Therefore, the tremendous growth
towards millions of multi-core CPUs raises a demand for
more advanced tooling for e-scientists to make their code
more efficient and scalable while minimizing the need to
access secondary storage (e.g. tapes). The Virtual Institute
High Productivity Supercomputing (VI-HPS) [11], for in-
stance, has been founded to provide scalable performance-
analysis tools that assist application developers in optimiz-
ing their codes. The optimization process supported by
these tools typically consists of an alternating sequence of
modifications and validation runs guided by targeted mea-
surements. The underlying approach is static in the sense
that the application is only modified between runs.
In contrast, computational steering tools follow a more
dynamic approach to increase the productivity of simula-
tions. The progress of a parallel program is guided on-line
during its execution on a supercomputer by dynamically
controlling selected application parameters. To do this most
effectively, such tools allow the user to watch the progress
of the simulation in real time using a suitable visualization
program (online visualization) and give them the ability to
intervene if necessary. That is, instead of performing costly
parameter sweeps, e-scientists that use such tools are able to
modify key application parameters at run time with imme-
diate visual feedback. In this way, unnecessary simulation
paths can be avoided along with the storage of undesired re-
sult data sets, saving valuable compute and I/O resources.
This paper is structured as follows. After reviewing up-
coming challenges in high-performance computing, Section
2 introduces the VISIT communication library that can be
used to steer HPC applications. Section 3 describes Xn-
body, a visualization program suitable for steering in com-
bination with VISIT. After that, two use cases are discussed,
one related to astrophysics in Section 4, another related to
plasma physics in Section 5. Based on our observations,
Section 6 proposes ways of how computational steering
and online visualizations can be seamlessly integrated into
e-science infrastructures. Finally, after surveying related
work in Section 7, we present our conclusion in Section 8.
2. The VISIT communication library for Com-
putational Steering and Data Transfer
Communication libraries for visualization and steering
provide a loosely coupled approach between an e-Science
application (parallel simulation) on the one hand and its vi-
sualization on the other. In principle, the basic design goal
of these kind of systems is to transport specific datasets
between the visualization and simulation and thus convey
scientific and additional information such as steering com-
mands. The open source VISualization Interface Toolkit
(VISIT) [12], for instance, is such a light-weight communi-
cation library that can be used for online visualization and
computational steering. VISIT has been initially developed
during the German project Gigabit Testbed West and since
then, VISIT has been continously enhanced and became a
stable software in the last years. It is used in several ap-
plication projects, for instance for astrophysics and plasma-
physics applications that are both later described in more
details during the steering case studies in Section 4 and 5.
Figure 1. The HPC oriented design of the
VISIT toolkit: Server acts as client.
As shown in Figure 1 VISIT uses a simple client-server
approach, therefore no central data server or data manager
is used. Thus data is exchanged directly between a simu-
lation and visualization. One of the main design goals of
VISIT is to minimize the load on the steered simulation
and to prevent failures or slow operations of the visualiza-
tion from disturbing the simulation progress. Additionally,
the part of VISIT that is used by the e-Science application
(simulation) should be easy to port to and use on all rele-
vant platforms including supercomputers and clusters and
potentially smaller resources within a Grid infrastructure.
Therefore VISIT does not rely on any external software or
special environments and has language bindings for C and
FORTRAN. Both, C and FORTRAN are still the most used
programming languages on HPC driven Grid resources such
as supercomputers today and thus any of these applications
can be basically instrumented with VISIT library calls to en-
able steering. This instrumentation is done manually by sci-
entists since it follows an approach similar to MPI and thus
it can be better aligned to the data model of the applications
reducing the overall overhead of instrumentation. Further-
more VISIT supports AVS/Express [1] and VTK [21].
In more detail, all these design goals have lead to an de-
sign of VISIT where the e-Science application (simulation)
acts as a client which initiates all operations like opening a
connection, sending data to be visualized or receiving new
steering parameters. Furthermore, it ensures that operations
are guaranteed to complete or fail after a user-specific time-
out. VISIT has an MPI-like API based on typed messages
that are sent or received from either the VISIT server (visu-
alization) or the VISIT client (simulation). Supported types
are strings, integers, floats of various length, user defined
data structures and arrays of these types. Hence, it can
be smoothly instrumented into scientific application code
to enable computational steering if the application itself is
basedonastep-wise algorithm. This means an instrumenta-
tion of a parallel simulation makes only sense if it computes
its data in a step-wise fashion to provide interim results
that in turn allow for an online visualization of the scien-
tific data. Also, the time for one computational step within
the application should not take too long in order to avoid
long-winded online visualizations and to see the impact of
computational steering in a real-time fashion.
On the other hand, the time for one computational step
should not be too short, in order to give the user of COVS
the chance to steer the application parameters based on the
current status, before the next data from the simulation is
already displayed. Most notably, in order to allow for com-
putational steering, the e-Science application (simulation)
must be parameterized in a way that allows the change of
values during its run-time. Analysis and case studies re-
vealed that applications that are based on N-body problems
[23] are typically well suited to be steered while other sci-
entific areas such as quantencomputing simulation or lattice
quantum chromodynamics applications can not be steered
due to long computational steps without interim results.
Finally, the communication between the server and
clients can be secured by using Secure Shell (SSH) tun-
nels which makes it feasible to integrate VISIT as a com-
putational steering tool into Grid infrastructures. In fact,
the newly developed VISIT multiplexer and collaboration
server support collaborative visualization and steering ses-
sions as part of a recently developed framework that allows
geographically dispersed e-Scientists to share and steer one
Grid application together. Performance evaluations of this
framework can be found in [18] that shows that the VISIT
components scale and this result is also expected for large-
scale HPC systems and will be more investigated in future.
3. Xnbody as Scientific Visualization of N-body
Problems with Steering Capabilities
The real-time (online) visualization tool Xnbody [13] can
be used to illustrate the step-wise output of a Grid appli-
cation that represents a massively parallel simulation in the
context of N-body problems. In the context of this work it is
important that Xnbody integrates a VISIT server and is thus
able to receive data transfers from a parallel simulation that
is instrumented with a VISIT client. If the amount of par-
ticles is too high, VISIT is able to perform a compression
technique [12]. Xnbody is based on the VTK toolkit [21]
that in turn is used to create visualization idioms of the re-
ceived data. A visualization idiom is any specific sequence
of data enrichment and enhancement transformations, visu-
alization mappings, and rendering transformations that pro-
duce an abstract display of a scientific dataset [17].
Figure 2. Steering options within Xnbody.
As shown in Figure 2 Xnbody provides a Graphical User
Interface (GUI) for e-Scientists to view the online visual-
ization and to input steering commands that are transported
via the communication library from the visualization to the
simulation. The shown numerical parameters such as am-
plitude, spot size, target size/depth, and others are being ac-
tually used in one of our case studies in the field of plasma-
physics. These parameters influence the computational pro-
cess of the massively parallel simulation during run-time
and thus the actual data output is changed and directly view-
able in Xnbody in different scientific visualization areas that
visualize a scientific dataset in 3d or 2d. Hence, the visu-
alization and the simulation represent a distributed appli-
cation that exchanges data via a bi-directional connection.
Of course, most parallel applications require different steer-
ing GUIs for their parameters that is achieved by compiling
Xnbody with different pre-compiler directives.
Xnbody provides the functionality to illustrate the step-
wise progress of the simulations and also allows for the con-
trol of the simulation through the definition of steering pa-
rameters. This allows to early abort or correct wrong behav-
iors of the simulation process before it is finished to avoid
useless computations. Also, the simulation can be steered
to focus on interesting areas of the simulation field (e.g. a
specific important parameter range). As a consequence, the
online visualization saves computational time via direct in-
teraction with the simulation by the e-Scientists and thus
also significantly reduces the amount of data that has to be
stored for later post-processing. In addition, also the time
spent for post-processing of the data is reduced, because
steered data often contains condensed useful data instead
of many intermediate results that not explicitly met the ex-
pectations of the e-Scientist, but are still being worth to be
stored for later analysis. This will decrease the impact of
the above described storage/IO barriers on petascale sys-
tems, because the amount of data is significantly reduced.
The implementation of the Xnbody visualization implies
several requirements. First and foremost, the simulation
process should not be disturbed by any connected visualiza-
tion processes. In this context, a buffer-mechanism was in-
tegrated into the scientific visualization and is also provided
via VISIT to the parallel applications. The usage of these
buffers decreases the amount of data transfers, because the
data of the simulation is collected at the buffer of the simu-
lation and then transferred block-wise via the communica-
tion library to the visualization. At the visualization, these
data blocks are stored within another buffer to provide the
data to be visualized. To provide an example of the usage,
buffers allow for efficient illustrating of trajectories of ob-
jects, which includes the up-to-date position of the object.
This implies that the buffer allows for requests of datasets of
a specific time step to get positions from the objects. A First
In First Out (FIFO) data structure is used for the implemen-
tation of the buffer since the data of the simulation is sent
chronologically. If the buffer for the scientific data is full,
the data of the oldest steps of the simulation are removed
from the buffer. Hence, the buffer consists of data over a
specified period of time. Finally, the buffer provides sig-
naling mechanisms to notify the visualization process when
new data arrives which in turn leads to the update of the
illustration of the data within the Xnbody GUI.
In some scenarios, especially in those that visualize the
trajectory of objects, the described blockwise sending of the
data can lead to a situation that not for every timestep the
data of a specific object is within the buffer. For example,
this is the case if a parallel simulation does not compute the
position of each object in each computational step. In this
scenario, Xnbody must compute several positions of the ob-
ject in advance by extrapolating from older positions within
the buffer. In other scenarios the buffer can be used to navi-
gate through the data of the scientific simulation by the user,
for instance to look on an important time period twice, e.g.
when the simulation is paused. To sum up, the buffer design
should support two types of requests. Firstly, a request for
all objects for a specific time (timestamp) while their posi-
tions are not included in this data are approximated. Sec-
ondly, a request for all positions within the buffer of a spe-
cific object in order to compute trajectories more efficiently.
For communication, the visualization Xnbody interacts
with the communication library VISIT for the data ex-
change. During that process, the visualization must also
accept user inputs from e-Scientists, for instance to influ-
ence the next computational step of the simulation via new
steering parameters. This is realized by a thread-oriented
design where one thread listens to buffer events and thus to
the arrival of new data to be visualized and another thread
listens to the inputs of the end-user. These parallel activities
are typically well supported by visualization technologies
(e.g. socket events in VTK). To sum up, the visualization
must handle two events. Firstly, the inputs of a user (e.g.
new steering parameters, mouse clicks on specific objects)
within the visualization. Secondly, the visualization must
cope with events in the context of commun ication that indi-
cate that new data is within the buffer to be visualized.
Xnbody uses the visualization technology VTK that
transforms the scientific data into visual representations or
visualization idioms [17] with different shapes and colors.
Basically, it uses the functionality of OpenGL for 3d graph-
ics, but offer interfaces and functions on a higher level of
abstraction, including visualization algorithms. The visual-
ization idioms and their integration into the Xnbody GUI al-
low users to set up the view on the data. There is a wide va-
riety of representations of scientific data related to N-body
problems, for instance computed objects can be represented
as balls or just points. Furthermore, their trajectory or their
velocity can be represented with arrows. On the other hand,
arrows are also often useful to indicate different kinds of
forces that influence objects. Other attributes of scientific
data, for instance the density of objects, can be represented
by different colors or object sizes. Finally, these graphical
schematic representations can be augmented with number
scales or other additional information such as the number
of objects shown in the graphical representation (e.g. 1 mil-
lion particles or electrons). To sum up, the integrated visu-
alization technology typically supports such kind of visual
representations and the functionality to create visualization
idioms. All in all, the Xnbody visualization is a very im-
portant tool during scientific analysis since it influences the
mind of the scientists and end-users.
Finally, Xnbody is used in conjunction with several par-
allel applications related to N-body problems such as the
massively parallel codes nbody6++ and PEPC that are both
described in more detail as steering case studies in the next
sections. In addition, Xnbody is seamlessly integratable in
e-Science infrastructures that use the UNICORE Grid mid-
dleware [10] as secure access method (see Section 6).
4. Case Study: Scalable Parallel nbody6++
Code used in the Astrophysics Community
N-body problems appear in many scientific areas such
as astrophysics, plasma-physics, molecular dynamics, and
fluid dynamics. Today, such problems are commonly solved
by using divide-and-conquer mechanisms or parallel com-
puting techniques. Nbody6++ [22] is a parallel variant of
the Aarseth-type N-body code nbody6 suitable for N-body
simulations on massively parallel resources within HPC
driven e-Science infrastructures such as DEISA or Tera-
Grid. N-body problems are concerned with determining
the effect of forces between so called bodies [23], also of-
ten named as particles. Different subversions of the main
code Nbody6++ have been developed at the Astronomisches
Rechen-Institut (ARI) at the University of Heidelberg for ap-
plications in different areas of astrophysical research. These
areas include dynamics of star clusters in galaxies and their
centres, respectively, formation of planetary systems and
dynamical evolution of planetesimals. In addition other ar-
eas such as dynamics of galactic nuclei containing black
holes and Post-Newtonian corrections to classical Newto-
nian gravity are covered by the code. For upcoming peta-
scale systems it is expected that the parallel code scales up
to much more CPUS as used today given that the comput-
ing and communication speed between nodes are sufficient.
Hence, when the code runs on much more CPUs, more data
will be generated and thus it becomes more important to
significantly reduce the amount of data via steering.
call flvisit nbody2 steering recv(
ACTIVE.eq.1) Then
0,& MPI
Figure 3. Instrumentation of the Nbody6++
parallel code with the VISIT library to receive
new steering parameters to update specific
values (e.g. DTADJ and DELTAT) and finally
broadcast them to other nodes via MPI.
The instrumentation of the FORTRAN-based parallel
nbody6++ code is shown in Figure 3. In particular, the
code provides a step-wise approach that is necessary for
online visualization and to allow new steering parameters
to change the nbody6++ simulation during its run-time.
The instrumentation with VISIT enables the online vi-
sualization of intermediate scientific data during its compu-
tation on a parallel computer within e-Science infrastruc-
tures. The ARI in Heidelberg uses Xnbody and VISIT with
nbody6++ to follow the orbital evolution of the simulated
many-particle systems by human eye while the simulation is
running on a parallel computer. Therefore, an e-Scientist is
able to correct wrong behavior of the simulation process im-
mediately by changing steering parameters within the Xn-
body GUI. These computational steering parameters allow
to change data output parameters (e.g. data output intervals)
and physical parameters (e.g. position of black holes) in a
running simulation instantaneously. Finally it allows for a
soft termination of the parallel nbody6++ simulation with-
out loss of already computed data.
By using the massively parallel code in conjunction
with Xnbody and the underlying VISIT communication and
steering library the efficiency of the astrophysics team at the
ARI of University of Heidelberg has been improved and the
combination of technologies is also very useful for the pre-
sentation of scientific results. Finally, as ARI is being part
of the German national Grid AstroGrid-D science commu-
nity, the newly developed support for collaborative sessions
with Xnbody and VISIT within e-Science infrastructures
such as D-Grid allows new forms of collaboration and expe-
rience exchange with geographically dispersed e-Scientists
within AstroGrid-D. Finally, Figure 4 shows the usage of
Xnbody with the nbody6++ massively parallel code.
Figure 4. Steering nbody6++ with Xnbody.
5. Case Study: Scalable PEPC Solver used in
the Plasmaphysics Community
Another case study in the area of e-Science applications
that represent N-body problems are several routines of the
Pretty Efficient Parallel Coulomb solver (PEPC) project
[16]. The massively parallel code uses a hierarchical tree
algorithm to perform potential and force summation of N
charged particles in a time O(N log N), allowing mesh-free
particle simulation on length- and time-scales usually pos-
sible only with particle-in-cell or hydrodynamic techniques.
The parallel routines can be used for the simulation of a par-
ticle accelerator via laser pulses. In the context of this work
it is important that the routines of the PEPC code are aug-
mented with a VISIT client and thus provides data in a step-
wise fashion for online visualization, for instance by using
the Xnbody scientific visualization. In addition, the Fortran
code in Figure 5 shows that at the beginning of each com-
putational step new steering parameters are used by calling
the method vis
control, which is sketched in Figure 6.
Afterwards, vis
(vec)fields nbody sends field data
to VISIT for visualization of scalar volumetric data.
if ( mod(itime,ivis) ==0 ) then
call vis
parts nbody(vcount)
vcount = vcount + 1
if ( mod(itime,min(ivis,ivis
.and. steering) call vis
if ( mod(itime,ivis
fields)==0 ) then
call vis
fields nbody(itime+itime start)
call vis
vecfields nbody(itime+itime start)
Figure 5. The step-wise process of PEPC-B
use steering parameters.
The Fortran 90 code PEPC-B that represents the Mag-
netoinductive version of PEPC, is realized in a step-wise
fashion. In particular, Figure 6 illustrates the instrumen-
tation with VISIT in the code in the context of steering
that realizes an interactive control of the laser target con-
fig. In a similar manner, the code is instrumented to realize
the beam control and thus have interactive control of parti-
cle sources. In addition to steering, VISIT is instrumented
into the code to send field data for surface visualization or
for visualization of scalar volumetric data. Finally, particle
data itself is send to VISIT for real-time visualization and
all in all this code parts demonstrate that steering parame-
ters are usually very scientific code-specific but helpful. In
the context of upcoming petascale systems its clear that any
many-body code which is attempting to model macroscopic
systems will benefit from higher particle numbers to either
(a) improve the statistics, thus better reproducing the math-
ematical set of equations used to describe the system, or (b)
permit simulation of larger, perhaps more realistic systems.
For many of these systems (plasma devices, large proteins,
galaxies, ionic crystals), there is no upper limit on the num-
ber of simulation particles. PEPC currently runs with 20-30
million particles on 2048 processors. This is close to the
scalability limit of the present algorithm, which requires lo-
cal copies of certain global data and is therefore memory
bound. In principle this could be overcome with some code
restructering which should permit scaling up to 10k proces-
sors. Hence, here the impact of computational steering to
decrease data sizes is lower since the amount of used CPU is
currently not expected to move towards petascale systems.
subroutine vis control
integer :: isteer1=0,isteer2=0,isteer3=0,...
8 :: dsteer1,dsteer2,dsteer3,dsteer4
call flvisit
nbody2 check conn(lvisit active)
call flvisit
nbody2 steering recv(dsteer1,...
beam = dsteer1
beam = dsteer4
call MPI
BCAST( th beam, 1, MPI REAL, 0,
Figure 6. Interactive control of the par t icle
source in PEPC code with VISIT.
The instrumented steering capabilities make sense and
Xnbody provides immediate visual insights into the compu-
tational process. This is useful to verify start-up parameters
such as the initial alignment of laser and target, or to per-
form quick trial runs with reduced numbers of particles as a
prelude to full-blown production scenarios. Figure 7 shows
the use of Xnbody in conjunction with the PEPC parallel
code (see Figure 2 for PEPC steering GUI).
Figure 7. Steering PEPC with Xnbody.
6. Integration into e-Science Infrastructures
In order to seamlessly integrate the presented visualiza-
tion and steering tools within e-Science infrastructures, we
developed the COVS framework [18]. This framework ba-
sically provides a secure access to Grid resources enabling
the computational steering of e-Science applications (paral-
lel simulations) as well as the visualization of their outcome
and also sharing of the results with multiple other collabo-
rators. It also allows for connection performance measure-
ments. In many e-Science infrastructures, a massively par-
allel simulation like nbody6++ or PEPC run on a Grid re-
source such as a supercomputer that is typically not locally
available for the broader scientific community. Therefore,
the parallel simulations and Xnbody visualizations are of-
ten geographically dispersed and thus raise the demand for
a secure and fast transfer of scientific data, especially when
applying the online visualization technique in conjunction
with computational steering.
To seamlessly integrate such a secure data transfer in
common e-Science and Grid environments, a Grid middle-
ware is used as one of the fundamental building blocks
of the COVS framework. The COVS framework imple-
mentation presented here relies on the Web services-based
UNICORE 6 Grid middleware [10] since it represents a
rather lightweight solution and is in fact a ready-to-run
Grid system free available as open source via BSD license
on sourceforge. UNICORE is the Grid middleware of the
HPC-driven DEISA supercomputing e-Science infrastruc-
ture and thus it is reasonable to implement the COVS frame-
work for HPC applications in a middleware that is already
used in such HPC-based environments.
Figure 8. COVS with VISIT and UNICORE.
As shown in Figure 8 UNICORE allows for the submis-
sion of parallel simulations through an interface that accepts
job descriptions in the standardized JSDL format [14]. Fig-
ure 8 also illustrates the newly developed COVS Grid ser-
vices that are used to manage collaborative VISIT-based vi-
sualization sessions and that are further used to establish
secure channels based on an SSH public key exchange via
Grid technologies. In more detail, the COVS Grid services
are implemented as higher-level services to benefit from the
strong security infrastructure UNICORE provides. In par-
ticular, the COVS Grid services are used to interact with
the VISIT communication library to exchange pieces of in-
formation about participants and the submitted simulation
jobs. In addition, such pieces of information include contact
and control information that the COVS Grid services within
UNICORE use to control the wrapped VISIT multiplexer
and collaboration server, for instance, the start and stop of
the data-flow to visualizations. Another core building block
of the COVS framework is the Grid client. This technol-
ogy is used by e-Scientists to submit a simulation job to the
UNICORE Grid middleware or to manage a COVS session.
A suitable Grid client suite that submits computational jobs
to UNICORE using open connection technologies is the
open source Grid Programming Environment ( GPE) Client
[20] from Intel . By using this Grid client, the whole COVS
framework implementation is still flexible since GPE clients
allow job submissions to UNICORE, Globus Toolkits or
even gLite-based systems in the near future. Hence, this
client can be used with many Grid middleware systems cur-
rently available and thus leads to a highly flexible client im-
plementation that is not explicitly bound to one dedicated
Grid middleware system.
Finally, the above described interoperability of the Grid
client from the underlying Grid middleware is one of the
fundamental benefits of the COVS framework compared to
other rather inflexible approaches in Grids that are tightly
couple with one specific Grid middleware. By using the ex-
tension mechanisms of this client, the COVS Client plugin
represents another core building block of the COVS frame-
work and is used to exchange security and contact infor-
mation with the communication library VISIT integrated
into the Xnbody visualization. This functionality retains the
single sign-on feature on the client-side and to provide the
communication library with contact details about the Grid
sites that run a parallel simulation. The implementation of
such a COVS Client plugin is based on the GPE GridBeans
technology [20] and named as COVS GridBean. It provides
e-Scientists with a GUI to access and control the state of
a collaborative visualization sessions. This is done by the
COVS GridBean by using the GPE Client framework to
access the underlying Grid middleware UNICORE through
open standards such as WS-RF [5] and JSDL [14]. Please
refer to [18] for more details about COVS.
7. Related Work
There are various projects that conduct research with
computational steering technologies and their often related
visualization approaches such the framework of Brodlie et
al. described in [15]. While this frameworks is more high-
level, more concrete and flexible solutions are still rather
rare, but presented in this paper. Also, the SciRun [19] and
the UK RealityGrid project [8] focus on how e-Scientists
can make effective use of a Grid and its visualizations.
The most known work in RealityGrid is related to its steer-
ing library that enable calls which can be embedded into
each of the three components simulation, visualization and
a steering client. The difference to our approach is that its
tightly integrated with the Imperial College e-Science Net-
worked Infrastructure (ICENI) middleware, while we pro-
vide a more loosely coupled design to support multiple Grid
8. Conclusions
This paper describes how computational steering and on-
line visualization tools can be used within todays and fu-
ture e-Science infrastructures. Using computational steer-
ing, e-Scientists have a tool that decreases the challenges
of IO/memory barriers in future peta-scale HPC systems,
which rely on multi-core technologies, because they are
able to focus on areas of interest during the computational
process of a Grid application. This will become increas-
ingly important when the number of CPUs will be in-
creased, e.g. within systems that are part of the DEISA
or TeraGrid infrastructure. Hence, e-Scientists that use the
VISIT-based COVS framework tool are able to modify key
application parameters at run time with immediate visual
feedback. The actual benefit is twofold. The central idea
is to avoid unnecessary computations and the associated
storage of checkpoints and/or final result data by directly
guiding the computation to parameter subspaces of inter-
est while circumnavigating other parts that otherwise would
have to be processed. This is accomplished by providing
the user with the ability to dynamically modify parameters
of an ongoing computation and to receive immediate visual
feedback on the effects of this modification.
The integration of the VISIT toolkit within the Grid
middleware UNICORE can be used by all visualizations
and simulations that rely on the VISIT communication li-
brary. We described an approach were binary protocol-
based VISIT data transfers are secured with SSH connec-
tions, because firewalls on supercomputers and clusters typ-
ically allow connections via SSH, In addition this approach
provides much better performance than using XML-based
Web services message exchanges for large data transfers.
However, Web services are well suited for the management
of connections and for controlling COVS sessions that en-
able e-Scientists to seamlessly steer their Grid applications.
Finally, the open source implementation of the COVS
framework was shown in conjunction with nbody6++ and
PEPC massively parallel codes on numerous events such
as OGF18, Supercomputing 2006 conference, and OGF19.
Furthermore, the tools presented here were demonstrated to
end-users in DEISA at DEISA training events and recently
at the International Supercomputing Conference 2007.
[1] AVS/Express.
[2] China-Grid.
[3] DEISA.
[4] EGEE.
[5] OASIS - WSRF Technical Committee. http://www.oasis-
[6] Open Science Grid - OSG.
[7] Pacific Rim Applications and Grid Middleware Assembly -
[8] RealityGrid.
[9] TeraGrid.
[11] VI-HPS.
[12] VISIT.
[13] XNBODY.
[14] A. Anjomshoaa, M. Drescher, D. Fellows, S. McGougha,
D. Pulsipher, and A. Savva. Job Submission Description
Language (JSDL) - Specification Version 1.0.OpenGrid
Forum Proposed Recommendation, 2006.
[15] K. Brodlie et al. Distributed and Collaborative Visualiza-
tion. InF.Berman,G.C.Fox,andA.J.G.Hey,editors,
Computer Graphics Forum, Volume 23, 2004.
[16] P. Gibbon et al. Performance Analysis and Visualization of
the N-body tree code PEPC on massively parallel comput-
ers. In Proc. of the ParCo, Malaga, Spain, 2005.
[17] R. B. Haber et al. Visualization idioms: A conceptual model
for scientific visualization systems. In Visualization in Sci-
entific Computing, pages 74–93, 1990.
[18] M.Riedel et al. Design and Evaluation of a Collaborative
Online Visualization and Steering Framework Implementa-
tion for Computational Grids. In Proc. of the 8th IEEE/ACM
Int. Conf. on Grid Comp, Austin, USA. 2007.
[19] S. Parker and C. Johnson. SCIRun: A scientific pro-
gramming environment for computational steering. In
H. Meuer, editor, Proceedings of Supercomputer 95 (New
York). Springer Verlag, 1995.
[20] R. Ratering et al. GridBeans: Supporting e-Science and Grid
Applications. In 2nd IEEE Int. Conf. on e-Science and Grid
Comp. (e-Science 2006), Amster dam, Netherlands, 2006.
[21] W. Schroeder, K. Martin, and B.Lorensen. The Visualization
Toolkit - An Object-Oriented Approach to 3D Graphics, 3rd
Edition. Kitware Inc., 2002. ISBN 1-930934-07-6.
[22] R. Spurzem et al. Nbody6 features. 2003. ftp://ftp.ari.uni-
[23] B. Wilkinson and M. Allen. Parallel Programming. Prentice
Hall, 1999. ISBN 0-13-671710-1.
... A otimização no uso dos recursos em centros de CAD e sua integração com a infraestrutura de eScience apresentam desafios, como a necessidade por sistemas de software mais eficientes. É necessário desenvolver programas com melhor escalabilidade e eficiência temporal, que aproveitem melhor os sistemas multiprocessados [Riedel et al. 2007]. Todavia, o maior desafio é a oferta de infraestrutura bruta de processamento escalonável dinamicamente e a integração entre as diversas estruturas existentes, de forma que possam ser utilizadas como um recurso unificado, o que pode ser conseguido mediante o modelo da Computação em Nuvem sob a ótica da oferta de IaaS. ...
Full-text available
This paper presents the infrastructure challenges inherent to the full development of e­Science applications. This paper outlines how cloud computing can collaborate to improve the infrastructure provisioning to e­ Science application support. We also show the main open source tools that can be used in the cloud implementation in order to improve the high performance computing in e­Science. Resumo. Este artigo apresenta os desafios de infraestrutura inerentes ao pleno desenvolvimento de aplicações de e­Science. Este artigo descreve como a computação em nuvem pode colaborar para melhorar o fornecimento de infraestrutura para suporte de aplicações de e­Science. São mostradas também ferramentas de código aberto que podem ser usadas na implementação de nuvens que proporcionam a computação de alto desempenho em e­Science.
... A otimização no uso dos recursos em centros de CAD e sua integração com a infraestrutura de eScience apresentam desafios, como a necessidade por sistemas de software mais eficientes. É necessário desenvolver programas com melhor escalabilidade e eficiência temporal, que aproveitem melhor os sistemas multiprocessados [Riedel et al. 2007]. Todavia, o maior desafio é a oferta de infraestrutura bruta de processamento escalonável dinamicamente e a integração entre as diversas estruturas existentes, de forma que possam ser utilizadas como um recurso unificado, o que pode ser conseguido mediante o modelo da Computação em Nuvem sob a ótica da oferta de IaaS. ...
... In all of these cases, the use of centralized storage as an integration point introduces a performance bottleneck. The alternative approach of in situ processing, such as is done by ParaView [6] and VisIt [9], has its own problems. For example, the CTH [3] shock physics code in use at Sandia easily can scale to 100,000 cores with an executable size of around 30 MB. ...
Conference Paper
Full-text available
Current scientific workflows consist of generally several components either integrated in situ or as completely independent, asynchronous components using centralized storage as an interface. Neither of these approaches are likely to scale well into Exascale. Instead, separate applications and services will be launched using online communication to link these components of the scientific discovery process. Our experiences with coupling multiple, independent MPI applications, each with separate processing phases, exposes limitations preventing use of some of the optimized mechanisms within the MPI standard. In this regard, we have identified two shortcomings with current MPI implementations. First, MPI intercommunicators offer a mechanism to communicate across application boundaries, but do not address the impact this operating mode has on possible programming models for each separate application. Second, MPI_Probe offers a way to interleave both local messaging and remote messages, but has limitations as MPI_Bcast and other collective calls are not supported by MPI_Probe thus limiting use of optimize collective calls in this operating mode.
... Since then the approach grew to a broader COVS framework [23] and we further published at the Grid 2007 conference in [22] that the approach taken is feasible and provides sophisticated performance. More recently, we investigated in [21] the impact of using the computational steering capabilities of the COVS framework implementation in UNICORE on largescale HPC systems of DEISA (e.g. IBM BlueGene/P JUGENE with 65536 processors). ...
Full-text available
Especially within grid infrastructures driven by high-performance computing (HPC), collaborative online visualization and steering (COVS) has become an important technique to dynamically steer the parameters of a parallel simulation or to just share the outcome of simulations via visualizations with geographically dispersed collaborators. In earlier work, we have presented a COVS framework reference implementation based on the UNICORE grid middleware used within DEISA. This paper lists current limitations of the COVS framework design and implementation related to missing fine-grained authorization capabilities that are required during collaborative COVS sessions. Such capabilities use end-user information about roles, project membership, or participation in a dedicated virtual organization (VO). We outline solutions and present a design and implementation of our architecture extension that uses attribute authorities such as the recently developed virtual organization membership service (VOMS) based on the security assertion markup language (SAML).
... Without a global optimizer for this problem, local methods are the only way. From an steering interactive environment perspective [1] such as the present one, the ideal situation would be to have a reliable and fast local minimizer that would allow for a real time feedback. However, this is not the case. ...
Conference Paper
Full-text available
The arrangement of atoms at the surface of a solid accounts for many of its properties: Hardness, chemical activity, corrosion, etc. are dictated by the precise surface structure. Hence, finding it, has a broad range of technical and industrial applications. The ability to solve this problem opens the possibility of designing by computer materials with properties tailored to specific applications. Since the search space grows exponentially with the number of atoms, its solution cannot be achieved for arbitrarily large structures. Presently, a trial and error procedure is used: an expert proposes an structure as a candidate solution and tries a local optimization procedure on it. The solution relaxes to the local minimum in the attractor basin corresponding to the initial point, that might be the one corresponding to the global minimum or not. This procedure is very time consuming and, for reasonably sized surfaces, can take many iterations and much effort from the expert. Here we report on a visualization environment designed to steer this process in an attempt to solve bigger structures and reduce the time needed. The idea is to use an immersive environment to interact with the computation. It has immediate feedback to assess the quality of the proposed structure in order to let the expert explore the space of candidate solutions. The visualization environment is also able to communicate with the de facto local solver used for this problem. The user is then able to send trial structures to the local minimizer and track its progress as they approach the minimum. This allows for simultaneous testing of candidate structures. The system has also proved very useful as an educational tool for the field.
... A infraestrutura para o tratamento desse tipo de informação e as ferramentas para auxiliarem nessa tarefa devem possibilitar que seus usuários (e-scientists) as utilizem de forma transparente. A otimização no uso dos recursos em grandes centros de HPC e sua integração com a infraestrutura de e-Science apresentam desafios, destacando-se a necessidade de algoritmos mais efetivos para o processamento da informação e o uso racional dos recursos, caracterizados pela necessidade de ferramentas e códigos mais eficientes e escaláveis, capazes de melhor aproveitar o crescente aumento do número de núcleos em umúnico processador [Riedel et al. 2007]. ...
Full-text available
How the cloud paradigm can support the solution of High Perfor-mance Computing (HPC) problems has not been fully answered yet. This raises a number of questions and, among them, the effect of resources virtualization in terms of performance. We present the analysis of the use of virtual clusters and the factors considered crucial to the adoption of such solution, mainly on infrastructure development to facilitate their creation and use of a custom envi-ronment for education, training, testing or development. Resumo. A questão de quanto o uso do paradigma de nuvem pode servir de apoio a solu ao de problemas de Computa ao de Alto Desempenho (HPC) ainda não foi totalmente respondida. Suscita uma série de questionamentos e, dentre eles, o efeito da virtualiza ao de recursos em termos de desempenho. Neste artigó e feita uma análise do uso de clusters virtuais e os fatores conside-rados determinantes para ado ao desse tipo de solu ao, principalmente no de-senvolvimento da infraestrutura que facilite sua cria ao e o uso de um ambiente personalizado para fins educativos, de treinamento, de teste ou de desenvolvi-mento.
Full-text available
A framework for accelerating modern long-running astrophysical simulations is presented, which is based on a hierarchical architecture where computational steering in the high-resolution run is performed under the guide of knowledge obtained in the gradually refined ensemble analyses. Several visualization schemes for facilitating ensemble management, error analysis, parameter grouping and tuning are also integrated owing to the pluggable modular design. The proposed approach is prototyped based on the Flash code, and it can be extended by introducing user-defined visualization for specific requirements. Two real-world simulations, i.e., stellar wind and supernova remnant, are carried out to verify the proposed approach. © 2015, Tianjin University and Springer-Verlag Berlin Heidelberg.
Conference Paper
HPC analytics has become increasingly vital to analyze the large volumes of data produced by sophisticated computing instruments. Meanwhile, with the successful development of cloud computing, more and more scientists are devoted to deploy HPC analytics in the ever-popular clouds, which poses new challenges mainly caused by different storage architectures, resource management mechanisms and programming APIs. Firstly, there exists a "data semantics" gap between the way data are stored by Cloud platform and the way data will be accessed by the HPC Analytics. Secondly, data are mostly distributed across data nodes for in-house data-intensive clusters to achieve co-located computation and storage, however, it is challenging for the public clouds to mimic because their data are stored centrally. In this paper, we develop a new HPC analytics framework called NOHAA, to provide 1) a semantics-aware intelligent data upload interface and 2) a locality-aware hierarchical storage system in support of co-located computation and storage on Windows Azure. Our extensive real world experiments show that NOHAA significantly reduces the average data access time by up to 85% and accelerates the HPC analytics execution time by a factor of 2 to 7.
Full-text available
This document specifies the semantics and structure of the Job Submission Description Language (JSDL). JSDL is used to describe the requirements of computational jobs for submission to resources, particularly in Grid environments, though not restricted to the latter. The document includes the normative XML Schema for the JSDL, along with examples of JSDL documents based on this schema.
Conference Paper
Full-text available
The performance and scalability of a parallel tree code for rapid computation of long-range Coulomb forces is investigated using both visual and analytical techniques. The present code uses a variation of the Hashed-Oct-Tree algorithm, in which communication overhead is minimised by bundling multipole data for large groups of particles prior to shipment between processors. The two critical components of this algorithm, the tree traversal and load-balancing, are examined in highly dynamic physical context with the help of the KOJAK performance analysis toolkit and the online visualisation packages VISIT and XNBODY. The parallel scalability of PEPC is investigated on the J¨ ulich IBM p690 and BlueGene/L machines.
Conference Paper
Full-text available
Today's large-scale scientific research often relies on the collaborative use of a Grid or c-Science infrastructure (e.g. DEISA, EGEE, TeraGrid, OSG) with computational, storage, or other types of physical resources. One of the goals of these emerging infrastructures is to support the work of scientists with advanced problem-solving tools. Many e-Science applications within these infrastructures aim at simulations of a scientific problem on powerful parallel computing resources. Typically, a researcher first performs a simulation for some fixed amount of time and then analyses results in a separate post-processing step, for instance, by viewing results in visualizations. In earlier work we have described early prototypes of a Collaborative Online Visualization and Steering (COVS) Framework in Grids that performs both -simulation and visualization -at the same time (online) to increase the efficiency of e-Scientists. This paper evaluates the evolved mature reference implementation of the COVS framework design that is ready for production usage within Web service-based Grid and e-Science infrastructures.
Thus far we have always (implicitly) assumed that the statements of a statement sequence execute one after the other. Programs consisting of such statement sequences are termed sequential. However, in practice some problems are difficult or impossible to solve with a sequential program. Here we can resort to parallel programming, which is actually the more general view: sequential programs can be viewed a special (although very important) case of parallel programs.
Visualization is a powerful tool for analyzing data and presenting results in science, engineering and medicine. This paper reviews ways in which it can be used in distributed and/or collaborative environments. Distributed visualization addresses a number of resource allocation problems, including the location of processing close to data for the minimization of data traffic. The advent of the Grid Computing paradigm and the link to Web Services provides fresh challenges and opportunities for distributed visualization—including the close coupling of simulations and visualizations in a steering environment. Recent developments in collaboration have seen the growth of specialized facilities (such as Access Grid) which have supplemented traditional desktop video conferencing using the Internet and multicast communications. Collaboration allows multiple users—possibly at remote sites—to take part in the visualization process at levels which range from the viewing of images to the shared control of the visualization methods. In this review, we present a model framework for distributed and collaborative visualization and assess a selection of visualization systems and frameworks for their use in a distributed or collaborative environment. We also discuss some examples of enabling technology and review recent work from research projects in this field.
Conference Paper
Large-scale scientific research often relies on the collaborative use of Grid and e-Science infrastructures that provide computational or storage related resources. One of the ideas of these modern infrastructures is to facilitate the routine interaction of scientists and their workflows with advanced problem solving tools and computational resources. While many production Grid projects and e-Science infrastructures have begun to offer services for the usage of resources to end-users during the past several years, the corresponding emerging standards defined by GGF and OASIS still appear to be in flux. In this paper, we present the GridBean technology that bridges the gap between the constantly changing basic Grid or e-Science infrastructures and the need of stable application development environments for the Grid users.
Conference Paper
We present the design, implementation and application of SCIRun, a scientific programming environment that allows the interactive construction, debugging and steering of large scale scientific computations. Using this "computational workbench," a scientist can design and modify simulations interactively via a dataflow programming model. SCIRun enables scientists to design and modify models and automatically change parameters and boundary conditions as well as the mesh discretization level needed for an accurate numerical solution. As opposed to the typical "off-line" simulation mode - in which the scientist manually sets input parameters, computes results, visualizes the results via a separate visualization package, then starts again at the beginning - SCIRun "closes the loop" and allows interactive steering of the design and computation phases of the simulation. To make the dataflow programming paradigm applicable to large scientific problems, we have identified ways to avoid the excessive memory use inherent in standard dataflow implementations, and have implemented fine-grained dataflow in order to further promote computational efficiency. In this paper, we describe applications of the SCIRun system to several problems in computational medicine. In addition, an we have included an interactive demo program in the form of an application of SCIRun system to a small electrostatic field problem.
http://www.oasisopen .org/committees/wsrf. [6] Open Science Grid - OSG
  • Oasis-Wsrf Technical Committee
OASIS -WSRF Technical Committee. [6] Open Science Grid - OSG.