Content uploaded by Morris Riedel
Author content
All content in this area was uploaded by Morris Riedel on Oct 15, 2019
Content may be subject to copyright.
e-Science Infrastructure Integration Invariants to Enable
HTC and HPC Interoperability Applications
M. Riedel, M.S. Memon,
A.S. Memon, D. Mallmann, Th. Lippert
Jülich Supercomputing Centre
Forschungszentrum Jülich
Jülich, Germany
m.riedel@fz-juelich.de
D. Kranzlmüller
Ludwig Maximilians University Munich
Munich, Germany
A. Streit
Karlsruhe Institute of Technology
Karlsruhe, Germany
Abstract — During the past decade, significant international
and broader interdisciplinary research is increasingly carried
out by global collaborations that often share resources within a
single production e-science infrastructure. More recently,
increasing complexity of e-science applications embrace
multiple physical models (i.e. multi-physics) and consider
longer and more detailed simulation runs as well as a larger
range of scales (i.e. multi-scale). This increase in complexity is
creating a steadily growing demand for cross-infrastructure
operations that take the advantage of multiple e-science
infrastructures with a more variety of resource types. Since
interoperable e-science infrastructures are still not seamlessly
provided today we proposed in earlier work the Infrastructure
Interoperability Reference Model (IIRM) that represents a
trimmed down version of the Open Grid Service Architecture
(OGSA) in terms of functionality and complexity, while on the
other hand being more specifically useful for production and
thus easier to implement. This contribution focuses on several
important reference model invariants that are often neglected
when infrastructure integration activities are being performed
thus hindering seamless interoperability in many aspects. In
order to indicate the relevance of our invariant definitions, we
provide insights into two accompanying cross-infrastructure
use cases of the bio-informatics and fusion science domain.
Keywords: HPC; HTC; Interoperability, Reference Model
I. INTRODUCTION
Computational simulations and thus scientific computing is
well accepted alongside theory and experiment in today’s
science. The term ‘e-science’ [20] evolved as a new
research field that focuses on collaboration in key areas of
science using so-called next generation data and computing
infrastructures (i.e. e-science infrastructures) to extend the
potential of scientific computing. A wide variety of e-
science applications take already advantage of various e-
science infrastructures that evolved over the last couple of
years to production research environments in the sense of
being successfully used as a tool by scientists on a daily
basis. The initial reference model of such e-science
infrastructures (aka Grids) was coined as the Open Grid
Services Architecture (OGSA) originally defined by Foster
et al. in 2003 [21] and we observed over the last decade only
a slow adoption rate so that OGSA not yet achieved a
common basis for such production infrastructures in Europe
and beyond. This absence of a realistically implementable
and community accepted reference model lead to different
‘non-interoperable architectures’ of production e-science
infrastructures in the last decade such as EGEE/EGI [22],
NorduGrid [26], DEISA [23], or PRACE [24]. This lack of
interoperability is a hindrance since we observe a growing
interest in the seamless use of multiple infrastructures from
a single client [27, 28, 29] and therefore we introduced in
earlier work our Infrastructure Interoperability Reference
Model (IIRM) [25, 11, 30, 41] to overcome limitations in e-
science interoperability.
In this contribution, we present a set of invariants
that complement the reference model with respect to
infrastructure integration issues thus defining several
relevant constraints in the area of information exchange,
accounting, and attribute-based authorization. Reference
model adoptions that satisfy our proposed invariants will
significantly increase its success rate of interoperability with
other existing pan-European e-science infrastructures.
The remainder of this contribution is as follows.
After the introduction, we review our infrastructure
interoperability reference model and present its core
building blocks in Section 2. Section 3 will then provide
details about an invariant about resource and service
information while Section 4 gives insights about an
invariant related to accounting and tracking resource usage.
Section 5 explains why also attribute-based authorization
requires an important agreement on common attributes
before Section 6 introduces us to accompanying use cases
that demonstrate the significance of the invariants to meet
operational interoperability. In Section 7, we survey related
work and this paper ends with some concluding remarks.
II. INFRASTRUCTURE REFERENCE MODEL
Although OGSA represents a good architectural blueprint
for e-science infrastructures in general, we argue that the
scope of OGSA is too broad to be realistically
implementable in production e-science infrastructures today.
Therefore, the fundamental idea of our e-science
Infrastructure Interoperability Reference Model (IIRM) [25]
is to formulate a well-defined set of emerging open
standards and some small refinements of them in order to
address the interoperability needs of state-of-the-art
production e-science infrastructures. In order to identify this
well-defined set of standards, we worked with many
scientific interoperability use cases [40, 27, 28, 29, 41] over
years. Based on these efforts, the lessons learned about the
most crucial functionality led to the core building blocks of
the reference model design as shown in Figure 1.
Figure 1. The infrastructure interoperability reference model core
building blocks in context of scientific-are specific clients, HTC-driven and
HPC-driven applications as well as production e-science infrastructures.
As shown in Figure 1, the core building blocks are
well embedded in the typical environments of e-science
infrastructures with different types of clients accessing them
in order to execute different types of applications that are
typically based on different computing paradigms. These
computing paradigms are High Throughput Computing
(HTC), which is used with applications that are
embarrassingly parallel, and High Performance Computing
(HPC), which is used with applications that require a good
interconnection between cores and tackle a massively
parallel problem. Both create and re-use forms of shared
scientific data within e-science infrastructures today. In the
most cases, different production Grid infrastructures exist to
satisfy these demands that are HPC-driven Grid
infrastructures (i.e. DEISA, PRACE) with large-scale HPC
resources and HTC-driven Grids (i.e. EGEE/EGI,
NorduGrid) with a high amount of smaller clusters or PC
pools. Note that EGEE basically finished its transformation
to the more recent EGI, but since several research results
have been performed on the comparable EGEE
infrastructure, we keep the EGEE/EGI description for it.
If we review Figure 1, the identified most crucial
functionality to actually enable interoperability between
production e-science infrastructures is data management and
control, including the data transfer, as well as job
management and control. In context of the latter, there is
also the execution environment important, for instance to
have common environment variables that describe different
computational resource conditions (e.g. available memory,
cores, etc.) for the use of application execution during run-
time by the applications themselves. Apart from this
functionality, there are two special kinds of elements of the
design that are security and information that both are part of
this contribution but are describes in other aspects in [25]. A
common security setup is in the most cases the major
showstopper to enable interoperability between Grid
infrastructures and as it affects basically every layer in can
be considered as a rather vertical building lock (cp. Figure
1). The same is valid for information that refers to the up-to-
date information about the infrastructures resources. These
pieces of information are reaching from the amount of
available CPUs, cores, free disk space, to a list of supported
applications and services. This motivates us to overcome
limitations of both information and security to reach more
stable interoperability setups based on clear invariants of
the reference model design that we introduce in this paper.
A full description of our Web services (WS)-based
reference model is out of scope of this contribution and we
only highlight the core building blocks in this contribution
while referring to other sources like [25, 11, 30, 41] for
more details. In short, the core building blocks are the OGF
standard Storage Resource Manager (SRM) [9] data control
and management interface that makes use of GridFTP [19]
in terms of wide-area data transfers. Another core building
block of job management and control is represented by
improved versions of the OGSA-Basic Execution Service
(BES) [32] and the Job Submission and Description
Language (JSDL) [17]. The execution environment (e.g.
common environment variables, etc.) refers to
standardization efforts started within the GIN community
group [40], but not yet finalized as proposed standardization
document while also consider accounting via the Usage
Record Format (UR) [5] in this context. The information
standard that plays a very important role in the design model
is the GLUE2 [1] building block. Finally, the security
building blocks refer to a broader range of HTTP-based
authentication and attribute-based authorization methods
basically based on X.509 proxies, the Security Assertion
Markup Language (SAML) [6] and the Extensible Access
Control Markup Language (XACML) [7].
The majority of concepts behind this reference
model have been already or are currently in the process of
being adopted by gLite [35], ARC [33], UNICORE [34], and
dCache [43] as SRM implementation through our European
Middleware Initiative (EMI) project [42] we are involved in.
III. GLOBAL INFORMATION INVARIANT
The first infrastructure integration element of the reference
model is the Global Information Invariant (GII) that defines
the overall cross-infrastructure information ecosystem. This
section describes the fundamental information exchange that
occurs during cross-infrastructure operation in terms of this
invariant and the reference model building blocks that are
used in context. The intent is not to provide an in-depth
view of the component information mechanism, but instead
to give a view from the user’s perspective of the information
aspects to be provided by architectures that adopt our
reference model. To keep this section simple, we
concentrate solely on information exchanges and omit
descriptions of functional component behavior.
The global information invariant defines at the
highest-level the abstract ‘service and resource information
property’ preserved by all the infrastructure services in
particular and the whole interoperable infrastructure
ecosystem in general. We define this invariant as follows:
Definition 1 (Global Information Invariant)
Components being part of the interoperable network of
services that make up the interoperable e-science
infrastructures are constrained by the common information
exchange policy that mandates the use of a particular
common information schema.
As shown in Figure 2, we have defined an
information exchange policy that mandates the use of
GLUE2 [1] as the common information schema. Any
relevant production e-science infrastructure needs to satisfy
this invariant in order to be compliant with our reference
model design to achieve the ‘basic semantic
interoperability’ with other infrastructures. This avoids
semantic loss of information that happens when different
information schemas are used in interoperability setups.
Figure 2. The Global Information Invariant ensures with a common
GLUE2-based constraint that all services of the infrastructure reference
model can exchange information without any semantic loss.
Although such an invariant in general and the use
of a common information schema like GLUE2 sounds
trivial, we observed in the past that this was not the case in
the European production e-science infrastructures
EGEE/EGI, NorduGrid, and DEISA/PRACE. In [2], Field
et al. describes the use of a NorduGrid schema for the
NorduGrid infrastructure while the EGEE infrastructure is
largely based on the GLUE1.3 proprietary information
schema. In parallel, the DEISA infrastructure used
UNICORE services [X10] that described resources via the
so-called Common Information Model (CIM) [4] of the
Distributed Management Task Force (DMTF). The findings
of these studies reveal the need for a common information
schema (not CIM) that is accepted by the wider community.
We build on these findings of the aforementioned
research and define the use of GLUE2 to satisfy our global
information invariant in order to enhance interoperability
between production e-science infrastructures in Europe. As
a consequence, we need to set the constraint that reference
model services that enable the Grid job submission to
computational resources (i.e. OGSA-BES) expose GLUE2-
compliant information as shown in Figure 2. In earlier work
[3], we have already shown that OGSA-BES can be
augmented with GLUE2 to expose standard-compliant
information about computational resources (i.e. number of
cores, etc.) instead of exposing its own proprietary
information schema. In addition, it is also essential that we
re-use the exposed GLUE2 elements in turn for the job
submission to services as part of the JSDL job description
schema. This is necessary that resource requirements of the
JSDL document can be matched with the actual resource
information published by the corresponding services that in
turn avoids semantic mismatches. But details how this can
be done have we published in the workshop last year [X11]
and thus we concentrate here on the higher-level impacts.
Figure 2 also illustrates that our invariant enforces
this constraint of exposing GLUE2 also on services that
manage data storage resources (i.e. SRM). In this context,
we refer to the use of an out-of-band information system
(e.g. LDAP [12]) that exposes GLUE2-based information
about resources collected from information providers. More
recently, our EMI project is implementing these concepts in
order to satisfy our defined invariant for relevant reference
model service elements as listed in Table 1. EMI is one of
the major software providers of EGEE / EGI and thus such
service elements are highly considered to be deployed on
these infrastructures as shown in Figure 2 leading to their
‘basic semantic interoperability’ in the near future.
TABLE I.
Reference Model Service Constraints
that Satisfy the Invariant Area Reference
Model Service
(a) Exposure of GLUE2 information Compute OGSA-BES
(b) Exposure of GLUE2 information Data SRM
(c) Re-use of GLUE2 elements for jobs Compute JSDL
IV. GLOBAL ACCOUNTING INVARIANT
While the previous section introduces an invariant about the
common information exchange, this section focuses on
resource accounting as a particular important ‘type of
information’ highly relevant for production e-Science
infrastructures today. In this context, we refer to resource
accounting as the usage of several services and, most
notably, their underlying resources they provide access to.
But before we define such an invariant, we need to explain
why the handling of accounting is different than the
handling of the general service and resource information
introduced in the last section.
The information encoded in GLUE2 and exposed
via information services is in the majority of cases available
to all infrastructure users while accounting data raises
typically privacy issues that limit their open exposure via
generally accessible services. Also, the information encoded
in GLUE2 is relatively static with only ‘a few’ changes in
their dynamic parts (e.g. TotalJobs in ComputingService
Entity) when we compare this to resource accounting
information that is highly dynamic in e-science
infrastructures. Especially when modern large-scale HPC
systems are part of these infrastructures that serve a high
amount of users at the same time, the amount of information
about used resources by end-users can be much higher than
in infrastructures traditionally driven by HTC. Hence, this
issue becomes even more relevant in interoperable e-science
infrastructure setups with a plethora of end-users services.
Having clarified this issue and in terms of the
larger infrastructure integration aspect we are now able to
describe our fundamental accounting information exchange
that occurs during cross-infrastructure operations in terms of
a Global Accounting Invariant (GAI). This global
accounting invariant defines at the highest-level the abstract
‘resource accounting property’ preserved by relevant
infrastructure services in particular and the whole
interoperable e-science infrastructure ecosystem in general.
This invariant is defined as follows:
Definition 2 (Global Accounting Invariant)
Components being part of the interoperable network of
services that make up the interoperable e-science
infrastructures are constrained by the common accounting
exchange policy that defines a particular common resource
usage schema.
As shown in Figure 3, we have defined an
accounting exchange policy that enforces the use of URs [5]
as the common resource usage schema. Any relevant
production e-science infrastructure needs to satisfy this
invariant in order to be compliant with our reference model
design in order to achieve the ‘basic accounting
interoperability’ with other infrastructures. This overcomes
the limitations faced when tracking resource usage across
infrastructure boundaries as often observed today.
Figure 3. The Global Accounting Invariant ensures with a common UR-
based constraint that the resource usage of infrastructure reference model
adoptions can be tracked across e-science infrastructure boundaries.
The use of a common resource tracking model
based on URs that is in-line with our defined invariant has
been essentially done since several years as part of the
computing services of different e-Science infrastructures
(e.g. DEISA). It is thus easier to force this constraint for the
computing services (i.e. OGSA-BES) in contrast to the
storage services (i.e. SRM) that so far did not track resource
usage. This can be partly explained by the fact that we need
to extend the UR definition towards the support for storage
information in order to satisfy our defined invariant. This is
indicated in Figure 3 with the UR +
Δ
Y entity where
Δ
Y
indicates the storage resource enhancements.
The aforementioned work is essential to enable a
consistent ‘basic accounting interoperability’ between
production e-science infrastructures in Europe. Therefore
our EMI project works on these enhancements. In this
context, we refer to the use of an out-of-band accounting
system (e.g. SGAS [14]) that works with UR-based
information collected from accounting providers. The
transfer mechanism can be realized using messaging
implementations (e.g. ActiveMQ [15]) since our earlier
investigations [16] revealed that a WS-based Resource
Usage Service (RUS) [13] is not scalable enough when
HPC-driven resources are used. Finally, we also raise the
demand to extend the UR schema with aggregated resource
usage tracking concepts as shown in Figure 3. In that case,
an administrative accounting client takes advantage from the
convenient use of aggregated URs (indicated with ΔZ).
TABLE II.
Reference Model Service Constraints
that Satisfy the Invariant Area Reference
Model Service
(a) Track resource usage with UR Compute OGSA-BES
(b) Track resource usage with UR Data SRM
(c) Enhance URs for Storage Systems Data SRM
(d) Enhance URs for aggregated usage All All
V. GLOBAL AUTHORIZATION ATTRIBUTES INVARIANT
The next infrastructure integration element of the reference
model is the Global Authorization Attributes Invariant
(GAAI) that defines one particular important aspect of the
overall cross-infrastructure security ecosystem. In this
section we describe the need for a common set of security
attributes to enable attribute-based authorization during
secure production cross-infrastructure operations. In our
given context, such security attributes typical convey pieces
of information about project or Virtual Organization (VO)
[18] memberships as well as role possession.
The global authorization attributes invariant
defines at the highest-level the abstract ‘service
authorization property’ preserved by the infrastructure
services in particular and the whole interoperable e-science
infrastructure ecosystem in general. It is defined as follows:
Definition 3 (Global Authorization Attributes Invariant)
Components being part of the interoperable network of
services that make up the interoperable e-science
infrastructures are constrained by the common security
attribute exchange policy that defines a particular common
set of attributes as part of the authorization of end-users.
As shown in Figure 4, we have defined a common
security exchange policy that enforces the use of common
security attributes that are exchanged using different
approaches. While one approach uses the attributes encoded
in Security Assertion Markup Language (SAML) [6] tokens,
another approach ships the security attributes as part of
X.509 proxy extensions. But even if a production e-science
infrastructure might use different channels for transporting
these attributes it can be still compliant with the reference
model and its invariant by using the common set of
attributes for authorization decisions. Hence, in order to be
compliant with our reference model design in order to
achieve the ‘basic authorization interoperability’ with other
infrastructures. This is possible since the same attributes,
even if differently encoded, state the same security
information while their transfer method is another issue.
This is best explained by describing the other
essential parts of attribute-based authorization that is about
the corresponding Attribute Authority (AA) and security
policies as illustrated in Figure 4. In our approach, one
central AA like the Virtual Organization Membership
Service (VOMS) [8] is responsible to release signed
attributes about end-users in two ways: SAML and X.509.
This AA is one central source of trust and thus it is
important that it satisfies our global authorization attribute
invariant. While it is not necessarily important which ways
the attribute take from clients to services, it is important that
the sink also conforms to our invariant. As the attributes are
being signed they cannot easily compromised or used by
other e-scientists, but essential is that the sink that enforces
authorization (AuthZ) trusts the source (i.e. the AA).
Figure 4. The Global Authorization Attributes Invariant ensures with a
common set of security attributes that the service usage of the infrastructure
reference model can be authorized across infrastructure boundaries.
As shown in Figure 4, such an aforementioned sink
are authorization policies that are traditionally encoded
using either Gridmap files or, more recently, policies based
on the Extensible Access Control Markup Language
(XACML) [7]. In order to achieve ‘basic authorization
interoperability’ we define the constraint that the sink, like
the source, needs to satisfy our invariant. This raises the
demand that such authorization policies need to include
authorization definitions for our common security attributes.
Hence, during job submissions or data transfer between
storages, we define the constraint that our services (i.e.
OGSA-BES and SRM) have to use such a sink in order to
enforce attribute-based authorization that does not violate
our defined invariant.
This topic is quite more complex, but is essential to
achieve a secure operational level of interoperability
between HTC-driven and HPC-driven e-science
infrastructures in Europe today. So far, the HTC-driven e-
science infrastructure EGEE/EGI has used their own
attribute profile based on so-called Fully Qualified Attribute
Names [8]. In contrast, DEISA did not perform any kind of
attribute-based authorization through UNICORE since long
and thus only recently developed methods for it. Hence,
there was not even basic authorization interoperability
between those important infrastructures that led to the fact
that e-scientists that would like to use both required to get
access to both in a different fashion.
More recently, the technical possibilities are
present in the UNICORE middleware and partly used in
DEISA with some cross-infrastructure applications. But to
achieve seamless operational interoperability without
dedicated setups, we need to satisfy our global authorization
attributes invariant. Therefore, the EMI projects have begun
to define a common SAML attribute profile that matches
another common profile based on XACML. With this work,
‘basic authorization interoperability’ between European
middleware and their storage elements is possible.
VI. ACCOMPANYING USE CASES & IMPLEMENTATION
Our two use cases are the WISDOM use case [28] of the
bio-informatics domain and the EUFORIA use case [27] of
the fusion science domain. The WISDOM use case aims at
developing new drugs for Malaria using different types of e-
Science infrastructures for large-scale in-silico docking and
molecular dynamics validation methods. The EUFORIA use
case and its applications simulate aspects of the ITER
tokamak that is a fusion device that may become the basis
for future fusion power generating power-plants. Thus the
major goal of this use case is to enhance the modeling
capabilities for ITER through the use of e-Science
infrastructures together with the fusion modeling
community by adaptation, optimization, and integration of a
set of critical applications for edge and core transport
modeling as well as turbulence simulations. The majority of
the application codes of this use case allow for the
construction of complex scientific workflows that produce
advance physic results via the use of different types of e-
Science infrastructures.
Both aforementioned use cases implement the e-
science design pattern [30] of using jointly HPC and HTC
resources within interoperable e-science infrastructures to
perform e-science via a greater cross-infrastructure
scientific workflow. The WISDOM use case implements
this pattern by using first HTC resources for in-silico
docking methods that do not require a good interconnection
between CPUs. Afterwards, the best results of the docking
step are verified with scalable molecular dynamics
techniques that simulate the docked compounds over time
on HPC resources. In the EUFORIA use case many fusion
workflows can take advantage of the e-Science
infrastructure interoperability but one concrete example is
the HELENA-ILSA workflow [27]. The HELENA [46]
code is first computed on HTC resources while its results
are then used by the ILSA code on HPC resources.
Figure 5. The defined invariants ensure operational interoperability that
go beyond WS-based interface/protocol interoperability issues thus
enabling e-science infrastructure interoperability for e-science applications.
In order to demonstrate the significance of our
invariant definitions in the greater context of e-science
applications and IIRM adoptions, we focus on the
computational parts and describe a detailed job submission
relevant for those two use cases. The described invariant
implementations and applied methods are all also relevant
for other reference model core building blocks (e.g. SRM)
but cannot be shown due to page restrictions. As shown in
Figure 5, several IIRM core building blocks and their
invariant implementations form a relatively complex
architecture with essentially four layers. At the bottom, we
have the ‘resource layer’ that stands for the e-Science
infrastructure resources that are available. Example
resources are HPC- and HTC-computing resources while on
this layer also data storage resources can be found in
production setups, but that are neglected for clarity of
Figure 5 and the description of the use case. On top of this
layer, our IIRM service entities are shown that offer access
to the underlying resource layer and its functionalities.
These services together form one ‘virtual network of
interoperable services’ [44] and are as such independent
from the ‘real physical e-Science infrastructures’ (e.g.
EGEE/EGI or DEISA/PRACE) mainly because of their
interoperability based on the IIRM and conformance to the
invariants described in this paper. We nevertheless include
one dedicated layer for the production infrastructures on top
of these interoperable services to model the reality as best as
possible in Figure 5. The final layer of the architecture
stands for ‘scientific clients’ that can be any form of a so-
called rather abstract ‘scientific gateway’ providing easy
access to e-Science infrastructure services used by scientists
on a daily basis. While WISDOM scientists actually prefer
Web-based access via portal techniques, the EUFORIA
scientists use the KEPLER framework [48] with Grid
modifications through gLite and UNICORE actors [27].
Due to the aforementioned complexity of the
architecture, we describe the use of IIRM elements and their
impact on the different levels in a step-wise fashion during
cross-infrastructure application runs of our two use cases.
Each step marked as (n) corresponds to one step (n) within
the architecture illustrated in Figure 5. We start with an e-
scientist that works with a particular aforementioned
scientific gateway (0) of his field (bio-informatics or fusion
science) that include clients to numerous services (i.e.
OGSA-BES or SRM) of our reference model design.
Complementary to the functional services, there must be an
integrated client able to obtain valid credentials from an
Attribute Authority (AA) like the VOMS system. Also, the
client has methods to query an information service of
interest (e.g. via LDAP) in order to obtain the status of the
e-Science infrastructures and their offered services.
Hence, before the e-scientist is even able to work
with our interoperable infrastructures, we need to expose the
existence of computational resources and their properties in
a consistent manner. Here, the global information invariant
comes into play that raises the demand to expose such
properties via GLUE2. In this sense, we describe a HPC
resource with a GLUE2 entity (1) and another HTC resource
with a GLUE2 entity (2) as shown in Figure 5. These
entities are exposed via their corresponding information
providers (3 and 4) to an information system (5). Due to
scalability reasons we do not force specific Web service
interfaces as part of the IIRM and rather refer to an out-of-
band mechanism that provides a scalable solution (e.g.
based on LDAP). With this we have satisfied our first
invariant so that we can state to have ‘basic semantic
interoperability’ in our e-science application ecosystem.
This basic semantic interoperability is essential since it
clarifies the use of terms across infrastructures that
previously have described their resources with different
terms, languages, or information models. The constraint of
having one common information model enables
interoperability in terms of fundamental information
exchange and gives clarity thus avoiding semantic loss of
information where one proprietary model might be
transferred into another proprietary model and vice versa.
Based on this semantic interoperability, the e-scientist is
able to use his client to query the information system (6) in
order to search for useful computational resources (across
multiple infrastructures) that are clearly described in our
case with GLUE2 entities.
When the systems have been identified the right
credentials can be obtained from the AA that in our case is
the VOMS server (7). This server releases signed attribute
statements about end-users (e.g. project or VO membership,
role possession, etc.) either encoded in SAML or in X.509
proxies depending on the relevant systems of interest found
in the query of the information service. Hence, the
interoperability of the interface for the AA is not as
important as the agreement of the attribute formats. In this
context it is essential that the attribute statements are in the
same common security attribute format in order to achieve
‘basic authorization interoperability’. This is important
because it lays the foundation to enforce our global
authorization attributes invariant later during AuthZ
decisions within the service layer.
After the aforementioned basic steps are taken, the
e-scientist would like to use a HTC resource with an
embarrassingly parallel job. In the WISDOM use case an
application called autodock is used while the EUFORIA use
case uses the HELENA application in this step. According
to the IIRM design, the e-scientist re-uses elements of
obtained resource description elements based on GLUE2 to
form an enhanced JSDL document (8) that is submitted to
an enhanced OGSA-BES implementation within gLite. One
example of the enhancements of the JSDL document is the
use of GLUE2 elements in its resource description section.
Along with this submission is the security credential
obtained from the VOMS server, in our case an X.509 proxy
credential with attribute statements that are conform to the
common set of attributes stating the VO, group, and role of
the e-scientist. But before the execution of the job is
performed, the authorization (AuthZ) policy framework of
gLite (i.e. gJAF) (9) is responsible to extract the attributes
from the credential and to perform a check of the attributes
against the policy. As we set the constraints that also the
policy definitions must be in the same common attribute
format (10), the access is granted (11) when the e-scientist
obtained the credential from a trusted VOMS server that
also satisfy our global authorization attribute invariant.
When the execution of the HTC job is finished a usage
record entity (12) is created and forwarded via a dedicated
accounting sensor (13) to a central accounting system (14)
like SGAS. This basically concludes the HTC execution
part of the greater scientific workflow.
Typically the results of the HTC-based workflows
are evaluated, often also manually, before the best results
are considered for submission to HPC-based workflow
elements. As these steps are in many cases very similar like
the HTC steps due to the interoperability and common
invariants usage, we have a shorter description of this part to
avoid the repetition of details.
Starting with the HPC-based workflow, we are
back at the scientific gateway where we this time submit an
enhanced JSDL (15) with a different credential set but that
encodes the same attribute statements as used in the HTC
setup. In other words, the same identity is used encoded in a
different way while the real security-relevant information
remains the same. This is an important aspect since it is
related to the trust of end-users into the infrastructure and
the feedback about their identity that remains the same
independent of which infrastructure is actually used.
Afterwards, the AuthZ policy entity (16) of the UNICORE
middleware extracts the attributes encoded in a signed
SAML assertion and performs an authorization decision
based on the common set of attributes (17). In this sense, we
satisfy the global authorization attribute invariant that here
avoids the need to (often manually) setup dedicated
authorization policy elements for end-users of other e-
science infrastructures. We observed, that a manual setup is
typically hard to maintain, time-consuming and in many
cases also error-prone.
When the access is granted (18) based on a match
of the provided end-user attributes in the SAML credential
with the defined XACML security policy, the HPC
execution is started. In the case of the WISDOM use case
the HPC application uses several programs as part of the
AMBER [31] molecular dynamics package (~80 programs)
while more recently also NAMD [45] is more and more
used because it essentially scales better on large-scale HPC
resources. In the case of the EUFORIA use case, the HPC
application in our example is the ILSA code [47] that in turn
is a composite of different other codes intended to run HPC
resources. Finally, after the job run, a usage record entity
(19) is created and forwarded via an accounting sensor (20)
to the same relevant accounting system (21), which can be
used for billing purposes and such like. This concludes the
HPC workflow part.
VII. RELATED WORK
Related work in the field of standardization is clearly
found among the members of the JSDL, UR, GLUE2,
OGSA-BES, SRM OGF working groups. Several ideas
about invariants and related IIRM concepts arise also from
the work of these members and from discussions within the
Production Grid Infrastructure (PGI) working group that is a
spin-off of the Grid Interoperation Now (GIN) community
group. In the last years, we have discussed and will discuss
in future of how we can align our work in order to have new
set of specifications that not fundamentally change the
existing specifications (i.e. preserve backwards-
compatibility where possible) and thus just improving them
without break their emerging stability.
Related Work in the field of reference models
typically leads to the Open Grid Services Architecture
(OGSA) [21], which has in comparison to our approach a
much bigger scope. Hence, our approach only represents a
subset of this scope but being much more focused on e-
Science and thus more detailed with respect to our given
scientifically-driven environments. In this sense, we deliver
with our reference model IIRM a much more detailed
approach of how open standards can be improved and used
in scientific applications that require interoperability of e-
science production Grids today such as WISDOM [28],
EUFORIA [27], or VPH [29]. Neither this contribution nor
the reference model in the bigger context aim at replacing
OGSA and thus rather represent a medium-term milestone
towards a full OGSA perhaps in the future.
Nevertheless, in earlier work we have critically
reviewed OGSA as being the basic Grid reference model
and one outcome of this process have been a couple of
relevant factors that are shown in Table III and that have
been still unpublished critics to OGSA from a production
Grid perspective. Each of these factors is underpinned by
several ‘indicators’ we shortly introduce here before we
survey related work in the field based on the factors and
their indicators.
TABLE III.
In general, the boundary between each of the
factors is not completely strict and in many cases the factor
fulfillment of ‘yes’ only indicates a ‘certain direction’ of the
reference model in question and no in-depth analysis
evaluating every single pieces of the model.
We start with description with the factor of being
‘service-based’ that is marked as yes when the reference
model is following the basic design principle of Service
Oriented Architectures (SOAs). Only if reference model
entities are modeled as services (that have unique
semantics) and offer clear service interfaces, this factor is
fulfilled (i.e. yes).
The reference model is truly applicable in the ‘e-
science context’ when it focuses on scientific use cases and
as such are not driven by commercial applications that move
the focus of services to different reference model entities as
required in our environments. Also, it is important to cover
the essential functional areas that are Grid execution
management, Grid data management, Grid security
elements, and also Grid information management. These
essential functional areas are taken from OGSA while we
drop those areas that we observed to be not a priority for
production Grids today (i.e. self-management) or being
handled by the existing environment or corresponding
implementation (i.e. resource management services and the
handling of stateful resources).
The factor of ‘details for implementation’
addresses the major critics of OGSA that was by far too
high-level to be actually really implemented. Also, over
time the hugh set of services specifications (i.e. resource
selection services, etc.) did not appear. Therefore, relevant
indicators for this factor are the use of concrete Web service
(WS) technologies to implement the abstract SOA designs.
But apart from this, another indicator is referencing
specifications with concrete portTypes (e.g. operations) to
indicate that enough details for implementation exist.
Related to the previous factor is the factor we refer
to as ‘realistically implementable’ that go beyond the
aforementioned factor in terms of the availability of
specifications as well as funding from projects that already
work on implementation of the corresponding reference
model parts. Also this factor is underpinned by the indicator
of whether the number of core service entities is not too
broad where we set the ‘lower than 5’ boundary based on
the aforementioned four core services that should be
available.
We marked a reference model in Table III as ‘yes’
for the factor ‘standards-based’ when it shows several
indicators. First, the reference model must be at least based
on normative standard specifications (while some additions
where useful are allowed that do not break backwards
compatibility). In this sense, it is important that the
specification is developed by a real Standardization
Development Organization (SDO) following an open
process and not developed from any closed consortium of
projects or vendors.
Relevant Factors OGSA EGA CCA CSA CPN
(a) Service based yes yes yes yes no
(b) e-Science Context yes no yes no yes
(c) Details for
implementation no no yes yes no
(d) Realistically
implementable no no yes no no
(e) Standards based yes yes yes yes no
(f) Adoption in
e-Science
production
technologies
no no no no no
(g) Relationships
between
functional areas
no yes no yes no
(h) Cover Invariant
Constraint Topics no no no no no
Another relevant factor of Table III addresses the
critics of the e-science domain itself of being actually
extremely broad in scope, projects, directions, approaches.
To ensure that the reference model is truly applicable we
defined the ‘adoption in e-Science production technologies
and infrastructures’. In this sense, we added indicators of
whether the corresponding reference model has a high
probability to be relevant for EGEE/EGI and
DEISA/PRACE through the adoption of technologies
typically deployed on these infrastructures. Hence, another
indicator is whether the reference model is already partly
adopted or planned to be adopted by one of the four major
production middleware systems relevant in Europe today.
These are ARC, gLite, UNICORE, and Globus [49].
Another factor is about the specification of
‘relationship between different relevant functional areas’.
These areas have been defined above via an analysis of
OGSA with a perspective of production Grids. But in
contrast to OGSA, we see these functional areas not on the
same level. For example, the relationships of compute as
well as data with information and security are important.
We argue that these are orthogonal to each other while
OGSA basically refers to them as parallel services.
The final factor is essentially about the presence of
constraints (i.e. invariants) that are the content of this paper.
Based on these factors and their indicators, we
have surveyed a wide variety of existing reference models in
the field as shown in Table III. We thus evaluated whether
they are useful for our given interoperability use cases
through various factors for our given production e-science
infrastructure environments. We studied the Enterprise Grid
Alliance Reference Model [36], the Common Component
Architecture (CCA) [37], the OASIS Service Component
Architecture (CSA) [38], and Colored Petri Nets (CPN)
[39]. As a conclusion of this survey, none of them give
detailed insights of how the core building blocks of a
realistically implementable reference model can be defined
in terms of existing normative specifications like we did.
Also, none of the surveyed reference models provide
constraints or other relevant detailed pieces of information
about the issues addressed with our set of defined invariants.
VIII. CONCLUSIONS
In this contribution we have shown how we are able to
overcome certain limitations that are often neglected within
e-science infrastructure interoperability since they go
beyond the usual ‘interface compliance’. In other words, we
provide insights that even with having interoperability on
the interface level through the adoption of our IIRM using
OGSA-BES or SRM, there are still significant factors that
can break interoperability setups of e-science infrastructures
today. As a consequence, infrastructure integration activities
and interoperability remain in many cases only a time-
limited endeavor with unstable, often manually configured,
setups that are error-prone and thus hinder real production e-
science applications from being productive on a daily basis.
The three defined constraints through the Global
Information Invariant, Global Accounting Invariant, and the
Global Authorization Attributes Invariant directly address
the aforementioned significant factors being able to
significantly increase the production level interoperability of
e-science infrastructures that satisfy them. As our two
accompanying use cases reveal, the complementary
definition of our invariants to the reference model is not
only relevant, but also lead to significant impact in
supporting real cross-infrastructure e-science applications.
From our findings we can further conclude that in
our context of information exchange, accounting with
resource usage tracking, and attribute-based authorization,
the ‘common interface’ (i.e. OGSA-BES, SRM) itself is not
as important as ‘a common schema or set of attributes’ (i.e.
GLUE2, JSDL, UR, and common security attributes). Even
if interoperability on an OGSA-BES WS interface-level is
established, any mismatch can cause uncertainties when
JSDL document resource descriptions during submission do
not match the GLUE2-based resource description. In a
similar way, it makes no sense to perform cross-
infrastructure accounting when the track resource usage
records have different semantics. For example, compute
records on HTC resources compared to HPC resources are
different, but can be still tracked in a common schema. In
addition, it is not optimal when end-users have the right
credentials with the right set of attributes stating their VO or
group membership, but the different e-science
infrastructures have different set of attributes thus don’t
reaching basic authorization interoperability that can be
easily achieved when a common set of attributes is used.
Finally, we would like to raise the awareness that
many technology foundations to satisfy our invariants are
currently broadly developed in gLite, UNICORE, ARC,
dCache via our EMI project while several early concept
implementations have been already performed to check the
suitability of our approaches. This in turn leads to more
stable interoperability setups of pan-European
infrastructures such as EGI and DEISA making the joint use
of HTC and large-scale HPC realistic having a true impact
for production in general and cross-infrastructure use cases
(e.g. WISDOM, EUFORIA, VPH, etc.) in particular.
ACKNOWLEDGMENT
We thank the OGF GIN and PGI groups for fruitful
discussions in the context of this work. We also thank our
EMI JRA1 development area leaders John White (HIP),
Laurence Field (CERN), Patrick Fuhrmann (DESY), and
Marco Cecchi (INFN) who bring the development work
forward together with the author. The research results of this
Project are co-funded by the EC under the FP7
Collaborative Projects Grant Agreement Nr. 261611.
Because of the page restriction for this workshop, we only
present the high-level invariants, but refer to the EMI
project for insights on low-level developments in context.
REFERENCES
[1] S. Andreozzi et al., GLUE Specification 2.0, OGF Grid Final Document
Nr. 147, 2009.
[2] L. Field et al. Grid Information System Interoperability: The Need For
A Common Information Model. In Proc. of the IGIIW Workshop, e-
Science Conference 2008, Indianapolis, USA, 2008.
[3] M.S. Memon, A.S. Memon, M. Riedel, A. Streit, F. Wolf
Enabling Grid Interoperability by Extending HPC-driven Job
Management with an Open Standard Information Model
Proceedings of International Conference on Computer and
Information Science (ICIS 2009), June 2009, IEEE, ACIS-ICIS pp.
506-511
[4] DMTF: Common Information Model (CIM) Standards,
http://www.dmtf.org/standards/cim/
[5] Mach R., Lepro-Metz R., Jackson S., McGinnis L., Usage Record –
Format Recommendation, OGF GFD98
[6] S. Cantor, J. Kemp, R. Philpott, and E. Maler. Assertions and Protocols
for the OASIS Security Assertion Markup Language, OASIS
Standard, 2005
[7] Moses, T., et al.: eXtensible Access Control Markup Language, OASIS
Standard (2005)
[8] Alfieri, R., et al.: From gridmap-file to voms: managing authorization
in a grid environment. In: Future Generation Comp. Syst.,
21(4):549-558 (2005)
[9] A. Sim et al., The Storage Resource Manager Interface Specification
Version 2.2. OGF Grid Final Document Nr. 129, 2008.
[10] A.S. Memon, M.S. Memon, Ph. Wieder, B. Schuller
CIS: An Information Service based on the Common Information
Model, Proceedings of 3rd IEEE International Conference on e-
Science and Grid Computing, Bangalore, India, December, 2007,
IEEE Computer Society, ISBN 0-7695-3064-8, pp. 465 - 472
[11] M. Riedel, M.S. Memon, A.S. Memon, A. Streit, F. Wolf, Th. Lippert,
B. Konya, A. Konstaninov, O. Smirnova, M. Marzolla, L.
Zangrando, J. Watzl, D. Kranzlmüller, Improvements of Common
Open Grid Standards to Increase High Throughput and High
Performance Computing Effectiveness on Large-scale Grid and e-
Science Infrastructures, Seventh High-Performance Grid Computing
(HPGC) Workshop at International Parallel and Distributed
Processing Symposium (IPDPS) 2010, April 19-23, 2010, Atlanta,
USA, ISBN 978-1-4244-6533-0, pp.1-7
[12] Open LDAP [Online], available: http://www.openldap.org
[13] OGF - OGSA - RUS Working Group (OGSA-RUS) [Online],
available: https://forge.gridforum.org/projects/rus-wg
[14] T. Sandholm et al., A service-oriented approach to enforce grid
resource allocation, Int. Journal of Cooperative Inf. Systems, Vol.15,
2006
[15] Apache ActiveMQ [Online], available: http://activemq.apache.org/
[16] W. Frings, M. Riedel, A. Streit, D. Mallmann, S. van den Berghe, D.
Snelling, and V. Li, LLview: User-Level Monitoring in
Computational Grids and e-Science Infrastructures. In Proc. of
German e-Science Conference 2007, Baden-Baden, Germany
[17] A. Anjomshoaa et al., Job Submission Description Language
Specification V.1.0. OGF (GFD56), 2005.
[18] I. Foster, C. Kesselmann, and S. Tuecke. The Anatomy of the Grid -
Enable Scalable Virtual Organizations. In F. Berman, G. C. Fox, and
A. J. G. Hey, editors, Grid Computing - Making the Global
Infrastructure a Reality, pages 171–198. JohnWiley & Sons Ltd,
2003.
[19] I. Mandrichenko et al., GridFTP v2 Protocol Description, OGF Grid
Final Document Nr. 47, 2005.
[20] e-Science Definition [Online], available:
http://www.e-science.clrc.ac.uk
[21] I. Foster, C. Kesselmann, J. M. Nick, and S. Tuecke. The Physiology
of the Grid. In F. Berman, G. C. Fox, and A. J. G. Hey, editors, Grid
Computing - Making the Global Infrastructure a Reality, pages 217–
249. JohnWiley & Sons Ltd, 2003.
[22] EGI [Online], Available: http://www.egi.org/
[23] DEISA. [Online], Available: http://www.deisa.org
[24] PRACE [Online], available: www.prace-project.eu/
[25] M. Riedel et al., “Research Advances by using Interoperable e-Science
Infrastructures - The Infrastructure Interoperability Reference Model
applied in e-Science,” in Journal of Cluster Computing, SI Recent
Research Advances in e-Science, 2009.
[26] P. Eerola et al., "Building a Production Grid in Scandinavia", in IEEE
Internet Computing, 2003, vol.7, issue 4, pp.27-35
[27] M. S. Memon, M. Riedel, et al. Lessons learned from jointly using
HTC- and HPC-driven e-science infrastructures in Fusion Science.
In proceedings of the IEEE ICIET 2010 Conference, Pakistan, 2010.
[28] M. Riedel et al., “Improving e-Science with Interoperability of the e-
Infrastructures EGEE and DEISA,” in Proc. of the MIPRO, 2007.
[29] M. Riedel, B. Schuller, M. Rambadt, M.S. Memon, A.S. Memon, A.
Streit, F. Wolf, Th. Lippert, S.J. Zasada, S. Manos, P.V. Coveney, F.
Wolf, D. Kranzlmüller - Exploring the Potential of Using Multiple e-
Science Infrastructures with Emerging Open Standards-based e-
Health Research Tools, Proceedings of the The 10th IEEE/ACM
International Symposium on Cluster, Cloud and Grid Computing
(CCGrid 2010), May 17-20, 2010, Melbourne, Victoria, Australia,
pp. 341-348, ISBN 978-0-7695-4039-9
[30] M. Riedel. E-Science Infrastructure Interoperability Guide - The
Seven Steps towards Interoperability for e-Science. In Book „Guide
to e-Science: Next Generation Scientific Research and Discovery“,
Springer, 2011.
[31] AMBER [Online], available: http://amber.scripps.edu
[32] I. Foster et al., OGSA Basic Execution Service Version 1.0. Open Grid
Forum Grid Final Document Nr. 108, 2007.
[33] M. Ellert et al., Advanced Resource Connector middleware for
lightweight computational Grids", Future Generation Computer
Systems 23 (2007) 219-240.
[34] A. Streit et al., “UNICORE - From Project Results to Production
Grids.” in Grid Computing: The New Frontiers of High Performance
Processing, Advances in Parallel Computing 14, L. Grandinetti, Ed.
Elsevier, pp. 357–376.
[35] E. Laure et al., “Programming the Grid with gLite,” in Computational
Methods in Science and Technology, 2006, pp. 33–46.
[36] Enterprise Grid Alliance (EGA) Reference Model [Online], available:
http://www.ogf.org/UnderstandingGrids/documents/EGA_reference
_model.pdf
[37] R. Armstrong et al. Toward a Common Component Architecture for
High-Performance Scientific Computing. In Proc. of HPDC, 1999.
[38] OASIS CSA [Online], available: http://www.oasis-opencsa.org/sca
[39] C. Bratosin,W. Aalst, N. Sidorova, and N. Trcka. A Reference Model
for Grid Architectures and its analysis. In LNCS, Vol 5331, pages
898–913, 2008.
[40] M. Riedel, E. Laure, et al., “Interoperation of World-Wide Production
e-Science Infrastructures,” in Journal on Concurrency and Comp.:
Practice and Experience, 2008.
[41] M. Riedel, A. Streit, Th. Lippert, F. Wolf, D. Kranzlmueller
Concepts and Design of an Interoperability Reference Model for
Scientific- and Grid Computing Infrastructures, in Proc. of the ACC
Conference, 2009, ISBN 978-960-474-124-3, Pages 691 - 698
[42] The EMI Project [Online], available: http://www.eu-emi.eu/
[43] dCache [Online], available: http://www.dcache.org/
[44] M. Riedel et al., Towards individually formed computing
infrastructures with high throughput and high performance
computing resources of large-scale grid and e-science
infrastructures, in Proceedings of the MIPRO 2010
[45] NAMD [Online], available: http://www.ks.uiuc.edu/Research/namd/
[46] Huysmans, G.T.A. et al., Proc. CP90 Conf. on Comp. Physics Proc.
(1991)
[47] Huysmans, G.T.A. et al., Phys. Plasmas 8(10), 4292 (2001)
[48] I. Altintas and C. Berkley and E. Jaeger and M. Jones and B.
Ludascher and S. Mock , “Kepler: an extensible system for design
and execution of scientific workflows”, Scientific and Statistical
Database Management, 2004. Proceedings of 16th International
Conference on. Pages: 423—424,
[49] I. Foster. Globus Toolkit version 4: Software for Service-Oriented
Science. In Proceedings of IFIP Int. Conference on Network and
Parallel Computing, 2005, LNCS 3779, pages 213–223