Conference Paper

Interoperable job execution and data access through UNICORE and the Global Federated File System

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This article is a joint and extended version of [34] and [40]. ...
Article
Full-text available
Emerging challenges for scientific communities are to efficiently process big data obtained by experimentation and computational simulations. Supercomputing architectures are available to support scalable and high performant processing environment, but many of the existing algorithm implementations are still unable to cope with its architectural complexity. One approach is to have innovative technologies that effectively use these resources and also deal with geographically dispersed large datasets. Those technologies should be accessible in a way that data scientists who are running data intensive computations do not have to deal with technical intricacies of the underling execution system. Our work primarily focuses on providing data scientists with transparent access to these resources in order to easily analyze data. Impact of our work is given by describing how we enabled access to multiple high performance computing resources through an open standards-based middleware that takes advantage of a unified data management provided by the the Global Federated File System. Our architectural design and its associated implementation is validated by a usecase that requires massivley parallel DBSCAN outlier detection on a 3D point clouds dataset.
Article
Full-text available
This document specifies the semantics and structure of the Job Submission Description Language (JSDL). JSDL is used to describe the requirements of computational jobs for submission to resources, particularly in Grid environments, though not restricted to the latter. The document includes the normative XML Schema for the JSDL, along with examples of JSDL documents based on this schema.
Article
Full-text available
The WS-PGRADE/gUSE generic DCI gateway framework has been developed to support a large variety of user communities. It provides a generic purpose, workflow-oriented graphical user interface to create and run workflows on various DCIs including clusters, Grids, desktop Grids and clouds. The framework can be used by NGIs to support small user communities who cannot afford to develop their own customized science gateway. The WS-PGRADE/gUSE framework also provides two API interfaces (Application Specific Module API and Remote API) to create application-specific science gateways according to the needs of different user communities. The paper describes in detail the workflow concept of WS-PGRADE, the DCI Bridge service that enables access to most of the popular European DCIs and the Application Specific Module and Remote API concepts to generate application-specific science gateways.
Article
Full-text available
The Nordic Data Grid Facility (NDGF) consists of Grid resources running ARC middleware in Denmark, Finland, Norway and Sweden. These resources serve many virtual organisations and contribute a large fraction of total worldwide resources for the ATLAS experiment, whose data is distributed and managed by the DQ2 software. Managing ATLAS data within NDGF and between NDGF and other Grids used by ATLAS (the Enabling Grids for E-sciencE Grid and the Open Science Grid) presents a unique challenge for several reasons. Firstly, the entry point for data, the Tier 1 centre, is physically distributed among heterogeneous resources in several countries and yet must present a single access point for all data stored within the centre. The middleware framework used in NDGF differs significantly from other Grids, specifically in the way that all data movement and registration is performed by services outside the worker node environment. Also, the service used for cataloging the location of data files is different from other Grids but must still be useable by DQ2 and ATLAS users to locate data within NDGF. This paper presents in detail how we solve these issues to allow seamless access worldwide to data within NDGF.
Article
Full-text available
This specification defines the syntax and semantics for XML-encoded assertions about authentication, attributes, and authorization, and for the protocols that convey this information. Status: This is a second Committee Draft approved by the Security Services Technical Committee on 21 September 2004.
Article
Full-text available
This specification defines protocol bindings for the use of SAML assertions and request-response messages in communications protocols and frameworks. Status: This is a working draft produced by the Security Services Technical Committee. See the Revision History for details of changes made in this revision.
Article
Full-text available
In the last three years activities in Grid computing have changed; in particular in Europe the focus moved from pure research-oriented work on concepts, architectures, interfaces, and protocols towards activities driven by the usage of Grid technologies in day-to-day operation of e-infrastructure and in applicationdriven use cases. This change is also reected in the UNICORE activities [1]. The basic components and services have been established, and now the focus is increasingly on enhancement with higher level services, integration of upcoming standards, deployment in e-infrastructures, setup of interoperability use cases and integration of applications. The development of UNICORE started back more than 10 years ago, when in 1996 users, supercomputer centres and vendors were discussing "what prevents the efficient use of distributed supercomputers?". The result of this discussion was a consensus which still guides UNICORE today: seamless, secure and intuitive access to distributed resources. Since the end of 2002 continuous development of UNICORE took place in several EU-funded projects, with the subsequent broadening of the UNICORE community to participants from across Europe. In 2004 the UNICORE software became open source and since then UNICORE is developed within the open source developer community. Publishing UNICORE as open source under BSD license has promoted a major uptake in the community with contributions from multiple organisations. Today the developer community includes developers from Germany, Poland, Italy, UK, Russia and other countries. The structure of the paper is as follows. In Section 2 the architecture of UNICORE 6 as well as implemented standards are described, while Section 3 focusses on its clients. Section 4 covers recent developments and advancements of UNICORE 6, while in section 5 an outlook on future planned developments is given. The paper closes with a conclusion.
Article
Federated, secure, standardized, scalable, and transparent mechanism to access and share resources, particularly data resources, across organizational boundaries that does not require application modification and does not disrupt existing data access patterns has been needed for some time in the computational science community. The Global Federated File System (GFFS) addresses this need and is a foundational component of the NSF-funded eXtreme Science and Engineering Discovery Environment (XSEDE) program. The GFFS allows user applications to access (create, read, update, delete) remote resources in a location-transparent fashion. Existing applications, whether they are statically linked binaries, dynamically linked binaries, or scripts (shell, PERL, Python), can access resources anywhere in the GFFS without modification (subject to access control). In this paper we present an overview of the GFFS and its most common use cases: accessing data at an NSF center from a home or campus, accessing data on a campus machine from an NSF center, directly sharing data with a collaborator at another institution, accessing remote computing resources, and interacting with remote running jobs. We present these uses cases and how they are realized using the GFFS.
Article
As computational Grids move away from the prototyping state, reliability, performance and ease of use and maintenance become focus areas of their adoption. In this paper, we describe ARC (Advanced Resource Connector) Grid middleware, where these issues have been given special consideration.We present an in-depth view of the existing components of ARC, and discuss some of the new components, functionalities and enhancements currently under development. This paper also describes architectural and technical choices that have been made to ensure scalability, stability and high performance. The core components of ARC have already been thoroughly tested in demanding production environments, where it has been in use since 2002. The main goal of this paper is to provide a first comprehensive description of ARC.
Conference Paper
In the past years, hype over Web services and their uses in emerging software applications has prompted the creation of many standards and proto-standards. The OGF has seen a number of standards making their way through design and edit pipelines. While this standards process progresses, it is important that implementations of these standards develop in parallel in order to validate the efforts of the standards authors while also providing feedback for further specification refinement. No specification exists in isolation but rather composes with others to form higher order products. These specifications will form the grid infrastructure of the future and an evaluation of this emerging work becomes increasingly relevant. Genesis II is a grid system implemented using these standards that serves both to provide the feedback described above as well as to function as a production level grid system for research at the University of Virginia.
Conference Paper
In 2007, the most challenging high energy physics experiment ever, the Large Hardon Collider(LHC), at CERN, will produce a sustained stream of data in the order of 300MB/sec, equivalent to a stack of CDs as high as the Eiffel Tower once per week. This data is, while produced, distributed and persistently stored at several dozens of sites around the world, building the LHC data grid. The destination sites are expected to provide the necessary middle-ware, so called Storage Elements, offering standard protocols to receive the data and to store it at the site specific Storage Systems. A major player in the set of Storage Elements is the dCache/SRM system. dCache/SRM has proven to be capable of managing the storage and exchange of several hundreds of terabytes of data, transparently distributed among dozens of disk storage nodes. One of the key design features of the dCache is that although the location and multiplicity of the data is autonomously determined by the system, based on configuration, cpu load and disk space, the name space is uniquely represented within a single file system tree. The system has shown to significantly improve the efficiency of connected tape storage systems, by caching, ’gather & flush’ and scheduled staging techniques. Furthermore, it optimizes the throughput to and from data clients as well as smoothing the load of the connected disk storage nodes by dynamically replicating datasets on the detection of load hot spots. The system is tolerant against failures of its data servers which enables administrators to go for commodity disk storage components. Access to the data is provided by various standard protocols. Furthermore the software is coming with an implementation of the Storage Resource Manager protocol (SRM), which is evolving to an open standard for grid middleware to communicate with site specific storage fabrics.
Article
This paper presents the security architecture of the sixth version of the UNICORE grid middleware. The sixth iteration of UNICORE introduced a number of new, security-related solutions which make UNICORE distinguishable from the other grid middleware as Globus, gLite or NorduGrid ARC, and these are presented in this paper. The paper discusses the low level security: users authentication, non-repudiation control and trust delegation. The UNICORE unique approach to the challenge of trust delegation is called explicit trust delegation (ETD); discussion of this constitutes the most significant and extensive part of this paper. ETD is compared with the popular grid security infrastructure (GSI). High level security services (such as authorization services) are not described in this paper.
RNS Specification 1.1
  • M Morgan
  • A Grimshaw
  • O Tatebe
RNS 1.1 OGSA WSRF Basic Profile Rendering 1.0
  • M Morgan
  • O Tatebe
OGSA Basic Execution Service (BES), Version 1.0
  • I Foster
ByteIO Specification 1.0
  • M Morgan
The Storage Resource Manager Interface Specification, Version 2.2
  • A Sim