ArticlePublisher preview available

Towards Federated Service Discovery and Identity Management in Collaborative Data and Compute Cloud Infrastructures

To read the full-text of this research, you can request a copy directly from the authors.


This paper compares three multi-national research infrastructures, one that provides data services, one that provides compute services, and one that supports linguistics research. The aim is to jointly provide services to the user communities, and, perhaps eventually, seamlessly interoperate. To this end, we look at and compare how the infrastructures build their service federations (trust, service status, information systems), and how they manage users (identities, authentication, and authorisation).
J Grid Computing (2018) 16:663–681
Towards Federated Service Discovery and Identity
Management in Collaborative Data and Compute Cloud
Shiraz Memon ·Jensen Jens ·Elbers Willem ·
Helmut Neukirchen ·Matthias Book ·
Morris Riedel
Received: 18 May 2017 / Accepted: 11 June 2018 / Published online: 26 June 2018
© Springer Nature B.V. 2018
Abstract This paper compares three multi-national
research infrastructures, one that provides data ser-
vices, one that provides compute services, and one
that supports linguistics research. The aim is to jointly
provide services to the user communities, and, per-
haps eventually, seamlessly interoperate. To this end,
we look at and compare how the infrastructures build
their service federations (trust, service status, informa-
tion systems), and how they manage users (identities,
authentication, and authorisation).
H. Neukirchen ·M. Book
University of Iceland, Reykjavik, Iceland
H. Neukirchen
M. Book
S. Memon ()·M. Riedel
ulich Supercomputing Centre, Forschungszentrum ulich,
Leo-Brandt Straße, 52428 ulich, Germany
M. Riedel
J. Jens
STFC, Harwell Oxford Campus, Didcot, UK
E. Willem
CLARIN ERIC, Utrecht, Netherlands
Keywords Distributed infrastructure ·Federated
identity management ·Service discovery ·
Standards ·Interoperation ·Cloud computing
1 Introduction
Distributed compute, data, and more recently, cloud
infrastructures have been successful in providing
resources to a wide variety of research communi-
ties. The e-Infrastructure Reflection Group identified
in 2004 the outline/vision of a distributed infrastruc-
ture comprised of fabric (disk, CPU, networks), and
a “middleware” layer connecting the infrastructure
across sites; user communities would then develop
and deploy their own applications on top of the e-
infrastructure [44]. Also the Foster/Kesselman vision
of grid computing [31], with computing available on
demand through standard interfaces, was hugely influ-
ential in the development and use of e-infrastructures,
leading for example to the middleware that is known
as Globus Toolkit [29] and more recent Globus cloud
services [30].
The established e-infrastructures have been very
successful, having provided resources to researchers
on a national or multinational scale in TeraGrid [36],
European National Grid Initiatives (NGIs), Extreme
Science and Engineering Discovery Environments
(XSEDEs) [52], or, in the case of the world-wide
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... It can register, unregister and search into an associated directory, allowing access to the offered services. Therefore, discovery services are composed of a system side, where the communication with the services and the directory is performed; and a directory side where services are stored after their discovery [20]. ...
... A possible solution deal with the performance problem is to form a peer-topeer network of discovery services to decrease the latency rate and be able to respond more quickly to the requests [9]. Another solution could be based on the application of heuristics to optimize the query process into this kind of search spaces [10]. ...
Full-text available
The current state of the technologies related to the Web of Things (WoT) and the Internet of Things (IoT) fosters the creation of service directories gathering resource descriptions. These directories are aimed at enabling the service discovery and supporting providers and consumers with a shared element for their communication and interoperability between the involved agents. This interoperability can be ensured by using the abstract layer of the W3C WoT recommendations. However, many of the existing approaches do not include a service discovery mechanism and those that include a WoT directory lack certain functionalities required by distributed systems specific to the WoT such as a Service-Oriented Architecture (SOA). This paper proposes a federated service discovery approach to support the management of WoT applications which ensures that the integration of the whole system components can be addressed by following a SOA. It is aimed at providing query and storage functionality for WoT resources, but also is intended to be connected to other WoT directories by applying a customizable approach based on recommender systems. Thus, we guarantee a flexible mechanism to obtain sets of ranked WoT resources to be utilized in different kinds of applications and domains.
... Cloud providers collaboration for the purpose of service discovery and management has been discussed in [37]. Solutions such as hybrid cloud and adaptive scheduling for heterogeneous workloads have impacted the performability of workflow applications in cloud environments [38,39]. ...
Full-text available
Most Internet of Things (IoT)-based service requests require excessive computation which exceeds an IoT device’s capabilities. Cloud-based solutions were introduced to outsource most of the computation to the data center. The integration of multi-agent IoT systems with cloud computing technology makes it possible to provide faster, more efficient and real-time solutions. Multi-agent cooperation for distributed systems such as fog-based cloud computing has gained popularity in contemporary research areas such as service composition and IoT robotic systems. Enhanced cloud computing performance gains and fog site load distribution are direct achievements of such cooperation. In this article, we propose a workflow-net based framework for agent cooperation to enable collaboration among fog computing devices and form a cooperative IoT service delivery system. A cooperation operator is used to find the topology and structure of the resulting cooperative set of fog computing agents. The operator shifts the problem defined as a set of workflow-nets into algebraic representations to provide a mechanism for solving the optimization problem mathematically. IoT device resource and collaboration capabilities are properties which are considered in the selection process of the cooperating IoT agents from different fog computing sites. Experimental results in the form of simulation and implementation show that the cooperation process increases the number of achieved tasks and is performed in a timely manner.
Full-text available
Currently, there is a proliferation of technological tools with a Science Gateway approach. For IT administrators manage these kinds of tools is not a trivial activity, although there is a significant volume of related studies. This situation represents a latent challenge to IT administrators in TERS (Technology Ecosystem for Research Support). This paper analyzes and classifies studies related to IT resources and services management applicable to this type of technology ecosystem. Methodologically we used an adaptation of guidelines aimed at the construction of a SMS (Systematic Mapping Study). Additionally, we performed an analysis of the papers to recognize inferences and trends in them, which allowed us to claim that cloud computing technology plays a predominant role. We consider it good practice for implementations that support research processes. In this sense, we recommend to those interested in this topic to prioritize cloud technologies to achieve an adequate management of the set of IT resources and services used to support Science Gateway environments.
Full-text available
Central elements of the TERENO network are “terrestrial observatories” at the catchment scale which were selected in climate sensitive regions of Germany for the regional analyses of climate change impacts. Within these observatories small scale research facilities and test areas are placed in order to accomplish energy, water, carbon and nutrient process studies across the different compartments of the terrestrial environment. Following a hierarchical scaling approach (point-plot-field) these detailed information and the gained knowledge will be transferred to the regional scale using integrated modelling approaches. Furthermore, existing research stations are enhanced and embedded within the observatories. In addition, mobile measurement platforms enable monitoring of dynamic processes at the local scale up to the determination of spatial pattern at the regional scale are applied within TERENO.
Full-text available
AARC (Authentication and Authorisation for Research Communities) is a two-year EC-funded project to develop and pilot an integrated cross-discipline authentication and authorisation framework, building on existing authentication and authorisation infrastructures (AAIs) and production federated infrastructure. AARC also champions federated access and offers tailored training to complement the actions needed to test AARC results and to promote AARC outcomes. This article describes a high-level blueprint architectures for interoperable AAIs.
Conference Paper
Full-text available
The CLARIN research infrastructure aims to place language resources and services within easy reach of the humanities researchers. One of the measures to make access easy is to allow these researchers to access them using their home institutions credentials. However, the technology used for this makes it hard for services to make delegated call, i.e., a call on behalf of the researcher, to other services. In this paper several use cases, e.g., interaction with a researcher’s private workspace or protected resources, show how user delegation would enrich the capabilities of the infrastructure. To enable these use cases various technical solutions have been investigated and some of these have been used in pilot implementations of the use cases. This paper reports on the use cases, the research and the implementation experiences.
Conference Paper
Globus Auth is a foundational identity and access management platform service designed to address unique needs of the science and engineering community. It serves to broker authentication and authorization interactions between end-users, identity providers, resource servers (services), and clients (including web, mobile, desktop, and command line applications, and other services). Globus Auth thus makes it easy, for example, for a researcher to authenticate with one credential, connect to a specific remote storage resource with another identity, and share data with colleagues based on another identity. By eliminating friction associated with the frequent need for multiple accounts, identities, credentials, and groups when using distributed cyberinfrastructure, Globus Auth streamlines the creation, integration, and use of advanced research applications and services. Globus Auth builds upon the OAuth 2 and OpenID Connect specifications to enable standards-compliant integration using existing client libraries. It supports identity federation models that enable diverse identities to be linked together, while also providing delegated access tokens via which client services can obtain short term delegated tokens to access other services. We describe the design and implementation of Globus Auth, and report on experiences integrating it with a range of research resources and services, including the JetStream cloud, XSEDE, NCAR's Research Data Archive, and FaceBase.
Conference Paper
EPOS is an e-Infrastructure for solid Earh science in Europe. It integrates many heterogeneous Research Infrastructures (RIs) using a novel approach based on the harmonization of existing service and component interfaces. EPOS is designed to provide an architectural framework for new Research Infrastructures in the domain, and to interface with increasing sophistication of existing RIs working with them in co-development from their present state to a future integrated state. The key is the metadata catalogue based on CERIF which provides the virtualization required for EPOS to provide a homogeneous view over the heterogeneity. Architectural concepts together with a plan for integration and collaboration with EPOS nodes in order to interoperate are presented in this paper.
"Grid" computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation. In this article, we define this new field. First, we review the "Grid problem," which we define as flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources-what we refer to as virtual organizations. In such settings, we encounter unique authentication, authorization, resource access, resource discovery, and other challenges. It is this class of problem that is addressed by Grid technologies. Next, we present an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing. We describe requirements that we believe any such mechanisms must satisfy, and we discuss the central role played by the intergrid protocols that enable interoperability among different Grid systems. Finally, we discuss how Grid technologies relate to other contemporary technologies, including enterprise integration, application service provider, storage service provider, and peer-to-peer computing. We maintain that Grid concepts and technologies complement and have much to contribute to these other approaches.
This chapter traces the development of the computing service for the Large Hadron Collider (LHC) at CERN data analysis over the 10 years prior to the start-up of the accelerator. It explores the main factors that influenced the choice of technology, a data intensive computational Grid, provides a brief explanation of the fundamentals of Grid computing, and records some sof the technical and organisational challenges that had to be overcome to achieve the capacity, performance, and usability requirements of the LHC experiments.
The Distributed Computing Infrastructure (DCI) has become an indispensable tool for scientific research. Such infrastructures are composed of many independent services that are managed by autonomous service providers. The discovery of services is therefore a primary function, which is a precursor for enabling efficient workflows that utilise multiple cooperating services. As DCIs, such as the European Grid Initiative (EGI), are based on a federated model of cooperating yet autonomous service providers, a federated approach to service discovery is required that seamlessly fits into the operational and management procedures of the infrastructure. Many existing approaches rely on a centralised service registry, which is not suited to a federated deployment and operational model. A federated service registry is therefore required that is capable of scaling to handle the number of services and discovery requests found in a production DCI. In this paper we present the EMI Registry (EMIR), a decentralised architecture that supports both hierarchical and peering topologies, enabling autonomous domains to collaborate in a federated infrastructure. An EMIR pilot service is used in order to evaluate a prototype of this architecture under real-world conditions with a geographically-dispersed deployment. The results of this initial deployment are provided along with a few performance measurements.
Conference Paper
Federated Identity Management is considered a promising approach to facilitate secure resource sharing between collaborating partners. The adoption rate of identity federation technologies in the industrial domain, however, has not been as expected. A structured survey provides the basis for this paper, which reports on challenges related to Federated Identity Management. This paper presents a narrative of the main challenges that are reported in existing FIdM research, and provide a starting point to those who seek to learn more about these concepts.