Conference Paper

GridBeans: Support e-Science and Grid Applications

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Large-scale scientific research often relies on the collaborative use of Grid and e-Science infrastructures that provide computational or storage related resources. One of the ideas of these modern infrastructures is to facilitate the routine interaction of scientists and their workflows with advanced problem solving tools and computational resources. While many production Grid projects and e-Science infrastructures have begun to offer services for the usage of resources to end-users during the past several years, the corresponding emerging standards defined by GGF and OASIS still appear to be in flux. In this paper, we present the GridBean technology that bridges the gap between the constantly changing basic Grid or e-Science infrastructures and the need of stable application development environments for the Grid users.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... We will discuss our recent progress in this area, including efficient Green's function evaluation and parallelization strategies based on block diagonalization techniques and the Krylov subspace method [1] [2], the use of scattering states for improved performance of finite bias transport calculations [3], finite-element techniques coupled to first-principles calculations [4], and multi-scale models that take advantage of methods that operate on different complexity levels and length scales [4] [5] [6]. ...
... Ref e re nc e s: [1] D. E. Petersen, H. H. B. Sørensen, P. C. Hansen, S. Skelboe, and K. Stokbro, Journal of Computational Physics 227, 3174-3190 (2008) [2] S. Skelboe, The Scheduling of a Parallel Tiled Matrix Inversion Algorithm, submitted (2010) [3] H. H. B. Sørensen, P.C. Hansen, D.E. Petersen, S. Skelboe, and K. Stokbro, Phys. ...
... Méhaut, A. Neelov, S. Goedecker, Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures; Journal of Chemical Physics, Vol. 131, p. 034103 (2009) [2] L. Genovese, A. Neelov, S. Goedecker, T. Deutsch, S. A. Ghasemi, A. Willand, D. Caliste, O. ...
Article
Full-text available
On the nanoscale, electrical currents behave radically different compared to on the microscale. As the active regions become comparable to or smaller than the mean-free path of the material, it becomes necessary to describe the electron transport by quantum-mechanical methods instead of using classical relations like Ohm's law. Over the past decade, methods for computing electron tunneling currents in nanosized junctions have evolved steadily, and are now approaching a sophistication where they can provide real assistance in the development of novel semiconductor materials and devices. At the same time, the industry's demand for such solutions is rising rapidly to meet the challenges both above and under the 16 nm node. In this paper we provide an overview of the current state-of-the-art of the field of how to model electrical currents on the nanoscale, using atomic-scale simulations.
... In general, the framework offers Web Service interfaces and a GUI framework, which hides the complexity of the underlying Grid technology to the user. An example of GPE-based clients is the UNICORE Application Client [79], which is specifically designed to support scientists with one particular scientific application GUI. GPE offers components on three different levels: utility level, service level and application level. ...
... The GridBean Software Development Kit and a GUI framework is added at the level of applications. This GridBean [79] concept is a plugin technology for clients that abstracts from applications. It is the fundamental concept of GPE and represents the foundation to link Grid technologies and e-Science applications. ...
... The UNICORE Rich Client (URC) [37] is integrated in the client tier of the UNICORE three tier architecture (see Section 2.1). UNICORE offers at the client tier several other clients, like a command line client [34], an Application client [79] and a high level API. They exploit all features provided by the underlying infrastructure. ...
Article
Today, most biological applications raise high demands in data resources and computational power. In order to meet these demands, powerful computer and storage systems are required used in conjunction with modern software technologies that allow for seamless execution of applications on these systems. Suitable to meet these requirements are e-Science infrastructures that provide access to data and computing resources for scientific domains, such as computational biology. They are commonly used via Grid middleware systems, which in turn are accessible by users via Grid clients. Grid clients allow biological scientists to seamlessly access e-Science infrastructures, without necessarily being very well versed in computer science. Furthermore, they provide powerful graphical user interfaces that hide the low level complexity of the underlying infrastructure, so that scientists can concentrate on their scientific challenges.
... Currently, UNICORE offers four different client variants: command line client, graphical rich client, a portal client and a high-level API. The UNICORE Rich Client (URC) [57] provides us with the software basis to integrate eSBMTools within an application extension called GridBean [58]. In addition, UNICORE includes a workflow engine and a powerful graphical workflow editor, completely integrated in the service layer and the URC, respectively. ...
... The implementation of the SBM GridBean is based on the GridBean API [58] and consists of three major parts: ...
Article
Full-text available
Background Molecular dynamics (MD) simulations provide valuable insight into biomolecular systems at the atomic level. Notwithstanding the ever-increasing power of high performance computers current MD simulations face several challenges: the fastest atomic movements require time steps of a few femtoseconds which are small compared to biomolecular relevant timescales of milliseconds or even seconds for large conformational motions. At the same time, scalability to a large number of cores is limited mostly due to long-range interactions. An appealing alternative to atomic-level simulations is coarse-graining the resolution of the system or reducing the complexity of the Hamiltonian to improve sampling while decreasing computational costs. Native structure-based models, also called Gō-type models, are based on energy landscape theory and the principle of minimal frustration. They have been tremendously successful in explaining fundamental questions of, e.g., protein folding, RNA folding or protein function. At the same time, they are computationally sufficiently inexpensive to run complex simulations on smaller computing systems or even commodity hardware. Still, their setup and evaluation is quite complex even though sophisticated software packages support their realization. Results Here, we establish an efficient infrastructure for native structure-based models to support the community and enable high-throughput simulations on remote computing resources via GridBeans and UNICORE middleware. This infrastructure organizes the setup of such simulations resulting in increased comparability of simulation results. At the same time, complete workflows for advanced simulation protocols can be established and managed on remote resources by a graphical interface which increases reusability of protocols and additionally lowers the entry barrier into such simulations for, e.g., experimental scientists who want to compare their results against simulations. We demonstrate the power of this approach by illustrating it for protein folding simulations for a range of proteins. Conclusions We present software enhancing the entire workflow for native structure-based simulations including exception-handling and evaluations. Extending the capability and improving the accessibility of existing simulation packages the software goes beyond the state of the art in the domain of biomolecular simulations. Thus we expect that it will stimulate more individuals from the community to employ more confidently modeling in their research. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-292) contains supplementary material, which is available to authorized users.
... Also, we identified a demand for a managing Grid client (e.g. GPE Client suite [8], or CogKits ) within the COVS framework that provides an overview of the session by using information exposed by the underlying Grid middleware. Therefore, the Grid client as well as its COVS specific plugin for COVS session management also represent core building blocks. ...
... Therefore, the Grid client as well as its COVS specific plugin for COVS session management also represent core building blocks. In our implementation, we used the GPE application client and developed a COVS GridBean as shown inFigure 2. GridBeans are scientific application-specific plugins for the GPE clients [8]. Finally, the parallel simulation and scientific visualization represent core building blocks that are scientific application-specific and thus it is feasible provide an concrete example in the next Section. ...
Article
Full-text available
Many production e-Science infrastructures (e.g. DEISA, D-Grid) have begun to offer a wide variety of services for end-users during the past several years. Many e-Scientists solve their scientific problems by us-ing parallel computing applications on clusters and collaborative on-line visualization and steering (COVS) is known as a tool for analyz-ing and better understanding of these applications. In absence of a widely accepted COVS framework within Grids, visualizations are often created using proprietary technologies assuming a dedicated scenario. This makes it feasible to analyze the usual requirements to provide a blueprint for a more general COVS framework that can be integrated into Grid middleware systems such as UNICORE, gLite, or Globus Toolkits. These requirements lead to a design that was successfully implemented as a higher-level service in UNICORE and presented at numerous places such as the Open Grid Forum 19 and 20, Europar 2006, Supercomputing 2006 and DEISA trainings.
... The GridBean Application Programming Interface (API) [8] and the UNICORE workflow engine [9] provide very flexible infrastructure for application integration in workflows, facilitating the management of storage and staging of input and output data. In particular, input and output files of GridBeans connected in a workflow can be shared using a special UNICORE workflow storage. ...
Article
Full-text available
Managing data exchange in scientific workflow simulations is a challenge for which existing solutions have only limited success. Basing on a data-model approach we developed a service-oriented, platform-and language-independent framework which provides seamless access for heterogeneous applications to data storage resources distributed over grids and clouds. After implementing a working prototype of the framework we demonstrated its benefits for a complex workflow application from the domain of multiscale materials modeling. Because of its generic architecture and wide standards conformity the framework can be deployed as grid or cloud middleware for various other domains of computational science to enable rapid development of complex workflow applications.
... In [8] the concept of GridBean in GPE grid middleware platform was introduced. A GridBean is an object responsible for generating the job description for grid services and providing graphical user interface for input and output data. ...
Conference Paper
Full-text available
Computer-Aided Engineering (CAE) systems demand a vast amount of computing resources to simulate modern hi-tech products. In this paper, we consider the problem-oriented approach to access remote distributed supercomputer resources using the concept of distributed virtual test bed (DiVTB). DiVTB provides a problem-oriented user interface to distributed computing resources within the grid, online launch of CAE simulation, automated search, monitoring and allocation of computing resources for carrying out the virtual experiments. To support the concept of DiVTB the DiVTB technology was developed. DiVTB technology provides a solution for the development and deployment of DiVTB, integration of most common CAE systems into the distributed computing environment as grid services (based on the UNICORE grid middleware) and web access to CAE simulation process.
... . It illustrates a collaborative scenario with two geographically dispersed participants (i.e., client tier A and B). Both run on their machine a scientific visualization that interacts with a covs GridBean plug-in, which extends the gpe unicore Grid client [8]. The Grid client is used to submit computational jobs to the Grid and to access two dedicated covs services that are compliant with the Web Service Resource Framework (ws-rf) standard [21]. ...
Chapter
Full-text available
Large-scale scientific research often relies on the collaborative use of massive computational power, fast networks, and large storage capacities provided by e-science infrastructures (e.g., deisa, egee) since the past several years. Especially within e-science infrastructures driven by high-performance computing (hpc) such as deisa, collaborative online visualization and computational steering (covs) has become an important technique to enable hpc applications with interactivity and visualized feedback mechanisms. In earlier work we have shown a prototype covs technique implementation based on the visualization interface toolkit (visit) and the Grid middleware of deisa named as Uniform Interface to Computing Resources (unicore). Since then the approach grew to a broader covs framework. More recently, we investigated the impact of using the computational steering capabilities of the covs framework implementation in unicore on large-scale hpc systems (i.e., ibm BlueGene/P with 65536 processors) and the use of attribute-based authorization. In this chapter we emphasize on the improved collaborative features of the covs framework and present new insights of how we deal with dynamic management of n participants, transparency of Grid resources, and virtualization of hosts of end-users. We also show that our interactive approach to hpc systems fully supports the necessary single sign-on feature required in Grid and e-science infrastructures. KeywordsScientific visualization-Computational steering-COVS-VISIT-UNICORE
... The parallel simulations that are used in conjunction with the COVS framework are submitted to the computational resource using a Grid client (e.g. GPE Grid Client [5]) and the underlying Grid middleware (e.g. UNICORE) of the correspondent Grids (e.g. ...
Conference Paper
Full-text available
Today's large-scale scientific research often relies on the collaborative use of a Grid or c-Science infrastructure (e.g. DEISA, EGEE, TeraGrid, OSG) with computational, storage, or other types of physical resources. One of the goals of these emerging infrastructures is to support the work of scientists with advanced problem-solving tools. Many e-Science applications within these infrastructures aim at simulations of a scientific problem on powerful parallel computing resources. Typically, a researcher first performs a simulation for some fixed amount of time and then analyses results in a separate post-processing step, for instance, by viewing results in visualizations. In earlier work we have described early prototypes of a Collaborative Online Visualization and Steering (COVS) Framework in Grids that performs both -simulation and visualization -at the same time (online) to increase the efficiency of e-Scientists. This paper evaluates the evolved mature reference implementation of the COVS framework design that is ready for production usage within Web service-based Grid and e-Science infrastructures.
Chapter
After several years of development, the computational infrastructure has been widely developed. In particular, when the grid technology grows to production level, users have several options to handle resourceless problems when submitting large-scale jobs to the infrastructure. The capability of dynamic provisioning, fast deployment of virtual machines and interoperation can help users complete their jobs correctly even if the local resources are not adequate. To orchestrate these approaches well and find the optimal solution for end users, this chapter proposes a novel orchestrating model to optimize these approaches efficiently. An image processing application running on three heterogeneous grid platforms is adopted to demonstrate the efficiency and capability of the proposed model. The result proved that the optimal solution can efficiently orchestrate the jobs among the grids.
Article
Traditional grid managers and schedulers considered availability and optimization of computing resources and did not take into account data transfer delays when deciding on job submission. This is acceptable in applications where data transfer time is negligible compared to job executions time. However, many e-science applications, such as linear algebra, image processing, and data mining [3], are becoming more data intensive. In this paper, we propose a genetic algorithm (GA)-based approach that coschedules computational and networking resources. The proposed approach assumes the availability of on demand reconfigurable optical network infrastructure. We study the performance of the proposed approach and compare it to traditional grid scheduling. Simulation results show the advantages of the proposed coscheduling approach especially in networking-intensive applications.
Conference Paper
Native structure based models are a broadly used technique in bio molecular simulation allowing understanding of complex processes in the living cell involving bio macromolecules. Based on energy landscape theory and the principle of minimal frustration, these models find wide application in simulating complex biological processes as diverse as protein or RNA folding and assembly, conformational transitions associated with allostery, to structure prediction. To allow rapid adoption by scientists, especially experimentalists, having no background in programming or high performance computing, we here provide an effective user interface to existing applications running on distributed computing resources. Based on the gateway technologies WS-PGRADE and gUSE, we developed a web-based community application service for native structure based modeling by integrating a powerful user interface to an existing UNICORE grid application based on the eSBMTools package. The eSBM port let has been integrated into the MoSGrid portal and is immediately accessible for the bioinformatics, biophysics and structural biology communities.
Conference Paper
Full-text available
Computer-Aided Engineering (CAE) systems demand a vast amount of computing resources to simulate modern hi-tech products. In this paper we consider the problem-oriented approach to access remote distributed supercomputer resources using the concept of distributed virtual test bed (DiVTB). DiVTB provides a problem-oriented user interface to distributed computing resources within the grid, online launch of CAE simulation, automated search, monitoring and allocation of computing resources for carrying out the virtual experiments. To support the concept of DiVTB the CAEBeans technology was developed. CAEBeans technology provides a solution for the development and deployment of DiVTB, integration of most common CAE systems into the distributed computing environment as grid services (based on the UNICORE grid middleware) and web access to CAE simulation processes.
Conference Paper
Many existing problem solving environments provide scientists with convenient methods for building scientific applications over distributed computational and storage resources. In many cases a basic set of features of such environments is sufficient to conduct a complete experiment flow. However, complex cases often require extensions supporting an external piece of software or a communication standard not integrated beforehand. Most environments deal with such cases by providing an extension facility and letting third parties add required features. The GridSpace environment also includes several mechanisms for extending its own functionality and here we describe how this can be accomplished. We focus on extensions already implemented such as local job submission and scripting language repositories, as well as on a GUI extension point which can be used to add custom graphical user interfaces to GridSpace experiments independently of their release process.
Conference Paper
After several years development, the computational infrastructure has been widely developed. Especially, when the grid technology grows to production level, users have several options to handle resource less problems when submitting large scale jobs to the infrastructure. Capabilities of dynamic provisioning, fast deployment of virtual machines, and interoperation can help users finishing their jobs correctly even the local resources are not adequate. To orchestrate these approaches well and find the optimal solution for users, this paper proposes a novel orchestrating model to optimize these approaches for high efficiency. An image processing application running on three heterogeneous grid platforms is adopted to demonstrate the efficiency and capability of proposed model. The result proved that the optimal solution can efficiently orchestrate the jobs among the grids.
Article
Full-text available
Especially within grid infrastructures driven by high-performance computing (HPC), collaborative online visualization and steering (COVS) has become an important technique to dynamically steer the parameters of a parallel simulation or to just share the outcome of simulations via visualizations with geographically dispersed collaborators. In earlier work, we have presented a COVS framework reference implementation based on the UNICORE grid middleware used within DEISA. This paper lists current limitations of the COVS framework design and implementation related to missing fine-grained authorization capabilities that are required during collaborative COVS sessions. Such capabilities use end-user information about roles, project membership, or participation in a dedicated virtual organization (VO). We outline solutions and present a design and implementation of our architecture extension that uses attribute authorities such as the recently developed virtual organization membership service (VOMS) based on the security assertion markup language (SAML).
Article
Many production Grid infrastructures such as DEISA, EGEE, or TeraGrid have begun to offer services to endusers that include access to computational resources. The major goal of these infrastructures is to facilitate the routine interaction of scientists and their workflows with advanced tools and seamless access to computational resources via Grid middleware systems such as UNICORE, gLite or Globus Toolkits. While UNICORE 5 is used in production Grids since several years, recently an early prototype of the new Web services-based UNICORE 6 became available that will be continously improved in the next months for its use in production. In absence of a widely accepted framework for visualization and steering, the new UNICORE 6 Grid middleware provides not such a higherlevel service by default. This motivates this contribution to support e-Scientists in upcoming WS-based UNICORE Grids with visualization and steering techniques. In this paper we present the augmentation of the early standards-based UNICORE 6 prototype with a higher-level service for collaborative online visualization and steering. It describes the seamless integration of this service within UNICORE Grids by retaining the convenient single sign-on feature.
Article
Full-text available
Many production Grid and e-science infrastructures offer their broad range of resources via services to end-users during the past several years with an increasing number of scientific applications that require access to a wide variety of resources and services in multiple Grids. But the vision of world-wide federated Grid infrastructures in analogy to the electrical power Grid is still not seamlessly provided today. This is partly due to the fact, that Grids provide a much more variety of services (job management, data management, data transfer, etc.) in comparison with the electrical power Grid, but also the emerging open standards are still partly to be improved in terms of production usage. This paper points exactly to these improvements with a well-defined design of an infrastructure interoperability reference model that is based on open standards that are refined with experience gained by production Grid interoperability use cases. This contribution gives insights into the core building blocks in general, but focuses significantly on the computing building blocks of the reference model in particular.
Conference Paper
Molecular simulations are playing important role in understanding the mechanisms at microscopic level of all organisms. Increasing computer power available at single computers is still not enough for the molecular simulations. In this case, the grid middleware which has the ability to distribute calculations in a seamless and secure way over different computing systems becomes an immediate solution. In this paper we present GridBean developed for NAMD application which is commonly used for molecular simulations. The GridBean is designed with standards of Grid Programming Environment (GPE) currently being developed and implemented, based on the latest versions of Globus and UNICORE middlewares.
Article
Full-text available
In the last three years activities in Grid computing have changed; in particular in Europe the focus moved from pure research-oriented work on concepts, architectures, interfaces, and protocols towards activities driven by the usage of Grid technologies in day-to-day operation of e-infrastructure and in applicationdriven use cases. This change is also reected in the UNICORE activities [1]. The basic components and services have been established, and now the focus is increasingly on enhancement with higher level services, integration of upcoming standards, deployment in e-infrastructures, setup of interoperability use cases and integration of applications. The development of UNICORE started back more than 10 years ago, when in 1996 users, supercomputer centres and vendors were discussing "what prevents the efficient use of distributed supercomputers?". The result of this discussion was a consensus which still guides UNICORE today: seamless, secure and intuitive access to distributed resources. Since the end of 2002 continuous development of UNICORE took place in several EU-funded projects, with the subsequent broadening of the UNICORE community to participants from across Europe. In 2004 the UNICORE software became open source and since then UNICORE is developed within the open source developer community. Publishing UNICORE as open source under BSD license has promoted a major uptake in the community with contributions from multiple organisations. Today the developer community includes developers from Germany, Poland, Italy, UK, Russia and other countries. The structure of the paper is as follows. In Section 2 the architecture of UNICORE 6 as well as implemented standards are described, while Section 3 focusses on its clients. Section 4 covers recent developments and advancements of UNICORE 6, while in section 5 an outlook on future planned developments is given. The paper closes with a conclusion.
Conference Paper
Heterogeneous grid workflow management plays an important role in the grid interoperation. It integrates service resources from different grid systems and packs them as atomic services into a composite service. The focuses of the research are how to shield the difference among the heterogeneous grid services and how to provide a flexible way to compose services. Heterogeneous grid workflow management mechanism, based on virtual service, compiles with BPEL4WS standards and shields the differences in the aspects of both organization of service information and the types of service among heterogeneous grids. The mechanism compensates for the weakness that traditional mechanism requires prior static binding in the service selection and scheduling. The proposed mechanism can provide some new functions like service backup and dynamic service selection based on QoS to improve the flexibility and stability of workflow management. In addition, in order to address the issues of tight coupling between data and service, the proposed mechanism employs virtual data space and corresponding data transferring solution.
Conference Paper
With e-science applications becoming more and more data-intensive, data is generally generated and stored at different locations and can be divided into independent subsets to be analyzed distributed at many compute locations across an optical grid. It is required to achieve an optimal utilization of optical grid resources. This is generally achieved by minimizing application completion time, which is calculated as the sum of times spent for data transmission and analysis. We propose a Genetic Algorithm (GA) based approach that co-schedules computing and networking resources to achieve this objective. The proposed approach defines a schedule of when to transfer what data subsets to which sites at what times in order to minimize data processing time as well as defining the routes to be used for transferring data subsets to minimize data transfer times. Simulation results show the advantages of the proposed approach in both minimizing the maximum application completion time and reducing the overall genetic algorithm execution time.
Conference Paper
Today, many scientific disciplines heavily rely on computer systems for in-silico experimentation or data management and analysis. The employed computer hard- and software is heterogeneous and complies to different standards, interfaces and protocols for interoperation. Grid middleware systems like UNICORE 6 try to hide some of the complexity of the underlying systems by offering high-level, uniform interfaces for executing computational jobs or storing, moving, and searching through data. Via UNICORE 6 computer resources can be accessed securely with different software clients, e.g. the UNICORE Command line Client (UCC) or the graphical UNICORE Rich Client (URC) which is based on Eclipse. In this paper, we describe the design and features of the URC, and highlight its role as a flexible and extensible Grid client framework using the QSAR field as an example.
Conference Paper
Most grid applications require the processing of large amounts of data stored at different locations across the network which makes optical grid infrastructures optimal for such applications. The increase in intensity of data- and communications of these applications calls for new mechanisms and theories on how to optimally allocate optical grid resources. Typical Divisible Load Theory (DLT) makes optimal allocation of computational resources. In this paper, we introduce Network Aware Divisible Load Algorithm (NADLA) that extends DLT to optimally allocate both the computational and networking resources of an optical grid. We assume a data- and communications-intensive application where data is stored at different sites across the optical grid and can be divided into independent subsets to be processed in parallel at different sites. The algorithm defines an optimal data transfer schedule that defines when to transfer what data subsets to which sites across the optical grid in order to minimize the overall application completion time. It consists of two phases. In the first phase, NADLA provide load distribution that minimizes application completion time taking into account network connectivity and computational and networking resources availability. Site connectivity is estimated by considering the bandwidth and the expected free time of all the links connecting this site to other sites. In the second phase, a simple greedy algorithm is used to allocate computational and networking resources. Extensive Simulations are conducted to examine the performance of the proposed algorithm for different application types and sizes, and different optical grid topologies. Simulation results show the advantages of the proposed algorithm over the traditional DLT approach for the category of applications considered.
Conference Paper
Full-text available
Many existing Grid technologies and resource management systems lack a standardized job submission interface in Grid environments or e-Infrastructures. Even if the same language for job description is used, often the interface for job submission is also different in each of these technologies. The evolvement of the standardized Job Submission and Description Language (JSDL) as well as the OGSA - Basic Execution Services (OGSA-BES) pave the way to improve the interoperability of all these technologies enabling cross-Grid job submission and better resource management capabilities. In addition, the BytelO standards provide useful mechanisms for data access that can be used in conjunction with these improved resource management capabilities. This paper describes the integration of these standards into the recently released UNICORE 6 Grid middleware that is based on open standards such as the Web Services Resource Framework (WS-RF) and WS-Addressing (WS-A).
Conference Paper
In the last decade, life science applications have become more and more integrated into e-Science environments, hence they are typically very demanding, both in terms of computational capabilities and data capacities. Especially the access to life science applications, embedded in such environments via Grid clients still constitutes a major hurdle for scientists that do not have an IT background. Life science applications often comprise a whole set of small programs instead of a single executable. Many of the graphical Grid clients are not perfectly suited for these types of applications, as they often assume that Grid jobs will run a single executable instead of a set of chained executions (i.e. sequences). This means that in order to execute a sequence of multiple programs on a single Grid resource, piping data from one program to the next, the user would have to run a hand-written shell script. Otherwise each program is independently scheduled as a Grid job, which causes unnecessary file transfers between the jobs, even if they are scheduled on the same resource. We present a generic solution to this problem and provide a reference implementation, which seamlessly integrates with the Grid middleware UNICORE. Our approach focuses on a comfortable user interface for the creation of such program sequences, validated in UNICORE-driven HPC-based Grids. Thus, we applied our approach in order to provide support for the usage of the AMBER package (a widely-used collection of programs for molecular dynamics simulations) within Grid workflows. We finally provide a scientific use case of our approach leveraging the interoperability of two different scientific infrastructures that represents an instance of the infrastructure interoperability reference model.
Article
Full-text available
Computational simulations and thus scientific computing is the third pillar alongside theory and experiment in todays science. The term e-science evolved as a new research field that focuses on collaboration in key areas of science using next generation computing infrastructures (i.e. co-called e-science infrastructures) to extend the potential of scientific computing. During the past years, significant international and broader interdisciplinary research is increasingly carried out by global collaborations that often share a single e-science infrastructure. More recently, increasing complexity of e-science applications that embrace multiple physical models (i.e. multi-physics) and consider a larger range of scales (i.e. multi-scale) is creating a steadily growing demand for world-wide interoperable infrastructures that allow for new innovative types of e-science by jointly using different kinds of e-science infrastructures. But interoperable infrastructures are still not seamlessly provided today and we argue that this is due to the absence of a realistically implementable infrastructure reference model. Therefore, the fundamental goal of this paper is to provide insights into our proposed infrastructure reference model that represents a trimmed down version of ogsa in terms of functionality and complexity, while on the other hand being more specific and thus easier to implement. The proposed reference model is underpinned with experiences gained from e-science applications that achieve research advances by using interoperable e-science infrastructures.
Article
The first step in finding a "drug" is screening chemical compound databases against a protein target. In silico approaches like virtual screening by molecular docking are well established in modern drug discovery. As molecular databases of compounds and target structures are becoming larger and more and more computational screening approaches are available, there is an increased need in compute power and more complex workflows. In this regard, computational Grids are predestined and offer seamless compute and storage capacity. In recent projects related to pharmaceutical research, the high computational and data storage demands of large-scale in silico drug discovery approaches have been addressed by using Grid computing infrastructures, in both; pharmaceutical industry as well as academic research. Grid infrastructures are part of the so-called eScience paradigm, where a digital infrastructure supports collaborative processes by providing relevant resources and tools for data- and compute-intensive applications. Substantial computing resources, large data collections and services for data analysis are shared on the Grid infrastructure and can be mobilized on demand. This review gives an overview on the use of Grid computing for in silico drug discovery and tries to provide a vision of future development of more complex and integrated workflows on Grids, spanning from target identification and target validation via protein-structure and ligand dependent screenings to advanced mining of large scale in silico experiments.
Conference Paper
Full-text available
In a distributed grid environment with ambitious service demands the job submission and management interfaces provide functionality of major importance. Emerging e-science and grid infrastructures such as EGEE and DEISA rely on highly available services that are capable of managing scientific jobs. It is the adoption of emerging open standard interfaces which allows the distribution of grid resources in such a way that their actual service implementation or grid technologies are not isolated from each other, especially when these resources are deployed in different e-science infrastructures that consist of different types of computational resources. This paper motivates the interoperability of these infrastructures and discusses solutions. We describe the adoption of various open standards that recently emerged from the open grid forum (OGF) in the field of job submission and management by well-known grid technologies, respectively gLite and UNICORE. This has a fundamental impact on the interoperability between these technologies and thus within the next generation e-science infrastructures that rely on these technologies.
Conference Paper
Full-text available
In the past several years, many scientific applications from various domains have taken advantage of e-science infrastructures that share storage or computational resources such as supercomputers, clusters or PC server farms across multiple organizations. Especially within e-science infrastructures driven by high-performance computing (HPC) such as DEISA, online visualization and computational steering (COVS) has become an important technique to save compute time on shared resources by dynamically steering the parameters of a parallel simulation. This paper argues that future supercomputers in the Petaflop/s performance range with up to 1 million CPUs will create an even stronger demand for seamless computational steering technologies. We discuss upcoming challenges for the development of scalable HPC applications and limits of future storage/IO technologies in the context of next generation e- science infrastructures and outline potential solutions.
Article
Full-text available
This document specifies the semantics and structure of the Job Submission Description Language (JSDL). JSDL is used to describe the requirements of computational jobs for submission to resources, particularly in Grid environments, though not restricted to the latter. The document includes the normative XML Schema for the JSDL, along with examples of JSDL documents based on this schema.
Article
Full-text available
started. The goal was to provide users of the German supercomputer centers with a seam- less, secure, and intuitive access to the heterogeneous computing resources at the centers consistent with the recommendations of the German Science Council 2 . A first prototype was developed in project UNICORE 3 to demonstrate the concept. The current production version was created in a follow-on project UNICORE Plus 4 which was completed in 2002. UNICORE's scope is expanded in projects like EUROGRID 5 , GRIP 6 and OpenMolGRID 7 to provide support for additional application areas and fun ctions like resource brokering. Section 2 familiarises the term Grid and illustrates its maj or ideas and concepts. UNI- CORE as a realization of these ideas is introduced in Section 3 followed by a detailed description of the benefits UNICORE offers to users and appli cation developers. Section 5 summarizes the paper and concludes with an outlook.
Article
Full-text available
The UNICORE (UNiform Interface to COmputing REsources) software provides a Grid infrastructure together with a computing portal for engineers and scientists to access supercomputer centres from anywhere on the Internet. While UNICORE is primarily designed for the submission and control of batch jobs, it is also feasible to establish an on-line connection between an application and the UNICORE user-client. This opens up the possibility of performing on-line visualization and computational steering of applications under UNICORE control while maintaining the security provided by this system. This contribution describes the design of a steering extension to UNICORE based on the steering toolkit VISIT (VISualization Interface Toolkit). VISIT is a lightweight library that supports bidirectional data exchange between visualizations and parallel applications. As an example application, a parallel simulation of a laser-plasma interaction that can be steered by an AVS/Express application is presented.
Chapter
This chapter describes the Open Grid Services Architecture (OGSA)—the standards-based Grid computing framework. It introduces the service-oriented architecture principles that underlie OGSA and provides a detailed description of the Web services mechanisms and the Open Grid Services Infrastructure specification that together define the core interfaces and behaviors underlying OGSA. The development of OGSA represents a natural evolution of the Globus Toolkit 2.0 in which the key concepts of factory, registry, reliable and secure invocation, and so on exist in a less general and flexible form and without the benefits of a uniform interface definition language. The development of OGSA also represents a natural evolution of Web services. By integrating support for transient, stateful service instances with existing Web services technologies, OGSA extends significantly the power of the Web services framework, while requiring only minor extensions to existing technologies. A defining feature of Grids is the sharing of networks, computers, and other resources and services. This sharing introduces a need for resource and service management.
Conference Paper
Sequence analysis is one of the most fundamental tasks in molecular biology. Because of the increasing number of sequences we still need more computing power. One of the solutions are grid environments, which make use of computing centers. In this paper we present plug-in which enables the use of BLAST software for sequence analysis within Grid environments such as UNICORE (Uniform Interface to Computing Resources) and GPE (Grid Programming Environment).
Article
The UNICORE Grid-technology provides a seamless, secure and intuitive access to distributed Grid resources. In this paper we present the recent evolution from project results to production Grids. At the beginning UNICORE was developed as a prototype software in two projects funded by the German research ministry (BMBF). Over the following years, in various European-funded projects, UNICORE evolved to a full-grown and well-tested Grid middleware system, which today is used in daily production at many supercomputing centers worldwide. Beyond this production usage, the UNICORE technology serves as a solid basis in many European and International research projects, which use existing UNICORE components to implement advanced features, high level services, and support for applications from a growing range of domains. In order to foster these ongoing developments, UNICORE is available as open source under BSD licence at SourceForge, where new releases are published on a regular basis. This paper is a review of the UNICORE achievements so far and gives a glimpse on the UNICORE roadmap.
Conference Paper
The Globus Toolkit (GT) has been developed since the late 1990s to support the development of service-oriented distributed computing applications and infrastructures. Core GT components address, within a common framework, fundamental issues relating to security, resource access, resource management, data movement, resource discovery, and so forth. These components enable a broader “Globus ecosystem” of tools and components that build on, or interoperate with, GT functionality to provide a wide range of useful application-level functions. These tools have in turn been used to develop a wide range of both “Grid” infrastructures and distributed applications. I summarize here the principal characteristics of the recent Web Services-based GT4 release, which provides significant improvements over previous releases in terms of robustness, performance, usability, documentation, standards compliance, and functionality. I also introduce the new “dev.globus” community development process, which allows a larger community to contribute to the development of Globus software.
Article
We often encounter in distributed systems the need to model, access, and manage state. This state may be, for example, data in a purchase order, service level agreements representing resource availability, or the current load on a computer. We introduce two closely related approaches to modeling and manipulating state within a Web services (WS) framework: the Open Grid Services Infrastructure (OGSI) and WS-Resource Framework (WSRF). Both approaches define conventions on the use of the Web service definition language schema that enable the modeling and management of state. OGSI introduces the idea of a stateful Web service and defines approaches for creating, naming, and managing the lifetime of instances of services; for declaring and inspecting service state data; for asynchronous notification of service state change; for representing and managing collections of service instances; and for common handling of service invocation faults. WSRF refactors and evolves OGSI to exploit new Web services standards, specifically WS-addressing, and to respond to early implementation and application experiences. WSRF retains essentially all of the functional capabilities present in OGSI, while changing some syntax (e.g., to exploit WS-addressing) and also adopting a different terminology in its presentation. In addition, WSRF partitions OGSI functionality into five distinct composable specifications. We explain the relationship between OGSI and WSRF and the related WS-notification specifications, explain the common requirements that both address, and compare and contrast the approaches taken to the realization of those requirements.
GridFTP: Protocol Extensions to FTP for the Grid
  • W Allcock
Execution Services Interfaces
  • D Snelling
  • I Foster
UNICORE - From Project Results to Production Grids. Elsevier, L. Grandinetti (Edt.), Grid Comp
  • A Streit
  • D Erwin
  • T Lippert
  • D Mallmann
  • R Menday
  • M Rambadt
  • M Riedel
  • M Romberg
  • B Schuller
  • P Wieder
Globus Toolkit 4: Software for Service-Oriented Systems
  • I Foster