March 2018
·
28 Reads
·
2 Citations
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
March 2018
·
28 Reads
·
2 Citations
October 2017
·
64 Reads
·
2 Citations
Journal of Physics Conference Series
In this paper we discuss design, implementation considerations, and performance of a new Resilience Service in the dCache storage system responsible for file availability and durability functionality.
October 2017
·
51 Reads
Journal of Physics Conference Series
For over a decade, dCache.org has delivered a robust software used at more than 80 Universities and research institutes around the world, allowing these sites to provide reliable storage services for the WLCG experiments as well as many other scientific communities. The flexible architecture of dCache allows running it in a wide variety of configurations and platforms - from a SoC based all-in-one Raspberry-Pi up to hundreds of nodes in a multipetabyte installation. Due to lack of managed storage at the time, dCache implemented data placement, replication and data integrity directly. Today, many alternatives are available: S3, GlusterFS, CEPH and others. While such solutions position themselves as scalable storage systems, they cannot be used by many scientific communities out of the box. The absence of community-accepted authentication and authorization mechanisms, the use of product specific protocols and the lack of namespace are some of the reasons that prevent wide-scale adoption of these alternatives. Most of these limitations are already solved by dCache. By delegating low-level storage management functionality to the above-mentioned new systems and providing the missing layer through dCache, we provide a solution which combines the benefits of both worlds - industry standard storage building blocks with the access protocols and authentication required by scientific communities. In this paper, we focus on CEPH, a popular software for clustered storage that supports file, block and object interfaces. CEPH is often used in modern computing centers, for example as a backend to OpenStack services. We will show prototypes of dCache running with a CEPH backend and discuss the benefits and limitations of such an approach. We will also outline the roadmap for supporting 'delegated storage' within the dCache releases.
October 2017
·
93 Reads
·
2 Citations
Journal of Physics Conference Series
For over a decade, dCache has relied on the authentication and authorization infrastructure (AAI) offered by VOMS, Kerberos, Xrootd etc. Although the established infrastructure has worked well and provided sufficient security, the implementation of procedures and the underlying software is often seen as a burden, especially by smaller communities trying to adopt existing HEP software stacks [1]. Moreover, scientists are increasingly dependent on service portals for data access [2]. In this paper, we describe how federated identity management systems can facilitate the transition from traditional AAI infrastructure to novel solutions like OpenID Connect. We investigate the advantages offered by OpenID Connect in regards to 'delegation of authentication' and 'credential delegation for offline access'. Additionally, we demonstrate how macaroons can provide a more fine-granular authorization mechanism that supports anonymized delegation.
December 2015
·
90 Reads
Journal of Physics Conference Series
The availability of cheap, easy-to-use sync-and-share cloud services has split the scientific storage world into the traditional big data management systems and the very attractive sync-and-share services. With the former, the location of data is well understood while the latter is mostly operated in the Cloud, resulting in a rather complex legal situation. Beside legal issues, those two worlds have little overlap in user authentication and access protocols. While traditional storage technologies, popular in HEP, are based on X.509, cloud services and sync-and-share software technologies are generally based on username/password authentication or mechanisms like SAML or Open ID Connect. Similarly, data access models offered by both are somewhat different, with sync-and-share services often using proprietary protocols. As both approaches are very attractive, dCache.org developed a hybrid system, providing the best of both worlds. To avoid reinventing the wheel, dCache.org decided to embed another Open Source project: OwnCloud. This offers the required modern access capabilities but does not support the managed data functionality needed for large capacity data storage. With this hybrid system, scientists can share files and synchronize their data with laptops or mobile devices as easy as with any other cloud storage service. On top of this, the same data can be accessed via established mechanisms, like GridFTP to serve the Globus Transfer Service or the WLCG FTS3 tool, or the data can be made available to worker nodes or HPC applications via a mounted filesystem. As dCache provides a flexible authentication module, the same user can access its storage via different authentication mechanisms; e.g., X.509 and SAML. Additionally, users can specify the desired quality of service or trigger media transitions as necessary, thus tuning data access latency to the planned access profile. Such features are a natural consequence of using dCache. We will describe the design of the hybrid dCache/OwnCloud system, report on several months of operations experience running it at DESY, and elucidate the future road-map.
December 2015
·
36 Reads
Journal of Physics Conference Series
X.509, the dominant identity system from grid computing, has proved unpopular for many user communities. More popular alternatives generally assume the user is interacting via their web-browser. Such alternatives allow a user to authenticate with many services with the same credentials (user-name and password). They also allow users from different organisations form collaborations quickly and simply. Scientists generally require that their custom analysis software has direct access to the data. Such direct access is not currently supported by alternatives to X.509, as they require the use of a web-browser. Various approaches to solve this issue are being investigated as part of the Large Scale Data Management and Analysis (LSDMA) project, a German funded national R&D project. These involve dynamic credential translation (creating an X.509 credential) to allow backwards compatibility in addition to direct SAML- and OpenID Connect-based authentication. We present a summary of the current state of art and the current status of the federated identity work funded by the LSDMA project along with the future road map.
June 2014
·
64 Reads
·
5 Citations
Journal of Physics Conference Series
With over ten years in production use dCache data storage system has evolved to match ever changing lansdcape of continually evolving storage technologies with new solutions to both existing problems and new challenges. In this paper, we present three areas of innovation in dCache: providing efficient access to data with NFS v4.1 pNFS, adoption of CDMI and WebDAV as an alternative to SRM for managing data, and integration with alternative authentication mechanisms.
December 2012
·
76 Reads
·
14 Citations
Journal of Physics Conference Series
For over a decade, dCache has been synonymous with large-capacity, fault-tolerant storage using commodity hardware that supports seamless data migration to and from tape. In this paper we provide some recent news of changes within dCache and the community surrounding it. We describe the flexible nature of dCache that allows both externally developed enhancements to dCache facilities and the adoption of new technologies. Finally, we present information about avenues the dCache team is exploring for possible future improvements in dCache.
December 2011
·
83 Reads
·
1 Citation
April 2010
·
41 Reads
Journal of Physics Conference Series
The Tier-1 facility operated by the Nordic DataGrid Facility (NDGF) differs significantly from other Tier-1s in several aspects: It is not located at one or a few locations but instead distributed throughout the Nordic, it is not under the governance of a single organisation but but is instead build from resources under the control of a number of different national organisations. Being physically distributed makes the design and implementation of the networking infrastructure a challenge. NDGF has its own internal OPN connecting the sites participating in the distributed Tier-1. To assess the suitability of the network design and the capacity of the links, we present a model of the internal bandwidth needs for the NDGF Tier-1 and its associated Tier-2 sites. The model takes the different type of workloads into account and can handle different kinds of data management strategies. It has already been used to dimension the internal network structure of NDGF. We also compare the model with real life data measurements.
... Subsequently, a SRM + HTTP-TPC solution was put in place to replace SRM + GridFTP [4] in Tier-1 tape-endpoints. This process occurred in 2020/2021 and required site-issued tokens (most widely used implementation being macaroons [16]) plus support in FTS/Gfal2 and SEs [17]. Tape operations still had be done with SRM bringOnline [2]. ...
Reference:
An HTTP REST API for Tape-backed Storage
March 2018
... Identity harmonization builds upon provisioning by linking heterogeneous identities of SPs or federated IdPs. Thereby, 13 Background references: [31], [49], [56], [82]- [100] it obtains an aggregated identity with harmonized attributes which is looped back to the other IdPs and SPs. Since provisioning (and identity harmonization) are the primary use cases for SCIM, scientific contributions mostly use it the intended way. ...
October 2017
Journal of Physics Conference Series
... To address those requirements the Resilience [8] subsystem of dCache, which is responsible for data durability, is evolving into QoS Engine. A significant amount of Resilience's architecture needed to be refactored in order to be placed on top of the already implemented functionality. ...
October 2017
Journal of Physics Conference Series
... dCache -Several PB of dCache [130] storage are available for multiple purposes. dCache storage is not fully POSIX compliant but network file system (NFS) mounts with limited functionality are available on Fermilab interactive machines. ...
June 2014
Journal of Physics Conference Series
... We have operated a dCache [12] storage element on physical hardware since our site was commissioned in 2010. However, motivated by the goals of physical consolidation of all data onto our Ceph [13] cloud storage cluster and logical consolidation of all site services onto Kubernetes in our cloud, we are evaluating EOS [14] as an alternative solution. ...
December 2012
Journal of Physics Conference Series
... ARC [20] is a middleware suite used by high throughput computing communities. ARC's integration with [26] and DDM (Distributed Data Management) [13] solutions are mostly used by the ATLAS [13] particle physics community at the Large Hadron Collider. dCache and DDM are distributed data management platforms providing storage and retrieval of huge amounts of data. ...
July 2008
Journal of Physics Conference Series
... The storage management system dCache at Brookhaven National Laboratory (BNL) is the disk cache for a large collection of high-energy physics (HEP) data, primarily from the A Toroidal LHC ApparatuS (ATLAS) experiment [1,2]. Storage space on dCache is much smaller than the full ATLAS data collection residing on tape archives and distributed data centers. ...
July 2008
Journal of Physics Conference Series