Content uploaded by Anna Maria Schleimer
Author content
All content in this area was uploaded by Anna Maria Schleimer on Nov 04, 2022
Content may be subject to copyright.
Architecture Design Options for Federated Data Spaces
Anna Maria Schleimer
Fraunhofer ISST
TU Dortmund University
anna.schleimer@isst.fraunhofer.de
Nils Jahnke
Fraunhofer ISST
nils.jahnke@isst.fraunhofer.de
Boris Otto
TU Dortmund University
Fraunhofer ISST
boris.otto@tu-dortmund.de
Abstract
The massive growth of data and the increasing
potential of data analytics in industrial production fuel
the emergence of data spaces and corresponding
platforms that realize data ecosystems and enable
data-driven sustainability applications. To leverage
their benefits of demand-driven and scalable data
integration, the stakeholders of emerging data space
initiatives must make informed decisions about their
data space support platforms (DSSPs). This study
proposes a conceptual framework based on federated
architectures and by considering existing endeavors of
data infrastructures. Based on existing literature
about data ecosystem resources and an explorative
single case study of an industrial data space with
sustainability-focused applications, we elaborate on
the key design options of data, services, and
computing infrastructures. The resulting conceptual
framework guides design decisions for DSSPs. The
framework captures not only the resources involved
but also the operational concepts of federated services
and shared services to introduce governance
mechanisms and sustainability policies.
Keywords: data spaces, federated architectures, data
space support platform, data sharing, sustainable
manufacturing
1. Introduction
The industrial sector currently faces a “data tsunami”
from huge and complex data sets; which pose a
challenge to traditional processing and database
management tools (Zhong et al., 2016, p. 572) but also
hold immense potential for data analytics (Dai et al.,
2020). These opportunities include the realization and
improvement of sustainable supply chain management
or circular economy applications that rely on
technologies such as big data analytics, simulation, or
digital twins for industrial applications, and depend on
inter-organizational shared data (Z. Chen & Huang,
2021; Mageto, 2021). With the resulting information
transparency facilitated by shared data, environmental
benefits like natural resource replenishment, as well as
carbon footprint tracking, and social responsibility
actions can be realized (Khan & Abonyi, 2022).
However, industrial data have distinct properties that
challenge data sharing: the data is typically
characterized by massive volume, heterogeneous data
types, real-time existence, and being sensitive to
delays, as well as having considerable value potential
(Dai et al., 2020). These characteristics create barriers
to data sharing for applications focused on industry
sustainability and resilience, including lack of data
interoperability, trust, and privacy concerns (Z. Chen
& Huang, 2021; Walden et al., 2021). The need to
solve these issues and foster data sharing by
implementing new data management, sharing, and
integration capabilities has led to the increasing
emergence of data spaces and data space support
platforms (DSSPs) (Franklin et al., 2005; Otto &
Jarke, 2019). For instance, the European Commission
(2020) proposes a data space for traceability and
innovative services to improve social, environmental
and economic issues related to batteries. Hence, data
ecosystems emerging on top of data spaces are
beneficial not only to individual companies but also to
entire economies and societies (Capiello et al., 2020).
To enable ecosystems and new, data-driven
applications, the concept of “data infrastructure as
platform” emerged as key enabler (Castro et al., 2021,
para. 5). In contrast to private and often central
platforms, data infrastructure platforms can also be
realized as open and public solutions (Beverungen et
al., 2022). The underlying platform technologies can
be considered a form of digital infrastructure due to
their enabling role for any applications built on top of
them (Constantinides et al., 2018) and, thus, not only
enable single data spaces but also the purposeful,
organized federation of multiple data spaces. In
hierarchical systems such as a federated DSSP,
stakeholders must understand the architecture and the
dependencies to come towards a well-defined set of
requirements from “informed deliberations among
stakeholders with shared as well as competing
interests” (Whalen et al., 2012, p. 55). Using an
architecture-driven approach, the following research
question arises:
What are the architecture design options
for federated industrial data spaces?
In response to this question, this study proposes a
framework for structured design decisions according
to the systems’ architectural decomposition (Whalen
et al., 2012). The design options outline opportunities
to realize sustainability and sovereignty-oriented
business applications and policies. The framework is
based on an explorative single case study of an
emerging large-scale data space initiative in the
industrial sector that tackles sustainability-related use
cases. Following Ridder (2017), the study is meant to
fill gaps in existing theory and therefore relies on a
research framework derived from the literature. In this
paper we outline the design options on the basis of the
resources defined in the data ecosystem metamodel
(Oliveira et al., 2018) and the different layers defined
in the federated architecture literature (Busse et al.,
2000; Heimbigner & McLeod, 1985). The initiative
represents an extreme case, since it (a) consists of
more than 100 participants from areas such as the
automotive, manufacturing, and IT sections, including
small- and medium-size enterprises (SMEs); (b)
enables data-driven use cases to foster sustainable
manufacturing and supply chain resilience; and (c)
commits to leveraging the developments of multiple
data infrastructure initiatives at the same time.
2. Background
2.1. Data Spaces
Data spaces enable demand-driven and flexible
data integration within and across domains (Curry,
2020). In the industrial sector, the term is often
commonly used to describe an alliance of
organizations that collaborate for data sharing
purposes. From a technical viewpoint, the term
describes a particular data integration concept
(Franklin et al., 2005; Halevy et al., 2006) enabled via
a set of enabling services that allow for scalability and
to integrate governance mechanisms (Curry, 2020).
The key characteristics of data spaces are integration
via semantic integration and vocabularies according to
Linked Data principles, remaining a decentral data
holding, and enabling nesting and overlaps of data
(Franklin et al., 2005; Halevy et al., 2006). Data spaces
are an enabler for data ecosystems, a term that
describes (analogously to biological ecosystems) a set
of loosely coupled actors that jointly create value from
data and compete with data and service offerings
(Jacobides et al., 2018). Different resources are
involved in data ecosystems as modeled in Oliveira et
al.’s (2018) data ecosystem metamodel. Resources
contain data sets, systems infrastructure for storage
and computing, and data-based software solutions.
These solutions can include reusable assets such as
components and services but also applications to
produce, provide, or consume data by different actors.
In addition to the business applications that process or
pre-process data, a set of resources is also required to
enable data exchange and related communication
between data space participants, which is generally
realized via an additional abstraction layer. Figure 1
presents a four-layered model (Curry et al., 2019;
Curry, 2020) that illustrates the position of data
services in the technology stack. Several not-for-profit
associations have suggested key concepts and roles for
this abstraction layer to support the domain-
independent standardization of DSSPs (Nagel &
Lycklama, 2021). In addition to such a standardized
set of services, the specific characteristics of industrial
data require a particular and flexible design of services
and industrial DSSPs. First, the volume and velocity
of data flows are considerable and presuppose a highly
scalable data management and integration concept that
also considers the implications that different operating
systems such as cloud-edge combinations will bring.
The data is also private and highly protected, in
contrast to, e.g., information available as open data.
Manufacturing processes and supply chain networks
also have their own hierarchies that demand easily
adjustable governance capabilities to modify the
framing conditions of a data sharing collaboration on
a case-by-case basis.
Figure 1. Framework to enable data ecosystems
(Curry et al., 2019; Curry, 2020, p. 8)
2.2. Federated Architectures
DSSPs create a balance between the autonomy of
the various participants while placing considerable
demands on the ability to communicate and negotiate
between them, at both the technological and
organizational levels. These fundamental challenges
are addressed in DSSPs’ architectural structure, which
present a federated architecture that connects
decentralized databases to a joint data exchange group
Communication and Sensing
Middleware
Data
Intelligent Applications
(Heimbigner & McLeod, 1985). The key concerns of
federated architectures are autonomy and self-
organization of the involved entities while creating a
“‘game field’ with the necessary rules and
infrastructure supporting functions so that all of the
‘players’ are able to find the data they need” (Duan,
2009, p. 166). Considering the basic types of network
models, the federated approach represents a
hierarchical model in which entities are organized into
multiple layers, as shown in Figure 2. Although the
other models also offer advantages for distinct use
cases, the hierarchical model holds benefits as it
reflects common corporate structures and offers
advantages by allowing conglomerates of different
entities to have their own policies and processes; the
hierarchical model also has a hierarchical control,
discovery, and governance structure (Duan, 2009).
These demands and structures are also required for
production networks.
Figure 2. Basic network model types
(Duan, 2009, p. 170)
From a conceptual perspective, federated
architectures may be distinguished into three different
layers: the global presentation layer, the federation
layer, and the local layer; the local layer also includes
a wrapper layer (Busse et al., 2000). These layers can
also be described as overlay network layer, service
provider layers and peer-to-peer overlay network layer
(G. Chen et al., 2008). A mapping of the traditional
federated information systems layers to the data space
concept is displayed in Figure 3. In order to realize a
federated architecture, Steinke and Hommel (2018,
p. 1) note that not only technologies but also the
“management needs to become federated to support
the collaboration between multiple organizations”. A
federated service thus enables organization and
execution among multiple autonomous entities and
allows for the following of different hierarchies of
governance and their policies. Such policies might be
related to interoperability or security. Within a DSSP,
such capabilities for governance and technology
policies are conceptually located in the federation
layer. Different alliances establish to design and
analyze different aspects of DSSPs (Otto & Jarke,
2019). In addition, a DSSP is a form of data platform
(Kramberg & Heinzl, 2021) that can be private (Castro
et al., 2021) or public (Beverungen et al., 2022). Some
federated services are realized as shared services,
which are a common management concept for sharing
costs among a collaboration network (Borman &
Ulbrich, 2011; van Fenema et al., 2014). The key
characteristics of federated services, however, are
their distributed nature, their ability to encompass
governance mechanisms, and their hierarchical
network character, all of which allow for the
inscription of properties such as standards or policies
in a top-down manner. These characteristics mean that
a federated service can also be decentrally realized at
the autonomous entities’ location and still be governed
in a top-down way without being a shared service.
Figure 3. Data spaces as federated systems
3. Conceptual Framework
3.1. Research Approach
This study follows the paradigm of design-
oriented research (Hevner et al., 2004) and realizes the
benefits of conceptual modeling and of single case
studies. Conceptual models are generally abstractions
that require simplification of the real system or the part
of the real world they represent (Robinson, 2010).
Case studies consist of the analysis of real-world
phenomena (Baskerville et al., 2018; Yin, 1981) and
are characterized, among other aspects, by being
highly complex and focusing on one particular
research question. Because data spaces and their
technologies are still an emerging field and this study
has an explorative character, a single case study from
an established conceptual lens is a suitable approach
to extend the body of knowledge (Yin, 2010). First,
the key concepts and structures are derived from
literature and form a conceptual research framework
(section 3.2). The study’s conceptual framework
draws on federated architecture concepts (Busse et al.,
central distributed hierarchical
Data
Resource
Company A
Data
Resource
Company B
Data
Resource
Company C
Dataspace Dataspace
Dataspace
Dataspace
Dataspace
Global Data Availability
Local
Foundation
Layer
Global
Presentation
Layer
Federation
Layer
2000; Heimbigner & McLeod, 1985), data space
characteristics (Franklin et al., 2005; Halevy et al.,
2006) and their belonging data ecosystem resources
(Oliveira et al., 2018). Second, the single-case study is
analyzed with a strong emphasis on the federation
layer and service design (section 3.3). Third, the
results are generalized (3.4). Subsequently, section 4
continues with the application of the conceptual
framework in a circular economy use case requiring
data sharing.
3.2. Research Framework
The conceptual framework used to systematically
analyze the case is derived from Oliveira et al.’s
(2018) data ecosystem metamodel as well as the layers
of federated architectures (Busse et al., 2000;
Heimbigner & McLeod, 1985), which were explained
in sections 2.1 and 2.2, respectively. Because the
framework focuses on the design of a DSSP, different
data characteristics, service categories, and
infrastructural options for hosting and computing are
of interest. The framework is displayed in Table 1 and
includes nine fields numbered from I to IX describing
the realization of data ecosystem resources on the
vertical dimension and the architectural dimension on
the horizontal dimension. Each field presents a
resource in a certain dimension, e.g., V the services on
the federation layer. As DSSPs focus on the creation
of a federation layer, this architectural layer is strongly
emphasized.
Table 1. Conceptual research framework
Dimension
Data
Service
Infrastructure
Global
I
II
III
Federation
IV
V
VI
Local
VII
VIII
IX
The vertical data dimension refers to analytical
data as well as operational data and metadata for the
purpose of data sharing. While analytical data is
generally characterized by its applications in machine
learning and is often the object of interest in
distributed data platforms and data markets,
operational data is involved in ongoing internal
operations and can be used in a DSSP as well
(Dehghani, 2022; Inmon et al., 2019). Industrial data
can have characteristics that lead to specific data-
space-enabling service demands, such as volume or
policies.
The vertical service dimension describes private
business applications, federated services, and shared
services. While private business applications refer to
privately owned applications for analytic purposes,
federated services refer to those that form a federation
and connect different autonomous participants.
Federated services may be shared services, as
visualized in Figure 4. They can also be realized
decentrally or via single intermediary business
partners instead of as a collaborative network. The aim
of the management concept of shared services is to
consolidate services to reduce costs. If this approach is
applied across organizational boundaries, then
organizations form a shared service network and can
collaborate to gain mutual benefits (Borman &
Ulbrich, 2011). Inter-organizational shared services
foster process and output innovation while involving
multiple organizations (van Fenema et al., 2014).
The vertical infrastructure dimension refers to the
deployment and operation of different services that
process data and thus describes design related to
storage and computing options. This dimension also
captures the quantity dimensions, which describe how
often a certain solution is instantiated.
Figure 4. Federated and shared services
3.3. Case Analysis
The following section describes the service
landscape of the industrial data space initiative under
the conceptual lens. The local level (VII, VIII, IX)
includes data-driven business applications (VIII) that
handle the productive data of individual data space
participants (VII). All decisions regarding the data,
service, and infrastructure design and operation of
these business applications belong to the participants
(IX), who consume data via the DSSP and may
provide their results (again via the DSSP) to other
participants. The business applications realized via the
DSSP are characterized by being strongly reliant on
data from multiple sources and diverse participants,
including SMEs and large industrial organizations. For
example, to enable a carbon footprint calculation
along the supply chain, a large set of data is required
that will demand different design options on the
Shared Federated
Federated
Shared
federation layer. The DSSP provides cloud-agnostic
endpoints in the form of interfaces and system
adapters that can be adapted to any participant’s
systems and provide data and metadata, such as
provenance information or policies for further
processing.
Dimensions IV–VI refer to the federation level.
Services at this level (V) may be divided according to
their operating model into those that exist only as one
common instance provided by the alliance and those
that can exist in multiple instances in a decentralized
manner among each participant (VI). The first option
of having only one instance of a service is referred to
as a central shared service. The services include
intermediary services that only handle metadata and
self-information, such as the cataloging or logging of
data exchanges, as well as business services, which go
beyond the intermediary function and are used to
process production data (IV). An example of the latter
are services that anonymize a vehicle identification
number to ensure data protection. The different
options are described in Figure 5. One benefit in
particular of the central shared service is that one
trusted instance is used that every participant has
agreed on. On the other hand, central services also
place single-point-of-failures and bottlenecks of
technology performance, but also power structure.
This service can also represent a central trusted data
source that contains information such as master data
and trust-relevant member information. Despite their
advantages, the aim in this case is to use as few central
services as possible to avoid bottlenecks and increase
scalability.
Figure 5. Design options for shared services
The services (V) can be additionally characterized
as federated services. They can be shared services, but
they don't necessarily have to be, as depicted in Figure
4. Federated services enable a distributed nature and
autonomous usage while incorporating common
agreements, which mainly consist of interoperability
and security aspects, such as policies for how to
formulate outputs. They also provide the option to
include policies regarding ecological and social
properties. The case examined in this study has
different subcategories of federated services, such as
federated integration services, federated data
services, and federated business services, as
summarized in Figure 6. The federated services are
also categorized into being mandatory, optional, or
recommended. For example, the use of a specific
identity and certification service is mandatory for all
data exchange partners, as is the use of a registry for
digital twins. To incorporate these services,
participants can leverage an open-source reference
implementation. Further, it is envisioned that
commercial services with similar functionality will be
available in the future.
Figure 6. Design options for federated services
When realized at the participant level, the
participants can decide to have multiple instances or
shared versions in a subgroup (VI). Due to
interoperability concerns, most shared services are
also federated services, but this is not a necessity. In
particular, atomically small and encapsulated services
on the bottom of the technology stack are not subject
to the federation agreements
The federation layer also comprises the data
infrastructure services. The interoperability of these
services is driven by data infrastructure initiatives,
which place a common layer between all data spaces
and have the goal of creating a global data space
(Dataspace Business Alliance, 2021). Such services
have a cloud-agnostic design (VI) and follow a
predefined architecture and specifications (V), which
allows for multiple implementation options and
multiple instantiations.
The remaining global layer (I-III) describes the
resulting data availability across data spaces. Enabled
by the federation layer, the data on this global layer
consists of shared data or data processing results (I).
The service (II) and infrastructure options (III) of the
data are unrestricted and not only capture the data
space participants but also their end users, who benefit
from data or service products based on the data space.
Shared Service
one instance
shared by the alliance
multiple instances
shared by the alliance
intermediary functions
(metadata only)
business functions
(metadata and data)
distributed
operation and hosting
central
operation and hosting
Federated Service
one or multiple
instances shared
by the alliance
multiple instances shared
by autonomously defined
subgroups of participants
intermediary functions
(metadata only)
business functions
(metadata and data)
distributed
operation and hosting
central
operation and hosting
instance at
participants’
side
Table 2. Conceptual Framework
Service
Category
Data
Service
Infrastructure
Type and Holder
Governance
Specification
Implementation
Usage
Mandatory
Occurrence
Operation and
Deployment
Global
Global
Dataspace
Data or services as business application result available for
end-user provided by data space participants.
*
Operation and deployment
are up to the end-users and
participants.
Federation Layer
Central
Service
The alliance defines, specifies, and implements
the central service.
Defined
by
alliance.
0…1
The alliance defines how
they are operated and
where they are deployed.
Federated
Business
Service
Participants’
data to enable
federated
processing.
The alliance
defines
governance
rules.
Considering the
federation
governance,
multiple
specifications and
implementations
exist.
Defined
by
alliance.
*
The alliance defines how
they are operated and
where they are deployed.
Federated
Intermediary
Service
Participants’
metadata to
enable data
sharing.
The alliance decides whether
additional federated intermediary
services are required and how they
are designed and realized.
Defined
by
alliance.
*
The alliance defines how
they are operated and
where they are deployed.
Federated
Data Infra-
structure
Service
Participants’
metadata and
alliance’s self-
information if
required by
data infra-
structure.
The
governance
and
specifications
are defined by
the data
infrastructure.
Different
implementation
can be used,
among them a
reference
implementation.
Defined
by data
infra-
structure.
1…*
The alliance defines how
the services are operated
and where they are
deployed, the data
infrastructures may also
operate some of them.
Local
Business
Application
Participants’
data obtained
via the data
space system.
The complete design and governance of
business application services belongs to the
participant.
*
Operation and deployment
are each participant’s
decision.
3.4. Resulting Conceptual Framework
The case analysis of a particular industrial data
space and its DSSP as an extreme case, based on a
domain-neutral conceptual framework that relates
architecture to resources, can be abstracted to a general
framework for industrial DSSPs that allows with
connectivity and governance design options to address
sustainability-relevant properties in Table 2. The
horizontal rows present the architectural layer and the
vertical columns the different resource design options.
On the vertical axis are the data, services, and
infrastructure options. For the data involved in a
DSSP, the design options exist to decide whether
productive data or only metadata will be processed.
The holder is also defined as being the responsible
actor or holder of decision rights to determine data
usage. The services involved have different design
options for the authority and design of governance,
specifications, and implementation and whether the
usage is mandatory. The infrastructural design options
address the decisions and implications about one
versus multiple instantiations of services. Different
operational and deployment options should also be
considered to enable performance in the targeted
industrial environment. When considering the
horizontal axis, the focus lays on the federation layer,
which is the key layer for balancing different
autonomy and communication purposes. At the local
level, the business application category implies
complete autonomy for participants in terms of the
design and operation of their applications.
The global level represents the presentation level
of the resulting data and service availability, drawn
from the aggregation of several resources. On the
federation layer, industrial enterprises have various
options for shared and federated services that can be
designed according to their needs. Such needs can be
manifested as interoperability, demands for policies
and data sovereignty, and the security or performance
demands that guide design decisions. One design
option is the provisioning of a central service, which
is only made available one time for the data space.
Federated services can occur multiple times, once or
not at all, depending on their type. If federated services
are handling metadata only, they are referred to as
federated intermediary services. If they are handling
actual data, they are labeled as federated business
services. Data infrastructure services also represent
federated services where certain design decisions are
made by the data infrastructure initiatives.
4. Enabling Sustainability and Resilience
In the following, the framework is applied to an
exemplary circular economy use case in pump
manufacturing. Pumps used in industrial application
scenarios (e.g., as part of a chemical plant) consist of
components provided by different suppliers. They
include shaft, impeller, housing, bearing and motor,
amongst others. After the end-of-life of a pump,
decisions must be made if the components can be
repaired, refurbished, reconditioned, reprocessed, or
remanufactured. As main component, the motor is of
special interest as it can often be easily separated from
the other parts and be potentially reused or
remanufactured for other applications. Further, the
motor contains valuable elements such as rare earths
that are being used for permanent magnets due to their
high efficiency and high energy density (Li et al.,
2019). In order to make sound reuse decisions, such as
Table 3. Conceptual Framework applied to Circular Economy Use Case
Category
Example and Explanation
Benefits to foster data sharing
Global
Dataspace
In sum the information about product lifecycle relevant for
recycling decisions.
The data availability creates
information about different
products during their lifecycle.
Central
Service
One commonly used frontend and a functionality that gives an
overview about data transactions by analyzing metadata, and
one service that analyzes the payload data to estimate the CO2
savings reached via the product reuse decisions.
A central portal allows for a
single point of contact for end-
users to execute data sharing
via the data space activities.
Federated
Business
Service
Specialized services are required to detect toxic materials during
the product lifecycle and issues alerts. One service is performed
as hyperscaler-based cloud solution, another version on a
European-hosted solution in case data is not allowed to leave
Europe, and another cloud service exists that demands extensive
high computing power due to distinct artificial intelligence
algorithms to detect certain implications of complex materials.
By offering the distinct services
that issue alerts, different
analysis methods can be used
and different hosting options
allow for compliance
conformity and to fulfill
computing demands.
Federated
Intermediary
Service
One service is a distinct logging service for audit reasons that
includes a history of data exchange partners. Further services are
a distinct search and query functions and a corresponding
catalog that is tailored to the circular economy needs.
Additionally, a suitable data model is needed that fits the
sustainability demands.
To realize circular economy
applications and data
integration via data space
principles, different
interoperable metadata-
processing services are
required.
Federated
Data
Infrastructure
Service
An identity management approach is selected, and necessary
components and support systems are provided. For instance, the
eligible identity certificate providers are defined and how the
identities are proved.
A standardized approach for
identity management allows to
easily connect to other data
spaces.
Business
Application
Raw data as basis of circular economy use case is collected on
participant level such as in PLM, ERP or MES systems. In-house
data and data obtained via the data space can be processed in
own applications to gain information that determines the
potential use and specific constraints of components.
Data is only shared on need-to-
know basis and remains at
participant until data sharing
agreement is reached.
opting for a remanufacturing of the motor versus
recycling of distinct parts and materials, data along the
whole lifecycle of the product is needed. For instance,
motor curve data measured during service may
indicate the wear of the device, environmental data
gives hints on the contact of certain parts with toxic
substances that impact reusability from environmental
and safety perspectives, and demand data about parts
or materials enable to assess the economic benefits of
different reuse options. Assuming the pump motor is
given to a recycling service provider after its end-of-
life, information about the scenarios mentioned above
is commonly not available as the data streams are
interrupted between different stages of the product
lifecycle across stakeholders and systems (Wang &
Wang, 2019). To share and prepare the required data
throughout the lifecycle, different data spaces support
services are required that consider technological
constraints, but also trust and governance aspects of
the stakeholders involved.
In the light of the mentioned circular economy use
case Table 3 illustrates how certain design choices
foster data sharing to achieve higher transparency for
sustainability actions. Providing and applying generic
federated data infrastructure services enables the easy
integration of a broad range of participants and their
data into different data spaces. Such multiple data
space integration fosters the sharing of data across
domains that may be crucial for some information
chains. For example, sharing the carbon footprint of
manufacturing enterprises with banks may allow for
sustainable financing (Xu & Li, 2020). Supply chains
may also cross different jurisdictions that require to
rely on common, fundamental agreements. Further,
disruptive scenarios with dynamic changes of supply
chains due to interruptions (such as environmental
disasters) or business interruptions due to new
business models that require different data products
require flexibility in data spaces and participants. Next
to enabling uniformity and standardization with data
infrastructures, at the same time the flexible design of
added federated intermediary and business services
allows for purposeful tailoring to the demands of
single data spaces and staying flexible. This way, also
the adjustment and lowering of their energy and cloud
resource consumption is possible, as well as the ability
to define own governance rules and machine-
interpretable information including ecological or
social fairness information besides data protection and
interoperability ones. These information enables
informed decisions to grant or deny access to the
whole data space or certain resources.
Next to ecological or environmental aspects of
sustainability, the system design also allows for long-
term use, reliability and stability that makes it a
sustainable system itself and prevents it from large re-
build demands.
5. Discussion
The design of industrial DSSPs must consider
different services as well as their processed data and
operational options simultaneously. Service categories
distinguish between (a) shared and federated services,
(b) the use of highly sensitive business data and
metadata, and (c) services that support the data space
defined by domain-neutral data infrastructure. The
different categories also follow different business
models. Consequently, the conceptual framework
displays the nature of data infrastructures and
highlights their infrastructural characteristics (Hanseth
& Monteiro, 1998). Besides defining the services for
each category, data space alliances must also decide
what is mandatory to be used and what is not. This
decision covers whole service instances but also
dedicated governance rules, specifications, or
infrastructural options. The case study examined in the
present study further distinguishes between optional
and recommended services. Notably, the demand of
some services may imply dependencies to other
services that become implicitly mandatory or can pose
lock-in effects. The abstraction level of the conceptual
framework (Table 2) allows for a unifying view on
DSSPs of different natures and their comparison. The
focus on operational environments allows for
comparing and composing different options. Different
operation options can be selected depending on the use
case’s specific threats and targets. For example, as
Adhikari and Winslett (2019, p. 974) note that “supply
chain data and its threat model are a good match for
blockchains […] other fine-grained data from a factory
floor can be valuable for manufacturing analytics, but
is a poor match for blockchains, due to its volume
[and] velocity”. This characteristic highlights the
necessity for different design options especially for the
infrastructural and operational aspects.
6. Conclusion, Limitations, and Outlook
This study has elaborated on the foundational
concepts of a data space support platform (DSSP) and
has proposed a conceptual framework for industrial,
federated data spaces aimed at creating information
transparency. The use of this model can ease the
design of DSSPs at an emerging development stage
and enables sustainable applications as well as design
decisions in manufacturing that are reliant on the data
shared across organizations. The following limitations
must be considered, however. First, the case
considered in this study is a single case and thus does
not allow for comparison between different cases. The
case is also a data space endeavor in the ramp-up stage
and is not yet fully operationalized. The conceptual
analysis shows only a snapshot, and the concepts and
services of the case have yet to be completely defined
and may still change. Future research opportunities
could include a detailed analysis the remaining
properties of data ecosystem resources of quality,
standards, and license constraints (Oliveira et al.,
2018). Doing so would allow for further locating
production-specific standards and constraints in a
more fine-grained manner. Additionally, key
components and sustainability-specific concepts could
be added and refined as additional governance layer.
Closely related are also the implications of centralized
or decentralized service design and operation,
including the costs or any legal implications that arise.
7. Acknowledgements
This work has been supported by the German
Federal Ministry for Economic Affairs and Climate
Action in context of the GAIA-X4KI project (no.
19A21011E).
8. References
Adhikari, A., & Winslett, M. (2019). A hybrid architecture
for secure management of manufacturing data in
industry 4.0. In 2019 IEEE International
Conference on Pervasive Computing and
Communications Workshops (PerCom
Workshops). Symposium conducted at the
meeting of IEEE.
Baskerville, R., Baiyere, A., Gregor, S., Hevner, A., &
Rossi, M. (2018). Design Science Research
Contributions: Finding a Balance between
Artifact and Theory. Journal of the Association
for Information Systems, 19(5), 358–376.
https://doi.org/10.17705/1jais.00495
Beverungen, D., Hess, T., Köster, A., & Lehrer, C. (2022).
From private digital platforms to public data
spaces: implications for the digital
transformation. Electronic Markets. Advance
online publication. https://doi.org/10.1007/s
12525-022-00553-z
Borman, M., & Ulbrich, F. (2011). Managing
Dependencies in Inter-Organizational
Collaboration: The Case of Shared Services for
Application Hosting Collaboration in Australia.
In I. Staff (Ed.), 2011 44th Hawaii International
Conference on System Sciences (pp. 1–10). I E E
E. https://doi.org/10.1109/HICSS.2011.295
Busse, S., Kutsche, R.‑D., & Leser, U. (2000). Strategies
for the Conceptual Design of Federated
Information Systems. In EFIS. Symposium
conducted at the meeting of Citeseer.
Capiello, C., Gal, A., Jarke, M., & Rehof, J. (2020). Data
Ecosystems: Sovereign Data Exchange among
Organizations (Dagstuhl Seminar 19391).
Advance online publication.
https://doi.org/10.4230/DagRep.9.9.66.
Castro, A., Machado, J., Roggendorf, M., & Soll, H.
(2021). How to build a data architecture to drive
innovation—today and tomorrow. McKinsey
Digital. https://www.mckinsey.de/business-
functions/mckinsey-digital/our-insights/how-to-
build-a-data-architecture-to-drive-innovation-
today-and-tomorrow
Chen, G., Low, C. P., & Yang, Z. (2008). Coordinated
Services Provision in Peer-to-Peer Environments.
IEEE Transactions on Parallel and Distributed
Systems, 19(4), 433–446.
https://doi.org/10.1109/TPDS.2007.70745
Chen, Z., & Huang, L. (2021). Digital twins for
information-sharing in remanufacturing supply
chain: A review. Energy, 220, 119712.
https://doi.org/10.1016/j.energy.2020.119712
Constantinides, P., Henfridsson, O., & Parker, G. G.
(2018). Introduction—Platforms and
Infrastructures in the Digital Age. Information
Systems Research, 29(2), 381–400.
https://doi.org/10.1287/isre.2018.0794
Curry, E. (2020). Real-time Linked Dataspaces : Enabling
Data Ecosystems for Intelligent Systems. Springer
Nature. https://doi.org/10.1007/978-3-030-29665-
0
Curry, E., Derguech, W., Hasan, S., Kouroupetroglou, C.,
& ul Hassan, U. (2019). A Real-time Linked
Dataspace for the Internet of Things: Enabling
“Pay-As-You-Go” Data Management in Smart
Environments. Future Generation Computer
Systems, 90, 405–422.
https://doi.org/10.1016/j.future.2018.07.019
Dai, H.‑N., Wang, H., Xu, G., Wan, J., & Imran, M.
(2020). Big data analytics for manufacturing
internet of things: opportunities, challenges and
enabling technologies. Enterprise Information
Systems, 14(9-10), 1279–1303.
Dataspace Business Alliance. (2021). Unleashing the
European Data Economy. https://data-spaces-
business-alliance.eu/
Dehghani, Z. (2022). Data mesh: Delivering data-driven
value at scale. O'Reilly.
Duan, N. (2009). Design Principles of a Federated Service-
oriented Architecture Model for Net-centric Data
Sharing. The Journal of Defense Modeling and
Simulation: Applications, Methodology,
Technology, 6(4), 165–176.
https://doi.org/10.1177/1548512909352790
European Commission. (2020, December 10). Green Deal:
Sustainable batteries for a circular and climate
neutral economy [Press release].
https://ec.europa.eu/commission/presscorner/
detail/en/ip_20_2312
Franklin, M., Halevy, A., & Maier, D. (2005). From
databases to dataspaces. ACM Sigmod Record,
34(4), 27–33. https://doi.org/10.1145/1107499.
1107502
Halevy, A., Franklin, M., & Maier, D. (2006). Principles of
dataspace systems: 2006 ACM SIGMOD
International Conference on Management of
Data.
Hanseth, O., & Monteiro, E. (1998). Understanding
information infrastructure. Unpublished
Manuscript, retrieved from http://heim. ifi. uio.
no/~ oleha/Publications/bok. pdf
Heimbigner, D., & McLeod, D. (1985). A federated
architecture for information management. ACM
Transactions on Information Systems (TOIS),
3(3), 253–278.
Hevner, A. R., March, S. T., Park, J., & Ram, S. (2004).
Design Science in Information Systems Research.
MIS Quarterly: Management Information
Systems, 28(1), 75–105.
https://doi.org/10.2307/25148625
Inmon, W. H., Linstedt, D., & Levins, M. (2019). Data
architecture: A primer for the data scientist
(Second Edition). Academic Press.
Jacobides, M. G., Cennamo, C., & Gawer, A. (2018).
Towards a theory of ecosystems. Strategic
Management Journal, 39(8), 2255–2276.
https://doi.org/10.1002/smj.2904
Khan, A. A., & Abonyi, J. (2022). Information sharing in
supply chains – Interoperability in an era of
circular economy. Cleaner Logistics and Supply
Chain, 5, 100074.
https://doi.org/10.1016/j.clscn.2022.100074
Kramberg, P., & Heinzl, A. (2021).
Datenplattformökosysteme [Data Platform
Ecosystems]. HMD Praxis Der
Wirtschaftsinformatik, 58(3), 477–493.
https://doi.org/10.1365/s40702-021-00716-0
Li, Z., Kedous-Lebouc, A., Dubus, J.‑M., Garbuio, L., &
Personnaz, S. (2019). Direct reuse strategies of
rare earth permanent magnets for PM electrical
machines – an overview study. The European
Physical Journal Applied Physics, 86(2), 20901.
https://doi.org/10.1051/epjap/2019180289
Mageto, J. (2021). Big Data Analytics in Sustainable
Supply Chain Management: A Focus on
Manufacturing Supply Chains. Sustainability,
13(13), 7101. https://doi.org/10.3390/su13137101
Nagel, L., & Lycklama, D. (2021). Design Principles for
Data Spaces - Position Paper. https://design-
principles-for-data-spaces.org/
https://doi.org/10.5281/zenodo.5105744
Oliveira, M. I. S., Oliveira, L. E. R. A., Batista, M. G. R.,
& Lóscio, B. F. (2018). Towards a meta-model
for data ecosystems. In M. Janssen, S. A. Chun,
& V. Weerakkody (Eds.), Proceedings of the
19th Annual International Conference on Digital
Government Research Governance in the Data
Age - dgo '18 (pp. 1–10). ACM Press.
https://doi.org/10.1145/3209281.3209333
Otto, B., & Jarke, M. (2019). Designing a multi-sided data
platform: findings from the International Data
Spaces case. Electronic Markets, 29(4), 561–580.
https://doi.org/10.1007/s12525-019-00362-x
Ridder, H.‑G. (2017). The theory contribution of case study
research designs. Business Research, 10(2), 281–
305. https://doi.org/10.1007/s40685-017-0045-z
Robinson, S. (2010). Conceptual modelling: Who needs it.
SCS M&S Magazine, 2(7).
Steinke, M., & Hommel, W. (2018). A data model for
federated network and security management
information exchange in inter-organizational IT
service infrastructures. In NOMS 2018 - 2018
IEEE/IFIP Network Operations and Management
Symposium (pp. 1–2). IEEE.
https://doi.org/10.1109/NOMS.2018.8406162
van Fenema, P. C., Keers, B., & Zijm, H. (2014).
Interorganizational Shared Services: Creating
Value across Organizational Boundaries. In T.
Bondarouk (Ed.), Advanced Series in
Management: Vol. 13. Shared services as a new
organizational form (Vol. 13, pp. 175–217).
Emerald Group Publishing.
https://doi.org/10.1108/S1877-
636120140000013009
Walden, J., Steinbrecher, A., & Marinkovic, M. (2021).
Digital Product Passports as Enabler of the
Circular Economy. Chemie Ingenieur Technik,
93(11), 1717–1727.
https://doi.org/10.1002/cite.202100121
Wang, X. V., & Wang, L. (2019). Digital twin-based
WEEE recycling, recovery and remanufacturing
in the background of Industry 4.0. International
Journal of Production Research, 57(12), 3892–
3902. https://doi.org/10.1080/00207543.2018.
1497819
Whalen, M. W., Gacek, A., Cofer, D., Murugesan, A.,
Heimdahl, M. P. E., & Rayadurgam, S. (2012).
Your" what" is my" how": Iteration and hierarchy
in system design. IEEE Software, 30(2), 54–60.
Xu, X., & Li, J. (2020). Asymmetric impacts of the policy
and development of green credit on the debt
financing cost and maturity of different types of
enterprises in China. Journal of Cleaner
Production, 264, 121574.
https://doi.org/10.1016/j.jclepro.2020.121574
Yin, R. K. (1981). The Case Study as a Serious Research
Strategy. Knowledge, 3(1), 97–114.
https://doi.org/10.1177/107554708100300106
Yin, R. K. (2010). Case study research: Design and
methods (4. ed.,). Applied social research
methods series: Vol. 5. Sage.
Zhong, R. Y., Newman, S. T., Huang, G. Q., & Lan, S.
(2016). Big Data for supply chain management in
the service and manufacturing sectors:
Challenges, opportunities, and future
perspectives. Computers & Industrial
Engineering, 101, 572–591.
https://doi.org/10.1016/j.cie.2016.07.013