ArticlePDF Available

Dynamic, Context-Aware Cross-Layer Orchestration of Containerized Applications

Authors:

Abstract and Figures

Container orchestration handles the semi-automated management of applications across Edge-Cloud, providing features such as autoscaling, high availability, and portability. Having been developed for Cloud-based applications, container orchestration faces challenges in the context of decentralized Edge-Cloud environments, requiring a higher degree of adaptability in the verge of mobility, heterogeneous networks, and constrained devices. In this context, this perspective paper aims at igniting discussion on the aspects that a dynamic orchestration approach should integrate to support an elastic orchestration of containerized applications. The motivation for the provided perspective focuses on proposing directions to better support challenges faced by next-generation IoT services, such as mobility or privacy preservation, advocating the use of context awareness and a cognitive, cross-layer approach to container orchestration to be able to provide adequate support to next-generation services. A proof of concept (available open source software) of the discussed concept has been implemented in a testbed composed of embedded devices.
Content may be subject to copyright.
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.DOI
Dynamic, Context-aware Cross-layer
Orchestration of Containerized
Applications
RUTE C. SOFIA1(Senior Member, IEEE), DOUG DYKEMAN2, PETER URBANETZ2, AKRAM
GALAL1(Member, IEEE), DUSHYANT DAVE1
1fortiss GmbH, Research Institute of the Free State of Bavaria - associated with the Technical University of Munich, Guerickestraße 25, 80805 München, Germany
2IBM Research Europe, Zurich Research Laboratory, Switzerland
Corresponding author: Rute C. Sofia ORCID ID: 0000-0002-7455-5872 (e-mail: sofia@ fortiss.org).
ABSTRACT Container orchestration handles the semi-automated management of applications across
Edge-Cloud, providing features such as autoscaling, high availability, and portability. Having been devel-
oped for Cloud-based applications, container orchestration faces challenges in the context of decentralized
Edge-Cloud environments, requiring a higher degree of adaptability in the verge of mobility, heterogeneous
networks, and constrained devices. In this context, this perspective paper aims at igniting discussion on
the aspects that a dynamic orchestration approach should integrate to support an elastic orchestration of
containerized applications. The motivation for the provided perspective focuses on proposing directions to
better support challenges faced by next-generation IoT services, such as mobility, or privacy preservation,
advocating the use of context-awareness and a cognitive, cross-layer approach to container orchestration
in order to be able to provide adequate support to next generation services. A proof-of-concept (available
open-source software) of the discussed concept has been implemented in a testbed composed of embedded
devices.
INDEX TERMS Context-awareness, IoT, Edge computing, Machine Learning, Data observability.
I. INTRODUCTION
INTERNET-based services, and in particular Internet of
Things (IoT) services, provide a way to exploit data across
different vertical domains to reach a higher degree of effi-
ciency. Up until recently, IoT data processing and storage
has been mostly based on Cloud solutions, which implied
transmitting all the collected data from IoT devices to the
Cloud. However, the increasing amount of generated data and
the frequent exchange of such data to the Cloud brought in
new challenges, such as delay in processing data, increase in
energy consumption and costs [1]. Edge computing [2] is a
paradigm that can assist in overcoming some IoT challenges,
by pushing the computation and data processing "closer" to
the end-user, or to field-level devices. This decentralization
of computation and networking functions across the so-
called Edge-Cloud continuum [3] envisions the distribution
of computation, data, storage, and application logic often
across multiple operational regions controlled by different
service operators. The spread of different functions across
IoT device-Edge-Cloud is today made possible via soft-
warization and software-based as well as hardware-based vir-
tualization solutions that provide support to easily setup and
deploy applications based on micro-service architectures [4].
The most popular software-based virtualization approaches
considered in the context of the Cloud-Edge continuum are
Virtual Machines (VM), managed by and Virtual Machine
Monitors (VMM), also known as hypervisors, and container
technologies, such as Docker [5]–[7]. Container technolo-
gies and hypervisors provide the basis to run applications
efficiently, by supporting isolation of applications within the
system (software and hardware). Key differences between
these technologies relate with the level of provided isolation.
When considering a container solution such as Docker, the
application is isolated via containerized images, but still
shares the same kernel with different containers on the same
host. Hence, processes running inside containers that com-
pose an application can be accessible to the host system,
assuming privileges for such exist. VM approaches make
anything running inside a VM independent from the host
operating system. At start time, a VM boots a new dedicated
VOLUME 4, 2016 1
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
kernel slice for a VM environment, and therefore, activates
required operating system processes, which a container-
based approach does not need to have. Performance issues,
and comparison between different approaches are available
in related literature [8]. The focus of this paper concerns
container-based technologies, and in particular, solutions that
manage container technologies, known as container orches-
trators [9].
Due to the flexibility introduced by container-based ap-
proaches, today IoT services that have been traditionally
deployed on the Cloud, e.g., an IoT data analytics service,
can now be easily deployed across Edge-Cloud. Container-
based approaches for IoT enable the exploration of new
computational models and facilitate the deployment of new
business models derived from the expanded data processing
capabilities from the Edge to the Cloud [10]. However, the
generalized use of container technologies to support a fast
and flexible deployment of IoT services across Edge-Cloud
is only feasible if there is an adequate control plane, capable
of managing the setup and run-time of containerized applica-
tions. Container orchestrators provide a way to reduce human
intervention and to increase efficiency in the overall setup and
lifetime management of IoT services and applications across
Edge-Cloud [11]–[13]. Specifically, container orchestration
pertains to the handling and scheduling of the workload
(microservice, component of an application plus its state)
of individual containers for applications that are based on
micro-service architectures. Some of the tasks handled by
orchestrators include: configuration and scheduling as well
as provisioning of containers; checking availability of con-
tainers; scaling the system to balance application workloads
across the overall infrastructure; allocating resources to the
different containers, and monitoring their health; securing
the communication exchange between containers. The most
popular examples of container orchestrators are Kubernetes
(K8s) 1, Docker Swarm (DS) 2[14].
The current generation of container orchestrators ad-
dresses the scheduling and distribution of workload across
Edge-Cloud in a semi-static fashion, relying on a replication
approach. By replication, it is meant that the overall orches-
tration of containerized applications relies on an approach
where the orchestrator is responsible for handling multiple
replicas of the same containerized applications, or of its
containerized micro-services, deciding to activate, or to stop
specific replicas due to a pre-configured set of rules. This
approach is semi-static, require support by a human operator
for each deployed and active application.
However, next generation IoT and Internet services rely on
multiple applications across Edge-Cloud. With the integra-
tion of computation on the Edge and in particular on the so-
called far Edge/deep Edge [15] , applications and their micro-
services will be running across mobile, often constrained
interconnected nodes. Hence, a high degree of variability in
1https://kubernetes.io/
2https://dockerswarm.rocks/
terms of computational and networking resources has to be
supported, and next generation container orchestrators have
to be able to account for such behaviour.
Accordingly, the motivation that gave rise to this work
concerns the belief that in order for container orchestrators
to achieve a higher degree of elasticity, required in the verge
of challenges such as mobility, intermittent connectivity, and
deploying applications across constrained devices, it is im-
portant to devise new features based upon context-awareness;
learning and adaptation capabilities.
Context-awareness in this paper refers to the capability
of a system to consider knowledge about its environment
Context-awareness refers to the capability of a system to con-
sider knowledge about its environment, to perform specific
actions [16], [17], to perform specific actions. In this context,
a system can be a computational system; a cyber-physical
system; a set of cyber-physical systems. Usually, contextual
information falls into a wide range of categories such as com-
puting context (e.g., available processors, memory, nearby
resources); user context (e.g., location, user profiles, nearby
users); environmental context (e.g., lighting, temperature,
etc).
In the context of the perspective provided in this paper,
context-awareness concerns functional and non-functional
requirements provided by the application and system; from
the network; from the user; and from the data. In other words,
container orchestration mechanisms need to take into consid-
eration a jointly devised cross layer approach to provide an
adequate support to next generation IoT services, and internet
services.
To achieve such a dynamic behaviour, this perspective
paper identifies and debates different aspects that need to
be addressed, and proposes a high-level functional design
for a context-aware orchestration, focusing on the following
challenges:
What kind of context-awareness, which parameters are
relevant to integrate into an orchestrator?
How to model context in a way that is relevant to
be applied in the overall management of Edge-Cloud
application lifetime management?
What is the role of Machine Learning (ML) in the
overall orchestration process? Which type of learning
pattern is required?
Is current replication enough, or are there use-cases
where it would be beneficial to fully handover (offload)
application workload and its state, from one location to
another?
In order to contribute to answering the aforementioned
research questions, this paper provides the following contri-
butions:
Provides a novel perspective and promotes a debate con-
cerning current container orchestration approaches and
proposes steps to achieve a more dynamic orchestration
behaviour, better suited for next-generation IoT services
across the Edge-Cloud continuum.
2VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
Proposes the use of context-awareness, detailing spe-
cific indicators that can be considered, and the process
to integrate context-awareness into reference container
orchestrators.
Proposes an architecture for dynamic container orches-
tration based on ML and context-awareness. integrating
parameters that relate with application requirements,
data requirements, system requirements, network re-
quirements, and user behaviour.
Describes a first proof-of-concept (Movek8s) deployed
in a testbed based on embedded devices, to assist the
reader and developers in understanding the overall pro-
posed concept.
The rest of the paper is organized as follows. Section II
describes work related to ours, explaining our key contribu-
tions. Section III covers background on what is container or-
chestration; which architectural solutions are available, their
advantages and gaps. Section IV provides an overview on
scenarios across different vertical domains, where container
orchestration requires a more dynamic behaviour than the
one available today. Section V addresses the integration of
ML on container orchestrators; approaches as of today; why
further decentralisation in the training and learning process
is required, and which advantages such decentralisation may
bring. Section VI debates on data observability and its role
in achieving a dynamic and robust container orchestration
across Edge-Cloud. Section VII proposes a functional ar-
chitecture for dynamic orchestration, explaining the role of
context-awareness; ML-based orchestration; and how such
framework may interact with an orchestrator such as Kuber-
netes. Section VIII describes a proof-of-concept, Movek8s 3,
implemented in a testbed composed of embedded devices.
This proof-of-concept helped us in understanding the impli-
cations of integrating a simple form of context-awareness
(location) into a container orchestrator (Kubernetes), and
is described to assist the reader in understanding how the
proposed framework may be instantiated, and the purpose of
doing so. Section IX concludes the paper and proposes a few
directions for future work.
II. RELATED WORK
Current orchestrators such as K8s and DS offer fundamental
features such as (i) resource control, (ii) service scheduling,
(iii) load-balancing, (iv) auto-scaling, and (v) high service
availability. However, they have some limitations such as lack
of elasticity to handle a more variable service behaviour, or
support for a higher degree of automation, i.e., zero config-
uration [25]. his section summarizes analysed related work,
highlighting contributions expected by our proposal concept.
To assist the reader, a summary of the analysed approaches
including advantages and disadvantages is provided in Ta-
ble 1.
A first category of related literature considers a more adap-
tive behaviour, based on improvements to the orchestration
3https://git.fortiss.org/iiot_external/movek8s
scheduling. For instance, Bulej et al. proposes a self-adapting
Kubernetes (K8s) scheduler aiming at a better support of
time-sensitive applications [18]. Their approach relies on
continuous probing to assess current performance, and to
allow K8s to detect and to react to failures, in order to
satisfy bounded latency requirements. This brings in some
degree of adaptation, but has the disadvantage of requiring
constant probing by the system. Rossi et al. have proposed
a scheduling approach that takes into consideration geo-
location [19]. Their approach considers the application of
Reinforced Learning (RL) to dynamically control the number
of application replicas across different locations, and address
the distribution based on an optimization problem that takes
into consideration network-aware heuristics, e.g., path or
hop delay. Zhang et al. outline a predictive container auto-
scaling algorithm (ASARSA) [20]. ASARSA combines Au-
toregressive Integrated Moving Average (ARIMA) and Artifi-
cial Neural Network models to timely schedule the containers
and improve the accuracy of scheduling decisions. Although
ARIMA scales well to large-scale datasets, it is limited to
linear models [26]. Similarly, the application of ML into
the scheduling process for autoscaling aspects has also been
described provided by Toka et al., where the authors have
provided a taxonomy for the application of ML in container
orchestration [26]. On this category of related work, there
is a specific focus on improving, via self-adaptation, an
existing feature of K8s, often related with the support of time-
sensitive applications, for instance, auto-scaling, or load-
balancing. This adaptation is often based on heuristics that
take into consideration functional application or networking
requirements, such as latency.
Another category of related work concerns the application
of context-awareness to orchestrators, so that the overall
orchestration process can be adapted to existing conditions
and the resulting application deployment becomes smoother.
A debate on the use of context-awareness has been provided
by Ogbuachi et al. in the context of Edge-based applications
across a 5G infrastructure [21]. The authors defined context
as cluster and network data that should be integrated into
the K8s scheduler, proposing a measure of usability of a
node. Specifically, their algorithm, which has been shown
to improve the behaviour in comparison to K8s, relies on
collected node (e.g., CPU, memory) and network data (e.g.,
network delay) and current workload, as well as requirements
of the applications, to propose an adaptation while preserving
fairness in terms of workload distribution. Kaur et al. propose
a multi-objective scheduling optimization based on integer
linear programming, KEIDS, which aims at minimizing the
energy usage for Industrial IoT (IIoT) environments [22].
Context is defined in their work as node energy usage, Car-
bon footprint emissions and also performance interference,
and their algorithm shows relevant performance improve-
ments in terms of energy consumption efficiency in compar-
ison to existing K8s schedulers, based on real-time Google
traces. Energy and latency, along with specific node usage
indicators such as CPU and memory, are the most common
VOLUME 4, 2016 3
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
TABLE 1. Advantages and disadvantages of analysed related work and of our proposed concept.
Related work Pros Cons
Self-adapting K8s scheduler for
time-sensitive applications [18]
Supportes bounded latency Requires continuous active probing and just supports latency
aspects, ignoring other relevant requirements from the applica-
tion, data flow, or network.
K8s scheduler with knowledge
on geo-location and network-
awareness in the form of path
and hop delay [19]
Controls dynamically number of replicas across
different locations taking into consideration hop
count and path delay
Provides a very basic form of network-awareness and does not
integrate other requirements.
ASARSA, predictive auto-
scaler algorithm [20]
Improves the scheduling accuracy via NN mod-
els
Scales well if datasets are dense, but is limited to linear models.
Network-aware approach [21] Provides a more flexible scheduling taking into
consideration a measure of node usability, that
considers both node usage and network usage
aspects
Considers autoscaling aspects only, and does not support inter-
mittent connectivity aspects.
KEIDS [22] Supports energy awareness Other relevant context indicators that can be sensed across
the supported system, going beyond network and application
requirements [23] are not considered.
Advocates the need to allow
K8s to support a higher degree
of decentralization in terms
of application workload setup
and runtime to support hetero-
geneous Edge-Cloud environ-
ments [24]
Debates on the need to have support to de-
centralization in terms of setup and runtime of
applications
Focuses on network and application requirements for a single
vertical domain, Smart Cities.
Our concept Provides support to Edge-Cloud consider-
ing application-awareness, system-awareness,
network-awareness and data-awareness, thus
creating a more elastic support to heteroge-
neous and decentralized Edge-Cloud environ-
ments
Requires an integrated approach to a data-compute-network or-
chestration, which will bring a complexity trade-off that needs
to be analysed for single and federated clusters environments.
context indicators considered in elastic orchestration. How-
ever, there are other relevant context indicators that can be
sensed across the supported system, going beyond network
and application requirements [23].
In this work, we aim at explaining why it is relevant
to consider other types of parameters to define context, in
particular external parameters to a system, and how these
parameters can be modelled and applied in terms of container
orchestration. It is our opinion that orchestration based on
multiple parameters (metadata) that is collected across dif-
ferent OSI Layers can achieve more elasticity and robustness,
and is expected to increase the level of efficiency and fairness
of a system in terms of application workload scheduling
decisions.
The proposed approach for a dynamic container orches-
tration to be debated in the next sections aims at supporting
an heterogeneous multi-provider Edge-Cloud continuum by
considering requirements derived from the application and
application based parameters; system awareness (e.g., infor-
mation about the computational status of nodes); network-
awareness (e.g., information about specific available paths or
links); data awareness (e.g., information about the status of
data, such as data freshness). These aspects shall be further
debated in section VII, where our perspective on the building
blocks of a dynamic container orchestration framework are
debated.
The need to allow K8s to support a higher degree of
decentralization in terms of application workload setup and
run-time, in particular considering more variable environ-
ments across Edge-Cloud, has been the subject of a literature
review in the context of Smart Cities [24]. The authors debate
on the need to consider custom parameters (custom context
indicators) based on networking or application requirements,
to allow K8s to support Edge environments. However, it is
relevant to go beyond a specific domain, and to address such
support in a global way, not necessarily tied to a specific do-
main, e.g., Manufacturing, Smart Cities, as shall be explained
in section IV.
III. CONTAINER ORCHESTRATION BACKGROUND
A. TERMINOLOGY
This subsection introduces terminology that is used through-
out the paper. An application is defined as being based on
a micro-service architecture, where each component (micro-
service) can be run independently based on container tech-
nology, such as Docker. This is named containerized micro-
service, or containerized application. A containerized micro-
service is therefore composed of the binary system, work-
load, data, and state, i.e., a set of global variables defined in
the micro-service and required at run-time. A containerized
application therefore consists of one or several containerized
micro-services which are interconnected via specific inter-
facing policies. The different micro-services may run on the
same device, in a decentralized way (independently, e.g., a
broker and an application monitor), or in a distributed way,
across different devices or virtual machines. To scale the
application, it is feasible to consider transparent replication
or offloading, also known as relocation.
4VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
Replication implies that an application and its micro-
services are copied across different locations, i.e., there are
multiple replicas of the application in a system, some active,
and some inactive. With replication, the decision to activate
or to stop an application is performed based on specific
configuration and a set of policies. Replication requires a way
to synchronize state across an "old" and a "new" environment
where the application is deployed, and implies that even
inactive applications consume resources (at least storage).
Offloading implies moving an application and its micro-
services across different environments. Therefore, with of-
floading, the "old" environment, the application workload
and its state are deleted and not simply made inactive; a copy
of workload and of state, followed by eventual adaptation of
state is done in the "new" environment.
Applications supported in this process can be stateless
or stateful. Stateless applications require no data storage to
work. An example is a Web search. Stateful applications keep
state on clusters, and require that state (data, status of the
application) to be kept and eventually found.
Edge, Cloud definitions, including the notions of far
Edge/near Edge follow the line of thought being driven
in the European initiative Next Generation IoT (NGIoT) 4,
and in particular, the vision for smart, decentralised Edge-
Cloud environments for IoT applications [15]. Moreover, as
shall be further explained in the next section, there are a
few definitions used throughout the paper that relate with
orchestration.
Container follows the K8s definition, where it is a package
with the overall settings (workload, state, data) to allow
execution of an application in an independent way.
APod follows also the K8s definition, being a logical
wrapper entity for containers to be executed on a K8s clus-
ter. This logical wrapper "holds" a group of one or more
containers with shared storage and network resources, and
also a common namespace, providing a definition to run the
containers.
Acluster corresponds to the logical environment where
Pods run in a way that has been orchestrated by a human
operator.
Hence, a container runs logically in a Pod. A Pod may
hold more than one container. A cluster can hold multiple
Pods (not necessarily related); Pods are grouped via logical
boundaries, their namespace. Hence, a Pod is the unit of
replication in a cluster.
The applications across Edge-Cloud are therefore orches-
trated via multiple clusters, where an Edge environment, or
an Edge-Cloud environment may be inside a single cluster
(e.g., if under the operation of a same service provider) or of
multiple clusters (e.g., across multi-domain environments).
B. CONTAINER TECHNOLOGY
Container technology such as Docker 5provides a virtual-
ization solution to isolate applications together with their
4https://www.ngiot.eu/
5https://www.docker.com/
state, and eventually data, and thus provides the means to run
applications in a way that is independent of the underlying
Operating System (OS) and hardware. Docker popularized
the container pattern and has been an important player in the
development of the underlying technology, but the container
ecosystem is much broader than just Docker.
The management of containerized applications is per-
formed via container orchestrations, as explained before.
Container orchestrators assist in configuring and managing
the overall application setup and life cycle. As has also
been explained, the two most relevant container orchestrators
existing today are K8s and DS, which are further explained
on the next subsections. Out of these, several other variants
have been derived.
C. KUBERNETES (K8S)
K8s has been affirming itself as the de-facto container or-
chestrator. As an open-source solution, it provides support
for automating the deployment of applications, also support-
ing an adequate scaling and overall run-time management.
Originally developed for Cloud-based services, K8s requires
some adaptation for Edge-based services.
The overall management approach of K8s, for which the
architecture is represent in Figure 1, relies on the concept
of Pod, i.e., an abstraction element to manage one or more
containers, with shared resources and network configuration
collection of containers sharing the same IP and port space.
Pods are assigned to worker nodes, where a local daemon
(Kubelet) manages the worker nodes life cycle. Kubelet is
also the entry point to the K8s control plane. The control
plane is therefore composed of Pods that reside on main
nodes and implement etcd, the scheduler, the controller
manager and the API server.
etcd is a key-value internal database which represents
the desired state of a cluster, based on the Raft consensus
protocol [27]. It stores all the information about the Pods, in
which node they should run, number of instances, etc.
The controller is the control-plane component (control
loop) that runs processes which are continuously checking
the state of the cluster, and comparing them to the desired
state in etcd. then being able to make or to request changes
when required.
The controller usually performs re-scheduling based on
actions provided by the API server, but it can also execute
the action itself. For example, a controller can scale nodes in
a cluster. When the desired state is updated, the controllers
will detect the mismatch and try to bring the cluster state to
the desired one. K8s has some built-in controllers, but new
controllers can easily be added to a cluster, for example, by
running them as a Pod, or even outside the cluster.
The desired state of the cluster is defined by the user via
the API server. This component receives the commands of
the user and stores the desired state in etcd.
The scheduler (kube-scheduler) handles the decision on
the Pod (new or unscheduled) to Node matching, so that
Kubelet (worker nodes) can run the Pods. On a first phase
VOLUME 4, 2016 5
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
FIGURE 1. High-level perspective of the K8s architecture.
(filtering), the scheduler checks which nodes can meet the
scheduling requirements. These nodes are named feasible
nodes. On a second phase (scoring), the scheduler ranks the
feasible nodes for the "best" Pod deployment, by calculating
scheduling priorities, also defined in the desired state. On
a second phase (scoring), the scheduler ranks the feasible
nodes for the "best" Pod deployment, by calculating schedul-
ing priorities, also defined in the desired state. Among the
filters and priorities, the scheduler can take into account the
CPU and RAM usage of the Pods and/or the nodes.
Since the scheduling algorithm can be very simple, and
since scheduling is an optimization problem, there are some
ways to extend the scheduler so that it handles Pod placement
with finer-grained detail. The three main ways to extend the
scheduler:
Adding custom filters and priorities to the scheduler and
recompile it from source code.
Implement a complete new scheduler that can run in-
stead of or in parallel with the default scheduler.
Implement a scheduler extender, which provides call-
backs that the default scheduler calls at the end of each
phase of the decision, as proposed by Santos et al. for
the case of latency and bandwidth [28].
Once Pods are assigned to nodes, then kubelet handles the
execution of Pods. K8s does not actually offload application
workload from one node to another; instead, it relies on
replication: the K8s scheduler sends commands to kubelet on
the selected nodes to start the container; and sends commands
to kubelet in the old nodes, to stop containers. Containerized
application state changes, however, since the container in the
new location is not the same as the original one. Hence,
keeping the state adequately synchronized across new and
old containers requires extra configurations from the cluster
administrator.
D. DOCKER SWARM
DS is the native Docker orchestration tool, currently known
as Docker Swarm Mode. While K8s was first designed to
manage a single cluster, DS has been devised to support the
management of nodes across multiple clusters.
Similarly to K8s, the DS architecture consists of two types
of nodes: manager and worker nodes. Manager nodes run on
the control plane which maintains the cluster desired state
and assign tasks (containers) to the working nodes, and these
execute the given tasks. On each worker node, an agent is
running and reports its internal state to the manager node, as
illustrated in Figure 2.
The control plane integrates four components:
Orchestrator: evaluates the state of the cluster by com-
paring it with the desired one and creates the tasks for
each service description.
Allocator: enables the assignment of tasks to worker
nodes by referring their corresponding IP address.
Scheduler: receives the created tasks and checks the
available nodes and its resources to decide where they
will run.
Dispatcher: connects to each worker node, and sends
task assignments to them. Each worker node periodi-
cally reports its health status to the Dispatcher.
6VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
FIGURE 2. High-level perspective of the Docker Swarm architecture.
A node manager can also be a worker node, and the ro-
bustness of the swarm can be improved by having more
than one manager node. This robustness is widely known as
high-availability. The capability of having multiple manager
nodes is a feature embedded in Docker Swarm, which uses a
Raft implementation to maintain a distributed and consistent
internal state of the entire swarm. This internal state of cluster
is stored in etcd.
E. COMPARISON OF K8S AND DS
K8s and DS offer common capabilities such as a scheduler
for allocating containers in nodes, high availability, state
and health check of containers and more. Both solutions
use declarative languages, meaning that the human operator
defines the desired state of the clusters via specific tools with
some degree of flexibility and high level of interoperability.
Due to its integration into Docker, DS provides an easy way
of working with its API, in contrast to K8s, which introduces
its own CLI, making the learning curve steeper.
K8s implements its functionalities following a modular
approach, where different functional blocks and services
communicate via the API server. If new features (or third-
party applications) are to be considered, then these require
also integration to the API server.
Regarding performance aspects, DS performs better and
add less overhead, as shown by Nickoloff [29], who provides
a performance evaluation for multiple aspects, e.g., time
required to start thousands of containers in both K8s and
DS considering a cluster with one thousand nodes. Pan et al.
experimented with the same tools to investigate the overhead
by each tool compared to containers running directly with
Docker [30]. The results show a bigger overhead of K8s.
Beltre et al. also compare performance of these tools and
against bare metal solutions, but focused on communications
in HPC. applications [31]. The analysis shows that both
K8s and DS can achieve near bare metal performance over
Remote Direct Memory Access (RDMA) networking, when
high performance transports are enabled.
To provide a better comparison of key features across
K8s and DS, and how should orchestrators evolve, Table 2
provides a descriptive comparison of features for K8s and
DS, and proposes a few directions towards a novel generation
of dynamic orchestrators. The proposed features are the basis
for the discussion in the next sections.
F. REPLICATION AND OFFLOADING
Container orchestrators manage containerized applications
based on human intervention, having as basis replication as
has been explained earlier.
In Edge-Cloud environments, and in particular in far Edge
environments, replication may create challenges, as devices
are often mobile and resource constrained. Hence, it is im-
portant to focus also on the possibility of offloading and
to debate and evaluate implications of such process in the
context of dynamic container orchestration.
Replication focuses on scaling aspects: replicas of micro-
services are deployed in different nodes, and can be activated
or deactivated to achieve a specific performance objective.
Offloading is often used in Edge environments, when there
is the need, for instance, to run an application independently
on the Edge to meet latency or other types of requirements.
Offloading implies a load reduction in the former environ-
ment, but requires additional management to ensure stability
of the overall system.
In dynamic environments, for instance, Edge-based envi-
ronments involving mobile devices such as Unmanned Aerial
Vehicles (UAVs),Automated Guided Vehicles (AGVs) or even
cars, there is a need not just to scale up the system, but in
many cases also to scale down resource usage. Adding to this,
environments that encompass multiple clusters belonging to
different operators may require that security parameters be
updated whenever a container is moved to a new domain, or
provisioning of a dynamic and distributed trust management
scheme.
With the replication model, this implies that during a
scheduling process, which is usually based on a two-phase
filtering and ranking scheme, the eligible nodes are selected
based on available static resources at some instant in time.
Assuming a very dynamic environment, where new Pods
are frequently assigned to new nodes to reduce latency, then
it may happen that the system may reach a point where the
existing resources are not enough to meet the current system
requirements. It should be highlighted that K8s never moves
Pods from one node to another to free up resources. It is in
variable mobility environments that offloading seems to be a
more interesting model. However, its application may imply
additional adjustments to the system, and therefore, it is im-
portant to understand the offloading requirements and steps.
It is also relevant to understand which indicators (beyond
node usage indicators such as CPU and memory) should be
passed to a container orchestrator scheduler in order to best
meet the requirements of more dynamic environments. This
aspect which context indicators to consider and why
VOLUME 4, 2016 7
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
TABLE 2. Comparison of features between K8s and DS, and expected features in a dynamic orchestrator.
Feature K8s DS Future Directions
Setup, installa-
tion and config-
uration
Scripting required to bring up a cluster, define
the specific environment, define the internet-
working between components (Pod network)
and setting up the dashboard.
The Docker CLI runs the different programmes
and therefore simplifies setup. The resulting
clusters are not as robust as in K8s.
Better integration of a CLI and
UI, to address zeroconfigura-
tion.
Building
and running
containers
Specific API, client, and configuration (YAML)
definitions.
Docker CLI allows for new containers to be
easily deployed.
Initial set of parameters set via
a dashboard. The setting up of
a cluster, adding and removing
would be automated via a zero-
configuration mechanism.
Logging, Mon-
itoring
Inbuilt tools provide support to understand
cause of failures. Analysis is manual. Monitor-
ing provides assessment of nodes and container-
ized services.
No inbuilt tools, but provides support via
third-party tools, such as ELK (logging) and
Reimann (monitoring).
Basic inbuilt tools and also
interfaces towards third-party
tooling are required.
Scalability Intra-cluster concept, it runs into problems for
cluster federation. Envisions auto-scaling. 5000
node clusters, amounting to 150 000 Pods.
Scales better than K8s, given that it provides
a basis to deploy faster containers in large en-
vironments. Envisions auto-scaling. Claimed to
be up to 5 times more scalable than K8s. 1000
node clusters with 30000 containers
Starting point is the scalability
KPIs provided to DS.
Security Pod security policies can provide access to a
full cluster. K8s model for service account to-
kens enable API servers to be reachable from
every container by default.
Exchange of data between nodes is based on
encrypted data and shared to neighboring nodes
with a gossip protocol. Encryption keys are only
stored in master nodes. Data is not encrypted
in transit. Secrets (password, tokens, keys) are
encrypted and stored as Raft data. PKI is used
to enable nodes to communicate with each other
securely
Encrypted exchange of data by
default, based, e.g., on a social
opportunistic protocol; overlay
could be based e.g., on DLT.
Workload
migration
Manual configuration is required to provide ad-
equate auto-scaling or load-balancing
Should perform adaptive load-balancing across
a cluster.
Performs load balancing not
just taking into consideration
workload, but also taking into
consideration other types of
context and behaviour infer-
ence, intra and inter-cluster.
Node support Supports up to 5000 nodes Supports 2000+ nodes Expected to support at least
2000 nodes
Optimization
target
1 large cluster multiple small clusters Overall optimization strategies,
to balance both operations intra
and inter-cluster.
Networking Flat overlay network interconnecting Pods Docker daemons interconnected via overlay
networks. Overlay network driver.
Overlay should be content-
oriented (data-driven), thus re-
quiring a new degree of net-
work abstraction.
Availability High-availability, health checks directly per-
formed in Pods
High availability, containers restarted if failures
occur.
High availability and redun-
dancy, supporting also mobility
aspects, due to the integration
of context-awareness.
Mobility
support
Not supported Not supported Inbuilt support, due to a bet-
ter adaptation to the overall
conditions, internal and exter-
nal to the system (context-
awareness).
Compatibility
with other tools
Not compatible with any existing Docker CLI
and Compose tools
some support for third-party tools Should ensure adequate inte-
gration of varied third-party
tooling.
Architecture client-server model Native clustering approach Clustering approach
Adaptability Setup configuration Setup configuration Integrates behaviour inference
(ML) to support a higher degree
of automation during applica-
tion setup and lifetime manage-
ment.
is addressed in the next subsection. It is also important to
understand why offloading should occur and the impact of
offloading workload and respective state at an instant in time
as well as on future system operation.
IV. GUIDING SCENARIOS
The integration of IoT services across different vertical sec-
tors is expected to grow, backed up by the technological
capability to distribute micro-services. A key enabler for this
is the capability to deploy a service in a way that improves
the overall system performance, which is frequently tied to
8VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
latency reduction - the closer the workload is to the user
or data source, the smaller is the latency. This distribution
of micro-services brings additional advantages, in particular
for scenarios involving mobile devices. A key issue in this
context is mobility of the devices that are tied to the user
mobility behaviour. Similarly, more advanced scenarios are
based on devices (IoT-Edge) that are mobile independent of
users. Overall, providing support for mobility is an essential
requirement for future orchestrators. Examples of application
categories that can profit from a more dynamic orchestration
are described in the remainder of this section, including:
Time-sensitive applications, e.g., IIoT critical applica-
tions that require bounded latency and low jitter.
Pervasive sensing applications, e.g., Mobile Crowd
Sensing (MCS) applications, that are time-sensitive and
require low energy consumption.
Mobile content/video streaming, requiring low latency
which is supported via storage closer to the user.
A. MANUFACTURING
A worker in a factory relies on a certified device that can
be used only in specific areas, to perform maintenance of
the machines. During a specific period of time (in a day,
in a week) the worker relies on the device to collect in-
formation on machines. The worker then moves to another
location, leaving the device on the specific premises. On the
new premises, the worker will use another device, but the
application needs to be transferred beforehand.
While the worker is performing his tasks, the orchestrator
predicts (based on learning of the worker mobility patterns)
that the worker will likely move to a new location to con-
tinue his work. It therefore triggers a decision to offload the
application to a device at the predicted location to allow work
to continue without interruption. Part of the orchestration
implementation will be to ensure that the data required by
the application is available locally. For example, if the appli-
cation includes production monitoring, production plans and
related data should be prefetched to the device at the new
location.
In this case, the regular mobility of the user represents
a type of user behaviour trigger, which can be fed into an
orchestrator, to best estimate when and where to offload
the application workload and state, and to ensure that any
required data is available.
B. SMART CITIES
Smart Cities benefit greatly of the use of AI services. Coupled
with ubiquitous connectivity (LoRa/LoRaWAN, 5G, etc.), it
allows cities to consider different sensing aspects to improve
the overall city planning. An example can be Audio/Video
processing at Edge servers co-located with cameras in park-
ing lots. Today, the collected data is sent to the Cloud for
further processing. Based on Edge computing, it is feasible
run specific AI/ML in co-location with cameras and process
the data locally.
An adaptation based on specific context indicators brings
benefits in the sense that the application running may adapt,
in specific locations, the running ML model to best suit the
context (e.g., peak number of vehicles in certain seasons) and
the overall needs (e.g., energy reduction, or need to process
data faster during a specific time of the day).
Here, a dynamic orchestrator would support the required
adaptation. For instance, it would be feasible to perform
system adaptation on the Edge nodes to run specific micro-
services based on service or situation needs (e.g., on a
camera, count the number of electric vehicles around; assess
abnormal situations based on noise).
C. FARMING
A set of UAVs is used in the context of crop lifecycle
monitoring. The UAVs carry a data analysis application,
which supports local pre-processing of the collected IoT
data, to circumvent the issue of intermittent connectivity, and
to reduce energy consumption. The pre-processed datasets
are time-stamped and downloaded to an Edge node when
connectivity is available. Thanks to the global view of data
provided by Pathfinder, it is simple for a back-end application
running in the Cloud to gather the pre-processed data and
complete the overall crop monitoring and planning. This
can be done without implementing any application-specific
mechanisms, aside from a naming convention including time-
stamping.
D. MOBILITY
e-Mobility solutions are becoming increasingly decentralized
and integrate different e-Mobility services (e.g., transports,
energy monitoring) in an attempt to provide personalized
recommendations to the user. Such recommendations require
heavy data analysis, today performed in the Cloud. With the
increase in the heterogeneity of possible e-Mobility services,
and with the integration of support for a more dynamic
lifestyle, where commuting requires the use of different
infrastructures, feedback to the user becomes more complex.
The use of context-awareness and of a dynamic orchestrator
facilitates the integration of new services, and the possibility
to handle data locally. Here, the role of the orchestrator is
to assist the data exchange across federated cluster envi-
ronments, making it possible for the application to provide
recommendations interactively in real-time.
As an example, consider an employee travelling to attend
a meeting in another city. Based on user preferences and
predicted traffic, a travel plan is created that includes a
train to the destination city and then a taxi to the meeting
location, which might be convenient in an unfamiliar city. If
on arrival at the destination train station, local traffic is not
as expected, the traveller may reroute his/her trip to use a
local subway or tram system that bypasses the unexpected
road congestion. The orchestrator predicts the need to run
the planning application on the Edge close to the destination
train station, with up-to-date data on local road and public
VOLUME 4, 2016 9
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
transportation operations in order to interact with the traveller
to make it easy to adapt the trip.
In this example, the orchestrator would rely on user pref-
erences (e.g., preferred means of transportation), location
(distance to a station), and dynamic information (traffic con-
ditions or service interruptions) to ensure that the necessary
information is collected to assist the user in making good,
dynamic decisions for planning and adapting travel plans.
V. THE ROLE OF AI/ML IN CONTAINER
ORCHESTRATION
A. KEY CHALLENGES
ML-based container orchestration technologies have been
leveraged in Cloud computing environments for various pur-
poses, such as resource efficiency, load-balancing, energy ef-
ficiency, and Service Level Agreement (SLA) assurance [26],
[32]. One main challenge when making an offloading deci-
sion relates to the learning and adaptation capability consid-
ering both internal and external system variables. The load
on system resources as well as the number of users and
the data being consumed change constantly, and both users
and system nodes may be mobile thus adding to the dynam-
ics of the overall system [33], [34]. Therefore, integrating
intelligence into this process is both challenging and es-
sential [35]. Managing computing resources and optimizing
costs on multi-cluster Edge-Cloud environments are highly
demanding tasks, as container adoption is growing using
multiple container management platforms [25]. Whether on-
premises, in public Clouds, or both, the operational overhead
for container adoption is significant. As administrators can-
not predict the computing resource demands of applications,
they typically reserve more computing resources for an ap-
plication workload and state than needed. Therefore, the inte-
gration of learning and prediction using AI/ML is recognized
as an essential element of container orchestration [36].
B. FEATURES AND BENEFITS
Supporting elasticity as well as internal and external sys-
tem dynamics are fundamental requirements in Edge-Cloud
environments, and therefore offering appropriate resource
scaling is one of the most important features. Adopting
AI/ML techniques in the orchestration process will ensure
better adaptability, as it will increase the overall operational
efficiency and provide more flexibility by replacing the
manual configuration with digital intelligence, reducing the
need for manual resource monitoring, tracking of data us-
age, calculating the optimal configurations, and changing the
configurations accordingly [37]. These tasks are automated
and become routine when accomplished using AI/ML tech-
niques in the orchestration process. Common issues, such
as over-provisioned computing resources and deployments
with poorly selected number and size of Pods, or under-
provisioning of resources, e.g., for time-critical workloads
may be avoidable with ML. ML integration is also beneficial
to assist an adequate prediction of resource consumption in a
way that can best adapt the system over time. Hence, employ-
ing AI/ML in orchestrators provides a learning mechanism
for application resource usage patterns and enables resource
prediction down to the container level. While continuously
generating recommendations, the system learns based on
additional data. Overall this can result in a reduction in
spending while increasing the application service quality and
delivering the necessary performance.
The interlock of AI/ML with the orchestration mechanism
can provide precise offloading decisions based on the overall
dynamic operation and state of a system, across time and also
taking into consideration the overall context of the system.
Such context can be based on different external and internal
parameters, e.g., derived from network or application re-
quirements; from data observability; from the user behaviour
as will be discussed later in subsection VII-A.
The optimization of the objectives and metrics of ML-
based approaches for container orchestration has been in-
vestigated and multiple methods have been extensively dis-
cussed in related literature. Figure 3 provides a taxonomy for
the application of ML across different features of container
orchestration [26]. This taxonomy addressed benefits ranging
from resource efficiency, energy efficiency, to cost efficiency,
for instance.
FIGURE 3. ML-based container orchestration taxonomy [26].
The proposed taxonomy details the use of ML into five
specific orchestration categories: application architecture,
infrastructure, optimization objectives, behaviour modeling
and resource provisioning. In the context of the proposed tax-
onomy, application architecture represents the behaviour and
internal structures of containerized application components.
Infrastructure indicates the environments or platforms where
applications operate considers single Cloud, multi-Cloud and
Hybrid Cloud infrastructure patterns. Federated Clouds in
this context fall into hybrid Clouds. Optimization objectives
are the improvements that ML-based approaches attempt
to achieve. Behaviour modeling leverages ML models for
pattern recognition and simulation of system and application
behaviours, besides forecasting future tendencies according
to collected data. While resource provisioning prescribes the
resource management policies of containerized applications
at different phases in the container life cycle under diverse
scenarios [26]. While relevant, the cited taxonomy isolates
the use of ML across the different proposed categories,
10 VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
without considering, for instance, that several aspects are
interlocked. For instance, optimization objectives provide
examples of specific efficiency measures to consider, not
considering the integration of other aspects, e.g., behaviour
modelling.
In our opinion, the integration of ML in orchestration serves
the overall goal of improving adaptability and as such,
requires a cross-category approach, where a central point
is considering multi-objective optimization approaches that
improve the overall system efficiency, adapting to different
environments, without the need for additional manual in-
tervention. For this purpose, Figure 4 provides a high-level
perspective on the key aspects that are required in a dynamic
orchestrator. As illustrated, we advocate the integration of
context based on multiple parameters (data observability,
application and network requirements, user behaviour) that
bring context-awareness to an adaptive ML-based layer, ca-
pable of behaviour modelling and prediction. Such approach
needs to be abstracted from underlying architectures, being
able to be deployed on Cloud-based architectures, Edge-
Cloud, or fully decentralized Edge Architectures. For this
purpose, it is relevant to understand what kind of ML-based
approaches can be leveraged to reach decentralization. These
aspects are discussed next.
C. ML-BASED ARCHITECTURES FOR EDGE-CLOUD
The implementation of ML-based approaches generally fol-
lows three phases: data pre-processing, model training, and
model testing. In data pre-processing, the learning task and
the learning parameters shall be declared, while the datasets
are loaded to perform the training and import the required
libraries according to the learning task and datasets. Once
the data is pre-processed, it is used to train the model. After-
ward, the model is tested based on the achieved convergence
accuracy and evaluation parameters. All these phases are
conducted based on different ML architectures of which fed-
erated learning is an example [38]. This section aims at ex-
plaining main differences between existing types of architec-
tures (centralized vs decentralized; distributed, federated and
swarm learning). The section does not exhaustively describe
different algorithms that can be used across Edge-Cloud,
given that such aspects would require a deeper, survey-
oriented analysis, and the focus of this work is on advocating
a need for a cognitive and decentralized orchestration, and
to provide a first insight into the role of cross-layer context-
awareness in this process. For this topic, we further direct the
reader to related literature on the topic of AI approaches to
the Edge-Cloud continuum [39], [40].
1) Centralized Learning
In centralized ML architectures, the data training is handled
at a central location, usually the Cloud. This often requires
having an infrastructure capable of supporting heavy data
transmission from a high number of data sources towards
the Cloud. In general, centralized training is computationally
efficient assuming a not so large number of sources, and
adequate interconnections to the Cloud. However, centralized
approaches imply, in particular when considering personal
data, eventual security and confidentiality breaches as data
is stored in the Cloud [38].
2) Distributed Learning
The increased risks of moving large bulks of data to a
centralized entity motivated the evolution of the distributed
ML approach, where training, prediction and inference are
based on live-streaming data. Distributed learning relies on a
multi-node approach, where datasets are locally trained (per
node). Distributed learning engages the server to distribute
a pre-trained or generic ML model to the participating en-
tities. Distributed Learning is particularly beneficial when
there are frequent and large updates of data, and eventually,
some devices are resource constrained - hence, distributed
learning approaches become beneficial in Edge-Cloud envi-
ronments [38].
3) Federated Learning
Federated learning (FL) follows a hierarchical approach,
where an algorithm training can consider datasets on different
Edges, without having to transmit such datasets to the Cloud
(only learning parameters are transmitted). So, this technique
reduces the amount of data transferred and minimizes the
privacy concerns of the user’s private data.
FL can be based on a centralized approach, or on a de-
centralized approach. In a centralized approach, FL considers
for instance a Cloud server to orchestrate the algorithm and
ensure coordination of different learning parameters e.g., for
different datasets on different Edges. Hence, there is an initial
selection of specific (Edge) nodes to consider in the training,
and to aggregate the updates. This approach partially inherits
the issues of centralized learning approaches, as the Cloud
server can become a bottleneck. However, it prevents privacy
issues. For the decentralized FL, the different involved enti-
ties (nodes) coordinate the global model to consider directly.
In this case, the main issue may relate with the underlying
infrastructure [38].
4) Swarm Learning
Swarm learning 6is a decentralized, privacy-preserving ap-
proach that is built on the principles of Distributed Ledger
Technology (DLT), thus supporting learning in a decentral-
ized way near to data sources. Raw data is kept local,
while learning parameters are shared via a swarm network
- hence, models are built independently at the Edge. Swarm
Learning integrates security to support data sovereignty, con-
fidentiality, via smart contracts involving on pre-authorized
participants. Swarm learning therefore supports a dynamic
onboarding of nodes, via smart contracts. Once a new node
enters (via a smart contract), it obtains a model and performs
local training until specific synchronization conditions are
met. Then, the model parameters are exchange via a swarm
6https://github.com/HewlettPackard/swarm-learning
VOLUME 4, 2016 11
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
FIGURE 4. ML as a key aspect in dynamic container orchestration.
API, and merged to create an updated model with updated
parameter settings, before a new training round is started.
An example of the application of Swarm Learning has been
provided by Warnat-Herresthal et al. in the context of clinical
disease classifiers, showing promising results in terms of
decentralized learning, while keeping data privacy [41].
5) Summary
In the Internet and with the increasing integration of IoT
across different vertical domains, sending large bulks of data
to the Cloud for training and inference introduces major
issues when confronted with the existing decentralization
trend in the Internet.
In this context, FL and its variants (centralized, decen-
tralized, hierarchical) have been contributing to allow to
decentralize the learning process across Edge-Cloud. Fed-
erated learning allows participant nodes to collaboratively
train local models on their data without revealing sensitive
information to a central Cloud. Nonetheless, while it provides
anonymity, it does not preserve privacy.
Swarm learning, on the other hand, seems to bring addi-
tional benefits in comparison to FL. The most visible one
is privacy preservation. A not so visible benefit concerns
the possibility to integrate more nodes (hence, more local
training on different Edges) and a higher heterogeneity in
terms of local model training.
To assist the evolution of container orchestrators, it is
important to consider decentralized and privacy-preserving
approaches, being Swarm learning a relevant example to
consider. Other variants of decentralized FL may bring ad-
vantages in this context, in particular approaches that con-
sider aspects such as mobility [42], constrained devices,
synchronization [43].
Three specific challenges need to be addressed by design,
when considering future container orchestrators and hav-
ing in mind the need to integrate a decentralized learning
approach for orchestration: mobility of the involved enti-
ties [44]; privacy preservation, as is attempted via swarm
learning; and the capability to perform training involving
constrained devices [45].
VI. THE NEED FOR DATA OBSERVABILITY IN
IOT-EDGE-CLOUD COMPUTING
Data observability encompasses providing a view of many
aspects of an organization’s data for example, includ-
ing information describing where data is stored, how is it
structured (schema), where it originated (lineage), how it
is classified, how is it used, how it is protected in order
to make it possible to get more value from the data and in
general, to better manage and protect the data. Data observ-
ability becomes increasingly important as IT environments
are expanded to include Cloud resources, since like the IT
resources themselves, the data becomes more distributed and
more difficult to manage efficiently. Edge computing further
increases the amount and distribution of data by adding
compute and data domains that are closer to real-world data
workflows across different domains such as Smart Cities,
Agriculture, Manufacturing, Health.
A. MOTIVATION
Our objective is to improve how data is used and managed
across IoT-Edge-Cloud and company internal processing and
data domains. We want to ensure that data can be found,
that data pipelines are working as intended to provide data
for processing, and above all that data is protected and used
in compliance with an organization’s policies and applicable
regulations. Improved data observability is essential for the
orchestration of computation. By incorporating information
describing data and the state of the networks and systems on
which that data is processed, we will be able to significantly
improve process orchestration.
Some of our specific objectives for improving data observ-
ability across compute and storage hierarchies include:
Knowing where data is stored and available for pro-
cessing: Data needs to be moved to where it will be
processed, or processing moved to the data. Moving data
may mean copying and possibly transforming complete
data sets. Alternatively, data can be queried remotely
to avoid copying. Rich data observability will make it
possible to further optimize orchestration by accounting
for the cost of accessing data.
The ability to exploit and manage data as things
12 VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
change: In dynamic IoT-Edge environments, events can
occur that impact how data can be used or protected.
The flow of data might be interrupted by a disruption
in the systems where it is generated. A data pipeline
might break due to changes in the data structure or a
change in the IT environment. A new deployment of
an application that processes source data to produce
higher-value assets such as machine learning models
may fail. Prevention or fast detection of such situations
is essential to minimize disruptions in the real-world
systems being monitored and managed.
Providing end-to-end observability: The objective of
Edge computing is to bring computation and data stor-
age closer to where data is produced. Therefore, many
Edge-computing scenarios will require the use of multi-
ple Edge clusters to accomplish this. The administrative
control of the domains may belong to different parts of a
company or be provided by different service providers.
Optimization of orchestration and data management re-
quires visibility of data and the state of resources across
these domains.
Compliance: As data processing and storage becomes
more distributed, and data is moved or copied through
IOT-Edge-Cloud and on-premise domains, the risk of
violating either company policies or regulations grows.
Therefore it is essential to monitor these environments
to ensure that the data is stored and processed in com-
pliance with the applicable policies and laws.
Data Protection: Similarly, in dynamic IoT-Edge-
Cloud environments data becomes more exposed to
theft or tampering. In the case of sensitive data, special
attention is required to ensure that data is properly
protected while at rest and when it is moved or copied.
Processes might be deployed that include vulnerabilities
that expose data, for example by encrypting data using
weak or poorly-configured cryptographic algorithms.
All cryptographic components (algorithms, configura-
tion, keys, certificates) must be adequate to achieve the
required level of protection.
Flexibility to deal with diverse systems: Expanding
existing distributed IT environments to include Cloud
and Edge resources requires monitoring and controlling
an increasingly wider range of systems. Therefore the
monitoring system must be easy to adapt to these new
systems.
Orchestration across multiple domains can benefit from
rich, cross-domain data observability. Orchestration needs to
account for the cost of moving data or moving processing,
therefore considering factors such as the size of the datasets,
frequency of access, frequency of data updates, and network
bandwidth and latency. The availability of data lineage infor-
mation and metrics on the frequency of updates and freshness
of copies of datasets may help with locating copies of data
that enable better overall orchestration decisions. In general,
making good orchestration decisions requires having an up-
to-date view of the resources available across all domains. In
particular, triggers need to be available to make it possible
to adapt the system to changes that may disrupt running
processes, thus warranting re-orchestration.
B. PATHFINDER DATA OBSERVABILITY
Pathfinder is a system developed at IBM Research to provide
data observability by automatically collecting, linking, and
enriching metadata describing data as well as the systems
that process and store the data [46], [47]. Pathfinder collects
metadata from a wide range of sources so that information
that is normally siloed is available from a single source
to simplify orchestration and data management tasks. The
complete collection of linked and enriched metadata is called
the Enterprise Data Map (EDM).
Pathfinder is event oriented built on Apache Kafka to
support applications that need to be informed of changes in
system state.
Pathfinder uses connectors to interface to systems in order
to collect metadata. For example a connector to a database
or file system can provide information on the tables or files
that are available. Connectors to data catalogs can provide
metadata on how data is classified, who owns the data, and
other relevant information found in catalogs. Connectors
to data pipelines provide information on data lineage and
transformations that have been performed on source data
to produce new datasets. The Pathfinder connector model
ensures that the system can be extended to support new
sources of metadata.
The independence of Pathfinder connectors from the core
system simplifies the deployment and administration of the
system in an environment with multiple administrative do-
mains. The Pathfinder core system does not require any per-
missions to access any of the systems from which metadata is
collected. The connectors collect and share metadata that can
be shared with the core system. The only requirement is that
connectors and core system have connectivity. Connectors
can be deployed and provided with credentials under control
of the domain in which they run. Thanks to this flexibility,
Pathfinder can create an end-to-end view of data and other
resources, including across multi-domain environments.
Pathfinder uses enrichers to add additional information to
the EDM based on the collected metadata combined with
other information and analysis. As a simple example, an
enricher could be created to analyze the quality (relative
to some desired metric) of a dataset, which is then added
to the metadata and thus available to all applications that
might process that data. A more complex enricher would be
required to evaluate policies, making use of the information
in the EDM to determine if data is being stored and processed
in compliance with those policies. In this case, Pathfinder has
a compliance role by implementing data security and privacy
controls [48]. If metadata is available describing known
vulnerabilities to processes running in an IT environment,
this view can be enriched by tagging the data assets that are
exposed due to those vulnerabilities.
VOLUME 4, 2016 13
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
Significant value in a data observability system comes
from the fact that data is linked and enriched such that the
collection of metadata (the Pathfinder EDM) provides more
information and value than the sum of the individual pieces.
For example, if the EDM includes the data classification
learned from a data catalog, and lineage information learned
from a data pipeline to show where data has been copied
and how it has been transformed, a compliance enricher can
be created. Since we view data compliance evaluation as
one of the fundamental elements of Pathfinder, the system
has to have a data model that explicitly includes all of the
information required to implement this enricher.
If an application wants to include metadata that is only
of interest within that application, it is sufficient that the
connector for that application adds the information with an
appropriate tag, linked to the relevant element(s) in the EDM.
Such information cannot be interpreted by Pathfinder but can
be used by all components of the application that understand
it.
VII. DYNAMIC CONTAINER ORCHESTRATION
FRAMEWORK
The previous sections focused on presenting container or-
chestration as it is today and existing tools that can provide
a better offloading support; AI/ML for the learning and
system adaptation; Pathfinder as a relevant tool to provide
data observability results.
In this section, we propose a conceptual framework to
support dynamic container orchestration, as represented in
Figure 5, which provides a high-level scheme for the pro-
posed functional blocks of dynamic container orchestration.
Such framework could be implemented within an existing
orchestrator, or as a third-party application. The explanation
provided in this section considers the case where such frame-
work would be implemented as a third-party application to
K8s, thus interacting with the K8s scheduler. The different
components are explained in the following subsections, after
explaining what type of context should be considered to
achieve a more elastic container orchestration.
A. CONTEXT-AWARE OFFLOADING TRIGGERS
In current container orchestration solutions, the need to repli-
cate resources is usually based on system requirements (e.g.,
CPU usage of a node reaches a specific level, or energy level
is perceived as high); application requirements (e.g., bounded
latency); network policies. To further support dynamic Edge-
Cloud environments, container orchestration needs to inte-
grate context-awareness into the offloading process.
A first step towards achieving context-aware orchestration
is the definition of a basic set of context-awareness parameter
categories to consider in the offloading process, how to model
context, and how such modelling can be integrated.
Our proposal is to consider the following categories of
context-aware triggers as illustrated in Figure 6 and discussed
next: i) system ; ii) network ; iii) application; iv) data observ-
ability; v) user behaviour and preferences.
The acquisition of context parameters is handled via the
context acquisition block. This requires a sensing module or
a sensing interface that may be present on each worker node
in an orchestration system, and is passively fed with context.
It can also be performed via a monitoring protocol that
may require active probing instead of passive sensing. The
Pathfinder system described in subsection VI-B provides a
general mechanism for generating some of the context-aware
triggers. Pathfinder connectors interface with the systems
distributed across the Edge-Cloud environment and gathers
information on their status. This information is then avail-
able in the Pathfinder EDM. For example, server utilization
statistics can be gathered across all domains. If a particular
server’s utilization exceeds a defined threshold, an event can
be generated to trigger re-orchestration. Similar triggers can
be defined to ensure that the system reacts to network delays
and problems with applications (e.g., if an application is no
longer responsive, it can be restarted). Also, data compliance
triggers are generated by Pathfinder enrichers based on how
data is classified, how it has been transformed, where it is
stored, and the purpose for which it is being used. Note that
if information cannot be determined automatically (e.g., the
purpose for processing), reports can be generated for offline
investigation.
1) System Triggers
System triggers refer to internal parameters of a system,
such as CPU or memory utilization, which can be engi-
neered to improve the efficiency of the overall system. In
current orchestrators, the overall Edge-Cloud infrastructure
is modelled in accordance to the needs of an application, to
adequately scale over time. For instance, an application may
require more CPU capability than the capability existing at
a specific node. The orchestrator system adds such capabil-
ity, based on available nodes. Common system parameters
available in orchestrators are CPU, memory, storage. Energy
consumption can be integrated into an orchestrator scheduler,
via third-party solutions.
2) Network Triggers
Regarding the networking level, the context that surrounds
the nodes in different container orchestration tools can as-
sist in better defining opportunities for replication/offloading
over time and space. Current orchestrators rely on a network-
ing overlay, and do not consider specific functional or non-
functional networking requirements. A containerized appli-
cation is often deployed on a collective set of nodes, which
are interconnected via an IP network, public, or private.
Moreover, there is frequent interaction between containers,
and between containers and the data they process.
Accordingly, each container should be reachable and discov-
erable thanks to the container networking mechanism.
The horizontal Pod autoscaling feature of K8s provides
service availability and scalability by increasing the num-
ber of replicas/Pods. As a result, load-balancing between
replicas of an application is necessary by distributing re-
14 VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
FIGURE 5. High-level block diagram of the proposed framework.
FIGURE 6. Proposed context-aware categories and examples of specific parameters that can be sensed.
quests/demands equally to all Pods in a K8s cluster. Taking
into consideration that these requests might be forwarded to
remote worker nodes, this approach can result in long delays,
especially in Edge computing environments where worker
nodes are geographically dispersed [49].
Starting from the RTT, latency and the allocated bandwidth
between the worker node and the master node, up to traffic
load-balancing operations, varied QoS aspects from a node,
path, and link perspective are examples of functional require-
ments that should be considered in the orchestration process.
Examples of non-functional requirements are, for instance,
mobility or portability.
Currently, and due to the fact that orchestrators rely on
overlay networks, what is considered from a network per-
spective are Service Level Agreements (SLAs). SLAs are not
enough to provide an adequate adaptation of the underly-
ing infrastructure. In mobile environments, where there are
frequent IP changes handled by mobility solutions such as
Mobile IPv6 (MIPv6) [50]. Network Function Virtualization
(NFV) approaches provide today the possibility to integrate
mobility management into orchestrators [51] thus bringing
the possibility to better adapt the overall system orchestra-
tion, taking into consideration computational and networking
resources. Considering an approach similar to the one of
MIPv6, it is possible, for instance, to handle changes of IP
subnets across an old Pod and a new Pod.
3) Application Triggers
Current orchestrators rely on application functional require-
ments to best model the system scaling or load-balancing.
The support for time-sensitive applications is nonetheless
limited, as explained in Section II, where we have described
related literature that proposes changes to the K8s scheduler
to best meet the needs of time-sensitive applications. In the
context of Industrial IoT, for instance, critical applications of-
ten require minimum and bounded latency within one msec.
Tight time synchronization among devices serving a specific
application is often a challenge, due to time-aware queuing
disciplines that support the exchange of critical traffic. These
are examples of QoS parameters. Current orchestrators re-
quire a finer-grained support for QoS, to cope with new chal-
lenges. Adding to this, non-functional requirements, such as
a sensitiveness level of an application, or even certification or
compliance aspects, may prevent an application to run under
VOLUME 4, 2016 15
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
specific conditions on a node.
Other requirements, such as application usage, roaming
and location preferences, or even desired time to completion
of a service, can be provided via a semantic abstraction of the
application, during setup.
4) Data Observability Triggers
Events to re-orchestrate application processing can also be
triggered by a solution such as Pathfinder based on data-
related changes. The most basic event of this type could be
created based on a new or updated dataset. These datasets
could be tagged in the EDM by the application or pipeline
that creates them, or a Pathfinder enricher could create trig-
gers automatically based on predefined conditions. Many
different dataset characteristics could be considered. For
example, if a dataset is not updated within a specifed time,
or if a dataset grows to exceed a specified size, a trigger to be
generated to schedule application to handle the condition.
Triggers can also be created based on data compliance
issues. Regulations, such as the European )7, and company
policies can constrain for example where data can be pro-
cessed and the purpose of that processing, and is therefore an
important factor to be considered as part of orchestrating data
processing.
In many cases, data protection issues should be evaluated
when initial orchestration decisions are made, before data is
transferred and processed. This is the only way to ensure
compliance. In distributed Edge-Cloud environments it will
not always be possible to control all compliance issues a
priori. For instance, data may be be copied by a procFor ex-
ample, data may be copied by a process that is not controlled
(e.g., if that process is not documented) from an Edge to the
Cloud for processing. To cover such cases, it is important
to implement compliance monitoring to complement real-
time controls. This can be implemented using a data observ-
ability system like Pathfinder, which monitors the data and
processing in each domain, and uses enrichers to evaluate
the compliance status. In some cases a compliance violation
could result in an orchestration trigger to move processing so
that it is compliant. In other cases evaluation of a particular
policy may not be possible (for example, if the purpose of
processing by an application has not been documented), in
which case a report should be generated to trigger a manual
processes to ensure that compliance is maintained or restored.
5) User Behaviour Triggers
The generalized application of pervasive technologies, such
as IoT, IIoT, MCS gave rise to the possibility of explor-
ing user behaviour across different domains, to improve
system and networking behaviour. A common indicator of
user behaviour is user location, which is widely applied to
improve the support of pervasive applications or the overall
system behaviour [52]. However, new technologies allow
to explore and to integrate other facets of user and human
7https://gdpr-info.eu/
behaviour, in an attempt to benefit the overall orchestration
of resources, in particular across mobile heterogeneous Edge-
Cloud environments. An example on how user behaviour can
be applied to improve the overall interconnection across users
has been developed in the context of the H2020 UMOBILE
project 8, where a context-aware agent has been developed
to capture and model user mobility to assist the overall
network operation and in particular, to support social-aware
opportunistic routing 9.
Other forms of user behaviour, including user preferences,
have been considered, e.g., social interaction, group forma-
tion, and social/physical proximity [53] to improve overall
system performance and to provide better support for task
offloading across the Edge [54]. User mobility has also
been applied in the context of energy consumption improve-
ment [55].
User behaviour and preferences are therefore relevant to be
considered in the context of improving container orchestra-
tion, in particular for mobile environments as has been briefly
explained in section IV.
While there are a few starting points to model user be-
haviour, it is relevant to understand which particular features
can be considered, and how can such context be passed
to the orchestrator scheduler, to improve the overall use of
resources across Edge-Cloud.
6) Summary, Basic Set of Offloading Triggers
To sum up, a precise offloading decision depends on a set of
operations and parameters that are required to be fulfilled
in an automated and dynamic way. These parameters are
currently contingent on the triggers coming from the orches-
trator system, the networking features, and the application
requirements. However, it is feasible to further improve
this by integrating both external and internal triggers to
the orchestrator. For instance, taking data protection issues
into account, and integrating also human behaviour into the
orchestrator scheduler.
A proposed initial set of triggers based on the categories
discussed is provided in Table 3, where the first column pro-
vides the type of trigger, system (S); application (A); network
(N); data observability (D); user behaviour (U). The second
column identifies the parameter that shall be part of the set
of offloading triggers, while the third column gives insight
to the current support in K8s. The fourth column explains
the purpose of the integration of the parameter; the fifth
explains units to consider, while the sixth column provides
a description on how the parameters can be modelled.
The parameters taken into account by an orchestrator
should not be worked in isolation, as they often depend
on other different parameters that must be monitored by
the orchestrator. For example, the perceived QoS levels de-
pend on other aspects, such as mobility, or user location.
8https://umobile-project.eu
9https://umobile-project.eu/phocadownload/papers/wp-
contextualmanager.pdf
16 VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
TABLE 3. Initial proposed set of context-aware offloading triggers, and how they can be modelled to be integrated in K8s, as an example of an orchestrator.
Type Parameter K8s Support Purpose Modelling
S CPU usage Min-Max Prevent computational exhaustion Integrated into K8s as container.resources
S Memory usage Min-Max Prevent storage exhaustion Integrated into K8s as container.resources
S Energy
consumption
No Optimize energy consumption Rf. to Scaphandre for a modelling 10
N Path length
(hops)
No, only net-
work policies
Reduce latency Interconnection between Pods across multi-path envi-
ronments can consider a hop count distribution condi-
tioned to Euclidean distance to estimate a best path to
consider.
N Network
congestion
No Improve resilience, support mobility Based on jitter and packet loss modelling.
N Available
Bandwidth
Partial, limits
can be set
via network
policies for a
cluster
Reduce latency and improve resilience Consider an adaptive approach based on passive prob-
ing, to increase or decrease bandwidth.
A Security Partial, ACL,
focus on
authorization
and access
control only
Consider communication encryption to prevent security
breaches
Integrate adaptive link-level security
A QoS Limited, based
on specific user
configuration
Integrate diverse application QoS levels, to allow for
the support of real-time, time-sensitive, and bandwidth-
intensive applications
consider the use of the DiffServ codepoint [56].
A Geo-location No Take into consideration the application geo-location
needs due to resilience or compliance
model the cluster in accordance with the geographical
proximity of the micro-services.
D Data New or
Updated
No Ensure that new or updated data is processed in a timely
fashion
If a new dataset is created or a dataset is updated within
the defined scope, schedule registered application to
process the dataset.
D Data Freshness No Ensure that data is updated at minimal intervals If a threshold is exceeded, schedule registered applica-
tion to handle the issue.
D Data
Compliance
No Ensure adherence to regulations and company policies If a policy violation is detected, schedule registered
application to handle the violation.
U Geo-location No Replicate/offload workload and state in a way that is
closer to the user, to improve latency, resilience, or even
due to compliance
Model the cluster in accordance with the user geo-
location (centrality measure based on the user location)
U Volume of vis-
its to an IP net-
work
No Improve resilience and latency based on a user affinity
to a network
Combine the frequency of visits with the duration of
visits on a specific IP network (betweeness).
By providing a multi-objective integration of the proposed
parameters, it is our belief that the overall orchestration will
become more fluid, and provide a better response to the needs
of applications and the user.
B. CONTEXT ACQUISITION AND MODELLING
The context acquisition integrates two sub-components, illus-
trated in Figure 7.
First, it senses and gathers parameters, context-aware of-
floading triggers, that are described in subsection VII-A.
These parameters are generated based on changes in state
of the system, network and application, taking into consid-
eration data observability as well as user behaviour. The
parameters are sensed or discovered (e.g., via a tool such
as Pathfinder) as has been described. The parameters can
be statically configured as happens today. The acquisition
component collects and stores the different parameters into
different multivariate time-series. It may also perform local
data aggregation per category, to reduce the resulting dataset.
The multivariate time-series datasets are then passed to
the normalization and standardization component. Here, the
datasets are first normalized (and if required standardized).
This is required since each triggering type has a special unit
as has been described in Table 3.
Accordingly, context normalization avoids these problems
by creating new values that maintain the general distribution
and ratios in the dataset, while keeping values within a scale
applied across all numeric columns used in the model.
The result is then passed to the performance measurement
component.
C. CONTEXT-AWARE PERFORMANCE PROFILING
The normalized context datasets are passed to the context-
aware performance profiling block, illustrated in Figure 8.
This block performs a combination of the received data
sets based of pre-configured heuristics that aim at provid-
ing a measure of performance, for a specific performance
efficiency profile. For instance, assuming a user wants to
optimize the system for greeness, then this block would select
and combine weighted context datasets (e.g., hop count,
energy consumption) in accordance with a specific function
(e.g., product of hop count and energy consumption).
The combination of parameters is weighted, i.e., the differ-
ent categories of context parameters are weighted according
VOLUME 4, 2016 17
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
FIGURE 7. High-level representation of the context acquisition and modelling.
FIGURE 8. High-level representation of the context-aware performance profiling block.
to data preferences, or even according to learning over time,
provided via feedback provided by the ML-based offloading
decision engine.
Moreover, the combination of parameters takes into con-
sideration feedback obtained from the ML-based offloading
engine.
D. ML-BASED OFFLOADING ENGINE
The third component of the framework is the ML-based
offloading engine, where different mathematical methods are
applied to train and to classify the optimized performance
metrics’ dataset, to assist the K8s in a decision to perform
offloading.
In this block, ML is employed to provide behaviour in-
ference and also to perform prediction that can assist in
understanding the impact that the proposed optimization may
bring to the system.
This engine is therefore composed by a predictor com-
ponent, and also by an offloading decision component. The
predictor assists in estimating the impact that the combined
context-aware optimization metric proposal may have on the
system. The ML offloading decision engine learns about the
change in demands derived from contextual information, and
proposes offloading configuration recommendations to the
K8s scheduler.
E. CROSS-LAYER AND CONTEXT-AWARE
ORCHESTRATION CHALLENGES
The capability to inject metadata and additional data into
an orchestrator brings in the possibility to select an infras-
tructure that best adapts to the user expectations (QoE) and
application requirements (QoS). In this context, there are a
few challenges to be tackled, described in this subsection.
1) Mobility
Across a dynamic Edge-Cloud continuum, the overall infras-
tructure (data-network-computation) is dynamic. The net-
work of computing nodes and connected devices is dynamic
and its state varies. Usually, a network overlay is estab-
lished between Pods, relying on the existing Internet routing.
Hence, a first challenge concerns supporting mobility, and
performing a placement that can provide lower latency and
energy consumption. This implies offloading "closer" to the
user/data sources. And for this, context-awareness needs to
consider aspects that derive from social proximity, and social
mobility.
A first way to tackle this challenge is to move applica-
tions as close as possible to the end-user location. How-
ever, assuming the end-user has frequent movement, this
will result in additional overhead in terms of signaling and
also in terms of energy-consumption [34]. A possibility to
integrate mobility estimation as context-awareness [57], to
18 VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
assist the scheduler in performing a decision on whether or
not to re-schedule applications across Edge-Cloud contin-
uum. Mobile-Kube [34] proposes to consider as additional
metrics energy consumption, deriving an optimization of the
overall infrastructure based on mobility and energy. Similar
approaches, articulating the placement "closer" to the data
source/end-user can be worked by considering additional
context-aware metrics.
2) Context-awareness Integration
One of the key challenges in container orchestrators such as
Kubernetes is to be able to provide a cross-layer orchestra-
tion, thus allowing placement decisions to occur based on
real-time resource demands that relate with the application
and computational nodes; with the network; and also with
the data. The current available schedulers provide partial
support to these aspects as has been explained in section
VII. When considering constrained devices and also inter-
mittent connectivity/variable quality network links, then this
problem becomes more complex. A first challenge in this
context is the definition of a subset of parameters that can
provide a meaningful definition of a node, covering not
just computational resources but also networking resources
(e.g., egress and ingress bandwidth; centrality value), data
freshness/compliance and energy consumption based not just
on the energy consumed due to computational processes, but
also due to networking processes. For examples in parame-
ters, we refer the reader to the work under development by
Sofia et al. in the context of the Horizon Europe CODECO
project11. Specific parameters that provide an initial starting
point are publicly available in the CODECO report D9,
Annex I [58].
VIII. APPLICABILITY EXAMPLE: THE MOVEK8S PROOF
OF CONCEPT
To better illustrate how such a dynamic orchestration frame-
work would work in practice, this section describes a proof-
of-concept, MoveK8s 12, which has been implemented in
a Raspberry PI testbed in the fortiss Industrial IoT Lab,
for a Manufacturing use-case. Movek8s has a Technology
Readiness Level of 3 (TRL3), so it concerns "Research that
Proves the Feasibility of the Concept". The proof-of-concept
aimed at understanding in practice how context-awareness
in a simple form could be integrated into K8s, and how
workload migration (and not replication) could be deployed.
Aspects such as scalability, performance impact in the sense
of resource utilization and impact of several external (and
internal) factors in the overall system are out of the scope
of this perspective paper, and expected to be addressed in a
near future.
The underlying scenario for the proof-of-concept is rep-
resented in Figure 9, where three different nodes have been
set in a single K3s cluster. Each node corresponded to
11https://he-codeco.eu
12https://git.fortiss.org/iiot_external/movek8s
a Raspberry Pi 4 Model B. Nodes have been labelled as
Edge1, Edge2, Cloud. An Edge-based IIoT application for
environmental monitoring in manufacturing environments,
TSMatch 13, has then been set on the 2 Edge nodes. The
user initial location is Edge 1 (IP: 10.0.33.34). The selected
application integrates multiple micro-services, each running
on a separate Docker container: (i) graph database, (ii) TS-
Match Broker, and (iii) TSMatch Engine. Moreover, the end-
user relies on an application that interacts with the TSMatch
engine located on the Edge device, performing IoT service
requests (e.g., monitor temperature on a room and send alert
when temperature reaches a threshold).
From a K8s perspective, this deployment corresponds to
a simple cluster, where each Edge node has been set as a
worker node, and the Cloud has been set to run the K8s
master node. Movek8s has been developed as a third-party
Web-based application (Flask) co-located to the main node.
Movek8s performs a request to K8s to perform a change
of workload based on location. For this, on this first proof-of-
concept, the user relies on a QR code on each location. When
the user moves from Edge 1 to Edge 2, there is currently
a manual activation of the new location (via a QR code).
The active location information is then sent to MoveK8s. A
more efficient approach could be to predict the mobility of
the user based on the pattern of visits to the new location,
and to pass that information to the future Movek8s decision
handling engine represented in Figure 5. Moreover, in the
current proof-of-concept, Movek8s is placed in the main
node. However, in future versions, we envision that Movek8s
(or micro-services thereof) will have also to be installed on
worker nodes as well. For instance, the context acquisition
and modeling functional block represented in Figure 5 should
interface with each worker node.
Once the active location is received, Movek8s creates the
application workload on the selected location, and deletes the
workload on the prior location (if it exists).
K3s handles as usual the interconnection of nodes within the
cluster. Each worker node knows the main node IP and K3s
API port and its K3s token. Movek8s gathers the provided
user location and interacts with K3s to activate the TSMatch
replica that is closest to the user-provided location. For
the selected TSMatch Edge-based application, the Movek8s
offloading process is as follows:
1) The three TSMatch containers belong to the same Pod,
on Edge 1.
2) When a new location reaches Movek8s, it triggers the
offloading of TSMatch to Edge 2.
3) The prior Pod (Edge 1) is deleted, thereby terminating
the active TSMatch containers.
4) Movek8s then schedules the creation of a new Pod
containing the TSMatch containers, on Edge 2.
Policies defined to implement data security and privacy
protection can constrain the orchestration process. In the pre-
vious example, if cameras are used in the warehouse to mon-
13https://git.fortiss.org/iiot_external/tsmatch
VOLUME 4, 2016 19
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
FIGURE 9. Movek8s topology and K8s architecture.
itor and control a process, the images may include pictures
of the people working there. The company may therefore
define a policy that these images may only be processed in
the local edge domain. Before implementing the offloading
described above, the orchestrator should evaluate the appli-
cable policies and in this case decide to leave the processes
running on the warehouse if the workload is processing these
images. The information required to recognize this condition
is provided by data observability. In this example, we assume
the privacy-protection policy has priority over the location
trigger.
IX. SUMMARY AND FUTURE WORK
Next-generation IoT applications are expected to be deployed
across a highly dynamic Edge-Cloud environment, involving
different types of wireless and cellular infrastructures, mo-
bile and constrained nodes, and large-scale deployments. At
present, the spread of different functions across IoT devices,
Edge nodes, and the Cloud is achieved via virtualization.
Containerized applications require a flexible management,
today provided by solutions such as K8s. However, such
tools have some limitations in the context of decentralized
environments. As they handle the management of appli-
cations across Edge-Cloud in a static approach based on
human intervention. This paper proposes an architecture for
a dynamic container orchestration based on ML and context
awareness, integrating parameters that relate to application
requirements, data requirements, system requirements, net-
work requirements, and user behavior to support a more
elastic orchestration of containerized applications. The com-
ponents of the proposed framework have been explained
showing the relevancy of context-awareness integration, the
different types of its corresponding relevant parameters, and
how they can be modeled and integrated into the orches-
trator, enabling devised container orchestrators with enough
elasticity, adaptation, and learning capability. The relevant
role of data observability, and trends towards decentralisation
have been advocated. Finally, the application of context-
aware based orchestration has been developed in a testbed
as a proof-of-concept, to understand operational limitations
derived from the integration of context. The presented work
is the starting point for context-awareness integration in
the context of the Horizon Europe CODECO project. The
proposed categories of parameters are being used to further
refine a cross-layer notion of context-awareness that can be
modelled and integrated into K8s as a representative example
of a container orchestrator.
Future research should address the integration of context-
awareness into the orchestrator scheduler(s), by considering
approaches beyond the K8s filter and score (e.g., graph
utilization maximization), that can meet application and user
needs taking into consideration different layers of orchestra-
tion (network, computational, data, users, etc).
ACKNOWLEDGMENTS
This work has been partially funded by the following
projects: Horizon Europe CODECO, Gr nr. 101092696,
fortiss-IBM C4AI EDGE.
REFERENCES
[1] A. Yousefpour, C. Fung, T. Nguyen, K. Kadiyala, F. Jalali, A. Niakanlahiji,
J. Kong, and J. P. Jue, “All one needs to know about fog computing
and related edge computing paradigms: A complete survey, Journal of
Systems Architecture, vol. 98, pp. 289–330, 2019.
[2] W. Yu, F. Liang, X. He, W. G. Hatcher, C. Lu, J. Lin, and X. Yang, “A
Survey on the Edge Computing for the Internet of Things, 2017.
[3] A. Brogi, S. Forti, and A. Ibrahim, Predictive Analy-
sis to Support Fog Application Deployment. John Wiley
20 VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
Sons, Ltd, 2019, ch. 9, pp. 191–221. [Online]. Available:
https://onlinelibrary.wiley.com/doi/abs/10.1002/9781119525080.ch9
[4] J. Pan and J. McElhannon, “Future Edge Cloud and Edge Computing for
Internet of Things Applications,” IEEE Internet of Things Journal, vol. 5,
no. 1, pp. 439–449, 2017.
[5] A. S. Thyagaturu, P. Shantharama, A. Nasrallah, and M. Reisslein, “Op-
erating Systems and Hypervisors for Network Functions: A Survey of
Enabling Technologies and Research Studies, IEEE Access, vol. 10, pp.
79 825–79 873, 2022.
[6] K. M. Majidha Fathima and N. Santhiyakumari, “A Survey on Evolution of
Cloud Technology and Virtualization,” in Third International Conference
on Intelligent Communication Technologies and Virtual Mobile Networks
(ICICV), 2021, pp. 428–433.
[7] A. Randal, “The Ideal Versus the Real: Revisiting the History of Virtual
Machines and Containers,” ACM Comput. Surv., vol. 53, no. 1, feb 2020.
[Online]. Available: https://doi.org/10.1145/3365199
[8] A. M. Potdar, N. D G, S. Kengond, and M. M. Mulla,
“Performance Evaluation of Docker Container and Virtual
Machine,” Procedia Computer Science, vol. 171, pp. 1419–
1428, 2020, third International Conference on Computing
and Network Communications (CoCoNet’19). [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S1877050920311315
[9] E. Casalicchio, Container Orchestration: A Survey. Springer
International Publishing, 2019, pp. 221–235. [Online]. Available:
https://doi.org/10.1007/978-3-319-92378-9_14
[10] S. Wang, J. Xu, N. Zhang, and Y. Liu, “A Survey on Service Migration in
Mobile Edge Computing,” IEEE Access, vol. 6, pp. 23511–23 528, 2018.
[11] B. Wang, C. Wang, W. Huang, Y. Song, and X. Qin, “A Survey and
Taxonomy on Task Offloading for Edge-Cloud Computing, IEEE Access,
vol. 8, pp. 186 080–186 101, 2020.
[12] F. Liu, G. Tang, Y. Li, Z. Cai, X. Zhang, and T. Zhou, “A Survey on Edge
Computing Systems and Tools, Proceedings of the IEEE, 2019.
[13] S. Dubey and J. Meena, “Computation Offloading Techniques in Mobile
Edge Computing Environment: A Review,” in 2020 International Confer-
ence on Communication and Signal Processing (ICCSP), 2020, pp. 1217–
1223.
[14] A. Malviya and R. K. Dwivedi, “A Comparative Analysis of Container
Orchestration Tools in Cloud Computing, in 2022 9th International Con-
ference on Computing for Sustainable Global Development (INDIACom),
2022, pp. 698–703.
[15] R. C. Sofia (Ed.), “A Vision on Smart, Decentralised Edge Computing Re-
search Directions,” NGIoT white paper, DOI: 10.5281/zenodo.5837299,
2001.
[16] G. D. Abowd, A. K. Dey, P. J. Brown, N. Davies, M. Smith, and
P. Steggles, “Towards a Better Understanding of Context and Context-
awareness,” in International symposium on handheld and ubiquitous com-
puting. Springer, 1999, pp. 304–307.
[17] W. Liu, X. Li, and D. Huang, “A Survey on Context Awareness, in
2011 International Conference on Computer Science and Service System
(CSSS), 2011, pp. 144–147.
[18] L. Bulej, T. Bureš, P. Hnˇ
etynka, and D. Khalyeyev, “Self-adaptive K8S
Cloud Controller for Time-sensitive Applications, in 2021 47th Euromi-
cro Conference on Software Engineering and Advanced Applications
(SEAA), 2021, pp. 166–169.
[19] F. Rossi, V. Cardellini, F. Lo Presti, and M. Nardelli, “Geo-distributed
Efficient Deployment of Containers with Kubernetes, Computer
Communications, vol. 159, pp. 161–174, 2020. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0140366419317931
[20] S. Zhang, T. Wu, M. Pan, C. Zhang, and Y. Yu, “A-SARSA: A Predictive
Container Auto-Scaling Algorithm Based on Reinforcement Learning,” in
2020 IEEE International Conference on Web Services (ICWS), 2020, pp.
489–497.
[21] M. Chima Ogbuachi, C. Gore, A. Reale, P. Suskovics, and B. Kovács,
“Context-Aware K8S Scheduler for Real Time Distributed 5G Edge
Computing Applications,” in 2019 International Conference on Software,
Telecommunications and Computer Networks (SoftCOM), 2019, pp. 1–6.
[22] K. Kaur, S. Garg, G. Kaddoum, S. H. Ahmed, and M. Atiquzzaman,
“KEIDS: Kubernetes-Based Energy and Interference Driven Scheduler
for Industrial IoT in Edge-Cloud Ecosystem,” IEEE Internet of Things
Journal, vol. 7, no. 5, pp. 4228–4237, 2020.
[23] D. Silva, R. C. Sofia, “A Discussion on Context-Awareness to Better Sup-
port the IoT Cloud/Edge Continuum,” IEEE Access, vol. 8, pp. 193686–
193 694, 2020.
[24] S. Böhm and G. Wirtz, “Cloud-Edge Orchestration for Smart Cities: A
Review of Kubernetes-based Orchestration Architectures, EAI Endorsed
Transactions on Smart Cities, vol. 6, no. 18, pp. e2–e2, 2022.
[25] C. Carrión, “Kubernetes Scheduling: Taxonomy, Ongoing Issues and
Challenges,” ACM Comput. Surv., vol. 55, no. 7, dec 2022. [Online].
Available: https://doi.org/10.1145/3539606
[26] Z. Zhong, M. Xu, M. A. Rodriguez, C. Xu, and R. Buyya, “Machine
Learning-Based Orchestration of Containers: A Taxonomy and Future
Directions,” ACM Comput. Surv., vol. 54, no. 10s, sep 2022. [Online].
Available: https://doi.org/10.1145/3510415
[27] D. Ongaro and J. Ousterhout, “In Search of an Understandable Consensus
Algorithm,” in Proceedings of the 2014 USENIX Conference on USENIX
Annual Technical Conference, ser. USENIX ATC’14. USA: USENIX
Association, 2014, p. 305–320.
[28] J. Santos, T. Wauters, B. Volckaert, and F. De Turck, “Towards Network-
Aware Resource Provisioning in Kubernetes for Fog Computing Appli-
cations,” in 2019 IEEE Conference on Network Softwarization (NetSoft),
2019, pp. 351–359.
[29] J. Nickoloff, “Evaluating Container Platforms at Scale, 2016. [Online].
Available: https://medium.com/on-docker/evaluating-container-platforms-
at-scale-5e7b44d93f2c
[30] Y. Pan, I. Chen, F. Brasileiro, G. Jayaputera, and R. Sinnott, “A Perfor-
mance Comparison of Cloud-Based Container Orchestration Tools, in
2019 IEEE International Conference on Big Knowledge (ICBK), 2019,
pp. 191–198.
[31] A. M. Beltre, P. Saha, M. Govindaraju, A. Younge, and R. E. Grant,
“Enabling HPC Workloads on Cloud Infrastructure Using Kubernetes
Container Orchestration Mechanisms,” in 2019 IEEE/ACM International
Workshop on Containers and New Orchestration Paradigms for Isolated
Environments in HPC (CANOPIE-HPC), 2019, pp. 11–20.
[32] N. Naydenov and S. Ruseva, “Combining Container Orchestration and
Machine Learning in the Cloud: a Systematic Mapping Study,” in 2022
21st International Symposium INFOTEH-JAHORINA (INFOTEH), 2022,
pp. 1–6.
[33] S. V. Gogouvitis, H. Mueller, S. Premnadh, A. Seitz,
and B. Bruegge, “Seamless computing in industrial systems
using container orchestration,” Future Generation Computer
Systems, vol. 109, pp. 678–688, 2020. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0167739X17330236
[34] S. Ghafouri, A. Karami, D. B. Bakhtiarvan, A. S. Bigdeli, S. S. Gill,
and J. Doyle, “Mobile-Kube: Mobility-aware and energy-efficient service
orchestration on kubernetes edge servers,” in 2022 IEEE/ACM 15th In-
ternational Conference on Utility and Cloud Computing (UCC). IEEE,
2022, pp. 82–91.
[35] P. Gonzalez-Gil, A. Robles-Enciso, J. A. Martínez, and A. F. Skarmeta,
“Architecture for Orchestrating Dynamic DNN-Powered Image Process-
ing Tasks in Edge and Cloud Devices,” IEEE Access, vol. 9, pp. 107 137–
107 148, 2021.
[36] W. Sun, J. Liu, and Y. Yue, “AI-Enhanced Offloading in Edge Computing:
When Machine Learning Meets Industrial IoT,” IEEE Network, vol. 33,
no. 5, pp. 68–74, 2019.
[37] Z. Rejiba and J. Chamanara, “Custom Scheduling in Kubernetes:
A Survey on Common Problems and Solution Approaches,” ACM
Comput. Surv., vol. 55, no. 7, dec 2022. [Online]. Available:
https://doi.org/10.1145/3544788
[38] S. Abdulrahman, H. Tout, H. Ould-Slimane, A. Mourad, C. Talhi, and
M. Guizani, “A Survey on Federated Learning: The Journey From Cen-
tralized to Distributed On-Site Learning and Beyond, IEEE Internet of
Things Journal, vol. 8, no. 7, pp. 5476–5497, 2021.
[39] M. M. John, H. Holmström Olsson, and J. Bosch, “AI on the Edge: Ar-
chitectural Alternatives, in 2020 46th Euromicro Conference on Software
Engineering and Advanced Applications (SEAA), 2020, pp. 21–28.
[40] Z. Zou, Y. Jin, P. Nevalainen, Y. Huan, J. Heikkonen, and T. Westerlund,
“Edge and Fog Computing Enabled AI for IoT-An Overview,” in 2019
IEEE International Conference on Artificial Intelligence Circuits and
Systems (AICAS), 2019, pp. 51–56.
[41] S. Warnat-Herresthal, H. Schultze, K. L. Shastry, S. Manamohan,
S. Mukherjee, V. Garg, R. Sarveswara, K. Händler, P. Pickkers, N. A.
Aziz et al., “Swarm Learning for Decentralized and Confidential Clinical
Machine Learning,” Nature, vol. 594, no. 7862, pp. 265–270, 2021.
[42] Y. Liu, J. Nie, X. Li, S. H. Ahmed, W. Y. B. Lim, and C. Miao, “Federated
Learning in the Sky: Aerial-Ground Air Quality Sensing Framework With
UAV Swarms, IEEE Internet of Things Journal, vol. 8, no. 12, pp. 9827–
9837, 2021.
VOLUME 4, 2016 21
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
[43] M. Chen, D. Gündüz, K. Huang, W. Saad, M. Bennis, A. V. Feljan,
and H. V. Poor, “Distributed Learning in Wireless Networks: Recent
Progress and Future Challenges,” IEEE Journal on Selected Areas in
Communications, vol. 39, no. 12, pp. 3579–3605, 2021.
[44] J. Kang, Z. Xiong, D. Niyato, Y. Zou, Y. Zhang, and M. Guizani, “Reliable
Federated Learning for Mobile Networks,” IEEE Wireless Communica-
tions, vol. 27, no. 2, pp. 72–80, 2020.
[45] A. Imteaj, U. Thakker, S. Wang, J. Li, and M. H. Amini, “A Survey on
Federated Learning for Resource-Constrained IoT Devices,”IEEE Internet
of Things Journal, vol. 9, no. 1, pp. 1–24, 2022.
[46] S. Rooney, L. Garcés-Erice, D. Bauer, and P. Urbanetz, “Pathfinder:
Building the Enterprise Data Map,” in 2021 IEEE International Conference
on Big Data (Big Data), 2021, pp. 1909–1919.
[47] D. Bauer, C. Giblin, L. Garcés-Erice, N. Pardon, S. Rooney, E. Toniato,
and P. Urbanetz, “Revisiting Data Lakes: The Metadata Lake,” in
Proceedings of the 23rd International Middleware Conference Industrial
Track, ser. Middleware Industrial Track ’22. New York, NY, USA:
Association for Computing Machinery, 2022, p. 8–14. [Online]. Available:
https://doi.org/10.1145/3564695.3564773
[48] “NIST SP 800-53 Rev. 5, Security and Privacy Controls for
Information Systems and Organizations,” 2020. [Online]. Available:
https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final
[49] Q.-M. Nguyen, L.-A. Phan, and T. Kim, “Load-Balancing of
Kubernetes-Based Edge Computing Infrastructure Using Resource
Adaptive Proxy, Sensors, vol. 22, no. 8, 2022. [Online]. Available:
https://www.mdpi.com/1424-8220/22/8/2869
[50] C. E. Perkins, J. Arkko, and D. B. Johnson, “Mobility Support in
IPv6,” RFC 3775, Jun. 2004. [Online]. Available: https://www.rfc-
editor.org/info/rfc3775
[51] Leiter, D. Huszti, N. Galambosi, E. Lami, M. S. Salah, P. Kulics, and
L. Bokor, “Cloud-native IP-based mobility management: a MIPv6 Home
Agent standalone microservice design,” in 2022 13th International Sympo-
sium on Communication Systems, Networks and Digital Signal Processing
(CSNDSP), 2022, pp. 252–257.
[52] C. Harris and V. Cahill, “Exploiting user behaviour for context-aware
power management,” in WiMob’2005), IEEE International Conference
on Wireless And Mobile Computing, Networking And Communications,
2005., vol. 4, 2005, pp. 122–130 Vol. 4.
[53] R. C. Sofia, L. Carvalho, and F. M. Pereira, The Role of Smart Data in
Inference of Human Behavior and Interaction. Chapman and Hall/CRC,
2019, pp. 191–214.
[54] F. Saeik, J. Violos, A. Leivadeas, M. Avgeris, D. Spatharakis, and D. De-
chouniotis, “User Association and Behavioral Characterization during
Task Offloading at the Edge,” in 2021 IEEE International Mediterranean
Conference on Communications and Networking (MeditCom), 2021, pp.
70–75.
[55] A. Przybylowski, S. Stelmak, and M. Suchanek, “Mobility Behaviour in
View of the Impact of the COVID-19 Pandemic—Public Transport Users
in Gdansk Case Study,” Sustainability, vol. 13, no. 1, 2021. [Online].
Available: https://www.mdpi.com/2071-1050/13/1/364
[56] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss,
“RFC2475: An Architecture for Differentiated Service,” USA, 1998.
[57] O. Aponte and R. C. Sofia, “Mobility management optimization via
inference of roaming behavior, in 2019 International Conference on Wire-
less and Mobile Computing, Networking and Communications (WiMob).
IEEE, 2019, pp. 71–76.
[58] R. C. Sofia (Ed.), H. Mueller, J. Solomon, R. Touma, L. G.-E., L.
C. Murillo, D. Remon, A. Espinosa, J. Soldatos, N. Psaromanolakis,
L. Mamathas, I. Kapetanidou, V. Tsaoussidis, J. Martrat, I. P.
Mariscal, P. Urbanetz, D. Dykemann, V. Theodorou, S. Ferlin-Reiter,
E. Paraskevoulakou, P. Karamolegkos, “CODECO D9 -Technological
Guidelines, Reference Architecture, and Initial Open-source Ecosystem
Design v1.0,” Jul. 2023, All CODECO Partners have contributed to the
deliverable. [Online]. Available: https://doi.org/10.5281/zenodo.8143860
RUTE C. SOFIA (PhD 04, IEEE Senior Mem-
ber) is the Industrial IoT department head at the
research institute fortiss. She is also an Invited
Associate Professor of University Lusófona de
Humanidades e Tecnologias, and an Associate Re-
searcher at ISTAR, Instituto Universitário de Lis-
boa. Rute’s research background has been devel-
oped on industry (Grupo Forum, Lisbon; Siemens
AG, Nokia Networks, Munich) and on academia
(FCCN, Lisbon; INESC TEC, Porto; ULHT, Lis-
bon; Bundeswehr Universität, Munich). She was a co-founder of the por-
tuguese COPELABS research unit, and was the COPELABS scientific
director (2013-2017), where she was a Senior Researcher (2010-2019). She
has also co-founded the COPELABS spin-off Senception Lda (2013-2019).
Her current research interests are: network architectures and protocols;
IoT; Edge computing; Edge AI; in-network computing; 6G. Rute holds over
70 peer-reviewed publications in her fields of interest, and 9 patents. She
is an ACM Europe Councilor; an ACM Senior member, an IEEE Senior
Member. She was an IEEE ComSoc N2Women Awards co-chair (2020-
2021), and is the IEEE ComSoc WICE industry liaison deputy. She leads
the 6G CONASENSE platform. She is an associated editor among several
venues, such as IEEE Access, IEEE Network.
DOUG DYKEMAN (PhD 88) manages the AI for
Data Integration research group at IBM Research
in Zurich Switzerland. In his career he has focused
on a range of networking and systems manage-
ment topics, spanning the telco (initially at North-
ern Telecom) and enterprise (at IBM) worlds. His
personal focused has covered research, product
development, and standardization in this space.
His current team is focused on using AI and
other technologies to simplify management of data
in large organizations. The IBM Pathfinder data observability project is
focused on using metadata to build a complete map of data in an orgnization.
We are investigating how to use this enterprise data map to support a range
of applications, including continuous compliance, data and AI workflow
maintenance, and achieving agility in managing the use of cryptography.
Doug has a Ph.D. in Computer Science from the University of Waterloo,
Canada.
PETER URBANETZ is a member of the AI for
Data Integration team at IBM Research in Zurich
Switzerland. His recent projects have focused on
data management in large organizations. The team
created the Cognitive Enterprise Data Platform
(CEDP) for the IBM Chief Data Office to provide
a state-of-the art platform for analyzing data based
on Hadoop. From the CEDP experience the team
recognized the constraints associated with copying
data to a common data analytics platform, and
therefore developed the idea of managing data based on decentralized data
stores the data remains with the organization that manages it with
centralized metadata to make it possible to simplify the job of managing
and analyzing data while removing any practical scalability limits.
22 VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Sofia et al.: Dynamic Container Orchestration
AKRAM GALAL (PhD 22, IEEE Member) was a
Researcher at the IIoT research department, for-
tiss, Munich (08.22-01.23). During his Ph.D., he
was a Visiting Researcher at the National Insti-
tute of Standards and Technology (NIST), USA
(2019), and a Visiting Researcher at Interuniver-
sity Microelectronics Centre (IMEC) within Ghent
University, Belgium (2020). Akram served as a
Solution Design Consultant at Tawasul Telecom,
a regional Information and Communication Tech-
nology solutions provider in Kuwait (2014-2017). Also, he served as an
Enterprise Networks Engineer at Telecom Egypt, the dominant Internet
service provider in Egypt (2011-2014). Akram research interests relate with
IoT, Internet of nano-things, edge computing, software-defined networking,
network function virtualization, and AI/ML for data communication. He has
participated in several national/international research projects in collabora-
tion with multiple partners from academia and industry and funded either by
the EU, such as 5GROUTES and MARSAL or by the Spanish government,
such as TRUE5G and 5GCity.
DUSHYANT ANIRUDHDHABHAI DAVE
(M.Sc. 22) was a Student Assistant at fortiss,
IIoT research department (21-22), where he has
been responsible for the development of the
Movek8s demonstrator development and instal-
lation. Dushyant has developed his M.Sc. in In-
formatics at the Technical University of Munich.
He has industrial experience, having developed,
during his master studies, an internship at the
TUM-IBM OpenPower project (2020-2021). His
research interests relate with software engineering, ML-driven communica-
tions and orchestration; Edge computing.
VOLUME 4, 2016 23
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3307026
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
... • Defines, via the CODECO experimentation framework, operational guidelines to deploying and testing flexible orchestration in existing operational environments. • Provides the research community with an explanation on the developed open-source software, in particular regarding the early-release 6 of CODECO components and tools, such as the integration of context-awareness into the Edge-Cloud orchestration; decentralized learning approaches that can be tested via the provided code; network probing mechanisms. • Provides the research community with the experimentation approach under development in CODECO, which includes the integration of CODECO into the international experimental testbed EdgeNet 7 . ...
... The subcomponent obtains data from other components (e.g., network metrics from NetMA; user metrics from ACM), preprocesses such data (PDLC-DP) and generates contextawareness based on a specific performance profile (PDLC-PP) requested by the user, e.g., optimal greeness of the overall system. Specific parameters and categories of metrics to be considered are available in prior work [6], [7], and are illustrated in Figure 7. ...
Article
Full-text available
This perspective paper introduces a novel framework for container orchestration called CODECO. The CODECO orchestration framework relies on a data-network-computing approach to define the best infrastructure that can support the operation of next-generation Internet applications across a mobile, heterogeneous Edge-Cloud continuum. The selection of such an infrastructure is aligned with target performance profiles defined by the user, such as resilience or greenness. CODECO proposes to rely on decentralized Artificial Intelligence approaches in an attempt to provide the most suitable infrastructure to an application deployment, considering infrastructural challenges, such as intermittent connectivity and node failure. This paper explains the current CODECO framework and provides insight into operational use-cases where CODECO is being deployed, as relevant examples of application for such a framework. Recent developments in the creation of the open-source CODECO framework are described and explained, allowing the use of the framework by the research community. The paper then provides a thorough analysis of CODECO’s features in comparison with existing orchestration frameworks, explaining the benefits introduced with this dynamic orchestration approach.
... This SLR discusses only two studies with swarm intelligence and one study related to semantics. Sofia et al. share perspectives discussing the aspects that a dynamic orchestration approach should integrate to support an elastic orchestration of containerized applications [22]. To the best of our knowledge, no existing SLR addresses the targets or the research questions as a whole proposed in this study. ...