Technical ReportPDF Available

Sustainable Life-Cycle of Intelligent Socio-Technical Systems

Authors:

Abstract

Artificial Intelligence (AI) is ubiquitous in our society and a key driver of economic growth. Intelligent socio-technical systems (ITS) – that means physical artifacts with a specific purpose which have intelligent components at its heart – surround us in the form of intelligent household devices, medical support systems, robotics components in manufacturing, or components of cognitive automation, to name just a few. Current design of AI components for ITS mainly targets the introduction phase, where its technical functionality is optimized based on given training data. Broader, long term, and non-functional objectives can be evaluated in later phases of the life-cycle: How does a model behave in previously not encountered situations? Does the ITS have unwanted effects on human behavior and welfare? How can model maintenance be realized efficiently as regards energy consumption? How can novel demands be dealt with? As widely statistical tools, AI models always make mistakes; yet these often become apparent in later stages of the system’s life-cycle only as they arise from interference with the environment or society. The SAIL network aims for a natural and efficient way which enables humans to shape ITS through its whole life-cycle, aiming for alternatives to the current predominant approach of black-box AI components which require huge data sets and resources. SAIL targets sustainability of AI not only with respect to its mere technical function but also its cognitive compatibility and aspects of societal welfare. SAIL constitutes an interinstitutional network of Bielefeld University, Paderborn University, University of Applied Sciences Bielefeld, and OWL University of Applied Sciences and Art, in which about 80 principal investigators and researchers joined forces towards a holistic approach to AI, which enables us to address ITS long-term societal and ecological impact, the demands and cognition of human users, and their surrounding IT infrastructure. The SAIL network kicked off in summer 2022, incorporating networking activities such as yearly retreats and regular lecture series, international outreach via its fellowship program and international events, and science communication and transfer to society and industrial partners via tailored activities. At the heart of SAIL are its interdisciplinary research activities, which are bundled in its interdisciplinary graduate school with its regular colloquium and AI-schools.
EN
SUSTAINABLE LIFE-CYCLE OF
INTELLIGENT SOCIO-TECHNICAL SYSTEMS
1
Artificial Intelligence (AI) is ubiquitous in our society
and a key driver of economic growth. Intelligent
socio-technical systems (ITS) – that means physical
artifacts with a specific purpose which have
intelligent components at its heart surround us in
the form of intelligent household devices, medical
support systems, robotics components in
manufacturing, or components of cognitive
automation, to name just a few.
Current design of AI components for ITS mainly
targets the introduction phase, where its technical
functionality is optimized based on given training
data. Broader, long-term, and non-functional
objectives can be evaluated in later phases of the
life-cycle:
How does a model behave in previously not
encountered situations? Does the ITS have
unwanted effects on human behavior and welfare?
How can model maintenance be realized efficiently
as regards energy consumption? How can novel
demands be dealt with? As widely statistical tools,
AI models always make mistakes; yet these often
become apparent in later stages of the system’s life-
cycle only as they arise from interference with the
environment or society. The SAIL network aims for
a natural and efficient way which enables humans to
shape ITS through its whole life-cycle, aiming for
alternatives to the current predominant approach of
black-box AI components which require huge data
sets and resources.
SAIL targets sustainability of AI not only with respect to its mere technical function but also its cognitive
compatibility and aspects of societal welfare. SAIL constitutes an interinstitutional network of Bielefeld
University, Paderborn University, University of Applied Sciences Bielefeld, and OWL University of Applied
Sciences and Art, in which about 80 principal investigators and researchers joined forces towards a holistic
approach to AI, which enables us to address ITS long-term societal and ecological impact, the demands
and cognition of human users, and their surrounding IT infrastructure. The SAIL network kicked off in
summer 2022, incorporating networking activities such as yearly retreats and regular lecture series,
international outreach via its fellowship program and international events, and science communication and
transfer to society and industrial partners via tailored activities. At the heart of SAIL are its interdisciplinary
research activities, which are bundled in its interdisciplinary graduate school with its regular colloquium and
AI-schools.
Within SAIL, we aim for overarching design principles to address challenges following three
dedicated research areas:
These research areas are pursued with a special focus on two application domains which address ITS with
societal and industrial relevance: intelligent industrial workspaces and adaptive healthcare assistance
systems. These dedicated transfer scenarios facilitate the integration of interdisciplinary research and act
as catalysts to cover the spectrum of fundamental research up to technology transfer.
Principles of human agency as a way of valuing, communicating and aligning objectives of humans
and ITS in particular in the growth phase.
Efficient realizations thereof w.r.t. time and memory, energy, and cognitive load, throughout the whole
life-cycle to enable real-time operations involving human partners.
Prosilience as a way of modeling and mitigating possibly unwanted long-term effects (errors) on a
functional, cognitive, and societal level for mature ITS or systems transitioning to new environments.
2
www.sail.nrw
SAIL Research Network
SAIL’s first research pillar, human agency and the
capability to shape systems according to human needs
rather than vice versa, targets key requirements for a
sustainable human-centric design. Recent advances of
large language models, notably chatbots such as
OpenAI’s ChatGPT, allow humans to interact with
machines in natural language. Yet, these interactions
are still far from natural, as they are limited to one
modality or two; current transformer-based models lack
higher-order symbolic knowledge, and they might fall
prey to spurious correlations which can cause wrong
results, referred to as hallucinations. As they constitute
black box models, the underlying rationale and
limitations of AI systems often remain opaque for
humans. Hence human agency is limited when
interacting with AI systems as human’s objectives and
the mathematical modeling underlying the AI system
are typically not aligned.
In SAIL, we address key challenges to facilitate human
agency in cognitive interaction technologies. The
question, which orchestration of ITS with verbal and
non-verbal signals enables natural interaction modes of
humans with ITS, has led SAIL researchers to
investigate the specific effects of referential gaze and
multimodality [14], as well as the influence of context
and causality bias on human perception [10,12].
3
R1: HUMAN AGENCY TO SHAPE
COOPERATIVE INTELLIGENCE
A further focus is put on technologies which enhance
explainability of AI models, thus facilitating human
understanding [7,8,9]. Approaches which make such
models more trustworthy can rely on probing with
semantic adversarials which might occur in real life[1],
and in a better understanding how to identify misuse
and unwanted effects such as hate speech [3] and how
this capability changes based on historical context [4].
Research deals with the suitability of different AI
paradigms to generate truthful content [6], and
understanding the aspects which are crucial for
humans to accept AI-based support in applications
[2,13].
3
Within this interdisciplinary endeavor, research area (R1) integrates natural
language interfaces and multi-modal social signals, while relying on expertise from
linguistics, psychology, human-computer interaction, and systems engineering.
R1
4
5
Recently, major advances in AI have been made by the
introduction of large language models (LLMs), which
can outperform older models in computer science
benchmarks and are fluent in dialog with users. ITS that
are based on LLMs have in fact already left the lab and
are changing the way we interact with digital assistants
and how we produce and consume texts; applications
such as OpenAI’s ChatGPT or Microsoft’s Bing
assistant are already used by millions of people every
day.
Despite these successes, LLMs have severe limitations
that SAIL research investigates and works to solve.
Interaction with LLMs is text, so that it is necessary to
expand to other modalities. LLMs are trained on huge
amounts of text, as well as feedback from humans,
favoring popular information and introducing biases.
Despite appearances,
LLMs neither understand us as humans understand
each other, nor can they distinguish between fact and
fiction, so that we cannot be sure whether they are
telling us the truth or making things up. Ways to
mitigate this that are investigated in SAIL are to
combine LLMs with knowledge graphs, so that they are
more likely to produce factual information.
TRANSCENDING
LIMITATIONS OF
LARGE LANGUAGE
MODELS
Overarching Challenges
R1
Human agency to shape
cooperative intelligence
17
JUNIOR
RESEARCH
GROUP
LEADERS IN
FOCUS:
ÖZGE
ALACAM
I am Özge Alaçam, and I grew up in Istanbul, Turkey. I earned my
Bachelor's and Master’s degrees in Instructional Design and
Cognitive Science at METU, Ankara, and completed my PhD in
Informatics at Uni Hamburg, Germany. Currently, I lead a Junior
Research Group at Uni Bielefeld as part of the SAIL project and
serve as an acting professor at the Center for Information and
Language Processing at LMU. I live in Bielefeld with my partner
and our three cats. In my free time, I enjoy diving, underwater
photography, and exploring beautiful places in NRW by bicycle.
Can you tell us something
about yourself?
SAIL provided me with the opportunity to have my own junior
research group, which contributed a lot to my visibility,
establishing broader networks, and preparing me for my next
academic goal, being a full-time professor.
I especially appreciate the collaborative culture of Uni
Bielefeld. Furthermore, the supportive environment from the
established faculty members and the project SAIL creates a
fruitful and nourishing working atmosphere.
What do you find interesting
about SAIL?
My main research areas are Multimodal NLP (Situated
Language Understanding, Gaze-Contingent LMs) and Hate
Speech Detection. Integrating multimodal information to
language models (visual information, gaze, etc.) to achieve
better comprehension of the environment or
understanding/mitigating hate speech are challenging issues,
and we are far away from being able to model these intricate
relationships. Achieving this will contribute to smoother
human-computer interaction and safer online communication.
What is your research area?
6
R1
In industry, human-centric AI revolutionizes work
processes, enhances human capabilities, and creates
added value. However, this technological progress has
its challenges. AI systems are vulnerable to subtle
manipulations of input data, often imperceptible to
humans. These artificial perturbed inputs are referred to
as adversarial examples and pose a significant threat to
AI system's reliability and safety.
To counter this threat, we are investigating semantic
adversarials. These are not only artificially computed
perturbations but naturally occurring erroneous inputs.
Semantic adversarials retain the meaningful context of
the input data and allow for larger perturbations
compared to conventional adversarial examples. By
probing AI systems with adversarial examples and
semantic adversarials in training, more resilient models
can be achieved.
In our work, adversarial examples were successfully
generated on industrial data and countermeasures and
their limitations were identified. Another topic that
combines human interaction with an industrial context
and a security-relevant application is banknote
authentication and document design. The boundaries
between adversarial examples and semantic
adversarials are currently merging here.
RELEVANCE FOR
INDUSTRIAL
APPLICATIONS
Human agency to shape
cooperative intelligence
7
a) Banknote specimen is classified based on
printing structure in different image regions.
b) Original region A6 of
Forgery detected as
such
c) Modified region A6
of Forgery detected as
Genuine
Figure: Illustration of the forgery of a banknote specimen. The changes
required to deceive a machine learning system are so small that they
cannot be detected even in the zoomed-in image regions.
17
SAIL's focus on social sustainability is manifested in its
medical applications, in collaboration with the newly
established Medical School OWL at Bielefeld University
with its unique research focus on persons with
disabilities.
Our endeavor is to build a more inclusive and
supportive healthcare system by ensuring that
technology is accessible and responsive to all users. In
healthcare, catering to the diverse needs of individual
users is essential, particularly in daily life and home
environments. SAIL aims to make assistance through
ITS possible and enhance user experience by
developing AI models and interfaces that are adaptive,
barrier-free, and explainable. This ensures that ITS can
adjust and be adjusted to individual needs and contexts
in interaction with their users, strengthening human
agency and participation.
Specifically, one of SAIL's PhD tandems is working on
an interdisciplinary project to develop personalized,
explainable interactions with humanoid robots. These
sessions are tailored to users’ personalities and
designed to prepare shy children and individuals with
disabilities for upcoming medical examinations,
reducing anxiety and empowering them to express their
thoughts and participate in the examination actively.
This approach promotes a sense of control and
participation in their medical care.
DEMONSTRATORS
IN MEDICINE R1
Human agency to shape
cooperative intelligence
8
R1
Human agency to shape
cooperative intelligence
In our recent collaborative effort within SAIL, we contributed
to the fusion of computational methods with Digital
Humanities research. Our investigation yielded a pilot
dataset covering Early Modern English texts, tailored to the
novel task of detecting changes of hateful word meaning.
This dataset, which is enriched with annotations from
domain experts, served as a valuable evaluation resource
for the development of methods leveraging LLMs to tackle
this challenge.
Our interdisciplinary approach not only enables advancing
computational techniques that, in turn, contribute to new
historical insights, but also illuminates the requisites for
collaboration across different research fields. Along our
journey, we encountered the contrasting objectives of
computational linguists, aiming for generalizable methods,
opposed to specific analytical goals of Digital Humanities
researchers. Furthermore, it became evident that efforts are
needed to reconcile divergent interpretations of fundamental
concepts such as “data” and “context”.
We envision significant advantages in fostering
interdisciplinary collaboration, particularly in enhancing
access to high-quality resources for increasingly data-driven
and reliant LLMs. Simultaneously, it is imperative to critically
assess challenges which arise from interdisciplinary
research and to systematically define the interrelationship
between researchers, their agency and AI technology.
DIGITAL HUMANITIES
MEETS COMPUTATIONAL
LINGUISTICS
9
Classical AI technologies can be supported by
mathematical guarantees that hold for well-defined
settings of the introduction phase and for formal
mathematical, i.e., functional objectives. The behavior
of human partners, the development of social context
and technical environment can be subject to
(unexpected) variations such that medium to long-term
guarantees of ITS’s functionality hardly exist. Within
SAIL, we aim for prosilience—proactive actions to
guarantee resilience over time—as a fundamental
design principle for robust behavior in unexpected
situations with respect to diverse technology-centered
and human-centered requirements.
Research questions within SAIL address the challenge
of how to deal with systematic noise and sensor faults
that occur in complex realistic environments, including
medical applications [15,16]. Complementary research
aims for a better understanding of regularization, which
is implied by architectural choices of modern neural
architectures, such as the type of non-linearity, choice
of cost function, or chosen representation such as
induced by adaptive dimensionality reduction
[19,21,23]. Important measures to improve long-term
robustness aim for regularization based on domain-
knowledge such as algebraic invariances [20] or
inductive biases from the system’s dynamical
properties [28]. Specific dynamic properties might also
enable efficient model identification leading to better
10
R2: PROSILIENCE AND LONG-TERM ROBUSTNESS
THROUGH HUMAN-CENTERED DESIGN
generalization behavior and controllability of the
system [22,23,25]. As evolving systems might lead to
unpredicted behavior, particular focus is put on
explainability of evolving systems, dynamic fact
checking, or the preservation of formal mathematical
guarantees while model adaptation [17,18,24]. As
these approaches primarily concern technical aspects
the research is complemented by the investigation of
long term effects on social behavior, specifically team
building of AI-agents and humans [26,27].
3
Research area (R2) combines machine learning technologies with knowledge-
based representations for context information and control theory to model long-
term technological constraints. Moreover, organizational science and sociology
enable us to understand and mediate ITS’ long-term societal impact.
R2
11
Data is continuously generated in many processes in
companies, industry and society. This constant stream of
data can be used to train AI processes and dynamically
adapt them to the current situation. This is in contrast to the
traditional approach, where a static amount of data is used
for training.
An exemplary application area for continuous learning
addressed in the SAIL project is industrial manufacturing
processes, where AI models are continuously adapted to
current production conditions to detect error states.
The dynamic adaptation of AI is often very beneficial, but it
also brings new challenges: how can a real anomaly be
distinguished from a benign change in the example
mentioned, to which the AI system should adapt? Which of
the numerous new data points in the data stream should
possibly be presented to an expert in order to correctly
assess the situation (in the example: has a new, previously
unknown error occurred)?
In SAIL, we deal with precisely these challenges so that the
dynamically learning AI can interact with the user as
efficiently as possible. One focus area, for example, is new
developments in the field of active learning, in which new
data is specifically identified for the most efficient
incremental adaptation possible. Another focus addressed in
SAIL is methods that ensure that the AI does not drift into
unsafe states during continuous learning, in which
operational safety is no longer guaranteed.
CONTINUAL LEARNING
FROM DATA STREAMS
R2
Prosilience and long-term
robustness through
human-centered design
Overarching Challenges
12
17
JUNIOR
RESEARCH
GROUP
LEADERS IN
FOCUS:
DANIEL
LEITE
I am a researcher in the Department of Computer Science, DICE
Group, Paderborn University. For 11 years I was a professor and
researcher in Brazil, Chile, and Slovenia in the areas of
incremental machine learning, dynamic systems, and control. I
earned my PhD in 2012 from UNICAMP, Brazil.
Can you tell us something
about yourself?
A typical workday involves a variety of tasks such as defining
priorities; responding to emails; attending meetings; preparing
presentations; analyzing data, and improving algorithms and
models. Additionally, typical tasks involve defining experimental
setups to prove a particular hypothesis; reviewing and editing
papers; writing rebuttals and reports; contributing to the
organization of scientific events; studying research papers; and
developing new methods or methodology.
How does a workday look like for you?
A research focus of AI involves developing procedures to manage
situations or events that have never been seen before. The idea is
that machines should capture the essence of information from
these new situations in real-time, and generalize their
understanding to handle similar instances in the future. Models of
various phenomena, including physical, chemical, biological,
meteorological, economic, and ecological, have been
autonomously developed within the internal structure of machines,
without human intervention. However, the widespread adoption of
such self-evolving AI is still domain-dependent.
A significant distinction between human intelligence and AI is our
ability to generalize across domains. Scientific progress in this
area is imperative and should be observed in the near future.
A look into the future: What do you expect
to happen with AI in the next years?
13
JUNIOR
RESEARCH
GROUP
LEADERS IN
FOCUS:
MICHIEL
STRAAT
My name is Michiel Straat and I am from the Netherlands. I did
my studies in computer science and earned my PhD at the
University of Groningen focusing on theoretical aspects of
machine learning. In my spare time, I enjoy playing the piano,
reading and practicing sports such as squash, running and
tennis.
Can you tell us something
about yourself?
Before joining SAIL, I studied learning behaviour in artificial
neural networks theoretically. In my Postdoc in SAIL, I am
focusing currently on applications of AI in fluid dynamics. This
broad field, which describes the flow of fluids in various
conditions, is highly relevant to urgent challenges such as the
energy transition and climate change. AI methods have the
potential to bring much progress here by complementing
traditional methods with data-driven modelling.
Can you tell us about a topic you are
working on?
What I find most interesting and appealing is that SAIL targets
aspects of intelligent systems that are crucial for further
progress in AI, such as human agency, robustness,
computational and data efficiency and sustainability. A large
group of experts is part of SAIL that work on these topics and
inspire each other by exchanging research results and their
ideas. This creates a very stimulating environment for
performing research in AI.
What do you find interesting about
SAIL?
I practice sports each morning because I think it is most
important for my productivity.
What do you do to stay productive?
14
The development and validation of (partially) autonomous vehicles
is very complex and involves a great deal of time and expense.
Numerous requirements and use cases must be taken into account,
and the final system must be versatile and meet high safety
standards. Consequently, the number of test cases to be
considered (e.g. the lane change of a vehicle in front shown below)
is enormous; it is expected that more than one million road
kilometers will have to be tested in order to obtain certification,
which is a major challenge even for virtual tests using realistic
computer simulations.
At this point, machine learning can help in two ways. On the one
hand, the data obtained from expensive simulations can be used to
train so-called surrogate models, with the support of which further
tests can be carried out many times faster. The resulting higher data
density then enables efficient test validation, including safety
guarantees, as well as the further optimization of additional criteria,
such as energy consumption. On the other hand, machine learning
methods can be used to make an intelligent selection of simulations.
For example, test runs can be identified that are on the boundary
between safe and unsafe system behavior. As such tests are much
more informative than randomly selected ones, the number of
expensive simulations can be further reduced without jeopardizing
the robustness of the system.
In the SAIL project, there is a dedicated focus on the question of
how dynamic systems occurring in this context can be identified as
efficiently as possible by exploiting basic mathematical principles
with just a few observations.
ROBUST DESIGN OF
AUTONOMOUS VEHICLES
THROUGH AI-SUPPORTED
TESTING
R2
Prosilience and long-term
robustness through
human-centered design
15
Wrong blood in tube (WBIT) are the errors where the name
on the blood sample tube does not match the person that the
blood in the tube belongs to. It is a serious and not
uncommon occurrence in the medical domain that can have
heavy consequences such as delays of the correct
diagnosis, misdiagnosis or mistreatment.
Within SAIL, we pursue collaborative research with Klinikum
Lippe Detmold, to develop a machine learning (ML)-based
system to detect WBIT errors. Unlike conventional
measures, our approach utilizes ML's ability to analyze
complex patterns in data, such as blood characteristics of
the patients, for real-time WBIT error detection. We aim to
create a system which is not only accurate and cost-effective
but also enables identifying risk factors, and detecting
unusual medical constellations, ultimately improving patient
safety.
We plan to conduct studies in multiple healthcare centers to
validate the effectiveness and robustness of our system. We
believe that our close collaboration with esteemed
institutions like Klinikum Lippe in OWL area will elevate the
quality of medical services available to its residents. Through
this innovative approach, we aim to set a new standard in
WBIT error detection, transforming laboratory medicine
practices and enhancing patient safety in clinical settings.
RELEVANCE FOR
MEDICAL APPLICATIONS:
INTELLIGENT HANDLING
OF MISLABELED DATA IN
A MEDICAL CONTEXT
R2
Prosilience and long-term
robustness through
human-centered design
16
Human beings possess an inherent desire for social
interaction. Thus, interactions with coworkers can either
fulfill or threaten basic social needs.
With the growing presence of robots or, more generally,
AI-based agents as work colleagues, this essential aspect
of human nature may be disrupted, potentially leading to
feelings of social exclusion.
SAIL explores forms of social exclusion in work-related
human-robot interactions and chose the restaurant industry
as a contemporary use case. Many skilled workers have left
the restaurant industry during the COVID-19 pandemic
which increased the already existing staffing problem.
As a consequence, plenty of restaurants started using
robots as waiters to counteract skill shortages, also in
OWL.
This is associated with the interdisciplinary challenge to
examine team processes known from interpersonal
relations at work with regards to work-related human-
robot interactions.
HUMAN PERCEPTION OF
AI-BASED AGENTS AS
TEAM MEMBERS
R2
Prosilience and long-term
robustness through
human-centered design
17
Current AI models for ITS can contain hundreds of
billion model parameters and tokens for training. These
exhaustive resources limit their availability for fast
model adaptation in human-machine interaction and
continued adaptation during their life-cycle.
Furthermore, there is a need for data- and resource-
efficient AI technologies in the light of AIs ecological
footprint. While research begins to address
technological demands such as learning from little data
or resource-efficient model distillation, we also need to
consider efficiency regarding human-computer-
interaction and cognitive constraints of a human
partner. SAIL leverages and connects cognitive
components, human-computer interaction expertise,
and novel engineering technologies to create
innovative approaches for challenges pertaining to data
economy and knowledge integration, energy efficiency,
and cognitive efficiency.
Active learning constitutes one prominent technology
which enables a data-efficient adaptation of AI models
to new scenarios. SAIL researchers have proposed
innovative methods to deal with complex regression
tasks, missing data, or low query budget [29,30,31,33].
In biomedical applications, gathering labels by human
experts is often problematic; this can be countered by
novel architectural design principles which enable
training based on artificial surrogate labels, thus
minimizing the cognitive load [36].
18
R3: SUSTAINABILITY AND EFFICIENCY IN
HUMAN-CENTERED ENVIRONMENTS
Another option is offered by suitable foundation models
which allow us to target tracking tasks based on one-
shot adaptation procedures [37]. Learning from few
data is also of high relevance in contexts where
foundation models might not yet exist due to the
novelty of the used sensor technologies. SAIL
researchers demonstrated how to devise reliable
systems based on little data for an electronic nose
sensor in the biomedical context [38,40].
For dynamic systems, efficiency can be accelerated by
means of system identification, saving resources and
improving system robustness [35]. Design principles
which incorporate system invariances or equivariance
enable the stark reduction of model size hence both
energy and time consumption [34,39]. Distributed
design in line with the system structure can save time
within the model exploration phase [32].
3
Research area (R3) provides methodologies to master the complexity of involved tasks efficiently,
in real-time, and on restricted hardware, and to do so while maximizing ‘cognitive’ efficiency, i.e.,
the ratio between knowledge gain and invested effort. For this purpose, we integrate AI methods
for small data sets with engineering expertise such as approximate computing and energy-
efficient design, principles of nature such as those offered in biomechatronics, and cognitive
science.
R3
19
Modern AI methods have become an integral part of today's
industry. However, the performance of established ML methods
as a sub-area of AI has so far mostly been based on the use of
powerful decentralized computing resources in the cloud. The
application depends on powerful hardware resources not only
for training of the models but also for their execution. The issues
addressed by large providers of AI expertise highly differ from
the requirements of industrial applications due to high demands
on low latency, real-time capability.
Progress has been made in the past in the efficient execution of
AI algorithms on embedded system (cognitive edge computing).
At all levels of their execution, there are a high number of
potential hardware architectures and AI accelerators that differ
in terms of the available system resources such as.
performance or energy requirements. Shifting the execution of
the AI algorithms as close to the origin of the data offers high
potential of energy savings compared to using central cloud
resources. Examples of relevant hardware architectures are
embedded microcontrollers with integrated AI acceleration,
embedded GPUs/FPGAs, dedicated AI hardware accelerators
or high-end HPC GPUs/FPGAs.
The research goal within SAIL is the development of
sustainable industrial applications by advancing AI algorithms
for the efficient execution on resource-limited hardware in the
sense of an AI-hardware co-design. When exploring the design
space, we not only consider a one-way linear process from the
model to the inference, but also the impact of the selection of
suitable hardware on the original AI model.
COGNITIVE EDGE
COMPUTING
20
R3
Sustainability and efficiency
in human-centered
environments
Overarching Challenges
JUNIOR
RESEARCH
GROUP
LEADERS IN
FOCUS:
MICHAEL
RÖDER
We are working on an improved evaluation of complex AI
applications. The goal is to ease the understanding of the
performance and the identification of potential problems in these
applications. A short term goal is to provide hints for developers
regarding possible improvements of their software. A long term
goal is to enable complex AI applications to improve themselves
over time.
Can you tell us about a topic you are working
on?
I think that machine learning and AI will be integrated into many
applications that people use everyday. Apart from the “classic”
examples like personal assistants that are based on AI, I think
that AI-based applications will be used in a lot of work areas. For
example, small or medium-sized businesses may start to use AI
for planning their work or optimizing internal processes.
This will be made possible because of the current research that
makes creating and training an AI pipeline easier by automating
many of the steps for which an expert is needed at the moment.
One of these steps is the evaluation of AI algorithms that we are
trying to improve with our current work.
A look into the future: What do you expect to
happen with AI in the next years?
SAIL brings together young researchers, PhD students and AI
experts to work on cutting edge AI technologies. All these
different people bring in their own view on research questions
and it is great to see how different collaborations emerge from
this group of people.
What do you find interesting about SAIL?
21
JUNIOR
RESEARCH
GROUP
LEADERS IN
FOCUS:
ALAA
OTHMAN
I'm Egyptian, married since 2006 and have three children.
I started working in Egypt as a teaching assistant in a private
institute of computer science (since 2003) and after getting
my master's degree I found another job as an assistant
lecturer in a public university. I visited Belgium (Ghent
University) as a PhD student, funded by one of the Erasmus
programs (2014-2015). Then, I returned to Egypt to finish my
PhD, after the defense of my PhD, directly, I found a job here
in Germany, Frankfurt University of Applied Sciences
(March, 2017).
Then, found a good job here in Bielefeld University of
Applied Sciences (March, 2019). My hobbies are watching
and playing football with my family and some friends. Also
reading is one of my hobbies.
Can you tell us something about yourself?
My research area focuses on semi-supervised learning, which
deals with situations where the available training data is limited
and obtaining more data is challenging.
In such cases, I explore the use of intelligent methods known as
active learning to select the most informative and representative
data points for labeling.
This approach significantly reduces the cost and time involved
in the annotation process.
The topic of active learning in the context of semi-supervised
learning is of great interest to a broader audience.
It addresses a common problem faced by many fields: the
scarcity of sufficient training data. By developing techniques to
maximize the value of available data, active learning offers
practical solutions that can enhance model performance and
efficiency. This research has implications in various domains,
benefitting industries such as healthcare, finance, and natural
language processing, where acquiring labeled data can be
resource-intensive.
What is your research area?
22
Accelerating inference for neural network models using
reprogrammable hardware such as Field-Programmable
Gate Arrays (FPGAs) is an active area of research.
Mapping such models to hardware layer-by-layer results in
custom-tailored streaming dataflow architectures that can
deliver extreme performance and energy efficiency.
Since for such architectures model dimensions are
constrained by available hardware, we apply aggressive
approximation techniques such as network pruning and
low-bit quantization to minimize resource requirements. In
our project, we leverage the FINN open-source compilation
framework to create the first-ever complete FPGA
streaming dataflow architecture for transformer models.
Transformers pose substantial challenges for dataflow
architectures, including the proper approximation of the
attention and softmax building blocks and the automated
balancing of the different layers of the architecture.
We are showcasing performance and resource figures on
tiny GPT-like transformers for text generation. Scaling to
medium-sized models will be possible by deploying the
accelerated transformers across multiple devices in an
FPGA cluster.
LOW-RESOURCE AI:
TRANSFORMERS FOR
FIELD-PROGRAMMABLE
GATE ARRAYS
R3
Sustainability and
efficiency in human-
centered environments
23
In 2019, 3.2 million people worldwide were affected by pressure ulcers.
The development of pressure ulcers is due to positioning problems,
e.g. in bedridden or immobilized patients of different ages. In this SAIL
sub-project, we are combining different machine learning approaches
to determine the pressure distribution between the body and the
positioning surface - for example, the back of the body and the lying
surface of a person lying down. nstead of a costly and hygienically
AI COMPONENTS FOR
INTELLIGENT CARE BEDS R3
Sustainability and
efficiency in human-
centered environments
24
By fusing the sensor information, an image of the lying person is
created from above, which in particular contains a height profile. Based
on this information, an image-to-image transformation is learned using
a deep neural network. After training, the neural network is able to
estimate a pressure distribution map of the back of the body based on
the height profile of the front of the body. The network must be
prevented from "hallucinating", i.e. inventing non-existent pressure
forces. For this purpose, the system is extended so that it can check
the physical plausibility of the predicted pressure distribution.
In future, a care bed will then be able to independently change the tilt
angle of the lying surface depending on the predicted pressure pattern
problematic sensor mat
underneath the lying person to
measure the pressure
distribution at several thousand
locations on the back of the
body, our approach uses cost-
effective and inconspi-cuously
installable video, thermal
imaging and depth camera
systems. These sensors are
integrated into the trapeze bar
above the patient's bed.
In engineering, machine learning models are most useful
when they are explainable. Recent advances have
successfully introduced efficient learning algorithms for
the large-scale structured data often found in this field. In
particular, approaches that combine knowledge graphs
and class expression learning have been shown to be
particularly well suited for engineering. Still, bridging
between the formal language spoken by these machine
learning approaches and the wealth of natural languages
spoken by engineers remained a challenge.
With LoLa, SAIL addresses cognitive sustainability by
developing a multilingual large language model that can
be used by machine learning algorithms to generate
explanations in over 160 languages. The 1.3-billion
parameter-strong large language model was trained on
over 450 billion tokens gathered from the Cultura-X
dataset. While heavily tailored towards generating
German, LoLa also covers low-resource languages such
as Tamil and Bangla. With this broad language
coverage, LoLa achieves two feats. First, it transforms
formal languages into natural language, hence enabling
engineers to fully understand the behavior of machine
learning models. Second, it achieves this goal in the
native languages of a large portion of the population of
engineers. Therewith, LoLa lowers their cognitive load
significantly.
SUSTAINABLE AI IN
ENGINEERING
R3
Sustainability and
efficiency in human-
centered environments
25
REFERENCES
26
[9] Abdullah Fathi Ahmed, Asep Fajar Firmansyah, Mohamed Ahmed Sherif, Diego Moussallem, and Axel-Cyrille Ngonga
Ngomo. Explainable integration of knowledge graphs using large language models. In Natural Language Processing and
Information Systems, 124–139, Cham, 2023. Springer Nature Switzerland. https://arxiv.org/pdf/2111.08486
[1] Julian Knaup, Christoph-Alexander Holst, and Volker Lohweg. Robust training with adversarial examples on industrial
data. In Proceedings - 33. Workshop Computational Intelligence, 123–142, Berlin, Germany, November 2023.
https://publikationen.bibliothek.kit.edu/1000162754/151572782.
[2] Inga Jagemann, Ole Wensing, Manuel Stegemann, and Gerrit Hirschfeld. Acceptance of medical artificial intelligence in
skin cancer screening: Choice-based conjoint survey. JMIR Form Res, 8:e46402, Jan 2024.
https://formative.jmir.org/2024/1/e46402.
[3] Sanne Hoeken, Sina Zarrieß, and ÖOzge Alacam. Identifying slurs and lexical hate speech via light-weight dimension
projection in embedding space. In Proceedings of the 13th Workshop on Computational Approaches to Subjectivity,
Sentiment, & Social Media Analysis, 278–289, Toronto, Canada, July 2023. Association for Computational Linguistics.
https://aclanthology.org/2023.wassa-1.25.
[4] Sanne Hoeken, Sophie Spliethoff, Silke Schwandt, Sina Zarrieß, and Özge Alacam. Towards detecting lexical change of
hate speech in historical data. In Proceedings of the 4th Workshop on Computational Approaches to Historical Language
Change, 100–111, Singapore, December 2023. Association for Computational Linguistics.
https://aclanthology.org/2023.lchange-1.11.
[5] Sanne Hoeken, Ozge Alacam, Antske Fokkens, and Pia Sommerauer.¨ Methodological insights in detecting subtle
semantic shifts with contextualized and static language models. In Findings of the Association for Computational Linguistics:
EMNLP 2023, 3662–3675, Singapore, December 2023. Association for Computational Linguistics.
https://aclanthology.org/2023.findings-emnlp.237.
[5] Sanne Hoeken, Özge Alacam, Antske Fokkens, and Pia Sommerauer. Methodological insights in detecting subtle
semantic shifts with contextualized and static language models. In Findings of the Association for Computational Linguistics:
EMNLP 2023, 3662–3675, Singapore, December 2023. Association for Computational Linguistics.
https://aclanthology.org/2023.findings-emnlp.237.
[6] Christian Witte, David M. Schmidt, and Philipp Cimiano. Comparing generative and extractive approaches to information
extraction from abstracts describing randomized clinical trials. Journal of Biomedical Semantics, 2024.
https://jbiomedsem.biomedcentral.com/articles/ 10.1186/s13326-024-00305-2.
[7] Ashwin Prasad Shivarpatna Venkatesh, Jiawei Wang, Li Li, and Eric Bodden. Enhancing comprehension and navigation
in jupyter notebooks with static analysis. In 2023 IEEE International Conference on Software Analysis, Evolution and
Reengineering (SANER), 391–401, 2023. https://ieeexplore.ieee.org/iel7/10123438/10123439/10123615.pdf
[8] N’Dah Jean Kouagou, Stefan Heindorf, Caglar Demir, and Axel-Cyrille Ngonga Ngomo. Neural class expression
synthesis. In The Semantic Web, 209–226, Cham, 2023. Springer Nature Switzerland. https://papers.dice-
research.org/2023/ESWC_NCES/NCES_ public.pdf.
[10] Judith Sieker, Oliver Bott, Torgrim Solstad, and Sina Zarrieß. Beyond the bias: Unveiling the quality of implicit causality
prompt continuations in language models. In Proceedings of the 16th International Natural Language Generation
Conference, 206–220, Prague, Czechia, September 2023. Association for Computational Linguistics.
https://aclanthology.org/2023.inlg-main.15.
11] Lina Mavrina and Stefan Kopp. Predicting grounding state for adaptive explanation generation in analogical problem-
solving. In Proceedings of the 27th Workshop on the Semantics and Pragmatics of Dialogue, 2023. https://pub.uni-
bielefeld.de/record/2981840#ama.
[12] Judith Sieker and Sina Zarrieß. When your language model cannot even do determiners right: Probing for anti-
presuppositions and the maximize presupposition! principle. In Proceedings of the 6th BlackboxNLP Workshop: Analyzing
and Interpreting Neural Networks for NLP, 180–198, Singapore, December 2023. Association for Computational Linguistics.
https://aclanthology.org/2023.blackboxnlp-1.14.
[13] Hitesh Dhiman, Yutaro Nemoto, Holger Mühlan, Michael Fellmann, and Carsten Röcker. Towards designing assistants
for well-being: clarifying the relationship between users’ intrinsic motivation and expectations from assistants. In ICIS 2022
Proceedings, number 10, 2022.
https://aisel.aisnet.org/icis2022/user_behaivor/user_ behaivor/10.
[14] Özge Alacam, Eugen Ruppert, Sina Zarrieß, Ganeshan Malhotra, Chris¨ Biemann, and Sina Zarrieß. Modeling
referential gaze in task-oriented settings of varying referential complexity. In Findings of the Association for Computational
Linguistics: AACL-IJCNLP 2022, 197–210, Online only, November 2022. Association for Computational Linguistics.
https://aclanthology.org/2022.findings-aacl.19.
[15] Zafran Hussain Shah, Marcel Müller, Wolfgang Hübner, Tung-Cheng Wang, Daniel Telman, Thomas Huser, and
Wolfram Schenck. Evaluation of Swin Transformer and knowledge transfer for denoising of superresolution structured
illumination microscopy data. GigaScience, 13,01 2024.
https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giad109/7529002.
[17] Daniel Leite, Alisson Silva, Gabriella Casalino, Arnab Sharma, Danielle Fortunato, and Axel-Cyrille Ngonga Ngomo.
EGNN-C+: interpretable evolving granular neural network and application in classification of weakly-supervised EEG data
streams. 2024. https://arxiv.org/pdf/ 2402.17792.
[16] Lars Quakernack, Valerie Vaquet, Barbara Hammer, and Jens Haubrock. A sensor fault detection and imputation
framework for electrical distribution grids. In 2023 IEEE PES Innovative Smart Grid Technologies Europe (ISGT EUROPE),
1–5, 2023. https://ieeexplore. ieee.org/document/10407843.
[18] Jörn Tebbe, Christoph Zimmer, Ansgar Steland, Markus LangeHegermann, and Fabian Mies. Efficiently computable
safety bounds for Gaussian processes in active learning. In Proceedings of The 27th International Conference on Artificial
Intelligence and Statistics, volume 238 of Proceedings of Machine Learning Research, 1333–1341. PMLR, 02–04 May
2024. https://proceedings.mlr.press/v238/tebbe24a.html.
[19] Hans Harder and Sebastian Peitz. On the continuity and smoothness of the value function in reinforcement learning and
optimal control, 2024. https://arxiv.org/abs/2403.14432.
[30] Alaa Tharwat and Wolfram Schenck. Active learning for handling missing data. IEEE Transactions on Neural Networks
and Learning Systems, 1–15, 2024. https://pubmed.ncbi.nlm.nih.gov/38277246/.
[20] Zafran Hussain Shah, Marcel Müller, Wolfgang Hübner, Tung-Cheng Wang, Daniel Telman, Thomas Huser, and
Wolfram Schenck. Evaluation of Swin Transformer and knowledge transfer for denoising of super-resolution structured
illumination microscopy data. GigaScience, 13, 01 2024. https://link.springer.com/chapter/10.1007/ 978-3-031-30047-9_9.
[21] Frederieke Richert, Michiel Straat, Elisa Oostwal, and Michael Biehl. Layered neural networks with gelu activation, a
statistical mechanics analysis. In Proceedings ESANN 2023, 435–440. i6doc.com publication, 2023.
https://research.rug.nl/en/publications/ layered-neural-networks-with-gelu-activation-a-statistical-mechan.
[22] Thorben Markmann, Michiel Straat, and Barbara Hammer. Koopman-based surrogate modelling of turbulent rayleigh-
benard convection, 2024. https://arxiv.org/abs/2405.06425. Accepted for IJCNN24.
[23] Nico Migenda, Ralf Möller, and Wolfram Schenck. Adaptive local principal component analysis improves the clustering
of high-dimensional data. Pattern Recognition, 146, 2024. https://www.sciencedirect.
com/science/article/pii/S0031320323007276?via%3Dihub.
[24] Umair Qudus, Michael Röder, Sabrina Kirrane, and AxelCyrille Ngonga Ngomo. Temporalfc: A temporal fact checking
approach over knowledge graphs. In The Semantic Web – ISWC 2023, 465–483, Cham, 2023. Springer Nature Switzerland.
https://link.springer. com/chapter/10.1007/978-3-031-47240-4_25.
[25] Annika Junker, Keno Pape, Julia Timmermann, and Ansgar Trächtler. Adaptive koopman-based models for holistic
controller and observer design. IFAC-PapersOnLine, 56(3):625–630, 2023.
https://www.sciencedirect.com/science/article/pii/S2405896323024278.
[26] Clarissa Sabrina Arlinghaus and Günter W. Maier. Being ignored is not the only possible form of social exclusion in
human-agent interaction. In HAI ’23: The Importance of Human Factors for Trusted HumanRobot Collaborations, 2023.
https://pub.uni-bielefeld.de/record/ 2984976.
[27] Clarissa Sabrina Arlinghaus and Günter W. Maier. Social exclusion in personnel selection – the risk of discriminating AI
biases. In Proceedings of Interdisciplinary Approaches in Human-Agent Interaction (Inter.HAI ’23), 2023. https://pub.uni-
bielefeld.de/record/2984822
[31] Bjarne Jaster and Martin Kohlhase. Active learning for regression problems with ensemble methods. In Proceedings -
33. Workshop Computational Intelligence, 9–30, Berlin, Germany, November 2023.
https://publikationen.bibliothek.kit.edu/1000162754/151572782.
[32] Sebastian Peitz, Jan Stenner, Vikas Chidananda, Oliver Wallscheid, Steven L. Brunton, and Kunihiko Taira. Distributed
control of partial differential equations using convolutional reinforcement learning. Physica D: Nonlinear Phenomena, 461,
2024. https://www.sciencedirect.com/science/article/pii/S0167278924000472.
[33] A. Tharwat and W. Schenck. Using methods from dimensionality reduction for active learning with low query budget.
IEEE Transactions on Knowledge amp; Data Engineering, (01):1–14, feb 5555. https://www.
computer.org/csdl/journal/tk/5555/01/10433684/1Uvg4zZhThu.
34] Hans Harder and Sebastian Peitz. Predicting PDE fast and efficiently with equivariant extreme learning machines, 2024.
https://arxiv.org/ abs/2404.18530.
[35] Stefan Werner and Sebastian Peitz. Learning a model is paramount for sample efficiency in reinforcement learning
control of PDEs, 2023. https://arxiv.org/abs/2302.07160. Accepted for European Control Conference 2024.
[36] Dominik Stallmann and Barbara Hammer. Unsupervised cyclic siamese networks automating cell imagery analysis.
Algorithms, 16(4):205, 2023. https://www.mdpi.com/1999-4893/16/4/205.
[38] Julius Wörner, Maurice Moelleken, Joachim Dissemond, and Miriam Pein-Hackelbusch. Supporting wound infection
diagnosis: advancements and challenges with electronic noses. Frontiers in Sensors,4, 2023.
https://www.frontiersin.org/articles/10.3389/fsens.2023.1250756/full.
[37] Tristan Kenneweg, Philip Kenneweg, and Barbara Hammer. Foundation model vision transformers are great tracking
backbones. In 2024 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA),
1–6, 2024. https://ieeexplore. ieee.org/document/10467598.
[28] Marc Harkonen, Markus Lange-Hegermann, and Bogdan Raita. Gaussian process priors for systems of linear partial
differential equations with constant coefficients. In Proceedings of the 40th International Conference on Machine Learning,
volume 202 of Proceedings of Machine Learning Research, 12587–12615. PMLR, 23–29 Jul 2023.
https://proceedings.mlr.press/v202/harkonen23a.html.
[29] Alaa Tharwat and Wolfram Schenck. A survey on active learning: State-of-the-art, practical challenges and research
directions. Mathematics, 11(4), 2023. https://www.mdpi.com/2227-7390/11/4/820.
[39] Sebastian Peitz, Hans Harder, Feliks Nüske, Friedrich Philipp, Manuel Schaller, and Karl Worthmann. Partial
observations, coarse graining and equivariance in Koopman operator theory for large-scale dynamical systems, 2023.
https://arxiv.org/abs/2307.15325.
[39] Sebastian Peitz, Hans Harder, Feliks Nüske, Friedrich Philipp, Manuel Schaller, and Karl Worthmann. Partial
observations, coarse graining and equivariance in Kkoopman operator theory for large-scale dynamical systems, 2023.
https://arxiv.org/abs/2307.15325.
Prof'in Dr Barbara Hammer
(Bielefeld University)
Dr Özge Alacam
(Bielefeld University)
Clarissa Sabrina Arlinghaus
(Bielefeld University)
Mona Brinkmann
(TH OWL) | Design
Prof'in Dr Helene Dörksen
(TH OWL)
Sanne Hoeken
(Bielefeld University)
Prof Dr Thorsten Jungeblut
(HSBI)
Julian Knaup
(TH OWL)
Dr Daniel Leite
(Paderborn University)
Prof Dr Volker Lohweg
(TH OWL)
Prof Dr Günter Maier
(Bielefeld University)
Prof Dr Axel-Cyrille Ngonga Ngomo
(Paderborn University)
Dr Alaa Othman
(HSBI)
Jun Prof Dr Sebastian Peitz
(Paderborn University)
Prof Dr Marco Platzner
(Paderborn University)
Dr Ole Pütz
(Bielefeld University)
Prof Dr Dr Dr Carsten Röcker
(TH OWL)
Dr Michael Röder
(Paderborn University)
Prof Dr Wolfram Schenck
(HSBI)
Prof Dr Axel Schneider
(HSBI)
Prof'in Dr Silke Schwandt
(Bielefeld University)
Sophie Jasmin Spliethoff
(Bielefeld University)
Dr Michiel Straat
(Bielefeld University)
Baris Gün Sürmeli
(TH OWL)
Prof'in Dr Anna-Lisa Vollmer
(Bielefeld University)
Prof'in Dr Sina Zarrieß
(Bielefeld University)
AUTHORS
27
... Humans have an innate need for social contact [28], which can be fulfilled through coworker interactions [175]. Now that more and more robots are being used as work colleagues [180,185,275], this essential aspect of our social nature could be disrupted and make people feel socially excluded [15,94]. ...
... Given this, the restaurant industry is an excellent example of how using robots as coworkers can counteract the shortage of skilled workers and relieve the existing human workforce. Plus, it also represents an excellent example of human-robot teams [15,94]. Current research on robot restaurants focuses on customers [109,117,126,135,154,232,237] but neglects the impact on humans working in such restaurants. ...
... We live in a time where human-robot interactions at work are increasing [180,185] and are often viewed positively at first [87,222,256,265]. Robots can be used to mitigate the shortage of skilled workers (e.g., in gastronomy) [94]. However, our results show that these new forms of work also have adverse effects that need to be viewed critically. ...
Preprint
Full-text available
Nowadays, we can observe more and more work situations where humans and robots work together (e.g., in manufacturing, care, or gastronomy). Consequently, the question arises if work still satisfies fundamental social needs (e.g., belonging, self-esteem, meaningful existence) or if human-robot teams make people feel excluded with severe consequences for individuals and organizations. Buildingon the temporal need-threat model, we examined restaurant employees’ reactions to social inclusion and exclusion from human or robot coworkers in two pre-registered studies (N 1 = 74; N 2 = 256). Our findings demonstrate that social inclusion from human or robot coworkers leads to higher need fulfillment, while social exclusion (ostracism and rejection) from human or robot coworkers triggers need-threat (i.e., low need fulfillment). However, the effect was more pronounced when being included or excluded by human coworkers, possibly due to more internal and uncontrollable attributions. Participants assumed interpersonal like/dislike when included/excluded by human coworkers, whereas they blamed the robots’ programming for being included or excluded by robot coworkers. Ignored participants show more organizational citizenship behavior (e.g., relieving a coworker’s workload) and less counterproductive behavior (e.g., insultinga coworker) towards their human coworkers but not towards their robot coworkers. Both studies showed that people do not mindlessly interpret robot behavior as like social behavior by humans and, therefore, demonstrate a case where the “Computers Are Social Actors” paradigm is not supported. Consequently, social dynamics within human team members should be prioritized in human-robot teams to maintain a healthy work environment.
Conference Paper
Full-text available
The value function plays a crucial role as a measure for the cumulative future reward an agent receives in both reinforcement learning and optimal control. It is therefore of interest to study how similar the values of neighboring states are, i.e., to investigate the continuity of the value function. We do so by providing and verifying upper bounds on the value function’s modulus of continuity. Additionally, we show that the value function is always Hölder continuous under relatively weak assumptions on the underlying system and that non-differentiable value functions can be made differentiable by slightly “disturbing” the system.
Article
Full-text available
Background Systematic reviews of Randomized Controlled Trials (RCTs) are an important part of the evidence-based medicine paradigm. However, the creation of such systematic reviews by clinical experts is costly as well as time-consuming, and results can get quickly outdated after publication. Most RCTs are structured based on the Patient, Intervention, Comparison, Outcomes (PICO) framework and there exist many approaches which aim to extract PICO elements automatically. The automatic extraction of PICO information from RCTs has the potential to significantly speed up the creation process of systematic reviews and this way also benefit the field of evidence-based medicine. Results Previous work has addressed the extraction of PICO elements as the task of identifying relevant text spans or sentences, but without populating a structured representation of a trial. In contrast, in this work, we treat PICO elements as structured templates with slots to do justice to the complex nature of the information they represent. We present two different approaches to extract this structured information from the abstracts of RCTs. The first approach is an extractive approach based on our previous work that is extended to capture full document representations as well as by a clustering step to infer the number of instances of each template type. The second approach is a generative approach based on a seq2seq model that encodes the abstract describing the RCT and uses a decoder to infer a structured representation of a trial including its arms, treatments, endpoints and outcomes. Both approaches are evaluated with different base models on a manually annotated dataset consisting of RCT abstracts on an existing dataset comprising 211 annotated clinical trial abstracts for Type 2 Diabetes and Glaucoma. For both diseases, the extractive approach (with flan-t5-base) reached the best F1F1F_1 score, i.e. 0.547 (±0.006±0.006\pm 0.006) for type 2 diabetes and 0.636 (±0.006±0.006\pm 0.006) for glaucoma. Generally, the F1F1F_1 scores were higher for glaucoma than for type 2 diabetes and the standard deviation was higher for the generative approach. Conclusion In our experiments, both approaches show promising performance extracting structured PICO information from RCTs, especially considering that most related work focuses on the far easier task of predicting less structured objects. In our experimental results, the extractive approach performs best in both cases, although the lead is greater for glaucoma than for type 2 diabetes. For future work, it remains to be investigated how the base model size affects the performance of both approaches in comparison. Although the extractive approach currently leaves more room for direct improvements, the generative approach might benefit from larger models.
Article
Full-text available
Recently, it has been challenging to generate enough labeled data for supervised learning models from a large amount of free unlabeled data due to the high cost of the labeling process. Here, the active learning technique provides a solution by annotating a small but highly informative set of unlabeled data. This ensures high generalizability in space and improves classification performance with test data. The task is more challenging when the query budget is small, the data is imbalanced, multiple classes are present, and no predefined knowledge is available. To address these challenges, we present a novel active learner geometrically based on principal component analysis (PCA) and linear discriminant analysis (LDA). The proposed active learner consists of two phases: The PCA-inspired exploration phase, in which regions with high variances are explored, and the LDA-inspired exploitation phase, in which boundary points between classes are selected. The proposed geometric strategy improves the search capabilities of the active learner, allowing it to explore the space of minority classes even with multiple minority classes and a small query budget. Experiments on synthetic and real binary and multi-class imbalanced data show that the proposed algorithm has significant advantages over multiple known active learners.
Article
Full-text available
We present a convolutional framework which significantly reduces the complexity and thus, the computational effort for distributed reinforcement learning control of dynamical systems governed by partial differential equations (PDEs). Exploiting translational equivariances, the high-dimensional distributed control problem can be transformed into a multi-agent control problem with many identical, uncoupled agents. Furthermore, using the fact that information is transported with finite velocity in many cases, the dimension of the agents' environment can be drastically reduced using a convolution operation over the state space of the PDE, by which we effectively tackle the curse of dimensionality otherwise present in deep reinforcement learning. In this setting, the complexity can be flexibly adjusted via the kernel width or by using a stride greater than one (meaning that we do not place an actuator at each sensor location). Moreover, scaling from smaller to larger domains-or the transfer between different domains-becomes a straightforward task requiring little effort. We demonstrate the performance of the proposed framework using several PDE examples with increasing complexity, where stabilization is achieved by training a low-dimensional deep deterministic policy gradient agent using minimal computing resources.
Article
Full-text available
Background Convolutional neural network (CNN)–based methods have shown excellent performance in denoising and reconstruction of super-resolved structured illumination microscopy (SR-SIM) data. Therefore, CNN-based architectures have been the focus of existing studies. However, Swin Transformer, an alternative and recently proposed deep learning–based image restoration architecture, has not been fully investigated for denoising SR-SIM images. Furthermore, it has not been fully explored how well transfer learning strategies work for denoising SR-SIM images with different noise characteristics and recorded cell structures for these different types of deep learning–based methods. Currently, the scarcity of publicly available SR-SIM datasets limits the exploration of the performance and generalization capabilities of deep learning methods. Results In this work, we present SwinT-fairSIM, a novel method based on the Swin Transformer for restoring SR-SIM images with a low signal-to-noise ratio. The experimental results show that SwinT-fairSIM outperforms previous CNN-based denoising methods. Furthermore, as a second contribution, two types of transfer learning—namely, direct transfer and fine-tuning—were benchmarked in combination with SwinT-fairSIM and CNN-based methods for denoising SR-SIM data. Direct transfer did not prove to be a viable strategy, but fine-tuning produced results comparable to conventional training from scratch while saving computational time and potentially reducing the amount of training data required. As a third contribution, we publish four datasets of raw SIM images and already reconstructed SR-SIM images. These datasets cover two different types of cell structures, tubulin filaments and vesicle structures. Different noise levels are available for the tubulin filaments. Conclusion The SwinT-fairSIM method is well suited for denoising SR-SIM images. By fine-tuning, already trained models can be easily adapted to different noise characteristics and cell structures. Furthermore, the provided datasets are structured in a way that the research community can readily use them for research on denoising, super-resolution, and transfer learning strategies.
Conference Paper
We introduce a modified incremental learning algorithm for evolving Granular Neural Network Classifiers (eGNN-C+). We use double-boundary hyper-boxes to represent granules, and customize the adaptation procedures to enhance the robustness of outer boxes for data coverage and noise suppression, while ensuring that inner boxes remain flexible to capture drifts. The classifier evolves from scratch, incorporates new classes on the fly, and performs local incremental feature weighting. As an application, we focus on the classification of emotion-related patterns within electroencephalogram (EEG) signals. Emotion recognition is crucial for enhancing the realism and interactivity of computer systems. The challenge lies exactly in developing high-performance algorithms capable of effectively managing individual differences and non-stationarities in physiological data without relying on subject-specific information. We extract features from the Fourier spectrum of EEG signals obtained from 28 individuals engaged in playing computer games – a public dataset. Each game elicits a different predominant emotion: boredom, calmness, horror, or joy. We analyze individual electrodes, time window lengths, and frequency bands to assess the accuracy and interpretability of resulting user-independent neural models. The findings indicate that both brain hemispheres assist classification, especially electrodes on the temporal (T8) and parietal (P7) areas, alongside contributions from frontal and occipital electrodes. While patterns may manifest in any band, the Alpha (8-13Hz), Delta (1-4Hz), and Theta (4-8Hz) bands, in this order, exhibited higher correspondence with the emotion classes. The eGNN-C+ demonstrates effectiveness in learning EEG data. It achieves an accuracy of 81.7% and a 0.0029 II interpretability using 10-second time windows, even in face of a highly-stochastic time-varying 4-class classification problem.
Article
Recently, the massive growth of IoT devices and Internet data, which are widely used in many applications, including industry and healthcare, has dramatically increased the amount of free unlabeled data collected. However, this unlabeled data is useless if we want to learn supervised machine learning models. The expensive and time-consuming cost of labeling makes the problem even more challenging. Here, the active learning (AL) technique provides a solution by labeling small but highly informative and representative data, which guarantees a high degree of generalizability over space and improves classification performance with data we have never seen before. The task is more difficult when the active learner has no predefined knowledge, such as initial training data, and when the obtained data is incomplete (i.e., contains missing values). In previous studies, the missing data should first be imputed. Then, the active learner selects from the available unlabeled data, regardless of whether the points were originally observed or imputed. However, selecting inaccurate imputed data points would negatively affect the active learner and prevent it from selecting informative and/or representative points, thus reducing the overall classification performance of the prediction models. This motivated us to introduce a novel query selection strategy that accounts for imputation uncertainty when querying new points. For this purpose, we first introduce a novel multiple imputation method that considers feature importance in selecting the most promising feature groups for missing values estimation. This multiple imputation method provides the ability to quantify the imputation uncertainty of each imputed data point. Furthermore, in each of the two phases of the proposed active learner (exploration and exploitation), imputation uncertainty is taken into account to reduce the probability of selecting points with high imputation uncertainty. We tested the effectiveness of the proposed active learner on different binary and multiclass datasets with different missing rates.