Fabian Hinder’s research while affiliated with Citec and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (48)


An Algorithm-Centered Approach To Model Streaming Data
  • Preprint
  • File available

December 2024

·

16 Reads

Fabian Hinder

·

Valerie Vaquet

·

David Komnick

·

Besides the classical offline setup of machine learning, stream learning constitutes a well-established setup where data arrives over time in potentially non-stationary environments. Concept drift, the phenomenon that the underlying distribution changes over time poses a significant challenge. Yet, despite high practical relevance, there is little to no foundational theory for learning in the drifting setup comparable to classical statistical learning theory in the offline setting. This can be attributed to the lack of an underlying object comparable to a probability distribution as in the classical setup. While there exist approaches to transfer ideas to the streaming setup, these start from a data perspective rather than an algorithmic one. In this work, we suggest a new model of data over time that is aimed at the algorithm's perspective. Instead of defining the setup using time points, we utilize a window-based approach that resembles the inner workings of most stream learning algorithms. We compare our framework to others from the literature on a theoretical basis, showing that in many cases both model the same situation. Furthermore, we perform a numerical evaluation and showcase an application in the domain of critical infrastructure.

Download

Adversarial Attacks for Drift Detection

November 2024

·

4 Reads

Concept drift refers to the change of data distributions over time. While drift poses a challenge for learning models, requiring their continual adaption, it is also relevant in system monitoring to detect malfunctions, system failures, and unexpected behavior. In the latter case, the robust and reliable detection of drifts is imperative. This work studies the shortcomings of commonly used drift detection schemes. We show how to construct data streams that are drifting without being detected. We refer to those as drift adversarials. In particular, we compute all possible adversairals for common detection schemes and underpin our theoretical findings with empirical evaluations.


Figure 1: Overview of the particularities of and key tasks in WDNs (in the boxes) and their dependencies. Red connections mark key challenges, green potentials, and blue additional constraints.
Summary of approaches for leakage detection and localization alongside their data requirements and a summary of how the stages of drift detection are realized.
Challenges, Methods, Data -- a Survey of Machine Learning in Water Distribution Networks

October 2024

·

124 Reads

Valerie Vaquet

·

Fabian Hinder

·

André Artelt

·

[...]

·

Research on methods for planning and controlling water distribution networks gains increasing relevance as the availability of drinking water will decrease as a consequence of climate change. So far, the majority of approaches is based on hydraulics and engineering expertise. However, with the increasing availability of sensors, machine learning techniques constitute a promising tool. This work presents the main tasks in water distribution networks, discusses how they relate to machine learning and analyses how the particularities of the domain pose challenges to and can be leveraged by machine learning approaches. Besides, it provides a technical toolkit by presenting evaluation benchmarks and a structured survey of the exemplary task of leakage detection and localization.


Figure 1: Synthetic datasets (a: "local" with linear model (black), b: "XOR", c: fair projection of "local"). Label is color/hatched; Protected attribute is shape.
FairGLVQ: Fairness in Partition-Based Classification

October 2024

·

13 Reads

Fairness is an important objective throughout society. From the distribution of limited goods such as education, over hiring and payment, to taxes, legislation, and jurisprudence. Due to the increasing importance of machine learning approaches in all areas of daily life including those related to health, security, and equity, an increasing amount of research focuses on fair machine learning. In this work, we focus on the fairness of partition- and prototype-based models. The contribution of this work is twofold: 1) we develop a general framework for fair machine learning of partition-based models that does not depend on a specific fairness definition, and 2) we derive a fair version of learning vector quantization (LVQ) as a specific instantiation. We compare the resulting algorithm against other algorithms from the literature on theoretical and real-world data showing its practical relevance.





FIGURE E
One or two things we know about concept drift-a survey on monitoring in evolving environments. Part B: locating and explaining concept drift

July 2024

·

20 Reads

·

3 Citations

Frontiers in Artificial Intelligence

In an increasing number of industrial and technical processes, machine learning-based systems are being entrusted with supervision tasks. While they have been successfully utilized in many application areas, they frequently are not able to generalize to changes in the observed data, which environmental changes or degrading sensors might cause. These changes, commonly referred to as concept drift can trigger malfunctions in the used solutions which are safety-critical in many cases. Thus, detecting and analyzing concept drift is a crucial step when building reliable and robust machine learning-driven solutions. In this work, we consider the setting of unsupervised data streams which is highly relevant for different monitoring and anomaly detection scenarios. In particular, we focus on the tasks of localizing and explaining concept drift which are crucial to enable human operators to take appropriate action. Next to providing precise mathematical definitions of the problem of concept drift localization, we survey the body of literature on this topic. By performing standardized experiments on parametric artificial datasets we provide a direct comparison of different strategies. Thereby, we can systematically analyze the properties of different schemes and suggest first guidelines for practical applications. Finally, we explore the emerging topic of explaining concept drift.



FIGURE E
One or two things we know about concept drift-a survey on monitoring in evolving environments. Part A: detecting concept drift

June 2024

·

22 Reads

·

15 Citations

Frontiers in Artificial Intelligence

The world surrounding us is subject to constant change. These changes, frequently described as concept drift, influence many industrial and technical processes. As they can lead to malfunctions and other anomalous behavior, which may be safety-critical in many scenarios, detecting and analyzing concept drift is crucial. In this study, we provide a literature review focusing on concept drift in unsupervised data streams. While many surveys focus on supervised data streams, so far, there is no work reviewing the unsupervised setting. However, this setting is of particular relevance for monitoring and anomaly detection which are directly applicable to many tasks and challenges in engineering. This survey provides a taxonomy of existing work on unsupervised drift detection. In addition to providing a comprehensive literature review, it offers precise mathematical definitions of the considered problems and contains standardized experiments on parametric artificial datasets allowing for a direct comparison of different detection strategies. Thus, the suitability of different schemes can be analyzed systematically, and guidelines for their usage in real-world scenarios can be provided.


Citations (28)


... Drift might manifest itself in a change of the input distribution, the posterior distribution, or any representation of features (see e.g. [26,27] for a detailed recent discussion). Hence the current model might become invalid either because there does not exist a model fitting both D t1 and D t2 , or because the variability of D t2 cannot easily be predicted based on D t1 . ...

Reference:

Machine learning in distributed, federated and non-stationary environments - recent trends
Feature-based analyses of concept drift
  • Citing Article
  • October 2024

Neurocomputing

... In contrast to other methods, those usually come with formal descriptions and guarantees on what they can and cannot do (Hinder and Hammer, 2023). Such are useful for various setups with semantic features, in particular sensor networks (Hinder et al., 2023a;Vaquet et al., 2024b). • Local feature importance techniques like Saliency ...

Localizing of Anomalies in Critical Infrastructure using Model-Based Drift Explanations
  • Citing Conference Paper
  • June 2024

... This is particularly relevant when the underlying system is part of critical infrastructure [7]. A promising way to address this problem is to consider it through the lens of concept drift [8,9,10,4,11], i.e., a change of the underlying data generating process. Here, a core task is drift detection [10] which is closely related to analyzing and understanding the drift [4]. ...

One or two things we know about concept drift-a survey on monitoring in evolving environments. Part B: locating and explaining concept drift

Frontiers in Artificial Intelligence

... from a single distribution. However, in many real-world applications, data is generated over time and possibly subject to various changes which are commonly referred to as concept drift [2,6,1] or drift for shorthand. Drift can arise from a variety of sources as, for instance, seasonal patterns, shifting user demands, sensor aging, and environmental factors. ...

One or two things we know about concept drift-a survey on monitoring in evolving environments. Part A: detecting concept drift

Frontiers in Artificial Intelligence

... This differs from a time series or stochastic process which are randomly sampled functions from time to data where observations can depend on each other, but each time point has only one definite value. Although both describe data and time interdependencies, and observed data can usually be modeled in both setups, their interpretation and areas of application differ significantly (Hinder et al., 2024). For instance, measuring the temperature of an object over time is a time series, yielding a single value per time. ...

A Remark on Concept Drift for Dependent Data
  • Citing Chapter
  • April 2024

Lecture Notes in Computer Science

... Besides, avoiding high dimensional data if possible is essential. In this case, feature selection might offer a good solution (Hinder and Hammer, 2023). Finally, when relying on tree-based methods, it is crucial to design an appropriate preprocessing if the drift inflicts itself in data correlations. ...

Feature Selection for Concept Drift Detection
  • Citing Conference Paper
  • January 2023

... As standard frameworks, for example, statistical learning theory [7], impose the assumption that all considered data is drawn i.i.d. from a single distribution, drift complicates theoretical analysis [4]. Besides, the established way of modeling data over time and defining drift are not suitable for analyzing stream machine learning on a theoretical level [3,2,4]. ...

On the Change of Decision Boundary and Loss in Learning with Concept Drift
  • Citing Chapter
  • April 2023

Lecture Notes in Computer Science