Arthur Zimek’s research while affiliated with University of Southern Denmark and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (167)


Simple example of dataset with ten inliers and two outliers. The table shows the outlier scores linearly normalized of eight different algorithms (introduced in Sect. 2.3). Note the differences in the scores, particularly for the outliers in (1,1) and (2,2)
Baseline dataset used in the experiments. Outliers are shown in orange and inliers in blue
Examples of datasets in the sensitivity analysis. Except for case (e), outliers are shown in orange and inliers in blue
Sensitivity analysis on the number of data points. Colors map to i values in Table 4
Sensitivity analysis on the number of dimensions. Colors map to i values in Table 4

+12

What do anomaly scores actually mean? Dynamic characteristics beyond accuracy
  • Article
  • Full-text available

November 2024

·

12 Reads

Data Mining and Knowledge Discovery

·

Henrique O. Marques

·

Arthur Zimek

·

Tanja Zseby

Anomaly detection has become pervasive in modern technology, covering applications from cybersecurity, to medicine or system failure detection. Before outputting a binary outcome (i.e., anomalous or non-anomalous), most algorithms evaluate instances with outlierness scores. But what does a score of 0.8 mean? Or what is the practical difference compared to a score of 1.2? Score ranges are assumed non-linear and relative, their meaning established by weighting the whole dataset (or a dataset model). While this is perfectly true, algorithms also impose dynamics that decisively affect the meaning of outlierness scores. In this work, we aim to gain a better understanding of the effect that both algorithms and specific data particularities have on the meaning of scores. To this end, we compare established outlier detection algorithms and analyze them beyond common metrics related to accuracy. We disclose trends in their dynamics and study the evolution of their scores when facing changes that should render them invariant. For this purpose we abstract characteristic S-curves and propose indices related to discriminant power, bias, variance, coherence and robustness. We discovered that each studied algorithm shows biases and idiosyncrasies, which habitually persist regardless of the dataset used. We provide methods and descriptions that facilitate and extend a deeper understanding of how the discussed algorithms operate in practice. This information is key to decide which one to use, thus enabling a more effective and conscious incorporation of unsupervised learning in real environments.

Download

Transparent Neighborhood Approximation for Text Classifier Explanation

November 2024

·

1 Read

Recent literature highlights the critical role of neighborhood construction in deriving model-agnostic explanations, with a growing trend toward deploying generative models to improve synthetic instance quality, especially for explaining text classifiers. These approaches overcome the challenges in neighborhood construction posed by the unstructured nature of texts, thereby improving the quality of explanations. However, the deployed generators are usually implemented via neural networks and lack inherent explainability, sparking arguments over the transparency of the explanation process itself. To address this limitation while preserving neighborhood quality, this paper introduces a probability-based editing method as an alternative to black-box text generators. This approach generates neighboring texts by implementing manipulations based on in-text contexts. Substituting the generator-based construction process with recursive probability-based editing, the resultant explanation method, XPROB (explainer with probability-based editing), exhibits competitive performance according to the evaluation conducted on two real-world datasets. Additionally, XPROB's fully transparent and more controllable construction process leads to superior stability compared to the generator-based explainers.


Systematic Review of Generative Modelling Tools and Utility Metrics for Fully Synthetic Tabular Data

November 2024

·

18 Reads

ACM Computing Surveys

Sharing data with third parties is essential for advancing science, but it is becoming more and more difficult with the rise of data protection regulations, ethical restrictions, and growing fear of misuse. Fully synthetic data, which transcends anonymisation, may be the key to unlocking valuable untapped insights stored away in secured data vaults. This review examines current synthetic data generation methods and their utility measurement. We found that more traditional generative models such as Classification and Regression Tree models alongside Bayesian Networks remain highly relevant and are still capable of surpassing deep learning alternatives like Generative Adversarial Networks. However, our findings also display the same lack of agreement on metrics for evaluation, uncovered in earlier reviews, posing a persistent obstacle to advancing the field. We propose a tool for evaluating the utility of synthetic data and illustrate how it can be applied to three synthetic data generation models. By streamlining evaluation and promoting agreement on metrics, researchers can explore novel methods and generate compelling results that will convince data curators and lawmakers to embrace synthetic data. Our review emphasises the potential of synthetic data and highlights the need for greater collaboration and standardisation to unlock its full potential.






Fig. 4: Mean rank of linear scaling, non-robust, and robust Gaussian scaling variants for the harmonic improvement score of the stratified sharpness, refinement, and calibration errors for outliers and inliers: Gaussian scaling with sample mean as center and nMAD as scale performs best.
Robust Statistical Scaling of Outlier Scores: Improving the Quality of Outlier Probabilities for Outliers (Extended Version)

August 2024

·

42 Reads

Outlier detection algorithms typically assign an outlier score to each observation in a dataset, indicating the degree to which an observation is an outlier. However, these scores are often not comparable across algorithms and can be difficult for humans to interpret. Statistical scaling addresses this problem by transforming outlier scores into outlier probabilities without using ground-truth labels, thereby improving interpretability and comparability across algorithms. However, the quality of this transformation can be different for outliers and inliers. Missing outliers in scenarios where they are of particular interest - such as healthcare, finance, or engineering - can be costly or dangerous. Thus, ensuring good probabilities for outliers is essential. This paper argues that statistical scaling, as commonly used in the literature, does not produce equally good probabilities for outliers as for inliers. Therefore, we propose robust statistical scaling, which uses robust estimators to improve the probabilities for outliers. We evaluate several variants of our method against other outlier score transformations for real-world datasets and outlier detection algorithms, where it can improve the probabilities for outliers.


Fig. 1. Example of the function approximation (left) and first order derivative (right) with all and half of the observation, based on the accuracy as the performance measure.
Fig. 2. FSDEM score based on accuracy for two different feature selection methods and target feature number ranges.
Fig. 3. FSDEM and stability score for the two scenarios.
Selected datasets for the experiments and their characteristics.
FSDEM: Feature Selection Dynamic Evaluation Metric

August 2024

·

18 Reads

Expressive evaluation metrics are indispensable for informative experiments in all areas, and while several metrics are established in some areas, in others, such as feature selection, only indirect or otherwise limited evaluation metrics are found. In this paper, we propose a novel evaluation metric to address several problems of its predecessors and allow for flexible and reliable evaluation of feature selection algorithms. The proposed metric is a dynamic metric with two properties that can be used to evaluate both the performance and the stability of a feature selection algorithm. We conduct several empirical experiments to illustrate the use of the proposed metric in the successful evaluation of feature selection algorithms. We also provide a comparison and analysis to show the different aspects involved in the evaluation of the feature selection algorithms. The results indicate that the proposed metric is successful in carrying out the evaluation task for feature selection algorithms. This paper is an extended version of a paper accepted at SISAP 2024.


Evaluating outlier probabilities: assessing sharpness, refinement, and calibration using stratified and weighted measures

July 2024

·

31 Reads

·

3 Citations

Data Mining and Knowledge Discovery

An outlier probability is the probability that an observation is an outlier. Typically, outlier detection algorithms calculate real-valued outlier scores to identify outliers. Converting outlier scores into outlier probabilities increases the interpretability of outlier scores for domain experts and makes outlier scores from different outlier detection algorithms comparable. Although several transformations to convert outlier scores to outlier probabilities have been proposed in the literature, there is no common understanding of good outlier probabilities and no standard approach to evaluate outlier probabilities. We require that good outlier probabilities be sharp, refined, and calibrated. To evaluate these properties, we adapt and propose novel measures that use ground-truth labels indicating which observation is an outlier or an inlier. The refinement and calibration measures partition the outlier probabilities into bins or use kernel smoothing. Compared to the evaluation of probability in supervised learning, several aspects are relevant when evaluating outlier probabilities, mainly due to the imbalanced and often unsupervised nature of outlier detection. First, stratified and weighted measures are necessary to evaluate the probabilities of outliers well. Second, the joint use of the sharpness, refinement, and calibration errors makes it possible to independently measure the corresponding characteristics of outlier probabilities. Third, equiareal bins, where the product of observations per bin times bin length is constant, balance the number of observations per bin and bin length, allowing accurate evaluation of different outlier probability ranges. Finally, we show that good outlier probabilities, according to the proposed measures, improve the performance of the follow-up task of converting outlier probabilities into labels for outliers and inliers.


Citations (72)


... In short, Gaussian normalization democratizes algorithms by making their scores and dynamics considerably more similar and comparable (less dependent on the algorithm used), although at the cost of a notable reduction in the variance of scores and suppressing dynamic nuances that might be used for the diagnosis of the data context and the algorithm performance (see "Appendix 2"). In fact, transforming raw scores into probability estimates does not guarantee that these new probability scores are fully well defined or reliable either (Röchner et al. 2024). Therefore, assumed the loss of information and a possible distortion due to the statistical modeling, Gaussian normalization is advisable in most practical cases, since it largely decouples the score from the algorithm. ...

Reference:

What do anomaly scores actually mean? Dynamic characteristics beyond accuracy
Evaluating outlier probabilities: assessing sharpness, refinement, and calibration using stratified and weighted measures

Data Mining and Knowledge Discovery

... The dimensionality of the reduced representation should ideally match the intrinsic dimensionality of the data [17]. The bare minimum of parameters required to explain the observable qualities of the data is known as the intrinsic dimension of the data [18,19]. Dimensionality reduction is crucial in many fields because it reduces the negative effects of dimensionality and other high-dimensional spatial characteristics [1,20]. ...

Dimensionality-Aware Outlier Detection

... Our work advances the research on observers-based unsupervised learning, which originated SDO [4], SDOstream [5] and SDOclust [6]. The remainder of this paper is structured as follows: In Section II, we introduce observersbased OD. ...

SDOclust: Clustering with Sparse Data Observers
  • Citing Chapter
  • October 2023

Lecture Notes in Computer Science

... [cs.LG] 26 Aug 2024 informative features without requiring the use of labels [11,22,24]. Feature selection takes a global approach to select a subset of representational features for a full dataset as opposed to local approaches such as subspace methods for clustering [12] or outlier detection [26], or to projection and approximation approaches for nearest neighbor search [3], where different feature subsets or combinations could be relevant for different patterns or locations in a dataset. ...

Clustering High-Dimensional Data
  • Citing Chapter
  • February 2023

... Failure to adapt the model can lead to degraded performance and an inability to detect emerging anomalies or recognize shifts in normal behaviour.Our review reveals that anomaly detection and model updates are the main focus of existing research on streaming anomaly detection. Although some surveys (Ntroumpogiannis et al. 2023;Lu et al. 2023;Vázquez et al. 2023) have assessed the current streaming anomaly detection algorithms, their assessments have the following limitations: ...

Anomaly detection in streaming data: A comparison and evaluation study
  • Citing Article
  • July 2023

Expert Systems with Applications

... Filter methods are faster than wrapper methods. On the other hand, outlier rejection techniques are divided into three primary classes: statistical, cluster, and neighbor procedures [24,25]. Information gain (IG) acts as a filter mechanism in P 2 S feature selection to rapidly identify the most informative collection of features [26]. ...

On the evaluation of outlier detection and one-class classification: a comparative study of algorithms, model selection, and ensembles

Data Mining and Knowledge Discovery

... However, football tactics may evolve with time [30] and recent publications have considered on and off-ball movements' contribution to creating promising opportunities during open play events [31][32][33][34]. McCarthy et al. evaluated on-ball passing sequences occurring in specific field zones that lead to or concede goal-scoring opportunities [28]. Shitansu et al. utilized on-ball player-to-player passing interactions to predict teams that create goal-scoring opportunities (shots) in event sequences [35]. ...

Analyzing Passing Sequences for the Prediction of Goal-Scoring Opportunities
  • Citing Chapter
  • February 2023

Communications in Computer and Information Science

... Researchers have developed methods to detect and correct biases in NLP. These include statistical techniques to identify biased patterns in data [34] and innovative approaches using advanced machine learning to explore different aspects of bias, such as gender, race, and disability [35][36][37]. Notably, efforts have been made to debiasing word embeddings and mitigate attribute bias in tasks like natural language inference [38,39]. Moreover, emerging research has expanded the understanding of bias beyond simple demographic factors, investigating how biases related to race, gender, disability, nationality, and religion are replicated in NLP models [40][41][42]. ...

Power of Explanations: Towards automatic debiasing in hate speech detection
  • Citing Conference Paper
  • October 2022

... This kind of data is prevalent in various practical applications, such as social networks and biological networks [16,35,37,43]. Recently, there has been growing interest in unsupervised multiplex graph learning (UMGL) due to its ability to extract valuable information from multiple relationships among nodes without relying on label information [3,21,31,39]. ...

Unsupervised Representation Learning on Attributed Multiplex Network
  • Citing Conference Paper
  • October 2022

... To this end, we developed BrainAlign to map the mouse and human brain spatial transcriptomics into a shared space without labels for whole-brain alignment based on two microarray sequencing datasets 17,19 . To overcome the issue of data scale discrepancy and learn general embeddings for human and mouse brain STs, we constructed a heterogeneous graph neural network trained with a kind of contrastive loss in a self-supervised manner 23 . We showed that Brai-nAlign mapped mouse and human spots into the same space and facilitated the alignment of most homologous regions between mouse and human. ...

A Simple Meta-path-free Framework for Heterogeneous Network Embedding
  • Citing Conference Paper
  • October 2022