Figure 2 - uploaded by Kaize Ding
Content may be subject to copyright.

Anomaly discovery curve results.
Source publication
Performing anomaly detection on attributed networks concerns with finding nodes whose patterns or behaviors deviate significantly from the majority of reference nodes. Its success can be easily found in many real-world applications such as network intrusion detection, opinion spam detection and system fault diagnosis, to name a few. Despite their e...
Contexts in source publication
Context 1
... budget number T is specified as 250. Figure 2 presents the anomaly discovery curve of all the algorithms on all three datasets from t = 1 to T . Meanwhile, we also report the cumulative precision and recall results in Table 3. From the evaluation results, we make the following observations: ...
Context 2
... budget number T is specified as 250. Figure 2 presents the anomaly discovery curve of all the algorithms on all three datasets from t = 1 to T . Meanwhile, we also report the cumulative precision and recall results in Table 3. From the evaluation results, we make the following observations: ...
Similar publications
Anomaly detection in multivariate time series is valuable for many applications. In this context, unsupervised and semi-supervised deep learning methods that estimate how normal a new observation is have shown promising results on benchmark datasets. These methods are dependent on a threshold that determines which points should be regarded as anoma...
Citations
... In the past decade, various approaches have been proposed for unsupervised GAD. Reconstruction-based methods [6,9,17,35] usually adopt autoencoder or GAN as the backbone, which aims to reconstruct the structural or contextual information of raw graph data. After model training, the objects with higher reconstruction errors are defined as anomalies. ...
... We follow the strategy in OCGNN [38] and split the dataset into training, validation, and test set by the ratio of 6 : 1 : 3. Baselines. We compare our proposed framework with the following baselines: a heuristic method DEG which directly uses node degree as the anomaly score, two traditional approaches including a density-based method LOF [3] and a clustering-based method SCAN [41], a residual-based approach Radar [23], two contrastive learning-based approaches including CoLA [27] and GRADAT [8], four state-of-the-art hypersphere learning-based approaches including OCGNN [38], AAGNN [51], MHGL [49] and OCGTL [33], four autoencoder-based approaches including Dominant [6], ...
Graph Anomaly Detection (GAD) plays a vital role in various data mining applications such as e-commerce fraud prevention and malicious user detection. Recently, Graph Neural Network (GNN) based approach has demonstrated great effectiveness in GAD by first encoding graph data into low-dimensional representations and then identifying anomalies under the guidance of supervised or unsupervised signals. However, existing GNN-based approaches implicitly follow the homophily principle (i.e., the "like attracts like" phenomenon) and fail to learn discriminative embedding for anomalies that connect vast normal nodes. Moreover, such approaches identify anomalies in a unified global perspective but overlook diversified abnormal patterns conditioned on local graph context, leading to suboptimal performance. To overcome the aforementioned limitations, in this paper, we propose a Multi-hypersphere Heterophilic Graph Learning (MHetGL) framework for unsupervised GAD. Specifically, we first devise a Heterophilic Graph Encoding (HGE) module to learn distinguishable representations for potential anomalies by purifying and augmenting their neighborhood in a fully unsupervised manner. Then, we propose a Multi-Hypersphere Learning (MHL) module to enhance the detection capability for context-dependent anomalies by jointly incorporating critical patterns from both global and local perspectives. Extensive experiments on ten real-world datasets show that MHetGL outperforms 14 baselines. Our code is publicly available at https://github.com/KennyNH/MHetGL.
... In [11], writers examine by recognizing system of anomaly in desired networks in interactive environment letting system to actively communicate with human expert in producing questionnaires' restricted amount on anomalies based on truth. Aim is providing max certain anomalies to specialist of human after applying particular budget. ...
Networks of Computer are vulnerable to cyberattacks which could affect the mission critical data accessibility, confidentiality, integrity. Anomaly detection became the most basic environment of study because of extend usage range like unusual network traffic manner detection, detection of disease in MRI images, detection of fraud in transactions of credit card. In a lot of real-life anomaly issues of detection, we meet the heterogeneous data including various features’ kinds such as categorical and continuous features. Data heterogeneity makes that actually hard for data examples’ comparison. In addition, data manners might shift over time in flowing areas. At last, that is difficult foe getting data tags as we get a lot of data every day for classification manually. Autoencoders are a feed-forward neural network kind which could be ordered for performing anomaly detection through learning the stochastic input ‘normal’ instances’ representation and abnormal instances’ diagnosis through controlling error of reconstruction in comparison with the predetermined anomaly threshold. This paper concentrates on developing IDS effectiveness by applying proposed Stacked Autoencoder Hoeffding Tree approach (SAE-HT) applying Rain Optimization Algorithm (ROA) for selection of feature. Experiments on the dataset NSL-KDD illustrate that our model multi-classification possesses great performance. In comparison to the other mechanisms of ML in accuracy case, our model performs better than such mechanisms.
... 2) Graph-based outlier detection models, including GCN autoencoder (Kipf & Welling, 2016b), GAAN , DOMINANT (Ding et al., 2019), ANOMALOUS (Peng et al., 2018), and SL-GAD (Zheng et al., 2021). 3) Transformation-based outlier detection approaches, such as Table 2. Category-free OOD detection on real-world datasets. ...
... Representation-based methods, including EnergyDef (Gong & Sun, 2024), aim to generate synthetic OOD nodes but often fail to capture the true features of real OOD nodes. Graph anomaly detection methods, like DOMINANT (Ding et al., 2019) and SL-GAD (Zheng et al., 2021), detect general anomalies through reconstruction errors, but they struggle to distinguish between OOD nodes and general anomalies. Recent works such as Bazhenov et al., 2022;Liu et al., 2023;Ding & Shi, 2023) explore graph-level OOD detection but can not be directly applied to node-level OOD detection due to the complexity of node dependencies. ...
Detecting out-of-distribution (OOD) nodes in the graph-based machine-learning field is challenging, particularly when in-distribution (ID) node multi-category labels are unavailable. Thus, we focus on feature space rather than label space and find that, ideally, during the optimization of known ID samples, unknown ID samples undergo more significant representation changes than OOD samples, even if the model is trained to fit random targets, which we called the Feature Resonance phenomenon. The rationale behind it is that even without gold labels, the local manifold may still exhibit smooth resonance. Based on this, we further develop a novel graph OOD framework, dubbed Resonance-based Separation and Learning (RSL), which comprises two core modules: (i) a more practical micro-level proxy of feature resonance that measures the movement of feature vectors in one training step. (ii) integrate with synthetic OOD nodes strategy to train an effective OOD classifier. Theoretically, we derive an error bound showing the superior separability of OOD nodes during the resonance period. Empirically, RSL achieves state-of-the-art performance, reducing the FPR95 metric by an average of 18.51% across five real-world datasets.
... Traditional approaches like Oddball [1] rely on power-law relationships between local graph features, while more recent deep learning-based approaches are more generalizable. For instance, DOMINANT [7] employs a graph autoencoder to identify anomalies based on graph reconstruction. ComGA [22] introduces a tailored GCN to learn distinguishable node representations by explicitly capturing community structure. ...
... Test-time training baselines include TENT [34], GraphCL [43], and GTrans [13]. As GAD baselines, we consider self-supervised learning-based methods CoLA [21], SL-GAD [48], and HCM-A [11], reconstruction-based methods DOMINANT [7] and ComGA [22], and a local affinity-based method TAM [26]. ...
Graph Anomaly Detection (GAD) has demonstrated great effectiveness in identifying unusual patterns within graph-structured data. However, while labeled anomalies are often scarce in emerging applications, existing supervised GAD approaches are either ineffective or not applicable when moved across graph domains due to distribution shifts and heterogeneous feature spaces. To address these challenges, we present AdaGraph-T3, a novel test-time training framework for cross-domain GAD. AdaGraph-T3 combines supervised and self-supervised learning during training while adapting to a new domain during test time using only self-supervised learning by leveraging a homophily-based affinity score that captures domain-invariant properties of anomalies. Our framework introduces four key innovations to cross-domain GAD: an effective self-supervision scheme, an attention-based mechanism that dynamically learns edge importance weights during message passing, domain-specific encoders for handling heterogeneous features, and class-aware regularization to address imbalance. Experiments across multiple cross-domain settings demonstrate that AdaGraph-T3 significantly outperforms existing approaches, achieving average improvements of over 6.6% in AUROC and 7.9% in AUPRC compared to the best competing model.
... However, these methods have primarily demonstrated promising results on artificial datasets within the general GAD context. The artificial anomalous patterns (e.g., dense subgraph and feature outliers) are relatively straightforward to identify (Ding et al. 2019;Gu and Zou 2024). In contrast, in real-world fraud detection, fraudsters often introduce more heterophilic connections with benign users to conceal their activities. ...
... Therefore, in this work, we aim to address the pressing need for developing unsupervised GFD methods. Graph Anomaly Detection (GAD) is a broader concept than GFD, aiming to identify not only fraudsters but also any rare and unusual patterns that significantly deviate from the majority in graph data (Ding et al. 2019;Zheng et al. 2021b;Wang et al. 2025;Liu et al. 2024b;Cai et al. 2024;Cai and Fan 2022;Zhang et al. 2024;Liu et al. 2023). Therefore, GAD techniques can be directly applied to GFD, especially in unsupervised learning scenarios (Li et al. 2024;Liu et al. 2024a). ...
... Therefore, GAD techniques can be directly applied to GFD, especially in unsupervised learning scenarios (Li et al. 2024;Liu et al. 2024a). Given the broad scope of GAD and the difficulty in obtaining real-world anomalies, many unsupervised GAD methods have been designed and evaluated on several datasets with artificially injected anomalies (Ding et al. 2019;Liu et al. 2021b;Jin et al. 2021;Zheng et al. 2021a;Duan et al. 2023a,b;Pan et al. 2023). Despite their decent performances, these methods rely on the strong homophily of the datasets with injected anomalies, which limits their applications under graph heterophily (Zheng et al. 2022a. ...
Graph fraud detection (GFD) has rapidly advanced in protecting online services by identifying malicious fraudsters. Recent supervised GFD research highlights that heterophilic connections between fraudsters and users can greatly impact detection performance, since fraudsters tend to camouflage themselves by building more connections to benign users. Despite the promising performance of supervised GFD methods, the reliance on labels limits their applications to unsupervised scenarios; Additionally, accurately capturing complex and diverse heterophily patterns without labels poses a further challenge. To fill the gap, we propose a Heterophily-guided Unsupervised Graph fraud dEtection approach (HUGE) for unsupervised GFD, which contains two essential components: a heterophily estimation module and an alignment-based fraud detection module. In the heterophily estimation module, we design a novel label-free heterophily metric called HALO, which captures the critical graph properties for GFD, enabling its outstanding ability to estimate heterophily from node attributes. In the alignment-based fraud detection module, we develop a joint MLP-GNN architecture with ranking loss and asymmetric alignment loss. The ranking loss aligns the predicted fraud score with the relative order of HALO, providing an extra robustness guarantee by comparing heterophily among non-adjacent nodes. Moreover, the asymmetric alignment loss effectively utilizes structural information while alleviating the feature-smooth effects of GNNs.Extensive experiments on 6 datasets demonstrate that HUGE significantly outperforms competitors, showcasing its effectiveness and robustness. The source code of HUGE is at https://github.com/CampanulaBells/HUGE-GAD.
... It has found broad applications in areas such as financial fraud detection, transaction analysis, and social networks [1,22,26,30]. Most existing GAD methods, either supervised or unsupervised methods, follow a paradigm of training and inference on the same graph, making the trained models struggle to generalize to new/unseen graphs due to the distribution shift between the training and testing graphs [6,12,20,28]. Further, this type of methods can become inapplicable in applications where graph data is not accessible during training due to data privacy or other data access issues. Generalist/foundation models for graphs have achieved remarkable progress in tackling these challenges in general tasks such as node classification, graph classification, and link prediction [9,16,32,41,45], but they are difficult to generalize the GAD task. ...
... Unsupervised methods, which typically assume the absence of both normal and abnormal labels, have garnered significant attention due to their more practical setting assumption on data labels. They generally incorporate some conventional techniques such as reconstruction [6], one-class classification [28,37,43], contrastive learning [20,25], and adversarial learning [5] into graph learning to capture the normal patterns within the graph and then assign an anomaly score for each node based on its deviation from the normal patterns. However, these methods still follow the paradigm of training and inference on the same graph, making them struggle to generalize to new/unseen graphs due to the distribution shift between training and testing set. ...
Graph anomaly detection (GAD) aims to identify abnormal nodes that differ from the majority of the nodes in a graph, which has been attracting significant attention in recent years. Existing generalist graph models have achieved remarkable success in different graph tasks but struggle to generalize to the GAD task. This limitation arises from their difficulty in learning generalized knowledge for capturing the inherently infrequent, irregular and heterogeneous abnormality patterns in graphs from different domains. To address this challenge, we propose AnomalyGFM, a GAD-oriented graph foundation model that supports zero-shot inference and few-shot prompt tuning for GAD in diverse graph datasets. One key insight is that graph-agnostic representations for normal and abnormal classes are required to support effective zero/few-shot GAD across different graphs. Motivated by this, AnomalyGFM is pre-trained to align data-independent, learnable normal and abnormal class prototypes with node representation residuals (i.e., representation deviation of a node from its neighbors). The residual features essentially project the node information into a unified feature space where we can effectively measure the abnormality of nodes from different graphs in a consistent way. This provides a driving force for the learning of graph-agnostic, discriminative prototypes for the normal and abnormal classes, which can be used to enable zero-shot GAD on new graphs, including very large-scale graphs. If there are few-shot labeled normal nodes available in the new graphs, AnomalyGFM can further support prompt tuning to leverage these nodes for better adaptation. Comprehensive experiments on 11 widely-used GAD datasets with real anomalies, demonstrate that AnomalyGFM significantly outperforms state-of-the-art competing methods under both zero- and few-shot GAD settings.
... Detecting anomalies in graph-structured data is a pivotal task across diverse fields, including social networks (Hassanzadeh et al., 2012), cybersecurity (Wang & Zhu, 2022), transportation systems (Hu et al., 2020), and biological networks (Singh & Vig, 2017). Traditional methods primarily focus on identifying anomalies through structural irregularities or attribute deviations (Ding et al., 2019;Fan et al., 2020), such as abnormal connections or unusual feature val-ues. However, these approaches often overlook the underlying geometric properties of graphs, particularly the graph curvature, which encapsulates essential information about the global and local topology of the graph. ...
... These methods employ autoencoders to reconstruct graph structures (adjacency matrices) and node attributes, characterizing anomalies with a higher reconstruction error as anomalous nodes or substructures are relatively difficult to reconstruct. For example, DOMINANT (Ding et al., 2019) and AnomalyDAE (Fan et al., 2020) use a dual discriminative mechanism to simultaneously detect structural and attribute anomalies by minimizing reconstruction loss in the adjacency and feature matrices. (Gu et al., 2019), this idea has been extended by models like κ-GCN (Bachmann et al., 2020), which uses the κ-stereographic model, and Q- GCN (Xiong et al., 2022), which operates on pseudo-Riemannian manifolds. ...
Does the intrinsic curvature of complex networks hold the key to unveiling graph anomalies that conventional approaches overlook? Reconstruction-based graph anomaly detection (GAD) methods overlook such geometric outliers, focusing only on structural and attribute-level anomalies. To this end, we propose CurvGAD - a mixed-curvature graph autoencoder that introduces the notion of curvature-based geometric anomalies. CurvGAD introduces two parallel pipelines for enhanced anomaly interpretability: (1) Curvature-equivariant geometry reconstruction, which focuses exclusively on reconstructing the edge curvatures using a mixed-curvature, Riemannian encoder and Gaussian kernel-based decoder; and (2) Curvature-invariant structure and attribute reconstruction, which decouples structural and attribute anomalies from geometric irregularities by regularizing graph curvature under discrete Ollivier-Ricci flow, thereby isolating the non-geometric anomalies. By leveraging curvature, CurvGAD refines the existing anomaly classifications and identifies new curvature-driven anomalies. Extensive experimentation over 10 real-world datasets (both homophilic and heterophilic) demonstrates an improvement of up to 6.5% over state-of-the-art GAD methods.
... An anomaly-based detection is a dynamic approach that identifies deviations from normal network behavior to uncover botnet-related activities. Unlike signature-based methods, which rely on predefined patterns, anomaly detection leverages statistical analysis, machine learning, and behavioral modeling to detect previously unknown or evolving botnets [24,25]. This approach operates at multiple granularities, including packet-level and flow-level anomaly detection. ...
The growing threat of social botnets demands advanced detection techniques to identify sophisticated malicious activities within network traffic. This paper introduces a graph-based detection framework leveraging the Composite Node Information - Variance Inflation Factor (CNI-VIF) method for enhanced feature selection. By integrating traditional statistical metrics with graph-specific attributes like centrality measures, CNI-VIF effectively reduces dimensionality while preserving crucial features. The proposed methodology is validated using multiple machine learning models across CTU-13, IoT-23, and NCC-2 diverse botnet datasets, demonstrating superior accuracy, reduced computational overhead, and robust detection performance. The framework integrates machine learning models, counting Logistic Regression, Random Forest, SVM, Ensemble, FFNN, and Convolutional Neural Networks, achieving near-perfect detection rates with minimal false positives and false negatives. Furthermore, the proposed methodology substantially reduces computational time, up to 80%, compared to the state-of-the-art method, highlighting its suitability for real-time botnet detection in complex datasets. Comparative analysis confirms the methodology's advantage over existing state-of-the-art solutions, emphasizing its practical utility for real-time botnet detection.
... Since we apply the structure-based reconstruction loss as the objective function of traditional graph representation learning L ssl in the current version of GFL-LC, here we decide to compare it with other objective functions that can be used as graph representation learning. The first is the objective function capable of reconstructing both structures and attributes presented in DOM [50]. Intuitively, DOM contains a wider range of views than only the current structure-based reconstruction graph representation learning objective function. ...
Graph few-shot learning implies achieving node classification with extremely limited label information. To mitigate theperformance degradation of the graph neural network model under label scarcity, some previous work has attempted to facilitate the subsequent classification task with the aid of deep clustering. However, these methods merely treat this process as an intermediate step in providing prior knowledge, and do not fully exploit the potential of the clustering technique, which prevents the model from being trained in a more intuitive end-to-end manner. Therefore, we propose a novel graph few-shot learning framework via label-aware clustering (GFL-LC), benefiting from a tailored clustering process, our framework can obtain the final classification result without the classifier. Specifically, for the purpose of encouraging labeled nodes with the same class to learn similar feature representations that contribute to clustering them into the same cluster, we first engage label information in the graph representation learning process by defining an objective function. Next, on the basis of obtaining the preliminary label-aware clustering result, we draw on the idea of reinforcement learning to further improve its quality, which is capable of introducing optimization guidance by providing feedback from both the accuracy of labeled node division and structural rationality. Finally, considering that the ultimate task is a classification task, we design an alignment mechanism to process the clustering result, and transform it into the classification result. Comprehensive experiments on several benchmark datasets show that GFL-LC outperforms its peers, and confirms the superiority of our framework under the few-shot setting.
... Owing to the rarity of anomaly events in real-world applications, existing GAD methods primarily rely on unsupervised learning-based approaches. These methods typically utilize reconstruction-based [8,17,32] and contrastive learningbased techniques [15,20,31,34] to learn normal patterns within unlabeled data. Anomalies are then detected as deviations from these patterns. ...
... Anomalies are then detected as deviations from these patterns. For example, reconstruction-based methods like DOMINANT [8] and ComGA [17] employ autoencoders to reconstruct graph structures and attributes, whereas anomalies are identified through substantial reconstruction errors. On the other hand, contrastive learning-based methods, such as COLA [15] and PREM [20], detect anomalies by comparing nodes with their surrounding structures and identifying anomalies that have large inconsistencies highlighted via contrastive losses. ...
Semi-supervised graph anomaly detection (GAD) has recently received increasing attention, which aims to distinguish anomalous patterns from graphs under the guidance of a moderate amount of labeled data and a large volume of unlabeled data. Although these proposed semi-supervised GAD methods have achieved great success, their superior performance will be seriously degraded when the provided labels are extremely limited due to some unpredictable factors. Besides, the existing methods primarily focus on anomaly detection in static graphs, and little effort was paid to consider the continuous evolution characteristic of graphs over time (dynamic graphs). To address these challenges, we propose a novel GAD framework (EL-DGAD) to tackle anomaly detection problem in dynamic graphs with extremely limited labels. Specifically, a transformer-based graph encoder model is designed to more effectively preserve evolving graph structures beyond the local neighborhood. Then, we incorporate an ego-context hypersphere classification loss to classify temporal interactions according to their structure and temporal neighborhoods while ensuring the normal samples are mapped compactly against anomalous data. Finally, the above loss is further augmented with an ego-context contrasting module which utilizes unlabeled data to enhance model generalization. Extensive experiments on four datasets and three label rates demonstrate the effectiveness of the proposed method in comparison to the existing GAD methods.