January 2025
·
2 Reads
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
January 2025
·
2 Reads
September 2024
·
6 Reads
The Computer Journal
Protocol reverse engineering is crucial in normative verification, and malware behavior analysis and vulnerability discovery. However, uncovering the structural features of binary protocols concealed within dense data representations remains a significant challenge. Accurately identifying keyword segments associated with message types is a prerequisite for meaningful semantic analysis and protocol state machine reduction. In this work, we introduce a novel approach for inferring keywords from binary protocols based on probabilistic statistics. Our method in terms of Byte employs heuristic rules to filter offset positions that are clearly unrelated to message types. We further filter candidate Byte-offsets utilizing constraint relations and provide the probabilistic ranking of each offset as the keyword segment. To enhance the reliability of keyword segment inference, we utilize the Monte Carlo algorithm to assess the difference between message clustering with candidate Byte-offset and random message clustering, and reorder candidate offsets according to the results. Then we can observe optimal values from both orderings and present the ultimate inference results. Experimental results demonstrate that our method excels in the accuracy of keyword segments identification compared with previous techniques.
August 2024
May 2024
·
18 Reads
·
1 Citation
Website fingerprinting, also known as WF, is a traffic analysis attack that enables local eavesdroppers to infer a user’s browsing destination, even when using the Tor anonymity network. While advanced attacks based on deep neural network (DNN) can perform feature engineering and attain accuracy rates of over 98%, research has demonstrated that DNN is vulnerable to adversarial samples. As a result, many researchers have explored using adversarial samples as a defense mechanism against DNN-based WF attacks and have achieved considerable success. However, these methods suffer from high bandwidth overhead or require access to the target model, which is unrealistic. This paper proposes CMAES-WFD, a black-box WF defense based on adversarial samples. The process of generating adversarial examples is transformed into a constrained optimization problem solved by utilizing the Covariance Matrix Adaptation Evolution Strategy (CMAES) optimization algorithm. Perturbations are injected into the local parts of the original traffic to control bandwidth overhead. According to the experiment results, CMAES-WFD was able to significantly decrease the accuracy of Deep Fingerprinting (DF) and VarCnn to below 8.3% and the bandwidth overhead to a maximum of only 14.6% and 20.5%, respectively. Specially, for Automated Website Fingerprinting (AWF) with simple structure, CMAES-WFD reduced the classification accuracy to only 6.7% and the bandwidth overhead to less than 7.4%. Moreover, it was demonstrated that CMAES-WFD was robust against adversarial training to a certain extent.
April 2024
·
1 Read
·
1 Citation
February 2024
·
9 Reads
·
1 Citation
Addressing inherent limitations in distinguishing metrics relying solely on Euclidean distance, especially within the context of geo-indistinguishability (Geo-I) as a protection mechanism for location-based service (LBS) privacy, this paper introduces an innovative and comprehensive metric. Our proposed metric not only incorporates geographical information but also integrates semantic, temporal, and query data, serving as a powerful tool to foster semantic diversity, ensure high servifice similarity, and promote spatial dispersion. We extensively evaluate our technique by constructing a comprehensive metric for Dongcheng District, Beijing, using road network data obtained through the OSMNX package and semantic and temporal information acquired through Gaode Map. This holistic approach proves highly effective in mitigating adversarial attacks based on background knowledge. Compared with existing methods, our proposed protection mechanism showcases a minimum 50% reduction in service quality and an increase of at least 0.3 times in adversarial attack error using a real-world dataset from Geolife. The simulation results underscore the efficacy of our protection mechanism in significantly enhancing user privacy compared to existing methodologies in the LBS location privacy-protection framework. This adjustment more fully reflects the authors’ preference while maintaining clarity about the role of Geo-I as a protection mechanism within the broader framework of LBS location privacy protection.
January 2024
·
1 Read
November 2023
·
9 Reads
·
1 Citation
September 2023
·
6 Reads
June 2023
·
32 Reads
Deep learning has achieved good classification results in the field of traffic classification in recent years due to its good feature representation ability. However, the existing traffic classification technology cannot meet the requirements for the incremental learning of tasks in online scenarios. In addition, due to the high concealment and fast update speed of malicious traffic, the number of labeled samples that can be captured is scarce, and small samples cannot drive neural network training, resulting in poor performance of the classification model. Therefore, this paper proposes an incremental learning method for small-sample malicious traffic classification. The method uses the pruning strategy to find the redundant network structure and dynamically allocates redundant neurons for training based on the proposed measurement method according to the difficulty of the new class. This enables the network to perform incremental learning without excessively consuming storage and computing resources, and reasonable allocation improves the classification accuracy of new classes. At the same time, through the knowledge transfer method, the model can reduce the catastrophic forgetting of the old class, relieve the pressure of training large parameters with small-sample data, and improve the model classification performance. Experiments involving multiple datasets and settings show that our method is superior to the established baseline in terms of classification accuracy, consuming 50% less memory.
... However, most network intrusion datasets have the problem of imbalanced data distribution [30]. Based on adversarial samples of traffic obfuscation in the IDS, an adversarial obfuscation method based on the generative adversarial network was proposed [31], which clearly divided functional and nonfunctional features and only added perturbation to nonfunctional features, so as to avoid model detection while maintaining functionality of traffic. The conventional deep learning (DL) approaches frequently grapple with issues like data loss and overfitting when confronted with class-imbalanced datasets. ...
April 2024
... To validate the advantages of MTMSGDM proposed by the research institute, it was tested in a larger scale real-world network scenario, which contains thousands of links and has been subjected to attacks including DDoS attacks, malware propagation, and information theft. The research mainly compares the fair allocation model, single-stage Stackelberg model, Adversarial Website Fingerprinting Defense Based on Covariance Matrix Adaptation Evolution Strategy (CMAES-WFD) [32], and Multi-level Stackelberg model [33] with the MTMSGDM model. As shown in Figure 12, in large-scale network scenarios, the average defense effectiveness of MTMS-GDM is as high as 273, followed by CMAES-WFD with an average defense effectiveness of 261. ...
May 2024
... Layer 3 is the secondary type, and the leaf nodes in layer 4 contain the real locations of specific points of interest on the map. [29] is the number of hops that need to be experienced between leaf nodes corresponding to two locations in the location semantic tree, which measures the semantic similarity between locations. The greater semantic distance, the greater semantic difference between locations. ...
February 2024
... Dai et al. [3] comprehensively extracted features from multiple angles, including traffic statistical features, TLS handshake fields, and certificates, and applied various machine learning models for classification, with the extreme gradient boosting (XGBoost) model achieving the highest accuracy rate of 97.71%. Gu et al. [4] used three independent feature extraction networks for pre-training, fully exploring the diversity and heterogeneity of TLS traffic. However, as the TLS protocol continues to evolve, the features selected by this method need to be updated promptly. ...
November 2023
... A comprehensive analysis of the challenges and prospects of machine learning in smart grid cybersecurity is offered by Berghout et al. 2022. A number of writers have examined approaches, algorithms, and assessment methods for moving target defense strategies, such as classification and categorization (Navas et al., 2020;Sun et al., 2023). However, they have yet to give much attention to deep learning techniques. ...
April 2023
... Dimensionality reduction techniques and feature selection are also critical preprocessing steps for deriving optimal and minimal subsets of relevant input features, which facilitates more efficient learning [45][46][47][48]. Stacked autoencoders (SAEs) are widely used in unsupervised feature learning and dimensionality reduction to improve intrusion detection, and Muhammad et al. [49] proposed the use of SAE with two latent layers, succeeded by a supervised deep neural network classifier. ...
February 2023
... Consequently, current methodologies typically meticulously analyze the sample distribution of the image data when handling hyperspectral image data features stemming from multiple source domains. This scrutiny aims to ensure that the network model adeptly captures the distribution characteristics of each source domain [20][21][22][23][24]. ...
December 2022
... Zhang et al. [22] also used 8 MCFP malware classes in detecting encrypted malicious traffic. Li et al. [23] selected a random set of 20 MCFP malware classes for unknown malware class detection. Zhao et al. [24] used 10 MCFP malware classes in their prototype based learning method in malware classification. ...
January 2023
... To detect imbalanced traffic, Al-Yaseen et al. [18] proposed a hybrid IDS combining support vector machine, extreme learning machine and K-means clustering algorithm to improve the detection rate of uncommon attacks. Ahmim et al. [19] proposed HCPTC-IDS, an IDS system based on predict probabilities of decision trees. The HCPTC-IDS system consists of two layers, the first layer is a decision tree, and the second layer is an end classifier that combines the different probabilities of the first layer to make predictions. ...
November 2022
... Recently, a new series of WF attacks based on deep learning (DL) has been proposed to improve the accuracy, effectiveness, and robustness of WF attacks [8][9][10][11][12][13][33][34][35]. They have maintained more than 90% accuracy, become state-ofthe-art WF attacks, and are used to benchmark other attacks and defenses. ...
October 2021