Khalid Ismail’s research while affiliated with Birmingham City University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (5)


OOP Poisoned Model Generation
Surrogate Model Development
Calculating Distances from Decision Boundaries
GMM visualization of features relationship in the dataset with PCA reduction.
Performance analysis of support vector machines (SVM) with consistent poisoning

+6

Outlier-oriented poisoning attack: a grey-box approach to disturb decision boundaries by perturbing outliers in multiclass learning
  • Article
  • Publisher preview available

February 2025

·

9 Reads

·

1 Citation

·

·

Mohamed Ben Farah

·

Khalid Ismail

Poisoning attacks are a primary threat to machine learning (ML) models, aiming to compromise their performance and reliability by manipulating training datasets. This paper introduces a novel attack—outlier-oriented poisoning (OOP) attack, which manipulates labels of most distanced samples from the decision boundaries. To ascertain the severity of the OOP attack for different degrees (5–25%) of poisoning to conduct a detailed analysis, we analyzed variance, accuracy, precision, recall, f1-score, and false positive rate for chosen ML models. Benchmarking the OOP attack, we have analyzed key characteristics of multiclass machine learning algorithms and their sensitivity to poisoning attacks. Our analysis helps understand behaviour of multiclass models against data poisoning attacks and contributes to effective mitigation against such attacks. Utilizing three publicly available datasets: IRIS, MNIST, and ISIC, our analysis shows that KNN and GNB are the most affected algorithms with a decrease in accuracy of 22.81% and 56.07% for IRIS dataset with 15% poisoning. Whereas, for same poisoning level and dataset, Decision Trees and Random Forest are the most resilient algorithms with the least accuracy disruption (12.28% and 17.52%). We have also analyzed the correlation between number of dataset classes and the performance degradation of models. Our analysis highlighted that number of classes are inversely proportional to the performance degradation, specifically the decrease in accuracy of the models, which is normalized with increasing number of classes. Further, our analysis identified that imbalanced dataset distribution can aggravate the impact of poisoning for machine learning models.

View access options


Deep behavioral analysis of machine learning algorithms against data poisoning

November 2024

·

59 Reads

·

2 Citations

Poisoning attacks represent one of the most common and practical adversarial attempts on machine learning systems. In this paper, we have conducted a deep behavioural analysis of six machine learning algorithms, analyzing poisoning impact and correlation between poisoning levels and classification accuracy. Adopting an empirical approach, we highlight practical feasibility of data poisoning, comprehensively analyzing factors of individual algorithms affected by poisoning. We used public datasets (UNSW-NB15, BotDroid, CTU13, and CIC-IDS-2017) and varying poisoning levels (5–25%) to conduct rigorous analysis across different settings. In particular, we analyzed the accuracy, precision, recall, f1-score, false positive rate and ROC of the chosen algorithms. Further, we conducted a sensitivity analysis of each algorithm to understand the impact of poisoning on its performance and characteristics underpinning its susceptibility against data poisoning attacks. Our analysis shows that, for 15% poisoning of UNSW-NB15 dataset, the accuracy of Decision Tree decreases by 15.04% with an increase of 14.85% in false positive rate. Further, with 25% poisoning of BotDroid dataset, accuracy of K-nearest neighbours (KNN) decreases by 15.48%. On the other hand, Random Forest is comparatively more resilient against poisoned training data with a decrease of 8.5% in accuracy with 15% poisoning of UNSW-NB15 dataset and 5.2% for BotDroid dataset. Our results highlight that 10–15% of dataset poisoning is the most effective poisoning rate, significantly disrupting classifiers without introducing overfitting, whereas 25% is detectable because of high performance degradation and overfitting algorithms. Our analysis also helps understand how asymmetric features and noise affect the impact of data poisoning on machine learning classifiers. Our experimentation and analysis are publicly available at: https://github.com/AnumAtique/Behavioural-Analaysis-of-Poisoned-ML/.


Outlier-Oriented Poisoning Attack: A Grey-box Approach to Disturb Decision Boundaries by Perturbing Outliers in Multiclass Learning

November 2024

·

11 Reads

Poisoning attacks are a primary threat to machine learning models, aiming to compromise their performance and reliability by manipulating training datasets. This paper introduces a novel attack - Outlier-Oriented Poisoning (OOP) attack, which manipulates labels of most distanced samples from the decision boundaries. The paper also investigates the adverse impact of such attacks on different machine learning algorithms within a multiclass classification scenario, analyzing their variance and correlation between different poisoning levels and performance degradation. To ascertain the severity of the OOP attack for different degrees (5% - 25%) of poisoning, we analyzed variance, accuracy, precision, recall, f1-score, and false positive rate for chosen ML models.Benchmarking our OOP attack, we have analyzed key characteristics of multiclass machine learning algorithms and their sensitivity to poisoning attacks. Our experimentation used three publicly available datasets: IRIS, MNIST, and ISIC. Our analysis shows that KNN and GNB are the most affected algorithms with a decrease in accuracy of 22.81% and 56.07% while increasing false positive rate to 17.14% and 40.45% for IRIS dataset with 15% poisoning. Further, Decision Trees and Random Forest are the most resilient algorithms with the least accuracy disruption of 12.28% and 17.52% with 15% poisoning of the IRIS dataset. We have also analyzed the correlation between number of dataset classes and the performance degradation of models. Our analysis highlighted that number of classes are inversely proportional to the performance degradation, specifically the decrease in accuracy of the models, which is normalized with increasing number of classes. Further, our analysis identified that imbalanced dataset distribution can aggravate the impact of poisoning for machine learning models


Machine learning security and privacy: a review of threats and countermeasures

April 2024

·

274 Reads

·

8 Citations

EURASIP Journal on Information Security

Machine learning has become prevalent in transforming diverse aspects of our daily lives through intelligent digital solutions. Advanced disease diagnosis, autonomous vehicular systems, and automated threat detection and triage are some prominent use cases. Furthermore, the increasing use of machine learning in critical national infrastructures such as smart grids, transport, and natural resources makes it an attractive target for adversaries. The threat to machine learning systems is aggravated due to the ability of mal-actors to reverse engineer publicly available models, gaining insight into the algorithms underpinning these models. Focusing on the threat landscape for machine learning systems, we have conducted an in-depth analysis to critically examine the security and privacy threats to machine learning and the factors involved in developing these adversarial attacks. Our analysis highlighted that feature engineering, model architecture, and targeted system knowledge are crucial aspects in formulating these attacks. Furthermore, one successful attack can lead to other attacks; for instance, poisoning attacks can lead to membership inference and backdoor attacks. We have also reviewed the literature concerning methods and techniques to mitigate these threats whilst identifying their limitations including data sanitization, adversarial training, and differential privacy. Cleaning and sanitizing datasets may lead to other challenges, including underfitting and affecting model performance, whereas differential privacy does not completely preserve model’s privacy. Leveraging the analysis of attack surfaces and mitigation techniques, we identify potential research directions to improve the trustworthiness of machine learning systems.

Citations (2)


... The poisoning levels are set between 5%-25%, at a scale of 5, studying model behavior in different settings. The settings of these poisoning limits are inspired by [40] which highlighted that the poisoning level > 25% leads to abrupt performance degra-dation that is detectable. With our analysis, we have identified important parameters of each algorithm that are sensitive to poisoning attacks, answering how the models are getting misclassified and identifying optimal poisoning rates for each algorithm. ...

Reference:

Outlier-oriented poisoning attack: a grey-box approach to disturb decision boundaries by perturbing outliers in multiclass learning
Deep behavioral analysis of machine learning algorithms against data poisoning

... In order to obtain the answers to research questions formulated in the introductory part of this article, literature review process guidelines provided in [17] have been used. A typical literature review procesds defines categories and develops a review protocol for selecting research articles. ...

Machine learning security and privacy: a review of threats and countermeasures

EURASIP Journal on Information Security