Conference PaperPDF Available

Cyber Threat Detection Using Machine Learning Techniques: A Performance Evaluation Perspective

October 2020

October 2020

DOI:10.1109/ICCWS48432.2020.9292388

Conference: IEEE International Conference on Cyber Warfare and Security
At: Islamabad, Pakistan

Authors:

Kamran Shaukat Dar

The University of Newcastle, Australia

Shan Chen

The present-day world has become all dependent on cyberspace for every aspect of daily living. The use of cyberspace is rising with each passing day. The world is spending more time on the Internet than ever before. As a result, the risks of cyber threats and cybercrimes are increasing. The term 'cyber threat' is referred to as the illegal activity performed using the Internet. Cybercriminals are changing their techniques with time to pass through the wall of protection. Conventional techniques are not capable of detecting zero-day attacks and sophisticated attacks. Thus far, heaps of machine learning techniques have been developed to detect the cybercrimes and battle against cyber threats. The objective of this research work is to present the evaluation of some of the widely used machine learning techniques used to detect some of the most threatening cyber threats to the cyberspace. Three primary machine learning techniques are mainly investigated, including deep belief network, decision tree and support vector machine. We have presented a brief exploration to gauge the performance of these machine learning techniques in the spam detection, intrusion detection and malware detection based on frequently used and benchmark datasets.

Content uploaded by Kamran Shaukat Dar

Content may be subject to copyright.

Cyber Threat Detection Using Machine Learning

Techniques: A Performance Evaluation Perspective

Kamran Shaukat, Suhuai Luo, Shan Chen

The University of Newcastle, Australia

Kamran.shaukat@uon.edu.au

Dongxi Liu,

Data61, Commonwealth Scientific and Industrial Research

Organization, Australia

Abstract — The present-day world has become all dependent

on cyberspace for every aspect of daily living. The use of

cyberspace is rising with each passing day. The world is spending

more time on the Internet than ever before. As a result, the risks

of cyber threats and cybercrimes are increasing. The term 'cyber

threat' is referred to as the illegal activity performed using the

Internet. Cybercriminals are changing their techniques with

time to pass through the wall of protection. Conventional

techniques are not capable of detecting zero-day attacks and

sophisticated attacks. Thus far, heaps of machine learning

techniques have been developed to detect the cybercri mes and

battle against cyber threats. The objective of this research work

is to present the evaluation of some of the widely used machine

learning techniques used to detect some of the most threatening

cyber threats to the cyberspace. Three primary machine learning

techniques are mainly investigated, including deep belief

network, decision tree and support vector machine. We have

presented a brief exploration to gauge the performance of these

machine learning techniques in the spam detection, intrusion

detection and malware detection based on frequently used and

benchmark datasets.

Keywords— Cyber Threat; Cybercrime; Performance

Evaluation; Machine Learning Application; Intrusion Detection

System; Malware Detection; Spam Classification

I. INTRODUCTION

The cyberspace refers to the global environment that

facilitates the sharing of electronic resources from all over the

world. Resources can be an electronic document, audio, video,

image, and tweet. The cyberspace incorporates a wide range

of components, including the Internet, technically skilled

users, system resources, data and untrained users. The

cyberspace is providing a global arena to infinitely gain access

to information and resources. At present, the cyberspace is

playing the leading role in data transfer and information

exchange with all its vastly growing losses and gains. After

2017 the cyberspace gained more popularity. Internet usage

has risen 81% in developed countries and still growing all

over the globe [1]. The elevating cyberspace has also given

rise to the risks of cybercrimes and cyber threats.

With the growing range of cyber threats, cyber security has

also made a considerable number of enhancements to compete

against cybercrimes. The cyber security refers to a set of

technologies, technology experts and processes that are used

to make safety measures to protect the cyberspace from

cybercriminals [2]. There are two main approaches of cyber

security, i.e., conventional cyber security and automated cyber

security. There are numerous downsides of conventional cyber

security which contributes to strengthening cybercrimes,

including unqualified users, the weak configuration of system

resources and limited access to clean data [3]. The future of

cyber security is all about automated cyber security. Advanced

and automated cyber security techniques are highly needed.

They possess the ability to learn from experience to detect

new polymorphic cyberattacks to keep pace with the evolving

cybercrimes [4].

The cyber threat is an act in which someone will try or

attend to steal the information, violate the integrity rules and

harm the computing device or network. Cyber threats include

phishing, malware, attack on IoT devices, denial of service

attack, spam, intrusion on network or mobile device, financial

fraud, ransomware, to name a few [5, 6]. Malware detection,

intrusion detection and spam detection are discussed in this

paper.

An email that is unwanted or unsolicited is called spam

email. Spam emails are mostly used for advertisement or

spreading fraudulent material. It occupies the network and

computer resources such as the bandwidth of network,

memory and wastage of time [7]. Another cyber threat is

malware. Malware, as a short for malicious software, is a

software that is installed on a computer to disrupt its operation

and harm the electronic data. Viruses, worms, ransomware,

adware, spyware, malvertising, and Trojan horse are

considered as significant types of malware [8]. Malign

intrusions over the computer network and devices are another

cyber threat to cyberspace. These intrusions are used to

identify and scan the vulnerabilities of a network or computer

system. An intrusion detection system (IDS) is used to protect

against these intrusions. There are three classifications of

intrusions, namely, signature/misuse-based, anomaly-based

and hybrid [9, 10].

Machine learning (ML) is the most effective and

fundamental strategy to compete against cyber threats and

overcome the limitations of conventional security systems

[11]. Despite having all its charms, machine learning

techniques have their constraints and limitations. Machine

learning is a subclass of artificial intelligence (AI) [12]. The

fascinating quality of machine learning techniques is that

machine learning techniques do not need to be explicitly

programmed as they can automatically learn from their

experience to generate the results [13].

On the strength of all the benefits of machine learning

techniques, ML techniques are expanding their scope in

almost every area of life, including cyber security [14],

medical science [15, 16], educational purposes [17, 18],

intrusion detection [19, 20], spam detection [21, 22] and

malware detection [23]. Almost all famous machine learning

techniques have been applied to detect and classify different

cyber threats. Commonly used machine learning techniques

are decision tree, random forest, naive Bayes, support vector

machine, K-nearest neighbor, deep belief network, artificial

neural network, K-mean, to name a few [24, 25]. However, we

have considered the decision tree, deep belief network, and

support vector machine techniques for this article. We have

provided a comparison of machine learning techniques based

on frequently used and benchmark datasets.

II. LITERATURE REVIEW

Authors in [26] analyzed the applications of widely used

machine learning techniques to protect the cyberspace from

cybercriminals. The authors also depicted various obstacles

faced during the implementation of machine learning

techniques. The work concluded that although the machine

learning techniques are expanding various ways to protect

cyberspace against cybercriminals, still there is an immense

number of advancements needed to protect the classifiers from

adversarial attacks. Machine learning classifiers themselves

are incredibly vulnerable to cyber threats and adversarial

attacks.

Authors in [27] bestowed a brief review of several

publications related to the implementation of machine learning

models to enhance cyber security. They addressed some

commonly faced barriers to machine learning techniques in

finding appropriate datasets with most efficient applicability

for a specific security problem.

Authors in [28] presented a brief performance comparison

of different machine learning techniques, specifically in

anomaly detection. They gauged the performance efficiency of

feature selection in ML for IDS. They claimed that the

convolutional neural network (CNN) classifier is an underused

classifier and it could have brought vast advancements in

cyber security if it was used to its full potential.

Authors in [29] analyzed the role of various machine

learning techniques in spam detection, malware detection and

intrusion detection. They claimed that there is no machine

learning technique that is not vulnerable to cyberattacks.

Every machine learning technique is still struggling to keep a

pace with continuously upgrading cybercrimes.

Authors in [30] proposed a novel machine learning

technique for spam detection in text messages using content-

based features. They concluded that the proposed averaged

neural network and content-based feature selection outplayed

most of the recent machine learning techniques in terms of

accuracy on the same dataset. Authors in [31] stated that the

signature-based classification techniques generate results with

high error rates when it comes to mobile malware detection.

They proposed an image-based deep learning technique for

mobile malware detection, aiming to demonstrate the

discrimination between the family of malicious attributes and

the legitimate attributes by obtaining grey-scale images.

Authors in [32] came up with a statistical semi-supervised

machine learning technique for intrusion detection in Android

mobile devices. The increase in data traffic will also give rise

to cybercrimes. Consequently, to protect Android mobile

devices against advanced cybercrimes, more advanced

machine learning techniques are needed to be developed to

detect malicious activities.

In this paper, we have provided a comprehensive review of

widely used machine learning techniques to gauge the

performance of machine learning techniques to detect some

widely known cybercrimes. We have analyzed three widely

used machine learning techniques, namely: decision tree, deep

belief network and support vector machine. Most of the

review articles only focused on a particular threat. However,

we have considered three major cyber threats. An intrusion

detection, spam detection and malware detection are

considered for this study. We have provided a comprehensive

comparison to see the performance of each classifier based on

frequently used datasets. We have mentioned the

computational complexity of each classifier. The following

section will discuss the fundamentals of machine learning, an

overview of considered classifiers and evaluation criteria to

evaluate the performance of a classifier. The discussion

section will discuss cyber threats and provide the performance

evaluation in the form of accuracy, recall and precision.

Lastly, the conclusion section will conclude the study.

III. FUNDAMENTALS OF MACHINE LEARNING

Artificial intelligence is a branch of computer science based

on simulation of the human brain by an artificial entity to

automate a necessary process. Machine learning is a sub-

branch of AI. It achieves a specific goal by using the results

from experience without explicitly being programmed. Hence

machine learning does not require to be fed explicitly with data

[33]. There are three sub-branches of machine learning,

namely, supervised learning, unsupervised learning and semi-

supervised learning. In supervised learning, the targeted

class/label is known in advance, whereas the targeted classes

are unknown in unsupervised learning. Unsupervised learning

divides the data into different clusters based on the similarity

between data objects. Semi-supervised learning combines

characteristics of both: supervised learning and unsupervised

learning.

Decision tree, random forest, naive Bayes, support vector

machine, K-nearest neighbour, deep belief network, artificial

neural network, K-mean are widely used learning techniques to

detect cyber threats. We have considered three techniques that

are decision tree, deep belief network, and support vector

machine. We have briefly described each technique below.

A deep belief network (DBN) is a complex representation

of middle layers of Restricted Boltzmann Machine (RBM).

Deep belief network follows a greedy approach. Every layer

communicates with the previous layer and the next layer. In

each layer of the deep belief network, the nodes do not

communicate laterally with other nodes. In a deep belief

network, every layer is assigned with both input and output

tasks, excluding the first layer and the last layer. The end layer

is the classifier layer. The computation complexity of DBN is

O((n+N)k) where k is the number of iterations, n represents the

number of records, and N is the number of parameters in DBN

[34].

Decision tree (DT) is a supervised machine learning

technique. The main components of a decision tree are nodes,

paths and leaf nodes. A node can be a root node or an

intermediate node. Decision tree follows the if-then rule to find

the best suitable root node at each level. Leaf node or terminal

node is an ending node. The decision class is denoted by the

leaf node [35]. The time complexity of DT is O(mn

) where n

represents the number of instances and m shows the number of

attributes [36, 37].

TABLE I. CONFUSION MATRIX

Predicted as

Normal

Predicted as

Attack

Actual Labeling as

Normal T

Positive

Negative

Actual Labeling as

Attack F

Positive

Negative

Support vector machine (SVM) is another widely used

supervised machine learning model. SVM works to find

hyperplane with most suitable dataset distribution by

classifying the data into two classes on both sides of the

hyperplane. Both sides of the hyperplane donate a separate

class. The class of every data point depends on the side of the

hyperplane it lands. Support vector machine has a high

consumption of space and time to handle larger and noisier

datasets [25]. The computational complexity of SVM is O(n

)

where n represents the number of instances [38, 39].

A matrix that is used to evaluate the performance of

machine learning classifier is called a confusion matrix [40], as

depicted in Table 1. T

Positive

means the number of normal

instances that are correctly classified as normal.

Negative

means

the number of attack instances that are correctly classified as an

attack. F

Negative

means the number of normal instances that are

misclassified as an attack. F

Positive

means the number of attack

instances that are misclassified as normal.

Precision

The precision is a percentage of the total number of

positive instances classified to the total number of positive

instances.

Precision= T

Positive

/ (T

Positive

+ F

Positive

) (1)

Error Rate

The error rate (ERate) is a percentage of the total number

of misclassified instances to all instances of the dataset.

ERate = (F

Positive

+ F

Negative

)/ (T

Negative

+ F

Positive

+ F

Negative

+ T

Positive

) (2)

Recall

The recall is a percentage of correctly classified positive

instances to the total number of positive instances classified in

the dataset.

Recall = T

Positive

/ (T

Positive

+ F

Negative

) (3)

IV. DISCUSSION AND PERFORMANCE EVALAUTION

There is a wide range of cybercrimes that try to breach the

privacy of user’s data daily on a computer network or mobile

devices. An extensive range of machine learning techniques

have been developed to battle against cybercrimes. However

those techniques are still lagging a step behind as compared to

cybercrimes. In our review, we have mainly focused on the

detection of three cardinal cyber threats, namely: IDS, malware

detection and spam detection. We have considered three

learning models that are decision tree, support vector machine,

and deep belief network. Datasets play an important role in

completing all the significant tasks as the results are all

dependent on the type and size of the dataset. The diversity of

the dataset helped to evaluate the performance of the classifier

in the training and testing phases. Real-time and diverse

datasets produce better results than a customized dataset. In

this review, we have considered frequently used and

benchmark datasets that are KDD CUP 99 [41], Spambase

[42], Twitter dataset [43], Enron [44], NSL-KDD [45],

DARPA [46], and malware datasets [47]. We have compared

the performance of the machine learning models on detecting

these cyber threats.

TABLE II. PERFORMANCE RESULTS OF SPAM DETECTION USING MACHINE LEARNING MODELS

Cyber

Threat

Learning

Model Dataset Reference Published

Year Sub-Domain Performance Results

Precision Accuracy Recall

Spam

Detection

Support

Vector

Machine

Spambase [48] 2011 Email Spam 93.12 % 96.90 % 95.00 %

[49] 2015 Email Spam 79.02 % 79.50 % 68.67 %

Twitter Dataset [50] 2018 Spam Tweets 92.91 % 93.14 % 93.14 %

[51] 2015 Spam Tweets 95.20 % 93.60 %

Decision

Tree

Enron [52] 2016 Email Spam

98.00 % 96.00 % 94.00 %

[52] 2016 Email Spam 98.00 % 96.00 % 94.00 %

Spambase [53] 2014

Email Spam 91.51 % 92.08 % 88.08 %

[54] 2014 Email Spam - 94.27 % 91.02 %

DBN

Enron [55] 2016 Email Spam 96.49 % 95.86 % 95.61 %

[56] 2016

Email Spam 98.39 % 97.50 % 98.02 %

Spambase [57] 2007 Email Spam 94.94 % 97.43 % 96.47 %

[58] 2018 Email Spam 96.00 % 89.20 % -

We have taken accuracy, recall and precision as evaluation

factors to measure the performance of classification models.

Table 2, Table 3, and Table 4 present the performance of three

learning models for spam detection, malware detection, and

intrusion detection, respectively. Cyber Threat and Learning

Model columns are self-explanatory. Dataset column shows the

frequently used and benchmark dataset for each particular

threat. Reference column depicts the citation of specific paper

that shows the evaluation results. Values for the sub-domain

column is different for each cyber threat. Performance results

column shows the performance results of each cited article.

Following sub-sections will present the discussion on each

cyber threat.

A. Spam Detection

Spam is a threat to computer and network resources. It is a

term used for an unwanted message. Spam can be in different

mediums. It can be in the form of text messages, images and

videos on mobile devices [59]. Spam tweets and spam emails

are the mediums that are mostly used over the computing

devices and network. Spam messages consume a lot of network

resources, such as bandwidth. Spam emails in the form of

unnecessary advertisements consume a lot of time. Machine

learning techniques have been applied in the literature to

distinguish between a genuine email and a spam email, as

shown in Table 2. SVM and DT have shown a good accuracy

of 96.90 % [48]. However, DBN has outperformed with a

precision value of 98.39 % using Enron dataset [56]. DBN also

outperformed in terms of recall and precision over SVM and

DT. On Spambase dataset, SVM performed better than DT

with an accuracy of 96.90 % [48]. Using Enron dataset, the

decision tree has shown better precision than SVM and similar

precision to DBN [52]. It is apparent from Table 2 that DBN

has performed better than other learning models for these

particular datasets. Based on the above evaluation metrics, the

authors recommend using DBN for spam detection.

B. Intrusion Detection

Malign intrusions over the computer network and devices

are another cyber threat to cyberspace. These intrusions are

used to identify the vulnerabilities of a network [60]. Intrusions

identify the weakness within a computer system for further

attacks. An intrusion detection system is used to protect against

these intrusions. There are three classifications of intrusions,

namely, signature/misuse-based, anomaly-based and hybrid

[61]. The intrusions can be detected on the network or a host

computer. Conventional techniques are unable to cope with the

pace to detect intrusions. Commonly used datasets are DARPA

and KDD versions. However, these datasets are older for more

than fifteen years. Table 3 presents the evaluation results of

intrusion detection. DBN performed better than SVM and DT

in terms of accuracy. DBN has shown better accuracy results of

96.70 % using NSL-KDD dataset [62].

TABLE III. PERFORMANCE RESULTS OF INTRUSION DETECTION SYSTEM USING MACHINE LEARNING MODELS

Cyber

Threat

Learning

Model Dataset Reference Published

Year Sub-Domain Performance Results

Precision Accuracy Recall

Intrusion

Detection

Support

Vector

Machine

NSL-KDD

[63] 2019 Anomaly-Based - 89.70 % -

[41] 2014 Hybrid-Based 74.00 % 82.37 % 82.00 %

DARPA

[64] 2007 Hybrid-Based - 69.80 % -

[65] 2014 Anomaly-Based - 95.11 % -

Decision

Tree

KDD [66] 2018 Misuse-Based - 99.96 % -

[67] 2017 Hybrid-Based

- 86.29 % 78.00 %

NSL-KDD [68] 2019 Hybrid-Based - 93.40 % -

[69] 2017 Hybrid-Based 91.15 % 90.30 % 90.31 %

DBN

KDD [61] 2015 Anomaly-Based - 97.50 % -

NSL-KDD [62] 2015 Hybrid-Based 97.90 %

96.70 % -

[70] 2017 Anomaly-Based 88.60 % 90.40 % 95.30 %

TABLE IV. PERFORMANCE RESULTS OF MALWARE DETECTION USING MACHINE LEARNING MODELS

Cyber

Threat

Learning

Model Dataset Reference Published

Year Sub-Domain Performance Results

Precision Accuracy Recall

Malware

Detection

Support

Vector

Machine

Malware Dataset [71] 2017 Static - 94.37 % -

[72] 2013 Dynamic - 95.00 %

Enron

[73] 2015 Dynamic - 97.10 % -

[52] 2016 Static 84.74 % 91.00 % 100 %

Decision

Tree

Custom [74] 2016 Static 99.40 % 99.90 % -

Malware Dataset [75] 2017 Static - 84.70 % -

[76] 2014 Static

97.90 % - 96.70 %

DBN

Custom [77] 2016 Dynamic 78.08 % 71.00 % 59.09 %

[77] 2016 Static 83.00 % 89.03 % 98.18 %

KDD CUP99 [77] 2016 Hybrid 95.77 % 96.76 % 97.84 %

[78] 2015 Hybrid - 91.40 % 95.34 %

However, the decision tree has shown outstanding

accuracy of 99.96 % and better than DBN and SVM using

KDD dataset [66]. The decision tree has shown the best

efficiency among the learning classifiers of 99.96 %

regardless of the dataset [66]. DBN has reported the best

recall and precision values of 95.30 % and 97.90 %,

respectively [62, 70]. Based on the considered articles, the

decision tree is recommended as the best learning classifier

for intrusion detection, as depicted in Table 3.

C. Malware detection

Malware, short for malicious software, is a software that

is installed on a computer to disrupt its operation and harm

the electronic data. Viruses, worms, ransomware, adware,

spyware, malvertising, and Trojan horse are considered as

significant types of malware [79]. Malware interrupts the

normal flow of computer operations. With a growing pace of

usage of computing and mobile devices, the cybercriminal is

finding it easy to compromise the integrity of data. Malware

also disrupts the availability of computer and network

resources. Machine learning techniques are being used to

detect malware. The performance of each learning classifier

is depicted in Table 4. Static detection is a sub-domain of

malware detection in which applications are tested for

malware without executing them. However, in dynamic

detection, the applications or software are tested by

executing them. Hybrid detection is a mixture of both static

and dynamic detection [80]. The decision tree has shown

overall best accuracy of 99.90 % on custom data collected by

the author [74]. However, on a malware dataset, SVM

performed better than decision tree in terms of accuracy.

SVM reported the best recall value of 100% [52]. SVM is

recommended based upon the cited papers to detect and

classify applications from malware.

V. CONCLUSION

Cyber threats are increasing at a growing pace. The

conventional security techniques are not capable enough of

coping with these threats. Machine learning techniques are

being applied to overcome the limitations of conventional

security systems. Machine learning techniques are playing

their role at both ends: at defender-end and attacker-end. We

have presented a performance comparison of three learning

models to detect and classify the intrusion, spam and

malware. We have considered frequently used and

benchmark datasets to compare the evaluation results in

terms of recall, precision, and accuracy. In the previous

section, we have discussed and concluded that we cannot

recommend a particular learning technique for every cyber

threat detection. Different learning models are being used for

specific different cyber threats. On the other hand, there is a

vast number of authors who have worked to highlight the

constraints faced by machine learning techniques. We have

observed and suggested that there is a dare need of latest

benchmark dataset to test the latest advancement in the field

of machine learning for cyber threat detection. Available

datasets lack in terms of diversity and sophisticated attacks

and contain missing values. There is a need for specific and

customized learning models specifically designed for

security purposes. In future, we will focus on analyzing more

learning techniques for cyber threat detection.

REFERENCES

[1] "ICT Facts and Figures 2017." Telecommunication Development

Bureau,International Telecommunication Union (ITU), Technical Report.

https://www.itu.int/en/ITU-D/Statistics/Pages/facts/default.aspx (accessed

October 09, 2019).

[2] "What is Cyber-Security?" https://www.kaspersky.com.au/resource-

center/definitions/what-is-cyber-security (accessed January 11, 2020).

[3] F. Farahmand, S. B. Navathe, P. H. Enslow, a nd G. P. Sharp, "Managing

vulnerabilities of information systems to security incidents," in Proceedings of

the 5th international conference on Electronic commerce, 2003: ACM, pp. 348-

354.

[4] P. Szor, The Art of Computer Virus Research and Defense: ART COMP VIRUS

RES DEFENSE _p1. Pearson Education, 2005.

[5] M. Jump, "Fighting Cyberthreats with Technology Solut ions," Biomedical

instrumentation & technology, vol. 53, no. 1, pp. 38-43, 2019.

[6] N. Kostyuk and C. Wayne, "Communicating Cybersecurity: Citizen Risk

Perception of Cyber Threats," 2019.

[7] A. K. Jain, D. Goel, S. Agarwal, Y. Singh, and G. Bajaj, "Predicting Spam

Messa ges Using Back P ropaga tion N eural N etwor k," Wireless Personal

Communications, vol. 110, no. 1, pp. 403-422, 2020.

[8] "Malware Types and Classifications." https://www.lastline.com/blog/malware-

types-and-classifications/ (accessed April 18,2020).

[9] N. Sultana, N. Chila mkurti, W. Pe ng, and R. Alha dad, "Survey o n SDN ba sed

network intrusion de tectio n syste m using machine lea rning approac hes," Peer-to-

Peer Networking and Applications, vol. 12, no. 2, pp. 493-501, 2019.

[10] M. Pra dhan, C . K. Nayak, and S . K. Pra dhan, " Intrus ion Det ection Syste m (IDS)

and Their Types," in Securing the Internet of Things: Concepts, Methodologies,

Tools, and Applications: IGI Global, 2020, pp. 481-497.

[11] I. Firdausi, A. Erwin, and A. S. Nugroho, "Analysis of machine learning

techniques used in behavior-based malware detection," in 2010 second

international conference on advances in computing, control, and

telecommunication technologies, 2010: IEEE, pp. 201-203.

[12] A. V. Joshi, Machine Learning and Artificial Intellige nce. Springer, 2020.

[13] D. Michie, D. J. Spiegelhalter, and C. Taylor, "Machine learning," Neural and

Statistical Classification, vol. 13, 1994.

[14] K. Shaukat, A. Rubab, I. Shehzadi, and R. Iqba l, "A Socio-Technological

analysis of Cyber Crime and Cyber Security in Pakistan," Transylvanian Review,

vol. 1, no. 3, 2017.

[15] K. Shaukat, N. Masood, A. B. Shafaat, K. Jabbar, H. Shabbir, and S. Shabbir,

"Dengue fever in perspective of clustering algorithms," arXiv preprint

arXiv:1511.07353, 2015.

[16] K. Shaukat, N. Masood, S. Mehreen, and U. Azmeen, "Dengue fever predictio n:

A data mining problem," Journal of Data Mining in Genomics & Proteomics,

vol. 2015, 2015.

[17] K. Shaukat, I. Nawaz, and S. Zaheer, Students Performance: A Data Mining

Perspective. LAP Lambert Academic Publishing, 2017.

[18] K. Shaukat, I. Nawaz, S. Aslam, S. Zaheer, and U. Shaukat, "Student's

performance in the context of data mining," in 2016 19th International Multi-

Topic Conference (INMIC), 2016: IEEE, pp. 1-8.

[19] S. Dey, Q. Ye, and S. Sampalli, "A machine learning based intrusion detection

scheme for data fusion in mobile clouds involving heterogeneous client

networks," Information Fusion, vol. 49, pp. 205-215, 2019.

[20] B. Geluvaraj, P. Satwik, and T. A. Kumar, "The future of cybersecurity: Major

role of artificial intelligence, machine learning, and deep learning in cyberspace,"

in International Conference on Computer Networks and Communication

Technologies, 2019: Springer, pp. 739-747.

[21] A. A. Alurkar et al., "A Comparative Analysis and Discussion of Email Spam

Classification Methods Using Machine Learning Techniques," Applied Machine

Learning for Smart Data Analysis, p. 185, 2019.

[22] E. G. Dada, J. S. Bassi, H. Chiroma, A. O. Adetunmbi, and O. E. Ajibuwa,

"Machine learning for email spa m filtering: review, approaches and open

research problems," Heliyon, vol. 5, no. 6, p. e01802, 2019.

[23] P. Jain, "Machine Learning versus Deep Learning for Malware Detection," 2019.

[24] P. Thiyagarajan, "A Review on Cyber Security Mechanisms Using Machine and

Deep Learning Algorithms," in Handbook of Research on Machine and Deep

Learning Applications for Cyber Security: IGI Global, 2020, pp. 23-41.

[25] S. S. Iyer and S. Rajagopal, "Applications of Machine Learning in Cyber

Security Domain," in Handbook of Research on Machine and Deep Learning

Applications for Cyber Security: IGI Global, 2020, pp. 64-82.

[26] V. Ford and A. Siraj, "Applications of machine learning in cyber security," in

Proceedings of the 27th International Conference on Computer Applications in

Industry and Engineering, 2014.

[27] H. Jiang, J. Nagra, and P. Ahammad, "Sok: Applying machine learning in

security-a survey," arXiv preprint arXiv:1611.03186, 2016.

[28] E. Hodo, X. Bellekens, A. Hamilton, C. Tachtatzis, and R. Atkinson, "Shallow

and deep netwo rks int rusio n detect ion system: A taxono my and s urvey, " arXiv

preprint arXiv:1701.02145, 2017.

[29] G. Apruzzese, M. Colajanni, L. Ferretti, A. Guido, and M. Marchetti, "On the

effectiveness of machine and deep learning for cyber security," in 2018 10th

International Conference on Cyber Conflict (CyCon), 2018: IEEE, pp. 371-390.

[30] S. Sheikhi, M. Kheirabadi, and A. Bazzazi, "An Effective Model for SMS Spam

Detection Using Content-based Features and Averaged Neural Network,"

International Journal of Engineering, vol. 33, no. 2, pp. 221-228, 2020.

[31] F. Mercaldo and A. Santone, "Deep learning for image-based mobile malware

detection," Journal of Computer Virology and Hacking Techniques, pp. 1-15,

2020.

[32] J. Ribeiro, F. B. Saghezchi, G. Mantas, J. Rodriguez, S. J. Shepherd, and R. A.

Abd-Al hameed , "An aut onomo us host -base d intrus ion det ectio n syste m for

android mobile devices," Mobile Networks and Applications, vol. 25, no. 1, pp.

164-172, 2020.

[33] C. Chen et al., "A performance evaluation of machine learning-based streaming

spam tweets detection," IEEE Transactions on Computational social systems,

vol. 2, no. 3, pp. 65-76, 2015.

[34] Z. Chen, S. Liu, K. Jiang, H. Xu, and X. Cheng, "A data imputation method

based on deep belief network," in 2015 IEEE International Conference on

Computer and Information Technology; Ubiquitous Computing and

Communications; Dependable, Autonomic and Secure Computing; Pervasive

Intelligence and Computing, 2015: IEEE, pp. 1238-1243.

[35] D. M. Farid, N. Harbi, and M. Z. Rahman, "Combining naive bayes and decision

tree for adaptive intrusion detection," arXiv preprint arXiv:1005.4496, 2010.

[36] Q. J. Ross, "C4. 5: programs for machine learning," San Mateo, CA, 1993.

[37] P. S. Oliveto, J. He, and X. Yao, "Time complexity of evolutionary algorithms

for combinatorial optimization: A decade of results," International Journal of

Automation and Computing, vol. 4, no. 3, pp. 281-293, 2007.

[38] C. J. Burges, "A tutorial on support vector machines for pattern recognition,"

Data mining and knowledge discovery, vol. 2, no. 2, pp. 121-167, 1998.

[39] G. D. Forney, "The viterbi algorithm," Proceedings of the IEEE, vol. 61, no. 3,

pp. 268-278, 1973.

[40] X. Deng, Q. Liu, Y. Deng, and S. Mahadevan, "An improved method to construct

basic probability assignment based o n the confusion matrix for classification

problem," Information Sciences, vol. 340, pp. 250-261, 2016.

[41] M. S. Pervez and D. M. Farid, "Feature selection and intrusion classification in

NSL-KDD cup 99 dataset employing SVMs," in The 8th Internationa l

Conference on Software, Knowledge, Information Management and Applications

(SKIMA 2014), 2014: IEEE, pp. 1-6.

[42] "Spambase Datas et. Center for Machine Learning a nd Intelligent Systems at UC

Irvine." https://archive.ics.uci.edu/ml/datasets/Spambase (accessed January 31,

2020).

[43] D. Gunawan, R. F. Rahmat, A. Putra, and M. F. Pasha, "Filtering Spa m Text

Messages by Using Twitter-LDA Algorithm," in 2018 IEEE International

Conference on Communication, Networks and Satellite (Comnetsat), 2018: IEEE,

pp. 1-6.

[44] B. Klimt and Y. Yang, "Introducing the Enron corpus," in CEAS, 2004.

[45] B. Ingre and A. Yadav, "Performance analysis of NSL-KDD dataset using

ANN," in 2015 International Conference on Signal Processing and

Communication Engineering Systems, 2015: IEEE, pp. 92-96.

[46] A. Chahal and R. Nagpal, "Performance of Snort on Darpa Dataset and Diferent

False Alert Reduction Techniques," in 3rd International Conference on

Electrical, Electronics, Engineering Trends, Communication, Optimization and

Sciences (EEECOS).

[47] H. Kim, T. Cho, G.-J. Ahn, and J. H. Yi, "Risk assessment of mobile applications

based on machine learned malware dataset," Multimedia Tools and Applications,

vol. 77, no. 4, pp. 5027-5042, 2018.

[48] W. Awad, S. J. I. J. o. C. S. ELseuofi, and I. Technology, "Machine learning

methods for spam e-mail classification," vol. 3, no. 1, pp. 173-184, 2011.

[49] R. Karthika and P. J. W. T. C. Visalakshi, "A hybrid ACO based feature selection

method for email spam classification," vol. 14, pp. 171-177, 2015.

[50] G. Jain, M. Sharma, and B. J. I. J. o. K. D. i. B. Agarwal, "Spam detection on

social media using semantic convolutional neural network," vol. 8, no. 1, pp. 12-

26, 2018.

[51] C. Chen et al., "A performance evaluation of machine learning-based streaming

spam tweets detection," vol. 2, no. 3, pp. 65-76, 2015.

[52] Z. Khan and U. Qamar, "Text Mining Approach to Detect Spam in Emails," in

The International Conference on Innovations in Intelligent Systems and

Computing Technologies (ICIISCT2016), 2016, p. 45.

[53] S. A. Saab, N. Mitri, and M. Awad, "Ham or spam? A comparative study for

some content-based classificatio n algorithms for email filtering," in MELECON

2014-2014 17th IEEE Mediterranean Electrotechnical Conference, 2014: IEEE,

pp. 339-343.

[54] Y. Zhang, S. Wang, P. Phillips, and G. J. K.-B. S. Ji, "Binary PSO with mutation

operator for feature selection using decision tree applied to spam detection," vol.

64, pp. 22-31, 2014.

[55] A. Tyagi, "Content Based Spam Classification-A Deep Learning Approach,"

University of Calgary, 2016.

[56] I. J. Alkaht and B. J. I. R. C. S. Al-Khatib, "Filtering SPAM Using Se veral

Stages Neural Networks," vol. 11, p. 2, 2016.

[57] G. Tzortzis and A. Likas, "Deep belief networks for spam filtering," in 19th

IEEE International Conference on Tools with Artificial Intelligence (ICTAI

2007), 2007, vol. 2: IEEE, pp. 306-309.

[58] Y. Rizk, N. Hajj, N. Mitri, M. J. A. C. Awad, and Informatics, "Deep belief

networks and cortical algorithms: A comparative study for supervised

classification," 2018.

[59] A. Sharaff, N. K. Nagwani, and A. Dhadse, "Comparative study of classification

algorithms for spam email detection," in Emerging research in computing,

information, communication and applications: Springer, 2016, pp. 237-244.

[60] C. Yin, Y. Zhu, J. Fei, and X. He, "A deep learning approach for intrusion

detection usin g recurre nt neur al networks," Ieee Access, vol. 5, pp. 21954-21961,

2017.

[61] M. Z. Alom, V. Bontupalli, and T. M. Taha, "Intrusion detection using deep

belief networks," in 2015 National Aerospace and Electronics Conference

(NAECON), 2015: IEEE, pp. 339-344.

[62] S. Jo, H. Sung, and B. Ahn, "A comparative study on the performance of

intrus ion dete ction using de cisio n tree and artificial ne ural ne twork models ,"

Journal of the Korea Society of Digital Industry and Information Management,

vol. 11, no. 4, pp. 33-45, 2015.

[63] J. Lee, J. Kim, I. Kim, and K. Han, "Cyber Threat Detection Based on Artificial

Neural Networks Using Event Profiles," IEEE Acces s, vol. 7, pp. 165607-

165626, 2019.

[64] L. Khan, M. Awad, and B. Thuraisingham, "A new intrusion detection system

using support vector machines and hierarchical clustering," The VLDB journal,

vol. 16, no. 4, pp. 507-521, 2007.

[65] R. Kokila, S. T. Selvi, and K. Govindarajan, "DDoS detection and analysis in

SDN-based environment using support vector machine classifier," in 2014 Sixth

International Conference on Advanced Computing (ICoAC), 2014: IEEE, pp.

205-210.

[66] P. Mis hra, V. Varadharajan, U. Tupakula, E. S. J. I. C. S. Pilli, and Tutorials, "A

detailed investigation and analysis of using machine learning techniques for

intrusion detection," vol. 21, no. 1, pp. 686-728, 2018.

[67] J. Kevric, S. Jukic, A. J. N. C. Subasi, and Applications, "An effective combining

classifier approach using tree algorithms for network intrusion detection," vol.

28, no. 1, pp. 1051-1058, 2017.

[68] A. Ahmim, L. Maglaras, M. A. Ferrag, M. Derdour, and H. Janicke, "A novel

hierarchical intrusion detection system based on decision tree and rules-based

models ," in 2019 15th International Conf erence on Distributed Computing in

Sensor Systems (DCOSS), 2019: IEEE, pp. 228-233.

[69] B. Ingre, A. Yadav, and A. K. Soni, "Decision tree based intrusion detection

system for NSL-KDD dataset," in International Conference on Information and

Communication Technology for Intelligent Systems, 2017: Springer, pp. 207-218.

[70] D. Kwon, H. Kim, J. Kim, S. C. Suh, I. Kim, and K. J. Kim, "A survey of deep

learning-based network anomaly detection," Cluster Computing, pp. 1-13, 2017.

[71] Y. Cheng, W. Fan, W. Huang, and J. An, "A Shellcode Detection Method Based

on Full Native API Sequence and Support Vector Machine," in IOP Conference

Series: Materials Science and Engineering, 2017, vol. 242, no. 1: IOP

Publishing, p. 012124.

[72] A. Mohaisen and O. Alrawi, "Unveiling zeus: automated classification of

malware samples," in Proceedings of the 22nd International Conference on

World Wide Web, 2013: ACM, pp. 829-832.

[73] P. Shijo and A. J. P. C. S. Salim, "Integrated static and dynamic analysis for

malware detection," vol. 46, pp. 804-811, 2015.

[74] Q. Jamil and M. A. Shah, "Analysis of machine learning solutions to detect

malware in android," in 2016 Sixth International Conference on Innovative

Computing Technology (INTECH), 2016: IEEE, pp. 226-232.

[75] D. Moon, H. Im, I. Kim, and J. H. Park, "DTB-IDS: an intrusion detectio n

system based on decision tree using behavior analysis for preventing APT

attacks," The Journal of supercomputing, vol. 73, no. 7, pp. 2881-2895, 2017.

[76] Z. Salehi, A. Sami, M. J. C. F. Ghiasi, and Security, "Using feature generation

from API calls for malware detection," vol. 2014, no. 9, pp. 9-18, 2014.

[77] Z. Yuan, Y. Lu, Y. J. T. S. Xue, and Technology, "Droiddetector: android

malware characterization and detection using deep learning," vol. 21, no. 1, pp.

114-123, 2016.

[78] Y. Li, R. Ma, R. J. I. J. o. S. Jiao, and I. Applications, "A hybrid malicious code

detection method based on deep learning," vol. 9, no. 5, pp. 205-216, 2015.

[79] R. Vinayakumar, M. Alazab, K. Soman, P. Poornachandran, and S.

Venkatraman, "Robust intelligent malware detection using deep learning," IEEE

Access, vol. 7, pp. 46717-46738, 2019.

[80] A. Damodaran, F. Di Troia, C. A. Visaggio, T. H. Austin, and M. Stamp, "A

comparison of static, dynamic, and hybrid analysis for malware detection,"

Journal of Computer Virology and Hacking Techniques, vol. 13, no. 1, pp. 1-12,

2017.

Survey of Indoor Localization Based on Deep Learning

Article

Jan 2024
CMC-COMPUT MATER CON

Patient informed consent, ethical and legal considerations in the context of digital vulnerability with smart, cardiac implantable electronic devices

Article

Full-text available

May 2024

Advancements in digitalisation with cardiac implantable electronic devices (CIEDs) allow patients opportunities for improved autonomy, quality of life, and a potential increase in life expectancy. However, with the digital and functional practicalities of CIEDs, there exists also cyber safety issues with transferring wireless information. If a digital network were to be hacked, a CIED patient could experience both the loss of sensitive data and the loss of functional control of the CIED due to an unwelcome party. Moreover, if a CIED patient were to become victim of a cyber attack, which resulted in a serious or lethal event, and if this information were to become public, the trust in healthcare would be impacted and legal consequences could result. A cyber attack therefore poses not only a direct threat to the patient’s health but also the confidentiality, integrity, and availability of the CIED, and these cyber threats could be considered “patient-targeted threats.” Informed consent is a key component of ethical care, legally concordant practice, and promoting patient-as-partner therapeutic relationships [1]. To date, there are no standardised guidelines for listing cybersecurity risks within the informed consent or for discussing them during the consent process. Providers are responsible for adhering to the ethical principles of autonomy, beneficence, non-maleficence, and justice, both in medical practice generally and the informed consent process specifically. At present, the decision to include cybersecurity risks is mainly left to the provider’s discretion, who may also have limited cyber risk information. Without effective and in-depth communication about all possible cybersecurity risks during the consent process, CIED patients can be left unaware of the privacy and physical risks they possess by carrying such a device. Therefore, cyber risk factors should be covered within the patients’ informed consent and reviewed on an ongoing basis as new risk information becomes available. By including cyber risk information in the informed consent process, patients are given the autonomy to make the best-informed decision.

Robust Malicious Executable Detection Using Host-Based Machine Learning Classifier

Article

Full-text available

Apr 2024
CMC-COMPUT MATER CON

The continuous development of cyberattacks is threatening digital transformation endeavors worldwide and leads to wide losses for various organizations. These dangers have proven that signature-based approaches are insufficient to prevent emerging and polymorphic attacks. Therefore, this paper is proposing a Robust Malicious Executable Detection (RMED) using Host-based Machine Learning Classifier to discover malicious Portable Executable (PE) files in hosts using Windows operating systems through collecting PE headers and applying machine learning mechanisms to detect unknown infected files. The authors have collected a novel reliable dataset containing 116,031 benign files and 179,071 malware samples from diverse sources to ensure the efficiency of RMED approach. The most effective PE headers that can highly differentiate between benign and malware files were selected to train the model on 15 PE features to speed up the classification process and achieve real-time detection for malicious executables. The evaluation results showed that RMED succeeded in shrinking the classification time to 91 milliseconds for each file while reaching an accuracy of 98.42% with a false positive rate equal to 1.58. In conclusion, this paper contributes to the field of cybersecurity by presenting a comprehensive framework that leverages Artificial Intelligence (AI) methods to proactively detect and prevent cyber-attacks.

A Cybersecurity Classification Model for Detecting Cyberattacks

Chapter

May 2024

FACNN: fuzzy-based adaptive convolution neural network for classifying COVID-19 in noisy CXR images

Article

May 2024
MED BIOL ENG COMPUT

COVID-19 detection using chest X-rays (CXR) has evolved as a significant method for early diagnosis of the pandemic disease. Clinical trials and methods utilize X-ray images with computer and intelligent algorithms to improve detection and classification precision. This article thus proposes a fuzzy-based adaptive convolution neural network (FACNN) model to improve the detection precision by confining the false rates. The feature extraction process between the successive regions is validated using a fuzzy process that classifies labeled and unknown pixels. The membership functions are derived based on high precision features for detection and false rate suppression process. The convolution neural network process is responsible for increasing detection precision through recurrent training based on feature availability. This availability analysis is verified using fuzzy derivatives under local variances. Based on variance-reduced features, the appropriate regions with labeled and unknown features are used for normal or infected classification. Thus, the proposed FACNN improves accuracy, precision, and feature extraction by 14.36%, 8.74%, and 12.35%, respectively. This model reduces the false rate and extraction time by 10.35% and 10.66%, respectively. Proposed FACNN Model

KFFPDet: Android malicious application detection system with assisted detection of adversarial samples

Article

Apr 2024
EXPERT SYST APPL

Enhancing Tor Network Security: Identifying Active Hidden Services

Conference Paper

Mar 2024

Cyber-Malware Detection using Machine Learning

Conference Paper

Dec 2023

Network Security Threats Detection Methods Based on Machine Learning Techniques

Chapter

Apr 2024

A Survey on the Applications of Semi-Supervised Learning to Cyber-Security

Article

Apr 2024

Machine Learning’s widespread application owes to its ability to develop accurate and scalable models. In cyber-security, where labeled data is scarce, Semi-Supervised Learning (SSL) emerges as a potential solution. SSL excels at tasks challenging traditional supervised and unsupervised algorithms by leveraging limited labelled data alongside abundant unlabeled data. This paper presents a comprehensive survey of SSL in cyber-security, focusing on countering diverse cybercrimes, particularly intrusion detection. Despite its potential, a notable research gap persists, with few recent studies comprehensively reviewing SSL’s application in cyber-security. This study examines state-of-the-art SSL techniques tailored for cyber-security to address this gap. Relevant methods are identified, and their effectiveness is evaluated to empower researchers and practitioners with insights to enhance cyber-security measures. This work sheds light on SSL’s potential in addressing data scarcity in cyber-security domains in addition to outlining new research directions to advance this crucial field. By bridging this research gap, this manuscript paves the way for enhanced cyber-threat detection and mitigation in an increasingly interconnected world.

An Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network

Article

Full-text available

Feb 2020

In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecommunication service providers as it disturbs their customers and causes them to lose business. Therefore, in this paper, we proposed a novel machine learning method for detection of SMS spam messages. The proposed model contains two main stages: feature extraction and decision making. In the first stage, we have extracted relevant features from the dataset based on the characteristics of spam and legitimate messages to reduce the complexity and improve performance of the model. Then, an averaged neural network model was applied on extracted features to classify messages into either spam or legitimate classes. The method is evaluated in terms of accuracy and F-measure metrics on a real-world SMS dataset with over 5000 messages. Moreover, the achieved results were compared against three recently published works. Our results show that the proposed approach achieved successfully high detection rates in terms of F-measure and classification accuracy, compared with other considered researches.

Deep learning for image-based mobile malware detection

Article

Full-text available

Jun 2020

Current anti-malware technologies in last years demonstrated their evident weaknesses due to the signature-based approach adoption. Many alternative solutions were provided by the current state of art literature, but in general they suffer of a high false positive ratio and are usually ineffective when obfuscation techniques are applied. In this paper we propose a method aimed to discriminate between malicious and legitimate samples in mobile environment and to identify the belonging malware family and the variant inside the family. We obtain gray-scale images directly from executable samples and we gather a set of features from each image to build several classifiers. We experiment the proposed solution on a data-set of 50,000 Android (24,553 malicious among 71 families and 25,447 legitimate) and 230 Apple (115 samples belonging to 10 families) real-world samples, obtaining promising results.

Cyber Threat Detection Based on Artificial Neural Networks Using Event Profiles

Article

Full-text available

Nov 2019

One of the major challenges in cybersecurity is the provision of an automated and effective cyber-threats detection technique. In this paper, we present an AI technique for cyber-threats detection, based on artificial neural networks. The proposed technique converts multitude of collected security events to individual event profiles and use a deep learning-based detection method for enhanced cyber-threat detection. For this work, we developed an AI-SIEM system based on a combination of event profiling for data preprocessing and different artificial neural network methods, including FCNN, CNN, and LSTM. The system focuses on discriminating between true positive and false positive alerts, thus helping security analysts to rapidly respond to cyber threats. All experiments in this study are performed by authors using two benchmark datasets (NSLKDD and CICIDS2017) and two datasets collected in the real world. To evaluate the performance comparison with existing methods, we conducted experiments using the five conventional machine-learning methods (SVM, k-NN, RF, NB, and DT). Consequently, the experimental results of this study ensure that our proposed methods are capable of being employed as learning-based models for network intrusion-detection, and show that although it is employed in the real world, the performance outperforms the conventional machine-learning methods.

Predicting Spam Messages Using Back Propagation Neural Network

Article

Full-text available

Jan 2020
WIRELESS PERS COMMUN

With the increase in popularity of smartphones, text-based communication has also gained popularity. Availability of messaging services at low cost has resulted into the increase in spam messages. This increase in number of spam messages has become an important issue these days. Many mobile applications are developed to detect spam messages in mobile phones but still, there is a lack of a complete solution. This paper presents an approach for the detection of spam messages. We have identified an effective feature set for text messages which classify the messages into spam or ham with high accuracy. The feature selection procedure is implemented on normalized text messages to obtain a feature vector for each message. The feature vector obtained is tested on a set of machine learning algorithms to observe their efficiency. This paper also presents a comparative analysis of different algorithms on which the features are implemented. In addition, it presents the contribution of different features in spam detection. After implementation and as per the set of features selected, Artificial Neural Network Algorithm using Back Propagation technique works in the most efficient manner.

Malware Detection Using Machine Learning

Conference Paper

Dec 2020

Decision making using Machine Learning can be efficiently applied to security. Malware has become a big risk in today’s times. In order to provide protection for the same, we present a machine-learning based technique for predicting Windows PE files as benign or malignant based on fifty-seven of their attributes. We have used the Brazilian Malware dataset, which had around 1,00,000 samples and 57 labels. We have made seven models, and have achieved 99.7% accuracy for the Random Forest model, which is very high when compared to other existing systems. Thus using the Random Forest model one can make a decision on whether a particular file is malware or benign.

Students Performance: A Data Mining Perspective

Book

Sep 2017

Data Mining is an emerging field used in educational purposes to improve the perceptive and learning method of students. It focuses on recognizing, extracting and calculating data associated to the learning method and improving students performance. Mining in a learning field is known as educational information mining which is fretful with exploring the latest techniques to find out knowledge from educational fields. The purpose of our study is to evaluate the performance of students by taking different attributes like academic achievements (CGPA), gender, class test grade, the environment of the class, Fund/Scholarships/Private etc. In our research, we will use classification and clustering techniques to analyze student performance. The techniques used in our work are decision tree, Bayesian classification-mean algorithms, neural networks, naves Bayes, Web-based system and nearest neighbour methods.

Machine Learning and Artificial Intelligence

Article

Jan 2020

Ameet V Joshi

This book provides comprehensive coverage of combined Artificial Intelligence (AI) and Machine Learning (ML) theory and applications. Rather than looking at the field from only a theoretical or only a practical perspective, this book unifies both perspectives to give holistic understanding. The first part introduces the concepts of AI and ML and their origin and current state. The second and third parts delve into conceptual and theoretic aspects of static and dynamic ML techniques. The forth part describes the practical applications where presented techniques can be applied. The fifth part introduces the user to some of the implementation strategies for solving real life ML problems. The book is appropriate for students in graduate and upper undergraduate courses in addition to researchers and professionals. It makes minimal use of mathematics to make the topics more intuitive and accessible. • Presents a full reference to artificial intelligence and machine learning techniques - in theory and application; • Provides a guide to AI and ML with minimal use of mathematics to make the topics more intuitive and accessible; • Connects all ML and AI techniques to applications and introduces implementations.

Spam Detection on Social Media Using Semantic Convolutional Neural Network

Chapter

Jan 2020

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.

A Review on Cyber Security Mechanisms Using Machine and Deep Learning Algorithms

Chapter

Jan 2020

Thiyagarajan Paramasivan

Digitalization is the buzz word today by which every walk of our life has been computerized, and it has made our life more sophisticated. On one side, we are enjoying the privilege of digitalization. On the other side, security of our information in the internet is the most concerning element. A variety of security mechanisms, namely cryptography, algorithms which provide access to protected information, and authentication including biometric and steganography, provide security to our information in the Internet. In spite of the above mechanisms, recently artificial intelligence (AI) also contributes towards strengthening information security by providing machine learning and deep learning-based security mechanisms. The artificial intelligence (AI) contribution to cyber security is important as it serves as a provoked reaction and a response to hackers' malicious actions. The purpose of this chapter is to survey recent papers which are contributing to information security by using machine learning and deep learning techniques.

Intrusion Detection System (IDS) and Their Types

Chapter

Jan 2020

Over the last two decades, computer and network security has become a main issue, especially with the increase number of intruders and hackers, therefore systems were designed to detect and prevent intruders. This chapter per the authors investigated the most important design approaches, by mainly focusing on their collecting, analysis, responding capabilities and types of current IDS products. For the collecting capability, there were two main approaches, namely host- and network-based IDSs. Therefore, a combination of the two approaches in a hybrid implementation is ideal, as it will offer the highest level of protection at all levels of system functions. The analysis capability of an IDS can be characterised by the misuse and anomaly detection approaches. Therefore, a combination of the two approaches should improve the analysis capability of an IDS i.e. hybrid of misuse and anomaly detection.

Cyber Threat Detection Using Machine Learning Techniques: A Performance Evaluation Perspective

Abstract

Recommended publications

Performance Comparison and Current Challenges of Using Machine Learning Techniques in Cybersecurity

Cyber Threat Detection Based On Angle-Based Subspace Anomaly For Machine Learning Applications

A Novel Approach for Cyber Threat Detection Based on Angle-Based Subspace Anomaly Detection

A Survey on Machine Learning Techniques for Cyber Security in the Last Decade