ArticlePDF Available

Abstract and Figures

Businesses are experiencing an ever-growing problem of how to identify and guard in opposition to insider threats. Users with legal access to sensitive organizational data are positioned in a role of power that can be abused and could do harm to an enterprise. This can range from monetary and intellectual property theft to the destruction of assets and enterprise reputation. Traditional intrusion detection structures are neither designed nor able to figure out those who act maliciously inside a business enterprise. In this paper, we describe an automated system capable of detecting insider threats within an enterprise. We outline a tree-shape profiling technique that includes the information on activities conducted by each user and every task after which we use this to obtain a consistent representation of functions that provide a rich description of the user's behavior. The deviation may be assessed based on the amount of variance that each user exhibits across multiple attributes, compared in opposition to their peers. The primary function of User and Entity behavior Analysis(UEBA) is to track normal user behaviors. UEBA defines a baseline for each entity in the environment, and actions will be evaluated by comparing with pr-defined baselines.
Content may be subject to copyright.
An MBS Model For Enterprise Security Using User
and Entity Behavior Analytics
Pranali Navghare
Dept. of Computer Engineering
Pune Institute of Computer Technology
Pune, India
prnavghare@pict.edu
Nikita Kapadnis
Dept. of Computer Engineering
Pune Institute of Computer Technology
Pune, India
nykapadnis@pict.edu
Dipika Raigar
Dept. of Computer Engineering
Pune Institute of Computer Technology
Pune, India
ddbhaiyya@pict.edu
AbstractBusinesses are experiencing an ever-growing prob-
lem of how to identify and guard in opposition to insider
threats. Users with legal access to sensitive organizational data
are positioned in a role of power that can be abused and
could do harm to an enterprise. This can range from monetary
and intellectual property theft to the destruction of assets and
enterprise reputation. Traditional intrusion detection structures
are neither designed nor able to figure out those who act
maliciously inside a business enterprise. In this paper, we describe
an automated system capable of detecting insider threats within
an enterprise. We outline a tree-shape profiling technique that
includes the information on activities conducted by each user
and every task after which we use this to obtain a consistent
representation of functions that provide a rich description of
the user’s behavior. The deviation may be assessed based on
the amount of variance that each user exhibits across multiple
attributes, compared in opposition to their peers. The primary
function of User and Entity behavior Analysis(UEBA) is to track
normal user behaviors. UEBA defines a baseline for each entity
in the environment, and actions will be evaluated by comparing
with pr-defined baselines.
Index TermsUser and Entity Behavior Analytics, Anomaly
Detection, Deep Learning, Security
I.
INTRODUCTION
The User and Entity Behavior Analytics (UEBA) is a type of
cybersecurity technology that focuses on detecting abnormal
behavior by users and entities within an organization’s network
[38]. UEBA uses machine learning algorithms and statistical
models to analyze vast amounts of data from various sources,
such as log files, network traffic, and user activity, to identify
potential security threats [6]. UEBA is designed to detect
a wide range of security threats, including insider threats,
account takeover attacks, and other types of malicious activity
[1].
UEBA is important because it provides organizations with a
proactive approach to cybersecurity. Rather than relying solely
on reactive measures, such as firewalls and antivirus software,
UEBA enables organizations to detect security threats in real-
time before they cause significant damage [38]. UEBA can
also help organizations reduce the time it takes to detect and
respond to security incidents, which is critical for minimizing
the impact of a security breach [1].
Insider attacks are typically a critical security concern for
businesses, as they are capable of causing extreme threats to
private records and intellectual property [1]. According to the
2019 Insider Risk Report, 68% of corporations are regularly
affected by insider attacks. The intention of cyber security
specialists is generally limited to protecting corporate net-
works from external attacks, using various preventive measures
including firewalls, IDS, IPS, proxy servers, security gateways,
and so on. Failure to comply with these measures will result
in suspension [36]. A valid user regularly has more rights and
access to manipulate rights compared to invalid users. This
raises security concerns about what security policies need to
be installed to protect sensitive information if compromised
by valid users [6].
User and Entity Behavior Analysis (UEBA) is an analytics
tool based on machine learning and deep learning. UEBA
model overcomes the shortcomings of the traditional systems
such as adapting to escape the attacker, etc. [1]. Through
statistical and time series data analysis, UEBA perceives and
recognizes the minute details that humans cannot. UEBA
systems typically require large amounts of data to be effective,
so it’s important to have a good data management strategy in
place to ensure that data is collected and stored in a way that is
easily accessible and can be used for analysis [6]. Additionally,
it’s important to have a team of skilled data scientists and
security analysts who can interpret the results of machine
learning models and take appropriate actions to mitigate any
security threats that are identified [6].
Recently, the topic of insider threats has received a lot of
attention in the literature. Researchers have proposed several
different models aimed at preventing or detecting the pres-
ence of attacks (e.g. [23] and [24]). Similarly, many works
Mukt Shabd Journal
Volume XII, Issue XII, DECEMBER/2023
ISSN NO : 2347-3150
Page No : 1580
examine the psychological and behavioral characteristics of
potentially threatening insiders as a means of detection (For
example [25] [27]). Kammu¨ller and Probst [28] explore how
organizations can identify attack vectors based on policy vio-
lations to minimize the possibility of insider attacks. Similarly,
Ogiela and Ogiela [9] investigated how low-threshold secret
sharing can be used to protect against insider threats. For
the remainder of this section, we choose to focus exclusively
on research dealing with the practical aspects of designing
and developing systems capable of predicting or detecting
the presence of insider threats. Spitzner’s early work [30]
deals with the use of honeypots (decoy machines capable
of attracting attacks) to detect insider attacks. However, as
security awareness increases, those who choose to carry out
insider attacks are finding more subtle ways to harm or deceive
their organizations, requiring more sophisticated prevention
and detection. Early work by Magklaras and Furnell [31]
investigated how to estimate the level of threat likely to come
from a specific insider based on a specific user behavior
profile. As they acknowledge, a lot of work remains to be done
to validate the proposed solution. Myers et al. [32] considers
how web server log data can be used to identify malicious
insiders attempting to exploit internal systems. Maloof and
Stephens [33] propose a detection tool to detect when insiders
violate the necessary knowledge constraints within an orga-
nization. Okolica et al. [34] use probabilistic latent semantic
indexing on users to identify employee interests, which are
used to form a social graph that can highlight insiders. Liu et
al. [35] proposed a multi-layered framework called sensitive
information dissemination detection, including network-level
application identification, content signature generation and
detection, and covert communication detection.
II.
LITERATURE SURVEY
Ref
Dataset Used
Algorithm/
Approach
Performance
[5]
CERT insider threat
dataset
LSTM
90.17% accuracy
[4]
Custom 18-month
LDAP from a
company
Improved LSTM-
GaN
Accuracy highest
at 128 hidden
layers i.e.
91.14%
[3]
DARPA ADAMS
real-time
anomaly
detection in
streaming
heterogeneity
(RADISH)
50%
of
all
malicious
sessions
are
detected and
about 92% of
sessions that
are flagged
malicious are
actually benign.
[6]
CMU-CERT, Enron
email dataset, sample
data by Centre for the
Protection of National
Infrastructure
(CPNI), and in-
house generated data
Tree based pro-
filing using stan-
dard deviation in
the mahalanobis
distance
42% Precision
100% Recall
[7]
Custom in-house gen-
erated
SVD
false positive
rate computed at
2.2%
III.
METHODOLOGY
A.
Data Preparation
The User and Entity Behavior Analytics is performed using
Computer Emergency Response Teams(CERT) Dataset. The
CMU generates the dataset and is one of the popular datasets
amongst User and Entity Behavior Analytics. The cert dataset
is ideal for training and testing the proposed system to
withstand the Big Data. The dataset consists of several logs
such as (log on.csv, email.csv, device.csv, http.csv, file.csv,
and psychometric.csv) of over 1000 employees. The dataset
contains 502 days, 1000 users, and 32,770,227 logs. Some of
the logs are manually injected by domain experts. Along with
logs, it contains metadata such as role, project, functional unit,
department, team, etc. These logs help to analyze the features
and roles of several employees. The dataset is divided into
training and validation dataset.
Types of Features
Features from Dataset
Action Features
Num. device
Files exe copy
Files jpg copy
Files txt/doc/pdf copy
Files zip copy
Num. emails sending
Internal email sends
Num. Internal email receive
Num external email receive
Size of emails
Num. attachments
Num. websites
Num. career sites
Num. news sites
Num. tech sites
Action Sequences
log on
log off
HTTP
device connect
device disconnect
Email
Role Features
It contains the average of all the features se-
lected from the Action Features like Num. emails
sending, Internal email send, Num. Internal email
receives, etc.
B.
Feature Extraction
Feature extraction is the method for creating a new and
smaller set of features that captures most of the useful infor-
mation of raw data. Feature Extraction is a crucial step for
User And Entity Behavior Analytics. The selected features
decide the predictions of anomalous behavior and in turn
the accuracy of the model. Most problems involved with the
feature extraction is the user’s time window. If the fixed
window time is taken then there are chances of missing
the anomalous part. If the long window is considered then
it becomes too generic. Also, a short window of time can
divide the anomalous part. Hence, the flexible window of
the user’s session time is considered. The user’s activity is
directly proposed to the user’s session/CITE. Users’ session
activity consists of device login/logout, HTTP, File, and email
Mukt Shabd Journal
Volume XII, Issue XII, DECEMBER/2023
ISSN NO : 2347-3150
Page No : 1581
activities. Session Calculation: To generate a feature vector,
the user’s activity like device login/logout, HTTP, file, and
email is used. This common feature of users is aggregated.
Sessions are calculated for each of these events. Session
calculation helps in finding the active user time hence it
takes into consideration the device activity during weekdays
and weekends, HTTP activity, email activity, and file activity.
The session calculation can predict the baseline for individual
user’s activity
classified. Role classification helps to understand the average
behavior of a particular organization.
A.
LSTM
Fig. 2. Architecture diagram
IV.
ALGORITHMIC MODEL
Fig. 1. Session Calculation
1)
Action Features: Action features contain the numerical
features from the dataset. Action Features include number of
thumb drives used, the number of emails sent, times of visiting
websites, users logged on or logged off during working time,
etc. This feature contains the user’s daily activity for each
time period. This feature helps to find the daily activities of
the user. This activity may differ for every user based on the
user’s role. This feature tells the characteristics of every user
and the user’s daily work and working habits
2)
Action Sequences: Action sequence calculates the se-
quential behavior of the user for each time period. Action
sequence records all the activities performed by the user on a
day-to-day basis from the first login to the last logout within
a day. It finally gives us the array of all activities performed
by the user for a given particular period of time. A sequence
of users’ habits may not be always the same but it helps in
understanding the working habits.
3)
Role Features: Role features are statistical features from
all colleagues in the same group. Role features identify
individual user as a part of a particular group. A Group
or Organization is formed based on averaging the activities
performed by all the users. Based on the similarity between
the user’s activities and the organizations’ activities the user is
Recurrent Neural Networks can be used to analyze the
sequences using several layers of artificial neural networks¿
LSTM is a particular algorithm that is used for the time-
series analysis or sequence generation. We take the success
of LSTM’s sequence-generating capabilities to our benefit. we
train LSTMs to learn the normal action sequences and predict
the action sequences of the next state based on histories./CITE
After training the model, the deviation between the true action
sequence and LSTM-generated action sequence shows the
anomaly detection. The LSTM Algorithm takes sequential data
as input and predicts the output which is also a sequence. The
user’s log of actions of N days in the form of the array such
as log in web, file, HTTP, email, log off is given input to
the LSTM. There are two layers for which the output of the
previous layer is given as input to the next layer. The two
layers have a “tanh” activation function and the final prediction
layer uses “relu” activation function.
B.
convLSTM
To use ConvLSTMs for anomaly detection, the algorithm is
typically trained on normal or regular patterns in the data.
During training, the network learns to capture the tempo-
ral dependencies and spatial relationships within the input
sequences. The convolutional layers extract spatial features,
while the LSTM layers capture the sequential patterns. Once
trained, the ConvLSTM model can be used to detect anomalies
in new data by comparing the predicted output with the
Mukt Shabd Journal
Volume XII, Issue XII, DECEMBER/2023
ISSN NO : 2347-3150
Page No : 1582
actual input. Anomalies are identified as deviations from the
learned normal patterns. This can be done by computing a
reconstruction error, which quantifies the difference between
the predicted and actual input. High reconstruction errors
indicate the presence of anomalies.
V.
EXPERMIENTAL RESULTS
In this section, we describe the results of LSTM and MBS
models. The deviation between real features and predicted
results which tells us the degree of anomaly can be seen
through various graphs and diagrams. The details of LSTM
and convLSTM models are given in the following tables. The
LSTM model has 2 layers and convLSTM has 3 layers.
Layers
Parameters
Input
Dim = 128
Reshape
Dim = (4, 32)
LSTM
Units = 100
Activation
Function = tanh
LSTM
Units = 160
Activation
Function = tanh
Dense
Dim = 32
Activation
Function = relu
Layers
Parameters
Input
Dim = 92
Reshape
Dim = (4,6,8,1)
ConvLSTM
Filters = 24, kernel size = (2,3)
Activation
Function = relu
ConvLSTM
Filters = 24, kernel size = (2,3)
Activation
Function = tanh
ConvLSTM
Filters = 24, kernel size = (2,3)
Activation
Function = tanh
maxpooling
Pool size = (3,3)
Dense
Dim = 48
Activation
Function = relu
The deviation between true features and predicted which
is anomaly is measured with WDD Loss. WDD stands for
Weighted Deviation Degree(WDD). WDD linearly measures
the squared difference according to the weighted value.
WDD
= 1
w
(
y
y
ˆ)
2
|V |
y
V
Here, V is the set of all features, y is the true value and
y is the predicted value. W is the specially designed weight
value. Along with this WDD loss, for some models, the Mean
Squared Error losses are calculated. The MSE is the average
of squared difference between true and predicted values.
model’s performance, resulting in better deviation detection
for anomalous user behaviors.
Fig. 3. WDD LOSS of Action Feature
Fig. 4. MSE LOSS of Action Feature
Role Features Result: The model studied partial action
features of users with the same role and calculated the mean
value. The model compared the daily role features of each
user with the standard role features. The results, presented
in Figure, showed that deviations between daily features and
standard role features remained within a range of 0-2 for the
first 200 days. However, after 200 days, some users exhib-
ited suspicious behavior, resulting in a significant difference,
suggesting that role features can partially indicate abnormal
behavior detection.
1
MSE
=
|V |
Σ
w
(
y
y
ˆ)
2
Action Features Result: The model was trained on the
diverse user activity features and captured connections be-
tween these features. The results were proposed using con-
vLSTM with time and space characteristics to handle user
action features. However, they encountered overfitting issues,
and the model was tested with several different epochs, and
changing certain parameters were changed to improve the
Fig. 5. MBS Result
Mukt Shabd Journal
Volume XII, Issue XII, DECEMBER/2023
ISSN NO : 2347-3150
Page No : 1583
MBS Results: The proposed model of the Malicious Be-
havior Detection System (MBS) aims to identify malicious
behaviors in users based on three perspectives. The model
observed that normal and anomalous points are distinguishable
in their approach, although some false positives and false
negatives exist. The model utilizes a Multilayer Perceptron
(MLP) to learn the relationships between the deviations and
determine abnormal behaviors. Experimental results demon-
strate the effectiveness of MBS, outperforming the baseline
model in all metrics, with an AUC value of 0.96, indicating
its high effectiveness.
Fig. 6. WDD LOSS of Action Feature
VI.
CONCLUSION
In this proposed system, we have implemented the Multi-
model Based System for UEBA with convLSTM Algorithm.
The improved accuracy over a single model-based can be
seen over traditional Machine Learning algorithms. With the
help of Deep Learning models, we have built a system where
the admin can detect insider attacks within the organization.
To enhance the proposed system, we have created a Dash-
board where the admin can track malicious users. The admin
Dashboard is a web application where the admin can keep a
watch on the malicious user and receive alerts in case of an
insider attack. Real-time continuous working software can be
implemented with this model in the future.
ACKNOWLEDGMENT
We would like to take this opportunity to thank our guide
Prof. Pranali Navghare for giving us all the help and guidance
we needed. We are really grateful to her for her kind support.
Her valuable suggestions were very helpful. We are also
grateful to Dr. G.V. Kale, Head of Department of Computer
Engineering, and Dr. S. T. Gandhe, Principal, Pune Institute
of Computer Technology for their indispensable support and
suggestions.
REFERENCES
[1]
M. Shashanka, M. -Y. Shen and J. Wang, ”User and entity behavior
analytics for enterprise security,” 2016 IEEE International Conference
on Big Data (Big Data), 2016, pp. 1867-1874, doi: 10.1109/Big-
Data.2016.7840805.
[2]
R. Sommer and V. Paxson. Outside the closed world: On using machine
learning for network intrusion detection. In Security and Privacy (SP),
2010 IEEE Symposium on, pages 305316. IEEE, 2 .
[3]
B. Bo¨se, B. Avasarala, S. Tirthapura, Y. -Y. Chung and D. Steiner,
”Detecting Insider Threats Using RADISH: A System for Real-
Time Anomaly Detection in Heterogeneous Data Streams,” in IEEE
Systems Journal, vol. 11, no. 2, pp. 471-482, June 2017, doi:
10.1109/JSYST.2016.2558507.
[4]
Haowei Liu 2021 J. Phys.: Conf. Ser. 1994 012021 DOI 10.1088/1742-
6596/1994/1/012021.
[5]
Balaram Sharma, Prabhat Pokharel, and Basanta Joshi. 2020. User
Behavior Analytics for Anomaly Detection Using LSTM Autoencoder
- Insider Threat Detection. In Proceedings of the 11th International
Conference on Advances in Information Technology (IAIT2020). As-
sociation for Computing Machinery, New York, NY, USA, Article 5,
19. https://doi.org/10.1145/3406601.3406610 .
[6]
P. A. Legg, O. Buckley, M. Goldsmith and S. Creese, ”Automated
Insider Threat Detection System Using User and Role-Based Profile
Assessment,” in IEEE Systems Journal, vol. 11, no. 2, pp. 503-512,
June 2017, doi: 10.1109/JSYST.2015.2438442.
[7]
Yousef, Rasheed and Jazzar, Mahmoud. (2021). Measuring the Ef-
fectiveness of User and Entity Behavior Analytics for the Preven-
tion of Insider Threats. Xi’an Jianzhu Keji Daxue Xuebao/Journal
of Xi’an University of Architecture and Technology. XIII. 175-181.
10.37896/JXAT13.10/313918.
[8]
A. Saaudi, Z. Al-Ibadi, Y. Tong and C. Farkas, ”Insider Threats
Detection Using CNN-LSTM Model,” 2018 International Conference
on Computational Science and Computational Intelligence (CSCI), Las
Vegas, NV, USA, 2018, pp. 94-99, doi: 10.1109/CSCI46756.2018.00025.
[9]
Machine Learning; Investigators from Dalhousie University Release
New Data on Machine Learning (Analyzing Data Granularity Levels
for Insider Threat Detection Using Machine Learning)[J]. Journal of
Engineering,2020.
[10]
LeCun Y, Bengio Y, Hinton GE.Deep Learning.
Nature[J].2015,521(7553):436-444.
[11]
Schmidhuber J. Deep Learning in Neural Networks: An Overview.
Neural networks[J]. 2015,61:85-117.
[12]
Bengio Y, Courville A, Vincent P. Represent action Learning: A Review
and New Perspectives[C]. IEEE TPAMI,20 13,35(8):1798-1828.
[13]
Yu Y ,Canales S . Conditional LSTM-GAN for Melody Generation from
Lyrics[J]. 2019.
[14]
Jingxian Yang,Shuai Zhang,Yue Xiang,Jichun Liu,Junyong Liu,Xiaoyan
Han,Fei Teng. LSTM auto-encoder based representative scenario gen-
eration method for hybrid hydro-PV power system[J]. IET Generation,
Transmission & Distribution,2020,14(24).
[15]
Malhotra Pankaj, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh
Vig, Puneet Agarwal, and Gautam Shroff, LSTM-based encoder-decoder
for multi-sensor anomaly detection. arXiv preprint arXiv:1607.00148
(2016).
[16]
nsider
Threat
Test
Dataset.
(November
2016).
Retrieved
March 6, 2020 from https://resources.sei.cmu.edu/library/asset-
view.cfm?assetid=508099.
[17]
Fangfang Yuan, Yanan Cao, Yanmin Shang, Yanbing Liu, Jianlong Tan,
and Binxing Fang. 2018. Insider Threat Detection with Deep Neural
Network. Lecture Notes in Computer Science Computational Science
ICCS 2018(2018), 4354.
[18]
Iffat A. Gheyas and Ali E. Abdallah. 2016. Detection and pre-
diction of insider threats to cyber security: a systematic liter-
ature review and meta-analysis. Big Data Analytics1, 1 (2016).
DOI:http://dx.doi.org/10.1186/s41044-016-0006-0.
[19]
Ali H. Mirza and Selin Cosan. 2018. Computer network intrusion
detection using sequential LSTM Neural Networks autoencoders. 2018
26th Signal Processing and Communications Applications Conference
(SIU) (2018).
[20]
Pang, G., Shen, C., Cao, L., & Hengel, A. V. D. (2021). Deep Learn-
ing for Anomaly Detection. ACM Computing Surveys, 54(2), 138.
doi:10.1145/3439950.
[21]
Davide Abati, Angelo Porrello, Simone Calderara, and Rita Cucchiara.
2019. Latent space autoregression for novelty detection. In CVPR. 481
490.
[22]
Charu C. Aggarwal. 2017. Outlier Analysis. Springer.
[23]
Samet Akcay, Amir Atapour-Abarghouei, and Toby P. Breckon. 2018.
GANomaly: Semi-supervised anomaly detection via adversarial training.
In ACCV. Springer, 622637.
Mukt Shabd Journal
Volume XII, Issue XII, DECEMBER/2023
ISSN NO : 2347-3150
Page No : 1584
[24]
Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph based
anomaly detection and description: A survey. Data Min. Knowl. Discov.
29, 3 (2015), 626688.
[25]
Elie Aljalbout, Vladimir Golkov, Yawar Siddiqui, Maximilian Strobel,
and Daniel Cremers. 2018. Clustering with deep learning: Taxonomy
and new methods. arXiv:1801.07648.
[26]
J. Andrews, Thomas Tanay, Edward J. Morton, and Lewis D. Griffin.
2016. Transfer representation-learning for anomaly detection.
[27]
Fabrizio Angiulli, Fabio Fassetti, Giuseppe Manco, and Luigi Palopoli.
2017. Outlying property detection with numerical attributes. Data Min.
Knowl. Discov. 31, 1 (2017), 134163.
[28]
Fabrizio Angiulli, Fabio Fassetti, and Luigi Palopoli. 2009. Detecting
outlying properties of exceptional objects. ACM Trans. Database Syst.
34, 1 (2009), 162.
[29]
Fabrizio Angiulli and Clara Pizzuti. 2002. Fast outlier detection in high
dimensional spaces. In PKDD. Springer, 1527.
[30]
Martin Arjovsky, Soumith Chintala, and Le´on Bottou. 2017. Wasserstein
generative adversarial networks. In ICML.214223.
[31]
Terje Aven. 2016. Risk assessment and risk management: Review of
recent advances on their foundation. Eur. J. Operat. Res. 253, 1 (2016),
113.
[32]
Fatemeh Azmandian, Ayse Yilmazer, Jennifer G. Dy, Javed A. Aslam,
and David R. Kaeli. 2012. GPU-accelerated feature selection for outlier
detection using the local kernel density ratio. In ICDM. IEEE, 5160.
[33]
Kevin Bache and Moshe Lichman. 2013. UCI machine learning repos-
itory. Retrieved from http://archive.ics.uci.edu/ml.
[34]
Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representa-
tion learning: A review and new perspectives.IEEE Trans. Pattern Anal.
Mach. Intell. 35, 8 (2013), 17981828.
[35]
Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger.
2019. MVTec ADA comprehensive realworld dataset for unsupervised
anomaly detection. In CVPR. 95929600.
[36]
Insider Threat Report 2020, Cyber Insiders https://www.cybersecurity-
insiders.com/wp-content/uploads/2019/11/2020-Insider-Threat-Report-
Gurucul.pdf.
[37]
S. Khaliq, Z. U. Abideen Tariq and A. Masood, ”Role of User and Entity
Behavior Analytics in Detecting Insider Attacks,” 2020 International
Conference on Cyber Warfare and Security (ICCWS), 2020, pp. 1-6,
doi: 10.1109/ICCWS48432.2020.9292394.
[38]
A. Salitin and A. H. Zolait, ”The role of User Entity Behavior Analytics
to detect network attacks in real time,” 2018 International Conference
on Innovation and Intelligence for Informatics, Computing, and Tech-
nologies (3ICT), 2018, pp. 1-5, doi: 10.1109/3ICT.2018.8855782.
[39]
M. Raut, S. Dhavale, A. Singh and A. Mehra, ”Insider Threat Detection
using Deep Learning: A Review,” 2020 3rd International Conference on
Intelligent Sustainable Systems (ICISS), Thoothukudi, India, 2020, pp.
856-863, doi: 10.1109/ICISS49785.2020.9315932.
[40]
P. Goyal and K. Gupta, ”Supervised Learning Approach for User
and Entity Behavior Analytics,” 2020 3rd International Conference on
Computing and Communications Technologies (ICCCT), 2020, pp. 1-6,
doi: 10.1109/ICCCT49228.2020.9269224
[41]
N. R. Pandit and V. M. Thakare, ”An Overview of Super-
vised Learning Algorithms for User and Entity Behavior Analyt-
ics,” 2021 7th International Conference on Advanced Computing
and Communication Systems (ICACCS), 2021, pp. 782-787, doi:
10.1109/ICACCS51225.2021.9375201.
[42]
R. Zhang, J. Zhang, and Y. Jia, ”A Survey on Unsupervised Learning
for User and Entity Behavior Analytics,” 2021 IEEE 5th Information
Technology and Mechatronics Engineering Conference (ITOEC), 2021,
pp. 237-242, doi: 10.1109/ITOEC51615.2021.00055.
[43]
J. Datta, R. Dasgupta, S. Dasgupta and K. R. Reddy, ”Real-Time Threat
Detection in UEBA using Unsupervised Learning Algorithms,” 2021
5th International Conference on Electronics, Materials Engineering &
Nano-Technology (IEMENTech), Kolkata, India, 2021, pp. 1-6, doi:
10.1109/IEMENTech53263.2021.9614848.
Mukt Shabd Journal
Volume XII, Issue XII, DECEMBER/2023
ISSN NO : 2347-3150
Page No : 1585
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The increasing penetration of renewable energy sources causes complex uncertainties of the power system. To capture such uncertainties in power system planning, an important step is to generate representative scenarios. In this work, a long short term memory (LSTM) auto‐encoder based approach is proposed to generate representative scenarios in an integrated hydro‐photovoltaic (PV) power generation system, which consists of feature extraction by LSTM Encoder, scenario clustering in feature domain by combining gap statistics method and K‐means++, and representative scenario reconstruction by using LSTM Decoder. Compared with traditional scenario selection and generation methods, the proposed method can better capture the patterns of multivariate time‐series data in both temporal and spatial dimensions. A case study in southwest China is used to demonstrate the effectiveness of the proposed method, which outperforms other existing methods by achieving the lowest SSE and DBI indices of 0.89 and 0.12, respectively, and obtaining the best SIL and CHI scores of 0.93 and 2.30, respectively, In addition, the case study shows the proposed model setup works more stable for scenario generation.
Conference Paper
Full-text available
Organizations are using advanced security solutions to protect their information resources. However, even such high investments, traditional security approaches failed to protect the network structure against state-of-the-art attacks. New proactive approaches to security are on the rise such as User Entity Behavior Analytics (UEBA). UEBA is a type of cybersecurity process that uses machine learning, algorithms, and statistical analyses to detect real-time network attacks. This paper aims to assess the value and success of using behavior analytics in securing the network from not-before-seen attacks such as zero-day attacks. This paper uses a systematic literature review and self-administrated survey and interviews with convenience sampling of high profile network users and top security vendors. Survey and interviews with various security experts are utilized to verify the matter-of-fact effectiveness of the solutions based on behavior analytics. During collecting the primary data via a survey, researchers will go for a structured interview with vendors who are selling solutions to understand the performance of behavior analytics-based solutions and the distinct features of their solutions. The results of literature review, survey, interviews and focus groups will be used to assess the value and success of using behavior analytics in securing the network from not-before-seen attacks such as zeroday attacks. The endeavor of this paper is to highlight the weaknesses and strengths of different UEBA solutions and their effectiveness for detecting network attacks in real-time interaction. This research contrasts top fifteen UEBA technologies based on use cases and capabilities and highlights common usage scenarios. Based on the evidence, recommendations will be given.
Conference Paper
Full-text available
The detection of anomalous structures in natural image data is of utmost importance for numerous tasks in the field of computer vision. The development of methods for unsupervised anomaly detection requires data on which to train and evaluate new approaches and ideas. We introduce the MVTec Anomaly Detection (MVTec AD) dataset containing 5354 high-resolution color images of different object and texture categories. It contains normal, i.e., defect-free, images intended for training and images with anomalies intended for testing. The anomalies manifest themselves in the form of over 70 different types of defects such as scratches, dents, contaminations, and various structural changes. In addition, we provide pixel-precise ground truth regions for all anomalies. We also conduct a thorough evaluation of current state-of-the-art unsupervised anomaly detection methods based on deep architectures such as convolutional autoencoders, generative adversarial networks, and feature descriptors using pre-trained convolutional neural networks, as well as classical computer vision methods. This initial benchmark indicates that there is considerable room for improvement. To the best of our knowledge, this is the first comprehensive, multi-object, multi-defect dataset for anomaly detection that provides pixel-accurate ground truth regions and focuses on real-world applications.
Chapter
Full-text available
Anomaly detection is a classical problem in computer vision, namely the determination of the normal from the abnormal when datasets are highly biased towards one class (normal) due to the insufficient sample size of the other class (abnormal). While this can be addressed as a supervised learning problem, a significantly more challenging problem is that of detecting the unknown/unseen anomaly case that takes us instead into the space of a one-class, semi-supervised learning paradigm. We introduce such a novel anomaly detection model, by using a conditional generative adversarial network that jointly learns the generation of high-dimensional image space and the inference of latent space. Employing encoder-decoder-encoder sub-networks in the generator network enables the model to map the input image to a lower dimension vector, which is then used to reconstruct the generated output image. The use of the additional encoder network maps this generated image to its latent representation. Minimizing the distance between these images and the latent vectors during training aids in learning the data distribution for the normal samples. As a result, a larger distance metric from this learned data distribution at inference time is indicative of an outlier from that distribution—an anomaly. Experimentation over several benchmark datasets, from varying domains, shows the model efficacy and superiority over previous state-of-the-art approaches.
Article
Melody generation from lyrics has been a challenging research issue in the field of artificial intelligence and music, which enables us to learn and discover latent relationships between interesting lyrics and accompanying melodies. Unfortunately, the limited availability of a paired lyrics–melody dataset with alignment information has hindered the research progress. To address this problem, we create a large dataset consisting of 12,197 MIDI songs each with paired lyrics and melody alignment through leveraging different music sources where alignment relationship between syllables and music attributes is extracted. Most importantly, we propose a novel deep generative model, conditional Long Short-Term Memory (LSTM)–Generative Adversarial Network for melody generation from lyrics, which contains a deep LSTM generator and a deep LSTM discriminator both conditioned on lyrics. In particular, lyrics-conditioned melody and alignment relationship between syllables of given lyrics and notes of predicted melody are generated simultaneously. Extensive experimental results have proved the effectiveness of our proposed lyrics-to-melody generative model, where plausible and tuneful sequences can be inferred from lyrics.
Conference Paper
Identifying anomalies from log data for insider threat detection is practically a very challenging task for security analysts. User behavior modeling is very important for the identification of these anomalies. This paper presents unsupervised user behavior modeling for anomaly detection. The proposed approach uses LSTM based Autoencoder to model user behavior based on session activities and thus identify the anomalous data points. The proposed method follows a two-step process. First, it calculates the reconstruction error using the autoencoder on the non-anomalous dataset, and then it is used to define the threshold to separate the outliers from the normal data points. The identified outliers are then classified as anomalies. The CERT insider threat dataset has been used for the research work. For each user, the feature vectors are prepared by extracting key information from corresponding raw events and aggregating the data points based on users' actions within respective users' sessions. LSTM Autoencoder has been implemented for behavior learning and anomaly detection. For any unseen behavior or anomaly pattern, the model produces high reconstruction error which is an indication of an anomaly. The experimental results show that in the best case, the model produced an Accuracy of 90.17%, True Positives 91.03%, and False Positives 9.84%. Thus, the results suggest that the proposed approach can be effectively used in automatic anomaly detection.